aboutsummaryrefslogtreecommitdiffstats
path: root/manifests/profile/pacemaker
AgeCommit message (Collapse)AuthorFilesLines
2017-07-17Merge "Add option for innodb_flush_log_at_trx_commit = 2 for Galera only"Jenkins1-35/+44
2017-07-13Fix typo in haproxy bundleMichele Baldessari1-2/+2
Change I6f4d3a5abae8f1781cfe6f69ff960aad500061e3 slipped in a typo and it removed the '$' character from a puppet manifest. Which causes a deployment to fail with: INFO: running container haproxy-bundle-docker-0 for the first time ERROR: /usr/bin/docker-current: Error response from daemon: Invalid bind mount spec "deployed_ssl_cert_path:deployed_ssl_cert_path:ro": Invalid volume destination path: 'deployed_ssl_cert_path' mount path must be absolute.. See '/usr/bin/docker-current run --help'. ERROR: docker failed to launch container Change-Id: Ic602fd443d38482bf1f924531561b2174dc38293
2017-07-13Merge "Leverage kolla config_files to copy config into containers"Jenkins6-43/+23
2017-07-12Leverage kolla config_files to copy config into containersMartin André6-43/+23
This solves a problem with bind-mounts when the containers are holding files descriptors open. At the same time this makes the template more robust to puppet changes since new config files will be available in the containers without needing to update the templates. Closes-Bug: #1698323 Change-Id: I857c94ba5f7f064d7c58df621ec5d477654b9166 Depends-On: I78dcec741a941dc21adba33ba33a6dc6ff1d217c
2017-07-10Let pacemaker bind-mount needed cert for haproxy bundleDamien Ciabrini1-5/+16
When SSL configuration is enabled, haproxy expects to load a SSL certificate file at startup. Update the bundle configuration to always bind-mount the cert file, to support both SSL and non SSL HAproxy bundle deployments. Change-Id: I6f4d3a5abae8f1781cfe6f69ff960aad500061e3
2017-07-06Add option for innodb_flush_log_at_trx_commit = 2 for Galera onlyMike Bayer1-35/+44
The innodb_flush_log_at_trx_commit flag changes the timing of when the log buffer is written to disk for writes. At its default of 1, transactions are written to disk and the buffer flushed on a per-transaction basis; but when set to 2, the flush of the buffer proceeds only once per second. This removes the durability guarantee for the single node. However the central concept of Galera is that durability is achieved via the cluster as a whole, in that transactions are replicated to other nodes before the commit succeeds (though not necessarily written to disk unless wsrep_causal_reads is set). In this model, data would only be lost of all nodes of the Galera cluster were killed within one second of each other. Percona's blog post at https://www.percona.com/blog/2014/11/17/typical-misconceptions-on-galera-for-mysql/ recommends that the value of 2 should be considered "safe" for a Galera cluster unless you are in fact worried that all three nodes will be powered off simultaneously. The value here is added as an option only, defaulting to the usual default of "1", flush per transaction. Change-Id: Id5a30f1daf978e094a74db2d284febbc9ae64bb3
2017-06-21Enable TLS for MySQL's replication trafficJuan Antonio Osorio Robles1-6/+43
This enables the options so Galera can use TLS for the replication traffic. bp tls-via-certmonger Depends-On: I9252303b92a2805ba83f86a85770db2551a014d3 Change-Id: I2ee3bf4bbda3f65f5b03440ecbc75f14225a2428
2017-06-16Merge "Ensure hiera step value is an integer"Jenkins17-17/+17
2017-06-14Merge "Do not create VIP for pacemaker OVN OCF resource"Jenkins1-29/+7
2017-06-14Ensure hiera step value is an integerSteve Baker17-17/+17
The step is typically set with the hieradata setting an integer value: {"step": 1} However it would be useful for the value to be a string so that substitutions are possible, for example: {"step": "%{::step}"} This change ensures the step parameter defaults to an integer by calling Integer(hiera('step')) This change was made by manually removing the undef defaults from fluentd.pp, uchiwa.pp, and sensu.pp then bulk updating with: find ./ -type f -print0 |xargs -0 sed -i "s/= hiera('step')/= Integer(hiera('step'))/" Change-Id: I8a47ca53a7dea8391103abcb8960a97036a6f5b3
2017-06-13Merge "Make sure the resource bundles use a location_rule"Jenkins4-0/+16
2017-06-13Merge "Configure Galera cluster with FQDNs instead of shortnames"Jenkins1-11/+13
2017-06-13Configure Galera cluster with FQDNs instead of shortnamesJuan Antonio Osorio Robles1-11/+13
This takes into use the cluster_host_map, which allows to give aliases to the pacemaker nodes (which are FQDNs), and allows us to configure the cluster using FQDNs. We need FQDNs in order to request certificates, since the default CA (FreeIPA) only allows certificates for FQDNs. Change-Id: I2f146afdd32aef2d11cf25a65fa8d67428f621f5
2017-06-12Merge "Puppet module to deploy cinder-backup bundle for HA"Jenkins1-0/+146
2017-06-12Merge "Puppet module to deploy cinder-volume bundle for HA"Jenkins1-0/+141
2017-06-11Merge "Install rsync package for galera"Jenkins1-0/+9
2017-06-11Do not create VIP for pacemaker OVN OCF resourceNuman Siddique1-29/+7
The commit with change id [1], added the pacemaker HA support for OVN DB servers. That commit created a new VIP which is really not required. This patch removes the code to create a new ip resource. Instead it expects the pacemaker ip resource (with the ip address in the 'ovn_dbs_vip' parameter and with the name "ip-$ovn_dbs_vip") to be created before ovn_northd class is called, which is the case anyway if 'ovn_dbs_vip' is taken from the ServiceNetMapDefaults (in t-h-t). [1] - I9dc366002ef5919339961e5deebbf8aa815c73db Change-Id: I94d3960e6c5406e3af309cc8c787ac0a6c9b1756 Partial-bug: #1670564
2017-06-09Make sure the resource bundles use a location_ruleMichele Baldessari4-0/+16
In composable HA we bind resources to nodes that have special node properties. We need to do this also for bundle resources otherwise there is a potential race where the bundle might be started on nodes where it is not supposed to during a small window of time. Tested with the depends-on and correctly obtained a containerized composable HA deployment: Docker container set: rabbitmq-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-rabbitmq:latest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-2 Docker container set: galera-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-mariadb:latest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-galera-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-galera-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-galera-2 Docker container set: redis-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-redis:latest] redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2 ip-192.168.24.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-10.0.0.7 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.2.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 ip-172.16.2.9 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.16.1.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.3.7 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 Docker container set: haproxy-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-haproxy:latest] haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-0 haproxy-bundle-docker-1 (ocf::heartbeat:docker): Started overcloud-controller-1 haproxy-bundle-docker-2 (ocf::heartbeat:docker): Started overcloud-controller-2 Depends-On: I44449861cbfe56304b8829c9ca10fd648353b3ae Change-Id: I48fb490040497ba08cae19937159c0efdf99e3f8
2017-06-08Puppet module to deploy cinder-volume bundle for HADamien Ciabrini1-0/+141
This module is used by tripleo-heat-templates to configure and deploy Kolla-based cinder-volume containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1668920 Change-Id: I95ad4dd89b47396bea672813d87de35e64c04b2d
2017-06-08Puppet module to deploy cinder-backup bundle for HADamien Ciabrini1-0/+146
This module is used by tripleo-heat-templates to configure and deploy Kolla-based cinder-backup containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1668920 Change-Id: If53495ff75d4832cc6be80dc0dc9bd540ab6583b
2017-06-05Merge "Pacemaker support for OVN DB servers"Jenkins1-0/+121
2017-06-03Merge "Clustercheck, monitor service for galera containers"Jenkins1-0/+65
2017-06-03Merge "Puppet module to deploy MySQL bundle for HA"Jenkins1-0/+302
2017-06-03Merge "Puppet module to deploy RabbitMQ bundle for HA"Jenkins1-0/+194
2017-06-02Merge "Puppet module to deploy Redis bundle for HA"Jenkins1-0/+178
2017-06-02Puppet module to deploy MySQL bundle for HADamien Ciabrini1-0/+302
This module is used by tripleo-heat-templates to configure and deploy Kolla-based mysql containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1692842 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f Change-Id: I3b4d8ad2eec70080419882d5d822f78ebd3721ae
2017-06-02Merge "Puppet module to deploy HAProxy bundle for HA"Jenkins1-0/+196
2017-06-01Install rsync package for galeraJames Slagle1-0/+9
Since galera is configured to use rsync, we ought to make sure the package is installed. Particularly when using deployed-server, the package is not always installed by default depending on what was used to install the servers. Change-Id: I92ee78f2dd2c0f7fd4d393b104166407d7c654e2 Closes-Bug: #1693003
2017-06-01Pacemaker support for OVN DB serversBabu Shanmugam1-0/+121
This patch enables OVN DB servers to be started in master/slave mode in the pacemaker cluster. A virtual IP resource is created first and then the pacemaker OVN OCF resource - "ovn:ovndb-servers" is created. The OVN OCF resource is configured to be colocated with the vip resource. The ovn-controller and Neutron OVN ML2 mechanism driver which depends on OVN DB servers will always connect to the vip address on which the master OVN DB servers listen on. The OVN OCF resource itself takes care of (re)starting ovn-northd service on the master node and we don't have to manage it. When HA is enabled for OVN DB servers, haproxy does not configure the OVN DB servers in its configuration. This patch requires OVS 2.7 in the overcloud. Co-authored:by: Numan Siddique <nusiddiq@redhat.com> Change-Id: I9dc366002ef5919339961e5deebbf8aa815c73db Partial-bug: #1670564
2017-05-30Puppet module to deploy RabbitMQ bundle for HADamien Ciabrini1-0/+194
This module is used by tripleo-heat-templates to configure and deploy Kolla-based RabbitMQ containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Co-Authored-By: John Eckersberg <jeckersb@redhat.com> Partial-Bug: #1692909 Change-Id: I0722e4a4d4716f477e8304cfa1aadd3eef7c2f31 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f
2017-05-25Puppet module to deploy HAProxy bundle for HADamien Ciabrini1-0/+196
This module is used by tripleo-heat-templates to configure and deploy Kolla-based haproxy containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1692908 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f Change-Id: Ifcf890a88ef003d3ab754cb677cbf34ba8db9312
2017-05-25Puppet module to deploy Redis bundle for HADamien1-0/+178
This module is used by tripleo-heat-templates to configure and deploy Kolla-based Redis containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1692924 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f Change-Id: Ia1131611d15670190b7b6654f72e6290bf7f8b9e
2017-05-23Clustercheck, monitor service for galera containersDamien Ciabrini1-0/+65
In HA overcloud deployments, HAProxy makes use of a helper service called "clustercheck", to check whether galera nodes are available for serving traffic. This change implements a dedicated profile for clustercheck, which was originally part of the pacemaker mysql profile. The profile generates the necessary configuration files for clustercheck and let heat templates manage the associated container's lifecycle. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Partial-Bug: #1692969 Change-Id: I1aabe34fa6a9c8c705a4405f275b66502c313cf2
2017-05-16Composable Role for Neutron LBaaSRyan Hefner1-0/+44
Add composable service interface for Neutron LBaaSv2 service. Change-Id: Ieeb21fafd340fdfbaddbe7633946fe0f05c640c9
2017-05-05Remove limits for redis in /etc/security/limits.dMichele Baldessari1-15/+16
Now that puppet-redis supports ulimit for cluster managed redis (via https://github.com/arioch/puppet-redis/pull/192), we need to remove the file snippet as otherwise we will get a duplicate resource error. We will need to create a THT change that at the very least sets the redis::managed_by_cluster_manager key to true so that /etc/security/limits.d/redis.conf gets created. We also add code to not break backwards compatibility with the old hiera key. Change-Id: I4ffccfe3e3ba862d445476c14c8f2cb267fa108d Partial-Bug: #1688464
2017-04-26Add a flag to rabbitmq so that we can deploy with ha-mode: all againMichele Baldessari1-2/+6
In change Ib62001c03e1e08f58cf0c6e0ba07a8879a584084 we switched the rabbitmq queues HA mode from ha-all to ha-exactly. While this gives us a nice performance boost with rabbitmq, it makes rabbit less resilient to network glitches as we painfully found out via https://bugzilla.redhat.com/show_bug.cgi?id=1441635. Will propose another THT change to actually change the default to -1 so we get this ha-mode:all by default. Change-Id: I9a90e71094b8d8d58b5be0a45a2979701b0ac21c Partial-Bug: #1686337 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: John Eckersberg <jeckersb@redhat.com>
2017-04-06Make galera-ready exec refreshonlyAlex Schultz1-2/+3
Previously we were always run the galera-ready exec every step. This change switches it to be refreshonly so we only wait when the service is setup or restarted. Change-Id: I5ff9d49c2590751913b96777bcd72c8a15627a01 Closes-Bug: #1680586
2017-02-17Merge "xinetd: bind only on mysql network"Jenkins1-0/+1
2017-02-16Merge "Fix a typo in mysql.pp"Jenkins1-1/+1
2017-02-03Revert "Revert "set innodb_file_per_table to ON for MySQL / Galera""Alex Schultz1-0/+1
This reverts commit 3f7e74ab24bb43f9ad7e24e0efd4206ac6a3dd4e. After identifying how to workaround the performance issues on the undercloud, let's put this back in. Enabling innodb_file_per_table is important for operators to be able to better manage their databases. Change-Id: I435de381a0f0e3ef221e498f442335cdce3fb818 Depends-On: I77507c638237072e38d9888aff3da884aeff0b59 Closes-Bug: #1660722
2017-02-02Revert "set innodb_file_per_table to ON for MySQL / Galera"Alex Schultz1-1/+0
This reverts commit 621ea892a299d2029348db2b56fea1338bd41c48. We're getting performance problems on SATA disks. Change-Id: I30312fd5ca3405694d57e6a4ff98b490de388b92 Closes-Bug: #1661396 Related-Bug: #1660722
2017-02-01set innodb_file_per_table to ON for MySQL / GaleraMike Bayer1-0/+1
InnoDB uses a single file by default which can grow to be tens/hundreds of gigabytes, and is not shrinkable even if data is deleted from the database. Best practices are that innodb_file_per_table is set to ON which instead stores each database table in its own file, each of which is also shrinkable by the InnoDB engine. Closes-Bug: #1660722 Change-Id: I59ee53f6462a2eeddad72b1d75c77a69322d5de4
2017-01-26Support composable HA for the Ceph rbdmirror daemonGiulio Fidente1-1/+21
Follow up patch for I63da4f48da14534fd76265764569e76300534472 to support composable HA for the Ceph rbdmirror daemon. Change-Id: I3767bee4b1c7849fa85e71bcc57534b393d2d415
2017-01-26Merge "Ensure basic Ceph configuration is performed by RBD mirror"Jenkins1-0/+1
2017-01-25Merge "Composable HA"Jenkins7-47/+204
2017-01-25Composable HAMichele Baldessari7-47/+204
This commit implements composable HA for the pacemaker profiles. - Everytime a pacemaker resource gets included on a node, that node will add a node cluster property with the name of the resource (e.g. galera-role=true) - Add a location rule constraint to force running the resource only on the nodes that have that property - We also make sure that any pacemaker resource/property creation has a predefined number of tries (20 by default). The reason for this is that within composable HA, it might be possible to get "older CIB" errors when another node changed the CIB while we were doing an operation on it. Simply retrying fixes this. - Also make sure that we use the newly introduced pacemaker::constraint::order class instead of the older pacemaker::constraint::base class. The former uses the push_cib() function and hence behaves correctly in case multiple nodes try to modify the CIB at the same time. Change-Id: I63da4f48da14534fd76265764569e76300534472 Depends-On: Ib931adaff43dbc16220a90fb509845178d696402 Depends-On: I8d78cc1b14f0e18e034b979a826bf3cdb0878bae Depends-On: Iba1017c33b1cd4d56a3ee8824d851b38cfdbc2d3
2017-01-25Ensure basic Ceph configuration is performed by RBD mirrorGiulio Fidente1-0/+1
Previously we missed to perform the basic Ceph client configuration on a node where only the RBD mirror service was deployed. Change-Id: Ie6a4284a88714bcee964a38636e12aa88bb95c9d Co-Authored-By: Michele Baldessari <michele@acksyn.org> Related-Bug: #1652177
2017-01-25Fix wrong hiera key in ceph_rbdmirrorMichele Baldessari1-1/+1
There is a typo in the bootstrap check which will lead to: Could not find data item ceph_rbdmirror_bootstrap_short_node_name in any Hiera data file and no default supplied at /etc/puppet/modules/tripleo/manifests/profile/pacemaker/ceph/rbdmirror.pp We need to be using the correct one: $ hiera ceph_rbdmirror_short_bootstrap_node_name overcloud-remote-0 Change-Id: Ic343e5f99e48360bdd2d2989781a4b6ca484e8fc
2017-01-18Add Ceph RBD mirror Pacemaker profileGiulio Fidente1-0/+77
This change adds a profile for the Ceph RBD mirror service, which should be managed by Pacemaker to make sure there is always a single instance running. Change-Id: Ic63dc5cffece38942d305f538f71dd58a5d50789 Partial-Bug: #1652177
2017-01-18Do not depend on bootstrap_nodeid for any pacemaker profileMichele Baldessari7-13/+19
When we create a pacemaker resource it must happen from a single node. If it happens from multiple nodes an immediate error will be returned by pcs. For the pacemaker roles we enforce this by leveraging the recently introduced <SERVICE_NAME_bootstrap_short_node_name> which gives us the first hostname per-service, regardless of the role. (introduced via I03e8685f939e8ae1fcd8b16883b559615042505d) With this approach if a pacemaker service belongs to two different roles (say role Controller on node A and role galera on node B), it will only create the resource from one of the two and not both (which would return an error). Only setting Partial-Bug for this one, because it addresses the issue from the pacemaker resource creation POV (which is always affected). But the issue itself is a race that we're theoretically affected by since the composable roles work landed. While I have tried to fix the more general case in previous attempts, I think it is best if we start a discussion on how to fix it, because each approach has a bunch of potential drawbacks and is quite invasive on how we do things. A discussion slot for this has been proposed for the Atlanta PTG. Change-Id: I662398cab60d523d204b57a5674ca8f5c0f2e68a Partial-Bug: #1615983