summaryrefslogtreecommitdiffstats
path: root/docker/services/pacemaker
AgeCommit message (Collapse)AuthorFilesLines
2017-11-04Fix iptables rules override bug in clustercheck docker serviceMichele Baldessari1-1/+4
When deploying a composable HA overcloud with a database role split off to separate nodes we could observe a deployment failure due to galera never starting up properly. The reason for this was that instead of having the firewall rules for the galera bundle applied (i.e. those with the extra control-port for the bundle), we would see the firewall rules for the BM galera service. E.g. we would see the following on the host: tripleo.mysql.firewall_rules: { 104 mysql galera: { dport: [ 873, 3306, 4444, 4567, 4568, 9200 ] Instead of the correct mysq bundle firewall rules: tripleo.mysql.firewall_rules: 104 mysql galera-bundle: dport: [ 873, 3123, 3306, 4444, 4567, 4568, 9200 ] The reason for this is the following piece of code in https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/pacemaker/clustercheck.yaml#L62: ... MysqlPuppetBase: type: ../../../puppet/services/pacemaker/database/mysql.yaml properties: EndpointMap: {get_param: EndpointMap} ServiceData: {get_param: ServiceData} ServiceNetMap: {get_param: ServiceNetMap} DefaultPasswords: {get_param: DefaultPasswords} RoleName: {get_param: RoleName} RoleParameters: {get_param: RoleParameters} outputs: role_data: description: Containerized service clustercheck using composable services. value: service_name: clustercheck config_settings: {get_attr: [MysqlPuppetBase, role_data, config_settings]} logging_source: {get_attr: [MysqlPuppetBase, role_data, logging_source]} ... Depending on the ordering of the clustercheck service within the role (before or after the mysql service), the above code will override the tripleo.mysql.firewall_rules with the wrong rules because we derive from puppet/services/... which contain the BM firewall rules. Let's just switch to derive from the docker service so we do not risk getting the wrong firewall rules during the map_merge. Tested this change successfully on a composable HA with split-off DB nodes. Change-Id: Ie87b327fe7981d905f8762d3944a0e950dbd0bfa Closes-Bug: #1728918 (cherry picked from commit 3df6a4204a85b119cd67ccf176d5b72f9e550da6)
2017-10-10Merge "Adds pacemaker update_tasks for Pike minor update workflow" into ↵Jenkins8-3/+212
stable/pike
2017-10-10Merge "Make containerized galera use mysql_network everywhere" into stable/pikeJenkins1-0/+6
2017-10-09Adds pacemaker update_tasks for Pike minor update workflowmarios8-3/+212
Adds update_tasks for the minor update workflow. These will be collected into playbooks during an initial 'update init' heat stack update and then invoked later by the operator as ansible playbooks. Current understanding/workflow: Step=1: stop the cluster on the updated node Step=2: Pull the latest image and retag the it pcmklatest Step=3: yum upgrade happens on the host Step=4: Restart the cluster on the node Step=5: Verification: test pacemaker services are running. https://etherpad.openstack.org/p/tripleo-pike-updates-upgrades Related-Bug: 1715557 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com> Change-Id: I101e0f5d221045fbf94fb9dc11a2f30706843806 (cherry picked from commit a953bda0ae615dc44d3e8a70aa7ab0160e26f3af)
2017-10-09docker: add logging(source & groups)Juan Badia Payno9-0/+18
The services that docker depends on, have logging_sources and logging_groups; but those are not set on the docker outputs so they are not used when dockers are deployed. Added logging_source & logging_groups as docker optional parameters in tools/yaml-validate.py Closes-Bug: #1718110 Change-Id: I8795eaf4bd06051e9b94aa50450dee0d8761e526 (cherry picked from commit 5dbe1121e98a794ec6a6387ff56ee34314177567)
2017-10-07Make containerized galera use mysql_network everywhereDamien Ciabrini1-0/+6
The containerized galera service generates a galera.cnf which uses short hostname to identify itself rather than the fqdn from the mysql_network (e.g. overcloud-x.internalapi.cloudname). This breaks when internal TLS is in use, because the mysql certificate does not reference this short hostname. Fix the appropriate hiera parameter to make it behave like the non-containerized galera service. Change-Id: I904cde38f2baeddab5178e8ad48d34a0c73629af Closes-Bug: #1719599 (cherry picked from commit e10aa591dc9155a2746df01279c4ba4f2133fd17)
2017-09-21Merge "Use haproxy-systemd-wrapper as pid1 in containerized Haproxy" into ↵Jenkins1-3/+2
stable/pike
2017-09-20Use haproxy-systemd-wrapper as pid1 in containerized HaproxyDamien Ciabrini1-3/+2
This wrapper binary spawns the HAproxy daemon and implements a coordinated HAproxy restart on SIGHUP. From a service's perspective, this allows reloading the HAProxy configuration with minimal service disruption, i.e. without stopping and restarting the HAProxy container. Closes-Bug: #1717521 Change-Id: Ib3ef0c0bcf1a8151e179ff4d7509cf0d6b3ac5a1 (cherry picked from commit 91cd44cd7266c15ce07fafbee9d2e33f226096ba)
2017-09-20Disable all uses of wsrep-provider in mysql_bootstrap containerDamien Ciabrini1-2/+4
During the bootstrap of the mariadb database, galera replication must be disabled while the users credentials are being set up. This is done by setting wsrep-provider=none when starting mysqld_safe. Icf67fd2fbf520e8a62405b4d49e8d5169ff3925b already disabled it when the clustercheck credentials are being set up, but Kolla also start a temporary server for setting up the root password. Disable the setting directly at the end of the mysql.cnf in the running container. That way, the default setting from galera.cnf will be overriden, all mysqld_safe calls will disable WSREP and the setting will stay ephemeral. Change-Id: If14e22992b46a35a05a16a9db5ecb360ea13df8f Closes-Bug: #1717250 (cherry picked from commit b0f50db80b10e9cd6263c4d6b3ca8dd818b658ba)
2017-09-15One time delete pacemaker resources during upgrade to containersMarius Cornea4-8/+40
This change allows running the major upgrade composable docker steps multiple times by not trying to delete the pacemaker resources if they're not reported as started or in master state. Closes-bug: 1716031 Depends-On: I8da03f5c4a6d442617b81be5793a9724cc8842bf Change-Id: Ifcf9de8c82550a90a9fb118052d43fdbcdc6ca7e (cherry picked from commit 64d7be1e3d4552e06cbc53f788572e530cc5c3bb)
2017-09-14Retry if the pacemaker_resource commands failedMathieu Bultel6-0/+36
Add a retry when the pacemaker_resource command wasn't apply correctly, more info here: https://bugzilla.redhat.com/show_bug.cgi?id=1482116 This is the same approach puppet-pacemaker uses and provides eventual consistency when multiple nodes change the cluster CIB concurrently. This change depends-on : https://review.gerrithub.io/375982 The return code is not available in the current ansible-pacemaker package. Change-Id: I8da03f5c4a6d442617b81be5793a9724cc8842bf (cherry picked from commit e92430d8d03fc2ce2d0ce192b96209f2c5c04169)
2017-09-11Enable redis TLS proxy in HA deploymentsMartin André1-26/+67
Redis does not have TLS out of the box. Let's use a proxy container for TLS termination. This commit enables redis TLS proxy for the HA deployment. bp tls-via-certmonger Change-Id: I45e539872a03878337def33c681c4577c1a5629e (cherry picked from commit c6d8df01d7aa8b44af9ac152b3bb08f07e2e02b7)
2017-09-07Merge "Support HA for OVN DBs containers using pacemaker bundle" into ↵Jenkins1-0/+140
stable/pike
2017-09-06Mount public certificate in haproxy init containerJuan Antonio Osorio Robles1-0/+1
It's being mounted on the actual haproxy container, but not the init one. Change-Id: I66b69e0bb3642dbfeec767ef5216d515786b5b19 Closes-Bug: #1715132 (cherry picked from commit 03622e89ac3037b4d69d913586823e689b210688)
2017-08-31Add --wsrep-provider=none to the mysql_bootstrap containerMichele Baldessari1-2/+2
Depending on the version of mariadb/galera installed the mysql_bootstrap command might fail. With the following unrevealing error: openstack-mariadb-docker:2017-08-28.10 "bash -ec 'if [ -e /v" 3 hours ago Exited (124) 3 hours ago The timeout is actually due to the fact that the following snippets does not complete within 60 seconds: """ if [ -e /var/lib/mysql/mysql ]; then exit 0; fi kolla_start mysqld_safe --skip-networking --wsrep-on=OFF & timeout ${DB_MAX_TIMEOUT} /bin/bash -c ''until mysqladmin -uroot -p"${DB_ROOT_PASSWORD}" ping 2>/dev/null; do sleep 1; done'' mysql -uroot -p"${DB_ROOT_PASSWORD}" -e "CREATE USER ''clustercheck''@''localhost'' IDENTIFIED BY '${DB_CLUSTERCHECK_PASSWORD}'';" mysql -uroot -p"${DB_ROOT_PASSWORD}" -e "GRANT PROCESS ON *.* TO ''clustercheck' """ The problem is that with older mariadb versions: galera-25.3.16-3.el7ost.x86_64 mariadb-5.5.56-2.el7.x86_64 The mysqld_safe process starts in galera mode (as opposed as to single local mode): 170830 17:03:05 [Note] WSREP: Start replication 170830 17:03:05 [Note] WSREP: GMCast version 0 ... 170830 17:03:05 [ERROR] WSREP: wsrep::connect() failed: 7 170830 17:03:05 [ERROR] Aborting That means that even though we specified --wsrep-on=OFF it is still starting in cluster mode. Let's add the extra --wsrep-provider=none which older versions required. Let's also add a '-x' to this transient container as that would have helped a bit because we would have understood right away that it was mysqld_safe that was not starting. I tested this successfully on an environment that showed the problem. The new option is still accepted by newer DB versions in any case. Closes-Bug: #1714057 Change-Id: Icf67fd2fbf520e8a62405b4d49e8d5169ff3925b Co-Authored-By: Mike Bayer <mbayer@redhat.com> (cherry picked from commit c19968ca852ab608513fe692aab958af25276220)
2017-08-31Support HA for OVN DBs containers using pacemaker bundleNuman Siddique1-0/+140
ovn-dbs pacemaker bundle resources are created for supporting Master/Slave HA. puppet-tripleo already supports creating ovn-dbs bundle resources. The heat template added in this patch makes use of this. Closes-bug: #1699085 Change-Id: I23c2d312cfb144f9afc14f0982a92670dc29d74c (cherry picked from commit 444a39f5983e71e3222b6b7f8f523fce60aeece7)
2017-08-30Remove src_ceph from manila kolla_configJan Provaznik1-5/+0
Pacemaker puppet module takes care of mounting /etc/ceph into manila-share container (I23b6890b4cf7f1e6fe84b6be280dde82218275fc). Change-Id: I1026b2436275b17cfe3ac85192d42c5268f0a630 Related-To: I23b6890b4cf7f1e6fe84b6be280dde82218275fc (cherry picked from commit 0d8040ca33d42dbb7e06162f2b659ff6cbc0316f)
2017-08-18Tag the ha containers with 'pcmklatest' at deploy timeMichele Baldessari7-18/+221
We need to tag the HA containers with a special tag so that the RA definition never changes. We do this step in THT as opposed to puppet because we need to guarantee that all images are tagged on all nodes *before* step 2 where the bundle gets created. NB: Getting the image name without the tag will require some more yaql work to get all the cases right. Right now this works only if we enforce that the image has a ':tag' at the end of the name. So far this is always the case. If things change we will need to amend this code. Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com> Change-Id: I362e6cf26fba77d3f949b7d2fc4b35a3eab9087e
2017-08-18Merge "Containerize Manila Share for HA"Jenkins1-0/+142
2017-08-17Merge "Enable TLS configuration for containerized RabbitMQ"Jenkins1-0/+15
2017-08-17Merge "Enable TLS configuration for containerized HAProxy"Jenkins1-5/+52
2017-08-16Containerize Manila Share for HAVictoria Martinez de la Cruz1-0/+142
This service allows configuring and deploying manila-share containers in a HA overcloud managed by pacemaker. The containers are managed and run by pacemaker. Pacemaker runs the standard Kolla image but overrides the initial command so that it explicitely calls manila-share. This way, we shield ourselves from any unexpected future change in Kolla. This container needs to use the 'docker_config' section to invoke puppet (as opposed to 'docker_puppet_tasks'), because due to the HA composability each resource creation needs to happen on the bootstrap node of that service and 'docker_puppet_tasks' will only run on the controller/primary role. Based on work done in fdb233e64e3d78014dd7e351abfed5aec5035866 Partial-Bug: #1668922 Change-Id: Ifa94c506db5eb667690a19d594115a93d2a790b2 Depends-On: I797eea2f7788f65411964ccb852b5707e916416f
2017-08-15Merge "Do not run clustercheck on the host after O->P upgrade"Jenkins1-0/+6
2017-08-14Merge "Enable TLS configuration for containerized Galera"Jenkins1-0/+35
2017-08-11Enable TLS configuration for containerized GaleraDamien Ciabrini1-0/+35
In non-containerized deployments, Galera can be configured to use TLS for gcomm group communication when enable_internal_tls is set to true. Fix the metadata service definition and update the Kolla configuration to make gcomm use TLS in containers, if configured. bp tls-via-certmonger-containers Change-Id: Ibead27be81910f946d64b8e5421bcc41210d7430 Co-Authored-By: Juan Antonio Osorio Robles <jaosorior@redhat.com> Closes-Bug: #1708135 Depends-On: If845baa7b0a437c28148c817b7f94d540ca15814
2017-08-09Enable TLS configuration for containerized HAProxyDamien Ciabrini1-5/+52
In non-containerized deployments, HAProxy can be configured to use TLS for proxying internal services. Fix the creation of the of the haproxy bundle resource to enable TLS when configured. The keys and certs files are all passed as configuration files and must be copied by Kolla at container startup. For the time being, disable the use of the CRL file until we find a means of restarting the containerized HAProxy service when that file expires. Change-Id: If307e3357dccb7e96bdb80c9c06d66a09b55f3bd Depends-On: I4b72739446c63f0f0ac9f859314a4d6746e20255 Closes-Bug: #1709563
2017-08-09Enable TLS configuration for containerized RabbitMQDamien Ciabrini1-0/+15
In non-containerized deployments, RabbitMQ can be configured to use TLS for serving and mirroring traffic. Fix the creation of the rabbitmq bundle resource to enable TLS when configured. The key and cert are passed as other configuration files and must be copied by Kolla at container startup. Change-Id: I8af63a1cb710e687a593505c0202d717842d5496 Depends-On: Ia64d79462de7012e5bceebf0ffe478a1cccdd6c9 Closes-Bug: #1709558
2017-08-08Merge "MariaDB: create clustercheck user at container bootstrap"Jenkins1-1/+22
2017-07-31MariaDB: create clustercheck user at container bootstrapDamien Ciabrini1-1/+22
In HA overclouds, the helper script clustercheck is called by HAProxy to poll the state of the galera cluster. Make sure that a dedicated clustercheck user is created at deployment, like it is currently done in Ocata. The creation of the clustercheck user happens on all controller nodes, right after the database creation. This way, it does not need to wait for the galera cluster to be up and running. Partial-Bug: #1707683 Change-Id: If8e0b3f9e4f317fde5328e71115aab87a5fa655f
2017-07-27Generate MySQL client config if service requires databaseDamien Ciabrini2-2/+16
Services that access database have to read an extra MySQL configuration file /etc/my.cnf.d/tripleo.cnf which holds client-only settings, like client bind address and SSL configuration. The configuration file is thus used by containerized services, but also by non-containerized services that still run on the host. In order to generate that client configuration file appropriately both on the host and for containers, 1) the MySQLClient service must be included by the role; 2) every containerized service which uses the database must include the mysql::client profile in the docker-puppet config generation step. By including the mysql::client profile in each containerized service, we ensure that any change in configuration file will be reflected in the service's /var/lib/config-data/{service}, and that paunch will restart the service's container automatically. We now only rely on MySQLClient from puppet/services, to make it possible to generate /etc/my.cnf.d/tripleo.cnf on the host, and to set the hiera keys that drive the generation of that config file in containers via docker-puppet. We include a new YAML validation step to ensure that any service which depends on MySQL will initialize the mysql::client profile during the docker-puppet step. Change-Id: I0dab1dc9caef1e749f1c42cfefeba179caebc8d7
2017-07-27Do not run clustercheck on the host after O->P upgradeDamien Ciabrini1-0/+6
Once an Ocata overcloud is upgraded to Pike, clustercheck should only be running in a dedicated container, and xinetd should no longer manage it on the host. Fix the mysql upgrade_task accordingly. Change-Id: I01acacc2ff7bcc867760b298fad6ff11742a2afb Closes-Bug: #1706612
2017-07-26Merge "Open up firewall for the control-ports in the bundles"Jenkins3-1/+26
2017-07-21Open up firewall for the control-ports in the bundlesMichele Baldessari3-1/+26
This is required when the bundles run on pacemaker remote nodes otherwise the cluster won't be able to connect to the control-ports of each bundle. The only services that need this are rabbit, redis and galera because those run pacemaker_remote inside the container (A/P resources and haproxy do not) Change-Id: I6a56d79319ef3d14973a0586dcda4d523adda7aa Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
2017-07-20Remove non-containerized pacemaker resources on upgrademarios6-10/+140
Adds upgrade_tasks to remove the pacemaker resources using the ansible-pacemaker module. Resources are disabled and removed in step2 (called only on bootstrap node) and then the cluster stop is moved to step3 The existing systemd/service call is kept but only to disable services after they are disabled/deleted from the cluster. Related-Bug: 1701485 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Change-Id: Ia597d240ea5834c50a8f6c4fac0b6ed417b8535c
2017-07-15Merge "Use a single configuration file for specifying docker containers."Jenkins7-101/+14
2017-07-14Use a single configuration file for specifying docker containers.Ian Main7-101/+14
This removes the default container names from all the templates and uses a single environment file to specify the full container name and registry from which to pull. Also does away with most of DockerNamespace. Change-Id: Ieaedac33f0a25a352ab432cdb00b5c888be4ba27 Depends-On: Ibc108871ebc2beb1baae437105b2da1d0123ba60 Co-Authored-By: Dan Prince <dprince@redhat.com> Co-Authored-By: Steve Baker <sbaker@redhat.com>
2017-07-14Adds network/cidr mapping into a new service propertyGiulio Fidente7-0/+35
Makes it possible to resolve network subnets within a service template; the data is transported into a new property ServiceData wired into every service which hopefully is generic enough to be extended in the future and transport more data. Data can be consumed in service templates to set config values which need to know what is the subnet where a deamon operates (for example the Ceph Public vs Cluster network). Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
2017-07-12Merge "Bind mount needed cert for haproxy for HA too"Jenkins1-12/+26
2017-07-10Bind mount needed cert for haproxy for HA tooMartin André1-12/+26
haproxy needs the deployed SSL cert file to function when TLS is enabled. It is also required for the docker-puppet haproxy container since the haproxy puppet module uses a validate_cmd to check the generated config file is valid that fails when the required SSL cert is not present. There is no clean way to disable this feature [1] so we need to bind mount the cert into the container. This commit applies the same change that was applied in Id2df144b678769def204961236624091d4e5c457 for the non-ha case. [1] https://github.com/puppetlabs/puppetlabs-haproxy/blob/4753ea5b2506ee093e9b4c8af6e91201d476d426/manifests/config.pp#L53-L57 Change-Id: I93e1ee86197bcf271f18a62a27c2f350ed3966ea Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
2017-07-10Copy only generated puppet files into the containerMartin André5-33/+30
This solves a problem with bind-mounts when the containers are holding files descriptors open. At the same time this makes the template more robust to puppet changes since new config files will be available in the containers without needing to update the templates. Partial-Bug: #1698323 Change-Id: Ia4ad6d77387e3dc354cd131c2f9756939fb8f736
2017-06-28Add heat parameter for all of config_volume imagesMartin André7-12/+45
This commit consistently defines a heat template parameter in the form of DockerXXXConfigImage where XXX represents the name of the config_volume that is used by docker-puppet. The goal is to mitigate hard to debug errors where the templates would set different defaults for the image docker-puppet.py uses to run, for the same config_volume name. This fixes a couple of inconsistencies on the way. Change-Id: I212020a76622a03521385a6cae4ce73e51ce5b6b Closes-Bug: #1699791
2017-06-26Merge "Containerize Cinder-backup for HA"Jenkins1-0/+152
2017-06-15Merge "Containerize Cinder-volume for HA"Jenkins1-0/+170
2017-06-12Generate HAproxy iptables rules for containerized HA deploymentsDamien Ciabrini1-10/+13
The containerized HAproxy service can only specify steps to be run in containers, i.e. it cannot runs the regular puppet steps on bare metal at the same time. A side effect is that the dedicated HAproxy iptables rules are no longer generated. Update the docker_config step to fix the creation of iptables rules for HAproxy and persist them on-disk as before. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Closes-Bug: 1697387 Change-Id: Ib5a083ba3299a82645f1a0f9da0d482c6b89ee23
2017-06-08Containerize Cinder-volume for HADamien Ciabrini1-0/+170
This service allows configuring and deploying cinder-volume containers in a HA overcloud managed by pacemaker. The containers are managed and run by pacemaker. Pacemaker runs the standard Kolla image but overrides the initial command so that it explicitely calls cinder-volume. This way, we shield ourselves from any unexpected future change in Kolla. This container needs to use the 'docker_config' section to invoke puppet (as opposed to 'docker_puppet_tasks'), because due to the HA composability each resource creation needs to happen on the bootstrap node of that service and 'docker_puppet_tasks' will only run on the controller/primary role. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Partial-Bug: #1668920 Depends-On: I95ad4dd89b47396bea672813d87de35e64c04b2d Change-Id: Ib6396219c3d9484c533f6f9995d565091a197bbb
2017-06-08Containerize Cinder-backup for HADamien Ciabrini1-0/+152
This service allows configuring and deploying cinder-backup containers in a HA overcloud managed by pacemaker. The containers are managed and run by pacemaker. Pacemaker runs the standard Kolla image but overrides the initial command so that it explicitely calls cinder-backup. This way, we shield ourselves from any unexpected future change in Kolla. This container needs to use the 'docker_config' section to invoke puppet (as opposed to 'docker_puppet_tasks'), because due to the HA composability each resource creation needs to happen on the bootstrap node of that service and 'docker_puppet_tasks' will only run on the controller/primary role. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Partial-Bug: #1668920 Depends-On: If53495ff75d4832cc6be80dc0dc9bd540ab6583b Change-Id: Ieec823e10667592bd775bb2642f0c3790a83e85f
2017-06-04Merge "Containerize Redis for HA"Jenkins1-0/+140
2017-06-04Containerize Redis for HADamien1-0/+140
This service allows configuring and deploying Redis containers in a HA overcloud managed by pacemaker. The containers are managed and run by pacemaker. Inside there is pacemaker_remote which will invoke the resource agent managing galera. The resources themselves are created via puppet-pacemaker inside a short-lived container used for this purpose (mysql_init_bundle). This container needs to use the 'docker_config' section to invoke puppet (as opposed to 'docker_puppet_tasks'), because due to the HA composability each resource creation needs to happen on the bootstrap node of that service and 'docker_puppet_tasks' will only run on the controller/primary role. Co-Authored-By: Michele Baldessari <michele@acksyn.org> Closes-Bug: #1692924 Depends-On: Ia1131611d15670190b7b6654f72e6290bf7f8b9e Change-Id: Ie045954fcc86ef2b3e4562b6f012853177f03948
2017-06-03Merge "Containerize clustercheck galera monitor for HA deployments"Jenkins1-0/+103
2017-06-03Merge "Containerize HAProxy for HA"Jenkins1-0/+116