aboutsummaryrefslogtreecommitdiffstats
path: root/manifests/profile/pacemaker/database
AgeCommit message (Collapse)AuthorFilesLines
2017-08-06Enable TLS configuration for containerized GaleraDamien Ciabrini1-74/+118
In non-containerized deployments, Galera can be configured to use TLS for gcomm group communication when enable_internal_tls is set to true. Fix the creation of the mysql bundle resource to enable TLS when configured. The key and cert are passed as other configuration files and must be copied by Kolla at container startup. Change-Id: If845baa7b0a437c28148c817b7f94d540ca15814 Partial-Bug: #1708135
2017-07-21Fix up the control-port for rabbitmq bundlesMichele Baldessari2-3/+3
Mistakenly this was set to 3121 which is the same port that pacemaker remote uses. Move this to 3122 which was the plan all along. Also fix a wrong port comment in redis and mysql at the same time. Change-Id: Iccca6a53a769570443091577c7d86f47119d9cbb
2017-07-17Merge "Add option for innodb_flush_log_at_trx_commit = 2 for Galera only"Jenkins1-35/+44
2017-07-12Leverage kolla config_files to copy config into containersMartin André2-30/+15
This solves a problem with bind-mounts when the containers are holding files descriptors open. At the same time this makes the template more robust to puppet changes since new config files will be available in the containers without needing to update the templates. Closes-Bug: #1698323 Change-Id: I857c94ba5f7f064d7c58df621ec5d477654b9166 Depends-On: I78dcec741a941dc21adba33ba33a6dc6ff1d217c
2017-07-06Add option for innodb_flush_log_at_trx_commit = 2 for Galera onlyMike Bayer1-35/+44
The innodb_flush_log_at_trx_commit flag changes the timing of when the log buffer is written to disk for writes. At its default of 1, transactions are written to disk and the buffer flushed on a per-transaction basis; but when set to 2, the flush of the buffer proceeds only once per second. This removes the durability guarantee for the single node. However the central concept of Galera is that durability is achieved via the cluster as a whole, in that transactions are replicated to other nodes before the commit succeeds (though not necessarily written to disk unless wsrep_causal_reads is set). In this model, data would only be lost of all nodes of the Galera cluster were killed within one second of each other. Percona's blog post at https://www.percona.com/blog/2014/11/17/typical-misconceptions-on-galera-for-mysql/ recommends that the value of 2 should be considered "safe" for a Galera cluster unless you are in fact worried that all three nodes will be powered off simultaneously. The value here is added as an option only, defaulting to the usual default of "1", flush per transaction. Change-Id: Id5a30f1daf978e094a74db2d284febbc9ae64bb3
2017-06-21Enable TLS for MySQL's replication trafficJuan Antonio Osorio Robles1-6/+43
This enables the options so Galera can use TLS for the replication traffic. bp tls-via-certmonger Depends-On: I9252303b92a2805ba83f86a85770db2551a014d3 Change-Id: I2ee3bf4bbda3f65f5b03440ecbc75f14225a2428
2017-06-14Ensure hiera step value is an integerSteve Baker4-4/+4
The step is typically set with the hieradata setting an integer value: {"step": 1} However it would be useful for the value to be a string so that substitutions are possible, for example: {"step": "%{::step}"} This change ensures the step parameter defaults to an integer by calling Integer(hiera('step')) This change was made by manually removing the undef defaults from fluentd.pp, uchiwa.pp, and sensu.pp then bulk updating with: find ./ -type f -print0 |xargs -0 sed -i "s/= hiera('step')/= Integer(hiera('step'))/" Change-Id: I8a47ca53a7dea8391103abcb8960a97036a6f5b3
2017-06-13Merge "Make sure the resource bundles use a location_rule"Jenkins2-0/+10
2017-06-13Configure Galera cluster with FQDNs instead of shortnamesJuan Antonio Osorio Robles1-11/+13
This takes into use the cluster_host_map, which allows to give aliases to the pacemaker nodes (which are FQDNs), and allows us to configure the cluster using FQDNs. We need FQDNs in order to request certificates, since the default CA (FreeIPA) only allows certificates for FQDNs. Change-Id: I2f146afdd32aef2d11cf25a65fa8d67428f621f5
2017-06-11Merge "Install rsync package for galera"Jenkins1-0/+9
2017-06-09Make sure the resource bundles use a location_ruleMichele Baldessari2-0/+10
In composable HA we bind resources to nodes that have special node properties. We need to do this also for bundle resources otherwise there is a potential race where the bundle might be started on nodes where it is not supposed to during a small window of time. Tested with the depends-on and correctly obtained a containerized composable HA deployment: Docker container set: rabbitmq-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-rabbitmq:latest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-rabbit-2 Docker container set: galera-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-mariadb:latest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-galera-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-galera-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-galera-2 Docker container set: redis-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-redis:latest] redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2 ip-192.168.24.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-10.0.0.7 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.2.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 ip-172.16.2.9 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.16.1.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.3.7 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 Docker container set: haproxy-bundle [192.168.24.1:8787/tripleoupstream/centos-binary-haproxy:latest] haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-0 haproxy-bundle-docker-1 (ocf::heartbeat:docker): Started overcloud-controller-1 haproxy-bundle-docker-2 (ocf::heartbeat:docker): Started overcloud-controller-2 Depends-On: I44449861cbfe56304b8829c9ca10fd648353b3ae Change-Id: I48fb490040497ba08cae19937159c0efdf99e3f8
2017-06-03Merge "Puppet module to deploy MySQL bundle for HA"Jenkins1-0/+302
2017-06-02Puppet module to deploy MySQL bundle for HADamien Ciabrini1-0/+302
This module is used by tripleo-heat-templates to configure and deploy Kolla-based mysql containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1692842 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f Change-Id: I3b4d8ad2eec70080419882d5d822f78ebd3721ae
2017-06-01Install rsync package for galeraJames Slagle1-0/+9
Since galera is configured to use rsync, we ought to make sure the package is installed. Particularly when using deployed-server, the package is not always installed by default depending on what was used to install the servers. Change-Id: I92ee78f2dd2c0f7fd4d393b104166407d7c654e2 Closes-Bug: #1693003
2017-05-25Puppet module to deploy Redis bundle for HADamien1-0/+178
This module is used by tripleo-heat-templates to configure and deploy Kolla-based Redis containers managed by pacemaker. We use short-lived containers that call pcs via puppet to create the needed pacemaker resources, properties and constraints. Co-Authored-By: Michele Baldesari <michele@acksyn.org> Partial-Bug: #1692924 Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f Change-Id: Ia1131611d15670190b7b6654f72e6290bf7f8b9e
2017-05-05Remove limits for redis in /etc/security/limits.dMichele Baldessari1-15/+16
Now that puppet-redis supports ulimit for cluster managed redis (via https://github.com/arioch/puppet-redis/pull/192), we need to remove the file snippet as otherwise we will get a duplicate resource error. We will need to create a THT change that at the very least sets the redis::managed_by_cluster_manager key to true so that /etc/security/limits.d/redis.conf gets created. We also add code to not break backwards compatibility with the old hiera key. Change-Id: I4ffccfe3e3ba862d445476c14c8f2cb267fa108d Partial-Bug: #1688464
2017-04-06Make galera-ready exec refreshonlyAlex Schultz1-2/+3
Previously we were always run the galera-ready exec every step. This change switches it to be refreshonly so we only wait when the service is setup or restarted. Change-Id: I5ff9d49c2590751913b96777bcd72c8a15627a01 Closes-Bug: #1680586
2017-02-17Merge "xinetd: bind only on mysql network"Jenkins1-0/+1
2017-02-16Merge "Fix a typo in mysql.pp"Jenkins1-1/+1
2017-02-03Revert "Revert "set innodb_file_per_table to ON for MySQL / Galera""Alex Schultz1-0/+1
This reverts commit 3f7e74ab24bb43f9ad7e24e0efd4206ac6a3dd4e. After identifying how to workaround the performance issues on the undercloud, let's put this back in. Enabling innodb_file_per_table is important for operators to be able to better manage their databases. Change-Id: I435de381a0f0e3ef221e498f442335cdce3fb818 Depends-On: I77507c638237072e38d9888aff3da884aeff0b59 Closes-Bug: #1660722
2017-02-02Revert "set innodb_file_per_table to ON for MySQL / Galera"Alex Schultz1-1/+0
This reverts commit 621ea892a299d2029348db2b56fea1338bd41c48. We're getting performance problems on SATA disks. Change-Id: I30312fd5ca3405694d57e6a4ff98b490de388b92 Closes-Bug: #1661396 Related-Bug: #1660722
2017-02-01set innodb_file_per_table to ON for MySQL / GaleraMike Bayer1-0/+1
InnoDB uses a single file by default which can grow to be tens/hundreds of gigabytes, and is not shrinkable even if data is deleted from the database. Best practices are that innodb_file_per_table is set to ON which instead stores each database table in its own file, each of which is also shrinkable by the InnoDB engine. Closes-Bug: #1660722 Change-Id: I59ee53f6462a2eeddad72b1d75c77a69322d5de4
2017-01-25Composable HAMichele Baldessari2-9/+47
This commit implements composable HA for the pacemaker profiles. - Everytime a pacemaker resource gets included on a node, that node will add a node cluster property with the name of the resource (e.g. galera-role=true) - Add a location rule constraint to force running the resource only on the nodes that have that property - We also make sure that any pacemaker resource/property creation has a predefined number of tries (20 by default). The reason for this is that within composable HA, it might be possible to get "older CIB" errors when another node changed the CIB while we were doing an operation on it. Simply retrying fixes this. - Also make sure that we use the newly introduced pacemaker::constraint::order class instead of the older pacemaker::constraint::base class. The former uses the push_cib() function and hence behaves correctly in case multiple nodes try to modify the CIB at the same time. Change-Id: I63da4f48da14534fd76265764569e76300534472 Depends-On: Ib931adaff43dbc16220a90fb509845178d696402 Depends-On: I8d78cc1b14f0e18e034b979a826bf3cdb0878bae Depends-On: Iba1017c33b1cd4d56a3ee8824d851b38cfdbc2d3
2017-01-18Do not depend on bootstrap_nodeid for any pacemaker profileMichele Baldessari2-3/+9
When we create a pacemaker resource it must happen from a single node. If it happens from multiple nodes an immediate error will be returned by pcs. For the pacemaker roles we enforce this by leveraging the recently introduced <SERVICE_NAME_bootstrap_short_node_name> which gives us the first hostname per-service, regardless of the role. (introduced via I03e8685f939e8ae1fcd8b16883b559615042505d) With this approach if a pacemaker service belongs to two different roles (say role Controller on node A and role galera on node B), it will only create the resource from one of the two and not both (which would return an error). Only setting Partial-Bug for this one, because it addresses the issue from the pacemaker resource creation POV (which is always affected). But the issue itself is a race that we're theoretically affected by since the composable roles work landed. While I have tried to fix the more general case in previous attempts, I think it is best if we start a discussion on how to fix it, because each approach has a bunch of potential drawbacks and is quite invasive on how we do things. A discussion slot for this has been proposed for the Atlanta PTG. Change-Id: I662398cab60d523d204b57a5674ca8f5c0f2e68a Partial-Bug: #1615983
2017-01-05Fix a typo in mysql.ppCao Xuan Hoang1-1/+1
Removed redundant 'the' Change-Id: Ie2051f35ec1e7010423c46084f5512c02af85f33
2016-12-10xinetd: bind only on mysql networkDimitri Savineau1-0/+1
By default galera-monitor xinetd is binding on all the interfaces. That means that the port 9200 is exposed on the external network. Because haproxy is using the same network for the backend and the check we can reuse it for the xinetd binding. Change-Id: If1a50515593e81f46d67309bdeecbe84c1d0ebe4
2016-12-04Merge "Remove unused pacemaker profiles"Jenkins1-73/+0
2016-11-29pacemaker: create Mysql_user once Galera is ready (puppet4)Emilien Macchi1-1/+2
Puppet 4 ordering make things more strict in catalog, which is good. Resources have to be orchestrated or Puppet will take them in the order they are found in catalog. This patch makes sure we create MySQL users only when Galera is actually ready. Closes-Bug: #1645787 Change-Id: I536a1a128c3a7eca49bcc4f34a1307bcd60b029e
2016-10-25Remove unused pacemaker profilesMichele Baldessari1-73/+0
With the landing of HA NG in Newton we can actually remove the pacemaker profiles we do not need. The only ones that are being used in one form or the other are: $ grep -ir services\/pacemaker environments | awk '{ print $3 }' | sort | uniq ../puppet/services/pacemaker/cinder-backup.yaml ../puppet/services/pacemaker/cinder-volume.yaml ../puppet/services/pacemaker/database/mysql.yaml ../puppet/services/pacemaker/database/redis.yaml ../puppet/services/pacemaker/haproxy.yaml ../puppet/services/pacemaker/manila-share.yaml ../puppet/services/pacemaker/rabbitmq.yaml ../puppet/services/pacemaker.yaml The only exception is profile/pacemaker/database/mongodbvalidator because it is included by profile/base/database/mongodb.pp Change-Id: I80c8559bb2d915385bcc20ae71fe144ddd6591c1
2016-10-25Set redis file descriptor limit when run via pacemakerMichele Baldessari1-0/+17
The current redis file descriptor limit is 4096 because of two reasons: - It is run via the redis user - It is not started via systemd which has explicit LimitNOFILE set to 10240 (which matches the default configuration of maximum 10000 clients) Create an /etc/security/limits.d/redis.conf file in order to increase the fd limit value With this change we correctly get the following limits: [root@overcloud-controller-0 ~]# pcs status |grep -A2 redis Master/Slave Set: redis-master [redis] Masters: [ overcloud-controller-2 ] Slaves: [ overcloud-controller-0 overcloud-controller-1 ] [root@overcloud-controller-0 ~]# cat /proc/`pgrep redis`/limits | grep open Max open files 10240 10240 files Previously this limit was set to 4096. Change-Id: I7691581bad92ad9442cecd82cf44f5ac78ed169f Closes-Bug: #1635334
2016-10-20Merge "pacemaker/mysql: wait step 2 to remove default accounts"Jenkins1-1/+11
2016-10-13pacemaker/mysql: wait step 2 to remove default accountsEmilien Macchi1-1/+11
remove_default_accounts is a mysql::server parameter that, set to True, will execute some MySQL commands to cleanup MySQL defaults accounts created by packaging. In order to successfully run the commands, we need MySQL up and running, which is not the case at step 1 but at step 2. This patch make sure we run the commands at step 2 on pacemaker master only. No change for scenarios without Pacemaker. Change-Id: Ifad3cb40fd958d7ea606b9cd2ba4c8ec22a8e94e Closes-Bug: #1633113
2016-10-12pacemaker: increase timeouts for rabbitmq and redisEmilien Macchi1-0/+1
When we observe the 'stop timeout' values of pacemaker resources: rabbitmq and redis, they are set to 90s. But for all other services, it is set to 200s. The overcloud deployment sometimes fails due to this with the error: Error: Could not complete shutdown of rabbitmq-clone, 1 resources remaining Error performing operation: Timer expired This patch updates the timeout for Redis and RabbitMQ to avoid this error. Change-Id: I8a3b3951a896ee3e8e5e09778e8ea4717e76a1b4
2016-10-05Enable usage of "short names" for Galera clusterJuan Antonio Osorio Robles1-1/+6
We're not able to use FQDNs yet, so to work around this, we give precedence to a "short name" list we'll get from t-h-t. Change-Id: I4ef7786474c229d5212a0deb2ca02ee992b030d8 Related-Bug: #1628521
2016-09-28Merge "Move db syncs into mysql base role"Jenkins1-0/+5
2016-09-27Move db syncs into mysql base roleDan Prince1-0/+5
This patch moves the various DB syncs into the MySQL role. Database creation needs to occur on the MySQL server to avoid permission issues. This patch also moves database creation to step 2 so we can guarantee that all per-service databases exist at this time. This avoids complex ordering needed during step 3 where services, on different hosts, can run their own db sync's in a distributed fashion. Change-Id: I05cc0afa9373429a3197c194c3e8f784ae96de5f Partial-bug: #1620595
2016-09-26Add pameter for gmcast.listen_addr configurationJuan Antonio Osorio Robles1-4/+9
having an actual name for that configuration will allow us to pass a more proper name via t-h-t. Change-Id: Iea4bd67074824e5dc6732fd7e408743e693d80b3
2016-09-24Make mysql bind-address configurableJuan Antonio Osorio Robles1-2/+7
It used to be hardcoded that the bind-address was always coming from the $::hostname fact. This is wrong, as it disregards where we have configured the mysql address. This commit actually makes it configurable, so we'll be able to set it via hieradata. On the other hand, we use the hiera key that we already set 'mysql_bind_host' as a default; if, for some reason, that's unavailable then we fall back to $::hostname. Related-Bug: #1627060 Change-Id: I316acfd514aac63b84890e20283c4ca611ccde8b
2016-09-16mysql: never add brackets to mysql_bind_hostEmilien Macchi1-1/+1
Don't add brackets on mysql_bind_host parameter in Galera config. Having brackets from this parameter works with old version of Galera but not newest one. So let's remove them at all, so we can safely upgrade Galera in RDO. Change-Id: Ic904d4efda162f18ec8dffb91c2f383f54361f41 Closes-Bug: #1622755
2016-09-01Merge "Write restart flags to restart services only when necessary"Jenkins2-0/+14
2016-08-30Merge "Handle galera_node_names being an array"Jenkins1-1/+10
2016-08-30Write restart flags to restart services only when necessaryJiri Stransky2-0/+14
Write restart flag file for services managed by Pacemaker into /var/lib/tripleo/pacemaker-restarts directory. The name of the file must match the name of the clone resource defined in pacemaker. The post-puppet restart script will restart each service having a restart flag file and remove those files. This approach focuses on $pacemaker_master only (we don't want to restart the pacemaker services 3 times when we have 3 controllers), so it relies on the assumption that we're making the matching config changes across the pacemaker nodes. Change-Id: I6369ab0c82dbf3c8f21043f8aa9ab810744ddc12
2016-08-29Merge "Removing WARNING: line has more than 140 characters in puppet-tripleo ↵Jenkins1-1/+5
profiles"
2016-08-29Handle galera_node_names being an arrayJiri Stransky1-1/+10
Prepare the pacemaker mysql manifest that galera_node_names will be an array. Stay backwards compatible to handle comma-delimited-string too and avoid a chicken-and-egg patch problem between t-h-t and puppet-tripleo. Change-Id: Ia0d9d59728c8771974bfbc486f4929b99a38e4fb Partially-Implements: blueprint custom-roles
2016-08-17Configure galera-monitor on all controller nodesMichele Baldessari1-1/+1
When we implemented the galera composable role we accidentally moved the xinetd.d monitor service on the bootstrap node only. This meant that haproxy believed that galera was down on the non-bootstrap nodes. A shutdown of the bootstrap node meant that galera was effectively down because haproxy would refuse to redirect the traffic to the non-bootstrap node. Fix this by creating the /etc/xinetd.d/galera-monitor on all controller nodes. Change-Id: Ib5a06b3abbc32182476c2b0c81eb77a12821ad6b
2016-08-11Removing WARNING: line has more than 140 characters in puppet-tripleo profilesCarlos Camacho1-1/+5
Some lint checks are returning: WARNING: line has more than 140 characters in puppet-tripleo profiles This patch will remove those warnings by adding \'s Change-Id: I19b56c93db82948fb0498a4c9851b522c81946f8
2016-08-08Fix parameters and headers inconsistency in the puppet manifests.Carlos Camacho3-5/+2
As we are staring to manually check overcloud services the first step is to check that the puppet profiles are all aligned. Changes applied: No logic added or removed in this submission. Removed unused parameters. Align header comments structure. All profiles parameters sorted following: "Mandatory params first sorted alphabetically then optional params sorted alphabetically." Note: Following submissions will check pacemaker, cinder, mistral and redis services in the base profiles as some of them has the $pacemaker_master parameter defaulted to true. Change-Id: I2f91c3f6baa33f74b5625789eec83233179a9655
2016-07-27Remove global openstack-core resourceGiulio Fidente1-11/+0
The openstack-core resource is not needed by the NG Pacemaker architecture. It was moved into an isolated role by [1] so that it could optionally be enabled when wanting the older architecture. This submission removes the old openstack-core global resource. 1. I74a62973146c0261385ecf5fd3d06db51e079caa Change-Id: I16a786ce167c57848551c7245f4344c382c55b3d
2016-07-22Merge "use parameter to lookup the step instead of hiera again"Jenkins1-3/+3
2016-07-22use parameter to lookup the step instead of hiera againEmilien Macchi1-3/+3
In some profiles, we were looking up the $step by using Hiera again, while we already do it in the parameter definition. When using this class outside THT, it will fail but with this patch, we could use just set the $step parameter and the rest of the manifest will work. Change-Id: I7082f47204fb4e529b164e4c4f1032e7bdd88f02