Age | Commit message (Collapse) | Author | Files | Lines |
|
During the bootstrap of the mariadb database, galera replication
must be disabled while the users credentials are being set up. This
is done by setting wsrep-provider=none when starting mysqld_safe.
Icf67fd2fbf520e8a62405b4d49e8d5169ff3925b already disabled it
when the clustercheck credentials are being set up, but Kolla also
start a temporary server for setting up the root password.
Disable the setting directly at the end of the mysql.cnf in the
running container. That way, the default setting from galera.cnf will
be overriden, all mysqld_safe calls will disable WSREP and the setting
will stay ephemeral.
Change-Id: If14e22992b46a35a05a16a9db5ecb360ea13df8f
Closes-Bug: #1717250
(cherry picked from commit b0f50db80b10e9cd6263c4d6b3ca8dd818b658ba)
|
|
This change allows running the major upgrade composable docker
steps multiple times by not trying to delete the pacemaker resources
if they're not reported as started or in master state.
Closes-bug: 1716031
Depends-On: I8da03f5c4a6d442617b81be5793a9724cc8842bf
Change-Id: Ifcf9de8c82550a90a9fb118052d43fdbcdc6ca7e
(cherry picked from commit 64d7be1e3d4552e06cbc53f788572e530cc5c3bb)
|
|
Add a retry when the pacemaker_resource command
wasn't apply correctly, more info here:
https://bugzilla.redhat.com/show_bug.cgi?id=1482116
This is the same approach puppet-pacemaker uses
and provides eventual consistency when multiple
nodes change the cluster CIB concurrently.
This change depends-on :
https://review.gerrithub.io/375982
The return code is not available in the current
ansible-pacemaker package.
Change-Id: I8da03f5c4a6d442617b81be5793a9724cc8842bf
(cherry picked from commit e92430d8d03fc2ce2d0ce192b96209f2c5c04169)
|
|
Depending on the version of mariadb/galera installed the mysql_bootstrap
command might fail. With the following unrevealing error:
openstack-mariadb-docker:2017-08-28.10 "bash -ec 'if [ -e /v" 3 hours ago Exited (124) 3 hours ago
The timeout is actually due to the fact that the following snippets does
not complete within 60 seconds:
"""
if [ -e /var/lib/mysql/mysql ]; then exit 0; fi
kolla_start
mysqld_safe --skip-networking --wsrep-on=OFF &
timeout ${DB_MAX_TIMEOUT} /bin/bash -c ''until mysqladmin -uroot -p"${DB_ROOT_PASSWORD}" ping 2>/dev/null; do sleep 1; done''
mysql -uroot -p"${DB_ROOT_PASSWORD}" -e "CREATE USER ''clustercheck''@''localhost'' IDENTIFIED BY '${DB_CLUSTERCHECK_PASSWORD}'';"
mysql -uroot -p"${DB_ROOT_PASSWORD}" -e "GRANT PROCESS ON *.* TO ''clustercheck'
"""
The problem is that with older mariadb versions:
galera-25.3.16-3.el7ost.x86_64
mariadb-5.5.56-2.el7.x86_64
The mysqld_safe process starts in galera mode (as opposed as to single
local mode):
170830 17:03:05 [Note] WSREP: Start replication
170830 17:03:05 [Note] WSREP: GMCast version 0
...
170830 17:03:05 [ERROR] WSREP: wsrep::connect() failed: 7
170830 17:03:05 [ERROR] Aborting
That means that even though we specified --wsrep-on=OFF it is still
starting in cluster mode. Let's add the extra --wsrep-provider=none
which older versions required.
Let's also add a '-x' to this transient container as that
would have helped a bit because we would have understood right away
that it was mysqld_safe that was not starting. I tested this
successfully on an environment that showed the problem. The new
option is still accepted by newer DB versions in any case.
Closes-Bug: #1714057
Change-Id: Icf67fd2fbf520e8a62405b4d49e8d5169ff3925b
Co-Authored-By: Mike Bayer <mbayer@redhat.com>
(cherry picked from commit c19968ca852ab608513fe692aab958af25276220)
|
|
We need to tag the HA containers with a special tag so
that the RA definition never changes. We do this step in THT
as opposed to puppet because we need to guarantee
that all images are tagged on all nodes *before* step 2 where the bundle
gets created.
NB: Getting the image name without the tag will require some more
yaql work to get all the cases right. Right now this works only
if we enforce that the image has a ':tag' at the end of the name.
So far this is always the case. If things change we will need to
amend this code.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com>
Change-Id: I362e6cf26fba77d3f949b7d2fc4b35a3eab9087e
|
|
|
|
|
|
In non-containerized deployments, Galera can be configured to use TLS
for gcomm group communication when enable_internal_tls is set to true.
Fix the metadata service definition and update the Kolla configuration
to make gcomm use TLS in containers, if configured.
bp tls-via-certmonger-containers
Change-Id: Ibead27be81910f946d64b8e5421bcc41210d7430
Co-Authored-By: Juan Antonio Osorio Robles <jaosorior@redhat.com>
Closes-Bug: #1708135
Depends-On: If845baa7b0a437c28148c817b7f94d540ca15814
|
|
In HA overclouds, the helper script clustercheck is called by HAProxy to poll
the state of the galera cluster. Make sure that a dedicated clustercheck user
is created at deployment, like it is currently done in Ocata.
The creation of the clustercheck user happens on all controller nodes, right
after the database creation. This way, it does not need to wait for the galera
cluster to be up and running.
Partial-Bug: #1707683
Change-Id: If8e0b3f9e4f317fde5328e71115aab87a5fa655f
|
|
Once an Ocata overcloud is upgraded to Pike, clustercheck should only be
running in a dedicated container, and xinetd should no longer manage it on
the host. Fix the mysql upgrade_task accordingly.
Change-Id: I01acacc2ff7bcc867760b298fad6ff11742a2afb
Closes-Bug: #1706612
|
|
|
|
This is required when the bundles run on pacemaker remote nodes
otherwise the cluster won't be able to connect to the control-ports
of each bundle. The only services that need this are rabbit, redis and
galera because those run pacemaker_remote inside the container
(A/P resources and haproxy do not)
Change-Id: I6a56d79319ef3d14973a0586dcda4d523adda7aa
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
|
|
Adds upgrade_tasks to remove the pacemaker resources using the
ansible-pacemaker module.
Resources are disabled and removed in step2 (called only on
bootstrap node) and then the cluster stop is moved to step3
The existing systemd/service call is kept but only to disable
services after they are disabled/deleted from the cluster.
Related-Bug: 1701485
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: Ia597d240ea5834c50a8f6c4fac0b6ed417b8535c
|
|
|
|
This removes the default container names from all the templates
and uses a single environment file to specify the full container
name and registry from which to pull. Also does away with most
of DockerNamespace.
Change-Id: Ieaedac33f0a25a352ab432cdb00b5c888be4ba27
Depends-On: Ibc108871ebc2beb1baae437105b2da1d0123ba60
Co-Authored-By: Dan Prince <dprince@redhat.com>
Co-Authored-By: Steve Baker <sbaker@redhat.com>
|
|
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.
Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).
Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
|
|
This solves a problem with bind-mounts when the containers are holding
files descriptors open.
At the same time this makes the template more robust to puppet changes
since new config files will be available in the containers without
needing to update the templates.
Partial-Bug: #1698323
Change-Id: Ia4ad6d77387e3dc354cd131c2f9756939fb8f736
|
|
This commit consistently defines a heat template parameter in the form
of DockerXXXConfigImage where XXX represents the name of the
config_volume that is used by docker-puppet.
The goal is to mitigate hard to debug errors where the templates would
set different defaults for the image docker-puppet.py uses to run, for
the same config_volume name.
This fixes a couple of inconsistencies on the way.
Change-Id: I212020a76622a03521385a6cae4ce73e51ce5b6b
Closes-Bug: #1699791
|
|
This service allows configuring and deploying MySQL/galera containers
in a HA overcloud managed by pacemaker.
The containers are managed and run by pacemaker. Inside there is
pacemaker_remote which will invoke the resource agent managing galera.
The resources themselves are created via puppet-pacemaker inside a
short-lived container used for this purpose (mysql_init_bundle).
This container needs to use the 'docker_config' section to invoke
puppet (as opposed to 'docker_puppet_tasks'), because due to the HA
composability each resource creation needs to happen on the bootstrap
node of that service and 'docker_puppet_tasks' will only run on the
controller/primary role.
Co-Authored-By: Michele Baldessari <michele@acksyn.org>
Closes-Bug: #1692842
Depends-On: I3b4d8ad2eec70080419882d5d822f78ebd3721ae
Change-Id: I790dbc30b3de1c1a3fe76d3d8f060e4d7f95e2e7
|