Age | Commit message (Collapse) | Author | Files | Lines |
|
This commit implements composable HA for the pacemaker profiles.
- Everytime a pacemaker resource gets included on a node,
that node will add a node cluster property with the name of the resource
(e.g. galera-role=true)
- Add a location rule constraint to force running the resource only
on the nodes that have that property
- We also make sure that any pacemaker resource/property creation has a
predefined number of tries (20 by default). The reason for this is
that within composable HA, it might be possible to get "older CIB"
errors when another node changed the CIB while we were doing an
operation on it. Simply retrying fixes this.
- Also make sure that we use the newly introduced
pacemaker::constraint::order class instead of the older
pacemaker::constraint::base class. The former uses the push_cib()
function and hence behaves correctly in case multiple nodes try
to modify the CIB at the same time.
Change-Id: I63da4f48da14534fd76265764569e76300534472
Depends-On: Ib931adaff43dbc16220a90fb509845178d696402
Depends-On: I8d78cc1b14f0e18e034b979a826bf3cdb0878bae
Depends-On: Iba1017c33b1cd4d56a3ee8824d851b38cfdbc2d3
|
|
When we create a pacemaker resource it must happen from a single node.
If it happens from multiple nodes an immediate error will be returned by
pcs.
For the pacemaker roles we enforce this by leveraging the recently
introduced <SERVICE_NAME_bootstrap_short_node_name> which gives us
the first hostname per-service, regardless of the role.
(introduced via I03e8685f939e8ae1fcd8b16883b559615042505d)
With this approach if a pacemaker service belongs to two different
roles (say role Controller on node A and role galera on node B), it
will only create the resource from one of the two and not both (which
would return an error).
Only setting Partial-Bug for this one, because it addresses the issue
from the pacemaker resource creation POV (which is always affected). But
the issue itself is a race that we're theoretically affected by since
the composable roles work landed. While I have tried to fix the more
general case in previous attempts, I think it is best if we start a
discussion on how to fix it, because each approach has a bunch of
potential drawbacks and is quite invasive on how we do things. A
discussion slot for this has been proposed for the Atlanta PTG.
Change-Id: I662398cab60d523d204b57a5674ca8f5c0f2e68a
Partial-Bug: #1615983
|
|
Puppet 4 ordering make things more strict in catalog, which is good.
Resources have to be orchestrated or Puppet will take them in the order
they are found in catalog.
This patch makes sure we create MySQL users only when Galera is actually
ready.
Closes-Bug: #1645787
Change-Id: I536a1a128c3a7eca49bcc4f34a1307bcd60b029e
|
|
remove_default_accounts is a mysql::server parameter that, set to True,
will execute some MySQL commands to cleanup MySQL defaults accounts
created by packaging.
In order to successfully run the commands, we need MySQL up and running,
which is not the case at step 1 but at step 2.
This patch make sure we run the commands at step 2 on pacemaker master
only.
No change for scenarios without Pacemaker.
Change-Id: Ifad3cb40fd958d7ea606b9cd2ba4c8ec22a8e94e
Closes-Bug: #1633113
|
|
We're not able to use FQDNs yet, so to work around this, we give
precedence to a "short name" list we'll get from t-h-t.
Change-Id: I4ef7786474c229d5212a0deb2ca02ee992b030d8
Related-Bug: #1628521
|
|
|
|
This patch moves the various DB syncs into the MySQL role.
Database creation needs to occur on the MySQL server to
avoid permission issues.
This patch also moves database creation to step 2 so we can
guarantee that all per-service databases exist at this time.
This avoids complex ordering needed during step 3 where
services, on different hosts, can run their own db sync's
in a distributed fashion.
Change-Id: I05cc0afa9373429a3197c194c3e8f784ae96de5f
Partial-bug: #1620595
|
|
having an actual name for that configuration will allow us to pass a
more proper name via t-h-t.
Change-Id: Iea4bd67074824e5dc6732fd7e408743e693d80b3
|
|
It used to be hardcoded that the bind-address was always coming from
the $::hostname fact. This is wrong, as it disregards where we have
configured the mysql address. This commit actually makes it
configurable, so we'll be able to set it via hieradata.
On the other hand, we use the hiera key that we already set
'mysql_bind_host' as a default; if, for some reason, that's
unavailable then we fall back to $::hostname.
Related-Bug: #1627060
Change-Id: I316acfd514aac63b84890e20283c4ca611ccde8b
|
|
Don't add brackets on mysql_bind_host parameter in Galera config.
Having brackets from this parameter works with old version of
Galera but not newest one.
So let's remove them at all, so we can safely upgrade Galera in RDO.
Change-Id: Ic904d4efda162f18ec8dffb91c2f383f54361f41
Closes-Bug: #1622755
|
|
|
|
|
|
Write restart flag file for services managed by Pacemaker into
/var/lib/tripleo/pacemaker-restarts directory. The name of the file must
match the name of the clone resource defined in pacemaker. The
post-puppet restart script will restart each service having a restart
flag file and remove those files.
This approach focuses on $pacemaker_master only (we don't want to
restart the pacemaker services 3 times when we have 3 controllers), so
it relies on the assumption that we're making the matching config
changes across the pacemaker nodes.
Change-Id: I6369ab0c82dbf3c8f21043f8aa9ab810744ddc12
|
|
profiles"
|
|
Prepare the pacemaker mysql manifest that galera_node_names will be an
array. Stay backwards compatible to handle comma-delimited-string too
and avoid a chicken-and-egg patch problem between t-h-t and
puppet-tripleo.
Change-Id: Ia0d9d59728c8771974bfbc486f4929b99a38e4fb
Partially-Implements: blueprint custom-roles
|
|
When we implemented the galera composable role we accidentally moved the
xinetd.d monitor service on the bootstrap node only. This meant that
haproxy believed that galera was down on the non-bootstrap nodes. A
shutdown of the bootstrap node meant that galera was effectively down
because haproxy would refuse to redirect the traffic to the
non-bootstrap node. Fix this by creating the
/etc/xinetd.d/galera-monitor on all controller nodes.
Change-Id: Ib5a06b3abbc32182476c2b0c81eb77a12821ad6b
|
|
Some lint checks are returning:
WARNING: line has more than 140 characters in puppet-tripleo profiles
This patch will remove those warnings by adding \'s
Change-Id: I19b56c93db82948fb0498a4c9851b522c81946f8
|
|
As we are staring to manually check overcloud services
the first step is to check that the puppet profiles
are all aligned.
Changes applied:
No logic added or removed in this submission.
Removed unused parameters.
Align header comments structure.
All profiles parameters sorted following:
"Mandatory params first sorted alphabetically
then optional params sorted alphabetically."
Note: Following submissions will check pacemaker,
cinder, mistral and redis services in the base profiles
as some of them has the $pacemaker_master parameter
defaulted to true.
Change-Id: I2f91c3f6baa33f74b5625789eec83233179a9655
|
|
The openstack-core resource is not needed by the NG Pacemaker
architecture. It was moved into an isolated role by [1] so that
it could optionally be enabled when wanting the older architecture.
This submission removes the old openstack-core global resource.
1. I74a62973146c0261385ecf5fd3d06db51e079caa
Change-Id: I16a786ce167c57848551c7245f4344c382c55b3d
|
|
In some profiles, we were looking up the $step by using Hiera
again, while we already do it in the parameter definition.
When using this class outside THT, it will fail but with this patch, we
could use just set the $step parameter and the rest of the manifest will
work.
Change-Id: I7082f47204fb4e529b164e4c4f1032e7bdd88f02
|
|
Add MySQL profiles, for non-ha and ha scenarios.
Change-Id: I7ddae28a6affd55c5bffc15d72226a18c708850e
Closes-Bug: #1601853
|