Age | Commit message (Collapse) | Author | Files | Lines |
|
min-masters strategy"
|
|
It turns out that reducing number of rabbitmq queues in cluster
significantly improves performance of cluster especially in the case of
failover recovery time. Right now the cluster uses ha-all mode for rabbitmq
queues.
It is best to change this to "ha-exactly" mode and reduce the number
of queue copies to ceil(N/2) where N is number of controllers in the
cluster - so in typical scenario of 3 controller It would be 2 by
default.
It does not make much sense to keep the copies of queues over whole
cluster since if the quorum of nodes is lost then the rest of cluster
nodes will be stopped anyway. We let the user override this with a
parameter.
I.e. for a 3 node controlplane cluster we will go from this:
pcs resource show rabbitmq
Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"all"}"
To this:
pcs resource show rabbitmq
Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"exactly","ha-params":2}"
According to Marin Krcmarik's testing recovery time from failure was
reduced significantly.
Partial-Bug: #1628998
Change-Id: Iace6daf27a76cb8ef1050ada0de7ff1f530916c6
|
|
They are now normalized and set in puppet-tripleo.
Change-Id: I197481c577b85894178e7899a55869da47847755
Closes-Bug: #1629279
Depends-On: Ic6de09acf0d36ca90cc2041c0add1bc2b4a369a5
|
|
Add redis_password parameter in Hiera so we can re-use it from
puppet-tripleo later for Aodh, Ceilometer and Gnocchi.
Change-Id: I038e2bac22e3bfa5047d2e76e23cff664546464d
Partial-Bug: #1629279
|
|
strategy
It may happen that one of the controllers may become unavailable and
Queue Masters will be located on available controllers during queue
declarations. Once a lost controller will be become available masters of
newly declared queues are not placed with priority to such controller
with obviously lower number of queue masters and thus the distribution
may be unbalanced and one of the controllers may become under
significantly higher load in some circumstances of multiple fail-overs.
With rabbit 3.6.0 rabbitmq introduced a new HA feature of Queue masters
distribution - one of the strategies is min-masters, which picks the
node hosting the minimum number of masters.
One of the ways how to turn such min-masters strategy on is by adding
following into configuration file - rabbitmq.config
{rabbit,[ ..
{queue_master_locator, <<"min-masters">>},
.. ]},
Change-Id: I61bcab0e93027282b62f2a97bec87cbb0a6e6551
Closes-Bug: #1629010
|
|
This patch movs the various db::mysql hiera settings into a
'mysql' specific service_config_settings section for each
service so that these will only get applied on the MySQL service
node. This follows a similar puppet-tripleo change where we
create the actual databases for all services locally on
the MySQL service node to avoid permission issues.
Change-Id: Ic0692b1f7aa8409699630ef3924c4be98ca6ffb2
Closes-bug: #1620595
Depends-On: I05cc0afa9373429a3197c194c3e8f784ae96de5f
Depends-On: I5e1ef2dc6de6f67d7c509e299855baec371f614d
|
|
|
|
This patch enables correctly setting the NTP server passed via
--ntp-server in the overcloud nodes' /etc/ntp.conf.
Change-Id: Iff644b9da51fb8cd1946ad9d297ba0e94d3d782b
|
|
|
|
|
|
Without setting this parameter, overcloud deploy fails and
'openstack stack failures list overcloud' reveals the
following error:
Error: Puppet::Type::Keystone_user_role::ProviderOpenstack: Could
not find project with name [services] and domain [Default]
Error:
/Stage[main]/Manila::Keystone::Auth/Keystone::Resource::Service_identity[manilav2]/Keystone_user_role[manilav2@services]:
Could not evaluate: undefined method `[]' for nil:NilClass
When we set manila::keystone::auth::tenant to 'service', analogous
to cinder, nova, etc., the overcloud deploy completes successfully.
Change-Id: I996ac2ff602c632a9f9ea9c293472a6f2f92fd72
|
|
|
|
|
|
|
|
|
|
This used to used mysql_bind_ip, but this parameter is quite misleading
since what it actually configures is not the bind-ip itself, but the
gmcast.listen_addr parameter. This fixes that confusion.
Depends-On: Iea4bd67074824e5dc6732fd7e408743e693d80b3
Change-Id: I2b114600e622491ccff08a07946926734b50ac70
|
|
Change-Id: I291bfb1e5736864ea504cd82eea1d4001fcdd931
|
|
This now takes into use the mysql_bind_host key, to set an
appropriate fqdn for mysql to bind to.
Closes-Bug: #1627060
Change-Id: I50f4082ea968d93b240b6b5541d84f27afd6e2a3
Depends-On: I316acfd514aac63b84890e20283c4ca611ccde8b
|
|
Depending on the environment, gnocchi workers
uses several controller resources RAM/CPU,
this option makes it configurable.
Also, configured to 1 in environments/low-memory-usage.yaml
which will reduce the service footprint in i.e. CI
Change-Id: Ia008b32151f4d8fec586cf89994ac836751b7cce
Closes-bug: #1626473
|
|
Enables configuring CephFS Native backend for Manila.
This change is based on the usage of environments like in
review https://review.openstack.org/#/c/354019 for Netapp
driver.
Co-Authored-By: Marios Andreou <marios@redhat.com>
Change-Id: If013d796bcdfe48b2c995bcab462c89c360b7367
Depends-On: I918f6f23ae0bd3542bcfe1bf0c797d4e6aa8f4d9
Depends-On: I2b537f735b8d1be8f39e8c274be3872b193c1014
|
|
This patch moves the keystone::auth settings for all
services into the new service_config_settings section. This
is important because we execute the keystone commands via
puppet only on the role containing the keystone service
and without these settings it will fail.
Note that yaql merging/filtering is used here to ensure that
service_config_settings is optional in service templates,
and also that we'll only deploy hieradata for a given
service on a node running the service (the key in
the service_config_settings map must match the service_name
in the service template for this to work).
e.g the following will result in only deploying keystone: 123
in hiera on the role running the "keystone" service,
regardless of which service template defines it.
service_config_settings:
keystone:
keystone: 123
Co-Authored-By: Steven Hardy <shardy@redhat.com>
Change-Id: I0c2fce037a1a38772f998d582a816b4b703f8265
Closes-bug: 1620829
|
|
|
|
|
|
|
|
This patch changes the default value and type of the NeutronWorkers
parameter, allowing it to be unset and let a system-dependent value to
be used (e.g. processorcount or some derivate value).
Change-Id: Ia385b3503fe405c4b981c451f131ac91e1af5602
Closes-Bug: #1626126
|
|
the konstantin-fluentd package assumes sysv init scripts, while the
fluentd package in rhel(/centos/fedora) uses systemd. this can cause
errors starting the service.
This review explicitly sets the service_provider to "systemd".
This requires https://github.com/soylent/konstantin-fluentd/pull/15, which exposes the service_provider parameter in konstantin-fluentd.
Change-Id: I24332203de33f56a0e49fcc15f7fb7bb576e8752
|
|
NeutronL3HA used to be enabled by the tripleoclient if the controller
count > 1. This functionality has been moved into the relevant heat
template, making the parameter less valuable for general use. If
necessary, deployers can override the automatic behavior through extra
config.
Change-Id: Id5bb5070b9627fd545357acc9ef51bdc69d10551
Related-Bug: #1623155
|
|
Currently we have a few keys which may be considered optional,
such as monitoring_subscription, logging* and global_config_settings.
Currently we dereference these directly via get_attr, but this will
break when heat output validation is fixed, ref bug #1599114 is fixed
(patches are up for this, so it may be soon).
Change-Id: If4eed1ca39c10ace9b1cb5ce2dc4b9c70a3dd2f4
Partial-Bug: #1620829
|
|
|
|
|
|
|
|
This patch changes the default value and type of the Glance worker
configuration to allow it to be unset and allow a system dependent
default to be used (e.g. processorcount or some derivative value). The
previous default of 0 would result in a single self contained process,
which while suitable for debugging and testing is not appropriate for
production deployments.
Partial-Bug: #1626126
Change-Id: I58a6a72a581e7083e1dc4e5ca568fdd3fdd6cdf1
|
|
We hit problems in environments which don't have a lot of RAM (e.g. dev
envs, could be also CI) that Apache ate too much memory due to
too many worker processes being spawned.
This commit allows customizing the Apache MaxRequestWorkers and
ServerLimit directives via Heat parameters. The default stays 256 as
that's the default in the Puppet module, to be suited for production
environments with powerful machines. Also low-memory-usage.yaml
environment file is added, which can be used to make dev/test/CI
overclouds less memory hungry, where the limits are now set to 32.
Change-Id: Ibcf1d9c3326df8bb5b380066166c4ae3c4bf8d96
Co-Authored-By: Carlos Camacho <ccamacho@redhat.com>
Closes-Bug: #1619205
|
|
|
|
Currently the servername is incorrectly set for the services running
over apache. It currently takes the default value which is just the
regular FQDN, when the services actually might be running on
different IPs that require alternative FQDNs.
This fixes that by filling that value from a fact in hiera that's
dependant on the service's network.
Closes-Bug: #1625677
Change-Id: Ib7ea5fd2d18a376eaa2f5a3fa5687cb9b719a8e2
|
|
|
|
The neutron metadata agent's metadata_ip field is meant to refer to the
nova metadata service, not the local address on the NeutronApiNetwork.
Change-Id: Ibb25a80ea3e66ab3f5cf63c197460d495939778d
Closes-Bug: #1625504
|
|
This is needed because currently we're not generating
nova_metadata_vip or nova_metadata_nodes_ip, and a service profile is
required for that. Unfortunately, currently puppet-nova only deploys
osapi and metadata through the same manifest, so this profile doesn't
really inject any puppet code. We can make this more elegant later.
Change-Id: Id7112111f16d0c749a6203b90e29e6d9f1e4d57e
Closes-Bug: #1625543
|
|
Currently in puppet/services/rabbitmq.yaml we hardcode the thread pool
size to 30 (via the +A30 snippet):
rabbitmq_environment:
RABBITMQ_SERVER_ERL_ARGS: '"+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<5000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<5000:64/native>>}]"'
Upstream rabbit has gained the ability to dynamically configure the
number of threads since 3.6.2 via the following commit:
https://github.com/rabbitmq/rabbitmq-server/commit/41ce5ad808863944cd6d62ce7f7e2271f1010582
Given that the default was hardcoded in rabbit from at least 3.4.0 up
until 3.6.2 (see LP bug associated to this commit), we can actually
remove this hardcoded value as it overrides a sane default.
Before the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -A30 -P 1048576 ...
After the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -P 1048576 ...
So effectively with this change we will have the following:
- With older rabbitmq versions we keep the +A30 default
- With rabbitmq versions >= 3.6.2 the thread number is dynamically
computed to nr_cpus * 16
Change-Id: I8d30c7d141c29fcc439d40fc767498520be7966e
Closes-Bug: #1625486
|
|
This patch conditionally enables Neutron L3 HA if there are multiple
controllers but DVR has not been enabled. If the conditions are false,
the value of NeutronL3HA is used.
Change-Id: If1ebeaf417c0da99d833450e394b71cabff2c800
Closes-Bug: #1623155
|
|
|
|
While it is possible to override the pg_num, pgp_num and size for
each pool, the defaults are hardcoded. This patch uses as default
the values given via ceph::profile::params::osd_pool_default_*
parameters, if any.
Closes-Bug: 1623590
Change-Id: Iecde772e7f72fd9abedb54cff4b8f2605df8fedd
|
|
|
|
|
|
These are needed so the computes can advertize the VNC URL correctly.
Change-Id: Ic3eba9fe929ce396b584249eb84415de09ab1b62
Closes-Bug: #1623607
|
|
|
|
|
|
This implements support for installing fluentd agents as a composable
service on the overcloud.
Depends-On: I2e1abe4d8c8359e56ff626255ee50c9cacca1940
Implements: tripleo-opstools-centralized-logging
Change-Id: I23b0e23881b742158fcfb6b8c145a3211d45086e
|
|
|
|
Currently RabbitMQ cluster uses a predefined port 35672 for clustering.
This port belongs to so-called ephemeral ports range.
Ephemeral ports are the ports kernel assings to application if it
doesn't specify which port to open. So there is a small chance that this
application being started before RabbitMQ itself could grab this port.
While rather unlikely we did see this happen.
Selinux change should already be in place. On my Centos 7 we have:
rabbitmq_port_t tcp 25672
corenet_tcp_bind_rabbitmq_port(rabbitmq_t)
corenet_tcp_connect_rabbitmq_port(rabbitmq_t)
First noted via:
https://bugzilla.redhat.com/show_bug.cgi?id=1357522
Closes-Bug: #1623818
Depends-On: I0bcd0d063a7a766483426fdd5ea81cbe1dfaa348
Change-Id: I995bd96c2a17614e954ea5bbae4d58998ef420dc
|