Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently we do not disable openstack-cinder-volume during our
major-upgrade-pacemaker step. This leads to the following scenario. In
major_upgrade_controller_pacemaker_2.sh we do:
start_or_enable_service galera
check_resource galera started 600
....
if [[ -n $(is_bootstrap_node) ]]; then
...
cinder-manage db sync
...
What happens here is that since openstack-cinder-volume was never
disabled it will already be started by pacemaker before we call
cinder-manage and this will give us the following errors during the
start:
06:05:21.861 19482 ERROR cinder.cmd.volume DBError:
(pymysql.err.InternalError) (1054, u"Unknown column 'services.cluster_name' in 'field list'")
Change-Id: I01b2daf956c30b9a4985ea62cbf4c941ec66dcdf
Closes-Bug: #1627470
|
|
|
|
This patch moves the keystone::auth settings for all
services into the new service_config_settings section. This
is important because we execute the keystone commands via
puppet only on the role containing the keystone service
and without these settings it will fail.
Note that yaql merging/filtering is used here to ensure that
service_config_settings is optional in service templates,
and also that we'll only deploy hieradata for a given
service on a node running the service (the key in
the service_config_settings map must match the service_name
in the service template for this to work).
e.g the following will result in only deploying keystone: 123
in hiera on the role running the "keystone" service,
regardless of which service template defines it.
service_config_settings:
keystone:
keystone: 123
Co-Authored-By: Steven Hardy <shardy@redhat.com>
Change-Id: I0c2fce037a1a38772f998d582a816b4b703f8265
Closes-bug: 1620829
|
|
|
|
|
|
|
|
This was missed during custom-roles work, and will mean deployments
break if any of the existing roles are removed from roles_data.yaml
Change-Id: Ia737b48a0dd272f8d706b7458764201fa47cb0bb
Closes-Bug: #1625755
|
|
|
|
the konstantin-fluentd package assumes sysv init scripts, while the
fluentd package in rhel(/centos/fedora) uses systemd. this can cause
errors starting the service.
This review explicitly sets the service_provider to "systemd".
This requires https://github.com/soylent/konstantin-fluentd/pull/15, which exposes the service_provider parameter in konstantin-fluentd.
Change-Id: I24332203de33f56a0e49fcc15f7fb7bb576e8752
|
|
Currently we have a few keys which may be considered optional,
such as monitoring_subscription, logging* and global_config_settings.
Currently we dereference these directly via get_attr, but this will
break when heat output validation is fixed, ref bug #1599114 is fixed
(patches are up for this, so it may be soon).
Change-Id: If4eed1ca39c10ace9b1cb5ce2dc4b9c70a3dd2f4
Partial-Bug: #1620829
|
|
Our previous no-ops stopped working because the Puppet run resources
moved under a different entry in resource registry. This is now fixed
to follow the latest way.
Change-Id: Ia5598385ddca185bfbf10e2d3babb53f6f77d1ac
Closes-Bug: #1626452
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Currently we pass numbers in (hard-coded in post.j2.yaml) but the
SoftwareConfig schema defaults to String. If puppet requires an
integer number, setting this type may help preserve the type for
the hook.
Change-Id: Ie9227d7adb58ea3c791aa459a1ab5b17ad935919
|
|
This patch changes the default value and type of the Glance worker
configuration to allow it to be unset and allow a system dependent
default to be used (e.g. processorcount or some derivative value). The
previous default of 0 would result in a single self contained process,
which while suitable for debugging and testing is not appropriate for
production deployments.
Partial-Bug: #1626126
Change-Id: I58a6a72a581e7083e1dc4e5ca568fdd3fdd6cdf1
|
|
We hit problems in environments which don't have a lot of RAM (e.g. dev
envs, could be also CI) that Apache ate too much memory due to
too many worker processes being spawned.
This commit allows customizing the Apache MaxRequestWorkers and
ServerLimit directives via Heat parameters. The default stays 256 as
that's the default in the Puppet module, to be suited for production
environments with powerful machines. Also low-memory-usage.yaml
environment file is added, which can be used to make dev/test/CI
overclouds less memory hungry, where the limits are now set to 32.
Change-Id: Ibcf1d9c3326df8bb5b380066166c4ae3c4bf8d96
Co-Authored-By: Carlos Camacho <ccamacho@redhat.com>
Closes-Bug: #1619205
|
|
The previous logic left out the default Count completely when it was
zero, which breaks nested validation and it's likely similar problems
would exist with the other optional defaults, so rework it so the
defaulting happens in the jinja2 logic, and document the interfaces
better in roles_data.yaml
Change-Id: I7f2eb4a3a0b43c5d2cd0d001ed3c73f783c95c74
Closes-Bug: #1625760
|
|
|
|
Currently the servername is incorrectly set for the services running
over apache. It currently takes the default value which is just the
regular FQDN, when the services actually might be running on
different IPs that require alternative FQDNs.
This fixes that by filling that value from a fact in hiera that's
dependant on the service's network.
Closes-Bug: #1625677
Change-Id: Ib7ea5fd2d18a376eaa2f5a3fa5687cb9b719a8e2
|
|
Running upgrade-non-controller.sh against compute and object storage did
not fail if the /root/tripleo_upgrade_node.sh failed.
This make it harder to detect error in CI system for instance.
Change-Id: I12b7d640547d3b8ec1f70104d159d6052b7638ff
Closes-Bug: 1620973
|
|
|
|
The neutron metadata agent's metadata_ip field is meant to refer to the
nova metadata service, not the local address on the NeutronApiNetwork.
Change-Id: Ibb25a80ea3e66ab3f5cf63c197460d495939778d
Closes-Bug: #1625504
|
|
This is needed because currently we're not generating
nova_metadata_vip or nova_metadata_nodes_ip, and a service profile is
required for that. Unfortunately, currently puppet-nova only deploys
osapi and metadata through the same manifest, so this profile doesn't
really inject any puppet code. We can make this more elegant later.
Change-Id: Id7112111f16d0c749a6203b90e29e6d9f1e4d57e
Closes-Bug: #1625543
|
|
Currently in puppet/services/rabbitmq.yaml we hardcode the thread pool
size to 30 (via the +A30 snippet):
rabbitmq_environment:
RABBITMQ_SERVER_ERL_ARGS: '"+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<5000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<5000:64/native>>}]"'
Upstream rabbit has gained the ability to dynamically configure the
number of threads since 3.6.2 via the following commit:
https://github.com/rabbitmq/rabbitmq-server/commit/41ce5ad808863944cd6d62ce7f7e2271f1010582
Given that the default was hardcoded in rabbit from at least 3.4.0 up
until 3.6.2 (see LP bug associated to this commit), we can actually
remove this hardcoded value as it overrides a sane default.
Before the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -A30 -P 1048576 ...
After the change:
/usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -P 1048576 ...
So effectively with this change we will have the following:
- With older rabbitmq versions we keep the +A30 default
- With rabbitmq versions >= 3.6.2 the thread number is dynamically
computed to nr_cpus * 16
Change-Id: I8d30c7d141c29fcc439d40fc767498520be7966e
Closes-Bug: #1625486
|
|
This patch conditionally enables Neutron L3 HA if there are multiple
controllers but DVR has not been enabled. If the conditions are false,
the value of NeutronL3HA is used.
Change-Id: If1ebeaf417c0da99d833450e394b71cabff2c800
Closes-Bug: #1623155
|
|
|
|
|
|
This is the initial work to have a function that migrates a full HA
architecture as deployed in Mitaka to the HA architecture as deployed in
Newton where only a few resources are managed by pacemaker.
The sequence is the following:
1) We remove the desired services from pacemaker's control. The services
at this point are still running normally via the systemd service as
invoked by pacemaker
2) We do a "systemctl stop <service>" on all controllers for all the
services that were removed from pacemaker's control. We do this to make
sure that during the yum upgrade, the %post sections that call
"systemctl try-restart" do not take ages, because at this point during
the upgrade rabbit is down. The only exceptions are "openstack-core"
and "delay" which are dummy pacemaker resources that do not exist on
the system
3) We do a "systemctl start <service>" on all nodes for all the services
mentioned above.
We should probably merge this patch only when newton has branched as it
is very specific to the M/N upgrade.
Closes-Bug: 1617520
Change-Id: I4c409ce58c1a57b6e0decc3cf168b62698b32e39
|
|
While it is possible to override the pg_num, pgp_num and size for
each pool, the defaults are hardcoded. This patch uses as default
the values given via ceph::profile::params::osd_pool_default_*
parameters, if any.
Closes-Bug: 1623590
Change-Id: Iecde772e7f72fd9abedb54cff4b8f2605df8fedd
|
|
|
|
|
|
|
|
Change-Id: I7a041dab8b1b1edc9c80248e1eef3ce7ab272292
Closes-Bug: 1615056
|
|
|
|
These are needed so the computes can advertize the VNC URL correctly.
Change-Id: Ic3eba9fe929ce396b584249eb84415de09ab1b62
Closes-Bug: #1623607
|
|
|
|
For N we cannot assume services are managed by pacemaker.
This adds functions to check if a service is systemd or
pcmk managed and start/stops it accordingly. For pcmk,
only stop/disable on bootstrap node for example, whereas
systemd should stop/start on all controllers.
There is also an equivalent change to the check_resource
which has been reworked to allow both pcmk and systemd.
Implements: blueprint overcloud-upgrades-workflow-mitaka-to-newton
Change-Id: Ic8252736781dc906b3aef8fc756eb8b2f3bb1f02
|
|
|
|
|
|
This implements support for installing fluentd agents as a composable
service on the overcloud.
Depends-On: I2e1abe4d8c8359e56ff626255ee50c9cacca1940
Implements: tripleo-opstools-centralized-logging
Change-Id: I23b0e23881b742158fcfb6b8c145a3211d45086e
|
|
|
|
|
|
|
|
|
|
|
|
|