Age | Commit message (Collapse) | Author | Files | Lines |
|
When we observe the 'stop timeout' values of pacemaker resources:
rabbitmq and redis, they are set to 90s. But for all other services, it
is set to 200s.
The overcloud deployment sometimes fails due to this with the error:
Error: Could not complete shutdown of rabbitmq-clone, 1 resources
remaining
Error performing operation: Timer expired
This patch updates the timeout for Redis and RabbitMQ to avoid this
error.
Change-Id: I8a3b3951a896ee3e8e5e09778e8ea4717e76a1b4
|
|
|
|
|
|
|
|
|
|
Release Newton RC3 5.3.0
Change-Id: I1b367dcaba4c2c0bffa9eae0b81ee81f1676d754
|
|
Aligns the way how we check for enabled backends in
pacemaker/manila.pp with what we did in base/manila/api.pp with [1].
The benefit is that we don't need to emit from the templates
custom hiera.
1. I86ba8b9d5872c0f1a94e74215e97b796ad129bfb
Change-Id: I04e28a95e8d69a24cd3df109bf1802bfcbd941db
|
|
When deploying manila with cephfs, share creation fails because
'enabled_share_protocols' sticks to NFS,CIFS and does not get updated
with CEPHFS. This change aims at fixing it by building the list of
enabled protocols based on the list of enabled backends.
Co-Authored-By: Tom Barron <tbarron@redhat.com>
Closes-Bug: 1630564
Change-Id: I86ba8b9d5872c0f1a94e74215e97b796ad129bfb
|
|
|
|
|
|
|
|
|
|
The name was changed to "zaqar-websocket" recently. Having the old name
in the configuration file leads to errors and confusion when overriding
URLs, as the override won't get picked up with the old name.
Change-Id: I7acf900d094e41862958b3cddbb66ff0d8a3e46f
Closes-Bug: #1630965
|
|
|
|
|
|
This change adds rspec testing for the ceph profiles in puppet-tripleo.
Change-Id: I08954e011848d6b747735f11b3cbff5707460c26
|
|
The service profile in HAProxy has the capability of creating
certificates based on a map. The idea is to standardize this, as
some of those certificates should match certain networks the services
are listening on (with the exception of the external network which is
handled differently and the tenant network which doesn't need a
certificate). So, based on which network a certain service is
listening on, we fetch the appropriate certificate.
bp tls-via-certmonger
Change-Id: I89001ae32f46c9682aecc118753ef6cd647baa62
|
|
|
|
We're not able to use FQDNs yet, so to work around this, we give
precedence to a "short name" list we'll get from t-h-t. We can
migrate to using FQDNs in the next cycle.
Change-Id: Ic6fec1057439ed9122d44ef294be890d3ff8a8ee
Related-Bug: #1628521
|
|
|
|
The UI expects a Keystone endpoint URL that includes the version
(without it, it is not possible to log in). Looking at the
dist/tripleo_ui_config.js.sample configuration sample in the tripleo-ui
repository, the current expectation is a v2.0 URL so let's use that for
now.
Change-Id: I4ca04b16251fbee264cd4ce5e5433c2c1cb6d2f0
Closes-Bug: #1630546
|
|
Right now we're hardcoding the server names for the services to be
the controllers. This is problematic if we start using custom roles
for services, which listen on nodes that are not controllers.
We already have the server names for each service, so using this
mapping instead fixes the issue.
Change-Id: Ic4b65edb3dc1b75abbc3421a87cab97425b058c4
Closes-Bug: #1629098
|
|
We're not able to use FQDNs yet, so to work around this, we give
precedence to a "short name" list we'll get from t-h-t.
Change-Id: I4ef7786474c229d5212a0deb2ca02ee992b030d8
Related-Bug: #1628521
|
|
It turns out that reducing number of rabbitmq queues in cluster
significantly improves performance of cluster especially in the case of
failover recovery time. Right now the cluster uses ha-all mode for rabbitmq
queues.
It is best to change this to "ha-exactly" mode and reduce the number
of queue copies to ceil(N/2) where N is number of controllers in the
cluster - so in typical scenario of 3 controller It would be 2 by
default.
It does not make much sense to keep the copies of queues over whole
cluster since if the quorum of nodes is lost then the rest of cluster
nodes will be stopped anyway. We let the user override this with a
parameter.
I.e. for a 3 node controlplane cluster we will go from this:
pcs resource show rabbitmq
Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"all"}"
To this:
pcs resource show rabbitmq
Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"exactly","ha-params":2}"
According to Marin Krcmarik's testing recovery time from failure was
reduced significantly.
Co-Authored-By: Marian Krcmarik <mkrcmari@redhat.com>
Change-Id: Ib62001c03e1e08f58cf0c6e0ba07a8879a584084
Partial-Bug: #1628998
|
|
We added code in t-h-t to strip empty services from the service_names
list. (These are often the result of a service set to OS::Heat::None).
As such we can now drop this puppet reject statement.
Change-Id: Ie66f14f183de7e44a1f69af862f7d4be9a14c904
|
|
|
|
When updating the package with yum directly, a new httpd config file is
created with a different name than the one used by Puppet, causing
httpd to fail. Cleaning out the package config file and keeping it
around means it won't get overwritten on update, and is the way other
projects such as puppet-horizon handle this.
Change-Id: I539729ce4cd0898f8b0f3f26266e4e6d55b99e37
Closes-Bug: #1628983
|
|
|
|
Back in the Mitaka cycle via the change If6b43982c958f63bc78ad997400bf1279c23df7e
we made sure that the default start and stop timeouts for pacemaker
systemd resources is 200s (>= twice the default 90s DefaultTimeoutStopSec
in systemd). We did this change by setting puppet resource defaults for
the Pacemaker::Resource::Service class:
Pacemaker::Resource::Service {
op_params => 'start timeout=200s stop timeout=200s',
}
The problem is that after the composable services rework, this does not
work anymore and the pacemaker systemd resources that still exist do not
have these timeouts set.
We want to move away from resource defaults for this because its results
are dependent on the inclusion order which in tripleo is not guaranteed
any longer (https://docs.puppet.com/puppet/latest/reference/lang_scope.html#scope-lookup-rules)
The only services affected in Newton are: cinder-volume,
cinder-backup, manila-share, haproxy. I preferred fixing all the
pacemaker resources because it seems the cleanest and most logical
commit.
Change-Id: If89a95706514e536a7a2949871a0002c79b6046e
Closes-Bug: #1629366
|
|
|
|
|
|
|
|
This change adds rspec testing for the ceilometer profiles. While
writing these tests, the tripleo::profile::base::ceilometer::collector
class needed to have the hiera lookups moved to class parameters to
allow for testing the possible options around the database backend.
These tests add coverage for ipv4 and ipv6 configurations for the
collector profile as well as excluding mongodb on the backend.
Change-Id: I1abae040104e8492a9fe266de74080e1e7701731
|
|
This change adds rspec testing for the aodh profile and serves as an
example as to how to add in spec testing using hieradata to provide some
required parameters. This testing adds improved coverage for
expectations around computed configuration items as well as for
conditions around the steps within the tripleo deployment
Change-Id: Ic763a544289a222fea97020a98821c1e375651a3
|
|
Normalize coordination_url for Telemetry services, so we can deploy them
with IPv6.
Change-Id: Ic6de09acf0d36ca90cc2041c0add1bc2b4a369a5
Partial-Bug: #1629279
Depends-On: I038e2bac22e3bfa5047d2e76e23cff664546464d
|
|
The original configuration produced a 400 error for all requests. The
new FallbackResource directive accomplishes our task in a more elegant
fashion.
Change-Id: Ib5d77d158e73acc63d5c0c85d6aa6d99d2176333
Closes-Bug: 1628484
|
|
Bump to 5.2.0 which is RC2 release.
Change-Id: If5e650c52fa3d7701d3079712a9cc8db3a431e36
|
|
|
|
This patch moves the various DB syncs into the MySQL role.
Database creation needs to occur on the MySQL server to
avoid permission issues.
This patch also moves database creation to step 2 so we can
guarantee that all per-service databases exist at this time.
This avoids complex ordering needed during step 3 where
services, on different hosts, can run their own db sync's
in a distributed fashion.
Change-Id: I05cc0afa9373429a3197c194c3e8f784ae96de5f
Partial-bug: #1620595
|
|
|
|
|
|
|
|
having an actual name for that configuration will allow us to pass a
more proper name via t-h-t.
Change-Id: Iea4bd67074824e5dc6732fd7e408743e693d80b3
|
|
|
|
It used to be hardcoded that the bind-address was always coming from
the $::hostname fact. This is wrong, as it disregards where we have
configured the mysql address. This commit actually makes it
configurable, so we'll be able to set it via hieradata.
On the other hand, we use the hiera key that we already set
'mysql_bind_host' as a default; if, for some reason, that's
unavailable then we fall back to $::hostname.
Related-Bug: #1627060
Change-Id: I316acfd514aac63b84890e20283c4ca611ccde8b
|
|
swift proxy has already been updated to use updated
ceilometermiddleware as indicated here [1]. Include
it in the proxy class.
[1] https://github.com/openstack/puppet-swift/commit/e8ad981eff0f97c24a53197c42caf350627d3c9f
Change-Id: Ie49f4a750368ff174b23b8d6baa743d0956d727e
|
|
|
|
|
|
|
|
|