Age | Commit message (Collapse) | Author | Files | Lines |
|
Adds update_tasks for the minor update workflow. These will be
collected into playbooks during an initial 'update init' heat
stack update and then invoked later by the operator as ansible
playbooks.
Current understanding/workflow:
Step=1: stop the cluster on the updated node
Step=2: Pull the latest image and retag the it pcmklatest
Step=3: yum upgrade happens on the host
Step=4: Restart the cluster on the node
Step=5: Verification: test pacemaker services are running.
https://etherpad.openstack.org/p/tripleo-pike-updates-upgrades
Related-Bug: 1715557
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com>
Change-Id: I101e0f5d221045fbf94fb9dc11a2f30706843806
(cherry picked from commit a953bda0ae615dc44d3e8a70aa7ab0160e26f3af)
|
|
The services that docker depends on, have logging_sources and logging_groups;
but those are not set on the docker outputs so they are not used when dockers
are deployed.
Added logging_source & logging_groups as docker optional parameters in
tools/yaml-validate.py
Closes-Bug: #1718110
Change-Id: I8795eaf4bd06051e9b94aa50450dee0d8761e526
(cherry picked from commit 5dbe1121e98a794ec6a6387ff56ee34314177567)
|
|
In Ocata all live-migration over ssh is performed on the default ssh port (22).
In Pike the containerized live-migration over ssh is on port 2022 as the
docker host's sshd is using port 22.
To allow live migration during upgrade we need to temporarily pin the Pike
computes to port 22 and in the final converge we can switch over to port 2022.
This also changes the default port to 2022 for baremetal computes in Pike to
enable live-migration between baremetal and containerized computes.
Change-Id: Icb9bfdd9a99dc1dce28eb95c50a9a36bffa621b1
Depends-On: I0b80b81711f683be539939e7d084365ff63546d3
Closes-Bug: 1714171
(cherry picked from commit 17fd16b9f266e1aa67bf03ebdf309e89d668ada2)
|
|
|
|
|
|
Upgrades from older versions using Management network fail.
This patch enables the management network even though it is not
enabled in any of the role definitions. This will allow upgrades
to complete using existing network environment files, without
requiring operators to switch to the new method for defining
which networks are attached to roles. Eventually these older
environment files will be removed.
Change-Id: Iadd12a559f0ad6918958a1355f189187fd327363
Closes-bug: 1717123
(cherry picked from commit 5b9fbc2b2bfa00de2fe0f437f21e05e3fc09a53d)
|
|
This adds a new config/deployment per role that will come after any
post deploy steps. It drives the same ansible config as the
upgrade_tasks but instead collects the post_upgrade_tasks for any
service in the given role.
The workflow is upgrade_tasks, then post deploy steps (either
puppet/ or docker/ depending on the env) and then the
post_upgrade_tasks added here.
This is added to the pacemaker/cinder-volume.yaml service for now
see the bug below for more info
Change-Id: Iced34fecf02ebddc91df9302de54d2f4c2cab680
Closes-Bug: 1706951
(cherry picked from commit 2e182bffeeb099cb5e0b1747086fb0e0f57b7b5d)
|
|
Using the service_ prefix seems incoherent with its use in
service_config_settings (vs config_settings).
Change-Id: Ia39f181415bee0071409dabddfa0c5c312915e1f
(cherry picked from commit 09137304b98a02ed024c0288da907cfe35ca5fe1)
|
|
Use a separate config_volume for swift_ringbuilder puppet_config tasks.
This is necessary so that the swift_ringbuilder and swift-storage
services don't both rsync files to the same bind mounted directory.
The rsync command from docker-puppet.py uses --delete-after, so when
they both use the same config_volume, they can end up deleting the files
generated by the other (depending on the order of execution).
Even though a separate config_volume is used, the rings must still end up
in /etc/swift for the swift services containers. An additional
container init task is used to copy the ring files into
/var/lib/config-data/puppet-generated/swift/etc/swift so that they will
be present when the actual swift services containers are started.
Change-Id: I05821e76191f64212704ca8e3b7428cda6b3a4b7
Closes-Bug: #1710952
(cherry picked from commit cba00abb7517efa6a8d9b8fb954563204323ffed)
|
|
There is logic in nova-base.yaml that depends on the default for
this parameter being '', and the nova-compute service only needs it
set to auto during upgrade. That will be done by [1] anyway, so it
doesn't matter what the default is. It's also not clear to me that
the nova-compute task is even needed now that we're post-Ocata, but
that's not a change I feel comfortable making.
1: https://github.com/openstack/tripleo-heat-templates/blob/master/environments/major-upgrade-composable-steps.yaml
Change-Id: Iccfcb5b68e406db1b942375803cfedbb929b4307
Partial-Bug: 1700664
|
|
These are mostly the low hanging fruit that only required a few
minor changes to fix. There are more that require a lot of changes
or might be more controversial that will be done later.
Change-Id: I55cebc92ef37a3bb167f5fae0debe77339395e62
Partial-Bug: 1700664
|
|
The key_name default is ignored because the parameter is used in
some mutually exclusive environments where the default doesn't
need to be the same.
Change-Id: I77c1a1159fae38d03b0e59b80ae6bee491d734d7
Partial-Bug: 1700664
|
|
Services that access database have to read an extra MySQL configuration file
/etc/my.cnf.d/tripleo.cnf which holds client-only settings, like client bind
address and SSL configuration. The configuration file is thus used by
containerized services, but also by non-containerized services that still
run on the host.
In order to generate that client configuration file appropriately both on the
host and for containers, 1) the MySQLClient service must be included by the
role; 2) every containerized service which uses the database must include the
mysql::client profile in the docker-puppet config generation step.
By including the mysql::client profile in each containerized service, we ensure
that any change in configuration file will be reflected in the service's
/var/lib/config-data/{service}, and that paunch will restart the service's
container automatically.
We now only rely on MySQLClient from puppet/services, to make it possible to
generate /etc/my.cnf.d/tripleo.cnf on the host, and to set the hiera keys that
drive the generation of that config file in containers via docker-puppet.
We include a new YAML validation step to ensure that any service which depends
on MySQL will initialize the mysql::client profile during the docker-puppet
step.
Change-Id: I0dab1dc9caef1e749f1c42cfefeba179caebc8d7
|
|
|
|
Add docker profiles to deploy Ceph in containers via ceph-ansible. This is
implemented by triggering a Mistral workflow during one of the overcloud
deployment steps, as provided by [1].
Some new service-specific parameters are available to determine the workflow to
execute and the ansible playbook to use. A new `CephAnsibleExtraConfig`
parameter can be used to provide arbitrary config variables consumed by `ceph-ansible`.
The pre-existing template params consumed up until the Pike release to
drive `puppet-ceph` continue to work and are translated, when possible, into
the equivalent `ceph-ansible` variable.
A new environment file is added to enable use of ceph-ansible;
the pre-existing puppet-ceph implementation remains unchanged and usable
for non-containerized deployments.
1. https://review.openstack.org/#/c/463324/
Change-Id: I81d44a1e198c83a4ef8b109b4eb6c611555dcdc5
|
|
|
|
|
|
This patch adds parameters to configure alternative version
of the Zaqar messaging and management backends.
The intent is to make use of these settings in the
containers undercloud to use swift/mysql backends as a default
thus avoiding the dependency on MongoDB.
Change-Id: Ifd6a561737184c9322192ffc9a412c77d6eac3e9
Depends-On: Ie6a56b9163950cee2c0341afa0c0ddce665f3704
Depends-On: I3598e39c0a3cdf80b96e728d9aa8a7e6505e0690
|
|
Since these are obviously global parameters they shouldn't specify
what will be using them because they are used in multiple places.
Change-Id: I5054c2d67dffe802e37f8391dd7bad4721e29831
Partial-Bug: 1700664
|
|
It seems UpdateIdentifier is an overloaded parameter - it is used
both to trigger package updates in the minor update case as well as
to trigger the upgrade steps during a major upgrade. I'm not sure
it's appropriate to change either of the descriptions as a result,
so for the moment that is added to the exclusion list.
Change-Id: Ied36cf259f6a6e5c8cfa7a01722fb7fda6900976
Partial-Bug: 1700664
|
|
Change-Id: I3ea7c0c7ea049043668e68c6e637fd2aaf992622
Partial-Bug: 1700664
|
|
This way we have one list of problems that need to be fixed and can
enable this check to avoid adding any new ones. As parameters are
fixed they can be removed from the exclusion list.
Change-Id: Icb5fc36e2da3a3bfb7eaa8a66464c685220e527f
|
|
|
|
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.
Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).
Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
|
|
With the merging of Iad3e9b215c6f21ba761c8360bb7ed531e34520e6 the
roles_data.yaml should be generated with tripleoclient rather than
edited. This change adds in a pep8 task to verify that the appropriate
role files in roles/ have been modified to match how our default
roles_data.yaml is constructed. Additionally this change adds a new tox
target called 'genrolesdata' that will all you to automatically generate
roles_data.yaml and roles_data_undercloud.yaml
Change-Id: I5eb15443a131a122d1a4abf6fc15a3ac3e15941b
Related-Blueprint: example-custom-role-environments
|
|
|
|
The ComputeHCI role is meant to be a copy of the Compute role
except it hosts CephOSD and uses StorageMgmt.
Change-Id: Ic8fc5e672361a652ef19199a941c87247ca6925d
|
|
This is necessary for accessing the bind mounted hieradata in the
container in order to determine if the node is the primary node.
With the new validation added to yaml-validate.py, we could spot
potential issues in sahara-api and keystone bootstrap tasks.
The keystone one is a false positive, as the image defaults to the root
user in order to be able to run apache. Still, it is better to be
consistent here and specify the root user nonetheless.
Change-Id: Ib0ff9748d5406f507261e506c19b96750b10e846
Closes-Bug: #1697917
|
|
Mounting host volumes when running containers via puppet_config already
works and is supported with docker-puppet.py. However, the validation in
yaml-validate.py does not allow it. This patch makes it allowed by the
validation.
It is sometimes necessary since some puppet modules expect to make
persistent file system changes other than just configuration data under
/etc.
In particular, ironic inspector expects to configure a http and tftp
root director with an ipxe configuration. See:
https://github.com/openstack/puppet-ironic/blob/master/manifests/inspector.pp
These changes would be lost if the value for those directories are not
mounted as host volumes.
Change-Id: Ie51c653f4c666fbaaef0ea80990e2e61f4b1353b
|
|
This commit consistently defines a heat template parameter in the form
of DockerXXXConfigImage where XXX represents the name of the
config_volume that is used by docker-puppet.
The goal is to mitigate hard to debug errors where the templates would
set different defaults for the image docker-puppet.py uses to run, for
the same config_volume name.
This fixes a couple of inconsistencies on the way.
Change-Id: I212020a76622a03521385a6cae4ce73e51ce5b6b
Closes-Bug: #1699791
|
|
Many of our parameters are defined in multiple templates, but
currently there is no easy way of checking that all of those
definitions match. It can be confusing when a parameter is defined
one way in one file and another way in a different file. For example,
the NovaWorkers description is:
Number of workers for Nova API service.
and
Number of workers for Nova Placement API service.
and
Number of workers for Nova Conductor service.
Which is it actually? All of them. That one parameter controls
the workers for all of the nova services, and its description should
reflect that, no matter which template you happen to look at.
This change adds a check to yaml-validate.py to catch these sorts of
inconsistencies and allow us to eventually prevent new ones from
getting into the templates.
An exclusion mechanism is included because there are some parameter
definitions we probably can't/shouldn't change. In particular, this
includes the network cidrs which are defaulted to ipv4 addresses in
the ipv4 net-iso templates and ipv6 in the ipv6 templates. It's
possible a user would be relying on one of those defaults in their
configuration, so if we change it they might break.
To get around that, the tool explicitly ignores the default field of
those parameters, while still checking the description and type fields
so we maintain some sanity. There may be other parameters where this
is an issue, but those can be added later as they are found.
For the moment any inconsistencies are soft-fails. A failure message
will be printed, but the return value will not be affected so we can
add the tool without first having to fix every divergent parameter
definition in tripleo-heat-templates (and there appear to be plenty).
This will allow us to gradually fix the parameters over time, and
once that is done we can make this a hard-fail.
Change-Id: Ib8b2cb5e610022d2bbcec9f2e2d30d9a7c2be511
Partial-Bug: 1700664
|
|
|
|
We're not going to want to list every single sample environment in
a single file, so let's also take a directory and just read every
yaml file in it. This commit adds support for that as well as
some initial environments to demonstrate its use.
Change-Id: If2c608f2a61fc5e16784ab594d23f1fa335e1d3c
|
|
Move to one common services.yaml not only reduces the duplication, but it
should improve performance for the docker/services.yaml case, because we were
creating two ResourceChains with $many services which we know can be really
slow (especially since we seem to be missing concurrent: true on one)
Change-Id: I76f188438bfc6449b152c2861d99738e6eb3c61b
|
|
When a service is enabled on multiple roles, the parameters for the
service will be global. This change enables an option to provide
role specific parameter to services and other templates.
Two new parameters - RoleName and RoleParameters, are added to the
service template. RoleName provides the role name of on which the
current instance of the service is being applied on. RoleParameters
provides the list of parameters which are configured specific to the
role in the environment file, like below:
parameters_default:
# Default value for applied to all roles
NovaReservedHostMemory: 2048
ComputeDpdkParameters:
# Applied only to ComputeDpdk role
NovaReservedHostMemory: 4096
In above sample, the cluster contains 2 roles - Compute, ComputeDpdk.
The values of ComputeDpdkParameters will be passed on to the templates
as RoleParameters while creating the stack for ComputeDpdk role. The
parameter which supports role specific configuration, should find the
parameter first in in the RoleParameters list, if not found, then the
default (for all roles) should be used.
Implements: blueprint tripleo-derive-parameters
Change-Id: I72376a803ec6b2ed93903cc0c95a6ffce718b6dc
|
|
* Split it to REQUIRED/OPTIONAL
* Move puppet_tags to OPTIONAL as it already has a
default set of tags that need not to be repeated
explicitly.
Change-Id: Ib70176f1edf61228771c983b0c3231fb7939a316
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
|
|
Note: since it replaces rabbitmq, in order to aim for the smallest
amount of changes the service_name is called 'rabbitmq' so all the
other services do not need additional logic to use qdr.
Depends-On: Idecbbabdd4f06a37ff0cfb34dc23732b1176a608
Change-Id: I27f01d2570fa32de91ffe1991dc873cdf2293dbc
|
|
We've decided to use volumes for configuration wherever possible.
This means moving away from kolla_config blocks in the templates.
Update pep8 to reflect this.
Change-Id: If1ec40d0e5a515eed35e0cd04711079294f358c3
|
|
This section will be needed for TLS-everywhere. So it should be added as
optional in the yaml-validate.
Change-Id: Ic6ea563b6c8e454cb51f640bb5aaa3adda82a5dd
|
|
This implements a host_prep_tasks hook where we can specify Ansible
tasks to perform on the host before deploying containerized
services. The hook runs in a single step, the assumption is that we will
mostly use the hook for creating per-service directories on the host to
ensure we are able to mount them into the containers. (We cannot do this
operation via Puppet because all containerized services run their Puppet
within a config container, so Puppet doesn't have access to host's
filesystem.)
Change-Id: I7d8bac39e0cd422fd651eefe29f7d10941ab4a1a
|
|
This patch adds the beginning of a set of unit tests
for the new docker services templates. This should help
us the new interfaces as they evolve.
Change-Id: I98a73cf090ebab4593a682f5f34c0950d37e010c
|
|
Until bug #1635409 is fixed we'll have to keep the default list
of services deployed by hyperconverged-ceph.yaml in sync with the
ServicesDefault list provided in roles_data.yaml
This change adds some logic in the templates validation script to
ensure that is preserved with future updates.
Change-Id: Ib767f9a24c3541b16f96bd6b6455cf797113fbd8
|
|
When fixing LP#1643487 we added ?bind_address to all DB URIs.
Since this clashes with Cellsv2 due to the URIs becoming host
dependent, we need a new approach to pass bind_address to pymysql
that leaves the DB URIs host-independent.
In change Iff8bd2d9ee85f7bb1445aa2e1b3cfbff1f397b18 we first create a
/etc/my.cnf.d/tripleo.cnf file with a [tripleo] section with the correct
bind-address option.
In this change we make sure that the DB URIs will point to the added
file and to the specific section containing the necessary bind-address
option. We do introduce a new MySQLClient profile which will hold all
this more client-specific configuration so that this change can fit
better in the composable roles work. Also, in the future it might
contain the necessary configuration for SSL for example.
Note that in case the /etc/my.cnf.d/tripleo.cnf file does not exist
(because it is created via the mysqlclient profile), things keep on
working as usual and the bind-address option simply won't be set, which
has no impact on hosts where there are no VIPs.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: Ieac33efe38f32e949fd89545eb1cd8e0fe114a12
Related-Bug: #1643487
Closes-Bug: #1663181
Closes-Bug: #1664524
Depends-On: Iff8bd2d9ee85f7bb1445aa2e1b3cfbff1f397b18
|
|
This reverts the changes in https://review.openstack.org/414629 for nova as
they are incompatible with cell_v2.
This is a temporary fix for HA while a long-term solution is developed.
Change-Id: I79d30a2d76a354999152c0c997ea77f104c51027
Related-bug: #1643487
Closes-bug: #1662344
|
|
Currently we are applying this validation for the services templates, this
submission moves it to run with all templates.
Also fixed those templates not using the alias name.
Change-Id: I3a2c0ce6adcc8061fdc51f73fdc6b9748c0fead9
|
|
this attempts to make the error message more useful. This error message
happens if the environment files containing endpoint map overrides
haven't been updated to match the base endpoint map (or the defaults).
Change-Id: If53d3a9d7848aed62ebb235afe8b14c18d1b284d
|
|
Quick verification to check that the release name
is used instead the date.
Im also adding here all the updated templates required
to pass the check and merge this check as soon as possible.
Change-Id: Ifdc9ac4a9d0a4872d3e21672c93fc87da2e68a4e
|
|
This validation checks that the TLS-related environment files contain
all of the services defined in the base endpoint map. This will
hopefully help to keep them updated.
Change-Id: I58df72e104d8eb74e577484405f15e0a6f92d0ce
|
|
When a service connects to the database VIP from the node hosting this
VIP, the resulting TCP socket has a src address which is by default
bound to the VIP as well. If the VIP is failed over to another node
while the socket's Send-Q is not empty, TCP keepalive won't engage and
the service will become unavailable for a very long time (by default
more than 10m).
To prevent failover issues, DB connections should have the src address
of their TCP socket bound to the IP of the network interface used for
MySQL traffic. This is achieved by passing a new option to the
database connection URIs. This option is available starting from
PyMySQL 0.7.9-2.
We use a new intermediate variable in hiera to hold the IP to be used
as a source address for all DB connections. All services adapt their
database URI accordingly.
Moreover, a new YAML validation check is added to guarantee that new
services will construct their database URI appropriately.
Change-Id: Ic69de63acbfb992314ea30a3a9b17c0b5341c035
Closes-Bug: #1643487
|
|
The first step of generating the Service chain resources via j2,
we'll then incrementally convert other resources to be created
in a similar way.
Partially-Implements: blueprint custom-roles
Depends-On: I81239991f36ed5f6453184bf9cffe930832cb68b
Change-Id: Iafa9b2afddf18a5a9833ec472a552fb256338b38
|