Age | Commit message (Collapse) | Author | Files | Lines |
|
In two places during upgrade we manually trigger puppet.
There can be a problem when new puppet modules are added, and their
corresponding symlinks in /etc/puppet/modules are not created during
the installation as their are installed in
/usr/share/openstack-puppet/modules. To prevent the issue tripleo set
modulepath in the templates.
We must use the same modulepath to make sure that we don't fail
because of missing module in the manual puppet run.
This particulary happens when you upgrade from M->N->O, as the base
image in Mitaka doesn't have the proper symlinks and they are not
created during the installation of the package.
Closes-Bug: #1684587
Change-Id: I79df6ea33f1c58e13309176a6de41b7572541fd6
(cherry picked from commit 79c2d0f3d411da9e57731d9da79d25a3e0364eb2)
|
|
|
|
|
|
To ensure that yum update passes without issues we touch ssl.conf.
Proper fix is https://review.openstack.org/#/c/456712/
Depends-On: Ic5a0719f67d3795a9edca25284d1cf6f088073e8
Closes-Bug: #1682448
Resolves: rhbz#1441977
Change-Id: I73e5272c64df4aa5900f544a5d9f0670544ca679
|
|
The current check tends to produce a false positive causing unnecessary
service restarts. yum check-update will exit with return code 100 if
updated packages are available.
Change-Id: I8bd89f2b24bafc6c991382b9eb484cfa9a2f8968
(cherry picked from commit 9e4375d2762f4a26e8b0b8375f9265ad6e439ea1)
Closes-Bug: #1680634
|
|
This reverts commit b323f8a16035549d84cdec4718380bde3d23d6c3 and uses
the new logic in puppet-tripleo, basically doing the same.
Closes-Bug: 1665641
Depends-On: Ifd6fa5b398d98e8998630ea0c9a2ce9867ceba2b
Change-Id: Ib5cb0578be2993af0a0b8675005d838640bdb139
(cherry picked from commit 76c1c0cbba38b2f25290f5ad80e38ddd97ae834b)
|
|
In [1] we removed the previously used special case upgrade code.
However we have since discovered that for openvswitch 2.5.0-14
the special case is still required with an extra flag to prevent
the restart. This adds the upgrade code back into the minor
update and 'manual upgrade' scripts for compute/swift. The
review at If998704b3c4199bbae8a1d068c31a71763f5c8a2 is adding
this logic for the ansible upgrade steps.
Related-Bug: 1669714
[1] https://review.openstack.org/#/q/59e5f9597eb37f69045e470eb457b878728477d7
Change-Id: I3e5899e2d831b89745b2f37e61ff69dbf83ff595
(cherry picked from commit 25983882c2f7a8e8f8fb83bd967a67d008a556a4)
|
|
Attempt to check galera's cluster status fails when galera service
is not running on the same node.
Change-Id: I27fb0841d85cd0dc86e92ac2e21eedf5f8f863ab
Closes-Bug: #1677574
(cherry picked from commit d39c952fd3150d24c9e01c15806181715d0760f8 )
|
|
|
|
|
|
The UpdateDeployment already depends on NetworkDeployment.
We should not run os-net-config unconditionally before update.
Closes-Bug: #1666227
Change-Id: I48cbf5de00d47c6fdad71ff24c00e9db05cec5d5
(cherry picked from commit b19d6306ea582dc31ebfd609475d9ac4e641e278)
|
|
stable/ocata
|
|
Removes some of the no longer used scripts and templates used by
the upgrades workflow in previous versions.
Closes-Bug: 1673447
Change-Id: I7831d20eae6ab9668a919b451301fe669e2b1346
(cherry picked from commit 521a8973229484d52c03e9ed04782c5dc493c1b0)
|
|
Removed from the tripleo_upgrade_node.sh (major upgrade) & yum_update.sh
(minor update). The workaround is no longer needed and in fact has the
opposite effect killing connectitivity to the node. The 'normal' yum
update on nodes delivers the latest openvswitch 2.6.1 with no drama.
Also adds a 'complete' message, some extra debug echo for logs
and removes the python-zaqarclient install no longer needed
Closes-Bug: 1669714
Change-Id: Icd1517bcade36781fa0da21d045ffd9ec68efc38
(cherry picked from commit 9025a3bc23834e31efc5021acaef80b8d0f5de73)
|
|
Package update fails on compute node, when yum_update checks for
pacemaker status via systemctl command. Because exit on error (-e)
option has been enabled recently, this issue is happening. Fixing
by, executing the command only on nodes where pacemaker is enabled.
Closes-Bug: #1668266
Change-Id: I2aae4e2fdfec526c835f8967b54e1db3757bca17
(cherry picked from commit e9a2fdc0afd2a3f1242f397c5f164cf6b43c2669)
|
|
Adds two checks, one for the CephMon and one for the CephOSD upgrade
tasks borrowed from ceph-ansible.
Change-Id: I0a0e60d277240130c6bd76a74ccc13354b87a30a
Co-Authored-By: Sebastien Han <seb@redhat.com>
(cherry picked from commit a3df16776dd5d7eb0a60ca4c58cef9913eb1c5cb)
|
|
|
|
into stable/ocata
|
|
And change the conditional to use hiera instead.
Change-Id: Icf91dd91c0ab04e7919172fcfd130183bfd427b4
(cherry picked from commit d8e75b220efec3b17a76bed6898327784fb4e6cc)
|
|
We want to apply a puppet manifest for the non-controller role, but we
need to apply it in stages. By loading the proper hieradata we get the
needed step configuration.
Change-Id: I07bfeee7b7d9a9b8c2c20e5d5c9ed735d0bfc842
Closes-Bug: #1664304
(cherry picked from commit 237cd2004a2c0869d60d0e11e9dccd59e809ff90)
|
|
Swift rings created or updated on the overcloud nodes will now be
stored on the undercloud at the end of the deployment. An
additional consistency check is executed before storing them,
ensuring all rings within the cluster are identical.
These rings will be retrieved (before Puppet runs) by every node
when an UPDATE is executed, and by doing this will be in a
consistent state across the cluster.
This makes it possible to add, remove or replace nodes in an
existing cluster without manual operator interaction.
Closes-Bug: 1609421
Depends-On: Ic3da38cffdd993c768bdb137c17d625dff1aa372
Change-Id: I758179182265da5160c06bb95f4c6258dc0edcd6
(cherry picked from commit b323f8a16035549d84cdec4718380bde3d23d6c3)
|
|
|
|
We wants to run puppet on each role which has the flag
disable_upgrade_deployment to true. It will run after the upgrade
of the role and before running the whole converge step.
Change-Id: Ia85be688d070dfb5b8337e8ef3c4bc439fb6052e
|
|
We do not need the upgrade scripts used to migrate Ceph from
hammer to jewel. This submission removes that and the legacy
upgrade scripts used for the BlockStorage role.
Change-Id: I2674216dd9b5b849de6a2624ee1115420a254182
|
|
This delivers a /root/tripleo_upgrade_node.sh to those nodes
that have the disable_upgrade_deployment flag set to true.
They will later be upgraded manually by the operator who will
invoke the script delivered here using upgrade-non-controller.sh
We can also deliver any service specific upgrade configuration,
such as configuring nova-compute to use the placement API as this
is required in order for placement to be configured and installed
during the subsequent upgrade steps for controller services.
This removes the compute and swift specific upgrade scripts as
they are now merged into the common
tripleo_upgrade_node.sh - removing any hard coded
reference to a particular role name (compute/objectstorage) and
only relying on the disable_upgrade_deployment is roles_data.yaml
Change-Id: I4531a4038b78087ef4a1a62c35f1328822427817
Co-Authored-By: Mathieu Bultel <mbultel@redhat.com>
|
|
|
|
|
|
|
|
We only need to know if pacemaker service is in active state.
Change-Id: Id5e16f2bbbe51b8a0c250eb5d35e89e61a7b3383
Resolves: rhbz#1414779
Closes-Bug: #1656980
|
|
Glance registry is not required for the v2 of the API and there are
plans to deprecate it in the glance community.
Let's remove v1 support since it has been deprecated for a while in
Glance.
Depends-On: I77db1e1789fba0fb8ac014d6d1f8f5a8ae98ae84
Co-Authored: Flavio Percoco <flaper87@gmail.com>
Change-Id: I0cd722e8c5a43fd19336e23a7fada71c257a8e2d
|
|
Heat now supports release name aliases, so we can replace
the inconsistent mix of date related versions with one consistent
version that aligns with the supported version of heat for this
t-h-t branch.
This should also help new users who sometimes copy/paste old templates
and discover intrinsic functions in the t-h-t docs don't work because
their template version is too old.
Change-Id: Ib415e7290fea27447460baa280291492df197e54
|
|
|
|
For now, don't run anything in yum_update.sh when it is run from
inside the heat-agents container. A mechanism for doing a yum update
on the host can be worked out later, but for now a yum update should
never be run inside a container.
Change-Id: I73d37578f8b2dc9c3029b968b1ef74ef4894100a
|
|
In I9b1f0eaa0d36a28e20b507bec6a4e9b3af1781ae and
I11fcf688982ceda5eef7afc8904afae44300c2d9 we added a manual step
for upgrading openvswitch in order to specify the --nopostun
as discussed in the bug below.
This change adds a minor update to make this workaround more
robust. It removes any existing rpms that may be around from
an earlier run, and also checks that the rpms installed are
at least newer than the version we are on.
This also refactors the code into a common definition in the
pacemaker_common_functions.sh which is included even for the
heredocs generating upgrade scripts during init. Thanks
Sofer Athlan-Guyot and Jirka Stransky for help with that.
Change-Id: Idc863de7b5a8c116c990ee8c1472cfe377836d37
Related-Bug: 1635205
|
|
There are scenarios in which findmnt will return a list of all
mounted filesystems, which causes the upgrade script to fail in
recognizing if the Ceph OSD is backed by ext4.
Change-Id: Iadebdc32b523c05216202b782ceb54bec4389413
Closes-Bug: #1649407
|
|
neutron.conf is found by the virtue of oslo.config auto-discovery
mechanism; and plugin.ini is no longer needed since Juno because now
schema does not depend on plugin used.
While at it, switched head -> heads to reflect recent changes in neutron
with multiple alembic branches. The old format still works, but 'heads'
is slightly more encouraged.
Change-Id: I614a6d43087fa231f0d582bab10a82480aaefda5
Related: Icc4de9824ef95781a1d060534973c2bbf8e03059
|
|
|
|
Running os-net-config before restarting the cluster prevents changes to
the interface files caused by changes to implementation from bouncing
network interfaces after the cluster has restarted.
Closes-Bug: #1644138
Change-Id: I65fb104465ff3d37ddc791634302994334136014
|
|
During ceilometer pre upgrade, rabbit host config gets overridden in
ceilometer conf as its setting to defaults. This explicitly sets the
host info in standalone manifest.
Closes-Bug: #1644278
Change-Id: I862ea7165c5d42ba1f9a19111a8be8934c0ef883
|
|
In I9b1f0eaa0d36a28e20b507bec6a4e9b3af1781ae and
I11fcf688982ceda5eef7afc8904afae44300c2d9 we landed a workaround
for the openvswitch 2.4 to 2.5 upgrade discussed in the bug below.
Unfortunately testing has revealed a problem with the minor update
case specifically for non controllers. It seems we would exit
before the ovs workaround has had a chance to execute. This moves
the block up a few lines to avoid this condition. As with the
other two reviews noted here, this will need to go into newton
and then mitaka too.
Change-Id: If905de82d96302334ebe02de9c43f00faed9b72b
Related-Bug: 1635205
|
|
|
|
|
|
https://review.openstack.org/#/c/388688/ has removed ceilometer-dbsync so
ceilometer-upgrade must be used instead.
Additionally, ceilometer-dbsync enabled option --skip-gnocchi-resource-types
and ceilometer-upgrade doesn't, so i'm setting it by default to ensure backwards compatibility.
Note this is based on the corresponding fix to puppet-ceilometer ref
https://review.openstack.org/#/c/396570
Change-Id: Ic0a15c75d1cd3e3f70eeafd9ba09d50c58cc1293
Closes-Bug: #1641076
|
|
Deployments using external LB will file like this:
deploy_stderr: |
+ RESTART_FOLDER=/var/lib/tripleo/pacemaker-restarts
+ [[ -d /var/lib/tripleo/pacemaker-restarts ]]
++ systemctl is-active haproxy
+ haproxy_status=unknown
deploy_status_code: 3
openstack software deployment show 4f339ca4-7600-4ca0-b0ef-f798bc47b6cf
The reason is that via https://review.openstack.org/#/c/393644/ we
introducted the haproxy restart like this:
haproxy_status=$(systemctl is-active haproxy)
if [ "$haproxy_status" = "active" ]; then
systemctl reload haproxy
fi
The problem is that if haproxy is not running/installed systemctl
is-active can fail and the script will terminate with an error return
code. Let's just move the call inside the if so the script does not fail
in case haproxy is not there.
The snippet before the change (on a system without haproxy installed):
[root@mrg-09 tmp]# ./test.sh
++ systemctl is-active haproxy
+ haproxy_status=unknown
[root@mrg-09 tmp]# echo $?
3
After this change:
[root@mrg-09 tmp]# ./test.sh
++ systemctl is-active haproxy
+ '[' unknown = active ']'
[root@mrg-09 tmp]# echo $?
0
Change-Id: I837c63a9dbcde8c922f843c442974fa79cf1eede
Closes-Bug: #1641904
|
|
In ocata we changed the ha policy to "ha-exactly" via the following changes:
- tht: Iace6daf27a76cb8ef1050ada0de7ff1f530916c6
- puppet-tripleo: Ib62001c03e1e08f58cf0c6e0ba07a8879a584084
We initially also took care of changing this policy (which is set in the
pacemaker resource agent) for the M/N upgrade path:
I2468a096b5d7042bc801a742a7a85fb1521c1c02
In the end we decided against changing the policy in Newton as well (it
was only for ocata) as it was too close to the release date and we took
the safer path.
This patch does two things:
1) It renames the upgrade function to "newton_ocata" since that is the
only upgrade path we need to take care of
2) It reinstates the actual upgrade function which was mistakenly
removed via an unrelated change in the ceilometer upgrade path:
If9d6987cd0a8fc5d3f9de518ba422d97d5149732
Closes-Bug: #1628998
Change-Id: I3a97505d2ae1ae27f3080ffe74c33fdabffd2420
|
|
|
|
|
|
Currently when we call the major-upgrade step we do the following:
"""
...
if [[ -n $(is_bootstrap_node) ]]; then
check_clean_cluster
fi
...
if [[ -n $(is_bootstrap_node) ]]; then
migrate_full_to_ng_ha
fi
...
for service in $(services_to_migrate); do
manage_systemd_service stop "${service%%-clone}"
...
done
"""
The problem with the above code is that it is open to the following race
condition:
1. Code gets run first on a non-bootstrap controller node so we start
stopping a bunch of services
2. Pacemaker notices will notice that services are down and will mark
the service as stopped
3. Code gets run on the bootstrap node (controller-0) and the
check_clean_cluster function will fail and exit
4. Eventually also the script on the non-bootstrap controller node will
timeout and exit because the cluster never shut down (it never actually
started the shutdown because we failed at 3)
Let's make sure we first only call the HA NG migration step as a
separate heat step. Only afterwards we start shutting down the systemd
services on all nodes.
We also need to move the STONITH_STATE variable into a file because it
is being used across two different scripts (1 and 2) and we need to
store that state.
Co-Authored-By: Athlan-Guyot Sofer <sathlang@redhat.com>
Closes-Bug: #1640407
Change-Id: Ifb9b9e633fcc77604cca2590071656f4b2275c60
|
|
After compute nodes are upgraded, the ceilometer compute agent
doesnt poll and throws warnings. Restarting the compute agent
at this step gets the service back to its normal state.
Closes-Bug: #1640177
Change-Id: I7392de43e933b1d16002e12e407748ae289d5e99
|
|
After deploying a fresh installed Overcloud or updating the stack
the haproxy configuration is updated correctly but no change in the
HA proxy stats happens.
This submission will add the missing resources to run pre and post
puppet tasks.
Closes-bug: 1640175
Change-Id: I2f08704daeee502c618256695a30ce244a1d7ba5
|