Age | Commit message (Collapse) | Author | Files | Lines |
|
The ceph-osd package is only required on nodes hosting the CephOSD
service, but the package's presence on other nodes may interfere with
software updates. That's because some distros distribute Ceph software
in different channels, and not all nodes have access to the ceph-osd
channel.
There are two parts to the fix, and the first is an enhancement to the
yum update process. The process detects when the ceph-osd package is not
required, and removes the package from the node.
The second part takes ceph-osd out of the default list of packages
needed by puppet-ceph. The ceph-osd package is listed only on the nodes
hosting the CephOSD service.
Closes-Bug: #1713292
Change-Id: I7a581518ed25cf5f264abfaabfcf2041363a065b
(cherry picked from commit 5a89ea21f2add98119a10464b020a98999d31c41)
|
|
For bug 1708115 and the O..P upgrade, and for the upgrade of
'non-controlers' we are now generating ansible playbooks from
collected service upgrade_tasks and these are executed instead
of the legacy tripleo_upgrade_node.sh.
To clarify, by 'non-controllers' it is meant any node for which
the corresponding roles_data.yaml role has the
disable_upgrade_deployment flag set True.
As a first pass, I am removing the workarounds from the script but
keeping its delivery mechanism for now in case it is needed still.
We can either update here to remove it or keep it until next cycle
The most important part for now is that we no longer 'manually'
run puppet here. Instead the post_deploy_steps are also collected
into a playbook and will be executed after the upgrade_tasks
(see the bug for discussion of the mechanism and related reviews)
Change-Id: Ib017b0ab435ca9558cf8659d434489cdf01df955
Related-Bug: 1708115
(cherry picked from commit 4c5b9c5c967105536106fa4a7e1ec2352b14b08c)
|
|
This change stops and disables the openstack-nova-compute service
on the compute nodes during the upgrade to the containers architecture.
Closes-bug: 1708371
Change-Id: I9ca909d4e91d0a0e4de15572f727f959d9185c64
|
|
Adds this into the tripleo_upgrade_node.sh executed by the
operator for the major upgrade see the bug for more info
Change-Id: Ic54b48b149594e8ea08e95152111bcdaf7b252b7
Closes-Bug: 1707926
|
|
|
|
|
|
To be consistent with all other SoftwareDeployment's in
tripleo-heat-templates, this sets the name property on
the deployments where it was missing.
Change-Id: I8bc062d2af93acead240bd5e473ea385b2bf6cf2
|
|
Checks for an existing /var/run/yum.pid and exit 1 with an error
message saying why.
Change-Id: I374eeb4164a8007ae67fea2796eac109fffdef97
Closes-Bug: 1704131
|
|
To workaround yum bug with libnss we need to make yum cache
before running update. In fact we should have done this
regardless of the bug.
Change-Id: I5b2355fb8abe3c8d4b9ce9c62b9ffdba8c1e8d9d
Resolves: rhbz#1458841
Closes-Bug: #1703830
|
|
There is a Heat patch posted (via Depends-On) that resolves the issue
that caused this to be reverted. This reverts the revert and we need to
make sure all the upgrades jobs pass before we merge this patch.
This reverts commit 69936229f4def703cd44ab164d8d1989c9fa37cb.
Closes-Bug: #1699463
implements blueprint disable-deployments
Change-Id: Iedf680fddfbfc020d301bec8837a0cb98d481eb5
|
|
|
|
This reverts commit d6c0979eb3de79b8c3a79ea5798498f0241eb32d.
This seems to be causing issues in Heat in upgrades.
Change-Id: I379fb2133358ba9c3c989c98a2dd399ad064f706
Related-Bug: #1699463
|
|
|
|
Commit I46941e54a476c7cc8645cd1aff391c9c6c5434de added support for
blacklisting servers from triggered Heat deployments.
This commit adds that functionality to the remaining Deployments in
tripleo-heat-templates for the ExtraConfig interfaces.
Since we can not (should not) change the interface to ExtraConfig, Heat
conditions are used on the actual <role>ExtraConfigPre and
NodeExtraConfig resources instead of using the actions approach on
Deployments.
Change-Id: I38fdb50d1d966a6c3651980c52298317fa3bece4
|
|
The bootstrap_nodeid can have capital letters while the hostname may
not. In puppet we use downcase for this comparison, so let's follow a
similar pattern for scripts from THT.
Change-Id: I8a0bec4a6f3ed0b4f2289cbe7023344fb284edf7
Closes-Bug: #16998201
|
|
We need to ensure that the pacemaker cluster restarts
in the end of the deployment.
Due to the resources renaming we added the
postconfig resource not in the end of the
deployment as it was *postpuppet.
Closes-bug: 1695904
Change-Id: Ic6978fcff591635223b354831cd6cbe0802316cf
|
|
Master is now the development branch for pike
changing the release alias name.
Change-Id: I938e4a983e361aefcaa0bd9a4226c296c5823127
|
|
|
|
|
|
In change I2aae4e2fdfec526c835f8967b54e1db3757bca17 we did the
following:
-pacemaker_status=$(systemctl is-active pacemaker || :)
+pacemaker_status=""
+if hiera -c /etc/puppet/hiera.yaml service_names | grep -q pacemaker;
then
+ pacemaker_status=$(systemctl is-active pacemaker)
+fi
we did that so due to LP#1668266: we did not want systemctl is-active to
fail on non pacemaker nodes. The problem with the above hiera check is
that it will match on pacemaker_remote nodes as well.
We cannot piggyback the pacemaker_enabled hiera key because that is true
on all nodes. So let's make the test check only for pacemaker service
without matching pacemaker remote. Tested with:
1) Test on a controller node with pacemaker service enabled
[root@overcloud-controller-0 ~]# hiera -c /etc/puppet/hiera.yaml -a service_names |grep '\bpacemaker\b'
"pacemaker",
[root@overcloud-controller-0 ~]# echo $?
0
2) Test on a compute node without pacemaker:
[root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker\b'
[root@overcloud-novacompute-0 puppet]# echo $?
1
3) Test on a node with pacemaker_remote in the service_names key:
[root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker\b'
[root@overcloud-novacompute-0 puppet]# echo $?
1
[root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker_remote\b'
"pacemaker_remote"]
[root@overcloud-novacompute-0 puppet]# echo $?
0
Change-Id: I54c5756ba6dea791aef89a79bc0b538ba02ae48a
Closes-Bug: #1688214
|
|
|
|
This add openstack-nova-migration on the compute during the upgrade.
Closes-Bug: #1687081
Depends-on: Iab022bdfb655e3c52fecebf416e75c9e981072ab
Depends-on: I02dc8934521340f42ac44a7d16889f6d79620c33
Change-Id: I3db2a3188e538eeaef61769d38f0166545444cfe
|
|
|
|
To test this change we deployed a stock master with ipv6 which created a bunch
of ipv6 with /64 netmask:
[root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000..18
Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=64
Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)
Then we update the THT folder with this patch and upload the new scripts on the undercloud via:
openstack overcloud deploy --update-plan-only ....
Then we kick off the minor update workflow:
openstack overcloud update stack -i overcloud
Once the controller-0 node (bootstrap node for pacemaker) is completed we have the
correct VIP configuration:
[root@overcloud-controller-0 heat-config-script]# pcs resource show ip-fd00.fd00.fd00.2000..18
Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=128 nic=vlan20 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99
Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)
Also verified that running the script a second time does not alter the
(already fixed) VIPs.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: I765cd5c9b57134dff61f67ce726bf88af90f8090
|
|
Closes-Bug:1686619
Change-Id: I7c32ca39a456de9833d30c31d41fcb727d2b0a34
|
|
We fixed pcs resources start/stop timeouts via
I587136d8d045d213875c657ea5a405074f80c8ad in Nov 2015.for mitaka.
And there we stated:
This can be removed once updates from deployments made prior to
I6fc18f1ad876c5a25723710a3b20d8ec9519dcba are no longer supported.
We can now safely remove these updates as they are useless and cost time
anyway.
Change-Id: Ibad2b3eed0d08560d52d5ebe700746b61e5b8f51
|
|
The [Pre|Post]Puppet resources were renamed in
https://review.openstack.org/#/c/365763.
This was intended for having a pre/post deployment
steps using an agnostic name instead of
being attached to a technology.
The renaming was unintentionally reverted in
https://review.openstack.org/#/c/393644/ and
https://review.openstack.org/#/c/434451.
This submission merge both resources into one,
and remove the old pre|post hooks.
Closes-bug: #1669756
Change-Id: Ic9d97f172efd2db74255363679b60f1d2dc4e064
|
|
In two places during upgrade we manually trigger puppet.
There can be a problem when new puppet modules are added, and their
corresponding symlinks in /etc/puppet/modules are not created during
the installation as their are installed in
/usr/share/openstack-puppet/modules. To prevent the issue tripleo set
modulepath in the templates.
We must use the same modulepath to make sure that we don't fail
because of missing module in the manual puppet run.
This particulary happens when you upgrade from M->N->O, as the base
image in Mitaka doesn't have the proper symlinks and they are not
created during the installation of the package.
Closes-Bug: #1684587
Change-Id: I79df6ea33f1c58e13309176a6de41b7572541fd6
|
|
|
|
Fetch the host public keys from each node, combine them all and write to the
system-wide ssh known hosts. The alternative of disabling host key
verification is vulnerable to a MITM attack.
Change-Id: Ib572b5910720b1991812256e68c975f7fbe2239c
|
|
|
|
This reverts commit b323f8a16035549d84cdec4718380bde3d23d6c3 and uses
the new logic in puppet-tripleo (see Ifd6fa5b398d98e8998630ea0c9a2ce9867ceba2b
), basically doing the same.
Closes-Bug: 1665641
Change-Id: Ib5cb0578be2993af0a0b8675005d838640bdb139
|
|
The current check tends to produce a false positive causing unnecessary
service restarts. yum check-update will exit with return code 100 if
updated packages are available.
Change-Id: I8bd89f2b24bafc6c991382b9eb484cfa9a2f8968
|
|
|
|
In [1] we removed the previously used special case upgrade code.
However we have since discovered that for openvswitch 2.5.0-14
the special case is still required with an extra flag to prevent
the restart. This adds the upgrade code back into the minor
update and 'manual upgrade' scripts for compute/swift. The
review at If998704b3c4199bbae8a1d068c31a71763f5c8a2 is adding
this logic for the ansible upgrade steps.
Related-Bug: 1669714
[1] https://review.openstack.org/#/q/59e5f9597eb37f69045e470eb457b878728477d7
Change-Id: I3e5899e2d831b89745b2f37e61ff69dbf83ff595
|
|
Attempt to check galera's cluster status fails when galera service
is not running on the same node.
Change-Id: I27fb0841d85cd0dc86e92ac2e21eedf5f8f863ab
|
|
|
|
Removes some of the no longer used scripts and templates used by
the upgrades workflow in previous versions.
Change-Id: I7831d20eae6ab9668a919b451301fe669e2b1346
|
|
The UpdateDeployment already depends on NetworkDeployment.
We should not run os-net-config unconditionally before update.
Closes-Bug: #1666227
Change-Id: I48cbf5de00d47c6fdad71ff24c00e9db05cec5d5
|
|
|
|
|
|
Removed from the tripleo_upgrade_node.sh (major upgrade) & yum_update.sh
(minor update). The workaround is no longer needed and in fact has the
opposite effect killing connectitivity to the node. The 'normal' yum
update on nodes delivers the latest openvswitch 2.6.1 with no drama.
Also adds a 'complete' message, some extra debug echo for logs
and removes the python-zaqarclient install no longer needed
Closes-Bug: 1669714
Change-Id: Icd1517bcade36781fa0da21d045ffd9ec68efc38
|
|
Package update fails on compute node, when yum_update checks for
pacemaker status via systemctl command. Because exit on error (-e)
option has been enabled recently, this issue is happening. Fixing
by, executing the command only on nodes where pacemaker is enabled.
Closes-Bug: #1668266
Change-Id: I2aae4e2fdfec526c835f8967b54e1db3757bca17
|
|
|
|
During the upgrade from M to N i encountered an error in a step
requiring the upgrade of mysql version. The variable backup_flags
is undefined at that point.
Change-Id: Ic6681c40934b27a03d00a75007d7f12d6d540de3
Closes-Bug: #1667731
|
|
Adds two checks, one for the CephMon and one for the CephOSD upgrade
tasks borrowed from ceph-ansible.
Change-Id: I0a0e60d277240130c6bd76a74ccc13354b87a30a
Co-Authored-By: Sebastien Han <seb@redhat.com>
|
|
|
|
|
|
|
|
And change the conditional to use hiera instead.
Change-Id: Icf91dd91c0ab04e7919172fcfd130183bfd427b4
|