Age | Commit message (Collapse) | Author | Files | Lines |
|
The ceph-osd package is only required on nodes hosting the CephOSD
service, but the package's presence on other nodes may interfere with
software updates. That's because some distros distribute Ceph software
in different channels, and not all nodes have access to the ceph-osd
channel.
There are two parts to the fix, and the first is an enhancement to the
yum update process. The process detects when the ceph-osd package is not
required, and removes the package from the node.
The second part takes ceph-osd out of the default list of packages
needed by puppet-ceph. The ceph-osd package is listed only on the nodes
hosting the CephOSD service.
Closes-Bug: #1713292
Change-Id: I7a581518ed25cf5f264abfaabfcf2041363a065b
(cherry picked from commit 5a89ea21f2add98119a10464b020a98999d31c41)
|
|
Checks for an existing /var/run/yum.pid and exit 1 with an error
message saying why.
Change-Id: I374eeb4164a8007ae67fea2796eac109fffdef97
Closes-Bug: 1704131
|
|
The bootstrap_nodeid can have capital letters while the hostname may
not. In puppet we use downcase for this comparison, so let's follow a
similar pattern for scripts from THT.
Change-Id: I8a0bec4a6f3ed0b4f2289cbe7023344fb284edf7
Closes-Bug: #16998201
|
|
To test this change we deployed a stock master with ipv6 which created a bunch
of ipv6 with /64 netmask:
[root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000..18
Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=64
Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)
Then we update the THT folder with this patch and upload the new scripts on the undercloud via:
openstack overcloud deploy --update-plan-only ....
Then we kick off the minor update workflow:
openstack overcloud update stack -i overcloud
Once the controller-0 node (bootstrap node for pacemaker) is completed we have the
correct VIP configuration:
[root@overcloud-controller-0 heat-config-script]# pcs resource show ip-fd00.fd00.fd00.2000..18
Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=128 nic=vlan20 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99
Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)
Also verified that running the script a second time does not alter the
(already fixed) VIPs.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: I765cd5c9b57134dff61f67ce726bf88af90f8090
|
|
In [1] we removed the previously used special case upgrade code.
However we have since discovered that for openvswitch 2.5.0-14
the special case is still required with an extra flag to prevent
the restart. This adds the upgrade code back into the minor
update and 'manual upgrade' scripts for compute/swift. The
review at If998704b3c4199bbae8a1d068c31a71763f5c8a2 is adding
this logic for the ansible upgrade steps.
Related-Bug: 1669714
[1] https://review.openstack.org/#/q/59e5f9597eb37f69045e470eb457b878728477d7
Change-Id: I3e5899e2d831b89745b2f37e61ff69dbf83ff595
|
|
In I9b1f0eaa0d36a28e20b507bec6a4e9b3af1781ae and
I11fcf688982ceda5eef7afc8904afae44300c2d9 we added a manual step
for upgrading openvswitch in order to specify the --nopostun
as discussed in the bug below.
This change adds a minor update to make this workaround more
robust. It removes any existing rpms that may be around from
an earlier run, and also checks that the rpms installed are
at least newer than the version we are on.
This also refactors the code into a common definition in the
pacemaker_common_functions.sh which is included even for the
heredocs generating upgrade scripts during init. Thanks
Sofer Athlan-Guyot and Jirka Stransky for help with that.
Change-Id: Idc863de7b5a8c116c990ee8c1472cfe377836d37
Related-Bug: 1635205
|
|
Seems the conditional has changed and we should pickup the
tripleo::profile::base::swift::storage::enable_swift_storage
hiera data.
After controller nodes are upgraded the swift services were down
even though there was no stand-alone swift node (the current
conditional was failing as that hiera isn't set any more)
Closes-Bug: 1638821
Change-Id: Id1383c1e54f9cae13fd375e90da525230e5d23eb
|
|
For N we cannot assume services are managed by pacemaker.
This adds functions to check if a service is systemd or
pcmk managed and start/stops it accordingly. For pcmk,
only stop/disable on bootstrap node for example, whereas
systemd should stop/start on all controllers.
There is also an equivalent change to the check_resource
which has been reworked to allow both pcmk and systemd.
Implements: blueprint overcloud-upgrades-workflow-mitaka-to-newton
Change-Id: Ic8252736781dc906b3aef8fc756eb8b2f3bb1f02
|
|
While having extra customizations inside a TripleO deployed
Pacemaker environment, say you have instance HA with
pacemaker_remoted or you need to configure an external arbitrator
for something, then the status of the resources for remote nodes
is "Stopped".
This leads to failures while, for example, scaling up.
This fixes the way status is checked, filtering just local nodes.
Co-Authored-By: Giulio Fidente <gfidente@redhat.com>
Change-Id: I8dc25f5d7031c265858afd5a266fda5315ae37a0
|
|
During the controller upgrade in
major_upgrade_controller_pacemaker_1.sh we use systemctl to stop
all swift services and then start them again in _pacemaker_2.sh
In the case of stand-alone swift nodes the deployer may have
used the ControllerEnableSwiftStorage: false so that only the
swift-proxy service is left on controllers (wrt swift). The
systemctl_swift function used during upgrades is changed to factor
this in.
Change-Id: Ib22005123429f250324df389855d0dccd2343feb
|
|
Since swift isn't managed by pacemaker we need to manually (systemctl)
stop and start the swift services. This moves the duplicate blocks for
start/stop into a common function (we already include that
pacemaker_common_functions.sh here so may as well)
Change-Id: Ic4f23212594c1bf9edc39143bf60c7f6d648fd1d
|
|
Also split out echo_error function to DRY the error output code and
allow changing the way we report errors in a single place.
Change-Id: I448bf0eb49390f03155335736bb4ab4e979db128
Co-Authored-By: Jiri Stransky <jistr@redhat.com>
|