summaryrefslogtreecommitdiffstats
path: root/extraconfig
AgeCommit message (Collapse)AuthorFilesLines
2016-04-08Replace extraconfig/tasks/noop.yaml w/ Heat::NoneDan Prince1-26/+0
Removes the old noop nested stack template for extraconfig tasks and instead uses OS::Heat::None. This should avoid a few extra resource checks on create and update. Change-Id: I5a42fc78ece2553e86385236e214aa1e3c91cd85
2016-04-07Add removal of the /etc/resolv.conf.save file for +bug/1567004marios1-0/+3
The change at https://review.openstack.org/#/c/302352/ should stop the if up/down scripts from making changes to resolv.conf as discussed in that review and the related bug below. However during upgrades, as we are moving from a version of the ifcfg-vlanXX files that don't have the PEERDNS=no added by /#/c/302352 the if up script will restore the /etc/resolv.conf.save to /etc/resolv.conf and overwrite it. This removes the .save file during the upgrade init command which gets delivered to all nodes as the first stage of a major upgrade. Change-Id: I91dd139f43be4912c20d8661691bee2b662964d4 Related-Bug: 1567004
2016-04-04Filter for local nodes in check_resource functionRaoul Scarazzini1-1/+2
While having extra customizations inside a TripleO deployed Pacemaker environment, say you have instance HA with pacemaker_remoted or you need to configure an external arbitrator for something, then the status of the resources for remote nodes is "Stopped". This leads to failures while, for example, scaling up. This fixes the way status is checked, filtering just local nodes. Co-Authored-By: Giulio Fidente <gfidente@redhat.com> Change-Id: I8dc25f5d7031c265858afd5a266fda5315ae37a0
2016-04-01Restart haproxy after configuring SSL certsBen Nemec2-7/+21
If a certificate expires, the user will need to update it. However, because we only restart services at the end of a stack-update the new certificate doesn't take effect until after puppet has run. This is a problem because puppet makes OpenStack calls, which will fail if the certificate is expired. In that case we never get to the service restart so the stack is wedged until the user manually restart haproxy. This patch addresses the problem by reloading haproxy before puppet runs. This is done in a pre-puppet script for pacemaker after pacemaker is maintenance mode because we need to make sure it happens after all of the certs have been installed on the controllers, but before puppet runs. For non-pacemaker, haproxy is simply reloaded. Change-Id: Id5ed05b3a20d06af8ae7a3d6f859b03399b0d77d
2016-03-29change the default satellite tools rpm repo.Mike Burns3-1/+6
Change-Id: I60ab36b04b8932e4dbee58e21998dc984178b41c Bugzilla: https://bugzilla.redhat.com/1275281
2016-03-24Set UpdateIdentifier for upgrade converge, to prevent services downMathieu Bultel1-6/+0
We'd like to let the post puppet pacemaker controller services restart to happen for the convergence step so set the UpdateIdentifier. However also set the PackageUpdate to noop so the yum_update.sh doesn't happen. Since a full haproxy restart is expected, we no longer need the systemctl reload added at Iae3bad745ecdf952a7a0314fe1375d07eb47c454 so remove that too. Some more context at https://bugzilla.redhat.com/show_bug.cgi?id=1321036 Co-Authored-By: marios <marios@redhat.com> Change-Id: I31c2d97d68c97b435f63863fae2c89f18f99681d
2016-03-24Merge "Fix satellite registration for http or https"Jenkins1-4/+5
2016-03-24Merge "Add systemctl reload haproxy to the pacemaker_resource_restart.sh"Jenkins1-0/+6
2016-03-24Merge "Deploy Aodh services, replacing Ceilometer Alarm"Jenkins2-2/+31
2016-03-23Add systemctl reload haproxy to the pacemaker_resource_restart.shmarios1-0/+6
As discussed in the related bug below, after upgrading your environment to latest liberty the haproxy config isn't picked up. This adds a systemctl reload haproxy in the pacemaker resource restart we run as part of the post-puppet-pacemaker. Related-Bug: 1561012 Change-Id: Iae3bad745ecdf952a7a0314fe1375d07eb47c454
2016-03-23Fix satellite registration for http or httpsJames Slagle1-4/+5
If the satellite registration url was specified with https, the curl command to detect the satellite version would not work as expected since -L was not passed and you get redirected to https when testing the ping api. To additionally handle the case where https is specified, also use curl directly with -k to download the configuration rpm instead of using rpm with a url. Fixes another bug with a missing $ in the reference to the $satellite_version variable. Change-Id: I984fdfc415eeeed4ef29cc8d0812e1b67545d6b1
2016-03-20Deploy Aodh services, replacing Ceilometer AlarmPradeep Kilambi2-2/+31
Ceilometer Alarm is deprecated in Liberty by Aodh. This patch: * manage Aodh Keystone resources * deploy Aodh API under WSGI, Notifier, Listener and Evaluator * manage new parameters to customize Aodh deployment * uses ceilometer DB for the upgrade path * pacemaker config * Add migration logic to remove pcs resources Depends-On: I5333faa72e52d2aa2a622ac2d4b60825aadc52b5 Depends-On: Ib6c9c4c35da3fb55e0ca8e2d5a58ebaf4204d792 Co-Authored-By: Emilien Macchi <emilien@redhat.com> Change-Id: Ib47a22884afb032ebc1655e1a4a06bfe70249134
2016-03-11Merge "Upgrades: quiet yum upgrade on cinder nodes"Jenkins1-1/+1
2016-03-10Merge "Upgrades: initialization command/snippet"Jenkins1-0/+52
2016-03-10Merge "Add a ceph-storage node upgrade script for the upgrade workflow"Jenkins2-4/+50
2016-03-10Merge "Upgrades: object storage node upgrade fix"Jenkins1-3/+4
2016-03-10Upgrades: quiet yum upgrade on cinder nodesJiri Stransky1-1/+1
Yum update on cinder nodes should be quiet, as it is on controllers, because results of these updates are sent to Heat. I mistakenly left this out in the first patch because i used one of the standalone node upgrade scripts as a copy/paste base for the cinder node upgrade script. Change-Id: Id13190dc4d242317829c7994088183f52d21461d
2016-03-10Merge "Upgrade of Cinder block storage nodes"Jenkins2-1/+23
2016-03-09Upgrades: object storage node upgrade fixJiri Stransky1-3/+4
The variables in the heredoc should be escaped because they should evaluate only when the inner script runs, not when the outer "writer" script runs. Python-zaqarclient is installed for os-collect-config to work, as we do on the other node types. Swift-proxy is removed from list of services to stop/start, as swift-proxy isn't supposed to run on the swift storage nodes. Change-Id: I8426b859d11378ebdc3da94dcc090133dab0c628
2016-03-09Fixup systemctl_swift stop/start during the controller upgrademarios1-4/+17
During the controller upgrade in major_upgrade_controller_pacemaker_1.sh we use systemctl to stop all swift services and then start them again in _pacemaker_2.sh In the case of stand-alone swift nodes the deployer may have used the ControllerEnableSwiftStorage: false so that only the swift-proxy service is left on controllers (wrt swift). The systemctl_swift function used during upgrades is changed to factor this in. Change-Id: Ib22005123429f250324df389855d0dccd2343feb
2016-03-09Upgrades: initialization command/snippetJiri Stransky1-0/+52
This allows to run a command or a script snippet on all overcloud nodes at the beginning of the upgrade. The intended use is to switch to a new set of repositories on the overcloud. This is done differently in different contexts (e.g. upstream vs. downstream), but generally it should be simple enough to not warrant creation of switchable "UpgradeInit" resource in the resource registry, and a string command/snippet parameter should suffice. Change-Id: I72271170d3f53a5179b3212ec9bae9a6204e29e6
2016-03-08Add a ceph-storage node upgrade script for the upgrade workflowmarios2-4/+50
This adds delivery of an upgrade script to any ceph-storage nodes during the script delivery that comes first during the upgrade workflow. The controllers have the ceph-mon whilst the ceph-osds are on the ceph-storage nodes. The ceph-mons will be updated first as part of the heat-driven controller upgrade, and ceph-osds on ceph nodes are upgraded with the upgrade-non-controller.sh tripleo-common script as with compute and swift nodes. Also slight rename for the ObjectStorageConfig/Deployment here for consistency. Change-Id: I12abad5548dcb019ade9273da06fe66fd97f54cc
2016-03-08Merge "Add an environment to use a swap partition"Jenkins1-0/+90
2016-03-07Merge "Revert "Deploy Aodh services, replacing Ceilometer Alarm""Jenkins1-7/+2
2016-03-07Merge "Function library for major upgrades"Jenkins2-0/+16
2016-03-07Merge "Introduce a UpgradeScriptDeliveryWorfklow as part of tripleo upgrades"Jenkins3-25/+103
2016-03-06Add an environment to use a swap partitionJames Slagle1-0/+90
This environment can be used with AllNodesExtraConfig to enable swap on a device with the given label as specified by the swap_partition_label parameter. If Ironic is used to create the swap partition, the partition will have a label of swap1, so that's a reasonable default for the parameter. The partition is also written to /etc/fstab as a swap mount so that it will be enabled on reboot. Change-Id: I5cd68c13dbfe53eecf6c6ad93151eadc980a902d
2016-03-04Revert "Deploy Aodh services, replacing Ceilometer Alarm"James Slagle1-7/+2
This just a revert to see if reverting this gets back to a normal CI run time. This reverts commit f72aed85594f223b6f888e6d0af3c880ea581a66. Change-Id: I04a0893f6cf69f547a4db26261005e580e1fc90b
2016-03-03Merge "Deploy Aodh services, replacing Ceilometer Alarm"Jenkins1-2/+7
2016-03-03Deploy Aodh services, replacing Ceilometer AlarmEmilien Macchi1-2/+7
Ceilometer Alarm is deprecated in Liberty by Aodh. This patch: * manage Aodh Keystone resources * deploy Aodh API under WSGI, Notifier, Listener and Evaluator * manage new parameters to customize Aodh deployment * uses ceilometer DB for the upgrade path * pacemaker config Depends-On: I9e34485285829884d9c954b804e3bdd5d6e31635 Depends-On: I891985da9248a88c6ce2df1dd186881f582605ee Depends-On: Ied8ba5985f43a5c5b3be5b35a091aef6ed86572f Co-Authored-By: Pradeep Kilambi <pkilambi@redhat.com> Change-Id: I58d419173e80d2462accf7324c987c71420fd5f6
2016-03-03Function library for major upgradesJiri Stransky2-0/+16
This commit introduces a bash file to be sourced into major upgrade scripts. Into this file we can put specific pieces of migration logic in the form of bash functions, which can then be called from the upgrade scripts. Change-Id: Ibf7aa84d3880e9218c488dec9d707300e1784744
2016-03-03Merge "Moves the swift start/stop into the common_functions.sh file"Jenkins3-10/+11
2016-03-03Merge "Add Satellite 5 support"Jenkins2-7/+36
2016-03-03Introduce a UpgradeScriptDeliveryWorfklow as part of tripleo upgradesmarios3-25/+103
This splits the upgrade script delivery out of the UpgradeWorkflow and into a new task which delivers the upgrade script for compute and object-storage nodes. This is intended to be the first part of the upgrades process, since we need to upgrade swift nodes before the controllers and then only one at a time. So this will deliver the upgrade script which can be invoked by the operator using the existing script in tripleo-common 'upgrade-non-controller.sh'. This can be invoked by passing the -e environments/major-upgrade-script-delivery.yaml (added here) to the openstack overcloud deploy command. Change-Id: I20a0d4978e907111404f8108c502ab53b69a3296
2016-03-03Upgrade of Cinder block storage nodesJiri Stransky2-1/+23
This introduces upgrades for Cinder block storage nodes. Currently Cinder doesn't support upgrade level pinning and cannot safely deal with version skew. This means that we have to upgrade Cinder storage nodes in sync with controller nodes (after they were taken down for upgrade, before they are brought back up) to ensure that Cinder services perform AMQP communication only within the same major version of Cinder. According to our current knowledge, Cinder block storage nodes are the only node type that will have to be upgraded in sync with controllers. Change-Id: Icec913c015eff744b0f31b513176b4b657df43af
2016-03-02Moves the swift start/stop into the common_functions.sh filemarios3-10/+11
Since swift isn't managed by pacemaker we need to manually (systemctl) stop and start the swift services. This moves the duplicate blocks for start/stop into a common function (we already include that pacemaker_common_functions.sh here so may as well) Change-Id: Ic4f23212594c1bf9edc39143bf60c7f6d648fd1d
2016-03-02Merge "Upgrades: install zaqarclient"Jenkins2-0/+3
2016-03-02Merge "Upgrades: quiet yum update"Jenkins1-1/+1
2016-03-02Upgrades: install zaqarclientJiri Stransky2-0/+3
Old overcloud images don't have python-zaqarclient installed, and new overclouds' os-collect-config are configured with Zaqar support. This together means that on upgrade we need to install python-zaqarclient, otherwise os-collect-config will be restarted during yum update and crash due to trying to import missing Python module from zaqarclient. Change-Id: I3e875e14cb60b1b78aec0d9ddc412ccf865abd01
2016-03-02Upgrades: quiet yum updateJiri Stransky1-1/+1
Quiet down yum during major upgrades to reduce the output size. This is consistent with what was introduced into minor updates in change I517271e8465885421a78b73c5af756816c37a977. Change-Id: Ie6b470e383fdf42870ac6f60ca43e44b4c446ebe
2016-03-01Support adding a swap file to overcloud nodesJames Slagle1-0/+108
Create a new SoftwareDeployment that can be used to add a swap file to all nodes The amount of swap and the location of the swap file can be customized via parameter_defaults and the swap_size_megabytes/swap_path parameters. Change-Id: I1fb14c0fab2255410fceb26c3a7d5cfe0ba57b3b
2016-02-29Add Satellite 5 supportJames Slagle2-7/+36
Add Satellite 5 support to the RHEL registration environment and resources. The registration script is updated to support both satellite versions in place given the similarity of the options for both scenarios. The satellite version is detected based on $REG_SAT_URL, and that determines whether subscription-manager or rhnreg_ks is used. Change-Id: Ic261c8a16a7d6d3978f8bfc6e53f75dbe1b716db
2016-02-29Merge "Write the compute upgrade script for tripleo major upgrade workflow"Jenkins2-0/+50
2016-02-26Merge "Add meta notify=true to rabbitmq resource"Jenkins1-0/+3
2016-02-26Write the compute upgrade script for tripleo major upgrade workflowmarios2-0/+50
As part of the major upgrade workflow non-controller nodes are to be updated by the operator, out-of-band and only after an initial heat stack-update that invokes the upgrade of the controller nodes. This review adds a ComputeDeliverUpgradeConfigDeployment_Step3 SoftwareDeploymentGroup to be applied only to compute nodes, and that depends on the controllers having been upgraded after ControllerPacemakerUpgradeConfig_Step2. Its purpose is to deliver but not invoke the upgrade script on compute nodes to /root/tripleo_upgrade_node.sh . The non-controller nodes will then be upgraded later by an operator that will run the script provided for that purpose, like at https://review.openstack.org/#/c/284722/1 for example. Change-Id: Ic6115fc8cf5320abfcf500112ff563bde8b88661
2016-02-23Add UpgradeLevelNovaCompute parameterJiri Stransky2-2/+17
This parameter can be used for pinning (and later unpinning) the Nova Compute RPC version. Change-Id: I2f181f3b01f0b8059566d01db0152a12bbbd1c3e
2016-02-23Introduce update/upgrade workflowJiri Stransky5-6/+59
Change-Id: I7226070aa87416e79f25625647f8e3076c9e2c9a
2016-02-23Add resources for major upgrade in Pacemaker scenarioDerek Higgins3-0/+174
Add Heat software deployments to be used to upgrade major versions of OpenStack on the controller nodes. All controller services are taken down while the upgrade is in progress. The new updated yum repositories should be configured by another process e.g. the deployment artifacts transfer via Swift. Change-Id: Ia0a04e4a11d67e7a5acc53c1f8a8f01ed5ca8675 Co-Authored-By: Giulio Fidente <gfidente@redhat.com> Co-Authored-By: Jiri Stransky <jistr@redhat.com>
2016-02-23Add meta notify=true to rabbitmq resourceMichele Baldessari1-0/+3
See RHBZ 1311005 and 1247303. In short: sometimes when a controller node gets fenced, rabbitmq is unable to rejoin the cluster. To fix this we need two steps: 1) The fix for the RA in BZ 1247303 2) Add notify=true to the meta parameters of the rabbitmq resource on fresh installs and updates Note that if this change is applied on systems that do not have the fix for the rabbitmq resource agent, no action is taken. So when the resource agent will be updated, the notify operation will start to work as soon as the first monitor action will take place. Fixes RH Bug #1311005 Change-Id: I513daf6d45e1a13d43d3c404cfd6e49d64e51d5a
2016-02-16Merge "Split pacemaker common check_service function out of _restart.sh"Jenkins3-34/+44