From 676ec6ea6d7995266b8162c8caa89f72693eb929 Mon Sep 17 00:00:00 2001 From: Emilien Macchi Date: Thu, 10 Dec 2015 16:23:50 -0500 Subject: pacemaker: run neutron-server-start-wait-stop only at step 4 neutron-server-start-wait-stop is a dangerous Exec that is exposed to race conditions, because it does not have "onlyif" or "unless" statements. That means during a deployment, this exec can be run in the wrong order during Step 5 and/or 6, while it was supposed to be run at Step 4 only. If that happens, the exec will fail because puppet tries to start neutron-server while Pacemaker already started the resource. So in that case, systemd would returns 1 to Puppet which would return 6 to the overcloud deployment and the deployment would fail to finish correctly. This patch aims to prevent from this scenario by making sure we run the exec only during the step 4. Also, in order to secure it a bit more, we add 'unless' statement to this exec, so we would make sure the Puppet run would be idempotent and the Exec would run one successful time only. https://bugzilla.redhat.com/show_bug.cgi?id=1290582 Change-Id: I42813c5cff6c525c15c9c24baad4e355f88af672 --- puppet/manifests/overcloud_controller_pacemaker.pp | 35 ++++++++++++++++------ 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/puppet/manifests/overcloud_controller_pacemaker.pp b/puppet/manifests/overcloud_controller_pacemaker.pp index 6c8530ff..f4f7a4ea 100644 --- a/puppet/manifests/overcloud_controller_pacemaker.pp +++ b/puppet/manifests/overcloud_controller_pacemaker.pp @@ -1060,15 +1060,32 @@ if hiera('step') >= 4 { Pacemaker::Resource::Service[$::glance::params::api_service_name]], } - # Neutron - # NOTE(gfidente): Neutron will try to populate the database with some data - # as soon as neutron-server is started; to avoid races we want to make this - # happen only on one node, before normal Pacemaker initialization - # https://bugzilla.redhat.com/show_bug.cgi?id=1233061 - exec { '/usr/bin/systemctl start neutron-server && /usr/bin/sleep 5' : } -> - pacemaker::resource::service { $::neutron::params::server_service: - clone_params => 'interleave=true', - require => Pacemaker::Resource::Service[$::keystone::params::service_name], + if hiera('step') == 4 { + # Neutron + # NOTE(gfidente): Neutron will try to populate the database with some data + # as soon as neutron-server is started; to avoid races we want to make this + # happen only on one node, before normal Pacemaker initialization + # https://bugzilla.redhat.com/show_bug.cgi?id=1233061 + # NOTE(emilien): we need to run this Exec only at Step 4 otherwise this exec + # will try to start the service while it's already started by Pacemaker + # It would result to a deployment failure since systemd would return 1 to Puppet + # and the overcloud would fail to deploy (6 would be returned). + # This conditional prevents from a race condition during the deployment. + # https://bugzilla.redhat.com/show_bug.cgi?id=1290582 + exec { 'neutron-server-systemd-start-sleep' : + command => 'systemctl start neutron-server && /usr/bin/sleep 5', + path => '/usr/bin', + unless => '/sbin/pcs resource show neutron-server', + } -> + pacemaker::resource::service { $::neutron::params::server_service: + clone_params => 'interleave=true', + require => Pacemaker::Resource::Service[$::keystone::params::service_name] + } + } else { + pacemaker::resource::service { $::neutron::params::server_service: + clone_params => 'interleave=true', + require => Pacemaker::Resource::Service[$::keystone::params::service_name] + } } if hiera('neutron::enable_l3_agent', true) { pacemaker::resource::service { $::neutron::params::l3_agent_service: -- cgit 1.2.3-korg