From 72392a373ca3c0f2b4ea6887fde19cf886288dc4 Mon Sep 17 00:00:00 2001 From: Michele Baldessari Date: Sat, 6 May 2017 17:40:24 +0200 Subject: Use verify_on_create when creating pacemaker remote resources We currently create remote resources without waiting for their creation. This leads to the following potential race (spotted by Marian Mkrcmari): - On Step1 pacemaker bootstrap node creates the resource but the remote resource is not yet created - Step1 completes and Step2 starts - On Step2 the remote node sets a property (or calls pcs cib) but the remote is not yet set up so 'pcs cluster cib' will fail there with: (err): Could not evaluate: backup_cib: Running: /usr/sbin/pcs cluster cib /var/lib/pacemaker/cib/puppet-cib-backup20170506-15994-1swnk1i failed with code: 1 -> Note that when verify_on_create is set to true we are not using the cib dump/push mechanism. That is fine because we create the remotes on step1 and the dump/push mechanism is only needed starting from step2 when multiple nodes set cluster properties at the same time. Tested by Marian Mkrcmari successfully as well. Closes-Bug: #1689028 Change-Id: I764526b3f3c06591d477cc92779d83a19802368e Depends-On: I1db31dcc92b8695ab0522bba91df729b37f34e0f (cherry picked from commit b6d02fd5001153b53b3061d63d2cb686b0646f18) --- manifests/profile/base/pacemaker.pp | 1 + 1 file changed, 1 insertion(+) diff --git a/manifests/profile/base/pacemaker.pp b/manifests/profile/base/pacemaker.pp index c1d745a..811b911 100644 --- a/manifests/profile/base/pacemaker.pp +++ b/manifests/profile/base/pacemaker.pp @@ -136,6 +136,7 @@ class tripleo::profile::base::pacemaker ( remote_address => $remotes_hash[$title], reconnect_interval => $remote_reconnect_interval, op_params => "monitor interval=${remote_monitor_interval}", + verify_on_create => true, tries => $remote_tries, try_sleep => $remote_try_sleep, } -- cgit 1.2.3-korg