Age | Commit message (Collapse) | Author | Files | Lines |
|
Where applicable, use list_concat instead of yaql to build new lists: it
should be more resilient to errors, easier to debug, and less expensive.
Change-Id: I6d3dbc7ee8eac50f46023a35af4ec7f2d378fd87
Related-Bug: #1714005
(cherry picked from commit 8008089de24437757d3ba10299bb1041b4aa627a)
|
|
For bug 1708115 and the O..P upgrade, and for the upgrade of
'non-controlers' we are now generating ansible playbooks from
collected service upgrade_tasks and these are executed instead
of the legacy tripleo_upgrade_node.sh.
To clarify, by 'non-controllers' it is meant any node for which
the corresponding roles_data.yaml role has the
disable_upgrade_deployment flag set True.
As a first pass, I am removing the workarounds from the script but
keeping its delivery mechanism for now in case it is needed still.
We can either update here to remove it or keep it until next cycle
The most important part for now is that we no longer 'manually'
run puppet here. Instead the post_deploy_steps are also collected
into a playbook and will be executed after the upgrade_tasks
(see the bug for discussion of the mechanism and related reviews)
Change-Id: Ib017b0ab435ca9558cf8659d434489cdf01df955
Related-Bug: 1708115
(cherry picked from commit 4c5b9c5c967105536106fa4a7e1ec2352b14b08c)
|
|
docker-puppet.py is very aggressive about running concurrently.
It uses python multiprocessing to run multiple config generating
containers at once. This seems to work well in general, but
in some cases... perhaps when the registry is slow or under
heavy load can cause timeouts to occur. Lately I'm seeing
several 'container did not start before the specified timeout'
errors that always seem to occur when config files are generated
(docker-puppet.py is initially executed.
A couple of things:
-when config files are generated this is the first time
most of the containers are pulled to each host machine
during deployment
-docker-puppet.py runs many of these processes at once. Some
of them run faster, other not.
-docker daemon's pull limit defaults to 3. This would throttle
the above a bit perhaps contributing the the likelyhood of a timeout.
One solution that seems to work for me is to set the PROCESS_COUNT
in docker-puppet.py to 3. As this matches docker daemon's default
it is probably safer at the cost of being slightly slower in some
cases.
Change-Id: I17feb3abd9d36fe7c95865a064502ce9902a074e
Closes-bug: #1713188
(cherry picked from commit 949d367ddeb42eff913cdbed733ccf6239b4864b)
|
|
Force the count start to 0 to ensure that the
update step loop will start to 0 and execute the
update step0
Closes-Bug: #1712498
Change-Id: I71be55c1f56e53e5c565bec281795d63e5845ff6
|
|
To get this to work upgrade_tasks need to be rewritten with 'when'
statements like the update tasks (in parent review from shardy).
So that we don't break the existing upgrades workflow, we add these
as part of the config download see the depends on
Related-Bug: 1708115
Depends-On: Ief593dc758a2ffe33c1cbcbda9289393fcf023e4
Change-Id: Ib01b96a2c26721747d81d98e3d57c4c388663004
|
|
This enables either deploying without configuring any services, or
temporarily disabling the deploy steps such as will be required
for minor updates where we want to re-run the rolling update outside
of heat.
To deploy directly via ansible-playbook you can do e.g:
openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-6b02U7-config
ansible-playbook -vvv -b -i /usr/bin/tripleo-ansible-inventory deploy_steps_playbook.yaml
Which will run the same ansible steps as we normally run via heat.
Change-Id: I59947b67523dfcc43d454d4ac7d82b06804cf71d
|
|
These work the same way as upgrade_tasks *but* they use a step variable
instead of tags, so we can iterate over a count/sequence which isn't
possibly via a wrapper playbook with tags (we may want to align upgrade
tasks with the same approach if this works out well).
Note the tasks can be run via ansible-playbook on the undercloud, like:
openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-HCrDA6-config
ansible-playbook -b -i /usr/bin/tripleo-ansible-inventory update_steps_playbook.yaml --limit controller
The above will do a rolling update for the Controller role (note the inconsistent
capitalization, we probably need to fix the group naming in tripleo-ansible-inventory)
because we specify serial: 1 in the playbook.
You can also trigger an update explicitly on one node like this, which is useful for debugging:
ansible-playbook -vvv -b -i /usr/bin/tripleo-ansible-inventory update_steps_playbook.yaml --limit overcloud-controller-0
Change-Id: I20bb3e26ab9d9cadf1a31fd304de8a014a901aa9
|
|
This exposes the deploy workflow for all roles from deploy-steps
via overcloud.j2.yaml - which means we can write it via the new
openstack overcloud config download command and/or run the workflow
outside of heat via mistral
With https://review.openstack.org/#/c/485732/ applied to
tripleoclient it becomes possible to do:
openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-EvEZk0-config
ansible-playbook -b -i /usr/bin/tripleo-ansible-inventory deploy_steps_playbook.yaml
This runs the deploy steps, exactly the same as normally run via heat
via ansible-playbook for all overcloud nodes (--limit can be used to restrict
to specific nodes/roles).
Change-Id: I96ec09bc788836584c4b39dcce5bf9b80e914c71
|
|
This isn't set unless the playbook is run via heat, so default it to false
to enable easier use via ansible-playbook combined with tripleo-ansible-inventory
Change-Id: I9705e4533831a019dd0051e5522d4b7958682506
|
|
So that we can more easily iterate over an include in an output
Change-Id: Idd5bb47589e5c37123caafcded1afbff8881aa33
|
|
If we consolidate these we can focus on one implementation (the new ansible
based one used for docker-steps)
Change-Id: Iec0ad2278d62040bf03613fc9556b199c6a80546
Depends-On: Ifa2afa915e0fee368fb2506c02de75bf5efe82d5
|
|
The key_name default is ignored because the parameter is used in
some mutually exclusive environments where the default doesn't
need to be the same.
Change-Id: I77c1a1159fae38d03b0e59b80ae6bee491d734d7
Partial-Bug: 1700664
|
|
This makes the RolesData output more accurate, and we can rework
things so docker-puppet only gets run when there is a non-empty
file calculated (e.g there are tasks to run).
Change-Id: I8cdab3c857977c80fe2e359ab9e05740a838d66b
|
|
This stores the result of the yaql queries etc for easier debugging, and
also so there's no risk we constantly re-evaluate the expensive query
which can happen with some heat versions and configurations.
This also gives a nicer error when things go wrong as when a query fails
you know which resource had an error, and also the validation on resources
is currently stricter due to bug #1599114. We also get some additional
type validation from each OS::Heat::Value resource, e.g it checks if the
calculated value is a valid map or list.
The final advantage (and the original motivation for doing this) is that
we can easily filter null values for any outputs where this isn't already
done, which makes the config data written via openstack overcloud config
download cleaner.
Change-Id: Ia6697cf2e47f3f7b727d620536e0873a985c98c4
|
|
Moving these means we get a more accurate output from the overcloud
RoleData output, which more closely reflects what is actually
deployed.
Change-Id: I154f36c1597cf4abe29ca0bfe15a54f507433fb1
|
|
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.
Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).
Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
|
|
This new directory has now been added to the RDO packaging so we
can move things common to both puppet/container architecture here,
starting with the recently combined services.yaml
Change-Id: If2ce27188c4c15002b3ad830e8d6eb9504d2f3d2
|
|
Move to one common services.yaml not only reduces the duplication, but it
should improve performance for the docker/services.yaml case, because we were
creating two ResourceChains with $many services which we know can be really
slow (especially since we seem to be missing concurrent: true on one)
Change-Id: I76f188438bfc6449b152c2861d99738e6eb3c61b
|