Overcloud Container Design/Architecture
This document describes the changes done to implement container deployments in
* OOO container architecture
* Upstream vs Downstream deployment
* Apex container deployment overview
OOO container architecture
Typically in OOO each OpenStack service is represented by a TripleO Heat
Template stored under the puppet/services directory in the THT code base. For
containers, there are new templates created in the docker/services directory
which include templates for most of the previously defined puppet services.
These docker templates in almost all cases inherit their puppet template
counterpart and then build off of that to provide OOO docker specific
The containers configuration in OOO is still done via puppet, and config files
are then copied into a host directory to be later mounted in the service
container during deployment. The docker template contains docker specific
settings to the service, including what files to mount into the container,
along with which puppet resources to execute, etc. Note, the puppet code is
still stored locally on the host, while the service python code is stored in
the container image.
RDO has its own registry which stores the Docker images per service to use in
deployments. The container image is usually just a CentOS 7 container with the
relevant service RPM installed.
In addition, Ceph no longer uses puppet to deploy. puppet-ceph was previously
used to configure Ceph on the overcloud, but has been replaced with
Ceph-Ansible. During container deployment, the undercloud calls a mistral
workflow to initiate a Ceph-Ansible playbook that will download the Ceph Daemon
container image to the overcloud and configure it.
Upstream vs. Downstream deployment
In Apex we typically build artifacts and then deploy from them. This works in
the past as we usually modify disk images (qcow2s) with files or patches and
distribute them as RPMs. However, with containers space becomes an issue. The
size of each container image ranges from 800 MB to over 2GB. This makes it
unfeasible to download all of the possible images and store them into a disk
image for distribution.
Therefore for container deployments the only option is to deploy using
upstream. This means that only upstream undercloud/overcloud images are pulled
at deploy time, and the required containers are docker pulled during deployment
into the undercloud. For upstream deployments the modified time of the
RDO images are checked and cached locally, to refrain from unnecessary
downloading of artifacts. Also, the optional '--no-fetch' argument may be
provided at deploy time, to ignore pulling any new images, as long as previous
artifacts are cached locally.
Apex container deployment
For deploying containers with Apex, a new deploy setting is available,
'containers'. When this flag is used, along with '--upstream' the following
1. The upstream RDO images for undercloud/overcloud are checked and
downloaded if necessary.
2. The undercloud VM is installed and configured as a normal deployment.
3. The overcloud prep image method is called which is modified now for
patches and containers. The method will now return a set of container
images which are going to be patched. These can be either due to a change
in OpenDaylight version for example, or patches included in the deploy
settings for the overcloud that include a python path.
4. During the overcloud image prep, a new directory in the Apex tmp dir is
created called 'containers' which then includes sub-directories for each
docker image which is being patched (for example, 'containers/nova-api').
5. A Dockerfile is created inside of the directory created in step 4, which
holds Dockerfile operations to rebuild the container with patches or any
required changes. Several container images could be used for different
services inside of an OS project. For example, there are different images
for each nova service (nova-api, nova-conductor, nova-compute). Therefore
a lookup is done to figure out all of the container images that a
hypothetically provided nova patch would apply to. Then a directory and
Dockerfile is created for each image. All of this is tar'ed and
compressed into an archive which will be copied to the undercloud.
6. Next, the deployment is checked to see if a Ceph devices was provided in
Apex settings. If it is not, then a persistent loop device is created
in the overcloud image to serve as storage backend for Ceph OSDs. Apex
previously used a directory '/srv/data' to serve as the backend to the
OSDs, but that is no longer supported with Ceph-Ansible.
7. The deployment command is then created, as usual, but with minor changes
to add docker.yaml and docker-ha.yaml files which are required to deploy
containers with OOO.
8. Next a new playbook is executed, 'prepare_overcloud_containers.yaml',
which includes several steps:
a. The previously archived docker image patches are copied and unpacked
b. 'overcloud_containers' and 'sdn_containers' image files are then
prepared which are basically just yaml files which indicate which
docker images to pull and where to store them. Which in our case is a
local docker registry.
c. The docker images are then pulled and stored into the local registry.
The reason for using a local registry is to then have a static source
of images that do not change every time a user deploys. This allows
for more control and predictability in deployments.
d. Next, the images in the local registry are cross-checked against
the images that were previously collected as requiring patches. Any
image which then exists in the local registry and also requires changes
is then rebuilt by the docker build command, tagged with 'apex' and
then pushed into the local registry. This helps the user distinguish
which containers have been modified by Apex, in case any debugging is
needed in comparing upstream docker images with Apex modifications.
e. Then new OOO image files are created, to indicate to OOO that the
docker images to use for deployment are the ones in the local registry.
Also, the ones modified by Apex are modified with the 'apex' tag.
f. The relevant Ceph Daemon Docker image is pulled and pushed into the
local registry for deployment.
9. At this point the OOO deployment command is initiated as in regular
Apex deployments. Each container will be started on the overcloud and
puppet executed in it to gather the configuration files in Step 1. This
leads to Step 1 taking longer than it used to in non-containerized
deployments. Following this step, the containers are then brought up in
their regular step order, while mounting the previously generated