=======================================
Overcloud Container Design/Architecture
=======================================

This document describes the changes done to implement container deployments in
Apex.

 * OOO container architecture
 * Upstream vs Downstream deployment
 * Apex container deployment overview

OOO container architecture
--------------------------

Typically in OOO each OpenStack service is represented by a TripleO Heat
Template stored under the puppet/services directory in the THT code base.  For
containers, there are new templates created in the docker/services directory
which include templates for most of the previously defined puppet services.
These docker templates in almost all cases inherit their puppet template
counterpart and then build off of that to provide OOO docker specific
configuration.

The containers configuration in OOO is still done via puppet, and config files
are then copied into a host directory to be later mounted in the service
container during deployment.  The docker template contains docker specific
settings to the service, including what files to mount into the container,
along with which puppet resources to execute, etc.  Note, the puppet code is
still stored locally on the host, while the service python code is stored in
the container image.

RDO has its own registry which stores the Docker images per service to use in
deployments.  The container image is usually just a CentOS 7 container with the
relevant service RPM installed.

In addition, Ceph no longer uses puppet to deploy.  puppet-ceph was previously
used to configure Ceph on the overcloud, but has been replaced with
Ceph-Ansible.  During container deployment, the undercloud calls a mistral
workflow to initiate a Ceph-Ansible playbook that will download the Ceph Daemon
container image to the overcloud and configure it.

Upstream vs. Downstream deployment
----------------------------------

In Apex we typically build artifacts and then deploy from them.  This works in
the past as we usually modify disk images (qcow2s) with files or patches and
distribute them as RPMs.  However, with containers space becomes an issue.  The
size of each container image ranges from 800 MB to over 2GB.  This makes it
unfeasible to download all of the possible images and store them into a disk
image for distribution.

Therefore for container deployments the only option is to deploy using
upstream.  This means that only upstream undercloud/overcloud images are pulled
at deploy time, and the required containers are docker pulled during deployment
into the undercloud.  For upstream deployments the modified time of the
RDO images are checked and cached locally, to refrain from unnecessary
downloading of artifacts.  Also, the optional '--no-fetch' argument may be
provided at deploy time, to ignore pulling any new images, as long as previous
artifacts are cached locally.

Apex container deployment
-------------------------

For deploying containers with Apex, a new deploy setting is available,
'containers'.  When this flag is used, along with '--upstream' the following
workflow occurs:

  1. The upstream RDO images for undercloud/overcloud are checked and
     downloaded if necessary.
  2. The undercloud VM is installed and configured as a normal deployment.
  3. The overcloud prep image method is called which is modified now for
     patches and containers.  The method will now return a set of container
     images which are going to be patched.  These can be either due to a change
     in OpenDaylight version for example, or patches included in the deploy
     settings for the overcloud that include a python path.
  4. During the overcloud image prep, a new directory in the Apex tmp dir is
     created called 'containers' which then includes sub-directories for each
     docker image which is being patched (for example, 'containers/nova-api').
  5. A Dockerfile is created inside of the directory created in step 4, which
     holds Dockerfile operations to rebuild the container with patches or any
     required changes.  Several container images could be used for different
     services inside of an OS project.  For example, there are different images
     for each nova service (nova-api, nova-conductor, nova-compute). Therefore
     a lookup is done to figure out all of the container images that a
     hypothetically provided nova patch would apply to.  Then a directory and
     Dockerfile is created for each image.  All of this is tar'ed and
     compressed into an archive which will be copied to the undercloud.
  6. Next, the deployment is checked to see if a Ceph devices was provided in
     Apex settings.  If it is not, then a persistent loop device is created
     in the overcloud image to serve as storage backend for Ceph OSDs.  Apex
     previously used a directory '/srv/data' to serve as the backend to the
     OSDs, but that is no longer supported with Ceph-Ansible.
  7. The deployment command is then created, as usual, but with minor changes
     to add docker.yaml and docker-ha.yaml files which are required to deploy
     containers with OOO.
  8. Next a new playbook is executed, 'prepare_overcloud_containers.yaml',
     which includes several steps:

     a. The previously archived docker image patches are copied and unpacked
        into /home/stack.
     b. 'overcloud_containers' and 'sdn_containers' image files are then
        prepared which are basically just yaml files which indicate which
        docker images to pull and where to store them.  Which in our case is a
        local docker registry.
     c. The docker images are then pulled and stored into the local registry.
        The reason for using a local registry is to then have a static source
        of images that do not change every time a user deploys.  This allows
        for more control and predictability in deployments.
     d. Next, the images in the local registry are cross-checked against
        the images that were previously collected as requiring patches.  Any
        image which then exists in the local registry and also requires changes
        is then rebuilt by the docker build command, tagged with 'apex' and
        then pushed into the local registry.  This helps the user distinguish
        which containers have been modified by Apex, in case any debugging is
        needed in comparing upstream docker images with Apex modifications.
     e. Then new OOO image files are created, to indicate to OOO that the
        docker images to use for deployment are the ones in the local registry.
        Also, the ones modified by Apex are modified with the 'apex' tag.
     f. The relevant Ceph Daemon Docker image is pulled and pushed into the
        local registry for deployment.
  9. At this point the OOO deployment command is initiated as in regular
     Apex deployments.  Each container will be started on the overcloud and
     puppet executed in it to gather the configuration files in Step 1.  This
     leads to Step 1 taking longer than it used to in non-containerized
     deployments.  Following this step, the containers are then brought up in
     their regular step order, while mounting the previously generated
     configuration files.