summaryrefslogtreecommitdiffstats
path: root/docs/contributor/upstream-overcloud-container-design.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/contributor/upstream-overcloud-container-design.rst')
-rw-r--r--docs/contributor/upstream-overcloud-container-design.rst126
1 files changed, 126 insertions, 0 deletions
diff --git a/docs/contributor/upstream-overcloud-container-design.rst b/docs/contributor/upstream-overcloud-container-design.rst
new file mode 100644
index 0000000..4b368c2
--- /dev/null
+++ b/docs/contributor/upstream-overcloud-container-design.rst
@@ -0,0 +1,126 @@
+=======================================
+Overcloud Container Design/Architecture
+=======================================
+
+This document describes the changes done to implement container deployments in
+Apex.
+
+ * OOO container architecture
+ * Upstream vs Downstream deployment
+ * Apex container deployment overview
+
+OOO container architecture
+--------------------------
+
+Typically in OOO each OpenStack service is represented by a TripleO Heat
+Template stored under the puppet/services directory in the THT code base. For
+containers, there are new templates created in the docker/services directory
+which include templates for most of the previously defined puppet services.
+These docker templates in almost all cases inherit their puppet template
+counterpart and then build off of that to provide OOO docker specific
+configuration.
+
+The containers configuration in OOO is still done via puppet, and config files
+are then copied into a host directory to be later mounted in the service
+container during deployment. The docker template contains docker specific
+settings to the service, including what files to mount into the container,
+along with which puppet resources to execute, etc. Note, the puppet code is
+still stored locally on the host, while the service python code is stored in
+the container image.
+
+RDO has its own registry which stores the Docker images per service to use in
+deployments. The container image is usually just a CentOS 7 container with the
+relevant service RPM installed.
+
+In addition, Ceph no longer uses puppet to deploy. puppet-ceph was previously
+used to configure Ceph on the overcloud, but has been replaced with
+Ceph-Ansible. During container deployment, the undercloud calls a mistral
+workflow to initiate a Ceph-Ansible playbook that will download the Ceph Daemon
+container image to the overcloud and configure it.
+
+Upstream vs. Downstream deployment
+----------------------------------
+
+In Apex we typically build artifacts and then deploy from them. This works in
+the past as we usually modify disk images (qcow2s) with files or patches and
+distribute them as RPMs. However, with containers space becomes an issue. The
+size of each container image ranges from 800 MB to over 2GB. This makes it
+unfeasible to download all of the possible images and store them into a disk
+image for distribution.
+
+Therefore for container deployments the only option is to deploy using
+upstream. This means that only upstream undercloud/overcloud images are pulled
+at deploy time, and the required containers are docker pulled during deployment
+into the undercloud. For upstream deployments the modified time of the
+RDO images are checked and cached locally, to refrain from unnecessary
+downloading of artifacts. Also, the optional '--no-fetch' argument may be
+provided at deploy time, to ignore pulling any new images, as long as previous
+artifacts are cached locally.
+
+Apex container deployment
+-------------------------
+
+For deploying containers with Apex, a new deploy setting is available,
+'containers'. When this flag is used, along with '--upstream' the following
+workflow occurs:
+
+ 1. The upstream RDO images for undercloud/overcloud are checked and
+ downloaded if necessary.
+ 2. The undercloud VM is installed and configured as a normal deployment.
+ 3. The overcloud prep image method is called which is modified now for
+ patches and containers. The method will now return a set of container
+ images which are going to be patched. These can be either due to a change
+ in OpenDaylight version for example, or patches included in the deploy
+ settings for the overcloud that include a python path.
+ 4. During the overcloud image prep, a new directory in the Apex tmp dir is
+ created called 'containers' which then includes sub-directories for each
+ docker image which is being patched (for example, 'containers/nova-api').
+ 5. A Dockerfile is created inside of the directory created in step 4, which
+ holds Dockerfile operations to rebuild the container with patches or any
+ required changes. Several container images could be used for different
+ services inside of an OS project. For example, there are different images
+ for each nova service (nova-api, nova-conductor, nova-compute). Therefore
+ a lookup is done to figure out all of the container images that a
+ hypothetically provided nova patch would apply to. Then a directory and
+ Dockerfile is created for each image. All of this is tar'ed and
+ compressed into an archive which will be copied to the undercloud.
+ 6. Next, the deployment is checked to see if a Ceph devices was provided in
+ Apex settings. If it is not, then a persistent loop device is created
+ in the overcloud image to serve as storage backend for Ceph OSDs. Apex
+ previously used a directory '/srv/data' to serve as the backend to the
+ OSDs, but that is no longer supported with Ceph-Ansible.
+ 7. The deployment command is then created, as usual, but with minor changes
+ to add docker.yaml and docker-ha.yaml files which are required to deploy
+ containers with OOO.
+ 8. Next a new playbook is executed, 'prepare_overcloud_containers.yaml',
+ which includes several steps:
+
+ a. The previously archived docker image patches are copied and unpacked
+ into /home/stack.
+ b. 'overcloud_containers' and 'sdn_containers' image files are then
+ prepared which are basically just yaml files which indicate which
+ docker images to pull and where to store them. Which in our case is a
+ local docker registry.
+ c. The docker images are then pulled and stored into the local registry.
+ The reason for using a local registry is to then have a static source
+ of images that do not change every time a user deploys. This allows
+ for more control and predictability in deployments.
+ d. Next, the images in the local registry are cross-checked against
+ the images that were previously collected as requiring patches. Any
+ image which then exists in the local registry and also requires changes
+ is then rebuilt by the docker build command, tagged with 'apex' and
+ then pushed into the local registry. This helps the user distinguish
+ which containers have been modified by Apex, in case any debugging is
+ needed in comparing upstream docker images with Apex modifications.
+ e. Then new OOO image files are created, to indicate to OOO that the
+ docker images to use for deployment are the ones in the local registry.
+ Also, the ones modified by Apex are modified with the 'apex' tag.
+ f. The relevant Ceph Daemon Docker image is pulled and pushed into the
+ local registry for deployment.
+ 9. At this point the OOO deployment command is initiated as in regular
+ Apex deployments. Each container will be started on the overcloud and
+ puppet executed in it to gather the configuration files in Step 1. This
+ leads to Step 1 taking longer than it used to in non-containerized
+ deployments. Following this step, the containers are then brought up in
+ their regular step order, while mounting the previously generated
+ configuration files.