From f6dbb3929d904b4d5a9ee01f8270051e29ac1ec3 Mon Sep 17 00:00:00 2001 From: Tim Rozet Date: Mon, 4 Dec 2017 11:20:23 -0500 Subject: Enables containerized overcloud deployments Changes Include: - For upstream deployments, Docker local registry will be updated with latest current RDO containers, regular deployments will use latest stable - Upstream container images will then be patched/modified and then re-uploaded into local docker registry with 'apex' tag - Deployment command modified to deploy with containers - Adds a --no-fetch deployment argument to disable pulling latest from upstream, and instead using what already exists in cache - Moves Undercloud NAT setup to just after undercloud is installed. This provides internet during overcloud install which is now required for upstream container deployments. - Creates loop device for Ceph deployment when no device is provided in deploy settings (for container deployment only) - Updates NIC J2 template to use the new format in OOO since the os-apply-config method is now deprecated in > Queens JIRA: APEX-566 JIRA: APEX-549 Change-Id: I0652c194c059b915a942ac7401936e8f5c69d1fa Signed-off-by: Tim Rozet --- .../upstream-overcloud-container-design.rst | 126 +++++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 docs/contributor/upstream-overcloud-container-design.rst (limited to 'docs') diff --git a/docs/contributor/upstream-overcloud-container-design.rst b/docs/contributor/upstream-overcloud-container-design.rst new file mode 100644 index 00000000..4b368c2e --- /dev/null +++ b/docs/contributor/upstream-overcloud-container-design.rst @@ -0,0 +1,126 @@ +======================================= +Overcloud Container Design/Architecture +======================================= + +This document describes the changes done to implement container deployments in +Apex. + + * OOO container architecture + * Upstream vs Downstream deployment + * Apex container deployment overview + +OOO container architecture +-------------------------- + +Typically in OOO each OpenStack service is represented by a TripleO Heat +Template stored under the puppet/services directory in the THT code base. For +containers, there are new templates created in the docker/services directory +which include templates for most of the previously defined puppet services. +These docker templates in almost all cases inherit their puppet template +counterpart and then build off of that to provide OOO docker specific +configuration. + +The containers configuration in OOO is still done via puppet, and config files +are then copied into a host directory to be later mounted in the service +container during deployment. The docker template contains docker specific +settings to the service, including what files to mount into the container, +along with which puppet resources to execute, etc. Note, the puppet code is +still stored locally on the host, while the service python code is stored in +the container image. + +RDO has its own registry which stores the Docker images per service to use in +deployments. The container image is usually just a CentOS 7 container with the +relevant service RPM installed. + +In addition, Ceph no longer uses puppet to deploy. puppet-ceph was previously +used to configure Ceph on the overcloud, but has been replaced with +Ceph-Ansible. During container deployment, the undercloud calls a mistral +workflow to initiate a Ceph-Ansible playbook that will download the Ceph Daemon +container image to the overcloud and configure it. + +Upstream vs. Downstream deployment +---------------------------------- + +In Apex we typically build artifacts and then deploy from them. This works in +the past as we usually modify disk images (qcow2s) with files or patches and +distribute them as RPMs. However, with containers space becomes an issue. The +size of each container image ranges from 800 MB to over 2GB. This makes it +unfeasible to download all of the possible images and store them into a disk +image for distribution. + +Therefore for container deployments the only option is to deploy using +upstream. This means that only upstream undercloud/overcloud images are pulled +at deploy time, and the required containers are docker pulled during deployment +into the undercloud. For upstream deployments the modified time of the +RDO images are checked and cached locally, to refrain from unnecessary +downloading of artifacts. Also, the optional '--no-fetch' argument may be +provided at deploy time, to ignore pulling any new images, as long as previous +artifacts are cached locally. + +Apex container deployment +------------------------- + +For deploying containers with Apex, a new deploy setting is available, +'containers'. When this flag is used, along with '--upstream' the following +workflow occurs: + + 1. The upstream RDO images for undercloud/overcloud are checked and + downloaded if necessary. + 2. The undercloud VM is installed and configured as a normal deployment. + 3. The overcloud prep image method is called which is modified now for + patches and containers. The method will now return a set of container + images which are going to be patched. These can be either due to a change + in OpenDaylight version for example, or patches included in the deploy + settings for the overcloud that include a python path. + 4. During the overcloud image prep, a new directory in the Apex tmp dir is + created called 'containers' which then includes sub-directories for each + docker image which is being patched (for example, 'containers/nova-api'). + 5. A Dockerfile is created inside of the directory created in step 4, which + holds Dockerfile operations to rebuild the container with patches or any + required changes. Several container images could be used for different + services inside of an OS project. For example, there are different images + for each nova service (nova-api, nova-conductor, nova-compute). Therefore + a lookup is done to figure out all of the container images that a + hypothetically provided nova patch would apply to. Then a directory and + Dockerfile is created for each image. All of this is tar'ed and + compressed into an archive which will be copied to the undercloud. + 6. Next, the deployment is checked to see if a Ceph devices was provided in + Apex settings. If it is not, then a persistent loop device is created + in the overcloud image to serve as storage backend for Ceph OSDs. Apex + previously used a directory '/srv/data' to serve as the backend to the + OSDs, but that is no longer supported with Ceph-Ansible. + 7. The deployment command is then created, as usual, but with minor changes + to add docker.yaml and docker-ha.yaml files which are required to deploy + containers with OOO. + 8. Next a new playbook is executed, 'prepare_overcloud_containers.yaml', + which includes several steps: + + a. The previously archived docker image patches are copied and unpacked + into /home/stack. + b. 'overcloud_containers' and 'sdn_containers' image files are then + prepared which are basically just yaml files which indicate which + docker images to pull and where to store them. Which in our case is a + local docker registry. + c. The docker images are then pulled and stored into the local registry. + The reason for using a local registry is to then have a static source + of images that do not change every time a user deploys. This allows + for more control and predictability in deployments. + d. Next, the images in the local registry are cross-checked against + the images that were previously collected as requiring patches. Any + image which then exists in the local registry and also requires changes + is then rebuilt by the docker build command, tagged with 'apex' and + then pushed into the local registry. This helps the user distinguish + which containers have been modified by Apex, in case any debugging is + needed in comparing upstream docker images with Apex modifications. + e. Then new OOO image files are created, to indicate to OOO that the + docker images to use for deployment are the ones in the local registry. + Also, the ones modified by Apex are modified with the 'apex' tag. + f. The relevant Ceph Daemon Docker image is pulled and pushed into the + local registry for deployment. + 9. At this point the OOO deployment command is initiated as in regular + Apex deployments. Each container will be started on the overcloud and + puppet executed in it to gather the configuration files in Step 1. This + leads to Step 1 taking longer than it used to in non-containerized + deployments. Following this step, the containers are then brought up in + their regular step order, while mounting the previously generated + configuration files. -- cgit 1.2.3-korg