aboutsummaryrefslogtreecommitdiffstats
path: root/docker
AgeCommit message (Collapse)AuthorFilesLines
2017-10-30Update CephPools format in the docker templates to fit ceph-ansibleGiulio Fidente1-16/+4
The format which ceph-ansible uses to describe the list of pools to be created in the cluster is different from the one which puppet-ceph uses; this commit updates the description and the the docker templates accordingly. Change-Id: I1e5b2c3cbf6ae02c19a2275ca119fed6e173319d Closes-Bug: #1720373 (cherry picked from commit c10aa7a0439fb7d8e8e964e75d73f3cbb54aa9ec)
2017-10-23Merge "Also match config volumes for /var/lib/config-data/puppet-generated/" ↵Zuul1-5/+7
into stable/pike
2017-10-23Merge "Disable xinetd class when creating swift-storage puppet ↵Zuul1-1/+4
configuration" into stable/pike
2017-10-19Disable xinetd class when creating swift-storage puppet configurationMichele Baldessari1-1/+4
Due to missing puppet invocation with --detailed-exitcodes we ignored a large amount of puppet errors during deploy. Swift storage fails during the puppet_config step with the following error: Debug: /Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Package[swift-object]: Not tagged with file, file_line, concat, augeas, cron, swif t_proxy_config, swift_config, swift_container_config, swift_container_sync_realms_config, swift_account_config, swift_object_config, swift_object_expirer_con fig, rsync::server Debug: /Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Package[swift-object]: Resource is being skipped, unscheduling all events Debug: Executing: '/usr/bin/systemctl is-active xinetd' Debug: Executing: '/usr/bin/systemctl is-enabled xinetd' Debug: Executing: '/usr/bin/systemctl unmask xinetd' Debug: Executing: '/usr/bin/systemctl start xinetd' Debug: Runing journalctl command to get logs for systemd start failure: journalctl -n 50 --since '5 minutes ago' -u xinetd --no-pager Debug: Executing: 'journalctl -n 50 --since '5 minutes ago' -u xinetd --no-pager' Error: Systemd start for xinetd failed! The problem is that by using the rsync::server tag we end up including the xinetd class automatically which will try to start a service inside a container. By nooping the xinetd class, we're able avoid systemctl calls and have a successfuly deployment. The resulting swift_rsync container seems to work correctly: [root@overcloud-controller-0 ~]# docker exec -it swift_rsync /bin/bash -c "ps -axuwf" USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 10 0.0 0.0 47444 1624 pts/1 Rs+ 18:16 0:00 ps -axuwf root 1 0.0 0.0 188 4 ? Ss 17:27 0:00 /usr/local/bin/dumb-init /bin/bash /usr/local/bin/kolla_start root 6 0.0 0.0 11036 924 ? Ss 17:27 0:00 /usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf [root@overcloud-controller-0 ~]# docker logs swift_rsync 2>&1|tail -n4 INFO:__main__:Deleting /etc/rsyncd.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rsyncd.conf to /etc/rsyncd.conf INFO:__main__:Writing out command to execute Running command: '/usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf' Change-Id: I5e43e8fd61e002d2acc56a7de52e6aae64ab60be Closes-Bug: #1723463 (cherry picked from commit b5eeeab73e12efecc86ea7deebc105eee0739510)
2017-10-18Also match config volumes for /var/lib/config-data/puppet-generated/Steven Hardy1-5/+7
Some services only mount this directory, not /var/lib/config-data/$service so handle this case in the docker-puppet code that maps the mounted volumes to the services when adding the config hash to the container environment. Change-Id: I3bdb7609f322458584ac9597ffbfefb057b84646 Closes-Bug: #1720208 (cherry picked from commit 3a932b056914d148fa460b8890fc0e631c817a40)
2017-10-14Remove monitor_interface from ceph-ansible parametersGiulio Fidente2-2/+0
We should not pass any hardcoded value for monitor_interface and rely on monitor_address_block only instead. Also removes journal_collocation which is not consumed by newer (and stable) builds of ceph-ansible. Change-Id: Idf213a1f43a66506f76d07102f122839b5096948 Closes-Bug: #1715246 (cherry picked from commit 3e90ae3df5a7c5491672254733ceac163b34a395)
2017-10-10Merge "Remove package if service stopped and disabled" into stable/pikeJenkins30-3/+282
2017-10-10Merge "Adds pacemaker update_tasks for Pike minor update workflow" into ↵Jenkins10-3/+239
stable/pike
2017-10-10Merge "Make containerized galera use mysql_network everywhere" into stable/pikeJenkins1-0/+6
2017-10-10Merge "Create mysql user for non-ha deployments" into stable/pikeJenkins1-5/+21
2017-10-10Merge "List all unhealthy containers" into stable/pikeJenkins1-1/+5
2017-10-09Remove package if service stopped and disabledmarios30-3/+282
Adds a UpgradeRemoveUnusedPackages param to use in the ansible when conditional for the removal Adds package removal to step2 right after a service is stopped and disabled on step2. Package updates happen in step3 so ideally remove before that. The package removal task has ignore_errors true so dependencies or other issue removing packages will not fail the upgrade workflow. Also adds this to the upgrade environment files for visibility and defaulting false Change-Id: Ie4e4a2d41f7752c5a13507a7c15c6f68e203cfca Related-Bug: 1701501 (cherry picked from commit ce0ef2fa207698c1ae61c1620fe3c5e8d1c7bfca)
2017-10-09Adds pacemaker update_tasks for Pike minor update workflowmarios10-3/+239
Adds update_tasks for the minor update workflow. These will be collected into playbooks during an initial 'update init' heat stack update and then invoked later by the operator as ansible playbooks. Current understanding/workflow: Step=1: stop the cluster on the updated node Step=2: Pull the latest image and retag the it pcmklatest Step=3: yum upgrade happens on the host Step=4: Restart the cluster on the node Step=5: Verification: test pacemaker services are running. https://etherpad.openstack.org/p/tripleo-pike-updates-upgrades Related-Bug: 1715557 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com> Change-Id: I101e0f5d221045fbf94fb9dc11a2f30706843806 (cherry picked from commit a953bda0ae615dc44d3e8a70aa7ab0160e26f3af)
2017-10-09Merge "docker: add logging(source & groups)" into stable/pikeJenkins82-7/+165
2017-10-09List all unhealthy containersMartin Mágr1-1/+5
Currently the default Sensu check defined in docker/services/sensu-client.yaml reports only first unhealthy container. This patch changes the check output to contain list of all unhealthy containers. Change-Id: I0a934367ef22984d9091d160ec7105092edc8149 Closes-Bug: #1720972 (cherry picked from commit 9b016c9f3fbe9552497737974b9928d1dff4d299)
2017-10-09Create mysql user for non-ha deploymentsMartin Mágr1-5/+21
Currently health check for mysql container reports unhealthy container because there is no 'mysql' user created. This patch creates the user during mysql_bootstrap without any permission, just to allow health check to connect to DB and run 'select 1'. Change-Id: Iab26da0d30939b219189d4e7beb2a61d456ab7c3 Closes-Bug: #1718944 (cherry picked from commit 3a9cfaa992e92423461d64f84d701336322bdd10)
2017-10-09docker: add logging(source & groups)Juan Badia Payno82-7/+165
The services that docker depends on, have logging_sources and logging_groups; but those are not set on the docker outputs so they are not used when dockers are deployed. Added logging_source & logging_groups as docker optional parameters in tools/yaml-validate.py Closes-Bug: #1718110 Change-Id: I8795eaf4bd06051e9b94aa50450dee0d8761e526 (cherry picked from commit 5dbe1121e98a794ec6a6387ff56ee34314177567)
2017-10-09Containerized Fluentd clientJuan Badia Payno1-0/+121
Change-Id: Ia350e4899aa499cf27efffd9d2243e7e95fa1d65 Depends-On: I60796063fa9ebe0d98030fb982d22dabe2593ea0 Depends-On: I585b6877074353b5de62e5efaabfbe62432c473d (cherry picked from commit f37fe4f903f429b43d22b485c29547f576ec7269)
2017-10-07Make containerized galera use mysql_network everywhereDamien Ciabrini1-0/+6
The containerized galera service generates a galera.cnf which uses short hostname to identify itself rather than the fqdn from the mysql_network (e.g. overcloud-x.internalapi.cloudname). This breaks when internal TLS is in use, because the mysql certificate does not reference this short hostname. Fix the appropriate hiera parameter to make it behave like the non-containerized galera service. Change-Id: I904cde38f2baeddab5178e8ad48d34a0c73629af Closes-Bug: #1719599 (cherry picked from commit e10aa591dc9155a2746df01279c4ba4f2133fd17)
2017-10-07Support for Ocata-Pike live-migration over sshOliver Walsh4-6/+104
In Ocata all live-migration over ssh is performed on the default ssh port (22). In Pike the containerized live-migration over ssh is on port 2022 as the docker host's sshd is using port 22. To allow live migration during upgrade we need to temporarily pin the Pike computes to port 22 and in the final converge we can switch over to port 2022. This also changes the default port to 2022 for baremetal computes in Pike to enable live-migration between baremetal and containerized computes. Change-Id: Icb9bfdd9a99dc1dce28eb95c50a9a36bffa621b1 Depends-On: I0b80b81711f683be539939e7d084365ff63546d3 Closes-Bug: 1714171 (cherry picked from commit 17fd16b9f266e1aa67bf03ebdf309e89d668ada2)
2017-09-28Make CephConfigOverrides append to ceph.conf[global]Giulio Fidente1-4/+4
Previously it was mistakenly replacing the contents because we do not do deep merge. Change-Id: I145feb0208f135da7c71694ebcecd937244d66b1 Closes-Bug: #1719919 (cherry picked from commit 17416dcfc56c5148ccc9ab40297f99adfdcd085b)
2017-09-25Merge "Rename service_workflow_tasks into workflow_tasks" into stable/pikeJenkins7-7/+7
2017-09-22Set Ceph pgp_num after pg_numGiulio Fidente1-1/+2
We missed to set the pgp_num default in ceph.conf, causing WARNING messages like: pool default.rgw.buckets.data pg_num 32 > pgp_num 8 Also increases the default pg_num to 128 which is the recommended value for less than 5 OSDs [1]. 1. http://docs.ceph.com/docs/master/rados/operations/placement-groups/ Change-Id: Ibd9fb23e04576e95e24af58f856663397886a947 Closes-Bug: #1718173 (cherry picked from commit 58e6f6533a04eddd2dc897d890737bbccde4ea7b)
2017-09-21Merge "Use haproxy-systemd-wrapper as pid1 in containerized Haproxy" into ↵Jenkins2-6/+4
stable/pike
2017-09-21Merge "Disable all uses of wsrep-provider in mysql_bootstrap container" into ↵Jenkins1-2/+4
stable/pike
2017-09-20Use haproxy-systemd-wrapper as pid1 in containerized HaproxyDamien Ciabrini2-6/+4
This wrapper binary spawns the HAproxy daemon and implements a coordinated HAproxy restart on SIGHUP. From a service's perspective, this allows reloading the HAProxy configuration with minimal service disruption, i.e. without stopping and restarting the HAProxy container. Closes-Bug: #1717521 Change-Id: Ib3ef0c0bcf1a8151e179ff4d7509cf0d6b3ac5a1 (cherry picked from commit 91cd44cd7266c15ce07fafbee9d2e33f226096ba)
2017-09-20Disable all uses of wsrep-provider in mysql_bootstrap containerDamien Ciabrini1-2/+4
During the bootstrap of the mariadb database, galera replication must be disabled while the users credentials are being set up. This is done by setting wsrep-provider=none when starting mysqld_safe. Icf67fd2fbf520e8a62405b4d49e8d5169ff3925b already disabled it when the clustercheck credentials are being set up, but Kolla also start a temporary server for setting up the root password. Disable the setting directly at the end of the mysql.cnf in the running container. That way, the default setting from galera.cnf will be overriden, all mysqld_safe calls will disable WSREP and the setting will stay ephemeral. Change-Id: If14e22992b46a35a05a16a9db5ecb360ea13df8f Closes-Bug: #1717250 (cherry picked from commit b0f50db80b10e9cd6263c4d6b3ca8dd818b658ba)
2017-09-19Run gnocchi statsd and metrcd at step 5Dan Prince2-2/+2
Running these daemons at step 5 should avoid seeing error messages in the gnocchi-statsd log files on startup which starts at step4. Change-Id: Idb82f864a2e1c623dab7a2a87054443036670453 Closes-bug: #1713182 (cherry picked from commit 9d8e496f3e8a825d48d9eba9aab540001bb780ea)
2017-09-15One time delete pacemaker resources during upgrade to containersMarius Cornea4-8/+40
This change allows running the major upgrade composable docker steps multiple times by not trying to delete the pacemaker resources if they're not reported as started or in master state. Closes-bug: 1716031 Depends-On: I8da03f5c4a6d442617b81be5793a9724cc8842bf Change-Id: Ifcf9de8c82550a90a9fb118052d43fdbcdc6ca7e (cherry picked from commit 64d7be1e3d4552e06cbc53f788572e530cc5c3bb)
2017-09-14Rename service_workflow_tasks into workflow_tasksGiulio Fidente7-7/+7
Using the service_ prefix seems incoherent with its use in service_config_settings (vs config_settings). Change-Id: Ia39f181415bee0071409dabddfa0c5c312915e1f (cherry picked from commit 09137304b98a02ed024c0288da907cfe35ca5fe1)
2017-09-14Retry if the pacemaker_resource commands failedMathieu Bultel6-0/+36
Add a retry when the pacemaker_resource command wasn't apply correctly, more info here: https://bugzilla.redhat.com/show_bug.cgi?id=1482116 This is the same approach puppet-pacemaker uses and provides eventual consistency when multiple nodes change the cluster CIB concurrently. This change depends-on : https://review.gerrithub.io/375982 The return code is not available in the current ansible-pacemaker package. Change-Id: I8da03f5c4a6d442617b81be5793a9724cc8842bf (cherry picked from commit e92430d8d03fc2ce2d0ce192b96209f2c5c04169)
2017-09-13Merge "Enable redis TLS proxy in HA deployments" into stable/pikeJenkins1-26/+67
2017-09-13Merge "Add CephConfigOverrides to allow arbitrary configs in ceph.conf" into ↵Jenkins2-11/+19
stable/pike
2017-09-13Merge "Enable selinux in containers" into stable/pikeJenkins1-0/+1
2017-09-13Merge "Add verbose output to containerized cell_v2 host discovery" into ↵Jenkins1-1/+1
stable/pike
2017-09-12Add CephConfigOverrides to allow arbitrary configs in ceph.confGiulio Fidente2-11/+19
We need to reuse the ceph_conf_overrides structure provided by ceph-ansible for both user provided configs and TripleO managed configs. This change merges the special user facing parameter with the TripleO generated configs. Also adds osd_scenario and osd_objectstore params for compatibility with newer ceph-ansible versions. Change-Id: I29c689c6c689590da5b6a3f581fdbec98a52e207 Closes-Bug: #1715321 (cherry picked from commit 32bc2abf14af4ca1449e18b848e2be3cff013987)
2017-09-12Merge "Add panko config to ceilometer notification agent container" into ↵Jenkins1-0/+9
stable/pike
2017-09-12Merge "Add a docker pull retry to docker-puppet.py" into stable/pikeJenkins1-4/+18
2017-09-12Merge "Persist containerized services httpd logs" into stable/pikeJenkins18-22/+118
2017-09-11Enable selinux in containersOliver Walsh1-0/+1
We cannot use the --selinux-enabled docker daemon option on CentOS/RHEL 7.3. It will fail if security_inode_copy_up is not found in the kernel symbols: https://github.com/projectatomic/docker/blob/docker-1.12.6/daemon/daemon_unix.go#L661 NB this has been reduced to a warning upstream: https://github.com/moby/moby/commit/885b29df096db1d6746ece4b3a298a1ffe85716d Instead this just bind mounts /sys/fs/selinux in containers-common.yaml. Everything appears to work at initial glance. Pingtest succeeds, and live-migration between baremetal and containerized computes works. Change-Id: I018221bf7ae9ab9ece193b55f1ce31eb1591046c Closes-bug: #1715171 (cherry picked from commit 520f889a31f1ea6ee2bad86d1dbb3c0435604d10)
2017-09-11Add verbose output to containerized cell_v2 host discoveryOliver Walsh1-1/+1
Required to debug issues. Change-Id: I4d86c8d9ecc353a916475977eb6f2d842c812556 (cherry picked from commit dc64a1108e7bc23f92d77e75001fb42549731e3b)
2017-09-11Add panko config to ceilometer notification agent containerPradeep Kilambi1-0/+9
Without this, ceilometer notification agent cant find panko and skips posting events to it. Change-Id: Ibfeef5c557d1ceb11a999aa947597014ca94ec34 (cherry picked from commit 5437086ee744469b9daf8cd9edd600f7aa98dde6)
2017-09-11Enable redis TLS proxy in HA deploymentsMartin André1-26/+67
Redis does not have TLS out of the box. Let's use a proxy container for TLS termination. This commit enables redis TLS proxy for the HA deployment. bp tls-via-certmonger Change-Id: I45e539872a03878337def33c681c4577c1a5629e (cherry picked from commit c6d8df01d7aa8b44af9ac152b3bb08f07e2e02b7)
2017-09-11Merge "Add defaults for ceilometer-agent-compute upgrade tasks" into stable/pikeJenkins1-3/+3
2017-09-11Merge "Enable Ceilometer agent logging for containers" into stable/pikeJenkins3-3/+20
2017-09-11Merge "Add Neutron SR-IOV agent container" into stable/pikeJenkins1-0/+108
2017-09-11Persist containerized services httpd logsBogdan Dobrelya18-22/+118
Store the httpd logs under dedicated /var/log/containers/httpd/ paths. Additionally, add release notes describing upgrade impact for containerized services logs. Closes-bug: #1700045 Change-Id: I8120c56f2315700862bd0f708b8baa8910275b09 Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com> (cherry picked from commit 287e84585ca9170570ce8d06eebd7f9a3ec3345c)
2017-09-11Add a docker pull retry to docker-puppet.pyDan Prince1-4/+18
Co-Authored-By: Ian Main <imain@redhat.com> Change-Id: Iad6d38690340f4a064a4527c58ed439d91fa5188 Closes-bug: #1715136 (cherry picked from commit d3b3361a76c2e8b188fa8e586d9fb7f3c60bb66f)
2017-09-11Enable Ceilometer agent logging for containersPradeep Kilambi3-3/+20
Change-Id: Ibeb28d7c497b02253d00a74257989cefba2b0cc4 (cherry picked from commit fc44ee6ff3553754c618349df3be7544b17e9c5f)
2017-09-11Add defaults for ceilometer-agent-compute upgrade tasksMarius Cornea1-3/+3
This change allows the upgrade non controller script, which loops throug all steps, to complete by adding default values to be evaluated in the steps where the vars are not registered. Closes-Bug: 1715574 Change-Id: Ic056fc556240d1acc9f28a75f63c7628cc64da03 (cherry picked from commit d109c1d7a7d2f6302c39369de8a601bc0b2f6704)