aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-10-23Merge "Disable xinetd class when creating swift-storage puppet ↵Zuul1-1/+4
configuration" into stable/pike
2017-10-19Merge "Remove Heat Cloudwatch API during upgrade and disable by default" ↵Zuul4-1/+66
into stable/pike
2017-10-19Merge "Fix some missed hard-coded network references" into stable/pikeZuul2-60/+16
2017-10-19Merge "Remove monitor_interface from ceph-ansible parameters" into stable/pikeZuul3-3/+0
2017-10-19Disable xinetd class when creating swift-storage puppet configurationMichele Baldessari1-1/+4
Due to missing puppet invocation with --detailed-exitcodes we ignored a large amount of puppet errors during deploy. Swift storage fails during the puppet_config step with the following error: Debug: /Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Package[swift-object]: Not tagged with file, file_line, concat, augeas, cron, swif t_proxy_config, swift_config, swift_container_config, swift_container_sync_realms_config, swift_account_config, swift_object_config, swift_object_expirer_con fig, rsync::server Debug: /Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Package[swift-object]: Resource is being skipped, unscheduling all events Debug: Executing: '/usr/bin/systemctl is-active xinetd' Debug: Executing: '/usr/bin/systemctl is-enabled xinetd' Debug: Executing: '/usr/bin/systemctl unmask xinetd' Debug: Executing: '/usr/bin/systemctl start xinetd' Debug: Runing journalctl command to get logs for systemd start failure: journalctl -n 50 --since '5 minutes ago' -u xinetd --no-pager Debug: Executing: 'journalctl -n 50 --since '5 minutes ago' -u xinetd --no-pager' Error: Systemd start for xinetd failed! The problem is that by using the rsync::server tag we end up including the xinetd class automatically which will try to start a service inside a container. By nooping the xinetd class, we're able avoid systemctl calls and have a successfuly deployment. The resulting swift_rsync container seems to work correctly: [root@overcloud-controller-0 ~]# docker exec -it swift_rsync /bin/bash -c "ps -axuwf" USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 10 0.0 0.0 47444 1624 pts/1 Rs+ 18:16 0:00 ps -axuwf root 1 0.0 0.0 188 4 ? Ss 17:27 0:00 /usr/local/bin/dumb-init /bin/bash /usr/local/bin/kolla_start root 6 0.0 0.0 11036 924 ? Ss 17:27 0:00 /usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf [root@overcloud-controller-0 ~]# docker logs swift_rsync 2>&1|tail -n4 INFO:__main__:Deleting /etc/rsyncd.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rsyncd.conf to /etc/rsyncd.conf INFO:__main__:Writing out command to execute Running command: '/usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf' Change-Id: I5e43e8fd61e002d2acc56a7de52e6aae64ab60be Closes-Bug: #1723463 (cherry picked from commit b5eeeab73e12efecc86ea7deebc105eee0739510)
2017-10-18Sync deployed-server-roles-data and roles-dataEmilien Macchi1-81/+208
deployed-server-roles-data was out of sync and missing some parameters introduced in Pike cycle: This patch syncs the roles_data between 2 files. Change-Id: If4a8388634fb1dcbb47beeabbd3db005abc80d4e Closes-Bug: #1723177 (cherry picked from commit 0e6c86dc123e9f558c4d3d594ff50e85dd00171f)
2017-10-17Remove Heat Cloudwatch API during upgrade and disable by defaultmarios4-1/+66
This adds a heat-api-cloudwatch-disabled.yaml and wires it up in the resource registry. During the Ocata to Pike upgrade this service will thus be stopped and disabled by default. If you wish to keep the Heat Cloudwatch API then you should instead use the provided heat-api-cloudwatch.yaml environment file. Change-Id: I3f90a9799b90ca365f675f593371c1d3701fede6 Related-Bug: 1713531 (cherry picked from commit 4d21451666f2dd7a8935da3a7166a9afc2ccd6bd)
2017-10-17Merge "Fix ConfigDebug for puppet host runs" into stable/pikeZuul2-1/+11
2017-10-16Merge "Fixes dynamic networks falling back to ctlplane" into stable/pikeZuul2-1/+10
2017-10-16Fix ConfigDebug for puppet host runsMichele Baldessari2-1/+11
Before pike we used to be able to add -e environments/config-debug.yaml and that would give us debug logs for puppet. With the move to ansible running puppet we lost this feature. Let's make sure that the old ConfigDebug variable still works with the ansible playbook-based deploy steps. With this patch and ConfigDebug set to true, we correctly get the puppet debug logs: TASK [debug] ******************************************************************* ok: [localhost] => { "(outputs.stderr|default('')).split('\n')|union(outputs.stdout_lines|default([]))": [ "Warning: Undefined variable 'deploy_config_name'; ", " (file & line not available)", "Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Bool. There is further documentation for validate_legacy function in the README. at [\"/etc/puppet/modules/ntp/manifests/init.pp\", 54]:[\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\", 29]", " (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation')", "Debug: Runtime environment: puppet_version=4.8.2, ruby_version=2.0.0, run_mode=user, default_encoding=UTF-8", "Debug: Loading external facts from /etc/puppet/modules/openstacklib/facts.d", "Debug: Loading external facts from /var/lib/puppet/facts.d", .... Change-Id: Ia726fb8ca4a6f7bbbd7a1284d76ff42df6825d01 Closes-Bug: #1722752 (cherry picked from commit ecc6ce340aea59faaee4c2a49cd6d6fb90d8ed35)
2017-10-14Merge "Hardcode tag-stable-3.0-jewel-centos-7 in scenario001-containers" ↵Jenkins1-1/+1
into stable/pike
2017-10-14Remove monitor_interface from ceph-ansible parametersGiulio Fidente3-3/+0
We should not pass any hardcoded value for monitor_interface and rely on monitor_address_block only instead. Also removes journal_collocation which is not consumed by newer (and stable) builds of ceph-ansible. Change-Id: Idf213a1f43a66506f76d07102f122839b5096948 Closes-Bug: #1715246 (cherry picked from commit 3e90ae3df5a7c5491672254733ceac163b34a395)
2017-10-14Merge "Revert "Fixes heat resource name for Internal API Network"" into ↵Jenkins3-8/+5
stable/pike
2017-10-12Revert "Fixes heat resource name for Internal API Network"Tim Rozet3-8/+5
This reverts commit 520be6bb4056ead8e6fad08ad96e99f7da5b341e. This introduced a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1501515 where during upgrade, the previous heat resource would for the InternalApi network would have the incorrect name "Internal" and the upgrade would try to delete the resource in order to create "InternalApi". This needs to be reverted and a proper fix will be submitted that accounts for this upgrade scenario. Related-Bug: #1718764 Change-Id: Id906fac421db317ce48d5cecfcd43397a0f4ab3d
2017-10-12Hardcode tag-stable-3.0-jewel-centos-7 in scenario001-containersJohn Fulton1-1/+1
Change-Id: I88f622c0b7a92ab75c2523fdc0d4d9ac1a2a2560 Closes-Bug: #1722908 (cherry picked from commit 06331a830e8923a9dc2ef8c15f2f1bf9d1d58ba1)
2017-10-11Fix some missed hard-coded network referencesSteven Hardy2-60/+16
These got missed in the refactoring to support composable networks. Change-Id: I5c97df08ae84e9c383175687428fb00143d171ff Closes-Bug: #1720849 (cherry picked from commit ef1768e40c3a6c58a22381a4546772f571bee5cc)
2017-10-11Fixes dynamic networks falling back to ctlplaneTim Rozet2-1/+10
Currently when a network in network_data is disabled it no port definitions for that network will be created per role. This results in no fallback to the ctlplane IP because overriding a type in network-isolation to noop.yaml does nothing when the port does not exist for the role. This patch changes the IPs when a network is disabled to be the same IPs as ctlplane and fixes the issue, along with removing the need to use noop.yaml override for ports (non-vip). Closes-Bug: 1721542 Change-Id: I301370fbf47a71291614dd60e4c64adc7b5ebb42 Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 9285cb5fc99331ca63ff09df59f26b6018bc781b)
2017-10-10Merge "Add IronicPxe to the default controller" into stable/pikeJenkins5-0/+5
2017-10-10Merge "Remove package if service stopped and disabled" into stable/pikeJenkins33-3/+296
2017-10-10Merge "Adds pacemaker update_tasks for Pike minor update workflow" into ↵Jenkins13-5/+261
stable/pike
2017-10-10Add IronicPxe to the default controllerDerek Higgins5-0/+5
It doesn't exist in the non containerized openstack so leave it stubbed out by default. Closes-Bug: #1721212 Change-Id: I5fcb1f0b9958ac90f034a12f1ee733dae6571f9c (cherry picked from commit a850d8059fbc1c36efb18773e40bb600e5da5005)
2017-10-10Merge "Make containerized galera use mysql_network everywhere" into stable/pikeJenkins1-0/+6
2017-10-10Merge "Fix cold/live migration network config" into stable/pikeJenkins3-4/+10
2017-10-10Merge "Create mysql user for non-ha deployments" into stable/pikeJenkins1-5/+21
2017-10-10Merge "List all unhealthy containers" into stable/pikeJenkins1-1/+5
2017-10-10Merge "Special treatment for os-net-config upgrade." into stable/pikeJenkins1-0/+9
2017-10-09Remove package if service stopped and disabledmarios33-3/+296
Adds a UpgradeRemoveUnusedPackages param to use in the ansible when conditional for the removal Adds package removal to step2 right after a service is stopped and disabled on step2. Package updates happen in step3 so ideally remove before that. The package removal task has ignore_errors true so dependencies or other issue removing packages will not fail the upgrade workflow. Also adds this to the upgrade environment files for visibility and defaulting false Change-Id: Ie4e4a2d41f7752c5a13507a7c15c6f68e203cfca Related-Bug: 1701501 (cherry picked from commit ce0ef2fa207698c1ae61c1620fe3c5e8d1c7bfca)
2017-10-09Adds pacemaker update_tasks for Pike minor update workflowmarios13-5/+261
Adds update_tasks for the minor update workflow. These will be collected into playbooks during an initial 'update init' heat stack update and then invoked later by the operator as ansible playbooks. Current understanding/workflow: Step=1: stop the cluster on the updated node Step=2: Pull the latest image and retag the it pcmklatest Step=3: yum upgrade happens on the host Step=4: Restart the cluster on the node Step=5: Verification: test pacemaker services are running. https://etherpad.openstack.org/p/tripleo-pike-updates-upgrades Related-Bug: 1715557 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com> Change-Id: I101e0f5d221045fbf94fb9dc11a2f30706843806 (cherry picked from commit a953bda0ae615dc44d3e8a70aa7ab0160e26f3af)
2017-10-09Merge "docker: add logging(source & groups)" into stable/pikeJenkins83-8/+166
2017-10-09Special treatment for os-net-config upgrade.Sofer Athlan-Guyot1-0/+9
We make sure to run upgrade and run os-net-config on its own. Running os-net-config with the no-activate option will - prevent the restart of the interface - adjust the network files to the expected configuration so that next run won't restart the network. Eventually at next reboot the change will be taken into account. Currently we have no change that are required to be taken live during the upgrade so it safe to ignore the new parameters. Closes-Bug: #1721073 Change-Id: I51464274d5dff8a267992ae303ac3517b78d08fb (cherry picked from commit 5aab25bb68f62b0d7e4ffdc20d4f4da1d82a76db)
2017-10-09List all unhealthy containersMartin Mágr1-1/+5
Currently the default Sensu check defined in docker/services/sensu-client.yaml reports only first unhealthy container. This patch changes the check output to contain list of all unhealthy containers. Change-Id: I0a934367ef22984d9091d160ec7105092edc8149 Closes-Bug: #1720972 (cherry picked from commit 9b016c9f3fbe9552497737974b9928d1dff4d299)
2017-10-09Create mysql user for non-ha deploymentsMartin Mágr1-5/+21
Currently health check for mysql container reports unhealthy container because there is no 'mysql' user created. This patch creates the user during mysql_bootstrap without any permission, just to allow health check to connect to DB and run 'select 1'. Change-Id: Iab26da0d30939b219189d4e7beb2a61d456ab7c3 Closes-Bug: #1718944 (cherry picked from commit 3a9cfaa992e92423461d64f84d701336322bdd10)
2017-10-09Fix cold/live migration network configOliver Walsh3-4/+10
Cold migration network is determined by the value of my_ip in nova.conf. If this isn't set then the network with the default gateway will be used. This patch sets my_ip and the whitelisted IP for cold migation over SSH to the NovaApiNetwork. Until https://bugs.launchpad.net/nova/+bug/1671288 is fixed we cannot control the network used for live migration over SSH. It is determined by hostname resolution. This patch sets the whitelisted IP for live migration over SSH to the hostname resolution network for the role - which is typically the same as NovaApiNetwork. (NB The puppet manifest will remove duplicates). Live migration over TLS is not affected. It can control the network used so it configurable via NovaLibvirtNetwork. Change-Id: Ica3f79d6d0cfae446e276172146f3a9407f2971f Depends-On: Id22a6c990f424b9f3ca6159088540ea207460ffd (cherry picked from commit 23331889a577b82b625610a80ecd44e164fe6cf1)
2017-10-09docker: add logging(source & groups)Juan Badia Payno83-8/+166
The services that docker depends on, have logging_sources and logging_groups; but those are not set on the docker outputs so they are not used when dockers are deployed. Added logging_source & logging_groups as docker optional parameters in tools/yaml-validate.py Closes-Bug: #1718110 Change-Id: I8795eaf4bd06051e9b94aa50450dee0d8761e526 (cherry picked from commit 5dbe1121e98a794ec6a6387ff56ee34314177567)
2017-10-09Containerized Fluentd clientJuan Badia Payno3-1/+126
Change-Id: Ia350e4899aa499cf27efffd9d2243e7e95fa1d65 Depends-On: I60796063fa9ebe0d98030fb982d22dabe2593ea0 Depends-On: I585b6877074353b5de62e5efaabfbe62432c473d (cherry picked from commit f37fe4f903f429b43d22b485c29547f576ec7269)
2017-10-07Make containerized galera use mysql_network everywhereDamien Ciabrini1-0/+6
The containerized galera service generates a galera.cnf which uses short hostname to identify itself rather than the fqdn from the mysql_network (e.g. overcloud-x.internalapi.cloudname). This breaks when internal TLS is in use, because the mysql certificate does not reference this short hostname. Fix the appropriate hiera parameter to make it behave like the non-containerized galera service. Change-Id: I904cde38f2baeddab5178e8ad48d34a0c73629af Closes-Bug: #1719599 (cherry picked from commit e10aa591dc9155a2746df01279c4ba4f2133fd17)
2017-10-07Merge "Remove extra noop.yaml ports from network-isolation files." into ↵Jenkins2-6/+4
stable/pike
2017-10-07Merge "Default Ceph pg_num and pgp_num to 128" into stable/pikeJenkins6-2/+17
2017-10-07Merge "Support for Ocata-Pike live-migration over ssh" into stable/pikeJenkins14-12/+145
2017-10-07Merge "Fixes missing type for heat param TenantNetName" into stable/pikeJenkins2-24/+5
2017-10-07Merge "Use sub_nodes_private instead of node_private" into stable/pikeJenkins1-2/+2
2017-10-07Merge "Update panko port in env ssl yaml files to correct one" into stable/pikeJenkins4-18/+18
2017-10-07Merge "Bump fs.inotify.max_user_instances for scale" into stable/pikeJenkins1-0/+9
2017-10-07Merge "Drop extraconfig for nova-nuage" into stable/pikeJenkins4-94/+45
2017-10-07Merge "Fixes heat resource name for Internal API Network" into stable/pikeJenkins3-5/+8
2017-10-07Remove extra noop.yaml ports from network-isolation files.Dan Sneddon2-6/+4
The environments/network-isolation[-v6].yaml files have an unneeded reference to network/ports/noop.yaml for unused networks. This introduces a regression where environment files that define the networks and ports on a per-role basis can cancel out other environment files. See bug # 1717322. The overcloud-resource-registry.j2.yaml already uses noop.yaml for every network on every role (whether or not the networks are enabled, or whether the particular network is supposed to be on a role. So having noop.yaml specified for every role in network-isolation[-v6].yaml is not needed and can cause issues with upgrades if the environments are not included in a specific order. Change-Id: If06407e5235587af090ede44674bf9c7e08e340e Closes-bug: 1717322 (cherry picked from commit 9b08df3733257ac0fbc150a4071aec051e073ef7)
2017-10-07Support for Ocata-Pike live-migration over sshOliver Walsh14-12/+145
In Ocata all live-migration over ssh is performed on the default ssh port (22). In Pike the containerized live-migration over ssh is on port 2022 as the docker host's sshd is using port 22. To allow live migration during upgrade we need to temporarily pin the Pike computes to port 22 and in the final converge we can switch over to port 2022. This also changes the default port to 2022 for baremetal computes in Pike to enable live-migration between baremetal and containerized computes. Change-Id: Icb9bfdd9a99dc1dce28eb95c50a9a36bffa621b1 Depends-On: I0b80b81711f683be539939e7d084365ff63546d3 Closes-Bug: 1714171 (cherry picked from commit 17fd16b9f266e1aa67bf03ebdf309e89d668ada2)
2017-10-07Default Ceph pg_num and pgp_num to 128Giulio Fidente6-2/+17
As per Ceph docs [1] we should default pg_num and pgp_num to 128 when using less than 5 OSDs. This same change was applied to the ceph-ansible profiles with [2]. Also updates the CI environment files to continue using 32 where we deploy a single OSD. 1. http://docs.ceph.com/docs/master/rados/operations/placement-groups/ 2. Ibd9fb23e04576e95e24af58f856663397886a947 Change-Id: I1920bc8f5251f362af38ad3bd6f46dda42c6ee93 Closes-Bug: #1718756 (cherry picked from commit e17ae7620e03790da0d29092ab42e8089b2e8d11)
2017-10-07Use sub_nodes_private instead of node_privateSagi Shnaidman1-2/+2
node_private file doesn't exist anymore, use sub_nodes_private instead Change-Id: Ifb3af18733c0e1fd6895c270bb39199acaa98968
2017-10-07Fixes missing type for heat param TenantNetNameTim Rozet2-24/+5
Closes-Bug: 1720823 Change-Id: I239cc9f827fe99a553f9c18b80336bc6ce0b1d14 Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit ba5436099d37898e418406f8b4376923e14f4c89)