aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-06-16Merge "Add ignore_projects to filter gnocchi events" into stable/ocataJenkins1-0/+8
2017-06-16Merge "Dell SC: Add exclude_domain_ip option" into stable/ocataJenkins2-0/+5
2017-06-15Merge "Add fqdn_external" into stable/ocataJenkins6-0/+6
2017-06-15Add ignore_projects to filter gnocchi eventsPradeep Kilambi1-0/+8
Without this, ceilometer db gets hammered with gnocchi swift events. Keystone creds are required so middleware can query for id. Related change: I5c0f4f1a2c7fe7eb39ea6441970e9ac0946a4ec1 Change-Id: I9a7a80252703e470a69dc10352e7ece45ab23150 (cherry picked from commit 37447494de7380409f4461835a2b1882ead37985)
2017-06-15Dell SC: Add exclude_domain_ip optionrajinir2-0/+5
This option allows users to exclude some fault domains. Otherwise all domains are returned. Change-Id: Iefd1a44c8fe217aee5845bba35def571317bb123 Closes-Bug: #1681490 Depends-On: I6eb2bcc7db003a5eebd3924e3e4eb44e35f60483 (cherry picked from commit e0bc8d6813d7cd0ecbef1dfe17d9d3cfec4225d7)
2017-06-14Merge "Dell SC: Add secondary DSM support" into stable/ocataJenkins2-3/+23
2017-06-14Add fqdn_externalAlex Schultz6-0/+6
In newton, we used to construct the fqdn_$NETWORK in puppet-tripleo for external, internal_api, storage, storage_mgmt, tenant, management, and ctrlplane. When this was moved into THT, we accidently dropped external which leads to deployment failures if a service is moved to the external network and the configuration consumes the fqdn_external hiera key. Specifically this is reproduced if the MysqlNetwork is switch to to exernal, then the deployment fails because the bind address which is set to use fqdn_external is blank. Change-Id: I01ad0c14cb3dc38aad7528345c928b86628433c1 Closes-Bug: #1697722 (cherry picked from commit 426de202880c890360bd446907aca44ca1e73a03)
2017-06-13Moving *postconfig where it was *postpuppetCarlos Camacho3-26/+35
We need to ensure that the pacemaker cluster restarts in the end of the deployment. Due to the resources renaming we added the postconfig resource not in the end of the deployment as it was *postpuppet. Closes-bug: 1695904 Change-Id: Ic6978fcff591635223b354831cd6cbe0802316cf
2017-06-07Expose metric delay processing metricPradeep Kilambi2-0/+8
For performance reasons we might want to tweak this param lets expose this via tripleo. The puppet changes were added in this patch I5de5283d1b14e0bba63d6d9a440611914ba86ca4 Change-Id: I72f1fe3a47060fe37602a70b8a74fba72209127c (cherry picked from commit e33e76684c9b60b9ce50ad7996529ed49dddd9d9)
2017-06-06Fix the constraints for THT params NeutronDpdkCoreList and HostCpusListKarthik S1-2/+2
This fix needs to be backported to ocata. Conflicts: puppet/services/neutron-ovs-dpdk-agent.yaml Signed-off-by: Karthik S <ksundara@redhat.com> Closes-Bug: #1694703 Change-Id: I5938761efa4f56e576f41929e0bc12df246ac81a (cherry picked from commit 61480182f8a6f27ab7e1e73b9dd79e17a4927f0f)
2017-06-05Merge "Restrict nova migration ssh tunnel" into stable/ocataJenkins2-0/+6
2017-06-05Merge "Handle upgrading cinder-volume under pacemaker" into stable/ocataJenkins1-0/+15
2017-06-02Merge "Updated from global requirements" into stable/ocataJenkins1-1/+1
2017-06-02Handle upgrading cinder-volume under pacemakerAlan Bishop1-0/+15
Add upgrade tasks for cinder-volume when it's controlled by pacemaker: o Stop the service before the entire pacemaker cluster is stopped. This ensures the service is stopped before infrastructure services (e.g. rabbitmq) go away. o Migrate the cinder DB prior to restarting the service. This covers the situation when puppet-cinder (who otherwise would handle the db sync) isn't managing the service. o Start the service after the rest of the pacemaker cluster has been started. Closes-Bug: #1691851 Change-Id: I5874ab862964fadb68320d5c4de39b20f53dc25c (cherry picked from commit c4e3bbe039135f32f0e198365e704b3dbfd00290)
2017-05-31Restrict nova migration ssh tunnelOliver Walsh2-0/+6
Specify the allowed networks for migration ssh tunneling. bp tripleo-cold-migration Change-Id: Iab022bdfb655e3c52fecebf416e75c9e981072ab Depends-on: Idb56acd1e1ecb5a5fd4d942969be428cc9cbe293 (cherry picked from commit 3d8af2fcf8e2d41600fa10584120a8117e7ef40c)
2017-05-30Updated from global requirementsOpenStack Proposal Bot1-1/+1
Change-Id: Ife3a3ee576b940f1f8a06d26a0cb99d69423cf9f
2017-05-30Enable arp_accept for all interfacesIhar Hrachyshka2-0/+11
OpenStack heavily relies on gratuitous ARP updates when moving floating IP addresses between devices. When a floating IP moves, Neutron L3 agent issues a burst of gratuitous ARP packets that should update any existing ARP table entries on all nodes that belong to the same network segment. Due to locktime kernel behavior, some gratuitous ARP packets may be ignored [1], rendering ARP table entries broken for some time. Due to a kernel bug [2], the time may be as long as hours, depending on other traffic flowing to the node. With the current EL7 kernel, the only way to make sure that nodes honor all sent gratuitous ARP updates is to set arp_accept to 1; this will disable locktime mechanism for the packets sent by Neutron L3 agent, and will make sure ARP tables are always updated. [1] https://patchwork.ozlabs.org/patch/762732/ [2] https://bugzilla.redhat.com/show_bug.cgi?id=1450203 Conflicts: puppet/services/kernel.yaml Related-Bug: #1690165 Change-Id: I863b240e0ab4c4d5bb844f91b607fd0937d5cedf (cherry picked from commit 804fd3427eeb31a2846ee096dbdac924ec39bcbc)
2017-05-29Add heat environment for disabling all telemetry servicesJohn Trowbridge1-0/+20
This will be used in our HA OVB CI job where we currently are failing due to running out of memory. Telemetry will still be tested via scenarios, but this will free up a large chunk of memory in the most memory intensive job. Closes-Bug: 1693174 Change-Id: Idefe9f0de47c5b0f29b7326642d697ed179e2eb8 (cherry picked from commit 0751d69e3b6560ef87ed43859df92fdcc08f9cd1)
2017-05-23Add $STACK_NAME input varJames Slagle2-3/+9
The stack name can now be overridden in the get-occ-config.sh script for deployed-server's by setting the $STACK_NAME variable in the environment. Change-Id: Iecba21499b80e463b4c629be53c309996d39472d Closes-Bug: #1686719 (cherry picked from commit e17590c69e599a3eb6b4a18d2d6dbef9dede9ea8)
2017-05-22Dell SC: Add secondary DSM supportrajinir2-3/+23
Adds support for a secondary DSM in case the primary becomes unavailable. Change-Id: I0887e15a7e1c90a4f333bef6cdbb5d43ba0cd838 Closes-Bug: #1681492 Depends-On: I331466e4f254b2b8ff7891b796e78cd30c2c87f7 (cherry picked from commit 69be0c2ae7131af20385b4f11a8190ed9fba32c7)
2017-05-22Merge "Timeout early on pcs cluster status check0 during upgrade." into ↵Jenkins1-0/+2
stable/ocata
2017-05-20Merge "Addition of firewall rules for Nuage" into stable/ocataJenkins3-7/+11
2017-05-20Merge "Disable Manila CephFS snapshots by default" into stable/ocataJenkins4-2/+8
2017-05-18Add NodeCreateBatchSize parameterSteven Hardy1-0/+8
This uses the heat resource group batched create feature to ensure we don't create more than 30 nodes at a time, which has been reported as the maximum supported by the default ironic ipxe/TFTP configuration. Closes-Bug: #1688550 Change-Id: If3651e4c465d8d7bd4c8f2b48d45b1272ff2d272 Depends-On: I3551456664daf89d01f98bde85d7fb22a01d4a03 (cherry picked from commit 129881f2c600217ff06b4570950b4e60ff9a63b5)
2017-05-17Timeout early on pcs cluster status check0 during upgrade.Sofer Athlan-Guyot1-0/+2
There is a windows for the pcs cluster status to hang forever[1]. We add a timeout during check0 to avoid this situation. 2 minutes should be more than enought to get all the pcsd nodes to reply. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1292858 Closes-Bug: #1680477 Change-Id: Icb3dc76e031a3d4f26294f37d169f2f61d30973e (cherry picked from commit 0ea21f51a8128e536404ffd87f741443c9287593)
2017-05-16Merge "Disable ComputeNeutron* for cisco-nexus-ucsm" into stable/ocataJenkins1-0/+2
2017-05-15Fix SshHostPubKeyDeployment on containerized nova-compute.Oliver Walsh1-1/+2
This is failing since https://review.openstack.org/458672 merged because the ssh host keys are not mapped to the container. Change-Id: Ie868654f13bee04da642337cc344871903f40473 Closes-bug: #1690911
2017-05-15Disable ComputeNeutron* for cisco-nexus-ucsmSteven Hardy1-0/+2
It seems this wasn't adjusted when https://review.openstack.org/#/c/338315/ landed, which added interfaces for compute specific neutron configuration, which is disabled for most vendor backends. Change-Id: I4c98008107568b3b65decd7640e25c7d2b1ea9ff Related-Bug: #1687597 (cherry picked from commit 95fbda4d0254edb12bfec1ccd41d3b5f6204fe8f)
2017-05-13Merge "Fix for the resource ControllerPostPuppetMaintenanceModeDeployment" ↵Jenkins4-11/+16
into stable/ocata
2017-05-12Merge "Merge pre|post puppet resources into pre|post config." into stable/ocataJenkins12-42/+26
2017-05-08Fix for the resource ControllerPostPuppetMaintenanceModeDeploymentCarlos Camacho4-11/+16
Depends-On: If88f403c85b79bd896a24c7816486709bd67706f Closes-Bug:1686619 Change-Id: I7c32ca39a456de9833d30c31d41fcb727d2b0a34 (cherry picked from commit 77b4bd53dae1882ae3094597e674218b7773eda9)
2017-05-08Merge pre|post puppet resources into pre|post config.Jenkins12-42/+26
The [Pre|Post]Puppet resources were renamed in https://review.openstack.org/#/c/365763. This was intended for having a pre/post deployment steps using an agnostic name instead of being attached to a technology. The renaming was unintentionally reverted in https://review.openstack.org/#/c/393644/ and https://review.openstack.org/#/c/434451. This submission merge both resources into one, and remove the old pre|post hooks. Change-Id: Ic9d97f172efd2db74255363679b60f1d2dc4e064 Closes-bug: #1669756 (cherry picked from commit 258c6ce52d0c8467f34693722a883d96345802b2)
2017-05-08Fix up pacemaker_status test in yum_update.shMichele Baldessari1-2/+2
In change I2aae4e2fdfec526c835f8967b54e1db3757bca17 we did the following: -pacemaker_status=$(systemctl is-active pacemaker || :) +pacemaker_status="" +if hiera -c /etc/puppet/hiera.yaml service_names | grep -q pacemaker; then + pacemaker_status=$(systemctl is-active pacemaker) +fi we did that so due to LP#1668266: we did not want systemctl is-active to fail on non pacemaker nodes. The problem with the above hiera check is that it will match on pacemaker_remote nodes as well. We cannot piggyback the pacemaker_enabled hiera key because that is true on all nodes. So let's make the test check only for pacemaker service without matching pacemaker remote. Tested with: 1) Test on a controller node with pacemaker service enabled [root@overcloud-controller-0 ~]# hiera -c /etc/puppet/hiera.yaml -a service_names |grep '\bpacemaker\b' "pacemaker", [root@overcloud-controller-0 ~]# echo $? 0 2) Test on a compute node without pacemaker: [root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker\b' [root@overcloud-novacompute-0 puppet]# echo $? 1 3) Test on a node with pacemaker_remote in the service_names key: [root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker\b' [root@overcloud-novacompute-0 puppet]# echo $? 1 [root@overcloud-novacompute-0 puppet]# hiera -c /etc/puppet/hiera.yaml service_names |grep '\bpacemaker_remote\b' "pacemaker_remote"] [root@overcloud-novacompute-0 puppet]# echo $? 0 NB: cherry-pick was not 100% clean due to unrelated lines being cleaned up in master. Change-Id: I54c5756ba6dea791aef89a79bc0b538ba02ae48a Closes-Bug: #1688214 (cherry picked from commit 2244290424ffa7781fb5b64688908c218cd10ecd)
2017-05-04Initial VIP ipv6 minor update codeMichele Baldessari2-5/+74
To test this change we deployed a stock master with ipv6 which created a bunch of ipv6 with /64 netmask: [root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000..18 Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=64 Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s) stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s) monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s) Then we update the THT folder with this patch and upload the new scripts on the undercloud via: openstack overcloud deploy --update-plan-only .... Then we kick off the minor update workflow: openstack overcloud update stack -i overcloud Once the controller-0 node (bootstrap node for pacemaker) is completed we have the correct VIP configuration: [root@overcloud-controller-0 heat-config-script]# pcs resource show ip-fd00.fd00.fd00.2000..18 Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=128 nic=vlan20 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99 Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s) stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s) monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s) Also verified that running the script a second time does not alter the (already fixed) VIPs. Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Change-Id: I765cd5c9b57134dff61f67ce726bf88af90f8090 (cherry picked from commit 4923f5c4991bd539888b4175fae20025d6ef3957)
2017-05-03Addition of firewall rules for Nuagelokesh-jain3-7/+11
Added VxLAN and metadata agent firewall rules to neutron-compute-plugin for Nuage. Removed a deprecated parameter 'OSControllerIp' as well. Change-Id: If10c300db48c66b9ebeaf74b5f5fee9132e75366 (cherry picked from commit d5309c9443cbfe50ba5e7c15f025393a58b0804c)
2017-05-02Ensure AllNodesExtraConfig runs before AllNodesDeployStepsSteven Hardy1-0/+1
When implementing custom roles, we lost an implicit dependency that ensured AllNodesExtraConfig is applied before AllNodesDeploySteps, which causes problems if you need to write hieradata via the AllNodesExtraConfig hook (some cisco integrations we have in tree do this, and are now broken because the ordering is no longer ensured. Change-Id: Ie78ecbb4e135ab7f196867ef9d8d271049a9cd10 Closes-Bug: #1687597 (cherry picked from commit 4efc067a7e2965fc7a07eb05b019d0e3e8160606)
2017-04-28Unset the UpgradeInitCommand on convergemarios1-0/+1
In the converge envs we unset the UpgradeInitCommon since we used that for the N..O upgrades workflow. However an operator may have also overridden the UpgradeInitCommand so we should unset that too. Closes-Bug: 1686918 Change-Id: I3b316d04b78a4ab1e3f9f69948e42e6fb0ad6632 (cherry picked from commit 7d87b8225bd640fee4b55fd66e793391526f6d54)
2017-04-28Merge "Change the default for rabbitmq back to ha-mode: all" into stable/ocataJenkins3-33/+15
2017-04-28Merge "upgrades: deploy mod_ssl when upgrading apache" into stable/ocataJenkins9-67/+116
2017-04-27Merge "Prepare 6.1.0 (ocata)" into stable/ocataJenkins1-2/+2
2017-04-27Merge "Cinder-api upgrade: use httpd instead of apachectl" into stable/ocataJenkins1-1/+1
2017-04-27Merge "Align hyperconverged-ceph.yaml environment and adds some validation" ↵Jenkins1-0/+18
into stable/ocata
2017-04-27Prepare 6.1.0 (ocata)Emilien Macchi1-2/+2
Change-Id: Idb0423f9cf76234b9f44cacf32dd34cd9ae4e655
2017-04-27upgrades: deploy mod_ssl when upgrading apacheSofer Athlan-Guyot9-67/+116
1) When Apache is upgraded, install mod_ssl rpm. See https://bugs.launchpad.net/tripleo/+bug/1682448 to understand why we need mod_ssl. 2) All services that run Apache for API will use the snippet from Apache service to deploy mod_ssl, so we don't duplicate the code in all services. It's using the same mechanism as ovs upgrade to compile upgrade_tasks between both services. Change-Id: Ia2f6fea45c2c09790c49baab19b1efcab25e9a84 Closes-Bug: #1686503 (cherry picked from commit a6041608ca68aad4298ed9e8febafc442a250a55)
2017-04-26Cinder-api upgrade: use httpd instead of apachectlSofer Athlan-Guyot1-1/+1
It doesn't work downstream, so the httpd command was recommended. Change-Id: I4807333b80dad10f16e5deb56cbfdda656cd1e50 (cherry picked from commit 0b05d7fd9b0e8811755499642647919eaf64cc39)
2017-04-26Change the default for rabbitmq back to ha-mode: allMichele Baldessari3-33/+15
In change Ib62001c03e1e08f58cf0c6e0ba07a8879a584084 we switched the rabbitmq queues HA mode from ha-all to ha-exactly. While this gives us a nice performance boost with rabbitmq, it makes rabbit less resilient to network glitches as we painfully found out via https://bugzilla.redhat.com/show_bug.cgi?id=1441635. This is the THT part of the change that changes the default to ha-mode: all. NB: not clean cherry-pick due to the added metadata_settings line in master Closes-Bug: #1686337 Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com> Co-Authored-By: John Eckersberg <jeckersb@redhat.com> Change-Id: I7afcf2b3c8deb13fc2134e4cae9c06a44e775384 Depends-On: I9a90e71094b8d8d58b5be0a45a2979701b0ac21c (cherry picked from commit 90fc4b2e27ef6f612a82dfc5e08884629d0fe0bf)
2017-04-26Increase documentation about parametersJuan Badia Payno2-3/+33
CollectdServer, CollectdServerPort, CollectdSecurityLevel, CollectdUsername, CollectdPassword Change-Id: I43a0aca6f620f2570bdfd88531e70611867337b0 (cherry picked from commit f209f0aa48d277ecb8300ef33225f6ce6e24a4ae)
2017-04-25Merge "SSHD Service extensions" into stable/ocataJenkins10-5/+46
2017-04-25Merge "sensu: fix upgrade case when service is added" into stable/ocataJenkins1-1/+1
2017-04-25Merge "Deploy ceilometer_auth_enabled to node containing keystone" into ↵Jenkins1-1/+1
stable/ocata