aboutsummaryrefslogtreecommitdiffstats
path: root/mcp/config/states
AgeCommit message (Collapse)AuthorFilesLines
2018-02-17[MaaS] Override failed testing by defaultAlexandru Avadanii1-1/+8
Some nodes fail automatic testing done by MaaS during commissioning, although running the testing suites one more time manually works. For now, just override all 'failed testing' nodes unconditionally. [stable/euphrates cherry-pick additions] Note: Since our salt formulas are pinned to 2017.12 repos, we need to backport one salt-formula-maas patch merged upstream, which adds support for translating status code '22' to 'Failed testing' [1]. JIRA: FUEL-333 [1] https://github.com/salt-formulas/salt-formula-maas/commit/08ffc3ff Change-Id: I13d3ee3d82550524480aa53aa8752ab90aa940cd Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 81561126307f15d4f65a743ed2431ea8c713a921)
2018-01-31[VCP] Catch 'no response' when adding ssh auth keyAlexandru Avadanii1-1/+1
On rare occassions, one or more minions might fail to respond in due time, so catch 'no reponse' using `wait_for`. Change-Id: I8e6b0dc44a39e79c2874ff9a657e152620ba3f13 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit b254aef5a34c7fc96db5c0a330c757fe9d51be76)
2018-01-29[ovs/dpdk] Force up the float-to-ex link onlyMichael Polenchuk1-1/+1
Force up br-ex to br-floating link instead of all networking restart to avoid issues with existing interface routes. Change-Id: I6b8204db6767e1fde964eb1913f885ecb06d0c28 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-22[patch] system.repo: Add keyserver proxy supportAlexandru Avadanii2-2/+0
Instead of defining a http proxy for all salt-minion traffic, which also includes some Openstack API accesses we can't filter (no_proxy is not yet supported), add & leverage support for proxy configuration during APT keyserver access / key download. JIRA: FUEL-331 Change-Id: I9470807633596c610cfafb141b139ddda2ff096b Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 6ab6935577900e598ca60aaed14d2e73f7b1633f)
2018-01-19Retry cinder volume stateMichael Polenchuk2-2/+5
The service of cinder-volume restarts too quickly after package installation with default/incorrect configuration and goes over restart threshold, so systemd stops attempt to restart any further causing state faulure. To fix it properly the RestartSec (i.e. restart delay) param should be added into cinder-volume.service unit. Change-Id: Ic8591e8ef52a3d439122f276d275e56bd2442ce6 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 1ea49591080442d8db86fff60031d3dc41142274)
2018-01-12[baremetal] Disable dhcp offered routesMichael Polenchuk1-0/+5
Prevent dhcp client from setting an unwanted default routes on compute nodes. Conflicts: mcp/reclass/classes/system [stable/euphrates cherry-pick] Drop reclass system submodule bump, only applicable to master. Change-Id: I2529491bbc977647e5f457d5f1ba88b0cc4372ee Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 658418ea84e633f5f97a706a075d7e2f24127999)
2018-01-08lib.sh: Extend wait_for function to catch no respAlexandru Avadanii4-13/+8
wait_for function should be able to also check for minions that did not return or not respond, in addition to the return code. To keep it backwards compatible, condition the new check on the max attempt number being specified in decimal format (e.g. '10.0' unlike old '10'). Change-Id: If2512cf9121cdd795638efe7362ef0485d4e8d91 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 3f559299c232bbb7639d02243c95d6256cdf94d4)
2018-01-06Revert "[baremetal] Retire mas01 NAT"Alexandru Avadanii1-0/+1
Although deploy works now without direct internet access on the cluster nodes, testing suites seem to require it. This reverts commit ed209426e895c7c323d253afd6276bb74df64da0. Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> Change-Id: I35489e18fdd6a4ee6a270e42a3542e5a370bf819
2018-01-03[baremetal] Retire mas01 NATAlexandru Avadanii1-1/+0
Isolate networks by retiring NAT on mas01; also cutting direct internet access from cluster nodes that are not facing the public network (prx, cmp). NOTE: Since we are removing mas01 NAT, VCP VMs (except prx which have public IPs) and kvm nodes (cmp also have public IPs) will no longer have direct internet connectivity. Cluster deployment and operations will work without it, but if it is required for different reasons, the MaaS proxy could be enabled by uncommenting the /etc/enviroment section in: - cluster.baremetal-mcp-pike-common-ha.include.proxy.yml JIRA: FUEL-317 Change-Id: I5ed8b420296b27df34a54ec1ebd7b7cf58041425 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 9a6e655e0b851ff6e449027c01ac1a66188b0064)
2018-01-03[baremetal] MaaS: Enable HTTP proxyAlexandru Avadanii2-1/+13
Instead of using NAT on the mas01 node for all cluster node outgoing traffic, use the MaaS built-in proxy for APT traffic to leverage its caching capabilities too. Also enable the proxy for salt minions, so they can access public keyservers et al. Cleanup public DNS from kvm nodes, interferes with MaaS proxy. Add example config for global env proxy, but don't enable it: - default environment settings - /etc/environment (via reclass); The MaaS proxy will not be used (at least for now) on nodes: - cfg01; - mas01; NOTE: We can't yet drop the maas.pxe_nat state completely, as certain Openstack services are still accessed via public addresses from ctl nodes. JIRA: FUEL-317 JIRA: FUEL-318 Change-Id: I6c5f6872bb94afb838580571080e808bc262fc68 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 90c0b369c01a2185fe86651f8ad9e0a172d6941d)
2017-12-31[vcp] Catch 'no response' of salt minion as wellMichael Polenchuk1-1/+1
Salt minion could return 'no response' and cause an unconfigured state of the vcp node(s), so catch this output after linux state as well. Also clean up excess route on proxy nodes. Change-Id: I3183fa09ff41a8f027ee789869bdae0c3962ab8f Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commmit a183db4b3404bd12073b5691eb5d4fbd8135b44b)
2017-12-31[baremetal] Move salt master IP to PXE/adminAlexandru Avadanii1-2/+0
Use PXE/admin network for salt traffic from/to all minions except cfg01, mas01. This allows us to drop the route to admin net from cfg01. Change-Id: Ic2526f1ff77afe5d92ced900971f4c8f78d2d8a2 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit d4ab072aeab143ce72e4b81122d4580915a4ad1a)
2017-12-31states: Rename openstack, add baremetal_initAlexandru Avadanii3-22/+35
To align with new cluster naming convention, rename 'openstack' state file to 'openstack_noha'. While at it, factor out baremetal setup from 'virtual_control_plane' into a new state that will be reused in upcoming scenarios, remove useless sync_all (automatically done after node reboot). FUEL-310 Change-Id: I6d7e5db8f09305f2fd8eeca0199a2e85b08d2202 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commmit 52e37b795bb975b1cb3bf1f684b009848c50a2d6)
2017-12-31[maas] Adjust deployment order/timeoutsMichael Polenchuk1-3/+7
Change-Id: I9dbb51ce2387450e4ae19f8b3444f5e52cfdc71d Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry-picked from commit 52bd5a8f6c5b27ec3070625a51aea8ff85f5a8db)
2017-12-31[baremetal] MaaS: Reduce timeout valuesAlexandru Avadanii1-9/+8
`maas_fixup` is already re-entrant, so we can execute it more than once during a commissioning/deploy cycle. Reduce the timeout waiting for all nodes to reach a stable state, so nodes stuck in 'Ready' state instead of reaching 'Deploying' get dealt with sooner (~5 min vs old 30 min). While at it, let `maas_fixup` handle machine deploy as well, so we can catch nodes stuck in 'Ready' state and re-trigger the deploy. Change-Id: Id24cc97b17489835c5846288639a9a6032bd320a Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 8da73521d3b9347a982ea6e77114bba0d0f0adeb)
2017-12-30ci/deploy.sh: maas: cleanup_uefi on env eraseAlexandru Avadanii1-8/+6
Running `ci/deploy.sh -EE` should also perform an UEFI boot option cleanup, otherwise we risk booting the previously installed OS. While at it, reduce delay between nodes removal and fix a rare failure for `-EE` when no nodes are defined in MaaS. Change-Id: I789ffd3e22545921216f7d5ee3509c76354542eb Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 15173a83dba08729e62da277b9165677323675d8)
2017-12-13[ovs/dpdk] Split out networking restart actionMichael Polenchuk2-1/+12
In common openstack_ha state the networking service restart has no expected effect, so split it out into the detached post-deployment state. Change-Id: Iaaae0cd048474667895b7abf2a77196ee3dee14b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-11states: maas: Stop using maas-stable PPAAlexandru Avadanii1-2/+0
Currently, Xenial repos provide MaaS 2.2.x, while the PPA bumped it to 2.3.x. Since we switched to 2.3, we observed a rare wrongful state transition from 'Deploying' back to 'Ready'. Drop the PPA, falling back to 2.2 from mainline distro repos. JIRA: FUEL-312 Change-Id: I3daa118059f37cbeca076da685661c28f3a28a97 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 9da33bc85d681950a09452f28ca39df2108b0b56)
2017-12-04[baremetal] Restart gateway networking serviceMichael Polenchuk1-0/+1
Make sure all missing interfaces/links are up & running (e.g. br-ex <-> float-to-ex <-> br-floating). Fix (for https://github.com/saltstack/salt/issues/40262) into linux formula brought in a weird behaviour with network/interfaces.u/ items. Change-Id: Ic13f0ed2063455ae191bbc99920f97c5ecaa61fd Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-11-29[baremetal] Fix prx stale route via MaaS DHCPAlexandru Avadanii1-0/+3
Although we add default routes via public network and disable DHCP client from setting new routes, until we reboot the prx* nodes we still have the stale route originally set at initial boot. Change-Id: Ib8e5fb67c7da00684e0ac21984fc4661d3820d83 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 7daf7f128714021711970557129a23a86cce2a72)
2017-11-27[baremetal] Retry cinder.controller on failureAlexandru Avadanii1-1/+1
Occasionally, cinderng.volume_type_present errors with: ClientException: Service Unavailable (HTTP 503) Instead of retrying the whole state file, use `wait_for` macro to retry only this high state up to 5 times. Change-Id: Ib9ef017aca737e53c853007c13107d56d856c016 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 92fb2b5e303b5e097a21d43612d5c8132f23152b)
2017-11-27Revert "Apply apache state on proxy nodes"Alexandru Avadanii1-1/+0
Upstream fixed the salt-formula-horizon in commit 95387ec, by defining 8078 (and only that) port in Apache's ports.conf. This fixes the port 80 overlap, so running the `apache` high state after the `horizon` high state not only is unnecessary now, but also would lead to new breakage, since `apache` state would overwrite the ports.conf (removing 8078 and adding 80), i.e. creating a new port conflict and breaking Horizon port completely. This reverts commit eb4645206d6d74992fca3b8726ee2eebca97205f. Conflicts: mcp/config/states/openstack_ha mcp/reclass/classes/cluster/baremetal-mcp-ocata-common/openstack_proxy.yml Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> Change-Id: Iea8f0bd90ee8d12f399aad16247dda274d6a907a (cherry picked from commit 0c71112ec06bd73a3ddc42ba0aacd666e9a00553)
2017-11-24Merge "Switch nofeature-ha compute nodes to UCA repo" into stable/euphratesAlexandru Avadanii1-3/+1
2017-11-24Merge "ci/deploy.sh: Add new `-E` arg for env erase" into stable/euphratesAlexandru Avadanii2-0/+27
2017-11-24Switch nofeature-ha compute nodes to UCA repoMichael Polenchuk1-3/+1
Employ UCA repo on computes nodes for nosdn-nofeature-ha scenario as well to prevent a regression (creation of ports failed for 1+n instances) of neutron ovs agent from mcp/openstack repos. Change-Id: Ie65ae122096c0d3a93c09d46191787a934bd7d4f Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 8ba3a1a4ed0ce41a76fa6d712778904bb56b60ac)
2017-11-23Merge "[virtual] Apply nova controller state twice" into stable/euphratesMichael Polenchuk1-0/+4
2017-11-22ci/deploy.sh: Add new `-E` arg for env eraseAlexandru Avadanii2-0/+27
NOTE: In order to undefine VCP VMs with NVRAM (e.g. AArch64 VMs using AAVMF), an additional parameter should be passed to libvirt by Salt virt core module (equivalent to `virsh undefine --nvram`). While at it, pass CI_DEBUG, ERASE_ENV enviroment variables to state execution, and stop force-applying patches. Also refactor the rsync between foundation node and Salt master, so the whole git repo is copied as </root/opnfv>, and <root/fuel> becomes a link to it; useful for Armband, where 'fuel' is a git submodule. Fix .git paths after rsync, so git submodules work as expected in cfg01 repos. JIRA: FUEL-307 Change-Id: Ic62f03e786581c019168c50ccc50107238021d7f Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 77942178b3aff6adc83b5f83645acfff467fa76a)
2017-11-22[virtual] Apply nova controller state twiceMichael Polenchuk1-0/+4
In order to complete broken database sync run nova state on controller one more time. Change-Id: I761f26667ebb531b848a62e096f3d79f588d9f24 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 246928006daf99de2317dc8d171c2b0735a3c605)
2017-11-21[baremetal] public gateway setup on prx nodesGuillermo Herrero1-1/+4
- prx: add route for public traffic to public interface - prx: add route towards salt master through maas - remove dashboard class from proxy node (already implements horizon) - remove dashboard (and benchmark) class definitions (no longer used) - (temporary) backport Pharos change for adapter template JIRA: FUEL-305 Change-Id: Ia14a18ac0123c1134d8d99dc43da9a1f770001d0 Signed-off-by: Guillermo Herrero <guillermo.herrero@enea.com> (cherry picked from commit 07f4e0238646fcb77072769feb8a0b68df52caca)
2017-11-17[baremetal] MaaS: Remove curtin netconfig via SaltAlexandru Avadanii1-2/+0
JIRA: FUEL-301 Change-Id: Id6b2b423b8045c581fa5c02133cf91702d9915c9 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 4010ea45c703d82e2fb95dcc869ff72bbca088b7)
2017-11-17Merge "[baremetal] Retry keystone.client state on failure" into stable/euphratesAlexandru Avadanii1-1/+1
2017-11-16[baremetal] Retry keystone.client state on failureAlexandru Avadanii1-1/+1
JIRA: FUEL-306 Change-Id: I648545890c1f7dc59176beac1a0593aed54079cb Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> Signed-off-by: Delia Popescu <delia.popescu@enea.com> (cherry picked from commit dcbc90f89292bf5070e8e0b54a760755b8206346)
2017-11-15[baremetal] SaltStack Deployment DocumentationAlexandru Avadanii1-1/+1
Generate documentation automatically using `reclass-doc`. nginx is already configured to serve said documentation on proxy's public VIP on port 8090. Change-Id: If2aef646a0ec44d5cc7e9d425e565e5c0aa581b3 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit f3a355d5644a7271d9df0a48febc3a93cceddb8e)
2017-11-10Apply apache state on proxy nodesMichael Polenchuk1-0/+1
Apache module will take care of ports.conf file to prevent bind socket conflict between apache & nginx services. Change-Id: Ia76ec356002e1db0dabd20d8f355a1b16fc07b30 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit eb4645206d6d74992fca3b8726ee2eebca97205f)
2017-11-09Handle vlan package to avoid downgradeMichael Polenchuk1-2/+9
Change-Id: Ic81507f3f7b3fec593b507e0c534434e8489b01b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit ceedb354822eb672fdde6d63d49cbe2f05f29cdb)
2017-11-08MaaS: Fix conflicting curtin network configAlexandru Avadanii1-0/+1
JIRA: FUEL-301 Change-Id: I9de98fb961fd1d480b45a774de61ad6a93e9addf Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 3803f9ff798b5c186e605cb8366b5153ab4e19fc)
2017-11-08Merge "salt modules: debian_ip: Accept uppercase ifaces" into stable/euphratesAlexandru Avadanii1-0/+3
2017-11-08Merge "Enable glance v1 api for orchestra tests" into stable/euphratesAlexandru Avadanii1-0/+4
2017-11-07[maas] Conform regex to machines status outputMichael Polenchuk1-3/+3
Change-Id: Icc30d27951abb1e231c9269c6293782a39e08fb6 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit f31a33c3f576733728118bbd181707f4db55f903)
2017-11-03Enable glance v1 api for orchestra testsMichael Polenchuk1-0/+4
Change-Id: Ia896c3f9fcd96dd498eef6d1f83d46e29df0cd6b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit c2925b6d13a20468845f8af1b54665cbac8b9bef)
2017-11-03salt modules: debian_ip: Accept uppercase ifacesAlexandru Avadanii1-0/+3
Since VMs are not affected by this limitation, only apply the fixup to baremetal nodes. JIRA: FUEL-299 Change-Id: Ib94c481627538d900295df03b8c8fdc7b61cd718 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 8f39b4895fa66223ef6293630556457f8fb9a348)
2017-10-26Run aodh state one by oneMichael Polenchuk1-1/+1
Apply aodh state in consecutive order to avoid a race condition with database synchronization. Change-Id: I4684fbeaaba2c9780084e0a64fe6453bccfb67e0 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 9cfa75272ba2fd9abab416db1f22df5989c9959e)
2017-10-21Catch expected failuresopnfv-5.0.1Michael Polenchuk2-2/+2
* neutron on computes (dpdk case: void state) * mongodb server (incomplete initialization) Change-Id: I3dd3266b5c2d1b155981f725e15742cd38ed899d Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 24d9cdd384635d8c1a037d6341d63a9c9be039b1)
2017-10-19[vcp] Increase timeout till VCP VMs onlineAlexandru Avadanii1-1/+1
Change-Id: I95c284cbf374194694360bffbeaf6770db6111bf Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 4b63bd0ea961d06723b277b874168c2aaddb96c5)
2017-10-19[baremetal] Remove infinite loops from node checksAlexandru Avadanii2-5/+10
Change-Id: I7a21c30d49aecca948f45535fec164c2f643450e Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 9cfa3c11bbd71ce4ec24dba9dbd9a2289b76a4a3)
2017-10-19[baremetal] cmp: run linux.network before rebootAlexandru Avadanii1-0/+1
The recent addition of `linux.system`, combined with `system.reboot` for the baremetal compute nodes leaves compute nodes unconfigured after reboot. Run `system.network` too, but expect a failure (only for DPDK, which requires hugepages to be already active, hence a prior reboot). Fixes: 64920b8 Change-Id: I8c73b24ae15e1f87dee64ae2aba7af86db1e942f Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 595119281c50edb86b987f5fdd6eac25e28147ae)
2017-10-19[baremetal] maas state: Wait for all nodes onlineAlexandru Avadanii1-0/+13
After MaaS reports baremetal provisioning finished successfully, check that all nodes are online before attempting a `sync_all`. Change-Id: I6ba4b3e4ba5b5258ace4da8c39e0fc77354885e3 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit b9918f1f8df52c52cd2ab76eec3b540b37789e55)
2017-10-18[baremetal] maas state: Retry sync_all on failureAlexandru Avadanii1-1/+1
Change-Id: Ib4aa3f2cb4fc7129d502b4332cd7fedd83a0e1fe Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com> (cherry picked from commit 51f374b055999fbc121b624424c21ee45d061538)
2017-10-18Return back glusterfs client stateMichael Polenchuk1-1/+1
In order to set properly keystone fernet keys, apply glusterfs client state before second keystone server state. Also leave out user/group settings for glusterfs volume of nova instances as it will be set later by nova compute packages themselves. Change-Id: I069e37c67f08c51ed29f45cf6f92d4a00a1ac97b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> (cherry picked from commit 0224929b3a87d0e0ec011311c46872e6142497cf)
2017-10-18Merge "states: Break on error, retry states up to 5 times" into stable/euphratesAlexandru Avadanii8-10/+26