summaryrefslogtreecommitdiffstats
path: root/mcp/config/states
AgeCommit message (Collapse)AuthorFilesLines
2018-05-17[noha] Bring in gnocchi/panko servicesMichael Polenchuk1-1/+3
JIRA: FUEL-372 Change-Id: I4e322a4a2c84843e9350fe9b3b849cd0c5244a12 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-04-26Run galera state on slave nodes one by oneMichael Polenchuk1-1/+1
Apply galera state in consecutive order to avoid a race condition with database initialization. Change-Id: I877bad38777d8469c03cee3b7e96acc875a3a72a Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-04-25[states] Catch more transient 'no response' respAlexandru Avadanii2-8/+8
Change-Id: Ie8e60a648fa28e59daa6e00f357df52b5821e833 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-04-24Mend OVN scenarioMichael Polenchuk1-15/+0
* setup HWE kernel to get suitable conntrack module * clean out outdated state with ovn ctl options * point SB remote source to local mgmt network Change-Id: I8986c227ce0a9a3b7ab3faf382760ec32e6e7c00 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-04-18[baremetal] cmp linux.system: catch 'no response'Alexandru Avadanii1-1/+1
Catch & retry transient errors / timeouts while applying the `linux.system` state on cmp nodes. Change-Id: Id314b5a29673e0bcaa78611fc787491056830952 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-04-06Update opendaylight version to oxygenMichael Polenchuk1-4/+0
JIRA: FUEL-362 Change-Id: Ib2621bca72d1ba376af5d369edcf5fcf37e9788b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-04-03Remove opendaylight service maskMichael Polenchuk1-3/+1
Nitrogen SR2 brought in weird behaviour into netvirt feature configuration causing malfunction tunnels between client nodes (e.g. gateway, computes). In order to work properly service of opendaylight requires an explicit restart or reload by means of salt formula. Change-Id: I277da5ad2787f1005647e500b64c7ffa6051443b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-03-30Re-order opendaylight stateMichael Polenchuk1-0/+3
* return back opendaylight state after neutron setup * sleep for awhile to let neutron api reconnect to the ODL controller and agents to register on server Change-Id: Ife0c7d3cc20574b0733e8e3064843c680379cc84 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-03-28[odl] Setup manager target after ovs host configMichael Polenchuk1-1/+1
Change-Id: Ia517b7cf1723a5afaf43cb0709716f3a67a29e9f Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-03-23Apply opendaylight state after ovs host configMichael Polenchuk1-1/+11
* employ GA kernel for baremetal computes as well * setup/start opendaylight server after ovs host config Change-Id: Ic772aed544b17be02e6ca9ccd175f2288b2128a8 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-03-13[vcp] Catch 'no response' for salt stateAlexandru Avadanii1-1/+1
JIRA: FUEL-358 Change-Id: I8dc89676aa777068d1a13168bf7b7d7156903c03 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-03-12[virtual/odl] Apply missing neutron.compute stateMichael Polenchuk2-12/+1
Change-Id: I078e11219fb8dea4505c46e7f75c295c5a72c59b Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-03-08Revert "[baremetal] Retire mas01 NAT"Alexandru Avadanii1-0/+1
Bring back public internet access to all cluster nodes via NAT on mas01 node, required for NTP syncing. NOTE: Both mcpcontrol and PXE/admin networks are currently hard wired to using /24 netmask, so we leverage that in pxe_nat.sls. JIRA: FUEL-348 This reverts commit 9a6e655e0b851ff6e449027c01ac1a66188b0064. Change-Id: I7bab385f95f8c6d92cadc4e2149c2cd56e10c506 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-03-07[ha] Add route_wrapper to prx, kvmAlexandru Avadanii2-1/+2
Similar to cmp, when route already exists, networking service fails to start on 'nginx:server' slaves ('kvm' in novcp case). JIRA: FUEL-349 Change-Id: I2dc83ea78528533e92c9b9125e78b6e4387bdfe2 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-03-05Align opendaylight settings with upstreamMichael Polenchuk1-4/+0
Change-Id: If7d51555bc13dbcaa63f93ab1993f3655e2ce643 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-02-25[ovs/dpdk] Add opnfv.route_wrapper slsAlexandru Avadanii2-13/+4
- fix `route-br-ex` if-up.d script failing when route already exists by adding a wrapper around distro's '/sbin/route' binary in '/usr/local/sbin/route', exploiting default order in Ubuntu PATH; - fix 'br-prv' duplicate entry in 'interfaces.d/ifcfg-br-prv' and 'interfaces' caused by upstream bug [1]; - add barrier waiting for all baremetal nodes online before attempting reboot, trying to catch rare failures which are undetectable in logs as both a succesful reboot and a disconneted minion report 'n/c'; With the above in place, networking service should no longer fail to start on cmp nodes w/ DPDK. [1] https://github.com/saltstack/salt/issues/40262 Change-Id: I6d4895376ce323c14c997e6c9af2ea3eeeee0184 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-19Merge "[Horizon] Fix 'mcp' version check pattern"Alexandru Avadanii1-1/+1
2018-02-18[Horizon] Fix 'mcp' version check patternAlexandru Avadanii1-1/+1
Previous commit used a pattern that is too generic and always matches the substring 'mcp' vs the node hostname, not only pkg version. Fixes: 4658acf Change-Id: Ia4dcbbf7cdfa68574c86459217101d83d61add01 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-17[MaaS] Add maas.machines.set_storage_layout slsAlexandru Avadanii1-0/+10
On cmp nodes, allocate only 30GB (fixed for now) for / partition. The rest of the disk(s) can later be allocated via salt-formula-linux. JIRA: FUEL-330 Change-Id: Ie11c78791e60801719cd33475ff91fc003df5ffa Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-17[MaaS] Override failed testing by defaultAlexandru Avadanii1-1/+8
Some nodes fail automatic testing done by MaaS during commissioning, although running the testing suites one more time manually works. For now, just override all 'failed testing' nodes unconditionally. JIRA: FUEL-333 Change-Id: I13d3ee3d82550524480aa53aa8752ab90aa940cd Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-15Mask opendaylight serviceMichael Polenchuk1-1/+4
In order to avoid using cache data with initial/outdated configuration, mask opendaylight service before package installation. JIRA: FUEL-344 Change-Id: I71eb0b0a5af93d6d21698e76587b32098aba96b4 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-02-07[states] Fix broken online check for bm, vcp nodesAlexandru Avadanii2-5/+5
Previous commit replacing explicit loops with `wait_for` failed to properly escape a nested variable, leading to deploy failure. Also, the logic was flawed, not breaking for offline nodes, rendering the whole barrier check useless. Fixes: 1a0e8e7e Change-Id: I038dbf90fb53c6b61da2e5c9b6867e31d78867af Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-07Merge "[states] maas, vcp: Use `wait_for` in online check"Alexandru Avadanii2-25/+8
2018-02-07Switch off broken sphinx stateMichael Polenchuk1-1/+1
Deactivate documentation related optional state until it get fixed in upstream. Change-Id: I5242ed307548c4f37f81d271a1f4f6bee9903f4e Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-02-06[states] maas, vcp: Use `wait_for` in online checkAlexandru Avadanii2-25/+8
Change-Id: I7b583c354843f0116a65b3a31f3be4589087b8a5 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-06[HA] Use cluster_public_host for SSL cert fetchAlexandru Avadanii1-5/+7
For VCP-enabled scenarios, `cluster_public_host` and `cluster_vip_address` both point to the public VIP of the cluster. However, for upcoming NOVCP scenarios, `cluster_vip_address` resides inside the management segment, so use `cluster_public_host` instead. JIRA: FUEL-310 Change-Id: I13ef482e2c3116c991dfe91be81d0964f140f8e9 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-06[Horizon] Limit css fixup to Ubuntu packageAlexandru Avadanii1-9/+11
Horizon package from Mirantis mcp-repos does not require the fixup, so limit its application to non-mcp packages. Required for upcoming NOVCP scenarios, where we also have mcp-repos APT source on the proxy nodes. JIRA: FUEL-324 JIRA: FUEL-310 Change-Id: I4399af803c0a17e0aa8f3d7a7330e501a5eedf55 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-05[FN VM] Reboot VMs on jump, wait for all onlineAlexandru Avadanii2-4/+1
- apply `linux` state on cfg01 first, so PXE/admin IP is added and FN VM minions are available; - add barrier and wait for all FN VMs to register with cfg01; - use batch-mode execution while applying `linux.network` on FN VMs; - retry all states executed via <salt.sh> on FN VMs; JIRA: FUEL-310 Change-Id: I72e1c565370072500df1d486fe76e6315f583c75 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-31Turn off Retpoline and KPTI protectionMichael Polenchuk1-1/+1
Based on Canonical research (https://goo.gl/QJykMa) there is low-risk of attack for private clouds environments, therefore turn off the related kernel patches & regain performance back. Change-Id: I661fa127241e327b07d21a29d58d584997607123 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-31[VCP] Catch 'no response' when adding ssh auth keyAlexandru Avadanii1-1/+1
On rare occassions, one or more minions might fail to respond in due time, so catch 'no reponse' using `wait_for`. Change-Id: I8e6b0dc44a39e79c2874ff9a657e152620ba3f13 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-26[ovs/dpdk] Configure vxlan for baremetal scenarioMichael Polenchuk1-0/+12
* switch ovs/dpdk scenario from vlan to vxlan mode * force br-ex interface to mitigate race with incorrect state * remove dpdk packages list (already in upstream) Change-Id: Ib827cef2d67879fd2a86d286ca2118b22493274d Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-25Merge "Add support for different public network netmask"Alexandru Avadanii1-12/+15
2018-01-25Merge "Horizon: Fix and reload missing css in Pike"Alexandru Avadanii2-0/+20
2018-01-23Horizon: Fix and reload missing css in Piketing wu2-0/+20
The horizon in Pike release is broken due to missing the static content. This workaround is to: - create a missing symbolic link. The link is defined as an alias in the apache configuraion - collecting and compressing static assets - add single "Default" theme as AVAILABLE_THEMES - restart apache2 service - apply the workaround to Salt states 'openstack_ha' and 'openstack_noha' JIRA: FUEL-324 Change-Id: Idd70165f1be8d31967a3ab518323e6f3e8406624 Signed-off-by: ting wu <ting.wu@enea.com>
2018-01-22Add support for different public network netmaskGuillermo Herrero1-12/+15
- Remove hardcoded /24 mask - Use PDF as source for public network, with reclass params: opnfv_net_public, _mask, _gw, _pool_start, _pool_end JIRA: FUEL-315 Change-Id: Idf3a4ed8f63f58fa90d9c1dcb7751ef3b1c9bd36 Signed-off-by: Guillermo Herrero <guillermo.herrero@enea.com>
2018-01-22[patch] system.repo: Add keyserver proxy supportAlexandru Avadanii2-2/+0
Instead of defining a http proxy for all salt-minion traffic, which also includes some Openstack API accesses we can't filter (no_proxy is not yet supported), add & leverage support for proxy configuration during APT keyserver access / key download. JIRA: FUEL-331 Change-Id: I9470807633596c610cfafb141b139ddda2ff096b Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-15Setup mongodb master primarilyMichael Polenchuk2-4/+4
Initiate mongodb master at first to avoid race conditaion with unwanted master election which causes cluster setup failure. Change-Id: I6d2f75f3f002849cac3a5f52a7dcfb4646b7822a Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-12Retry cinder volume stateMichael Polenchuk2-2/+5
The service of cinder-volume restarts too quickly after package installation with default/incorrect configuration and goes over restart threshold, so systemd stops attempt to restart any further causing state faulure. To fix it properly the RestartSec (i.e. restart delay) param should be added into cinder-volume.service unit. Change-Id: Ic8591e8ef52a3d439122f276d275e56bd2442ce6 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-11[baremetal] Disable dhcp offered routesMichael Polenchuk1-0/+5
Prevent dhcp client from setting an unwanted default routes on compute nodes. Change-Id: I2529491bbc977647e5f457d5f1ba88b0cc4372ee Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2018-01-07lib.sh: Extend wait_for function to catch no respAlexandru Avadanii4-13/+8
wait_for function should be able to also check for minions that did not return or not respond, in addition to the return code. To keep it backwards compatible, condition the new check on the max attempt number being specified in decimal format (e.g. '10.0' unlike old '10'). Change-Id: If2512cf9121cdd795638efe7362ef0485d4e8d91 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-03[baremetal] Retire mas01 NATAlexandru Avadanii1-1/+0
Isolate networks by retiring NAT on mas01; also cutting direct internet access from cluster nodes that are not facing the public network (prx, cmp). NOTE: Since we are removing mas01 NAT, VCP VMs (except prx which have public IPs) and kvm nodes (cmp also have public IPs) will no longer have direct internet connectivity. Cluster deployment and operations will work without it, but if it is required for different reasons, the MaaS proxy could be enabled by uncommenting the /etc/enviroment section in: - cluster.baremetal-mcp-pike-common-ha.include.proxy.yml JIRA: FUEL-317 Change-Id: I5ed8b420296b27df34a54ec1ebd7b7cf58041425 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-01[baremetal] MaaS: Enable HTTP proxyAlexandru Avadanii2-1/+13
Instead of using NAT on the mas01 node for all cluster node outgoing traffic, use the MaaS built-in proxy for APT traffic to leverage its caching capabilities too. Also enable the proxy for salt minions, so they can access public keyservers et al. Cleanup public DNS from kvm nodes, interferes with MaaS proxy. Add example config for global env proxy, but don't enable it: - default environment settings - /etc/environment (via reclass); The MaaS proxy will not be used (at least for now) on nodes: - cfg01; - mas01; NOTE: We can't yet drop the maas.pxe_nat state completely, as certain Openstack services are still accessed via public addresses from ctl nodes. JIRA: FUEL-317 JIRA: FUEL-318 Change-Id: I6c5f6872bb94afb838580571080e808bc262fc68 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-12-27[ovn] Inject ovn central optionsMichael Polenchuk1-0/+15
Change-Id: Ib9021ee3ca15c05cc137ae42c263383acb4393bd Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-22[vcp] Catch 'no response' of salt minion as wellMichael Polenchuk1-1/+1
Salt minion could return 'no response' and cause an unconfigured state of the vcp node(s), so catch this output after linux state as well. Also clean up excess route on proxy nodes. Change-Id: I3183fa09ff41a8f027ee789869bdae0c3962ab8f Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-21Bring in ovn based scenarioMichael Polenchuk2-1/+12
OVN based scenario doesn't require conventional gateway node since connectivity to external networks and routing occurs on compute nodes. Change-Id: I81e0d497170d5ffb067adf13b0e46290525f26a6 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-20[maas] Adjust deployment order/timeoutsMichael Polenchuk1-3/+7
Change-Id: I9dbb51ce2387450e4ae19f8b3444f5e52cfdc71d Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-19[dpdk] Remove user/group setting for ovs rundirMichael Polenchuk1-3/+0
The proper patches have been merged into upstream (nova/neutron formulas, system reclass) to use a separate dir for vhost_user sockets. Change-Id: Iba8d8a9a05c5ab681b5b5ffbea786dca92704c82 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2017-12-19[baremetal] MaaS: Reduce timeout valuesAlexandru Avadanii1-9/+8
`maas_fixup` is already re-entrant, so we can execute it more than once during a commissioning/deploy cycle. Reduce the timeout waiting for all nodes to reach a stable state, so nodes stuck in 'Ready' state instead of reaching 'Deploying' get dealt with sooner (~5 min vs old 30 min). While at it, let `maas_fixup` handle machine deploy as well, so we can catch nodes stuck in 'Ready' state and re-trigger the deploy. Change-Id: Id24cc97b17489835c5846288639a9a6032bd320a Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-12-18Merge "states: networks: Use role-based addressing"Alexandru Avadanii1-5/+5
2017-12-18Merge "[baremetal] Move salt master IP to PXE/admin"Alexandru Avadanii1-2/+0