summaryrefslogtreecommitdiffstats
path: root/mcp/salt-formulas/maas
AgeCommit message (Collapse)AuthorFilesLines
2018-04-18[MaaS] destructive storage test fio on failureAlexandru Avadanii1-1/+1
Perform fio storage destructive test operation (usually takes just a few minutes) to completely destroy any previous storage metadata that might cause issues with cleanup in cloud-init/curtin during deploy. Only resort to fio when a node fails to deploy, which allows us to reuse the `maas.machines.mark_broken_state` state. JIRA: FUEL-365 Change-Id: Ief327e6b4fefa83a8a3c131acfdf9f5fd605689d Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-03-08Revert "[baremetal] Retire mas01 NAT"Alexandru Avadanii1-0/+37
Bring back public internet access to all cluster nodes via NAT on mas01 node, required for NTP syncing. NOTE: Both mcpcontrol and PXE/admin networks are currently hard wired to using /24 netmask, so we leverage that in pxe_nat.sls. JIRA: FUEL-348 This reverts commit 9a6e655e0b851ff6e449027c01ac1a66188b0064. Change-Id: I7bab385f95f8c6d92cadc4e2149c2cd56e10c506 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-17[MaaS] Add maas.machines.set_storage_layout slsAlexandru Avadanii1-0/+20
On cmp nodes, allocate only 30GB (fixed for now) for / partition. The rest of the disk(s) can later be allocated via salt-formula-linux. JIRA: FUEL-330 Change-Id: Ie11c78791e60801719cd33475ff91fc003df5ffa Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-02-17[MaaS] Override failed testing by defaultAlexandru Avadanii3-0/+22
Some nodes fail automatic testing done by MaaS during commissioning, although running the testing suites one more time manually works. For now, just override all 'failed testing' nodes unconditionally. JIRA: FUEL-333 Change-Id: I13d3ee3d82550524480aa53aa8752ab90aa940cd Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2018-01-03[baremetal] Retire mas01 NATAlexandru Avadanii1-37/+0
Isolate networks by retiring NAT on mas01; also cutting direct internet access from cluster nodes that are not facing the public network (prx, cmp). NOTE: Since we are removing mas01 NAT, VCP VMs (except prx which have public IPs) and kvm nodes (cmp also have public IPs) will no longer have direct internet connectivity. Cluster deployment and operations will work without it, but if it is required for different reasons, the MaaS proxy could be enabled by uncommenting the /etc/enviroment section in: - cluster.baremetal-mcp-pike-common-ha.include.proxy.yml JIRA: FUEL-317 Change-Id: I5ed8b420296b27df34a54ec1ebd7b7cf58041425 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-12-19[baremetal] MaaS: Reduce timeout valuesAlexandru Avadanii1-1/+1
`maas_fixup` is already re-entrant, so we can execute it more than once during a commissioning/deploy cycle. Reduce the timeout waiting for all nodes to reach a stable state, so nodes stuck in 'Ready' state instead of reaching 'Deploying' get dealt with sooner (~5 min vs old 30 min). While at it, let `maas_fixup` handle machine deploy as well, so we can catch nodes stuck in 'Ready' state and re-trigger the deploy. Change-Id: Id24cc97b17489835c5846288639a9a6032bd320a Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-12-18[baremetal] Move salt master IP to PXE/adminAlexandru Avadanii1-15/+0
Use PXE/admin network for salt traffic from/to all minions except cfg01, mas01. This allows us to drop the route to admin net from cfg01. Change-Id: Ic2526f1ff77afe5d92ced900971f4c8f78d2d8a2 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-12-11[baremetal] Move all MaaS PXE net config to PDFAlexandru Avadanii1-1/+1
- s/opnfv_maas_pxe_/opnfv_infra_maas_pxe_/g to align with other vars; - patches: pharos: Add MaaS PXE network to installer adapter; - runtime.yml{,.template}: move to installer adapter, update pod_config.yml example; - drop MAAS_PXE_NETWORK global env var, now read strictly from PDF; JIRA: FUEL-313 Change-Id: I46d7510bd53fba7890c411d36bc28fd6ff6f3648 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-10-14Add license headers where missingAlexandru Avadanii4-0/+28
While at it, compact 'set' into bash shebang where possible and add `make patches-copyright` target to simplify adding patch license headers. Change-Id: I0c841de72e5709e5eef915a52c5ec4a7fc0f7c37 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-10-03Identify jump host bridges based on IDF / PDF netsAlexandru Avadanii1-1/+1
- minor refactor of runtime templates parsing to allow var expansion; - parse <pod_config.yml> into shell vars, match dynamically networks from PDF to IP addresses on bridges of current jumphost; - keep old '-B' parameter in <ci/deploy.sh>, use it for providing fallback values in case there's no bridge name specified via IDF and no IP on the jumphost for one or more of the PDF networks; - re-enable dry-run to ease testing of the above; - add sample 'idf-pod1.yaml' to <mcp/config/labs/local>; The new behavior will try to determine the jump host bridge names: 1. Based on IDF mapping, if available 2. Based on PDF network matching with IP addrs on jumphost; 3. Fallback to values passed via '-B'; 4. Fallback to default values hardcoded in the deploy script; Later, we will drop MaaS network env vars in favor of PDF vars, once the PDF template is generating them. Change-Id: If9cd65d310c02965b2e2bfa06a0d7e0f97f1dd48 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-09-23MaaS: Reduce C/D timeouts, minor fixesAlexandru Avadanii2-3/+3
- add new patch for maas.region, extending it poorly with a timeout override mechanism; the new comissioning/deploying timeout defaults (10/15min) will be used instead of MaaS defaults (20/40min), unless reclass params are defined with different values; - add 30s delay between 'machine mark-broken' and 'machine mark-fixed' MaaS cli commands (fixes a rare race condition); - fix forgotten replace in 'maas.pxe_route': s/opnfv_fuel_/opnfv_/g; Change-Id: I71c562b80031bac2793dd470d52928c2d62e5300 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-09-12reclass, states: Parametrize runtime configurationAlexandru Avadanii1-0/+8
mcpcontrol virsh network, as well as MaaS PXE network are installer specific, and not POD specific. Therefore, these should be easily parametrized without the PDF, using only installer inputs (e.g. env vars passed via Jenkins). - add new <all-mcp-ocata-common.opnfv.runtime> reclass class; - parametrize at runtime new reclass class based on global vars; - factor out MaaS deploy address / config using new mechanism; - parametrize at runtime virsh network definitions based on template; - add new "maas.pxe_route" sls for configuring routing on cfg01; - replace env vars with the new sls in "maas" state; NOTE: baremetal parametrization will be handled later. Change-Id: Ifd61143d818fb088b3f4395388ba769bbc49156e Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-08-23MaaS: commissioning/deployment retryAlexandru Avadanii2-0/+24
While at it, parametrize max attempt number in maas state's "wait_for", and reduce retries count for certain simpler tasks. Change-Id: I3ac2877719cdd32613bcf41186ebbb9f3f3aee93 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2017-08-17Bring in baremetal supportAlexandru Avadanii1-0/+30
- ci/deploy.sh: fail if default scenario file is missing; - start by copying reclass/classes/cluster/virtual-mcp-ocata-ovs as classes/cluster/baremetal-mcp-ocata-ovs; - add new state (maas) that will handle MaaS configuration; - Split PXE network in two for baremetal: * rename old "pxe" virtual network to "mcpcontrol", make it non-configurable and identical for baremetal/virtual deploys; * new "pxebr" bridge is dedicated for MaaS fabric network, which comes with its own DHCP, TFTP etc.; - Drop hardcoded PXE gateway & static IP for MaaS node, since "mcpcontrol" remains a NAT-ed virtual network, with its own DHCP; - Keep internet access available on first interfaces for cfg01/mas01; - Align MaaS IP addrs (all x.y.z.3), add public IP for easy debug via MaaS dashboard; - Add static IP in new network segment (192.168.11.3/24) on MaaS node's PXE interface; - Set MaaS PXE interface MTU 1500 (weird network errors with jumbo); - MaaS node: Add NAT iptables traffic forward from "mcpcontrol" to "pxebr" interfaces; - MaaS: Add harcoded lf-pod2 machine info (fixed identation in v6); - Switch our targeted scenario to HA; * scenario: s/os-nosdn-nofeature-noha/os-nosdn-nofeature-ha/ - maas region: Use mcp.rsa.pub from ~ubuntu/.ssh/authorized_keys; - add route for 192.168.11.0/24 via mas01 on cfg01; - fix race condition on kvm nodes network setup: * add "noifupdown" support in salt formula for linux.network; * keep primary eth/br-mgmt unconfigured till reboot; TODO: - Read all this info from PDF (Pod Descriptor File) later; - investigate leftover references to eno2, eth3; - add public network interfaces config, IPs; - improve wait conditions for MaaS commision/deploy; - report upstream breakage in system.single; Change-Id: Ie8dd584b140991d2bd992acdfe47f5644bf51409 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com> Signed-off-by: Guillermo Herrero <Guillermo.Herrero@enea.com> Signed-off-by: Charalampos Kominos <Charalampos.Kominos@enea.com> Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>