Age | Commit message (Collapse) | Author | Files | Lines |
|
Salt relies on a limiting libvirt_domain j2 template to generate the
XML it passes to libvirt for salt.control managed virtual machines.
For AArch64, we need to set up 3 XML nodes in a non-default way:
1. UEFI firmware (AAVMF) should be enabled by passing a pflash loader;
2. CPU mode should be 'host-passthrough';
3. QEMU machine type should be 'virt';
To allow configuring the above using pillar data:
- virtng module: implement functionality similar to upstream changes:
* 219b84a512 virt module: Allow NVRAM unlinking on DOM undefine
in develop, not in 2018.2;
* 9cace9adb9 Add support to virt for libvirt loader
in develop, not in 2018.2;
- virtng module: extend it with:
* pass virt machine type to vm;
* pass cpu_mode to vm;
JIRA: ARMBAND-404
Change-Id: Ib2123e7170991b3dfbdb42bd1a2baa5a4360b200
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
AArch64 specific formula, mostly tweaking nova conf / installing
virtualization layer prerequisites:
- install qemu-efi;
- install vgabios;
- fix missing link for vgabios binary blob;
- nova conf: cpu_model=cortex-a57 (only for virtual deploys);
- nova conf: virt_type=qemu (only for virtual deploys);
- nova compute conf: virt_type=qemu (only for virtual deploys);
- nova conf: pointer_model=ps2mouse since AArch64 has no USB tablet;
[1] https://github.com/openstack/nova/commit/f0f0953
Change-Id: I40515bdbd941850b103a86d51b347cc8610f5741
Signed-off-by: Guillermo Herrero <Guillermo.Herrero@enea.com>
Signed-off-by: Charalampos Kominos <Charalampos.Kominos@enea.com>
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
* Refactor OPNFV salt-formulas mechanism to resemble upstream git
structure:
- git submodules: add new submodule for each formula we patch;
- create salt-formula-x directories for OPNFV formulas;
- move mcp/metadata/service contents to their each formula subdir;
- use `make patches-import` for patches previously handled by
patch.sh;
- retire patch.sh
* states: add virtual_init:
- mostly based on old salt.sh, which is now obsolete;
- exclude salt-master service restart (it would kill the container);
* scenarios: cleanup (rm cfg01 virtual node def), adopt virtual_init;
* reclass: align our model with prebuilt container's Salt config:
- drop linux:network pillar data (handled by Docker);
- stop applying linux.system state on cfg01;
- align salt user homedir;
- drop salt-formula packages (preprovisioned);
* minor plumbing in deploy.sh and lib.sh;
JIRA: FUEL-383
Change-Id: I28708a9b399d3f19012212c71966ebda9d6fc0ac
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I84a4789ff2155d7c14f9ffd9bfe54c5bca7a0d4f
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I192cda7412151b2974e1bcd79a51f7593acace5d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Align opendaylight/server sls with the latest salt-formula-linux
version of keyserver requests support behind proxy.
Change-Id: I55f9010eec8b74932d4a28215ecf9647deb3d82c
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- noha: 'accept_policy: open_mode' to align with ha scenarios;
- s/cmp01/cmp001/g to align all scenarios and allow code reuse;
- rename network params: s/dhcp/mcpcontrol/g, cleanup;
- computes XDF data: drop 'opnfv_*' layer of params, cleanup;
- local vPDF: add comments with default roles by node index;
- parameterize all netmasks;
- drop unused address/netmask for 'proto: manual' interfaces;
- virsh_net: cleanup definitions, remove hardcodes, align IP on
jumpserver and DHCP range with MaaS for pxebr;
- maas: parameterize hardcoded '/24' cidr for PXE/admin, refactor
maas.region.machines parameterization;
- merge <all-mcp-arch-common/infra/config_*pdf.yaml.j2> templates;
- move reclass.storage definitions of compute nodes to common dir;
- drop 'openstack_compute_*' reclass params in favor of expanding
them via j2 directly in reclass.storage params;
- adopt `nm.cluster.has_*_nodes` where possible;
- obsolete `runtime.yml` from reclass model;
- refactor arch-specific reclass param selection;
- remove unused defaults in favor of mandatory IDF properties;
- noha: prepare for baremetal node support in cinder_lvm_devices;
- interfaces: add interface_mtu and 'noifupdown: true' everywhere;
- interfaces: use j2 macros to generate eth/vlan config;
- states cleanup: remove DHCP route disable workaround on prx/cmp;
- allow configuring NTP servers via:
`idf.fuel.network.ntp_strata_host{1,2}`;
- ovs_bridge: Allow setting gateway, dns-nameservers
- apache: Adjust module list for novcp class inheritance;
- glusterfs PPA: pin with same prio of MCP repos for novcp scenario;
JIRA: FUEL-319
JIRA: FUEL-326
JIRA: FUEL-337
Change-Id: Ia6ad64ba8cade85a75fb22c9a2505decc3834360
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Id024ed22dd1760f41ae18aeb8e680c2f07a5dc63
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
Install tacker from git repository since there is no ubuntu package yet.
Change-Id: Ibe4b6486050213df1a545c5c79c43a635bbf6c08
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
After salt update to version 2017.7.0 the indefinite mask
has to be removed before attempting to start the service.
Change-Id: I21616929f06f8ebd8a2d70e8c33f92c7b808a9c5
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
ODL requires native leveldbjni support on architectures like AArch64,
provided as a Debian package in ODL Team Nitrogen PPA.
Only systemd is supported (unlikely to change).
JIRA: ARMBAND-387
Change-Id: Ie7f2955c6574ab4584ed0c207b42ed7ab7261561
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Replace MAAS CLI set_disk_layout with the
new maas.machines.storage state
JIRA: FUEL-364
Change-Id: I4d8cd9f473c5386ee7b32ad378ca1e02989233ca
Signed-off-by: ting wu <ting.wu@enea.com>
|
|
Perform fio storage destructive test operation (usually takes just a
few minutes) to completely destroy any previous storage metadata that
might cause issues with cleanup in cloud-init/curtin during deploy.
Only resort to fio when a node fails to deploy, which allows us to
reuse the `maas.machines.mark_broken_state` state.
JIRA: FUEL-365
Change-Id: Ief327e6b4fefa83a8a3c131acfdf9f5fd605689d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Oxygen has an issue with broken config/data cache caused
by service restart in the middle of initial boot.
Change-Id: Ia30c76b67566ab8a2fb9045d0e10ca788f1a06a6
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
JIRA: FUEL-362
Change-Id: Ib2621bca72d1ba376af5d369edcf5fcf37e9788b
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Bring back public internet access to all cluster nodes via NAT
on mas01 node, required for NTP syncing.
NOTE: Both mcpcontrol and PXE/admin networks are currently
hard wired to using /24 netmask, so we leverage that in pxe_nat.sls.
JIRA: FUEL-348
This reverts commit 9a6e655e0b851ff6e449027c01ac1a66188b0064.
Change-Id: I7bab385f95f8c6d92cadc4e2149c2cd56e10c506
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- fix `route-br-ex` if-up.d script failing when route already exists
by adding a wrapper around distro's '/sbin/route' binary in
'/usr/local/sbin/route', exploiting default order in Ubuntu PATH;
- fix 'br-prv' duplicate entry in 'interfaces.d/ifcfg-br-prv' and
'interfaces' caused by upstream bug [1];
- add barrier waiting for all baremetal nodes online before attempting
reboot, trying to catch rare failures which are undetectable in logs
as both a succesful reboot and a disconneted minion report 'n/c';
With the above in place, networking service should no longer fail
to start on cmp nodes w/ DPDK.
[1] https://github.com/saltstack/salt/issues/40262
Change-Id: I6d4895376ce323c14c997e6c9af2ea3eeeee0184
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
On cmp nodes, allocate only 30GB (fixed for now) for / partition.
The rest of the disk(s) can later be allocated via salt-formula-linux.
JIRA: FUEL-330
Change-Id: Ie11c78791e60801719cd33475ff91fc003df5ffa
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Some nodes fail automatic testing done by MaaS during commissioning,
although running the testing suites one more time manually works.
For now, just override all 'failed testing' nodes unconditionally.
JIRA: FUEL-333
Change-Id: I13d3ee3d82550524480aa53aa8752ab90aa940cd
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- add new virsh managed network 'pxebr' (to mimic baremetal behavior
on virtual PODs, this will be the equivalent of PXE/admin network);
- connect 'pxebr' to 3rd interface for cfg01, mas01 for all deploys
(used to be baremetal-specific), replacing 'internal';
- keep 'mcpcontrol' connected only to 'cfg01' (+ 'mas01' if present)
for initial infrastructure bring-up (1st interface);
- switch all virtual cluster nodes to 'pxebr' (1st interface);
- use 'pxebr' for all Salt cluster nodes traffic, 'mcpcontrol' only
for mas01<=>cfg01 Salt traffic;
- convert <user-data.template> to jinja2 and expand it based on PDF
instead of using `envsubst`;
- split <user-data.sh.j2> into two versions, one for each network
used for Salt traffic;
- ci/deploy.sh: Read scenario data before template parsing for
cluster domain variable, needed in virsh network def;
- leave docs diagram refresh to later after all possible deploy types
have settled;
- limit keyserver proxy usage to nodes where the configured http proxy
matches the first nameserver (true for all MaaS-provisioned nodes),
so we can re-use the same pillar for FN VMs and baremetal nodes;
- add PXE/admin IP on cfg01's 3rd interface and switch other vnodes
`salt_master_host` to point to it;
JIRA: FUEL-322
Change-Id: Ie4f7aedddf2ef81046f1127b377d88dce79f0fda
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Iaa917be9f8f86c328ce4d503923a0d7cca680434
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Instead of defining a http proxy for all salt-minion traffic, which
also includes some Openstack API accesses we can't filter (no_proxy
is not yet supported), add & leverage support for proxy configuration
during APT keyserver access / key download.
JIRA: FUEL-331
Change-Id: I9470807633596c610cfafb141b139ddda2ff096b
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Isolate networks by retiring NAT on mas01; also cutting direct
internet access from cluster nodes that are not facing the public
network (prx, cmp).
NOTE: Since we are removing mas01 NAT, VCP VMs (except prx which have
public IPs) and kvm nodes (cmp also have public IPs) will no longer
have direct internet connectivity.
Cluster deployment and operations will work without it, but if it is
required for different reasons, the MaaS proxy could be enabled by
uncommenting the /etc/enviroment section in:
- cluster.baremetal-mcp-pike-common-ha.include.proxy.yml
JIRA: FUEL-317
Change-Id: I5ed8b420296b27df34a54ec1ebd7b7cf58041425
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Instead of using NAT on the mas01 node for all cluster node outgoing
traffic, use the MaaS built-in proxy for APT traffic to leverage its
caching capabilities too.
Also enable the proxy for salt minions, so they can access public
keyservers et al.
Cleanup public DNS from kvm nodes, interferes with MaaS proxy.
Add example config for global env proxy, but don't enable it:
- default environment settings - /etc/environment (via reclass);
The MaaS proxy will not be used (at least for now) on nodes:
- cfg01;
- mas01;
NOTE: We can't yet drop the maas.pxe_nat state completely, as certain
Openstack services are still accessed via public addresses from ctl
nodes.
JIRA: FUEL-317
JIRA: FUEL-318
Change-Id: I6c5f6872bb94afb838580571080e808bc262fc68
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
`maas_fixup` is already re-entrant, so we can execute it more than
once during a commissioning/deploy cycle. Reduce the timeout waiting
for all nodes to reach a stable state, so nodes stuck in 'Ready'
state instead of reaching 'Deploying' get dealt with sooner (~5 min
vs old 30 min).
While at it, let `maas_fixup` handle machine deploy as well, so we
can catch nodes stuck in 'Ready' state and re-trigger the deploy.
Change-Id: Id24cc97b17489835c5846288639a9a6032bd320a
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Use PXE/admin network for salt traffic from/to all minions
except cfg01, mas01.
This allows us to drop the route to admin net from cfg01.
Change-Id: Ic2526f1ff77afe5d92ced900971f4c8f78d2d8a2
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- s/opnfv_maas_pxe_/opnfv_infra_maas_pxe_/g to align with other vars;
- patches: pharos: Add MaaS PXE network to installer adapter;
- runtime.yml{,.template}: move to installer adapter, update
pod_config.yml example;
- drop MAAS_PXE_NETWORK global env var, now read strictly from PDF;
JIRA: FUEL-313
Change-Id: I46d7510bd53fba7890c411d36bc28fd6ff6f3648
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, compact 'set' into bash shebang where possible and
add `make patches-copyright` target to simplify adding patch
license headers.
Change-Id: I0c841de72e5709e5eef915a52c5ec4a7fc0f7c37
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, fix some shellcheck warnings, and s/fgrep/grep -F/g.
Change-Id: I093b7b4c196731b1ecc0c27a4111955b2e412762
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
* use pseudo agentdb port binding controller instead of
the deprecated network topology one
* disable superfluous l2population mechanism driver
* tidy up the duplicated haproxy neutron listen opts
* straighten karaf features list
* update jetty config
Change-Id: Ifacf8de11eb56ab72df13a312151a510b280dea2
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
- minor refactor of runtime templates parsing to allow var expansion;
- parse <pod_config.yml> into shell vars, match dynamically networks
from PDF to IP addresses on bridges of current jumphost;
- keep old '-B' parameter in <ci/deploy.sh>, use it for providing
fallback values in case there's no bridge name specified via IDF
and no IP on the jumphost for one or more of the PDF networks;
- re-enable dry-run to ease testing of the above;
- add sample 'idf-pod1.yaml' to <mcp/config/labs/local>;
The new behavior will try to determine the jump host bridge names:
1. Based on IDF mapping, if available
2. Based on PDF network matching with IP addrs on jumphost;
3. Fallback to values passed via '-B';
4. Fallback to default values hardcoded in the deploy script;
Later, we will drop MaaS network env vars in favor of PDF vars,
once the PDF template is generating them.
Change-Id: If9cd65d310c02965b2e2bfa06a0d7e0f97f1dd48
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- add new patch for maas.region, extending it poorly with a timeout
override mechanism; the new comissioning/deploying timeout defaults
(10/15min) will be used instead of MaaS defaults (20/40min), unless
reclass params are defined with different values;
- add 30s delay between 'machine mark-broken' and 'machine mark-fixed'
MaaS cli commands (fixes a rare race condition);
- fix forgotten replace in 'maas.pxe_route': s/opnfv_fuel_/opnfv_/g;
Change-Id: I71c562b80031bac2793dd470d52928c2d62e5300
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
mcpcontrol virsh network, as well as MaaS PXE network are installer
specific, and not POD specific.
Therefore, these should be easily parametrized without the PDF,
using only installer inputs (e.g. env vars passed via Jenkins).
- add new <all-mcp-ocata-common.opnfv.runtime> reclass class;
- parametrize at runtime new reclass class based on global vars;
- factor out MaaS deploy address / config using new mechanism;
- parametrize at runtime virsh network definitions based on template;
- add new "maas.pxe_route" sls for configuring routing on cfg01;
- replace env vars with the new sls in "maas" state;
NOTE: baremetal parametrization will be handled later.
Change-Id: Ifd61143d818fb088b3f4395388ba769bbc49156e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, parametrize max attempt number in maas state's "wait_for",
and reduce retries count for certain simpler tasks.
Change-Id: I3ac2877719cdd32613bcf41186ebbb9f3f3aee93
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- ci/deploy.sh: fail if default scenario file is missing;
- start by copying reclass/classes/cluster/virtual-mcp-ocata-ovs as
classes/cluster/baremetal-mcp-ocata-ovs;
- add new state (maas) that will handle MaaS configuration;
- Split PXE network in two for baremetal:
* rename old "pxe" virtual network to "mcpcontrol", make it
non-configurable and identical for baremetal/virtual deploys;
* new "pxebr" bridge is dedicated for MaaS fabric network, which
comes with its own DHCP, TFTP etc.;
- Drop hardcoded PXE gateway & static IP for MaaS node, since
"mcpcontrol" remains a NAT-ed virtual network, with its own DHCP;
- Keep internet access available on first interfaces for cfg01/mas01;
- Align MaaS IP addrs (all x.y.z.3), add public IP for easy debug
via MaaS dashboard;
- Add static IP in new network segment (192.168.11.3/24) on MaaS
node's PXE interface;
- Set MaaS PXE interface MTU 1500 (weird network errors with jumbo);
- MaaS node: Add NAT iptables traffic forward from "mcpcontrol" to
"pxebr" interfaces;
- MaaS: Add harcoded lf-pod2 machine info (fixed identation in v6);
- Switch our targeted scenario to HA;
* scenario: s/os-nosdn-nofeature-noha/os-nosdn-nofeature-ha/
- maas region: Use mcp.rsa.pub from ~ubuntu/.ssh/authorized_keys;
- add route for 192.168.11.0/24 via mas01 on cfg01;
- fix race condition on kvm nodes network setup:
* add "noifupdown" support in salt formula for linux.network;
* keep primary eth/br-mgmt unconfigured till reboot;
TODO:
- Read all this info from PDF (Pod Descriptor File) later;
- investigate leftover references to eno2, eth3;
- add public network interfaces config, IPs;
- improve wait conditions for MaaS commision/deploy;
- report upstream breakage in system.single;
Change-Id: Ie8dd584b140991d2bd992acdfe47f5644bf51409
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
Signed-off-by: Guillermo Herrero <Guillermo.Herrero@enea.com>
Signed-off-by: Charalampos Kominos <Charalampos.Kominos@enea.com>
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I8a3be1764de136e2ecf81f964233483be5d6655a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
* fix formula & reclass cluster model
* bring in running states
Change-Id: I8e66e69045f5c745f9aa6f59f7ce6d66b5bf1c95
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Change-Id: I2eed0cf19907f257be1cb4aee96528cc41f4843a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|