Age | Commit message (Collapse) | Author | Files | Lines |
|
By default, vnet devices have a MTU of 1500 on the host side, causing
issue with larger packets traversing the bridges between guest VMs
when guest VMs have jumbo frames enabled.
JIRA: FUEL-336
JIRA: FUEL-367
JIRA: FUEL-382
[1] http://linuxaleph.blogspot.com/2013/01/
how-to-network-jumbo-frames-to-kvm-guest.html
[2] https://packetpushers.net/udev/
Change-Id: I941ac9cf764e3b3fa2d6463be5363b5459775f29
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Split scenario yaml definitions for virtual.nodes based on node
role ('infra', 'control' or 'compute'), to be leveraged later to
contruct node lists based on said role.
This moves the responsability of filtering node names in scenario
files (based on 'virtual' or 'baremetal' type) to xdf_data.sh.j2,
simplifying scenario templates.
By keeping all nodes (both virtual and baremetal) in scenario files,
we can later determine the role (and implicitly the hostname) for a
MaaS-managed node based on its index in the virtual.nodes.control
structure.
JIRA: FUEL-382
Change-Id: I1f83a307631f4166ee1c57ef598c44876b962f97
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
For hybrid PODs (e.g. x86_64 jumpserver + control nodes, aarch64
baremetal compute nodes), the virtual nodes rely on MaaS DHCP to be
up when the OS boots, so issue a `virsh reset` accordingly.
Instead of checking for online nodes using `test.ping`, use
`saltutil.sync_all` to also sync Salt state modules to the virtual
nodes (usually handled by baremetal_init state in HA deploys).
JIRA: FUEL-338
Change-Id: If689d057dc4438102c3a7428a97b9638e21bfdc5
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
`virsh undefine` argument `--nvram` is only supported by newer
versions of libvirt.
Although this is mandatory for AArch64, for x86_64 this is not a
blocker (since we don't enable OVMF for the VMs on the jumpserver).
Change-Id: I3a82bc54b36228980a41d77a463a7558a685c03d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, skip waiting for Salt master when deleting UEFI stale
entries if it doesn't respond to ping.
Also, use https for fetching Armband GPG key to bypass yet another
hks issue behind proxies/firewalls that block hks port.
Change-Id: I400cbe3257094b62c96b302a3c81c5ffd1ba4755
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I1c5e3d7a0dbac14bf242730d6ac8d2b1d0817907
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
j2/python is easier to read and manipulate strings, although it does
need some special care about undefined dict keys.
With this in place, deploy.sh only contains the higher level logic for
the deployment process.
- merge arch-specific default configuration files into a singular
file with arch name as main dict key of old config (also avoids
creating duplicate 'virtual' YAML keys in $LOCAL_PDF);
- move template handling to separate <lib_template.sh>;
- decouple tight bash ordering of scenario expansion -> parse_yaml ->
variable export (e.g. CLUSTER_DOMAIN) -> re-use in cluster j2s;
however we can't parse *all* j2s in one go, as scenario j2s might
expand to YAMLs needed while expanding cluster j2;
- split `do_templates` into separate functions for each stage, with
no coupling between them other then call order;
Change-Id: I4b5e804094c00e5e918caf769fd85fa52181ad76
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I687b73b256aca78c9d41d4bcd49bfbde51278b51
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- bump Pharos git submodule for j2 'do' extension + batch mode;
- adopt j2 'do' in our templates;
- use int filter for 'native' vlan check;
- lib.sh: adopt `-i` to remove `ln` hack for net_map.j2;
- lib.sh: adopt `-b` to speedup template parsing;
NOTE: Bumping Pharos will also bring in the latest changes in
pod_config.yml.j2, which include massive IP shifts and updates.
JIRA: FUEL-335
Change-Id: I7d3a997b3d8659d5f09f867870fb3a148c1ec6df
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Run the pharos yaml schema validation for configuration files
before expanding them
JIRA: FUEL-341
Change-Id: Ia1d69f53265876683a1b6674665a9594ba7dae16
Signed-off-by: Guillermo Herrero <guillermo.herrero@enea.com>
|
|
Replace loop device LVM-backed cinder volume with a dedicated
/dev/vdb drive.
This is another small step towards bringing noHA to baremetal.
Change-Id: I80f9c2bee42e933a36ab7a8f9b4c5247d1652b42
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- MaaS requires PXE/admin to be a Linux bridge;
- if virtual nodes are present, they should be hooked to a proper
Linux bridge for the Public network, but only throw a warning if
not (and create a mock public virsh network instead);
- if both virtual and baremetal nodes are present, Public bridge is
indirectly mandatory (we can't mock it);
JIRA: FUEL-339
Change-Id: Idfe99d66c49eadc56cb3d94ca4db3467fb76d388
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Add a new class of scenarios, based on existing baremetal HA
scenarios, but instead of having a virtualized control plane (VCP),
all Openstack controller services will run directly on the cluster
nodes.
This change adds the common scaffolding, as well as the OVS scenario.
The new scenario(s) can be used on full-baremetal clusters, soon on
full-virtual clusters and later on hybrid (virt + bare) clusters.
This change defines old (current) style scenario definitions for
both baremetal and virtual, both named:
- os-nosdn-nofeature-novcp-ha;
Prerequisites:
1. Merge-able by name reclass.storage.node definitions
Each cluster (e.g. database, telemetry) adds its own set of
reclass storage node defitions, which for novcp scenarios should
be merged into a single node (kvm) based on the 'name' property.
This is not currently supported by upstream reclass 'node.sls'
high state, so add support for it via an early patch (required
before salt-master-init.sh tries to handle reclass.storage).
2. common reclass classes for novcp
Some of the classes in `baremetal-...-common-ha` are not fit for
novcp as they define VCP-specific config/inheritance, so add new
versions of said classes with novcp in mind or adapt old classes:
- parameterize ctl hostname in `openstack_compute.yml`;
- new `openstack_control_novcp.yml`;
- new `openstack_init_novcp.yml`;
3. Handle hard set names in state files for baremetal nodes
Some of our state files (e.g. maas) hardcodes baremetal node names
to 'kvm', 'cmp', so we need to align the names in novcp scenario
with these values to re-use the maas state. As a future improvement
we should parameterize these names in all state files.
As a consequence, our baremetal controller nodes will also use
'kvm*' hostnames (instead of 'ctl*').
4. Add 'noifupdown' to all interfaces on kvm nodes to prevent duplicate
IPs/routes created at *any* ifup due to /etc/network/route-br-ex.
Patch salt-formula-linux to skip network restart on 'noifupdown',
also when routes are present on that interface.
JIRA: FUEL-310
Change-Id: Ic67778f63e5ee0334dbfe9547c7109ec1a938d61
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- add missing network definitions for ODL node's 1st interface;
- add missing comments for `notify` global functions;
- fix or silence shellcheck issues;
JIRA: FUEL-322
Change-Id: Ie3341d29ab12ddf432db603ad865259afb54714e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Some sysadmins or distro defaults might blacklist br_netfilter, or
it might not be loaded at deploy start, account for these corner
cases too.
JIRA: FUEL-334
Change-Id: I3ca6cb3848df8d2af1625ff4e3816efe8b320886
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Decouple virtual cluster nodes (ctl, gtw etc.) from opnfv_fn_* vars
in favor of parsing PDF/IDF.
This is the first step towards unifying baremetal and virtual network
definition templates, as well as allowing virtual nodes to run on a
remote hypervisor (and eventually with a different arch).
opnfv_fn_* vars will still be used for infra VMs spawned on FN (cfg01
and optionally mas01).
Adopt new 'net_map.j2' from Pharos submodule for new templates (virt),
as well as old ones (baremetal).
JIRA: FUEL-322
Change-Id: I150c2416566bbe42ea11cd00f12a8a7bf96776c2
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- add new virsh managed network 'pxebr' (to mimic baremetal behavior
on virtual PODs, this will be the equivalent of PXE/admin network);
- connect 'pxebr' to 3rd interface for cfg01, mas01 for all deploys
(used to be baremetal-specific), replacing 'internal';
- keep 'mcpcontrol' connected only to 'cfg01' (+ 'mas01' if present)
for initial infrastructure bring-up (1st interface);
- switch all virtual cluster nodes to 'pxebr' (1st interface);
- use 'pxebr' for all Salt cluster nodes traffic, 'mcpcontrol' only
for mas01<=>cfg01 Salt traffic;
- convert <user-data.template> to jinja2 and expand it based on PDF
instead of using `envsubst`;
- split <user-data.sh.j2> into two versions, one for each network
used for Salt traffic;
- ci/deploy.sh: Read scenario data before template parsing for
cluster domain variable, needed in virsh network def;
- leave docs diagram refresh to later after all possible deploy types
have settled;
- limit keyserver proxy usage to nodes where the configured http proxy
matches the first nameserver (true for all MaaS-provisioned nodes),
so we can re-use the same pillar for FN VMs and baremetal nodes;
- add PXE/admin IP on cfg01's 3rd interface and switch other vnodes
`salt_master_host` to point to it;
JIRA: FUEL-322
Change-Id: Ie4f7aedddf2ef81046f1127b377d88dce79f0fda
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- move bash template handling (previously expanded via `envsubst`)
to lib.sh;
- move j2 template handling to lib.sh;
- move virsh network templates to 'mcp/scripts/virsh_net' subdir;
- switch virsh network templates from `envsubst` expansion to j2 and
leverage generate_config.py, similar to PDF Fuel installer adapter;
- add relevant runtime env vars (e.g. SALT_MASTER, MAAS_IP) on the fly
to PDF, to consume them in templates like params coming from PDF;
- parameterize virsh network definitions based on PDF (mgmt, public);
JIRA: FUEL-322
Change-Id: Ib94e78fc4f25797b9354a0552e884104da5d0003
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
RHEL family virtualization tools reserve 02:00 PCI slot for VGA, even
if 'nographics' is specified when creating the VM (in case the user
wants to later hook a video card, which usually *requires* PCI slot2).
Debian systems do not follow this rule (tested with libvirt 1.x, 2.x,
3.x), hence 1st NIC lands on PCI slot 2 (and get eth name 'ens2').
To align the behavior across all possible jumpserver distros, bring
back the virtio video.
This reverts commit 738f6c3b68d1179de1ff790f9e72c25f10874da4.
Change-Id: Ifd855c12e04aec1ff0ab047b13f8081365741889
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
JIRA: FUEL-334
Change-Id: I6d2499053dcfb7f99593fcd5c948b569bdcb9c9b
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Since VCP VMs (created via salt formula) do not have a video
controller defined in their domain XMLs, network devices end on
different PCI slots and hence have different names assigned
(ens2+ vs foundation node VMs, which start with ens3).
To align network interface names for VMs on jumpserver vs kvm nodes,
and reduce confusion, remove the video controller from FN VMs.
This allows some cleanup:
- drop extra AArch64 args from virt-install;
- unify 'opnfv_vcp_vm_*' and 'opnfv_fn_vm_*' variables;
Change-Id: I0d108b00914b3eaaa03b67c652174f8ed4573118
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Downloading the base image (usually a few hundred MB) outputs a lot
of useless dots to show progress. Switch to 1M per dot (from 1K).
Change-Id: I8c525cad0b46e8ba3a7f6da4dd7f8277a49df91f
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- Remove hardcoded /24 mask
- Use PDF as source for public network, with reclass params:
opnfv_net_public, _mask, _gw, _pool_start, _pool_end
JIRA: FUEL-315
Change-Id: Idf3a4ed8f63f58fa90d9c1dcb7751ef3b1c9bd36
Signed-off-by: Guillermo Herrero <guillermo.herrero@enea.com>
|
|
In case the previous deploy attempt already copied the base image
as the VCP image in order to perform offline operations and failed,
leaving an incomplete image in place, current code might try to use
it instead of building it from scratch.
Use the hash-agnostic link names as checkpoints for successful image
handling.
Change-Id: I1e99e515e18ba1dec534c520811c127b2b528afe
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
For some reason, `modprobe -f` for a clean nbd module (from vanilla
Ubuntu) fails with exec format error randomly, while a simple
`modprobe` works.
Change-Id: I79785e510cab757e2482baf442054be984c24019
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Ida693b6dd328db283d6992ac33500f4dd1a73eb8
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
wait_for function should be able to also check for minions that did
not return or not respond, in addition to the return code.
To keep it backwards compatible, condition the new check on the max
attempt number being specified in decimal format (e.g. '10.0' unlike
old '10').
Change-Id: If2512cf9121cdd795638efe7362ef0485d4e8d91
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
salt-minion is now pre-provisioned inside the image using qemu-nbd.
Revert "lib.sh: Limit envsubst to certain variables"
This reverts commit 3a76d07dbd409b781abdb8520f55a1b20edf07db.
Change-Id: Icceb8bcf439e28ab01c7731c3602c1113290454d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Fingerprint and re-use base image artifacts.
Change-Id: Ic7a73c04e27d25addd50e4e9880619a0028956d3
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Ia514418d2aae1b4f7e752d4610fa6c9829c67e51
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
RHEL distros do not maintain nbd, so add a best-effort function
to build it on the fly.
Change-Id: Ie0419f0fed8a0b12f6b878b3093d6ca34f72d140
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
On rare occassions, mapper bindings created by kpartx take longer
to show up, leading to errors when we try to mount them on.
Bring back the hardcoded delay to bypass such issues.
Change-Id: Ib386c04fc55cd85235a2156dba08fda378e4cdfd
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
Use PXE/admin network for salt traffic from/to all minions
except cfg01, mas01.
This allows us to drop the route to admin net from cfg01.
Change-Id: Ic2526f1ff77afe5d92ced900971f4c8f78d2d8a2
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Running `ci/deploy.sh -EE` should also perform an UEFI boot option
cleanup, otherwise we risk booting the previously installed OS.
While at it, reduce delay between nodes removal and fix a rare failure
for `-EE` when no nodes are defined in MaaS.
Change-Id: I789ffd3e22545921216f7d5ee3509c76354542eb
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
|
|
qemu-nbd currently available in CentOS 7 does not add partition
mappings automatically for NBD devices, so add explicit `kpartx`
calls.
Change-Id: Ifa79c89b82024602b782c449dbf4de10899e03b5
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
cfg01, mas01 DHCP leases in mcpcontrol virtual network should be
persistent (if cfg01 IP changes, minions can't find Salt Master).
Change-Id: I497207ebe1537af94fd92de12491664d17ad3144
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, rename apt repo in foundation node user-data template
from "salt" to "saltstack", to align with reclass model naming.
Change-Id: I5b216492349ae187b568884b1ab4046c52b1c6b2
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Extend <lib.sh> and its invocation from <ci.deploy.sh> with
support for modifying foundation node VMs base image prior to
using it with:
- additional APT GPG keys;
- additional APT repos;
- packages to pre-install;
- packages to pre-remove;
- (non-configurable) cloud init datasource via NoCloud only,
so VCP VMs won't wait for metadata service;
While at it, re-use the resulting image as a base for another
round of pre-patching (same operations as above are supported)
to provide a base image for VCP VMs.
Add AArch64-specific configuration based on new mechanisms:
- pre-install linux-image-generic-hwe-16.04-edge (and headers)
for foundation node and VCP (common) image (also requires new
repo and its key);
- pre-install cloud-init for VCP image (it should already be
installed, but script needs non-empty config for VCP to create
the VCP image and transfer it over to Salt Master);
NOTE: cloud-init is required on VCP VMs for DHCP on 1st iface.
JIRA: FUEL-309
Change-Id: I7dcaf0ffd9c57009133c6d339496ec831ab14375
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Some UEFI firmwares insist on scanning removable drives, even when
boot entries were deleted from UEFI boot list (board flash).
To work around this, remove contents of </boot/efi/*>, so scanning
won't identify any valid EFI binaries.
Another option would be erasing partition tables, but identifying
the underlying disk(s) is more complicated, especially when using
LVM/RAID etc.
Change-Id: I9949b99b139b1642e3bd8f04de3bd5ef74d1ecc5
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
On EFI-enabled systems, grub-install from grub-efi-* package
installs a boot entry named "ubuntu".
MaaS relies on IPMI to set boot order to PXE first; however
on systems with buggy firmware or without full IPMI support,
that fails, leading to booting Ubuntu from hard disk instead.
Work around this by clearing any previous Ubuntu boot entry
from board flash, before starting a new baremetal deploy.
NOTE: This only runs against nodes that are online from a
previous deploy.
Closes: ARMBAND-47
Change-Id: I1c4ece09e42845ce2a1b7119ec69e46e5ca12376
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
JIRA: FUEL-296
Change-Id: Ide9f9333fe9b44ff6b78678064f8e67f05aabd42
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Drop vgabios dependency by switching video from VGA to virtio for
all VMs spawned on the jumpserver.
NOTE: This requires virtualization packages on the jumpserver to be
up to date (e.g. libvirt, QEMU).
JIRA: ARMBAND-306
Change-Id: I73913e1ae8584f4e73b92994f78f7ec363cba3ec
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
'wait_for' bash function is nested in another 'wait_for' call in some
places, which leads to inner calls interfering with outer calls by
overriding the locally scoped variables, including the 'attempt'
internal counter. In some cases, the outer 'wait_for' would exit
after a single attempt.
Fix that by running all contents of `wait_for` inside a subshell,
which inherits outer calls variables, but does not override them
when the inner call is finished.
Change-Id: I450eda3d023af2380c61ee930071fbfc393a5645
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While applying scenario states, break on error, and retry failed
state up to 5 times. Apply the same behavior for `salt.sh`.
Add new deploy parameter, '-D', backed up by 'CI_DEBUG' env var,
which gates deploy sh scripts logging (set -x).
Also extend '-f' deploy parameter, allowing it to be specified
more than once; the first occurence will skip infra VM creation,
but still sync reclass & other config from local repo, while a
second occurence will also disable config sync.
To prevent glusterfs client state from failing due to non-existent
nova user/group, move it after nova:compute's nova state is applied.
Change-Id: I234e126e16be0e133d878957bd88fed946955de8
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While at it, compact 'set' into bash shebang where possible and
add `make patches-copyright` target to simplify adding patch
license headers.
Change-Id: I0c841de72e5709e5eef915a52c5ec4a7fc0f7c37
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|