aboutsummaryrefslogtreecommitdiffstats
path: root/mcp
AgeCommit message (Collapse)AuthorFilesLines
2020-01-29Merge "aarch64: Add kpti=off similar to x86_64 nopti"Alexandru Avadanii7-10/+18
2020-01-29aarch64: Add kpti=off similar to x86_64 noptiAlexandru Avadanii7-10/+18
arm64 kernels use a different kernel option (kpti=off vs nopti) to disable PTI, so sync the two platform configurations. Conveniently, this also bypasses kernel 4.15 issues described in [1], so apply the kernel option customisation via MaaS too, to allow aarch64 deployments to bootstrap using 4.15 kernel (with the downside of these args being duplicated by Salt later in HA scenarios). PTI is now disabled for baremetal nodes (via MaaS, no matter the scenario) and/or for kvm/cmp hosts (in HA scenarios only). While at it, install missing thin provisioning tools in aarch64 bootstrap image for MaaS deploy stage to succeed. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074 Change-Id: Ibd1f57f24abc690b0f13b6298f25d7e8a1af1567 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-29maas: Avoid race condition in node fixupsAlexandru Avadanii2-7/+17
When more than one node enters a failure state during a deploy attempt, we recover the first one and issue another deploy request; avoid raising an exception for the second node (which is not in 'Ready' state either), allowing the retries to continue. Change-Id: I4a3e037e78b5c48aebf6e700115c0bbf848c7cd5 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-29aarch64: docker: Add missing setuptools depAlexandru Avadanii1-4/+5
Change-Id: I4fd461c0ea861d541ab001431c9e2f21cfaea1b4 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-28Merge "cfg01, mas01: Switch to Ubuntu Bionic"Alexandru Avadanii5-14/+54
2020-01-28maas: curtin: Fix generic kernel dep purgeAlexandru Avadanii1-4/+4
When installing a custom kernel, purge the generic linux-image/headers packages too to avoid dependency conflicts. Change-Id: I4108350643fb97845decf48b9a281c471dad2a82 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-28cfg01, mas01: Switch to Ubuntu BionicAlexandru Avadanii5-14/+54
Pin salt-formula-nfs to a commit before 'mount.opts' was introduced. Adapt salt-formula-maas bits for MaaS 2.4 (shipped by default in Bionic) compatibility. Change-Id: I42f436203d3fbdb777d6b3eff9ac185240088742 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-28maas: Switch back to ga-18.04 kernel during deployAlexandru Avadanii1-2/+1
hwe-18.04, currently based on 5.3 kernel in Bionic, has issues on both x86_64 and aarch64 nodes, so use ga-18.04, currently based on 4.15. If MCP_KERNEL_VER is set (currently pinned to 5.0), the ga-18.04 kernel is replaced by the specified version after the MaaS commissioning, initial MaaS deployment. Change-Id: Ibe8e27217025290c1263f8dca9496b2cde24368c Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-26docker build, deploy: Switch tooling to python3Alexandru Avadanii5-17/+39
Python2.7 is deprecated and packages are starting to enforce py3 usage (e.g. dockermake recently started supporting only 3.6). Switch pipenv to python3, but allow pyhton3.5 by pinning dockermake to v0.8 since Ubuntu Xenial does not have python3.6 easily available. While at it, switch deploy tooling (PDF/IDF configuration parsing) from python2 to pyhton3 too and fix some jumphost package requirements. Change-Id: Id66d08d0f51a1bc35c1d78c1956df832a5536bde Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-21all: Pin Ubuntu kernel to 5.0.0-37 for BionicAlexandru Avadanii6-11/+53
Ubuntu kernel meta packages are all broken on at least one platform architecture, so pin the kernel version to 5.0.0-37, which is known to be stable. Make the kernel version configurable via a new enviroment variable, MCP_KERNEL_VER in globals.sh. If not defined, the ga-18.04 kernel is left unchanged (based on upstream kernel 4.15), except for baremetal nodes providioned by MaaS which currently use the HWE kernel (based on 5.3 in Bionic). Change-Id: I648d09b22f6080efd2bce26b6a06fecc3f6b4599 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-17Merge "odl-ovs noha: Support VLAN tagged public"Alexandru Avadanii1-5/+8
2020-01-16all: Actually honor public DNS set in IDFAlexandru Avadanii3-6/+15
We currently do not configure linux:network:resolv:dns via reclass pillar data, so we don't actually enforce the public DNS set in the IDF file, but instead leave it to the OS to figure it out, which most of the time works fine, but it's not completely reliable. Change that behavior to instead enforce it via linux.network.resolv state across all cluster nodes. Change-Id: I4f82315a473fcbdc8573380cfcac1e30b44c3dd4 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-16odl-ovs noha: Support VLAN tagged publicAlexandru Avadanii1-5/+8
Some baremetal servers might have VLAN tagged public interfaces configured via PDF/IDF, adjust our compute networking j2 handling to accomodate that. Change-Id: I97c07f9742a09cd01e7aecf118ada270a682280e Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-16odl: Make odl_hostconfig patching idempotentAlexandru Avadanii1-1/+3
Although rarely, ODL hostconfig patching for py3 compatibility silently fails, leading to fatal errors in later deploy stages. Skip said patch if already applied, respectively fail if the patch can't be applied. Change-Id: I1addf17f61fa01055c0db83056870a7e7b8d3a42 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-15Merge "fdio noha: Workaround tap MAC generation issues"Alexandru Avadanii1-0/+10
2020-01-14iec: Use 4.x kernel for K8s compatibilityAlexandru Avadanii1-0/+4
Change-Id: Ic720a1d35d7396aad94dbe0e63aa089fa5c23508 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-14fdio noha: Workaround tap MAC generation issuesAlexandru Avadanii1-0/+10
systemd 230..241 has issues generating persistent MAC addresses for bridge/tap/etc network devices, causing trouble for VPP agent hooking tap devices to the bridges it creates on the fly. Work around this by disabling the faulty policy, as suggested in [1]. [1] https://github.com/systemd/systemd/issues/3374 Change-Id: I8d568bc0a859256d1493bf9f8261d60943fa60e0 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-14fdio virtual: Bump cmp/gtw RAM to avoid OOMAlexandru Avadanii2-8/+8
Some PODs (e.g. ericsson-virtual*) use more than 5000 x 2M hugepages, together with 3G+ per-socket dpdk memory. Adjust our FDIO scenario definitions to accomodate such configurations without triggering the OOM. Change-Id: Ibce2316f158bde98ad8e54f3eec75a827982d417 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-09baremetal, virtual: Bump kernel to hwe-18.04 (5.0)Alexandru Avadanii5-5/+32
On some aarch64 platforms (e.g. ThunderX 1), lvcreate manifests some spurious timing issues resulting in incomplete/corrupted LVM thin creation and eventually to transaction ID mismatch between userspace and kernel space. This eventually leads to cinder-volume issues, either when creating the thin storage pool (vgroot-pool) and/or when creating the LVs inside said pool. The issue manifests spuriously on Ubuntu Bionic + UCA, so until a working combination of userspace/kernel is found, work around this by bumping the kernel package to hwe-18.04 (kernel 5.0), effectively bypassing the timing issues during volume creation. This affects all cluster machines (both HA and NOHA scenarios, baremetal and virtual, x86_64 and aarch64, baremetal and virtualized nodes). Note: Ubuntu Bionic cloud image partition handling requires e2fsprogs 1.43, not currently available on Ubuntu Xenial / CentOS 7. Change-Id: I839e03080104c391fe18185b9544c9df43c114e6 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-08ha, noha: Fix Horizon stale cache after installAlexandru Avadanii8-22/+8
Partially revert more from commit 63b712d, it turns out static files were not always up to date after the package install, so force a refresh. While at it, fold some common libvirt pillar configuration. Fixes: af1a4adf Change-Id: I1b4c20cfa9ae08d1cd7b0b774b544b76fc73a715 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2020-01-03aarch64: Workaround broken lshw CPU detectionAlexandru Avadanii1-1/+1
On some aarch64 platforms (e.g. ThunderX), the DMI tables parsed by lshw lead to wrong CPU capabilities detection, breaking our MaaS tag filtering (which used to rely solely on CPU having asimd caps). Extend the tag filtering condition to also include nodes that report `cp15_barrier` platform capability. Note that not all aarch64 systems include this cap explicitly (especially since it's been deprecated in ARM v8), but it is currently reported by the platforms where asimd is not properly detected. This is merely a workaround for the broken lshw version in Ubuntu Bionic (B.02.18). Change-Id: I4a5c0d6af4d863d2ca094d6926a65ee90dee0e07 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-12-30noha: Re-enable Horizon dashboard, fix CSSAlexandru Avadanii6-1/+65
- ha, noha: Fix misaligned python 3 requirement for Horizon: * python3-pylibmc - ha, noha: Partially revert commit 63b712d: "[Horizon] Drop the obsolete Horizon workaround" Since we switched back from MCP Horizon package to UCA, fix misaligned expected static resources location. - noha: Enable nginx proxy on ctl01 node for serving the Horizon dashboard at http://<cluster public VIP>:80 (http only, no SSL). Change-Id: I5f930a5826a818791183d3910aa0e5607924e8f3 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-12-30aarch64: Pin qemu-efi from Armband reposAlexandru Avadanii2-4/+18
Upstream (UCA) qemu-efi (AAVMF) package is incompatible with most cloud images, e.g. Cirros used by Functest, resulting in kernel boot issues and/or missing serial console output. Work around this by pinning the qemu-efi Debian package from the old Armband repositories. This should fix singlevm1 functest testcase. Change-Id: Ibbe2218d99881f6fec89846497c2cc248aab5031 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-12-19[fdio] Bump VPP to 19.08.1-releaseAlexandru Avadanii3-11/+73
- refresh formula patches with new package names where necessary; - switch to packagecloud.io repositories; Change-Id: I1178a387891d34117c162380d8247eb7a4212359 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-10-30[ha] [odl] Patch hostconfig for py3 compatAlexandru Avadanii1-0/+17
Change-Id: Id6754dec226e75b9ee1e8c19ac04531b9f277e0f Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-10-25[baremetal] Stein, Bionic, py3 supportAlexandru Avadanii33-77/+445
Change-Id: If3f8cb6bfeedeb766a050d5a271b21c90bb3ba1c Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-09-16docker-compose: Align hosts with hostnameAlexandru Avadanii3-0/+7
When using Docker CE 19.x, `hostname -d` fails to properly resolve the domainname due to changes in the way Docker sets it inside the container. Work around this issue by aligning the contents of `/etc/hostname` with `/etc/hosts`, so `hostname -d` can properly determine the domain name. This also requires calling `hostname -b` via cfg01 entrypoint.sh. Change-Id: I697b5d9882e3d6641712a00bca10012800ee1898 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-08-08Conform ovsdb listen port to os-vif defaultsMichael Polenchuk2-0/+8
Nova (by means of os-vif lib) uses 6640 port by default to connect to remote ovsdb over tcp/ssl. Change-Id: I1372d8a3170b00243a5756b15a140aafe03dc268 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-08-02[k8s] Adjust scenario for bionicMichael Polenchuk4-6/+6
Change-Id: I5c7a1e827446189b98b924ffd4272acf1a794697 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-30[dpdk] Remove invalid vhost optionsMichael Polenchuk1-0/+25
With DPDK 18.11 the vhost owner/perm options have to be removed since libvirt creates the server side of the socket and OVS connects to it using DPDK as a client. Change-Id: Ic33de66dcc0830cd31fc54880c524f850e2c4ea1 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-30Merge "[deploy] Explicitly set NS for resolvconf in VMs"Michael Polenchuk2-3/+5
2019-07-29[deploy] Explicitly set NS for resolvconf in VMsAlexandru Avadanii2-3/+5
With newer Ubuntu distros using netplan and systemd-resolve, we can't rely on /etc/resolv.conf found on the Jumphost being usable inside the guest VMs, so explicitly use the public network DNS servers configured in PDF/IDF. This will enable support for Jumpserver operating systems like Ubuntu 18.04. Change-Id: I0c7e02d5c1b822f809ce818e739c19d0344f39f5 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-07-29Merge "Update OpenDaylight version to Neon"Michael Polenchuk8-10/+32
2019-07-24Merge "[iec] centos: Preinstall git into cloud image"Alexandru Avadanii2-2/+5
2019-07-22[iec] centos: Preinstall git into cloud imageAlexandru Avadanii2-2/+5
While at it, fix CentOS selinux preconfiguration on x86_64, which was previously limited (incorrectly) to AArch64. Change-Id: I2d6604d3eea2bfc11fdd5dd3aeb4e2c0c3ede4a2 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-07-12Update OpenDaylight version to NeonMichael Polenchuk8-10/+32
Change-Id: I6cbbceb9b4a88f527d8dd800b0650f31a3dc1364 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-12Align python3 packages with stein requirementsMichael Polenchuk3-7/+117
Change-Id: Ib2b1525957929c39e4b602ad1b7f4fbfd16a375c Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-12Merge "Add extra bionic repo"Michael Polenchuk3-0/+13
2019-07-11[iec] Copy private RSA key to K8s masterAlexandru Avadanii1-3/+8
Certain validation testing suites require the SSH RSA private key to to be available on the K8s master node. Change-Id: Ib496ac6b33642d86bfd0e0f72ee847a2f31ea952 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-07-11Add extra bionic repoMichael Polenchuk3-0/+13
Change-Id: I06577fa93e895a7c5940dac41b4f9c24b455f455 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-10[virtual] Update OpenStack version to SteinMichael Polenchuk41-259/+168
Change-Id: I9c1e97144ffd46040d32a0edf8253fc393b73c89 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-07-03[AArch64] Fix renamed repo key in defaults sectionAlexandru Avadanii1-2/+2
The `apt` key has been renamed to `repo` in a previous change, but we missed renaming some occurences in defaults.yml.j2 for AArch64. Change-Id: Icf930371e9bc5253ea27e053933e1c012361f66e Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-07-01[lib] Limit cloud img partition resize to XenialAlexandru Avadanii1-1/+2
All cloud images except Ubuntu Xenial (CentOS 7, Ubuntu 18.04) already have enough free space on the predefined partitions, so skip the resize to avoid dealing with the newer e2fsprogs required by Ubuntu 18.04. Change-Id: I184590e631c76910e7c3169dc7bee3c5902ebaf1 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-06-29[virtual] Add Ubuntu 18.04 (Bionic) basic supportAlexandru Avadanii3-0/+63
Support Ubuntu 18.04 for virtual deployments (and implicitly for VCP VMs). Note that MaaS-provisioned systems will require the same changes being applied via curtin templates. Change-Id: I7cbd7e7c4421f6b970ce6ef97c10d269fec5fca3 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-06-28[iec] Add basic CentOS support (virtual only)Alexandru Avadanii11-79/+240
- reclass: iec: CentOS compatibility changes: * drop `proto: static` in favor of letting the linux formula set the appropiate default based on target OS; * replace `proto: manual` with `proto: none` on RHEL systems; * system.file: Avoid using non-existing `shadow` group for system files; * load br_netfilter kernel module to avoid `linux.network` state failures; * disable `at`, `cron` due to incomplete defaults in salt-formula-linux (since we don't use them on iec nodes anyway); - jumpserver/VCP VMs: centos: enable predictable interface names: * CentOS cloud image defaults to old 'eth' naming scheme; * add necessary kernel boot options via linux state; * cleanup auto-generated udev rules for old eth interface names; - salt-formula-linux: network: RHEL: Set bridge for member interfaces * Find the bridge containing the interface being currently configured (if any) and pass it to the `network.managed` Salt call; - deploy.sh: Add new deploy argument `-o` for specifying the operating system to preinstall on jumpserver and/or VCP VMs; * defaults to 'ubuntu1604'; * only iec scenarios will also support 'centos' for now; - user-data: minor tweaks for CentOS compatability: * use `systemctl` instead of `service` utility; * explicitly enable `salt-minion` service, since it defaults to disabled on RHEL systems; * explicitly call `ldconfig` to work around stale cache on RHEL, preventing `salt-minion` from using OpenSSL library; - states: virtual_init: Skip non-existing sysctl options on CentOS: * CentOS currently uses a 3.x kernel which lacks certain sysctl options that were only introduced in 4.x kernels, so skip them; - state: akraino_iec: Add centos support: * move iec repo to `/var/lib/akraino/iec` on both Salt Master and cluster nodes; - scenario defaults: Add CentOS configuration: * OS-dependent configuration split; * CentOS base image, default packages etc.; - AArch64 deploy requirements: Add `xz` dependency * CentOS AArch64 cloud image is archived using xz, install xz tools for decompression; - xdf_data: Make yaml parsing OS agnostic: * rename `apt` to `repo` where appropiate; * OS-dependent configuration parsing; - lib_jump_deploy: CentOS handling changes: * skip filesystem resize of cloud image for CentOS; * add repo handling, package intallation/removal handling for CentOS; * unxz base image if necessary (CentOS AArch64 cloud image); Change-Id: Ic3538bacd53198701ff4ef77db62218eabc662e7 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-06-10[ha] Disable apache's status moduleMichael Polenchuk2-33/+2
To avoid ports conflict of nginx/apache disable unused apache's status module, which is binded on 80 port by default. Also remove patch with double locations content (formula already has such configuration). JIRA: FUEL-408 Change-Id: Ib06dac8abe36299cf77747bdb3fc0fe7216b6096 Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
2019-06-06Merge "[ha] Re-enable nginx proxy for Horizon"Alexandru Avadanii3-0/+33
2019-06-05Merge "[lib] Add uninstall/cleanup option"Alexandru Avadanii1-0/+21
2019-06-05[ha] Re-enable nginx proxy for HorizonAlexandru Avadanii3-0/+33
Starting with MCP 2019.2, Horizon was moved under haproxy in Active/Active mode by default via upstream changes: - Adding haproxy class for horizon [1]; - Cleanup nginx horizon sites by default [2]; This change re-enables the old behavior where Horizon is served by nginx instead of haproxy. While at it, fix missing support in salt-formula-apache for wsgi `locations`, so Horizon dashboard can access '/static' resources (e.g. CSS/images). JIRA: FUEL-408 [1] https://github.com/Mirantis/reclass-system-salt-model/commit/81c4c21a [2] https://github.com/Mirantis/reclass-system-salt-model/commit/a3b38f46 Change-Id: I9b35d5d0ce4e0b53dae808c2620a31ca80290b55 Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
2019-06-04Merge "Revert "Disable block migration explicitly""Michael Polenchuk2-2/+0