Age | Commit message (Collapse) | Author | Files | Lines |
|
Recent virsh/Docker network rework changed mcpcontrol (previously
a virsh-managed network) into a Docker-controlled network using
the 'bridge' driver.
As a consequence, Docker now isolates traffic from 'mcpcontrol'
network from the default Docker bridge network ('docker0') using
iptables rules that check input/output interfaces.
Yardstick (and any other Docker container hooked via 'docker0')
will not be able to ssh into Salt master due to this isolation.
One possible workaround would be to explicitly ACCEPT traffic
from 'docker0' going to Salt master. However, this is only
properly supported starting with Docker 17.06, while most CI hosts
and end users are still using 17.05 or older.
In older Docker releases, DOCKER-USER iptables table was not
avaiable, so injecting custom iptables and making them persistent
is not only complicated, it's also prone to subtle errors.
Another way to bypass the iptables rules is to route the packets
coming from our new Docker network via another bridge before
letting them find their way into 'docker0'.
This change adds a new route for the Salt master host (note that
MaaS container will not benefit from this) via the PXE bridge on
the jumphost (which can be either a real Linux bridge for baremetal
deployments or a virsh-managed network); adding one extra network
hop for each packet going between our 'mcpcontrol' Docker network
and 'docker0', effectively bypassing the Docker-enforced iptables
DROP.
Change-Id: Id8ac7a638c778887b361c9b64c320664c88f59fd
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
|
|
Change-Id: I791436f512dea6c6bc61133c4122ac872950af8e
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Upstream change [1] switched from old qemu-nbd preseeding of VCP VMs
to using a cloud-init + configuration drive. This breaks on AArch64
with "IDE controllers are unsupported for this QEMU binary or machine
type", so switch back to using qemu-nbd.
[1] https://github.com/Mirantis/reclass-system-salt-model/commit/c0e4807
Change-Id: I0dfeb638d408343c76a73fafa503048a79ce1f6e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
There is no enough memory (default 4k pages) for services
like libvirt, which cannot fork child processes.
Change-Id: I44d8efd7cafb52a7c823c02738c1d321017aa7a3
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Required only for Rally validation in cinder scenarios,
there is no useful functionaly in terms of cluster.
Change-Id: Idc4d62cbbc9974972e9d492b5a419342077e3d9a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Sometimes instance doesn't get ip address from dhcp server, which
resides only on gateway node, so run additional dhcp/metadata agents
on compute nodes to handle tenant networks in place.
Change-Id: If1d74af665cf8db64b09f846fac7192f76abdb25
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
The per port memory model provides a more transparent memory usage model
and avoids pool exhaustion due to competing memory requirements for
interfaces. (http://docs.openvswitch.org/en/latest/topics/dpdk/memory/)
Change-Id: I5add0f49cdcdf2fc3d24affee10a275abe3ca46a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
- bump Pharos git submodule to allow PODs with fewer nodes;
- add `k8-calico-iec-noha` scenario definition for Akraino
IEC basic configuration;
- add `k8-calico-iec-vcp-noha` scenario definition for Akraino
IEC nested (virtualized control plane) configuration;
- add `akraino_iec` state, which will leverage the Akraino IEC
bootstrap scripts from [1];
- replace system.reboot salt call with cmd.run 'reboot' as it's more
reliable;
- use kernel 4.15 for AArch64 K8 IEC scenarios;
NOTE: These scenarios will not be released in OPNFV since don't rely
on Salt formulas but instead of Akraino IEC scripts to install K8s.
[1] https://gerrit.akraino.org/r/#/q/project:iec
Change-Id: I4e538e0563d724cd3fd5c4d462ddc22d0c739402
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I2b41ce2e275bb053fa2590654ea7fa432b0c857f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
* add opendaylight password (removed from system level)
* get updated ovn system class w/o mysql settings
* enable ceilometer user back (removed along with outdated service/endpoints)
* adjsut check interval of haproxy for noha scenarios since there is
only one backend for services, i.e. failover ain't expected
Change-Id: Iedee290e1cfcf838998bd44dc09a729d143974ac
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
After Rocky support was added upstream to salt-formula-neutron, our
FDIO patch continued to be applied only for Queens, so refresh the
patch by switching to Rocky.
Change-Id: If0bbb9c4ec674d386ceade00ef8fe936482fb49c
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I745a838b1f2f294b6c455700509ddf4b0264446f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
This reverts commit ac56d7b14f46b05f497b3dca4b6a4b0bfedd83e2.
The original patch has been merged (https://review.openstack.org/643011)
Change-Id: I3a7cd825f371e375d36256143b4b8c91f90ee26e
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Certain kernels (e.g. 4.4.0-101+ in Ubuntu) no longer automatically
ack the partition table update after `kpartx -a /dev/nbdX`, see [1].
To avoid another dependency on `parted` packages, use `partx` from
`util-linux`, which is already installed as a dependency of e2fsprogs.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1743026
Change-Id: Ibd993fe210c1a11814e89a66759568d4d117d613
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
* update gnocchi to 4.3
* remove outdated ceilometer api
Change-Id: I7adaf3ddc76d93531b6b0997b684672b80f2992f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Create 2 systemd services on the jumphost that will handle veth
pairs creation, respectively adding them to virsh/real bridges.
This allows us to set docker containers restart policy to 'always',
enabling persistent Salt Master/MaaS containers across jumphost
reboots.
NOTE: libvirt creates virtual networks async, hence the need for
retrying hooking veths to them.
JIRA: FUEL-406
Change-Id: I1ca033cb5eb854b577b57bb2387a58bd9605a5bb
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Id75ffe4db808a4ec250ba8b86c5d49f1206c3784
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Also re-align resources for virtual scenarios.
Change-Id: Id0d55407fd5b1720a24e30c364219f8b08e89d06
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Bug: https://bugs.launchpad.net/nova/+bug/1809123
Change-Id: I14622c21826aeeddac6ea7bf7f9d116cd3e68cfb
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
As reported in [1], kernel 4.4 seems to break nested virtualization,
add a fatal check against it.
[1] https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1797332
Change-Id: I0aef8a7340dd82bfeb2e58c9642623b9ec13dca5
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Some PODs are fast enough to get past installing, syncing and using
MaaS to provision the OS on the baremetal nodes before the 1h mine
refresh.
Since mine.update operation is fast enough to go unnoticed and we
only collect IP addresses, grains and pem entries, schedule it every
15 minutes.
Due to reclass class inheritance, we can't easily override this via
pillar data, so handle it via entrypoint.sh.
Change-Id: I0d8ed2da838ad09c94e9327d0131d3e239de4f08
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Ifc4fff90551344c69295990b220f0778967887a4
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
|
|
Previously, Salt Master CA mine was only sent once, during
salt.minion.ca state execution at cfg01 bringup / bootstrap.
This causes possible issues with:
- Salt Master container restart (mine data is lost);
- UNH Lab deployment (uknown rootcause, might be related to XFS and
overlay2 being used with Docker on CentOS);
To bypass this issue, make x509.get_pem_entries module send mine data
at the default mine interval (60 minutes).
Change-Id: I5f6334ae18f5af6cbe0a164791603b67f0a3668f
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
- replace mas01 VM with a Docker container;
- drop `mcpcontrol` virsh-managed network, including special handling
previously required for it across all scripts;
- drop infrastructure VMs handling from scripts, the only VMs we still
handle are cluster VMs for virtual and/or hybrid deployments;
- drop SSH server from mas01;
- stop running linux state on mas01, as all prerequisites are properly
handled durin Docker build or via entrypoint.sh - for completeness,
we still keep pillar data in sync with the actual contents of mas01
configuration, so running the state manually would still work;
- make port 5240 available on the jumpserver for MaaS dashboard access;
- docs: update diagrams and text to reflect the new changes;
Change-Id: I6d9424995e9a90c530fd7577edf401d552bab929
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
While the minions are working their jobs the CLI is waiting for the
first initial timeout period (timeout) to start. When that hits,
the CLI sends sends the first "find_job" query. This kicks off the
gather_job_timeout timer. Sometimes a minion doesn't respond to the request
within the gather_job_timeout time period (default is 10s), so rise up
this value to give a chance for a minion to report actual status.
Change-Id: Ic3756b82fdeb17718870ab30e9578263d25309f7
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
|
|
Change-Id: I3bbe3e4be520ccac198654bb4a7d493aa8450023
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
Change-Id: I7709c9ca9e701b656447154919eb084a710f49af
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
The PaxOsgi logging has a performance impact
(i.e. makes pressure to the Java GC).
Change-Id: Ic0bc2c0d1cfac195a04d1cfa90fa7fa47fc37612
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
|
|
|
|
Previously, Ubuntu ignored the VPP pinning with:
N: Ignoring file 'fdio.ubuntu' in directory '/etc/apt/preferences.d/'
as it has an invalid filename extension
Change-Id: I5ee60c1715bea3b4180b55125dc72962a70c2754
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: I7486569568207f7652f8bdfcf1060ce51a9dbb0e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Change-Id: Ia7f8845017333e54db110bca5b3715702948b76b
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
In order to mitigate live migration procedure make VIF plugging
event non-fatal for nova-compute. Also align max value of memory
for instance of ODL controller.
Change-Id: I0d00cc97c652eef3bd3404fac4715e2e7f2f02c7
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
|
|
|
|
|
|
Extend one of the existing deployment arguments to allow the
installation of only the operating system and infrastructure networks,
skipping cloud setup.
Change-Id: Ibc5d0f324ed15b66f809839cfce49a0324b6fe4d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
|
|
VPP 18.10 has a weird bug triggered by certain packets, e.g. from
inside a guest VM on a compute node, these behave differently:
$ udhcpc -x hostname:1234567890123456789012 # works
$ udhcpc -x hostname:12345678901234567890123 # confuses VPP on gtw01
To avoid this bug, pin VPP to the previous release, which does not
exhibit the issue.
Change-Id: I8c1e085731909d4b9296e8b09608887a4b5bfdd6
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Baremetal clusters might benefit from having a little more time
to plug in the VIFs.
Change-Id: I9406a0ef24de2177827b3acd27b7c60b293a4572
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|
|
Fix broken systemd service unit dependecies:
- OVS should start before networking service;
- OVS ports & bridges should not be automatically ifup-ed by
networking service to avoid races, so drop 'auto' for both
(OVS ports are automatically handled when part of an OVS bridge);
- explicitly ifup OVS bridges as part of networking service, but
after all Linux interfaces have been handled;
- use 'allow-ovs br-prv' to let OVS handle br-prv and avoid another
race condition;
While at it, fix some other related issues:
- make OVS service start after DPDK service (if present);
- bump OVS-DPDK compute VMs RAM since since switching from MTU 1500
to jumbo frames for virtual PODs a while ago failed to do so [1];
- avoid creating conflicting reclass linux.network.interfaces entries
for OVS ports by using their name (drop 'ovs_port_' prefix):
* for untagged networks they will override existing common defs;
* for tagged networks, they will create separate entries;
- DPDK scenarios: make gtw01 br-prv members OVS ports to avoid race
conditions after node reboot by letting OVS handle them;
[1] https://developers.redhat.com/blog/2018/03/16/\
ovs-dpdk-hugepage-memory/
Change-Id: I0266ba67f3849b6f7e331a758146b331730bae55
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
|