Age | Commit message (Collapse) | Author | Files | Lines |
|
There is a new bug when deploying master/rocky where the OS of the
undercloud/overcloud is now upgraded to CentOS 7.6. When the undercloud
install runs it fails to configure the hostname using hostnamectl. This
is because systemd-hostnamed is not running and fails to start. Simply
reloading dbus seems to fix the issue. In the dbus logs there are odd
error messages like:
dbus-daemon[3230]: Unknown username "root" in message bus configuration
file
Disabling selinux seems to fix this. This patch also moves to use
podman instead of docker for container management and invokes a script
in Ansible which updates NIC templates as new variables are added
upstream. Furthermore, with the new patches for routed networks in OOO,
it is now required that the MTU is set in network-data, as well as
adding the External network to the Compute role. Now the External
network is removed by default from the Compute role.
Change-Id: Ie8b86f6f28d69bda11b1f7a430df882970ac3cb9
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Issues still persist where sometimes instances fail to start due to a
failure with os.utime to read the file path. This could be some bad race
condition between qemu/nova while copying images on the NFS. This patch
adds more ports to open in firewall, and changes initial directory owner
to nfsnobody.
Also, includes a patch to fix an apparent race condition when nova sends
a remote call to the privsep helper daemon to modify the time of the
base file owned by qemu:
https://review.openstack.org/#/c/625741/
Includes another fix for patching container images where the docker
image was not being detected correctly because the full gerrit project
name including 'openstack/' prefix was being used to search tripleo
docker images. Additionally, there were more bugs around patching
openstack python containers where the patch was not being applied
correctly.
JIRA: APEX-654
Change-Id: I1d011035486298d5906038922e69d478c383c3f7
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
There are problems with Nova launching instantces due to permissions
with nova being able to read/write certain directories on the NFS. The
permissions are right on the NFS and the folders the NFS mounts to, but
there still seems to be issues. The cause may be using a directory under
/root as the NFS mount. This patch moves the NFS mounts to be individual
folders under /. The patch also restart nova_compute docker container as
NFS problems still persist unless this is done.
JIRA: APEX-654
Change-Id: I25eee98c1a6516dfa44c686c2e614f6dc7000d98
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
RDO builds packages which are aarch64 compatible but some configuration
is needed to succesfully deploy.
This change:
- Prepares the aarch64 docker.io repo as the source for Kolla Containers
- Configures VM sizing for aarch64 undercloud.
- Configures VM sizing for aarch64 virtual deploy targets.
Vms need to be larger on aarch64 compared to x86 to avoid
starvation of resources. (MYSQL)
- Configures vda2 as the location of the Linux Kernel in aarch64 in
an UEFI system
- Configures the vNICs to be on the pci-bus instead of the virtio-mmio
bus.This will enalbe the Nics to come up in the same order as the
x86 ones, so the extra configuration in ansible is not needed
- Configures apex to use a stable version of the ceph:daemon container
- Configure apex for containerized undercloud in Rocky
- Add extra ansible.cfg file for aarch64 which increases waiting
times in ansible for aarch64
- Provide helper scripts for DIB to create aarch64 UEFI images
Known limitations:
- Selinux is interfering with DHCP requests in ironic and ssh
so it must be disabled before the deploy command is ran.
- The aarch64 containers are frozen for in this commit:
https://trunk.rdoproject.org/centos7-rocky/f3/18/f3180de6439333a2813119ad4b00ef897fcd596f_70883030
- The 600s timeout defined in :
https://bugs.launchpad.net/tripleo/+bug/1789680 is not enough for
aarch64. A value of 1200s is recommended
JIRA: APEX-619
Change-Id: Ia3f067821e12bba44939bbf8c0e4676f2da70239
Signed-off-by: Charalampos Kominos <Charalampos.Kominos@enea.com>
Signed-off-by: ting wu <ting.wu@enea.com>
|
|
OOO team is removing the undercloud disk image as it is no longer needed
for containerized undercloud deployments. Instead, we can just use the
overcloud image as the undercloud image.
Additionally, OOO team has recommended we use current-tripleo instead of
current-tripleo-rdo. current-tripleo-rdo was previously thought to be
more stable with more promotion checks, but now it seems that it is
older and current-tripleo now has the same stability/checks.
This patch also bumps the undercloud RAM from 8GB to 10GB. With the new
containerized undercloud there is more RAM consumption during
deployment.
Change-Id: I9e6bb2260dbe9f8796ee54d20527c0aad96476ec
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
|
|
The API has changed to create/upload the docker container images to be
used for deployment. In the past the prepare commands would read the THT
environment files passed, to determine which docker images to render
into an environment file. The new behavior uses a new
"containers-prepare-parameter.yaml" format (included in this patch),
which Apex will now configure for deployment. By default docker images
will be rendered for all TripleO services identified in the
roles_data.yaml file. Therefore we must use several excludes patterns to
only pull the docker images needed for a default deployment.
JIRA: APEX-642
Change-Id: Iab00fcb874554bb98540dc9a4c3051e58ea68a3b
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Functest needs these values to be set in overcloudrc to know which ports
to query ODL on.
JIRA: APEX-621
Change-Id: I41e34efccedc26edd98c6dd3f45e553ea76db195
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
It looks like the docker_container ansible module will recreate the
container if it fails to restart it. This is undesired behavior so
moving to use shell to restart the containers.
Also, fixes mistral executor container not properly mounting the
ceph-ansible playbook. Additionally fixes an issue with ceph-ansible by
downgrading the package. Related rhbz:
https://bugzilla.redhat.com/show_bug.cgi?id=1644713
Change-Id: I3199b4af11a4170d19419f70cb53f7d74def273c
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
|
|
Master code only supports containerized undercloud now, so this
migration is needed.
- Containerized services in undercloud
We can still apply patches to THT and other non-docker services, but
we will need to add support for patching openstack services on
undercloud.
Change-Id: I1ca4c6108f144efef7b5889503af265ef0fff8b2
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
As of Queens only HA OVN deployments are supported.
Change-Id: I184c5a096fec9cbc3cf2ec06218700138ea3ed57
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Change-Id: I9e01f1164fc72915b92dfb1c0aad7414c484567e
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
Change-Id: Ibfbd08dc2fa5fca95668fd0590707cfebd92099f
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
New arguments are added to allow snapshot deployment:
--snapshot, --snap-cache
The previous tripleo-quickstart code has been removed/replaced
with the snapshot option.
Snapshot deployments are supported on CentOS and Fedora, and snapshot
artifacts use a similar caching system as the standard deployment.
Snapshots are produced daily by Apex, and include latest as well as n-1
OpenStack versions. The os-odl-nofeature scenario is used for the
snapshots. Additionally multiple topology verions of Snapshots are
available. The Snapshot pulled at deploy time depends on the
deploy-settings and number of virtual-computes used at deploy time.
Since there is only one network used with snapshot deployments (admin),
there is no reason to pass in network settings for snapshot deployments.
That argument is now optional. Previously we required even in Standard
virtual deployments that the network settings be provided. However that
is also unnecessary, as we can default to the virtual network settings.
Includes minor fix to the tox.ini to allow specifying test cases
to run (useful for developers writing tests). Default behavior of
tox is unchanged.
JIRA: APEX-548
Change-Id: I1e08c4e54eac5aae99921f61ab7f69693ed12b47
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
- This patch will install OVS 2.9.2 including
its kernel module which allows native NSH
headers.
- Fix Custom OVS due to bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1544892
- Tacker is disable for the time being, tacker-conductor
needs to be enabled.
JIRA: APEX-630
Change-Id: Ia410309fd7053602ce78eae919839d0f57c9742a
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
- Injection of Quagga tarball via overcloud builder.
- Extraction and installation of all related packages.
- It uses SDNVPN artifact repository to download Quagga
tarball, so there is only one source to test.
- Modifies bgpvpn scenario files to use OS master branch,
ODL master branch and containers.
JIRA: APEX-627
Change-Id: Icdbc2853d9531048e23fd6d5e444bd68208d18fc
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
There is an issue with loss of external network connectivity that
prevents cloud init to instances working. This becomes a big problem
with snapshots where there is no external network connectivity. Cloud
init fails because each request takes over 30 seconds to get a response.
This is because in the background neutron metadata agent is proxying the
request to nova metadata agent with an HTTP GET using the FQDN. For
whatever reason, a DNS lookup happens even though the entry exists for
the FQDN in /etc/hosts and waits 30 seconds until timing out. After this
timeout, a 200 OK is sent and metadata works.
This patch modifies the config file for metadata to use nova metadata
server's internal IP rather than FQDN as there is no option in OOO to
use IPs instead of FQDN.
Change-Id: I6960181a227d0002c99aeae5112f59807dc41d7a
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Upstream has moved to install undercloud as containers which breaks our
post undercloud configuration for some services. Disable it for now and
then move to it gracefully in a future patch.
Change-Id: I0cd1a8ddac4ba92734750265d2a16d3ef008f236
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
|
|
This patch removes the logic to use an specific tag for
Ceph containers. We will use whatever docker image TripleO
upstream uses. For aarch64, an ansible task will replace
the tag to pull the proper container image.
This patch also refactors the preparation of the local
registry. In Queens, there is no need to execute twice
the overcloud container image prepare command.
JIRA: APEX-622
Change-Id: I947d931609e58505675bb460a59d08c1d10d1d0b
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
By default 8101 (karaf shell) is blocked on controllers. In Apex we
advertise in our user guide (and tools scripts) the ability to connect
to karaf shell. It is also required to run CSIT. This patch opens the
port when ODL is deployed.
Change-Id: Ib3ece41f19607bafc329d9de390cf774766a26cd
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
With deploying snapshots with a new ODL, we currently bring down the
docker container and bring up the tar.gz distro of ODL on the Overcloud
host itself (not rebuilding/using container). Therefore we need java
installed so that ODL can run on the host. In the future this may
change, but it works well and keeps things simple for now.
Additionally, there was a change upstream to make the opendaylight
container docker restart policy "unless-stopped" which means it will
no longer restart automatically when docker is stopped/started.
Therefore on first snapshot bring up (without the previously mentioned
ODL reinstallation) the container does not start, and snapshot
deployment fails. This patch includes a change to the restart policy to
always restart it.
Change-Id: Icc712ba147e578a28e371313154ae3190676f0dc
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Recent changes upstream have removed the default 'admin' ODL password
and now password is randomly generated:
https://review.openstack.org/#/c/578505/
So in OPNFV we now store the password in overcloudrc as
SDN_CONTROLLER_PASSWORD variable.
Also includes minor fixes to unittests.
Change-Id: Iabe7e4f902442c80af99ba1603a3927cf13d0393
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
This patch adds capability to deploy kubernetes cluster instead of openstack.
Kubernetes will be deployed using kubespray and is run after TripleO bookstraps
overcloud nodes.
JIRA: APEX-574
Change-Id: If9c171620c933a052b719e7112a50e22bbab667f
Signed-off-by: Feng Pan <fpan@redhat.com>
Signed-off-by: Zenghui Shi <zshi@redhat.com>
|
|
In latest THT upstream, environment file neutron-opendaylight-sriov.yaml was
moved to services folder. Updating references in Apex to avoid deploy failure.
Change-Id: I7065e0d8e13c9add9ead282db2244a27c177e5a4
Signed-off-by: Feng Pan <fpan@redhat.com>
|
|
|
|
Baremetal deployments were failing because the ceph PG size was
exceeding the max allowed. Virtual was still working because we lower
the number of pools and pg/osd. This patch changes the values to a
number which should work for both virtual and baremetal. Also includes a
fix which adds the controllers back as OSDs and a few other cleanup
issues.
JIRA: APEX-614
JIRA: APEX-569
Change-Id: I2ad65727ecdcaa0454eb53d25e32b7f1a53cd3a4
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
/var/lib/mistral path contains logs for when ansible is invoked by
TripleO for Ceph configuration as well as config download. This patch
now archives and fetches that directory. Logs in previous releases
like Queens store the Ceph logs in /var/log/mistral.
Change-Id: I50c43e55efaa5dbcf8b7fb00b0e11cd3288fdd05
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
We currently only enable NAT on undercloud for virtual deployments.
However, there could be a case where a baremetal deployment also needs
NAT as it is not using an interface on the overcloud nodes with external
access. Therefore this patch changes the behavior to configure NAT when
the gateway of either the external or admin (when external is disabled)
network matches an IP assigned to the undercloud.
JIRA: APEX-605
Change-Id: I9c79af371913e6e5f0d39b433f68205bc7e106c5
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
There was a compatibility issue with the centos 7.4/7.5 between the host
pacemaker version and container. Now that containers have moved to 7.5
we should not need this workaround anymore.
Change-Id: I9632c65e87687d4f36130719c6df9af2e913eed8
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
We now move master to deploy from upstream. That means we do not need
to build undercloud/overcloud images anymore.
Changes-Include:
- Remove bash build scripts as we do not need to build anything
other than the python package anymore
- Remove building images or iso from build.py
- Remove building of images and iso from Makefile
- Rename/refactor deploy settings files for nosdn and odl. The new
convention is that the typical scenario names we use will deploy
master. We also support n-1 OS, so in that case we use the branch
name for the "feature" in the scenario name: os-odl-queens-noha.
- Tacker/Congress are disabled in settings files until we fix that with
upstream. Containers are now enabled by default.
- Disable TLS for undercloud (was changed upstream to default enabled)
- Fix environments docker directory for master THT (was changed upstream)
- Includes fix for LP#1768901
- Includes workaround for LP#1770692
- Moves to docker.io for container images as it is more stable and
should contain the same images
- Removes the term 'common' from apex packaging for referencing the
Python Apex package
Change-Id: If6b433860b3ff882686c78d0f24a2f0c52b9b57a
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
After deploying with nosdn, it looks like there is some out of state
issue between the services. First guess looks like something is going
on with the services and timing of registering to each other through
rabbit. Simply restarting the services seems to sync them back up
correctly.
Change-Id: I417911067c841725ee12eb9354e5759054724e01
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Usage:
opnfv-pyutil --fetch-logs
python3 utils.py --fetch-logs --lib-dir ../lib
Eventually all utils.sh functions will be migrated here.
Note there is no support here for containers. Will be
added later.
Change-Id: I223b8592ad09e0370e287ee2801072db31f9aa12
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Changes Include:
- For upstream deployments, Docker local registry will be updated with
latest current RDO containers, regular deployments will use latest
stable
- Upstream container images will then be patched/modified and then
re-uploaded into local docker registry with 'apex' tag
- Deployment command modified to deploy with containers
- Adds a --no-fetch deployment argument to disable pulling latest
from upstream, and instead using what already exists in cache
- Moves Undercloud NAT setup to just after undercloud is installed.
This provides internet during overcloud install which is now
required for upstream container deployments.
- Creates loop device for Ceph deployment when no device is
provided in deploy settings (for container deployment only)
- Updates NIC J2 template to use the new format in OOO since
the os-apply-config method is now deprecated in > Queens
JIRA: APEX-566
JIRA: APEX-549
Change-Id: I0652c194c059b915a942ac7401936e8f5c69d1fa
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
|
|
This scenario should enable SRIOV interfaces to be used
by Neutron. Only will be supported in baremetal deployments
with SRIOV capable NICs. The name of the interface must
be known in advance and the physnet of the SRIOV network
is set as nfv_sriov.
Change-Id: Ie4295413e0be2197bd9ada4f887f6b47cd486765
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
Although this is not required to be able to access overcloud, it is
required by some tests in Functest.
JIRA: APEX-570
Change-Id: I45deaa8061f1be44ce80eed4810537eaf6841803
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
|
|
JIRA: APEX-512
Change-Id: I875bd99203b425e448e7a3f64eb9a8f99d03ddaf
Signed-off-by: Dan Radez <dradez@redhat.com>
|
|
We configure host iptables to open different ports for VBMC. This may
fail if the iptables filters are not loaded.
JIRA: APEX-521
Change-Id: Ia33032c29aba3555551e39b4f819087aeafe05d9
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
Needed for virt-customize the images
Change-Id: Ide3fff2c6b850047add6eeed4082c518c36e6e74
Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
|
|
JIRA: APEX-535
Change-Id: I52d17e962fc4a504db1ddbc20df0ac56a208f34b
Signed-off-by: Dan Radez <dradez@redhat.com>
|
|
Change-Id: Ib5f4715d851dc91be6a57fcb5d18a0557a7b0c7f
Signed-off-by: Dan Radez <dradez@redhat.com>
|
|
- exposes new option to end users to skip introspection
- moves the logic to decide to introspect or not into python
JIRA: APEX-536
Change-Id: Ieaff11362ff8f906daa98d301d3d473ad549d08f
Signed-off-by: Dan Radez <dradez@redhat.com>
|
|
|
|
- Fix ansible kvm_intel kernel module reload when trying to enable
nested kvm
- Add "--libvirt-type qemu" to deploy command when nested kvm is
not enabled.
JIRA: APEX-514
Change-Id: I0e659b1c99b5732854d723e1cb049845cb60ef37
Signed-off-by: Feng Pan <fpan@redhat.com>
|
|
The lineinfile was not actually inserting the CephOSD line because it
was already present later in the file (regardless of 'insertbefore').
Therefore we can fix it by simply removing the CephOSD line before we
try to set it, since we do not need it in the storage role anyway.
Change-Id: I8c8d9b6baccfc77ea582fab6ad438b02293f96fb
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
There was an issue with patching the overcloud where the patch binary is
missing, making it impossible to apply patches. This change install
patch now on the image.
Also, although deployments were successful, storage was not working.
This is because by default upstream does not apply Ceph OSDs to compute
nodes for hyperconverged Ceph, but we use this as our standard
deployment in Apex. This patch inserts CephOSD into the default Compute
role. Note: we normally override role's services in regular Apex
deployments so we do not hit this issue there.
Change-Id: I5bddda4784dc00148395863ae0990343a4159602
Signed-off-by: Tim Rozet <trozet@redhat.com>
|
|
We dont need to use https with rdo-release and it can
cause issues for ansible.
Change-Id: I081228a05d68f987fa02480bcd1bf216573550f1
Signed-off-by: Tim Rozet <trozet@redhat.com>
|