summaryrefslogtreecommitdiffstats
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2019-03-20Fixes deployment on CentOS 7.6Tim Rozet1-0/+4
Ceph-ansible install is moved from image builder to post undercloud install to ensure the right repo exists and it used. OVS building is now skipped as the build fails with CentOS 7.6. JIRA: APEX-658 Change-Id: I6ec253d5a88eb3cdfa38cf177b6e4b16ac5a16ed Signed-off-by: Tim Rozet <trozet@redhat.com>
2019-01-22Fixes broken compute role updateTim Rozet2-3/+20
We now insert the External network into the compute role after it was removed upstream. However, the format has now changed for the network specification. It no longer uses an Array, but instead uses a Dict. This patch accounts for that case. Also, adds new required arg --role-name to the NIC template merge tool. Additionally, now undercloud is missing iptables rule to allow ssh after undercloud install. This patch adds it via ansible. Change-Id: Id3e4ecdfb1633ec4c58435c294f544a9625a106e Signed-off-by: Tim Rozet <trozet@redhat.com>
2019-01-11Fixes undercloud install failure with setting hostnameTim Rozet5-15/+29
There is a new bug when deploying master/rocky where the OS of the undercloud/overcloud is now upgraded to CentOS 7.6. When the undercloud install runs it fails to configure the hostname using hostnamectl. This is because systemd-hostnamed is not running and fails to start. Simply reloading dbus seems to fix the issue. In the dbus logs there are odd error messages like: dbus-daemon[3230]: Unknown username "root" in message bus configuration file Disabling selinux seems to fix this. This patch also moves to use podman instead of docker for container management and invokes a script in Ansible which updates NIC templates as new variables are added upstream. Furthermore, with the new patches for routed networks in OOO, it is now required that the MTU is set in network-data, as well as adding the External network to the Compute role. Now the External network is removed by default from the Compute role. Change-Id: Ie8b86f6f28d69bda11b1f7a430df882970ac3cb9 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-12-19Attempting to fix NFS issuesTim Rozet2-6/+18
Issues still persist where sometimes instances fail to start due to a failure with os.utime to read the file path. This could be some bad race condition between qemu/nova while copying images on the NFS. This patch adds more ports to open in firewall, and changes initial directory owner to nfsnobody. Also, includes a patch to fix an apparent race condition when nova sends a remote call to the privsep helper daemon to modify the time of the base file owned by qemu: https://review.openstack.org/#/c/625741/ Includes another fix for patching container images where the docker image was not being detected correctly because the full gerrit project name including 'openstack/' prefix was being used to search tripleo docker images. Additionally, there were more bugs around patching openstack python containers where the patch was not being applied correctly. JIRA: APEX-654 Change-Id: I1d011035486298d5906038922e69d478c383c3f7 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-12-14Fix NFS issues with NovaTim Rozet1-5/+3
There are problems with Nova launching instantces due to permissions with nova being able to read/write certain directories on the NFS. The permissions are right on the NFS and the folders the NFS mounts to, but there still seems to be issues. The cause may be using a directory under /root as the NFS mount. This patch moves the NFS mounts to be individual folders under /. The patch also restart nova_compute docker container as NFS problems still persist unless this is done. JIRA: APEX-654 Change-Id: I25eee98c1a6516dfa44c686c2e614f6dc7000d98 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-11-18Bring in aarch64 support in apexCharalampos Kominos3-38/+30
RDO builds packages which are aarch64 compatible but some configuration is needed to succesfully deploy. This change: - Prepares the aarch64 docker.io repo as the source for Kolla Containers - Configures VM sizing for aarch64 undercloud. - Configures VM sizing for aarch64 virtual deploy targets. Vms need to be larger on aarch64 compared to x86 to avoid starvation of resources. (MYSQL) - Configures vda2 as the location of the Linux Kernel in aarch64 in an UEFI system - Configures the vNICs to be on the pci-bus instead of the virtio-mmio bus.This will enalbe the Nics to come up in the same order as the x86 ones, so the extra configuration in ansible is not needed - Configures apex to use a stable version of the ceph:daemon container - Configure apex for containerized undercloud in Rocky - Add extra ansible.cfg file for aarch64 which increases waiting times in ansible for aarch64 - Provide helper scripts for DIB to create aarch64 UEFI images Known limitations: - Selinux is interfering with DHCP requests in ironic and ssh so it must be disabled before the deploy command is ran. - The aarch64 containers are frozen for in this commit: https://trunk.rdoproject.org/centos7-rocky/f3/18/f3180de6439333a2813119ad4b00ef897fcd596f_70883030 - The 600s timeout defined in : https://bugs.launchpad.net/tripleo/+bug/1789680 is not enough for aarch64. A value of 1200s is recommended JIRA: APEX-619 Change-Id: Ia3f067821e12bba44939bbf8c0e4676f2da70239 Signed-off-by: Charalampos Kominos <Charalampos.Kominos@enea.com> Signed-off-by: ting wu <ting.wu@enea.com>
2018-11-13Remove downloading undercloud.qcow2Tim Rozet1-0/+4
OOO team is removing the undercloud disk image as it is no longer needed for containerized undercloud deployments. Instead, we can just use the overcloud image as the undercloud image. Additionally, OOO team has recommended we use current-tripleo instead of current-tripleo-rdo. current-tripleo-rdo was previously thought to be more stable with more promotion checks, but now it seems that it is older and current-tripleo now has the same stability/checks. This patch also bumps the undercloud RAM from 8GB to 10GB. With the new containerized undercloud there is more RAM consumption during deployment. Change-Id: I9e6bb2260dbe9f8796ee54d20527c0aad96476ec Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-11-01Merge "Fixes Docker image upload for master/rocky"Feng Pan2-46/+12
2018-11-01Fixes Docker image upload for master/rockyTim Rozet2-46/+12
The API has changed to create/upload the docker container images to be used for deployment. In the past the prepare commands would read the THT environment files passed, to determine which docker images to render into an environment file. The new behavior uses a new "containers-prepare-parameter.yaml" format (included in this patch), which Apex will now configure for deployment. By default docker images will be rendered for all TripleO services identified in the roles_data.yaml file. Therefore we must use several excludes patterns to only pull the docker images needed for a default deployment. JIRA: APEX-642 Change-Id: Iab00fcb874554bb98540dc9a4c3051e58ea68a3b Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-10-31Adds SDN Port variables to overcloudrcTim Rozet1-0/+11
Functest needs these values to be set in overcloudrc to know which ports to query ODL on. JIRA: APEX-621 Change-Id: I41e34efccedc26edd98c6dd3f45e553ea76db195 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-10-31Fixes failure to restart containers post undercloud installTim Rozet2-12/+11
It looks like the docker_container ansible module will recreate the container if it fails to restart it. This is undesired behavior so moving to use shell to restart the containers. Also, fixes mistral executor container not properly mounting the ceph-ansible playbook. Additionally fixes an issue with ceph-ansible by downgrading the package. Related rhbz: https://bugzilla.redhat.com/show_bug.cgi?id=1644713 Change-Id: I3199b4af11a4170d19419f70cb53f7d74def273c Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-10-08Merge "Adding support for containerized undercloud"Tim Rozet2-47/+40
2018-10-08Adding support for containerized undercloudRicardo Noriega2-47/+40
Master code only supports containerized undercloud now, so this migration is needed. - Containerized services in undercloud We can still apply patches to THT and other non-docker services, but we will need to add support for patching openstack services on undercloud. Change-Id: I1ca4c6108f144efef7b5889503af265ef0fff8b2 Signed-off-by: Ricardo Noriega <rnoriega@redhat.com> Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-09-27Enable OVN scenariosTim Rozet1-1/+4
As of Queens only HA OVN deployments are supported. Change-Id: I184c5a096fec9cbc3cf2ec06218700138ea3ed57 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-09-17Fix per-network routes to NIC templates dependencyRicardo Noriega1-0/+4
Change-Id: I9e01f1164fc72915b92dfb1c0aad7414c484567e Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-09-06Updates Calipso deploy settingsTim Rozet1-1/+7
Change-Id: Ibfbd08dc2fa5fca95668fd0590707cfebd92099f Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-08-23Adds deployment via snapshotTim Rozet1-0/+27
New arguments are added to allow snapshot deployment: --snapshot, --snap-cache The previous tripleo-quickstart code has been removed/replaced with the snapshot option. Snapshot deployments are supported on CentOS and Fedora, and snapshot artifacts use a similar caching system as the standard deployment. Snapshots are produced daily by Apex, and include latest as well as n-1 OpenStack versions. The os-odl-nofeature scenario is used for the snapshots. Additionally multiple topology verions of Snapshots are available. The Snapshot pulled at deploy time depends on the deploy-settings and number of virtual-computes used at deploy time. Since there is only one network used with snapshot deployments (admin), there is no reason to pass in network settings for snapshot deployments. That argument is now optional. Previously we required even in Standard virtual deployments that the network settings be provided. However that is also unnecessary, as we can default to the virtual network settings. Includes minor fix to the tox.ini to allow specifying test cases to run (useful for developers writing tests). Default behavior of tox is unchanged. JIRA: APEX-548 Change-Id: I1e08c4e54eac5aae99921f61ab7f69693ed12b47 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-08-22Enable SFC scenarios for GambiaRicardo Noriega1-9/+0
- This patch will install OVS 2.9.2 including its kernel module which allows native NSH headers. - Fix Custom OVS due to bug: https://bugzilla.redhat.com/show_bug.cgi?id=1544892 - Tacker is disable for the time being, tacker-conductor needs to be enabled. JIRA: APEX-630 Change-Id: Ia410309fd7053602ce78eae919839d0f57c9742a Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-08-02Enable BGPVPN for master deploymentsRicardo Noriega1-1/+1
- Injection of Quagga tarball via overcloud builder. - Extraction and installation of all related packages. - It uses SDNVPN artifact repository to download Quagga tarball, so there is only one source to test. - Modifies bgpvpn scenario files to use OS master branch, ODL master branch and containers. JIRA: APEX-627 Change-Id: Icdbc2853d9531048e23fd6d5e444bd68208d18fc Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-07-31Use metadata IP instead of FQDNTim Rozet1-0/+11
There is an issue with loss of external network connectivity that prevents cloud init to instances working. This becomes a big problem with snapshots where there is no external network connectivity. Cloud init fails because each request takes over 30 seconds to get a response. This is because in the background neutron metadata agent is proxying the request to nova metadata agent with an HTTP GET using the FQDN. For whatever reason, a DNS lookup happens even though the entry exists for the FQDN in /etc/hosts and waits 30 seconds until timing out. After this timeout, a 200 OK is sent and metadata works. This patch modifies the config file for metadata to use nova metadata server's internal IP rather than FQDN as there is no option in OOO to use IPs instead of FQDN. Change-Id: I6960181a227d0002c99aeae5112f59807dc41d7a Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-07-26Disable undercloud as containers for nowTim Rozet2-2/+2
Upstream has moved to install undercloud as containers which breaks our post undercloud configuration for some services. Disable it for now and then move to it gracefully in a future patch. Change-Id: I0cd1a8ddac4ba92734750265d2a16d3ef008f236 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-07-26Merge "Remove obsolete Ceph tags"Tim Rozet1-29/+16
2018-07-26Remove obsolete Ceph tagsRicardo Noriega1-29/+16
This patch removes the logic to use an specific tag for Ceph containers. We will use whatever docker image TripleO upstream uses. For aarch64, an ansible task will replace the tag to pull the proper container image. This patch also refactors the preparation of the local registry. In Queens, there is no need to execute twice the overcloud container image prepare command. JIRA: APEX-622 Change-Id: I947d931609e58505675bb460a59d08c1d10d1d0b Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-07-24Open port 8101 on controllers for karafTim Rozet1-0/+15
By default 8101 (karaf shell) is blocked on controllers. In Apex we advertise in our user guide (and tools scripts) the ability to connect to karaf shell. It is also required to run CSIT. This patch opens the port when ODL is deployed. Change-Id: Ib3ece41f19607bafc329d9de390cf774766a26cd Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-07-20Fixes for snapshotsTim Rozet1-0/+6
With deploying snapshots with a new ODL, we currently bring down the docker container and bring up the tar.gz distro of ODL on the Overcloud host itself (not rebuilding/using container). Therefore we need java installed so that ODL can run on the host. In the future this may change, but it works well and keeps things simple for now. Additionally, there was a change upstream to make the opendaylight container docker restart policy "unless-stopped" which means it will no longer restart automatically when docker is stopped/started. Therefore on first snapshot bring up (without the previously mentioned ODL reinstallation) the container does not start, and snapshot deployment fails. This patch includes a change to the restart policy to always restart it. Change-Id: Icc712ba147e578a28e371313154ae3190676f0dc Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-07-18Add param for ODL password into overcloudrcTim Rozet1-0/+14
Recent changes upstream have removed the default 'admin' ODL password and now password is randomly generated: https://review.openstack.org/#/c/578505/ So in OPNFV we now store the password in overcloudrc as SDN_CONTROLLER_PASSWORD variable. Also includes minor fixes to unittests. Change-Id: Iabe7e4f902442c80af99ba1603a3927cf13d0393 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-07-06Add support for kubernetes deploymentZenghui Shi4-3/+16
This patch adds capability to deploy kubernetes cluster instead of openstack. Kubernetes will be deployed using kubespray and is run after TripleO bookstraps overcloud nodes. JIRA: APEX-574 Change-Id: If9c171620c933a052b719e7112a50e22bbab667f Signed-off-by: Feng Pan <fpan@redhat.com> Signed-off-by: Zenghui Shi <zshi@redhat.com>
2018-07-06Fix neutron-opendaylight-sriov.yaml pathFeng Pan2-2/+2
In latest THT upstream, environment file neutron-opendaylight-sriov.yaml was moved to services folder. Updating references in Apex to avoid deploy failure. Change-Id: I7065e0d8e13c9add9ead282db2244a27c177e5a4 Signed-off-by: Feng Pan <fpan@redhat.com>
2018-06-19Merge "Fixes Ceph PG calculation"Feng Pan1-0/+8
2018-06-18Fixes Ceph PG calculationTim Rozet1-0/+8
Baremetal deployments were failing because the ceph PG size was exceeding the max allowed. Virtual was still working because we lower the number of pools and pg/osd. This patch changes the values to a number which should work for both virtual and baremetal. Also includes a fix which adds the controllers back as OSDs and a few other cleanup issues. JIRA: APEX-614 JIRA: APEX-569 Change-Id: I2ad65727ecdcaa0454eb53d25e32b7f1a53cd3a4 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-06-18Fetch mistral logs from undercloudTim Rozet1-3/+18
/var/lib/mistral path contains logs for when ansible is invoked by TripleO for Ceph configuration as well as config download. This patch now archives and fetches that directory. Logs in previous releases like Queens store the Ceph logs in /var/log/mistral. Change-Id: I50c43e55efaa5dbcf8b7fb00b0e11cd3288fdd05 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-05-30Configure NAT with baremetal when necessaryTim Rozet1-1/+1
We currently only enable NAT on undercloud for virtual deployments. However, there could be a case where a baremetal deployment also needs NAT as it is not using an interface on the overcloud nodes with external access. Therefore this patch changes the behavior to configure NAT when the gateway of either the external or admin (when external is disabled) network matches an IP assigned to the undercloud. JIRA: APEX-605 Change-Id: I9c79af371913e6e5f0d39b433f68205bc7e106c5 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-05-25Remove pacmeaker workaroundTim Rozet1-27/+0
There was a compatibility issue with the centos 7.4/7.5 between the host pacemaker version and container. Now that containers have moved to 7.5 we should not need this workaround anymore. Change-Id: I9632c65e87687d4f36130719c6df9af2e913eed8 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-05-21Migrates master to use direct upstreamTim Rozet4-6/+51
We now move master to deploy from upstream. That means we do not need to build undercloud/overcloud images anymore. Changes-Include: - Remove bash build scripts as we do not need to build anything other than the python package anymore - Remove building images or iso from build.py - Remove building of images and iso from Makefile - Rename/refactor deploy settings files for nosdn and odl. The new convention is that the typical scenario names we use will deploy master. We also support n-1 OS, so in that case we use the branch name for the "feature" in the scenario name: os-odl-queens-noha. - Tacker/Congress are disabled in settings files until we fix that with upstream. Containers are now enabled by default. - Disable TLS for undercloud (was changed upstream to default enabled) - Fix environments docker directory for master THT (was changed upstream) - Includes fix for LP#1768901 - Includes workaround for LP#1770692 - Moves to docker.io for container images as it is more stable and should contain the same images - Removes the term 'common' from apex packaging for referencing the Python Apex package Change-Id: If6b433860b3ff882686c78d0f24a2f0c52b9b57a Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-04-09Fix functional issues after nosdn deploymentTim Rozet1-0/+17
After deploying with nosdn, it looks like there is some out of state issue between the services. First guess looks like something is going on with the services and timing of registering to each other through rabbit. Simply restarting the services seems to sync them back up correctly. Change-Id: I417911067c841725ee12eb9354e5759054724e01 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-04-04Adds the ability to fetch logs from deploymentTim Rozet2-0/+38
Usage: opnfv-pyutil --fetch-logs python3 utils.py --fetch-logs --lib-dir ../lib Eventually all utils.sh functions will be migrated here. Note there is no support here for containers. Will be added later. Change-Id: I223b8592ad09e0370e287ee2801072db31f9aa12 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-03-16Enables containerized overcloud deploymentsTim Rozet3-56/+140
Changes Include: - For upstream deployments, Docker local registry will be updated with latest current RDO containers, regular deployments will use latest stable - Upstream container images will then be patched/modified and then re-uploaded into local docker registry with 'apex' tag - Deployment command modified to deploy with containers - Adds a --no-fetch deployment argument to disable pulling latest from upstream, and instead using what already exists in cache - Moves Undercloud NAT setup to just after undercloud is installed. This provides internet during overcloud install which is now required for upstream container deployments. - Creates loop device for Ceph deployment when no device is provided in deploy settings (for container deployment only) - Updates NIC J2 template to use the new format in OOO since the os-apply-config method is now deprecated in > Queens JIRA: APEX-566 JIRA: APEX-549 Change-Id: I0652c194c059b915a942ac7401936e8f5c69d1fa Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-03-16Merge "Adding SRIOV scenario"Tim Rozet2-0/+11
2018-03-15Adding SRIOV scenarioRicardo Noriega2-0/+11
This scenario should enable SRIOV interfaces to be used by Neutron. Only will be supported in baremetal deployments with SRIOV capable NICs. The name of the interface must be known in advance and the physnet of the SRIOV network is set as nfv_sriov. Change-Id: Ie4295413e0be2197bd9ada4f887f6b47cd486765 Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-03-12Adds OS_REGION_NAME into overcloudrc filesTim Rozet1-0/+15
Although this is not required to be able to access overcloud, it is required by some tests in Functest. JIRA: APEX-570 Change-Id: I45deaa8061f1be44ce80eed4810537eaf6841803 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-03-07Merge "Adding libguestfs-tools as dependency"Feng Pan1-0/+1
2018-02-16Add http(s)_proxy handling to apexDan Radez1-0/+12
JIRA: APEX-512 Change-Id: I875bd99203b425e448e7a3f64eb9a8f99d03ddaf Signed-off-by: Dan Radez <dradez@redhat.com>
2018-02-15Ensures v4/v6 iptables filters are loadedTim Rozet1-0/+6
We configure host iptables to open different ports for VBMC. This may fail if the iptables filters are not loaded. JIRA: APEX-521 Change-Id: Ia33032c29aba3555551e39b4f819087aeafe05d9 Signed-off-by: Tim Rozet <trozet@redhat.com>
2018-02-15Adding libguestfs-tools as dependencyRicardo Noriega1-0/+1
Needed for virt-customize the images Change-Id: Ide3fff2c6b850047add6eeed4082c518c36e6e74 Signed-off-by: Ricardo Noriega <rnoriega@redhat.com>
2018-02-07Allow disabling ipxe for provisioningDan Radez1-3/+0
JIRA: APEX-535 Change-Id: I52d17e962fc4a504db1ddbc20df0ac56a208f34b Signed-off-by: Dan Radez <dradez@redhat.com>
2018-01-31wrapping up deploy items for aarchDan Radez1-0/+2
Change-Id: Ib5f4715d851dc91be6a57fcb5d18a0557a7b0c7f Signed-off-by: Dan Radez <dradez@redhat.com>
2017-12-05Make introspection optionalDan Radez1-5/+3
- exposes new option to end users to skip introspection - moves the logic to decide to introspect or not into python JIRA: APEX-536 Change-Id: Ieaff11362ff8f906daa98d301d3d473ad549d08f Signed-off-by: Dan Radez <dradez@redhat.com>
2017-12-02Merge "Fix nested kvm detection and enablement"Tim Rozet1-11/+17
2017-11-30Fix nested kvm detection and enablementFeng Pan1-11/+17
- Fix ansible kvm_intel kernel module reload when trying to enable nested kvm - Add "--libvirt-type qemu" to deploy command when nested kvm is not enabled. JIRA: APEX-514 Change-Id: I0e659b1c99b5732854d723e1cb049845cb60ef37 Signed-off-by: Feng Pan <fpan@redhat.com>
2017-11-30Fixes inserting ceph OSD into compute roleTim Rozet1-1/+9
The lineinfile was not actually inserting the CephOSD line because it was already present later in the file (regardless of 'insertbefore'). Therefore we can fix it by simply removing the CephOSD line before we try to set it, since we do not need it in the storage role anyway. Change-Id: I8c8d9b6baccfc77ea582fab6ad438b02293f96fb Signed-off-by: Tim Rozet <trozet@redhat.com>