diff options
Diffstat (limited to 'docs/testing/user/userguide/troubleshooting.rst')
-rw-r--r-- | docs/testing/user/userguide/troubleshooting.rst | 387 |
1 files changed, 387 insertions, 0 deletions
diff --git a/docs/testing/user/userguide/troubleshooting.rst b/docs/testing/user/userguide/troubleshooting.rst new file mode 100644 index 000000000..845501916 --- /dev/null +++ b/docs/testing/user/userguide/troubleshooting.rst @@ -0,0 +1,387 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +Troubleshooting +=============== + +This section gives some guidelines about how to troubleshoot the test cases +owned by Functest. + +**IMPORTANT**: As in the previous section, the steps defined below must be +executed inside the Functest Docker container and after sourcing the OpenStack +credentials:: + + . $creds + +or:: + + source /home/opnfv/functest/conf/openstack.creds + +VIM +--- + +This section covers the test cases related to the VIM (healthcheck, vping_ssh, +vping_userdata, tempest_smoke_serial, tempest_full_parallel, rally_sanity, +rally_full). + +vPing common +^^^^^^^^^^^^ +For both vPing test cases (**vPing_ssh**, and **vPing_userdata**), the first steps are +similar: + + * Create Glance image + * Create Network + * Create Security Group + * Create Instances + +After these actions, the test cases differ and will be explained in their +respective section. + +These test cases can be run inside the container, using new Functest CLI as follows:: + + $ functest testcase run vping_ssh + $ functest testcase run vping_userdata + +The Functest CLI is designed to route a call to the corresponding internal +python scripts, located in paths: +*$REPOS_DIR/functest/functest/opnfv_tests/vPing/CI/libraries/vPing_ssh.py* and +*$REPOS_DIR/functest/functest/opnfv_tests/vPing/CI/libraries/vPing_userdata.py* + +Notes: + + #. There is one difference, between the Functest CLI based test case + execution compared to the earlier used Bash shell script, which is + relevant to point out in troubleshooting scenarios: + + The Functest CLI does **not yet** support the option to suppress + clean-up of the generated OpenStack resources, following the execution + of a test case. + + Explanation: After finishing the test execution, the corresponding + script will remove, by default, all created resources in OpenStack + (image, instances, network and security group). When troubleshooting, + it is advisable sometimes to keep those resources in case the test + fails and a manual testing is needed. + + It is actually still possible to invoke test execution, with suppression + of OpenStack resource cleanup, however this requires invocation of a + **specific Python script:** '/home/opnfv/repos/functest/ci/run_test.py'. + The `OPNFV Functest Developer Guide`_ provides guidance on the use of that + Python script in such troubleshooting cases. + +Some of the common errors that can appear in this test case are:: + + vPing_ssh- ERROR - There has been a problem when creating the neutron network.... + +This means that there has been some problems with Neutron, even before creating the +instances. Try to create manually a Neutron network and a Subnet to see if that works. +The debug messages will also help to see when it failed (subnet and router creation). +Example of Neutron commands (using 10.6.0.0/24 range for example):: + + neutron net-create net-test + neutron subnet-create --name subnet-test --allocation-pool start=10.6.0.2,end=10.6.0.100 \ + --gateway 10.6.0.254 net-test 10.6.0.0/24 + neutron router-create test_router + neutron router-interface-add <ROUTER_ID> test_subnet + neutron router-gateway-set <ROUTER_ID> <EXT_NET_NAME> + +Another related error can occur while creating the Security Groups for the instances:: + + vPing_ssh- ERROR - Failed to create the security group... + +In this case, proceed to create it manually. These are some hints:: + + neutron security-group-create sg-test + neutron security-group-rule-create sg-test --direction ingress --protocol icmp \ + --remote-ip-prefix 0.0.0.0/0 + neutron security-group-rule-create sg-test --direction ingress --ethertype IPv4 \ + --protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0 + neutron security-group-rule-create sg-test --direction egress --ethertype IPv4 \ + --protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0 + +The next step is to create the instances. The image used is located in +*/home/opnfv/functest/data/cirros-0.3.5-x86_64-disk.img* and a Glance image is created +with the name **functest-vping**. If booting the instances fails (i.e. the status +is not **ACTIVE**), you can check why it failed by doing:: + + nova list + nova show <INSTANCE_ID> + +It might show some messages about the booting failure. To try that manually:: + + nova boot --flavor m1.small --image functest-vping --nic net-id=<NET_ID> nova-test + +This will spawn a VM using the network created previously manually. +In all the OPNFV tested scenarios from CI, it never has been a problem with the +previous actions. Further possible problems are explained in the following sections. + + +vPing_SSH +^^^^^^^^^ +This test case creates a floating IP on the external network and assigns it to +the second instance **opnfv-vping-2**. The purpose of this is to establish +a SSH connection to that instance and SCP a script that will ping the first +instance. This script is located in the repository under +*$REPOS_DIR/functest/functest/opnfv_tests/OpenStack/vPing/ping.sh* and takes an IP as +a parameter. When the SCP is completed, the test will do an SSH call to that script +inside the second instance. Some problems can happen here:: + + vPing_ssh- ERROR - Cannot establish connection to IP xxx.xxx.xxx.xxx. Aborting + +If this is displayed, stop the test or wait for it to finish, if you have used +the special method of test invocation with specific supression of OpenStack +resource clean-up, as explained earler. It means that the Container can not +reach the Public/External IP assigned to the instance **opnfv-vping-2**. There +are many possible reasons, and they really depend on the chosen scenario. For +most of the ODL-L3 and ONOS scenarios this has been noticed and it is a known +limitation. + +First, make sure that the instance **opnfv-vping-2** succeeded to get an IP +from the DHCP agent. It can be checked by doing:: + + nova console-log opnfv-vping-2 + +If the message *Sending discover* and *No lease, failing* is shown, it probably +means that the Neutron dhcp-agent failed to assign an IP or even that it was not +responding. At this point it does not make sense to try to ping the floating IP. + +If the instance got an IP properly, try to ping manually the VM from the container:: + + nova list + <grab the public IP> + ping <public IP> + +If the ping does not return anything, try to ping from the Host where the Docker +container is running. If that solves the problem, check the iptable rules because +there might be some rules rejecting ICMP or TCP traffic coming/going from/to the +container. + +At this point, if the ping does not work either, try to reproduce the test +manually with the steps described above in the vPing common section with the +addition:: + + neutron floatingip-create <EXT_NET_NAME> + nova floating-ip-associate nova-test <FLOATING_IP> + + +Further troubleshooting is out of scope of this document, as it might be due to +problems with the SDN controller. Contact the installer team members or send an +email to the corresponding OPNFV mailing list for more information. + + + +vPing_userdata +^^^^^^^^^^^^^^ +This test case does not create any floating IP neither establishes an SSH +connection. Instead, it uses nova-metadata service when creating an instance +to pass the same script as before (ping.sh) but as 1-line text. This script +will be executed automatically when the second instance **opnfv-vping-2** is booted. + +The only known problem here for this test to fail is mainly the lack of support +of cloud-init (nova-metadata service). Check the console of the instance:: + + nova console-log opnfv-vping-2 + +If this text or similar is shown:: + + checking http://169.254.169.254/2009-04-04/instance-id + failed 1/20: up 1.13. request failed + failed 2/20: up 13.18. request failed + failed 3/20: up 25.20. request failed + failed 4/20: up 37.23. request failed + failed 5/20: up 49.25. request failed + failed 6/20: up 61.27. request failed + failed 7/20: up 73.29. request failed + failed 8/20: up 85.32. request failed + failed 9/20: up 97.34. request failed + failed 10/20: up 109.36. request failed + failed 11/20: up 121.38. request failed + failed 12/20: up 133.40. request failed + failed 13/20: up 145.43. request failed + failed 14/20: up 157.45. request failed + failed 15/20: up 169.48. request failed + failed 16/20: up 181.50. request failed + failed 17/20: up 193.52. request failed + failed 18/20: up 205.54. request failed + failed 19/20: up 217.56. request failed + failed 20/20: up 229.58. request failed + failed to read iid from metadata. tried 20 + +it means that the instance failed to read from the metadata service. Contact +the Functest or installer teams for more information. + +NOTE: Cloud-init in not supported on scenarios dealing with ONOS and the tests +have been excluded from CI in those scenarios. + + +Tempest +^^^^^^^ + +In the upstream OpenStack CI all the Tempest test cases are supposed to pass. +If some test cases fail in an OPNFV deployment, the reason is very probably one +of the following + ++-----------------------------+-----------------------------------------------------+ +| Error | Details | ++=============================+=====================================================+ +| Resources required for test | Such resources could be e.g. an external network | +| case execution are missing | and access to the management subnet (adminURL) from | +| | the Functest docker container. | ++-----------------------------+-----------------------------------------------------+ +| OpenStack components or | Check running services in the controller and compute| +| services are missing or not | nodes (e.g. with "systemctl" or "service" commands).| +| configured properly | Configuration parameters can be verified from the | +| | related .conf files located under '/etc/<component>'| +| | directories. | ++-----------------------------+-----------------------------------------------------+ +| Some resources required for | The tempest.conf file, automatically generated by | +| execution test cases are | Rally in Functest, does not contain all the needed | +| missing | parameters or some parameters are not set properly. | +| | The tempest.conf file is located in directory | +| | '/home/opnfv/.rally/tempest/for-deployment-<UUID>' | +| | in the Functest Docker container. Use the "rally | +| | deployment list" command in order to check the UUID | +| | the UUID of the current deployment. | ++-----------------------------+-----------------------------------------------------+ + + +When some Tempest test case fails, captured traceback and possibly also the +related REST API requests/responses are output to the console. More detailed debug +information can be found from tempest.log file stored into related Rally deployment +folder. + + +Rally +^^^^^ + +The same error causes which were mentioned above for Tempest test cases, may also +lead to errors in Rally as well. + +It is possible to run only one Rally scenario, instead of the whole suite. +To do that, call the alternative python script as follows:: + + python $REPOS_DIR/functest/functest/opnfv_tests/OpenStack/rally/run_rally-cert.py -h + usage: run_rally-cert.py [-h] [-d] [-r] [-s] [-v] [-n] test_name + + positional arguments: + test_name Module name to be tested. Possible values are : [ + authenticate | glance | cinder | heat | keystone | neutron | + nova | quotas | requests | vm | all ] The 'all' value + performs all possible test scenarios + + optional arguments: + -h, --help show this help message and exit + -d, --debug Debug mode + -r, --report Create json result file + -s, --smoke Smoke test mode + -v, --verbose Print verbose info about the progress + -n, --noclean Don't clean the created resources for this test. + +For example, to run the Glance scenario with debug information:: + + python $REPOS_DIR/functest/functest/opnfv_tests/OpenStack/rally/run_rally-cert.py -d glance + +Possible scenarios are: + * authenticate + * glance + * cinder + * heat + * keystone + * neutron + * nova + * quotas + * requests + * vm + +To know more about what those scenarios are doing, they are defined in directory: +*$REPOS_DIR/functest/functest/opnfv_tests/OpenStack/rally/scenario* +For more info about Rally scenario definition please refer to the Rally official +documentation. `[3]`_ + +If the flag *all* is specified, it will run all the scenarios one by one. Please +note that this might take some time (~1,5hr), taking around 1 hour alone to +complete the Nova scenario. + +To check any possible problems with Rally, the logs are stored under +*/home/opnfv/functest/results/rally/* in the Functest Docker container. + + +Controllers +----------- + +Opendaylight +^^^^^^^^^^^^ + +If the Basic Restconf test suite fails, check that the ODL controller is +reachable and its Restconf module has been installed. + +If the Neutron Reachability test fails, verify that the modules +implementing Neutron requirements have been properly installed. + +If any of the other test cases fails, check that Neutron and ODL have +been correctly configured to work together. Check Neutron configuration +files, accounts, IP addresses etc.). + + +ONOS +^^^^ + +Please refer to the ONOS documentation. `ONOSFW User Guide`_ . + + +Features +-------- + +Please refer to the dedicated feature user guides for details. + + +security_scan +^^^^^^^^^^^^^ + +See OpenSCAP web site: https://www.open-scap.org/ + + + +NFV +--- + +cloudify_ims +^^^^^^^^^^^^ +vIMS deployment may fail for several reasons, the most frequent ones are +described in the following table: + ++-----------------------------------+------------------------------------+ +| Error | Comments | ++===================================+====================================+ +| Keystone admin API not reachable | Impossible to create vIMS user and | +| | tenant | ++-----------------------------------+------------------------------------+ +| Impossible to retrieve admin role | Impossible to create vIMS user and | +| id | tenant | ++-----------------------------------+------------------------------------+ +| Error when uploading image from | impossible to deploy VNF | +| OpenStack to glance | | ++-----------------------------------+------------------------------------+ +| Cinder quota cannot be updated | Default quotas not sufficient, they| +| | are adapted in the script | ++-----------------------------------+------------------------------------+ +| Impossible to create a volume | VNF cannot be deployed | ++-----------------------------------+------------------------------------+ +| SSH connection issue between the | if vPing test fails, vIMS test will| +| Test Docker container and the VM | fail... | ++-----------------------------------+------------------------------------+ +| No Internet access from the VM | the VMs of the VNF must have an | +| | external access to Internet | ++-----------------------------------+------------------------------------+ +| No access to OpenStack API from | Orchestrator can be installed but | +| the VM | the vIMS VNF installation fails | ++-----------------------------------+------------------------------------+ + + +parser +^^^^^^ + +For now log info is the only way to do trouble shooting + + +.. _`OPNFV Functest Developer Guide`: http://artifacts.opnfv.org/functest/docs/devguide/# |