From 72a1f8c92f1692f1ea8dcb5bc706ec9939c30e0a Mon Sep 17 00:00:00 2001 From: Tomi Juvonen Date: Tue, 13 Oct 2020 16:37:57 +0300 Subject: Documents up-to-date According to document guidelines Release notes ETSI FEAT03 support and other minor enhancements JIRA: DOCTOR-143 Signed-off-by: Tomi Juvonen Change-Id: Iefa74004dfada376d1ab05c0149029a26f822275 --- docs/development/index.rst | 14 +- .../development/manuals/get-valid-server-state.rst | 125 ---------- docs/development/manuals/index.rst | 13 -- docs/development/manuals/mark-host-down_manual.rst | 122 ---------- docs/development/manuals/monitors.rst | 37 --- .../doctor-scenario-in-functest.rst | 255 --------------------- .../images/Fault-management-design.png | Bin 237110 -> 0 bytes .../overview/functest_scenario/images/LICENSE | 14 -- .../images/Maintenance-design.png | Bin 316640 -> 0 bytes .../images/Maintenance-workflow.png | Bin 81286 -> 0 bytes docs/development/overview/index.rst | 7 +- docs/development/overview/overview.rst | 52 +++++ docs/development/overview/testing.rst | 99 -------- docs/development/requirements/index.rst | 6 +- 14 files changed, 66 insertions(+), 678 deletions(-) delete mode 100644 docs/development/manuals/get-valid-server-state.rst delete mode 100644 docs/development/manuals/index.rst delete mode 100644 docs/development/manuals/mark-host-down_manual.rst delete mode 100644 docs/development/manuals/monitors.rst delete mode 100644 docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst delete mode 100644 docs/development/overview/functest_scenario/images/Fault-management-design.png delete mode 100644 docs/development/overview/functest_scenario/images/LICENSE delete mode 100644 docs/development/overview/functest_scenario/images/Maintenance-design.png delete mode 100644 docs/development/overview/functest_scenario/images/Maintenance-workflow.png create mode 100644 docs/development/overview/overview.rst delete mode 100644 docs/development/overview/testing.rst (limited to 'docs/development') diff --git a/docs/development/index.rst b/docs/development/index.rst index 2dc16a82..a7d2817b 100644 --- a/docs/development/index.rst +++ b/docs/development/index.rst @@ -2,18 +2,18 @@ .. http://creativecommons.org/licenses/by/4.0 .. (c) 2016 OPNFV. +.. _development: -====== -Doctor -====== +=========== +Development +=========== .. toctree:: :maxdepth: 2 - ./design/index.rst - ./requirements/index.rst - ./manuals/index.rst - ./overview/functest_scenario/index.rst + ./design/index + ./overview/index + ./requirements/index Indices ======= diff --git a/docs/development/manuals/get-valid-server-state.rst b/docs/development/manuals/get-valid-server-state.rst deleted file mode 100644 index 824ea3c2..00000000 --- a/docs/development/manuals/get-valid-server-state.rst +++ /dev/null @@ -1,125 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - -====================== -Get valid server state -====================== - -Related Blueprints: -=================== - -https://blueprints.launchpad.net/nova/+spec/get-valid-server-state - -Problem description -=================== - -Previously when the owner of a VM has queried his VMs, he has not received -enough state information, states have not changed fast enough in the VIM and -they have not been accurate in some scenarios. With this change this gap is now -closed. - -A typical case is that, in case of a fault of a host, the user of a high -availability service running on top of that host, needs to make an immediate -switch over from the faulty host to an active standby host. Now, if the compute -host is forced down [1] as a result of that fault, the user has to be notified -about this state change such that the user can react accordingly. Similarly, -a change of the host state to "maintenance" should also be notified to the -users. - -What is changed -=============== - -A new ``host_status`` parameter is added to the ``/servers/{server_id}`` and -``/servers/detail`` endpoints in microversion 2.16. By this new parameter -user can get additional state information about the host. - -``host_status`` possible values where next value in list can override the -previous: - -- ``UP`` if nova-compute is up. -- ``UNKNOWN`` if nova-compute status was not reported by servicegroup driver - within configured time period. Default is within 60 seconds, - but can be changed with ``service_down_time`` in nova.conf. -- ``DOWN`` if nova-compute was forced down. -- ``MAINTENANCE`` if nova-compute was disabled. MAINTENANCE in API directly - means nova-compute service is disabled. Different wording is used to avoid - the impression that the whole host is down, as only scheduling of new VMs - is disabled. -- Empty string indicates there is no host for server. - -``host_status`` is returned in the response in case the policy permits. By -default the policy is for admin only in Nova policy.json:: - - "os_compute_api:servers:show:host_status": "rule:admin_api" - -For an NFV use case this has to also be enabled for the owner of the VM:: - - "os_compute_api:servers:show:host_status": "rule:admin_or_owner" - -REST API examples: -================== - -Case where nova-compute is enabled and reporting normally:: - - GET /v2.1/{tenant_id}/servers/{server_id} - - 200 OK - { - "server": { - "host_status": "UP", - ... - } - } - -Case where nova-compute is enabled, but not reporting normally:: - - GET /v2.1/{tenant_id}/servers/{server_id} - - 200 OK - { - "server": { - "host_status": "UNKNOWN", - ... - } - } - -Case where nova-compute is enabled, but forced_down:: - - GET /v2.1/{tenant_id}/servers/{server_id} - - 200 OK - { - "server": { - "host_status": "DOWN", - ... - } - } - -Case where nova-compute is disabled:: - - GET /v2.1/{tenant_id}/servers/{server_id} - - 200 OK - { - "server": { - "host_status": "MAINTENANCE", - ... - } - } - -Host Status is also visible in python-novaclient:: - - +-------+------+--------+------------+-------------+----------+-------------+ - | ID | Name | Status | Task State | Power State | Networks | Host Status | - +-------+------+--------+------------+-------------+----------+-------------+ - | 9a... | vm1 | ACTIVE | - | RUNNING | xnet=... | UP | - +-------+------+--------+------------+-------------+----------+-------------+ - -Links: -====== - -[1] Manual for OpenStack NOVA API for marking host down -http://artifacts.opnfv.org/doctor/docs/manuals/mark-host-down_manual.html - -[2] OpenStack compute manual page -http://developer.openstack.org/api-ref-compute-v2.1.html#compute-v2.1 diff --git a/docs/development/manuals/index.rst b/docs/development/manuals/index.rst deleted file mode 100644 index f705f94a..00000000 --- a/docs/development/manuals/index.rst +++ /dev/null @@ -1,13 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - -.. _doctor-manuals: - -******* -Manuals -******* - -.. toctree:: - -.. include:: mark-host-down_manual.rst -.. include:: get-valid-server-state.rst diff --git a/docs/development/manuals/mark-host-down_manual.rst b/docs/development/manuals/mark-host-down_manual.rst deleted file mode 100644 index 3815205d..00000000 --- a/docs/development/manuals/mark-host-down_manual.rst +++ /dev/null @@ -1,122 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - -========================================= -OpenStack NOVA API for marking host down. -========================================= - -Related Blueprints: -=================== - - https://blueprints.launchpad.net/nova/+spec/mark-host-down - https://blueprints.launchpad.net/python-novaclient/+spec/support-force-down-service - -What the API is for -=================== - - This API will give external fault monitoring system a possibility of telling - OpenStack Nova fast that compute host is down. This will immediately enable - calling of evacuation of any VM on host and further enabling faster HA - actions. - -What this API does -================== - - In OpenStack the nova-compute service state can represent the compute host - state and this new API is used to force this service down. It is assumed - that the one calling this API has made sure the host is also fenced or - powered down. This is important, so there is no chance same VM instance will - appear twice in case evacuated to new compute host. When host is recovered - by any means, the external system is responsible of calling the API again to - disable forced_down flag and let the host nova-compute service report again - host being up. If network fenced host come up again it should not boot VMs - it had if figuring out they are evacuated to other compute host. The - decision of deleting or booting VMs there used to be on host should be - enhanced later to be more reliable by Nova blueprint: - https://blueprints.launchpad.net/nova/+spec/robustify-evacuate - -REST API for forcing down: -========================== - - Parameter explanations: - tenant_id: Identifier of the tenant. - binary: Compute service binary name. - host: Compute host name. - forced_down: Compute service forced down flag. - token: Token received after successful authentication. - service_host_ip: Serving controller node ip. - - request: - PUT /v2.1/{tenant_id}/os-services/force-down - { - "binary": "nova-compute", - "host": "compute1", - "forced_down": true - } - - response: - 200 OK - { - "service": { - "host": "compute1", - "binary": "nova-compute", - "forced_down": true - } - } - - Example: - curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services - /force-down -H "Content-Type: application/json" -H "Accept: application/json - " -H "X-OpenStack-Nova-API-Version: 2.11" -H "X-Auth-Token: {token}" -d '{"b - inary": "nova-compute", "host": "compute1", "forced_down": true}' - -CLI for forcing down: -===================== - - nova service-force-down nova-compute - - Example: - nova service-force-down compute1 nova-compute - -REST API for disabling forced down: -=================================== - - Parameter explanations: - tenant_id: Identifier of the tenant. - binary: Compute service binary name. - host: Compute host name. - forced_down: Compute service forced down flag. - token: Token received after successful authentication. - service_host_ip: Serving controller node ip. - - request: - PUT /v2.1/{tenant_id}/os-services/force-down - { - "binary": "nova-compute", - "host": "compute1", - "forced_down": false - } - - response: - 200 OK - { - "service": { - "host": "compute1", - "binary": "nova-compute", - "forced_down": false - } - } - - Example: - curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services - /force-down -H "Content-Type: application/json" -H "Accept: application/json - " -H "X-OpenStack-Nova-API-Version: 2.11" -H "X-Auth-Token: {token}" -d '{"b - inary": "nova-compute", "host": "compute1", "forced_down": false}' - -CLI for disabling forced down: -============================== - - nova service-force-down --unset nova-compute - - Example: - nova service-force-down --unset compute1 nova-compute diff --git a/docs/development/manuals/monitors.rst b/docs/development/manuals/monitors.rst deleted file mode 100644 index eeb5e226..00000000 --- a/docs/development/manuals/monitors.rst +++ /dev/null @@ -1,37 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - -Monitor Types and Limitations -============================= - -Currently there are two monitor types supported: sample and collectd - -Sample Monitor --------------- - -Sample monitor type pings the compute host from the control host and calculates the -notification time after the ping timeout. -Also if inspector type is sample, the compute node needs to communicate with the control -node on port 12345. This port needs to be opened for incomming traffic on control node. - -Collectd Monitor ----------------- - -Collectd monitor type uses collectd daemon running ovs_events plugin. Collectd runs on -compute to send instant notification to the control node. The notification time is -calculated by using the difference of time at which compute node sends notification to -control node and the time at which consumer is notified. The time on control and compute -node has to be synchronized for this reason. For further details on setting up collectd -on the compute node, use the following link: -:doc:`` - - -Collectd monitors an interface managed by OVS. If the interface is not be assigned -an IP, the user has to provide the name of interface to be monitored. The command to -launch the doctor test in that case is: -MONITOR_TYPE=collectd INSPECTOR_TYPE=sample INTERFACE_NAME=example_iface ./run.sh - -If the interface name or IP is not provided, the collectd monitor type will monitor the -default management interface. This may result in the failure of doctor run.sh test case. -The test case sets the monitored interface down and if the inspector (sample or congress) -is running on the same subnet, collectd monitor will not be able to communicate with it. diff --git a/docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst b/docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst deleted file mode 100644 index 4505dd8f..00000000 --- a/docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst +++ /dev/null @@ -1,255 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - - - -Platform overview -""""""""""""""""" - -Doctor platform provides these features since `Danube Release `_: - -* Immediate Notification -* Consistent resource state awareness for compute host down -* Valid compute host status given to VM owner - -These features enable high availability of Network Services on top of -the virtualized infrastructure. Immediate notification allows VNF managers -(VNFM) to process recovery actions promptly once a failure has occurred. -Same framework can also be utilized to have VNFM awareness about -infrastructure maintenance. - -Consistency of resource state is necessary to execute recovery actions -properly in the VIM. - -Ability to query host status gives VM owner the possibility to get -consistent state information through an API in case of a compute host -fault. - -The Doctor platform consists of the following components: - -* OpenStack Compute (Nova) -* OpenStack Networking (Neutron) -* OpenStack Telemetry (Ceilometer) -* OpenStack Alarming (AODH) -* Doctor Sample Inspector, OpenStack Congress or OpenStack Vitrage -* Doctor Sample Monitor or any monitor supported by Congress or Vitrage - -.. note:: - Doctor Sample Monitor is used in Doctor testing. However in real - implementation like Vitrage, there are several other monitors supported. - -You can see an overview of the Doctor platform and how components interact in -:numref:`figure-p1`. - -.. figure:: ./images/Fault-management-design.png - :name: figure-p1 - :width: 100% - - Doctor platform and typical sequence - -Detailed information on the Doctor architecture can be found in the Doctor -requirements documentation: -http://artifacts.opnfv.org/doctor/docs/requirements/05-implementation.html - -Running test cases -"""""""""""""""""" - -Functest will call the "doctor_tests/main.py" in Doctor to run the test job. -Doctor testing can also be triggered by tox on OPNFV installer jumphost. Tox -is normally used for functional, module and coding style testing in Python -project. - -Currently, 'Apex', 'MCP' and 'local' installer are supported. - - -Fault management use case -""""""""""""""""""""""""" - -* A consumer of the NFVI wants to receive immediate notifications about faults - in the NFVI affecting the proper functioning of the virtual resources. - Therefore, such faults have to be detected as quickly as possible, and, when - a critical error is observed, the affected consumer is immediately informed - about the fault and can switch over to the STBY configuration. - -The faults to be monitored (and at which detection rate) will be configured by -the consumer. Once a fault is detected, the Inspector in the Doctor -architecture will check the resource map maintained by the Controller, to find -out which virtual resources are affected and then update the resources state. -The Notifier will receive the failure event requests sent from the Controller, -and notify the consumer(s) of the affected resources according to the alarm -configuration. - -Detailed workflow information is as follows: - -* Consumer(VNFM): (step 0) creates resources (network, server/instance) and an - event alarm on state down notification of that server/instance or Neutron - port. - -* Monitor: (step 1) periodically checks nodes, such as ping from/to each - dplane nic to/from gw of node, (step 2) once it fails to send out event - with "raw" fault event information to Inspector - -* Inspector: when it receives an event, it will (step 3) mark the host down - ("mark-host-down"), (step 4) map the PM to VM, and change the VM status to - down. In network failure case, also Neutron port is changed to down. - -* Controller: (step 5) sends out instance update event to Ceilometer. In network - failure case, also Neutron port is changed to down and corresponding event is - sent to Ceilometer. - -* Notifier: (step 6) Ceilometer transforms and passes the events to AODH, - (step 7) AODH will evaluate events with the registered alarm definitions, - then (step 8) it will fire the alarm to the "consumer" who owns the - instance - -* Consumer(VNFM): (step 9) receives the event and (step 10) recreates a new - instance - -Fault management test case -"""""""""""""""""""""""""" - -Functest will call the 'doctor-test' command in Doctor to run the test job. - -The following steps are executed: - -Firstly, get the installer ip according to the installer type. Then ssh to -the installer node to get the private key for accessing to the cloud. As -'fuel' installer, ssh to the controller node to modify nova and ceilometer -configurations. - -Secondly, prepare image for booting VM, then create a test project and test -user (both default to doctor) for the Doctor tests. - -Thirdly, boot a VM under the doctor project and check the VM status to verify -that the VM is launched completely. Then get the compute host info where the VM -is launched to verify connectivity to the target compute host. Get the consumer -ip according to the route to compute ip and create an alarm event in Ceilometer -using the consumer ip. - -Fourthly, the Doctor components are started, and, based on the above preparation, -a failure is injected to the system, i.e. the network of compute host is -disabled for 3 minutes. To ensure the host is down, the status of the host -will be checked. - -Finally, the notification time, i.e. the time between the execution of step 2 -(Monitor detects failure) and step 9 (Consumer receives failure notification) -is calculated. - -According to the Doctor requirements, the Doctor test is successful if the -notification time is below 1 second. - -Maintenance use case -"""""""""""""""""""" - -* A consumer of the NFVI wants to interact with NFVI maintenance, upgrade, - scaling and to have graceful retirement. Receiving notifications over these - NFVI events and responding to those within given time window, consumer can - guarantee zero downtime to his service. - -The maintenance use case adds the Doctor platform an `admin tool` and an -`app manager` component. Overview of maintenance components can be seen in -:numref:`figure-p2`. - -.. figure:: ./images/Maintenance-design.png - :name: figure-p2 - :width: 100% - - Doctor platform components in maintenance use case - -In maintenance use case, `app manager` (VNFM) will subscribe to maintenance -notifications triggered by project specific alarms through AODH. This is the way -it gets to know different NFVI maintenance, upgrade and scaling operations that -effect to its instances. The `app manager` can do actions depicted in `green -color` or tell `admin tool` to do admin actions depicted in `orange color` - -Any infrastructure component like `Inspector` can subscribe to maintenance -notifications triggered by host specific alarms through AODH. Subscribing to the -notifications needs admin privileges and can tell when a host is out of use as -in maintenance and when it is taken back to production. - -Maintenance test case -""""""""""""""""""""" - -Maintenance test case is currently running in our Apex CI and executed by tox. -This is because the special limitation mentioned below and also the fact we -currently have only sample implementation as a proof of concept and we also -support unofficial OpenStack project Fenix. Environment variable -TEST_CASE='maintenance' needs to be used when executing "doctor_tests/main.py" -and ADMIN_TOOL_TYPE='fenix' if want to test with Fenix instead of sample -implementation. Test case workflow can be seen in :numref:`figure-p3`. - -.. figure:: ./images/Maintenance-workflow.png - :name: figure-p3 - :width: 100% - - Maintenance test case workflow - -In test case all compute capacity will be consumed with project (VNF) instances. -For redundant services on instances and an empty compute needed for maintenance, -test case will need at least 3 compute nodes in system. There will be 2 -instances on each compute, so minimum number of VCPUs is also 2. Depending on -how many compute nodes there is application will always have 2 redundant -instances (ACT-STDBY) on different compute nodes and rest of the compute -capacity will be filled with non-redundant instances. - -For each project specific maintenance message there is a time window for -`app manager` to make any needed action. This will guarantee zero -down time for his service. All replies back are done by calling `admin tool` API -given in the message. - -The following steps are executed: - -Infrastructure admin will call `admin tool` API to trigger maintenance for -compute hosts having instances belonging to a VNF. - -Project specific `MAINTENANCE` notification is triggered to tell `app manager` -that his instances are going to hit by infrastructure maintenance at a specific -point in time. `app manager` will call `admin tool` API to answer back -`ACK_MAINTENANCE`. - -When the time comes to start the actual maintenance workflow in `admin tool`, -a `DOWN_SCALE` notification is triggered as there is no empty compute node for -maintenance (or compute upgrade). Project receives corresponding alarm and scales -down instances and call `admin tool` API to answer back `ACK_DOWN_SCALE`. - -As it might happen instances are not scaled down (removed) from a single -compute node, `admin tool` might need to figure out what compute node should be -made empty first and send `PREPARE_MAINTENANCE` to project telling which instance -needs to be migrated to have the needed empty compute. `app manager` makes sure -he is ready to migrate instance and call `admin tool` API to answer back -`ACK_PREPARE_MAINTENANCE`. `admin tool` will make the migration and answer -`ADMIN_ACTION_DONE`, so `app manager` knows instance can be again used. - -:numref:`figure-p3` has next a light blue section of actions to be done for each -compute. However as we now have one empty compute, we will maintain/upgrade that -first. So on first round, we can straight put compute in maintenance and send -admin level host specific `IN_MAINTENANCE` message. This is caught by `Inspector` -to know host is down for maintenance. `Inspector` can now disable any automatic -fault management actions for the host as it can be down for a purpose. After -`admin tool` has completed maintenance/upgrade `MAINTENANCE_COMPLETE` message -is sent to tell host is back in production. - -Next rounds we always have instances on compute, so we need to have -`PLANNED_MAINTANANCE` message to tell that those instances are now going to hit -by maintenance. When `app manager` now receives this message, he knows instances -to be moved away from compute will now move to already maintained/upgraded host. -In test case no upgrade is done on application side to upgrade instances -according to new infrastructure capabilities, but this could be done here as -this information is also passed in the message. This might be just upgrading -some RPMs, but also totally re-instantiating instance with a new flavor. Now if -application runs an active side of a redundant instance on this compute, -a switch over will be done. After `app manager` is ready he will call -`admin tool` API to answer back `ACK_PLANNED_MAINTENANCE`. In test case the -answer is `migrate`, so `admin tool` will migrate instances and reply -`ADMIN_ACTION_DONE` and then `app manager` knows instances can be again used. -Then we are ready to make the actual maintenance as previously trough -`IN_MAINTENANCE` and `MAINTENANCE_COMPLETE` steps. - -After all computes are maintained, `admin tool` can send `MAINTENANCE_COMPLETE` -to tell maintenance/upgrade is now complete. For `app manager` this means he -can scale back to full capacity. - -This is the current sample implementation and test case. Real life -implementation is started in OpenStack Fenix project and there we should -eventually address requirements more deeply and update the test case with Fenix -implementation. diff --git a/docs/development/overview/functest_scenario/images/Fault-management-design.png b/docs/development/overview/functest_scenario/images/Fault-management-design.png deleted file mode 100644 index 6d98cdec..00000000 Binary files a/docs/development/overview/functest_scenario/images/Fault-management-design.png and /dev/null differ diff --git a/docs/development/overview/functest_scenario/images/LICENSE b/docs/development/overview/functest_scenario/images/LICENSE deleted file mode 100644 index 21a2d03d..00000000 --- a/docs/development/overview/functest_scenario/images/LICENSE +++ /dev/null @@ -1,14 +0,0 @@ -Copyright 2017 Open Platform for NFV Project, Inc. and its contributors - -Open Platform for NFV Project Documentation License -=================================================== -Any documentation developed by the "Open Platform for NFV Project" -is licensed under a Creative Commons Attribution 4.0 International License. -You should have received a copy of the license along with this. If not, -see . - -Unless required by applicable law or agreed to in writing, documentation -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. diff --git a/docs/development/overview/functest_scenario/images/Maintenance-design.png b/docs/development/overview/functest_scenario/images/Maintenance-design.png deleted file mode 100644 index 8f21db6a..00000000 Binary files a/docs/development/overview/functest_scenario/images/Maintenance-design.png and /dev/null differ diff --git a/docs/development/overview/functest_scenario/images/Maintenance-workflow.png b/docs/development/overview/functest_scenario/images/Maintenance-workflow.png deleted file mode 100644 index 9b65fd59..00000000 Binary files a/docs/development/overview/functest_scenario/images/Maintenance-workflow.png and /dev/null differ diff --git a/docs/development/overview/index.rst b/docs/development/overview/index.rst index 956e73e3..f6d78d57 100644 --- a/docs/development/overview/index.rst +++ b/docs/development/overview/index.rst @@ -3,11 +3,12 @@ .. _doctor-overview: -************************ -Doctor Development Guide -************************ +******** +Overview +******** .. toctree:: :maxdepth: 2 + overview.rst testing.rst diff --git a/docs/development/overview/overview.rst b/docs/development/overview/overview.rst new file mode 100644 index 00000000..21f5439e --- /dev/null +++ b/docs/development/overview/overview.rst @@ -0,0 +1,52 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +Platform overview +""""""""""""""""" + +Doctor platform provides these features since `Danube Release `_: + +* Immediate Notification +* Consistent resource state awareness for compute host down +* Valid compute host status given to VM owner + +These features enable high availability of Network Services on top of +the virtualized infrastructure. Immediate notification allows VNF managers +(VNFM) to process recovery actions promptly once a failure has occurred. +Same framework can also be utilized to have VNFM awareness about +infrastructure maintenance. + +Consistency of resource state is necessary to execute recovery actions +properly in the VIM. + +Ability to query host status gives VM owner the possibility to get +consistent state information through an API in case of a compute host +fault. + +The Doctor platform consists of the following components: + +* OpenStack Compute (Nova) +* OpenStack Networking (Neutron) +* OpenStack Telemetry (Ceilometer) +* OpenStack Alarming (AODH) +* Doctor Sample Inspector, OpenStack Congress or OpenStack Vitrage +* Doctor Sample Monitor or any monitor supported by Congress or Vitrage + +.. note:: + Doctor Sample Monitor is used in Doctor testing. However in real + implementation like Vitrage, there are several other monitors supported. + +You can see an overview of the Doctor platform and how components interact in +:numref:`figure-p1`. + + +Maintenance use case provides these features since `Iruya Release `_: + +* Infrastructure maintenance and upgrade workflow +* Interaction between VNFM and infrastructe workflow + +Since `Jerma Release `_ maintenance +use case also supports 'ETSI FEAT03' implementation to have the infrastructure +maintenance and upgrade fully optimized while keeping zero impact on VNF +service. + diff --git a/docs/development/overview/testing.rst b/docs/development/overview/testing.rst deleted file mode 100644 index 663d4c3f..00000000 --- a/docs/development/overview/testing.rst +++ /dev/null @@ -1,99 +0,0 @@ -.. This work is licensed under a Creative Commons Attribution 4.0 International License. -.. http://creativecommons.org/licenses/by/4.0 - -============== -Testing Doctor -============== - -You have two options to test Doctor functions with the script developed -for doctor CI. - -You need to install OpenStack and other OPNFV components except Doctor Sample -Inspector, Sample Monitor and Sample Consumer, as these will be launched in -this script. You are encouraged to use OPNFV official installers, but you can -also deploy all components with other installers such as devstack or manual -operation. In those cases, the versions of all components shall be matched with -the versions of them in OPNFV specific release. - -Run Test Script -=============== - -Doctor project has own testing script under `doctor/doctor_tests`_. This test script -can be used for functional testing agained an OPNFV deployment. - -.. _doctor/doctor_tests: https://git.opnfv.org/doctor/tree/doctor_tests - -Before running this script, make sure OpenStack env parameters are set properly -(See e.g. `OpenStackClient Configuration`_), so that Doctor Inspector can operate -OpenStack services. - -.. _OpenStackClient Configuration: https://docs.openstack.org/python-openstackclient/latest/configuration/index.html - -Doctor now supports different test cases and for that you might want to -export TEST_CASE with different values: - -.. code-block:: bash - - #Fault management (default) - export TEST_CASE='fault_management' - #Maintenance (requires 3 compute nodes) - export TEST_CASE='maintenance' - #Use Fenix in maintenance testing instead of sample admin_tool - export ADMIN_TOOL_TYPE='fenix' - #Run both tests cases - export TEST_CASE='all' - -Run Python Test Script -~~~~~~~~~~~~~~~~~~~~~~ - -You can run the python script as follows: - -.. code-block:: bash - - git clone https://gerrit.opnfv.org/gerrit/doctor - cd doctor && tox - -You can see all the configurations with default values in sample configuration -file `doctor.sample.conf`_. And you can also modify the file to meet your -environment and then run the test. - -.. _doctor.sample.conf: https://git.opnfv.org/doctor/tree/etc/doctor.sample.conf - -In OPNFV Apex jumphost you can run Doctor testing as follows using tox: - -.. code-block:: bash - - source overcloudrc - export INSTALLER_IP=${INSTALLER_IP} - export INSTALLER_TYPE=${INSTALLER_TYPE} - git clone https://gerrit.opnfv.org/gerrit/doctor - cd doctor - sudo -E tox - -Run Functest Suite -================== - -Functest supports Doctor testing by triggering the test script above in a -Functest container. You can run the Doctor test with the following steps: - -.. code-block:: bash - - DOCKER_TAG=latest - docker pull docker.io/opnfv/functest-features:${DOCKER_TAG} - docker run --privileged=true -id \ - -e INSTALLER_TYPE=${INSTALLER_TYPE} \ - -e INSTALLER_IP=${INSTALLER_IP} \ - -e INSPECTOR_TYPE=sample \ - docker.io/opnfv/functest-features:${DOCKER_TAG} /bin/bash - docker exec functest testcase run doctor-notification - -See `Functest Userguide`_ for more information. - -.. _Functest Userguide: :doc:`` - - -For testing with stable version, change DOCKER_TAG to 'stable' or other release -tag identifier. - -Tips -==== diff --git a/docs/development/requirements/index.rst b/docs/development/requirements/index.rst index fceaebf0..ccc35cb8 100644 --- a/docs/development/requirements/index.rst +++ b/docs/development/requirements/index.rst @@ -3,9 +3,9 @@ .. _doctor-requirements: -**************************************** -Doctor: Fault Management and Maintenance -**************************************** +********************************************** +Requirements: Fault Management and Maintenance +********************************************** :Project: Doctor, https://wiki.opnfv.org/doctor :Editors: Ashiq Khan (NTT DOCOMO), Gerald Kunzmann (NTT DOCOMO) -- cgit 1.2.3-korg