Update docs structure according to new guidelines in https://wiki.opnfv.org/display/DOC

Change-Id: I1c8c20cf85aa46269c5bc369f17ab0020862ddc5 Signed-off-by: Gerald Kunzmann <kunzmann@docomolab-euro.com>
author: Gerald Kunzmann <kunzmann@docomolab-euro.com> 2017-02-14 15:38:29 +0000
committer: Gerald Kunzmann <kunzmann@docomolab-euro.com> 2017-02-16 14:41:46 +0000
commit: d0b22e1d856cf8f78e152dfb6c150e001e03dd52 (patch)
tree: 0c3b7af828967d5014c2272675560410fceb6e4d /docs/requirements/05-implementation.rst
parent: e171b396ce87322f2dc5ef0719419144774e43d7 (diff)
1 files changed, 0 insertions, 1050 deletions
diff --git a/docs/requirements/05-implementation.rst b/docs/requirements/05-implementation.rst
deleted file mode 100644
index 84979772..00000000
--- a/docs/requirements/05-implementation.rst
+++ /dev/null
@@ -1,1050 +0,0 @@
-.. This work is licensed under a Creative Commons Attribution 4.0 International License.
-.. http://creativecommons.org/licenses/by/4.0
-
-Detailed architecture and interface specification
-=================================================
-
-This section describes a detailed implementation plan, which is based on the
-high level architecture introduced in Section 3. Section 5.1 describes the
-functional blocks of the Doctor architecture, which is followed by a high level
-message flow in Section 5.2. Section 5.3 provides a mapping of selected existing
-open source components to the building blocks of the Doctor architecture.
-Thereby, the selection of components is based on their maturity and the gap
-analysis executed in Section 4. Sections 5.4 and 5.5 detail the specification of
-the related northbound interface and the related information elements. Finally,
-Section 5.6 provides a first set of blueprints to address selected gaps required
-for the realization functionalities of the Doctor project.
-
-.. _impl_fb:
-
-Functional Blocks
------------------
-
-This section introduces the functional blocks to form the VIM. OpenStack was
-selected as the candidate for implementation. Inside the VIM, 4 different
-building blocks are defined (see :numref:`figure6`).
-
-.. figure:: images/figure6.png
-   :name: figure6
-   :width: 100%
-
-   Functional blocks
-
-Monitor
-^^^^^^^
-
-The Monitor module has the responsibility for monitoring the virtualized
-infrastructure. There are already many existing tools and services (e.g. Zabbix)
-to monitor different aspects of hardware and software resources which can be
-used for this purpose.
-
-Inspector
-^^^^^^^^^
-
-The Inspector module has the ability a) to receive various failure notifications
-regarding physical resource(s) from Monitor module(s), b) to find the affected
-virtual resource(s) by querying the resource map in the Controller, and c) to
-update the state of the virtual resource (and physical resource).
-
-The Inspector has drivers for different types of events and resources to
-integrate any type of Monitor and Controller modules. It also uses a failure
-policy database to decide on the failure selection and aggregation from raw
-events. This failure policy database is configured by the Administrator.
-
-The reason for separation of the Inspector and Controller modules is to make the
-Controller focus on simple operations by avoiding a tight integration of various
-health check mechanisms into the Controller.
-
-Controller
-^^^^^^^^^^
-
-The Controller is responsible for maintaining the resource map (i.e. the mapping
-from physical resources to virtual resources), accepting update requests for the
-resource state(s) (exposing as provider API), and sending all failure events
-regarding virtual resources to the Notifier. Optionally, the Controller has the
-ability to force the state of a given physical resource to down in the resource
-mapping when it receives failure notifications from the Inspector for that
-given physical resource.
-The Controller also re-calculates the capacity of the NVFI when receiving a
-failure notification for a physical resource.
-
-In a real-world deployment, the VIM may have several controllers, one for each
-resource type, such as Nova, Neutron and Cinder in OpenStack. Each controller
-maintains a database of virtual and physical resources which shall be the master
-source for resource information inside the VIM.
-
-Notifier
-^^^^^^^^
-
-The focus of the Notifier is on selecting and aggregating failure events
-received from the controller based on policies mandated by the Consumer.
-Therefore, it allows the Consumer to subscribe for alarms regarding virtual
-resources using a method such as API endpoint. After receiving a fault
-event from a Controller, it will notify the fault to the Consumer by referring
-to the alarm configuration which was defined by the Consumer earlier on.
-
-To reduce complexity of the Controller, it is a good approach for the
-Controllers to emit all notifications without any filtering mechanism and have
-another service (i.e. Notifier) handle those notifications properly. This is the
-general philosophy of notifications in OpenStack. Note that a fault message
-consumed by the Notifier is different from the fault message received by the
-Inspector; the former message is related to virtual resources which are visible
-to users with relevant ownership, whereas the latter is related to raw devices
-or small entities which should be handled with an administrator privilege.
-
-The northbound interface between the Notifier and the Consumer/Administrator is
-specified in :ref:`impl_nbi`.
-
-Sequence
---------
-
-Fault Management
-^^^^^^^^^^^^^^^^
-
-The detailed work flow for fault management is as follows (see also :numref:`figure7`):
-
-1. Request to subscribe to monitor specific virtual resources. A query filter
-   can be used to narrow down the alarms the Consumer wants to be informed
-   about.
-2. Each subscription request is acknowledged with a subscribe response message.
-   The response message contains information about the subscribed virtual
-   resources, in particular if a subscribed virtual resource is in "alarm"
-   state.
-3. The NFVI sends monitoring events for resources the VIM has been subscribed
-   to. Note: this subscription message exchange between the VIM and NFVI is not
-   shown in this message flow.
-4. Event correlation, fault detection and aggregation in VIM.
-5. Database lookup to find the virtual resources affected by the detected fault.
-6. Fault notification to Consumer.
-7. The Consumer switches to standby configuration (STBY)
-8. Instructions to VIM requesting certain actions to be performed on the
-   affected resources, for example migrate/update/terminate specific
-   resource(s). After reception of such instructions, the VIM is executing the
-   requested action, e.g. it will migrate or terminate a virtual resource.
-a. Query request from Consumer to VIM to get information about the current
-   status of a resource.
-b. Response to the query request with information about the current status of
-   the queried resource. In case the resource is in "fault" state, information
-   about the related fault(s) is returned.
-
-In order to allow for quick reaction to failures, the time interval between
-fault detection in step 3 and the corresponding recovery actions in step 7 and 8
-shall be less than 1 second.
-
-.. figure:: images/figure7.png
-   :name: figure7
-   :width: 100%
-
-   Fault management work flow
-
-.. figure:: images/figure8.png
-   :name: figure8
-   :width: 100%
-
-   Fault management scenario
-
-:numref:`figure8` shows a more detailed message flow (Steps 4 to 6) between
-the 4 building blocks introduced in :ref:`impl_fb`.
-
-4. The Monitor observed a fault in the NFVI and reports the raw fault to the
-   Inspector.
-   The Inspector filters and aggregates the faults using pre-configured
-   failure policies.
-
-5.
-   a) The Inspector queries the Resource Map to find the virtual resources
-   affected by the raw fault in the NFVI.
-   b) The Inspector updates the state of the affected virtual resources in the
-   Resource Map.
-   c) The Controller observes a change of the virtual resource state and informs
-   the Notifier about the state change and the related alarm(s).
-   Alternatively, the Inspector may directly inform the Notifier about it.
-
-6. The Notifier is performing another filtering and aggregation of the changes
-   and alarms based on the pre-configured alarm configuration. Finally, a fault
-   notification is sent to northbound to the Consumer.
-
-NFVI Maintenance
-^^^^^^^^^^^^^^^^
-.. figure:: images/figure9.png
-   :name: figure9
-   :width: 100%
-
-   NFVI maintenance work flow
-
-The detailed work flow for NFVI maintenance is shown in :numref:`figure9`
-and has the following steps. Note that steps 1, 2, and 5 to 8a in the NFVI
-maintenance work flow are very similar to the steps in the fault management work
-flow and share a similar implementation plan in Release 1.
-
-1. Subscribe to fault/maintenance notifications.
-2. Response to subscribe request.
-3. Maintenance trigger received from administrator.
-4. VIM switches NFVI resources to "maintenance" state. This, e.g., means they
-   should not be used for further allocation/migration requests
-5. Database lookup to find the virtual resources affected by the detected
-   maintenance operation.
-6. Maintenance notification to Consumer.
-7. The Consumer switches to standby configuration (STBY)
-8. Instructions from Consumer to VIM requesting certain recovery actions to be
-   performed (step 8a). After reception of such instructions, the VIM is
-   executing the requested action in order to empty the physical resources (step
-   8b).
-9. Maintenance response from VIM to inform the Administrator that the physical
-   machines have been emptied (or the operation resulted in an error state).
-10. Administrator is coordinating and executing the maintenance operation/work
-    on the NFVI.
-a) Query request from Administrator to VIM to get information about the
-   current state of a resource.
-b) Response to the query request with information about the current state of
-   the queried resource(s). In case the resource is in "maintenance" state,
-   information about the related maintenance operation is returned.
-
-.. figure:: images/figure10.png
-   :name: figure10
-   :width: 100%
-
-   NFVI Maintenance implementation plan
-
-:numref:`figure10` shows a more detailed message flow (Steps 3 to 6 and 9)
-between the 4 building blocks introduced in Section 5.1..
-
-3. The Administrator is sending a StateChange request to the Controller residing
-   in the VIM.
-4. The Controller queries the Resource Map to find the virtual resources
-   affected by the planned maintenance operation.
-5.
-
-  a) The Controller updates the state of the affected virtual resources in the
-  Resource Map database.
-
-  b) The Controller informs the Notifier about the virtual resources that will
-  be affected by the maintenance operation.
-
-6. A maintenance notification is sent to northbound to the Consumer.
-
-...
-
-9. The Controller informs the Administrator after the physical resources have
-   been freed.
-
-
-
-Implementation plan for OPNFV Release 1
----------------------------------------
-
-Fault management
-^^^^^^^^^^^^^^^^
-
-:numref:`figure11` shows the implementation plan based on OpenStack and
-related components as planned for Release 1. Hereby, the Monitor can be realized
-by Zabbix. The Controller is realized by OpenStack Nova [NOVA]_, Neutron
-[NEUT]_, and Cinder [CIND]_ for compute, network, and storage,
-respectively. The Inspector can be realized by Monasca [MONA]_ or a simple
-script querying Nova in order to map between physical and virtual resources. The
-Notifier will be realized by Ceilometer [CEIL]_ receiving failure events
-on its notification bus.
-
-:numref:`figure12` shows the inner-workings of Ceilometer. After receiving
-an "event" on its notification bus, first a notification agent will grab the
-event and send a "notification" to the Collector. The collector writes the
-notifications received to the Ceilometer databases.
-
-In the existing Ceilometer implementation, an alarm evaluator is periodically
-polling those databases through the APIs provided. If it finds new alarms, it
-will evaluate them based on the pre-defined alarm configuration, and depending
-on the configuration, it will hand a message to the Alarm Notifier, which in
-turn will send the alarm message northbound to the Consumer. :numref:`figure12`
-also shows an optimized work flow for Ceilometer with the goal to
-reduce the delay for fault notifications to the Consumer. The approach is to
-implement a new notification agent (called "publisher" in Ceilometer
-terminology) which is directly sending the alarm through the "Notification Bus"
-to a new "Notification-driven Alarm Evaluator (NAE)" (see Sections 5.6.2 and
-5.6.3), thereby bypassing the Collector and avoiding the additional delay of the
-existing polling-based alarm evaluator. The NAE is similar to the OpenStack
-"Alarm Evaluator", but is triggered by incoming notifications instead of
-periodically polling the OpenStack "Alarms" database for new alarms. The
-Ceilometer "Alarms" database can hold three states: "normal", "insufficient
-data", and "fired". It is representing a persistent alarm database. In order to
-realize the Doctor requirements, we need to define new "meters" in the database
-(see Section 5.6.1).
-
-.. figure:: images/figure11.png
-   :name: figure11
-   :width: 100%
-
-   Implementation plan in OpenStack (OPNFV Release 1 ”Arno”)
-
-
-.. figure:: images/figure12.png
-   :name: figure12
-   :width: 100%
-
-   Implementation plan in Ceilometer architecture
-
-
-NFVI Maintenance
-^^^^^^^^^^^^^^^^
-
-For NFVI Maintenance, a quite similar implementation plan exists. Instead of a
-raw fault being observed by the Monitor, the Administrator is sending a
-Maintenance Request through the northbound interface towards the Controller
-residing in the VIM. Similar to the Fault Management use case, the Controller
-(in our case OpenStack Nova) will send a maintenance event to the Notifier (i.e.
-Ceilometer in our implementation). Within Ceilometer, the same workflow as
-described in the previous section applies. In addition, the Controller(s) will
-take appropriate actions to evacuate the physical machines in order to prepare
-them for the planned maintenance operation. After the physical machines are
-emptied, the Controller will inform the Administrator that it can initiate the
-maintenance. Alternatively the VMs can just be shut down and boot up on the
-same host after maintenance is over. There needs to be policy for administrator
-to know the plan for VMs in maintenance.
-
-Information elements
---------------------
-
-This section introduces all attributes and information elements used in the
-messages exchange on the northbound interfaces between the VIM and the VNFO and
-VNFM.
-
-Note: The information elements will be aligned with current work in ETSI NFV IFA
-working group.
-
-
-Simple information elements:
-
-* SubscriptionID (Identifier): identifies a subscription to receive fault or maintenance
-  notifications.
-* NotificationID (Identifier): identifies a fault or maintenance notification.
-* VirtualResourceID (Identifier): identifies a virtual resource affected by a
-  fault or a maintenance action of the underlying physical resource.
-* PhysicalResourceID (Identifier): identifies a physical resource affected by a
-  fault or maintenance action.
-* VirtualResourceState (String): state of a virtual resource, e.g. "normal",
-  "maintenance", "down", "error".
-* PhysicalResourceState (String): state of a physical resource, e.g. "normal",
-  "maintenance", "down", "error".
-* VirtualResourceType (String): type of the virtual resource, e.g. "virtual
-  machine", "virtual memory", "virtual storage", "virtual CPU", or "virtual
-  NIC".
-* FaultID (Identifier): identifies the related fault in the underlying physical
-  resource. This can be used to correlate different fault notifications caused
-  by the same fault in the physical resource.
-* FaultType (String): Type of the fault. The allowed values for this parameter
-  depend on the type of the related physical resource. For example, a resource
-  of type "compute hardware" may have faults of type "CPU failure", "memory
-  failure", "network card failure", etc.
-* Severity (Integer): value expressing the severity of the fault. The higher the
-  value, the more severe the fault.
-* MinSeverity (Integer): value used in filter information elements. Only faults
-  with a severity higher than the MinSeverity value will be notified to the
-  Consumer.
-* EventTime (Datetime): Time when the fault was observed.
-* EventStartTime and EventEndTime (Datetime): Datetime range that can be used in
-  a FaultQueryFilter to narrow down the faults to be queried.
-* ProbableCause (String): information about the probable cause of the fault.
-* CorrelatedFaultID (Integer): list of other faults correlated to this fault.
-* isRootCause (Boolean): Parameter indicating if this fault is the root for
-  other correlated faults. If TRUE, then the faults listed in the parameter
-  CorrelatedFaultID are caused by this fault.
-* FaultDetails (Key-value pair): provides additional information about the
-  fault, e.g. information about the threshold, monitored attributes, indication
-  of the trend of the monitored parameter.
-* FirmwareVersion (String): current version of the firmware of a physical
-  resource.
-* HypervisorVersion (String): current version of a hypervisor.
-* ZoneID (Identifier): Identifier of the resource zone. A resource zone is the
-  logical separation of physical and software resources in an NFVI deployment
-  for physical isolation, redundancy, or administrative designation.
-* Metadata (Key-value pair): provides additional information of a physical
-  resource in maintenance/error state.
-
-Complex information elements (see also UML diagrams in :numref:`figure13`
-and :numref:`figure14`):
-
-* VirtualResourceInfoClass:
-
-  + VirtualResourceID [1] (Identifier)
-  + VirtualResourceState [1] (String)
-  + Faults [0..*] (FaultClass): For each resource, all faults
-    including detailed information about the faults are provided.
-
-* FaultClass: The parameters of the FaultClass are partially based on ETSI TS
-  132 111-2 (V12.1.0) [*]_, which is specifying fault management in 3GPP, in
-  particular describing the information elements used for alarm notifications.
-
-  - FaultID [1] (Identifier)
-  - FaultType [1] (String)
-  - Severity [1] (Integer)
-  - EventTime [1] (Datetime)
-  - ProbableCause [1] (String)
-  - CorrelatedFaultID [0..*] (Identifier)
-  - FaultDetails [0..*] (Key-value pair)
-
-.. [*] http://www.etsi.org/deliver/etsi_ts/132100_132199/13211102/12.01.00_60/ts_13211102v120100p.pdf
-
-* SubscribeFilterClass
-
-  - VirtualResourceType [0..*] (String)
-  - VirtualResourceID [0..*] (Identifier)
-  - FaultType [0..*] (String)
-  - MinSeverity [0..1] (Integer)
-
-* FaultQueryFilterClass: narrows down the FaultQueryRequest, for example it
-  limits the query to certain physical resources, a certain zone, a given fault
-  type/severity/cause, or a specific FaultID.
-
-  - VirtualResourceType [0..*] (String)
-  - VirtualResourceID [0..*] (Identifier)
-  - FaultType [0..*] (String)
-  - MinSeverity [0..1] (Integer)
-  - EventStartTime [0..1] (Datetime)
-  - EventEndTime [0..1] (Datetime)
-
-* PhysicalResourceStateClass:
-
-  - PhysicalResourceID [1] (Identifier)
-  - PhysicalResourceState [1] (String): mandates the new state of the physical
-    resource.
-  - Metadata [0..*] (Key-value pair)
-
-* PhysicalResourceInfoClass:
-
-  - PhysicalResourceID [1] (Identifier)
-  - PhysicalResourceState [1] (String)
-  - FirmwareVersion [0..1] (String)
-  - HypervisorVersion [0..1] (String)
-  - ZoneID [0..1] (Identifier)
-  - Metadata [0..*] (Key-value pair)
-
-* StateQueryFilterClass: narrows down a StateQueryRequest, for example it limits
-  the query to certain physical resources, a certain zone, or a given resource
-  state (e.g., only resources in "maintenance" state).
-
-  - PhysicalResourceID [1] (Identifier)
-  - PhysicalResourceState [1] (String)
-  - ZoneID [0..1] (Identifier)
-
-.. _impl_nbi:
-
-Detailed northbound interface specification
--------------------------------------------
-
-This section is specifying the northbound interfaces for fault management and
-NFVI maintenance between the VIM on the one end and the Consumer and the
-Administrator on the other ends. For each interface all messages and related
-information elements are provided.
-
-Note: The interface definition will be aligned with current work in ETSI NFV IFA
-working group .
-
-All of the interfaces described below are produced by the VIM and consumed by
-the Consumer or Administrator.
-
-Fault management interface
-^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-This interface allows the VIM to notify the Consumer about a virtual resource
-that is affected by a fault, either within the virtual resource itself or by the
-underlying virtualization infrastructure. The messages on this interface are
-shown in :numref:`figure13` and explained in detail in the following
-subsections.
-
-Note: The information elements used in this section are described in detail in
-Section 5.4.
-
-.. figure:: images/figure13.png
-   :name: figure13
-   :width: 100%
-
-   Fault management NB I/F messages
-
-
-SubscribeRequest (Consumer -> VIM)
-__________________________________
-
-Subscription from Consumer to VIM to be notified about faults of specific
-resources. The faults to be notified about can be narrowed down using a
-subscribe filter.
-
-Parameters:
-
-- SubscribeFilter [1] (SubscribeFilterClass): Optional information to narrow
-  down the faults that shall be notified to the Consumer, for example limit to
-  specific VirtualResourceID(s), severity, or cause of the alarm.
-
-SubscribeResponse (VIM -> Consumer)
-___________________________________
-
-Response to a subscribe request message including information about the
-subscribed resources, in particular if they are in "fault/error" state.
-
-Parameters:
-
-* SubscriptionID [1] (Identifier): Unique identifier for the subscription. It
-  can be used to delete or update the subscription.
-* VirtualResourceInfo [0..*] (VirtualResourceInfoClass): Provides additional
-  information about the subscribed resources, i.e., a list of the related
-  resources, the current state of the resources, etc.
-
-FaultNotification (VIM -> Consumer)
-___________________________________
-
-Notification about a virtual resource that is affected by a fault, either within
-the virtual resource itself or by the underlying virtualization infrastructure.
-After reception of this request, the Consumer will decide on the optimal
-action to resolve the fault. This includes actions like switching to a hot
-standby virtual resource, migration of the fault virtual resource to another
-physical machine, termination of the faulty virtual resource and instantiation
-of a new virtual resource in order to provide a new hot standby resource. In
-some use cases the Consumer can leave virtual resources on failed host to be
-booted up again after fault is recovered. Existing resource management
-interfaces and messages between the Consumer and the VIM can be used for those
-actions, and there is no need to define additional actions on the Fault
-Management Interface.
-
-Parameters:
-
-* NotificationID [1] (Identifier): Unique identifier for the notification.
-* VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of faulty
-  resources with detailed information about the faults.
-
-FaultQueryRequest (Consumer -> VIM)
-___________________________________
-
-Request to find out about active alarms at the VIM. A FaultQueryFilter can be
-used to narrow down the alarms returned in the response message.
-
-Parameters:
-
-* FaultQueryFilter [1] (FaultQueryFilterClass): narrows down the
-  FaultQueryRequest, for example it limits the query to certain physical
-  resources, a certain zone, a given fault type/severity/cause, or a specific
-  FaultID.
-
-FaultQueryResponse (VIM -> Consumer)
-____________________________________
-
-List of active alarms at the VIM matching the FaultQueryFilter specified in the
-FaultQueryRequest.
-
-Parameters:
-
-* VirtualResourceInfo [0..*] (VirtualResourceInfoClass): List of faulty
-  resources. For each resource all faults including detailed information about
-  the faults are provided.
-
-NFVI maintenance
-^^^^^^^^^^^^^^^^
-
-The NFVI maintenance interfaces Consumer-VIM allows the Consumer to subscribe to
-maintenance notifications provided by the VIM. The related maintenance interface
-Administrator-VIM allows the Administrator to issue maintenance requests to the
-VIM, i.e. requesting the VIM to take appropriate actions to empty physical
-machine(s) in order to execute maintenance operations on them. The interface
-also allows the Administrator to query the state of physical machines, e.g., in
-order to get details in the current status of the maintenance operation like a
-firmware update.
-
-The messages defined in these northbound interfaces are shown in :numref:`figure14`
-and described in detail in the following subsections.
-
-.. figure:: images/figure14.png
-   :name: figure14
-   :width: 100%
-
-   NFVI maintenance NB I/F messages
-
-SubscribeRequest (Consumer -> VIM)
-__________________________________
-
-Subscription from Consumer to VIM to be notified about maintenance operations
-for specific virtual resources. The resources to be informed about can be
-narrowed down using a subscribe filter.
-
-Parameters:
-
-* SubscribeFilter [1] (SubscribeFilterClass): Information to narrow down the
-  faults that shall be notified to the Consumer, for example limit to specific
-  virtual resource type(s).
-
-SubscribeResponse (VIM -> Consumer)
-___________________________________
-
-Response to a subscribe request message, including information about the
-subscribed virtual resources, in particular if they are in "maintenance" state.
-
-Parameters:
-
-* SubscriptionID [1] (Identifier): Unique identifier for the subscription. It
-  can be used to delete or update the subscription.
-* VirtualResourceInfo [0..*] (VirtalResourceInfoClass): Provides additional
-  information about the subscribed virtual resource(s), e.g., the ID, type and
-  current state of the resource(s).
-
-MaintenanceNotification (VIM -> Consumer)
-_________________________________________
-
-Notification about a physical resource switched to "maintenance" state. After
-reception of this request, the Consumer will decide on the optimal action to
-address this request, e.g., to switch to the standby (STBY) configuration.
-
-Parameters:
-
-* VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of virtual
-  resources where the state has been changed to maintenance.
-
-StateChangeRequest (Administrator -> VIM)
-_________________________________________
-
-Request to change the state of a list of physical resources, e.g. to
-"maintenance" state, in order to prepare them for a planned maintenance
-operation.
-
-Parameters:
-
-* PhysicalResourceState [1..*] (PhysicalResourceStateClass)
-
-StateChangeResponse (VIM -> Administrator)
-__________________________________________
-
-Response message to inform the Administrator that the requested resources are
-now in maintenance state (or the operation resulted in an error) and the
-maintenance operation(s) can be executed.
-
-Parameters:
-
-* PhysicalResourceInfo [1..*] (PhysicalResourceInfoClass)
-
-StateQueryRequest (Administrator -> VIM)
-________________________________________
-
-In this procedure, the Administrator would like to get the information about
-physical machine(s), e.g. their state ("normal", "maintenance"), firmware
-version, hypervisor version, update status of firmware and hypervisor, etc. It
-can be used to check the progress during firmware update and the confirmation
-after update. A filter can be used to narrow down the resources returned in the
-response message.
-
-Parameters:
-
-* StateQueryFilter [1] (StateQueryFilterClass): narrows down the
-  StateQueryRequest, for example it limits the query to certain physical
-  resources, a certain zone, or a given resource state.
-
-StateQueryResponse (VIM -> Administrator)
-_________________________________________
-
-List of physical resources matching the filter specified in the
-StateQueryRequest.
-
-Parameters:
-
-* PhysicalResourceInfo [0..*] (PhysicalResourceInfoClass): List of physical
-  resources. For each resource, information about the current state, the
-  firmware version, etc. is provided.
-
-NFV IFA, OPNFV Doctor and AODH alarms
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-This section compares the alarm interfaces of ETSI NFV IFA with the specifications
-of this document and the alarm class of AODH.
-
-ETSI NFV specifies an interface for alarms from virtualised resources in ETSI GS
-NFV-IFA 005 [ENFV]_. The interface specifies an Alarm class and two notifications plus
-operations to query alarm instances and to subscribe to the alarm notifications.
-
-The specification in this document has a structure that is very similar to the
-ETSI NFV specifications. The notifications differ in that an alarm notification
-in the NFV interface defines a single fault for a single resource while the
-notification specified in this document can contain multiple faults for
-multiple resources. The Doctor specification is lacking the detailed time stamps
-of the NFV specification essential for synchronizaion of the alarm list
-using the query operation. The detailed time stamps are also of value in the event
-and alarm history DBs.
-
-AODH defines a base class for alarms, not the notifications. This means that
-some of the dynamic attributes of the ETSI NFV alarm type, like alarmRaisedTime,
-are not applicable to the AODH alarm class but are attributes of in the actual
-notifications. (Description of these attributes will be added later.)  The AODH alarm
-class is lacking some attributes present in the NFV specification, fault details
-and correlated alarms. Instead the AODH alarm class has attributes for actions,
-rules and user and project id.
-
-
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| ETSI NFV Alarm Type    | OPNFV Doctor           | AODH Event Alarm    | Description / Comment                       | Recommendations                       |
-|                        | Requirement Specs      | Notification        |                                             |                                       |
-+========================+========================+=====================+=============================================+=======================================+
-| alarmId                | FaultId                | alarm_id            | Identifier of an alarm.                     | \-                                    |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| \-                     | \-                     | alarm_name          | Human readable alarm name.                  | May be added in ETSI NFV Stage 3.     |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| managedObjectId        | VirtualResourceId      | (reason)            | Identifier of the affected virtual resource | \-                                    |
-|                        |                        |                     | is part of the AODH reason parameter.       |                                       |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| \-                     | \-                     | user_id, project_id | User and project identifiers.               | May be added in ETSI NFV Stage 3.     |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| alarmRaisedTime        | \-                     | \-                  | Timestamp when alarm was raised.            | To be added to Doctor and AODH. May   |
-|                        |                        |                     |                                             | be derived (e.g. in a shimlayer) from |
-|                        |                        |                     |                                             | the AODH alarm history.               |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| alarmChangedTime       | \-                     | \-                  | Timestamp when alarm was changed/updated.   | see above                             |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| alarmClearedTime       | \-                     | \-                  | Timestamp when alarm was cleared.           | see above                             |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| eventTime              | \-                     | \-                  | Timestamp when alarm was first observed by  | see above                             |
-|                        |                        |                     | the Monitor.                                |                                       |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| \-                     | EventTime              | generated           | Timestamp of the Notification.              | Update parameter name in Doctor spec. |
-|                        |                        |                     |                                             | May be added in ETSI NFV Stage 3.     |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| state:                 | VirtualResourceState:  | current: ok, alarm, | ETSI NFV IFA 005/006 lists example alarm    | Maintenance state is missing in AODH. |
-| E.g. Fired, Updated    | E.g. normal, down      | insufficient_data   | states.                                     | List of alarm states will be          |
-| Cleared                | maintenance, error     |                     |                                             | specified in ETSI NFV Stage 3.        |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| perceivedSeverity:     | Severity (Integer)     | Severity:           | ETSI NFV IFA 005/006 lists example          | List of alarm states will be          |
-| E.g. Critical, Major,  |                        | low (default),      | perceived severity values.                  | specified in ETSI NFV Stage 3.        |
-| Minor, Warning,        |                        | moderate, critical  |                                             |                                       |
-| Indeterminate, Cleared |                        |                     |                                             | **OPNFV: Severity (Integer)**:        |
-|                        |                        |                     |                                             |   * update OPNFV Doctor specification |
-|                        |                        |                     |                                             |     to *Enum*                         |
-|                        |                        |                     |                                             |                                       |
-|                        |                        |                     |                                             | **perceivedSeverity=Indetermined**:   |
-|                        |                        |                     |                                             |   * remove value *Indetermined* in    |
-|                        |                        |                     |                                             |     IFA and map undefined values to   |
-|                        |                        |                     |                                             |     “minor” severity, or              |
-|                        |                        |                     |                                             |   * add value *indetermined* in AODH  |
-|                        |                        |                     |                                             |     and make it the default value.    |
-|                        |                        |                     |                                             |                                       |
-|                        |                        |                     |                                             | **perceivedSeverity=Cleared**:        |
-|                        |                        |                     |                                             |   * remove value *Cleared* in IFA as  |
-|                        |                        |                     |                                             |     the information about a cleared   |
-|                        |                        |                     |                                             |     alarm alarm can be derived from   |
-|                        |                        |                     |                                             |     the alarm state parameter, or     |
-|                        |                        |                     |                                             |   * add value *cleared* in AODH and   |
-|                        |                        |                     |                                             |     set a rule that the severity is   |
-|                        |                        |                     |                                             |     “cleared” when the state is *ok*. |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| faultType              | FaultType              | event_type in       | Type of the fault, e.g. “CPU failure” of a  | OpenStack Alarming (Aodh) can use a   |
-|                        |                        | reason_data         | compute resource, in machine interpretable  | fuzzy matching with wildcard string,  |
-|                        |                        |                     | format.                                     | "compute.cpu.failure".                |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| N/A                    | N/A                    | type = "event"      | Type of the notification. For fault         | \-                                    |
-|                        |                        |                     | notifications the type in AODH is “event”.  |                                       |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| probableCause          | ProbableCause          | \-                  | Probable cause of the alarm.                | May be provided (e.g. in a shimlayer) |
-|                        |                        |                     |                                             | based on Vitrage topology awareness / |
-|                        |                        |                     |                                             | root-cause-analysis.                  |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| isRootCause            | IsRootCause            | \-                  | Boolean indicating whether the fault is the | see above                             |
-|                        |                        |                     | root cause of other faults.                 |                                       |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| correlatedAlarmId      | CorrelatedFaultId      | \-                  | List of IDs of correlated faults.           | see above                             |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| faultDetails           | FaultDetails           | \-                  | Additional details about the fault/alarm.   | FaultDetails information element will |
-|                        |                        |                     |                                             | be specified in ETSI NFV Stage 3.     |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-| \-                     | \-                     | action, previous    | Additional AODH alarm related parameters.   | \-                                    |
-+------------------------+------------------------+---------------------+---------------------------------------------+---------------------------------------+
-
-Table: Comparison of alarm attributes
-
-The primary area of improvement should be alignment of the perceived severity. This
-is important for a quick and accurate evaluation of the alarm. AODH thus should
-support also the X.733 values Critical, Major, Minor, Warning and Indeterminate.
-
-The detailed time stamps (raised, changed, cleared) which are essential for
-synchronizing the alarm list using a query operation should be added to the
-Doctor specification.
-
-Other areas that need alignment is the so called alarm state in NFV. Here we must
-however consider what can be attributes of the notification vs. what should be a
-property of the alarm instance. This will be analyzed later.
-
-.. _southbound:
-
-Detailed southbound interface specification
--------------------------------------------
-
-This section is specifying the southbound interfaces for fault management
-between the Monitors and the Inspector.
-Although southbound interfaces should be flexible to handle various events from
-different types of Monitors, we define unified event API in order to improve
-interoperability between the Monitors and the Inspector.
-This is not limiting implementation of Monitor and Inspector as these could be
-extended in order to support failures from intelligent inspection like prediction.
-
-Note: The interface definition will be aligned with current work in ETSI NFV IFA
-working group.
-
-Fault event interface
-^^^^^^^^^^^^^^^^^^^^^
-
-This interface allows the Monitors to notify the Inspector about an event which
-was captured by the Monitor and may effect resources managed in the VIM.
-
-EventNotification
-_________________
-
-
-Event notification including fault description.
-The entity of this notification is event, and not fault or error specifically.
-This allows us to use generic event format or framework build out of Doctor project.
-The parameters below shall be mandatory, but keys in 'Details' can be optional.
-
-Parameters:
-
-* Time [1]: Datetime when the fault was observed in the Monitor.
-* Type [1]: Type of event that will be used to process correlation in Inspector.
-* Details [0..1]: Details containing additional information with Key-value pair style.
-  Keys shall be defined depending on the Type of the event.
-
-E.g.:
-
-.. code-block:: bash
-
-    {
-        'event': {
-            'time': '2016-04-12T08:00:00',
-            'type': 'compute.host.down',
-            'details': {
-                'hostname': 'compute-1',
-                'source': 'sample_monitor',
-                'cause': 'link-down',
-                'severity': 'critical',
-                'status': 'down',
-                'monitor_id': 'monitor-1',
-                'monitor_event_id': '123',
-            }
-        }
-    }
-
-Optional parameters in 'Details':
-
-* Hostname: the hostname on which the event occurred.
-* Source: the display name of reporter of this event. This is not limited to monitor, other entity can be specified such as 'KVM'.
-* Cause: description of the cause of this event which could be different from the type of this event.
-* Severity: the severity of this event set by the monitor.
-* Status: the status of target object in which error occurred.
-* MonitorID: the ID of the monitor sending this event.
-* MonitorEventID: the ID of the event in the monitor. This can be used by operator while tracking the monitor log.
-* RelatedTo: the array of IDs which related to this event.
-
-Also, we can have bulk API to receive multiple events in a single HTTP POST
-message by using the 'events' wrapper as follows:
-
-.. code-block:: bash
-
-    {
-        'events': [
-            'event': {
-                'time': '2016-04-12T08:00:00',
-                'type': 'compute.host.down',
-                'details': {},
-            },
-            'event': {
-                'time': '2016-04-12T08:00:00',
-                'type': 'compute.host.nic.error',
-                'details': {},
-            }
-        ]
-    }
-
-
-
-
-Blueprints
-----------
-
-This section is listing a first set of blueprints that have been proposed by the
-Doctor project to the open source community. Further blueprints addressing other
-gaps identified in Section 4 will be submitted at a later stage of the OPNFV. In
-this section the following definitions are used:
-
-* "Event" is a message emitted by other OpenStack services such as Nova and
-  Neutron and is consumed by the "Notification Agents" in Ceilometer.
-* "Notification" is a message generated by a "Notification Agent" in Ceilometer
-  based on an "event" and is delivered to the "Collectors" in Ceilometer that
-  store those notifications (as "sample") to the Ceilometer "Databases".
-
-Instance State Notification  (Ceilometer) [*]_
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The Doctor project is planning to handle "events" and "notifications" regarding
-Resource Status; Instance State, Port State, Host State, etc. Currently,
-Ceilometer already receives "events" to identify the state of those resources,
-but it does not handle and store them yet. This is why we also need a new event
-definition to capture those resource states from "events" created by other
-services.
-
-This BP proposes to add a new compute notification state to handle events from
-an instance (server) from nova. It also creates a new meter "instance.state" in
-OpenStack.
-
-.. [*] https://etherpad.opnfv.org/p/doctor_bps
-
-Event Publisher for Alarm  (Ceilometer) [*]_
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-**Problem statement:**
-
-  The existing "Alarm Evaluator" in OpenStack Ceilometer is periodically
-  querying/polling the databases in order to check all alarms independently from
-  other processes. This is adding additional delay to the fault notification
-  send to the Consumer, whereas one requirement of Doctor is to react on faults
-  as fast as possible.
-
-  The existing message flow is shown in :numref:`figure12`: after receiving
-  an "event", a "notification agent" (i.e. "event publisher") will send a
-  "notification" to a "Collector". The "collector" is collecting the
-  notifications and is updating the Ceilometer "Meter" database that is storing
-  information about the "sample" which is capured from original "event". The
-  "Alarm Evaluator" is periodically polling this databases then querying "Meter"
-  database based on each alarm configuration.
-
-  In the current Ceilometer implementation, there is no possibility to directly
-  trigger the "Alarm Evaluator" when a new "event" was received, but the "Alarm
-  Evaluator" will only find out that requires firing new notification to the
-  Consumer when polling the database.
-
-**Change/feature request:**
-
-  This BP proposes to add a new "event publisher for alarm", which is bypassing
-  several steps in Ceilometer in order to avoid the polling-based approach of
-  the existing Alarm Evaluator that makes notification slow to users.
-
-  After receiving an "(alarm) event" by listening on the Ceilometer message
-  queue ("notification bus"), the new "event publisher for alarm" immediately
-  hands a "notification" about this event to a new Ceilometer component
-  "Notification-driven alarm evaluator" proposed in the other BP (see Section
-  5.6.3).
-
-  Note, the term "publisher" refers to an entity in the Ceilometer architecture
-  (it is a "notification agent"). It offers the capability to provide
-  notifications to other services outside of Ceilometer, but it is also used to
-  deliver notifications to other Ceilometer components (e.g. the "Collectors")
-  via the Ceilometer "notification bus".
-
-**Implementation detail**
-
-  * "Event publisher for alarm" is part of Ceilometer
-  * The standard AMQP message queue is used with a new topic string.
-  * No new interfaces have to be added to Ceilometer.
-  * "Event publisher for Alarm" can be configured by the Administrator of
-    Ceilometer to be used as "Notification Agent" in addition to the existing
-    "Notifier"
-  * Existing alarm mechanisms of Ceilometer can be used allowing users to
-    configure how to distribute the "notifications" transformed from "events",
-    e.g. there is an option whether an ongoing alarm is re-issued or not
-    ("repeat_actions").
-
-.. [*] https://etherpad.opnfv.org/p/doctor_bps
-
-Notification-driven alarm evaluator (Ceilometer) [*]_
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-**Problem statement:**
-
-The existing "Alarm Evaluator" in OpenStack Ceilometer is periodically
-querying/polling the databases in order to check all alarms independently from
-other processes. This is adding additional delay to the fault notification send
-to the Consumer, whereas one requirement of Doctor is to react on faults as fast
-as possible.
-
-**Change/feature request:**
-
-This BP is proposing to add an alternative "Notification-driven Alarm Evaluator"
-for Ceilometer that is receiving "notifications" sent by the "Event Publisher
-for Alarm" described in the other BP. Once this new "Notification-driven Alarm
-Evaluator" received "notification", it finds the "alarm" configurations which
-may relate to the "notification" by querying the "alarm" database with some keys
-i.e. resource ID, then it will evaluate each alarm with the information in that
-"notification".
-
-After the alarm evaluation, it will perform the same way as the existing "alarm
-evaluator" does for firing alarm notification to the Consumer. Similar to the
-existing Alarm Evaluator, this new "Notification-driven Alarm Evaluator" is
-aggregating and correlating different alarms which are then provided northbound
-to the Consumer via the OpenStack "Alarm Notifier". The user/administrator can
-register the alarm configuration via existing Ceilometer API [*]_. Thereby, he
-can configure whether to set an alarm or not and where to send the alarms to.
-
-**Implementation detail**
-
-* The new "Notification-driven Alarm Evaluator" is part of Ceilometer.
-* Most of the existing source code of the "Alarm Evaluator" can be re-used to
-  implement this BP
-* No additional application logic is needed
-* It will access the Ceilometer Databases just like the existing "Alarm
-  evaluator"
-* Only the polling-based approach will be replaced by a listener for
-  "notifications" provided by the "Event Publisher for Alarm" on the Ceilometer
-  "notification bus".
-* No new interfaces have to be added to Ceilometer.
-
-
-.. [*] https://etherpad.opnfv.org/p/doctor_bps
-.. [*] https://wiki.openstack.org/wiki/Ceilometer/Alerting
-
-Report host fault to update server state immediately (Nova) [*]_
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-**Problem statement:**
-
-* Nova state change for failed or unreachable host is slow and does not reliably
-  state host is down or not. This might cause same server instance to run twice
-  if action taken to evacuate instance to another host.
-* Nova state for server(s) on failed host will not change, but remains active
-  and running. This gives the user false information about server state.
-* VIM northbound interface notification of host faults towards VNFM and NFVO
-  should be in line with OpenStack state. This fault notification is a Telco
-  requirement defined in ETSI and will be implemented by OPNFV Doctor project.
-* Openstack user cannot make HA actions fast and reliably by trusting server
-  state and host state.
-
-**Proposed change:**
-
-There needs to be a new API for Admin to state host is down. This API is used to
-mark services running in host down to reflect the real situation.
-
-Example on compute node is:
-
-* When compute node is up and running:::
-
-    vm_state: activeand power_state: running
-    nova-compute state: up status: enabled
-
-* When compute node goes down and new API is called to state host is down:::
-
-    vm_state: stopped power_state: shutdown
-    nova-compute state: down status: enabled
-
-**Alternatives:**
-
-There is no attractive alternative to detect all different host faults than to
-have an external tool to detect different host faults. For this kind of tool to
-exist there needs to be new API in Nova to report fault. Currently there must be
-some kind of workarounds implemented as cannot trust or get the states from
-OpenStack fast enough.
-
-.. [*] https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately
-
-Other related BPs
-^^^^^^^^^^^^^^^^^
-
-This section lists some BPs related to Doctor, but proposed by drafters outside
-the OPNFV community.
-
-pacemaker-servicegroup-driver [*]_
-__________________________________
-
-This BP will detect and report host down quite fast to OpenStack. This however
-might not work properly for example when management network has some problem and
-host reported faulty while VM still running there. This might lead to launching
-same VM instance twice causing problems. Also NB IF message needs fault reason
-and for that the source needs to be a tool that detects different kind of faults
-as Doctor will be doing. Also this BP might need enhancement to change server
-and service states correctly.
-
-.. [*] https://blueprints.launchpad.net/nova/+spec/pacemaker-servicegroup-driver
author	Gerald Kunzmann <kunzmann@docomolab-euro.com>	2017-02-14 15:38:29 +0000
committer	Gerald Kunzmann <kunzmann@docomolab-euro.com>	2017-02-16 14:41:46 +0000
commit	d0b22e1d856cf8f78e152dfb6c150e001e03dd52 (patch)
tree	0c3b7af828967d5014c2272675560410fceb6e4d /docs/requirements/05-implementation.rst
parent	e171b396ce87322f2dc5ef0719419144774e43d7 (diff)