22 files changed, 2472 insertions, 293 deletions
diff --git a/docs/development/overview/HA_Analysis-Gambia.rst b/docs/development/overview/HA_Analysis-Gambia.rst
new file mode 100644
index 0000000..c5505b1
--- /dev/null
+++ b/docs/development/overview/HA_Analysis-Gambia.rst
@@ -0,0 +1,667 @@
+.. image:: opnfv-logo.png
+  :height: 40
+  :width: 200
+  :alt: OPNFV
+  :align: left
+
+
+******************
+Introduction
+******************
+This High Availability Requirement Analysis Document is used for eliciting High Availability
+Requirements of OPNFV. The document will refine high-level High Availability goals, into
+detailed HA mechanism design. And HA mechanisms are related with potential failures on
+different layers in OPNFV. Moreover, this document can be used as reference for HA Testing
+scenarios design.
+A requirement engineering model KAOS is used in this document.
+
+******************
+Terminologies and Symbols
+******************
+The following concepts in KAOS will be used in the diagrams of this document.
+
+- **Goal**: The objective to be met by the target system.
+
+- **Obstacle**: Condition whose satisfaction may prevent some goals from being achieved.
+
+- **Agent**: Active Object performing operations to achieve goals.
+
+- **Requirement**: Goal assigned to an agent of the software being studied.
+
+- **Domain Property**: Descriptive assertion about objects in the environment of the software.
+
+- **Refinement**: Relationship linking a goal to other goals that are called its subgoals.
+  Each subgoal contributes to the satisfaction of the goal it refines. There are two types of
+  refinements: AND refinement and OR refinement, which means whether the goal can be archived by
+  satisfying all of its sub goals or any one of its sub goals.
+
+- **Conflict**: Relationship linking an obstacle to a goal if the obstacle obstructs the goal
+  from being satisfied.
+
+- **Resolution**: Relationship linking a goal to an obstacle if the goal can resolve the
+  obstacle.
+
+- **Responsibility**: Relationship between an agent and a requirement. Holds when an agent is
+  assigned the responsibility of achieving the linked requirement.
+
+Figure 1 shows how these concepts are displayed in a KAOS diagram.
+
+.. figure:: images/fig1_KAOS_Sample.png
+    :alt: KAOS Sample
+    :figclass: align-center
+
+    Fig 1. A KAOS Sample Diagram
+
+******************
+High Availability Goals of OPNFV
+******************
+
+Overall Goals
+>>>>>>>>>>>>>>>>>>
+
+The Final Goal of OPNFV High Availability is to provide high available VNF services. And the
+following objectives are required to meet:
+
+- There should be no single point of failure in the NFV framework.
+
+- All resiliency mechanisms shall be designed for a multi-vendor environment, where for example
+  the NFVI, NFV-MANO, and VNFs may be supplied by different vendors.
+
+- Resiliency related information shall always be explicitly specified and communicated using
+  the reference interfaces (including policies/templates) of the NFV framework.
+
+
+
+Service Level Agreements of OPNFV HA
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Service Level Agreements of OPNFV HA are mainly focused on time constraints of service outage,
+failure detection, failure recovery. The following table outlines the SLA metrics of different
+service availability levels described in ETSI GS NFV-REL 001 V1.1.1 (2015-01). Table 1 shows
+time constraints of different Service Availability Levels. In this document, SAL1 is the
+default benchmark value required to meet.
+
+*Table 1. Time Constraints for Different Service Availability Levels*
+
++--------------------------------+----------------------------+------------------------+
+| Service Availability Level     | Failure Detection Time     | Failure Recovery Time  |
++================================+============================+========================+
+| SAL1                           | <1s                        | 5-6s                   |
++--------------------------------+----------------------------+------------------------+
+| SAL2                           | <5s                        | 10-15s                 |
++--------------------------------+----------------------------+------------------------+
+| SAL3                           | <10s                       | 20-25s                 |
++--------------------------------+----------------------------+------------------------+
+
+
+******************
+Overall Analysis
+******************
+Figure 2 shows the overall decomposition of high availability goals. The high availability of
+VNF Services can be refined to high availability of VNFs, MANO, and the NFVI where VNFs are
+deployed; the high availability of NFVI Service can be refined to high availability of Virtual
+Compute Instances, Virtual Storage and Virtual Network Services; the high availability of
+virtual instance is either the high availability of containers or the high availability of VMs,
+and these high availability goals can be further decomposed by how the NFV environment is
+deployed.
+
+.. figure:: images/fig2_Total_Framework.png
+    :alt: Overall HA Analysis of OPNFV
+    :figclass: align-center
+
+    Fig 2. Overall HA Analysis of OPNFV
+
+Thus the high availability requirement of VNF services can be classified into high availability
+requirements on different layers in OPNFV. The following layers are mainly discussed in this
+document:
+
+- VNF HA
+
+- MANO HA
+
+- Virtual Infrastructure HA (container HA or VM HA)
+
+- VIM HA
+
+- SDN HA
+
+- Hypervisor HA
+
+- Host OS HA
+
+- Hardware HA
+
+The next section will illustrate detailed analysis of HA requirements on these layers.
+
+******************
+Detailed Analysis
+******************
+
+VNF HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+MANO HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Virtual Infrastructure HA
+>>>>>>>>>>>>>>>>>>
+
+The Virtual Infrastructure HA in OPNFV includes container HA and VM HA.
+
+VM HA
+::::::::::::::::::::::::::::::::::::::
+
+This part describes a set of new optional capabilities where the OpenStack Cloud messages into the Guest
+VMs in order to provide improved Availability of the Host VMs.
+
+Table 2 shows the potential faults of VMs and corresponding initial solution capabilities or methods. 
+
+*Table 2. Potential Faults of VMs and the initial solution capabilities*
+
++---------------------------+------------------------------------+--------------------------------------------+
+| Fault                     | Description                        | solution capabilities                      |
++===========================+====================================+============================================+
+| VM faults                 | General internal VM faults         | VM Heartbeating and Health Checking        |
++---------------------------+------------------------------------+--------------------------------------------+
+| VM Server Group faults    | such as split brain                | VM Peer State Notification and Messaging   |
++---------------------------+------------------------------------+--------------------------------------------+
+
+
+.. figure:: images/fig3_VM_HA_Analysis.png
+    :alt: VM HA
+    :figclass: align-center
+
+    Fig 3. VM HA Analysis
+
+NOTE: A Server Group here is the OpenStack Nova Server Group concept where VMs
+are grouped together for purposes of scheduling.  E.g. A specific Server Group
+instance can specify whether the VMs within the group should be scheduled to
+run on the same compute host or different compute hosts.  A 'peer' VM in the
+context of this section refers to a VM within the same Nova Server Group.
+
+The initial set of new capabilities include: enabling the
+detection of and recovery from internal VM faults and providing
+a simple out-of-band messaging service to prevent scenarios such
+as split brain.
+
+More detailed description is located in R5_HA_API/OPNFV_HA_Guest_APIs-Overview_HLD.rst in this project. 
+
+The Host-to-Guest messaging APIs used by the services discussed
+in this Virtual Infrastructure HA part use a JSON-formatted application messaging layer
+on top of a virtio serial device between QEMU on the OpenStack Host
+and the Guest VM. Use of the virtio serial device provides a
+simple, direct communication channel between host and guest which is
+independent of the Guest's L2/L3 networking.
+
+The upper layer JSON messaging format is actually structured as a
+hierarchical JSON format containing a Base JSON Message Layer and an
+Application JSON Message Layer:
+
+- the Base Layer provides the ability to multiplex different groups of message types on top of a single virtio serial device
+e.g.
+
+    + heartbeating and healthchecks,
+    + server group messaging,
+
+and
+
+- the Application Layer provides the specific message types and fields of a particular group of message types.
+
+
+A) VM Heartbeating and Health Checking
+
+
+.. figure:: images/fig4_Heartbeating_and_Healthchecks.png
+    :alt: Heartbeating and Healthchecks
+    :figclass: align-center
+
+    Fig 4. Heartbeating and Healthchecks
+    
+VM Heartbeating and Health Checking provides a heartbeat service to enhance
+the monitoring of the health of guest application(s) within a VM running
+under the OpenStack Cloud. Loss of heartbeat or a failed health check status
+will result in a fault event being reported to OPNFV's DOCTOR infrastructure
+for alarm identification, impact analysis and reporting. This would then enable
+VNF Managers (VNFMs) listening to OPNFV's DOCTOR External Alarm Reporting through
+Telemetry's AODH, to initiate any required fault recovery actions.
+
+Guest heartbeat works on a challenge response model. The OpenStack Guest Heartbeat 
+Service on the compute node will challenge the registered Guest VM daemon with a 
+message each interval. The registered Guest VM daemon must respond prior to the 
+next interval with a message indicating good health. If the OpenStack Host does 
+not receive a valid response, or if the response specifies that the VM is in ill 
+health, then a fault event for the Guest VM is reported to the OpenStack Guest 
+Heartbeat Service on the controller node which will report the event to OPNFV's 
+DOCTOR (i.e. thru the Doctor SouthBound (SB) APIs).
+
+In summary, the Guest Heartbeating Messaging Specification is quite simple,
+including the following PDUs: Init, Init-Ack, Challenge-Request,
+Challenge-Response, Exit.  The Challenge-Response returning a healthy /
+not-healthy boolean.
+
+The registered Guest VM daemon's response to the challenge can be as simple
+as just immediately responding with OK.  This alone allows for detection of
+a failed or hung QEMU/KVM instance, or a failure of the OS within the VM to
+schedule the registered Guest VM's daemon or failure to route basic IO within
+the Guest VM.
+
+However the registered Guest VM daemon's response to the challenge can be more
+complex, running anything from a quick simple sanity check of the health of
+applications running in the Guest VM, to a more thorough audit of the
+application state and data.  In either case returning the status of the
+health check enables the OpenStack host to detect and report the event in order
+to initiate recovery from application level errors or failures within the Guest VM.
+
+
+B) VM Peer State Notification and Messaging
+
+
+.. figure:: images/fig5_VM_Peer_State_Notification_and_Messaging.png
+    :alt: VM Peer State Notification and Messaging
+    :figclass: align-center
+
+    Fig 5. VM Peer State Notification and Messaging
+    
+Server Group State Notification and Messaging is a service to provide
+simple low-bandwidth datagram messaging and notifications for servers that
+are part of the same server group.  This messaging channel is available
+regardless of whether IP networking is functional within the server, and
+it requires no knowledge within the server about the other members of the group.
+
+This Server Group Messaging service provides three types of messaging:
+
+- Broadcast: this allows a server to send a datagram (size of up to 3050 bytes)
+  to all other servers within the server group.
+- Notification: this provides servers with information about changes to the
+  (Nova) state of other servers within the server group.
+- Status: this allows a server to query the current (Nova) state of all servers within
+  the server group (including itself).
+
+A Server Group Messaging entity on both the controller node and the compute nodes manage 
+the routing of of VM-to-VM messages through the platform, leveraging Nova to determine 
+Server Group membership and compute node locations of VMs. The Server Group Messaging 
+entity on the controller also listens to Nova VM state change notifications and querys 
+VM state data from Nova, in order to provide the VM query and notification functionality 
+of this service.
+
+This service is not intended for high bandwidth or low-latency operations. It is best-effort, 
+not reliable. Applications should do end-to-end acks and retries if they care about reliability.
+      
+This service provides building block type capabilities for the Guest VMs that
+contribute to higher availability of the VMs in the Guest VM Server Group.  Notifications
+of VM Status changes potentially provide a faster and more accurate notification
+of failed peer VMs than traditional peer VM monitoring over Tenant Networks.  While
+the Broadcast Messaging mechanism provides an out-of-band messaging mechanism to
+monitor and control a peer VM under fault conditions; e.g. providing the ability to
+avoid potential split brain scenarios between 1:1 VMs when faults in Tenant
+Networking occur.
+
+Container HA
+::::::::::::::::::::::::::::
+
+The container HA in OPNFV is mainly focus on Kubernetes(K8s) platform. And using the Pod as
+the smallest unit of management, creation, and planning, the K8s' container HA actually means
+the High Availability of running Pods.
+
+Table 3 shows the potential faults of running pods in K8s. when it happens, the ReplicationController
+or ReplicaSet can prevent the services provided by the pod from being unavailable, as is shown in
+figure 6.
+
+*Table 3. Potential Faults in VIM level*
+
++------------+--------------+----------------------------------------------------+----------------+
+| Service    | Fault        | Description                                        | Severity       |
++============+==============+====================================================+================+
+|            |              | All Containers in the Pod have terminated, and     |                |
+| Running by | Pod failure  | at least one Container has terminated in failure.  | Critical       |
+| pods       |              | That is, the Container either exited with non-zero |                |
+|            |              | status or was terminated by the system.            |                |
++------------+--------------+----------------------------------------------------+----------------+
+
+.. figure:: images/fig6_Container_HA_analysis_in_K8s.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 6. Container HA analysis in K8s
+    
+    
+The Replication Controller or ReplicaSet (ReplicaSet is the next-generation Replication Controller) 
+is a kind of K8s Master Components, which ensures that a specified number of pod replicas are running 
+at any one time.
+
+The following requirements are elicited for Pod HA:
+
+**[Req 5.3.1]** A pod or a homogeneous set of pods is always up and available until terminated properly.
+
+**[Req 5.3.2]** The ReplicationController or ReplicaSet should terminate the extra pods If there are 
+more pods than specified number.
+
+**[Req 5.3.3]** The ReplicationController or ReplicaSet should start more pods If there are fewer pods 
+than specified number. 
+
+**[Req 5.3.4]** The new Pod should be scheduled to other Nodes, if detecting the failure state of the 
+host or container.
+
+
+  
+VIM HA
+>>>>>>>>>>>>>>>>>>
+
+
+OpenStack High Availability
+::::::::::::::::::::::::::::
+
+The VIM in the NFV reference architecture contains different components of Openstack, SDN
+controllers and other virtual resource controllers. VIM components can be classified into three
+types:
+
+- **Entry Point Components**: Components that give VIM service interfaces to users, like nova-
+  api, neutron-server.
+
+- **Middlewares**: Components that provide load balancer services, messaging queues, cluster
+  management services, etc.
+
+- **Subcomponents**: Components that implement VIM functions, which are called by Entry Point
+  Components but not by users directly.
+
+Table 4 shows the potential faults that may happen on VIM layer. Currently the main focus of
+VIM HA is the service crash of VIM components, which may occur on all types of VIM components.
+To prevent VIM services from being unavailable, Active/Active Redundancy, Active/Passive
+Redundancy and Message Queue are used for different types of VIM components, as is shown in
+figure 7.
+
+*Table 4. Potential Faults in VIM level*
+
++------------+------------------+-------------------------------------------------+----------------+
+| Service    | Fault            | Description                                     | Severity       |
++============+==================+=================================================+================+
+| General    | Service Crash    | The processes of a service crashed unnormally.  | Critical       |
++------------+------------------+-------------------------------------------------+----------------+
+
+.. figure:: images/fig7_VIM_Analysis.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 7. VIM HA Analysis
+
+
+A) Active/Active Redundancy
+
+Active/Active Redundancy manages both the main and redundant systems concurrently. If there is
+a failure happens on a component, the backups are already online and users are unlikely to
+notice that the failed VIM component is under fixing. A typical Active/Active Redundancy will
+have redundant instances, and these instances are load balanced via a virtual IP address and a
+load balancer such as HAProxy.
+
+When one of the redundant VIM component fails, the load balancer should be aware of the
+instance failure, and then isolate the failed instance from being called until it is recovered.
+The requirement decomposition of Active/Active Redundancy is shown in Figure 8.
+
+.. figure:: images/fig8_Active_Active_Redundancy.png
+    :alt: Active/Active Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 8. Active/Active Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Active Redundancy:
+
+**[Req 5.4.1]** Redundant VIM components should be load balanced by a load balancer.
+
+**[Req 5.4.2]** The load balancer should check the health status of VIM component instances.
+
+**[Req 5.4.3]** The load balancer should isolate the failed VIM component instance until it is
+recovered.
+
+**[Req 5.4.4]** The alarm information of VIM component failure should be reported.
+
+**[Req 5.4.5]** Failed VIM component instances should be recovered by a cluster manager.
+
+Table 5 shows the current VIM components using Active/Active Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 5. VIM Components using Active/Active Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-api          | endpoint component of Openstack Compute Service Nova  | yardstick_tc019      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-novncproxy   | server daemon that serves the Nova noVNC Websocket    |                      |
+|                   | Proxy service, which provides a websocket proxy that  |                      |
+|                   | is compatible with OpenStack Nova noVNC consoles.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| neeutron-server   | endpoint component of Openstack Networking Service    | yardstick_tc045      |
+|                   | Neutron                                               |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| keystone          | component of Openstack Identity Service Service       | yardstick_tc046      |
+|                   | Keystone                                              |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-api        | endpoint component of Openstack Image Service Glance  | yardstick_tc047      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-registry   | server daemon that serves image metadata through a    |                      |
+|                   | REST-like API.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-api        | endpoint component of Openstack Block Storage Service | yardstick_tc048      |
+|                   | Service Cinder                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| swift-proxy       | endpoint component of Openstack Object Storage        | yardstick_tc049      |
+|                   | Swift                                                 |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| horizon           | component of Openstack Dashboard Service Horizon      |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-api          | endpoint component of Openstack Stack Service Heat    | yardstick_tc091      |
++-------------------+-------------------------------------------------------+----------------------+
+| mysqld            | database service of VIM components                    | yardstick_tc090      |
++-------------------+-------------------------------------------------------+----------------------+
+
+B)Active/Passive Redundancy
+
+
+Active/Passive Redundancy maintains a redundant instance that can be brought online when the
+active service fails. A typical Active/Passive Redundancy maintains replacement resources that
+can be brought online when required. Requests are handled using a virtual IP address (VIP) that
+facilitates returning to service with minimal reconfiguration. A cluster manager (such as
+Pacemaker or Corosync) monitors these components, bringing the backup online as necessary.
+
+When the main instance of a VIM component is failed, the cluster manager should be aware of the
+failure and switch the backup instance online. And the failed instance should also be recovered
+to another backup instance. The requirement decomposition of Active/Passive Redundancy is shown
+in Figure 9.
+
+.. figure:: images/fig9_Active_Passive_Redundancy.png
+    :alt: Active/Passive Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 9. Active/Passive Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Passive Redundancy:
+
+**[Req 5.4.6]** The cluster manager should replace the failed main VIM component instance with
+a backup instance.
+
+**[Req 5.4.7]** The cluster manager should check the health status of VIM component instances.
+
+**[Req 5.4.8]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.9]** The alarm information of VIM component failure should be reported.
+
+
+Table 6 shows the current VIM components using Active/Passive Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 6. VIM Components using Active/Passive Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| haproxy           | load balancer component of VIM components             | yardstick_tc053      |
++-------------------+-------------------------------------------------------+----------------------+
+| rabbitmq-server   | messaging queue service of VIM components             | yardstick_tc056      |
++-------------------+-------------------------------------------------------+----------------------+
+| corosync          | cluster management component of VIM components        | yardstick_tc057      |
++-------------------+-------------------------------------------------------+----------------------+
+
+C) Message Queue
+
+Message Queue provides an asynchronous communication protocol. In Openstack, some projects (
+like Nova, Cinder) use Message Queue to call their sub components. Although Message Queue
+itself is not an HA mechanism, how it works ensures the high availaibility when redundant
+components subscribe to the Messsage Queue. When a VIM sub component fails, since there are
+other redundant components are subscribing to the Message Queue, requests still can be processed.
+And fault isolation can also be archived since failed components won't fetch requests actively.
+Also, the recovery of failed components is required. Figure 10 shows the requirement
+decomposition of Message Queue.
+
+.. figure:: images/fig10_Message_Queue.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 10. Message Queue Redundancy Requirement Decomposition
+
+The following requirements are elicited for Message Queue:
+
+**[Req 5.4.10]** Redundant component instances should subscribe to the Message Queue, which is
+implemented by the installer.
+
+**[Req 5.4.11]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.12]** The alarm information of VIM component failure should be reported.
+
+Table 7 shows the current VIM components using Message Queue and the corresponding HA test cases
+to verify them.
+
+*Table 7. VIM Components using Messaging Queue*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-scheduler    | Openstack compute component determines how to         | yardstick_tc088      |
+|                   | dispatch compute requests                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-cert         | Openstack compute component that serves the Nova Cert |                      |
+|                   | service for X509 certificates. Used to generate       |                      |
+|                   | certificates for euca-bundle-image.                   |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-conductor    | server daemon that serves the Nova Conductor service, | yardstick_tc089      |
+|                   | which provides coordination and database query        |                      |
+|                   | support for Nova.                                     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-compute      | Handles all processes relating to instances (guest    |                      |
+|                   | vms). nova-compute is responsible for building a disk |                      |
+|                   | image, launching it via the underlying virtualization |                      |
+|                   | driver, responding to calls to check its state,       |                      |
+|                   | attaching persistent storage, and terminating it.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-consoleauth  | Openstack compute component for Authentication of     |                      |
+|                   | nova consoles.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-scheduler  | Openstack volume storage component decides on         |                      |
+|                   | placement for newly created volumes and forwards the  |                      |
+|                   | request to cinder-volume.                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-volume     | Openstack volume storage component receives volume    |                      |
+|                   | management requests from cinder-api and               |                      |
+|                   | cinder-scheduler, and routes them to storage backends |                      |
+|                   | using vendor-supplied drivers.                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-engine       | Openstack Heat project server with an internal RPC    |                      |
+|                   | api called by the heat-api server.                    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+
+
+VIM HA in K8s
+::::::::::::::::::::::::::::
+
+The VIM HA in K8s can be generally analyzed from the following two concepts:
+
+- **Master Components HA**: the HA of k8s components in Master. (for example, Kube-apiserver, 
+  Kube-scheduler, Kube-controller-manager)
+  
+- **Data Storage HA**: the HA of etcd cluster. Actually etcd is a master component used as 
+  Kubernetes' backing store for all cluster data. Considering that etcd is the only stateful service
+  in k8s and that its HA policy can be deployed independent on K8s, it is necessary to discuss the 
+  HA of etcd separately.
+
+Table 8 shows the potential faults that may happen in K8s.
+
+*Table 8. Potential Faults in K8s*
+
++--------------------+------------------+----------------------------------------+----------------+
+| Service            | Fault            | Description                            | Severity       |
++====================+==================+========================================+================+
+| Provided by Master | Master           | A Master component crashed and can't   | Critical       |
+| Components         | Component crash  | provide normal service.                |                |
++--------------------+------------------+----------------------------------------+----------------+
+| Data storage       | Etcd Crash       | The Etcd cluster crashed unnormally.   | Critical       |
++--------------------+------------------+----------------------------------------+----------------+
+
+
+.. figure:: images/fig11_VIM_HA_analysis_in_K8s.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 11. VIM HA analysis in K8s
+    
+Master components can be run on any machine in the cluster. However, for simplicity, all master 
+components are typically started on the same machine, and do not run user containers on this machine.
+In this case, the K8s is based on a single Master, and only has container HA on application layer 
+realized by ReplicationController or ReplicaSet Master Component as mentioned in the container HA 
+part above.
+
+The HA of Mater and its components in K8s must depend on the multi-master setup. 
+
+The Data Storage HA can use an existing Etcd HA cluster to realize, or can be realized as a master
+component through multiple master implementation.
+
+.. figure:: images/fig12_VIM_HA_analysis_in_K8s_2.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 12. VIM HA analysis in K8s(2)
+
+In Multi-Master K8s, the Master Components HA is mainly based on the Leader Election function of Etcd 
+cluster. And load balancer is used to realize the HA of Kube-apiserver Master component.
+
+The following requirements are elicited for Master components HA:
+
+**[Req 5.4.13]** The Load Balancer should always forward the request to an available Kube-apiserver 
+instance. 
+
+**[Req 5.4.14]** The Master Component in the Leader state should confirm its Leader state to all 
+follower Components regularly through Heatbeat.
+
+**[Req 5.4.15]** When a Master Component in the Leader state crashed, an available Master Component 
+should be elected as Leader.
+
+Hypervisor HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Host OS HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Hardware HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+
+******************
+References
+******************
+
+- A KAOS Tutorial: http://www.objectiver.com/fileadmin/download/documents/KaosTutorial.pdf
+
+- ETSI GS NFV-REL 001 V1.1.1(2015-01):
+  http://www.etsi.org/deliver/etsi_gs/NFV-REL/001_099/001/01.01.01_60/gs_NFV-REL001v010101p.pdf
+
+- Openstack High Availability Guide: https://docs.openstack.org/ha-guide/
+
+- Highly Available (Mirrored) Queues: https://www.rabbitmq.com/ha.html
diff --git a/docs/development/overview/HA_Analysis-Gambia.rst.bak b/docs/development/overview/HA_Analysis-Gambia.rst.bak
new file mode 100644
index 0000000..40b3f35
--- /dev/null
+++ b/docs/development/overview/HA_Analysis-Gambia.rst.bak
@@ -0,0 +1,667 @@
+.. image:: opnfv-logo.png
+  :height: 40
+  :width: 200
+  :alt: OPNFV
+  :align: left
+
+
+******************
+Introduction
+******************
+This High Availability Requirement Analysis Document is used for eliciting High Availability
+Requirements of OPNFV. The document will refine high-level High Availability goals, into
+detailed HA mechanism design. And HA mechanisms are related with potential failures on
+different layers in OPNFV. Moreover, this document can be used as reference for HA Testing
+scenarios design.
+A requirement engineering model KAOS is used in this document.
+
+******************
+Terminologies and Symbols
+******************
+The following concepts in KAOS will be used in the diagrams of this document.
+
+- **Goal**: The objective to be met by the target system.
+
+- **Obstacle**: Condition whose satisfaction may prevent some goals from being achieved.
+
+- **Agent**: Active Object performing operations to achieve goals.
+
+- **Requirement**: Goal assigned to an agent of the software being studied.
+
+- **Domain Property**: Descriptive assertion about objects in the environment of the software.
+
+- **Refinement**: Relationship linking a goal to other goals that are called its subgoals.
+  Each subgoal contributes to the satisfaction of the goal it refines. There are two types of
+  refinements: AND refinement and OR refinement, which means whether the goal can be archived by
+  satisfying all of its sub goals or any one of its sub goals.
+
+- **Conflict**: Relationship linking an obstacle to a goal if the obstacle obstructs the goal
+  from being satisfied.
+
+- **Resolution**: Relationship linking a goal to an obstacle if the goal can resolve the
+  obstacle.
+
+- **Responsibility**: Relationship between an agent and a requirement. Holds when an agent is
+  assigned the responsibility of achieving the linked requirement.
+
+Figure 1 shows how these concepts are displayed in a KAOS diagram.
+
+.. figure:: images/fig1_KAOS_Sample.png
+    :alt: KAOS Sample
+    :figclass: align-center
+
+    Fig 1. A KAOS Sample Diagram
+
+******************
+High Availability Goals of OPNFV
+******************
+
+Overall Goals
+>>>>>>>>>>>>>>>>>>
+
+The Final Goal of OPNFV High Availability is to provide high available VNF services. And the
+following objectives are required to meet:
+
+- There should be no single point of failure in the NFV framework.
+
+- All resiliency mechanisms shall be designed for a multi-vendor environment, where for example
+  the NFVI, NFV-MANO, and VNFs may be supplied by different vendors.
+
+- Resiliency related information shall always be explicitly specified and communicated using
+  the reference interfaces (including policies/templates) of the NFV framework.
+
+
+
+Service Level Agreements of OPNFV HA
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Service Level Agreements of OPNFV HA are mainly focused on time constraints of service outage,
+failure detection, failure recovery. The following table outlines the SLA metrics of different
+service availability levels described in ETSI GS NFV-REL 001 V1.1.1 (2015-01). Table 1 shows
+time constraints of different Service Availability Levels. In this document, SAL1 is the
+default benchmark value required to meet.
+
+*Table 1. Time Constraints for Different Service Availability Levels*
+
++--------------------------------+----------------------------+------------------------+
+| Service Availability Level     | Failure Detection Time     | Failure Recovery Time  |
++================================+============================+========================+
+| SAL1                           | <1s                        | 5-6s                   |
++--------------------------------+----------------------------+------------------------+
+| SAL2                           | <5s                        | 10-15s                 |
++--------------------------------+----------------------------+------------------------+
+| SAL3                           | <10s                       | 20-25s                 |
++--------------------------------+----------------------------+------------------------+
+
+
+******************
+Overall Analysis
+******************
+Figure 2 shows the overall decomposition of high availability goals. The high availability of
+VNF Services can be refined to high availability of VNFs, MANO, and the NFVI where VNFs are
+deployed; the high availability of NFVI Service can be refined to high availability of Virtual
+Compute Instances, Virtual Storage and Virtual Network Services; the high availability of
+virtual instance is either the high availability of containers or the high availability of VMs,
+and these high availability goals can be further decomposed by how the NFV environment is
+deployed.
+
+.. figure:: images/fig2_Total_Framework.png
+    :alt: Overall HA Analysis of OPNFV
+    :figclass: align-center
+
+    Fig 2. Overall HA Analysis of OPNFV
+
+Thus the high availability requirement of VNF services can be classified into high availability
+requirements on different layers in OPNFV. The following layers are mainly discussed in this
+document:
+
+- VNF HA
+
+- MANO HA
+
+- Virtual Infrastructure HA (container HA or VM HA)
+
+- VIM HA
+
+- SDN HA
+
+- Hypervisor HA
+
+- Host OS HA
+
+- Hardware HA
+
+The next section will illustrate detailed analysis of HA requirements on these layers.
+
+******************
+Detailed Analysis
+******************
+
+VNF HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+MANO HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Virtual Infrastructure HA
+>>>>>>>>>>>>>>>>>>
+
+The Virtual Infrastructure HA in OPNFV includes container HA and VM HA.
+
+VM HA
+::::::::::::::::::::::::::::::::::::::
+
+This part describes a set of new optional capabilities where the OpenStack Cloud messages into the Guest
+VMs in order to provide improved Availability of the Host VMs.
+
+Table 2 shows the potential faults of VMs and corresponding initial solution capabilities or methods. 
+
+*Table 2. Potential Faults of VMs and the initial solution capabilities*
+
++---------------------------+------------------------------------+--------------------------------------------+
+| Fault                     | Description                        | solution capabilities                      |
++===========================+====================================+============================================+
+| VM faults                 | General internal VM faults         | VM Heartbeating and Health Checking        |
++---------------------------+------------------------------------+--------------------------------------------+
+| VM Server Group faults    | such as split brain                | VM Peer State Notification and Messaging   |
++---------------------------+------------------------------------+--------------------------------------------+
+
+
+.. figure:: images/fig3_VM_HA_Analysis.png
+    :alt: VM HA
+    :figclass: align-center
+
+    Fig 3. VM HA Analysis
+
+NOTE: A Server Group here is the OpenStack Nova Server Group concept where VMs
+are grouped together for purposes of scheduling.  E.g. A specific Server Group
+instance can specify whether the VMs within the group should be scheduled to
+run on the same compute host or different compute hosts.  A 'peer' VM in the
+context of this section refers to a VM within the same Nova Server Group.
+
+The initial set of new capabilities include: enabling the
+detection of and recovery from internal VM faults and providing
+a simple out-of-band messaging service to prevent scenarios such
+as split brain.
+
+More detailed description is located in R5_HA_API/OPNFV_HA_Guest_APIs-Overview_HLD.rst in this project. 
+
+The Host-to-Guest messaging APIs used by the services discussed
+in this Virtual Infrastructure HA part use a JSON-formatted application messaging layer
+on top of a virtio serial device between QEMU on the OpenStack Host
+and the Guest VM. Use of the virtio serial device provides a
+simple, direct communication channel between host and guest which is
+independent of the Guest's L2/L3 networking.
+
+The upper layer JSON messaging format is actually structured as a
+hierarchical JSON format containing a Base JSON Message Layer and an
+Application JSON Message Layer:
+
+- the Base Layer provides the ability to multiplex different groups of message types on top of a single virtio serial device
+e.g.
+
+    + heartbeating and healthchecks,
+    + server group messaging,
+
+and
+
+- the Application Layer provides the specific message types and fields of a particular group of message types.
+
+
+A) VM Heartbeating and Health Checking
+
+
+.. figure:: images/fig4_Heartbeating_and_Healthchecks.png
+    :alt: Heartbeating and Healthchecks
+    :figclass: align-center
+
+    Fig 4. Heartbeating and Healthchecks
+    
+VM Heartbeating and Health Checking provides a heartbeat service to enhance
+the monitoring of the health of guest application(s) within a VM running
+under the OpenStack Cloud. Loss of heartbeat or a failed health check status
+will result in a fault event being reported to OPNFV's DOCTOR infrastructure
+for alarm identification, impact analysis and reporting. This would then enable
+VNF Managers (VNFMs) listening to OPNFV's DOCTOR External Alarm Reporting through
+Telemetry's AODH, to initiate any required fault recovery actions.
+
+Guest heartbeat works on a challenge response model. The OpenStack Guest Heartbeat 
+Service on the compute node will challenge the registered Guest VM daemon with a 
+message each interval. The registered Guest VM daemon must respond prior to the 
+next interval with a message indicating good health. If the OpenStack Host does 
+not receive a valid response, or if the response specifies that the VM is in ill 
+health, then a fault event for the Guest VM is reported to the OpenStack Guest 
+Heartbeat Service on the controller node which will report the event to OPNFV's 
+DOCTOR (i.e. thru the Doctor SouthBound (SB) APIs).
+
+In summary, the Guest Heartbeating Messaging Specification is quite simple,
+including the following PDUs: Init, Init-Ack, Challenge-Request,
+Challenge-Response, Exit.  The Challenge-Response returning a healthy /
+not-healthy boolean.
+
+The registered Guest VM daemon's response to the challenge can be as simple
+as just immediately responding with OK.  This alone allows for detection of
+a failed or hung QEMU/KVM instance, or a failure of the OS within the VM to
+schedule the registered Guest VM's daemon or failure to route basic IO within
+the Guest VM.
+
+However the registered Guest VM daemon's response to the challenge can be more
+complex, running anything from a quick simple sanity check of the health of
+applications running in the Guest VM, to a more thorough audit of the
+application state and data.  In either case returning the status of the
+health check enables the OpenStack host to detect and report the event in order
+to initiate recovery from application level errors or failures within the Guest VM.
+
+
+B) VM Peer State Notification and Messaging
+
+
+.. figure:: images/fig5_VM_Peer_State_Notification_and_Messaging.png
+    :alt: VM Peer State Notification and Messaging
+    :figclass: align-center
+
+    Fig 5. VM Peer State Notification and Messaging
+    
+Server Group State Notification and Messaging is a service to provide
+simple low-bandwidth datagram messaging and notifications for servers that
+are part of the same server group.  This messaging channel is available
+regardless of whether IP networking is functional within the server, and
+it requires no knowledge within the server about the other members of the group.
+
+This Server Group Messaging service provides three types of messaging:
+
+- Broadcast: this allows a server to send a datagram (size of up to 3050 bytes)
+  to all other servers within the server group.
+- Notification: this provides servers with information about changes to the
+  (Nova) state of other servers within the server group.
+- Status: this allows a server to query the current (Nova) state of all servers within
+  the server group (including itself).
+
+A Server Group Messaging entity on both the controller node and the compute nodes manage 
+the routing of of VM-to-VM messages through the platform, leveraging Nova to determine 
+Server Group membership and compute node locations of VMs. The Server Group Messaging 
+entity on the controller also listens to Nova VM state change notifications and querys 
+VM state data from Nova, in order to provide the VM query and notification functionality 
+of this service.
+
+This service is not intended for high bandwidth or low-latency operations. It is best-effort, 
+not reliable. Applications should do end-to-end acks and retries if they care about reliability.
+      
+This service provides building block type capabilities for the Guest VMs that
+contribute to higher availability of the VMs in the Guest VM Server Group.  Notifications
+of VM Status changes potentially provide a faster and more accurate notification
+of failed peer VMs than traditional peer VM monitoring over Tenant Networks.  While
+the Broadcast Messaging mechanism provides an out-of-band messaging mechanism to
+monitor and control a peer VM under fault conditions; e.g. providing the ability to
+avoid potential split brain scenarios between 1:1 VMs when faults in Tenant
+Networking occur.
+
+Container HA
+::::::::::::::::::::::::::::
+
+The container HA in OPNFV is mainly focus on Kubernetes(K8s) platform. And using the Pod as
+the smallest unit of management, creation, and planning, the K8s' container HA actually means
+the High Availability of running Pods.
+
+Table 3 shows the potential faults of running pods in K8s. when it happens, the ReplicationController
+or ReplicaSet can prevent the services provided by the pod from being unavailable, as is shown in
+figure 6.
+
+*Table 3. Potential Faults in VIM level*
+
++------------+--------------+----------------------------------------------------+----------------+
+| Service    | Fault        | Description                                        | Severity       |
++============+==============+====================================================+================+
+|            |              | All Containers in the Pod have terminated, and     |                |
+| Running by | Pod failure  | at least one Container has terminated in failure.  | Critical       |
+| pods       |              | That is, the Container either exited with non-zero |                |
+|            |              | status or was terminated by the system.            |                |
++------------+--------------+----------------------------------------------------+----------------+
+
+.. figure:: images/fig6_Container_HA_analysis_in_K8s.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 6. Container HA analysis in K8s
+    
+    
+The Replication Controller or ReplicaSet (ReplicaSet is the next-generation Replication Controller) 
+is a kind of K8s Master Components, which ensures that a specified number of pod replicas are running 
+at any one time.
+
+The following requirements are elicited for Pod HA:
+
+**[Req 5.3.1]** A pod or a homogeneous set of pods is always up and available until terminated properly.
+
+**[Req 5.3.2]** The ReplicationController or ReplicaSet should terminate the extra pods If there are 
+more pods than specified number.
+
+**[Req 5.3.3]** The ReplicationController or ReplicaSet should start more pods If there are fewer pods 
+than specified number. 
+
+**[Req 5.3.4]** The new Pod should be scheduled to other Nodes, if detecting the failure state of the 
+host or container.
+
+
+  
+VIM HA
+>>>>>>>>>>>>>>>>>>
+
+
+OpenStack High Availability
+::::::::::::::::::::::::::::
+
+The VIM in the NFV reference architecture contains different components of Openstack, SDN
+controllers and other virtual resource controllers. VIM components can be classified into three
+types:
+
+- **Entry Point Components**: Components that give VIM service interfaces to users, like nova-
+  api, neutron-server.
+
+- **Middlewares**: Components that provide load balancer services, messaging queues, cluster
+  management services, etc.
+
+- **Subcomponents**: Components that implement VIM functions, which are called by Entry Point
+  Components but not by users directly.
+
+Table 4 shows the potential faults that may happen on VIM layer. Currently the main focus of
+VIM HA is the service crash of VIM components, which may occur on all types of VIM components.
+To prevent VIM services from being unavailable, Active/Active Redundancy, Active/Passive
+Redundancy and Message Queue are used for different types of VIM components, as is shown in
+figure 7.
+
+*Table 4. Potential Faults in VIM level*
+
++------------+------------------+-------------------------------------------------+----------------+
+| Service    | Fault            | Description                                     | Severity       |
++============+==================+=================================================+================+
+| General    | Service Crash    | The processes of a service crashed unnormally.  | Critical       |
++------------+------------------+-------------------------------------------------+----------------+
+
+.. figure:: images/VIM_Analysis.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 7. VIM HA Analysis
+
+
+A) Active/Active Redundancy
+
+Active/Active Redundancy manages both the main and redundant systems concurrently. If there is
+a failure happens on a component, the backups are already online and users are unlikely to
+notice that the failed VIM component is under fixing. A typical Active/Active Redundancy will
+have redundant instances, and these instances are load balanced via a virtual IP address and a
+load balancer such as HAProxy.
+
+When one of the redundant VIM component fails, the load balancer should be aware of the
+instance failure, and then isolate the failed instance from being called until it is recovered.
+The requirement decomposition of Active/Active Redundancy is shown in Figure 8.
+
+.. figure:: images/fig8_Active_Active_Redundancy.png
+    :alt: Active/Active Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 8. Active/Active Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Active Redundancy:
+
+**[Req 5.4.1]** Redundant VIM components should be load balanced by a load balancer.
+
+**[Req 5.4.2]** The load balancer should check the health status of VIM component instances.
+
+**[Req 5.4.3]** The load balancer should isolate the failed VIM component instance until it is
+recovered.
+
+**[Req 5.4.4]** The alarm information of VIM component failure should be reported.
+
+**[Req 5.4.5]** Failed VIM component instances should be recovered by a cluster manager.
+
+Table 5 shows the current VIM components using Active/Active Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 5. VIM Components using Active/Active Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-api          | endpoint component of Openstack Compute Service Nova  | yardstick_tc019      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-novncproxy   | server daemon that serves the Nova noVNC Websocket    |                      |
+|                   | Proxy service, which provides a websocket proxy that  |                      |
+|                   | is compatible with OpenStack Nova noVNC consoles.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| neeutron-server   | endpoint component of Openstack Networking Service    | yardstick_tc045      |
+|                   | Neutron                                               |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| keystone          | component of Openstack Identity Service Service       | yardstick_tc046      |
+|                   | Keystone                                              |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-api        | endpoint component of Openstack Image Service Glance  | yardstick_tc047      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-registry   | server daemon that serves image metadata through a    |                      |
+|                   | REST-like API.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-api        | endpoint component of Openstack Block Storage Service | yardstick_tc048      |
+|                   | Service Cinder                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| swift-proxy       | endpoint component of Openstack Object Storage        | yardstick_tc049      |
+|                   | Swift                                                 |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| horizon           | component of Openstack Dashboard Service Horizon      |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-api          | endpoint component of Openstack Stack Service Heat    | yardstick_tc091      |
++-------------------+-------------------------------------------------------+----------------------+
+| mysqld            | database service of VIM components                    | yardstick_tc090      |
++-------------------+-------------------------------------------------------+----------------------+
+
+B)Active/Passive Redundancy
+
+
+Active/Passive Redundancy maintains a redundant instance that can be brought online when the
+active service fails. A typical Active/Passive Redundancy maintains replacement resources that
+can be brought online when required. Requests are handled using a virtual IP address (VIP) that
+facilitates returning to service with minimal reconfiguration. A cluster manager (such as
+Pacemaker or Corosync) monitors these components, bringing the backup online as necessary.
+
+When the main instance of a VIM component is failed, the cluster manager should be aware of the
+failure and switch the backup instance online. And the failed instance should also be recovered
+to another backup instance. The requirement decomposition of Active/Passive Redundancy is shown
+in Figure 9.
+
+.. figure:: images/fig9_Active_Passive_Redundancy.png
+    :alt: Active/Passive Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 9. Active/Passive Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Passive Redundancy:
+
+**[Req 5.4.6]** The cluster manager should replace the failed main VIM component instance with
+a backup instance.
+
+**[Req 5.4.7]** The cluster manager should check the health status of VIM component instances.
+
+**[Req 5.4.8]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.9]** The alarm information of VIM component failure should be reported.
+
+
+Table 6 shows the current VIM components using Active/Passive Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 6. VIM Components using Active/Passive Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| haproxy           | load balancer component of VIM components             | yardstick_tc053      |
++-------------------+-------------------------------------------------------+----------------------+
+| rabbitmq-server   | messaging queue service of VIM components             | yardstick_tc056      |
++-------------------+-------------------------------------------------------+----------------------+
+| corosync          | cluster management component of VIM components        | yardstick_tc057      |
++-------------------+-------------------------------------------------------+----------------------+
+
+C) Message Queue
+
+Message Queue provides an asynchronous communication protocol. In Openstack, some projects (
+like Nova, Cinder) use Message Queue to call their sub components. Although Message Queue
+itself is not an HA mechanism, how it works ensures the high availaibility when redundant
+components subscribe to the Messsage Queue. When a VIM sub component fails, since there are
+other redundant components are subscribing to the Message Queue, requests still can be processed.
+And fault isolation can also be archived since failed components won't fetch requests actively.
+Also, the recovery of failed components is required. Figure 10 shows the requirement
+decomposition of Message Queue.
+
+.. figure:: images/fig10_Message_Queue.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 10. Message Queue Redundancy Requirement Decomposition
+
+The following requirements are elicited for Message Queue:
+
+**[Req 5.4.10]** Redundant component instances should subscribe to the Message Queue, which is
+implemented by the installer.
+
+**[Req 5.4.11]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.12]** The alarm information of VIM component failure should be reported.
+
+Table 7 shows the current VIM components using Message Queue and the corresponding HA test cases
+to verify them.
+
+*Table 7. VIM Components using Messaging Queue*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-scheduler    | Openstack compute component determines how to         | yardstick_tc088      |
+|                   | dispatch compute requests                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-cert         | Openstack compute component that serves the Nova Cert |                      |
+|                   | service for X509 certificates. Used to generate       |                      |
+|                   | certificates for euca-bundle-image.                   |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-conductor    | server daemon that serves the Nova Conductor service, | yardstick_tc089      |
+|                   | which provides coordination and database query        |                      |
+|                   | support for Nova.                                     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-compute      | Handles all processes relating to instances (guest    |                      |
+|                   | vms). nova-compute is responsible for building a disk |                      |
+|                   | image, launching it via the underlying virtualization |                      |
+|                   | driver, responding to calls to check its state,       |                      |
+|                   | attaching persistent storage, and terminating it.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-consoleauth  | Openstack compute component for Authentication of     |                      |
+|                   | nova consoles.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-scheduler  | Openstack volume storage component decides on         |                      |
+|                   | placement for newly created volumes and forwards the  |                      |
+|                   | request to cinder-volume.                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-volume     | Openstack volume storage component receives volume    |                      |
+|                   | management requests from cinder-api and               |                      |
+|                   | cinder-scheduler, and routes them to storage backends |                      |
+|                   | using vendor-supplied drivers.                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-engine       | Openstack Heat project server with an internal RPC    |                      |
+|                   | api called by the heat-api server.                    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+
+
+VIM HA in K8s
+::::::::::::::::::::::::::::
+
+The VIM HA in K8s can be generally analyzed from the following two concepts:
+
+- **Master Components HA**: the HA of k8s components in Master. (for example, Kube-apiserver, 
+  Kube-scheduler, Kube-controller-manager)
+  
+- **Data Storage HA**: the HA of etcd cluster. Actually etcd is a master component used as 
+  Kubernetes' backing store for all cluster data. Considering that etcd is the only stateful service
+  in k8s and that its HA policy can be deployed independent on K8s, it is necessary to discuss the 
+  HA of etcd separately.
+
+Table 8 shows the potential faults that may happen in K8s.
+
+*Table 8. Potential Faults in K8s*
+
++--------------------+------------------+----------------------------------------+----------------+
+| Service            | Fault            | Description                            | Severity       |
++====================+==================+========================================+================+
+| Provided by Master | Master           | A Master component crashed and can't   | Critical       |
+| Components         | Component crash  | provide normal service.                |                |
++--------------------+------------------+----------------------------------------+----------------+
+| Data storage       | Etcd Crash       | The Etcd cluster crashed unnormally.   | Critical       |
++--------------------+------------------+----------------------------------------+----------------+
+
+
+.. figure:: images/fig11_VIM_HA_analysis_in_K8s.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 11. VIM HA analysis in K8s
+    
+Master components can be run on any machine in the cluster. However, for simplicity, all master 
+components are typically started on the same machine, and do not run user containers on this machine.
+In this case, the K8s is based on a single Master, and only has container HA on application layer 
+realized by ReplicationController or ReplicaSet Master Component as mentioned in the container HA 
+part above.
+
+The HA of Mater and its components in K8s must depend on the multi-master setup. 
+
+The Data Storage HA can use an existing Etcd HA cluster to realize, or can be realized as a master
+component through multiple master implementation.
+
+.. figure:: images/fig12_VIM_HA_analysis_in_K8s_2.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 12. VIM HA analysis in K8s(2)
+
+In Multi-Master K8s, the Master Components HA is mainly based on the Leader Election function of Etcd 
+cluster. And load balancer is used to realize the HA of Kube-apiserver Master component.
+
+The following requirements are elicited for Master components HA:
+
+**[Req 5.4.13]** The Load Balancer should always forward the request to an available Kube-apiserver 
+instance. 
+
+**[Req 5.4.14]** The Master Component in the Leader state should confirm its Leader state to all 
+follower Components regularly through Heatbeat.
+
+**[Req 5.4.15]** When a Master Component in the Leader state crashed, an available Master Component 
+should be elected as Leader.
+
+Hypervisor HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Host OS HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Hardware HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+
+******************
+References
+******************
+
+- A KAOS Tutorial: http://www.objectiver.com/fileadmin/download/documents/KaosTutorial.pdf
+
+- ETSI GS NFV-REL 001 V1.1.1(2015-01):
+  http://www.etsi.org/deliver/etsi_gs/NFV-REL/001_099/001/01.01.01_60/gs_NFV-REL001v010101p.pdf
+
+- Openstack High Availability Guide: https://docs.openstack.org/ha-guide/
+
+- Highly Available (Mirrored) Queues: https://www.rabbitmq.com/ha.html
diff --git a/docs/development/overview/HA_Analysis-H.rst b/docs/development/overview/HA_Analysis-H.rst
new file mode 100644
index 0000000..a27a8bd
--- /dev/null
+++ b/docs/development/overview/HA_Analysis-H.rst
@@ -0,0 +1,713 @@
+.. image:: opnfv-logo.png
+  :height: 40
+  :width: 200
+  :alt: OPNFV
+  :align: left
+
+
+******************
+Introduction
+******************
+This High Availability Requirement Analysis Document is used for eliciting High Availability
+Requirements of OPNFV. The document will refine high-level High Availability goals, into
+detailed HA mechanism design. And HA mechanisms are related with potential failures on
+different layers in OPNFV. Moreover, this document can be used as reference for HA Testing
+scenarios design.
+A requirement engineering model KAOS is used in this document.
+
+******************
+Terminologies and Symbols
+******************
+The following concepts in KAOS will be used in the diagrams of this document.
+
+- **Goal**: The objective to be met by the target system.
+
+- **Obstacle**: Condition whose satisfaction may prevent some goals from being achieved.
+
+- **Agent**: Active Object performing operations to achieve goals.
+
+- **Requirement**: Goal assigned to an agent of the software being studied.
+
+- **Domain Property**: Descriptive assertion about objects in the environment of the software.
+
+- **Refinement**: Relationship linking a goal to other goals that are called its subgoals.
+  Each subgoal contributes to the satisfaction of the goal it refines. There are two types of
+  refinements: AND refinement and OR refinement, which means whether the goal can be archived by
+  satisfying all of its sub goals or any one of its sub goals.
+
+- **Conflict**: Relationship linking an obstacle to a goal if the obstacle obstructs the goal
+  from being satisfied.
+
+- **Resolution**: Relationship linking a goal to an obstacle if the goal can resolve the
+  obstacle.
+
+- **Responsibility**: Relationship between an agent and a requirement. Holds when an agent is
+  assigned the responsibility of achieving the linked requirement.
+
+Figure 1 shows how these concepts are displayed in a KAOS diagram.
+
+.. figure:: images/fig1_KAOS_Sample.png
+    :alt: KAOS Sample
+    :figclass: align-center
+
+    Fig 1. A KAOS Sample Diagram
+
+******************
+High Availability Goals of OPNFV
+******************
+
+Overall Goals
+>>>>>>>>>>>>>>>>>>
+
+The Final Goal of OPNFV High Availability is to provide high available VNF services. And the
+following objectives are required to meet:
+
+- There should be no single point of failure in the NFV framework.
+
+- All resiliency mechanisms shall be designed for a multi-vendor environment, where for example
+  the NFVI, NFV-MANO, and VNFs may be supplied by different vendors.
+
+- Resiliency related information shall always be explicitly specified and communicated using
+  the reference interfaces (including policies/templates) of the NFV framework.
+
+
+
+Service Level Agreements of OPNFV HA
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Service Level Agreements of OPNFV HA are mainly focused on time constraints of service outage,
+failure detection, failure recovery. The following table outlines the SLA metrics of different
+service availability levels described in ETSI GS NFV-REL 001 V1.1.1 (2015-01). Table 1 shows
+time constraints of different Service Availability Levels. In this document, SAL1 is the
+default benchmark value required to meet.
+
+*Table 1. Time Constraints for Different Service Availability Levels*
+
++--------------------------------+----------------------------+------------------------+
+| Service Availability Level     | Failure Detection Time     | Failure Recovery Time  |
++================================+============================+========================+
+| SAL1                           | <1s                        | 5-6s                   |
++--------------------------------+----------------------------+------------------------+
+| SAL2                           | <5s                        | 10-15s                 |
++--------------------------------+----------------------------+------------------------+
+| SAL3                           | <10s                       | 20-25s                 |
++--------------------------------+----------------------------+------------------------+
+
+
+******************
+Overall Analysis
+******************
+Figure 2 shows the overall decomposition of high availability goals. The high availability of
+VNF Services can be refined to high availability of VNFs, MANO, and the NFVI where VNFs are
+deployed; the high availability of NFVI Service can be refined to high availability of Virtual
+Compute Instances, Virtual Storage and Virtual Network Services; the high availability of
+virtual instance is either the high availability of containers or the high availability of VMs,
+and these high availability goals can be further decomposed by how the NFV environment is
+deployed.
+
+.. figure:: images/fig2_Total_Framework.png
+    :alt: Overall HA Analysis of OPNFV
+    :figclass: align-center
+
+    Fig 2. Overall HA Analysis of OPNFV
+
+Thus the high availability requirement of VNF services can be classified into high availability
+requirements on different layers in OPNFV. The following layers are mainly discussed in this
+document:
+
+- VNF HA
+
+- MANO HA
+
+- Virtual Infrastructure HA (container HA or VM HA)
+
+- VIM HA
+
+- SDN HA
+
+- Hypervisor HA
+
+- Host OS HA
+
+- Hardware HA
+
+The next section will illustrate detailed analysis of HA requirements on these layers.
+
+******************
+Detailed Analysis
+******************
+
+VNF HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+MANO HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Virtual Infrastructure HA
+>>>>>>>>>>>>>>>>>>
+
+The Virtual Infrastructure HA in OPNFV includes container HA and VM HA.
+
+VM HA
+::::::::::::::::::::::::::::::::::::::
+
+This part describes a set of new optional capabilities where the OpenStack Cloud messages into the Guest
+VMs in order to provide improved Availability of the Host VMs.
+
+Table 2 shows the potential faults of VMs and corresponding initial solution capabilities or methods. 
+
+*Table 2. Potential Faults of VMs and the initial solution capabilities*
+
++---------------------------+------------------------------------+--------------------------------------------+
+| Fault                     | Description                        | solution capabilities                      |
++===========================+====================================+============================================+
+| VM faults                 | General internal VM faults         | VM Heartbeating and Health Checking        |
++---------------------------+------------------------------------+--------------------------------------------+
+| VM Server Group faults    | such as split brain                | VM Peer State Notification and Messaging   |
++---------------------------+------------------------------------+--------------------------------------------+
+
+
+.. figure:: images/fig3_VM_HA_Analysis.png
+    :alt: VM HA
+    :figclass: align-center
+
+    Fig 3. VM HA Analysis
+
+NOTE: A Server Group here is the OpenStack Nova Server Group concept where VMs
+are grouped together for purposes of scheduling.  E.g. A specific Server Group
+instance can specify whether the VMs within the group should be scheduled to
+run on the same compute host or different compute hosts.  A 'peer' VM in the
+context of this section refers to a VM within the same Nova Server Group.
+
+The initial set of new capabilities include: enabling the
+detection of and recovery from internal VM faults and providing
+a simple out-of-band messaging service to prevent scenarios such
+as split brain.
+
+More detailed description is located in R5_HA_API/OPNFV_HA_Guest_APIs-Overview_HLD.rst in this project. 
+
+The Host-to-Guest messaging APIs used by the services discussed
+in this Virtual Infrastructure HA part use a JSON-formatted application messaging layer
+on top of a virtio serial device between QEMU on the OpenStack Host
+and the Guest VM. Use of the virtio serial device provides a
+simple, direct communication channel between host and guest which is
+independent of the Guest's L2/L3 networking.
+
+The upper layer JSON messaging format is actually structured as a
+hierarchical JSON format containing a Base JSON Message Layer and an
+Application JSON Message Layer:
+
+- the Base Layer provides the ability to multiplex different groups of message types on top of a single virtio serial device
+e.g.
+
+    + heartbeating and healthchecks,
+    + server group messaging,
+
+and
+
+- the Application Layer provides the specific message types and fields of a particular group of message types.
+
+
+A) VM Heartbeating and Health Checking
+
+
+.. figure:: images/fig4_Heartbeating_and_Healthchecks.png
+    :alt: Heartbeating and Healthchecks
+    :figclass: align-center
+
+    Fig 4. Heartbeating and Healthchecks
+    
+VM Heartbeating and Health Checking provides a heartbeat service to enhance
+the monitoring of the health of guest application(s) within a VM running
+under the OpenStack Cloud. Loss of heartbeat or a failed health check status
+will result in a fault event being reported to OPNFV's DOCTOR infrastructure
+for alarm identification, impact analysis and reporting. This would then enable
+VNF Managers (VNFMs) listening to OPNFV's DOCTOR External Alarm Reporting through
+Telemetry's AODH, to initiate any required fault recovery actions.
+
+Guest heartbeat works on a challenge response model. The OpenStack Guest Heartbeat 
+Service on the compute node will challenge the registered Guest VM daemon with a 
+message each interval. The registered Guest VM daemon must respond prior to the 
+next interval with a message indicating good health. If the OpenStack Host does 
+not receive a valid response, or if the response specifies that the VM is in ill 
+health, then a fault event for the Guest VM is reported to the OpenStack Guest 
+Heartbeat Service on the controller node which will report the event to OPNFV's 
+DOCTOR (i.e. thru the Doctor SouthBound (SB) APIs).
+
+In summary, the Guest Heartbeating Messaging Specification is quite simple,
+including the following PDUs: Init, Init-Ack, Challenge-Request,
+Challenge-Response, Exit.  The Challenge-Response returning a healthy /
+not-healthy boolean.
+
+The registered Guest VM daemon's response to the challenge can be as simple
+as just immediately responding with OK.  This alone allows for detection of
+a failed or hung QEMU/KVM instance, or a failure of the OS within the VM to
+schedule the registered Guest VM's daemon or failure to route basic IO within
+the Guest VM.
+
+However the registered Guest VM daemon's response to the challenge can be more
+complex, running anything from a quick simple sanity check of the health of
+applications running in the Guest VM, to a more thorough audit of the
+application state and data.  In either case returning the status of the
+health check enables the OpenStack host to detect and report the event in order
+to initiate recovery from application level errors or failures within the Guest VM.
+
+
+B) VM Peer State Notification and Messaging
+
+
+.. figure:: images/fig5_VM_Peer_State_Notification_and_Messaging.png
+    :alt: VM Peer State Notification and Messaging
+    :figclass: align-center
+
+    Fig 5. VM Peer State Notification and Messaging
+    
+Server Group State Notification and Messaging is a service to provide
+simple low-bandwidth datagram messaging and notifications for servers that
+are part of the same server group.  This messaging channel is available
+regardless of whether IP networking is functional within the server, and
+it requires no knowledge within the server about the other members of the group.
+
+This Server Group Messaging service provides three types of messaging:
+
+- Broadcast: this allows a server to send a datagram (size of up to 3050 bytes)
+  to all other servers within the server group.
+- Notification: this provides servers with information about changes to the
+  (Nova) state of other servers within the server group.
+- Status: this allows a server to query the current (Nova) state of all servers within
+  the server group (including itself).
+
+A Server Group Messaging entity on both the controller node and the compute nodes manage 
+the routing of of VM-to-VM messages through the platform, leveraging Nova to determine 
+Server Group membership and compute node locations of VMs. The Server Group Messaging 
+entity on the controller also listens to Nova VM state change notifications and querys 
+VM state data from Nova, in order to provide the VM query and notification functionality 
+of this service.
+
+This service is not intended for high bandwidth or low-latency operations. It is best-effort, 
+not reliable. Applications should do end-to-end acks and retries if they care about reliability.
+      
+This service provides building block type capabilities for the Guest VMs that
+contribute to higher availability of the VMs in the Guest VM Server Group.  Notifications
+of VM Status changes potentially provide a faster and more accurate notification
+of failed peer VMs than traditional peer VM monitoring over Tenant Networks.  While
+the Broadcast Messaging mechanism provides an out-of-band messaging mechanism to
+monitor and control a peer VM under fault conditions; e.g. providing the ability to
+avoid potential split brain scenarios between 1:1 VMs when faults in Tenant
+Networking occur.
+
+Container HA
+::::::::::::::::::::::::::::
+
+The container HA in OPNFV is mainly focus on Kubernetes(K8s) platform. And using the Pod as
+the smallest unit of management, creation, and planning, the K8s' container HA actually means
+the High Availability of running Pods.
+
+Table 3 shows the potential faults of running pods in K8s. when it happens, the ReplicationController
+or ReplicaSet can prevent the services provided by the pod from being unavailable, as is shown in
+figure 6.
+
+*Table 3. Potential Faults in VIM level*
+
++------------+--------------+----------------------------------------------------+----------------+
+| Service    | Fault        | Description                                        | Severity       |
++============+==============+====================================================+================+
+|            |              | All Containers in the Pod have terminated, and     |                |
+| Running by | Pod failure  | at least one Container has terminated in failure.  | Critical       |
+| pods       |              | That is, the Container either exited with non-zero |                |
+|            |              | status or was terminated by the system.            |                |
++------------+--------------+----------------------------------------------------+----------------+
+
+.. figure:: images/fig6_Container_HA_analysis_in_K8s.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 6. Container HA analysis in K8s
+    
+    
+The Replication Controller or ReplicaSet (ReplicaSet is the next-generation Replication Controller) 
+is a kind of K8s Master Components, which ensures that a specified number of pod replicas are running 
+at any one time.
+
+The following requirements are elicited for Pod HA:
+
+**[Req 5.3.1]** A pod or a homogeneous set of pods is always up and available until terminated properly.
+
+**[Req 5.3.2]** The ReplicationController or ReplicaSet should terminate the extra pods If there are 
+more pods than specified number.
+
+**[Req 5.3.3]** The ReplicationController or ReplicaSet should start more pods If there are fewer pods 
+than specified number. 
+
+**[Req 5.3.4]** The new Pod should be scheduled to other Nodes, if detecting the failure state of the 
+host or container.
+
+
+  
+VIM HA
+>>>>>>>>>>>>>>>>>>
+
+
+OpenStack High Availability
+::::::::::::::::::::::::::::
+
+The VIM in the NFV reference architecture contains different components of Openstack, SDN
+controllers and other virtual resource controllers. VIM components can be classified into three
+types:
+
+- **Entry Point Components**: Components that give VIM service interfaces to users, like nova-
+  api, neutron-server.
+
+- **Middlewares**: Components that provide load balancer services, messaging queues, cluster
+  management services, etc.
+
+- **Subcomponents**: Components that implement VIM functions, which are called by Entry Point
+  Components but not by users directly.
+
+Table 4 shows the potential faults that may happen on VIM layer. Currently the main focus of
+VIM HA is the service crash of VIM components, which may occur on all types of VIM components.
+To prevent VIM services from being unavailable, Active/Active Redundancy, Active/Passive
+Redundancy and Message Queue are used for different types of VIM components, as is shown in
+figure 7.
+
+*Table 4. Potential Faults in VIM level*
+
++------------+------------------+-------------------------------------------------+----------------+
+| Service    | Fault            | Description                                     | Severity       |
++============+==================+=================================================+================+
+| General    | Service Crash    | The processes of a service crashed unnormally.  | Critical       |
++------------+------------------+-------------------------------------------------+----------------+
+
+.. figure:: images/fig7_VIM_Analysis.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 7. VIM HA Analysis
+
+
+A) Active/Active Redundancy
+
+Active/Active Redundancy manages both the main and redundant systems concurrently. If there is
+a failure happens on a component, the backups are already online and users are unlikely to
+notice that the failed VIM component is under fixing. A typical Active/Active Redundancy will
+have redundant instances, and these instances are load balanced via a virtual IP address and a
+load balancer such as HAProxy.
+
+When one of the redundant VIM component fails, the load balancer should be aware of the
+instance failure, and then isolate the failed instance from being called until it is recovered.
+The requirement decomposition of Active/Active Redundancy is shown in Figure 8.
+
+.. figure:: images/fig8_Active_Active_Redundancy.png
+    :alt: Active/Active Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 8. Active/Active Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Active Redundancy:
+
+**[Req 5.4.1]** Redundant VIM components should be load balanced by a load balancer.
+
+**[Req 5.4.2]** The load balancer should check the health status of VIM component instances.
+
+**[Req 5.4.3]** The load balancer should isolate the failed VIM component instance until it is
+recovered.
+
+**[Req 5.4.4]** The alarm information of VIM component failure should be reported.
+
+**[Req 5.4.5]** Failed VIM component instances should be recovered by a cluster manager.
+
+Table 5 shows the current VIM components using Active/Active Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 5. VIM Components using Active/Active Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-api          | endpoint component of Openstack Compute Service Nova  | yardstick_tc019      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-novncproxy   | server daemon that serves the Nova noVNC Websocket    |                      |
+|                   | Proxy service, which provides a websocket proxy that  |                      |
+|                   | is compatible with OpenStack Nova noVNC consoles.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| neeutron-server   | endpoint component of Openstack Networking Service    | yardstick_tc045      |
+|                   | Neutron                                               |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| keystone          | component of Openstack Identity Service Service       | yardstick_tc046      |
+|                   | Keystone                                              |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-api        | endpoint component of Openstack Image Service Glance  | yardstick_tc047      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-registry   | server daemon that serves image metadata through a    |                      |
+|                   | REST-like API.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-api        | endpoint component of Openstack Block Storage Service | yardstick_tc048      |
+|                   | Service Cinder                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| swift-proxy       | endpoint component of Openstack Object Storage        | yardstick_tc049      |
+|                   | Swift                                                 |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| horizon           | component of Openstack Dashboard Service Horizon      |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-api          | endpoint component of Openstack Stack Service Heat    | yardstick_tc091      |
++-------------------+-------------------------------------------------------+----------------------+
+| mysqld            | database service of VIM components                    | yardstick_tc090      |
++-------------------+-------------------------------------------------------+----------------------+
+
+B)Active/Passive Redundancy
+
+
+Active/Passive Redundancy maintains a redundant instance that can be brought online when the
+active service fails. A typical Active/Passive Redundancy maintains replacement resources that
+can be brought online when required. Requests are handled using a virtual IP address (VIP) that
+facilitates returning to service with minimal reconfiguration. A cluster manager (such as
+Pacemaker or Corosync) monitors these components, bringing the backup online as necessary.
+
+When the main instance of a VIM component is failed, the cluster manager should be aware of the
+failure and switch the backup instance online. And the failed instance should also be recovered
+to another backup instance. The requirement decomposition of Active/Passive Redundancy is shown
+in Figure 9.
+
+.. figure:: images/fig9_Active_Passive_Redundancy.png
+    :alt: Active/Passive Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 9. Active/Passive Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Passive Redundancy:
+
+**[Req 5.4.6]** The cluster manager should replace the failed main VIM component instance with
+a backup instance.
+
+**[Req 5.4.7]** The cluster manager should check the health status of VIM component instances.
+
+**[Req 5.4.8]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.9]** The alarm information of VIM component failure should be reported.
+
+
+Table 6 shows the current VIM components using Active/Passive Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 6. VIM Components using Active/Passive Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| haproxy           | load balancer component of VIM components             | yardstick_tc053      |
++-------------------+-------------------------------------------------------+----------------------+
+| rabbitmq-server   | messaging queue service of VIM components             | yardstick_tc056      |
++-------------------+-------------------------------------------------------+----------------------+
+| corosync          | cluster management component of VIM components        | yardstick_tc057      |
++-------------------+-------------------------------------------------------+----------------------+
+
+C) Message Queue
+
+Message Queue provides an asynchronous communication protocol. In Openstack, some projects (
+like Nova, Cinder) use Message Queue to call their sub components. Although Message Queue
+itself is not an HA mechanism, how it works ensures the high availaibility when redundant
+components subscribe to the Messsage Queue. When a VIM sub component fails, since there are
+other redundant components are subscribing to the Message Queue, requests still can be processed.
+And fault isolation can also be archived since failed components won't fetch requests actively.
+Also, the recovery of failed components is required. Figure 10 shows the requirement
+decomposition of Message Queue.
+
+.. figure:: images/fig10_Message_Queue.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 10. Message Queue Redundancy Requirement Decomposition
+
+The following requirements are elicited for Message Queue:
+
+**[Req 5.4.10]** Redundant component instances should subscribe to the Message Queue, which is
+implemented by the installer.
+
+**[Req 5.4.11]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.12]** The alarm information of VIM component failure should be reported.
+
+Table 7 shows the current VIM components using Message Queue and the corresponding HA test cases
+to verify them.
+
+*Table 7. VIM Components using Messaging Queue*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-scheduler    | Openstack compute component determines how to         | yardstick_tc088      |
+|                   | dispatch compute requests                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-cert         | Openstack compute component that serves the Nova Cert |                      |
+|                   | service for X509 certificates. Used to generate       |                      |
+|                   | certificates for euca-bundle-image.                   |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-conductor    | server daemon that serves the Nova Conductor service, | yardstick_tc089      |
+|                   | which provides coordination and database query        |                      |
+|                   | support for Nova.                                     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-compute      | Handles all processes relating to instances (guest    |                      |
+|                   | vms). nova-compute is responsible for building a disk |                      |
+|                   | image, launching it via the underlying virtualization |                      |
+|                   | driver, responding to calls to check its state,       |                      |
+|                   | attaching persistent storage, and terminating it.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-consoleauth  | Openstack compute component for Authentication of     |                      |
+|                   | nova consoles.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-scheduler  | Openstack volume storage component decides on         |                      |
+|                   | placement for newly created volumes and forwards the  |                      |
+|                   | request to cinder-volume.                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-volume     | Openstack volume storage component receives volume    |                      |
+|                   | management requests from cinder-api and               |                      |
+|                   | cinder-scheduler, and routes them to storage backends |                      |
+|                   | using vendor-supplied drivers.                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-engine       | Openstack Heat project server with an internal RPC    |                      |
+|                   | api called by the heat-api server.                    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+
+
+VIM HA in K8s
+::::::::::::::::::::::::::::
+
+The VIM HA in K8s can be generally analyzed from the following two concepts:
+
+- **Master Components HA**: the HA of k8s components in Master. (for example, Kube-apiserver, 
+  Kube-scheduler, Kube-controller-manager)
+  
+- **Data Storage HA**: the HA of etcd cluster. Actually etcd is a master component used as 
+  Kubernetes' backing store for all cluster data. Considering that etcd is the only stateful service
+  in k8s and that its HA policy can be deployed independent on K8s, it is necessary to discuss the 
+  HA of etcd separately.
+
+Table 8 shows the potential faults that may happen in K8s.
+
+*Table 8. Potential Faults in K8s*
+
++--------------------+------------------+----------------------------------------+----------------+
+| Service            | Fault            | Description                            | Severity       |
++====================+==================+========================================+================+
+| Provided by Master | Master           | A Master component crashed and can't   | Critical       |
+| Components         | Component crash  | provide normal service.                |                |
++--------------------+------------------+----------------------------------------+----------------+
+| Data storage       | Etcd Crash       | The Etcd cluster crashed unnormally.   | Critical       |
++--------------------+------------------+----------------------------------------+----------------+
+
+
+.. figure:: images/fig11_VIM_HA_analysis_in_K8s.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 11. VIM HA analysis in K8s
+    
+Master components can be run on any machine in the cluster. However, for simplicity, all master 
+components are typically started on the same machine, and do not run user containers on this machine.
+In this case, the K8s is based on a single Master, and only has container HA on application layer 
+realized by ReplicationController or ReplicaSet Master Component as mentioned in the container HA 
+part above.
+
+The HA of Mater and its components in K8s must depend on the multi-master setup. 
+
+The Data Storage HA can use an existing Etcd HA cluster to realize, or can be realized as a master
+component through multiple master implementation.
+
+.. figure:: images/fig12_VIM_HA_analysis_in_K8s_2.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 12. VIM HA analysis in K8s(2)
+
+In Multi-Master K8s, the Master Components HA is mainly based on the Leader Election function of Etcd 
+cluster. And load balancer is used to realize the HA of Kube-apiserver Master component.
+
+The following requirements are elicited for Master components HA:
+
+**[Req 5.4.13]** The Load Balancer should always forward the request to an available Kube-apiserver 
+instance. 
+
+**[Req 5.4.14]** The Master Component in the Leader state should confirm its Leader state to all 
+follower Components regularly through Heatbeat.
+
+**[Req 5.4.15]** When a Master Component in the Leader state crashed, an available Master Component 
+should be elected as Leader.
+
+Hypervisor HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Host OS HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+Hardware HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+********************************
+Large scale deployment HA
+********************************
+
+When we go to large scale deployment, it is difficult to expect the behavior when hundreds or even 
+thousands of VM/Networks are created/delete. Therefore, high availability schema for large scale 
+deployment should also be considered. 
+
+In this paragraph, we first list some aspect we should consider for large scale HA.
+
+Large scale VM creat/delete test
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+1.Create a certain number of VM (e.g. about 50) in parallel, and trigger continuous creation of 
+this group of VM once the previous VMs are finished. Calculate the number of VM we can eventually 
+create when reach a certain deadline of T. See if all the VM can be connected through subnet 
+after creation.
+
+2.Check the speed of creating VM, see how fast can the system create a certain number of VM.
+
+3.similar test can be designed for delete/move/reboot VM
+
+Compute performance test
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+1 based on test of 1.1,  create the largest number of VM the system can bear. See the CPU utilization.
+
+2 based on test of 1.1, create the largest number of VM the system can bear. See the IO bandwidth.
+
+Network performance test
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+1 create network in a large scale, see if all the VM on the network can be connected
+
+2 network performance test under service flow (north-south flow, with L3 or L4-L7)
+
+3 network performance test under service flow (east-west flow, with L3 or L4-L7)
+
+Large scale high availability test
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+1 large scale VM failure and evacuation test
+
+2 long duration system test
+
+
+
+******************
+References
+******************
+
+- A KAOS Tutorial: http://www.objectiver.com/fileadmin/download/documents/KaosTutorial.pdf
+
+- ETSI GS NFV-REL 001 V1.1.1(2015-01):
+  http://www.etsi.org/deliver/etsi_gs/NFV-REL/001_099/001/01.01.01_60/gs_NFV-REL001v010101p.pdf
+
+- Openstack High Availability Guide: https://docs.openstack.org/ha-guide/
+
+- Highly Available (Mirrored) Queues: https://www.rabbitmq.com/ha.html
diff --git a/docs/development/overview/HA_Analysis.rst b/docs/development/overview/HA_Analysis.rst
new file mode 100644
index 0000000..06c0487
--- /dev/null
+++ b/docs/development/overview/HA_Analysis.rst
@@ -0,0 +1,406 @@
+.. image:: opnfv-logo.png
+  :height: 40
+  :width: 200
+  :alt: OPNFV
+  :align: left
+
+============
+High Availability Requirement Analysis in OPNFV
+============
+
+******************
+1 Introduction
+******************
+This High Availability Requirement Analysis Document is used for eliciting High Availability
+Requirements of OPNFV. The document will refine high-level High Availability goals, into
+detailed HA mechanism design. And HA mechanisms are related with potential failures on
+different layers in OPNFV. Moreover, this document can be used as reference for HA Testing
+scenarios design.
+A requirement engineering model KAOS is used in this document.
+
+******************
+2 Terminologies and Symbols
+******************
+The following concepts in KAOS will be used in the diagrams of this document.
+
+- **Goal**: The objective to be met by the target system.
+
+- **Obstacle**: Condition whose satisfaction may prevent some goals from being achieved.
+
+- **Agent**: Active Object performing operations to achieve goals.
+
+- **Requirement**: Goal assigned to an agent of the software being studied.
+
+- **Domain Property**: Descriptive assertion about objects in the environment of the software.
+
+- **Refinement**: Relationship linking a goal to other goals that are called its subgoals.
+  Each subgoal contributes to the satisfaction of the goal it refines. There are two types of
+  refinements: AND refinement and OR refinement, which means whether the goal can be archived by
+  satisfying all of its sub goals or any one of its sub goals.
+
+- **Conflict**: Relationship linking an obstacle to a goal if the obstacle obstructs the goal
+  from being satisfied.
+
+- **Resolution**: Relationship linking a goal to an obstacle if the goal can resolve the
+  obstacle.
+
+- **Responsibility**: Relationship between an agent and a requirement. Holds when an agent is
+  assigned the responsibility of achieving the linked requirement.
+
+Figure 1 shows how these concepts are displayed in a KAOS diagram.
+
+.. figure:: images/KAOS_Sample.png
+    :alt: KAOS Sample
+    :figclass: align-center
+
+    Fig 1. A KAOS Sample Diagram
+
+******************
+3 High Availability Goals of OPNFV
+******************
+
+3.1 Overall Goals
+>>>>>>>>>>>>>>>>>>
+
+The Final Goal of OPNFV High Availability is to provide high available VNF services. And the
+following objectives are required to meet:
+
+- There should be no single point of failure in the NFV framework.
+
+- All resiliency mechanisms shall be designed for a multi-vendor environment, where for example
+  the NFVI, NFV-MANO, and VNFs may be supplied by different vendors.
+
+- Resiliency related information shall always be explicitly specified and communicated using
+  the reference interfaces (including policies/templates) of the NFV framework.
+
+
+
+3.2 Service Level Agreements of OPNFV HA
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+Service Level Agreements of OPNFV HA are mainly focused on time constraints of service outage,
+failure detection, failure recovery. The following table outlines the SLA metrics of different
+service availability levels described in ETSI GS NFV-REL 001 V1.1.1 (2015-01). Table 1 shows
+time constraints of different Service Availability Levels. In this document, SAL1 is the
+default benchmark value required to meet.
+
+*Table 1. Time Constraints for Different Service Availability Levels*
+
++--------------------------------+----------------------------+------------------------+
+| Service Availability Level     | Failure Detection Time     | Failure Recovery Time  |
++================================+============================+========================+
+| SAL1                           | <1s                        | 5-6s                   |
++--------------------------------+----------------------------+------------------------+
+| SAL2                           | <5s                        | 10-15s                 |
++--------------------------------+----------------------------+------------------------+
+| SAL3                           | <10s                       | 20-25s                 |
++--------------------------------+----------------------------+------------------------+
+
+
+******************
+4 Overall Analysis
+******************
+Figure 2 shows the overall decomposition of high availability goals. The high availability of
+VNF Services can be refined to high availability of VNFs, MANO, and the NFVI where VNFs are
+deployed; the high availability of NFVI Service can be refined to high availability of Virtual
+Compute Instances, Virtual Storage and Virtual Network Services; the high availability of
+virtual instance is either the high availability of containers or the high availability of VMs,
+and these high availability goals can be further decomposed by how the NFV environment is
+deployed.
+
+.. figure:: images/Total_Framework.png
+    :alt: Overall HA Analysis of OPNFV
+    :figclass: align-center
+
+    Fig 2. Overall HA Analysis of OPNFV
+
+Thus the high availability requirement of VNF services can be classified into high availability
+requirements on different layers in OPNFV. The following layers are mainly discussed in this
+document:
+
+- VNF HA
+
+- MANO HA
+
+- Virtual Infrastructure HA (container HA or VM HA)
+
+- VIM HA
+
+- SDN HA
+
+- Hypervisor HA
+
+- Host OS HA
+
+- Hardware HA
+
+The next section will illustrate detailed analysis of HA requirements on these layers.
+
+******************
+5 Detailed Analysis
+******************
+
+5.1 VNF HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+5.2 MANO HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+5.3 Virtual Infrastructure HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+5.4 VIM HA
+>>>>>>>>>>>>>>>>>>
+
+The VIM in the NFV reference architecture contains different components of Openstack, SDN
+controllers and other virtual resource controllers. VIM components can be classified into three
+types:
+
+- **Entry Point Components**: Components that give VIM service interfaces to users, like nova-
+  api, neutron-server.
+
+- **Middlewares**: Components that provide load balancer services, messaging queues, cluster
+  management services, etc.
+
+- **Subcomponents**: Components that implement VIM functions, which are called by Entry Point
+  Components but not by users directly.
+
+Table 2 shows the potential faults that may happen on VIM layer. Currently the main focus of
+VIM HA is the service crash of VIM components, which may occur on all types of VIM components.
+To prevent VIM services from being unavailable, Active/Active Redundancy, Active/Passive
+Redundancy and Message Queue are used for different types of VIM components, as is shown in
+figure 3.
+
+*Table 2. Potential Faults in VIM level*
+
++------------+------------------+-------------------------------------------------+----------------+
+| Service    | Fault            | Description                                     | Severity       |
++============+==================+=================================================+================+
+| General    | Service Crash    | The processes of a service crashed unnormally.  | Critical       |
++------------+------------------+-------------------------------------------------+----------------+
+
+.. figure:: images/VIM_Analysis.png
+    :alt: VIM HA Analysis
+    :figclass: align-center
+
+    Fig 3. VIM HA Analysis
+
+
+Active/Active Redundancy
+::::::::::::::::::::::::::::
+Active/Active Redundancy manages both the main and redundant systems concurrently. If there is
+a failure happens on a component, the backups are already online and users are unlikely to
+notice that the failed VIM component is under fixing. A typical Active/Active Redundancy will
+have redundant instances, and these instances are load balanced via a virtual IP address and a
+load balancer such as HAProxy.
+
+When one of the redundant VIM component fails, the load balancer should be aware of the
+instance failure, and then isolate the failed instance from being called until it is recovered.
+The requirement decomposition of Active/Active Redundancy is shown in Figure 4.
+
+.. figure:: images/Active_Active_Redundancy.png
+    :alt: Active/Active Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 4. Active/Active Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Active Redundancy:
+
+**[Req 5.4.1]** Redundant VIM components should be load balanced by a load balancer.
+
+**[Req 5.4.2]** The load balancer should check the health status of VIM component instances.
+
+**[Req 5.4.3]** The load balancer should isolate the failed VIM component instance until it is
+recovered.
+
+**[Req 5.4.4]** The alarm information of VIM component failure should be reported.
+
+**[Req 5.4.5]** Failed VIM component instances should be recovered by a cluster manager.
+
+Table 3 shows the current VIM components using Active/Active Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 3. VIM Components using Active/Active Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-api          | endpoint component of Openstack Compute Service Nova  | yardstick_tc019      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-novncproxy   | server daemon that serves the Nova noVNC Websocket    |                      |
+|                   | Proxy service, which provides a websocket proxy that  |                      |
+|                   | is compatible with OpenStack Nova noVNC consoles.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| neeutron-server   | endpoint component of Openstack Networking Service    | yardstick_tc045      |
+|                   | Neutron                                               |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| keystone          | component of Openstack Identity Service Service       | yardstick_tc046      |
+|                   | Keystone                                              |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-api        | endpoint component of Openstack Image Service Glance  | yardstick_tc047      |
++-------------------+-------------------------------------------------------+----------------------+
+| glance-registry   | server daemon that serves image metadata through a    |                      |
+|                   | REST-like API.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-api        | endpoint component of Openstack Block Storage Service | yardstick_tc048      |
+|                   | Service Cinder                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| swift-proxy       | endpoint component of Openstack Object Storage        | yardstick_tc049      |
+|                   | Swift                                                 |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| horizon           | component of Openstack Dashboard Service Horizon      |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-api          | endpoint component of Openstack Stack Service Heat    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| mysqld            | database service of VIM components                    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+
+Active/Passive Redundancy
+::::::::::::::::::::::::::::
+
+Active/Passive Redundancy maintains a redundant instance that can be brought online when the
+active service fails. A typical Active/Passive Redundancy maintains replacement resources that
+can be brought online when required. Requests are handled using a virtual IP address (VIP) that
+facilitates returning to service with minimal reconfiguration. A cluster manager (such as
+Pacemaker or Corosync) monitors these components, bringing the backup online as necessary.
+
+When the main instance of a VIM component is failed, the cluster manager should be aware of the
+failure and switch the backup instance online. And the failed instance should also be recovered
+to another backup instance. The requirement decomposition of Active/Passive Redundancy is shown
+in Figure 5.
+
+.. figure:: images/Active_Passive_Redundancy.png
+    :alt: Active/Passive Redundancy Requirement Decomposition
+    :figclass: align-center
+
+    Fig 5. Active/Passive Redundancy Requirement Decomposition
+
+The following requirements are elicited for VIM Active/Passive Redundancy:
+
+**[Req 5.4.6]** The cluster manager should replace the failed main VIM component instance with
+a backup instance.
+
+**[Req 5.4.7]** The cluster manager should check the health status of VIM component instances.
+
+**[Req 5.4.8]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.9]** The alarm information of VIM component failure should be reported.
+
+
+Table 4 shows the current VIM components using Active/Passive Redundancy and the corresponding
+HA test cases to verify them.
+
+*Table 4. VIM Components using Active/Passive Redundancy*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| haproxy           | load balancer component of VIM components             | yardstick_tc053      |
++-------------------+-------------------------------------------------------+----------------------+
+| rabbitmq-server   | messaging queue service of VIM components             | yardstick_tc056      |
++-------------------+-------------------------------------------------------+----------------------+
+| corosync          | cluster management component of VIM components        | yardstick_tc057      |
++-------------------+-------------------------------------------------------+----------------------+
+
+Message Queue
+::::::::::::::::::::::::::::
+Message Queue provides an asynchronous communication protocol. In Openstack, some projects (
+like Nova, Cinder) use Message Queue to call their sub components. Although Message Queue
+itself is not an HA mechanism, how it works ensures the high availability when redundant
+components subscribe to the Message Queue. When a VIM sub component fails, since there are
+other redundant components are subscribing to the Message Queue, requests still can be processed.
+And fault isolation can also be archived since failed components won't fetch requests actively.
+Also, the recovery of failed components is required. Figure 6 shows the requirement
+decomposition of Message Queue.
+
+.. figure:: images/Message_Queue.png
+    :alt: Message Queue Requirement Decomposition
+    :figclass: align-center
+
+    Fig 6. Message Queue Redundancy Requirement Decomposition
+
+The following requirements are elicited for Message Queue:
+
+**[Req 5.4.10]** Redundant component instances should subscribe to the Message Queue, which is
+implemented by the installer.
+
+**[Req 5.4.11]** Failed VIM component instances should be recovered by the cluster manager.
+
+**[Req 5.4.12]** The alarm information of VIM component failure should be reported.
+
+Table 5 shows the current VIM components using Message Queue and the corresponding HA test cases
+to verify them.
+
+*Table 5. VIM Components using Messaging Queue*
+
++-------------------+-------------------------------------------------------+----------------------+
+| Component         | Description                                           | Related HA Test Case |
++===================+=======================================================+======================+
+| nova-scheduler    | Openstack compute component determines how to         |                      |
+|                   | dispatch compute requests                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-cert         | Openstack compute component that serves the Nova Cert |                      |
+|                   | service for X509 certificates. Used to generate       |                      |
+|                   | certificates for euca-bundle-image.                   |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-conductor    | server daemon that serves the Nova Conductor service, |                      |
+|                   | which provides coordination and database query        |                      |
+|                   | support for Nova.                                     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-compute      | Handles all processes relating to instances (guest    |                      |
+|                   | vms). nova-compute is responsible for building a disk |                      |
+|                   | image, launching it via the underlying virtualization |                      |
+|                   | driver, responding to calls to check its state,       |                      |
+|                   | attaching persistent storage, and terminating it.     |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| nova-consoleauth  | Openstack compute component for Authentication of     |                      |
+|                   | nova consoles.                                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-scheduler  | Openstack volume storage component decides on         |                      |
+|                   | placement for newly created volumes and forwards the  |                      |
+|                   | request to cinder-volume.                             |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| cinder-volume     | Openstack volume storage component receives volume    |                      |
+|                   | management requests from cinder-api and               |                      |
+|                   | cinder-scheduler, and routes them to storage backends |                      |
+|                   | using vendor-supplied drivers.                        |                      |
++-------------------+-------------------------------------------------------+----------------------+
+| heat-engine       | Openstack Heat project server with an internal RPC    |                      |
+|                   | api called by the heat-api server.                    |                      |
++-------------------+-------------------------------------------------------+----------------------+
+
+
+5.5 Hypervisor HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+5.6 Host OS HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+5.7 Hardware HA
+>>>>>>>>>>>>>>>>>>
+
+.. TBD
+
+
+******************
+6 References
+******************
+
+- A KAOS Tutorial: http://www.objectiver.com/fileadmin/download/documents/KaosTutorial.pdf
+
+- ETSI GS NFV-REL 001 V1.1.1(2015-01):
+  http://www.etsi.org/deliver/etsi_gs/NFV-REL/001_099/001/01.01.01_60/gs_NFV-REL001v010101p.pdf
+
+- Openstack High Availability Guide: https://docs.openstack.org/ha-guide/
+
+- Highly Available (Mirrored) Queues: https://www.rabbitmq.com/ha.html
+\ No newline at end of file
diff --git a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1.png b/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1.png
deleted file mode 100644
index c394763..0000000
--- a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1.png
+++ /dev/null
diff --git a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1b.png b/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1b.png
deleted file mode 100644
index 3f2491a..0000000
--- a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1b.png
+++ /dev/null
diff --git a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Peer_Messaging-FIGURE-2.png b/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Peer_Messaging-FIGURE-2.png
deleted file mode 100644
index 7147445..0000000
--- a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Peer_Messaging-FIGURE-2.png
+++ /dev/null
diff --git a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD.rst b/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD.rst
deleted file mode 100644
index b634a6b..0000000
--- a/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD.rst
+++ /dev/null
@@ -1,289 +0,0 @@
-Overview
-=====================================================================
-
-:abstract: This document describes a set of new optional
-   capabilities where the OpenStack Cloud messages into the Guest
-   VMs in order to provide improved Availability of the hosted VMs.
-   The initial set of new capabilities include: enabling the
-   detection of and recovery from internal VM faults and providing
-   a simple out-of-band messaging service to prevent scenarios such
-   as split brain.
-
-
-.. sectnum::
-
-.. contents:: Table of Contents
-
-
-
-Introduction
-=====================================================================
-
-   This document provides an overview and rationale for a
-   set of new capabilities where the OpenStack Cloud messages
-   into the Guest VMs in order to provide improved Availability
-   of the hosted VMs.
-
-   The initial set of new capabilities specifically include:
-
-        - VM Heartbeating and Health Checking
-        - VM Peer State Notification and Messaging
-
-   All of these capabilities leverage Host-to-Guest Messaging
-   Interfaces / APIs which are built on a messaging service between the
-   OpenStack Host and the Guest VM that uses a simple low-bandwidth
-   datagram messaging capability in the hypervisor and therefore has no
-   requirements on OpenStack Networking, and is available very early
-   after spawning the VM.
-
-   For each capability, the document outlines the interaction with
-   the Guest VM, any key technologies involved, the integration into
-   the larger OpenStack and OPNFV Architectures (e.g. interactions
-   with VNFM), specific OPNFV HA Team deliverables, and the use cases
-   for how availability of the hosted VM is improved.
-
-
-
-
-Messaging Layer
-========================================================================
-
-   The Host-to-Guest messaging APIs used by the services discussed
-   in this document use a JSON-formatted application messaging layer
-   on top of a ‘virtio serial device�?between QEMU on the OpenStack Host
-   and the Guest VM.  JSON formatting provides a simple, humanly readable
-   messaging format which can be easily parsed and formatted using any
-   high level programming language being used in the Guest VM (e.g. C/C++,
-   Python, Java, etc.).  Use of the ‘virtio serial device�?provides a
-   simple, direct communication channel between host and guest which is
-   independent of the Guest’s L2/L3 networking.
-
-   The upper layer JSON messaging format is actually structured as a
-   hierarchical JSON format containing a Base JSON Message Layer and an
-   Application JSON Message Layer:
-
-        - the Base Layer provides the ability to multiplex different groups
-          of message types on top of a single ‘virtio serial device�?
-          e.g.
-
-           + heartbeating and healthchecks,
-           + server group messaging,
-
-          and
-
-        - the Application Layer provides the specific message types and
-          fields of a particular group of message types.
-
-
-
-VM Heartbeating and Health Checking
-============================================================================
-
-   Normally OpenStack monitoring of the health of a Guest VM is limited
-   to a black-box approach of simply monitoring the presence of the
-   QEMU/KVM PID containing the VM, and/or by enabling libvirt's emulated
-   hardware watchdog.
-
-   VM Heartbeating and Health Checking provides a heartbeat service to enhance
-   the monitoring of the health of guest application(s) within a VM running
-   under the OpenStack Cloud.  Loss of heartbeat or a failed health check status
-   will result in a fault event being reported to OPNFV's DOCTOR infrastructure
-   for alarm identification, impact analysis and reporting.  This would then enable
-   VNF Managers (VNFMs) listening to OPNFV's DOCTOR External Alarm Reporting through
-   Telemetry's AODH, to initiate any required fault recovery actions.
-
-   .. image:: OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1.png
-
-   Or, in the context of the OPNFV DOCTOR's Fault Management Architecture:
-
-   .. image:: OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1b.png
-
-   The VM Heartbeating and Health Checking functionality is enabled on
-   a VM through a new flavor extraspec indicating that the VM supports
-   and wants to enable Guest Heartbeating.  An extension to Nova Compute uses
-   this extraspec to setup the required 'virtio serial device' for Host-to-Guest
-   messaging, on the QEMU/KVM instance created for the VM.
-
-   A daemon within the Guest VM will register with the OpenStack Guest
-   Heartbeat Service on the compute node to initiate the heartbeating on itself
-   (i.e. the Guest VM).  The OpenStack Compute Node will start heartbeating the
-   Guest VM, and if the heartbeat fails, the OpenStack Compute Node will report
-   the VM Fault thru DOCTOR and ultimately VNFM will see this thru NOVA VM
-   State Change Notifications thru AODH.  I.e. VNFM wouild see the VM Heartbeat
-   Failure events in the same way it sees all other VM Faults, thru DOCTOR
-   initiated VM state changes.
-
-   Part of the Guest VM's registration process is the specification of the
-   heartbeat interval in msecs.  I.e. the registering Guest VM specifies the
-   heartbeating interval.
-
-   Guest heartbeat works on a challenge response model.  The OpenStack
-   Guest Heartbeat Service on the compute node will challenge the registered
-   Guest VM daemon with a message each interval.  The registered Guest VM daemon
-   must respond prior to the next interval with a message indicating good health.
-   If the OpenStack Host does not receive a valid response, or if the response
-   specifies that the VM is in ill health, then a fault event for the Guest VM
-   is reported to the OpenStack Guest Heartbeat Service on the controller node which
-   will report the event to OPNFV's DOCTOR (i.e. thru the Doctor SouthBound (SB)
-   APIs).
-
-   In summary, the Guest Heartbeating Messaging Specification is quite simple,
-   including the following PDUs: Init, Init-Ack, Challenge-Request,
-   Challenge-Response, Exit.  The Challenge-Response returning a healthy /
-   not-healthy boolean.
-
-   The registered Guest VM daemon's response to the challenge can be as simple
-   as just immediately responding with OK.  This alone allows for detection of
-   a failed or hung QEMU/KVM instance, or a failure of the OS within the VM to
-   schedule the registered Guest VM's daemon or failure to route basic IO within
-   the Guest VM.
-
-   However the registered Guest VM daemon's response to the challenge can be more
-   complex, running anything from a quick simple sanity check of the health of
-   applications running in the Guest VM, to a more thorough audit of the
-   application state and data.  In either case returning the status of the
-   health check enables the OpenStack host to detect and report the event in order
-   to initiate recovery from application level errors or failures within the Guest VM.
-
-   In summary, the deliverables of this activity would be:
-
-   - Host Deliverables:    (OpenStack and OPNFV blueprints and implementation)
-
-   + an OpenStack Nova or libvirt extension to interpret the new flavor extraspec and
-     if present setup the required 'virtio serial device' for Host-to-Guest
-     heartbeat / health-check messaging, on the QEMU/KVM instance created
-     for the VM,
-   + an OPNFV Base Host-to-Guest Msging Layer Agent for multiplexing of Application
-     Layer messaging over the 'virtio serial device' to the VM,
-   + an OPNFV Heartbeat / Health-Check Compute Agent for local heartbeating of VM
-     and reporting of failures to the OpenStack Controller,
-   + an OPNFV Heartbeat / Health-check Server on the OpenStack Controller for
-     receiving VM failure notifications and reporting these to Vitrage thru
-     Vitrage's Data Source API,
-
-   - Guest Deliverables:
-
-   + a Heartbeat / Health-Check Message Specification covering
-
-      - Heartbeat / Health-Check Application Layer JSON Protocol,
-      - Base Host-to-Guest JSON Protocol,
-      - Details on the use of the underlying 'virtio serial device',
-
-   + a Reference Implementation of the Guest-side support of
-     Heartbeat / Health-check containing the peer protocol layers
-     within the Guest.
-
-      - will provide code and compile instructions,
-      - Guest will compile based on its specific OS.
-
-   NOTE that the described VM Heartbeating and Healthchecking functionality provides
-   enhanced monitoring over and above libvirt's emulated hardware watchdog.  VM
-   Heartbeating and Healthchecking can detect a wider range of issues than simply
-   lack of cpu time scheduling for a lower priority process feeding the hardware
-   watchdog.  VM Heartbeating and Healthchecking can ensure that specific key processes
-   within the application are not blocked, kernel resources for basic IO within
-   the Guest VM are available, and/or ensure the application-specific health of the VM
-   is good.
-
-   This proposal has been reviewed with both the OPNFV's Doctor and Management
-   and Orchestration teams, and general agreement was that the proposal integrated
-   / inter-worked correctly with the OPNFV DOCTOR's Vitrage, Congress and the overall
-   OPNFV fault reporting architecture.
-
-
-
-VM Peer State Notification and Messaging
-===================================================================================
-
-   Server Group State Notification and Messaging is a service to provide
-   simple low-bandwidth datagram messaging and notifications for servers that
-   are part of the same server group.  This messaging channel is available
-   regardless of whether IP networking is functional within the server, and
-   it requires no knowledge within the server about the other members of the group.
-
-   NOTE: A Server Group here is the OpenStack Nova Server Group concept where VMs
-   are grouped together for purposes of scheduling.  E.g. A specific Server Group
-   instance can specify whether the VMs within the group should be scheduled to
-   run on the same compute host or different compute hosts.  A 'peer' VM in the
-   context of this section refers to a VM within the same Nova Server Group.
-
-   This Server Group Messaging service provides three types of messaging:
-
-        - Broadcast: this allows a server to send a datagram (size of up to 3050 bytes)
-          to all other servers within the server group.
-        - Notification: this provides servers with information about changes to the
-          (Nova) state of other servers within the server group.
-        - Status: this allows a server to query the current (Nova) state of all servers within
-          the server group (including itself).
-
-   A Server Group Messaging entity on both the controller node and the compute nodes
-   manage the routing of of VM-to-VM messages through the platform, leveraging Nova
-   to determine Server Group membership and compute node locations of VMs.  The Server
-   Group Messaging entity on the controller also listens to Nova VM state change notifications
-   and querys VM state data from Nova, in order to provide the VM query and notification
-   functionality of this service.
-
-   .. image:: OPNFV_HA_Guest_APIs-Overview_HLD-Peer_Messaging-FIGURE-2.png
-
-   This service is not intended for high bandwidth or low-latency operations.  It
-   is best-effort, not reliable.  Applications should do end-to-end acks and
-   retries if they care about reliability.
-
-   This service provides building block type capabilities for the Guest VMs that
-   contribute to higher availability of the VMs in the Guest VM Server Group.  Notifications
-   of VM Status changes potentially provide a faster and more accurate notification
-   of failed peer VMs than traditional peer VM monitoring over Tenant Networks.  While
-   the Broadcast Messaging mechanism provides an out-of-band messaging mechanism to
-   monitor and control a peer VM under fault conditions; e.g. providing the ability to
-   avoid potential split brain scenarios between 1:1 VMs when faults in Tenant
-   Networking occur.
-
-   In summary, the deliverables for Server Group Messaging would be:
-
-   - Host Deliverables:
-
-   + a Nova or libvirt extension to interpret the new flavor extraspec and
-     if present setup the required 'virtio serial device' for Host-to-Guest
-     Server Group Messaging, on the QEMU/KVM instance created
-     for the VM,
-   + [ leveraging the Base Host-to-Guest Msging Layer Agent from previous section ],
-   + a Server Group Messaging Compute Agent for implementing the Application Layer
-     Server Group Messaging JSON Protocol with the VM, and forwarding the
-     messages to/from the Server Group Messaging Server on the Controller,
-   + a Server Group Messaging Server on the Controller for routing broadcast
-     messages to the proper Computes and VMs, as well as listening for Nova
-     VM State Change Notifications and forwarding these to applicable Computes
-     and VMs,
-
-   - Guest Deliverables:
-
-   + a Server Group Messaging Message Specification covering
-
-      - Server Group Messaging Application Layer JSON Protocol,
-      - [ leveraging Base Host-to-Guest JSON Protocol from previous section ],
-      - [ leveraging Details on the use of the underlying 'virtio serial device' from previous section ],
-
-   + a Reference Implementation of the Guest-side support of
-     Server Group Messaging containing the peer protocol layers
-     and Guest Application hooks within the Guest.
-
-   This proposal has been reviewed with both the OPNFV's Doctor and Management
-   and Orchestration teams, and general agreement was that the proposal did not
-   conflict with the OPNFV Doctor Architecture, and provided, at the very least,
-   an alternative messaging and state-change-notification mechanism for hosted
-   VMs in various HA use cases.
-
-
-
-Conclusion
-======================================================================================
-
-   The Reach-thru Guest Monitoring and Services described in this document
-   leverage Host-to-Guest messaging to provide a number of extended capabilities
-   that improve the Availability of the hosted VMs.  These new capabilities
-   enable detection of and recovery from internal VM faults and provides a simple
-   out-of-band messaging service to prevent scenarios such as split brain.
-
-   The next steps in progressing this proposal will be to submit blueprints to
-   the appropriate OpenStack working groups;  Vitrage for VM Heartbeating and
-   Healthchecking and Nova for VM Server Group Messaging.
diff --git a/docs/development/overview/images/fig10_Message_Queue.png b/docs/development/overview/images/fig10_Message_Queue.png
new file mode 100644
index 0000000..3ebedbe
--- /dev/null
+++ b/docs/development/overview/images/fig10_Message_Queue.png
diff --git a/docs/development/overview/images/fig11_VIM_HA_analysis_in_K8s.png b/docs/development/overview/images/fig11_VIM_HA_analysis_in_K8s.png
new file mode 100644
index 0000000..70252a7
--- /dev/null
+++ b/docs/development/overview/images/fig11_VIM_HA_analysis_in_K8s.png
diff --git a/docs/development/overview/images/fig12_VIM_HA_analysis_in_K8s_2.png b/docs/development/overview/images/fig12_VIM_HA_analysis_in_K8s_2.png
new file mode 100644
index 0000000..205cd2e
--- /dev/null
+++ b/docs/development/overview/images/fig12_VIM_HA_analysis_in_K8s_2.png
diff --git a/docs/development/overview/images/fig1_KAOS_Sample.png b/docs/development/overview/images/fig1_KAOS_Sample.png
new file mode 100644
index 0000000..0d35cd7
--- /dev/null
+++ b/docs/development/overview/images/fig1_KAOS_Sample.png
diff --git a/docs/development/overview/images/fig2_Total_Framework.png b/docs/development/overview/images/fig2_Total_Framework.png
new file mode 100644
index 0000000..c900908
--- /dev/null
+++ b/docs/development/overview/images/fig2_Total_Framework.png
diff --git a/docs/development/overview/images/fig3_VM_HA_Analysis.png b/docs/development/overview/images/fig3_VM_HA_Analysis.png
new file mode 100644
index 0000000..e263e60
--- /dev/null
+++ b/docs/development/overview/images/fig3_VM_HA_Analysis.png
diff --git a/docs/development/overview/images/fig4_Heartbeating_and_Healthchecks.png b/docs/development/overview/images/fig4_Heartbeating_and_Healthchecks.png
new file mode 100644
index 0000000..cd7a551
--- /dev/null
+++ b/docs/development/overview/images/fig4_Heartbeating_and_Healthchecks.png
diff --git a/docs/development/overview/images/fig5_VM_Peer_State_Notification_and_Messaging.png b/docs/development/overview/images/fig5_VM_Peer_State_Notification_and_Messaging.png
new file mode 100644
index 0000000..7614e19
--- /dev/null
+++ b/docs/development/overview/images/fig5_VM_Peer_State_Notification_and_Messaging.png
diff --git a/docs/development/overview/images/fig6_Container_HA_analysis_in_K8s.png b/docs/development/overview/images/fig6_Container_HA_analysis_in_K8s.png
new file mode 100644
index 0000000..5fdb8a7
--- /dev/null
+++ b/docs/development/overview/images/fig6_Container_HA_analysis_in_K8s.png
diff --git a/docs/development/overview/images/fig7_VIM_Analysis.png b/docs/development/overview/images/fig7_VIM_Analysis.png
new file mode 100644
index 0000000..5b0f579
--- /dev/null
+++ b/docs/development/overview/images/fig7_VIM_Analysis.png
diff --git a/docs/development/overview/images/fig8_Active_Active_Redundancy.png b/docs/development/overview/images/fig8_Active_Active_Redundancy.png
new file mode 100644
index 0000000..1863de1
--- /dev/null
+++ b/docs/development/overview/images/fig8_Active_Active_Redundancy.png
diff --git a/docs/development/overview/images/fig9_Active_Passive_Redundancy.png b/docs/development/overview/images/fig9_Active_Passive_Redundancy.png
new file mode 100644
index 0000000..e94c82f
--- /dev/null
+++ b/docs/development/overview/images/fig9_Active_Passive_Redundancy.png
diff --git a/docs/development/overview/index.rst b/docs/development/overview/index.rst
index 1114613..46dc173 100644
--- a/docs/development/overview/index.rst
+++ b/docs/development/overview/index.rst
@@ -5,11 +5,11 @@
 .. (c) <optionally add copywriters name>
 
 
-*********************************************************************************
-Reach-thru Guest Monitoring and Services for High Availability
-*********************************************************************************
+=======================================================================
+High Availability Requirement Analysis in OPNFV
+=======================================================================
 
 .. toctree::
    :maxdepth: 4
 
-   OPNFV_HA_Guest_APIs-Overview_HLD.rst
+   HA_Analysis-H.rst
diff --git a/docs/development/overview/index.rst.bak b/docs/development/overview/index.rst.bak
new file mode 100644
index 0000000..3e69259
--- /dev/null
+++ b/docs/development/overview/index.rst.bak
@@ -0,0 +1,15 @@
+.. _availability-overview:
+
+.. This work is licensed under a Creative Commons Attribution 4.0 International License.
+.. SPDX-License-Identifier: CC-BY-4.0
+.. (c) <optionally add copywriters name>
+
+
+*********************************************************************************
+High Availability Requirement Analysis in OPNFV
+*********************************************************************************
+
+.. toctree::
+   :maxdepth: 4
+
+   HA_Analysis-Gambia.rst