summaryrefslogtreecommitdiffstats
path: root/docs/scenarios
diff options
context:
space:
mode:
authorfuqiao <fuqiao@chinamobile.com>2017-02-13 10:44:55 +0800
committerfuqiao <fuqiao@chinamobile.com>2017-02-13 10:44:55 +0800
commit4260b10e3a7c7780385a61d6d708a9480d208a1a (patch)
treea61dd18b0273932bb1f6d347022c2829399be021 /docs/scenarios
parent593ef4f91ee0dac6cde4101b75174d88d0a01b28 (diff)
Danube MS6 adding document directives
add overview doc and design docs JIRA: HA-28 Change-Id: I890d2056a0fe61ca6aa1297210a1028d38d1514d Signed-off-by: fuqiao@chinamobile.com
Diffstat (limited to 'docs/scenarios')
-rw-r--r--docs/scenarios/GAP_Analysis_Colorado.rst278
-rw-r--r--docs/scenarios/GAP_Analysis_Colorado.rst.bak278
-rw-r--r--docs/scenarios/index.rst13
3 files changed, 0 insertions, 569 deletions
diff --git a/docs/scenarios/GAP_Analysis_Colorado.rst b/docs/scenarios/GAP_Analysis_Colorado.rst
deleted file mode 100644
index 4fefc09..0000000
--- a/docs/scenarios/GAP_Analysis_Colorado.rst
+++ /dev/null
@@ -1,278 +0,0 @@
-Introduction:
-^^^^^^^^^^^^^
-
-During the Colorado release the OPNFV availability team has reviewed a number of gaps
-in support for high availability in various areas of OPNFV. The focus and goal was
-to find gaps and work with the various open source communities( OpenStack as an
-example ) to develop solutions and blueprints. This would enhance the overall
-system availability and reliability of OPNFV going forward. We also worked with
-the OPNFV Doctor team to ensure our activities were coordinated. In the next
-releases of OPNFV the availability team will update the status of open gaps and
-continue to look for additional gaps.
-
-Summary of findings:
-^^^^^^^^^^^^^^^^^^^^
-
-1. Publish health status of compute node - this gap is now closed through and
-OpenStack blueprint in Mitaka
-
-2. Health status of compute node - some good work underway in OpenStack and with
-the Doctor team, we will continue to monitor this work.
-
-3. Store consoleauth tokens to the database - this gap can be address through
-changing OpenStack configurations
-
-4. Active/Active HA of cinder-volume - active work underway in Newton, we will
-monitor closely
-
-5. Cinder volume multi-attachment - this work has been completed in OpenStack -
-this gap is now closed
-
-6. Add HA tests into Fuel - the Availability team has been working with the
-Yardstick team to create additional test case for the Colorado release. Some of
-these test cases would be good additions to installers like Fuel.
-
-Detailed explanation of the gaps and findings:
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-GAP 1: Publish the health status of compute node
-================================================
-
-* Type: 'reliability'
-* Description:
-
- Current compute node status is only kept within nova. However, NFVO and VNFM
- may also need these information. For example, NFVO may trigger scale up/down
- based on the status. VNFM may trigger evacuation. In the meantime, in the
- high availability scenarios, VNFM may need the host status info from the VIM
- so that it can figure out what the failure exactly located. Therefore, these
- info need to be published outside to the NFVO and VNFM.
-
- + Desired state
-
- - Be able to have the health status of compute nodes published.
-
- + Current behaviour
-
- - Nova queries the ServiceGroup API to get the node liveness information.
-
- + Gap
-
-- Currently Service Group is keeping the health status of compute nodes internal
-- within nova, could have had those status published to NFV MANO plane.
-
-Findings:
-
-BP from the OPNFV Doctor team has covered this GAP. Add notification for service
-status change.
-
-Status: Merged (Mitaka release)
-
- + Owner: Balazs
-
- + BP: https://blueprints.launchpad.net/nova/+spec/service-status-notification
-
- + Spec: https://review.openstack.org/182350
-
- + Code: https://review.openstack.org/#/c/245678/
-
- + Merged Jan 2016 - Mitaka
-
-GAP 2: Health status of compute node
-====================================
-
-* Type: 'reliability'
-* Description:
-
- + Desired state:
-
- - Provide the health status of compute nodes.
-
- + Current Behaviour
-
- - Currently , while performing some actions like evacuation, Nova is
- checking for the compute service. If the service is down,it is assumed
- the host is down. This is not exactly true, since there is a possibility
- to only have compute service down, while all VMs that are running on the
- host, are actually up. There is no way to distinguish between two really
- different things: host status and nova-compute status, which is deployed
- on the host.
- - Also, provided host information by API and commands, are service centric,
- i.e."nova host-list" is just another wrapper for "nova service-list" with
- different format (in fact "service-list" is a super set to "host-list").
-
-
- + Gap
-
- - Not all the health information of compute nodes can be provided. Seems like
- nova is treating *host* term equally to *compute-host*, which might be misleading.
- Such situations can be error prone for the case where there is a need to perform
- host evacuation.
-
-
-Related BP:
-
-Pacemaker and Corosync can provide info about the host. Therefore, there is
-requirement to have nova support the pacemaker service group driver. There could
-be another option by adding tooz servicegroup driver to nova, and then have to
-support corosync driver.
-
- + https://blueprints.launchpad.net/nova/+spec/tooz-for-service-groups
-
-Doctor team is not working on this blueprint
-
-NOTE: This bp is active. A suggestion is to adopt this bp and add a corosync
-driver to tooz. Could be a solution.
-
-We should keep following this bp, when it finished, see if we could add a
-corosync driver for tooz to close this gap.
-
-Here are the currently supported driver in tooz.
-https://github.com/openstack/tooz/blob/master/doc/source/drivers.rst Meanwhile,
-we should also look into the doctor project and see if this could be solved.
-
-This work is still underway, but, doesn't directly map to the gap that it is
-identified above. Doctor team looking to get faster updates on node status and
-failure status - these are other blueprints. These are good problems to solve.
-
-GAP 3: Store consoleauth tokens to the database
-===============================================
-
-* Type: 'performance'
-* Description:
-
-+ Desired state
-
- - Change the consoleauth service to store the tokens in the databaseand, optionally,
- cache them in memory as it does now for fast access.
-
-+ Current State
-
- - Currently the consoleauth service is storing the tokens and theconnection data
- only in memory. This behavior makes impossible to have multipleinstances of this
- service in a cluster as there is no way for one of theisntances to know the tokens
- issued by the other.
-
- - The consoleauth service can use a memcached server to store those tokens,but again,
- if we want to share them among different instances of it we would berelying in one
- memcached server which makes this solution unsuitable for a highly available
- architecture where we should be able to replicate all ofthe services in our cluster.
-
-+ Gap
-
- - The consoleauth service is storing the tokens and the connection data only in memory.
- This behavior makes impossible to have multiple instances of this service in a cluster
- as there is no way for one of the instances to know the tokens issued by the other.
-
-* Related BP
-
- + https://blueprints.launchpad.net/nova/+spec/consoleauth-tokens-in-db
-
- The advise in the blueprint is to use memcached as a backend. Looking to the
- documentation memcached is not able to replicate data, so this is not a
- complete solution. But maybe redis (http://redis.io/) is a suitable backend
- to store the tokens that survive node failures. This blueprint is not
- directly needed for this gap.
-
-Findings:
-
-This bp has been rejected since the community feedback is that A/A can be
-supported by memcacheD. The usecase for this bp is not quite clear, since when
-the consoleauth service is done and the token is lost, the other service can
-retrieve the token again after it recovers. Can be accomplished through a
-different configuration set up for OpenStack. Therefore not a gap.
-Recommendation of the team is to verify the redis approach.
-
-
-GAP 4: Active/Active HA of cinder-volume
-========================================
-
-* Type: 'reliability/scalability'
-
-* Description:
-
- + Desired State:
-
- - Cinder-volume can run in an active/active configuration.
-
- + Current State:
-
- - Only one cinder-volume instance can be active. Failover to be handledby
- external mechanism such as pacemaker/corosync.
-
- + Gap
-
- - Cinder-volume doesn't supprt active/active configuration.
-
-* Related BP
-
- + https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
-
-* Findings:
-
- + This blueprint underway for Newton - as of July 6, 2016 great progress has
- been made, we will continue to monitor the progress.
-
-GAP 5: Cinder volume multi-attachment
-=====================================
-
-* Type: 'reliability'
-* Description:
-
- + Desired State
-
- - Cinder volumes can be attached to multiple VMs at the same time. So that
- active/standby stateful VNFs can share the same Cinder volume.
-
- + Current State
-
- - Cinder volumes can only be attached to one VM at a time.
-
- + Gap
-
- - Nova and cinder do not allow for multiple simultaneous attachments.
-
-* Related BP
-
- + https://blueprints.launchpad.net/openstack/?searchtext=multi-attach-volume
-
-* Findings
-
- + Multi-attach volume is still WIP in OpenStack. There is coordination work required with Nova.
- + At risk for Newton
- + Recommend adding a Yardstick test case.
-
-General comment for the next release. Remote volume replication is another
-important project for storage HA.
-The HA team will monitor this multi-blueprint activity that will span multiple
-OpenStack releases. The blueprints aren't approved yet and there dependencies
-on generic-volume-group.
-
-
-
-GAP 6: HA tests improvements in fuel
-====================================
-
-* Type: 'robustness'
-* Description:
-
- + Desired State
- - Increased test coverage for HA during install
- + Current State
- - A few test cases are available
-
- * Related BP
-
- - https://blueprints.launchpad.net/fuel/+spec/ha-test-improvements
- - Tie in with the test plans we have discussed previously.
- - Look at Yardstick tests that could be proposed back to Openstack.
- - Discussions planned with Yardstick team to engage with Openstack community to enhance Fuel or Tempest as appropriate.
-
-
-Next Steps:
-^^^^^^^^^^^
-
-The six gaps above demonstrate that on going progress is being made in various
-OPNFV and OpenStack communities. The OPNFV-HA team will work to suggest
-blueprints for the next OpenStack Summit to help continue the progress of high
-availability in the community.
diff --git a/docs/scenarios/GAP_Analysis_Colorado.rst.bak b/docs/scenarios/GAP_Analysis_Colorado.rst.bak
deleted file mode 100644
index b6b7313..0000000
--- a/docs/scenarios/GAP_Analysis_Colorado.rst.bak
+++ /dev/null
@@ -1,278 +0,0 @@
-Introduction:
-^^^^^^^^^^^^^
-
-During the Colorado release the OPNFV availability team has reviewed a number of gaps
-in support for high availability in various areas of OPNFV. The focus and goal was
-to find gaps and work with the various open source communities( OpenStack as an
-example ) to develop solutions and blueprints. This would enhance the overall
-system availability and reliability of OPNFV going forward. We also worked with
-the OPNFV Doctor team to ensure our activities were coordinated. In the next
-releases of OPNFV the availability team will update the status of open gaps and
-continue to look for additional gaps.
-
-Summary of findings:
-^^^^^^^^^^^^^^^^^^^^
-
-1. Publish health status of compute node - this gap is now closed through and
-OpenStack blueprint in Mitaka
-
-2. Health status of compute node - some good work underway in OpenStack and with
-the Doctor team, we will continue to monitor this work.
-
-3. Store consoleauth tokens to the database - this gap can be address through
-changing OpenStack configurations
-
-4. Active/Active HA of cinder-volume - active work underway in Newton, we will
-monitor closely
-
-5. Cinder volume multi-attachment - this work has been completed in OpenStack -
-this gap is now closed
-
-6. Add HA tests into Fuel - the Availability team has been working with the
-Yardstick team to create additional test case for the Colorado release. Some of
-these test cases would be good additions to installers like Fuel.
-
-Detailed explanation of the gaps and findings:
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-GAP 1: Publish the health status of compute node
-================================================
-
-* Type: 'reliability'
-* Description:
-
- Current compute node status is only kept within nova. However, NFVO and VNFM
- may also need these information. For example, NFVO may trigger scale up/down
- based on the status. VNFM may trigger evacuation. In the meantime, in the
- high availability scenarios, VNFM may need the host status info from the VIM
- so that it can figure out what the failure exactly located. Therefore, these
- info need to be published outside to the NFVO and VNFM.
-
- + Desired state
-
- - Be able to have the health status of compute nodes published.
-
- + Current behaviour
-
- - Nova queries the ServiceGroup API to get the node liveness information.
-
- + Gap
-
-- Currently Service Group is keeping the health status of compute nodes internal
-- within nova, could have had those status published to NFV MANO plane.
-
-Findings:
-
-BP from the OPNFV Doctor team has covered this GAP. Add notification for service
-status change.
-
-Status: Merged (Mitaka release)
-
- + Owner: Balazs
-
- + BP: https://blueprints.launchpad.net/nova/+spec/service-status-notification
-
- + Spec: https://review.openstack.org/182350
-
- + Code: https://review.openstack.org/#/c/245678/
-
- + Merged Jan 2016 - Mitaka
-
-GAP 2: Health status of compute node
-====================================
-
-* Type: 'reliability'
-* Description:
-
- + Desired state:
-
- - Provide the health status of compute nodes.
-
- + Current Behaviour
-
- - Currently , while performing some actions like evacuation, Nova is
- checking for the compute service. If the service is down,it is assumed
- the host is down. This is not exactly true, since there is a possibility
- to only have compute service down, while all VMs that are running on the
- host, are actually up. There is no way to distinguish between two really
- different things: host status and nova-compute status, which is deployed
- on the host.
- - Also, provided host information by API and commands, are service centric,
- i.e."nova host-list" is just another wrapper for "nova service-list" with
- different format (in fact "service-list" is a super set to "host-list").
-
-
- + Gap
-
- - Not all the health information of compute nodes can be provided. Seems like
- nova is treating *host* term equally to *compute-host*, which might be misleading.
- Such situations can be error prone for the case where there is a need to perform
- host evacuation.
-
-
-Related BP:
-
-Pacemaker and Corosync can provide info about the host. Therefore, there is
-requirement to have nova support the pacemaker service group driver. There could
-be another option by adding tooz servicegroup driver to nova, and then have to
-support corosync driver.
-
- + https://blueprints.launchpad.net/nova/+spec/tooz-for-service-groups
-
-Doctor team is not working on this blueprint
-
-NOTE: This bp is active. A suggestion is to adopt this bp and add a corosync
-driver to tooz. Could be a solution.
-
-We should keep following this bp, when it finished, see if we could add a
-corosync driver for tooz to close this gap.
-
-Here are the currently supported driver in tooz.
-https://github.com/openstack/tooz/blob/master/doc/source/drivers.rst Meanwhile,
-we should also look into the doctor project and see if this could be solved.
-
-This work is still underway, but, doesn't directly map to the gap that it is
-identified above. Doctor team looking to get faster updates on node status and
-failure status - these are other blueprints. These are good problems to solve.
-
-GAP 3: Store consoleauth tokens to the database
-===============================================
-
-* Type: 'performance'
-* Description:
-
-+ Desired state
-
- - Change the consoleauth service to store the tokens in the databaseand, optionally,
- cache them in memory as it does now for fast access.
-
-+ Current State
-
- - Currently the consoleauth service is storing the tokens and theconnection data
- only in memory. This behavior makes impossible to have multipleinstances of this
- service in a cluster as there is no way for one of theisntances to know the tokens
- issued by the other.
-
- - The consoleauth service can use a memcached server to store those tokens,but again,
- if we want to share them among different instances of it we would berelying in one
- memcached server which makes this solution unsuitable for a highly available
- architecture where we should be able to replicate all ofthe services in our cluster.
-
-+ Gap
-
- - The consoleauth service is storing the tokens and the connection data only in memory.
- This behavior makes impossible to have multiple instances of this service in a cluster
- as there is no way for one of the instances to know the tokens issued by the other.
-
-* Related BP
-
- + https://blueprints.launchpad.net/nova/+spec/consoleauth-tokens-in-db
-
- The advise in the blueprint is to use memcached as a backend. Looking to the
- documentation memcached is not able to replicate data, so this is not a
- complete solution. But maybe redis (http://redis.io/) is a suitable backend
- to store the tokens that survive node failures. This blueprint is not
- directly needed for this gap.
-
-Findings:
-
-This bp has been rejected since the community feedback is that A/A can be
-supported by memcacheD. The usecase for this bp is not quite clear, since when
-the consoleauth service is done and the token is lost, the other service can
-retrieve the token again after it recovers. Can be accomplished through a
-different configuration set up for OpenStack. Therefore not a gap.
-Recommendation of the team is to verify the redis approach.
-
-
-GAP 4: Active/Active HA of cinder-volume
-========================================
-
-* Type: 'reliability/scalability'
-
-* Description:
-
- + Desired State:
-
- - Cinder-volume can run in an active/active configuration.
-
- + Current State:
-
- - Only one cinder-volume instance can be active. Failover to be handledby
- external mechanism such as pacemaker/corosync.
-
- + Gap
-
- - Cinder-volume doesn't supprt active/active configuration.
-
-* Related BP
-
- + https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
-
-* Findings:
-
- + This blueprint underway for Newton - as of July 6, 2016 great progress has
- been made, we will continue to monitor the progress.
-
-GAP 5: Cinder volume multi-attachment
-=====================================
-
-* Type: 'reliability'
-* Description:
-
- + Desired State
-
- - Cinder volumes can be attached to multiple VMs at the same time. So that
- active/standby stateful VNFs can share the same Cinder volume.
-
- + Current State
-
- - Cinder volumes can only be attached to one VM at a time.
-
- + Gap
-
- - Nova and cinder do not allow for multiple simultaneous attachments.
-
-* Related BP
-
- + https://blueprints.launchpad.net/openstack/?searchtext=multi-attach-volume
-
-* Findings
-
- + Multi-attach volume is still WIP in OpenStack. There is coordination work required with Nova.
- + At risk for Newton
- + Recommend adding a Yardstick test case.
-
-General comment for the next release. Remote volume replication is another
-important project for storage HA.
-The HA team will monitor this multi-blueprint activity that will span multiple
-OpenStack releases. The blueprints aren't approved yet and there dependencies
-on generic-volume-group.
-
-
-
-GAP 6: HA tests improvements in fuel
-====================================
-
-* Type: 'robustness'
-* Description:
-
- + Desired State
- - Increased test coverage for HA during install
- + Current State
- - A few test cases are available
-
- * Related BP
-
- - https://blueprints.launchpad.net/fuel/+spec/ha-test-improvements
- - Tie in with the test plans we have discussed previously.
- - Look at Yardstick tests that could be proposed back to Openstack.
- - Discussions planned with Yardstick team to engage with Openstack community to enhance Fuel or Tempest as appropriate.
-
-
-Next Steps:
-^^^^^^^^^^^
-
-The six gaps above demonstrate that on going progress is being made in various
-OPNFV and OpenStack communities. The OPNFV-HA team will work to suggest
-blueprints for the next OpenStack Summit to help continue the progress of high
-availability in the community.
diff --git a/docs/scenarios/index.rst b/docs/scenarios/index.rst
deleted file mode 100644
index dec46c7..0000000
--- a/docs/scenarios/index.rst
+++ /dev/null
@@ -1,13 +0,0 @@
-.. This work is licensed under a Creative Commons Attribution 4.0 International License.
-.. http://creativecommons.org/licenses/by/4.0
-.. (c) <optionally add copywriters name>
-
-*********************************
-Gap Analysis of High Availability
-*********************************
-
-.. toctree::
- :numbered:
- :maxdepth: 4
-
- GAP_Analysis_Colorado.rst