diff options
author | Yujun Zhang <zhang.yujunz@zte.com.cn> | 2017-09-06 14:05:36 +0800 |
---|---|---|
committer | Ryota Mibu <r-mibu@cq.jp.nec.com> | 2017-09-27 05:27:53 +0000 |
commit | b20f69d5423280d6b41c591129cbc0555a040868 (patch) | |
tree | 89b040c3a80b83f606c8f9c3cb99bab3c471a865 | |
parent | 54b4b3418aae6d649411b531f0aa8bc4d9fd6b2c (diff) |
Add parallel execution and shortcut notification to inspector design guideline
JIRA: DOCTOR-73
Change-Id: Ic412b0c5e966f4391bc0f9e5e71d64e23e2eba68
Signed-off-by: Yujun Zhang <zhang.yujunz@zte.com.cn>
(cherry picked from commit 1f00955295c2461a181aa1fa5d8587f12832bf4d)
-rw-r--r-- | docs/development/design/images/conservative-notification.png | bin | 0 -> 63926 bytes | |||
-rw-r--r-- | docs/development/design/images/notification-time.png | bin | 0 -> 34847 bytes | |||
-rw-r--r-- | docs/development/design/images/shortcut-notification.png | bin | 0 -> 66098 bytes | |||
-rw-r--r-- | docs/development/design/index.rst | 1 | ||||
-rw-r--r-- | docs/development/design/inspector-design-guideline.rst | 48 |
5 files changed, 48 insertions, 1 deletions
diff --git a/docs/development/design/images/conservative-notification.png b/docs/development/design/images/conservative-notification.png Binary files differnew file mode 100644 index 00000000..b2645720 --- /dev/null +++ b/docs/development/design/images/conservative-notification.png diff --git a/docs/development/design/images/notification-time.png b/docs/development/design/images/notification-time.png Binary files differnew file mode 100644 index 00000000..8e140172 --- /dev/null +++ b/docs/development/design/images/notification-time.png diff --git a/docs/development/design/images/shortcut-notification.png b/docs/development/design/images/shortcut-notification.png Binary files differnew file mode 100644 index 00000000..54a3ce28 --- /dev/null +++ b/docs/development/design/images/shortcut-notification.png diff --git a/docs/development/design/index.rst b/docs/development/design/index.rst index e50c1704..713bb9b4 100644 --- a/docs/development/design/index.rst +++ b/docs/development/design/index.rst @@ -27,3 +27,4 @@ See also https://wiki.opnfv.org/requirements_projects . inspector-design-guideline.rst performance-profiler.rst maintenance-design-guideline.rst + inspector-design-guideline.rst diff --git a/docs/development/design/inspector-design-guideline.rst b/docs/development/design/inspector-design-guideline.rst index faa5e424..5396f883 100644 --- a/docs/development/design/inspector-design-guideline.rst +++ b/docs/development/design/inspector-design-guideline.rst @@ -53,7 +53,51 @@ This guideline can be summarized as following: Parallel execution ------------------ -TBD, see `discussion in mailing list`_. +In doctor's architecture, the inspector is responsible to set error state for the affected VMs in order to notify the +consumers of such failure. This is done by calling the nova `reset-state`_ API. However, this action is a synchronous +request with many underlying steps and cost typically hundreds of milliseconds. According to the +`discussion in mailing list`_, this time cost will grow linearly if the requests are sent one by one. It will become +a critical issue in large scale system. + +It is recommended to introduce **parallel execution** for actions like ``reset-state`` that takes a list of targets. + +Shortcut notification +--------------------- + +An alternative way to improve notification performance is to take a shortcut from inspector to notifier instead of +triggering it from controller. The difference between the two workflow is shown below: + +.. figure:: images/conservative-notification.png + :alt: conservative notification + + Conservative Notification + +.. figure:: images/shortcut-notification.png + :alt: shortcut notification + + Shortcut Notification + +It worth noting that the shortcut notification has a side effect that cloud resource states could still be out-of-sync +by the time consumer processes the alarm notification. This is out of scope of inspector design but need to be taken +consideration in system level. + +Also the call of "reset servers state to error" is not necessary in the alternative notification case where the "host +forced down" is still called. "get-valid-server-state" was implemented to have valid server state while earlier one +couldn't get it unless calling "reset servers state to error". When not having "reset servers state to error", states +are more unlikely to be out of sync while notification and force down host would be parallel. + +Appendix +======== + +A study has been made to evaluate the effect of parallel execution and shortcut notification on OPNFV Beijing Summit +2017. + +.. figure:: images/notification-time.png + :alt: notification time + + Notification Time + +Download the `full presentation slides`_ here. .. _DOCTOR-73: https://jira.opnfv.org/browse/DOCTOR-73 .. _OPNFV Doctor project: https://wiki.opnfv.org/doctor @@ -61,3 +105,5 @@ TBD, see `discussion in mailing list`_. .. _patch set for caching the list: https://gerrit.opnfv.org/gerrit/#/c/20877/ .. _DOCTOR-76: https://jira.opnfv.org/browse/DOCTOR-76 .. _discussion in mailing list: https://lists.opnfv.org/pipermail/opnfv-tech-discuss/2016-October/013036.html +.. _reset-state: https://developer.openstack.org/api-ref/compute/#reset-server-state-os-resetstate-action +.. _full presentation slides: https://wiki.opnfv.org/download/attachments/5046291/doctor_qtip_faster_higher_stronger.pdf
\ No newline at end of file |