diff options
author | Ryota MIBU <r-mibu@cq.jp.nec.com> | 2015-11-26 23:48:06 +0900 |
---|---|---|
committer | Ryota MIBU <r-mibu@cq.jp.nec.com> | 2015-12-02 00:14:07 +0900 |
commit | 4b620af0a7c1b34f42241195661627304e993236 (patch) | |
tree | 24dc64ad4ca0548d697d40f7ebd6db627f9448a4 /requirements/07-annex.rst | |
parent | c2f8523fe12c93813b8e459d093f0111c9dc1f31 (diff) |
change dirs to use new opnfv doc build script
Change-Id: Icfc17b1370fc111e0e9919f2f1c1d9ea8aee2702
Signed-off-by: Ryota MIBU <r-mibu@cq.jp.nec.com>
Diffstat (limited to 'requirements/07-annex.rst')
-rw-r--r-- | requirements/07-annex.rst | 64 |
1 files changed, 0 insertions, 64 deletions
diff --git a/requirements/07-annex.rst b/requirements/07-annex.rst deleted file mode 100644 index dbe41bd1..00000000 --- a/requirements/07-annex.rst +++ /dev/null @@ -1,64 +0,0 @@ -Annex: NFVI Faults -================================================= - -Faults in the listed elements need to be immediately notified to the Consumer in -order to perform an immediate action like live migration or switch to a hot -standby entity. In addition, the Administrator of the host should trigger a -maintenance action to, e.g., reboot the server or replace a defective hardware -element. - -Faults can be of different severity, i.e., critical, warning, or -info. Critical faults require immediate action as a severe degradation of the -system has happened or is expected. Warnings indicate that the system -performance is going down: related actions include closer (e.g. more frequent) -monitoring of that part of the system or preparation for a cold migration to a -backup VM. Info messages do not require any action. We also consider a type -"maintenance", which is no real fault, but may trigger maintenance actions -like a re-boot of the server or replacement of a faulty, but redundant HW. - -Faults can be gathered by, e.g., enabling SNMP and installing some open source -tools to catch and poll SNMP. When using for example Zabbix one can also put an -agent running on the hosts to catch any other fault. In any case of failure, the -Administrator should be notified. The following table provides a list of high -level faults that are considered within the scope of the Doctor project -requiring immediate action by the Consumer. - - - -+------------------+---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| Service | Fault | Severity | How to detect? | Comment | Action to recover | -+------------------+---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| Compute Hardware | Processor/CPU failure, CPU condition not ok | Critical | Zabbix | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Memory failure/Memory condition not ok | Critical | Zabbix (IPMI) | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Network card failure, e.g. network adapter connectivity lost | Critical | Zabbix/Ceilometer | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Disk crash | Info | RAID monitoring | Network storage is very redundant (e.g. RAID system) and can guarantee high availability | Inform OAM | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Storage controller | Critical | Zabbix (IPMI) | | Live migration if storage is still accessible; otherwise hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | PDU/power failure, power off, server reset | Critical | Zabbix/Ceilometer | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Power degradation, power redundancy lost, power threshold exceeded | Warning | SNMP | | Live migration | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Chassis problem (.e.g fan degraded/failed, chassis power degraded), CPU fan problem, temperature/thermal condition not ok | Warning | SNMP | | Live migration | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Mainboard failure | Critical | Zabbix (IPMI) | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | OS crash (e.g. kernel panic) | Critical | Zabbix | | Switch to hot standby | -+------------------+---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| Hypervisor | System has restarted | Critical | Zabbix | | Switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Hypervisor failure | Warning/Critical | Zabbix/Ceilometer | | Evacuation/switch to hot standby | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Zabbix/Ceilometer is unreachable | Warning | ? | | Live migration | -+------------------+---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| Network | SDN/OpenFlow switch, controller degraded/failed | Critical | ? | | Switch to hot standby or reconfigure virtual network topology | -+ +---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ -| | Hardware failure of physical switch/router | Warning | SNMP | Redundancy of physical infrastructure is reduced or no longer available | Live migration if possible, otherwise evacuation | -+------------------+---------------------------------------------------------------------------------------------------------------------------+------------------+-------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------+ - -.. - vim: set tabstop=4 expandtab textwidth=80: - |