1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
|
.. This work is licensed under a Creative Commons Attribution 4.0 International License.
.. http://creativecommons.org/licenses/by/4.0
Doctor Configuration
====================
OPNFV installers install most components of Doctor framework including
OpenStack Nova, Neutron and Cinder (Doctor Controller) and OpenStack
Ceilometer and Aodh (Doctor Notifier) except Doctor Monitor.
After major components of OPNFV are deployed, you can setup Doctor functions
by following instructions in this section. You can also learn detailed
steps for all supported installers under `doctor/doctor_tests/installer`_.
.. _doctor/doctor_tests/installer: https://git.opnfv.org/doctor/tree/doctor_tests/installer
Doctor Inspector
----------------
You need to configure one of Doctor Inspectors below. You can also learn detailed steps for
all supported Inspectors under `doctor/doctor_tests/inspector`_.
.. _doctor/doctor_tests/inspector: https://git.opnfv.org/doctor/tree/doctor_tests/inspector
**Sample Inspector**
Sample Inspector is intended to show minimum functions of Doctor Inspector.
Sample Inspector is suggested to be placed in one of the controller nodes,
but it can be put on any host where Sample Inspector can reach and access
the OpenStack Controllers (e.g. Nova, Neutron).
Make sure OpenStack env parameters are set properly, so that Sample Inspector
can issue admin actions such as compute host force-down and state update of VM.
Then, you can configure Sample Inspector as follows:
.. code-block:: bash
git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/doctor_tests/inspector
INSPECTOR_PORT=12345
python sample.py $INSPECTOR_PORT > inspector.log 2>&1 &
**Congress**
OpenStack `Congress`_ is a Governance as a Service (previously Policy as a
Service). Congress implements Doctor Inspector as it can inspect a fault
situation and propagate errors onto other entities.
.. _Congress: https://governance.openstack.org/tc/reference/projects/congress.html
Congress is deployed by OPNFV Apex installer. You need to enable doctor
datasource driver and set policy rules. By the example configuration below,
Congress will force down nova compute service when it received a fault event
of that compute host. Also, Congress will set the state of all VMs running on
that host from ACTIVE to ERROR state.
.. code-block:: bash
openstack congress datasource create doctor "doctor"
openstack congress datasource create --config api_version=$NOVA_MICRO_VERSION \
--config username=$OS_USERNAME --config tenant_name=$OS_TENANT_NAME \
--config password=$OS_PASSWORD --config auth_url=$OS_AUTH_URL \
nova "nova21"
openstack congress policy rule create \
--name host_down classification \
'host_down(host) :-
doctor:events(hostname=host, type="compute.host.down", status="down")'
openstack congress policy rule create \
--name active_instance_in_host classification \
'active_instance_in_host(vmid, host) :-
nova:servers(id=vmid, host_name=host, status="ACTIVE")'
openstack congress policy rule create \
--name host_force_down classification \
'execute[nova:services.force_down(host, "nova-compute", "True")] :-
host_down(host)'
openstack congress policy rule create \
--name error_vm_states classification \
'execute[nova:servers.reset_state(vmid, "error")] :-
host_down(host),
active_instance_in_host(vmid, host)'
**Vitrage**
OpenStack `Vitrage`_ is an RCA (Root Cause Analysis) service for organizing,
analyzing and expanding OpenStack alarms & events. Vitrage implements Doctor
Inspector, as it receives a notification that a host is down and calls Nova
force-down API. In addition, it raises alarms on the instances running on this
host.
.. _Vitrage: https://wiki.openstack.org/wiki/Vitrage
Vitrage is not deployed by OPNFV installers yet. It can be installed either on
top of a devstack environment, or on top of a real OpenStack environment. See
`Vitrage Installation`_
.. _`Vitrage Installation`: https://docs.openstack.org/developer/vitrage/installation-and-configuration.html
Doctor SB API and a Doctor datasource were implemented in Vitrage in the Ocata
release. The Doctor datasource is enabled by default.
After Vitrage is installed and configured, there is a need to configure it to
support the Doctor use case. This can be done in a few steps:
1. Make sure that 'aodh' and 'doctor' are included in the list of datasource
types in /etc/vitrage/vitrage.conf:
.. code-block:: bash
[datasources]
types = aodh,doctor,nova.host,nova.instance,nova.zone,static,cinder.volume,neutron.network,neutron.port,heat.stack
2. Enable the Vitrage Nova notifier. Set the following line in
/etc/vitrage/vitrage.conf:
.. code-block:: bash
[DEFAULT]
notifiers = nova
3. Add a template that is responsible to call Nova force-down if Vitrage
receives a 'compute.host.down' alarm. Copy `template`_ and place it under
/etc/vitrage/templates
.. _template: https://github.com/openstack/vitrage/blob/master/etc/vitrage/templates.sample/host_down_scenarios.yaml
4. Restart the vitrage-graph and vitrage-notifier services
Doctor Monitors
---------------
Doctor Monitors are suggested to be placed in one of the controller nodes,
but those can be put on any host which is reachable to target compute host and
accessible by the Doctor Inspector.
You need to configure Monitors for all compute hosts one by one. You can also learn detailed
steps for all supported monitors under `doctor/doctor_tests/monitor`_.
.. _doctor/doctor_tests/monitor: https://git.opnfv.org/doctor/tree/doctor_tests/monitor
**Sample Monitor**
You can configure the Sample Monitor as follows (Example for Apex deployment):
.. code-block:: bash
git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/doctor_tests/monitor
INSPECTOR_PORT=12345
COMPUTE_HOST='overcloud-novacompute-1.localdomain.com'
COMPUTE_IP=192.30.9.5
sudo python sample.py "$COMPUTE_HOST" "$COMPUTE_IP" \
"http://127.0.0.1:$INSPECTOR_PORT/events" > monitor.log 2>&1 &
**Collectd Monitor**
|