New test specs for ha database/controller_restart

New test specifications have been created for dovetail project. Test descriptions are related to test cases: - dovetail.ha.database - dovetail.ha.controller_restart JIRA: DOVETAIL-680 JIRA: DOVETAIL-681 Change-Id: I632cb69f9166a46e76f38a467f078fe5f31b63b3 Signed-off-by: Panagiotis Karalis <pkaralis@intracom-telecom.com>
author: Panagiotis Karalis <pkaralis@intracom-telecom.com> 2018-07-04 18:24:17 +0300
committer: Panagiotis Karalis <pkaralis@intracom-telecom.com> 2018-07-09 18:02:50 +0300
commit: 9d688995687a701ac1b5572e74b6c028885c92ea (patch)
tree: 727ba270ea55a9bb357bb76fbdeb3f8b85102975 /docs/testing/user/testspecification/highavailability
parent: cd122a0884564060743677548a4e522a9d8199c3 (diff)
1 files changed, 150 insertions, 0 deletions
diff --git a/docs/testing/user/testspecification/highavailability/index.rst b/docs/testing/user/testspecification/highavailability/index.rst
index 280a241e..443abd0e 100644
--- a/docs/testing/user/testspecification/highavailability/index.rst
+++ b/docs/testing/user/testspecification/highavailability/index.rst
@@ -749,5 +749,155 @@ Post conditions
 
 Restart the processes of "haproxy" if they are not running.
 
+----------------------------------------------------------------
+Test Case 9 - Controller node OpenStack service down - Database
+----------------------------------------------------------------
 
+Short name
+----------
+
+dovetail.ha.database
+
+Use case specification
+----------------------
+
+This test case verifies that the high availability of the data base instances
+used by OpenStack (mysql) on control node is working properly.
+Specifically, this test case kills the processes of database service on a
+selected control node, then checks whether the request of the related
+OpenStack command is OK and the killed processes are recovered.
+
+Test preconditions
+------------------
+
+In this test case, an attacker called "kill-process" is needed.
+This attacker includes three parameters: fault_type, process_name and host.
+
+The purpose of this attacker is to kill any process with a specific process
+name which is run on the host node. In case that multiple processes use the
+same name on the host node, all of them are going to be killed by this attacker.
+
+Basic test flow execution description and pass/fail criteria
+------------------------------------------------------------
+
+Methodology for verifying service continuity and recovery
+'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+In order to verify this service two different monitors are going to be used.
+
+As first monitor is used a OpenStack command and acts as watcher for
+database connection of different OpenStack components.
+
+For second monitor is used a process monitor and the main purpose is to watch
+whether the database processes on the host node are killed properly.
+
+Therefore, in this test case, there are two metrics:
+- service_outage_time, which indicates the maximum outage time (seconds)
+  of the specified OpenStack command request
+- process_recover_time, which indicates the maximum time (seconds) from the
+  process being killed to recovered
+
+Test execution
+''''''''''''''
+* Test action 1: Connect to Node1 through SSH, and check that "database"
+  processes are running on Node1
+* Test action 2: Start two monitors: one for "database" processes on the host
+  node and the other for connection toward database from OpenStack
+  components, verifying the results of openstack image list, openstack router list,
+  openstack stack list and openstack volume list.
+  Each monitor will run as an independent process
+* Test action 3: Connect to Node1 through SSH, and then kill the "mysql"
+  process(es)
+* Test action 4: Stop monitors after a period of time specified by "waiting_time".
+  The monitor info will be aggregated.
+* Test action 5: Verify the SLA and set the verdict of the test case to pass or fail.
+
+
+Pass / fail criteria
+''''''''''''''''''''
+
+Check whether the SLA is passed:
+- The process outage time is less than 30s.
+- The service outage time is less than 5s.
+
+The database operations are carried out in above order and no errors occur.
+
+A negative result will be generated if the above is not met in completion.
+
+Post conditions
+---------------
+
+The database service is up and running again.
+If the database service did not recover successfully by itself,
+the test explicitly restarts the database service.
+
+---------------------------------------------------------------------------
+Test Case 10 - Controller node OpenStack service down - Controller Restart
+---------------------------------------------------------------------------
+
+Short name
+----------
+
+dovetail.ha.controller_restart
+
+Use case specification
+----------------------
+
+This test case verifies that the high availability of controller node is working
+properly.
+Specifically, this test case shutdowns a specified controller node via IPMI,
+then checks whether all services provided by the controller node are OK with
+some monitor tools.
+
+Test preconditions
+------------------
+
+In this test case, an attacker called "host-shutdown" is needed.
+This attacker includes two parameters: fault_type and host.
+
+The purpose of this attacker is to shutdown a controller and check whether the
+services are handled by this controller are still working normally.
+
+Basic test flow execution description and pass/fail criteria
+------------------------------------------------------------
+
+Methodology for verifying service continuity and recovery
+'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+In order to verify this service one monitor is going to be used.
+
+This monitor is using an OpenStack command and the respective command name of
+the OpenStack component that we want to verify that the respective service is
+still running normally.
+
+In this test case, there is one metric: 1)service_outage_time: which indicates
+the maximum outage time (seconds) of the specified OpenStack command request.
+
+Test execution
+''''''''''''''
+* Test action 1: Connect to Node1 through SSH, and check that controller services
+  are running normally
+* Test action 2: Start monitors: each monitor will run as independently
+  process, monitoring the image list, router list, stack list and volume list accordingly.
+  The monitor info will be collected.
+* Test action 3: Using the IPMI component, the Node1 is shut-down remotely.
+* Test action 4: Stop monitors after a period of time specified by "waiting_time".
+  The monitor info will be aggregated.
+* Test action 5: Verify the SLA and set the verdict of the test case to pass or fail.
+
+
+Pass / fail criteria
+''''''''''''''''''''
+
+Check whether the SLA is passed:
+- The process outage time is less than 30s.
+- The service outage time is less than 5s.
+
+The controller operations are carried out in above order and no errors occur.
+
+A negative result will be generated if the above is not met in completion.
+
+Post conditions
+---------------
 
+The controller has been restarted
author	Panagiotis Karalis <pkaralis@intracom-telecom.com>	2018-07-04 18:24:17 +0300
committer	Panagiotis Karalis <pkaralis@intracom-telecom.com>	2018-07-09 18:02:50 +0300
commit	9d688995687a701ac1b5572e74b6c028885c92ea (patch)
tree	727ba270ea55a9bb357bb76fbdeb3f8b85102975 /docs/testing/user/testspecification/highavailability
parent	cd122a0884564060743677548a4e522a9d8199c3 (diff)