From 6d63f786a25688ff2033cb53f1932ec357b1c5df Mon Sep 17 00:00:00 2001 From: fuqiao Date: Fri, 6 Nov 2015 16:57:34 +0800 Subject: Scenario Analysis doc - multisite scenario Scenario Analysis doc - multisite Scenario JIRA:HA-18 Change-Id: I6fb47f4bf4441cf3e06955548f2f5f3da5cf8b5a Signed-off-by:fuqiao --- Scenario/scenario_analysis_multi_site.rst | 70 +++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 Scenario/scenario_analysis_multi_site.rst diff --git a/Scenario/scenario_analysis_multi_site.rst b/Scenario/scenario_analysis_multi_site.rst new file mode 100644 index 0000000..016fe58 --- /dev/null +++ b/Scenario/scenario_analysis_multi_site.rst @@ -0,0 +1,70 @@ +5, Multisite Scenario +==================================================== + +The Multisite scenario refers to the cases when VNFs are deployed on multiple VIMs. +There could be two typical usecases for such scenario. + +One is in one DC, multiple openstack cloud are deployed. Taking consideration that the +number of compute nodes in one openstack cloud are quite limited (nearly 100) for +both opensource and commercial product of openstack, multiple openstack cloud will +have to be deployed in the DC to manage thousands of servers. VNFs in such DC should +be possible to be deployed accross openstack cloud. +..[MT] Do we anticipate HA VNFs that require more than 100 VMs so that they need to +be deployed across DCs? Or the goal is to provide higher availability by deploying +across DCs? +..[fq] Here I just try to explain what multisite scenario means. I don't think HA should +be discussed in this scenario since as you said, we can not have 100 more VMs deployed +to be HA. + +The other typical usecase is geographic redundancy. GR deployment is to deal with more +catastrophic failures (flood, earthquake, propagating software fault, and etc.) for one site. +In the Geographic redundancy usecase, VNFs are deployed in two sites, which are +geographically seperated and are managed by seperate VIM. When such a catastrophic +failure happens, the VNFs at the failed site can failover to the redundant one so as to +proceed the service. +..[MT] I agree and this scenario is definitely not limited to HA VNFs. Thus there could +be different mechanisms for the state replication between the sites and from an HA +perspective in this case it is important that the replication mechanism does not degrade +the performance at normal behaviour. + +The multisite scenario is also captured by the Multisite project, in which specific +requirements of openstack are also proposed for different usecases. However, +the multisite project mainly focuses on the requirement of these multisite +usecases on openstack. HA requirements are not necessarily the requirement +for the approaches discussed in multisite. While the HA project tries to +capture the HA requirements in these usecases. +https://gerrit.opnfv.org/gerrit/#/c/2123/ +https://gerrit.opnfv.org/gerrit/#/c/1438/. + + +An architecure of stateful VNF with redundancy in the multisite scenario can be as +follows. Architecture for the other cases can be worked out accordingly. +https://wiki.opnfv.org/_detail/stateful_vnf_in_multisite_scenario.png?id=scenario_analysis_of_high_availability_in_nfv +..[MT] What is the relation of the VMs of a single site e.g. on the left hand side? +Do they collaborate? Do they protect each other? What makes the two VIMs independent +if they need to support that VNF and its VNFM? Could they be logically the same +VIM and wouldn't that be a better solution for the VNF? +..[fq] This is kind of architecture captureed from the multisite project's work. +One VM on the left site is acting as the active VNFC, and the other VM at the right +site is acting as the standby. I assume the two VIM are cooperate with each other +under the control of the orchestrator. I am also thinking that if the two VMs contrled +by one VIM would be a better solution. But apparently that is not the scenario for +multisite, cause they are thinking multisite means you have multi openstack. + + +Below listed the additinal labor and extra requirements of multisite comparing with +the basic usecases. + +1, specific network support for the active/standby or active/active VNFs across VIM. + +In the multisite scenario, instances constructing the VNFs can be placed across VIM. +This will introduce extra network support requirement. For example, heartbeat between +active/standby VMs placed across VIM requires overlay L2 network. The IP address used +for VNF to connect with other VNFs should be able to be floating across VIM as well. + +2, in the multisite scenario, a logical instance of VNFM should be put on multiple +VIM to manage the instances of VNFs placed across the VIM. + +3, in the VM failure scenarios, recovery of failed VM requires interface between +VNFM and the VIM. In the multisite scenario, the VNFM should have knowledge of +which VIM it should communicate with so as to recover the failed VNF. \ No newline at end of file -- cgit 1.2.3-korg From 197313b46e5e1b21f71ff5e264f43a284ac20dbe Mon Sep 17 00:00:00 2001 From: fuqiao Date: Wed, 25 Nov 2015 10:11:59 +0800 Subject: Scenario analysis doc - general issues for VNF HA Scenario Analysis doc - general issues for VNF HA JIRA:HA 15 Change-Id: I8dff0d1120ac4f5046f678667204cb6bc80d761e --- Scenario/scenario_analysis_multi_site.rst | 70 ---------------- .../scenario_analysis_VNF_external_interface.rst | 97 ++++++++++++++++++++++ 2 files changed, 97 insertions(+), 70 deletions(-) delete mode 100644 Scenario/scenario_analysis_multi_site.rst create mode 100644 Scenario_1/scenario_analysis_VNF_external_interface.rst diff --git a/Scenario/scenario_analysis_multi_site.rst b/Scenario/scenario_analysis_multi_site.rst deleted file mode 100644 index 016fe58..0000000 --- a/Scenario/scenario_analysis_multi_site.rst +++ /dev/null @@ -1,70 +0,0 @@ -5, Multisite Scenario -==================================================== - -The Multisite scenario refers to the cases when VNFs are deployed on multiple VIMs. -There could be two typical usecases for such scenario. - -One is in one DC, multiple openstack cloud are deployed. Taking consideration that the -number of compute nodes in one openstack cloud are quite limited (nearly 100) for -both opensource and commercial product of openstack, multiple openstack cloud will -have to be deployed in the DC to manage thousands of servers. VNFs in such DC should -be possible to be deployed accross openstack cloud. -..[MT] Do we anticipate HA VNFs that require more than 100 VMs so that they need to -be deployed across DCs? Or the goal is to provide higher availability by deploying -across DCs? -..[fq] Here I just try to explain what multisite scenario means. I don't think HA should -be discussed in this scenario since as you said, we can not have 100 more VMs deployed -to be HA. - -The other typical usecase is geographic redundancy. GR deployment is to deal with more -catastrophic failures (flood, earthquake, propagating software fault, and etc.) for one site. -In the Geographic redundancy usecase, VNFs are deployed in two sites, which are -geographically seperated and are managed by seperate VIM. When such a catastrophic -failure happens, the VNFs at the failed site can failover to the redundant one so as to -proceed the service. -..[MT] I agree and this scenario is definitely not limited to HA VNFs. Thus there could -be different mechanisms for the state replication between the sites and from an HA -perspective in this case it is important that the replication mechanism does not degrade -the performance at normal behaviour. - -The multisite scenario is also captured by the Multisite project, in which specific -requirements of openstack are also proposed for different usecases. However, -the multisite project mainly focuses on the requirement of these multisite -usecases on openstack. HA requirements are not necessarily the requirement -for the approaches discussed in multisite. While the HA project tries to -capture the HA requirements in these usecases. -https://gerrit.opnfv.org/gerrit/#/c/2123/ -https://gerrit.opnfv.org/gerrit/#/c/1438/. - - -An architecure of stateful VNF with redundancy in the multisite scenario can be as -follows. Architecture for the other cases can be worked out accordingly. -https://wiki.opnfv.org/_detail/stateful_vnf_in_multisite_scenario.png?id=scenario_analysis_of_high_availability_in_nfv -..[MT] What is the relation of the VMs of a single site e.g. on the left hand side? -Do they collaborate? Do they protect each other? What makes the two VIMs independent -if they need to support that VNF and its VNFM? Could they be logically the same -VIM and wouldn't that be a better solution for the VNF? -..[fq] This is kind of architecture captureed from the multisite project's work. -One VM on the left site is acting as the active VNFC, and the other VM at the right -site is acting as the standby. I assume the two VIM are cooperate with each other -under the control of the orchestrator. I am also thinking that if the two VMs contrled -by one VIM would be a better solution. But apparently that is not the scenario for -multisite, cause they are thinking multisite means you have multi openstack. - - -Below listed the additinal labor and extra requirements of multisite comparing with -the basic usecases. - -1, specific network support for the active/standby or active/active VNFs across VIM. - -In the multisite scenario, instances constructing the VNFs can be placed across VIM. -This will introduce extra network support requirement. For example, heartbeat between -active/standby VMs placed across VIM requires overlay L2 network. The IP address used -for VNF to connect with other VNFs should be able to be floating across VIM as well. - -2, in the multisite scenario, a logical instance of VNFM should be put on multiple -VIM to manage the instances of VNFs placed across the VIM. - -3, in the VM failure scenarios, recovery of failed VM requires interface between -VNFM and the VIM. In the multisite scenario, the VNFM should have knowledge of -which VIM it should communicate with so as to recover the failed VNF. \ No newline at end of file diff --git a/Scenario_1/scenario_analysis_VNF_external_interface.rst b/Scenario_1/scenario_analysis_VNF_external_interface.rst new file mode 100644 index 0000000..7667993 --- /dev/null +++ b/Scenario_1/scenario_analysis_VNF_external_interface.rst @@ -0,0 +1,97 @@ +2. Discussion for the General Issues for VNF HA schemes +=========================================================== + +This section is intended to talk about some general issues in the VNF HA schemes. +In sections 1, the usecases of both stateful and stateless VNFs are discussed. +While in this section, we would like to discuss some specific issues +which are quite general for all the usecases proposed in the previous sections. + +1.1. VNF External Interfacece + +Regardless whether the VNF is stateful or stateless, all the VNFCs should act as +a union from the perspective of the outside world. That means all the VNFCs should +share a common interface where the outside modules (e.g., the other VNFs) can +access the service from. There could be multiple solutions for this share of IP +interface. However, all of this sharing and switching of IP address should be +ignorant to the outside modules. + +There are several approaches for the VNFs to share the interfaces. A few of them +are listed as follows and will be discussed in detail. + +1) IP address of active/stand-by VM. + +2) Load balancers for active/active use cases + +Note that combinition of these two approaches is also feasible. + +For active/standby VNFCs, the HA manager will manage a common IP address +to the active and standby VMs, so that they look as one instance from outside. +(The HA manager may not be aware of this, I.e. the address may be configured +and the active/standby state management is linked to the possession of the IP +address, i.e. the active VNFC claims it as part of becoming active.) Only the +active one possesses the IP address. And when failover happens, the standby +is set to be active and can take possession of the IP address to continue traffic +process. + +..[MT] In general I would rather say that the IP address is managed by the HA +manager and not provided. But as a concrete use case "provide" works fine. +So it depends how you want to use this text. +..[fq] Agree, Thank you! + +For active/active VNFCs, LB(Load Balancer) could be used. In such scenario, there +could be two cases for the deployment and usage of LB. + +Case 1: LB used before a cluster of VNFCs to distribute traffic flow. + +In such case, the LB is deployed in front of a cluster of multiple VNFCs. Such +cluster can be managered by a seperate cluster manager, or can be managed just +by the LB, which is using heartbeat to monitor each VNFC. When one of VNFCs fails, +the cluster manager should recover the failed one, and should also exclude the +failed VNFC from the cluster so that the LB will re-route the traffic to +to the other VNFCs. In the case when the LB is acting as the cluster manager, it is +the LB's responsibility to inform the VNFM to recover the failed VNFC if possible. + + +Case 2: LB used before a cluster of VMs to distribute traffic flow. + +In this case, there exists a cluster manager(e.g. Pacemaker) to monitor and manage +the VMs in the cluster. The LB sits in front of the VM cluster so as to distribute +the traffic. When one of the VM fails, the cluster manager will ditect that and will +be in charge of the recovery. The cluster manager will also exclude the failed VM +out of the cluster, so that the LB won't re-route traffic to the failed one. + +In both two cases, the HA of the LB should also be considered. + +..[MT] I think this use case needs to show also how the LB learns about the new VNFC. +Also we should distinguish VNFC and VM failures as VNFC failure wouldn't be detected +in the NFVI e.g. LB, so we need a resolution, an applicability comment at least. +..[fq] I think I have made a mistake here by saying the VNFC. Actually if the failure +only happens in VNFC, the VNFC should reboot itself rather than have a new VNFC taking +its place. So in this case, I think I should modify VNFC into VMs. And as you mentioned, +the NFVI level can hardly detect VNFC level failure. + +..[MT] There could also be a combined case for the N+M redundancy, when there are N +actives but also M standbys at the VNF level. +..[fq] It could be. But I actually haven't see such a deployed case. So I am not sure +if I can discribe the schemes correctly:) + +1.2. Intra-VNF Communication + +For stateful VNFs, data synchronization is necessary between the active and standby VMs. +The HA manager is responsible for handling VNFC failover, and do the assignment of the +active/standby states between the VNFCs of the VNF. Data synchronization can be handled +either by the HA manager or by the VNFC itself. + +The state synchronization can happen as + +- direct communication between the active and the standby VNFCs + +- based on the information received from the HA manager on channel or messages using a common queue, + +..[MT] I don't understand the yellow inserted text +..[fq] Neither do I, actually. I think it is added by some one else and I can't make +out what it means as well:) + +- it could be through a shared storage assigned to the whole VNF + +- through in-memory database (checkpointing), when the database (checkpoint service) takes care of the data replication. -- cgit 1.2.3-korg From eeafc6f9a240099d4c190c47b637e34296e5770f Mon Sep 17 00:00:00 2001 From: fuqiao Date: Mon, 21 Dec 2015 16:45:01 +0800 Subject: Scenario Analysis doc - multisite scenario Scenario Analysis doc - multisite Scenario JIRA:HA-18 Change-Id: I74b51e91fcfed3a689945e8a86f6f5648aac00ba --- Scenario_2/scenario_analysis_multi_site.rst | 51 +++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 Scenario_2/scenario_analysis_multi_site.rst diff --git a/Scenario_2/scenario_analysis_multi_site.rst b/Scenario_2/scenario_analysis_multi_site.rst new file mode 100644 index 0000000..b9df8d0 --- /dev/null +++ b/Scenario_2/scenario_analysis_multi_site.rst @@ -0,0 +1,51 @@ +5, Multisite Scenario +==================================================== + +The Multisite scenario refers to the cases when VNFs are deployed on multiple VIMs. +There could be three typical usecases for such scenario. + +One is in one DC, multiple openstack clouds are deployed. Taking into consideration that the +number of compute nodes in one openstack cloud are quite limited (nearly 100) for +both opensource and commercial product of openstack, multiple openstack clouds will +have to be deployed in the DC to manage thousands of servers. In such a DC, it should +be possible to deploy VNFs accross openstack clouds. +..(MT) Do we anticipate HA VNFs that require more than 100 VMs so that they need to +be deployed across DCs? Or the goal is to provide higher availability by deploying +across DCs? +..(fq) Here I just try to explain what multisite scenario means. I don't think HA should +be discussed in this scenario since as you said, we can not have 100 more VMs deployed +to be HA. + +Another typical usecase is Geographic Redundancy (GR). GR deployment is to deal with more +catastrophic failures (flood, earthquake, propagating software fault, and etc.) of a single site. +In the Geographic redundancy usecase, VNFs are deployed in two sites, which are +geographically seperated and are deployed on NFVI managed by seperate VIM. When +such a catastrophic failure happens, the VNFs at the failed site can failover to +the redundant one so as to continue the service. Different VNFs may have specified +requirement of such failover. Some VNFs may need stateful failover, while others +may just need their VMs restarted on the redundant site in their initial state. +The first would create the overhead of state replication. The latter may still +have state replication through the storage. Accordingly for storage we don't want +to loose any data, and for networking the NFs should be connected the same way as +they were in the original site. We probably want also to have the same number of +VMs on the redundant site coming up for the VNFs. +..(MT) I agree and this scenario is definitely not limited to HA VNFs. Thus there could +be different mechanisms for the state replication between the sites and from an HA +perspective in this case it is important that the replication mechanism does not degrade +the performance at normal behaviour. + +The other usecase is the maintainance. When one site is planning for a maintaining, +it should first replicate the service to another site before it stops them. Such +replication should not disturb the service, nor should it cause any data loss. In +such case, the multisite schemes may be used. + +The multisite scenario is also captured by the Multisite project, in which specific +requirements of openstack are also proposed for different usecases. However, +the multisite project mainly focuses on the requirement of these multisite +usecases on openstack. HA requirements are not necessarily the requirement +for the approaches discussed in multisite. While the HA project tries to +capture the HA requirements in these usecases. +https://gerrit.opnfv.org/gerrit/#/c/2123/ +https://gerrit.opnfv.org/gerrit/#/c/1438/. + + -- cgit 1.2.3-korg From 092032680a564291e627239d464d4ecf45a8fb00 Mon Sep 17 00:00:00 2001 From: fuqiao Date: Mon, 21 Dec 2015 16:57:39 +0800 Subject: cenario analysis doc - general issues for VNF HA Scenario Analysis doc - general issues for VNF HA JIRA:HA 15 Change-Id: Iba2a6e467005b3bf332a04444564e69b92bc23dc --- .../scenario_analysis_VNF_external_interface.rst | 36 ++++++++++++---------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/Scenario_1/scenario_analysis_VNF_external_interface.rst b/Scenario_1/scenario_analysis_VNF_external_interface.rst index 7667993..c634c20 100644 --- a/Scenario_1/scenario_analysis_VNF_external_interface.rst +++ b/Scenario_1/scenario_analysis_VNF_external_interface.rst @@ -1,12 +1,13 @@ -2. Discussion for the General Issues for VNF HA schemes +3. Communication Interfaces for VNF HA schemes =========================================================== -This section is intended to talk about some general issues in the VNF HA schemes. -In sections 1, the usecases of both stateful and stateless VNFs are discussed. -While in this section, we would like to discuss some specific issues -which are quite general for all the usecases proposed in the previous sections. +This section will discuss some general issues about communication interfaces +in the VNF HA schemes. In sections 2, the usecases of both stateful and +stateless VNFs are discussed. While in this section, we would like to discuss +some specific issues which are quite general for all the usecases proposed +in the previous sections. -1.1. VNF External Interfacece +3.1. VNF External Interfacece Regardless whether the VNF is stateful or stateless, all the VNFCs should act as a union from the perspective of the outside world. That means all the VNFCs should @@ -18,14 +19,15 @@ ignorant to the outside modules. There are several approaches for the VNFs to share the interfaces. A few of them are listed as follows and will be discussed in detail. -1) IP address of active/stand-by VM. +1) IP address of VMs for active/stand-by VM. 2) Load balancers for active/active use cases Note that combinition of these two approaches is also feasible. -For active/standby VNFCs, the HA manager will manage a common IP address -to the active and standby VMs, so that they look as one instance from outside. +For active/standby VNFCs, there is a common IP address shared by the VMs hosting +the active and standby VNFCs, so that they look as one instance from outside. +The HA manager will manage the assignment of the IP address to the VMs. (The HA manager may not be aware of this, I.e. the address may be configured and the active/standby state management is linked to the possession of the IP address, i.e. the active VNFC claims it as part of becoming active.) Only the @@ -38,27 +40,27 @@ manager and not provided. But as a concrete use case "provide" works fine. So it depends how you want to use this text. ..[fq] Agree, Thank you! -For active/active VNFCs, LB(Load Balancer) could be used. In such scenario, there +For active/active VNFCs, a LB(Load Balancer) could be used. In such scenario, there could be two cases for the deployment and usage of LB. -Case 1: LB used before a cluster of VNFCs to distribute traffic flow. +Case 1: LB used in front of a cluster of VNFCs to distribute the traffic flow. In such case, the LB is deployed in front of a cluster of multiple VNFCs. Such -cluster can be managered by a seperate cluster manager, or can be managed just -by the LB, which is using heartbeat to monitor each VNFC. When one of VNFCs fails, +cluster can be managed by a seperate cluster manager, or can be managed just +by the LB, which uses heartbeat to monitor each VNFC. When one of VNFCs fails, the cluster manager should recover the failed one, and should also exclude the failed VNFC from the cluster so that the LB will re-route the traffic to to the other VNFCs. In the case when the LB is acting as the cluster manager, it is the LB's responsibility to inform the VNFM to recover the failed VNFC if possible. -Case 2: LB used before a cluster of VMs to distribute traffic flow. +Case 2: LB used in front of a cluster of VMs to distribute traffic flow. In this case, there exists a cluster manager(e.g. Pacemaker) to monitor and manage the VMs in the cluster. The LB sits in front of the VM cluster so as to distribute -the traffic. When one of the VM fails, the cluster manager will ditect that and will +the traffic. When one of the VM fails, the cluster manager will detect that and will be in charge of the recovery. The cluster manager will also exclude the failed VM -out of the cluster, so that the LB won't re-route traffic to the failed one. +out of the cluster, so that the LB won't route traffic to the failed one. In both two cases, the HA of the LB should also be considered. @@ -75,7 +77,7 @@ actives but also M standbys at the VNF level. ..[fq] It could be. But I actually haven't see such a deployed case. So I am not sure if I can discribe the schemes correctly:) -1.2. Intra-VNF Communication +3.2. Intra-VNF Communication For stateful VNFs, data synchronization is necessary between the active and standby VMs. The HA manager is responsible for handling VNFC failover, and do the assignment of the -- cgit 1.2.3-korg From 3ae95a32c1ad82473e1658dd9753cae15b634d5a Mon Sep 17 00:00:00 2001 From: fuqiao Date: Mon, 18 Jan 2016 16:27:52 +0800 Subject: Scenario Analysis doc - multisite scenario Scenario Analysis doc - multisite Scenario JIRA:HA-18 : Change-Id: I3df017ec31325afab8dfde7d56bbb013d460acbb --- Scenario_2/scenario_analysis_multi_site.rst | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/Scenario_2/scenario_analysis_multi_site.rst b/Scenario_2/scenario_analysis_multi_site.rst index b9df8d0..2e43471 100644 --- a/Scenario_2/scenario_analysis_multi_site.rst +++ b/Scenario_2/scenario_analysis_multi_site.rst @@ -1,4 +1,4 @@ -5, Multisite Scenario +6, Multisite Scenario ==================================================== The Multisite scenario refers to the cases when VNFs are deployed on multiple VIMs. @@ -9,12 +9,7 @@ number of compute nodes in one openstack cloud are quite limited (nearly 100) fo both opensource and commercial product of openstack, multiple openstack clouds will have to be deployed in the DC to manage thousands of servers. In such a DC, it should be possible to deploy VNFs accross openstack clouds. -..(MT) Do we anticipate HA VNFs that require more than 100 VMs so that they need to -be deployed across DCs? Or the goal is to provide higher availability by deploying -across DCs? -..(fq) Here I just try to explain what multisite scenario means. I don't think HA should -be discussed in this scenario since as you said, we can not have 100 more VMs deployed -to be HA. + Another typical usecase is Geographic Redundancy (GR). GR deployment is to deal with more catastrophic failures (flood, earthquake, propagating software fault, and etc.) of a single site. @@ -29,22 +24,21 @@ have state replication through the storage. Accordingly for storage we don't wan to loose any data, and for networking the NFs should be connected the same way as they were in the original site. We probably want also to have the same number of VMs on the redundant site coming up for the VNFs. -..(MT) I agree and this scenario is definitely not limited to HA VNFs. Thus there could -be different mechanisms for the state replication between the sites and from an HA -perspective in this case it is important that the replication mechanism does not degrade -the performance at normal behaviour. + The other usecase is the maintainance. When one site is planning for a maintaining, it should first replicate the service to another site before it stops them. Such -replication should not disturb the service, nor should it cause any data loss. In -such case, the multisite schemes may be used. +replication should not disturb the service, nor should it cause any data loss. The +service at the second site should be excuted, before the first site is stopped and +began maintenance. In such case, the multisite schemes may be used. The multisite scenario is also captured by the Multisite project, in which specific requirements of openstack are also proposed for different usecases. However, the multisite project mainly focuses on the requirement of these multisite usecases on openstack. HA requirements are not necessarily the requirement for the approaches discussed in multisite. While the HA project tries to -capture the HA requirements in these usecases. +capture the HA requirements in these usecases. The following links are the scenarios +and Usecases discussed in the Multisite project. https://gerrit.opnfv.org/gerrit/#/c/2123/ https://gerrit.opnfv.org/gerrit/#/c/1438/. -- cgit 1.2.3-korg From b9ac620a89e6b3e9cffa077535353e53807741ed Mon Sep 17 00:00:00 2001 From: fuqiao Date: Fri, 22 Jan 2016 14:43:29 +0800 Subject: Scenario Analysis Doc - Section 3 - communication Interfaces HA scheme Section 3 - communication interfaces of VNF HA schemes JIRA: HA 15 Change-Id: I1fd9a83fd3fdd985cf6d23685502c6a5df3877ed --- .../Scenario_Analysis_Communication_Interfaces.rst | 80 ++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 Scenario_1/Scenario_Analysis_Communication_Interfaces.rst diff --git a/Scenario_1/Scenario_Analysis_Communication_Interfaces.rst b/Scenario_1/Scenario_Analysis_Communication_Interfaces.rst new file mode 100644 index 0000000..c97776b --- /dev/null +++ b/Scenario_1/Scenario_Analysis_Communication_Interfaces.rst @@ -0,0 +1,80 @@ +3. Communication Interfaces for VNF HA schemes +=========================================================== + +This section will discuss some general issues about communication interfaces +in the VNF HA schemes. In sections 2, the usecases of both stateful and +stateless VNFs are discussed. While in this section, we would like to discuss +some specific issues which are quite general for all the usecases proposed +in the previous sections. + +3.1. VNF External Interfaces + +Regardless whether the VNF is stateful or stateless, all the VNFCs should act as +a union from the perspective of the outside world. That means all the VNFCs should +share a common interface where the outside modules (e.g., the other VNFs) can +access the service from. There could be multiple solutions for this share of IP +interface. However, all of this sharing and switching of IP address should be +ignorant to the outside modules. + +There are several approaches for the VNFs to share the interfaces. A few of them +are listed as follows and will be discussed in detail. + +1) IP address of VMs for active/stand-by VM. + +2) Load balancers for active/active use cases + +Note that combinition of these two approaches is also feasible. + +For active/standby VNFCs, there is a common IP address shared by the VMs hosting +the active and standby VNFCs, so that they look as one instance from outside. +The HA manager will manage the assignment of the IP address to the VMs. +(The HA manager may not be aware of this, I.e. the address may be configured +and the active/standby state management is linked to the possession of the IP +address, i.e. the active VNFC claims it as part of becoming active.) Only the +active one possesses the IP address. And when failover happens, the standby +is set to be active and can take possession of the IP address to continue traffic +process. + + +For active/active VNFCs, a LB(Load Balancer) could be used. In such scenario, there +could be two cases for the deployment and usage of LB. + +Case 1: LB used in front of a cluster of VNFCs to distribute the traffic flow. + +In such case, the LB is deployed in front of a cluster of multiple VNFCs. Such +cluster can be managed by a seperate cluster manager, or can be managed just +by the LB, which uses heartbeat to monitor each VNFC. When one of VNFCs fails, +the cluster manager should first exclude the failed VNFC from the cluster so that +the LB will re-route the traffic to the other VNFCs, and then the failed one should +be recovered. In the case when the LB is acting as the cluster manager, it is +the LB's responsibility to inform the VNFM to recover the failed VNFC if possible. + + +Case 2: LB used in front of a cluster of VMs to distribute traffic flow. + +In this case, there exists a cluster manager(e.g. Pacemaker) to monitor and manage +the VMs in the cluster. The LB sits in front of the VM cluster so as to distribute +the traffic. When one of the VM fails, the cluster manager will detect that and will +be in charge of the recovery. The cluster manager will also exclude the failed VM +out of the cluster, so that the LB won't route traffic to the failed one. + +In both two cases, the HA of the LB should also be considered. + + +3.2. Intra-VNF Communication + +For stateful VNFs, data synchronization is necessary between the active and standby VMs. +The HA manager is responsible for handling VNFC failover, and do the assignment of the +active/standby states between the VNFCs of the VNF. Data synchronization can be handled +either by the HA manager or by the VNFC itself. + +The state synchronization can happen as + +- direct communication between the active and the standby VNFCs + +- based on the information received from the HA manager on channel or messages using a common queue, + +- it could be through a shared storage assigned to the whole VNF + +- through the checkpointing of state information via underlying memory and/or +database checkpointing services to a separate VM and storage repository. -- cgit 1.2.3-korg