diff options
author | joehuang <joehuang@huawei.com> | 2016-09-20 05:20:31 -0400 |
---|---|---|
committer | joehuang <joehuang@huawei.com> | 2016-09-20 05:27:44 -0400 |
commit | 3f7394289faa2945e8f36d008cfb7dacf06279d6 (patch) | |
tree | de4c58563b7db1ebb33122ae1b4367b640e801eb /docs/userguide/multisite.admin.usage.rst | |
parent | b50a7a8c4d2f4a1e63b7dce077f9d3f726661a15 (diff) |
Cherry Pick the update from the master branch to stable/colorado branch
These patches should be cherry picked to colorado branch:
https://gerrit.opnfv.org/gerrit/#/c/20077/
https://gerrit.opnfv.org/gerrit/#/c/20635/
https://gerrit.opnfv.org/gerrit/#/c/19833/
https://gerrit.opnfv.org/gerrit/#/c/19587/
But because the structure of the folder was changed in
https://gerrit.opnfv.org/gerrit/#/c/19833/
cherry pick using gerrit will lead to merge conflict,
has to update the colorado branch through this patch.
Change-Id: I971eb28520b47b14de52a761269518b959c6921c
Signed-off-by: joehuang <joehuang@huawei.com>
Diffstat (limited to 'docs/userguide/multisite.admin.usage.rst')
-rw-r--r-- | docs/userguide/multisite.admin.usage.rst | 390 |
1 files changed, 390 insertions, 0 deletions
diff --git a/docs/userguide/multisite.admin.usage.rst b/docs/userguide/multisite.admin.usage.rst new file mode 100644 index 0000000..41f23c0 --- /dev/null +++ b/docs/userguide/multisite.admin.usage.rst @@ -0,0 +1,390 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +========================== +Multisite admin user guide +========================== + +Multisite identity service management +===================================== + +Goal +---- + +A user should, using a single authentication point be able to manage virtual +resources spread over multiple OpenStack regions. + +Token Format +------------ + +There are 3 types of token format supported by OpenStack KeyStone + + * **UUID** + * **PKI/PKIZ** + * **FERNET** + +It's very important to understand these token format before we begin the +mutltisite identity service management. Please refer to the OpenStack +official site for the identity management. +http://docs.openstack.org/admin-guide-cloud/identity_management.html + +Key consideration in multisite scenario +--------------------------------------- + +A user is provided with a single authentication URL to the Identity (Keystone) +service. Using that URL, the user authenticates with Keystone by +requesting a token typically using username/password credentials. Keystone +server validates the credentials, possibly with an external LDAP/AD server and +returns a token to the user. The user sends a request to a service in a +selected region including the token. Now the service in the region, say Nova +needs to validate the token. The service uses its configured keystone endpoint +and service credentials to request token validation from Keystone. After the +token is validated by KeyStone, the user is authorized to use the service. + +The key considerations for token validation in multisite scenario are: + * Site level failure: impact on authN and authZ shoulde be as minimal as + possible + * Scalable: as more and more sites added, no bottleneck in token validation + * Amount of inter region traffic: should be kept as little as possible + +Hence, Keystone token validation should preferably be done in the same +region as the service itself. + +The challenge to distribute KeyStone service into each region is the KeyStone +backend. Different token format has different data persisted in the backend. + +* UUID: UUID tokens have a fixed size. Tokens are persistently stored and + create a lot of database traffic, the persistence of token is for the revoke + purpose. UUID tokens are validated online by Keystone, call to service will + request keystone for token validation. Keystone can become a + bottleneck in a large system. Due to this, UUID token type is not suitable + for use in multi region clouds, no matter the Keystone database + replicates or not. + +* PKI: Tokens are non persistent cryptographic based tokens and validated + offline (not by the Keystone service) by Keystone middleware which is part + of other services such as Nova. Since PKI tokens include endpoint for all + services in all regions, the token size can become big. There are + several ways to reduce the token size such as no catalog policy, endpoint + filter to make a project binding with limited endpoints, and compressed PKI + token - PKIZ, but the size of token is still unpredictable, making it difficult + to manage. If catalog is not applied, that means the user can access all + regions, in some scenario, it's not allowed to do like this. Centralized + Keystone with PKI token to reduce inter region backend synchronization traffic. + PKI tokens do produce Keystone traffic for revocation lists. + +* Fernet: Tokens are non persistent cryptographic based tokens and validated + online by the Keystone service. Fernet tokens are more lightweight + than PKI tokens and have a fixed size. Fernet tokens require Keystone + deployed in a distributed manner, again to avoid inter region traffic. The + data synchronization cost for the Keystone backend is smaller due to the non- + persisted token. + +Cryptographic tokens bring new (compared to UUID tokens) issues/use-cases +like key rotation, certificate revocation. Key management is out of scope for +this use case. + +Database deployment as the backend for KeyStone service +------------------------------------------------------ + +Database replication: + - Master/slave asynchronous: supported by the database server itself + (mysql/mariadb etc), works over WAN, it's more scalable. But only master will + provide write functionality, domain/project/role provisioning. + - Multi master synchronous: Galera(others like percona), not so scalable, + for multi-master writing, and need more parameter tunning for WAN latency.It + can provide the capability for limited multi-sites multi-write + function for distributed KeyStone service. + - Symmetrical/asymmetrical: data replicated to all regions or a subset, + in the latter case it means some regions needs to access Keystone in another + region. + +Database server sharing: +In an OpenStack controller, normally many databases from different +services are provided from the same database server instance. For HA reasons, +the database server is usually synchronously replicated to a few other nodes +(controllers) to form a cluster. Note that _all_ database are replicated in +this case, for example when Galera sync repl is used. + +Only the Keystone database can be replicated to other sites. Replicating +databases for other services will cause those services to get of out sync and +malfunction. + +Since only the Keystone database is to be sync or replicated to another +region/site, it's better to deploy Keystone database into its own +database server with extra networking requirement, cluster or replication +configuration. How to support this by installer is out of scope. + +The database server can be shared when async master/slave replication is +used, if global transaction identifiers GTID is enabled. + +Deployment options +------------------ + +**Distributed KeyStone service with PKI token** + +Deploy KeyStone service in two sites with database replication. If site +level failure impact is not considered, then KeyStone service can only be +deployed into one site. + +The PKI token has one great advantage is that the token validation can be +done locally, without sending token validation request to KeyStone server. +The drawback of PKI token is +the endpoint list size in the token. If a project will be only spread in +very limited site number(region number), then we can use the endpoint +filter to reduce the token size, make it workable even a lot of sites +in the cloud. +KeyStone middleware(which is co-located in the service like +Nova-API/xxx-API) will have to send the request to the KeyStone server +frequently for the revoke-list, in order to reject some malicious API +request, for example, a user has to be deactivated, but use an old token +to access OpenStack service. + +For this option, needs to leverage database replication to provide +KeyStone Active-Active mode across sites to reduce the impact of site failure. +And the revoke-list request is very frequently asked, so the performance of the +KeyStone server needs also to be taken care. + +Site level keystone load balance is required to provide site level +redundancy, otherwise the KeyStone middleware will not switch request to the +healthy KeyStone server in time. + +And also the cert distribution/revoke to each site / API server for token +validation is required. + +This option can be used for some scenario where there are very limited +sites, especially if each project only spreads into limited sites ( regions ). + +**Distributed KeyStone service with Fernet token** + +Fernet token is a very new format, and just introduced recently,the biggest +gain for this token format is :1) lightweight, size is small to be carried in +the API request, not like PKI token( as the sites increased, the endpoint-list +will grows and the token size is too long to carry in the API request) 2) no +token persistence, this also make the DB not changed too much and with light +weight data size (just project, Role, domain, endpoint etc). The drawback for +the Fernet token is that token has to be validated by KeyStone for each API +request. + +This makes that the DB of KeyStone can work as a cluster in multisite (for +example, using MySQL galera cluster). That means install KeyStone API server in +each site, but share the same the backend DB cluster.Because the DB cluster +will synchronize data in real time to multisite, all KeyStone server can see +the same data. + +Because each site with KeyStone installed, and all data kept same, +therefore all token validation could be done locally in the same site. + +The challenge for this solution is how many sites the DB cluster can +support. Question is aksed to MySQL galera developers, their answer is that no +number/distance/network latency limitation in the code. But in the practice, +they have seen a case to use MySQL cluster in 5 data centers, each data centers +with 3 nodes. + +This solution will be very good for limited sites which the DB cluster can +cover very well. + +**Distributed KeyStone service with Fernet token + Async replication (star-mode)** + +One master KeyStone cluster with Fernet token in two sites (for site level +high availability purpose), other sites will be installed with at least 2 slave +nodes where the node is configured with DB async replication from the master +cluster members, and one slave’s mater node in site1, another slave’s master +node in site 2. + +Only the master cluster nodes are allowed to write, other slave nodes +waiting for replication from the master cluster member( very little delay). + +Pros: + * Deploy database cluster in the master sites is to provide more master + nodes, in order to provide more slaves could be done with async. replication + in parallel. Two sites for the master cluster is to provide higher + reliability (site level) for writing request, but reduce the maintaince + challenge at the same time by limiting the cluster spreading over too many + sites. + * Multi-slaves in other sites is because of the slave has no knowledge of + other slaves, so easy to manage multi-slaves in one site than a cluster, and + multi-slaves work independently but provide multi-instance redundancy(like a + cluster, but independent). + +Cons: + * Need to be aware of the chanllenge of key distribution and rotation + for Fernet token. + +Note: PKI token will be deprecated soon, so Fernet token is encouraged. + +Multisite VNF Geo site disaster recovery +======================================== + +Goal +---- + +A VNF (telecom application) should, be able to restore in another site for +catastrophic failures happened. + +Key consideration in multisite scenario +--------------------------------------- + +Geo site disaster recovery is to deal with more catastrophic failures +(flood, earthquake, propagating software fault), and that loss of calls, or +even temporary loss of service, is acceptable. It is also seems more common +to accept/expect manual / administrator intervene into drive the process, not +least because you don’t want to trigger the transfer by mistake. + +In terms of coordination/replication or backup/restore between geographic +sites, discussion often (but not always) seems to focus on limited application +level data/config replication, as opposed to replication backup/restore between +of cloud infrastructure between different sites. + +And finally, the lack of a requirement to do fast media transfer (without +resignalling) generally removes the need for special networking behavior, with +slower DNS-style redirection being acceptable. + +Here is more concerns about cloud infrastructure level capability to +support VNF geo site disaster recovery + +Option1, Consistency application backup +--------------------------------------- + +The disater recovery process will work like this: + +1) DR(Geo site disaster recovery )software get the volumes for each VM + in the VNF from Nova +2) DR software call Nova quiesce API to quarantee quiecing VMs in desired order +3) DR software takes snapshots of these volumes in Cinder (NOTE: Because + storage often provides fast snapshot, so the duration between quiece and + unquiece is a short interval) +4) DR software call Nova unquiece API to unquiece VMs of the VNF in reverse order +5) DR software create volumes from the snapshots just taken in Cinder +6) DR software create backup (incremental) for these volumes to remote + backup storage ( swift or ceph, or.. ) in Cinder +7) If this site failed, + 1) DR software restore these backup volumes in remote Cinder in the backup site. + 2) DR software boot VMs from bootable volumes from the remote Cinder in + the backup site and attach the regarding data volumes. + +Note: Quiesce/Unquiesce spec was approved in Mitaka, but code not get merged in +time, https://blueprints.launchpad.net/nova/+spec/expose-quiesce-unquiesce-api +The spec was rejected in Newton when it was reproposed: +https://review.openstack.org/#/c/295595/. So this option will not work any more. + +Option2, Vitrual Machine Snapshot +--------------------------------- +1) DR software create VM snapshot in Nova +2) Nova quiece the VM internally + (NOTE: The upper level application or DR software should take care of + avoiding infra level outage induced VNF outage) +3) Nova create image in Glance +4) Nova create a snapshot of the VM, including volumes +5) If the VM is volume backed VM, then create volume snapshot in Cinder +5) No image uploaded to glance, but add the snapshot in the meta data of the + image in Glance +6) DR software to get the snapshot information from the Glance +7) DR software create volumes from these snapshots +9) DR software create backup (incremental) for these volumes to backup storage + ( swift or ceph, or.. ) in Cinder +10) If this site failed, + 1) DR software restore these backup volumes to Cinder in the backup site. + 2) DR software boot vm from bootable volume from Cinder in the backup site + and attach the data volumes. + +This option only provides single VM level consistency disaster recovery. + +This feature is already available in current OPNFV release. + +Option3, Consistency volume replication +--------------------------------------- +1) DR software creates datastore (Block/Cinder, Object/Swift, App Custom + storage) with replication enabled at the relevant scope, for use to + selectively backup/replicate desire data to GR backup site +2) DR software get the reference of storage in the remote site storage +3) If primary site failed, + 1) DR software managing recovery in backup site gets references to relevant + storage and passes to new software instances + 2) Software attaches (or has attached) replicated storage, in the case of + volumes promoting to writable. + +Pros: + * Replication will be done in the storage level automatically, no need to + create backup regularly, for example, daily. + * Application selection of limited amount of data to replicate reduces + risk of replicating failed state and generates less overhear. + * Type of replication and model (active/backup, active/active, etc) can + be tailored to application needs + +Cons: + * Applications need to be designed with support in mind, including both + selection of data to be replicated and consideration of consistency + * "Standard" support in Openstack for Disaster Recovery currently fairly + limited, though active work in this area. + +Note: Volume replication v2.1 support project level replication. + + +VNF high availability across VIM +================================ + +Goal +---- + +A VNF (telecom application) should, be able to realize high availability +deloyment across OpenStack instances. + +Key consideration in multisite scenario +--------------------------------------- + +Most of telecom applications have already been designed as +Active-Standby/Active-Active/N-Way to achieve high availability +(99.999%, corresponds to 5.26 minutes of unplanned downtime in a year), +typically state replication or heart beat between +Active-Active/Active-Active/N-Way (directly or via replicated database +services, or via private designed message format) are required. + +We have to accept the currently limited availability ( 99.99%) of a +given OpenStack instance, and intend to provide the availability of the +telecom application by spreading its function across multiple OpenStack +instances.To help with this, many people appear willing to provide multiple +“independent” OpenStack instances in a single geographic site, with special +networking (L2/L3) between clouds in that physical site. + +The telecom application often has different networking plane for different +purpose: + +1) external network plane: using for communication with other telecom + application. + +2) components inter-communication plane: one VNF often consisted of several + components, this plane is designed for components inter-communication with + each other + +3) backup plane: this plane is used for the heart beat or state replication + between the component's active/standby or active/active or N-way cluster. + +4) management plane: this plane is mainly for the management purpose, like + configuration + +Generally these planes are separated with each other. And for legacy telecom +application, each internal plane will have its fixed or flexible IP addressing +plane. There are some interesting/hard requirements on the networking (L2/L3) +between OpenStack instances, at lease the backup plane across different +OpenStack instances: + +1) Overlay L2 networking is prefered as the backup plane for heartbeat or state + replication, the reason is: + + a) Support legacy compatibility: Some telecom app with built-in internal L2 + network, for easy to move these app to virtualized telecom application, it + would be better to provide L2 network. + + b) Support IP overlapping: multiple telecom applications may have + overlapping IP address for cross OpenStack instance networking. + Therefore over L2 networking across Neutron feature is required + in OpenStack. + +2) L3 networking cross OpenStack instance for heartbeat or state replication. + Can leverage FIP or vRouter inter-connected with overlay L2 network to + establish overlay L3 networking. + +Note: L2 border gateway spec was merged in L2GW project: +https://review.openstack.org/#/c/270786/. Code will be availabe in later +release. |