summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMaria Toeroe <Maria.Toeroe@ericsson.com>2015-10-22 10:04:31 -0400
committerMaria Toeroe <Maria.Toeroe@ericsson.com>2015-10-22 10:04:31 -0400
commitcdc8a81fe94348282c5a4f975163f8c231976fa8 (patch)
tree258ce11a16769dc970b8443808b46df3641eebdf
parentcc3d1dbe9e06bb5986896a1286d30cd1d08f8e1e (diff)
Incorporate software dimensions and other comments
Add definitions of rollback, downgrade and restore Add rollforward and apply other comments JIRA: ESCALATOR-24 Change-Id: I4a576c8fe1a7751ee934693ed8f948617a5542a0 Signed-off-by: Maria Toeroe <Maria.Toeroe@ericsson.com>
-rw-r--r--doc/02-Background_and_Terminologies.rst267
1 files changed, 163 insertions, 104 deletions
diff --git a/doc/02-Background_and_Terminologies.rst b/doc/02-Background_and_Terminologies.rst
index afb392f..36a81f2 100644
--- a/doc/02-Background_and_Terminologies.rst
+++ b/doc/02-Background_and_Terminologies.rst
@@ -1,4 +1,4 @@
-General Requirements Background and Terminology
+General Requirements Background and Terminology
-----------------------------------------------
Terminologies and definitions
@@ -12,7 +12,7 @@ NFVI
VIM
The term is an abbreviation for Virtual Infrastructure Management;
sometimes it is also referred as control plane in this document.
-
+
Operator
The term refers to network service providers and Virtual Network
Function (VNF) providers.
@@ -25,12 +25,12 @@ Network Service
End-users using a set of (virtualized) Network Functions
Infrastructure Services
- The term refers to services provided by the NFV Infrastructure and the
- the Management & Orchestration functions to the VNFs. I.e.
+ The term refers to services provided by the NFV Infrastructure and the
+ the Management & Orchestration functions to the VNFs. I.e.
these are the virtual resources as perceived by the VNFs.
Smooth Upgrade
- The term refers to an upgrade that results in no service outage
+ The term refers to an upgrade that results in no service outage
for the end-users.
Rolling Upgrade
@@ -38,7 +38,7 @@ Rolling Upgrade
a subset of nodes in a wave style rolling through the data centre. It
is a popular upgrade strategy to maintain service availability.
-Parallel Universe
+Parallel Universe Upgrade
The term refers to an upgrade strategy that creates and deploys
a new universe - a system with the new configuration - while the old
system continues running. The state of the old system is transferred
@@ -59,22 +59,57 @@ Virtual Resource
they are the resources on which VNF entities are deployed, e.g.
the VMs, virtual switches, virtual routers, virtual disks etc.
-.. <MT> I don't think the VNF is the virtual resource. Virtual
- resources are the VMs, virtual switches, virtual routers, virtual
- disks etc. The VNF uses them, but I don't think they are equal. The
- VIM doesn't manage the VNF, but it does manage virtual resources.
-
Visualization Facility
- The term refers to a resource that enables the creation
- of virtual environments on top of the physical resources, e.g.
- hypervisor, OpenStack, etc.
+ The term refers to a resource that enables the creation
+ of virtual environments on top of the physical resources, e.g.
+ hypervisor, OpenStack, etc.
+
+Upgrade Campaign
+ The term refers to a choreography that describes how the upgrade should
+ be performed in terms of its targets (i.e. upgrade objects), the
+ steps/actions required of upgrading each, and the coordination of these
+ steps so that service availability can be maintained. It is an input to an
+ upgrade tool (Escalator) to carry out the upgrade.
+
+Upgrade Duration
+ The duration of an upgrade characterized by the time elapsed between its
+ initiation and its completion. E.g. from the moment the execution of an
+ upgrade campaign has started until it has been committed. Depending on
+ the upgrade method and its target some parts of the system may be in a more
+ vulnerable state.
+
+Outage
+ The period of time during which a given service is not provided is referred
+ as the outage of that given service. If a subsystem or the entire system
+ does not provide any service, it is the outage of the given subsystem or the
+ system. Smooth upgrade means upgrade with no outage for the user plane, i.e.
+ no VNF should experience service outage.
+
+Rollback
+ The term refers to a failure handling strategy that reverts the changes
+ done by a potentially failed upgrade execution one by one in a reverse order.
+ I.e. it is like undoing the changes done by the upgrade.
+
+Restore
+ The term refers to a failure handling strategy that reverts the changes
+ done by an upgrade by restoring the system from some backup data. This
+ results in the loss of any data persisted since the backup has been taken.
+
+Rollforward
+ The term refers to a failure handling strategy applied after a restore
+ (from a backup) opertaion to recover any loss of data persisted between
+ the time the backup has been taken and the moment it is restored. Rollforward
+ requires that data that needs to survive the restore operation is logged at
+ a location not impacted by the restore so that it can be re-applied to the
+ system after its restoration from the backup.
+
+Downgrade
+ The term refers to an upgrade in which an earlier version of the software
+ is restored through the upgrade procedure. A system can be downgraded to any
+ earlier version and the compatibility of the versions will determine the
+ applicable upgrade strategies and whether service outage can be avoided.
+ In particular any data conversion needs special attention.
-Upgrade Plan (or Campaign?)
- The term refers to a choreography that describes how the upgrade should
- be performed in terms of its targets (i.e. upgrade objects), the
- steps/actions required of upgrading each, and the coordination of these
- steps so that service availability can be maintained. It is an input to an
- upgrade tool (Escalator) to carry out the upgrade
Upgrade Objects
@@ -83,29 +118,33 @@ Upgrade Objects
Physical Resource
^^^^^^^^^^^^^^^^^
-Most of cloud infrastructures support dynamic addition/removal of
-hardware. A hardware upgrade could be done by adding the new
-hardware node and removing the old one. From the persepctive of smooth
+Most cloud infrastructures support the dynamic addition/removal of
+hardware. Accordingly a hardware upgrade could be done by adding the new
+piece of hardware and removing the old one. From the persepctive of smooth
upgrade the orchestration/scheduling of this actions is the primary concern.
-Upgrading a physical resource,
-like upgrading its firmware and/or modify its configuration data, may
-also be considered in the future.
+Upgrading a physical resource may involve as well the upgrade of its firmware
+and/or modifying its configuration data. This may require the restart of the
+hardware.
+
Virtual Resources
^^^^^^^^^^^^^^^^^
-Virtual resource upgrade mainly done by users. OPNFV may facilitate
-the activity, but suggest to have it in long term roadmap instead of
-initiate release.
+Addition and removal of virtual resources may be initiated by the users or be
+a result of an elasticity action. Users may also request the upgrade of their
+virtual resources using a new VM image.
+
+.. Needs to be moved to requirement section: Escalator should facilitate such an
+option and allow for a smooth upgrade.
-.. <MT> same comment here: I don't think the VNF is the virtual
- resource. Virtual resources are the VMs, virtual switches, virtual
- routers, virtual disks etc. The VNF uses them, but I don't think they
- are equal. For example if by some reason the hypervisor is changed and
- the current VMs cannot be migrated to the new hypervisor, they are
- incompatible, then the VMs need to be upgraded too. This is not
- something the NFVI user (i.e. VNFs ) would even know about.
+On the other hand changes in the infrastructure, namely, in the hardware and/or
+the virtualization facility resources may result in the upgrade of the virtual
+resources. For example if by some reason the hypervisor is changed and
+the current VMs cannot be migrated to the new hypervisor - they are
+incompatible - then the VMs need to be upgraded too. This is not
+something the NFVI user (i.e. VNFs ) would know about. In such cases
+smooth upgrade is essential.
Virtualization Facility Resources
@@ -189,95 +228,115 @@ result in the same.
Updating 3 might lead to control plane services interruption if not an
HA deployment.
-Upgrade Span
-~~~~~~~~~~~~
-**Major Upgrade**
-Upgrades between major releases may introducing significant changes in
-function, configuration and data, such as the upgrade of OPNFV from
-Arno to Brahmaputra.
-**Minor Upgrade**
-
-Upgrades inside one major releases which would not leads to changing
-the structure of the platform and may not infect the schema of the
-system data.
Upgrade Granularity
~~~~~~~~~~~~~~~~~~~
-Physical/Hardware Dimension
-^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The granularity of an upgrade can be characterized from two perspective:
+- the physical dimension and
+- the software dimension
+
+
+Physical Dimension
+^^^^^^^^^^^^^^^^^^
+
+The physical dimension characterizes the number of similar upgrade objects
+targeted by the upgrade, i.e. whether it is full / partial upgrade of a
+data centre, cluster, zone.
+Because of the upgrade of a data centre or a zone, it may be divided into
+several batches. Thus there is a need for efficiency in the execution of
+upgrades of potentially huge number of upgrade objects while still maintain
+availability to fulfill the requirement of smooth upgrade.
-Support full / partial upgrade for data centre, cluster, zone. Because
-of the upgrade of a data centre or a zone, it may be divided into
-several batches. The upgrade of a cloud environment (cluster) may also
+The upgrade of a cloud environment (cluster) may also
be partial. For example, in one cloud environment running a number of
-VNFs, we may just try one of them to check the stability and
+VNFs, we may just try to upgrade one of them to check the stability and
performance, before we upgrade all of them.
+Thus there is a need for proper organization of the artifacts associated with
+the different upgrade objects. Also the different versions should be able
+to coextist beyond the upgrade period.
+
+From this perspective special attention may be needed when upgrading
+objects that are collaborating in a redundancy schema as in this case
+different versions not only need to coexist but also collaborate. This
+puts requirement on the upgrade objects primarily. If this is not possible
+the upgrade campaign should be designed in such a way that the proper
+isolation is ensured.
Software Dimension
^^^^^^^^^^^^^^^^^^
-- The upgrade of host OS or kernel may need a 'hot migration'
-- The upgrade of OpenStack’s components
+The software dimension of the upgrade characterizes the upgrade object
+type targeted and the combination in which they are upgraded together.
- i.the one-shot upgrade of all components
-
- ii.the partial upgrade (or bugfix patch) which only affects some
- components (e.g., computing, storage, network, database, message
- queue, etc.)
+Even though the upgrade may
+initially target only one type of upgrade object, e.g. the hypervisor
+the dependency of other upgrade objects on this initial target object may
+require their upgrade as well. I.e. the upgrades need to be combined. From this
+perspective the main concern is compatibility of the dependent and
+sponsor objects. To take into consideration of these dependencies
+they need to be described together with the version compatility information.
+Breaking dependencies is the major cause of outages during upgrades.
-.. <MT> this section seems to overlap with 2.1.
- I can see the following dimensions for the software.
+In other cases it is more efficient to upgrade a combination of upgrade
+objects than to do it one by one. One aspect of the combination is how
+the upgrade packages can be combined, whether a new image can be created for
+them before hand or the different packages can be installed during the upgrade
+independently, but activated together.
-.. <MT> different software packages
+The combination of upgrade objects may span across
+layers (e.g. software stack in the host and the VM of the VNF).
+Thus, it may require additional coordination between the management layers.
-.. <MT> different functions - Considering that the target versions of all
- software are compatible the upgrade needs to ensure that any
- dependencies between SW and therefore packages are taken into account
- in the upgrade plan, i.e. no version mismatch occurs during the
- upgrade therefore dependencies are not broken
-
-.. <MT> same function - This is an upgrade specific question if different
- versions can coexist in the system when a SW is being upgraded from
- one version to another. This is particularly important for stateful
- functions e.g. storage, networking, control services. The upgrade
- method must consider the compatibility of the redundant entities.
+With respect to each upgrade object type and even stacks we can
+distingush major and minor upgrades:
-.. <MT> different versions of the same software package
+**Major Upgrade**
+
+Upgrades between major releases may introducing significant changes in
+function, configuration and data, such as the upgrade of OPNFV from
+Arno to Brahmaputra.
+
+**Minor Upgrade**
+
+Upgrades inside one major releases which would not leads to changing
+the structure of the platform and may not infect the schema of the
+system data.
+
+Scope of Impact
+~~~~~~~~~~~~~~~
+
+Considering availability and therefore smooth upgrade, one of the major
+concerns is the predictability and control of the outcome of the different
+upgrade operations. Ideally an upgrade can be performed without impacting any
+entity in the system, which means none of the operations change or potentially
+change the behaviour of any entity in the system in an uncotrolled manner.
+Accordingly the operations of such an upgrade can be performed any time while
+the system is running, while all the entities are online. No entity needs to be
+taken offline to avoid such adverse effects. Hence such upgrade operations
+are referred as online operations. The effects of the upgrade might be activated
+next time it is used, or may require a special activation action such as a
+restart. Note that the activation action provides more control and predictability.
+
+If an entity's behavior in the system may change due to the upgrade it may
+be better to take it offline for the time of the relevant upgrade operations.
+The main question is however considering the hosting relation of an upgrade
+object what hosted entities are impacted. Accordingly we can identify a scope
+which is impacted by taking the given upgrade object offline. The entities
+that are in the scope of impact may need to be taken offline or moved out of
+this scope i.e. migrated.
+
+If the impacted entity is in a different layer managed by another manager
+this may require coordination because taking out of service some
+infrastructure resources for the time of their upgrade which support virtual
+resources used by VNFs that should not experience outages. The hosted VNFs
+may or may not allow for the hot migration of their VMs. In case of migration
+the VMs placement policy should be considered.
-.. <MT> major version changes - they may introduce incompatibilities. Even
- when there are backward compatibility requirements changes may cause
- issues at graceful roll-back
-
-.. <MT> minor version changes - they must not introduce incompatibility
- between versions, these should be primarily bug fixes, so live
- patches should be possible
-
-.. <MT> different installations of the same software package
-.. <MT> using different installation options - they may reflect different
- users with different needs so redundancy issues are less likely
- between installations of different options; but they could be the
- reflection of the heterogeneous system in which case they may provide
- redundancy for higher availability, i.e. deeper inspection is needed
-
-.. <MT> using the same installation options - they often reflect that the are
- used by redundant entities across space
-
-.. <MT> different distribution possibilities in space - same or different
- availability zones, multi-site, geo-redundancy
-
-.. <MT> different entities running from the same installation of a software
- package
-
-.. <MT> using different start-up options - they may reflect different users so
- redundancy may not be an issues between them
-
-.. <MT> using same start-up options - they often reflect redundant
- entities
Upgrade duration
~~~~~~~~~~~~~~~~