summaryrefslogtreecommitdiffstats
path: root/networking-odl/doc/source/specs/journal-recovery.rst
diff options
context:
space:
mode:
Diffstat (limited to 'networking-odl/doc/source/specs/journal-recovery.rst')
-rw-r--r--networking-odl/doc/source/specs/journal-recovery.rst152
1 files changed, 0 insertions, 152 deletions
diff --git a/networking-odl/doc/source/specs/journal-recovery.rst b/networking-odl/doc/source/specs/journal-recovery.rst
deleted file mode 100644
index 1132485..0000000
--- a/networking-odl/doc/source/specs/journal-recovery.rst
+++ /dev/null
@@ -1,152 +0,0 @@
-..
- This work is licensed under a Creative Commons Attribution 3.0 Unported
- License.
-
- http://creativecommons.org/licenses/by/3.0/legalcode
-
-================
-Journal Recovery
-================
-
-https://blueprints.launchpad.net/networking-odl/+spec/journal-recovery
-
-Journal entries in the failed state need to be handled somehow. This spec will
-try to address the issue and propose a solution.
-
-Problem Description
-===================
-
-Currently there is no handling for Journal entries that reach the failed state.
-A journal entry can reach the failed state for several reasons, some of which
-are:
-
-* Reached maximum failed attempts for retrying the operation.
-
-* Inconsistency between ODL and the Neutron DB.
-
- * For example: An update fails because the resource doesn't exist in ODL.
-
-* Bugs that can lead to failure to sync up.
-
-These entries will be left in the journal table forever which is a bit wasteful
-since they take up some space on the DB storage and also affect the performance
-of the journal table.
-Albeit each entry has a negligble effect on it's own, the impact of a large
-number of such entries can become quite significant.
-
-Proposed Change
-===============
-
-A "journal recovery" routine will run as part of the current journal
-maintenance process.
-This routine will scan the journal table for rows in the "failed" state and
-will try to sync the resource for that entry.
-
-The procedure can be best described by the following flow chart:
-
-asciiflow::
-
- +-----------------+
- | For each entry |
- | in failed state |
- +-------+---------+
- |
- +-------v--------+
- | Query resource |
- | on ODL (REST) |
- +-----+-----+----+
- | | +-----------+
- Resource | | Determine |
- exists +--Resource doesn't exist--> operation |
- | | type |
- +-----v-----+ +-----+-----+
- | Determine | |
- | operation | |
- | type | |
- +-----+-----+ |
- | +------------+ |
- +--Create------> Mark entry <--Delete--+
- | | completed | |
- | +----------^-+ Create/
- | | Update
- | | |
- | +------------+ | +-----v-----+
- +--Delete--> Mark entry | | | Determine |
- | | pending | | | parent |
- | +---------^--+ | | relation |
- | | | +-----+-----+
- +-----v------+ | | |
- | Compare to +--Different--+ | |
- | resource | | |
- | in DB +--Same------------+ |
- +------------+ |
- |
- +-------------------+ |
- | Create entry for <-----Has no parent------+
- | resource creation | |
- +--------^----------+ Has a parent
- | |
- | +---------v-----+
- +------Parent exists------+ Query parent |
- | on ODL (REST) |
- +---------+-----+
- +------------------+ |
- | Create entry for <---Parent doesn't exist--+
- | parent creation |
- +------------------+
-
-For every error during the process the entry will remain in failed state but
-the error shouldn't stop processing of further entries.
-
-
-The implementation could be done in two phases where the parent handling is
-done in a a second pahse.
-For the first phase if we detect an entry that is in failed for a create/update
-operation and the resource doesn't exist on ODL we create a new "create
-resource" journal entry for the resource.
-
-This propsal utilises the journal mechanism for it's operation while the only
-part that deviates from the standard mode of operation is when it queries ODL
-directly. This direct query has to be done to get ODL's representation of the
-resource.
-
-Performance Impact
-------------------
-
-The maintenance thread will have another task to handle. This can lead to
-longer processing time and even cause the thread to skip an iteration.
-This is not an issue since the maintenance thread runs in parallel and doesn't
-directly impact the responsiveness of the system.
-
-Since most operations here involve I/O then CPU probably won't be impacted.
-
-Network traffic would be impacted slightly since we will attempt to fetch the
-resource each time from ODL and we might attempt to fetch it's parent.
-This is however negligble as we do this only for failed entries, which are
-expected to appear rarely.
-
-
-Alternatives
-------------
-
-The partial sync process could make this process obsolete (along with full
-sync), but it's a far more complicated and problematic process.
-It's better to start with this process which is more lightweight and doable
-and consider partial sync in the future.
-
-
-Assignee(s)
-===========
-
-Primary assignee:
- mkolesni <mkolesni@redhat.com>
-
-Other contributors:
- None
-
-
-References
-==========
-
-https://goo.gl/IOMpzJ
-