summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/dev/mds_internals/exports.rst
diff options
context:
space:
mode:
authorQiaowei Ren <qiaowei.ren@intel.com>2018-01-04 13:43:33 +0800
committerQiaowei Ren <qiaowei.ren@intel.com>2018-01-05 11:59:39 +0800
commit812ff6ca9fcd3e629e49d4328905f33eee8ca3f5 (patch)
tree04ece7b4da00d9d2f98093774594f4057ae561d4 /src/ceph/doc/dev/mds_internals/exports.rst
parent15280273faafb77777eab341909a3f495cf248d9 (diff)
initial code repo
This patch creates initial code repo. For ceph, luminous stable release will be used for base code, and next changes and optimization for ceph will be added to it. For opensds, currently any changes can be upstreamed into original opensds repo (https://github.com/opensds/opensds), and so stor4nfv will directly clone opensds code to deploy stor4nfv environment. And the scripts for deployment based on ceph and opensds will be put into 'ci' directory. Change-Id: I46a32218884c75dda2936337604ff03c554648e4 Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Diffstat (limited to 'src/ceph/doc/dev/mds_internals/exports.rst')
-rw-r--r--src/ceph/doc/dev/mds_internals/exports.rst76
1 files changed, 76 insertions, 0 deletions
diff --git a/src/ceph/doc/dev/mds_internals/exports.rst b/src/ceph/doc/dev/mds_internals/exports.rst
new file mode 100644
index 0000000..c5b0e39
--- /dev/null
+++ b/src/ceph/doc/dev/mds_internals/exports.rst
@@ -0,0 +1,76 @@
+
+===============
+Subtree exports
+===============
+
+Normal Migration
+----------------
+
+The exporter begins by doing some checks in export_dir() to verify
+that it is permissible to export the subtree at this time. In
+particular, the cluster must not be degraded, the subtree root may not
+be freezing or frozen (\ie already exporting, or nested beneath
+something that is exporting), and the path must be pinned (\ie not
+conflicted with a rename). If these conditions are met, the subtree
+freeze is initiated, and the exporter is committed to the subtree
+migration, barring an intervening failure of the importer or itself.
+
+The MExportDirDiscover serves simply to ensure that the base directory
+being exported is open on the destination node. It is pinned by the
+importer to prevent it from being trimmed. This occurs before the
+exporter completes the freeze of the subtree to ensure that the
+importer is able to replicate the necessary metadata. When the
+exporter receives the MExportDirDiscoverAck, it allows the freeze to proceed.
+
+The MExportDirPrep message then follows to populate a spanning tree that
+includes all dirs, inodes, and dentries necessary to reach any nested
+exports within the exported region. This replicates metadata as well,
+but it is pushed out by the exporter, avoiding deadlock with the
+regular discover and replication process. The importer is responsible
+for opening the bounding directories from any third parties before
+acknowledging. This ensures that the importer has correct dir_auth
+information about where authority is delegated for all points nested
+within the subtree being migrated. While processing the MExportDirPrep,
+the importer freezes the entire subtree region to prevent any new
+replication or cache expiration.
+
+The warning stage occurs only if the base subtree directory is open by
+nodes other than the importer and exporter. If so, then a
+MExportDirNotify message informs any bystanders that the authority for
+the region is temporarily ambiguous. In particular, bystanders who
+are trimming items from their cache must send MCacheExpire messages to
+both the old and new authorities. This is necessary to ensure that
+the surviving authority reliably receives all expirations even if the
+importer or exporter fails. While the subtree is frozen (on both the
+importer and exporter), expirations will not be immediately processed;
+instead, they will be queued until the region is unfrozen and it can
+be determined that the node is or is not authoritative for the region.
+
+The MExportDir message sends the actual subtree metadata to the importer.
+Upon receipt, the importer inserts the data into its cache, logs a
+copy in the EImportStart, and replies with an MExportDirAck. The exporter
+can now log an EExport, which ultimately specifies that
+the export was a success. In the presence of failures, it is the
+existence of the EExport that disambiguates authority during recovery.
+
+Once logged, the exporter will send an MExportDirNotify to any
+bystanders, informing them that the authority is no longer ambiguous
+and cache expirations should be sent only to the new authority (the
+importer). Once these are acknowledged, implicitly flushing the
+bystander to exporter message streams of any stray expiration notices,
+the exporter unfreezes the subtree, cleans up its state, and sends a
+final MExportDirFinish to the importer. Upon receipt, the importer logs
+an EImportFinish(true), unfreezes its subtree, and cleans up its
+state.
+
+
+PARTIAL FAILURE RECOVERY
+
+
+
+RECOVERY FROM JOURNAL
+
+
+
+
+