summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/rbd
diff options
context:
space:
mode:
authorQiaowei Ren <qiaowei.ren@intel.com>2018-01-04 13:43:33 +0800
committerQiaowei Ren <qiaowei.ren@intel.com>2018-01-05 11:59:39 +0800
commit812ff6ca9fcd3e629e49d4328905f33eee8ca3f5 (patch)
tree04ece7b4da00d9d2f98093774594f4057ae561d4 /src/ceph/doc/rbd
parent15280273faafb77777eab341909a3f495cf248d9 (diff)
initial code repo
This patch creates initial code repo. For ceph, luminous stable release will be used for base code, and next changes and optimization for ceph will be added to it. For opensds, currently any changes can be upstreamed into original opensds repo (https://github.com/opensds/opensds), and so stor4nfv will directly clone opensds code to deploy stor4nfv environment. And the scripts for deployment based on ceph and opensds will be put into 'ci' directory. Change-Id: I46a32218884c75dda2936337604ff03c554648e4 Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Diffstat (limited to 'src/ceph/doc/rbd')
-rw-r--r--src/ceph/doc/rbd/api/index.rst8
-rw-r--r--src/ceph/doc/rbd/api/librbdpy.rst82
-rw-r--r--src/ceph/doc/rbd/disk.conf8
-rw-r--r--src/ceph/doc/rbd/index.rst72
-rw-r--r--src/ceph/doc/rbd/iscsi-initiator-esx.rst36
-rw-r--r--src/ceph/doc/rbd/iscsi-initiator-rhel.rst90
-rw-r--r--src/ceph/doc/rbd/iscsi-initiator-win.rst100
-rw-r--r--src/ceph/doc/rbd/iscsi-initiators.rst10
-rw-r--r--src/ceph/doc/rbd/iscsi-monitoring.rst103
-rw-r--r--src/ceph/doc/rbd/iscsi-overview.rst50
-rw-r--r--src/ceph/doc/rbd/iscsi-requirements.rst49
-rw-r--r--src/ceph/doc/rbd/iscsi-target-ansible.rst343
-rw-r--r--src/ceph/doc/rbd/iscsi-target-cli.rst163
-rw-r--r--src/ceph/doc/rbd/iscsi-targets.rst27
-rw-r--r--src/ceph/doc/rbd/libvirt.rst319
-rw-r--r--src/ceph/doc/rbd/man/index.rst16
-rw-r--r--src/ceph/doc/rbd/qemu-rbd.rst218
-rw-r--r--src/ceph/doc/rbd/rados-rbd-cmds.rst223
-rw-r--r--src/ceph/doc/rbd/rbd-cloudstack.rst135
-rw-r--r--src/ceph/doc/rbd/rbd-config-ref.rst136
-rw-r--r--src/ceph/doc/rbd/rbd-ko.rst59
-rw-r--r--src/ceph/doc/rbd/rbd-mirroring.rst318
-rw-r--r--src/ceph/doc/rbd/rbd-openstack.rst512
-rw-r--r--src/ceph/doc/rbd/rbd-replay.rst42
-rw-r--r--src/ceph/doc/rbd/rbd-snapshot.rst308
25 files changed, 3427 insertions, 0 deletions
diff --git a/src/ceph/doc/rbd/api/index.rst b/src/ceph/doc/rbd/api/index.rst
new file mode 100644
index 0000000..71f6809
--- /dev/null
+++ b/src/ceph/doc/rbd/api/index.rst
@@ -0,0 +1,8 @@
+========================
+ Ceph Block Device APIs
+========================
+
+.. toctree::
+ :maxdepth: 2
+
+ librados (Python) <librbdpy>
diff --git a/src/ceph/doc/rbd/api/librbdpy.rst b/src/ceph/doc/rbd/api/librbdpy.rst
new file mode 100644
index 0000000..fa90331
--- /dev/null
+++ b/src/ceph/doc/rbd/api/librbdpy.rst
@@ -0,0 +1,82 @@
+================
+ Librbd (Python)
+================
+
+.. highlight:: python
+
+The `rbd` python module provides file-like access to RBD images.
+
+
+Example: Creating and writing to an image
+=========================================
+
+To use `rbd`, you must first connect to RADOS and open an IO
+context::
+
+ cluster = rados.Rados(conffile='my_ceph.conf')
+ cluster.connect()
+ ioctx = cluster.open_ioctx('mypool')
+
+Then you instantiate an :class:rbd.RBD object, which you use to create the
+image::
+
+ rbd_inst = rbd.RBD()
+ size = 4 * 1024**3 # 4 GiB
+ rbd_inst.create(ioctx, 'myimage', size)
+
+To perform I/O on the image, you instantiate an :class:rbd.Image object::
+
+ image = rbd.Image(ioctx, 'myimage')
+ data = 'foo' * 200
+ image.write(data, 0)
+
+This writes 'foo' to the first 600 bytes of the image. Note that data
+cannot be :type:unicode - `Librbd` does not know how to deal with
+characters wider than a :c:type:char.
+
+In the end, you will want to close the image, the IO context and the connection to RADOS::
+
+ image.close()
+ ioctx.close()
+ cluster.shutdown()
+
+To be safe, each of these calls would need to be in a separate :finally
+block::
+
+ cluster = rados.Rados(conffile='my_ceph_conf')
+ try:
+ ioctx = cluster.open_ioctx('my_pool')
+ try:
+ rbd_inst = rbd.RBD()
+ size = 4 * 1024**3 # 4 GiB
+ rbd_inst.create(ioctx, 'myimage', size)
+ image = rbd.Image(ioctx, 'myimage')
+ try:
+ data = 'foo' * 200
+ image.write(data, 0)
+ finally:
+ image.close()
+ finally:
+ ioctx.close()
+ finally:
+ cluster.shutdown()
+
+This can be cumbersome, so the :class:`Rados`, :class:`Ioctx`, and
+:class:`Image` classes can be used as context managers that close/shutdown
+automatically (see :pep:`343`). Using them as context managers, the
+above example becomes::
+
+ with rados.Rados(conffile='my_ceph.conf') as cluster:
+ with cluster.open_ioctx('mypool') as ioctx:
+ rbd_inst = rbd.RBD()
+ size = 4 * 1024**3 # 4 GiB
+ rbd_inst.create(ioctx, 'myimage', size)
+ with rbd.Image(ioctx, 'myimage') as image:
+ data = 'foo' * 200
+ image.write(data, 0)
+
+API Reference
+=============
+
+.. automodule:: rbd
+ :members: RBD, Image, SnapIterator
diff --git a/src/ceph/doc/rbd/disk.conf b/src/ceph/doc/rbd/disk.conf
new file mode 100644
index 0000000..3db9b8a
--- /dev/null
+++ b/src/ceph/doc/rbd/disk.conf
@@ -0,0 +1,8 @@
+<disk type='network' device='disk'>
+ <source protocol='rbd' name='poolname/imagename'>
+ <host name='{fqdn}' port='6789'/>
+ <host name='{fqdn}' port='6790'/>
+ <host name='{fqdn}' port='6791'/>
+ </source>
+ <target dev='vda' bus='virtio'/>
+</disk>
diff --git a/src/ceph/doc/rbd/index.rst b/src/ceph/doc/rbd/index.rst
new file mode 100644
index 0000000..c297d0d
--- /dev/null
+++ b/src/ceph/doc/rbd/index.rst
@@ -0,0 +1,72 @@
+===================
+ Ceph Block Device
+===================
+
+.. index:: Ceph Block Device; introduction
+
+A block is a sequence of bytes (for example, a 512-byte block of data).
+Block-based storage interfaces are the most common way to store data with
+rotating media such as hard disks, CDs, floppy disks, and even traditional
+9-track tape. The ubiquity of block device interfaces makes a virtual block
+device an ideal candidate to interact with a mass data storage system like Ceph.
+
+Ceph block devices are thin-provisioned, resizable and store data striped over
+multiple OSDs in a Ceph cluster. Ceph block devices leverage
+:abbr:`RADOS (Reliable Autonomic Distributed Object Store)` capabilities
+such as snapshotting, replication and consistency. Ceph's
+:abbr:`RADOS (Reliable Autonomic Distributed Object Store)` Block Devices (RBD)
+interact with OSDs using kernel modules or the ``librbd`` library.
+
+.. ditaa:: +------------------------+ +------------------------+
+ | Kernel Module | | librbd |
+ +------------------------+-+------------------------+
+ | RADOS Protocol |
+ +------------------------+-+------------------------+
+ | OSDs | | Monitors |
+ +------------------------+ +------------------------+
+
+.. note:: Kernel modules can use Linux page caching. For ``librbd``-based
+ applications, Ceph supports `RBD Caching`_.
+
+Ceph's block devices deliver high performance with infinite scalability to
+`kernel modules`_, or to :abbr:`KVMs (kernel virtual machines)` such as `QEMU`_, and
+cloud-based computing systems like `OpenStack`_ and `CloudStack`_ that rely on
+libvirt and QEMU to integrate with Ceph block devices. You can use the same cluster
+to operate the `Ceph RADOS Gateway`_, the `Ceph FS filesystem`_, and Ceph block
+devices simultaneously.
+
+.. important:: To use Ceph Block Devices, you must have access to a running
+ Ceph cluster.
+
+.. toctree::
+ :maxdepth: 1
+
+ Commands <rados-rbd-cmds>
+ Kernel Modules <rbd-ko>
+ Snapshots<rbd-snapshot>
+ Mirroring <rbd-mirroring>
+ iSCSI Gateway <iscsi-overview>
+ QEMU <qemu-rbd>
+ libvirt <libvirt>
+ Cache Settings <rbd-config-ref/>
+ OpenStack <rbd-openstack>
+ CloudStack <rbd-cloudstack>
+ RBD Replay <rbd-replay>
+
+.. toctree::
+ :maxdepth: 2
+
+ Manpages <man/index>
+
+.. toctree::
+ :maxdepth: 2
+
+ APIs <api/index>
+
+.. _RBD Caching: ../rbd-config-ref/
+.. _kernel modules: ../rbd-ko/
+.. _QEMU: ../qemu-rbd/
+.. _OpenStack: ../rbd-openstack
+.. _CloudStack: ../rbd-cloudstack
+.. _Ceph RADOS Gateway: ../../radosgw/
+.. _Ceph FS filesystem: ../../cephfs/
diff --git a/src/ceph/doc/rbd/iscsi-initiator-esx.rst b/src/ceph/doc/rbd/iscsi-initiator-esx.rst
new file mode 100644
index 0000000..18dd583
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-initiator-esx.rst
@@ -0,0 +1,36 @@
+----------------------------------
+The iSCSI Initiator for VMware ESX
+----------------------------------
+
+**Prerequisite:**
+
+- VMware ESX 6.0 or later
+
+**iSCSI Discovery and Multipath Device Setup:**
+
+#. From vSphere, open the Storage Adapters, on the Configuration tab. Right click
+ on the iSCSI Software Adapter and select Properties.
+
+#. In the General tab click the "Advanced" button and in the "Advanced Settings"
+ set RecoveryTimeout to 25.
+
+#. If CHAP was setup on the iSCSI gateway, in the General tab click the "CHAP…​"
+ button. If CHAP is not being used, skip to step 4.
+
+#. On the CHAP Credentials windows, select “Do not use CHAP unless required by target”,
+ and enter the "Name" and "Secret" values used on the initial setup for the iSCSI
+ gateway, then click on the "OK" button.
+
+#. On the Dynamic Discovery tab, click the "Add…​" button, and enter the IP address
+ and port of one of the iSCSI target portals. Click on the "OK" button.
+
+#. Close the iSCSI Initiator Properties window. A prompt will ask to rescan the
+ iSCSI software adapter. Select Yes.
+
+#. In the Details pane, the LUN on the iSCSI target will be displayed. Right click
+ on a device and select "Manage Paths".
+
+#. On the Manage Paths window, select “Most Recently Used (VMware)” for the policy
+ path selection. Close and repeat for the other disks.
+
+Now the disks can be used for datastores.
diff --git a/src/ceph/doc/rbd/iscsi-initiator-rhel.rst b/src/ceph/doc/rbd/iscsi-initiator-rhel.rst
new file mode 100644
index 0000000..51248e4
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-initiator-rhel.rst
@@ -0,0 +1,90 @@
+------------------------------------------------
+The iSCSI Initiator for Red Hat Enterprise Linux
+------------------------------------------------
+
+**Prerequisite:**
+
+- Package ``iscsi-initiator-utils-6.2.0.873-35`` or newer must be
+ installed
+
+- Package ``device-mapper-multipath-0.4.9-99`` or newer must be
+ installed
+
+**Installing:**
+
+Install the iSCSI initiator and multipath tools:
+
+ ::
+
+ # yum install iscsi-initiator-utils
+ # yum install device-mapper-multipath
+
+**Configuring:**
+
+#. Create the default ``/etc/multipath.conf`` file and enable the
+ ``multiapthd`` service:
+
+ ::
+
+ # mpathconf --enable --with_multipathd y
+
+#. Add the following to ``/etc/multipath.conf`` file:
+
+ ::
+
+ devices {
+ device {
+ vendor "LIO-ORG"
+ hardware_handler "1 alua"
+ path_grouping_policy "failover"
+ path_selector "queue-length 0"
+ failback 60
+ path_checker tur
+ prio alua
+ prio_args exclusive_pref_bit
+ fast_oi_fail_tmo 25
+ no_path_retry queue
+ }
+ }
+
+#. Restart the ``multipathd`` service:
+
+ ::
+
+ # systemctl reload multipathd
+
+**iSCSI Discovery and Setup:**
+
+#. Discover the target portals:
+
+ ::
+
+ # iscsiadm -m discovery -t -st 192.168.56.101
+ 192.168.56.101:3260,1 iqn.2003-01.org.linux-iscsi.rheln1
+ 192.168.56.102:3260,2 iqn.2003-01.org.linux-iscsi.rheln1
+
+#. Login to target:
+
+ ::
+
+ # iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.rheln1 -l
+
+**Multipath IO Setup:**
+
+The multipath daemon (``multipathd``), will set up devices automatically
+based on the ``multipath.conf`` settings. Running the ``multipath``
+command show devices setup in a failover configuration with a priority
+group for each path.
+
+::
+
+ # multipath -ll
+ mpathbt (360014059ca317516a69465c883a29603) dm-1 LIO-ORG ,IBLOCK
+ size=1.0G features='0' hwhandler='1 alua' wp=rw
+ |-+- policy='queue-length 0' prio=50 status=active
+ | `- 28:0:0:1 sde 8:64 active ready running
+ `-+- policy='queue-length 0' prio=10 status=enabled
+ `- 29:0:0:1 sdc 8:32 active ready running
+
+You should now be able to use the RBD image like you would a normal
+multipath’d iSCSI disk.
diff --git a/src/ceph/doc/rbd/iscsi-initiator-win.rst b/src/ceph/doc/rbd/iscsi-initiator-win.rst
new file mode 100644
index 0000000..08a1cfb
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-initiator-win.rst
@@ -0,0 +1,100 @@
+-----------------------------------------
+The iSCSI Initiator for Microsoft Windows
+-----------------------------------------
+
+**Prerequisite:**
+
+- Microsoft Windows 2016
+
+**iSCSI Initiator, Discovery and Setup:**
+
+#. Install the iSCSI initiator driver and MPIO tools.
+
+#. Launch the MPIO program, click on the “Discover Multi-Paths” tab select “Add
+ support for iSCSI devices”.
+
+#. On the iSCSI Initiator Properties window, on the "Discovery" tab, add a target
+ portal. Enter the IP address or DNS name and Port of the Ceph iSCSI gateway.
+
+#. On the “Targets” tab, select the target and click on “Connect”.
+
+#. On the “Connect To Target” window, select the “Enable multi-path” option, and
+ click the “Advanced” button.
+
+#. Under the "Connet using" section, select a “Target portal IP” . Select the
+ “Enable CHAP login on” and enter the "Name" and "Target secret" values from the
+ Ceph iSCSI Ansible client credentials section, and click OK.
+
+#. Repeat steps 5 and 6 for each target portal defined when setting up
+ the iSCSI gateway.
+
+**Multipath IO Setup:**
+
+Configuring the MPIO load balancing policy, setting the timeout and
+retry options are using PowerShell with the ``mpclaim`` command. The
+reset is done in the MPIO tool.
+
+.. note::
+ It is recommended to increase the ``PDORemovePeriod`` option to 120
+ seconds from PowerShell. This value might need to be adjusted based
+ on the application. When all paths are down, and 120 seconds
+ expires, the operating system will start failing IO requests.
+
+::
+
+ Set-MPIOSetting -NewPDORemovePeriod 120
+
+::
+
+ mpclaim.exe -l -m 1
+
+::
+
+ mpclaim -s -m
+ MSDSM-wide Load Balance Policy: Fail Over Only
+
+#. Using the MPIO tool, from the “Targets” tab, click on the
+ “Devices...” button.
+
+#. From the Devices window, select a disk and click the
+ “MPIO...” button.
+
+#. On the "Device Details" window the paths to each target portal is
+ displayed. If using the ``ceph-ansible`` setup method, the
+ iSCSI gateway will use ALUA to tell the iSCSI initiator which path
+ and iSCSI gateway should be used as the primary path. The Load
+ Balancing Policy “Fail Over Only” must be selected
+
+::
+
+ mpclaim -s -d $MPIO_DISK_ID
+
+.. note::
+ For the ``ceph-ansible`` setup method, there will be one
+ Active/Optimized path which is the path to the iSCSI gateway node
+ that owns the LUN, and there will be an Active/Unoptimized path for
+ each other iSCSI gateway node.
+
+**Tuning:**
+
+Consider using the following registry settings:
+
+- Windows Disk Timeout
+
+ ::
+
+ HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk
+
+ ::
+
+ TimeOutValue = 65
+
+- Microsoft iSCSI Initiator Driver
+
+ ::
+
+ HKEY_LOCAL_MACHINE\\SYSTEM\CurrentControlSet\Control\Class\{4D36E97B-E325-11CE-BFC1-08002BE10318}\<Instance_Number>\Parameters
+
+ ::
+ LinkDownTime = 25
+ SRBTimeoutDelta = 15
diff --git a/src/ceph/doc/rbd/iscsi-initiators.rst b/src/ceph/doc/rbd/iscsi-initiators.rst
new file mode 100644
index 0000000..d3ad633
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-initiators.rst
@@ -0,0 +1,10 @@
+--------------------------------
+Configuring the iSCSI Initiators
+--------------------------------
+
+.. toctree::
+ :maxdepth: 1
+
+ The iSCSI Initiator for Red Hat Enterprise Linux <iscsi-initiator-rhel>
+ The iSCSI Initiator for Microsoft Windows <iscsi-initiator-win>
+ The iSCSI Initiator for VMware ESX <iscsi-initiator-esx>
diff --git a/src/ceph/doc/rbd/iscsi-monitoring.rst b/src/ceph/doc/rbd/iscsi-monitoring.rst
new file mode 100644
index 0000000..d425232
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-monitoring.rst
@@ -0,0 +1,103 @@
+-----------------------------
+Monitoring the iSCSI gateways
+-----------------------------
+
+Ceph provides an additional tool for iSCSI gateway environments
+to monitor performance of exported RADOS Block Device (RBD) images.
+
+The ``gwtop`` tool is a ``top``-like tool that displays aggregated
+performance metrics of RBD images that are exported to clients over
+iSCSI. The metrics are sourced from a Performance Metrics Domain Agent
+(PMDA). Information from the Linux-IO target (LIO) PMDA is used to list
+each exported RBD image with the connected client and its associated I/O
+metrics.
+
+**Requirements:**
+
+- A running Ceph iSCSI gateway
+
+**Installing:**
+
+#. As ``root``, install the ``ceph-iscsi-tools`` package on each iSCSI
+ gateway node:
+
+ ::
+
+ # yum install ceph-iscsi-tools
+
+#. As ``root``, install the performance co-pilot package on each iSCSI
+ gateway node:
+
+ ::
+
+ # yum install pcp
+
+#. As ``root``, install the LIO PMDA package on each iSCSI gateway node:
+
+ ::
+
+ # yum install pcp-pmda-lio
+
+#. As ``root``, enable and start the performance co-pilot service on
+ each iSCSI gateway node:
+
+ ::
+
+ # systemctl enable pmcd
+ # systemctl start pmcd
+
+#. As ``root``, register the ``pcp-pmda-lio`` agent:
+
+ ::
+
+ cd /var/lib/pcp/pmdas/lio
+ ./Install
+
+By default, ``gwtop`` assumes the iSCSI gateway configuration object is
+stored in a RADOS object called ``gateway.conf`` in the ``rbd`` pool.
+This configuration defines the iSCSI gateways to contact for gathering
+the performance statistics. This can be overridden by using either the
+``-g`` or ``-c`` flags. See ``gwtop --help`` for more details.
+
+The LIO configuration determines which type of performance statistics to
+extract from performance co-pilot. When ``gwtop`` starts it looks at the
+LIO configuration, and if it find user-space disks, then ``gwtop``
+selects the LIO collector automatically.
+
+**Example ``gwtop`` Outputs**
+
+For kernel RBD-based devices:
+
+::
+
+ gwtop 2/2 Gateways CPU% MIN: 4 MAX: 5 Network Total In: 2M Out: 3M 10:20:09
+ Capacity: 8G Disks: 8 IOPS: 500 Clients: 1 Ceph: HEALTH_OK OSDs: 3
+ Pool.Image Src Device Size r/s w/s rMB/s wMB/s await r_await w_await Client
+ iscsi.t1703 rbd0 500M 0 0 0.00 0.00 0.00 0.00 0.00
+ iscsi.testme1 rbd5 500M 0 0 0.00 0.00 0.00 0.00 0.00
+ iscsi.testme2 rbd2 500M 0 0 0.00 0.00 0.00 0.00 0.00
+ iscsi.testme3 rbd3 500M 0 0 0.00 0.00 0.00 0.00 0.00
+ iscsi.testme5 rbd1 500M 0 0 0.00 0.00 0.00 0.00 0.00
+ rbd.myhost_1 T rbd4 4G 500 0 1.95 0.00 2.37 2.37 0.00 rh460p(CON)
+ rbd.test_2 rbd6 1G 0 0 0.00 0.00 0.00 0.00 0.00
+ rbd.testme rbd7 500M 0 0 0.00 0.00 0.00 0.00 0.00
+
+For user backed storage (TCMU) devices:
+
+::
+
+ gwtop 2/2 Gateways CPU% MIN: 4 MAX: 5 Network Total In: 2M Out: 3M 10:20:00
+ Capacity: 8G Disks: 8 IOPS: 503 Clients: 1 Ceph: HEALTH_OK OSDs: 3
+ Pool.Image Src Size iops rMB/s wMB/s Client
+ iscsi.t1703 500M 0 0.00 0.00
+ iscsi.testme1 500M 0 0.00 0.00
+ iscsi.testme2 500M 0 0.00 0.00
+ iscsi.testme3 500M 0 0.00 0.00
+ iscsi.testme5 500M 0 0.00 0.00
+ rbd.myhost_1 T 4G 504 1.95 0.00 rh460p(CON)
+ rbd.test_2 1G 0 0.00 0.00
+ rbd.testme 500M 0 0.00 0.00
+
+In the *Client* column, ``(CON)`` means the iSCSI initiator (client) is
+currently logged into the iSCSI gateway. If ``-multi-`` is displayed,
+then multiple clients are mapped to the single RBD image.
diff --git a/src/ceph/doc/rbd/iscsi-overview.rst b/src/ceph/doc/rbd/iscsi-overview.rst
new file mode 100644
index 0000000..a8c64e2
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-overview.rst
@@ -0,0 +1,50 @@
+==================
+Ceph iSCSI Gateway
+==================
+
+The iSCSI gateway is integrating Ceph Storage with the iSCSI standard to provide
+a Highly Available (HA) iSCSI target that exports RADOS Block Device (RBD) images
+as SCSI disks. The iSCSI protocol allows clients (initiators) to send SCSI commands
+to SCSI storage devices (targets) over a TCP/IP network. This allows for heterogeneous
+clients, such as Microsoft Windows, to access the Ceph Storage cluster.
+
+Each iSCSI gateway runs the Linux IO target kernel subsystem (LIO) to provide the
+iSCSI protocol support. LIO utilizes a userspace passthrough (TCMU) to interact
+with Ceph's librbd library and expose RBD images to iSCSI clients. With Ceph’s
+iSCSI gateway you can effectively run a fully integrated block-storage
+infrastructure with all the features and benefits of a conventional Storage Area
+Network (SAN).
+
+.. ditaa::
+ Cluster Network
+ +-------------------------------------------+
+ | | | |
+ +-------+ +-------+ +-------+ +-------+
+ | | | | | | | |
+ | OSD 1 | | OSD 2 | | OSD 3 | | OSD N |
+ | {s}| | {s}| | {s}| | {s}|
+ +-------+ +-------+ +-------+ +-------+
+ | | | |
+ +--------->| | +---------+ | |<---------+
+ : | | | RBD | | | :
+ | +----------------| Image |----------------+ |
+ | Public Network | {d} | |
+ | +---------+ |
+ | |
+ | +-------------------+ |
+ | +--------------+ | iSCSI Initators | +--------------+ |
+ | | iSCSI GW | | +-----------+ | | iSCSI GW | |
+ +-->| RBD Module |<--+ | Various | +-->| RBD Module |<--+
+ | | | | Operating | | | |
+ +--------------+ | | Systems | | +--------------+
+ | +-----------+ |
+ +-------------------+
+
+
+.. toctree::
+ :maxdepth: 1
+
+ Requirements <iscsi-requirements>
+ Configuring the iSCSI Target <iscsi-targets>
+ Configuring the iSCSI Initiator <iscsi-initiators>
+ Monitoring the iSCSI Gateways <iscsi-monitoring>
diff --git a/src/ceph/doc/rbd/iscsi-requirements.rst b/src/ceph/doc/rbd/iscsi-requirements.rst
new file mode 100644
index 0000000..1ae19e0
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-requirements.rst
@@ -0,0 +1,49 @@
+==========================
+iSCSI Gateway Requirements
+==========================
+
+To implement the Ceph iSCSI gateway there are a few requirements. It is recommended
+to use two to four iSCSI gateway nodes for a highly available Ceph iSCSI gateway
+solution.
+
+For hardware recommendations, see the `Hardware Recommendation page <http://docs.ceph.com/docs/master/start/hardware-recommendations/>`_
+for more details.
+
+.. note::
+ On the iSCSI gateway nodes, the memory footprint of the RBD images
+ can grow to a large size. Plan memory requirements accordingly based
+ off the number RBD images mapped.
+
+There are no specific iSCSI gateway options for the Ceph Monitors or
+OSDs, but it is important to lower the default timers for detecting
+down OSDs to reduce the possibility of initiator timeouts. The following
+configuration options are suggested for each OSD node in the storage
+cluster::
+
+ [osd]
+ osd heartbeat grace = 20
+ osd heartbeat interval = 5
+
+- Online Updating Using the Ceph Monitor
+
+ ::
+
+ ceph tell <daemon_type>.<id> injectargs '--<parameter_name> <new_value>'
+
+ ::
+
+ ceph tell osd.0 injectargs '--osd_heartbeat_grace 20'
+ ceph tell osd.0 injectargs '--osd_heartbeat_interval 5'
+
+- Online Updating on the OSD Node
+
+ ::
+
+ ceph daemon <daemon_type>.<id> config set osd_client_watch_timeout 15
+
+ ::
+
+ ceph daemon osd.0 config set osd_heartbeat_grace 20
+ ceph daemon osd.0 config set osd_heartbeat_interval 5
+
+For more details on setting Ceph's configuration options, see the `Configuration page <http://docs.ceph.com/docs/master/rados/configuration/>`_.
diff --git a/src/ceph/doc/rbd/iscsi-target-ansible.rst b/src/ceph/doc/rbd/iscsi-target-ansible.rst
new file mode 100644
index 0000000..4169a9f
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-target-ansible.rst
@@ -0,0 +1,343 @@
+==========================================
+Configuring the iSCSI Target using Ansible
+==========================================
+
+The Ceph iSCSI gateway is the iSCSI target node and also a Ceph client
+node. The Ceph iSCSI gateway can be a standalone node or be colocated on
+a Ceph Object Store Disk (OSD) node. Completing the following steps will
+install, and configure the Ceph iSCSI gateway for basic operation.
+
+**Requirements:**
+
+- A running Ceph Luminous (12.2.x) cluster or newer
+
+- RHEL/CentOS 7.4; or Linux kernel v4.14 or newer
+
+- The ``ceph-iscsi-config`` package installed on all the iSCSI gateway nodes
+
+**Installing:**
+
+#. On the Ansible installer node, which could be either the administration node
+ or a dedicated deployment node, perform the following steps:
+
+ #. As ``root``, install the ``ceph-ansible`` package:
+
+ ::
+
+ # yum install ceph-ansible
+
+ #. Add an entry in ``/etc/ansible/hosts`` file for the gateway group:
+
+ ::
+
+ [ceph-iscsi-gw]
+ ceph-igw-1
+ ceph-igw-2
+
+.. note::
+ If co-locating the iSCSI gateway with an OSD node, then add the OSD node to the
+ ``[ceph-iscsi-gw]`` section.
+
+**Configuring:**
+
+The ``ceph-ansible`` package places a file in the ``/usr/share/ceph-ansible/group_vars/``
+directory called ``ceph-iscsi-gw.sample``. Create a copy of this sample file named
+``ceph-iscsi-gw.yml``. Review the following Ansible variables and descriptions,
+and update accordingly.
+
++--------------------------------------+--------------------------------------+
+| Variable | Meaning/Purpose |
++======================================+======================================+
+| ``seed_monitor`` | Each gateway needs access to the |
+| | ceph cluster for rados and rbd |
+| | calls. This means the iSCSI gateway |
+| | must have an appropriate |
+| | ``/etc/ceph/`` directory defined. |
+| | The ``seed_monitor`` host is used to |
+| | populate the iSCSI gateway’s |
+| | ``/etc/ceph/`` directory. |
++--------------------------------------+--------------------------------------+
+| ``cluster_name`` | Define a custom storage cluster |
+| | name. |
++--------------------------------------+--------------------------------------+
+| ``gateway_keyring`` | Define a custom keyring name. |
++--------------------------------------+--------------------------------------+
+| ``deploy_settings`` | If set to ``true``, then deploy the |
+| | settings when the playbook is ran. |
++--------------------------------------+--------------------------------------+
+| ``perform_system_checks`` | This is a boolean value that checks |
+| | for multipath and lvm configuration |
+| | settings on each gateway. It must be |
+| | set to true for at least the first |
+| | run to ensure multipathd and lvm are |
+| | configured properly. |
++--------------------------------------+--------------------------------------+
+| ``gateway_iqn`` | This is the iSCSI IQN that all the |
+| | gateways will expose to clients. |
+| | This means each client will see the |
+| | gateway group as a single subsystem. |
++--------------------------------------+--------------------------------------+
+| ``gateway_ip_list`` | The ip list defines the IP addresses |
+| | that will be used on the front end |
+| | network for iSCSI traffic. This IP |
+| | will be bound to the active target |
+| | portal group on each node, and is |
+| | the access point for iSCSI traffic. |
+| | Each IP should correspond to an IP |
+| | available on the hosts defined in |
+| | the ``ceph-iscsi-gw`` host group in |
+| | ``/etc/ansible/hosts``. |
++--------------------------------------+--------------------------------------+
+| ``rbd_devices`` | This section defines the RBD images |
+| | that will be controlled and managed |
+| | within the iSCSI gateway |
+| | configuration. Parameters like |
+| | ``pool`` and ``image`` are self |
+| | explanatory. Here are the other |
+| | parameters: ``size`` = This defines |
+| | the size of the RBD. You may |
+| | increase the size later, by simply |
+| | changing this value, but shrinking |
+| | the size of an RBD is not supported |
+| | and is ignored. ``host`` = This is |
+| | the iSCSI gateway host name that |
+| | will be responsible for the rbd |
+| | allocation/resize. Every defined |
+| | ``rbd_device`` entry must have a |
+| | host assigned. ``state`` = This is |
+| | typical Ansible syntax for whether |
+| | the resource should be defined or |
+| | removed. A request with a state of |
+| | absent will first be checked to |
+| | ensure the rbd is not mapped to any |
+| | client. If the RBD is unallocated, |
+| | it will be removed from the iSCSI |
+| | gateway and deleted from the |
+| | configuration. |
++--------------------------------------+--------------------------------------+
+| ``client_connections`` | This section defines the iSCSI |
+| | client connection details together |
+| | with the LUN (RBD image) masking. |
+| | Currently only CHAP is supported as |
+| | an authentication mechanism. Each |
+| | connection defines an ``image_list`` |
+| | which is a comma separated list of |
+| | the form |
+| | ``pool.rbd_image[,pool.rbd_image]``. |
+| | RBD images can be added and removed |
+| | from this list, to change the client |
+| | masking. Note that there are no |
+| | checks done to limit RBD sharing |
+| | across client connections. |
++--------------------------------------+--------------------------------------+
+
+.. note::
+ When using the ``gateway_iqn`` variable, and for Red Hat Enterprise Linux
+ clients, installing the ``iscsi-initiator-utils`` package is required for
+ retrieving the gateway’s IQN name. The iSCSI initiator name is located in the
+ ``/etc/iscsi/initiatorname.iscsi`` file.
+
+**Deploying:**
+
+On the Ansible installer node, perform the following steps.
+
+#. As ``root``, execute the Ansible playbook:
+
+ ::
+
+ # cd /usr/share/ceph-ansible
+ # ansible-playbook ceph-iscsi-gw.yml
+
+ .. note::
+ The Ansible playbook will handle RPM dependencies, RBD creation
+ and Linux IO configuration.
+
+#. Verify the configuration from an iSCSI gateway node:
+
+ ::
+
+ # gwcli ls
+
+ .. note::
+ For more information on using the ``gwcli`` command to install and configure
+ a Ceph iSCSI gateaway, see the `Configuring the iSCSI Target using the Command Line Interface`_
+ section.
+
+ .. important::
+ Attempting to use the ``targetcli`` tool to change the configuration will
+ result in the following issues, such as ALUA misconfiguration and path failover
+ problems. There is the potential to corrupt data, to have mismatched
+ configuration across iSCSI gateways, and to have mismatched WWN information,
+ which will lead to client multipath problems.
+
+**Service Management:**
+
+The ``ceph-iscsi-config`` package installs the configuration management
+logic and a Systemd service called ``rbd-target-gw``. When the Systemd
+service is enabled, the ``rbd-target-gw`` will start at boot time and
+will restore the Linux IO state. The Ansible playbook disables the
+target service during the deployment. Below are the outcomes of when
+interacting with the ``rbd-target-gw`` Systemd service.
+
+::
+
+ # systemctl <start|stop|restart|reload> rbd-target-gw
+
+- ``reload``
+
+ A reload request will force ``rbd-target-gw`` to reread the
+ configuration and apply it to the current running environment. This
+ is normally not required, since changes are deployed in parallel from
+ Ansible to all iSCSI gateway nodes
+
+- ``stop``
+
+ A stop request will close the gateway’s portal interfaces, dropping
+ connections to clients and wipe the current LIO configuration from
+ the kernel. This returns the iSCSI gateway to a clean state. When
+ clients are disconnected, active I/O is rescheduled to the other
+ iSCSI gateways by the client side multipathing layer.
+
+**Administration:**
+
+Within the ``/usr/share/ceph-ansible/group_vars/ceph-iscsi-gw`` file
+there are a number of operational workflows that the Ansible playbook
+supports.
+
+.. warning::
+ Before removing RBD images from the iSCSI gateway configuration,
+ follow the standard procedures for removing a storage device from
+ the operating system.
+
++--------------------------------------+--------------------------------------+
+| I want to…​ | Update the ``ceph-iscsi-gw`` file |
+| | by…​ |
++======================================+======================================+
+| Add more RBD images | Adding another entry to the |
+| | ``rbd_devices`` section with the new |
+| | image. |
++--------------------------------------+--------------------------------------+
+| Resize an existing RBD image | Updating the size parameter within |
+| | the ``rbd_devices`` section. Client |
+| | side actions are required to pick up |
+| | the new size of the disk. |
++--------------------------------------+--------------------------------------+
+| Add a client | Adding an entry to the |
+| | ``client_connections`` section. |
++--------------------------------------+--------------------------------------+
+| Add another RBD to a client | Adding the relevant RBD |
+| | ``pool.image`` name to the |
+| | ``image_list`` variable for the |
+| | client. |
++--------------------------------------+--------------------------------------+
+| Remove an RBD from a client | Removing the RBD ``pool.image`` name |
+| | from the clients ``image_list`` |
+| | variable. |
++--------------------------------------+--------------------------------------+
+| Remove an RBD from the system | Changing the RBD entry state |
+| | variable to ``absent``. The RBD |
+| | image must be unallocated from the |
+| | operating system first for this to |
+| | succeed. |
++--------------------------------------+--------------------------------------+
+| Change the clients CHAP credentials | Updating the relevant CHAP details |
+| | in ``client_connections``. This will |
+| | need to be coordinated with the |
+| | clients. For example, the client |
+| | issues an iSCSI logout, the |
+| | credentials are changed by the |
+| | Ansible playbook, the credentials |
+| | are changed at the client, then the |
+| | client performs an iSCSI login. |
++--------------------------------------+--------------------------------------+
+| Remove a client | Updating the relevant |
+| | ``client_connections`` item with a |
+| | state of ``absent``. Once the |
+| | Ansible playbook is ran, the client |
+| | will be purged from the system, but |
+| | the disks will remain defined to |
+| | Linux IO for potential reuse. |
++--------------------------------------+--------------------------------------+
+
+Once a change has been made, rerun the Ansible playbook to apply the
+change across the iSCSI gateway nodes.
+
+::
+
+ # ansible-playbook ceph-iscsi-gw.yml
+
+**Removing the Configuration:**
+
+The ``ceph-ansible`` package provides an Ansible playbook to
+remove the iSCSI gateway configuration and related RBD images. The
+Ansible playbook is ``/usr/share/ceph-ansible/purge_gateways.yml``. When
+this Ansible playbook is ran a prompted for the type of purge to
+perform:
+
+*lio* :
+
+In this mode the LIO configuration is purged on all iSCSI gateways that
+are defined. Disks that were created are left untouched within the Ceph
+storage cluster.
+
+*all* :
+
+When ``all`` is chosen, the LIO configuration is removed together with
+**all** RBD images that were defined within the iSCSI gateway
+environment, other unrelated RBD images will not be removed. Ensure the
+correct mode is chosen, this operation will delete data.
+
+.. warning::
+ A purge operation is destructive action against your iSCSI gateway
+ environment.
+
+.. warning::
+ A purge operation will fail, if RBD images have snapshots or clones
+ and are exported through the Ceph iSCSI gateway.
+
+::
+
+ [root@rh7-iscsi-client ceph-ansible]# ansible-playbook purge_gateways.yml
+ Which configuration elements should be purged? (all, lio or abort) [abort]: all
+
+
+ PLAY [Confirm removal of the iSCSI gateway configuration] *********************
+
+
+ GATHERING FACTS ***************************************************************
+ ok: [localhost]
+
+
+ TASK: [Exit playbook if user aborted the purge] *******************************
+ skipping: [localhost]
+
+
+ TASK: [set_fact ] *************************************************************
+ ok: [localhost]
+
+
+ PLAY [Removing the gateway configuration] *************************************
+
+
+ GATHERING FACTS ***************************************************************
+ ok: [ceph-igw-1]
+ ok: [ceph-igw-2]
+
+
+ TASK: [igw_purge | purging the gateway configuration] *************************
+ changed: [ceph-igw-1]
+ changed: [ceph-igw-2]
+
+
+ TASK: [igw_purge | deleting configured rbd devices] ***************************
+ changed: [ceph-igw-1]
+ changed: [ceph-igw-2]
+
+
+ PLAY RECAP ********************************************************************
+ ceph-igw-1 : ok=3 changed=2 unreachable=0 failed=0
+ ceph-igw-2 : ok=3 changed=2 unreachable=0 failed=0
+ localhost : ok=2 changed=0 unreachable=0 failed=0
+
+
+.. _Configuring the iSCSI Target using the Command Line Interface: ../iscsi-target-cli
diff --git a/src/ceph/doc/rbd/iscsi-target-cli.rst b/src/ceph/doc/rbd/iscsi-target-cli.rst
new file mode 100644
index 0000000..6da6f10
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-target-cli.rst
@@ -0,0 +1,163 @@
+=============================================================
+Configuring the iSCSI Target using the Command Line Interface
+=============================================================
+
+The Ceph iSCSI gateway is the iSCSI target node and also a Ceph client
+node. The Ceph iSCSI gateway can be a standalone node or be colocated on
+a Ceph Object Store Disk (OSD) node. Completing the following steps will
+install, and configure the Ceph iSCSI gateway for basic operation.
+
+**Requirements:**
+
+- A running Ceph Luminous or later storage cluster
+
+- RHEL/CentOS 7.4; or Linux kernel v4.14 or newer
+
+- The following packages must be installed from your Linux distribution's software repository:
+
+ - ``targetcli-2.1.fb47`` or newer package
+
+ - ``python-rtslib-2.1.fb64`` or newer package
+
+ - ``tcmu-runner-1.3.0`` or newer package
+
+ - ``ceph-iscsi-config-2.3`` or newer package
+
+ - ``ceph-iscsi-cli-2.5`` or newer package
+
+ .. important::
+ If previous versions of these packages exist, then they must
+ be removed first before installing the newer versions.
+
+Do the following steps on the Ceph iSCSI gateway node before proceeding
+to the *Installing* section:
+
+#. If the Ceph iSCSI gateway is not colocated on an OSD node, then copy
+ the Ceph configuration files, located in ``/etc/ceph/``, from a
+ running Ceph node in the storage cluster to the iSCSI Gateway node.
+ The Ceph configuration files must exist on the iSCSI gateway node
+ under ``/etc/ceph/``.
+
+#. Install and configure the `Ceph Command-line
+ Interface <http://docs.ceph.com/docs/master/start/quick-rbd/#install-ceph>`_
+
+#. If needed, open TCP ports 3260 and 5000 on the firewall.
+
+#. Create a new or use an existing RADOS Block Device (RBD).
+
+**Installing:**
+
+#. As ``root``, on all iSCSI gateway nodes, install the
+ ``ceph-iscsi-cli`` package:
+
+ ::
+
+ # yum install ceph-iscsi-cli
+
+#. As ``root``, on all iSCSI gateway nodes, install the ``tcmu-runner``
+ package:
+
+ ::
+
+ # yum install tcmu-runner
+
+#. As ``root``, on a iSCSI gateway node, create a file named
+ ``iscsi-gateway.cfg`` in the ``/etc/ceph/`` directory:
+
+ ::
+
+ # touch /etc/ceph/iscsi-gateway.cfg
+
+ #. Edit the ``iscsi-gateway.cfg`` file and add the following lines:
+
+ ::
+
+ [config]
+ # Name of the Ceph storage cluster. A suitable Ceph configuration file allowing
+ # access to the Ceph storage cluster from the gateway node is required, if not
+ # colocated on an OSD node.
+ cluster_name = ceph
+
+ # Place a copy of the ceph cluster's admin keyring in the gateway's /etc/ceph
+ # drectory and reference the filename here
+ gateway_keyring = ceph.client.admin.keyring
+
+
+ # API settings.
+ # The API supports a number of options that allow you to tailor it to your
+ # local environment. If you want to run the API under https, you will need to
+ # create cert/key files that are compatible for each iSCSI gateway node, that is
+ # not locked to a specific node. SSL cert and key files *must* be called
+ # 'iscsi-gateway.crt' and 'iscsi-gateway.key' and placed in the '/etc/ceph/' directory
+ # on *each* gateway node. With the SSL files in place, you can use 'api_secure = true'
+ # to switch to https mode.
+
+ # To support the API, the bear minimum settings are:
+ api_secure = false
+
+ # Additional API configuration options are as follows, defaults shown.
+ # api_user = admin
+ # api_password = admin
+ # api_port = 5001
+ # trusted_ip_list = 192.168.0.10,192.168.0.11
+
+ .. important::
+ The ``iscsi-gateway.cfg`` file must be identical on all iSCSI gateway nodes.
+
+ #. As ``root``, copy the ``iscsi-gateway.cfg`` file to all iSCSI
+ gateway nodes.
+
+#. As ``root``, on all iSCSI gateway nodes, enable and start the API
+ service:
+
+ ::
+
+ # systemctl enable rbd-target-api
+ # systemctl start rbd-target-api
+
+**Configuring:**
+
+#. As ``root``, on a iSCSI gateway node, start the iSCSI gateway
+ command-line interface:
+
+ ::
+
+ # gwcli
+
+#. Creating the iSCSI gateways:
+
+ ::
+
+ >/iscsi-target create iqn.2003-01.com.redhat.iscsi-gw:<target_name>
+ > goto gateways
+ > create <iscsi_gw_name> <IP_addr_of_gw>
+ > create <iscsi_gw_name> <IP_addr_of_gw>
+
+#. Adding a RADOS Block Device (RBD):
+
+ ::
+
+ > cd /iscsi-target/iqn.2003-01.com.redhat.iscsi-gw:<target_name>/disks/
+ >/disks/ create pool=<pool_name> image=<image_name> size=<image_size>m|g|t
+
+#. Creating a client:
+
+ ::
+
+ > goto hosts
+ > create iqn.1994-05.com.redhat:<client_name>
+ > auth chap=<user_name>/<password> | nochap
+
+
+ .. warning::
+ CHAP must always be configured. Without CHAP, the target will
+ reject any login requests.
+
+#. Adding disks to a client:
+
+ ::
+
+ >/iscsi-target..eph-igw/hosts> cd iqn.1994-05.com.redhat:<client_name>
+ > disk add <pool_name>.<image_name>
+
+The next step is to configure the iSCSI initiators.
diff --git a/src/ceph/doc/rbd/iscsi-targets.rst b/src/ceph/doc/rbd/iscsi-targets.rst
new file mode 100644
index 0000000..b7dcac7
--- /dev/null
+++ b/src/ceph/doc/rbd/iscsi-targets.rst
@@ -0,0 +1,27 @@
+=============
+iSCSI Targets
+=============
+
+Traditionally, block-level access to a Ceph storage cluster has been
+limited to QEMU and ``librbd``, which is a key enabler for adoption
+within OpenStack environments. Starting with the Ceph Luminous release,
+block-level access is expanding to offer standard iSCSI support allowing
+wider platform usage, and potentially opening new use cases.
+
+- RHEL/CentOS 7.4; or Linux kernel v4.14 or newer
+
+- A working Ceph Storage cluster, deployed with ``ceph-ansible`` or using the command-line interface
+
+- iSCSI gateways nodes, which can either be colocated with OSD nodes or on dedicated nodes
+
+- Separate network subnets for iSCSI front-end traffic and Ceph back-end traffic
+
+A choice of using Ansible or the command-line interface are the
+available deployment methods for installing and configuring the Ceph
+iSCSI gateway:
+
+.. toctree::
+ :maxdepth: 1
+
+ Using Ansible <iscsi-target-ansible>
+ Using the Command Line Interface <iscsi-target-cli>
diff --git a/src/ceph/doc/rbd/libvirt.rst b/src/ceph/doc/rbd/libvirt.rst
new file mode 100644
index 0000000..f953b1f
--- /dev/null
+++ b/src/ceph/doc/rbd/libvirt.rst
@@ -0,0 +1,319 @@
+=================================
+ Using libvirt with Ceph RBD
+=================================
+
+.. index:: Ceph Block Device; livirt
+
+The ``libvirt`` library creates a virtual machine abstraction layer between
+hypervisor interfaces and the software applications that use them. With
+``libvirt``, developers and system administrators can focus on a common
+management framework, common API, and common shell interface (i.e., ``virsh``)
+to many different hypervisors, including:
+
+- QEMU/KVM
+- XEN
+- LXC
+- VirtualBox
+- etc.
+
+Ceph block devices support QEMU/KVM. You can use Ceph block devices with
+software that interfaces with ``libvirt``. The following stack diagram
+illustrates how ``libvirt`` and QEMU use Ceph block devices via ``librbd``.
+
+
+.. ditaa:: +---------------------------------------------------+
+ | libvirt |
+ +------------------------+--------------------------+
+ |
+ | configures
+ v
+ +---------------------------------------------------+
+ | QEMU |
+ +---------------------------------------------------+
+ | librbd |
+ +------------------------+-+------------------------+
+ | OSDs | | Monitors |
+ +------------------------+ +------------------------+
+
+
+The most common ``libvirt`` use case involves providing Ceph block devices to
+cloud solutions like OpenStack or CloudStack. The cloud solution uses
+``libvirt`` to interact with QEMU/KVM, and QEMU/KVM interacts with Ceph block
+devices via ``librbd``. See `Block Devices and OpenStack`_ and `Block Devices
+and CloudStack`_ for details. See `Installation`_ for installation details.
+
+You can also use Ceph block devices with ``libvirt``, ``virsh`` and the
+``libvirt`` API. See `libvirt Virtualization API`_ for details.
+
+
+To create VMs that use Ceph block devices, use the procedures in the following
+sections. In the exemplary embodiment, we have used ``libvirt-pool`` for the pool
+name, ``client.libvirt`` for the user name, and ``new-libvirt-image`` for the
+image name. You may use any value you like, but ensure you replace those values
+when executing commands in the subsequent procedures.
+
+
+Configuring Ceph
+================
+
+To configure Ceph for use with ``libvirt``, perform the following steps:
+
+#. `Create a pool`_. The following example uses the
+ pool name ``libvirt-pool`` with 128 placement groups. ::
+
+ ceph osd pool create libvirt-pool 128 128
+
+ Verify the pool exists. ::
+
+ ceph osd lspools
+
+#. Use the ``rbd`` tool to initialize the pool for use by RBD::
+
+ rbd pool init <pool-name>
+
+#. `Create a Ceph User`_ (or use ``client.admin`` for version 0.9.7 and
+ earlier). The following example uses the Ceph user name ``client.libvirt``
+ and references ``libvirt-pool``. ::
+
+ ceph auth get-or-create client.libvirt mon 'profile rbd' osd 'profile rbd pool=libvirt-pool'
+
+ Verify the name exists. ::
+
+ ceph auth ls
+
+ **NOTE**: ``libvirt`` will access Ceph using the ID ``libvirt``,
+ not the Ceph name ``client.libvirt``. See `User Management - User`_ and
+ `User Management - CLI`_ for a detailed explanation of the difference
+ between ID and name.
+
+#. Use QEMU to `create an image`_ in your RBD pool.
+ The following example uses the image name ``new-libvirt-image``
+ and references ``libvirt-pool``. ::
+
+ qemu-img create -f rbd rbd:libvirt-pool/new-libvirt-image 2G
+
+ Verify the image exists. ::
+
+ rbd -p libvirt-pool ls
+
+ **NOTE:** You can also use `rbd create`_ to create an image, but we
+ recommend ensuring that QEMU is working properly.
+
+.. tip:: Optionally, if you wish to enable debug logs and the admin socket for
+ this client, you can add the following section to ``/etc/ceph/ceph.conf``::
+
+ [client.libvirt]
+ log file = /var/log/ceph/qemu-guest-$pid.log
+ admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
+
+ The ``client.libvirt`` section name should match the cephx user you created
+ above. If SELinux or AppArmor is enabled, note that this could prevent the
+ client process (qemu via libvirt) from writing the logs or admin socket to
+ the destination locations (``/var/log/ceph`` or ``/var/run/ceph``).
+
+
+
+Preparing the VM Manager
+========================
+
+You may use ``libvirt`` without a VM manager, but you may find it simpler to
+create your first domain with ``virt-manager``.
+
+#. Install a virtual machine manager. See `KVM/VirtManager`_ for details. ::
+
+ sudo apt-get install virt-manager
+
+#. Download an OS image (if necessary).
+
+#. Launch the virtual machine manager. ::
+
+ sudo virt-manager
+
+
+
+Creating a VM
+=============
+
+To create a VM with ``virt-manager``, perform the following steps:
+
+#. Press the **Create New Virtual Machine** button.
+
+#. Name the new virtual machine domain. In the exemplary embodiment, we
+ use the name ``libvirt-virtual-machine``. You may use any name you wish,
+ but ensure you replace ``libvirt-virtual-machine`` with the name you
+ choose in subsequent commandline and configuration examples. ::
+
+ libvirt-virtual-machine
+
+#. Import the image. ::
+
+ /path/to/image/recent-linux.img
+
+ **NOTE:** Import a recent image. Some older images may not rescan for
+ virtual devices properly.
+
+#. Configure and start the VM.
+
+#. You may use ``virsh list`` to verify the VM domain exists. ::
+
+ sudo virsh list
+
+#. Login to the VM (root/root)
+
+#. Stop the VM before configuring it for use with Ceph.
+
+
+Configuring the VM
+==================
+
+When configuring the VM for use with Ceph, it is important to use ``virsh``
+where appropriate. Additionally, ``virsh`` commands often require root
+privileges (i.e., ``sudo``) and will not return appropriate results or notify
+you that that root privileges are required. For a reference of ``virsh``
+commands, refer to `Virsh Command Reference`_.
+
+
+#. Open the configuration file with ``virsh edit``. ::
+
+ sudo virsh edit {vm-domain-name}
+
+ Under ``<devices>`` there should be a ``<disk>`` entry. ::
+
+ <devices>
+ <emulator>/usr/bin/kvm</emulator>
+ <disk type='file' device='disk'>
+ <driver name='qemu' type='raw'/>
+ <source file='/path/to/image/recent-linux.img'/>
+ <target dev='vda' bus='virtio'/>
+ <address type='drive' controller='0' bus='0' unit='0'/>
+ </disk>
+
+
+ Replace ``/path/to/image/recent-linux.img`` with the path to the OS image.
+ The minimum kernel for using the faster ``virtio`` bus is 2.6.25. See
+ `Virtio`_ for details.
+
+ **IMPORTANT:** Use ``sudo virsh edit`` instead of a text editor. If you edit
+ the configuration file under ``/etc/libvirt/qemu`` with a text editor,
+ ``libvirt`` may not recognize the change. If there is a discrepancy between
+ the contents of the XML file under ``/etc/libvirt/qemu`` and the result of
+ ``sudo virsh dumpxml {vm-domain-name}``, then your VM may not work
+ properly.
+
+
+#. Add the Ceph RBD image you created as a ``<disk>`` entry. ::
+
+ <disk type='network' device='disk'>
+ <source protocol='rbd' name='libvirt-pool/new-libvirt-image'>
+ <host name='{monitor-host}' port='6789'/>
+ </source>
+ <target dev='vda' bus='virtio'/>
+ </disk>
+
+ Replace ``{monitor-host}`` with the name of your host, and replace the
+ pool and/or image name as necessary. You may add multiple ``<host>``
+ entries for your Ceph monitors. The ``dev`` attribute is the logical
+ device name that will appear under the ``/dev`` directory of your
+ VM. The optional ``bus`` attribute indicates the type of disk device to
+ emulate. The valid settings are driver specific (e.g., "ide", "scsi",
+ "virtio", "xen", "usb" or "sata").
+
+ See `Disks`_ for details of the ``<disk>`` element, and its child elements
+ and attributes.
+
+#. Save the file.
+
+#. If your Ceph Storage Cluster has `Ceph Authentication`_ enabled (it does by
+ default), you must generate a secret. ::
+
+ cat > secret.xml <<EOF
+ <secret ephemeral='no' private='no'>
+ <usage type='ceph'>
+ <name>client.libvirt secret</name>
+ </usage>
+ </secret>
+ EOF
+
+#. Define the secret. ::
+
+ sudo virsh secret-define --file secret.xml
+ <uuid of secret is output here>
+
+#. Get the ``client.libvirt`` key and save the key string to a file. ::
+
+ ceph auth get-key client.libvirt | sudo tee client.libvirt.key
+
+#. Set the UUID of the secret. ::
+
+ sudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.libvirt.key) && rm client.libvirt.key secret.xml
+
+ You must also set the secret manually by adding the following ``<auth>``
+ entry to the ``<disk>`` element you entered earlier (replacing the
+ ``uuid`` value with the result from the command line example above). ::
+
+ sudo virsh edit {vm-domain-name}
+
+ Then, add ``<auth></auth>`` element to the domain configuration file::
+
+ ...
+ </source>
+ <auth username='libvirt'>
+ <secret type='ceph' uuid='9ec59067-fdbc-a6c0-03ff-df165c0587b8'/>
+ </auth>
+ <target ...
+
+
+ **NOTE:** The exemplary ID is ``libvirt``, not the Ceph name
+ ``client.libvirt`` as generated at step 2 of `Configuring Ceph`_. Ensure
+ you use the ID component of the Ceph name you generated. If for some reason
+ you need to regenerate the secret, you will have to execute
+ ``sudo virsh secret-undefine {uuid}`` before executing
+ ``sudo virsh secret-set-value`` again.
+
+
+Summary
+=======
+
+Once you have configured the VM for use with Ceph, you can start the VM.
+To verify that the VM and Ceph are communicating, you may perform the
+following procedures.
+
+
+#. Check to see if Ceph is running::
+
+ ceph health
+
+#. Check to see if the VM is running. ::
+
+ sudo virsh list
+
+#. Check to see if the VM is communicating with Ceph. Replace
+ ``{vm-domain-name}`` with the name of your VM domain::
+
+ sudo virsh qemu-monitor-command --hmp {vm-domain-name} 'info block'
+
+#. Check to see if the device from ``<target dev='hdb' bus='ide'/>`` appears
+ under ``/dev`` or under ``proc/partitions``. ::
+
+ ls dev
+ cat proc/partitions
+
+If everything looks okay, you may begin using the Ceph block device
+within your VM.
+
+
+.. _Installation: ../../install
+.. _libvirt Virtualization API: http://www.libvirt.org
+.. _Block Devices and OpenStack: ../rbd-openstack
+.. _Block Devices and CloudStack: ../rbd-cloudstack
+.. _Create a pool: ../../rados/operations/pools#create-a-pool
+.. _Create a Ceph User: ../../rados/operations/user-management#add-a-user
+.. _create an image: ../qemu-rbd#creating-images-with-qemu
+.. _Virsh Command Reference: http://www.libvirt.org/virshcmdref.html
+.. _KVM/VirtManager: https://help.ubuntu.com/community/KVM/VirtManager
+.. _Ceph Authentication: ../../rados/configuration/auth-config-ref
+.. _Disks: http://www.libvirt.org/formatdomain.html#elementsDisks
+.. _rbd create: ../rados-rbd-cmds#creating-a-block-device-image
+.. _User Management - User: ../../rados/operations/user-management#user
+.. _User Management - CLI: ../../rados/operations/user-management#command-line-usage
+.. _Virtio: http://www.linux-kvm.org/page/Virtio
diff --git a/src/ceph/doc/rbd/man/index.rst b/src/ceph/doc/rbd/man/index.rst
new file mode 100644
index 0000000..33a192a
--- /dev/null
+++ b/src/ceph/doc/rbd/man/index.rst
@@ -0,0 +1,16 @@
+============================
+ Ceph Block Device Manpages
+============================
+
+.. toctree::
+ :maxdepth: 1
+
+ rbd <../../man/8/rbd>
+ rbd-fuse <../../man/8/rbd-fuse>
+ rbd-nbd <../../man/8/rbd-nbd>
+ rbd-ggate <../../man/8/rbd-ggate>
+ ceph-rbdnamer <../../man/8/ceph-rbdnamer>
+ rbd-replay-prep <../../man/8/rbd-replay-prep>
+ rbd-replay <../../man/8/rbd-replay>
+ rbd-replay-many <../../man/8/rbd-replay-many>
+ rbd-map <../../man/8/rbdmap>
diff --git a/src/ceph/doc/rbd/qemu-rbd.rst b/src/ceph/doc/rbd/qemu-rbd.rst
new file mode 100644
index 0000000..80c5dcc
--- /dev/null
+++ b/src/ceph/doc/rbd/qemu-rbd.rst
@@ -0,0 +1,218 @@
+========================
+ QEMU and Block Devices
+========================
+
+.. index:: Ceph Block Device; QEMU KVM
+
+The most frequent Ceph Block Device use case involves providing block device
+images to virtual machines. For example, a user may create a "golden" image
+with an OS and any relevant software in an ideal configuration. Then, the user
+takes a snapshot of the image. Finally, the user clones the snapshot (usually
+many times). See `Snapshots`_ for details. The ability to make copy-on-write
+clones of a snapshot means that Ceph can provision block device images to
+virtual machines quickly, because the client doesn't have to download an entire
+image each time it spins up a new virtual machine.
+
+
+.. ditaa:: +---------------------------------------------------+
+ | QEMU |
+ +---------------------------------------------------+
+ | librbd |
+ +---------------------------------------------------+
+ | librados |
+ +------------------------+-+------------------------+
+ | OSDs | | Monitors |
+ +------------------------+ +------------------------+
+
+
+Ceph Block Devices can integrate with the QEMU virtual machine. For details on
+QEMU, see `QEMU Open Source Processor Emulator`_. For QEMU documentation, see
+`QEMU Manual`_. For installation details, see `Installation`_.
+
+.. important:: To use Ceph Block Devices with QEMU, you must have access to a
+ running Ceph cluster.
+
+
+Usage
+=====
+
+The QEMU command line expects you to specify the pool name and image name. You
+may also specify a snapshot name.
+
+QEMU will assume that the Ceph configuration file resides in the default
+location (e.g., ``/etc/ceph/$cluster.conf``) and that you are executing
+commands as the default ``client.admin`` user unless you expressly specify
+another Ceph configuration file path or another user. When specifying a user,
+QEMU uses the ``ID`` rather than the full ``TYPE:ID``. See `User Management -
+User`_ for details. Do not prepend the client type (i.e., ``client.``) to the
+beginning of the user ``ID``, or you will receive an authentication error. You
+should have the key for the ``admin`` user or the key of another user you
+specify with the ``:id={user}`` option in a keyring file stored in default path
+(i.e., ``/etc/ceph`` or the local directory with appropriate file ownership and
+permissions. Usage takes the following form::
+
+ qemu-img {command} [options] rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]
+
+For example, specifying the ``id`` and ``conf`` options might look like the following::
+
+ qemu-img {command} [options] rbd:glance-pool/maipo:id=glance:conf=/etc/ceph/ceph.conf
+
+.. tip:: Configuration values containing ``:``, ``@``, or ``=`` can be escaped with a
+ leading ``\`` character.
+
+
+Creating Images with QEMU
+=========================
+
+You can create a block device image from QEMU. You must specify ``rbd``, the
+pool name, and the name of the image you wish to create. You must also specify
+the size of the image. ::
+
+ qemu-img create -f raw rbd:{pool-name}/{image-name} {size}
+
+For example::
+
+ qemu-img create -f raw rbd:data/foo 10G
+
+.. important:: The ``raw`` data format is really the only sensible
+ ``format`` option to use with RBD. Technically, you could use other
+ QEMU-supported formats (such as ``qcow2`` or ``vmdk``), but doing
+ so would add additional overhead, and would also render the volume
+ unsafe for virtual machine live migration when caching (see below)
+ is enabled.
+
+
+Resizing Images with QEMU
+=========================
+
+You can resize a block device image from QEMU. You must specify ``rbd``,
+the pool name, and the name of the image you wish to resize. You must also
+specify the size of the image. ::
+
+ qemu-img resize rbd:{pool-name}/{image-name} {size}
+
+For example::
+
+ qemu-img resize rbd:data/foo 10G
+
+
+Retrieving Image Info with QEMU
+===============================
+
+You can retrieve block device image information from QEMU. You must
+specify ``rbd``, the pool name, and the name of the image. ::
+
+ qemu-img info rbd:{pool-name}/{image-name}
+
+For example::
+
+ qemu-img info rbd:data/foo
+
+
+Running QEMU with RBD
+=====================
+
+QEMU can pass a block device from the host on to a guest, but since
+QEMU 0.15, there's no need to map an image as a block device on
+the host. Instead, QEMU can access an image as a virtual block
+device directly via ``librbd``. This performs better because it avoids
+an additional context switch, and can take advantage of `RBD caching`_.
+
+You can use ``qemu-img`` to convert existing virtual machine images to Ceph
+block device images. For example, if you have a qcow2 image, you could run::
+
+ qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 rbd:data/squeeze
+
+To run a virtual machine booting from that image, you could run::
+
+ qemu -m 1024 -drive format=raw,file=rbd:data/squeeze
+
+`RBD caching`_ can significantly improve performance.
+Since QEMU 1.2, QEMU's cache options control ``librbd`` caching::
+
+ qemu -m 1024 -drive format=rbd,file=rbd:data/squeeze,cache=writeback
+
+If you have an older version of QEMU, you can set the ``librbd`` cache
+configuration (like any Ceph configuration option) as part of the
+'file' parameter::
+
+ qemu -m 1024 -drive format=raw,file=rbd:data/squeeze:rbd_cache=true,cache=writeback
+
+.. important:: If you set rbd_cache=true, you must set cache=writeback
+ or risk data loss. Without cache=writeback, QEMU will not send
+ flush requests to librbd. If QEMU exits uncleanly in this
+ configuration, filesystems on top of rbd can be corrupted.
+
+.. _RBD caching: ../rbd-config-ref/#rbd-cache-config-settings
+
+
+.. index:: Ceph Block Device; discard trim and libvirt
+
+Enabling Discard/TRIM
+=====================
+
+Since Ceph version 0.46 and QEMU version 1.1, Ceph Block Devices support the
+discard operation. This means that a guest can send TRIM requests to let a Ceph
+block device reclaim unused space. This can be enabled in the guest by mounting
+``ext4`` or ``XFS`` with the ``discard`` option.
+
+For this to be available to the guest, it must be explicitly enabled
+for the block device. To do this, you must specify a
+``discard_granularity`` associated with the drive::
+
+ qemu -m 1024 -drive format=raw,file=rbd:data/squeeze,id=drive1,if=none \
+ -device driver=ide-hd,drive=drive1,discard_granularity=512
+
+Note that this uses the IDE driver. The virtio driver does not
+support discard.
+
+If using libvirt, edit your libvirt domain's configuration file using ``virsh
+edit`` to include the ``xmlns:qemu`` value. Then, add a ``qemu:commandline``
+block as a child of that domain. The following example shows how to set two
+devices with ``qemu id=`` to different ``discard_granularity`` values.
+
+.. code-block:: guess
+
+ <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
+ <qemu:commandline>
+ <qemu:arg value='-set'/>
+ <qemu:arg value='block.scsi0-0-0.discard_granularity=4096'/>
+ <qemu:arg value='-set'/>
+ <qemu:arg value='block.scsi0-0-1.discard_granularity=65536'/>
+ </qemu:commandline>
+ </domain>
+
+
+.. index:: Ceph Block Device; cache options
+
+QEMU Cache Options
+==================
+
+QEMU's cache options correspond to the following Ceph `RBD Cache`_ settings.
+
+Writeback::
+
+ rbd_cache = true
+
+Writethrough::
+
+ rbd_cache = true
+ rbd_cache_max_dirty = 0
+
+None::
+
+ rbd_cache = false
+
+QEMU's cache settings override Ceph's cache settings (including settings that
+are explicitly set in the Ceph configuration file).
+
+.. note:: Prior to QEMU v2.4.0, if you explicitly set `RBD Cache`_ settings
+ in the Ceph configuration file, your Ceph settings override the QEMU cache
+ settings.
+
+.. _QEMU Open Source Processor Emulator: http://wiki.qemu.org/Main_Page
+.. _QEMU Manual: http://wiki.qemu.org/Manual
+.. _RBD Cache: ../rbd-config-ref/
+.. _Snapshots: ../rbd-snapshot/
+.. _Installation: ../../install
+.. _User Management - User: ../../rados/operations/user-management#user
diff --git a/src/ceph/doc/rbd/rados-rbd-cmds.rst b/src/ceph/doc/rbd/rados-rbd-cmds.rst
new file mode 100644
index 0000000..65f7737
--- /dev/null
+++ b/src/ceph/doc/rbd/rados-rbd-cmds.rst
@@ -0,0 +1,223 @@
+=======================
+ Block Device Commands
+=======================
+
+.. index:: Ceph Block Device; image management
+
+The ``rbd`` command enables you to create, list, introspect and remove block
+device images. You can also use it to clone images, create snapshots,
+rollback an image to a snapshot, view a snapshot, etc. For details on using
+the ``rbd`` command, see `RBD – Manage RADOS Block Device (RBD) Images`_ for
+details.
+
+.. important:: To use Ceph Block Device commands, you must have access to
+ a running Ceph cluster.
+
+Create a Block Device Pool
+==========================
+
+#. On the admin node, use the ``ceph`` tool to `create a pool`_.
+
+#. On the admin node, use the ``rbd`` tool to initialize the pool for use by RBD::
+
+ rbd pool init <pool-name>
+
+.. note:: The ``rbd`` tool assumes a default pool name of 'rbd' when not
+ provided.
+
+Create a Block Device User
+==========================
+
+Unless specified, the ``rbd`` command will access the Ceph cluster using the ID
+``admin``. This ID allows full administrative access to the cluster. It is
+recommended that you utilize a more restricted user wherever possible.
+
+To `create a Ceph user`_, with ``ceph`` specify the ``auth get-or-create``
+command, user name, monitor caps, and OSD caps::
+
+ ceph auth get-or-create client.{ID} mon 'profile rbd' osd 'profile {profile name} [pool={pool-name}][, profile ...]'
+
+For example, to create a user ID named ``qemu`` with read-write access to the
+pool ``vms`` and read-only access to the pool ``images``, execute the
+following::
+
+ ceph auth get-or-create client.qemu mon 'profile rbd' osd 'profile rbd pool=vms, profile rbd-read-only pool=images'
+
+The output from the ``ceph auth get-or-create`` command will be the keyring for
+the specified user, which can be written to ``/etc/ceph/ceph.client.{ID}.keyring``.
+
+.. note:: The user ID can be specified when using the ``rbd`` command by
+ providing the ``--id {id}`` optional argument.
+
+Creating a Block Device Image
+=============================
+
+Before you can add a block device to a node, you must create an image for it in
+the :term:`Ceph Storage Cluster` first. To create a block device image, execute
+the following::
+
+ rbd create --size {megabytes} {pool-name}/{image-name}
+
+For example, to create a 1GB image named ``bar`` that stores information in a
+pool named ``swimmingpool``, execute the following::
+
+ rbd create --size 1024 swimmingpool/bar
+
+If you don't specify pool when creating an image, it will be stored in the
+default pool ``rbd``. For example, to create a 1GB image named ``foo`` stored in
+the default pool ``rbd``, execute the following::
+
+ rbd create --size 1024 foo
+
+.. note:: You must create a pool first before you can specify it as a
+ source. See `Storage Pools`_ for details.
+
+Listing Block Device Images
+===========================
+
+To list block devices in the ``rbd`` pool, execute the following
+(i.e., ``rbd`` is the default pool name)::
+
+ rbd ls
+
+To list block devices in a particular pool, execute the following,
+but replace ``{poolname}`` with the name of the pool::
+
+ rbd ls {poolname}
+
+For example::
+
+ rbd ls swimmingpool
+
+To list deferred delete block devices in the ``rbd`` pool, execute the
+following::
+
+ rbd trash ls
+
+To list deferred delete block devices in a particular pool, execute the
+following, but replace ``{poolname}`` with the name of the pool::
+
+ rbd trash ls {poolname}
+
+For example::
+
+ rbd trash ls swimmingpool
+
+Retrieving Image Information
+============================
+
+To retrieve information from a particular image, execute the following,
+but replace ``{image-name}`` with the name for the image::
+
+ rbd info {image-name}
+
+For example::
+
+ rbd info foo
+
+To retrieve information from an image within a pool, execute the following,
+but replace ``{image-name}`` with the name of the image and replace ``{pool-name}``
+with the name of the pool::
+
+ rbd info {pool-name}/{image-name}
+
+For example::
+
+ rbd info swimmingpool/bar
+
+Resizing a Block Device Image
+=============================
+
+:term:`Ceph Block Device` images are thin provisioned. They don't actually use
+any physical storage until you begin saving data to them. However, they do have
+a maximum capacity that you set with the ``--size`` option. If you want to
+increase (or decrease) the maximum size of a Ceph Block Device image, execute
+the following::
+
+ rbd resize --size 2048 foo (to increase)
+ rbd resize --size 2048 foo --allow-shrink (to decrease)
+
+
+Removing a Block Device Image
+=============================
+
+To remove a block device, execute the following, but replace ``{image-name}``
+with the name of the image you want to remove::
+
+ rbd rm {image-name}
+
+For example::
+
+ rbd rm foo
+
+To remove a block device from a pool, execute the following, but replace
+``{image-name}`` with the name of the image to remove and replace
+``{pool-name}`` with the name of the pool::
+
+ rbd rm {pool-name}/{image-name}
+
+For example::
+
+ rbd rm swimmingpool/bar
+
+To defer delete a block device from a pool, execute the following, but
+replace ``{image-name}`` with the name of the image to move and replace
+``{pool-name}`` with the name of the pool::
+
+ rbd trash mv {pool-name}/{image-name}
+
+For example::
+
+ rbd trash mv swimmingpool/bar
+
+To remove a deferred block device from a pool, execute the following, but
+replace ``{image-id}`` with the id of the image to remove and replace
+``{pool-name}`` with the name of the pool::
+
+ rbd trash rm {pool-name}/{image-id}
+
+For example::
+
+ rbd trash rm swimmingpool/2bf4474b0dc51
+
+.. note::
+
+ * You can move an image to the trash even it has shapshot(s) or actively
+ in-use by clones, but can not be removed from trash.
+
+ * You can use *--delay* to set the defer time (default is 0), and if its
+ deferment time has not expired, it can not be removed unless you use
+ force.
+
+Restoring a Block Device Image
+==============================
+
+To restore a deferred delete block device in the rbd pool, execute the
+following, but replace ``{image-id}`` with the id of the image::
+
+ rbd trash restore {image-d}
+
+For example::
+
+ rbd trash restore 2bf4474b0dc51
+
+To restore a deferred delete block device in a particular pool, execute
+the following, but replace ``{image-id}`` with the id of the image and
+replace ``{pool-name}`` with the name of the pool::
+
+ rbd trash restore {pool-name}/{image-id}
+
+For example::
+
+ rbd trash restore swimmingpool/2bf4474b0dc51
+
+Also you can use *--image* to rename the iamge when restore it, for
+example::
+
+ rbd trash restore swimmingpool/2bf4474b0dc51 --image new-name
+
+
+.. _create a pool: ../../rados/operations/pools/#create-a-pool
+.. _Storage Pools: ../../rados/operations/pools
+.. _RBD – Manage RADOS Block Device (RBD) Images: ../../man/8/rbd/
+.. _create a Ceph user: ../../rados/operations/user-management#add-a-user
diff --git a/src/ceph/doc/rbd/rbd-cloudstack.rst b/src/ceph/doc/rbd/rbd-cloudstack.rst
new file mode 100644
index 0000000..f66d6d4
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-cloudstack.rst
@@ -0,0 +1,135 @@
+=============================
+ Block Devices and CloudStack
+=============================
+
+You may use Ceph Block Device images with CloudStack 4.0 and higher through
+``libvirt``, which configures the QEMU interface to ``librbd``. Ceph stripes
+block device images as objects across the cluster, which means that large Ceph
+Block Device images have better performance than a standalone server!
+
+To use Ceph Block Devices with CloudStack 4.0 and higher, you must install QEMU,
+``libvirt``, and CloudStack first. We recommend using a separate physical host
+for your CloudStack installation. CloudStack recommends a minimum of 4GB of RAM
+and a dual-core processor, but more CPU and RAM will perform better. The
+following diagram depicts the CloudStack/Ceph technology stack.
+
+
+.. ditaa:: +---------------------------------------------------+
+ | CloudStack |
+ +---------------------------------------------------+
+ | libvirt |
+ +------------------------+--------------------------+
+ |
+ | configures
+ v
+ +---------------------------------------------------+
+ | QEMU |
+ +---------------------------------------------------+
+ | librbd |
+ +---------------------------------------------------+
+ | librados |
+ +------------------------+-+------------------------+
+ | OSDs | | Monitors |
+ +------------------------+ +------------------------+
+
+.. important:: To use Ceph Block Devices with CloudStack, you must have
+ access to a running Ceph Storage Cluster.
+
+CloudStack integrates with Ceph's block devices to provide CloudStack with a
+back end for CloudStack's Primary Storage. The instructions below detail the
+setup for CloudStack Primary Storage.
+
+.. note:: We recommend installing with Ubuntu 14.04 or later so that
+ you can use package installation instead of having to compile
+ libvirt from source.
+
+Installing and configuring QEMU for use with CloudStack doesn't require any
+special handling. Ensure that you have a running Ceph Storage Cluster. Install
+QEMU and configure it for use with Ceph; then, install ``libvirt`` version
+0.9.13 or higher (you may need to compile from source) and ensure it is running
+with Ceph.
+
+
+.. note:: Ubuntu 14.04 and CentOS 7.2 will have ``libvirt`` with RBD storage
+ pool support enabled by default.
+
+.. index:: pools; CloudStack
+
+Create a Pool
+=============
+
+By default, Ceph block devices use the ``rbd`` pool. Create a pool for
+CloudStack NFS Primary Storage. Ensure your Ceph cluster is running, then create
+the pool. ::
+
+ ceph osd pool create cloudstack
+
+See `Create a Pool`_ for details on specifying the number of placement groups
+for your pools, and `Placement Groups`_ for details on the number of placement
+groups you should set for your pools.
+
+A newly created pool must initialized prior to use. Use the ``rbd`` tool
+to initialize the pool::
+
+ rbd pool init cloudstack
+
+Create a Ceph User
+==================
+
+To access the Ceph cluster we require a Ceph user which has the correct
+credentials to access the ``cloudstack`` pool we just created. Although we could
+use ``client.admin`` for this, it's recommended to create a user with only
+access to the ``cloudstack`` pool. ::
+
+ ceph auth get-or-create client.cloudstack mon 'profile rbd' osd 'profile rbd pool=cloudstack'
+
+Use the information returned by the command in the next step when adding the
+Primary Storage.
+
+See `User Management`_ for additional details.
+
+Add Primary Storage
+===================
+
+To add primary storage, refer to `Add Primary Storage (4.2.0)`_ to add a Ceph block device, the steps
+include:
+
+#. Log in to the CloudStack UI.
+#. Click **Infrastructure** on the left side navigation bar.
+#. Select the Zone you want to use for Primary Storage.
+#. Click the **Compute** tab.
+#. Select **View All** on the `Primary Storage` node in the diagram.
+#. Click **Add Primary Storage**.
+#. Follow the CloudStack instructions.
+
+ - For **Protocol**, select ``RBD``.
+ - Add cluster information (cephx is supported). Note: Do not include the ``client.`` part of the user.
+ - Add ``rbd`` as a tag.
+
+
+Create a Disk Offering
+======================
+
+To create a new disk offering, refer to `Create a New Disk Offering (4.2.0)`_.
+Create a disk offering so that it matches the ``rbd`` tag.
+The ``StoragePoolAllocator`` will choose the ``rbd``
+pool when searching for a suitable storage pool. If the disk offering doesn't
+match the ``rbd`` tag, the ``StoragePoolAllocator`` may select the pool you
+created (e.g., ``cloudstack``).
+
+
+Limitations
+===========
+
+- CloudStack will only bind to one monitor (You can however create a Round Robin DNS record over multiple monitors)
+
+
+
+.. _Create a Pool: ../../rados/operations/pools#createpool
+.. _Placement Groups: ../../rados/operations/placement-groups
+.. _Install and Configure QEMU: ../qemu-rbd
+.. _Install and Configure libvirt: ../libvirt
+.. _KVM Hypervisor Host Installation: http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Installation_Guide/hypervisor-kvm-install-flow.html
+.. _Add Primary Storage (4.2.0): http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Admin_Guide/primary-storage-add.html
+.. _Create a New Disk Offering (4.2.0): http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Admin_Guide/compute-disk-service-offerings.html#creating-disk-offerings
+.. _User Management: ../../rados/operations/user-management
diff --git a/src/ceph/doc/rbd/rbd-config-ref.rst b/src/ceph/doc/rbd/rbd-config-ref.rst
new file mode 100644
index 0000000..db942f8
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-config-ref.rst
@@ -0,0 +1,136 @@
+=======================
+ librbd Settings
+=======================
+
+See `Block Device`_ for additional details.
+
+Cache Settings
+=======================
+
+.. sidebar:: Kernel Caching
+
+ The kernel driver for Ceph block devices can use the Linux page cache to
+ improve performance.
+
+The user space implementation of the Ceph block device (i.e., ``librbd``) cannot
+take advantage of the Linux page cache, so it includes its own in-memory
+caching, called "RBD caching." RBD caching behaves just like well-behaved hard
+disk caching. When the OS sends a barrier or a flush request, all dirty data is
+written to the OSDs. This means that using write-back caching is just as safe as
+using a well-behaved physical hard disk with a VM that properly sends flushes
+(i.e. Linux kernel >= 2.6.32). The cache uses a Least Recently Used (LRU)
+algorithm, and in write-back mode it can coalesce contiguous requests for
+better throughput.
+
+.. versionadded:: 0.46
+
+Ceph supports write-back caching for RBD. To enable it, add ``rbd cache =
+true`` to the ``[client]`` section of your ``ceph.conf`` file. By default
+``librbd`` does not perform any caching. Writes and reads go directly to the
+storage cluster, and writes return only when the data is on disk on all
+replicas. With caching enabled, writes return immediately, unless there are more
+than ``rbd cache max dirty`` unflushed bytes. In this case, the write triggers
+writeback and blocks until enough bytes are flushed.
+
+.. versionadded:: 0.47
+
+Ceph supports write-through caching for RBD. You can set the size of
+the cache, and you can set targets and limits to switch from
+write-back caching to write through caching. To enable write-through
+mode, set ``rbd cache max dirty`` to 0. This means writes return only
+when the data is on disk on all replicas, but reads may come from the
+cache. The cache is in memory on the client, and each RBD image has
+its own. Since the cache is local to the client, there's no coherency
+if there are others accessing the image. Running GFS or OCFS on top of
+RBD will not work with caching enabled.
+
+The ``ceph.conf`` file settings for RBD should be set in the ``[client]``
+section of your configuration file. The settings include:
+
+
+``rbd cache``
+
+:Description: Enable caching for RADOS Block Device (RBD).
+:Type: Boolean
+:Required: No
+:Default: ``true``
+
+
+``rbd cache size``
+
+:Description: The RBD cache size in bytes.
+:Type: 64-bit Integer
+:Required: No
+:Default: ``32 MiB``
+
+
+``rbd cache max dirty``
+
+:Description: The ``dirty`` limit in bytes at which the cache triggers write-back. If ``0``, uses write-through caching.
+:Type: 64-bit Integer
+:Required: No
+:Constraint: Must be less than ``rbd cache size``.
+:Default: ``24 MiB``
+
+
+``rbd cache target dirty``
+
+:Description: The ``dirty target`` before the cache begins writing data to the data storage. Does not block writes to the cache.
+:Type: 64-bit Integer
+:Required: No
+:Constraint: Must be less than ``rbd cache max dirty``.
+:Default: ``16 MiB``
+
+
+``rbd cache max dirty age``
+
+:Description: The number of seconds dirty data is in the cache before writeback starts.
+:Type: Float
+:Required: No
+:Default: ``1.0``
+
+.. versionadded:: 0.60
+
+``rbd cache writethrough until flush``
+
+:Description: Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32.
+:Type: Boolean
+:Required: No
+:Default: ``true``
+
+.. _Block Device: ../../rbd
+
+
+Read-ahead Settings
+=======================
+
+.. versionadded:: 0.86
+
+RBD supports read-ahead/prefetching to optimize small, sequential reads.
+This should normally be handled by the guest OS in the case of a VM,
+but boot loaders may not issue efficient reads.
+Read-ahead is automatically disabled if caching is disabled.
+
+
+``rbd readahead trigger requests``
+
+:Description: Number of sequential read requests necessary to trigger read-ahead.
+:Type: Integer
+:Required: No
+:Default: ``10``
+
+
+``rbd readahead max bytes``
+
+:Description: Maximum size of a read-ahead request. If zero, read-ahead is disabled.
+:Type: 64-bit Integer
+:Required: No
+:Default: ``512 KiB``
+
+
+``rbd readahead disable after bytes``
+
+:Description: After this many bytes have been read from an RBD image, read-ahead is disabled for that image until it is closed. This allows the guest OS to take over read-ahead once it is booted. If zero, read-ahead stays enabled.
+:Type: 64-bit Integer
+:Required: No
+:Default: ``50 MiB``
diff --git a/src/ceph/doc/rbd/rbd-ko.rst b/src/ceph/doc/rbd/rbd-ko.rst
new file mode 100644
index 0000000..951757c
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-ko.rst
@@ -0,0 +1,59 @@
+==========================
+ Kernel Module Operations
+==========================
+
+.. index:: Ceph Block Device; kernel module
+
+.. important:: To use kernel module operations, you must have a running Ceph cluster.
+
+Get a List of Images
+====================
+
+To mount a block device image, first return a list of the images. ::
+
+ rbd list
+
+Map a Block Device
+==================
+
+Use ``rbd`` to map an image name to a kernel module. You must specify the
+image name, the pool name, and the user name. ``rbd`` will load RBD kernel
+module on your behalf if it's not already loaded. ::
+
+ sudo rbd map {pool-name}/{image-name} --id {user-name}
+
+For example::
+
+ sudo rbd map rbd/myimage --id admin
+
+If you use `cephx`_ authentication, you must also specify a secret. It may come
+from a keyring or a file containing the secret. ::
+
+ sudo rbd map rbd/myimage --id admin --keyring /path/to/keyring
+ sudo rbd map rbd/myimage --id admin --keyfile /path/to/file
+
+
+Show Mapped Block Devices
+=========================
+
+To show block device images mapped to kernel modules with the ``rbd`` command,
+specify the ``showmapped`` option. ::
+
+ rbd showmapped
+
+
+Unmapping a Block Device
+========================
+
+To unmap a block device image with the ``rbd`` command, specify the ``unmap``
+option and the device name (i.e., by convention the same as the block device
+image name). ::
+
+ sudo rbd unmap /dev/rbd/{poolname}/{imagename}
+
+For example::
+
+ sudo rbd unmap /dev/rbd/rbd/foo
+
+
+.. _cephx: ../../rados/operations/user-management/
diff --git a/src/ceph/doc/rbd/rbd-mirroring.rst b/src/ceph/doc/rbd/rbd-mirroring.rst
new file mode 100644
index 0000000..989f1fc
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-mirroring.rst
@@ -0,0 +1,318 @@
+===============
+ RBD Mirroring
+===============
+
+.. index:: Ceph Block Device; mirroring
+
+RBD images can be asynchronously mirrored between two Ceph clusters. This
+capability uses the RBD journaling image feature to ensure crash-consistent
+replication between clusters. Mirroring is configured on a per-pool basis
+within peer clusters and can be configured to automatically mirror all
+images within a pool or only a specific subset of images. Mirroring is
+configured using the ``rbd`` command. The ``rbd-mirror`` daemon is responsible
+for pulling image updates from the remote, peer cluster and applying them to
+the image within the local cluster.
+
+.. note:: RBD mirroring requires the Ceph Jewel release or later.
+
+.. important:: To use RBD mirroring, you must have two Ceph clusters, each
+ running the ``rbd-mirror`` daemon.
+
+Pool Configuration
+==================
+
+The following procedures demonstrate how to perform the basic administrative
+tasks to configure mirroring using the ``rbd`` command. Mirroring is
+configured on a per-pool basis within the Ceph clusters.
+
+The pool configuration steps should be performed on both peer clusters. These
+procedures assume two clusters, named "local" and "remote", are accessible from
+a single host for clarity.
+
+See the `rbd`_ manpage for additional details of how to connect to different
+Ceph clusters.
+
+.. note:: The cluster name in the following examples corresponds to a Ceph
+ configuration file of the same name (e.g. /etc/ceph/remote.conf). See the
+ `ceph-conf`_ documentation for how to configure multiple clusters.
+
+.. note:: Images in a given pool will be mirrored to a pool with the same name
+ on the remote cluster. Images using a separate data-pool will use a data-pool
+ with the same name on the remote cluster. E.g., if an image being mirrored is
+ in the ``rbd`` pool on the local cluster and using a data-pool called
+ ``rbd-ec``, pools called ``rbd`` and ``rbd-ec`` must exist on the remote
+ cluster and will be used for mirroring the image.
+
+Enable Mirroring
+----------------
+
+To enable mirroring on a pool with ``rbd``, specify the ``mirror pool enable``
+command, the pool name, and the mirroring mode::
+
+ rbd mirror pool enable {pool-name} {mode}
+
+The mirroring mode can either be ``pool`` or ``image``:
+
+* **pool**: When configured in ``pool`` mode, all images in the pool with the
+ journaling feature enabled are mirrored.
+* **image**: When configured in ``image`` mode, mirroring needs to be
+ `explicitly enabled`_ on each image.
+
+For example::
+
+ rbd --cluster local mirror pool enable image-pool pool
+ rbd --cluster remote mirror pool enable image-pool pool
+
+Disable Mirroring
+-----------------
+
+To disable mirroring on a pool with ``rbd``, specify the ``mirror pool disable``
+command and the pool name::
+
+ rbd mirror pool disable {pool-name}
+
+When mirroring is disabled on a pool in this way, mirroring will also be
+disabled on any images (within the pool) for which mirroring was enabled
+explicitly.
+
+For example::
+
+ rbd --cluster local mirror pool disable image-pool
+ rbd --cluster remote mirror pool disable image-pool
+
+Add Cluster Peer
+----------------
+
+In order for the ``rbd-mirror`` daemon to discover its peer cluster, the peer
+needs to be registered to the pool. To add a mirroring peer Ceph cluster with
+``rbd``, specify the ``mirror pool peer add`` command, the pool name, and a
+cluster specification::
+
+ rbd mirror pool peer add {pool-name} {client-name}@{cluster-name}
+
+For example::
+
+ rbd --cluster local mirror pool peer add image-pool client.remote@remote
+ rbd --cluster remote mirror pool peer add image-pool client.local@local
+
+Remove Cluster Peer
+-------------------
+
+To remove a mirroring peer Ceph cluster with ``rbd``, specify the
+``mirror pool peer remove`` command, the pool name, and the peer UUID
+(available from the ``rbd mirror pool info`` command)::
+
+ rbd mirror pool peer remove {pool-name} {peer-uuid}
+
+For example::
+
+ rbd --cluster local mirror pool peer remove image-pool 55672766-c02b-4729-8567-f13a66893445
+ rbd --cluster remote mirror pool peer remove image-pool 60c0e299-b38f-4234-91f6-eed0a367be08
+
+Image Configuration
+===================
+
+Unlike pool configuration, image configuration only needs to be performed against
+a single mirroring peer Ceph cluster.
+
+Mirrored RBD images are designated as either primary or non-primary. This is a
+property of the image and not the pool. Images that are designated as
+non-primary cannot be modified.
+
+Images are automatically promoted to primary when mirroring is first enabled on
+an image (either implicitly if the pool mirror mode was **pool** and the image
+has the journaling image feature enabled, or `explicitly enabled`_ by the
+``rbd`` command).
+
+Enable Image Journaling Support
+-------------------------------
+
+RBD mirroring uses the RBD journaling feature to ensure that the replicated
+image always remains crash-consistent. Before an image can be mirrored to
+a peer cluster, the journaling feature must be enabled. The feature can be
+enabled at image creation time by providing the
+``--image-feature exclusive-lock,journaling`` option to the ``rbd`` command.
+
+Alternatively, the journaling feature can be dynamically enabled on
+pre-existing RBD images. To enable journaling with ``rbd``, specify
+the ``feature enable`` command, the pool and image name, and the feature name::
+
+ rbd feature enable {pool-name}/{image-name} {feature-name}
+
+For example::
+
+ rbd --cluster local feature enable image-pool/image-1 journaling
+
+.. note:: The journaling feature is dependent on the exclusive-lock feature. If
+ the exclusive-lock feature is not already enabled, it should be enabled prior
+ to enabling the journaling feature.
+
+.. tip:: You can enable journaling on all new images by default by adding
+ ``rbd default features = 125`` to your Ceph configuration file.
+
+Enable Image Mirroring
+----------------------
+
+If the mirroring is configured in ``image`` mode for the image's pool, then it
+is necessary to explicitly enable mirroring for each image within the pool.
+To enable mirroring for a specific image with ``rbd``, specify the
+``mirror image enable`` command along with the pool and image name::
+
+ rbd mirror image enable {pool-name}/{image-name}
+
+For example::
+
+ rbd --cluster local mirror image enable image-pool/image-1
+
+Disable Image Mirroring
+-----------------------
+
+To disable mirroring for a specific image with ``rbd``, specify the
+``mirror image disable`` command along with the pool and image name::
+
+ rbd mirror image disable {pool-name}/{image-name}
+
+For example::
+
+ rbd --cluster local mirror image disable image-pool/image-1
+
+Image Promotion and Demotion
+----------------------------
+
+In a failover scenario where the primary designation needs to be moved to the
+image in the peer Ceph cluster, access to the primary image should be stopped
+(e.g. power down the VM or remove the associated drive from a VM), demote the
+current primary image, promote the new primary image, and resume access to the
+image on the alternate cluster.
+
+.. note:: RBD only provides the necessary tools to facilitate an orderly
+ failover of an image. An external mechanism is required to coordinate the
+ full failover process (e.g. closing the image before demotion).
+
+To demote a specific image to non-primary with ``rbd``, specify the
+``mirror image demote`` command along with the pool and image name::
+
+ rbd mirror image demote {pool-name}/{image-name}
+
+For example::
+
+ rbd --cluster local mirror image demote image-pool/image-1
+
+To demote all primary images within a pool to non-primary with ``rbd``, specify
+the ``mirror pool demote`` command along with the pool name::
+
+ rbd mirror pool demote {pool-name}
+
+For example::
+
+ rbd --cluster local mirror pool demote image-pool
+
+To promote a specific image to primary with ``rbd``, specify the
+``mirror image promote`` command along with the pool and image name::
+
+ rbd mirror image promote [--force] {pool-name}/{image-name}
+
+For example::
+
+ rbd --cluster remote mirror image promote image-pool/image-1
+
+To promote all non-primary images within a pool to primary with ``rbd``, specify
+the ``mirror pool promote`` command along with the pool name::
+
+ rbd mirror pool promote [--force] {pool-name}
+
+For example::
+
+ rbd --cluster local mirror pool promote image-pool
+
+.. tip:: Since the primary / non-primary status is per-image, it is possible to
+ have two clusters split the IO load and stage failover / failback.
+
+.. note:: Promotion can be forced using the ``--force`` option. Forced
+ promotion is needed when the demotion cannot be propagated to the peer
+ Ceph cluster (e.g. Ceph cluster failure, communication outage). This will
+ result in a split-brain scenario between the two peers and the image will no
+ longer be in-sync until a `force resync command`_ is issued.
+
+Force Image Resync
+------------------
+
+If a split-brain event is detected by the ``rbd-mirror`` daemon, it will not
+attempt to mirror the affected image until corrected. To resume mirroring for an
+image, first `demote the image`_ determined to be out-of-date and then request a
+resync to the primary image. To request an image resync with ``rbd``, specify the
+``mirror image resync`` command along with the pool and image name::
+
+ rbd mirror image resync {pool-name}/{image-name}
+
+For example::
+
+ rbd mirror image resync image-pool/image-1
+
+.. note:: The ``rbd`` command only flags the image as requiring a resync. The
+ local cluster's ``rbd-mirror`` daemon process is responsible for performing
+ the resync asynchronously.
+
+Mirror Status
+=============
+
+The peer cluster replication status is stored for every primary mirrored image.
+This status can be retrieved using the ``mirror image status`` and
+``mirror pool status`` commands.
+
+To request the mirror image status with ``rbd``, specify the
+``mirror image status`` command along with the pool and image name::
+
+ rbd mirror image status {pool-name}/{image-name}
+
+For example::
+
+ rbd mirror image status image-pool/image-1
+
+To request the mirror pool summary status with ``rbd``, specify the
+``mirror pool status`` command along with the pool name::
+
+ rbd mirror pool status {pool-name}
+
+For example::
+
+ rbd mirror pool status image-pool
+
+.. note:: Adding ``--verbose`` option to the ``mirror pool status`` command will
+ additionally output status details for every mirroring image in the pool.
+
+rbd-mirror Daemon
+=================
+
+The two ``rbd-mirror`` daemons are responsible for watching image journals on the
+remote, peer cluster and replaying the journal events against the local
+cluster. The RBD image journaling feature records all modifications to the
+image in the order they occur. This ensures that a crash-consistent mirror of
+the remote image is available locally.
+
+The ``rbd-mirror`` daemon is available within the optional ``rbd-mirror``
+distribution package.
+
+.. important:: Each ``rbd-mirror`` daemon requires the ability to connect
+ to both clusters simultaneously.
+.. warning:: Pre-Luminous releases: only run a single ``rbd-mirror`` daemon per
+ Ceph cluster.
+
+Each ``rbd-mirror`` daemon should use a unique Ceph user ID. To
+`create a Ceph user`_, with ``ceph`` specify the ``auth get-or-create``
+command, user name, monitor caps, and OSD caps::
+
+ ceph auth get-or-create client.rbd-mirror.{unique id} mon 'profile rbd' osd 'profile rbd'
+
+The ``rbd-mirror`` daemon can be managed by ``systemd`` by specifying the user
+ID as the daemon instance::
+
+ systemctl enable ceph-rbd-mirror@rbd-mirror.{unique id}
+
+.. _rbd: ../../man/8/rbd
+.. _ceph-conf: ../../rados/configuration/ceph-conf/#running-multiple-clusters
+.. _explicitly enabled: #enable-image-mirroring
+.. _force resync command: #force-image-resync
+.. _demote the image: #image-promotion-and-demotion
+.. _create a Ceph user: ../../rados/operations/user-management#add-a-user
+
diff --git a/src/ceph/doc/rbd/rbd-openstack.rst b/src/ceph/doc/rbd/rbd-openstack.rst
new file mode 100644
index 0000000..db52028
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-openstack.rst
@@ -0,0 +1,512 @@
+=============================
+ Block Devices and OpenStack
+=============================
+
+.. index:: Ceph Block Device; OpenStack
+
+You may use Ceph Block Device images with OpenStack through ``libvirt``, which
+configures the QEMU interface to ``librbd``. Ceph stripes block device images as
+objects across the cluster, which means that large Ceph Block Device images have
+better performance than a standalone server!
+
+To use Ceph Block Devices with OpenStack, you must install QEMU, ``libvirt``,
+and OpenStack first. We recommend using a separate physical node for your
+OpenStack installation. OpenStack recommends a minimum of 8GB of RAM and a
+quad-core processor. The following diagram depicts the OpenStack/Ceph
+technology stack.
+
+
+.. ditaa:: +---------------------------------------------------+
+ | OpenStack |
+ +---------------------------------------------------+
+ | libvirt |
+ +------------------------+--------------------------+
+ |
+ | configures
+ v
+ +---------------------------------------------------+
+ | QEMU |
+ +---------------------------------------------------+
+ | librbd |
+ +---------------------------------------------------+
+ | librados |
+ +------------------------+-+------------------------+
+ | OSDs | | Monitors |
+ +------------------------+ +------------------------+
+
+.. important:: To use Ceph Block Devices with OpenStack, you must have
+ access to a running Ceph Storage Cluster.
+
+Three parts of OpenStack integrate with Ceph's block devices:
+
+- **Images**: OpenStack Glance manages images for VMs. Images are immutable.
+ OpenStack treats images as binary blobs and downloads them accordingly.
+
+- **Volumes**: Volumes are block devices. OpenStack uses volumes to boot VMs,
+ or to attach volumes to running VMs. OpenStack manages volumes using
+ Cinder services.
+
+- **Guest Disks**: Guest disks are guest operating system disks. By default,
+ when you boot a virtual machine, its disk appears as a file on the filesystem
+ of the hypervisor (usually under ``/var/lib/nova/instances/<uuid>/``). Prior
+ to OpenStack Havana, the only way to boot a VM in Ceph was to use the
+ boot-from-volume functionality of Cinder. However, now it is possible to boot
+ every virtual machine inside Ceph directly without using Cinder, which is
+ advantageous because it allows you to perform maintenance operations easily
+ with the live-migration process. Additionally, if your hypervisor dies it is
+ also convenient to trigger ``nova evacuate`` and run the virtual machine
+ elsewhere almost seamlessly.
+
+You can use OpenStack Glance to store images in a Ceph Block Device, and you
+can use Cinder to boot a VM using a copy-on-write clone of an image.
+
+The instructions below detail the setup for Glance, Cinder and Nova, although
+they do not have to be used together. You may store images in Ceph block devices
+while running VMs using a local disk, or vice versa.
+
+.. important:: Ceph doesn’t support QCOW2 for hosting a virtual machine disk.
+ Thus if you want to boot virtual machines in Ceph (ephemeral backend or boot
+ from volume), the Glance image format must be ``RAW``.
+
+.. tip:: This document describes using Ceph Block Devices with OpenStack Havana.
+ For earlier versions of OpenStack see
+ `Block Devices and OpenStack (Dumpling)`_.
+
+.. index:: pools; OpenStack
+
+Create a Pool
+=============
+
+By default, Ceph block devices use the ``rbd`` pool. You may use any available
+pool. We recommend creating a pool for Cinder and a pool for Glance. Ensure
+your Ceph cluster is running, then create the pools. ::
+
+ ceph osd pool create volumes 128
+ ceph osd pool create images 128
+ ceph osd pool create backups 128
+ ceph osd pool create vms 128
+
+See `Create a Pool`_ for detail on specifying the number of placement groups for
+your pools, and `Placement Groups`_ for details on the number of placement
+groups you should set for your pools.
+
+Newly created pools must initialized prior to use. Use the ``rbd`` tool
+to initialize the pools::
+
+ rbd pool init volumes
+ rbd pool init images
+ rbd pool init backups
+ rbd pool init vms
+
+.. _Create a Pool: ../../rados/operations/pools#createpool
+.. _Placement Groups: ../../rados/operations/placement-groups
+
+
+Configure OpenStack Ceph Clients
+================================
+
+The nodes running ``glance-api``, ``cinder-volume``, ``nova-compute`` and
+``cinder-backup`` act as Ceph clients. Each requires the ``ceph.conf`` file::
+
+ ssh {your-openstack-server} sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
+
+
+Install Ceph client packages
+----------------------------
+
+On the ``glance-api`` node, you will need the Python bindings for ``librbd``::
+
+ sudo apt-get install python-rbd
+ sudo yum install python-rbd
+
+On the ``nova-compute``, ``cinder-backup`` and on the ``cinder-volume`` node,
+use both the Python bindings and the client command line tools::
+
+ sudo apt-get install ceph-common
+ sudo yum install ceph-common
+
+
+Setup Ceph Client Authentication
+--------------------------------
+
+If you have `cephx authentication`_ enabled, create a new user for Nova/Cinder
+and Glance. Execute the following::
+
+ ceph auth get-or-create client.glance mon 'profile rbd' osd 'profile rbd pool=images'
+ ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd pool=images'
+ ceph auth get-or-create client.cinder-backup mon 'profile rbd' osd 'profile rbd pool=backups'
+
+Add the keyrings for ``client.cinder``, ``client.glance``, and
+``client.cinder-backup`` to the appropriate nodes and change their ownership::
+
+ ceph auth get-or-create client.glance | ssh {your-glance-api-server} sudo tee /etc/ceph/ceph.client.glance.keyring
+ ssh {your-glance-api-server} sudo chown glance:glance /etc/ceph/ceph.client.glance.keyring
+ ceph auth get-or-create client.cinder | ssh {your-volume-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
+ ssh {your-cinder-volume-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
+ ceph auth get-or-create client.cinder-backup | ssh {your-cinder-backup-server} sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
+ ssh {your-cinder-backup-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring
+
+Nodes running ``nova-compute`` need the keyring file for the ``nova-compute``
+process::
+
+ ceph auth get-or-create client.cinder | ssh {your-nova-compute-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
+
+They also need to store the secret key of the ``client.cinder`` user in
+``libvirt``. The libvirt process needs it to access the cluster while attaching
+a block device from Cinder.
+
+Create a temporary copy of the secret key on the nodes running
+``nova-compute``::
+
+ ceph auth get-key client.cinder | ssh {your-compute-node} tee client.cinder.key
+
+Then, on the compute nodes, add the secret key to ``libvirt`` and remove the
+temporary copy of the key::
+
+ uuidgen
+ 457eb676-33da-42ec-9a8c-9293d545c337
+
+ cat > secret.xml <<EOF
+ <secret ephemeral='no' private='no'>
+ <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
+ <usage type='ceph'>
+ <name>client.cinder secret</name>
+ </usage>
+ </secret>
+ EOF
+ sudo virsh secret-define --file secret.xml
+ Secret 457eb676-33da-42ec-9a8c-9293d545c337 created
+ sudo virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
+
+Save the uuid of the secret for configuring ``nova-compute`` later.
+
+.. important:: You don't necessarily need the UUID on all the compute nodes.
+ However from a platform consistency perspective, it's better to keep the
+ same UUID.
+
+.. _cephx authentication: ../../rados/configuration/auth-config-ref/#enabling-disabling-cephx
+
+
+Configure OpenStack to use Ceph
+===============================
+
+Configuring Glance
+------------------
+
+Glance can use multiple back ends to store images. To use Ceph block devices by
+default, configure Glance like the following.
+
+Prior to Juno
+~~~~~~~~~~~~~~
+
+Edit ``/etc/glance/glance-api.conf`` and add under the ``[DEFAULT]`` section::
+
+ default_store = rbd
+ rbd_store_user = glance
+ rbd_store_pool = images
+ rbd_store_chunk_size = 8
+
+
+Juno
+~~~~
+
+Edit ``/etc/glance/glance-api.conf`` and add under the ``[glance_store]`` section::
+
+ [DEFAULT]
+ ...
+ default_store = rbd
+ ...
+ [glance_store]
+ stores = rbd
+ rbd_store_pool = images
+ rbd_store_user = glance
+ rbd_store_ceph_conf = /etc/ceph/ceph.conf
+ rbd_store_chunk_size = 8
+
+.. important:: Glance has not completely moved to 'store' yet.
+ So we still need to configure the store in the DEFAULT section until Kilo.
+
+Kilo and after
+~~~~~~~~~~~~~~
+
+Edit ``/etc/glance/glance-api.conf`` and add under the ``[glance_store]`` section::
+
+ [glance_store]
+ stores = rbd
+ default_store = rbd
+ rbd_store_pool = images
+ rbd_store_user = glance
+ rbd_store_ceph_conf = /etc/ceph/ceph.conf
+ rbd_store_chunk_size = 8
+
+For more information about the configuration options available in Glance please refer to the OpenStack Configuration Reference: http://docs.openstack.org/.
+
+Enable copy-on-write cloning of images
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Note that this exposes the back end location via Glance's API, so the endpoint
+with this option enabled should not be publicly accessible.
+
+Any OpenStack version except Mitaka
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you want to enable copy-on-write cloning of images, also add under the ``[DEFAULT]`` section::
+
+ show_image_direct_url = True
+
+For Mitaka only
+^^^^^^^^^^^^^^^
+
+To enable image locations and take advantage of copy-on-write cloning for images, add under the ``[DEFAULT]`` section::
+
+ show_multiple_locations = True
+ show_image_direct_url = True
+
+Disable cache management (any OpenStack version)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Disable the Glance cache management to avoid images getting cached under ``/var/lib/glance/image-cache/``,
+assuming your configuration file has ``flavor = keystone+cachemanagement``::
+
+ [paste_deploy]
+ flavor = keystone
+
+Image properties
+~~~~~~~~~~~~~~~~
+
+We recommend to use the following properties for your images:
+
+- ``hw_scsi_model=virtio-scsi``: add the virtio-scsi controller and get better performance and support for discard operation
+- ``hw_disk_bus=scsi``: connect every cinder block devices to that controller
+- ``hw_qemu_guest_agent=yes``: enable the QEMU guest agent
+- ``os_require_quiesce=yes``: send fs-freeze/thaw calls through the QEMU guest agent
+
+
+Configuring Cinder
+------------------
+
+OpenStack requires a driver to interact with Ceph block devices. You must also
+specify the pool name for the block device. On your OpenStack node, edit
+``/etc/cinder/cinder.conf`` by adding::
+
+ [DEFAULT]
+ ...
+ enabled_backends = ceph
+ ...
+ [ceph]
+ volume_driver = cinder.volume.drivers.rbd.RBDDriver
+ volume_backend_name = ceph
+ rbd_pool = volumes
+ rbd_ceph_conf = /etc/ceph/ceph.conf
+ rbd_flatten_volume_from_snapshot = false
+ rbd_max_clone_depth = 5
+ rbd_store_chunk_size = 4
+ rados_connect_timeout = -1
+ glance_api_version = 2
+
+If you are using `cephx authentication`_, also configure the user and uuid of
+the secret you added to ``libvirt`` as documented earlier::
+
+ [ceph]
+ ...
+ rbd_user = cinder
+ rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
+
+Note that if you are configuring multiple cinder back ends,
+``glance_api_version = 2`` must be in the ``[DEFAULT]`` section.
+
+
+Configuring Cinder Backup
+-------------------------
+
+OpenStack Cinder Backup requires a specific daemon so don't forget to install it.
+On your Cinder Backup node, edit ``/etc/cinder/cinder.conf`` and add::
+
+ backup_driver = cinder.backup.drivers.ceph
+ backup_ceph_conf = /etc/ceph/ceph.conf
+ backup_ceph_user = cinder-backup
+ backup_ceph_chunk_size = 134217728
+ backup_ceph_pool = backups
+ backup_ceph_stripe_unit = 0
+ backup_ceph_stripe_count = 0
+ restore_discard_excess_bytes = true
+
+
+Configuring Nova to attach Ceph RBD block device
+------------------------------------------------
+
+In order to attach Cinder devices (either normal block or by issuing a boot
+from volume), you must tell Nova (and libvirt) which user and UUID to refer to
+when attaching the device. libvirt will refer to this user when connecting and
+authenticating with the Ceph cluster. ::
+
+ [libvirt]
+ ...
+ rbd_user = cinder
+ rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
+
+These two flags are also used by the Nova ephemeral backend.
+
+
+Configuring Nova
+----------------
+
+In order to boot all the virtual machines directly into Ceph, you must
+configure the ephemeral backend for Nova.
+
+It is recommended to enable the RBD cache in your Ceph configuration file
+(enabled by default since Giant). Moreover, enabling the admin socket
+brings a lot of benefits while troubleshooting. Having one socket
+per virtual machine using a Ceph block device will help investigating performance and/or wrong behaviors.
+
+This socket can be accessed like this::
+
+ ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help
+
+Now on every compute nodes edit your Ceph configuration file::
+
+ [client]
+ rbd cache = true
+ rbd cache writethrough until flush = true
+ admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
+ log file = /var/log/qemu/qemu-guest-$pid.log
+ rbd concurrent management ops = 20
+
+Configure the permissions of these paths::
+
+ mkdir -p /var/run/ceph/guests/ /var/log/qemu/
+ chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/
+
+Note that user ``qemu`` and group ``libvirtd`` can vary depending on your system.
+The provided example works for RedHat based systems.
+
+.. tip:: If your virtual machine is already running you can simply restart it to get the socket
+
+
+Havana and Icehouse
+~~~~~~~~~~~~~~~~~~~
+
+Havana and Icehouse require patches to implement copy-on-write cloning and fix
+bugs with image size and live migration of ephemeral disks on rbd. These are
+available in branches based on upstream Nova `stable/havana`_ and
+`stable/icehouse`_. Using them is not mandatory but **highly recommended** in
+order to take advantage of the copy-on-write clone functionality.
+
+On every Compute node, edit ``/etc/nova/nova.conf`` and add::
+
+ libvirt_images_type = rbd
+ libvirt_images_rbd_pool = vms
+ libvirt_images_rbd_ceph_conf = /etc/ceph/ceph.conf
+ disk_cachemodes="network=writeback"
+ rbd_user = cinder
+ rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
+
+It is also a good practice to disable file injection. While booting an
+instance, Nova usually attempts to open the rootfs of the virtual machine.
+Then, Nova injects values such as password, ssh keys etc. directly into the
+filesystem. However, it is better to rely on the metadata service and
+``cloud-init``.
+
+On every Compute node, edit ``/etc/nova/nova.conf`` and add::
+
+ libvirt_inject_password = false
+ libvirt_inject_key = false
+ libvirt_inject_partition = -2
+
+To ensure a proper live-migration, use the following flags::
+
+ libvirt_live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST,VIR_MIGRATE_TUNNELLED"
+
+Juno
+~~~~
+
+In Juno, Ceph block device was moved under the ``[libvirt]`` section.
+On every Compute node, edit ``/etc/nova/nova.conf`` under the ``[libvirt]``
+section and add::
+
+ [libvirt]
+ images_type = rbd
+ images_rbd_pool = vms
+ images_rbd_ceph_conf = /etc/ceph/ceph.conf
+ rbd_user = cinder
+ rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
+ disk_cachemodes="network=writeback"
+
+
+It is also a good practice to disable file injection. While booting an
+instance, Nova usually attempts to open the rootfs of the virtual machine.
+Then, Nova injects values such as password, ssh keys etc. directly into the
+filesystem. However, it is better to rely on the metadata service and
+``cloud-init``.
+
+On every Compute node, edit ``/etc/nova/nova.conf`` and add the following
+under the ``[libvirt]`` section::
+
+ inject_password = false
+ inject_key = false
+ inject_partition = -2
+
+To ensure a proper live-migration, use the following flags (under the ``[libvirt]`` section)::
+
+ live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST,VIR_MIGRATE_TUNNELLED"
+
+Kilo
+~~~~
+
+Enable discard support for virtual machine ephemeral root disk::
+
+ [libvirt]
+ ...
+ ...
+ hw_disk_discard = unmap # enable discard support (be careful of performance)
+
+
+Restart OpenStack
+=================
+
+To activate the Ceph block device driver and load the block device pool name
+into the configuration, you must restart OpenStack. Thus, for Debian based
+systems execute these commands on the appropriate nodes::
+
+ sudo glance-control api restart
+ sudo service nova-compute restart
+ sudo service cinder-volume restart
+ sudo service cinder-backup restart
+
+For Red Hat based systems execute::
+
+ sudo service openstack-glance-api restart
+ sudo service openstack-nova-compute restart
+ sudo service openstack-cinder-volume restart
+ sudo service openstack-cinder-backup restart
+
+Once OpenStack is up and running, you should be able to create a volume
+and boot from it.
+
+
+Booting from a Block Device
+===========================
+
+You can create a volume from an image using the Cinder command line tool::
+
+ cinder create --image-id {id of image} --display-name {name of volume} {size of volume}
+
+Note that image must be RAW format. You can use `qemu-img`_ to convert
+from one format to another. For example::
+
+ qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
+ qemu-img convert -f qcow2 -O raw precise-cloudimg.img precise-cloudimg.raw
+
+When Glance and Cinder are both using Ceph block devices, the image is a
+copy-on-write clone, so it can create a new volume quickly. In the OpenStack
+dashboard, you can boot from that volume by performing the following steps:
+
+#. Launch a new instance.
+#. Choose the image associated to the copy-on-write clone.
+#. Select 'boot from volume'.
+#. Select the volume you created.
+
+.. _qemu-img: ../qemu-rbd/#running-qemu-with-rbd
+.. _Block Devices and OpenStack (Dumpling): http://docs.ceph.com/docs/dumpling/rbd/rbd-openstack
+.. _stable/havana: https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd
+.. _stable/icehouse: https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse
diff --git a/src/ceph/doc/rbd/rbd-replay.rst b/src/ceph/doc/rbd/rbd-replay.rst
new file mode 100644
index 0000000..e1c96b2
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-replay.rst
@@ -0,0 +1,42 @@
+===================
+ RBD Replay
+===================
+
+.. index:: Ceph Block Device; RBD Replay
+
+RBD Replay is a set of tools for capturing and replaying Rados Block Device
+(RBD) workloads. To capture an RBD workload, ``lttng-tools`` must be installed
+on the client, and ``librbd`` on the client must be the v0.87 (Giant) release
+or later. To replay an RBD workload, ``librbd`` on the client must be the Giant
+release or later.
+
+Capture and replay takes three steps:
+
+#. Capture the trace. Make sure to capture ``pthread_id`` context::
+
+ mkdir -p traces
+ lttng create -o traces librbd
+ lttng enable-event -u 'librbd:*'
+ lttng add-context -u -t pthread_id
+ lttng start
+ # run RBD workload here
+ lttng stop
+
+#. Process the trace with `rbd-replay-prep`_::
+
+ rbd-replay-prep traces/ust/uid/*/* replay.bin
+
+#. Replay the trace with `rbd-replay`_. Use read-only until you know
+ it's doing what you want::
+
+ rbd-replay --read-only replay.bin
+
+.. important:: ``rbd-replay`` will destroy data by default. Do not use against
+ an image you wish to keep, unless you use the ``--read-only`` option.
+
+The replayed workload does not have to be against the same RBD image or even the
+same cluster as the captured workload. To account for differences, you may need
+to use the ``--pool`` and ``--map-image`` options of ``rbd-replay``.
+
+.. _rbd-replay: ../../man/8/rbd-replay
+.. _rbd-replay-prep: ../../man/8/rbd-replay-prep
diff --git a/src/ceph/doc/rbd/rbd-snapshot.rst b/src/ceph/doc/rbd/rbd-snapshot.rst
new file mode 100644
index 0000000..2e5af9f
--- /dev/null
+++ b/src/ceph/doc/rbd/rbd-snapshot.rst
@@ -0,0 +1,308 @@
+===========
+ Snapshots
+===========
+
+.. index:: Ceph Block Device; snapshots
+
+A snapshot is a read-only copy of the state of an image at a particular point in
+time. One of the advanced features of Ceph block devices is that you can create
+snapshots of the images to retain a history of an image's state. Ceph also
+supports snapshot layering, which allows you to clone images (e.g., a VM image)
+quickly and easily. Ceph supports block device snapshots using the ``rbd``
+command and many higher level interfaces, including `QEMU`_, `libvirt`_,
+`OpenStack`_ and `CloudStack`_.
+
+.. important:: To use use RBD snapshots, you must have a running Ceph cluster.
+
+.. note:: If a snapshot is taken while `I/O` is still in progress in a image, the
+ snapshot might not get the exact or latest data of the image and the snapshot
+ may have to be cloned to a new image to be mountable. So, we recommend to stop
+ `I/O` before taking a snapshot of an image. If the image contains a filesystem,
+ the filesystem must be in a consistent state before taking a snapshot. To stop
+ `I/O` you can use `fsfreeze` command. See `fsfreeze(8)` man page for more details.
+ For virtual machines, `qemu-guest-agent` can be used to automatically freeze
+ filesystems when creating a snapshot.
+
+.. ditaa:: +------------+ +-------------+
+ | {s} | | {s} c999 |
+ | Active |<-------*| Snapshot |
+ | Image | | of Image |
+ | (stop i/o) | | (read only) |
+ +------------+ +-------------+
+
+
+Cephx Notes
+===========
+
+When `cephx`_ is enabled (it is by default), you must specify a user name or ID
+and a path to the keyring containing the corresponding key for the user. See
+`User Management`_ for details. You may also add the ``CEPH_ARGS`` environment
+variable to avoid re-entry of the following parameters. ::
+
+ rbd --id {user-ID} --keyring=/path/to/secret [commands]
+ rbd --name {username} --keyring=/path/to/secret [commands]
+
+For example::
+
+ rbd --id admin --keyring=/etc/ceph/ceph.keyring [commands]
+ rbd --name client.admin --keyring=/etc/ceph/ceph.keyring [commands]
+
+.. tip:: Add the user and secret to the ``CEPH_ARGS`` environment
+ variable so that you don't need to enter them each time.
+
+
+Snapshot Basics
+===============
+
+The following procedures demonstrate how to create, list, and remove
+snapshots using the ``rbd`` command on the command line.
+
+Create Snapshot
+---------------
+
+To create a snapshot with ``rbd``, specify the ``snap create`` option, the pool
+name and the image name. ::
+
+ rbd snap create {pool-name}/{image-name}@{snap-name}
+
+For example::
+
+ rbd snap create rbd/foo@snapname
+
+
+List Snapshots
+--------------
+
+To list snapshots of an image, specify the pool name and the image name. ::
+
+ rbd snap ls {pool-name}/{image-name}
+
+For example::
+
+ rbd snap ls rbd/foo
+
+
+Rollback Snapshot
+-----------------
+
+To rollback to a snapshot with ``rbd``, specify the ``snap rollback`` option, the
+pool name, the image name and the snap name. ::
+
+ rbd snap rollback {pool-name}/{image-name}@{snap-name}
+
+For example::
+
+ rbd snap rollback rbd/foo@snapname
+
+
+.. note:: Rolling back an image to a snapshot means overwriting
+ the current version of the image with data from a snapshot. The
+ time it takes to execute a rollback increases with the size of the
+ image. It is **faster to clone** from a snapshot **than to rollback**
+ an image to a snapshot, and it is the preferred method of returning
+ to a pre-existing state.
+
+
+Delete a Snapshot
+-----------------
+
+To delete a snapshot with ``rbd``, specify the ``snap rm`` option, the pool
+name, the image name and the snap name. ::
+
+ rbd snap rm {pool-name}/{image-name}@{snap-name}
+
+For example::
+
+ rbd snap rm rbd/foo@snapname
+
+
+.. note:: Ceph OSDs delete data asynchronously, so deleting a snapshot
+ doesn't free up the disk space immediately.
+
+Purge Snapshots
+---------------
+
+To delete all snapshots for an image with ``rbd``, specify the ``snap purge``
+option and the image name. ::
+
+ rbd snap purge {pool-name}/{image-name}
+
+For example::
+
+ rbd snap purge rbd/foo
+
+
+.. index:: Ceph Block Device; snapshot layering
+
+Layering
+========
+
+Ceph supports the ability to create many copy-on-write (COW) clones of a block
+device shapshot. Snapshot layering enables Ceph block device clients to create
+images very quickly. For example, you might create a block device image with a
+Linux VM written to it; then, snapshot the image, protect the snapshot, and
+create as many copy-on-write clones as you like. A snapshot is read-only,
+so cloning a snapshot simplifies semantics--making it possible to create
+clones rapidly.
+
+
+.. ditaa:: +-------------+ +-------------+
+ | {s} c999 | | {s} |
+ | Snapshot | Child refers | COW Clone |
+ | of Image |<------------*| of Snapshot |
+ | | to Parent | |
+ | (read only) | | (writable) |
+ +-------------+ +-------------+
+
+ Parent Child
+
+.. note:: The terms "parent" and "child" mean a Ceph block device snapshot (parent),
+ and the corresponding image cloned from the snapshot (child). These terms are
+ important for the command line usage below.
+
+Each cloned image (child) stores a reference to its parent image, which enables
+the cloned image to open the parent snapshot and read it.
+
+A COW clone of a snapshot behaves exactly like any other Ceph block device
+image. You can read to, write from, clone, and resize cloned images. There are
+no special restrictions with cloned images. However, the copy-on-write clone of
+a snapshot refers to the snapshot, so you **MUST** protect the snapshot before
+you clone it. The following diagram depicts the process.
+
+.. note:: Ceph only supports cloning for format 2 images (i.e., created with
+ ``rbd create --image-format 2``). The kernel client supports cloned images
+ since kernel 3.10.
+
+Getting Started with Layering
+-----------------------------
+
+Ceph block device layering is a simple process. You must have an image. You must
+create a snapshot of the image. You must protect the snapshot. Once you have
+performed these steps, you can begin cloning the snapshot.
+
+.. ditaa:: +----------------------------+ +-----------------------------+
+ | | | |
+ | Create Block Device Image |------->| Create a Snapshot |
+ | | | |
+ +----------------------------+ +-----------------------------+
+ |
+ +--------------------------------------+
+ |
+ v
+ +----------------------------+ +-----------------------------+
+ | | | |
+ | Protect the Snapshot |------->| Clone the Snapshot |
+ | | | |
+ +----------------------------+ +-----------------------------+
+
+
+The cloned image has a reference to the parent snapshot, and includes the pool
+ID, image ID and snapshot ID. The inclusion of the pool ID means that you may
+clone snapshots from one pool to images in another pool.
+
+
+#. **Image Template:** A common use case for block device layering is to create a
+ a master image and a snapshot that serves as a template for clones. For example,
+ a user may create an image for a Linux distribution (e.g., Ubuntu 12.04), and
+ create a snapshot for it. Periodically, the user may update the image and create
+ a new snapshot (e.g., ``sudo apt-get update``, ``sudo apt-get upgrade``,
+ ``sudo apt-get dist-upgrade`` followed by ``rbd snap create``). As the image
+ matures, the user can clone any one of the snapshots.
+
+#. **Extended Template:** A more advanced use case includes extending a template
+ image that provides more information than a base image. For example, a user may
+ clone an image (e.g., a VM template) and install other software (e.g., a database,
+ a content management system, an analytics system, etc.) and then snapshot the
+ extended image, which itself may be updated just like the base image.
+
+#. **Template Pool:** One way to use block device layering is to create a
+ pool that contains master images that act as templates, and snapshots of those
+ templates. You may then extend read-only privileges to users so that they
+ may clone the snapshots without the ability to write or execute within the pool.
+
+#. **Image Migration/Recovery:** One way to use block device layering is to migrate
+ or recover data from one pool into another pool.
+
+Protecting a Snapshot
+---------------------
+
+Clones access the parent snapshots. All clones would break if a user inadvertently
+deleted the parent snapshot. To prevent data loss, you **MUST** protect the
+snapshot before you can clone it. ::
+
+ rbd snap protect {pool-name}/{image-name}@{snapshot-name}
+
+For example::
+
+ rbd snap protect rbd/my-image@my-snapshot
+
+.. note:: You cannot delete a protected snapshot.
+
+Cloning a Snapshot
+------------------
+
+To clone a snapshot, specify you need to specify the parent pool, image and
+snapshot; and, the child pool and image name. You must protect the snapshot
+before you can clone it. ::
+
+ rbd clone {pool-name}/{parent-image}@{snap-name} {pool-name}/{child-image-name}
+
+For example::
+
+ rbd clone rbd/my-image@my-snapshot rbd/new-image
+
+.. note:: You may clone a snapshot from one pool to an image in another pool. For example,
+ you may maintain read-only images and snapshots as templates in one pool, and writeable
+ clones in another pool.
+
+Unprotecting a Snapshot
+-----------------------
+
+Before you can delete a snapshot, you must unprotect it first. Additionally,
+you may *NOT* delete snapshots that have references from clones. You must
+flatten each clone of a snapshot, before you can delete the snapshot. ::
+
+ rbd snap unprotect {pool-name}/{image-name}@{snapshot-name}
+
+For example::
+
+ rbd snap unprotect rbd/my-image@my-snapshot
+
+
+Listing Children of a Snapshot
+------------------------------
+
+To list the children of a snapshot, execute the following::
+
+ rbd children {pool-name}/{image-name}@{snapshot-name}
+
+For example::
+
+ rbd children rbd/my-image@my-snapshot
+
+
+Flattening a Cloned Image
+-------------------------
+
+Cloned images retain a reference to the parent snapshot. When you remove the
+reference from the child clone to the parent snapshot, you effectively "flatten"
+the image by copying the information from the snapshot to the clone. The time
+it takes to flatten a clone increases with the size of the snapshot. To delete
+a snapshot, you must flatten the child images first. ::
+
+ rbd flatten {pool-name}/{image-name}
+
+For example::
+
+ rbd flatten rbd/my-image
+
+.. note:: Since a flattened image contains all the information from the snapshot,
+ a flattened image will take up more storage space than a layered clone.
+
+
+.. _cephx: ../../rados/configuration/auth-config-ref/
+.. _User Management: ../../operations/user-management
+.. _QEMU: ../qemu-rbd/
+.. _OpenStack: ../rbd-openstack/
+.. _CloudStack: ../rbd-cloudstack/
+.. _libvirt: ../libvirt/