summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/ceph-volume
diff options
context:
space:
mode:
Diffstat (limited to 'src/ceph/doc/ceph-volume')
-rw-r--r--src/ceph/doc/ceph-volume/index.rst64
-rw-r--r--src/ceph/doc/ceph-volume/intro.rst19
-rw-r--r--src/ceph/doc/ceph-volume/lvm/activate.rst88
-rw-r--r--src/ceph/doc/ceph-volume/lvm/create.rst24
-rw-r--r--src/ceph/doc/ceph-volume/lvm/index.rst28
-rw-r--r--src/ceph/doc/ceph-volume/lvm/list.rst173
-rw-r--r--src/ceph/doc/ceph-volume/lvm/prepare.rst241
-rw-r--r--src/ceph/doc/ceph-volume/lvm/scan.rst9
-rw-r--r--src/ceph/doc/ceph-volume/lvm/systemd.rst28
-rw-r--r--src/ceph/doc/ceph-volume/lvm/zap.rst19
-rw-r--r--src/ceph/doc/ceph-volume/simple/activate.rst80
-rw-r--r--src/ceph/doc/ceph-volume/simple/index.rst19
-rw-r--r--src/ceph/doc/ceph-volume/simple/scan.rst158
-rw-r--r--src/ceph/doc/ceph-volume/simple/systemd.rst28
-rw-r--r--src/ceph/doc/ceph-volume/systemd.rst49
15 files changed, 1027 insertions, 0 deletions
diff --git a/src/ceph/doc/ceph-volume/index.rst b/src/ceph/doc/ceph-volume/index.rst
new file mode 100644
index 0000000..5cf4778
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/index.rst
@@ -0,0 +1,64 @@
+.. _ceph-volume:
+
+ceph-volume
+===========
+Deploy OSDs with different device technologies like lvm or physical disks using
+pluggable tools (:doc:`lvm/index` itself is treated like a plugin) and trying to
+follow a predictable, and robust way of preparing, activating, and starting OSDs.
+
+:ref:`Overview <ceph-volume-overview>` |
+:ref:`Plugin Guide <ceph-volume-plugins>` |
+
+
+**Command Line Subcommands**
+There is currently support for ``lvm``, and plain disks (with GPT partitions)
+that may have been deployed with ``ceph-disk``.
+
+* :ref:`ceph-volume-lvm`
+* :ref:`ceph-volume-simple`
+
+
+Migrating
+---------
+Starting on Ceph version 12.2.2, ``ceph-disk`` is deprecated. Deprecation
+warnings will show up that will link to this page. It is strongly suggested
+that users start consuming ``ceph-volume``.
+
+New deployments
+^^^^^^^^^^^^^^^
+For new deployments, :ref:`ceph-volume-lvm` is recommended, it can use any
+logical volume as input for data OSDs, or it can setup a minimal/naive logical
+volume from a device.
+
+Existing OSDs
+^^^^^^^^^^^^^
+If the cluster has OSDs that were provisioned with ``ceph-disk``, then
+``ceph-volume`` can take over the management of these with
+:ref:`ceph-volume-simple`. A scan is done on the data device or OSD directory,
+and ``ceph-disk`` is fully disabled.
+
+Encrypted OSDs
+^^^^^^^^^^^^^^
+If using encryption with OSDs, there is currently no support in ``ceph-volume``
+for this scenario (although support for this is coming soon). In this case, it
+is OK to continue to use ``ceph-disk`` until ``ceph-volume`` fully supports it.
+This page will be updated when that happens.
+
+.. toctree::
+ :hidden:
+ :maxdepth: 3
+ :caption: Contents:
+
+ intro
+ systemd
+ lvm/index
+ lvm/activate
+ lvm/prepare
+ lvm/scan
+ lvm/systemd
+ lvm/list
+ lvm/zap
+ simple/index
+ simple/activate
+ simple/scan
+ simple/systemd
diff --git a/src/ceph/doc/ceph-volume/intro.rst b/src/ceph/doc/ceph-volume/intro.rst
new file mode 100644
index 0000000..3869148
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/intro.rst
@@ -0,0 +1,19 @@
+.. _ceph-volume-overview:
+
+Overview
+--------
+The ``ceph-volume`` tool aims to be a single purpose command line tool to deploy
+logical volumes as OSDs, trying to maintain a similar API to ``ceph-disk`` when
+preparing, activating, and creating OSDs.
+
+It deviates from ``ceph-disk`` by not interacting or relying on the udev rules
+that come installed for Ceph. These rules allow automatic detection of
+previously setup devices that are in turn fed into ``ceph-disk`` to activate
+them.
+
+
+``ceph-volume lvm``
+-------------------
+By making use of :term:`LVM tags`, the :ref:`ceph-volume-lvm` sub-command is
+able to store and later re-discover and query devices associated with OSDs so
+that they can later activated.
diff --git a/src/ceph/doc/ceph-volume/lvm/activate.rst b/src/ceph/doc/ceph-volume/lvm/activate.rst
new file mode 100644
index 0000000..956a62a
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/activate.rst
@@ -0,0 +1,88 @@
+.. _ceph-volume-lvm-activate:
+
+``activate``
+============
+Once :ref:`ceph-volume-lvm-prepare` is completed, and all the various steps
+that entails are done, the volume is ready to get "activated".
+
+This activation process enables a systemd unit that persists the OSD ID and its
+UUID (also called ``fsid`` in Ceph CLI tools), so that at boot time it can
+understand what OSD is enabled and needs to be mounted.
+
+.. note:: The execution of this call is fully idempotent, and there is no
+ side-effects when running multiple times
+
+New OSDs
+--------
+To activate newly prepared OSDs both the :term:`OSD id` and :term:`OSD uuid`
+need to be supplied. For example::
+
+ ceph-volume lvm activate --bluestore 0 0263644D-0BF1-4D6D-BC34-28BD98AE3BC8
+
+.. note:: The UUID is stored in the ``osd_fsid`` file in the OSD path, which is
+ generated when :ref:`ceph-volume-lvm-prepare` is used.
+
+requiring uuids
+^^^^^^^^^^^^^^^
+The :term:`OSD uuid` is being required as an extra step to ensure that the
+right OSD is being activated. It is entirely possible that a previous OSD with
+the same id exists and would end up activating the incorrect one.
+
+
+Discovery
+---------
+With either existing OSDs or new ones being activated, a *discovery* process is
+performed using :term:`LVM tags` to enable the systemd units.
+
+The systemd unit will capture the :term:`OSD id` and :term:`OSD uuid` and
+persist it. Internally, the activation will enable it like::
+
+ systemctl enable ceph-volume@$id-$uuid-lvm
+
+For example::
+
+ systemctl enable ceph-volume@0-8715BEB4-15C5-49DE-BA6F-401086EC7B41-lvm
+
+Would start the discovery process for the OSD with an id of ``0`` and a UUID of
+``8715BEB4-15C5-49DE-BA6F-401086EC7B41``.
+
+.. note:: for more details on the systemd workflow see :ref:`ceph-volume-lvm-systemd`
+
+The systemd unit will look for the matching OSD device, and by looking at its
+:term:`LVM tags` will proceed to:
+
+# mount the device in the corresponding location (by convention this is
+ ``/var/lib/ceph/osd/<cluster name>-<osd id>/``)
+
+# ensure that all required devices are ready for that OSD
+
+# start the ``ceph-osd@0`` systemd unit
+
+.. note:: The system infers the objectstore type (filestore or bluestore) by
+ inspecting the LVM tags applied to the OSD devices
+
+Existing OSDs
+-------------
+For exsiting OSDs that have been deployed with different tooling, the only way
+to port them over to the new mechanism is to prepare them again (losing data).
+See :ref:`ceph-volume-lvm-existing-osds` for details on how to proceed.
+
+Summary
+-------
+To recap the ``activate`` process for :term:`bluestore`:
+
+#. require both :term:`OSD id` and :term:`OSD uuid`
+#. enable the system unit with matching id and uuid
+#. Create the ``tmpfs`` mount at the OSD directory in
+ ``/var/lib/ceph/osd/$cluster-$id/``
+#. Recreate all the files needed with ``ceph-bluestore-tool prime-osd-dir`` by
+ pointing it to the OSD ``block`` device.
+#. the systemd unit will ensure all devices are ready and linked
+#. the matching ``ceph-osd`` systemd unit will get started
+
+And for :term:`filestore`:
+
+#. require both :term:`OSD id` and :term:`OSD uuid`
+#. enable the system unit with matching id and uuid
+#. the systemd unit will ensure all devices are ready and mounted (if needed)
+#. the matching ``ceph-osd`` systemd unit will get started
diff --git a/src/ceph/doc/ceph-volume/lvm/create.rst b/src/ceph/doc/ceph-volume/lvm/create.rst
new file mode 100644
index 0000000..c90d1f6
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/create.rst
@@ -0,0 +1,24 @@
+.. _ceph-volume-lvm-create:
+
+``create``
+===========
+This subcommand wraps the two-step process to provision a new osd (calling
+``prepare`` first and then ``activate``) into a single
+one. The reason to prefer ``prepare`` and then ``activate`` is to gradually
+introduce new OSDs into a cluster, and avoiding large amounts of data being
+rebalanced.
+
+The single-call process unifies exactly what :ref:`ceph-volume-lvm-prepare` and
+:ref:`ceph-volume-lvm-activate` do, with the convenience of doing it all at
+once.
+
+There is nothing different to the process except the OSD will become up and in
+immediately after completion.
+
+The backing objectstore can be specified with:
+
+* :ref:`--filestore <ceph-volume-lvm-prepare_filestore>`
+* :ref:`--bluestore <ceph-volume-lvm-prepare_bluestore>`
+
+All command line flags and options are the same as ``ceph-volume lvm prepare``.
+Please refer to :ref:`ceph-volume-lvm-prepare` for details.
diff --git a/src/ceph/doc/ceph-volume/lvm/index.rst b/src/ceph/doc/ceph-volume/lvm/index.rst
new file mode 100644
index 0000000..9a2191f
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/index.rst
@@ -0,0 +1,28 @@
+.. _ceph-volume-lvm:
+
+``lvm``
+=======
+Implements the functionality needed to deploy OSDs from the ``lvm`` subcommand:
+``ceph-volume lvm``
+
+**Command Line Subcommands**
+
+* :ref:`ceph-volume-lvm-prepare`
+
+* :ref:`ceph-volume-lvm-activate`
+
+* :ref:`ceph-volume-lvm-create`
+
+* :ref:`ceph-volume-lvm-list`
+
+.. not yet implemented
+.. * :ref:`ceph-volume-lvm-scan`
+
+**Internal functionality**
+
+There are other aspects of the ``lvm`` subcommand that are internal and not
+exposed to the user, these sections explain how these pieces work together,
+clarifying the workflows of the tool.
+
+:ref:`Systemd Units <ceph-volume-lvm-systemd>` |
+:ref:`lvm <ceph-volume-lvm-api>`
diff --git a/src/ceph/doc/ceph-volume/lvm/list.rst b/src/ceph/doc/ceph-volume/lvm/list.rst
new file mode 100644
index 0000000..19e0600
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/list.rst
@@ -0,0 +1,173 @@
+.. _ceph-volume-lvm-list:
+
+``list``
+========
+This subcommand will list any devices (logical and physical) that may be
+associated with a Ceph cluster, as long as they contain enough metadata to
+allow for that discovery.
+
+Output is grouped by the OSD ID associated with the devices, and unlike
+``ceph-disk`` it does not provide any information for devices that aren't
+associated with Ceph.
+
+Command line options:
+
+* ``--format`` Allows a ``json`` or ``pretty`` value. Defaults to ``pretty``
+ which will group the device information in a human-readable format.
+
+Full Reporting
+--------------
+When no positional arguments are used, a full reporting will be presented. This
+means that all devices and logical volumes found in the system will be
+displayed.
+
+Full ``pretty`` reporting for two OSDs, one with a lv as a journal, and another
+one with a physical device may look similar to::
+
+ # ceph-volume lvm list
+
+
+ ====== osd.1 =======
+
+ [journal] /dev/journals/journal1
+
+ journal uuid C65n7d-B1gy-cqX3-vZKY-ZoE0-IEYM-HnIJzs
+ osd id 1
+ cluster fsid ce454d91-d748-4751-a318-ff7f7aa18ffd
+ type journal
+ osd fsid 661b24f8-e062-482b-8110-826ffe7f13fa
+ data uuid SlEgHe-jX1H-QBQk-Sce0-RUls-8KlY-g8HgcZ
+ journal device /dev/journals/journal1
+ data device /dev/test_group/data-lv2
+
+ [data] /dev/test_group/data-lv2
+
+ journal uuid C65n7d-B1gy-cqX3-vZKY-ZoE0-IEYM-HnIJzs
+ osd id 1
+ cluster fsid ce454d91-d748-4751-a318-ff7f7aa18ffd
+ type data
+ osd fsid 661b24f8-e062-482b-8110-826ffe7f13fa
+ data uuid SlEgHe-jX1H-QBQk-Sce0-RUls-8KlY-g8HgcZ
+ journal device /dev/journals/journal1
+ data device /dev/test_group/data-lv2
+
+ ====== osd.0 =======
+
+ [data] /dev/test_group/data-lv1
+
+ journal uuid cd72bd28-002a-48da-bdf6-d5b993e84f3f
+ osd id 0
+ cluster fsid ce454d91-d748-4751-a318-ff7f7aa18ffd
+ type data
+ osd fsid 943949f0-ce37-47ca-a33c-3413d46ee9ec
+ data uuid TUpfel-Q5ZT-eFph-bdGW-SiNW-l0ag-f5kh00
+ journal device /dev/sdd1
+ data device /dev/test_group/data-lv1
+
+ [journal] /dev/sdd1
+
+ PARTUUID cd72bd28-002a-48da-bdf6-d5b993e84f3f
+
+.. note:: Tags are displayed in a readable format. The ``osd id`` key is stored
+ as a ``ceph.osd_id`` tag. For more information on lvm tag conventions
+ see :ref:`ceph-volume-lvm-tag-api`
+
+Single Reporting
+----------------
+Single reporting can consume both devices and logical volumes as input
+(positional parameters). For logical volumes, it is required to use the group
+name as well as the logical volume name.
+
+For example the ``data-lv2`` logical volume, in the ``test_group`` volume group
+can be listed in the following way::
+
+ # ceph-volume lvm list test_group/data-lv2
+
+
+ ====== osd.1 =======
+
+ [data] /dev/test_group/data-lv2
+
+ journal uuid C65n7d-B1gy-cqX3-vZKY-ZoE0-IEYM-HnIJzs
+ osd id 1
+ cluster fsid ce454d91-d748-4751-a318-ff7f7aa18ffd
+ type data
+ osd fsid 661b24f8-e062-482b-8110-826ffe7f13fa
+ data uuid SlEgHe-jX1H-QBQk-Sce0-RUls-8KlY-g8HgcZ
+ journal device /dev/journals/journal1
+ data device /dev/test_group/data-lv2
+
+
+.. note:: Tags are displayed in a readable format. The ``osd id`` key is stored
+ as a ``ceph.osd_id`` tag. For more information on lvm tag conventions
+ see :ref:`ceph-volume-lvm-tag-api`
+
+
+For plain disks, the full path to the device is required. For example, for
+a device like ``/dev/sdd1`` it can look like::
+
+
+ # ceph-volume lvm list /dev/sdd1
+
+
+ ====== osd.0 =======
+
+ [journal] /dev/sdd1
+
+ PARTUUID cd72bd28-002a-48da-bdf6-d5b993e84f3f
+
+
+
+``json`` output
+---------------
+All output using ``--format=json`` will show everything the system has stored
+as metadata for the devices, including tags.
+
+No changes for readability are done with ``json`` reporting, and all
+information is presented as-is. Full output as well as single devices can be
+listed.
+
+For brevity, this is how a single logical volume would look with ``json``
+output (note how tags aren't modified)::
+
+ # ceph-volume lvm list --format=json test_group/data-lv1
+ {
+ "0": [
+ {
+ "lv_name": "data-lv1",
+ "lv_path": "/dev/test_group/data-lv1",
+ "lv_tags": "ceph.cluster_fsid=ce454d91-d748-4751-a318-ff7f7aa18ffd,ceph.data_device=/dev/test_group/data-lv1,ceph.data_uuid=TUpfel-Q5ZT-eFph-bdGW-SiNW-l0ag-f5kh00,ceph.journal_device=/dev/sdd1,ceph.journal_uuid=cd72bd28-002a-48da-bdf6-d5b993e84f3f,ceph.osd_fsid=943949f0-ce37-47ca-a33c-3413d46ee9ec,ceph.osd_id=0,ceph.type=data",
+ "lv_uuid": "TUpfel-Q5ZT-eFph-bdGW-SiNW-l0ag-f5kh00",
+ "name": "data-lv1",
+ "path": "/dev/test_group/data-lv1",
+ "tags": {
+ "ceph.cluster_fsid": "ce454d91-d748-4751-a318-ff7f7aa18ffd",
+ "ceph.data_device": "/dev/test_group/data-lv1",
+ "ceph.data_uuid": "TUpfel-Q5ZT-eFph-bdGW-SiNW-l0ag-f5kh00",
+ "ceph.journal_device": "/dev/sdd1",
+ "ceph.journal_uuid": "cd72bd28-002a-48da-bdf6-d5b993e84f3f",
+ "ceph.osd_fsid": "943949f0-ce37-47ca-a33c-3413d46ee9ec",
+ "ceph.osd_id": "0",
+ "ceph.type": "data"
+ },
+ "type": "data",
+ "vg_name": "test_group"
+ }
+ ]
+ }
+
+
+Synchronized information
+------------------------
+Before any listing type, the lvm API is queried to ensure that physical devices
+that may be in use haven't changed naming. It is possible that non-persistent
+devices like ``/dev/sda1`` could change to ``/dev/sdb1``.
+
+The detection is possible because the ``PARTUUID`` is stored as part of the
+metadata in the logical volume for the data lv. Even in the case of a journal
+that is a physical device, this information is still stored on the data logical
+volume associated with it.
+
+If the name is no longer the same (as reported by ``blkid`` when using the
+``PARTUUID``), the tag will get updated and the report will use the newly
+refreshed information.
diff --git a/src/ceph/doc/ceph-volume/lvm/prepare.rst b/src/ceph/doc/ceph-volume/lvm/prepare.rst
new file mode 100644
index 0000000..27ebb55
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/prepare.rst
@@ -0,0 +1,241 @@
+.. _ceph-volume-lvm-prepare:
+
+``prepare``
+===========
+This subcommand allows a :term:`filestore` or :term:`bluestore` setup. It is
+recommended to pre-provision a logical volume before using it with
+``ceph-volume lvm``.
+
+Logical volumes are not altered except for adding extra metadata.
+
+.. note:: This is part of a two step process to deploy an OSD. If looking for
+ a single-call way, please see :ref:`ceph-volume-lvm-create`
+
+To help identify volumes, the process of preparing a volume (or volumes) to
+work with Ceph, the tool will assign a few pieces of metadata information using
+:term:`LVM tags`.
+
+:term:`LVM tags` makes volumes easy to discover later, and help identify them as
+part of a Ceph system, and what role they have (journal, filestore, bluestore,
+etc...)
+
+Although initially :term:`filestore` is supported (and supported by default)
+the back end can be specified with:
+
+
+* :ref:`--filestore <ceph-volume-lvm-prepare_filestore>`
+* :ref:`--bluestore <ceph-volume-lvm-prepare_bluestore>`
+
+.. _ceph-volume-lvm-prepare_filestore:
+
+``filestore``
+-------------
+This is the OSD backend that allows preparation of logical volumes for
+a :term:`filestore` objectstore OSD.
+
+It can use a logical volume for the OSD data and a partitioned physical device
+or logical volume for the journal. No special preparation is needed for these
+volumes other than following the minimum size requirements for data and
+journal.
+
+The API call looks like::
+
+ ceph-volume prepare --filestore --data data --journal journal
+
+There is flexibility to use a raw device or partition as well for ``--data``
+that will be converted to a logical volume. This is not ideal in all situations
+since ``ceph-volume`` is just going to create a unique volume group and
+a logical volume from that device.
+
+When using logical volumes for ``--data``, the value *must* be a volume group
+name and a logical volume name separated by a ``/``. Since logical volume names
+are not enforced for uniqueness, this prevents using the wrong volume. The
+``--journal`` can be either a logical volume *or* a partition.
+
+When using a partition, it *must* contain a ``PARTUUID`` discoverable by
+``blkid``, so that it can later be identified correctly regardless of the
+device name (or path).
+
+When using a partition, this is how it would look for ``/dev/sdc1``::
+
+ ceph-volume prepare --filestore --data volume_group/lv_name --journal /dev/sdc1
+
+For a logical volume, just like for ``--data``, a volume group and logical
+volume name are required::
+
+ ceph-volume prepare --filestore --data volume_group/lv_name --journal volume_group/journal_lv
+
+A generated uuid is used to ask the cluster for a new OSD. These two pieces are
+crucial for identifying an OSD and will later be used throughout the
+:ref:`ceph-volume-lvm-activate` process.
+
+The OSD data directory is created using the following convention::
+
+ /var/lib/ceph/osd/<cluster name>-<osd id>
+
+At this point the data volume is mounted at this location, and the journal
+volume is linked::
+
+ ln -s /path/to/journal /var/lib/ceph/osd/<cluster_name>-<osd-id>/journal
+
+The monmap is fetched using the bootstrap key from the OSD::
+
+ /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
+ --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
+ mon getmap -o /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap
+
+``ceph-osd`` will be called to populate the OSD directory, that is already
+mounted, re-using all the pieces of information from the initial steps::
+
+ ceph-osd --cluster ceph --mkfs --mkkey -i <osd id> \
+ --monmap /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap --osd-data \
+ /var/lib/ceph/osd/<cluster name>-<osd id> --osd-journal /var/lib/ceph/osd/<cluster name>-<osd id>/journal \
+ --osd-uuid <osd uuid> --keyring /var/lib/ceph/osd/<cluster name>-<osd id>/keyring \
+ --setuser ceph --setgroup ceph
+
+.. _ceph-volume-lvm-existing-osds:
+
+Existing OSDs
+-------------
+For existing clusters that want to use this new system and have OSDs that are
+already running there are a few things to take into account:
+
+.. warning:: this process will forcefully format the data device, destroying
+ existing data, if any.
+
+* OSD paths should follow this convention::
+
+ /var/lib/ceph/osd/<cluster name>-<osd id>
+
+* Preferably, no other mechanisms to mount the volume should exist, and should
+ be removed (like fstab mount points)
+* There is currently no support for encrypted volumes
+
+The one time process for an existing OSD, with an ID of 0 and
+using a ``"ceph"`` cluster name would look like::
+
+ ceph-volume lvm prepare --filestore --osd-id 0 --osd-fsid E3D291C1-E7BF-4984-9794-B60D9FA139CB
+
+The command line tool will not contact the monitor to generate an OSD ID and
+will format the LVM device in addition to storing the metadata on it so that it
+can later be startednot contact the monitor to generate an OSD ID and will
+format the LVM device in addition to storing the metadata on it so that it can
+later be started (for detailed metadata description see :ref:`ceph-volume-lvm-tags`).
+
+
+.. _ceph-volume-lvm-prepare_bluestore:
+
+``bluestore``
+-------------
+The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit
+more flexibility for devices. Bluestore supports the following configurations:
+
+* A block device, a block.wal, and a block.db device
+* A block device and a block.wal device
+* A block device and a block.db device
+* A single block device
+
+It can accept a whole device (or partition), or a logical volume for ``block``.
+If a physical device is provided it will then be turned into a logical volume.
+This allows a simpler approach at using LVM but at the cost of flexibility:
+there are no options or configurations to change how the LV is created.
+
+The ``block`` is specified with the ``--data`` flag, and in its simplest use
+case it looks like::
+
+ ceph-volume lvm prepare --bluestore --data vg/lv
+
+A raw device can be specified in the same way::
+
+ ceph-volume lvm prepare --bluestore --data /path/to/device
+
+
+If a ``block.db`` or a ``block.wal`` is needed (they are optional for
+bluestore) they can be specified with ``--block.db`` and ``--block.wal``
+accordingly. These can be a physical device (they **must** be a partition) or
+a logical volume.
+
+For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes
+because they can be used as-is. Logical Volumes are also allowed.
+
+While creating the OSD directory, the process will use a ``tmpfs`` mount to
+place all the files needed for the OSD. These files are initially created by
+``ceph-osd --mkfs`` and are fully ephemeral.
+
+A symlink is always created for the ``block`` device, and optionally for
+``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD
+id of 0, the directory could look like::
+
+ # ls -l /var/lib/ceph/osd/ceph-0
+ lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62
+ lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1
+ lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0
+ -rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid
+ -rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid
+ -rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring
+ -rw-------. 1 ceph ceph 6 Oct 20 13:05 ready
+ -rw-------. 1 ceph ceph 10 Oct 20 13:05 type
+ -rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami
+
+In the above case, a device was used for ``block`` so ``ceph-volume`` create
+a volume group and a logical volume using the following convention:
+
+* volume group name: ``ceph-{cluster fsid}`` or if the vg exists already
+ ``ceph-{random uuid}``
+
+* logical volume name: ``osd-block-{osd_fsid}``
+
+
+Storing metadata
+----------------
+The following tags will get applied as part of the preparation process
+regardless of the type of volume (journal or data) or OSD objectstore:
+
+* ``cluster_fsid``
+* ``encrypted``
+* ``osd_fsid``
+* ``osd_id``
+
+For :term:`filestore` these tags will be added:
+
+* ``journal_device``
+* ``journal_uuid``
+
+For :term:`bluestore` these tags will be added:
+
+* ``block_device``
+* ``block_uuid``
+* ``db_device``
+* ``db_uuid``
+* ``wal_device``
+* ``wal_uuid``
+
+.. note:: For the complete lvm tag conventions see :ref:`ceph-volume-lvm-tag-api`
+
+
+Summary
+-------
+To recap the ``prepare`` process for :term:`bluestore`:
+
+#. Accept a logical volume for block or a raw device (that will get converted
+ to an lv)
+#. Accept partitions or logical volumes for ``block.wal`` or ``block.db``
+#. Generate a UUID for the OSD
+#. Ask the monitor get an OSD ID reusing the generated UUID
+#. OSD data directory is created on a tmpfs mount.
+#. ``block``, ``block.wal``, and ``block.db`` are symlinked if defined.
+#. monmap is fetched for activation
+#. Data directory is populated by ``ceph-osd``
+#. Logical Volumes are are assigned all the Ceph metadata using lvm tags
+
+
+And the ``prepare`` process for :term:`filestore`:
+
+#. Accept only logical volumes for data and journal (both required)
+#. Generate a UUID for the OSD
+#. Ask the monitor get an OSD ID reusing the generated UUID
+#. OSD data directory is created and data volume mounted
+#. Journal is symlinked from data volume to journal location
+#. monmap is fetched for activation
+#. devices is mounted and data directory is populated by ``ceph-osd``
+#. data and journal volumes are assigned all the Ceph metadata using lvm tags
diff --git a/src/ceph/doc/ceph-volume/lvm/scan.rst b/src/ceph/doc/ceph-volume/lvm/scan.rst
new file mode 100644
index 0000000..96d2719
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/scan.rst
@@ -0,0 +1,9 @@
+scan
+====
+This sub-command will allow to discover Ceph volumes previously setup by the
+tool by looking into the system's logical volumes and their tags.
+
+As part of the the :ref:`ceph-volume-lvm-prepare` process, the logical volumes are assigned
+a few tags with important pieces of information.
+
+.. note:: This sub-command is not yet implemented
diff --git a/src/ceph/doc/ceph-volume/lvm/systemd.rst b/src/ceph/doc/ceph-volume/lvm/systemd.rst
new file mode 100644
index 0000000..30260de
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/systemd.rst
@@ -0,0 +1,28 @@
+.. _ceph-volume-lvm-systemd:
+
+systemd
+=======
+Upon startup, it will identify the logical volume using :term:`LVM tags`,
+finding a matching ID and later ensuring it is the right one with
+the :term:`OSD uuid`.
+
+After identifying the correct volume it will then proceed to mount it by using
+the OSD destination conventions, that is::
+
+ /var/lib/ceph/osd/<cluster name>-<osd id>
+
+For our example OSD with an id of ``0``, that means the identified device will
+be mounted at::
+
+
+ /var/lib/ceph/osd/ceph-0
+
+
+Once that process is complete, a call will be made to start the OSD::
+
+ systemctl start ceph-osd@0
+
+The systemd portion of this process is handled by the ``ceph-volume lvm
+trigger`` sub-command, which is only in charge of parsing metadata coming from
+systemd and startup, and then dispatching to ``ceph-volume lvm activate`` which
+would proceed with activation.
diff --git a/src/ceph/doc/ceph-volume/lvm/zap.rst b/src/ceph/doc/ceph-volume/lvm/zap.rst
new file mode 100644
index 0000000..8d42a90
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/lvm/zap.rst
@@ -0,0 +1,19 @@
+.. _ceph-volume-lvm-zap:
+
+``zap``
+=======
+
+This subcommand is used to zap lvs or partitions that have been used
+by ceph OSDs so that they may be reused. If given a path to a logical
+volume it must be in the format of vg/lv. Any filesystems present
+on the given lv or partition will be removed and all data will be purged.
+
+.. note:: The lv or partition will be kept intact.
+
+Zapping a logical volume::
+
+ ceph-volume lvm zap {vg name/lv name}
+
+Zapping a partition::
+
+ ceph-volume lvm zap /dev/sdc1
diff --git a/src/ceph/doc/ceph-volume/simple/activate.rst b/src/ceph/doc/ceph-volume/simple/activate.rst
new file mode 100644
index 0000000..edbb1e3
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/simple/activate.rst
@@ -0,0 +1,80 @@
+.. _ceph-volume-simple-activate:
+
+``activate``
+============
+Once :ref:`ceph-volume-simple-scan` has been completed, and all the metadata
+captured for an OSD has been persisted to ``/etc/ceph/osd/{id}-{uuid}.json``
+the OSD is now ready to get "activated".
+
+This activation process **disables** all ``ceph-disk`` systemd units by masking
+them, to prevent the UDEV/ceph-disk interaction that will attempt to start them
+up at boot time.
+
+The disabling of ``ceph-disk`` units is done only when calling ``ceph-volume
+simple activate`` directly, but is is avoided when being called by systemd when
+the system is booting up.
+
+The activation process requires using both the :term:`OSD id` and :term:`OSD uuid`
+To activate parsed OSDs::
+
+ ceph-volume simple activate 0 6cc43680-4f6e-4feb-92ff-9c7ba204120e
+
+The above command will assume that a JSON configuration will be found in::
+
+ /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json
+
+Alternatively, using a path to a JSON file directly is also possible::
+
+ ceph-volume simple activate --file /etc/ceph/osd/0-6cc43680-4f6e-4feb-92ff-9c7ba204120e.json
+
+requiring uuids
+^^^^^^^^^^^^^^^
+The :term:`OSD uuid` is being required as an extra step to ensure that the
+right OSD is being activated. It is entirely possible that a previous OSD with
+the same id exists and would end up activating the incorrect one.
+
+
+Discovery
+---------
+With OSDs previously scanned by ``ceph-volume``, a *discovery* process is
+performed using ``blkid`` and ``lvm``. There is currently support only for
+devices with GPT partitions and LVM logical volumes.
+
+The GPT partitions will have a ``PARTUUID`` that can be queried by calling out
+to ``blkid``, and the logical volumes will have a ``lv_uuid`` that can be
+queried against ``lvs`` (the LVM tool to list logical volumes).
+
+This discovery process ensures that devices can be correctly detected even if
+they are repurposed into another system or if their name changes (as in the
+case of non-persisting names like ``/dev/sda1``)
+
+The JSON configuration file used to map what devices go to what OSD will then
+coordinate the mounting and symlinking as part of activation.
+
+To ensure that the symlinks are always correct, if they exist in the OSD
+directory, the symlinks will be re-done.
+
+A systemd unit will capture the :term:`OSD id` and :term:`OSD uuid` and
+persist it. Internally, the activation will enable it like::
+
+ systemctl enable ceph-volume@simple-$id-$uuid
+
+For example::
+
+ systemctl enable ceph-volume@simple-0-8715BEB4-15C5-49DE-BA6F-401086EC7B41
+
+Would start the discovery process for the OSD with an id of ``0`` and a UUID of
+``8715BEB4-15C5-49DE-BA6F-401086EC7B41``.
+
+
+The systemd process will call out to activate passing the information needed to
+identify the OSD and its devices, and it will proceed to:
+
+# mount the device in the corresponding location (by convention this is
+ ``/var/lib/ceph/osd/<cluster name>-<osd id>/``)
+
+# ensure that all required devices are ready for that OSD and properly linked,
+regardless of objectstore used (filestore or bluestore). The symbolic link will
+**always** be re-done to ensure that the correct device is linked.
+
+# start the ``ceph-osd@0`` systemd unit
diff --git a/src/ceph/doc/ceph-volume/simple/index.rst b/src/ceph/doc/ceph-volume/simple/index.rst
new file mode 100644
index 0000000..6f2534a
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/simple/index.rst
@@ -0,0 +1,19 @@
+.. _ceph-volume-simple:
+
+``simple``
+==========
+Implements the functionality needed to manage OSDs from the ``simple`` subcommand:
+``ceph-volume simple``
+
+**Command Line Subcommands**
+
+* :ref:`ceph-volume-simple-scan`
+
+* :ref:`ceph-volume-simple-activate`
+
+* :ref:`ceph-volume-simple-systemd`
+
+
+By *taking over* management, it disables all ``ceph-disk`` systemd units used
+to trigger devices at startup, relying on basic (customizable) JSON
+configuration and systemd for starting up OSDs.
diff --git a/src/ceph/doc/ceph-volume/simple/scan.rst b/src/ceph/doc/ceph-volume/simple/scan.rst
new file mode 100644
index 0000000..afeddab
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/simple/scan.rst
@@ -0,0 +1,158 @@
+.. _ceph-volume-simple-scan:
+
+``scan``
+========
+Scanning allows to capture any important details from an already-deployed OSD
+so that ``ceph-volume`` can manage it without the need of any other startup
+workflows or tools (like ``udev`` or ``ceph-disk``).
+
+The command has the ability to inspect a running OSD, by inspecting the
+directory where the OSD data is stored, or by consuming the data partition.
+
+Once scanned, information will (by default) persist the metadata as JSON in
+a file in ``/etc/ceph/osd``. This ``JSON`` file will use the naming convention
+of: ``{OSD ID}-{OSD FSID}.json``. An OSD with an id of 1, and an FSID like
+``86ebd829-1405-43d3-8fd6-4cbc9b6ecf96`` the absolute path of the file would
+be::
+
+ /etc/ceph/osd/1-86ebd829-1405-43d3-8fd6-4cbc9b6ecf96.json
+
+The ``scan`` subcommand will refuse to write to this file if it already exists.
+If overwriting the contents is needed, the ``--force`` flag must be used::
+
+ ceph-volume simple scan --force {path}
+
+If there is no need to persist the ``JSON`` metadata, there is support to send
+the contents to ``stdout`` (no file will be written)::
+
+ ceph-volume simple scan --stdout {path}
+
+
+.. _ceph-volume-simple-scan-directory:
+
+Directory scan
+--------------
+The directory scan will capture OSD file contents from interesting files. There
+are a few files that must exist in order to have a successful scan:
+
+* ``ceph_fsid``
+* ``fsid``
+* ``keyring``
+* ``ready``
+* ``type``
+* ``whoami``
+
+In the case of any other file, as long as it is not a binary or a directory, it
+will also get captured and persisted as part of the JSON object.
+
+The convention for the keys in the JSON object is that any file name will be
+a key, and its contents will be its value. If the contents are a single line
+(like in the case of the ``whoami``) the contents are trimmed, and the newline
+is dropped. For example with an OSD with an id of 1, this is how the JSON entry
+would look like::
+
+ "whoami": "1",
+
+For files that may have more than one line, the contents are left as-is, for
+example, a ``keyring`` could look like this::
+
+ "keyring": "[osd.1]\n\tkey = AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==\n",
+
+For a directory like ``/var/lib/ceph/osd/ceph-1``, the command could look
+like::
+
+ ceph-volume simple scan /var/lib/ceph/osd/ceph1
+
+
+.. note:: There is no support for encrypted OSDs
+
+
+.. _ceph-volume-simple-scan-device:
+
+Device scan
+-----------
+When an OSD directory is not available (OSD is not running, or device is not
+mounted) the ``scan`` command is able to introspect the device to capture
+required data. Just like :ref:`ceph-volume-simple-scan-directory`, it would
+still require a few files present. This means that the device to be scanned
+**must be** the data partition of the OSD.
+
+As long as the data partition of the OSD is being passed in as an argument, the
+sub-command can scan its contents.
+
+In the case where the device is already mounted, the tool can detect this
+scenario and capture file contents from that directory.
+
+If the device is not mounted, a temporary directory will be created, and the
+device will be mounted temporarily just for scanning the contents. Once
+contents are scanned, the device will be unmounted.
+
+For a device like ``/dev/sda1`` which **must** be a data partition, the command
+could look like::
+
+ ceph-volume simple scan /dev/sda1
+
+
+.. note:: There is no support for encrypted OSDs
+
+
+.. _ceph-volume-simple-scan-json:
+
+``JSON`` contents
+-----------------
+The contents of the JSON object is very simple. The scan not only will persist
+information from the special OSD files and their contents, but will also
+validate paths and device UUIDs. Unlike what ``ceph-disk`` would do, by storing
+them in ``{device type}_uuid`` files, the tool will persist them as part of the
+device type key.
+
+For example, a ``block.db`` device would look something like::
+
+ "block.db": {
+ "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e"
+ },
+
+But it will also persist the ``ceph-disk`` special file generated, like so::
+
+ "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+
+This duplication is in place because the tool is trying to ensure the
+following:
+
+# Support OSDs that may not have ceph-disk special files
+# Check the most up-to-date information on the device, by querying against LVM
+and ``blkid``
+# Support both logical volumes and GPT devices
+
+This is a sample ``JSON`` metadata, from an OSD that is using ``bluestore``::
+
+ {
+ "active": "ok",
+ "block": {
+ "path": "/dev/disk/by-partuuid/40fd0a64-caa5-43a3-9717-1836ac661a12",
+ "uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12"
+ },
+ "block.db": {
+ "path": "/dev/disk/by-partuuid/6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e"
+ },
+ "block.db_uuid": "6cc43680-4f6e-4feb-92ff-9c7ba204120e",
+ "block_uuid": "40fd0a64-caa5-43a3-9717-1836ac661a12",
+ "bluefs": "1",
+ "ceph_fsid": "c92fc9eb-0610-4363-aafc-81ddf70aaf1b",
+ "cluster_name": "ceph",
+ "data": {
+ "path": "/dev/sdr1",
+ "uuid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96"
+ },
+ "fsid": "86ebd829-1405-43d3-8fd6-4cbc9b6ecf96",
+ "keyring": "[osd.3]\n\tkey = AQBBJ/dZp57NIBAAtnuQS9WOS0hnLVe0rZnE6Q==\n",
+ "kv_backend": "rocksdb",
+ "magic": "ceph osd volume v026",
+ "mkfs_done": "yes",
+ "ready": "ready",
+ "systemd": "",
+ "type": "bluestore",
+ "whoami": "3"
+ }
diff --git a/src/ceph/doc/ceph-volume/simple/systemd.rst b/src/ceph/doc/ceph-volume/simple/systemd.rst
new file mode 100644
index 0000000..aa5bebf
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/simple/systemd.rst
@@ -0,0 +1,28 @@
+.. _ceph-volume-simple-systemd:
+
+systemd
+=======
+Upon startup, it will identify the logical volume by loading the JSON file in
+``/etc/ceph/osd/{id}-{uuid}.json`` corresponding to the instance name of the
+systemd unit.
+
+After identifying the correct volume it will then proceed to mount it by using
+the OSD destination conventions, that is::
+
+ /var/lib/ceph/osd/{cluster name}-{osd id}
+
+For our example OSD with an id of ``0``, that means the identified device will
+be mounted at::
+
+
+ /var/lib/ceph/osd/ceph-0
+
+
+Once that process is complete, a call will be made to start the OSD::
+
+ systemctl start ceph-osd@0
+
+The systemd portion of this process is handled by the ``ceph-volume simple
+trigger`` sub-command, which is only in charge of parsing metadata coming from
+systemd and startup, and then dispatching to ``ceph-volume simple activate`` which
+would proceed with activation.
diff --git a/src/ceph/doc/ceph-volume/systemd.rst b/src/ceph/doc/ceph-volume/systemd.rst
new file mode 100644
index 0000000..6cbc112
--- /dev/null
+++ b/src/ceph/doc/ceph-volume/systemd.rst
@@ -0,0 +1,49 @@
+.. _ceph-volume-systemd:
+
+systemd
+=======
+As part of the activation process (either with :ref:`ceph-volume-lvm-activate`
+or :ref:`ceph-volume-simple-activate`), systemd units will get enabled that
+will use the OSD id and uuid as part of their name. These units will be run
+when the system boots, and will proceed to activate their corresponding
+volumes via their sub-command implementation.
+
+The API for activation is a bit loose, it only requires two parts: the
+subcommand to use and any extra meta information separated by a dash. This
+convention makes the units look like::
+
+ ceph-volume@{command}-{extra metadata}
+
+The *extra metadata* can be anything needed that the subcommand implementing
+the processing might need. In the case of :ref:`ceph-volume-lvm` and
+:ref:`ceph-volume-simple`, both look to consume the :term:`OSD id` and :term:`OSD uuid`,
+but this is not a hard requirement, it is just how the sub-commands are
+implemented.
+
+Both the command and extra metadata gets persisted by systemd as part of the
+*"instance name"* of the unit. For example an OSD with an ID of 0, for the
+``lvm`` sub-command would look like::
+
+ systemctl enable ceph-volume@lvm-0-0A3E1ED2-DA8A-4F0E-AA95-61DEC71768D6
+
+The enabled unit is a :term:`systemd oneshot` service, meant to start at boot
+after the local filesystem is ready to be used.
+
+
+Failure and Retries
+-------------------
+It is common to have failures when a system is coming up online. The devices
+are sometimes not fully available and this unpredictable behavior may cause an
+OSD to not be ready to be used.
+
+There are two configurable environment variables used to set the retry
+behavior:
+
+* ``CEPH_VOLUME_SYSTEMD_TRIES``: Defaults to 30
+* ``CEPH_VOLUME_SYSTEMD_INTERVAL``: Defaults to 5
+
+The *"tries"* is a number that sets the maximum amount of times the unit will
+attempt to activate an OSD before giving up.
+
+The *"interval"* is a value in seconds that determines the waiting time before
+initiating another try at activating the OSD.