summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/dev/cache-pool.rst
diff options
context:
space:
mode:
Diffstat (limited to 'src/ceph/doc/dev/cache-pool.rst')
-rw-r--r--src/ceph/doc/dev/cache-pool.rst200
1 files changed, 0 insertions, 200 deletions
diff --git a/src/ceph/doc/dev/cache-pool.rst b/src/ceph/doc/dev/cache-pool.rst
deleted file mode 100644
index 7dc71c8..0000000
--- a/src/ceph/doc/dev/cache-pool.rst
+++ /dev/null
@@ -1,200 +0,0 @@
-Cache pool
-==========
-
-Purpose
--------
-
-Use a pool of fast storage devices (probably SSDs) and use it as a
-cache for an existing slower and larger pool.
-
-Use a replicated pool as a front-end to service most I/O, and destage
-cold data to a separate erasure coded pool that does not currently (and
-cannot efficiently) handle the workload.
-
-We should be able to create and add a cache pool to an existing pool
-of data, and later remove it, without disrupting service or migrating
-data around.
-
-Use cases
----------
-
-Read-write pool, writeback
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-We have an existing data pool and put a fast cache pool "in front" of
-it. Writes will go to the cache pool and immediately ack. We flush
-them back to the data pool based on the defined policy.
-
-Read-only pool, weak consistency
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-We have an existing data pool and add one or more read-only cache
-pools. We copy data to the cache pool(s) on read. Writes are
-forwarded to the original data pool. Stale data is expired from the
-cache pools based on the defined policy.
-
-This is likely only useful for specific applications with specific
-data access patterns. It may be a match for rgw, for example.
-
-
-Interface
----------
-
-Set up a read/write cache pool foo-hot for pool foo::
-
- ceph osd tier add foo foo-hot
- ceph osd tier cache-mode foo-hot writeback
-
-Direct all traffic for foo to foo-hot::
-
- ceph osd tier set-overlay foo foo-hot
-
-Set the target size and enable the tiering agent for foo-hot::
-
- ceph osd pool set foo-hot hit_set_type bloom
- ceph osd pool set foo-hot hit_set_count 1
- ceph osd pool set foo-hot hit_set_period 3600 # 1 hour
- ceph osd pool set foo-hot target_max_bytes 1000000000000 # 1 TB
- ceph osd pool set foo-hot min_read_recency_for_promote 1
- ceph osd pool set foo-hot min_write_recency_for_promote 1
-
-Drain the cache in preparation for turning it off::
-
- ceph osd tier cache-mode foo-hot forward
- rados -p foo-hot cache-flush-evict-all
-
-When cache pool is finally empty, disable it::
-
- ceph osd tier remove-overlay foo
- ceph osd tier remove foo foo-hot
-
-Read-only pools with lazy consistency::
-
- ceph osd tier add foo foo-east
- ceph osd tier cache-mode foo-east readonly
- ceph osd tier add foo foo-west
- ceph osd tier cache-mode foo-west readonly
-
-
-
-Tiering agent
--------------
-
-The tiering policy is defined as properties on the cache pool itself.
-
-HitSet metadata
-~~~~~~~~~~~~~~~
-
-First, the agent requires HitSet information to be tracked on the
-cache pool in order to determine which objects in the pool are being
-accessed. This is enabled with::
-
- ceph osd pool set foo-hot hit_set_type bloom
- ceph osd pool set foo-hot hit_set_count 1
- ceph osd pool set foo-hot hit_set_period 3600 # 1 hour
-
-The supported HitSet types include 'bloom' (a bloom filter, the
-default), 'explicit_hash', and 'explicit_object'. The latter two
-explicitly enumerate accessed objects and are less memory efficient.
-They are there primarily for debugging and to demonstrate pluggability
-for the infrastructure. For the bloom filter type, you can additionally
-define the false positive probability for the bloom filter (default is 0.05)::
-
- ceph osd pool set foo-hot hit_set_fpp 0.15
-
-The hit_set_count and hit_set_period define how much time each HitSet
-should cover, and how many such HitSets to store. Binning accesses
-over time allows Ceph to independently determine whether an object was
-accessed at least once and whether it was accessed more than once over
-some time period ("age" vs "temperature").
-
-The ``min_read_recency_for_promote`` defines how many HitSets to check for the
-existence of an object when handling a read operation. The checking result is
-used to decide whether to promote the object asynchronously. Its value should be
-between 0 and ``hit_set_count``. If it's set to 0, the object is always promoted.
-If it's set to 1, the current HitSet is checked. And if this object is in the
-current HitSet, it's promoted. Otherwise not. For the other values, the exact
-number of archive HitSets are checked. The object is promoted if the object is
-found in any of the most recent ``min_read_recency_for_promote`` HitSets.
-
-A similar parameter can be set for the write operation, which is
-``min_write_recency_for_promote``. ::
-
- ceph osd pool set {cachepool} min_read_recency_for_promote 1
- ceph osd pool set {cachepool} min_write_recency_for_promote 1
-
-Note that the longer the ``hit_set_period`` and the higher the
-``min_read_recency_for_promote``/``min_write_recency_for_promote`` the more RAM
-will be consumed by the ceph-osd process. In particular, when the agent is active
-to flush or evict cache objects, all hit_set_count HitSets are loaded into RAM.
-
-Cache mode
-~~~~~~~~~~
-
-The most important policy is the cache mode:
-
- ceph osd pool set foo-hot cache-mode writeback
-
-The supported modes are 'none', 'writeback', 'forward', and
-'readonly'. Most installations want 'writeback', which will write
-into the cache tier and only later flush updates back to the base
-tier. Similarly, any object that is read will be promoted into the
-cache tier.
-
-The 'forward' mode is intended for when the cache is being disabled
-and needs to be drained. No new objects will be promoted or written
-to the cache pool unless they are already present. A background
-operation can then do something like::
-
- rados -p foo-hot cache-try-flush-evict-all
- rados -p foo-hot cache-flush-evict-all
-
-to force all data to be flushed back to the base tier.
-
-The 'readonly' mode is intended for read-only workloads that do not
-require consistency to be enforced by the storage system. Writes will
-be forwarded to the base tier, but objects that are read will get
-promoted to the cache. No attempt is made by Ceph to ensure that the
-contents of the cache tier(s) are consistent in the presence of object
-updates.
-
-Cache sizing
-~~~~~~~~~~~~
-
-The agent performs two basic functions: flushing (writing 'dirty'
-cache objects back to the base tier) and evicting (removing cold and
-clean objects from the cache).
-
-The thresholds at which Ceph will flush or evict objects is specified
-relative to a 'target size' of the pool. For example::
-
- ceph osd pool set foo-hot cache_target_dirty_ratio .4
- ceph osd pool set foo-hot cache_target_dirty_high_ratio .6
- ceph osd pool set foo-hot cache_target_full_ratio .8
-
-will begin flushing dirty objects when 40% of the pool is dirty and begin
-evicting clean objects when we reach 80% of the target size.
-
-The target size can be specified either in terms of objects or bytes::
-
- ceph osd pool set foo-hot target_max_bytes 1000000000000 # 1 TB
- ceph osd pool set foo-hot target_max_objects 1000000 # 1 million objects
-
-Note that if both limits are specified, Ceph will begin flushing or
-evicting when either threshold is triggered.
-
-Other tunables
-~~~~~~~~~~~~~~
-
-You can specify a minimum object age before a recently updated object is
-flushed to the base tier::
-
- ceph osd pool set foo-hot cache_min_flush_age 600 # 10 minutes
-
-You can specify the minimum age of an object before it will be evicted from
-the cache tier::
-
- ceph osd pool set foo-hot cache_min_evict_age 1800 # 30 minutes
-
-
-