summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/dev/osd_internals/osd_throttles.rst
diff options
context:
space:
mode:
authorQiaowei Ren <qiaowei.ren@intel.com>2018-01-04 13:43:33 +0800
committerQiaowei Ren <qiaowei.ren@intel.com>2018-01-05 11:59:39 +0800
commit812ff6ca9fcd3e629e49d4328905f33eee8ca3f5 (patch)
tree04ece7b4da00d9d2f98093774594f4057ae561d4 /src/ceph/doc/dev/osd_internals/osd_throttles.rst
parent15280273faafb77777eab341909a3f495cf248d9 (diff)
initial code repo
This patch creates initial code repo. For ceph, luminous stable release will be used for base code, and next changes and optimization for ceph will be added to it. For opensds, currently any changes can be upstreamed into original opensds repo (https://github.com/opensds/opensds), and so stor4nfv will directly clone opensds code to deploy stor4nfv environment. And the scripts for deployment based on ceph and opensds will be put into 'ci' directory. Change-Id: I46a32218884c75dda2936337604ff03c554648e4 Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Diffstat (limited to 'src/ceph/doc/dev/osd_internals/osd_throttles.rst')
-rw-r--r--src/ceph/doc/dev/osd_internals/osd_throttles.rst93
1 files changed, 93 insertions, 0 deletions
diff --git a/src/ceph/doc/dev/osd_internals/osd_throttles.rst b/src/ceph/doc/dev/osd_internals/osd_throttles.rst
new file mode 100644
index 0000000..6739bd9
--- /dev/null
+++ b/src/ceph/doc/dev/osd_internals/osd_throttles.rst
@@ -0,0 +1,93 @@
+=============
+OSD Throttles
+=============
+
+There are three significant throttles in the filestore: wbthrottle,
+op_queue_throttle, and a throttle based on journal usage.
+
+WBThrottle
+----------
+The WBThrottle is defined in src/os/filestore/WBThrottle.[h,cc] and
+included in FileStore as FileStore::wbthrottle. The intention is to
+bound the amount of outstanding IO we need to do to flush the journal.
+At the same time, we don't want to necessarily do it inline in case we
+might be able to combine several IOs on the same object close together
+in time. Thus, in FileStore::_write, we queue the fd for asyncronous
+flushing and block in FileStore::_do_op if we have exceeded any hard
+limits until the background flusher catches up.
+
+The relevant config options are filestore_wbthrottle*. There are
+different defaults for xfs and btrfs. Each set has hard and soft
+limits on bytes (total dirty bytes), ios (total dirty ios), and
+inodes (total dirty fds). The WBThrottle will begin flushing
+when any of these hits the soft limit and will block in throttle()
+while any has exceeded the hard limit.
+
+Tighter soft limits will cause writeback to happen more quickly,
+but may cause the OSD to miss oportunities for write coalescing.
+Tighter hard limits may cause a reduction in latency variance by
+reducing time spent flushing the journal, but may reduce writeback
+parallelism.
+
+op_queue_throttle
+-----------------
+The op queue throttle is intended to bound the amount of queued but
+uncompleted work in the filestore by delaying threads calling
+queue_transactions more and more based on how many ops and bytes are
+currently queued. The throttle is taken in queue_transactions and
+released when the op is applied to the filesystem. This period
+includes time spent in the journal queue, time spent writing to the
+journal, time spent in the actual op queue, time spent waiting for the
+wbthrottle to open up (thus, the wbthrottle can push back indirectly
+on the queue_transactions caller), and time spent actually applying
+the op to the filesystem. A BackoffThrottle is used to gradually
+delay the queueing thread after each throttle becomes more than
+filestore_queue_low_threshhold full (a ratio of
+filestore_queue_max_(bytes|ops)). The throttles will block once the
+max value is reached (filestore_queue_max_(bytes|ops)).
+
+The significant config options are:
+filestore_queue_low_threshhold
+filestore_queue_high_threshhold
+filestore_expected_throughput_ops
+filestore_expected_throughput_bytes
+filestore_queue_high_delay_multiple
+filestore_queue_max_delay_multiple
+
+While each throttle is at less than low_threshhold of the max,
+no delay happens. Between low and high, the throttle will
+inject a per-op delay (per op or byte) ramping from 0 at low to
+high_delay_multiple/expected_throughput at high. From high to
+1, the delay will ramp from high_delay_multiple/expected_throughput
+to max_delay_multiple/expected_throughput.
+
+filestore_queue_high_delay_multiple and
+filestore_queue_max_delay_multiple probably do not need to be
+changed.
+
+Setting these properly should help to smooth out op latencies by
+mostly avoiding the hard limit.
+
+See FileStore::throttle_ops and FileSTore::thottle_bytes.
+
+journal usage throttle
+----------------------
+See src/os/filestore/JournalThrottle.h/cc
+
+The intention of the journal usage throttle is to gradually slow
+down queue_transactions callers as the journal fills up in order
+to smooth out hiccup during filestore syncs. JournalThrottle
+wraps a BackoffThrottle and tracks journaled but not flushed
+journal entries so that the throttle can be released when the
+journal is flushed. The configs work very similarly to the
+op_queue_throttle.
+
+The significant config options are:
+journal_throttle_low_threshhold
+journal_throttle_high_threshhold
+filestore_expected_throughput_ops
+filestore_expected_throughput_bytes
+journal_throttle_high_multiple
+journal_throttle_max_multiple
+
+.. literalinclude:: osd_throttles.txt