summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/mgr/prometheus.rst
diff options
context:
space:
mode:
Diffstat (limited to 'src/ceph/doc/mgr/prometheus.rst')
-rw-r--r--src/ceph/doc/mgr/prometheus.rst219
1 files changed, 219 insertions, 0 deletions
diff --git a/src/ceph/doc/mgr/prometheus.rst b/src/ceph/doc/mgr/prometheus.rst
new file mode 100644
index 0000000..5bae6a9
--- /dev/null
+++ b/src/ceph/doc/mgr/prometheus.rst
@@ -0,0 +1,219 @@
+=================
+Prometheus plugin
+=================
+
+Provides a Prometheus exporter to pass on Ceph performance counters
+from the collection point in ceph-mgr. Ceph-mgr receives MMgrReport
+messages from all MgrClient processes (mons and OSDs, for instance)
+with performance counter schema data and actual counter data, and keeps
+a circular buffer of the last N samples. This plugin creates an HTTP
+endpoint (like all Prometheus exporters) and retrieves the latest sample
+of every counter when polled (or "scraped" in Prometheus terminology).
+The HTTP path and query parameters are ignored; all extant counters
+for all reporting entities are returned in text exposition format.
+(See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
+
+Enabling prometheus output
+==========================
+
+The *prometheus* module is enabled with::
+
+ ceph mgr module enable prometheus
+
+Configuration
+-------------
+
+By default the module will accept HTTP requests on port ``9283`` on all
+IPv4 and IPv6 addresses on the host. The port and listen address are both
+configurable with ``ceph config-key set``, with keys
+``mgr/prometheus/server_addr`` and ``mgr/prometheus/server_port``.
+This port is registered with Prometheus's `registry <https://github.com/prometheus/prometheus/wiki/Default-port-allocations>`_.
+
+Statistic names and labels
+==========================
+
+The names of the stats are exactly as Ceph names them, with
+illegal characters ``.``, ``-`` and ``::`` translated to ``_``,
+and ``ceph_`` prefixed to all names.
+
+
+All *daemon* statistics have a ``ceph_daemon`` label such as "osd.123"
+that identifies the type and ID of the daemon they come from. Some
+statistics can come from different types of daemon, so when querying
+e.g. an OSD's RocksDB stats, you would probably want to filter
+on ceph_daemon starting with "osd" to avoid mixing in the monitor
+rocksdb stats.
+
+
+The *cluster* statistics (i.e. those global to the Ceph cluster)
+have labels appropriate to what they report on. For example,
+metrics relating to pools have a ``pool_id`` label.
+
+Pool and OSD metadata series
+----------------------------
+
+Special series are output to enable displaying and querying on
+certain metadata fields.
+
+Pools have a ``ceph_pool_metadata`` field like this:
+
+::
+
+ ceph_pool_metadata{pool_id="2",name="cephfs_metadata_a"} 0.0
+
+OSDs have a ``ceph_osd_metadata`` field like this:
+
+::
+
+ ceph_osd_metadata{cluster_addr="172.21.9.34:6802/19096",device_class="ssd",id="0",public_addr="172.21.9.34:6801/19096",weight="1.0"} 0.0
+
+
+Correlating drive statistics with node_exporter
+-----------------------------------------------
+
+The prometheus output from Ceph is designed to be used in conjunction
+with the generic host monitoring from the Prometheus node_exporter.
+
+To enable correlation of Ceph OSD statistics with node_exporter's
+drive statistics, special series are output like this:
+
+::
+
+ ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"}
+
+To use this to get disk statistics by OSD ID, use the ``and on`` syntax
+in your prometheus query like this:
+
+::
+
+ rate(node_disk_bytes_written[30s]) and on (device,instance) ceph_disk_occupation{ceph_daemon="osd.0"}
+
+See the prometheus documentation for more information about constructing
+queries.
+
+Note that for this mechanism to work, Ceph and node_exporter must agree
+about the values of the ``instance`` label. See the following section
+for guidance about to to set up Prometheus in a way that sets
+``instance`` properly.
+
+Configuring Prometheus server
+=============================
+
+See the prometheus documentation for full details of how to add
+scrape endpoints: the notes
+in this section are tips on how to configure Prometheus to capture
+the Ceph statistics in the most usefully-labelled form.
+
+This configuration is necessary because Ceph is reporting metrics
+from many hosts and services via a single endpoint, and some
+metrics that relate to no physical host (such as pool statistics).
+
+honor_labels
+------------
+
+To enable Ceph to output properly-labelled data relating to any host,
+use the ``honor_labels`` setting when adding the ceph-mgr endpoints
+to your prometheus configuration.
+
+Without this setting, any ``instance`` labels that Ceph outputs, such
+as those in ``ceph_disk_occupation`` series, will be overridden
+by Prometheus.
+
+Ceph instance label
+-------------------
+
+By default, Prometheus applies an ``instance`` label that includes
+the hostname and port of the endpoint that the series game from. Because
+Ceph clusters have multiple manager daemons, this results in an ``instance``
+label that changes spuriously when the active manager daemon changes.
+
+Set a custom ``instance`` label in your Prometheus target configuration:
+you might wish to set it to the hostname of your first monitor, or something
+completely arbitrary like "ceph_cluster".
+
+node_exporter instance labels
+-----------------------------
+
+Set your ``instance`` labels to match what appears in Ceph's OSD metadata
+in the ``hostname`` field. This is generally the short hostname of the node.
+
+This is only necessary if you want to correlate Ceph stats with host stats,
+but you may find it useful to do it in all cases in case you want to do
+the correlation in the future.
+
+Example configuration
+---------------------
+
+This example shows a single node configuration running ceph-mgr and
+node_exporter on a server called ``senta04``.
+
+This is just an example: there are other ways to configure prometheus
+scrape targets and label rewrite rules.
+
+prometheus.yml
+~~~~~~~~~~~~~~
+
+::
+
+ global:
+ scrape_interval: 15s
+ evaluation_interval: 15s
+
+ scrape_configs:
+ - job_name: 'node'
+ file_sd_configs:
+ - files:
+ - node_targets.yml
+ - job_name: 'ceph'
+ honor_labels: true
+ file_sd_configs:
+ - files:
+ - ceph_targets.yml
+
+
+ceph_targets.yml
+~~~~~~~~~~~~~~~~
+
+
+::
+
+ [
+ {
+ "targets": [ "senta04.mydomain.com:9283" ],
+ "labels": {
+ "instance": "ceph_cluster"
+ }
+ }
+ ]
+
+
+node_targets.yml
+~~~~~~~~~~~~~~~~
+
+::
+
+ [
+ {
+ "targets": [ "senta04.mydomain.com:9100" ],
+ "labels": {
+ "instance": "senta04"
+ }
+ }
+ ]
+
+
+Notes
+=====
+
+Counters and gauges are exported; currently histograms and long-running
+averages are not. It's possible that Ceph's 2-D histograms could be
+reduced to two separate 1-D histograms, and that long-running averages
+could be exported as Prometheus' Summary type.
+
+Timestamps, as with many Prometheus exporters, are established by
+the server's scrape time (Prometheus expects that it is polling the
+actual counter process synchronously). It is possible to supply a
+timestamp along with the stat report, but the Prometheus team strongly
+advises against this. This means that timestamps will be delayed by
+an unpredictable amount; it's not clear if this will be problematic,
+but it's worth knowing about.