diff options
Diffstat (limited to 'src/ceph/doc/mgr/prometheus.rst')
-rw-r--r-- | src/ceph/doc/mgr/prometheus.rst | 219 |
1 files changed, 219 insertions, 0 deletions
diff --git a/src/ceph/doc/mgr/prometheus.rst b/src/ceph/doc/mgr/prometheus.rst new file mode 100644 index 0000000..5bae6a9 --- /dev/null +++ b/src/ceph/doc/mgr/prometheus.rst @@ -0,0 +1,219 @@ +================= +Prometheus plugin +================= + +Provides a Prometheus exporter to pass on Ceph performance counters +from the collection point in ceph-mgr. Ceph-mgr receives MMgrReport +messages from all MgrClient processes (mons and OSDs, for instance) +with performance counter schema data and actual counter data, and keeps +a circular buffer of the last N samples. This plugin creates an HTTP +endpoint (like all Prometheus exporters) and retrieves the latest sample +of every counter when polled (or "scraped" in Prometheus terminology). +The HTTP path and query parameters are ignored; all extant counters +for all reporting entities are returned in text exposition format. +(See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.) + +Enabling prometheus output +========================== + +The *prometheus* module is enabled with:: + + ceph mgr module enable prometheus + +Configuration +------------- + +By default the module will accept HTTP requests on port ``9283`` on all +IPv4 and IPv6 addresses on the host. The port and listen address are both +configurable with ``ceph config-key set``, with keys +``mgr/prometheus/server_addr`` and ``mgr/prometheus/server_port``. +This port is registered with Prometheus's `registry <https://github.com/prometheus/prometheus/wiki/Default-port-allocations>`_. + +Statistic names and labels +========================== + +The names of the stats are exactly as Ceph names them, with +illegal characters ``.``, ``-`` and ``::`` translated to ``_``, +and ``ceph_`` prefixed to all names. + + +All *daemon* statistics have a ``ceph_daemon`` label such as "osd.123" +that identifies the type and ID of the daemon they come from. Some +statistics can come from different types of daemon, so when querying +e.g. an OSD's RocksDB stats, you would probably want to filter +on ceph_daemon starting with "osd" to avoid mixing in the monitor +rocksdb stats. + + +The *cluster* statistics (i.e. those global to the Ceph cluster) +have labels appropriate to what they report on. For example, +metrics relating to pools have a ``pool_id`` label. + +Pool and OSD metadata series +---------------------------- + +Special series are output to enable displaying and querying on +certain metadata fields. + +Pools have a ``ceph_pool_metadata`` field like this: + +:: + + ceph_pool_metadata{pool_id="2",name="cephfs_metadata_a"} 0.0 + +OSDs have a ``ceph_osd_metadata`` field like this: + +:: + + ceph_osd_metadata{cluster_addr="172.21.9.34:6802/19096",device_class="ssd",id="0",public_addr="172.21.9.34:6801/19096",weight="1.0"} 0.0 + + +Correlating drive statistics with node_exporter +----------------------------------------------- + +The prometheus output from Ceph is designed to be used in conjunction +with the generic host monitoring from the Prometheus node_exporter. + +To enable correlation of Ceph OSD statistics with node_exporter's +drive statistics, special series are output like this: + +:: + + ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"} + +To use this to get disk statistics by OSD ID, use the ``and on`` syntax +in your prometheus query like this: + +:: + + rate(node_disk_bytes_written[30s]) and on (device,instance) ceph_disk_occupation{ceph_daemon="osd.0"} + +See the prometheus documentation for more information about constructing +queries. + +Note that for this mechanism to work, Ceph and node_exporter must agree +about the values of the ``instance`` label. See the following section +for guidance about to to set up Prometheus in a way that sets +``instance`` properly. + +Configuring Prometheus server +============================= + +See the prometheus documentation for full details of how to add +scrape endpoints: the notes +in this section are tips on how to configure Prometheus to capture +the Ceph statistics in the most usefully-labelled form. + +This configuration is necessary because Ceph is reporting metrics +from many hosts and services via a single endpoint, and some +metrics that relate to no physical host (such as pool statistics). + +honor_labels +------------ + +To enable Ceph to output properly-labelled data relating to any host, +use the ``honor_labels`` setting when adding the ceph-mgr endpoints +to your prometheus configuration. + +Without this setting, any ``instance`` labels that Ceph outputs, such +as those in ``ceph_disk_occupation`` series, will be overridden +by Prometheus. + +Ceph instance label +------------------- + +By default, Prometheus applies an ``instance`` label that includes +the hostname and port of the endpoint that the series game from. Because +Ceph clusters have multiple manager daemons, this results in an ``instance`` +label that changes spuriously when the active manager daemon changes. + +Set a custom ``instance`` label in your Prometheus target configuration: +you might wish to set it to the hostname of your first monitor, or something +completely arbitrary like "ceph_cluster". + +node_exporter instance labels +----------------------------- + +Set your ``instance`` labels to match what appears in Ceph's OSD metadata +in the ``hostname`` field. This is generally the short hostname of the node. + +This is only necessary if you want to correlate Ceph stats with host stats, +but you may find it useful to do it in all cases in case you want to do +the correlation in the future. + +Example configuration +--------------------- + +This example shows a single node configuration running ceph-mgr and +node_exporter on a server called ``senta04``. + +This is just an example: there are other ways to configure prometheus +scrape targets and label rewrite rules. + +prometheus.yml +~~~~~~~~~~~~~~ + +:: + + global: + scrape_interval: 15s + evaluation_interval: 15s + + scrape_configs: + - job_name: 'node' + file_sd_configs: + - files: + - node_targets.yml + - job_name: 'ceph' + honor_labels: true + file_sd_configs: + - files: + - ceph_targets.yml + + +ceph_targets.yml +~~~~~~~~~~~~~~~~ + + +:: + + [ + { + "targets": [ "senta04.mydomain.com:9283" ], + "labels": { + "instance": "ceph_cluster" + } + } + ] + + +node_targets.yml +~~~~~~~~~~~~~~~~ + +:: + + [ + { + "targets": [ "senta04.mydomain.com:9100" ], + "labels": { + "instance": "senta04" + } + } + ] + + +Notes +===== + +Counters and gauges are exported; currently histograms and long-running +averages are not. It's possible that Ceph's 2-D histograms could be +reduced to two separate 1-D histograms, and that long-running averages +could be exported as Prometheus' Summary type. + +Timestamps, as with many Prometheus exporters, are established by +the server's scrape time (Prometheus expects that it is polling the +actual counter process synchronously). It is possible to supply a +timestamp along with the stat report, but the Prometheus team strongly +advises against this. This means that timestamps will be delayed by +an unpredictable amount; it's not clear if this will be problematic, +but it's worth knowing about. |