summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/mgr/influx.rst
blob: 37aa5cd63434d9c0ac73ee4625ed089c31685f1e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
=============
Influx Plugin 
=============

The influx plugin continuously collects and sends time series data to an
influxdb database.

The influx plugin was introduced in the 13.x *Mimic* release.

--------
Enabling 
--------

To enable the module, use the following command:

::

    ceph mgr module enable influx

If you wish to subsequently disable the module, you can use the equivalent
*disable* command:

::

    ceph mgr module disable influx

-------------
Configuration 
-------------

For the influx module to send statistics to an InfluxDB server, it
is necessary to configure the servers address and some authentication
credentials.

Set configuration values using the following command:

::

    ceph config-key set mgr/influx/<key> <value>


The most important settings are ``hostname``, ``username`` and ``password``.  
For example, a typical configuration might look like this:

::

    ceph config-key set mgr/influx/hostname influx.mydomain.com
    ceph config-key set mgr/influx/username admin123
    ceph config-key set mgr/influx/password p4ssw0rd
    
Additional optional configuration settings are:

:interval: Time between reports to InfluxDB.  Default 5 seconds.
:database: InfluxDB database name.  Default "ceph".  You will need to create this database and grant write privileges to the configured username or the username must have admin privileges to create it.  
:port: InfluxDB server port.  Default 8086
    

---------
Debugging 
---------

By default, a few debugging statments as well as error statements have been set to print in the log files. Users can add more if necessary.
To make use of the debugging option in the module:

- Add this to the ceph.conf file.::

    [mgr]
        debug_mgr = 20  

- Use this command ``ceph tell mgr.<mymonitor> influx self-test``.
- Check the log files. Users may find it easier to filter the log files using *mgr[influx]*.

--------------------
Interesting counters
--------------------

The following tables describe a subset of the values output by
this module.

^^^^^
Pools
^^^^^

+---------------+-----------------------------------------------------+
|Counter        | Description                                         |
+===============+=====================================================+
|bytes_used     | Bytes used in the pool not including copies         |
+---------------+-----------------------------------------------------+
|max_avail      | Max available number of bytes in the pool           |
+---------------+-----------------------------------------------------+
|objects        | Number of objects in the pool                       |
+---------------+-----------------------------------------------------+
|wr_bytes       | Number of bytes written in the pool                 |
+---------------+-----------------------------------------------------+
|dirty          | Number of bytes dirty in the pool                   |
+---------------+-----------------------------------------------------+
|rd_bytes       | Number of bytes read in the pool                    |
+---------------+-----------------------------------------------------+
|raw_bytes_used | Bytes used in pool including copies made            |
+---------------+-----------------------------------------------------+

^^^^
OSDs
^^^^

+------------+------------------------------------+
|Counter     | Description                        |
+============+====================================+
|op_w        | Client write operations            |
+------------+------------------------------------+
|op_in_bytes | Client operations total write size |
+------------+------------------------------------+
|op_r        | Client read operations             |
+------------+------------------------------------+
|op_out_bytes| Client operations total read size  |
+------------+------------------------------------+


+------------------------+--------------------------------------------------------------------------+
|Counter                 | Description                                                              |
+========================+==========================================================================+
|op_wip                  | Replication operations currently being processed (primary)               |
+------------------------+--------------------------------------------------------------------------+
|op_latency              | Latency of client operations (including queue time)                      |
+------------------------+--------------------------------------------------------------------------+
|op_process_latency      | Latency of client operations (excluding queue time)                      |           
+------------------------+--------------------------------------------------------------------------+
|op_prepare_latency      | Latency of client operations (excluding queue time and wait for finished)|
+------------------------+--------------------------------------------------------------------------+
|op_r_latency            | Latency of read operation (including queue time)                         |
+------------------------+--------------------------------------------------------------------------+
|op_r_process_latency    | Latency of read operation (excluding queue time)                         |
+------------------------+--------------------------------------------------------------------------+
|op_w_in_bytes           | Client data written                                                      |
+------------------------+--------------------------------------------------------------------------+
|op_w_latency            | Latency of write operation (including queue time)                        |
+------------------------+--------------------------------------------------------------------------+
|op_w_process_latency    | Latency of write operation (excluding queue time)                        |
+------------------------+--------------------------------------------------------------------------+
|op_w_prepare_latency    | Latency of write operations (excluding queue time and wait for finished) |
+------------------------+--------------------------------------------------------------------------+
|op_rw                   | Client read-modify-write operations                                      |
+------------------------+--------------------------------------------------------------------------+
|op_rw_in_bytes          | Client read-modify-write operations write in                             |
+------------------------+--------------------------------------------------------------------------+
|op_rw_out_bytes         | Client read-modify-write operations read out                             |
+------------------------+--------------------------------------------------------------------------+
|op_rw_latency           | Latency of read-modify-write operation (including queue time)            |
+------------------------+--------------------------------------------------------------------------+
|op_rw_process_latency   | Latency of read-modify-write operation (excluding queue time)            |
+------------------------+--------------------------------------------------------------------------+
|op_rw_prepare_latency   | Latency of read-modify-write operations (excluding queue time            |
|                        | and wait for finished)                                                   |
+------------------------+--------------------------------------------------------------------------+
|op_before_queue_op_lat  | Latency of IO before calling queue (before really queue into ShardedOpWq)|
|                        | op_before_dequeue_op_lat                                                 |
+------------------------+--------------------------------------------------------------------------+
|op_before_dequeue_op_lat| Latency of IO before calling dequeue_op(already dequeued and get PG lock)|
+------------------------+--------------------------------------------------------------------------+

Latency counters are measured in microseconds unless otherwise specified in the description.