summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/rados/troubleshooting/log-and-debug.rst
blob: c91f27218dd5c98a5ee1f53041c96b43036780d6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
=======================
 Logging and Debugging
=======================

Typically, when you add debugging to your Ceph configuration, you do so at
runtime. You can also add Ceph debug logging to your Ceph configuration file if
you are encountering issues when starting your cluster. You may view Ceph log
files under ``/var/log/ceph`` (the default location).

.. tip:: When debug output slows down your system, the latency can hide 
   race conditions.

Logging is resource intensive. If you are encountering a problem in a specific
area of your cluster, enable logging for that area of the cluster. For example,
if your OSDs are running fine, but your metadata servers are not, you should
start by enabling debug logging for the specific metadata server instance(s)
giving you trouble. Enable logging for each subsystem as needed.

.. important:: Verbose logging can generate over 1GB of data per hour. If your 
   OS disk reaches its capacity, the node will stop working.
   
If you enable or increase the rate of Ceph logging, ensure that you have
sufficient disk space on your OS disk.  See `Accelerating Log Rotation`_ for
details on rotating log files. When your system is running well, remove
unnecessary debugging settings to ensure your cluster runs optimally. Logging
debug output messages is relatively slow, and a waste of resources when
operating your cluster.

See `Subsystem, Log and Debug Settings`_ for details on available settings.

Runtime
=======

If you would like to see the configuration settings at runtime, you must log
in to a host with a running daemon and execute the following:: 

	ceph daemon {daemon-name} config show | less

For example,::

  ceph daemon osd.0 config show | less

To activate Ceph's debugging output (*i.e.*, ``dout()``) at runtime,  use the
``ceph tell`` command to inject arguments into the runtime configuration:: 

	ceph tell {daemon-type}.{daemon id or *} injectargs --{name} {value} [--{name} {value}]
	
Replace ``{daemon-type}`` with one of ``osd``, ``mon`` or ``mds``. You may apply
the runtime setting to all daemons of a particular type with ``*``, or specify
a specific daemon's ID. For example, to increase
debug logging for a ``ceph-osd`` daemon named ``osd.0``, execute the following:: 

	ceph tell osd.0 injectargs --debug-osd 0/5

The ``ceph tell`` command goes through the monitors. If you cannot bind to the
monitor, you can still make the change by logging into the host of the daemon
whose configuration you'd like to change using ``ceph daemon``.
For example:: 

	sudo ceph daemon osd.0 config set debug_osd 0/5

See `Subsystem, Log and Debug Settings`_ for details on available settings.


Boot Time
=========

To activate Ceph's debugging output (*i.e.*, ``dout()``) at boot time, you must
add settings to your Ceph configuration file. Subsystems common to each daemon
may be set under ``[global]`` in your configuration file. Subsystems for
particular daemons are set under the daemon section in your configuration file
(*e.g.*, ``[mon]``, ``[osd]``, ``[mds]``). For example::

	[global]
		debug ms = 1/5
		
	[mon]
		debug mon = 20
		debug paxos = 1/5
		debug auth = 2
		 
 	[osd]
 		debug osd = 1/5
 		debug filestore = 1/5
 		debug journal = 1
 		debug monc = 5/20
 		
	[mds]
		debug mds = 1
		debug mds balancer = 1


See `Subsystem, Log and Debug Settings`_ for details.


Accelerating Log Rotation
=========================

If your OS disk is relatively full, you can accelerate log rotation by modifying
the Ceph log rotation file at ``/etc/logrotate.d/ceph``. Add  a size setting
after the rotation frequency to accelerate log rotation (via cronjob) if your
logs exceed the size setting. For example, the  default setting looks like
this::
   
	rotate 7
  	weekly
  	compress
  	sharedscripts
   	
Modify it by adding a ``size`` setting. ::
   
  	rotate 7
  	weekly
  	size 500M
  	compress
  	sharedscripts

Then, start the crontab editor for your user space. ::
   
  	crontab -e
	
Finally, add an entry to check the ``etc/logrotate.d/ceph`` file. ::
   
  	30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1

The preceding example checks the ``etc/logrotate.d/ceph`` file every 30 minutes.


Valgrind
========

Debugging may also require you to track down memory and threading issues. 
You can run a single daemon, a type of daemon, or the whole cluster with 
Valgrind. You should only use Valgrind when developing or debugging Ceph. 
Valgrind is computationally expensive, and will slow down your system otherwise. 
Valgrind messages are logged to ``stderr``. 


Subsystem, Log and Debug Settings
=================================

In most cases, you will enable debug logging output via subsystems. 

Ceph Subsystems
---------------

Each subsystem has a logging level for its output logs, and for its logs
in-memory. You may set different values for each of these subsystems by setting
a log file level and a memory level for debug logging. Ceph's logging levels
operate on a scale of ``1`` to ``20``, where ``1`` is terse and ``20`` is
verbose [#]_ . In general, the logs in-memory are not sent to the output log unless:

- a fatal signal is raised or
- an ``assert`` in source code is triggered or
- upon requested. Please consult `document on admin socket <http://docs.ceph.com/docs/master/man/8/ceph/#daemon>`_ for more details.

A debug logging setting can take a single value for the log level and the
memory level, which sets them both as the same value. For example, if you
specify ``debug ms = 5``, Ceph will treat it as a log level and a memory level
of ``5``. You may also specify them separately. The first setting is the log
level, and the second setting is the memory level.  You must separate them with
a forward slash (/). For example, if you want to set the ``ms`` subsystem's
debug logging level to ``1`` and its memory level to ``5``, you would specify it
as ``debug ms = 1/5``. For example:



.. code-block:: ini 

	debug {subsystem} = {log-level}/{memory-level}
	#for example
	debug mds balancer = 1/20


The following table provides a list of Ceph subsystems and their default log and
memory levels. Once you complete your logging efforts, restore the subsystems
to their default level or to a level suitable for normal operations.


+--------------------+-----------+--------------+
| Subsystem          | Log Level | Memory Level |
+====================+===========+==============+
| ``default``        |     0     |      5       |
+--------------------+-----------+--------------+
| ``lockdep``        |     0     |      5       |
+--------------------+-----------+--------------+
| ``context``        |     0     |      5       |
+--------------------+-----------+--------------+
| ``crush``          |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds``            |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds balancer``   |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds locker``     |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds log``        |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds log expire`` |     1     |      5       |
+--------------------+-----------+--------------+
| ``mds migrator``   |     1     |      5       |
+--------------------+-----------+--------------+
| ``buffer``         |     0     |      0       |
+--------------------+-----------+--------------+
| ``timer``          |     0     |      5       |
+--------------------+-----------+--------------+
| ``filer``          |     0     |      5       |
+--------------------+-----------+--------------+
| ``objecter``       |     0     |      0       |
+--------------------+-----------+--------------+
| ``rados``          |     0     |      5       |
+--------------------+-----------+--------------+
| ``rbd``            |     0     |      5       |
+--------------------+-----------+--------------+
| ``journaler``      |     0     |      5       |
+--------------------+-----------+--------------+
| ``objectcacher``   |     0     |      5       |
+--------------------+-----------+--------------+
| ``client``         |     0     |      5       |
+--------------------+-----------+--------------+
| ``osd``            |     0     |      5       |
+--------------------+-----------+--------------+
| ``optracker``      |     0     |      5       |
+--------------------+-----------+--------------+
| ``objclass``       |     0     |      5       |
+--------------------+-----------+--------------+
| ``filestore``      |     1     |      5       |
+--------------------+-----------+--------------+
| ``journal``        |     1     |      5       |
+--------------------+-----------+--------------+
| ``ms``             |     0     |      5       |
+--------------------+-----------+--------------+
| ``mon``            |     1     |      5       |
+--------------------+-----------+--------------+
| ``monc``           |     0     |      5       |
+--------------------+-----------+--------------+
| ``paxos``          |     0     |      5       |
+--------------------+-----------+--------------+
| ``tp``             |     0     |      5       |
+--------------------+-----------+--------------+
| ``auth``           |     1     |      5       |
+--------------------+-----------+--------------+
| ``finisher``       |     1     |      5       |
+--------------------+-----------+--------------+
| ``heartbeatmap``   |     1     |      5       |
+--------------------+-----------+--------------+
| ``perfcounter``    |     1     |      5       |
+--------------------+-----------+--------------+
| ``rgw``            |     1     |      5       |
+--------------------+-----------+--------------+
| ``javaclient``     |     1     |      5       |
+--------------------+-----------+--------------+
| ``asok``           |     1     |      5       |
+--------------------+-----------+--------------+
| ``throttle``       |     1     |      5       |
+--------------------+-----------+--------------+


Logging Settings
----------------

Logging and debugging settings are not required in a Ceph configuration file,
but you may override default settings as needed. Ceph supports the following
settings:


``log file``

:Description: The location of the logging file for your cluster.
:Type: String
:Required: No
:Default: ``/var/log/ceph/$cluster-$name.log``


``log max new``

:Description: The maximum number of new log files.
:Type: Integer
:Required: No
:Default: ``1000``


``log max recent``

:Description: The maximum number of recent events to include in a log file.
:Type: Integer
:Required:  No
:Default: ``1000000``


``log to stderr``

:Description: Determines if logging messages should appear in ``stderr``.
:Type: Boolean
:Required: No
:Default: ``true``


``err to stderr``

:Description: Determines if error messages should appear in ``stderr``.
:Type: Boolean
:Required: No
:Default: ``true``


``log to syslog``

:Description: Determines if logging messages should appear in ``syslog``.
:Type: Boolean
:Required: No
:Default: ``false``


``err to syslog``

:Description: Determines if error messages should appear in ``syslog``.
:Type: Boolean
:Required: No
:Default: ``false``


``log flush on exit``

:Description: Determines if Ceph should flush the log files after exit.
:Type: Boolean
:Required: No
:Default: ``true``


``clog to monitors``

:Description: Determines if ``clog`` messages should be sent to monitors.
:Type: Boolean
:Required: No
:Default: ``true``


``clog to syslog``

:Description: Determines if ``clog`` messages should be sent to syslog.
:Type: Boolean
:Required: No
:Default: ``false``


``mon cluster log to syslog``

:Description: Determines if the cluster log should be output to the syslog.
:Type: Boolean
:Required: No
:Default: ``false``


``mon cluster log file``

:Description: The location of the cluster's log file. 
:Type: String
:Required: No
:Default: ``/var/log/ceph/$cluster.log``



OSD
---


``osd debug drop ping probability``

:Description: ?
:Type: Double
:Required: No
:Default: 0


``osd debug drop ping duration``

:Description: 
:Type: Integer
:Required: No
:Default: 0

``osd debug drop pg create probability``

:Description: 
:Type: Integer
:Required: No
:Default: 0

``osd debug drop pg create duration``

:Description: ?
:Type: Double
:Required: No
:Default: 1


``osd tmapput sets uses tmap``

:Description: Uses ``tmap``. For debug only.
:Type: Boolean
:Required: No
:Default: ``false``


``osd min pg log entries``

:Description: The minimum number of log entries for placement groups. 
:Type: 32-bit Unsigned Integer
:Required: No
:Default: 1000


``osd op log threshold``

:Description: How many op log messages to show up in one pass. 
:Type: Integer
:Required: No
:Default: 5



Filestore
---------

``filestore debug omap check``

:Description: Debugging check on synchronization. This is an expensive operation.
:Type: Boolean
:Required: No
:Default: 0


MDS
---


``mds debug scatterstat``

:Description: Ceph will assert that various recursive stat invariants are true 
              (for developers only).

:Type: Boolean
:Required: No
:Default: ``false``


``mds debug frag``

:Description: Ceph will verify directory fragmentation invariants when 
              convenient (developers only).

:Type: Boolean
:Required: No
:Default: ``false``


``mds debug auth pins``

:Description: The debug auth pin invariants (for developers only).
:Type: Boolean
:Required: No
:Default: ``false``


``mds debug subtrees``

:Description: The debug subtree invariants (for developers only).
:Type: Boolean
:Required: No
:Default: ``false``



RADOS Gateway
-------------


``rgw log nonexistent bucket``

:Description: Should we log a non-existent buckets?
:Type: Boolean
:Required: No
:Default: ``false``


``rgw log object name``

:Description: Should an object's name be logged. // man date to see codes (a subset are supported)
:Type: String
:Required: No
:Default: ``%Y-%m-%d-%H-%i-%n``


``rgw log object name utc``

:Description: Object log name contains UTC?
:Type: Boolean
:Required: No
:Default: ``false``


``rgw enable ops log``

:Description: Enables logging of every RGW operation.
:Type: Boolean
:Required: No
:Default: ``true``


``rgw enable usage log``

:Description: Enable logging of RGW's bandwidth usage.
:Type: Boolean
:Required: No
:Default: ``true``


``rgw usage log flush threshold``

:Description: Threshold to flush pending log data.
:Type: Integer
:Required: No
:Default: ``1024``


``rgw usage log tick interval``

:Description: Flush pending log data every ``s`` seconds.
:Type: Integer
:Required: No
:Default: 30


``rgw intent log object name``

:Description: 
:Type: String
:Required: No
:Default: ``%Y-%m-%d-%i-%n``


``rgw intent log object name utc``

:Description: Include a UTC timestamp in the intent log object name.
:Type: Boolean
:Required: No
:Default: ``false``

.. [#] there are levels >20 in some rare cases and that they are extremely verbose.