summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/rbd/qemu-rbd.rst
blob: 80c5dcc4187b585238a2a6ec77cc50651cfa0064 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
========================
 QEMU and Block Devices
========================

.. index:: Ceph Block Device; QEMU KVM

The most frequent Ceph Block Device use case involves providing block device
images to virtual machines. For example, a user may create  a "golden" image
with an OS and any relevant software in an ideal configuration. Then, the user
takes a snapshot of the image. Finally, the user clones the snapshot (usually
many times). See `Snapshots`_ for details. The ability to make copy-on-write
clones of a snapshot means that Ceph can provision block device images to
virtual machines quickly, because the client doesn't have to download an entire
image each time it spins up a new virtual machine.


.. ditaa::  +---------------------------------------------------+
            |                       QEMU                        |
            +---------------------------------------------------+
            |                      librbd                       |
            +---------------------------------------------------+
            |                     librados                      |
            +------------------------+-+------------------------+
            |          OSDs          | |        Monitors        |
            +------------------------+ +------------------------+


Ceph Block Devices can integrate with the QEMU virtual machine. For details on
QEMU, see  `QEMU Open Source Processor Emulator`_. For QEMU documentation, see
`QEMU Manual`_. For installation details, see `Installation`_.

.. important:: To use Ceph Block Devices with QEMU, you must have access to a 
   running Ceph cluster.


Usage
=====

The QEMU command line expects you to specify the pool name and image name. You
may also specify a snapshot name. 

QEMU will assume that the Ceph configuration file resides in the default
location (e.g., ``/etc/ceph/$cluster.conf``) and that you are executing
commands as the default ``client.admin`` user unless you expressly specify
another Ceph configuration file path or another user. When specifying a user,
QEMU uses the ``ID`` rather than the full ``TYPE:ID``. See `User Management -
User`_ for details. Do not prepend the client type (i.e., ``client.``) to the
beginning of the user  ``ID``, or you will receive an authentication error. You
should have the key for the ``admin`` user or the key of another user you
specify with the ``:id={user}`` option in a keyring file stored in default path
(i.e., ``/etc/ceph`` or the local directory with appropriate file ownership and
permissions. Usage takes the following form::

	qemu-img {command} [options] rbd:{pool-name}/{image-name}[@snapshot-name][:option1=value1][:option2=value2...]

For example, specifying the ``id`` and ``conf`` options might look like the following:: 

	qemu-img {command} [options] rbd:glance-pool/maipo:id=glance:conf=/etc/ceph/ceph.conf

.. tip:: Configuration values containing ``:``, ``@``, or ``=`` can be escaped with a
         leading ``\`` character.


Creating Images with QEMU
=========================

You can create a block device image from QEMU. You must specify ``rbd``,  the
pool name, and the name of the image you wish to create. You must also specify
the size of the image. ::

	qemu-img create -f raw rbd:{pool-name}/{image-name} {size}

For example::

	qemu-img create -f raw rbd:data/foo 10G

.. important:: The ``raw`` data format is really the only sensible
   ``format`` option to use with RBD. Technically, you could use other
   QEMU-supported formats (such as ``qcow2`` or ``vmdk``), but doing
   so would add additional overhead, and would also render the volume
   unsafe for virtual machine live migration when caching (see below)
   is enabled.


Resizing Images with QEMU
=========================

You can resize a block device image from QEMU. You must specify ``rbd``, 
the pool name, and the name of the image you wish to resize. You must also
specify the size of the image. ::

	qemu-img resize rbd:{pool-name}/{image-name} {size}

For example::

	qemu-img resize rbd:data/foo 10G


Retrieving Image Info with QEMU
===============================

You can retrieve block device image information from QEMU. You must 
specify ``rbd``, the pool name, and the name of the image. ::

	qemu-img info rbd:{pool-name}/{image-name}

For example::

	qemu-img info rbd:data/foo


Running QEMU with RBD
=====================

QEMU can pass a block device from the host on to a guest, but since
QEMU 0.15, there's no need to map an image as a block device on
the host. Instead, QEMU can access an image as a virtual block
device directly via ``librbd``. This performs better because it avoids
an additional context switch, and can take advantage of `RBD caching`_.

You can use ``qemu-img`` to convert existing virtual machine images to Ceph
block device images. For example, if you have a qcow2 image, you could run::

    qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 rbd:data/squeeze

To run a virtual machine booting from that image, you could run::

    qemu -m 1024 -drive format=raw,file=rbd:data/squeeze

`RBD caching`_ can significantly improve performance.
Since QEMU 1.2, QEMU's cache options control ``librbd`` caching::

    qemu -m 1024 -drive format=rbd,file=rbd:data/squeeze,cache=writeback

If you have an older version of QEMU, you can set the ``librbd`` cache
configuration (like any Ceph configuration option) as part of the
'file' parameter::

    qemu -m 1024 -drive format=raw,file=rbd:data/squeeze:rbd_cache=true,cache=writeback

.. important:: If you set rbd_cache=true, you must set cache=writeback
   or risk data loss. Without cache=writeback, QEMU will not send
   flush requests to librbd. If QEMU exits uncleanly in this
   configuration, filesystems on top of rbd can be corrupted.

.. _RBD caching: ../rbd-config-ref/#rbd-cache-config-settings


.. index:: Ceph Block Device; discard trim and libvirt

Enabling Discard/TRIM
=====================

Since Ceph version 0.46 and QEMU version 1.1, Ceph Block Devices support the
discard operation. This means that a guest can send TRIM requests to let a Ceph
block device reclaim unused space. This can be enabled in the guest by mounting
``ext4`` or ``XFS`` with the ``discard`` option.

For this to be available to the guest, it must be explicitly enabled
for the block device. To do this, you must specify a
``discard_granularity`` associated with the drive::

    qemu -m 1024 -drive format=raw,file=rbd:data/squeeze,id=drive1,if=none \
         -device driver=ide-hd,drive=drive1,discard_granularity=512

Note that this uses the IDE driver. The virtio driver does not
support discard.

If using libvirt, edit your libvirt domain's configuration file using ``virsh
edit`` to include the ``xmlns:qemu`` value. Then, add a ``qemu:commandline``
block as a child of that domain. The following example shows how to set two
devices with ``qemu id=`` to different ``discard_granularity`` values.

.. code-block:: guess

	<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
		<qemu:commandline>
			<qemu:arg value='-set'/>
			<qemu:arg value='block.scsi0-0-0.discard_granularity=4096'/>
			<qemu:arg value='-set'/>
			<qemu:arg value='block.scsi0-0-1.discard_granularity=65536'/>
		</qemu:commandline>
	</domain>


.. index:: Ceph Block Device; cache options

QEMU Cache Options
==================

QEMU's cache options correspond to the following Ceph `RBD Cache`_ settings.

Writeback::

   rbd_cache = true

Writethrough::

   rbd_cache = true
   rbd_cache_max_dirty = 0

None:: 

   rbd_cache = false

QEMU's cache settings override Ceph's cache settings (including settings that
are explicitly set in the Ceph configuration file).

.. note:: Prior to QEMU v2.4.0, if you explicitly set `RBD Cache`_ settings
   in the Ceph configuration file, your Ceph settings override the QEMU cache
   settings.

.. _QEMU Open Source Processor Emulator: http://wiki.qemu.org/Main_Page
.. _QEMU Manual: http://wiki.qemu.org/Manual
.. _RBD Cache: ../rbd-config-ref/
.. _Snapshots: ../rbd-snapshot/
.. _Installation: ../../install
.. _User Management - User: ../../rados/operations/user-management#user