diff options
Diffstat (limited to 'qemu/docs/xbzrle.txt')
-rw-r--r-- | qemu/docs/xbzrle.txt | 136 |
1 files changed, 136 insertions, 0 deletions
diff --git a/qemu/docs/xbzrle.txt b/qemu/docs/xbzrle.txt new file mode 100644 index 000000000..52c8511a4 --- /dev/null +++ b/qemu/docs/xbzrle.txt @@ -0,0 +1,136 @@ +XBZRLE (Xor Based Zero Run Length Encoding) +=========================================== + +Using XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction +of VM downtime and the total live-migration time of Virtual machines. +It is particularly useful for virtual machines running memory write intensive +workloads that are typical of large enterprise applications such as SAP ERP +Systems, and generally speaking for any application that uses a sparse memory +update pattern. + +Instead of sending the changed guest memory page this solution will send a +compressed version of the updates, thus reducing the amount of data sent during +live migration. +In order to be able to calculate the update, the previous memory pages need to +be stored on the source. Those pages are stored in a dedicated cache +(hash table) and are accessed by their address. +The larger the cache size the better the chances are that the page has already +been stored in the cache. +A small cache size will result in high cache miss rate. +Cache size can be changed before and during migration. + +Format +======= + +The compression format performs a XOR between the previous and current content +of the page, where zero represents an unchanged value. +The page data delta is represented by zero and non zero runs. +A zero run is represented by its length (in bytes). +A non zero run is represented by its length (in bytes) and the new data. +The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128) + +There can be more than one valid encoding, the sender may send a longer encoding +for the benefit of reducing computation cost. + +page = zrun nzrun + | zrun nzrun page + +zrun = length + +nzrun = length byte... + +length = uleb128 encoded integer + +On the sender side XBZRLE is used as a compact delta encoding of page updates, +retrieving the old page content from the cache (default size of 512 MB). The +receiving side uses the existing page's content and XBZRLE to decode the new +page's content. + +This work was originally based on research results published +VEE 2011: Evaluation of Delta Compression Techniques for Efficient Live +Migration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth. +Additionally the delta encoder XBRLE was improved further using the XBZRLE +instead. + +XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it +ideal for in-line, real-time encoding such as is needed for live-migration. + +Example +old buffer: +1001 zeros +05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d +3074 zeros + +new buffer: +1001 zeros +01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69 +3074 zeros + +encoded buffer: + +encoded length 24 +e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69 + +Cache update strategy +===================== +Keeping the hot pages in the cache is effective for decreased cache +misses. XBZRLE uses a counter as the age of each page. The counter will +increase after each ram dirty bitmap sync. When a cache conflict is +detected, XBZRLE will only evict pages in the cache that are older than +a threshold. + +Usage +====================== +1. Verify the destination QEMU version is able to decode the new format. + {qemu} info migrate_capabilities + {qemu} xbzrle: off , ... + +2. Activate xbzrle on both source and destination: + {qemu} migrate_set_capability xbzrle on + +3. Set the XBZRLE cache size - the cache size is in MBytes and should be a +power of 2. The cache default value is 64MBytes. (on source only) + {qemu} migrate_set_cache_size 256m + +4. Start outgoing migration + {qemu} migrate -d tcp:destination.host:4444 + {qemu} info migrate + capabilities: xbzrle: on + Migration status: active + transferred ram: A kbytes + remaining ram: B kbytes + total ram: C kbytes + total time: D milliseconds + duplicate: E pages + normal: F pages + normal bytes: G kbytes + cache size: H bytes + xbzrle transferred: I kbytes + xbzrle pages: J pages + xbzrle cache miss: K + xbzrle overflow : L + +xbzrle cache-miss: the number of cache misses to date - high cache-miss rate +indicates that the cache size is set too low. +xbzrle overflow: the number of overflows in the decoding which where the delta +could not be compressed. This can happen if the changes in the pages are too +large or there are many short changes; for example, changing every second byte +(half a page). + +Testing: Testing indicated that live migration with XBZRLE was completed in 110 +seconds, whereas without it would not be able to complete. + +A simple synthetic memory r/w load generator: +.. include <stdlib.h> +.. include <stdio.h> +.. int main() +.. { +.. char *buf = (char *) calloc(4096, 4096); +.. while (1) { +.. int i; +.. for (i = 0; i < 4096 * 4; i++) { +.. buf[i * 4096 / 4]++; +.. } +.. printf("."); +.. } +.. } |