summaryrefslogtreecommitdiffstats
path: root/kernel/Documentation/device-mapper/dm-raid.txt
diff options
context:
space:
mode:
authorYunhong Jiang <yunhong.jiang@intel.com>2015-08-04 12:17:53 -0700
committerYunhong Jiang <yunhong.jiang@intel.com>2015-08-04 15:44:42 -0700
commit9ca8dbcc65cfc63d6f5ef3312a33184e1d726e00 (patch)
tree1c9cafbcd35f783a87880a10f85d1a060db1a563 /kernel/Documentation/device-mapper/dm-raid.txt
parent98260f3884f4a202f9ca5eabed40b1354c489b29 (diff)
Add the rt linux 4.1.3-rt3 as base
Import the rt linux 4.1.3-rt3 as OPNFV kvm base. It's from git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-4.1.y-rt and the base is: commit 0917f823c59692d751951bf5ea699a2d1e2f26a2 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Sat Jul 25 12:13:34 2015 +0200 Prepare v4.1.3-rt3 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> We lose all the git history this way and it's not good. We should apply another opnfv project repo in future. Change-Id: I87543d81c9df70d99c5001fbdf646b202c19f423 Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
Diffstat (limited to 'kernel/Documentation/device-mapper/dm-raid.txt')
-rw-r--r--kernel/Documentation/device-mapper/dm-raid.txt226
1 files changed, 226 insertions, 0 deletions
diff --git a/kernel/Documentation/device-mapper/dm-raid.txt b/kernel/Documentation/device-mapper/dm-raid.txt
new file mode 100644
index 000000000..ef8ba9fa5
--- /dev/null
+++ b/kernel/Documentation/device-mapper/dm-raid.txt
@@ -0,0 +1,226 @@
+dm-raid
+=======
+
+The device-mapper RAID (dm-raid) target provides a bridge from DM to MD.
+It allows the MD RAID drivers to be accessed using a device-mapper
+interface.
+
+
+Mapping Table Interface
+-----------------------
+The target is named "raid" and it accepts the following parameters:
+
+ <raid_type> <#raid_params> <raid_params> \
+ <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]
+
+<raid_type>:
+ raid1 RAID1 mirroring
+ raid4 RAID4 dedicated parity disk
+ raid5_la RAID5 left asymmetric
+ - rotating parity 0 with data continuation
+ raid5_ra RAID5 right asymmetric
+ - rotating parity N with data continuation
+ raid5_ls RAID5 left symmetric
+ - rotating parity 0 with data restart
+ raid5_rs RAID5 right symmetric
+ - rotating parity N with data restart
+ raid6_zr RAID6 zero restart
+ - rotating parity zero (left-to-right) with data restart
+ raid6_nr RAID6 N restart
+ - rotating parity N (right-to-left) with data restart
+ raid6_nc RAID6 N continue
+ - rotating parity N (right-to-left) with data continuation
+ raid10 Various RAID10 inspired algorithms chosen by additional params
+ - RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
+ - RAID1E: Integrated Adjacent Stripe Mirroring
+ - RAID1E: Integrated Offset Stripe Mirroring
+ - and other similar RAID10 variants
+
+ Reference: Chapter 4 of
+ http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
+
+<#raid_params>: The number of parameters that follow.
+
+<raid_params> consists of
+ Mandatory parameters:
+ <chunk_size>: Chunk size in sectors. This parameter is often known as
+ "stripe size". It is the only mandatory parameter and
+ is placed first.
+
+ followed by optional parameters (in any order):
+ [sync|nosync] Force or prevent RAID initialization.
+
+ [rebuild <idx>] Rebuild drive number 'idx' (first drive is 0).
+
+ [daemon_sleep <ms>]
+ Interval between runs of the bitmap daemon that
+ clear bits. A longer interval means less bitmap I/O but
+ resyncing after a failure is likely to take longer.
+
+ [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization
+ [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization
+ [write_mostly <idx>] Mark drive index 'idx' write-mostly.
+ [max_write_behind <sectors>] See '--write-behind=' (man mdadm)
+ [stripe_cache <sectors>] Stripe cache size (RAID 4/5/6 only)
+ [region_size <sectors>]
+ The region_size multiplied by the number of regions is the
+ logical size of the array. The bitmap records the device
+ synchronisation state for each region.
+
+ [raid10_copies <# copies>]
+ [raid10_format <near|far|offset>]
+ These two options are used to alter the default layout of
+ a RAID10 configuration. The number of copies is can be
+ specified, but the default is 2. There are also three
+ variations to how the copies are laid down - the default
+ is "near". Near copies are what most people think of with
+ respect to mirroring. If these options are left unspecified,
+ or 'raid10_copies 2' and/or 'raid10_format near' are given,
+ then the layouts for 2, 3 and 4 devices are:
+ 2 drives 3 drives 4 drives
+ -------- ---------- --------------
+ A1 A1 A1 A1 A2 A1 A1 A2 A2
+ A2 A2 A2 A3 A3 A3 A3 A4 A4
+ A3 A3 A4 A4 A5 A5 A5 A6 A6
+ A4 A4 A5 A6 A6 A7 A7 A8 A8
+ .. .. .. .. .. .. .. .. ..
+ The 2-device layout is equivalent 2-way RAID1. The 4-device
+ layout is what a traditional RAID10 would look like. The
+ 3-device layout is what might be called a 'RAID1E - Integrated
+ Adjacent Stripe Mirroring'.
+
+ If 'raid10_copies 2' and 'raid10_format far', then the layouts
+ for 2, 3 and 4 devices are:
+ 2 drives 3 drives 4 drives
+ -------- -------------- --------------------
+ A1 A2 A1 A2 A3 A1 A2 A3 A4
+ A3 A4 A4 A5 A6 A5 A6 A7 A8
+ A5 A6 A7 A8 A9 A9 A10 A11 A12
+ .. .. .. .. .. .. .. .. ..
+ A2 A1 A3 A1 A2 A2 A1 A4 A3
+ A4 A3 A6 A4 A5 A6 A5 A8 A7
+ A6 A5 A9 A7 A8 A10 A9 A12 A11
+ .. .. .. .. .. .. .. .. ..
+
+ If 'raid10_copies 2' and 'raid10_format offset', then the
+ layouts for 2, 3 and 4 devices are:
+ 2 drives 3 drives 4 drives
+ -------- ------------ -----------------
+ A1 A2 A1 A2 A3 A1 A2 A3 A4
+ A2 A1 A3 A1 A2 A2 A1 A4 A3
+ A3 A4 A4 A5 A6 A5 A6 A7 A8
+ A4 A3 A6 A4 A5 A6 A5 A8 A7
+ A5 A6 A7 A8 A9 A9 A10 A11 A12
+ A6 A5 A9 A7 A8 A10 A9 A12 A11
+ .. .. .. .. .. .. .. .. ..
+ Here we see layouts closely akin to 'RAID1E - Integrated
+ Offset Stripe Mirroring'.
+
+<#raid_devs>: The number of devices composing the array.
+ Each device consists of two entries. The first is the device
+ containing the metadata (if any); the second is the one containing the
+ data.
+
+ If a drive has failed or is missing at creation time, a '-' can be
+ given for both the metadata and data drives for a given position.
+
+
+Example Tables
+--------------
+# RAID4 - 4 data drives, 1 parity (no metadata devices)
+# No metadata devices specified to hold superblock/bitmap info
+# Chunk size of 1MiB
+# (Lines separated for easy reading)
+
+0 1960893648 raid \
+ raid4 1 2048 \
+ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
+
+# RAID4 - 4 data drives, 1 parity (with metadata devices)
+# Chunk size of 1MiB, force RAID initialization,
+# min recovery rate at 20 kiB/sec/disk
+
+0 1960893648 raid \
+ raid4 4 2048 sync min_recovery_rate 20 \
+ 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
+
+
+Status Output
+-------------
+'dmsetup table' displays the table used to construct the mapping.
+The optional parameters are always printed in the order listed
+above with "sync" or "nosync" always output ahead of the other
+arguments, regardless of the order used when originally loading the table.
+Arguments that can be repeated are ordered by value.
+
+
+'dmsetup status' yields information on the state and health of the array.
+The output is as follows (normally a single line, but expanded here for
+clarity):
+1: <s> <l> raid \
+2: <raid_type> <#devices> <health_chars> \
+3: <sync_ratio> <sync_action> <mismatch_cnt>
+
+Line 1 is the standard output produced by device-mapper.
+Line 2 & 3 are produced by the raid target and are best explained by example:
+ 0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0
+Here we can see the RAID type is raid4, there are 5 devices - all of
+which are 'A'live, and the array is 2/490221568 complete with its initial
+recovery. Here is a fuller description of the individual fields:
+ <raid_type> Same as the <raid_type> used to create the array.
+ <health_chars> One char for each device, indicating: 'A' = alive and
+ in-sync, 'a' = alive but not in-sync, 'D' = dead/failed.
+ <sync_ratio> The ratio indicating how much of the array has undergone
+ the process described by 'sync_action'. If the
+ 'sync_action' is "check" or "repair", then the process
+ of "resync" or "recover" can be considered complete.
+ <sync_action> One of the following possible states:
+ idle - No synchronization action is being performed.
+ frozen - The current action has been halted.
+ resync - Array is undergoing its initial synchronization
+ or is resynchronizing after an unclean shutdown
+ (possibly aided by a bitmap).
+ recover - A device in the array is being rebuilt or
+ replaced.
+ check - A user-initiated full check of the array is
+ being performed. All blocks are read and
+ checked for consistency. The number of
+ discrepancies found are recorded in
+ <mismatch_cnt>. No changes are made to the
+ array by this action.
+ repair - The same as "check", but discrepancies are
+ corrected.
+ reshape - The array is undergoing a reshape.
+ <mismatch_cnt> The number of discrepancies found between mirror copies
+ in RAID1/10 or wrong parity values found in RAID4/5/6.
+ This value is valid only after a "check" of the array
+ is performed. A healthy array has a 'mismatch_cnt' of 0.
+
+Message Interface
+-----------------
+The dm-raid target will accept certain actions through the 'message' interface.
+('man dmsetup' for more information on the message interface.) These actions
+include:
+ "idle" - Halt the current sync action.
+ "frozen" - Freeze the current sync action.
+ "resync" - Initiate/continue a resync.
+ "recover"- Initiate/continue a recover process.
+ "check" - Initiate a check (i.e. a "scrub") of the array.
+ "repair" - Initiate a repair of the array.
+ "reshape"- Currently unsupported (-EINVAL).
+
+Version History
+---------------
+1.0.0 Initial version. Support for RAID 4/5/6
+1.1.0 Added support for RAID 1
+1.2.0 Handle creation of arrays that contain failed devices.
+1.3.0 Added support for RAID 10
+1.3.1 Allow device replacement/rebuild for RAID 10
+1.3.2 Fix/improve redundancy checking for RAID10
+1.4.0 Non-functional change. Removes arg from mapping function.
+1.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5).
+1.4.2 Add RAID10 "far" and "offset" algorithm support.
+1.5.0 Add message interface to allow manipulation of the sync_action.
+ New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
+1.5.1 Add ability to restore transiently failed devices on resume.
+1.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check".