1dm-raid 2======= 3 4The device-mapper RAID (dm-raid) target provides a bridge from DM to MD. 5It allows the MD RAID drivers to be accessed using a device-mapper 6interface. 7 8 9Mapping Table Interface 10----------------------- 11The target is named "raid" and it accepts the following parameters: 12 13 <raid_type> <#raid_params> <raid_params> \ 14 <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>] 15 16<raid_type>: 17 raid1 RAID1 mirroring 18 raid4 RAID4 dedicated parity disk 19 raid5_la RAID5 left asymmetric 20 - rotating parity 0 with data continuation 21 raid5_ra RAID5 right asymmetric 22 - rotating parity N with data continuation 23 raid5_ls RAID5 left symmetric 24 - rotating parity 0 with data restart 25 raid5_rs RAID5 right symmetric 26 - rotating parity N with data restart 27 raid6_zr RAID6 zero restart 28 - rotating parity zero (left-to-right) with data restart 29 raid6_nr RAID6 N restart 30 - rotating parity N (right-to-left) with data restart 31 raid6_nc RAID6 N continue 32 - rotating parity N (right-to-left) with data continuation 33 raid10 Various RAID10 inspired algorithms chosen by additional params 34 - RAID10: Striped Mirrors (aka 'Striping on top of mirrors') 35 - RAID1E: Integrated Adjacent Stripe Mirroring 36 - RAID1E: Integrated Offset Stripe Mirroring 37 - and other similar RAID10 variants 38 39 Reference: Chapter 4 of 40 http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf 41 42<#raid_params>: The number of parameters that follow. 43 44<raid_params> consists of 45 Mandatory parameters: 46 <chunk_size>: Chunk size in sectors. This parameter is often known as 47 "stripe size". It is the only mandatory parameter and 48 is placed first. 49 50 followed by optional parameters (in any order): 51 [sync|nosync] Force or prevent RAID initialization. 52 53 [rebuild <idx>] Rebuild drive number 'idx' (first drive is 0). 54 55 [daemon_sleep <ms>] 56 Interval between runs of the bitmap daemon that 57 clear bits. A longer interval means less bitmap I/O but 58 resyncing after a failure is likely to take longer. 59 60 [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization 61 [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization 62 [write_mostly <idx>] Mark drive index 'idx' write-mostly. 63 [max_write_behind <sectors>] See '--write-behind=' (man mdadm) 64 [stripe_cache <sectors>] Stripe cache size (RAID 4/5/6 only) 65 [region_size <sectors>] 66 The region_size multiplied by the number of regions is the 67 logical size of the array. The bitmap records the device 68 synchronisation state for each region. 69 70 [raid10_copies <# copies>] 71 [raid10_format <near|far|offset>] 72 These two options are used to alter the default layout of 73 a RAID10 configuration. The number of copies is can be 74 specified, but the default is 2. There are also three 75 variations to how the copies are laid down - the default 76 is "near". Near copies are what most people think of with 77 respect to mirroring. If these options are left unspecified, 78 or 'raid10_copies 2' and/or 'raid10_format near' are given, 79 then the layouts for 2, 3 and 4 devices are: 80 2 drives 3 drives 4 drives 81 -------- ---------- -------------- 82 A1 A1 A1 A1 A2 A1 A1 A2 A2 83 A2 A2 A2 A3 A3 A3 A3 A4 A4 84 A3 A3 A4 A4 A5 A5 A5 A6 A6 85 A4 A4 A5 A6 A6 A7 A7 A8 A8 86 .. .. .. .. .. .. .. .. .. 87 The 2-device layout is equivalent 2-way RAID1. The 4-device 88 layout is what a traditional RAID10 would look like. The 89 3-device layout is what might be called a 'RAID1E - Integrated 90 Adjacent Stripe Mirroring'. 91 92 If 'raid10_copies 2' and 'raid10_format far', then the layouts 93 for 2, 3 and 4 devices are: 94 2 drives 3 drives 4 drives 95 -------- -------------- -------------------- 96 A1 A2 A1 A2 A3 A1 A2 A3 A4 97 A3 A4 A4 A5 A6 A5 A6 A7 A8 98 A5 A6 A7 A8 A9 A9 A10 A11 A12 99 .. .. .. .. .. .. .. .. .. 100 A2 A1 A3 A1 A2 A2 A1 A4 A3 101 A4 A3 A6 A4 A5 A6 A5 A8 A7 102 A6 A5 A9 A7 A8 A10 A9 A12 A11 103 .. .. .. .. .. .. .. .. .. 104 105 If 'raid10_copies 2' and 'raid10_format offset', then the 106 layouts for 2, 3 and 4 devices are: 107 2 drives 3 drives 4 drives 108 -------- ------------ ----------------- 109 A1 A2 A1 A2 A3 A1 A2 A3 A4 110 A2 A1 A3 A1 A2 A2 A1 A4 A3 111 A3 A4 A4 A5 A6 A5 A6 A7 A8 112 A4 A3 A6 A4 A5 A6 A5 A8 A7 113 A5 A6 A7 A8 A9 A9 A10 A11 A12 114 A6 A5 A9 A7 A8 A10 A9 A12 A11 115 .. .. .. .. .. .. .. .. .. 116 Here we see layouts closely akin to 'RAID1E - Integrated 117 Offset Stripe Mirroring'. 118 119<#raid_devs>: The number of devices composing the array. 120 Each device consists of two entries. The first is the device 121 containing the metadata (if any); the second is the one containing the 122 data. 123 124 If a drive has failed or is missing at creation time, a '-' can be 125 given for both the metadata and data drives for a given position. 126 127 128Example Tables 129-------------- 130# RAID4 - 4 data drives, 1 parity (no metadata devices) 131# No metadata devices specified to hold superblock/bitmap info 132# Chunk size of 1MiB 133# (Lines separated for easy reading) 134 1350 1960893648 raid \ 136 raid4 1 2048 \ 137 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 138 139# RAID4 - 4 data drives, 1 parity (with metadata devices) 140# Chunk size of 1MiB, force RAID initialization, 141# min recovery rate at 20 kiB/sec/disk 142 1430 1960893648 raid \ 144 raid4 4 2048 sync min_recovery_rate 20 \ 145 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82 146 147 148Status Output 149------------- 150'dmsetup table' displays the table used to construct the mapping. 151The optional parameters are always printed in the order listed 152above with "sync" or "nosync" always output ahead of the other 153arguments, regardless of the order used when originally loading the table. 154Arguments that can be repeated are ordered by value. 155 156 157'dmsetup status' yields information on the state and health of the array. 158The output is as follows (normally a single line, but expanded here for 159clarity): 1601: <s> <l> raid \ 1612: <raid_type> <#devices> <health_chars> \ 1623: <sync_ratio> <sync_action> <mismatch_cnt> 163 164Line 1 is the standard output produced by device-mapper. 165Line 2 & 3 are produced by the raid target and are best explained by example: 166 0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0 167Here we can see the RAID type is raid4, there are 5 devices - all of 168which are 'A'live, and the array is 2/490221568 complete with its initial 169recovery. Here is a fuller description of the individual fields: 170 <raid_type> Same as the <raid_type> used to create the array. 171 <health_chars> One char for each device, indicating: 'A' = alive and 172 in-sync, 'a' = alive but not in-sync, 'D' = dead/failed. 173 <sync_ratio> The ratio indicating how much of the array has undergone 174 the process described by 'sync_action'. If the 175 'sync_action' is "check" or "repair", then the process 176 of "resync" or "recover" can be considered complete. 177 <sync_action> One of the following possible states: 178 idle - No synchronization action is being performed. 179 frozen - The current action has been halted. 180 resync - Array is undergoing its initial synchronization 181 or is resynchronizing after an unclean shutdown 182 (possibly aided by a bitmap). 183 recover - A device in the array is being rebuilt or 184 replaced. 185 check - A user-initiated full check of the array is 186 being performed. All blocks are read and 187 checked for consistency. The number of 188 discrepancies found are recorded in 189 <mismatch_cnt>. No changes are made to the 190 array by this action. 191 repair - The same as "check", but discrepancies are 192 corrected. 193 reshape - The array is undergoing a reshape. 194 <mismatch_cnt> The number of discrepancies found between mirror copies 195 in RAID1/10 or wrong parity values found in RAID4/5/6. 196 This value is valid only after a "check" of the array 197 is performed. A healthy array has a 'mismatch_cnt' of 0. 198 199Message Interface 200----------------- 201The dm-raid target will accept certain actions through the 'message' interface. 202('man dmsetup' for more information on the message interface.) These actions 203include: 204 "idle" - Halt the current sync action. 205 "frozen" - Freeze the current sync action. 206 "resync" - Initiate/continue a resync. 207 "recover"- Initiate/continue a recover process. 208 "check" - Initiate a check (i.e. a "scrub") of the array. 209 "repair" - Initiate a repair of the array. 210 "reshape"- Currently unsupported (-EINVAL). 211 212Version History 213--------------- 2141.0.0 Initial version. Support for RAID 4/5/6 2151.1.0 Added support for RAID 1 2161.2.0 Handle creation of arrays that contain failed devices. 2171.3.0 Added support for RAID 10 2181.3.1 Allow device replacement/rebuild for RAID 10 2191.3.2 Fix/improve redundancy checking for RAID10 2201.4.0 Non-functional change. Removes arg from mapping function. 2211.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5). 2221.4.2 Add RAID10 "far" and "offset" algorithm support. 2231.5.0 Add message interface to allow manipulation of the sync_action. 224 New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt. 2251.5.1 Add ability to restore transiently failed devices on resume. 2261.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check". 227