1Fault injection capabilities infrastructure
2===========================================
3
4See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.
5
6
7Available fault injection capabilities
8--------------------------------------
9
10o failslab
11
12  injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
13
14o fail_page_alloc
15
16  injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
17
18o fail_futex
19
20  injects futex deadlock and uaddr fault errors.
21
22o fail_make_request
23
24  injects disk IO errors on devices permitted by setting
25  /sys/block/<device>/make-it-fail or
26  /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
27
28o fail_mmc_request
29
30  injects MMC data errors on devices permitted by setting
31  debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
32
33Configure fault-injection capabilities behavior
34-----------------------------------------------
35
36o debugfs entries
37
38fault-inject-debugfs kernel module provides some debugfs entries for runtime
39configuration of fault-injection capabilities.
40
41- /sys/kernel/debug/fail*/probability:
42
43	likelihood of failure injection, in percent.
44	Format: <percent>
45
46	Note that one-failure-per-hundred is a very high error rate
47	for some testcases.  Consider setting probability=100 and configure
48	/sys/kernel/debug/fail*/interval for such testcases.
49
50- /sys/kernel/debug/fail*/interval:
51
52	specifies the interval between failures, for calls to
53	should_fail() that pass all the other tests.
54
55	Note that if you enable this, by setting interval>1, you will
56	probably want to set probability=100.
57
58- /sys/kernel/debug/fail*/times:
59
60	specifies how many times failures may happen at most.
61	A value of -1 means "no limit".
62
63- /sys/kernel/debug/fail*/space:
64
65	specifies an initial resource "budget", decremented by "size"
66	on each call to should_fail(,size).  Failure injection is
67	suppressed until "space" reaches zero.
68
69- /sys/kernel/debug/fail*/verbose
70
71	Format: { 0 | 1 | 2 }
72	specifies the verbosity of the messages when failure is
73	injected.  '0' means no messages; '1' will print only a single
74	log line per failure; '2' will print a call trace too -- useful
75	to debug the problems revealed by fault injection.
76
77- /sys/kernel/debug/fail*/task-filter:
78
79	Format: { 'Y' | 'N' }
80	A value of 'N' disables filtering by process (default).
81	Any positive value limits failures to only processes indicated by
82	/proc/<pid>/make-it-fail==1.
83
84- /sys/kernel/debug/fail*/require-start:
85- /sys/kernel/debug/fail*/require-end:
86- /sys/kernel/debug/fail*/reject-start:
87- /sys/kernel/debug/fail*/reject-end:
88
89	specifies the range of virtual addresses tested during
90	stacktrace walking.  Failure is injected only if some caller
91	in the walked stacktrace lies within the required range, and
92	none lies within the rejected range.
93	Default required range is [0,ULONG_MAX) (whole of virtual address space).
94	Default rejected range is [0,0).
95
96- /sys/kernel/debug/fail*/stacktrace-depth:
97
98	specifies the maximum stacktrace depth walked during search
99	for a caller within [require-start,require-end) OR
100	[reject-start,reject-end).
101
102- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
103
104	Format: { 'Y' | 'N' }
105	default is 'N', setting it to 'Y' won't inject failures into
106	highmem/user allocations.
107
108- /sys/kernel/debug/failslab/ignore-gfp-wait:
109- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
110
111	Format: { 'Y' | 'N' }
112	default is 'N', setting it to 'Y' will inject failures
113	only into non-sleep allocations (GFP_ATOMIC allocations).
114
115- /sys/kernel/debug/fail_page_alloc/min-order:
116
117	specifies the minimum page allocation order to be injected
118	failures.
119
120- /sys/kernel/debug/fail_futex/ignore-private:
121
122	Format: { 'Y' | 'N' }
123	default is 'N', setting it to 'Y' will disable failure injections
124	when dealing with private (address space) futexes.
125
126o Boot option
127
128In order to inject faults while debugfs is not available (early boot time),
129use the boot option:
130
131	failslab=
132	fail_page_alloc=
133	fail_make_request=
134	fail_futex=
135	mmc_core.fail_request=<interval>,<probability>,<space>,<times>
136
137How to add new fault injection capability
138-----------------------------------------
139
140o #include <linux/fault-inject.h>
141
142o define the fault attributes
143
144  DECLARE_FAULT_INJECTION(name);
145
146  Please see the definition of struct fault_attr in fault-inject.h
147  for details.
148
149o provide a way to configure fault attributes
150
151- boot option
152
153  If you need to enable the fault injection capability from boot time, you can
154  provide boot option to configure it. There is a helper function for it:
155
156	setup_fault_attr(attr, str);
157
158- debugfs entries
159
160  failslab, fail_page_alloc, and fail_make_request use this way.
161  Helper functions:
162
163	fault_create_debugfs_attr(name, parent, attr);
164
165- module parameters
166
167  If the scope of the fault injection capability is limited to a
168  single kernel module, it is better to provide module parameters to
169  configure the fault attributes.
170
171o add a hook to insert failures
172
173  Upon should_fail() returning true, client code should inject a failure.
174
175	should_fail(attr, size);
176
177Application Examples
178--------------------
179
180o Inject slab allocation failures into module init/exit code
181
182#!/bin/bash
183
184FAILTYPE=failslab
185echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
186echo 10 > /sys/kernel/debug/$FAILTYPE/probability
187echo 100 > /sys/kernel/debug/$FAILTYPE/interval
188echo -1 > /sys/kernel/debug/$FAILTYPE/times
189echo 0 > /sys/kernel/debug/$FAILTYPE/space
190echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
191echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
192
193faulty_system()
194{
195	bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
196}
197
198if [ $# -eq 0 ]
199then
200	echo "Usage: $0 modulename [ modulename ... ]"
201	exit 1
202fi
203
204for m in $*
205do
206	echo inserting $m...
207	faulty_system modprobe $m
208
209	echo removing $m...
210	faulty_system modprobe -r $m
211done
212
213------------------------------------------------------------------------------
214
215o Inject page allocation failures only for a specific module
216
217#!/bin/bash
218
219FAILTYPE=fail_page_alloc
220module=$1
221
222if [ -z $module ]
223then
224	echo "Usage: $0 <modulename>"
225	exit 1
226fi
227
228modprobe $module
229
230if [ ! -d /sys/module/$module/sections ]
231then
232	echo Module $module is not loaded
233	exit 1
234fi
235
236cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
237cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
238
239echo N > /sys/kernel/debug/$FAILTYPE/task-filter
240echo 10 > /sys/kernel/debug/$FAILTYPE/probability
241echo 100 > /sys/kernel/debug/$FAILTYPE/interval
242echo -1 > /sys/kernel/debug/$FAILTYPE/times
243echo 0 > /sys/kernel/debug/$FAILTYPE/space
244echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
245echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
246echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
247echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
248
249trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
250
251echo "Injecting errors into the module $module... (interrupt to stop)"
252sleep 1000000
253
254Tool to run command with failslab or fail_page_alloc
255----------------------------------------------------
256In order to make it easier to accomplish the tasks mentioned above, we can use
257tools/testing/fault-injection/failcmd.sh.  Please run a command
258"./tools/testing/fault-injection/failcmd.sh --help" for more information and
259see the following examples.
260
261Examples:
262
263Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
264allocation failure.
265
266	# ./tools/testing/fault-injection/failcmd.sh \
267		-- make -C tools/testing/selftests/ run_tests
268
269Same as above except to specify 100 times failures at most instead of one time
270at most by default.
271
272	# ./tools/testing/fault-injection/failcmd.sh --times=100 \
273		-- make -C tools/testing/selftests/ run_tests
274
275Same as above except to inject page allocation failure instead of slab
276allocation failure.
277
278	# env FAILCMD_TYPE=fail_page_alloc \
279		./tools/testing/fault-injection/failcmd.sh --times=100 \
280                -- make -C tools/testing/selftests/ run_tests
281