1	             Using the Linux Kernel Tracepoints
2
3			    Mathieu Desnoyers
4
5
6This document introduces Linux Kernel Tracepoints and their use. It
7provides examples of how to insert tracepoints in the kernel and
8connect probe functions to them and provides some examples of probe
9functions.
10
11
12* Purpose of tracepoints
13
14A tracepoint placed in code provides a hook to call a function (probe)
15that you can provide at runtime. A tracepoint can be "on" (a probe is
16connected to it) or "off" (no probe is attached). When a tracepoint is
17"off" it has no effect, except for adding a tiny time penalty
18(checking a condition for a branch) and space penalty (adding a few
19bytes for the function call at the end of the instrumented function
20and adds a data structure in a separate section).  When a tracepoint
21is "on", the function you provide is called each time the tracepoint
22is executed, in the execution context of the caller. When the function
23provided ends its execution, it returns to the caller (continuing from
24the tracepoint site).
25
26You can put tracepoints at important locations in the code. They are
27lightweight hooks that can pass an arbitrary number of parameters,
28which prototypes are described in a tracepoint declaration placed in a
29header file.
30
31They can be used for tracing and performance accounting.
32
33
34* Usage
35
36Two elements are required for tracepoints :
37
38- A tracepoint definition, placed in a header file.
39- The tracepoint statement, in C code.
40
41In order to use tracepoints, you should include linux/tracepoint.h.
42
43In include/trace/events/subsys.h :
44
45#undef TRACE_SYSTEM
46#define TRACE_SYSTEM subsys
47
48#if !defined(_TRACE_SUBSYS_H) || defined(TRACE_HEADER_MULTI_READ)
49#define _TRACE_SUBSYS_H
50
51#include <linux/tracepoint.h>
52
53DECLARE_TRACE(subsys_eventname,
54	TP_PROTO(int firstarg, struct task_struct *p),
55	TP_ARGS(firstarg, p));
56
57#endif /* _TRACE_SUBSYS_H */
58
59/* This part must be outside protection */
60#include <trace/define_trace.h>
61
62In subsys/file.c (where the tracing statement must be added) :
63
64#include <trace/events/subsys.h>
65
66#define CREATE_TRACE_POINTS
67DEFINE_TRACE(subsys_eventname);
68
69void somefct(void)
70{
71	...
72	trace_subsys_eventname(arg, task);
73	...
74}
75
76Where :
77- subsys_eventname is an identifier unique to your event
78    - subsys is the name of your subsystem.
79    - eventname is the name of the event to trace.
80
81- TP_PROTO(int firstarg, struct task_struct *p) is the prototype of the
82  function called by this tracepoint.
83
84- TP_ARGS(firstarg, p) are the parameters names, same as found in the
85  prototype.
86
87- if you use the header in multiple source files, #define CREATE_TRACE_POINTS
88  should appear only in one source file.
89
90Connecting a function (probe) to a tracepoint is done by providing a
91probe (function to call) for the specific tracepoint through
92register_trace_subsys_eventname().  Removing a probe is done through
93unregister_trace_subsys_eventname(); it will remove the probe.
94
95tracepoint_synchronize_unregister() must be called before the end of
96the module exit function to make sure there is no caller left using
97the probe. This, and the fact that preemption is disabled around the
98probe call, make sure that probe removal and module unload are safe.
99
100The tracepoint mechanism supports inserting multiple instances of the
101same tracepoint, but a single definition must be made of a given
102tracepoint name over all the kernel to make sure no type conflict will
103occur. Name mangling of the tracepoints is done using the prototypes
104to make sure typing is correct. Verification of probe type correctness
105is done at the registration site by the compiler. Tracepoints can be
106put in inline functions, inlined static functions, and unrolled loops
107as well as regular functions.
108
109The naming scheme "subsys_event" is suggested here as a convention
110intended to limit collisions. Tracepoint names are global to the
111kernel: they are considered as being the same whether they are in the
112core kernel image or in modules.
113
114If the tracepoint has to be used in kernel modules, an
115EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be
116used to export the defined tracepoints.
117
118If you need to do a bit of work for a tracepoint parameter, and
119that work is only used for the tracepoint, that work can be encapsulated
120within an if statement with the following:
121
122	if (trace_foo_bar_enabled()) {
123		int i;
124		int tot = 0;
125
126		for (i = 0; i < count; i++)
127			tot += calculate_nuggets();
128
129		trace_foo_bar(tot);
130	}
131
132All trace_<tracepoint>() calls have a matching trace_<tracepoint>_enabled()
133function defined that returns true if the tracepoint is enabled and
134false otherwise. The trace_<tracepoint>() should always be within the
135block of the if (trace_<tracepoint>_enabled()) to prevent races between
136the tracepoint being enabled and the check being seen.
137
138The advantage of using the trace_<tracepoint>_enabled() is that it uses
139the static_key of the tracepoint to allow the if statement to be implemented
140with jump labels and avoid conditional branches.
141
142Note: The convenience macro TRACE_EVENT provides an alternative way to
143      define tracepoints. Check http://lwn.net/Articles/379903,
144      http://lwn.net/Articles/381064 and http://lwn.net/Articles/383362
145      for a series of articles with more details.
146