1<html><head><meta http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968"><title>Avoiding Locks: Read Copy Update</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="index.html" title="Unreliable Guide To Locking"><link rel="up" href="Efficiency.html" title="Chapter 9. Locking Speed"><link rel="prev" href="Efficiency.html" title="Chapter 9. Locking Speed"><link rel="next" href="per-cpu.html" title="Per-CPU Data"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Avoiding Locks: Read Copy Update</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="Efficiency.html">Prev</a> </td><th width="60%" align="center">Chapter 9. Locking Speed</th><td width="20%" align="right"> <a accesskey="n" href="per-cpu.html">Next</a></td></tr></table><hr></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="efficiency-read-copy-update"></a>Avoiding Locks: Read Copy Update</h2></div></div></div><p> 2 There is a special method of read/write locking called Read Copy 3 Update. Using RCU, the readers can avoid taking a lock 4 altogether: as we expect our cache to be read more often than 5 updated (otherwise the cache is a waste of time), it is a 6 candidate for this optimization. 7 </p><p> 8 How do we get rid of read locks? Getting rid of read locks 9 means that writers may be changing the list underneath the 10 readers. That is actually quite simple: we can read a linked 11 list while an element is being added if the writer adds the 12 element very carefully. For example, adding 13 <span class="symbol">new</span> to a single linked list called 14 <span class="symbol">list</span>: 15 </p><pre class="programlisting"> 16 new->next = list->next; 17 wmb(); 18 list->next = new; 19 </pre><p> 20 The <code class="function">wmb()</code> is a write memory barrier. It 21 ensures that the first operation (setting the new element's 22 <span class="symbol">next</span> pointer) is complete and will be seen by 23 all CPUs, before the second operation is (putting the new 24 element into the list). This is important, since modern 25 compilers and modern CPUs can both reorder instructions unless 26 told otherwise: we want a reader to either not see the new 27 element at all, or see the new element with the 28 <span class="symbol">next</span> pointer correctly pointing at the rest of 29 the list. 30 </p><p> 31 Fortunately, there is a function to do this for standard 32 <span class="structname">struct list_head</span> lists: 33 <code class="function">list_add_rcu()</code> 34 (<code class="filename">include/linux/list.h</code>). 35 </p><p> 36 Removing an element from the list is even simpler: we replace 37 the pointer to the old element with a pointer to its successor, 38 and readers will either see it, or skip over it. 39 </p><pre class="programlisting"> 40 list->next = old->next; 41 </pre><p> 42 There is <code class="function">list_del_rcu()</code> 43 (<code class="filename">include/linux/list.h</code>) which does this (the 44 normal version poisons the old object, which we don't want). 45 </p><p> 46 The reader must also be careful: some CPUs can look through the 47 <span class="symbol">next</span> pointer to start reading the contents of 48 the next element early, but don't realize that the pre-fetched 49 contents is wrong when the <span class="symbol">next</span> pointer changes 50 underneath them. Once again, there is a 51 <code class="function">list_for_each_entry_rcu()</code> 52 (<code class="filename">include/linux/list.h</code>) to help you. Of 53 course, writers can just use 54 <code class="function">list_for_each_entry()</code>, since there cannot 55 be two simultaneous writers. 56 </p><p> 57 Our final dilemma is this: when can we actually destroy the 58 removed element? Remember, a reader might be stepping through 59 this element in the list right now: if we free this element and 60 the <span class="symbol">next</span> pointer changes, the reader will jump 61 off into garbage and crash. We need to wait until we know that 62 all the readers who were traversing the list when we deleted the 63 element are finished. We use <code class="function">call_rcu()</code> to 64 register a callback which will actually destroy the object once 65 all pre-existing readers are finished. Alternatively, 66 <code class="function">synchronize_rcu()</code> may be used to block until 67 all pre-existing are finished. 68 </p><p> 69 But how does Read Copy Update know when the readers are 70 finished? The method is this: firstly, the readers always 71 traverse the list inside 72 <code class="function">rcu_read_lock()</code>/<code class="function">rcu_read_unlock()</code> 73 pairs: these simply disable preemption so the reader won't go to 74 sleep while reading the list. 75 </p><p> 76 RCU then waits until every other CPU has slept at least once: 77 since readers cannot sleep, we know that any readers which were 78 traversing the list during the deletion are finished, and the 79 callback is triggered. The real Read Copy Update code is a 80 little more optimized than this, but this is the fundamental 81 idea. 82 </p><pre class="programlisting"> 83--- cache.c.perobjectlock 2003-12-11 17:15:03.000000000 +1100 84+++ cache.c.rcupdate 2003-12-11 17:55:14.000000000 +1100 85@@ -1,15 +1,18 @@ 86 #include <linux/list.h> 87 #include <linux/slab.h> 88 #include <linux/string.h> 89+#include <linux/rcupdate.h> 90 #include <linux/mutex.h> 91 #include <asm/errno.h> 92 93 struct object 94 { 95- /* These two protected by cache_lock. */ 96+ /* This is protected by RCU */ 97 struct list_head list; 98 int popularity; 99 100+ struct rcu_head rcu; 101+ 102 atomic_t refcnt; 103 104 /* Doesn't change once created. */ 105@@ -40,7 +43,7 @@ 106 { 107 struct object *i; 108 109- list_for_each_entry(i, &cache, list) { 110+ list_for_each_entry_rcu(i, &cache, list) { 111 if (i->id == id) { 112 i->popularity++; 113 return i; 114@@ -49,19 +52,25 @@ 115 return NULL; 116 } 117 118+/* Final discard done once we know no readers are looking. */ 119+static void cache_delete_rcu(void *arg) 120+{ 121+ object_put(arg); 122+} 123+ 124 /* Must be holding cache_lock */ 125 static void __cache_delete(struct object *obj) 126 { 127 BUG_ON(!obj); 128- list_del(&obj->list); 129- object_put(obj); 130+ list_del_rcu(&obj->list); 131 cache_num--; 132+ call_rcu(&obj->rcu, cache_delete_rcu); 133 } 134 135 /* Must be holding cache_lock */ 136 static void __cache_add(struct object *obj) 137 { 138- list_add(&obj->list, &cache); 139+ list_add_rcu(&obj->list, &cache); 140 if (++cache_num > MAX_CACHE_SIZE) { 141 struct object *i, *outcast = NULL; 142 list_for_each_entry(i, &cache, list) { 143@@ -104,12 +114,11 @@ 144 struct object *cache_find(int id) 145 { 146 struct object *obj; 147- unsigned long flags; 148 149- spin_lock_irqsave(&cache_lock, flags); 150+ rcu_read_lock(); 151 obj = __cache_find(id); 152 if (obj) 153 object_get(obj); 154- spin_unlock_irqrestore(&cache_lock, flags); 155+ rcu_read_unlock(); 156 return obj; 157 } 158</pre><p> 159Note that the reader will alter the 160<em class="structfield"><code>popularity</code></em> member in 161<code class="function">__cache_find()</code>, and now it doesn't hold a lock. 162One solution would be to make it an <span class="type">atomic_t</span>, but for 163this usage, we don't really care about races: an approximate result is 164good enough, so I didn't change it. 165</p><p> 166The result is that <code class="function">cache_find()</code> requires no 167synchronization with any other functions, so is almost as fast on SMP 168as it would be on UP. 169</p><p> 170There is a further optimization possible here: remember our original 171cache code, where there were no reference counts and the caller simply 172held the lock whenever using the object? This is still possible: if 173you hold the lock, no one can delete the object, so you don't need to 174get and put the reference count. 175</p><p> 176Now, because the 'read lock' in RCU is simply disabling preemption, a 177caller which always has preemption disabled between calling 178<code class="function">cache_find()</code> and 179<code class="function">object_put()</code> does not need to actually get and 180put the reference count: we could expose 181<code class="function">__cache_find()</code> by making it non-static, and 182such callers could simply call that. 183</p><p> 184The benefit here is that the reference count is not written to: the 185object is not altered in any way, which is much faster on SMP 186machines due to caching. 187</p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="Efficiency.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="Efficiency.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="per-cpu.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 9. Locking Speed </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Per-CPU Data</td></tr></table></div></body></html> 188