1<html><head><meta http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968"><title>Chapter 9. Locking Speed</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="index.html" title="Unreliable Guide To Locking"><link rel="up" href="index.html" title="Unreliable Guide To Locking"><link rel="prev" href="racing-timers.html" title="Racing Timers: A Kernel Pastime"><link rel="next" href="efficiency-read-copy-update.html" title="Avoiding Locks: Read Copy Update"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter 9. Locking Speed</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="racing-timers.html">Prev</a> </td><th width="60%" align="center"> </th><td width="20%" align="right"> <a accesskey="n" href="efficiency-read-copy-update.html">Next</a></td></tr></table><hr></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="Efficiency"></a>Chapter 9. Locking Speed</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="Efficiency.html#efficiency-rwlocks">Read/Write Lock Variants</a></span></dt><dt><span class="sect1"><a href="efficiency-read-copy-update.html">Avoiding Locks: Read Copy Update</a></span></dt><dt><span class="sect1"><a href="per-cpu.html">Per-CPU Data</a></span></dt><dt><span class="sect1"><a href="mostly-hardirq.html">Data Which Mostly Used By An IRQ Handler</a></span></dt></dl></div><p> 2There are three main things to worry about when considering speed of 3some code which does locking. First is concurrency: how many things 4are going to be waiting while someone else is holding a lock. Second 5is the time taken to actually acquire and release an uncontended lock. 6Third is using fewer, or smarter locks. I'm assuming that the lock is 7used fairly often: otherwise, you wouldn't be concerned about 8efficiency. 9</p><p> 10Concurrency depends on how long the lock is usually held: you should 11hold the lock for as long as needed, but no longer. In the cache 12example, we always create the object without the lock held, and then 13grab the lock only when we are ready to insert it in the list. 14</p><p> 15Acquisition times depend on how much damage the lock operations do to 16the pipeline (pipeline stalls) and how likely it is that this CPU was 17the last one to grab the lock (ie. is the lock cache-hot for this 18CPU): on a machine with more CPUs, this likelihood drops fast. 19Consider a 700MHz Intel Pentium III: an instruction takes about 0.7ns, 20an atomic increment takes about 58ns, a lock which is cache-hot on 21this CPU takes 160ns, and a cacheline transfer from another CPU takes 22an additional 170 to 360ns. (These figures from Paul McKenney's 23<a class="ulink" href="http://www.linuxjournal.com/article.php?sid=6993" target="_top"> Linux 24Journal RCU article</a>). 25</p><p> 26These two aims conflict: holding a lock for a short time might be done 27by splitting locks into parts (such as in our final per-object-lock 28example), but this increases the number of lock acquisitions, and the 29results are often slower than having a single lock. This is another 30reason to advocate locking simplicity. 31</p><p> 32The third concern is addressed below: there are some methods to reduce 33the amount of locking which needs to be done. 34</p><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="efficiency-rwlocks"></a>Read/Write Lock Variants</h2></div></div></div><p> 35 Both spinlocks and mutexes have read/write variants: 36 <span class="type">rwlock_t</span> and <span class="structname">struct rw_semaphore</span>. 37 These divide users into two classes: the readers and the writers. If 38 you are only reading the data, you can get a read lock, but to write to 39 the data you need the write lock. Many people can hold a read lock, 40 but a writer must be sole holder. 41 </p><p> 42 If your code divides neatly along reader/writer lines (as our 43 cache code does), and the lock is held by readers for 44 significant lengths of time, using these locks can help. They 45 are slightly slower than the normal locks though, so in practice 46 <span class="type">rwlock_t</span> is not usually worthwhile. 47 </p></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="racing-timers.html">Prev</a> </td><td width="20%" align="center"> </td><td width="40%" align="right"> <a accesskey="n" href="efficiency-read-copy-update.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Racing Timers: A Kernel Pastime </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Avoiding Locks: Read Copy Update</td></tr></table></div></body></html> 48