rt-mutex-design.txt - OpenGrok cross reference for /linux-4.4.14/Documentation/locking/rt-mutex-design.txt

Lines Matching refs:of
9 This document tries to describe the design of the rtmutex.c implementation.
15 The goal of this document is to help others understand the priority
25 most of the time it can't be helped.  Anytime a high priority process wants
31 an undetermined amount of time.
33 The classic example of unbounded priority inversion is where you have three
37 meantime, B executes, and since B is of a higher priority than C, it preempts C,
39 Now there's no way of knowing how long A will be sleeping waiting for C
60 There are several ways to solve this issue, but other ways are out of scope
63 PI is where a process inherits the priority of another process if the other
68 of A.  So now if B becomes runnable, it would not preempt C, since C now has
69 the high priority of A.  As soon as C releases the lock, it loses its
78 PI chain - The PI chain is an ordered series of locks and processes that cause
80            blocked on one of its locks.  This is described in more detail
88            referring to spin locks that are used to protect parts of the PI
95 waiter   - A waiter is a struct that is stored on the stack of a blocked
96            process.  Since the scope of the waiter is within the code for
101            place the task in the waiter_list of a mutex as well as the
102            pi_list of a mutex owner task (described below).
107 waiters  - A list of processes that are blocked on a mutex.
111 top pi waiter - The highest priority process waiting on one of the mutexes
121 The PI chain is a list of processes and mutexes that may cause priority
162 For PI to work, the processes at the right end of these chains (or we may
163 also call it the Top of the chain) must be equal to or higher in priority
189 The implementation of plist is out of scope for this document, but it is
194 priorities of the plist are sorted, such that it takes O(1) to retrieve the
199 list, the head of the list is a different element than the nodes of a list.
200 So the head of the list is declared as struct plist_head and nodes that will
207 Every mutex keeps track of all the waiters that are blocked on itself. The mutex
209 a spin lock that is located in the struct of the mutex. This lock is called
210 wait_lock.  Since the modification of the waiter list is never done in
217 To keep track of the PI chains, each process has its own PI list.  This is
218 a list of all top waiters of the mutexes that are owned by the process.
222 The top of the task's PI list is always the highest priority task that
224 inherited a priority, it will always be the priority of the task that is
225 at the top of this list.
227 This list is stored in the task structure of a process as a plist called
233 Depth of the PI Chain
236 The maximum depth of the PI chain is not dynamic, and could actually be
238 the nesting of mutexes.  Let's look at the example where we have 3 mutexes,
240 The following shows a locking order of L1->L2->L3, but may not actually
283 Now we add 4 processes that run each of these functions separately.
297 This gives us a PI depth of 4 (four processes), but looking at any of the
299 depth of two.  So, although the locking depth is defined at compile time,
300 it still is very difficult to find the possibilities of that depth.
303 type of application that nests large amounts of mutexes to create a large
305 amount of data.  So to prevent this, the implementation not only implements
313 The mutex structure contains a pointer to the owner of the mutex.  If the
329 is used (when applicable) to keep the fast path of grabbing and releasing
346 the return value (the old value of A) is equal to B.
353 The use of rt_mutex_cmpxchg with the flags in the owner field help optimize
361 The implementation of the PI code in rtmutex.c has several places that a
362 process must adjust its priority.  With the help of the pi_list of a
372 task's own normal priority, or if a process of a higher priority is waiting on
374 Since the pi_list of a task holds an order by priority list of all the top
375 waiters of all the mutexes that the task owns, rt_mutex_getprio simply needs
379 (Note:  if looking at the code, you will notice that the lower number of
381         is an inverse order of the actual priority.  So a "prio" of 5 is
382         of higher priority than a "prio" of 10.)
384 __rt_mutex_adjust_prio examines the result of rt_mutex_getprio, and if the
386 is called to adjust the priority of the task to the new priority.
391 or decrease the priority of the task.  In the case that a higher priority
395 would decrease/unboost the priority of the task.  That is because the pi_list
397 by the task, so we only need to compare the priority of that top pi waiter
398 to the normal priority of the given task.
401 High level overview of the PI chain walk
414 (de)boosting (the owner of a mutex that a process is blocking on), a flag to
423 that the state of the owner and lock can change when entered into this function.
427 should be at, but the plist nodes of the task's waiter have not been updated
433 was passed by parameter (for the first iteration).  The pi_lock of this task is
434 taken to prevent any more changes to the pi_list of the task.  This also
439 the top of the PI chain.
442 on the current mutex) is the top pi waiter of the task.  That is, is this
443 waiter on the top of the task's pi_list.  If it is not, it either means that
444 there is another process higher in priority that is blocked on one of the
447 we don't need to do any more changes to the priority of the current task, or any
453 the loop.  Remember that the function started with the priority of the
458 is taken.  This is done by a spin_trylock, because the locking order of the
462 Now that we have both the pi_lock of the task as well as the wait_lock of
466 Now we release the pi_lock of the task.
468 Next the owner of the mutex has its pi_lock taken, so we can update the
474       in which case the task is not yet on the pi_list of the waiter.  This
478 If the task was not the top waiter of the mutex, but it was before we
480 task.  In this case, the task is removed from the pi_list of the owner,
483 Lastly, we unlock both the pi_lock of the task, as well as the mutex's
484 wait_lock, and continue the loop again.  On the next iteration of the
485 loop, the previous owner of the mutex will be the task that will be
488 Note: One might think that the owner of this mutex might have changed
492       we have taken that task's pi_lock at the beginning of the loop.
493       So as long as there is an owner of this mutex that is not the same
497       end of the PI chain is when the task isn't blocked on anything or the
501       the protection of the mutex's wait_lock, which was not taken yet.
504       The answer is No!  The trick is the spin_trylock of the mutex's
505       wait_lock.  If we fail that lock, we release the pi_lock of the
506       task and continue the loop, doing the end of PI chain check again.
508       In the code to release the lock, the wait_lock of the mutex is held
509       the entire time, and it is not let go when we grab the pi_lock of the
510       new owner of the mutex.  So if the switch of a new owner were to happen
511       after the check for end of the PI chain and the grabbing of the
518       unless you really want to make a proof out of it ;)
524 One of the flags in the owner field of the mutex structure is "Pending Owner".
538 latency of that critical section (since the low priority process just entered
542 penalized if it tries to take that mutex again.  If the new owner of the
555 Taking of a mutex (The walk through)
558 OK, now let's take a look at the detailed walk through of what happens when
561 The first thing that is tried is the fast taking of the mutex.  This is
563 fails).  Only when the owner field of the mutex is NULL can the lock be
571 scope of this function.  The waiter structure holds the nodes to store
572 the task on the wait_list of the mutex, and if need be, the pi_list of
575 The wait_lock of the mutex is taken since the slow path of unlocking the
583 slow path.  The first thing that is done here is an atomic setting of
584 the "Has Waiters" flag of the mutex's owner field.  Yes, this could really
589 now, the owner of the mutex can't release the mutex without going into the
620   to take the top waiter, remove it from the pi_list of the pending owner, and
622   is no longer on the list of waiters.  This is fine, since the pending owner
631   of the mutex to current, and set the flag of "Has Waiters" if the current
645 of this mutex if we failed to get it after a specified amount of time.
647 Now we enter a loop that will continue to try to take ownership of the mutex, or
666 The accounting of a mutex and process is done with the waiter structure of
671 Since the wait_lock was taken at the entry of the slow lock, we can safely
674 previous top waiter process (if it exists) from the pi_list of the owner,
675 and add the current process to that list.  Since the pi_list of the owner
680 (or deadlock checking is on), we unlock the wait_lock of the mutex and go ahead
690   1) we were given pending ownership of the mutex.
694 In any of these cases, we continue the loop and once again try to grab the
695 ownership of the mutex.  If we succeed, we exit the loop, otherwise we continue
699 Note: For various reasons, because of timeout and signals, the steal mutex
701       still on the wait_list. And because of dynamic changing of priorities,
709 NULL and the task needs to be taken off the wait_list of the mutex and perhaps
710 pi_list of the owner.  If this process was a high priority process, then
718 The unlocking of a mutex also has a fast path for those architectures with
719 CMPXCHG.  Since the taking of a mutex on contention always sets the
720 "Has Waiters" flag of the mutex's owner, we use this to know if we need to
722 waiters, the owner field of the mutex would equal the current process and
728 The first thing done in the slow unlock path is to take the wait_lock of the
729 mutex.  This synchronizes the locking and unlocking of the mutex.
732 do not have CMPXCHG, this is the location that the owner of the mutex will
735 in the slow path too.  If a waiter of a mutex woke up because of a signal
737 the grabbing of the wait_lock, the mutex may not have any waiters, thus the
745 On the wake up code, the pi_lock of the current owner is taken.  The top
746 waiter of the lock is found and removed from the wait_list of the mutex
747 as well as the pi_list of the current owner.  The task field of the new
748 pending owner's waiter structure is set to NULL, and the owner field of the
753 The pi_lock of the previous owner is released, and the new pending owner's
758 We now clear the "pi_blocked_on" field of the new pending owner, and if
760 of the pending owner.
762 Finally we unlock the pi_lock of the pending owner and wake it up.