Lines Matching refs:the

7 the OS is in any state.  Including when one of the cpus is already
9 asking for deadlock. Also the state of structures that are protected
19 This is the monarch cpu.
22 to all the other cpus, the slaves.
24 * Slave cpus that receive the MCA interrupt call down into SAL, they
25 end up spinning disabled while the MCA is being serviced.
27 * If any slave cpu was already spinning disabled when the MCA occurred
28 then it cannot service the MCA interrupt. SAL waits ~20 seconds then
29 sends an unmaskable INIT event to the slave cpus that have not
32 * Because MCA/INIT can be delivered at any time, including when the cpu
33 is down in PAL in physical mode, the registers at the time of the
34 event are _completely_ undefined. In particular the MCA/INIT
35 handlers cannot rely on the thread pointer, PAL physical mode can
40 * If an MCA/INIT event occurs while the kernel was running (not user
41 space) and the kernel has called PAL then the MCA/INIT handler cannot
42 assume that the kernel stack is in a fit state to be used. Mainly
43 because PAL may or may not maintain the stack pointer internally.
44 Because the MCA/INIT handlers cannot trust the kernel stack, they
46 preformatted with just enough task state to let the relevant handlers
49 * Unlike most other architectures, the ia64 struct task is embedded in
50 the kernel stack[1]. So switching to a new kernel stack means that
51 we switch to a new task as well. Because various bits of the kernel
52 assume that current points into the struct task, switching to a new
55 * Once all slaves have rendezvoused and are spinning disabled, the
56 monarch is entered. The monarch now tries to diagnose the problem
59 * Part of the monarch's job is to look at the state of all the other
60 tasks. The only way to do that on ia64 is to call the unwinder,
63 * The starting point for the unwind depends on whether a task is
68 tasks. But (and its a big but), the cpus that received the MCA
71 * To distinguish between these two cases, the monarch must know which
74 set_curr_task(), so the monarch can tell that the _original_ task is
76 getting a valid backtrace of the _original_ task.
78 * MCA/INIT can be nested, to a depth of 2 on any cpu. In the case of a
79 nested error, we want diagnostics on the MCA/INIT handler that
80 failed, not on the task that was originally running. Again this
81 requires set_curr_task() so the MCA/INIT handlers can register their
83 trace of the failing handler's "task".
86 struct task and the kernel stacks. Then the MCA/INIT data would be
88 radical surgery on the rest of ia64, plus extra hard wired TLB
91 stacks meant separate "tasks" for the MCA/INIT handlers.
95 INIT is less complicated than MCA. Pressing the nmi button or using
96 the equivalent command on the management console sends INIT to all
97 cpus. SAL picks one of the cpus as the monarch and the rest are
98 slaves. All the OS INIT handlers are entered at approximately the same
99 time. The OS monarch prints the state of all tasks and returns, after
100 which the slaves return and the system resumes.
103 versions of SAL out there. Some drive all the cpus as monarchs. Some
105 cpu to return from the OS then drive the rest as slaves. Some versions
106 of SAL cannot even cope with returning from the OS, they spin inside
108 broken SAL symptoms, but some simply cannot be fixed from the OS side.
117 it is a difficult problem because of the asynchronous nature of these
132 stacks. ia64 has the struct task embedded in the single kernel
135 * x86 does not call the BIOS so the NMI handler does not have to worry
136 about any registers having changed. MCA/INIT can occur while the cpu
149 The user mode registers are stored in the RSE area of the MCA/INIT on
150 entry to the OS and are restored from there on return to SAL, so user
151 mode registers are preserved across a recoverable MCA/INIT. Since the
152 OS has no idea what unwind data is available for the user space stack,
153 MCA/INIT never tries to backtrace user space. Which means that the OS
154 does not bother making the user space process look like a blocked task,
155 i.e. the OS does not copy pt_regs and switch_stack to the user space
156 stack. Also the OS has no idea how big the user space RSE and memory
157 stacks are, which makes it too risky to copy the saved state to a user
162 How do we get a backtrace on the tasks that were running when MCA/INIT
166 verifies the original kernel stack, copies the dirty registers from
167 the MCA/INIT stack's RSE to the original stack's RSE, copies the
168 skeleton struct pt_regs and switch_stack to the original stack, fills
169 in the skeleton structures from the PAL minstate area and updates the
170 original stack's thread.ksp. That makes the original stack look
172 sleeping. To get a backtrace, just start with thread.ksp for the
177 How do we identify the tasks that were running when MCA/INIT was
180 If the previous task has been verified and converted to a blocked
181 state, then sos->prev_task on the MCA/INIT stack is updated to point to
182 the previous task. You can look at that field in dumps or debuggers.
183 To help distinguish between the handler and the original tasks,
186 The sos data is always in the MCA/INIT handler stack, at offset
191 Also the comm field of the MCA/INIT task is modified to include the pid
192 of the original task, for humans to use. For example, a comm field of
193 'MCA 12159' means that pid 12159 was running when the MCA was