Lines Matching refs:a

7 This document describes a set of complementary techniques in the Linux
24 (multi-queue). On reception, a NIC can send different packets to different
26 applying a filter to each packet that assigns it to one of a small number
27 of logical flows. Packets for each flow are steered to a separate receive
34 The filter used in RSS is typically a hash function over the network
35 and/or transport layer headers-- for example, a 4-tuple hash over
36 IP addresses and TCP ports of a packet. The most common hardware
37 implementation of RSS uses a 128-entry indirection table where each entry
38 stores a queue number. The receive queue for a packet is determined
40 packet (usually a Toeplitz hash), taking this number as a key into the
50 The driver for a multi-queue capable NIC typically provides a kernel
55 one for each memory domain, where a memory domain is a set of CPUs that
56 share a particular memory level (L1, L2, NUMA node, etc.).
58 The indirection table of an RSS device, which resolves a queue by masked
68 Each receive queue has a separate IRQ associated with it. The NIC triggers
69 this to notify a CPU when new packets arrive on the given queue. The
71 that can route each interrupt to a particular CPU. The active mapping
73 an IRQ may be handled on any CPU. Because a non-negligible part of packet
77 will be running irqbalance, a daemon that dynamically optimizes IRQ
78 assignments and as a result may override any manual settings.
82 RSS should be enabled when latency is a concern or whenever receive
83 interrupt processing forms a bottleneck. Spreading load between CPUs
88 receive queue overflows due to a saturated CPU, because in default
94 a separate CPU. For interrupt handling, HT has shown no benefit in
102 Receive Packet Steering (RPS) is logically a software implementation of
114 a driver sends a packet up the network stack with netif_rx() or
116 selects the queue that should process a packet.
118 The first step in determining the target CPU for RPS is to calculate a
120 depending on the protocol). This serves as a consistent hash of the
125 skb->rx_hash and can be used elsewhere in the stack as a hash of the
140 RPS requires a kernel compiled with the CONFIG_RPS kconfig symbol (on
143 can be configured for each receive queue using a sysfs file entry:
147 This file implements a bitmap of CPUs. RPS is disabled when it is zero
154 For a single queue device, a typical RPS configuration would be to set
160 For a multi-queue system, if RSS is configured so that a hardware
171 In the extreme case a single flow dominates traffic. Especially on
173 behavior indicates a problem such as a misconfiguration or spoofed
179 destination CPU approaches saturation. Once a CPU's input packet
181 net.core.netdev_max_backlog), the kernel starts a per-flow packet
182 count over the last 256 packets. If a flow exceeds a set ratio (by
183 default, half) of these packets when a new packet arrives, then the
200 Per-flow rate is calculated by hashing each packet into a hashtable
201 bucket and incrementing a per-bucket counter. The hash function is
202 the same that selects a CPU in RPS, but as the number of buckets can
209 The value is only consulted when a new table is allocated. Modifying
215 where a single connection taking up 50% of a CPU indicates a problem.
238 but the hash is used as index into a flow lookup table. This table maps
242 If an entry does not hold a valid CPU, then packets mapped to that entry
245 a single application thread handles flows with many different flow hashes.
247 rps_sock_flow_table is a global flow table that contains the *desired* CPU
249 Each table value is a CPU index that is updated during calls to recvmsg
253 When the scheduler moves a thread to a new CPU while it has outstanding
255 avoid this, RFS uses a second flow table to track outstanding packets
256 for each flow: rps_dev_flow_table is a table specific to each hardware
257 receive queue of each device. Each table value stores a CPU index and a
262 recently migrated a userspace thread while the kernel still has packets
266 CPU's backlog when a packet in this flow was last enqueued. Each backlog
267 queue has a head counter that is incremented on dequeue. A tail counter
289 CPU. These rules aim to ensure that a flow only moves to a new CPU when
308 Both of these need to be set before RFS is enabled for a receive queue.
312 connections. We have found that a value of 32768 for rps_sock_flow_entries
313 works fairly well on a moderately loaded server.
315 For a single queue device, the rps_flow_cnt value for the single queue
317 For a multi-queue device, the rps_flow_cnt for each queue might be
327 Accelerated RFS is to RFS what RSS is to RPS: a hardware-accelerated load
331 directly to a CPU local to the thread consuming the data. The target CPU
332 will either be the same CPU where the application runs, or at least a CPU
337 queue for packets matching a particular flow. The network stack
338 automatically calls this function every time a flow entry in
339 rps_dev_flow_table is updated. The driver in turn uses a device specific
342 The hardware queue for a flow is derived from the CPU recorded in
343 rps_dev_flow_table. The stack consults a CPU to hardware queue map which
367 Transmit Packet Steering is a mechanism for intelligently selecting
368 which transmit queue to use when transmitting a packet on a multi-queue
369 device. To accomplish this, a mapping from CPU to hardware queue(s) is
371 exclusively to a subset of CPUs, where the transmit completions for
372 these queues are processed on a CPU within this set. This choice
380 XPS is configured per transmit queue by setting a bitmap of CPUs that
383 When transmitting the first packet in a flow, the function
384 get_xps_queue() is called to select a queue. This function uses the ID
385 of the running CPU as a key into the CPU-to-queue lookup table. If the
386 ID matches a single queue, that is used for transmission. If multiple
390 The queue chosen for transmitting a particular flow is saved in the
391 corresponding socket structure for the flow (e.g. a TCP connection).
395 ooo packets, the queue for a flow can subsequently only be changed if
396 skb->ooo_okay is set for a packet in the flow. This flag indicates that
400 for instance, sets the flag when all data for a connection has been
407 configured. To enable XPS, the bitmap of CPUs that may use a transmit
414 For a network device with a single transmission queue, XPS configuration
415 has no effect, since there is no choice in this case. In a multi-queue
420 best CPUs to share a given queue are probably those that share the cache
428 a max-rate attribute is supported, by setting a Mbps value to