1 2 3 HOWTO for the linux packet generator 4 ------------------------------------ 5 6Enable CONFIG_NET_PKTGEN to compile and build pktgen either in-kernel 7or as a module. A module is preferred; modprobe pktgen if needed. Once 8running, pktgen creates a thread for each CPU with affinity to that CPU. 9Monitoring and controlling is done via /proc. It is easiest to select a 10suitable sample script and configure that. 11 12On a dual CPU: 13 14ps aux | grep pkt 15root 129 0.3 0.0 0 0 ? SW 2003 523:20 [pktgen/0] 16root 130 0.3 0.0 0 0 ? SW 2003 509:50 [pktgen/1] 17 18 19For monitoring and control pktgen creates: 20 /proc/net/pktgen/pgctrl 21 /proc/net/pktgen/kpktgend_X 22 /proc/net/pktgen/ethX 23 24 25Tuning NIC for max performance 26============================== 27 28The default NIC settings are (likely) not tuned for pktgen's artificial 29overload type of benchmarking, as this could hurt the normal use-case. 30 31Specifically increasing the TX ring buffer in the NIC: 32 # ethtool -G ethX tx 1024 33 34A larger TX ring can improve pktgen's performance, while it can hurt 35in the general case, 1) because the TX ring buffer might get larger 36than the CPU's L1/L2 cache, 2) because it allows more queueing in the 37NIC HW layer (which is bad for bufferbloat). 38 39One should hesitate to conclude that packets/descriptors in the HW 40TX ring cause delay. Drivers usually delay cleaning up the 41ring-buffers for various performance reasons, and packets stalling 42the TX ring might just be waiting for cleanup. 43 44This cleanup issue is specifically the case for the driver ixgbe 45(Intel 82599 chip). This driver (ixgbe) combines TX+RX ring cleanups, 46and the cleanup interval is affected by the ethtool --coalesce setting 47of parameter "rx-usecs". 48 49For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6): 50 # ethtool -C ethX rx-usecs 30 51 52 53Kernel threads 54============== 55Pktgen creates a thread for each CPU with affinity to that CPU. 56Which is controlled through procfile /proc/net/pktgen/kpktgend_X. 57 58Example: /proc/net/pktgen/kpktgend_0 59 60 Running: 61 Stopped: eth4@0 62 Result: OK: add_device=eth4@0 63 64Most important are the devices assigned to the thread. 65 66The two basic thread commands are: 67 * add_device DEVICE@NAME -- adds a single device 68 * rem_device_all -- remove all associated devices 69 70When adding a device to a thread, a corrosponding procfile is created 71which is used for configuring this device. Thus, device names need to 72be unique. 73 74To support adding the same device to multiple threads, which is useful 75with multi queue NICs, a the device naming scheme is extended with "@": 76 device@something 77 78The part after "@" can be anything, but it is custom to use the thread 79number. 80 81Viewing devices 82=============== 83 84The Params section holds configured information. The Current section 85holds running statistics. The Result is printed after a run or after 86interruption. Example: 87 88/proc/net/pktgen/eth4@0 89 90 Params: count 100000 min_pkt_size: 60 max_pkt_size: 60 91 frags: 0 delay: 0 clone_skb: 64 ifname: eth4@0 92 flows: 0 flowlen: 0 93 queue_map_min: 0 queue_map_max: 0 94 dst_min: 192.168.81.2 dst_max: 95 src_min: src_max: 96 src_mac: 90:e2:ba:0a:56:b4 dst_mac: 00:1b:21:3c:9d:f8 97 udp_src_min: 9 udp_src_max: 109 udp_dst_min: 9 udp_dst_max: 9 98 src_mac_count: 0 dst_mac_count: 0 99 Flags: UDPSRC_RND NO_TIMESTAMP QUEUE_MAP_CPU 100 Current: 101 pkts-sofar: 100000 errors: 0 102 started: 623913381008us stopped: 623913396439us idle: 25us 103 seq_num: 100001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0 104 cur_saddr: 192.168.8.3 cur_daddr: 192.168.81.2 105 cur_udp_dst: 9 cur_udp_src: 42 106 cur_queue_map: 0 107 flows: 0 108 Result: OK: 15430(c15405+d25) usec, 100000 (60byte,0frags) 109 6480562pps 3110Mb/sec (3110669760bps) errors: 0 110 111 112Configuring devices 113=================== 114This is done via the /proc interface, and most easily done via pgset 115as defined in the sample scripts. 116 117Examples: 118 119 pgset "clone_skb 1" sets the number of copies of the same packet 120 pgset "clone_skb 0" use single SKB for all transmits 121 pgset "burst 8" uses xmit_more API to queue 8 copies of the same 122 packet and update HW tx queue tail pointer once. 123 "burst 1" is the default 124 pgset "pkt_size 9014" sets packet size to 9014 125 pgset "frags 5" packet will consist of 5 fragments 126 pgset "count 200000" sets number of packets to send, set to zero 127 for continuous sends until explicitly stopped. 128 129 pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds 130 131 pgset "dst 10.0.0.1" sets IP destination address 132 (BEWARE! This generator is very aggressive!) 133 134 pgset "dst_min 10.0.0.1" Same as dst 135 pgset "dst_max 10.0.0.254" Set the maximum destination IP. 136 pgset "src_min 10.0.0.1" Set the minimum (or only) source IP. 137 pgset "src_max 10.0.0.254" Set the maximum source IP. 138 pgset "dst6 fec0::1" IPV6 destination address 139 pgset "src6 fec0::2" IPV6 source address 140 pgset "dstmac 00:00:00:00:00:00" sets MAC destination address 141 pgset "srcmac 00:00:00:00:00:00" sets MAC source address 142 143 pgset "queue_map_min 0" Sets the min value of tx queue interval 144 pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices 145 To select queue 1 of a given device, 146 use queue_map_min=1 and queue_map_max=1 147 148 pgset "src_mac_count 1" Sets the number of MACs we'll range through. 149 The 'minimum' MAC is what you set with srcmac. 150 151 pgset "dst_mac_count 1" Sets the number of MACs we'll range through. 152 The 'minimum' MAC is what you set with dstmac. 153 154 pgset "flag [name]" Set a flag to determine behaviour. Current flags 155 are: IPSRC_RND # IP source is random (between min/max) 156 IPDST_RND # IP destination is random 157 UDPSRC_RND, UDPDST_RND, 158 MACSRC_RND, MACDST_RND 159 TXSIZE_RND, IPV6, 160 MPLS_RND, VID_RND, SVID_RND 161 FLOW_SEQ, 162 QUEUE_MAP_RND # queue map random 163 QUEUE_MAP_CPU # queue map mirrors smp_processor_id() 164 UDPCSUM, 165 IPSEC # IPsec encapsulation (needs CONFIG_XFRM) 166 NODE_ALLOC # node specific memory allocation 167 NO_TIMESTAMP # disable timestamping 168 169 pgset spi SPI_VALUE Set specific SA used to transform packet. 170 171 pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then 172 cycle through the port range. 173 174 pgset "udp_src_max 9" set UDP source port max. 175 pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then 176 cycle through the port range. 177 pgset "udp_dst_max 9" set UDP destination port max. 178 179 pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example 180 outer label=16,middle label=32, 181 inner label=0 (IPv4 NULL)) Note that 182 there must be no spaces between the 183 arguments. Leading zeros are required. 184 Do not set the bottom of stack bit, 185 that's done automatically. If you do 186 set the bottom of stack bit, that 187 indicates that you want to randomly 188 generate that address and the flag 189 MPLS_RND will be turned on. You 190 can have any mix of random and fixed 191 labels in the label stack. 192 193 pgset "mpls 0" turn off mpls (or any invalid argument works too!) 194 195 pgset "vlan_id 77" set VLAN ID 0-4095 196 pgset "vlan_p 3" set priority bit 0-7 (default 0) 197 pgset "vlan_cfi 0" set canonical format identifier 0-1 (default 0) 198 199 pgset "svlan_id 22" set SVLAN ID 0-4095 200 pgset "svlan_p 3" set priority bit 0-7 (default 0) 201 pgset "svlan_cfi 0" set canonical format identifier 0-1 (default 0) 202 203 pgset "vlan_id 9999" > 4095 remove vlan and svlan tags 204 pgset "svlan 9999" > 4095 remove svlan tag 205 206 207 pgset "tos XX" set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00) 208 pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00) 209 210 pgset stop aborts injection. Also, ^C aborts generator. 211 212 pgset "rate 300M" set rate to 300 Mb/s 213 pgset "ratep 1000000" set rate to 1Mpps 214 215 pgset "xmit_mode netif_receive" RX inject into stack netif_receive_skb() 216 Works with "burst" but not with "clone_skb". 217 Default xmit_mode is "start_xmit". 218 219Sample scripts 220============== 221 222A collection of tutorial scripts and helpers for pktgen is in the 223samples/pktgen directory. The helper parameters.sh file support easy 224and consistant parameter parsing across the sample scripts. 225 226Usage example and help: 227 ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2 228 229Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX 230 -i : ($DEV) output interface/device (required) 231 -s : ($PKT_SIZE) packet size 232 -d : ($DEST_IP) destination IP 233 -m : ($DST_MAC) destination MAC-addr 234 -t : ($THREADS) threads to start 235 -c : ($SKB_CLONE) SKB clones send before alloc new SKB 236 -b : ($BURST) HW level bursting of SKBs 237 -v : ($VERBOSE) verbose 238 -x : ($DEBUG) debug 239 240The global variables being set are also listed. E.g. the required 241interface/device parameter "-i" sets variable $DEV. Copy the 242pktgen_sampleXX scripts and modify them to fit your own needs. 243 244The old scripts: 245 246pktgen.conf-1-2 # 1 CPU 2 dev 247pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS 248pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6 249pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS 250pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows. 251 252 253Interrupt affinity 254=================== 255Note that when adding devices to a specific CPU it is a good idea to 256also assign /proc/irq/XX/smp_affinity so that the TX interrupts are bound 257to the same CPU. This reduces cache bouncing when freeing skbs. 258 259Plus using the device flag QUEUE_MAP_CPU, which maps the SKBs TX queue 260to the running threads CPU (directly from smp_processor_id()). 261 262Enable IPsec 263============ 264Default IPsec transformation with ESP encapsulation plus transport mode 265can be enabled by simply setting: 266 267pgset "flag IPSEC" 268pgset "flows 1" 269 270To avoid breaking existing testbed scripts for using AH type and tunnel mode, 271you can use "pgset spi SPI_VALUE" to specify which transformation mode 272to employ. 273 274 275Current commands and configuration options 276========================================== 277 278** Pgcontrol commands: 279 280start 281stop 282reset 283 284** Thread commands: 285 286add_device 287rem_device_all 288 289 290** Device commands: 291 292count 293clone_skb 294burst 295debug 296 297frags 298delay 299 300src_mac_count 301dst_mac_count 302 303pkt_size 304min_pkt_size 305max_pkt_size 306 307queue_map_min 308queue_map_max 309skb_priority 310 311tos (ipv4) 312traffic_class (ipv6) 313 314mpls 315 316udp_src_min 317udp_src_max 318 319udp_dst_min 320udp_dst_max 321 322node 323 324flag 325 IPSRC_RND 326 IPDST_RND 327 UDPSRC_RND 328 UDPDST_RND 329 MACSRC_RND 330 MACDST_RND 331 TXSIZE_RND 332 IPV6 333 MPLS_RND 334 VID_RND 335 SVID_RND 336 FLOW_SEQ 337 QUEUE_MAP_RND 338 QUEUE_MAP_CPU 339 UDPCSUM 340 IPSEC 341 NODE_ALLOC 342 NO_TIMESTAMP 343 344spi (ipsec) 345 346dst_min 347dst_max 348 349src_min 350src_max 351 352dst_mac 353src_mac 354 355clear_counters 356 357src6 358dst6 359dst6_max 360dst6_min 361 362flows 363flowlen 364 365rate 366ratep 367 368xmit_mode <start_xmit|netif_receive> 369 370vlan_cfi 371vlan_id 372vlan_p 373 374svlan_cfi 375svlan_id 376svlan_p 377 378 379References: 380ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/ 381ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/ 382 383Paper from Linux-Kongress in Erlangen 2004. 384ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf 385 386Thanks to: 387Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek 388Stephen Hemminger, Andi Kleen, Dave Miller and many others. 389 390 391Good luck with the linux net-development. 392