1Virtual Routing and Forwarding (VRF) 2==================================== 3The VRF device combined with ip rules provides the ability to create virtual 4routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the 5Linux network stack. One use case is the multi-tenancy problem where each 6tenant has their own unique routing tables and in the very least need 7different default gateways. 8 9Processes can be "VRF aware" by binding a socket to the VRF device. Packets 10through the socket then use the routing table associated with the VRF 11device. An important feature of the VRF device implementation is that it 12impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected 13(ie., they do not need to be run in each VRF). The design also allows 14the use of higher priority ip rules (Policy Based Routing, PBR) to take 15precedence over the VRF device rules directing specific traffic as desired. 16 17In addition, VRF devices allow VRFs to be nested within namespaces. For 18example network namespaces provide separation of network interfaces at L1 19(Layer 1 separation), VLANs on the interfaces within a namespace provide 20L2 separation and then VRF devices provide L3 separation. 21 22Design 23------ 24A VRF device is created with an associated route table. Network interfaces 25are then enslaved to a VRF device: 26 27 +-----------------------------+ 28 | vrf-blue | ===> route table 10 29 +-----------------------------+ 30 | | | 31 +------+ +------+ +-------------+ 32 | eth1 | | eth2 | ... | bond1 | 33 +------+ +------+ +-------------+ 34 | | 35 +------+ +------+ 36 | eth8 | | eth9 | 37 +------+ +------+ 38 39Packets received on an enslaved device and are switched to the VRF device 40using an rx_handler which gives the impression that packets flow through 41the VRF device. Similarly on egress routing rules are used to send packets 42to the VRF device driver before getting sent out the actual interface. This 43allows tcpdump on a VRF device to capture all packets into and out of the 44VRF as a whole.[1] Similiarly, netfilter [2] and tc rules can be applied 45using the VRF device to specify rules that apply to the VRF domain as a whole. 46 47[1] Packets in the forwarded state do not flow through the device, so those 48 packets are not seen by tcpdump. Will revisit this limitation in a 49 future release. 50 51[2] Iptables on ingress is limited to NF_INET_PRE_ROUTING only with skb->dev 52 set to real ingress device and egress is limited to NF_INET_POST_ROUTING. 53 Will revisit this limitation in a future release. 54 55 56Setup 57----- 581. VRF device is created with an association to a FIB table. 59 e.g, ip link add vrf-blue type vrf table 10 60 ip link set dev vrf-blue up 61 622. Rules are added that send lookups to the associated FIB table when the 63 iif or oif is the VRF device. e.g., 64 ip ru add oif vrf-blue table 10 65 ip ru add iif vrf-blue table 10 66 67 Set the default route for the table (and hence default route for the VRF). 68 e.g, ip route add table 10 prohibit default 69 703. Enslave L3 interfaces to a VRF device. 71 e.g, ip link set dev eth1 master vrf-blue 72 73 Local and connected routes for enslaved devices are automatically moved to 74 the table associated with VRF device. Any additional routes depending on 75 the enslaved device will need to be reinserted following the enslavement. 76 774. Additional VRF routes are added to associated table. 78 e.g., ip route add table 10 ... 79 80 81Applications 82------------ 83Applications that are to work within a VRF need to bind their socket to the 84VRF device: 85 86 setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1); 87 88or to specify the output device using cmsg and IP_PKTINFO. 89 90 91Limitations 92----------- 93Index of original ingress interface is not available via cmsg. Will address 94soon. 95 96################################################################################ 97 98Using iproute2 for VRFs 99======================= 100VRF devices do *not* have to start with 'vrf-'. That is a convention used here 101for emphasis of the device type, similar to use of 'br' in bridge names. 102 1031. Create a VRF 104 105 To instantiate a VRF device and associate it with a table: 106 $ ip link add dev NAME type vrf table ID 107 108 Remember to add the ip rules as well: 109 $ ip ru add oif NAME table 10 110 $ ip ru add iif NAME table 10 111 $ ip -6 ru add oif NAME table 10 112 $ ip -6 ru add iif NAME table 10 113 114 Without the rules route lookups are not directed to the table. 115 116 For example: 117 $ ip link add dev vrf-blue type vrf table 10 118 $ ip ru add pref 200 oif vrf-blue table 10 119 $ ip ru add pref 200 iif vrf-blue table 10 120 $ ip -6 ru add pref 200 oif vrf-blue table 10 121 $ ip -6 ru add pref 200 iif vrf-blue table 10 122 123 1242. List VRFs 125 126 To list VRFs that have been created: 127 $ ip [-d] link show type vrf 128 NOTE: The -d option is needed to show the table id 129 130 For example: 131 $ ip -d link show type vrf 132 11: vrf-mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 133 link/ether 72:b3:ba:91:e2:24 brd ff:ff:ff:ff:ff:ff promiscuity 0 134 vrf table 1 addrgenmode eui64 135 12: vrf-red: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 136 link/ether b6:6f:6e:f6:da:73 brd ff:ff:ff:ff:ff:ff promiscuity 0 137 vrf table 10 addrgenmode eui64 138 13: vrf-blue: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 139 link/ether 36:62:e8:7d:bb:8c brd ff:ff:ff:ff:ff:ff promiscuity 0 140 vrf table 66 addrgenmode eui64 141 14: vrf-green: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 142 link/ether e6:28:b8:63:70:bb brd ff:ff:ff:ff:ff:ff promiscuity 0 143 vrf table 81 addrgenmode eui64 144 145 146 Or in brief output: 147 148 $ ip -br link show type vrf 149 vrf-mgmt UP 72:b3:ba:91:e2:24 <NOARP,MASTER,UP,LOWER_UP> 150 vrf-red UP b6:6f:6e:f6:da:73 <NOARP,MASTER,UP,LOWER_UP> 151 vrf-blue UP 36:62:e8:7d:bb:8c <NOARP,MASTER,UP,LOWER_UP> 152 vrf-green UP e6:28:b8:63:70:bb <NOARP,MASTER,UP,LOWER_UP> 153 154 1553. Assign a Network Interface to a VRF 156 157 Network interfaces are assigned to a VRF by enslaving the netdevice to a 158 VRF device: 159 $ ip link set dev NAME master VRF-NAME 160 161 On enslavement connected and local routes are automatically moved to the 162 table associated with the VRF device. 163 164 For example: 165 $ ip link set dev eth0 master vrf-mgmt 166 167 1684. Show Devices Assigned to a VRF 169 170 To show devices that have been assigned to a specific VRF add the master 171 option to the ip command: 172 $ ip link show master VRF-NAME 173 174 For example: 175 $ ip link show master vrf-red 176 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vrf-red state UP mode DEFAULT group default qlen 1000 177 link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff 178 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vrf-red state UP mode DEFAULT group default qlen 1000 179 link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff 180 7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master vrf-red state DOWN mode DEFAULT group default qlen 1000 181 link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff 182 183 184 Or using the brief output: 185 $ ip -br link show master vrf-red 186 eth1 UP 02:00:00:00:02:02 <BROADCAST,MULTICAST,UP,LOWER_UP> 187 eth2 UP 02:00:00:00:02:03 <BROADCAST,MULTICAST,UP,LOWER_UP> 188 eth5 DOWN 02:00:00:00:02:06 <BROADCAST,MULTICAST> 189 190 1915. Show Neighbor Entries for a VRF 192 193 To list neighbor entries associated with devices enslaved to a VRF device 194 add the master option to the ip command: 195 $ ip [-6] neigh show master VRF-NAME 196 197 For example: 198 $ ip neigh show master vrf-red 199 10.2.1.254 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE 200 10.2.2.254 dev eth2 lladdr 5e:54:01:6a:ee:80 REACHABLE 201 202 $ ip -6 neigh show master vrf-red 203 2002:1::64 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE 204 205 2066. Show Addresses for a VRF 207 208 To show addresses for interfaces associated with a VRF add the master 209 option to the ip command: 210 $ ip addr show master VRF-NAME 211 212 For example: 213 $ ip addr show master vrf-red 214 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vrf-red state UP group default qlen 1000 215 link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff 216 inet 10.2.1.2/24 brd 10.2.1.255 scope global eth1 217 valid_lft forever preferred_lft forever 218 inet6 2002:1::2/120 scope global 219 valid_lft forever preferred_lft forever 220 inet6 fe80::ff:fe00:202/64 scope link 221 valid_lft forever preferred_lft forever 222 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vrf-red state UP group default qlen 1000 223 link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff 224 inet 10.2.2.2/24 brd 10.2.2.255 scope global eth2 225 valid_lft forever preferred_lft forever 226 inet6 2002:2::2/120 scope global 227 valid_lft forever preferred_lft forever 228 inet6 fe80::ff:fe00:203/64 scope link 229 valid_lft forever preferred_lft forever 230 7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master vrf-red state DOWN group default qlen 1000 231 link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff 232 233 Or in brief format: 234 $ ip -br addr show master vrf-red 235 eth1 UP 10.2.1.2/24 2002:1::2/120 fe80::ff:fe00:202/64 236 eth2 UP 10.2.2.2/24 2002:2::2/120 fe80::ff:fe00:203/64 237 eth5 DOWN 238 239 2407. Show Routes for a VRF 241 242 To show routes for a VRF use the ip command to display the table associated 243 with the VRF device: 244 $ ip [-6] route show table ID 245 246 For example: 247 $ ip route show table vrf-red 248 prohibit default 249 broadcast 10.2.1.0 dev eth1 proto kernel scope link src 10.2.1.2 250 10.2.1.0/24 dev eth1 proto kernel scope link src 10.2.1.2 251 local 10.2.1.2 dev eth1 proto kernel scope host src 10.2.1.2 252 broadcast 10.2.1.255 dev eth1 proto kernel scope link src 10.2.1.2 253 broadcast 10.2.2.0 dev eth2 proto kernel scope link src 10.2.2.2 254 10.2.2.0/24 dev eth2 proto kernel scope link src 10.2.2.2 255 local 10.2.2.2 dev eth2 proto kernel scope host src 10.2.2.2 256 broadcast 10.2.2.255 dev eth2 proto kernel scope link src 10.2.2.2 257 258 $ ip -6 route show table vrf-red 259 local 2002:1:: dev lo proto none metric 0 pref medium 260 local 2002:1::2 dev lo proto none metric 0 pref medium 261 2002:1::/120 dev eth1 proto kernel metric 256 pref medium 262 local 2002:2:: dev lo proto none metric 0 pref medium 263 local 2002:2::2 dev lo proto none metric 0 pref medium 264 2002:2::/120 dev eth2 proto kernel metric 256 pref medium 265 local fe80:: dev lo proto none metric 0 pref medium 266 local fe80:: dev lo proto none metric 0 pref medium 267 local fe80::ff:fe00:202 dev lo proto none metric 0 pref medium 268 local fe80::ff:fe00:203 dev lo proto none metric 0 pref medium 269 fe80::/64 dev eth1 proto kernel metric 256 pref medium 270 fe80::/64 dev eth2 proto kernel metric 256 pref medium 271 ff00::/8 dev vrf-red metric 256 pref medium 272 ff00::/8 dev eth1 metric 256 pref medium 273 ff00::/8 dev eth2 metric 256 pref medium 274 275 2768. Route Lookup for a VRF 277 278 A test route lookup can be done for a VRF by adding the oif option to ip: 279 $ ip [-6] route get oif VRF-NAME ADDRESS 280 281 For example: 282 $ ip route get 10.2.1.40 oif vrf-red 283 10.2.1.40 dev eth1 table vrf-red src 10.2.1.2 284 cache 285 286 $ ip -6 route get 2002:1::32 oif vrf-red 287 2002:1::32 from :: dev eth1 table vrf-red proto kernel src 2002:1::2 metric 256 pref medium 288 289 2909. Removing Network Interface from a VRF 291 292 Network interfaces are removed from a VRF by breaking the enslavement to 293 the VRF device: 294 $ ip link set dev NAME nomaster 295 296 Connected routes are moved back to the default table and local entries are 297 moved to the local table. 298 299 For example: 300 $ ip link set dev eth0 nomaster 301 302-------------------------------------------------------------------------------- 303 304Commands used in this example: 305 306cat >> /etc/iproute2/rt_tables <<EOF 3071 vrf-mgmt 30810 vrf-red 30966 vrf-blue 31081 vrf-green 311EOF 312 313function vrf_create 314{ 315 VRF=$1 316 TBID=$2 317 # create VRF device 318 ip link add vrf-${VRF} type vrf table ${TBID} 319 320 # add rules that direct lookups to vrf table 321 ip ru add pref 200 oif vrf-${VRF} table ${TBID} 322 ip ru add pref 200 iif vrf-${VRF} table ${TBID} 323 ip -6 ru add pref 200 oif vrf-${VRF} table ${TBID} 324 ip -6 ru add pref 200 iif vrf-${VRF} table ${TBID} 325 326 if [ "${VRF}" != "mgmt" ]; then 327 ip route add table ${TBID} prohibit default 328 fi 329 ip link set dev vrf-${VRF} up 330 ip link set dev vrf-${VRF} state up 331} 332 333vrf_create mgmt 1 334ip link set dev eth0 master vrf-mgmt 335 336vrf_create red 10 337ip link set dev eth1 master vrf-red 338ip link set dev eth2 master vrf-red 339ip link set dev eth5 master vrf-red 340 341vrf_create blue 66 342ip link set dev eth3 master vrf-blue 343 344vrf_create green 81 345ip link set dev eth4 master vrf-green 346 347 348Interface addresses from /etc/network/interfaces: 349auto eth0 350iface eth0 inet static 351 address 10.0.0.2 352 netmask 255.255.255.0 353 gateway 10.0.0.254 354 355iface eth0 inet6 static 356 address 2000:1::2 357 netmask 120 358 359auto eth1 360iface eth1 inet static 361 address 10.2.1.2 362 netmask 255.255.255.0 363 364iface eth1 inet6 static 365 address 2002:1::2 366 netmask 120 367 368auto eth2 369iface eth2 inet static 370 address 10.2.2.2 371 netmask 255.255.255.0 372 373iface eth2 inet6 static 374 address 2002:2::2 375 netmask 120 376 377auto eth3 378iface eth3 inet static 379 address 10.2.3.2 380 netmask 255.255.255.0 381 382iface eth3 inet6 static 383 address 2002:3::2 384 netmask 120 385 386auto eth4 387iface eth4 inet static 388 address 10.2.4.2 389 netmask 255.255.255.0 390 391iface eth4 inet6 static 392 address 2002:4::2 393 netmask 120 394