Lines Matching refs:the
23 The behavior of the bonded interfaces depends upon the mode; generally
29 the original tools from extreme-linux and beowulf sites will not work
30 with this version of the driver.
32 For new versions of the driver, updated userspace tools, and
33 who to ask for help, please follow the links at the end of this file.
107 Most popular distro kernels ship with the bonding driver
111 the following steps:
113 1.1 Configure and build the kernel with bonding
116 The current version of the bonding driver is available in the
117 drivers/net/bonding subdirectory of the most recent kernel source
119 own" will want to use the most recent kernel from kernel.org.
122 "make config"), then select "Bonding driver support" in the "Network
123 device support" section. It is recommended that you configure the
124 driver as module since it is currently the only way to pass parameters
125 to the driver or configure more than one bonding device.
127 Build and install the new kernel and modules.
133 or sysfs, the old ifenslave control utility is obsolete.
138 Options for the bonding driver are supplied as parameters to the
141 Module options may be given as command line arguments to the
142 insmod or modprobe command, but are usually specified in either the
144 configuration file (some of which are detailed in the next section).
146 Details on bonding support for sysfs is provided in the
150 parameter is not specified the default value is used. When initially
154 It is critical that either the miimon or arp_interval and
159 Options with textual values will accept either the text name
160 or, for backwards compatibility, the option value. E.g.,
161 "mode=802.3ad" and "mode=4" set the same mode.
167 Specifies the new active slave for modes that support it
169 are the name of any currently enslaved interface, or an empty
170 string. If a name is given, the slave and its link must be up in order
171 to be selected as the new active slave. If an empty string is
172 specified, the current active slave is cleared, and a new active
175 Note that this is only available through the sysfs interface. No module
178 The normal value of this option is the name of the currently
179 active slave, or the empty string if there is no active slave or
180 the current mode does not use an active slave.
184 In an AD system, this specifies the system priority. The allowed range
185 is 1 - 65535. If the value is not specified, it takes 65535 as the
193 In an AD system, this specifies the mac-address for the actor in
195 multicast. It is preferred to have the local-admin bit set for this
196 mac but driver does not enforce it. If the value is not given then
197 system defaults to using the masters' mac address as actors' system
205 Specifies the 802.3ad aggregation selection logic to use. The
213 Reselection of the active aggregator occurs only when all
214 slaves of the active aggregator are down or the active
217 This is the default value.
224 - A slave is added to or removed from the bond
234 The active aggregator is chosen by the largest number of
235 ports (slaves). Reselection occurs as described under the
239 802.3ad aggregations when partial failure of the active aggregator
240 occurs. This keeps the aggregator with the highest availability
247 In an AD system, the port-key has three parts as shown below -
254 This defines the upper 10 bits of the port key. The values can be
255 from 0 - 1023. If not given, the system defaults to 0.
274 Specifies the ARP link monitoring frequency in milliseconds.
276 The ARP monitor works by periodically checking the slave
278 traffic recently (the precise criteria depends upon the
279 bonding mode, and the state of the slave). Regular traffic is
280 generated via ARP probes issued for the addresses specified by
281 the arp_ip_target option.
283 This behavior can be modified by the arp_validate option,
287 (modes 0 and 2), the switch should be configured in a mode
288 that evenly distributes packets across all links. If the
289 switch is configured to distribute the packets in an XOR
290 fashion, all replies from the ARP targets will be received on
291 the same link which could cause the other team members to
298 Specifies the IP addresses to use as ARP monitoring peers when
299 arp_interval is > 0. These are the targets of the ARP request
300 sent to determine the health of the link to the targets.
322 Validation is performed only for the active slave.
340 only for the active slave.
349 Enabling validation causes the ARP monitor to examine the incoming
351 is receiving the appropriate ARP traffic.
353 For an active slave, the validation checks ARP replies to confirm
355 do not typically receive these replies, the validation performed
356 for backup slaves is on the broadcast ARP request sent out via the
358 configurations may result in situations wherein the backup slaves
359 do not receive the ARP requests; in such a situation, validation
364 the active slave failure, it doesn't really guarantee that the
365 backup slave will work if it's selected as the next active slave.
369 beyond a common switch. Should the link between the switch and
370 target fail (but not the switch itself), the probe traffic
371 generated by the multiple bonding instances will fool the standard
372 ARP monitor into considering the links as still up. Use of
373 validation can resolve this, as the ARP monitor will only consider
379 Enabling filtering causes the ARP monitor to only use incoming ARP
384 Filtering operates by only considering the reception of ARP
390 levels of third party broadcast traffic would fool the standard
391 ARP monitor into considering the links as still up. Use of
399 Specifies the quantity of arp_ip_targets that must be reachable
400 in order for the ARP monitor to consider a slave as being up.
408 consider the slave up only when any of the arp_ip_targets
413 consider the slave up only when all of the arp_ip_targets
418 Specifies the time, in milliseconds, to wait before disabling
420 is only valid for the miimon link monitor. The downdelay
421 value should be a multiple of the miimon value; if not, it
422 will be rounded down to the nearest multiple. The default
428 the same MAC address at enslavement (the traditional
429 behavior), or, when enabled, perform special handling of the
430 bond's MAC address in accordance with the selected policy.
438 the same MAC address at enslavement time. This is the
443 The "active" fail_over_mac policy indicates that the
444 MAC address of the bond should always be the MAC
445 address of the currently active slave. The MAC
446 address of the slaves is not changed; instead, the MAC
447 address of the bond changes during a failover.
452 interferes with the ARP monitor).
455 the network must be updated via gratuitous ARP,
458 traffic, if the switch snoops incoming traffic to
459 update its tables) for the traditional method. If the
463 When this policy is used in conjunction with the mii
466 susceptible to loss of the gratuitous ARP, and an
471 The "follow" fail_over_mac policy causes the MAC
472 address of the bond to be selected normally (normally
473 the MAC address of the first slave added to the bond).
474 However, the second and subsequent slaves are not set
476 slave is programmed with the bond's MAC address at
477 failover time (and the formerly active slave receives
478 the newly active slave's MAC address).
482 when multiple ports are programmed with the same MAC
486 The default policy is none, unless the first slave cannot
487 change its MAC address, in which case the active policy is
491 present in the bond.
498 Option specifying the rate in which we'll ask our link partner
512 Specifies the number of bonding devices to create for this
513 instance of the bonding driver. E.g., if max_bonds is 3, and
514 the bonding driver is not already loaded, then bond0, bond1
520 Specifies the MII link monitoring frequency in milliseconds.
521 This determines how often the link state of each slave is
524 The use_carrier option, below, affects how the link state is
525 determined. See the High Availability section for additional
530 Specifies the minimum number of links that must be active before
531 asserting carrier. It is similar to the Cisco EtherChannel min-links
532 feature. This allows setting the minimum number of member ports that
533 must be up (link-up state) before marking the bond device as up
540 802.3ad mode) whenever there is an active aggregator, regardless of the
543 setting this option to 0 or to 1 has the exact same effect.
547 Specifies one of the bonding policies. The default is
553 order from the first available slave through the
559 Active-backup policy: Only one slave in the bond is
561 if, the active slave fails. The bond's MAC address is
563 to avoid confusing the switch.
567 or more gratuitous ARPs on the newly active slave.
568 One gratuitous ARP is issued for the bonding master
570 it, provided that the interface has at least one IP
572 interfaces are tagged with the appropriate VLAN id.
575 option, documented below, affects the behavior of this
580 XOR policy: Transmit based on the selected transmit
584 policies may be selected via the xmit_hash_policy option,
597 aggregation groups that share the same speed and
598 duplex settings. Utilizes all slaves in the active
599 aggregator according to the 802.3ad specification.
602 to the transmit hash policy, which may be changed from
603 the default simple XOR policy via the xmit_hash_policy
606 regards to the packet mis-ordering requirements of
607 section 43.2.4 of the 802.3ad standard. Differing
613 1. Ethtool support in the base drivers for retrieving
614 the speed and duplex of each slave.
627 In tlb_dynamic_lb=1 mode; the outgoing traffic is
628 distributed according to the current load (computed
629 relative to the speed) on each slave.
631 In tlb_dynamic_lb=0 mode; the load balancing based on
632 current load is disabled and the load is distributed
633 only using the hash distribution.
635 Incoming traffic is received by the current slave.
636 If the receiving slave fails, another slave takes over
637 the MAC address of the failed receiving slave.
641 Ethtool support in the base drivers for retrieving the
650 The bonding driver intercepts the ARP Replies sent by
651 the local system on their way out and overwrites the
652 source hardware address with the unique hardware
653 address of one of the slaves in the bond such that
655 the server.
657 Receive traffic from connections created by the server
658 is also balanced. When the local system sends an ARP
659 Request the bonding driver copies and saves the peer's
660 IP information from the ARP packet. When the ARP
661 Reply arrives from the peer, its hardware address is
662 retrieved and the bonding driver initiates an ARP
663 reply to this peer assigning it to one of the slaves
664 in the bond. A problematic outcome of using ARP
666 ARP request is broadcast it uses the hardware address
667 of the bond. Hence, peers learn the hardware address
668 of the bond and the balancing of receive traffic
669 collapses to the current slave. This is handled by
670 sending updates (ARP Replies) to all the peers with
672 the traffic is redistributed. Receive traffic is also
673 redistributed when a new slave is added to the bond
676 among the group of highest speed slaves in the bond.
678 When a link is reconnected or a new slave joins the
679 bond the receive traffic is redistributed among all
680 active slaves in the bond by initiating ARP Replies
681 with the selected MAC address to each of the
683 be set to a value equal or greater than the switch's
684 forwarding delay so that the ARP Replies sent to the
685 peers will not be blocked by the switch.
689 1. Ethtool support in the base drivers for retrieving
690 the speed of each slave.
692 2. Base driver support for setting the hardware
694 required so that there will always be one slave in the
695 team using the bond hardware address (the
697 address for each slave in the bond. If the
699 swapped with the new curr_active_slave that was
705 Specify the number of peer notifications (gratuitous ARPs and
707 failover event. As soon as the link is up on the new slave
708 (possibly immediately) a peer notification is sent on the
711 is active) if the number is greater than 1.
713 The valid range is 0 - 255; the default value is 1. These options
714 affect only the active-backup mode. These options were added for
718 are generated by the ipv4 and ipv6 code and the numbers of
723 Specify the number of packets to transmit through a slave before
724 moving to the next one. When set to 0 then a slave is chosen at
727 The valid range is 0 - 65535; the default value is 1. This option
732 A string (eth0, eth2, etc) specifying which slave is the
733 primary device. The specified device will always be the
734 active slave while it is available. Only when the primary is
744 Specifies the reselection policy for the primary slave. This
745 affects how the primary slave is chosen to become the active slave
746 when failure of the active slave or recovery of the primary slave
748 the primary slave and other slaves. Possible values are:
752 The primary slave becomes the active slave whenever it
757 The primary slave becomes the active slave when it comes
758 back up, if the speed and duplex of the primary slave is
759 better than the speed and duplex of the current active
764 The primary slave becomes the active slave only if the
765 current active slave fails and the primary slave is up.
769 If no slaves are active, the first slave to recover is
770 made the active slave.
772 When initially enslaved, the primary slave is always made
773 the active slave.
775 Changing the primary_reselect policy via sysfs will cause an
776 immediate selection of the best active slave according to the new
777 policy. This may or may not result in a change of the active
778 slave, depending upon the circumstances.
788 slaves based on the load in that interval. This gives nice lb
791 load balancing provided solely by the hash distribution.
792 xmit-hash-policy can be used to select the appropriate hashing for
793 the setup.
795 The sysfs entry can be used to change the setting per bond device
796 and the initial value is derived from the module parameter. The
797 sysfs entry is allowed to be changed only if the bond device is
806 Specifies the time, in milliseconds, to wait before enabling a
808 only valid for the miimon link monitor. The updelay value
809 should be a multiple of the miimon value; if not, it will be
810 rounded down to the nearest multiple. The default value is 0.
815 ioctls vs. netif_carrier_ok() to determine the link
817 utilize a deprecated calling sequence within the kernel. The
818 netif_carrier_ok() relies on the device driver to maintain its
822 If bonding insists that the link is up when it should not be,
826 it will appear as if the link is always up. In this case,
827 setting use_carrier to 0 will cause bonding to revert to the
828 MII / ETHTOOL ioctl method to determine the link state.
830 A value of 1 enables the use of netif_carrier_ok(), a value of
831 0 will use the deprecated MII / ETHTOOL ioctls. The default
836 Selects the transmit hash policy to use for slave selection in
842 field to generate the hash. The formula is
848 network peer on the same slave.
855 protocol information to generate the hash.
858 generate the hash. The formula is
866 If the protocol is IPv6 then the source and destination
870 network peer on the same slave. For non-IP traffic,
871 the formula is the same as for the layer2 transmit
884 when available, to generate the hash. This allows for
891 hash = source port, destination port (as in the header)
897 If the protocol is IPv6 then the source and destination
901 IPv6 protocol traffic, the source and destination port
902 information is omitted. For non-IP traffic, the
903 formula is the same as for the layer2 transmit hash
918 This policy uses the same formula as layer2+3 but it
919 relies on skb_flow_dissect to obtain the header fields
920 which might result in the use of inner headers if an
922 improve the performance for tunnel users because the
923 packets will be distributed according to the encapsulated
928 This policy uses the same formula as layer3+4 but it
929 relies on skb_flow_dissect to obtain the header fields
930 which might result in the use of inner headers if an
932 improve the performance for tunnel users because the
933 packets will be distributed according to the encapsulated
938 does not exist, and the layer2 policy is the only policy. The
943 Specifies the number of IGMP membership reports to be issued after
945 the failover, subsequent packets are sent in each 200ms interval.
947 The valid range is 0 - 255; the default value is 1. A value of 0
948 prevents the IGMP membership report from being issued in response
949 to the failover event.
953 switch the IGMP traffic from one slave to another. Therefore a fresh
954 IGMP report must be issued to cause the switch to forward the incoming
955 IGMP traffic over the newly selected slave.
961 Specifies the number of seconds between instances where the bonding
964 The valid range is 1 - 0x7fffffff; the default value is 1. This Option
971 initialization scripts, or manually using either iproute2 or the
972 sysfs interface. Distros generally use one of three packages for the
977 We will first describe the options for configuring bonding for
980 bonding without support from the network initialization scripts (i.e.,
991 Else, issue the command:
996 "initscripts" or "sysconfig," followed by some numbers. This is the
1000 issue the command:
1014 bonding, however, at this writing, the YaST system configuration
1018 First, if they have not already been configured, configure the
1019 slave devices. On SLES 9, this is most easily done by running the
1022 this is to configure the devices for DHCP (this is only to get the
1024 name of the configuration file for each device will be of the form:
1028 Where the "xx" portion will be replaced with the digits from
1029 the device's permanent MAC address.
1031 Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been
1032 created, it is necessary to edit the configuration files for the slave
1033 devices (the MAC addresses correspond to those of the slave devices).
1034 Before editing, the file will contain multiple lines, and will look
1043 Change the BOOTPROTO and STARTMODE lines to the following:
1048 Do not alter the UNIQUE or _nm_name lines. Remove any other
1051 Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified,
1052 it's time to create the configuration file for the bonding device
1053 itself. This file is named ifcfg-bondX, where X is the number of the
1055 ifcfg-bond0, the second is ifcfg-bond1, and so on. The sysconfig
1059 The contents of the ifcfg-bondX file is as follows:
1073 Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK
1074 values with the appropriate values for your network.
1076 The STARTMODE specifies when the device is brought online.
1092 The line BONDING_MASTER='yes' indicates that the device is a
1095 The contents of BONDING_MODULE_OPTS are supplied to the
1096 instance of the bonding module for this device. Specify the options
1097 for the bonding mode, link monitoring, and so on here. Do not include
1098 the max_bonds bonding parameter; this will confuse the configuration
1104 specifier for the network device. The interface name is easier to
1105 find, but the ethN names are subject to change at boot time if, e.g.,
1106 a device early in the sequence has failed. The device specifiers
1107 (bus-pci-0000:06:08.1 in the example above) specify the physical
1108 network device, and will not change unless the device's bus location
1111 configurations will choose one or the other for all slave devices.
1114 networking must be restarted for the configuration changes to take
1115 effect. This can be accomplished via the following:
1119 Note that the network control script (/sbin/ifdown) will
1120 remove the bonding module as part of the network shutdown processing,
1121 so it is not necessary to remove the module by hand if, e.g., the
1126 devices). It is necessary to edit the configuration file by hand to
1127 change the bonding configuration.
1129 Additional general options and details of the ifcfg file
1134 Note that the template does not document the various BONDING_
1135 settings described above, but does describe many of the other options.
1142 writing, this does not function for bonding devices; the scripts
1143 attempt to obtain the device address from DHCP prior to adding any of
1144 the slave devices. Without active slaves, the DHCP requests are not
1145 sent to the network.
1153 (as described above). Do not specify the "max_bonds" parameter to any
1158 Because the sysconfig scripts supply the bonding module
1159 options in the ifcfg-bondX file, it is not necessary to add them to
1160 the system /etc/modules.d/*.conf configuration files.
1167 version 3 or later, Fedora, etc. On these systems, the network
1169 control bonding devices. Note that older versions of the initscripts
1173 These distros will not automatically load the network adapter
1174 driver unless the ethX device is configured with an IP address.
1177 a bondX link. Network script files are located in the directory:
1182 with the adapter's physical adapter number. For example, the script
1184 Place the following text in the file:
1194 must correspond with the name of the file, i.e., ifcfg-eth1 must have
1195 a device line of DEVICE=eth1. The setting of the MASTER= line will
1196 also depend on the final bonding interface name chosen for your bond.
1198 one for each device, i.e., the first bonding instance is bond0, the
1203 the number of the bond. For bond0 the file is named "ifcfg-bond0",
1205 place the following text:
1216 Be sure to change the networking specific lines (IPADDR,
1221 and, indeed, preferable, to specify the bonding options in the ifcfg-bond0
1222 file, e.g. a line of the format:
1226 will configure the bond with the specified options. The options
1227 specified in BONDING_OPTS are identical to the bonding module parameters
1228 except for the arp_ip_target field when using versions of initscripts older
1231 should be preceded by a '+' to indicate it should be added to the list of
1236 is the proper syntax to specify multiple targets. When specifying
1241 your distro) to load the bonding module with your desired options when the
1243 will load the bonding module, and select its options:
1248 Replace the sample parameters with the appropriate set of
1252 will restart the networking subsystem and your bond link should be now
1258 Recent versions of initscripts (the versions supplied with Fedora
1264 above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
1265 and add a line consisting of "TYPE=Bonding". Note that the TYPE value
1273 specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the
1274 number of the bond. This support requires sysfs support in the kernel,
1277 those instances, see the "Configuring Multiple Bonds Manually" section,
1284 scripts (the sysconfig or initscripts package) do not have specific
1288 The general method for these systems is to place the bonding
1290 appropriate for the installed distro), then add modprobe and/or
1291 `ip link` commands to the system's global init script. The name of
1292 the global init script differs; for sysconfig, it is
1297 reboots, edit the appropriate file (/etc/init.d/boot.local or
1298 /etc/rc.d/rc.local), and add the following:
1306 Replace the example bonding module parameters and bond0
1307 network configuration (IP address, netmask, etc) with the appropriate
1310 Unfortunately, this method will not provide support for the
1311 ifup and ifdown scripts on the bond devices. To reload the bonding
1312 configuration, it is necessary to run the initialization script, e.g.,
1321 which only initializes the bonding configuration, then call that
1323 enabled without re-running the entire global init script.
1325 To shut down the bonding devices, it is necessary to first
1326 mark the bonding device itself as being down, then remove the
1328 the following:
1345 If you require multiple bonding devices, but all with the same
1346 options, you may wish to use the "max_bonds" module parameter,
1350 preferable to use bonding parameters exported by sysfs, documented in the
1353 For versions of bonding without sysfs support, the only means to
1355 the bonding driver multiple times. Note that current versions of the
1357 your distro uses these scripts, no special action is needed. See the
1361 To load multiple instances of the module, it is necessary to
1362 specify a different name for each instance (the module loading system
1363 requires that every loaded module, even multiple instances of the same
1373 will load the bonding module two times. The first instance is
1374 named "bond0" and creates the bond0 device in balance-rr mode with an
1375 miimon of 100. The second instance is named "bond1" and creates the
1379 the above does not work, and the second bonding instance never sees
1380 its options. In that case, the second options line can be substituted
1390 to rename modules at load time (the "-o bond1" part). Attempts to pass
1401 via the sysfs interface. This interface allows dynamic configuration
1402 of all bonds in the system without unloading the module. It also
1406 Use of the sysfs interface allows you to use multiple bonds
1407 with different configurations without having to reload the module.
1409 bonding is compiled into the kernel.
1411 You must have the sysfs filesystem mounted to configure
1413 are using the standard mount point for sysfs, e.g. /sys. If your
1414 sysfs filesystem is mounted elsewhere, you will need to adjust the
1434 Interfaces may be enslaved to a bond using the file
1436 are the same as for the bonding_masters file.
1445 When an interface is enslaved to a bond, symlinks between the
1446 two are created in the sysfs filesystem. In this case, you would get
1451 interface is enslaved by looking for the master symlink. Thus:
1454 the name of the bond interface.
1458 Each bond may be configured individually by manipulating the
1461 The names of these files correspond directly with the command-
1462 line parameters described elsewhere in this file, and, with the
1463 exception of arp_ip_target, they accept the same values. To see the
1464 current setting, simply cat the appropriate file.
1467 guidelines for each parameter, see the appropriate section in this
1475 NOTE: The bond interface must be down before the mode can be
1491 To configure the interval between learning packet transmits:
1493 NOTE: the lp_inteval is the number of seconds between instances where
1494 the bonding driver sends learning packets to each slaves peer switch. The
1499 We begin with the same example that is shown in section 3.3,
1503 and eth1), and have it persist across reboots, edit the appropriate
1504 file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the
1516 active-backup mode, using ARP monitoring, add the following lines to
1536 the box. The ifenslave-2.6 package should be installed to provide bonding
1540 Note that ifenslave-2.6 package will load the bonding module and use
1541 the ifenslave command when appropriate.
1546 In /etc/network/interfaces, the following stanza will configure bond0, in
1556 If the above configuration doesn't work, you might have a system using
1559 produce the same result on those systems.
1578 more advanced examples tailored to you particular distros, see the files in
1584 When using the bonding driver, the physical port which transmits a frame is
1585 typically selected by the bonding driver, and is not relevant to the user or
1586 system administrator. The output port is simply selected using the policies of
1587 the selected bonding mode. On occasion however, it is helpful to direct certain
1591 connects via a public network, it may be desirous to bias the bond to send said
1594 using the traffic control utilities inherent in linux.
1596 By default the bonding driver is multiqueue aware and 16 queues are created
1597 when the driver initializes (see Documentation/networking/multiqueue.txt
1598 for details). If more or less queues are desired the module parameter
1600 available as the allocation is done at module init time.
1602 The output of the file /proc/net/bonding/bondX has changed so the output Queue
1625 The queue_id for a slave can be set using the command:
1630 like the one above until proper priorities are set for all interfaces. On
1634 These queue id's can be used in conjunction with the tc utility to configure
1636 slave devices. For instance, say we wanted, in the above configuration to
1637 force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output
1645 These commands tell the kernel to attach a multiqueue queue discipline to the
1648 This value is then passed into the driver, causing the normal output path
1651 Note that qid values begin at 1. Qid 0 is reserved to initiate to the driver
1653 leaving the qid for a slave to 0 is the multiqueue awareness in the bonding
1655 slave devices as well as bond devices and the bonding driver will simply act as
1656 a pass-through for selecting output queues on the slave device rather than
1665 When using 802.3ad bonding mode, the Actor (host) and Partner (switch)
1668 supposed to forward). However, most of the values are easily predictable
1669 or are simply the machine's MAC address (which is trivially known to all
1670 other hosts in the same L2). This implies that other machines in the L2
1671 domain can spoof LACPDU packets from other hosts to the switch and potentially
1672 cause mayhem by joining (from the point of view of the switch) another
1681 Also it's preferable to set the local-admin bit. Following shell code
1693 (b) ad_actor_sys_prio : Randomize the system priority. The default value
1694 is 65535, but system can take the value from 1 - 65535. Following shell
1700 (c) ad_user_port_key : Use the user portion of the port-key. The default
1701 keeps this empty. These are the upper 10 bits of the port-key and value
1715 Each bonding device has a read-only file residing in the
1717 about the bonding configuration, options and state of each slave.
1719 For example, the contents of /proc/net/bonding/bond0 after the
1739 The precise format and contents will change depending upon the
1740 bonding configuration, state, and version of the bonding driver.
1745 The network configuration can be inspected using the ifconfig
1746 command. Bonding devices will have the MASTER flag set; Bonding slave
1747 devices will have the SLAVE flag set. The ifconfig output does not
1750 In the example below, the bond0 interface is the master
1752 bond0 have the same MAC address (HWaddr) as bond0 for all modes except
1780 For this section, "switch" refers to whatever system the
1781 bonded devices are directly connected to (i.e., where the other end of
1782 the cable plugs into). This may be an actual dedicated switch device,
1787 require any specific configuration of the switch.
1789 The 802.3ad mode requires that the switch have the appropriate
1792 Cisco 3550 series switch requires that the appropriate ports first be
1798 require that the switch have the appropriate ports grouped together.
1800 called an "etherchannel" (as in the Cisco example, above), a "trunk
1802 will also have its own configuration options for the switch's transmit
1803 policy to the bond. Typical choices include XOR of either the MAC or
1804 IP addresses. The transmit policy of the two peers does not need to
1805 match. For these three modes, the bonding mode really selects a
1814 using the 8021q driver. However, only packets coming from the 8021q
1817 packets generated by either ALB mode or the ARP monitor mechanism, are
1819 "learn" the VLAN IDs configured above it, and use those IDs to tag
1822 For reasons of simplicity, and to support the use of adapters
1823 that can do VLAN hardware acceleration offloading, the bonding
1825 the add_vid/kill_vid notifications to gather the necessary
1826 information, and it propagates those actions to the slaves. In case
1829 "un-accelerated" by the bonding driver so the VLAN tag sits in the
1834 hardware address of 00:00:00:00:00:00 until the first slave is added.
1835 If the VLAN interface is created prior to the first enslavement, it
1836 would pick up the all-zeroes hardware address. Once the first slave
1837 is attached to the bond, the bond device itself will pick up the
1838 slave's hardware address, which is then available for the VLAN device.
1842 top of it. When a new slave is added, the bonding interface will
1843 obtain its hardware address from the first slave, which might not
1844 match the hardware address of the VLAN interfaces (which was
1847 There are two methods to insure that the VLAN device operates
1848 with the correct hardware address if all slaves are removed from a
1853 2. Set the bonding interface's hardware address so that it
1854 matches the hardware address of the VLAN interfaces.
1856 Note that changing a VLAN interface's HW address would set the
1857 underlying device -- i.e. the bonding interface -- to promiscuous
1865 monitoring a slave device's link state: the ARP monitor and the MII
1868 At the present time, due to implementation restrictions in the
1876 queries to one or more designated peer systems on the network, and
1877 uses the response as an indication that the link is operating. This
1879 or more peers on the local network.
1881 The ARP monitor relies on the device driver itself to verify
1882 that traffic is flowing. In particular, the driver must keep up to
1883 date the last receive time, dev->last_rx, and transmit start time,
1884 dev->trans_start. If these are not updated by the driver, then the
1887 shows the ARP requests and replies on the network, then it may be that
1895 monitor. In the case of just one target, the target itself may go
1897 an additional target (or several) increases the reliability of the ARP
1906 For just a single target the options would resemble:
1916 The MII monitor monitors only the carrier state of the local
1918 depending upon the device driver to maintain its carrier state, by
1919 querying the device's MII registers, or by making an ethtool query to
1920 the device.
1922 If the use_carrier module parameter is 1 (the default value),
1923 then the MII monitor will rely on the driver for carrier state
1924 information (via the netif_carrier subsystem). As explained in the
1925 use_carrier parameter information, above, if the MII monitor fails to
1926 detect carrier loss on the device (e.g., when the cable is physically
1927 disconnected), it may be that the driver does not support
1930 If use_carrier is 0, then the MII monitor will first query the
1931 device's (via ioctl) MII registers and check the link state. If that
1932 request fails (not just that it returns carrier down), then the MII
1934 the same information. If both methods fail (i.e., the driver either
1935 does not support or had some error in processing both the MII register
1936 and ethtool requests), then the MII monitor will assume the link is
1945 When bonding is configured, it is important that the slave
1946 devices not have routes that supersede routes of the master (or,
1947 generally, not have routes at all). For example, suppose the bonding
1948 device bond0 has two slaves, eth0 and eth1, and the routing table is
1958 This routing configuration will likely still update the
1959 receive/transmit times in the driver (needed by the ARP monitor), but
1960 may bypass the bonding driver (because outgoing traffic to, in this
1964 configuration, because ARP requests (generated by the ARP monitor)
1965 will be sent on one interface (bond0), but the corresponding reply
1969 by the state of the routing table.
1973 not supersede routes of their master. This should generally be the
1982 that the same physical device always has the same "ethX" name), it may
1986 For example, given a modules.conf containing the following:
1995 If neither eth0 and eth1 are slaves to bond0, then when the
1996 bond0 interface comes up, the devices may end up reordered. This
1999 when the e1000 driver loads, it will receive eth0 and eth1 for its
2000 devices, but the bonding configuration tries to enslave eth2 and eth3
2001 (which may later be assigned to the tg3 devices).
2003 Adding the following:
2008 bonding is loaded. This command is fully documented in the
2012 In this case, the following can be added to config files in
2017 This will load tg3 and e1000 modules before loading the bonding one.
2018 Full documentation on this can be found in the modprobe.d and modprobe
2024 By default, bonding enables the use_carrier option, which
2025 instructs bonding to trust the driver to maintain carrier state.
2027 As discussed in the options section, above, some drivers do
2028 not support the netif_carrier_on/_off link state tracking system.
2033 not maintain it in real time, e.g., only polling the link state at
2037 use_carrier=0 to see if that improves the failure detection time. If
2038 it does, then it may be that the driver checks the carrier state at a
2039 fixed interval, but does not cache the MII register values (so the
2040 use_carrier=0 method of querying the registers directly works). If
2041 use_carrier=0 does not improve the failover, then the driver may cache
2042 the registers, or the problem may be elsewhere.
2044 Also, remember that miimon only checks for the device's
2045 carrier state. It has no way to determine the state of devices on or
2052 If running SNMP agents, the bonding driver should be loaded
2054 is due to the interface index (ipAdEntIfIndex) being associated to
2055 the first interface found with a given IP address. That is, there is
2057 eth1 are slaves of bond0 and the driver for eth0 is loaded before the
2058 bonding driver, the interface for the IP address will be associated
2059 with the eth0 interface. This configuration is shown below, the IP
2061 in the ifDescr table (ifDescr.2).
2074 This problem is avoided by loading the bonding driver before
2076 loading the bonding driver first, the IP address 192.168.1.1 is
2090 While some distributions may not report the interface name in
2091 ifDescr, the association between the IP address and IfIndex remains
2099 common to enable promiscuous mode on the device, so that all traffic
2100 is seen (instead of seeing only traffic destined for the local host).
2101 The bonding driver handles promiscuous mode changes to the bonding
2102 master device (e.g., bond0), and propagates the setting to the slave
2105 For the balance-rr, balance-xor, broadcast, and 802.3ad modes,
2106 the promiscuous mode setting is propagated to all slaves.
2108 For the active-backup, balance-tlb and balance-alb modes, the
2109 promiscuous mode setting is propagated only to the active slave.
2111 For balance-tlb mode, the active slave is the slave currently
2114 For balance-alb mode, the active slave is the slave used as a
2116 sending to peers that are unassigned or if the load is unbalanced.
2118 For the active-backup, balance-tlb and balance-alb modes, when
2119 the active slave changes (e.g., due to a link failure), the
2120 promiscuous setting will be propagated to the new active slave.
2127 links or switches between the host and the rest of the world. The
2128 goal is to provide the maximum availability of network connectivity
2129 (i.e., the network always works), even though other configurations
2139 access to fail over to. Additionally, the bonding load balance modes
2141 the load will be rebalanced across the remaining devices.
2149 With multiple switches, the configuration of bonding and the
2153 Below is a sample network, configured to maximize the
2154 availability of the network:
2168 In this configuration, there is a link between the two
2170 the outside world ("port3" on each switch). There is no technical
2176 In a topology such as the example above, the active-backup and
2177 broadcast modes are the only useful bonding modes when optimizing for
2178 availability; the other modes require all links to terminate on the
2181 active-backup: This is generally the preferred mode, particularly if
2182 the switches have an ISL and play together well. If the
2185 then the primary option can be used to insure that the
2189 only for very specific needs. For example, if the two
2190 switches are not connected (no ISL), and the networks beyond
2193 independent networks, then the broadcast mode may be suitable.
2199 switch. If the switch can reliably fail ports in response to other
2200 failures, then either the MII or ARP monitors should work. For
2201 example, in the above example, if the "port3" link fails at the remote
2202 end, the MII monitor has no direct means to detect this. The ARP
2203 monitor could be configured with a target at the remote end of port3,
2206 In general, however, in a multiple switch topology, the ARP
2208 end connectivity failures (which may be caused by the failure of any
2210 the ARP monitor should be configured with multiple targets (at least
2211 one for each switch in the network). This will insure that,
2212 regardless of which switch is active, the ARP monitor has a suitable
2216 generally referred to as "trunk failover." This is a feature of the
2217 switch that causes the link state of a particular switch port to be set
2218 down (or up) when the state of another switch port goes down (or up).
2220 to the logically "interior" ports that bonding is able to monitor via
2222 switch, but this can be a viable alternative to the ARP monitor when using
2231 In a single switch configuration, the best method to maximize
2232 throughput depends upon the application and network environment. The
2236 For this discussion, we will break down the topologies into
2237 two categories. Depending upon the destination of most traffic, we
2240 In a gatewayed configuration, the "switch" is acting primarily
2241 as a router, and the majority of traffic passes through this router to
2242 other networks. An example would be the following:
2253 acting as a gateway. For our discussion, the important point is that
2254 the majority of traffic from Host A will pass through the router to
2259 and received via one other peer on the local network, the router.
2261 Note that the case of two systems connected directly via
2262 multiple physical links is, for purposes of configuring bonding, the
2264 traffic is destined for the "gateway" itself, not some other network
2265 beyond the gateway.
2267 In a local configuration, the "switch" is acting primarily as
2268 a switch, and the majority of traffic passes through this switch to
2269 reach other stations on the same network. An example would be the
2280 Again, the switch may be a dedicated switch device, or another
2281 host acting as a gateway. For our discussion, the important point is
2282 that the majority of traffic from Host A is destined for other hosts
2283 on the same local network (Hosts B and C in the above example).
2286 the bonded device will be to the same MAC level peer on the network
2287 (the gateway itself, i.e., the router), regardless of its final
2289 from the final destinations, thus, each destination (Host B, Host C)
2293 configuration is important because many of the load balancing modes
2294 available use the MAC addresses of the local network source and
2302 This configuration is the easiest to set up and to understand,
2306 balance-rr: This mode is the only mode that will permit a single
2308 interfaces. It is therefore the only mode that will allow a
2310 worth of throughput. This comes at a cost, however: the
2316 altering the net.ipv4.tcp_reordering sysctl parameter. The
2320 Note that the fraction of packets that will be delivered out of
2322 of reordering depends upon a variety of factors, including the
2323 networking interfaces, the switch, and the topology of the
2332 through the switch to a balance-rr bond will not utilize greater
2339 to the bond.
2341 This mode requires the switch to have the appropriate ports
2345 the active-backup mode, as the inactive backup devices are all
2346 connected to the same peer as the primary. In this case, a
2347 load balancing mode (with link monitoring) will provide the
2349 available bandwidth. On the plus side, active-backup mode
2350 does not require any configuration of the switch, so it may
2351 have value if the hardware available does not support any of
2352 the load balance modes.
2355 for specific peers will always be sent over the same
2356 interface. Since the destination is determined by the MAC
2359 the same local network. This mode is likely to be suboptimal
2363 As with balance-rr, the switch ports need to be configured for
2372 protocol includes automatic configuration of the aggregates,
2373 so minimal manual configuration of the switch is needed
2378 packets. The 802.3ad mode does have some drawbacks: the
2379 standard mandates that all devices in the aggregate operate at
2380 the same speed and duplex. Also, as with all bonding load
2385 Additionally, the linux bonding 802.3ad implementation
2388 outgoing traffic will generally use the same device. Incoming
2390 dependent upon the balancing policy of the peer's 8023.ad
2392 distributed across the devices in the bond.
2394 Finally, the 802.3ad mode mandates the use of the MII monitor,
2395 therefore, the ARP monitor is not available in this mode.
2398 Since the balancing is done according to MAC address, in a
2405 XOR to the same value) will not all "bunch up" on a single
2409 special switch configuration is required. On the down side,
2411 interface, this mode requires certain ethtool support in the
2412 network device driver of the slave interfaces, and the ARP
2416 It has all of the features (and restrictions) of balance-tlb,
2418 peers (as described in the Bonding Module Options section,
2421 The only additional down side to this mode is that the network
2422 device driver must support changing the hardware address while
2423 the device is open.
2430 support the use of the ARP monitor, and are thus restricted to using
2431 the MII monitor (which does not provide as high a level of end to end
2432 assurance as the ARP monitor).
2457 In this configuration, the switches are isolated from one
2465 If access beyond the network is required, an individual host
2472 In actual practice, the bonding mode typically employed in
2474 network configuration, the usual caveats about out of order packet
2475 delivery are mitigated by the use of network adapters that do not do
2476 any kind of packet coalescing (via the use of NAPI, or because the
2478 packets has arrived). When employed in this fashion, the balance-rr
2485 Again, in actual practice, the MII monitor is most often used
2488 advantages over the MII monitor are mitigated by the volume of probes
2489 needed as the number of systems involved grows (remember that each
2490 host in the network is configured with bonding).
2498 Some switches exhibit undesirable behavior with regard to the
2499 timing of link up and down reporting by the switch.
2502 the link is up (carrier available), but not pass traffic over the
2507 value to the updelay bonding module option to delay the use of the
2510 Second, some switches may "bounce" the link state one or more
2512 the switch is initializing. Again, an appropriate updelay value may
2515 Note that when a bonding interface has no active links, the
2516 driver will immediately reuse the first link that goes up, even if the
2517 updelay parameter has been specified (the updelay is ignored in this
2518 case). If there are slave interfaces waiting for the updelay timeout
2519 to expire, the interface that first went into that state will be
2520 immediately reused. This reduces down time of the network if the
2523 ignoring the updelay.
2525 In addition to the concerns about switch timings, if your
2528 Failover may be delayed via the downdelay bonding module option.
2533 NOTE: Starting with version 3.0.2, the bonding driver has logic to
2538 traffic when the bonding device is first used, or after it has been
2540 a "ping" to some other host on the network, and noticing that the
2544 all connected to one switch, the output may appear as follows:
2557 This is not due to an error in the bonding driver, rather, it
2559 tables. Initially, the switch does not associate the MAC address in
2560 the packet with a particular switch port, and so it may send the
2562 the interfaces attached to the bond may occupy multiple ports on a
2563 single switch, when the switch (temporarily) floods the traffic to all
2564 ports, the bond device receives multiple copies of the same packet
2569 behavior, it can be induced by clearing the MAC forwarding table (on
2570 most Cisco switches, the privileged command "clear mac address-table
2583 This applies to the JS20 and similar systems.
2585 On the JS20 blades, the bonding driver supports only
2587 largely due to the network topology inside the BladeCenter, detailed
2594 integrated on the planar (that's "motherboard" in IBM-speak). In the
2595 BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to
2620 modules 1 and 2. In this configuration, the eth0 and eth1 ports of a
2621 JS20 will be connected to different internal switches (in the
2625 passthrough module) connects the I/O module directly to an external
2626 switch. By using PMs in I/O module #1 and #2, the eth0 and eth1
2627 interfaces of a JS20 can be redirected to the outside world and
2630 Depending upon the mix of ESMs and PMs, the network will
2634 much like the example in "High Availability in a Multiple Switch
2640 The balance-rr mode requires the use of passthrough modules
2641 for devices in the bond, all connected to an common external switch.
2642 That switch must be configured for "etherchannel" or "trunking" on the
2648 must be able to reach all destinations for traffic sent over the
2649 bonding device (i.e., the network must converge at some point outside
2650 the BladeCenter).
2657 When an Ethernet Switch Module is in place, only the ARP
2659 nothing unusual, but examination of the BladeCenter cabinet would
2660 suggest that the "external" network ports are the ethernet ports for
2661 the system, when it fact there is a switch between these "external"
2662 ports and the devices on the JS20 system itself. The MII monitor is
2663 only able to detect link failures between the ESM and the JS20 system.
2665 When a passthrough module is in place, the MII monitor does
2666 detect failures to the "external" port, which is then directly
2667 connected to the JS20 system.
2672 The Serial Over LAN (SoL) link is established over the primary
2675 network traffic, as the SoL system is beyond the control of the
2678 It may be desirable to disable spanning tree on the switch
2679 (either the internal Ethernet Switch Module, or an external switch) to
2689 The new driver was designed to be SMP safe from the start.
2695 devices need not be of the same speed.
2706 This is limited only by the number of network interfaces Linux
2707 supports and/or the number of network cards you can place in your
2712 If link monitoring is enabled, then the failing device will be
2714 other modes will ignore the failed link. The link will continue to be
2715 monitored, and should it recover, it will rejoin the bond (in whatever
2716 manner is appropriate for the mode). See the sections on High
2717 Availability and the documentation for each mode for additional
2720 Link monitoring can be enabled via either the miimon or
2721 arp_interval parameters (described in the module parameters section,
2722 above). In general, miimon monitors the carrier state as sensed by
2723 the underlying network device, and the arp monitor (arp_interval)
2724 monitors connectivity to another host on the local network.
2726 If no link monitoring is configured, the bonding driver will
2730 depends upon the bonding mode and network configuration.
2734 Yes. See the section on High Availability for details.
2738 The full answer to this depends upon the desired mode.
2740 In the basic balance modes (balance-rr and balance-xor), it
2747 support specific features (described in the appropriate section under
2759 the fail_over_mac option is enabled, the bonding device's MAC address is
2760 the MAC address of the active slave.
2763 ifconfig or ip link), the MAC address of the bonding device is taken from
2765 slaves and remains persistent (even if the first slave is removed) until
2766 the bonding device is brought down or reconfigured.
2768 If you wish to change the MAC address, you can set it with
2775 The MAC address can be also changed by bringing down/up the
2782 This method will automatically take the address from the next
2786 from the bond (`ifenslave -d bond0 eth0'). The bonding driver will
2787 then restore the MAC addresses that the slaves had before they were
2793 The latest version of the bonding driver can be found in the latest
2794 version of the linux kernel, found on http://kernel.org
2796 The latest version of this document can be found in the latest kernel
2799 Discussions regarding the usage of the bonding driver take place on the
2801 problems, post them to the list. The list address is:
2810 Discussions regarding the development of the bonding driver take place
2811 on the main Linux network mailing list, hosted at vger.kernel.org. The list