Network Working Group Dave Thaler Internet-Draft Christian Huitema Expires: January 2002 Microsoft 19 July 2001 Multi-link Subnet Support in IPv6 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Expires January 2002 [Page 1] Draft Multilink Subnets July 2001 Abstract Bridging multiple links into a single entity has several operational advantages. A single subnet prefix is sufficient to support multiple physical links. There is no need to allocate subnet numbers to the different networks, simplifying management. This document introduces the concept of a "multilink subnet", defined as a collection of independent links, connected by routers, but sharing a common subnet prefix. It then specifies the behavior of multilink subnet routers so that no changes to host behavior are needed. 1. Introduction Bridging multiple links into a single entity has several operational advantages. A single subnet prefix is sufficient to support multiple physical links. There is no need to allocate subnet numbers to the different networks, simplifying management. However, not all link-layer media can be easily bridged. Classic IEEE 802 bridging technology fails when the media does not naturally support IEEE 802 addressing. Furthermore, the operation becomes problematic when the different links don't support the same MTU size. Finally, bridging cannot be easily implemented when the network interface cannot be easily placed in "promiscuous" mode. This document introduces the concept of a "multilink subnet", defined as a collection of independent links, connected by routers, but sharing a common subnet prefix. Herein we discuss many of the problems and possible solutions surrounding this concept. The initial version of this draft will not specify behavior, but merely discuss the tradeoffs. A later version will narrow the solution space to a recommended approach. 2. Terminology multilink subnet: a collection of independent links, connected by routers, but sharing a common subnet prefix. subnet scope: multicast SCOP value 3, as specified in [ADDRARCH], which Expires January 2002 [Page 2] Draft Multilink Subnets July 2001 covers a (potentially multilink) subnet. This is the next larger multicast scope above link scope. multilink-subnet router (MSR): a router which has interfaces attached to different links in a multilink subnet, and which implements the rules in this document. Class 1 multilink subnet: a multilink subnet with only one MSR. Class 2 multilink subnet: a multilink subnet composed of multiple MSRs and links in a tree topology. That is, there is only one possible path within the subnet between any pair of nodes in the subnet. Class 3 multilink subnet: a multilink subnet composed of multiple MSRs and links connected together in an arbitrary topology. Class 1 MSR: an MSR which only works in a Class 1 multilink subnet. Class 2 MSR: an MSR which only works in a Class 1 or 2 multilink subnet. Class 3 MSR: an MSR which works in all types of multilink subnets. 3. Design Goals Multilink subnets are designed with the following goals in mind: o Existing IPv6 end hosts should continue to work when connected to a multilink subnet, without requiring any change to their behavior. For example, the host behavior parts of Router Discovery, Neighbor Discovery [ND], and Multicast Listener Discovery [MLD], must be supported. o Leave link-local address behavior unchanged. Link-local behavior continues to function only within a link, not across a multilink subnet. That is sending and receiving unicast, anycast, and multicast traffic within the link should be supported in the normal fashion. Expires January 2002 [Page 3] Draft Multilink Subnets July 2001 o Also support sending and receiving unicast and anycast traffic at the site and global scopes. o Also support sending and receiving multicast traffic at the subnet scope and above. o Prevent routing loops. o Support nodes moving between links within the subnet, with a reasonably fast convergence time (on the same order as Neighbor Unreachability Detection). o In a Class 3 multilink subnet, exploit richer connectivity than just using a spanning tree. 4. Overview This section gives an overview of multilink subnets. We describe the behavior of hosts (which is normal IPv6 host behavior with no changes), and the resulting requirements for routers. 4.1. Router Discovery Router Discovery continues to work on a per-link basis, as specified in [ND]. When sending Router Advertisements (RAs) with a Prefix Information Option, there are two possibilities for how an MSR can influence the Neighbor Discovery procedure used. 4.1.1. Making hosts not use ND If the MSR sets the A (autonomous address-configuration) flag on, and the L (on-link) flag off, then hosts on the link will attempt stateless address configuration [ADDRCONF] in the given prefix, but will not treat the prefix as being on-link. As a result, neighbor discovery is effectively disabled and packets to new destinations always go to the router first, which will then either forward them if the destination is off-link, or redirect them if the destination is on-link. In the remainder of this document, we will refer to this model as the "off-link" model, since hosts initially treat all addresses in the subnet as being off-link. Expires January 2002 [Page 4] Draft Multilink Subnets July 2001 4.1.2. Making hosts use ND If the MSR sets both the A and the L flags, then hosts on the link will perform stateless address configuration and neighbor discovery as usual. However, since Neighbor Solicitations (NSs) from existing hosts are sent to a link-scoped solicited-node multicast address, they will never reach nodes on other links within the subnet. Instead, MSRs must either know the location of the destination a priori, or else be able to relay such NS's to other links, either using link-scoped NS's relayed link-by-link, or using a subnet-scoped NS. In the remainder of this document, we will refer to this model as the "on-link" model, since hosts treat all addresses in the subnet as being on-link. 4.1.3. Effects on Duplicate Address Detection In either approach above, existing nodes will still do Duplicate Address Detection using the link-scoped solicited-node multicast address. Two important issues arise that must be addressed: 1) If two nodes on different links simultaneously attempt DAD for the same address, care must be taken to so that the collision is detected correctly. 2) If a node moves from one link to another link in the same multilink subnet, and performs DAD in its new location, care must be taken so that MSRs can distinguish between such a move, and a legitimate duplicate, so that after the move, the node can retain its address. Because of these issues, routers cannot use cached information to respond on behalf of off-link nodes. Another problem arises from the statement in [ND] that: "the link- local address MUST be tested for uniqueness, and if no duplicate address is detected, an implementation MAY choose to skip Duplicate Address Detection for additional addresses derived from the same interface identifier". Collisions would result if the interface identifier were unique on the link, but not across the entire multilink subnet. To avoid Expires January 2002 [Page 5] Draft Multilink Subnets July 2001 this, MSRs must get involved in duplicate address detection even for link-local addresses, to ensure that all addresses are unique across a multilink subnet. 4.2. Neighbor Discovery Neighbor Discovery is used differently, depending on whether the on-link or off-link model is used, as described in the previous section. Off-link model If the subnet is treated as being off-link, all packets are sent to a default router. It is then the default router's responsibility to figure out the next-hop of the packets. If the next-hop is on-link, it sends a Redirect to the source. On-link model If the subnet is treated as being on-link, nodes will send NS's to the solicited node multicast address. (If a node has interfaces attached to multiple links in the subnet, NS's MAY be sent on each link.) If the next-hop is off-link, a router will respond with a proxy Neighbor Advertisement (NA) containing its own link-layer address. In either case, it is the router's responsibility to determine whether a destination in the subnet is on-link. In this version of this draft, we will describe the rules for both of the above models. A future version of this draft may choose only one of them. 5. Basic (Class 1) Behavior In a Class 1 multilink subnet, only one router exists. This might be the case, for example, in a home network where a router connects a wired and a wireless link together to form a single subnet. Expires January 2002 [Page 6] Draft Multilink Subnets July 2001 5.1. Basic Unicast In this section, we step through an example of basic unicast communication, assuming that address configuration has already completed, and the router's routing table and neighbor cache already have any required information. In the simple scenario depicted in Figure 1 below, two links, (1) and (2) on a common subnet with global prefix G, are connected by an MSR B. Node A has link-layer address a on link 1, and has acquired global IPv6 address Ga. Similarly, MSR B has on link 1, link-layer address b1, and IPv6 address Gb1, and on link 2, link- layer address b2 and IPv6 address Gb2. Node C has link-layer address c on link 2, and IPv6 address Gc. Node D has link-layer address d on link 1, and IPv6 address Gd. +---+ +---+ | A | | D | +-+-+ +-+-+ | | --+------------+-------------+--------------(1)-- | +-+-+ | B | +-+-+ | ---------------+-------------+--------------(2)-- | +-+-+ | C | +---+ Figure 1: Class 1 Scenario Off-link model When A wants to start communication with Gc, it finds that the destination address matches no on-link prefix, and so sends the packet directly to its default router B. B first applies its usual packet validation rules (including decrementing the Hop Count in the IPv6 header). B knows that C is on-link to link 2, with link-layer address c, and so it forwards the packet to C. Expires January 2002 [Page 7] Draft Multilink Subnets July 2001 When A wants to communicate with Dc, it again finds that the destination address matches no on-link prefix, and so sends the packet directly to its default router B. B knows that D is on-link to the same link as A, and so responds with a Redirect. On-link model When A wants to start communication with Gc, it finds that the destination address matches an on-link prefix, and so sends an NS to the solicited-node multicast address Sc constructed from Gc. The NS message is received by the MSR B, which listens on all multicast groups. B knows that C is on-link to link 2, and responds to A with an NA containing its own link-layer address b1 as the Target Link-Layer Address. After this, A can send packets to the address Gc. The packets will be sent to the link address b1; they will be received by B, which will apply its usual validation rules (including decrementing the Hop Count in the IPv6 header), and forward them to the address c on link 2. When A wants to communicate with Gd, it again finds that the destination address matches an on-link prefix, and so sends an NS to its solicited-node multicast address. D receives the NS and responds. B also receives the NS, but knows that D is on the same link as A, and so does not respond. Note that we did not assume that the links had to use IEEE 802 addresses, or in fact any form of consistent addressing. B can also handle MTU discovery procedures, returning an ICMP messages if either A or C sends a packet that is too long. 5.2. Router Configuration The previous section assumed that the router's routing table and neighbor cache already had any required information. We now describe how this can be done. Like any other router, an MSR can acquire routes (including the subnet prefix) by using manual configuration or a routing protocol. An MSR with all interfaces in the same subnet MAY Expires January 2002 [Page 8] Draft Multilink Subnets July 2001 acquire its information solely based on RAs received from another router (which is not an MSR), in the same way a host would. It can then advertise the same prefix/route information on other links in the subnet, using either the on-link or off-link model. When needing to resolve a target address to a next-hop (when a host performs ND or DAD), it send a Neighbor Solicitation on each attached link in the subnet, except in the on-link model one is not sent back on the link from which an NS was just received. After sending an NS, the router suppresses sending of any other NS's for the same target address for a short interval (which must be less than ND's RetransTimer). While it is resolving a next- hop, the router also remembers each node sending an NS for the same target address. A Neighbor Advertisement would be sent in response to an NS only by (a) the actual node with the target address, or (b) an MSR which has received an NA in response to a relayed NS it sent as a result of receiving the first NS. Specifically, an NA is not sent just because the MSR has a neighbor cache entry for the target. When an MSR receives an NA, it sends an NA to all nodes from which it received NSs above. As specified in [ND], proxy Neighbor Advertisements sent by MSR's on behalf of remote targets always have the Override bit clear. 5.3. Multicast Most current multicast routing protocols are based on a "Reverse- Path Forwarding" check. That is, they drop a packet if the packet does not arrive on the link towards a given address (e.g., the source address, or a Rendezvous Point address associated with the group address). Thus, multicast will work as long as a router can tell which link is towards any address within the subnet. Note that in particular, simply using the subnet route is not sufficient in a multilink subnet. If an MSR's longest-match RPF lookup matches the subnet route for the multilink subnet, it means the source is in the subnet, and the neighbor cache is consulted (as for unicast) to find the link towards the source. Expires January 2002 [Page 9] Draft Multilink Subnets July 2001 5.4. Disabling Class 1 MSRs The above rules assume that the MSR is the only MSR in the subnet. Consequently, a Class 1 MSR MUST disable itself if it detects that another MSR is present. This can be done by assigning a flag (say, bit 0x4) in the RA that is set by all MSRs. TBD: If an MSR only sends RAs on links other than the one from which it got an RA from the "real" (non-MSR) router, then it seems that there can safely be multiple MSR's in parallel on that same link, and the rule above will not disable them either. This is really a Class 2 subnet. TBD: Do the rules above work in transit multilink subnets or not? If not, it also needs to disable itself if any RAs are seen on multiple links in the subnet. 6. Class 2 Behavior to Internet | +-+-+ | R | +-+-+ | | --+------------+-------------+----------+---(1)-- | +---+ | | | +--+ F | +-+-+ +-+-+ +-+-+ | +---+ | C | | A | | B +----------+ +---+ +-+-+ +-+-+ | | | | +---+ --(2)-------+-------+---- ---+-----+---(3)-- +--+ G | | | | +---+ +-+-+ +-+-+ | | D | | E | (4) +---+ +---+ | Figure 2: Class 2 Scenario Figure 2 shows a sample tree topology with MSRs A and B connecting four links into a single subnet with hosts C, D, E, F, and G. R is a normal router that provides connectivity to an internet, and sends RAs on link 1. Expires January 2002 [Page 10] Draft Multilink Subnets July 2001 TBD: Fill this in. Is there actually any difference between Class 1 and Class 2 behavior, if a single specific link is configured as the "upstream" (towards outside of the subnet) link? We may be able to collapse Class 1 and 2 into the same thing. TBD: What about transit subnets with exit points on different links? 7. Security Considerations TBD. 8. Appendix A: Class 3 Behavior In the network depicted in Figure 2, we have now three links, and also three multilink-subnet routers (MSRs), B, E, and F. +---+ +---+ | A | | D | +-+-+ +-+-+ | | --+------------+-------------+----------+---(1)-- | | +-+-+ +-+-+ | B | | E | +-+-+ +-+-+ | | -----------+-------------+----------+---(2)-- | +-+-+ | F | +-+-+ | ------+----------+--------------(3)-- | +-+-+ | C | +---+ Figure 3: Class 3 Scenario The network is sufficiently complex to expose several problems: Expires January 2002 [Page 11] Draft Multilink Subnets July 2001 o If A sends an NS packet, that packet is received by both B and E. Depending on the inter-router communication mechanism, this could lead to duplicate transmissions on link 2, and possibly to random behaviors, or to loops. o If A sends a multicast packet, and that packet is relayed by both B and E, it would lead to duplicate traffic, or even potential loops. It may not be relayed at all, if neither B nor E realize there is a group member hidden behind F. There are multiple possible approaches to solving the above problems which might meet our design goals. We discuss each approach in turn below, with examples using Figure 3 when no previous state is known. 8.1. Method A: Flooding Neighbor Solicitations Neighbor Soliticitations and Advertisements are proxied as described earlier, with the following additional rules. Since multiple paths may exist, to assist in loop prevention and provide shortest paths, a new "Local Distance" option in NA's can be defined: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | Hop Count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The option contains five fields, encoded in 8 octets. The Hop Count field contains an 8-bit unsigned integer being the number of hops between the advertising station and the source or the target address. It is used to assist in loop prevention and provide shortest paths. The Timestamp, is a 32-bit integer (in seconds) that describes the time at which the source or target address was last advertised by the actual node with that address. It is used to ensure that neighbor discovery messages do not loop forever if the propagation delay through across the subnet is significant. (Authors' note: Expires January 2002 [Page 12] Draft Multilink Subnets July 2001 is there a way to make this work without synchronized clocks? Is a Timestamp really required?) If this option is used, it is expected that an MSR's neighbor cache entries would also contain the Hop Count and Timestamp information associated with the link-layer address used. The absence of such this option implies a Hop Count value of 0. When proxying an NA, an MSR would include the Local Distance option with an incremented value. Legacy nodes will ignore the option, but MSRs (and new nodes if they wish) can use the option to prefer link-layer addresses with a lower Local Distance. To route actual packets, an MSR's route lookup would determine that the longest matching route is on-link to multiple links. The router would consult its (conceptual) neighbor cache, and use the next-hop with the lowest Local Distance. The same procedure would apply to multicast packets as well, when the router would look up the RPF address. 8.1.1. On-link model example In Figure 3, when A wishes to communicate with Gc, both B and E will receive an NS from A. Each will originate an NS for Gc on link 2. B, E, and F will receive the NS's on link 2. B and E will ignore each others' NS since they have just sent an NS for the same address. F will receive the NS's and the first one will cause it to create a neighbor cache entry in the INCOMPLETE state, and originate its own NS on link 3. When C receives this NS, it will respond with an NA. When F receives the NA from C, it will respond to B and E with an NA with its own link-layer address f2 as the Target Link Layer Address, and a "Local Distance = 1" option. B and E will then respond to A with NAs containing b1 and e1, respectively, as the Target Link Layer Address, and a "Local Distance = 2" option. 8.1.2. Off-link model example In Figure 3, when A wishes to communicate with Gc, it will send packets to a default router, say, B. B will send an NS on link 1, Expires January 2002 [Page 13] Draft Multilink Subnets July 2001 which will be received by E, and on link 2 which will be received by E and F. Depending on timing, E may send an NS on link 1 or link 2 or neither. (If a short delay were inserted before sending, both could be suppressed.) F will send an NS on link 3, to which C will reply with an NA. Upon receiving the NA, F sends an NA to all nodes from which it has seen an NS for Gc, namely B and possibly E. B (and possibly E as well) will then send an NA on link 1, after which A can communicate with C. 8.2. Method B: Proactively populate host routes The basic idea here is that MSR's would inject host routes into a routing protocol used within (at least) the subnet upon detecting a new node on a directly-connected link (e.g., when DAD completes on an Ethernet, or when IPv6CP completes on a PPP link). Once host routes exist, either the off-link or the on-link model could be used. In addition, multicast works with no changes, since host routes would be used for RPF checks. Another advantage is that since all resolution is done by MSR's "a priori", no additional delay is incurred when A wants to communicate with A. If the on-link model is used, no neighbor discovery delay exists at all. Packets are immediately forwarded along the correct path. This approach avoids all bursty-source problems. Since host routes are cached state, they cannot, however, be used for duplicate address detection, due to the issues described in Section 4.1.3. That is, the presence of a host route does not imply a duplicate, since the node may have just moved. The lack of a host route does not imply uniqueness, since another node may be simultaneously choosing the same address. As a result, DAD requires additional mechanisms, such as flooding neighbor discovery messages as in Method A, or provided by a specialized routing protocol. Work in progress in the Mobile Ad-hoc Networks (manet) WG may provide solutions to the above problems in the future. Expires January 2002 [Page 14] Draft Multilink Subnets July 2001 9. Acknowledgements Steve Deering, Brian Zill, Hesham Soliman, and Karim ElMalki participated in discussions that led to this draft. The term "multilink subnet" was coined by Steve Deering. 10. Authors' Addresses Dave Thaler Microsoft Corporation One Microsoft Way Redmond, WA 98052-6399 Phone: +1 425 703 8835 EMail: dthaler@microsoft.com Christian Huitema Microsoft Corporation One Microsoft Way Redmond, WA 98052-6399 EMail: huitema@microsoft.com 11. References [ADDRARCH] Hinden, R., and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, July 1998. [ADDRCONF] Thomson, S., and T. Narten, "IPv6 Stateless Address Autoconfiguration", RFC 2462, December 1998. [MLD] Deering, S., Fenner, W., and B. Haberman, "Multicast Listener Discovery (MLD) for IPv6", RFC 2710, October 1999. [ND] Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery for IP Version 6 (IPv6)", RFC 2461, December 1998. 12. Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. Expires January 2002 [Page 15] Draft Multilink Subnets July 2001 This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Expires January 2002 [Page 16]