Network Working Group S. Brim Internet-Draft D. Farinacci Intended status: Experimental D. Meyer Expires: May 12, 2008 Cisco Systems, Inc. J. Curran ServerVault November 9, 2007 EID Mappings Multicast Across Cooperating Systems for LISP draft-curran-lisp-emacs-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 12, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract One of the potential problems with the "map-and-encapsulate" approaches to routing architecture is that there is a significant chance of packets being dropped while a mapping is being retrieved. Some approaches pre-load ingress tunnel routers with at least part of the mapping database. Some approaches try to solve this by providing Brim, et al. Expires May 12, 2008 [Page 1] Internet-Draft EMACS-LISP November 2007 intermediate "default" routers which have a great deal more knowledge than a typical ingress tunnel router. This document proposes a scheme which does not drop packets yet does not require a great deal of knowledge in any router. However, there are still some issues that need to be worked out. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Detailed Discussion . . . . . . . . . . . . . . . . . . . . . 5 4.1. Assignment of rendezvous point addresses to groups . . . . 5 4.2. Determination of the Right Group to Join . . . . . . . . . 6 4.3. Sending a Packet . . . . . . . . . . . . . . . . . . . . . 6 4.4. Path Stretch . . . . . . . . . . . . . . . . . . . . . . . 7 4.5. Requirement for Multicast Deployment . . . . . . . . . . . 7 4.6. Protection against Snoopers . . . . . . . . . . . . . . . 7 4.7. ETR Initial Packet Forwarding . . . . . . . . . . . . . . 7 4.8. Responding with a Map-Reply . . . . . . . . . . . . . . . 8 4.9. Authentication of the Map-Reply . . . . . . . . . . . . . 8 4.10. Transition Scenarios . . . . . . . . . . . . . . . . . . . 8 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 9 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 8.1. Normative References . . . . . . . . . . . . . . . . . . . 9 8.2. Informative References . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 Intellectual Property and Copyright Statements . . . . . . . . . . 11 Brim, et al. Expires May 12, 2008 [Page 2] Internet-Draft EMACS-LISP November 2007 1. Introduction One of the potential problems with the "map-and-encapsulate" approaches to routing architecture is that there is a significant chance of packets being dropped while a mapping is being retrieved. Some approaches pre-load ingress tunnel routers (ITRs) with at least part of the mapping database. Some approaches try to solve this by providing intermediate "default" routers which have a great deal more knowledge than a typical ingress tunnel router. This document proposes a scheme which does not drop packets yet does not require a great deal of knowledge in any router. However, there are still some issues that need to be worked out. 2. Problem Statement LISP [I-D.farinacci-lisp] assumes a mechanism for obtaining mappings from EID to RLOC exists, but does not require or assume any specific mapping mechanism. Among those proposed for use with LISP are LISP- ALT [I-D.fuller-lisp-alt],NERD [I-D.lear-lisp-nerd], CONS [I-D.meyer-lisp-cons], and APT [I-D.jen-apt]. Others have also been considered. These mechanisms attempt in various ways to balance database size and churn with the delay of looking up a mapping for the first packet between two sites. If complete mapping information is pushed all the way to the ITRs, then there is no delay in looking up the first mapping, but each ITR must hold a large amount of information and be able to keep it up to date. If mapping information is not pushed at all (as in CONS), then an ITR need only hold the information it decides to cache, but there may be significant delays in retrieving a mapping for the first packets sent between two sites, and those packets may be dropped. Hybrid schemes, where mapping information is pushed partway to the ITRs, have been proposed, but the tradeoff between database size/churn and lookup delay is still not solved satisfactorily. "Default forwarders" have been proposed in CONS, APT, and CRIO [CRIO]. These are intermediate forwarding points. The intent is that if an ITR does not have a mapping for a packet, it will forward the packet to the default forwarder. The assumption is that the default forwarder serves an aggregate of endpoints, and will thus have better knowledge of how to reach the destination. This eliminates mapping lookup delay and the possibility of dropped packets, at the cost of possibly having the first packets sent between two sites take a longer path. There are two kinds of default forwarders, those that represent multiple sources and those that represent multiple destinations. Brim, et al. Expires May 12, 2008 [Page 3] Internet-Draft EMACS-LISP November 2007 o Source-side default forwarders (SSDFs) serve a group of sources, for example a site or all customers of an IP service provider. The belief is that a source-side default forwarder can have more mapping entries than the usual ITR, either because more can be pushed to it or because it will have more entries cached, since it is serving more queries. If it does not have a mapping for the destination it will use one of the mapping mechanisms on behalf of the source. Source-side default forwarders do not actually change the problem, they simply move the problem from the ITR to an intermediary. The same tradeoff, of having high rate/state versus dropping packets versus delay, is still there. The advantage is that they concentrate the problem so that costs can be concentrated as well, but in most cases an SSDF would have performance requirements at the level of a high end router. Valid packets can still be dropped if the SSDF does not itself have a mapping. Another disadvantage is that since they offer a general "default" route, bogus packets will get forwarded to and possibly through them instead of being dropped (for no route). o Destination-side default forwarders (DSDFs) serve a group of destinations. For example, if a source sends a packet to 192.168.100.1 and its ITR does not have a mapping entry for that packet, the ITR might forward the packet to a default forwarder responsible for all of 192.168.0.0/16. A destination-side forwarder has a mapping database which is complete, but only for a subset of the Internet, so it does not have the high performance requirements of a mainstream source-side default forwarder. Most bogus packets are not forwarded because ITRs will only have routes to DSDFs for valid EID prefixes. A potential downside to DSDFs is that since they represent an aggregate of destinations, the path to the destination through the DSDF may see some stretch. Destination-side default forwarders look like a good idea if some issues can be dealt with. They can eliminate the possibility of dropped packets. Delay for first packets exchanged between sites has a possibility of being long for some sites, depending on how DSDFs are organized. They still hold part of the mapping database, and need to maintain its accuracy. Also a mechanism is needed for sources, and the routers near sources, to determine which SSDF handles which prefixes. Finally, the mechanism by which the forwarding path is moved toward optimality needs to be secure. This draft proposes a mechanism using destination-side default forwarders that has low "rate*state" overhead, has easy DSDF location, and controls path stretch. It is called "EID-mappings Multicast Across Cooperating Systems for LISP", or EMACS-LISP. Brim, et al. Expires May 12, 2008 [Page 4] Internet-Draft EMACS-LISP November 2007 3. Overview The mechanism by which DSDFs forward packets to the appropriate ETRs is bidirectional PIM [RFC5015] multicast trees. Briefly: o In order to keep the number of multicast trees reasonable, each tree handles a subset of the entire EID address space. In IPv4 this might be a /16. o An ETR responsible for an EID prefix, for example 192.168.100.0/24, joins an appropriate multicast group for an including prefix, for example 192.168.0.0/16. An ETR responsible for an EID prefix larger than an including prefix, or for multiple EID prefixes in different including prefixes, will need to join multiple groups. Multiple ETRs for a site might join the group. o The bidirectional PIM tree rendezvous point addresses, and the groups they are rendezvous points for, are advertised in multiprocotol eBGP. This instance of eBGP runs in an overlay GRE infrastructure, distinct from the eBGP instance which will be used for normal RLOC routing in the Internet core. o When an ITR needs to forward a packet and does not have a LISP EID->RLOC mapping, it uses an algorithm to find the correct multicast group, and sends the packet to that group, on the overlay GRE infrastructure. The outer destination RLOC is the multicast address. The outer source RLOC is the ITR's. o The packets travel to all registered recipients. Most of them examine the packet and realize they are not responsible for the destination, so they throw it away. Among the ETRs which are responsible for the EID prefix, one or more will send a LISP Map- Reply back to the originating ITR, providing a specific mapping, so that the ITR can send all further packets directly. Thus the first one or two packets sent between two sites will experience more delay than following packets, but no packets are dropped unless they should be. Details are added, and issues discussed, in the following sections. 4. Detailed Discussion 4.1. Assignment of rendezvous point addresses to groups Rendezous point addresses (RPAs) are not necessarily related to anything physical. Their determination does not need to be covered Brim, et al. Expires May 12, 2008 [Page 5] Internet-Draft EMACS-LISP November 2007 in this document, as long as they can be advertised in the overlay eBGP instance. 4.2. Determination of the Right Group to Join An ETR must join one or more multicast groups in order to receive packets to the EID prefixes it is responsible for. There must be agreement among the ETRs for a prefix and the ITRs that want to send to that prefix on how the correct multicast group is determined. To avoid mapping retrieval delay, the multicast group must be determinable without querying a server. The following mechanism for mapping from destination EID to multicast group address MUST be supported for 32-bit EIDs: o There is a /16 in IPv4 multicast address space allocated for use by this protocol, for example 238.1.0.0/16. There are 64k groups, one for each of 64k "including" prefixes. o The RP addresses of all valid groups are advertised in eBGP, along with group addresses. The next_hop for a group address is the RP's address. o An ETR responsible for an EID prefix will mask out the higher order 16 bits of that EID prefix, and OR those bits into the lower order 16 bits of 238.1.0.0 to get the group to join. For example, for the EID prefix 192.168.100.0/24, the ETR will join multicast group 238.1.192.168. o If an ETR handles traffic to an EID prefix shorter than a /16, it will join all groups necessary to cover it. o The ETR joins those groups, on the GRE overlay. A similar mechanism can be defined for IPv6. Only valid groups, known to contain EID prefixes participating in LISP, will be advertised in the eBGP instance. Therefore it is all right to define the IPv6 mechanism in a way that allows for a large number of groups. 4.3. Sending a Packet ITRs participate in the eBGP instance running on the GRE overlay, so that they receive information on available groups and rendezvous point addresses. Packets for which the ITR does not have a direct lisp EID->RLOC mapping cached, does not have a route to an appropriate multicast group, and does not have a direct route for, and are dropped. To send a packet on the multicast overlay, the ITR encapsulates the packet with a LISP header. The destination address is the group determined by the same algorithm as above (Section 4.2). Brim, et al. Expires May 12, 2008 [Page 6] Internet-Draft EMACS-LISP November 2007 The source address is an RLOC for the ITR, or at least an RLOC for the source site. 4.4. Path Stretch The first packets sent between two sites will be multicast. Depending on how the multicast tree is assembled this may not be a direct path. How sub-optimal these paths would be is for further study. 4.5. Requirement for Multicast Deployment A potential concern with this approach is that it will require multicast support in all of the routers in the Internet core. However, the only nodes interested in the multicast routes are xTRs. They will participate in an overlay tunneled infrastructure, for example over GRE. eBGP, bidirectional PIM, and the multicast packets themselves would all travel over this tunneled infrastructure. The only nodes that need know or care about multicast are the ones that want to use it. The overhead of constructing and maintaining the GRE overlay is for further study. 4.6. Protection against Snoopers Without any join filters, it is possible for anyone to join any group. A node could join all groups in order to find out which sites are talking to which other sites. This is sometimes not acceptable. Therefore group join filters are required. At the points where a particular ETR will join, "join" messages to the groups for EID prefixes which that ETR handles MUST be allowed, but more SHOULD not be. This configuration is limited, simple, related to configuration that will already be done, and scalable. 4.7. ETR Initial Packet Forwarding When a packet is multicast it is distributed to all potentially interested ETRs. ETRs that are not responsible for an EID prefix containing the packet's destination address will discard the packet. ETRs which do handle traffic for the destination EID SHOULD decide among themselves who will forward the packet, since duplicate packets can sometimes be a problem. They do so by position in the LISP RLOC- set (see LISP [I-D.farinacci-lisp]). The ETR which is active and has the lowest ordinal position in the RLOC-set will forward the packet. Note: there can be a serious problem if a small site has an EID prefix in the same /16 including prefix as a large site. In that case, the small site's ETRs would get all of the initial traffic to the large site and have to throw it all away. This could overwhelm a Brim, et al. Expires May 12, 2008 [Page 7] Internet-Draft EMACS-LISP November 2007 small site's ETR. This problem, and possibly how to focus some multicast groups, is for further study. 4.8. Responding with a Map-Reply At least one receiving ETR SHOULD unicast a Map-Reply directly back to the originating ITR, so that future packets will be sent directly to the ETR, not via the multicast infrastructure. As with packet delivery (Section 4.7), the active ETR with the lowest ordinal position in the LISP RLOC-set will be the one to respond. 4.9. Authentication of the Map-Reply Because an initial packet can go to multiple sites, an ITR SHOULD authenticate any received Map-Reply messages. Otherwise it may misroute future packets. Nonces are not an adequate measure. Some kind of signature is required on the content of the Map-Reply. The trade-offs here are for further study. 4.10. Transition Scenarios To be filled in. Possibilities include using LISP-ALT as an intermediate approach, using alternative DNS resources, and sending initial packets via multiple paths. 5. IANA Considerations This section will be filled in later. A new SAFI will be needed to carry both rendezvous point addresses and group addresses. A set of at least 64k multicast groups is needed, and it would be better if a /8 were allocated, to be sure. Allocation of 238.0.0.0/8 is requested. A similar request for IPv6 address space will be made after further study. Note to RFC Editor: this section may be removed on publication as an RFC. 6. Security Considerations To be filled in. Known issues are Brim, et al. Expires May 12, 2008 [Page 8] Internet-Draft EMACS-LISP November 2007 o Revealing first packets to destinations which are not the source's intended destination. o Inviting map-reply responses off the path between source and destination. o Denial of service attacks on the multicast infrastructure. o Potentially overloading ETRs with unwanted traffic. 7. Contributors The authors are grateful for the help of those who offered comments, notably Vince Fuller, Eliot Lear, Darrel Lewis, and David Oran. 8. References 8.1. Normative References [I-D.farinacci-lisp] Farinacci, D., "Locator/ID Separation Protocol (LISP)", draft-farinacci-lisp-04 (work in progress), August 2007. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, "Bidirectional Protocol Independent Multicast (BIDIR- PIM)", RFC 5015, October 2007. 8.2. Informative References [CRIO] Zhang, X., Francis, P., Wang, J., and K. Yoshida, "Scaling IP Routing with the Core Router-Integrated Overlay", November 2006. [I-D.fuller-lisp-alt] Fuller, V., "LISP-ALT", Internet-Draft not yet published. [I-D.jen-apt] Jen, D., "APT: A Practical Transit Mapping Service", draft-jen-apt-00 (work in progress), July 2007. [I-D.lear-lisp-nerd] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", draft-lear-lisp-nerd-02 (work in progress), Brim, et al. Expires May 12, 2008 [Page 9] Internet-Draft EMACS-LISP November 2007 September 2007. [I-D.meyer-lisp-cons] Brim, S., "LISP-CONS: A Content distribution Overlay Network Service for LISP", draft-meyer-lisp-cons-02 (work in progress), September 2007. Authors' Addresses Scott Brim Cisco Systems, Inc. Email: swb@employees.org Dino Farinacci Cisco Systems, Inc. Email: dino@cisco.com David Meyer Cisco Systems, Inc. Email: dmm@1-4-5.net John Curran ServerVault Email: jcurran@istaff.org Brim, et al. Expires May 12, 2008 [Page 10] Internet-Draft EMACS-LISP November 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Brim, et al. Expires May 12, 2008 [Page 11]