idnits 2.17.1 draft-farinacci-lisp-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: UDP Checksum: this field field MUST be transmitted as 0 and ignored on receipt by the ETR. Note, even when the UDP checksum is transmitted as 0 an intervening NAT device can recalculate the checksum and rewrite the UDP checksum field to non-zero. For performance reasons, the ETR MUST ignore the checksum and MUST not do a checksum computation. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 19, 2008) is 5600 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) ** Obsolete normative reference: RFC 4423 (Obsoleted by RFC 9063) == Outdated reference: A later version (-05) exists of draft-fuller-lisp-alt-03 == Outdated reference: A later version (-01) exists of draft-jen-apt-00 == Outdated reference: A later version (-04) exists of draft-meyer-lisp-cons-03 == Outdated reference: A later version (-02) exists of draft-lewis-lisp-interworking-01 -- No information found for draft-mathy-lisp-dht - is the name correct? == Outdated reference: A later version (-01) exists of draft-meyer-loc-id-implications-00 == Outdated reference: A later version (-09) exists of draft-lear-lisp-nerd-02 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-00 == Outdated reference: A later version (-10) exists of draft-ietf-mip4-rfc3344bis-05 == Outdated reference: A later version (-08) exists of draft-ietf-pim-rpf-vector-03 -- No information found for draft-handley-p2ppush-unpublished-2007726 - is the name correct? == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-06 Summary: 4 errors (**), 0 flaws (~~), 13 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft V. Fuller 4 Intended status: Experimental D. Oran 5 Expires: June 22, 2009 D. Meyer 6 S. Brim 7 cisco Systems 8 December 19, 2008 10 Locator/ID Separation Protocol (LISP) 11 draft-farinacci-lisp-11.txt 13 Status of this Memo 15 This Internet-Draft is submitted to IETF in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on June 22, 2009. 36 Copyright Notice 38 Copyright (c) 2008 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 Abstract 50 This draft describes a simple, incremental, network-based protocol to 51 implement separation of Internet addresses into Endpoint Identifiers 52 (EIDs) and Routing Locators (RLOCs). This mechanism requires no 53 changes to host stacks and no major changes to existing database 54 infrastructures. The proposed protocol can be implemented in a 55 relatively small number of routers. 57 This proposal was stimulated by the problem statement effort at the 58 Amsterdam IAB Routing and Addressing Workshop (RAWS), which took 59 place in October 2006. 61 Table of Contents 63 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4 64 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 8 66 4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12 67 4.1. Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14 68 5. Tunneling Details . . . . . . . . . . . . . . . . . . . . . . 16 69 5.1. LISP IPv4-in-IPv4 Header Format . . . . . . . . . . . . . 17 70 5.2. LISP IPv6-in-IPv6 Header Format . . . . . . . . . . . . . 18 71 5.3. Tunnel Header Field Descriptions . . . . . . . . . . . . . 19 72 5.4. Dealing with Large Encapsulated Packets . . . . . . . . . 20 73 5.4.1. A Stateless Solution to MTU Handling . . . . . . . . . 21 74 5.4.2. A Stateful Solution to MTU Handling . . . . . . . . . 21 75 6. EID-to-RLOC Mapping . . . . . . . . . . . . . . . . . . . . . 23 76 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats . . . . . 23 77 6.1.1. LISP Packet Type Allocations . . . . . . . . . . . . . 25 78 6.1.2. Map-Request Message Format . . . . . . . . . . . . . . 25 79 6.1.3. EID-to-RLOC UDP Map-Request Message . . . . . . . . . 27 80 6.1.4. Map-Reply Message Format . . . . . . . . . . . . . . . 28 81 6.1.5. EID-to-RLOC UDP Map-Reply Message . . . . . . . . . . 30 82 6.2. Routing Locator Selection . . . . . . . . . . . . . . . . 31 83 6.3. Routing Locator Reachability . . . . . . . . . . . . . . . 32 84 6.4. Routing Locator Hashing . . . . . . . . . . . . . . . . . 34 85 6.5. Changing the Contents of EID-to-RLOC Mappings . . . . . . 35 86 6.5.1. Clock Sweep . . . . . . . . . . . . . . . . . . . . . 36 87 6.5.2. Solicit-Map-Request (SMR) . . . . . . . . . . . . . . 37 88 7. Router Performance Considerations . . . . . . . . . . . . . . 39 89 8. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 40 90 8.1. First-hop/Last-hop Tunnel Routers . . . . . . . . . . . . 41 91 8.2. Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 41 92 8.3. ISP Provider-Edge (PE) Tunnel Routers . . . . . . . . . . 42 93 9. Traceroute Considerations . . . . . . . . . . . . . . . . . . 43 94 9.1. IPv6 Traceroute . . . . . . . . . . . . . . . . . . . . . 44 95 9.2. IPv4 Traceroute . . . . . . . . . . . . . . . . . . . . . 44 96 9.3. Traceroute using Mixed Locators . . . . . . . . . . . . . 44 97 10. Mobility Considerations . . . . . . . . . . . . . . . . . . . 46 98 10.1. Site Mobility . . . . . . . . . . . . . . . . . . . . . . 46 99 10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 46 100 10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 46 101 10.4. Fast Network Mobility . . . . . . . . . . . . . . . . . . 48 102 11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 49 103 12. Security Considerations . . . . . . . . . . . . . . . . . . . 50 104 13. Prototype Plans and Status . . . . . . . . . . . . . . . . . . 51 105 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 53 106 14.1. Normative References . . . . . . . . . . . . . . . . . . . 53 107 14.2. Informative References . . . . . . . . . . . . . . . . . . 53 108 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 56 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 57 111 1. Requirements Notation 113 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 114 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 115 document are to be interpreted as described in [RFC2119]. 117 2. Introduction 119 Many years of discussion about the current IP routing and addressing 120 architecture have noted that its use of a single numbering space (the 121 "IP address") for both host transport session identification and 122 network routing creates scaling issues (see [CHIAPPA] and [RFC1498]). 123 A number of scaling benefits would be realized by separating the 124 current IP address into separate spaces for Endpoint Identifiers 125 (EIDs) and Routing Locators (RLOCs); among them are: 127 1. Reduction of routing table size in the "default-free zone" (DFZ). 128 Use of a separate numbering space for RLOCs will allow them to be 129 assigned topologically (in today's Internet, RLOCs would be 130 assigned by providers at client network attachment points), 131 greatly improving aggregation and reducing the number of 132 globally-visible, routable prefixes. 134 2. More cost-effective multihoming for sites that connect to 135 different service providers where they can control their own 136 policies for packet flow into the site without using extra 137 routing table resources of core routers. 139 3. Easing of renumbering burden when clients change providers. 140 Because host EIDs are numbered from a separate, non-provider- 141 assigned and non-topologically-bound space, they do not need to 142 be renumbered when a client site changes its attachment points to 143 the network. 145 4. Traffic engineering capabilities that can be performed by network 146 elements and do not depend on injecting additional state into the 147 routing system. This will fall out of the mechanism that is used 148 to implement the EID/RLOC split (see Section 4). 150 5. Mobility without address changing. Existing mobility mechanisms 151 will be able to work in a locator/ID separation scenario. It 152 will be possible for a host (or a collection of hosts) to move to 153 a different point in the network topology either retaining its 154 home-based address or acquiring a new address based on the new 155 network location. A new network location could be a physically 156 different point in the network topology or the same physical 157 point of the topology with a different provider. 159 This draft describes protocol mechanisms to achieve the desired 160 functional separation. For flexibility, the mechanism used for 161 forwarding packets is decoupled from that used to determine EID to 162 RLOC mappings. This document covers the former. For the later, see 163 [CONS], [ALT], [RPMD], and [NERD]. This work is in response to and 164 intended to address the problem statement that came out of the RAWS 165 effort [RFC4984]. 167 The Routing and Addressing problem statement can be found in [RADIR]. 169 This draft focuses on a router-based solution. Building the solution 170 into the network will facilitate incremental deployment of the 171 technology on the Internet. Note that while the detailed protocol 172 specification and examples in this document assume IP version 4 173 (IPv4), there is nothing in the design that precludes use of the same 174 techniques and mechanisms for IPv6. It should be possible for IPv4 175 packets to use IPv6 RLOCs and for IPv6 EIDs to be mapped to IPv4 176 RLOCs. 178 Related work on host-based solutions is described in Shim6 [SHIM6] 179 and HIP [RFC4423]. Related work on a router-based solution is 180 described in [GSE]. This draft attempts to not compete or overlap 181 with such solutions and the proposed protocol changes are expected to 182 complement a host-based mechanism when Traffic Engineering 183 functionality is desired. 185 Some of the design goals of this proposal include: 187 1. Require no hardware or software changes to end-systems (hosts). 189 2. Minimize required changes to Internet infrastructure. 191 3. Be incrementally deployable. 193 4. Require no router hardware changes. 195 5. Minimize the number of routers which have to be modified. In 196 particular, most customer site routers and no core routers 197 require changes. 199 6. Minimize router software changes in those routers which are 200 affected. 202 7. Avoid or minimize packet loss when EID-to-RLOC mappings need to 203 be performed. 205 There are 4 variants of LISP, which differ along a spectrum of strong 206 to weak dependence on the topological nature and possible need for 207 routability of EIDs. The variants are: 209 LISP 1: uses EIDs that are routable through the RLOC topology for 210 bootstrapping EID-to-RLOC mappings. [LISP1] This was intended as 211 a prototyping mechanism for early protocol implementation. It is 212 now deprecated and should not be deployed. 214 LISP 1.5: uses EIDs that are routable for bootstrapping EID-to-RLOC 215 mappings; such routing is via a separate topology. 217 LISP 2: uses EIDS that are not routable and EID-to-RLOC mappings are 218 implemented within the DNS. [LISP2] 220 LISP 3: uses non-routable EIDs that are used as lookup keys for a 221 new EID-to-RLOC mapping database. Use of Distributed Hash Tables 222 [DHTs] [LISPDHT] to implement such a database would be an area to 223 explore. Other examples of new mapping database services are 224 [CONS], [ALT], [RPMD], [NERD], and [APT]. 226 This document on LISP 1.5, and LISP 3 variants, both of which rely on 227 a router-based distributed cache and database for EID-to-RLOC 228 mappings. The LISP 1.0 mechanism works but does not allow reduction 229 of routing information in the default-free-zone of the Internet. The 230 LISP 2 mechanisms are put on hold and may never come to fruition 231 since it is not architecturally pure to have routing depend on 232 directory and directory depend on routing. The LISP 3 mechanisms 233 will be documented elsewhere but may use the control-plane options 234 specified in this specification. 236 3. Definition of Terms 238 Provider Independent (PI) Addresses: an address block assigned from 239 a pool where blocks are not associated with any particular 240 location in the network (e.g. from a particular service provider), 241 and is therefore not topologically aggregatable in the routing 242 system. 244 Provider Assigned (PA) Addresses: a block of IP addresses that are 245 assigned to a site by each service provider to which a site 246 connects. Typically, each block is sub-block of a service 247 provider CIDR block and is aggregated into the larger block before 248 being advertised into the global Internet. Traditionally, IP 249 multihoming has been implemented by each multi-homed site 250 acquiring its own, globally-visible prefix. LISP uses only 251 topologically-assigned and aggregatable address blocks for RLOCs, 252 eliminating this demonstrably non-scalable practice. 254 Routing Locator (RLOC): the IPv4 or IPv6 address of an egress 255 tunnel router (ETR). It is the output of a EID-to-RLOC mapping 256 lookup. An EID maps to one or more RLOCs. Typically, RLOCs are 257 numbered from topologically-aggregatable blocks that are assigned 258 to a site at each point to which it attaches to the global 259 Internet; where the topology is defined by the connectivity of 260 provider networks, RLOCs can be thought of as PA addresses. 261 Multiple RLOCs can be assigned to the same ETR device or to 262 multiple ETR devices at a site. 264 Endpoint ID (EID): a 32-bit (for IPv4) or 128-bit (for IPv6) value 265 used in the source and destination address fields of the first 266 (most inner) LISP header of a packet. The host obtains a 267 destination EID the same way it obtains an destination address 268 today, for example through a DNS lookup or SIP exchange. The 269 source EID is obtained via existing mechanisms used to set a 270 host's "local" IP address. An EID is allocated to a host from an 271 EID-prefix block associated with the site where the host is 272 located. An EID can be used by a host to refer to other hosts. 273 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks may be 274 assigned in a hierarchical manner, independent of the network 275 topology, to facilitate scaling of the mapping database. In 276 addition, an EID block assigned to a site may have site-local 277 structure (subnetting) for routing within the site; this structure 278 is not visible to the global routing system. 280 EID-prefix: A power-of-2 block of EIDs which are allocated to a 281 site by an address allocation authority. EID-prefixes are 282 associated with a set of RLOC addresses which make up a "database 283 mapping". EID-prefix allocations can be broken up into smaller 284 blocks when an RLOC set is to be associated with the smaller EID- 285 prefix. A globally routed address block (whether PI or PA) is not 286 an EID-prefix. However, a globally routed address block may be 287 removed from global routing and reused as an EID-prefix. A site 288 that receives an explicitly allocated EID-prefix may not use that 289 EID-prefix as a globally routed prefix assigned to RLOCs. 291 End-system: is an IPv4 or IPv6 device that originates packets with 292 a single IPv4 or IPv6 header. The end-system supplies an EID 293 value for the destination address field of the IP header when 294 communicating globally (i.e. outside of its routing domain). An 295 end-system can be a host computer, a switch or router device, or 296 any network appliance. 298 Ingress Tunnel Router (ITR): a router which accepts an IP packet 299 with a single IP header (more precisely, an IP packet that does 300 not contain a LISP header). The router treats this "inner" IP 301 destination address as an EID and performs an EID-to-RLOC mapping 302 lookup. The router then prepends an "outer" IP header with one of 303 its globally-routable RLOCs in the source address field and the 304 result of the mapping lookup in the destination address field. 305 Note that this destination RLOC may be an intermediate, proxy 306 device that has better knowledge of the EID-to-RLOC mapping closer 307 to the destination EID. In general, an ITR receives IP packets 308 from site end-systems on one side and sends LISP-encapsulated IP 309 packets toward the Internet on the other side. 311 Specifically, when a service provider prepends a LISP header for 312 Traffic Engineering purposes, the router that does this is also 313 regarded as an ITR. The outer RLOC the ISP ITR uses can be based 314 on the outer destination address (the originating ITR's supplied 315 RLOC) or the inner destination address (the originating hosts 316 supplied EID). 318 TE-ITR: is an ITR that is deployed in a service provider network 319 that prepends an additional LISP header for Traffic Engineering 320 purposes. 322 Egress Tunnel Router (ETR): a router that accepts an IP packet 323 where the destination address in the "outer" IP header is one of 324 its own RLOCs. The router strips the "outer" header and forwards 325 the packet based on the next IP header found. In general, an ETR 326 receives LISP-encapsulated IP packets from the Internet on one 327 side and sends decapsulated IP packets to site end-systems on the 328 other side. ETR functionality does not have to be limited to a 329 router device. A server host can be the endpoint of a LISP tunnel 330 as well. 332 TE-ETR: is an ETR that is deployed in a service provider network 333 that strips an outer LISP header for Traffic Engineering purposes. 335 xTR: is a reference to an ITR or ETR when direction of data flow is 336 not part of the context description. xTR refers to the router that 337 is the tunnel endpoint. Used synonymously with the term "Tunnel 338 Router". For example, "An xTR can be located at the Customer Edge 339 (CE) router", meaning both ITR and ETR functionality is at the CE 340 router. 342 EID-to-RLOC Cache: a short-lived, on-demand table in an ITR that 343 stores, tracks, and is responsible for timing-out and otherwise 344 validating EID-to-RLOC mappings. This cache is distinct from the 345 full "database" of EID-to-RLOC mappings, it is dynamic, local to 346 the ITR(s), and relatively small while the database is 347 distributed, relatively static, and much more global in scope. 349 EID-to-RLOC Database: a global distributed database that contains 350 all known EID-prefix to RLOC mappings. Each potential ETR 351 typically contains a small piece of the database: the EID-to-RLOC 352 mappings for the EID prefixes "behind" the router. These map to 353 one of the router's own, globally-visible, IP addresses. 355 Recursive Tunneling: when a packet has more than one LISP IP 356 header. Additional layers of tunneling may be employed to 357 implement traffic engineering or other re-routing as needed. When 358 this is done, an additional "outer" LISP header is added and the 359 original RLOCs are preserved in the "inner" header. Any 360 references to tunnels in this specification refers to dynamic 361 encapsulating tunnels and never are they staticly configured. 363 Reencapsulating Tunnels: when a packet has no more than one LISP IP 364 header (two IP headers total) and when it needs to be diverted to 365 new RLOC, an ETR can decapsulate the packet (remove the LISP 366 header) and prepend a new tunnel header, with new RLOC, on to the 367 packet. Doing this allows a packet to be re-routed by the re- 368 encapsulating router without adding the overhead of additional 369 tunnel headers. Any references to tunnels in this specification 370 refers to dynamic encapsulating tunnels and never are they 371 staticly configured. 373 LISP Header: a term used in this document to refer to the outer 374 IPv4 or IPv6 header, a UDP header, and a LISP header, an ITR 375 prepends or an ETR strips. 377 Address Family Indicator (AFI): a term used to describe an address 378 encoding in a packet. An address family currently pertains to an 379 IPv4 or IPv6 address. See [AFI] for details. 381 Negative Mapping Entry: also known as a negative cache entry, is an 382 EID-to-RLOC entry where an EID-prefix is advertised or stored with 383 no RLOCs. That is, the locator-set for the EID-to-RLOC entry is 384 empty or has an encoded locator count of 0. This type of entry 385 could be used to describe a prefix from a non-LISP site, which is 386 explicitly not in the mapping database. 388 Data Probe: a LISP-encapsulated data packet where the inner header 389 destination address equals the outer header destination address 390 used to trigger a Map-Reply by a decapsulating ETR. In addition, 391 the original packet is decapsulated and delivered to the 392 destination host. A Data Probe is used in some of the mapping 393 database designs to "probe" or request a Map-Reply from an ETR; in 394 other cases, Map-Requests are used. See each mapping database 395 design for details. 397 4. Basic Overview 399 One key concept of LISP is that end-systems (hosts) operate the same 400 way they do today. The IP addresses that hosts use for tracking 401 sockets, connections, and for sending and receiving packets do not 402 change. In LISP terminology, these IP addresses are called Endpoint 403 Identifiers (EIDs). 405 Routers continue to forward packets based on IP destination 406 addresses. When a packet is LISP encapsulated, these addresses are 407 referred to as Routing Locators (RLOCs). Most routers along a path 408 between two hosts will not change; they continue to perform routing/ 409 forwarding lookups on the destination addresses. For routers between 410 the source host and the ITR as well as routers from the ETR to the 411 destination host, the destination address is an EID. For the routers 412 between the ITR and the ETR, the destination address is an RLOC. 414 This design introduces "Tunnel Routers", which prepend LISP headers 415 on host-originated packets and strip them prior to final delivery to 416 their destination. The IP addresses in this "outer header" are 417 RLOCs. During end-to-end packet exchange between two Internet hosts, 418 an ITR prepends a new LISP header to each packet and an egress tunnel 419 router strips the new header. The ITR performs EID-to-RLOC lookups 420 to determine the routing path to the the ETR, which has the RLOC as 421 one of its IP addresses. 423 Some basic rules governing LISP are: 425 o End-systems (hosts) only send to addresses which are EIDs. They 426 don't know addresses are EIDs versus RLOCs but assume packets get 427 to LISP routers, which in turn, deliver packets to the destination 428 the end-system has specified. 430 o EIDs are always IP addresses assigned to hosts. 432 o LISP routers mostly deal with Routing Locator addresses. See 433 details later in Section 4.1 to clarify what is meant by "mostly". 435 o RLOCs are always IP addresses assigned to routers; preferably, 436 topologically-oriented addresses from provider CIDR blocks. 438 o When a router originates packets it may use as a source address 439 either an EID or RLOC. When acting as a host (e.g. when 440 terminating a transport session such as SSH, TELNET, or SNMP), it 441 may use an EID that is explicitly assigned for that purpose. An 442 EID that identifies the router as a host MUST NOT be used as an 443 RLOC; an EID is only routable within the scope of a site. A 444 typical BGP configuration might demonstrate this "hybrid" EID/RLOC 445 usage where a router could use its "host-like" EID to terminate 446 iBGP sessions to other routers in a site while at the same time 447 using RLOCs to terminate eBGP sessions to routers outside the 448 site. 450 o EIDs are not expected to be usable for global end-to-end 451 communication in the absence of an EID-to-RLOC mapping operation. 452 They are expected to be used locally for intra-site communication. 454 o EID prefixes are likely to be hierarchically assigned in a manner 455 which is optimized for administrative convenience and to 456 facilitate scaling of the EID-to-RLOC mapping database. The 457 hierarchy is based on a address allocation hierarchy which is not 458 dependent on the network topology. 460 o EIDs may also be structured (subnetted) in a manner suitable for 461 local routing within an autonomous system. 463 An additional LISP header may be prepended to packets by a transit 464 router (i.e. TE-ITR) when re-routing of the path for a packet is 465 desired. An obvious instance of this would be an ISP router that 466 needs to perform traffic engineering for packets in flow through its 467 network. In such a situation, termed Recursive Tunneling, an ISP 468 transit acts as an additional ingress tunnel router and the RLOC it 469 uses for the new prepended header would be either an TE-ETR within 470 the ISP (along intra-ISP traffic engineered path) or in an TE-ETR 471 within another ISP (an inter-ISP traffic engineered path, where an 472 agreement to build such a path exists). 474 This specification mandates that no more than two LISP headers get 475 prepended to a packet. This avoids excessive packet overhead as well 476 as possible encapsulation loops. It is believed two headers is 477 sufficient, where the first prepended header is used at a site for 478 Location/Identity separation and second prepended header is used 479 inside a service provider for Traffic Engineering purposes. 481 Tunnel Routers can be placed fairly flexibly in a multi-AS topology. 482 For example, the ITR for a particular end-to-end packet exchange 483 might be the first-hop or default router within a site for the source 484 host. Similarly, the egress tunnel router might be the last-hop 485 router directly-connected to the destination host. Another example, 486 perhaps for a VPN service out-sourced to an ISP by a site, the ITR 487 could be the site's border router at the service provider attachment 488 point. Mixing and matching of site-operated, ISP-operated, and other 489 tunnel routers is allowed for maximum flexibility. See Section 8 for 490 more details. 492 4.1. Packet Flow Sequence 494 This section provides an example of the unicast packet flow with the 495 following conditions: 497 o Source host "host1.abc.com" is sending a packet to 498 "host2.xyz.com", exactly what host1 would do if the site was not 499 using LISP. 501 o Each site is multi-homed, so each tunnel router has an address 502 (RLOC) assigned from the service provider address block for each 503 provider to which that particular tunnel router is attached. 505 o The ITR(s) and ETR(s) are directly connected to the source and 506 destination, respectively. 508 o Data Probes are used to solicit Map-Replies versus using Map- 509 Requests. And the Data Probes are sent on the underlying topology 510 (the LISP 1.0 variant) but could also be sent over an alternative 511 topology (the LISP 1.5 variant) as it would in [ALT]. 513 Client host1.abc.com wants to communicate with server host2.xyz.com: 515 1. host1.abc.com wants to open a TCP connection to host2.xyz.com. 516 It does a DNS lookup on host2.xyz.com. An A/AAAA record is 517 returned. This address is used as the destination EID and the 518 locally-assigned address of host1.abc.com is used as the source 519 EID. An IPv4 or IPv6 packet is built using the EIDs in the IPv4 520 or IPv6 header and sent to the default router. 522 2. The default router is configured as an ITR. The ITR must be able 523 to map the EID destination to an RLOC of the ETR at the 524 destination site. The ITR prepends a LISP header to the packet, 525 with one of its RLOCs as the source IPv4 or IPv6 address. The 526 destination EID from the original packet header is used as the 527 destination IPv4 or IPv6 in the prepended LISP header. 528 Subsequent packets, where the outer destination address is the 529 destination EID will be sent until EID-to-RLOC mapping is 530 learned. 532 3. In LISP 1, the packet is routed through the Internet as it is 533 today. In LISP 1.5, the packet is routed on a different topology 534 which may have EID prefixes distributed and advertised in an 535 aggregatable fashion. In either case, the packet arrives at the 536 ETR. The router is configured to "punt" the packet to the 537 router's processor. See Section 7 for more details. For LISP 538 2.0 and 3.0, the behavior is not fully defined yet. 540 4. The LISP header is stripped so that the packet can be forwarded 541 by the router control plane. The router looks up the destination 542 EID in the router's EID-to-RLOC database (not the cache, but the 543 configured data structure of RLOCs). An EID-to-RLOC Map-Reply 544 message is originated by the ETR and is addressed to the source 545 RLOC in the LISP header of the original packet (this is the ITR). 546 The source RLOC of the Map-Reply is one of the ETR's RLOCs. 548 5. The ITR receives the Map-Reply message, parses the message (to 549 check for format validity) and stores the mapping information 550 from the packet. This information is put in the ITR's EID-to- 551 RLOC mapping cache (this is the on-demand cache, the cache where 552 entries time out due to inactivity). 554 6. Subsequent packets from host1.abc.com to host2.xyz.com will have 555 a LISP header prepended by the ITR using the appropriate RLOC as 556 the LISP header destination address learned from the ETR. Note, 557 the packet may be sent to a different ETR than the one which 558 returned the Map-Reply due to the source site's hashing policy or 559 the destination site's locator-set policy. 561 7. The ETR receives these packets directly (since the destination 562 address is one of its assigned IP addresses), strips the LISP 563 header and forwards the packets to the attached destination host. 565 In order to eliminate the need for a mapping lookup in the reverse 566 direction, an ETR MAY create a cache entry that maps the source EID 567 (inner header source IP address) to the source RLOC (outer header 568 source IP address) in a received LISP packet. Such a cache entry is 569 termed a "gleaned" mapping and only contains a single RLOC for the 570 EID in question. More complete information about additional RLOCs 571 SHOULD be verified by sending a LISP Map-Request for that EID. Both 572 ITR and the ETR may also influence the decision the other makes in 573 selecting an RLOC. See Section 6 for more details. 575 5. Tunneling Details 577 This section describes the LISP Data Message which defines the 578 tunneling header used to encapsulate IPv4 and IPv6 packets which 579 contain EID addresses. Even though the following formats illustrate 580 IPv4-in-IPv4 and IPv6-in-IPv6 encapsulations, the other 2 581 combinations are supported as well. 583 Since additional tunnel headers are prepended, the packet becomes 584 larger and in theory can exceed the MTU of any link traversed from 585 the ITR to the ETR. It is recommended, in IPv4 that packets do not 586 get fragmented as they are encapsulated by the ITR. Instead, the 587 packet is dropped and an ICMP Too Big message is returned to the 588 source. 590 Based on informal surveys of large ISP traffic patterns, it appears 591 that most transit paths can accommodate a path MTU of at least 4470 592 bytes. The exceptions, in terms of data rate, number of hosts 593 affected, or any other metric are expected to be vanishingly small. 595 To address MTU concerns, mainly raised on the RRG mailing list, the 596 LISP deployment process will include collecting data during its pilot 597 phase to either verify or refute the assumption about minimum 598 available MTU. If the assumption proves true and transit networks 599 with links limited to 1500 byte MTUs are corner cases, it would seem 600 more cost-effective to either upgrade or modify the equipment in 601 those transit networks to support larger MTUs or to use existing 602 mechanisms for accommodating packets that are too large. 604 For this reason, there is currently no plan for LISP to add any new 605 additional, complex mechanism for implementing fragmentation and 606 reassembly in the face of limited-MTU transit links. If analysis 607 during LISP pilot deployment reveals that the assumption of 608 essentially ubiquitous, 4470+ byte transit path MTUs, is incorrect, 609 then LISP can be modified prior to protocol standardization to add 610 support for one of the proposed fragmentation and reassembly schemes. 611 Note that two simple existing schemes are detailed in Section 5.4. 613 5.1. LISP IPv4-in-IPv4 Header Format 615 0 1 2 3 616 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 618 / |Version| IHL |Type of Service| Total Length | 619 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 620 | | Identification |Flags| Fragment Offset | 621 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 OH | Time to Live | Protocol = 17 | Header Checksum | 623 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 | | Source Routing Locator | 625 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 626 \ | Destination Routing Locator | 627 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 628 / | Source Port = xxxx | Dest Port = 4341 | 629 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 630 \ | UDP Length | UDP Checksum | 631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 632 L / |S| Locator Reach Bits | 633 I +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 634 S \ | Nonce | 635 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 / |Version| IHL |Type of Service| Total Length | 637 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | | Identification |Flags| Fragment Offset | 639 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 IH | Time to Live | Protocol | Header Checksum | 641 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 | | Source EID | 643 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 644 \ | Destination EID | 645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 647 5.2. LISP IPv6-in-IPv6 Header Format 649 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 650 / |Version| Traffic Class | Flow Label | 651 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 652 | | Payload Length | Next Header=17| Hop Limit | 653 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 654 | | 655 O + + 656 u | | 657 t + Source Routing Locator + 658 e | | 659 r + + 660 | | 661 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 662 d | | 663 r + + 664 | | 665 ^ + Destination Routing Locator + 666 | | | 667 \ + + 668 \ | | 669 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 670 / | Source Port = xxxx | Dest Port = 4341 | 671 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 \ | UDP Length | UDP Checksum | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 L / |S| Locator Reach Bits | 675 I +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 S \ | Nonce | 677 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 678 / |Version| Traffic Class | Flow Label | 679 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 / | Payload Length | Next Header | Hop Limit | 681 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 | | 683 I + + 684 n | | 685 n + Source EID + 686 e | | 687 r + + 688 | | 689 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 690 d | | 691 r + + 692 | | 694 ^ + Destination EID + 695 \ | | 696 \ + + 697 \ | | 698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 700 5.3. Tunnel Header Field Descriptions 702 IH Header: is the inner header, preserved from the datagram received 703 from the originating host. The source and destination IP 704 addresses are EIDs. 706 OH Header: is the outer header prepended by an ITR. The address 707 fields contain RLOCs obtained from the ingress router's EID-to- 708 RLOC cache. The IP protocol number is "UDP (17)" from [RFC0768]. 710 UDP Header: contains a ITR selected source port when encapsulating a 711 packet. See Section 6.4 for details on the hash algorithm used 712 select a source port based on the 5-tuple of the inner header. 713 The destination port MUST be set to the well-known IANA assigned 714 port value 4341. 716 UDP Checksum: this field field MUST be transmitted as 0 and ignored 717 on receipt by the ETR. Note, even when the UDP checksum is 718 transmitted as 0 an intervening NAT device can recalculate the 719 checksum and rewrite the UDP checksum field to non-zero. For 720 performance reasons, the ETR MUST ignore the checksum and MUST not 721 do a checksum computation. 723 UDP Length: for an IPv4 encapsulated packet, the inner header Total 724 Length plus the UDP and LISP header lengths are used. For an IPv6 725 encapsulated packet, the inner header Payload Length plus the size 726 of the IPv6 header (40 bytes) plus the size of the UDP and LISP 727 headers are used. The UDP header length is 8 bytes. The LISP 728 header length is 8 bytes when no loc-reach-bit header extensions 729 are used. 731 S: this is the Solicit-Map-Request (SMR) bit. See section 732 Section 6.5.2 for details. 734 LISP Locator Reach Bits: in the LISP header are set by an ITR to 735 indicate to an ETR the reachability of the Locators in the source 736 site. Each RLOC in a Map-Reply is assigned an ordinal value from 737 0 to n-1 (when there are n RLOCs in a mapping entry). The Locator 738 Reach Bits are numbered from 0 to n-1 from the right significant 739 bit of the 31-bit field. When a bit is set to 1, the ITR is 740 indicating to the ETR the RLOC associated with the bit ordinal is 741 reachable. See Section 6.3 for details on how an ITR can 742 determine other ITRs at the site are reachable. When a site has 743 multiple EID-prefixes which result in multiple mappings (where 744 each could have a different locator-set), the Locator Reach Bits 745 setting in an encapsulated packet MUST reflect the mapping for the 746 EID-prefix that the inner-header source EID address matches. 748 LISP Nonce: is a 32-bit value that is randomly generated by an ITR. 749 It is used to test route-returnability when xTRs exchange 750 encapsulated data packets with the SMR bit set, Data-Probe, Map- 751 Request, or Map-Reply messages. 753 When doing Recursive Tunneling: 755 o The OH header Time to Live field (or Hop Limit field, in case of 756 IPv6) MUST be copied from the IH header Time to Live field. 758 o The OH header Type of Service field (or the Traffic Class field, 759 in the case of IPv6) SHOULD be copied from the IH header Type of 760 Service field. 762 When doing Re-encapsulated Tunneling: 764 o The new OH header Time to Live field SHOULD be copied from the 765 stripped OH header Time to Live field. 767 o The new OH header Type of Service field SHOULD be copied from the 768 stripped OH header Type of Service field. 770 Copying the TTL serves two purposes: first, it preserves the distance 771 the host intended the packet to travel; second, and more importantly, 772 it provides for suppression of looping packets in the event there is 773 a loop of concatenated tunnels due to misconfiguration. 775 5.4. Dealing with Large Encapsulated Packets 777 In the event that the MTU issues mentioned above prove to be more 778 serious than expected, this section proposes 2 simple mechanisms to 779 deal with large packets. One is stateless using IP fragmentation and 780 the other is stateful using Path MTU Discovery [RFC1191]. 782 It is left to the implementor to decide if the stateless or stateful 783 mechanism should be implemented. Both or neither can be decided as 784 well since it is a local decision in the ITR regarding how to deal 785 with MTU issues. Sites can interoperate with differing mechanisms. 787 5.4.1. A Stateless Solution to MTU Handling 789 An ITR stateless solution to handle MTU issues is described as 790 follows: 792 1. Define an architectural constant S for the maximum size of a 793 packet, in bytes, an ITR would receive from a source inside of 794 its site. 796 2. Define L to be the maximum size, in bytes, a packet of size S 797 would be after the ITR prepends the LISP header, UDP header, and 798 outer network layer header of size H. 800 3. Calculate: S + H = L. 802 When an ITR receives a packet from a site-facing interface and adds H 803 bytes worth of encapsulation to yield a packet size of L bytes, it 804 resolves the MTU issue by first splitting the original packet into 2 805 equal-sized fragments. A LISP header is then prepended to each 806 fragment. This will ensure that the new, encapsulated packets are of 807 size (S/2 + H), which is always below the effective tunnel MTU. 809 When an ETR receives encapsulated fragments, it treats them as two 810 individually encapsulated packets. It strips the LISP headers then 811 forwards each fragment to the destination host of the destination 812 site. The two fragments are reassembled at the destination host into 813 the single IP datagram that was originated by the source host. 815 This behavior is performed by the ITR when the source host originates 816 a packet with the DF field of the IP header is set to 0. When the DF 817 field of the IP header is set to 1, or the packet is an IPv6 packet 818 originated by the source host, the ITR will drop the packet when the 819 size is greater than L, and sends an ICMP Too Big message to the 820 source with a value of S, where S is (L - H). 822 This specification recommends that L be defined as 1500. 824 5.4.2. A Stateful Solution to MTU Handling 826 An ITR stateful solution to handle MTU issues is describe as follows 827 and was first introduced in [OPENLISP]: 829 1. The ITR will keep state of the effective MTU for each locator per 830 mapping cache entry. The effective MTU is what the core network 831 can deliver along the path between ITR and ETR. 833 2. When an encapsulated packet exceeds what the core network can 834 deliver, one of the intermediate routers on the path will send an 835 ICMP Too Big message to the ITR. The ITR will parse the ICMP 836 message to determine which locator is affected by the effective 837 MTU change and then record the new effective MTU value in the 838 mapping cache entry. 840 3. When a packet is received by the ITR from a source inside of the 841 site and the size of the packet is greater than the effective MTU 842 stored with the mapping cache entry associated with the 843 destination EID the packet is for, the ITR will send an ICMP Too 844 Big message back to the source. The packet size advertised by 845 the ITR in the ICMP Too Big message is the effective MTU minus 846 the LISP encapsulation length. 848 Even though this mechanism is stateful, it has advantages over the 849 stateless IP fragmentation mechanism, by not involving the 850 destination host with reassembly of ITR fragmented packets. 852 6. EID-to-RLOC Mapping 854 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats 856 The following new UDP packet types are used to retrieve EID-to-RLOC 857 mappings: 859 0 1 2 3 860 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 |Version| IHL |Type of Service| Total Length | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | Identification |Flags| Fragment Offset | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | Time to Live | Protocol = 17 | Header Checksum | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 | Source Routing Locator | 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 | Destination Routing Locator | 871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 / | Source Port | Dest Port | 873 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 874 \ | UDP Length | UDP Checksum | 875 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 876 | | 877 | LISP Message | 878 | | 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 882 |Version| Traffic Class | Flow Label | 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 | Payload Length | Next Header=17| Hop Limit | 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 886 | | 887 + + 888 | | 889 + Source Routing Locator + 890 | | 891 + + 892 | | 893 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 | | 895 + + 896 | | 897 + Destination Routing Locator + 898 | | 899 + + 900 | | 901 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 902 / | Source Port | Dest Port | 903 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 904 \ | UDP Length | UDP Checksum | 905 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 906 | | 907 | LISP Message | 908 | | 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 911 The LISP UDP-based messages are the Map-Request and Map-Reply 912 messages. When a UDP Map-Request is sent, the UDP source port is 913 chosen by the sender and the destination UDP port number is set to 914 4342. When a UDP Map-Reply is sent, the source UDP port number is 915 set to 4342 and the destination UDP port number is copied from the 916 source port of either the Map-Request or the invoking data packet. 918 The UDP Length field will reflect the length of the UDP header and 919 the LISP Message payload. 921 The UDP Checksum is computed and set to non-zero for Map-Request and 922 Map-Reply messages. It MUST be checked on receipt and if the 923 checksum fails, the packet MUST be dropped. 925 LISP-CONS [CONS] use TCP to send LISP control messages. The format 926 of control messages includes the UDP header so the checksum and 927 length fields can be used to protect and delimit message boundaries. 929 This main LISP specification is the authoritative source for message 930 format definitions for the Map-Request and Map-Reply messages. 932 6.1.1. LISP Packet Type Allocations 934 This section will be the authoritative source for allocating LISP 935 Type values. Current allocations are: 937 Reserved: 0 b'0000' 938 LISP Map-Request: 1 b'0001' 939 LISP Map-Reply: 2 b'0010' 940 LISP-CONS Open Message: 8 b'1000' 941 LISP-CONS Push-Add Message: 9 b'1001' 942 LISP-CONS Push-Delete Message: 10 b'1010' 943 LISP-CONS Unreachable Message 11 b'1011' 945 6.1.2. Map-Request Message Format 947 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 |S| Locator Reach Bits | 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 | Nonce | 951 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 |Type=1 |A|R| Reserved | Record Count | 953 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 954 | Source-EID-AFI | ITR-AFI | 955 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 | Source EID Address ... | 957 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 958 | Originating ITR RLOC Address ... | 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 / | Reserved | EID mask-len | EID-prefix-AFI | 961 Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 962 \ | EID-prefix ... | 963 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 964 | Map-Reply Record ... | 965 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 966 | Mapping Protocol Data | 967 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 Packet field descriptions: 971 S: This is the SMR bit. See Section 6.5.2 for details. 973 Locator Reach Bits: These bits MUST be set to 0 on transmission and 974 ignored on receipt. They cannot be used for indicating 975 reachability because the Map-Request does not have the EID-prefix 976 for the sending site so the receiver of the Map-Request cannot 977 know what mapping entry to associate the reachability with. 978 However, when Mapping Data is provided in the Map-Reply Record 979 field, and the receiver of the Map-Request is configured to accept 980 the mapping data, the R-bit per locator entry in the EID-prefix 981 record is used to denote reachability. 983 Nonce: A 4-byte random value created by the sender of the Map- 984 Request. 986 Type: 1 (Map-Request) 988 A: This is an authoritative bit, which is set to 0 for UDP-based Map- 989 Requests sent by an ITR. See other control-specific documents 990 [CONS] for TCP-based Map-Requests. 992 R: When set, it indicates a Map-Reply Record segment is included in 993 the Map-Request. 995 Reserved: Set to 0 on transmission and ignored on receipt. 997 Record Count: The number of records in this request message. A 998 record is comprised of the portion of the packet is labeled 'Rec' 999 above and occurs the number of times equal to Record count. 1001 Source-EID-AFI: Address family of the "Source EID Address" field. 1003 ITR-AFI: Address family of the "Originating ITR RLOC Address" field. 1005 Source EID Address: This is the EID of the source host which 1006 originated the packet which is invoking this Map-Request. 1008 Originating ITR RLOC Address: Used to give the ETR the option of 1009 returning a Map-Reply in the address-family of this locator. 1011 EID mask-len: Mask length for EID prefix. 1013 EID-AFI: Address family of EID-prefix according to [RFC2434] 1015 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1016 address-family. When a Map-Request is sent by an ITR because a 1017 data packet is received for a destination where there is no 1018 mapping entry, the EID-prefix is set to the destination IP address 1019 of the data packet. And the 'EID mask-len' is set to 32 or 128 1020 for IPv4 or IPv6, respectively. When an xTR wants to query a site 1021 about the status of a mapping it already has cached, the EID- 1022 prefix used in the Map-Request has the same mask-length as the 1023 EID-prefix returned from the site when it sent a Map-Reply 1024 message. 1026 Map-Reply Record: When the R bit is set, this field is the size of 1027 the "Record" field in the Map-Reply format. This Map-Reply record 1028 contains the EID-to-RLOC mapping entry associated with the Source 1029 EID. This allows the ETR which will receive this Map-Request to 1030 cache the data if it chooses to do so. 1032 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1033 is optional and present when the UDP length indicates there is 1034 enough space in the packet to include it. 1036 6.1.3. EID-to-RLOC UDP Map-Request Message 1038 A Map-Request is sent from an ITR when it needs a mapping for an EID, 1039 wants to test an RLOC for reachability, or wants to refresh a mapping 1040 before TTL expiration. This is performed by using the RLOC as the 1041 destination address for Map-Request message with a randomly allocated 1042 source UDP port number and the well-known destination port number 1043 4342. A successful Map-Reply updates the cached set of RLOCs 1044 associated with the EID prefix range. 1046 Map-Requests MUST be rate-limited. It is recommended that a Map- 1047 Request for the same EID-prefix be sent no more than once per second. 1049 6.1.4. Map-Reply Message Format 1051 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1052 |x| Locator Reach Bits | 1053 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1054 | Nonce | 1055 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1056 |Type=2 | Reserved | Record Count | 1057 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 | | Record TTL | 1059 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1060 R | Locator Count | EID mask-len |A| Reserved | 1061 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1062 c | Reserved | EID-AFI | 1063 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1064 r | EID-prefix | 1065 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 | /| Priority | Weight | M Priority | M Weight | 1067 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1068 | o | Unused Flags |R| Loc-AFI | 1069 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1070 | \| Locator | 1071 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1072 | Mapping Protocol Data | 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 Packet field descriptions: 1077 x: Set to 0 on transmission and ignored on receipt. 1079 Locator Reach Bits: Refer to Section 5.3. This field MUST be set to 1080 0 on transmission and ignored on receipt. The locator 1081 reachability is encoded as the R-bit in each locator entry of each 1082 EID-prefix record. 1084 Nonce: A 4-byte value set in a Data-Probe packet or a Map-Request 1085 that is echoed here in the Map-Reply. 1087 Type: 2 (Map-Reply) 1089 Reserved: Set to 0 on transmission and ignored on receipt. 1091 Record Count: The number of records in this reply message. A record 1092 is comprised of that portion of the packet labeled 'Record' above 1093 and occurs the number of times equal to Record count. 1095 Record TTL: The time in minutes the recipient of the Map-Reply will 1096 store the mapping. If the TTL is 0, the entry should be removed 1097 from the cache immediately. If the value is 0xffffffff, the 1098 recipient can decide locally how long to store the mapping. 1100 Locator Count: The number of Locator entries. A locator entry 1101 comprises what is labeled above as 'Loc'. The locator count can 1102 be 0 indicating there are no locators for the EID-prefix. 1104 EID mask-len: Mask length for EID prefix. 1106 A: The Authoritative bit, when sent by a UDP-based message is always 1107 set by the ETR. See [CONS] for TCP-based Map-Replies. 1109 EID-AFI: Address family of EID-prefix according to [RFC2434]. 1111 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1112 address-family. 1114 Priority: each RLOC is assigned a unicast priority. Lower values 1115 are more preferable. When multiple RLOCs have the same priority, 1116 they may be used in a load-split fashion. A value of 255 means 1117 the RLOC MUST NOT be used for unicast forwarding. 1119 Weight: when priorities are the same for multiple RLOCs, the weight 1120 indicates how to balance unicast traffic between them. Weight is 1121 encoded as a percentage of total unicast packets that match the 1122 mapping entry. If a non-zero weight value is used for any RLOC, 1123 then all RLOCs must use a non-zero weight value and then the sum 1124 of all weight values MUST equal 100. If a zero value is used for 1125 any RLOC weight, then all weights MUST be zero and the receiver of 1126 the Map-Reply will decide how to load-split traffic. See 1127 Section 6.4 for a suggested hash algorithm to distribute load 1128 across locators with same priority and equal weight values. When 1129 a single RLOC exists in a mapping entry, the weight value MUST be 1130 set to 100 and ignored on receipt. 1132 M Priority: each RLOC is assigned a multicast priority used by an 1133 ETR in a receiver multicast site to select an ITR in a source 1134 multicast site for building multicast distribution trees. A value 1135 of 255 means the RLOC MUST NOT be used for joining a multicast 1136 distribution tree. 1138 M Weight: when priorities are the same for multiple RLOCs, the 1139 weight indicates how to balance building multicast distribution 1140 trees across multiple ITRs. The weight is encoded as a percentage 1141 of total number of trees build to the source site identified by 1142 the EID-prefix. If a non-zero weight value is used for any RLOC, 1143 then all RLOCs must use a non-zero weight value and then the sum 1144 of all weight values MUST equal 100. If a zero value is used for 1145 any RLOC weight, then all weights MUST be zero and the receiver of 1146 the Map-Reply will decide how to distribute multicast state across 1147 ITRs. 1149 Unused Flags: set to 0 when sending and ignored on receipt. 1151 R: when this bit is set, the locator is known to be reachable from 1152 the Map-Reply sender's perspective. When there is a single 1153 mapping record in the message, the R-bit for each locator must 1154 have a consistent setting with the bitfield setting of the 'Loc 1155 Reach Bits' field in the early part of the header. When there are 1156 multiple mapping records in the message, the 'Loc Reach Bits' 1157 field is set to 0. 1159 Locator: an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field) 1160 assigned to an ETR or router acting as a proxy replier for the 1161 EID-prefix. Note that the destination RLOC address MAY be an 1162 anycast address. A source RLOC can be an anycast address as well. 1163 The source or destination RLOC MUST NOT be the broadcast address 1164 (255.255.255.255 or any subnet broadcast address known to the 1165 router), and MUST NOT be a link-local multicast address. The 1166 source RLOC MUST NOT be a multicast address. The destination RLOC 1167 SHOULD be a multicast address if it is being mapped from a 1168 multicast destination EID. 1170 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1171 is optional and present when the UDP length indicates there is 1172 enough space in the packet to include it. 1174 6.1.5. EID-to-RLOC UDP Map-Reply Message 1176 When a Data Probe packet or a Map-Request triggers a Map-Reply to be 1177 sent, the RLOCs associated with the EID-prefix matched by the EID in 1178 the original packet destination IP address field will be returned. 1179 The RLOCs in the Map-Reply are the globally-routable IP addresses of 1180 the ETR but are not necessarily reachable; separate testing of 1181 reachability is required. 1183 Note that a Map-Reply may contain different EID-prefix granularity 1184 (prefix + length) than the Map-Request which triggers it. This might 1185 occur if a Map-Request were for a prefix that had been returned by an 1186 earlier Map-Reply. In such a case, the requester updates its cache 1187 with the new prefix information and granularity. For example, a 1188 requester with two cached EID-prefixes that are covered by a Map- 1189 Reply containing one, less-specific prefix, replaces the entry with 1190 the less-specific EID-prefix. Note that the reverse, replacement of 1191 one less-specific prefix with multiple more-specific prefixes, can 1192 also occur but not by removing the less-specific prefix rather by 1193 adding the more-specific prefixes which during a lookup will override 1194 the less-specific prefix. 1196 Replies SHOULD be sent for an EID-prefix no more often than once per 1197 second to the same requesting router. For scalability, it is 1198 expected that aggregation of EID addresses into EID-prefixes will 1199 allow one Map-Reply to satisfy a mapping for the EID addresses in the 1200 prefix range thereby reducing the number of Map-Request messages. 1202 The addresses for a encapsulated data packets or Map-Request message 1203 are swapped and used for sending the Map-Reply. The UDP source and 1204 destination ports are swapped as well. That is, the source port in 1205 the UDP header for the Map-Reply is set to the well-known UDP port 1206 number 4342. 1208 6.2. Routing Locator Selection 1210 Both client-side and server-side may need control over the selection 1211 of RLOCs for conversations between them. This control is achieved by 1212 manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply 1213 messages. Alternatively, RLOC information may be gleaned from 1214 received tunneled packets or EID-to-RLOC Map-Request messages. 1216 The following enumerates different scenarios for choosing RLOCs and 1217 the controls that are available: 1219 o Server-side returns one RLOC. Client-side can only use one RLOC. 1220 Server-side has complete control of the selection. 1222 o Server-side returns a list of RLOC where a subset of the list has 1223 the same best priority. Client can only use the subset list 1224 according to the weighting assigned by the server-side. In this 1225 case, the server-side controls both the subset list and load- 1226 splitting across its members. The client-side can use RLOCs 1227 outside of the subset list if it determines that the subset list 1228 is unreachable (unless RLOCs are set to a Priority of 255). Some 1229 sharing of control exists: the server-side determines the 1230 destination RLOC list and load distribution while the client-side 1231 has the option of using alternatives to this list if RLOCs in the 1232 list are unreachable. 1234 o Server-side sets weight of 0 for the RLOC subset list. In this 1235 case, the client-side can choose how the traffic load is spread 1236 across the subset list. Control is shared by the server-side 1237 determining the list and the client determining load distribution. 1238 Again, the client can use alternative RLOCs if the server-provided 1239 list of RLOCs are unreachable. 1241 o Either side (more likely on the server-side ETR) decides not to 1242 send a Map-Request. For example, if the server-side ETR does not 1243 send Map-Requests, it gleans RLOCs from the client-side ITR, 1244 giving the client-side ITR responsibility for bidirectional RLOC 1245 reachability and preferability. Server-side ETR gleaning of the 1246 client-side ITR RLOC is done by caching the inner header source 1247 EID and the outer header source RLOC of received packets. The 1248 client-side ITR controls how traffic is returned and can alternate 1249 using an outer header source RLOC, which then can be added to the 1250 list the server-side ETR uses to return traffic. Since no 1251 Priority or Weights are provided using this method, the server- 1252 side ETR must assume each client-side ITR RLOC uses the same best 1253 Priority with a Weight of zero. In addition, since EID-prefix 1254 encoding cannot be conveyed in data packets, the EID-to-RLOC cache 1255 on tunnel routers can grow to be very large. 1257 RLOCs that appear in EID-to-RLOC Map-Reply messages are considered 1258 reachable. The Map-Reply and the database mapping service does not 1259 provide any reachability status for Locators. This is done outside 1260 of the mapping service. See next section for details. 1262 6.3. Routing Locator Reachability 1264 There are 4 methods for determining when a Locator is either 1265 reachable or has become unreachable: 1267 1. Locator reachability is determined by an ETR by examining the 1268 Loc-Reach-Bits from a LISP header of a encapsulated data packet 1269 which is provided by an ITR when an ITR encapsulates data. 1271 2. Locator unreachability is determined by an ITR by receiving ICMP 1272 Network or Host Unreachable messages. 1274 3. Locator unreachability can also be determined by an BGP-enabled 1275 ITR when there is no prefix matching a Locator address from the 1276 BGP RIB. 1278 4. Locator unreachability is determined when a host sends an ICMP 1279 Port Unreachable message. This occurs when an ITR may not use 1280 any methods of interworking. one which is describe in [INTERWORK] 1281 and the encapsulated data packet is received by a host at the 1282 destination non-LISP site. 1284 5. Locator reachability is determined by receiving a Map-Reply 1285 message from a ETR's Locator address in response to a previously 1286 sent Map-Request. 1288 6. Locator reachability can also be determined by receiving packets 1289 encapsulated by the ITR assigned to the locator address. 1291 When determining Locator reachability by examining the Loc-Reach-Bits 1292 from the LISP encapsulate data packet, an ETR will receive up to date 1293 status from the ITR closest to the Locators at the source site. The 1294 ITRs at the source site can determine reachability when running their 1295 IGP at the site. When the ITRs are deployed on CE routers, typically 1296 a default route is injected into the site's IGP from each of the 1297 ITRs. If an ITR goes down, the CE-PE link goes down, or the PE 1298 router goes down, the CE router withdraws the default route. This 1299 allows the other ITRs at the site to determine one of the Locators 1300 has gone unreachable. 1302 The Locators listed in a Map-Reply are numbered with ordinals 0 to 1303 n-1. The Loc-Reach-Bits in a LISP Data Message are numbered from 0 1304 to n-1 starting with the least significant bit numbered as 0. So, 1305 for example, if the ITR with locator listed as the 3rd Locator 1306 position in the Map-Reply goes down, all other ITRs at the site will 1307 have the 3rd bit from the right cleared (the bit that corresponds to 1308 ordinal 2). 1310 When an ETR decapsulates a packet, it will look for a change in the 1311 Loc-Reach-Bits value. When a bit goes from 1 to 0, the ETR will 1312 refrain from encapsulating packets to the Locator that has just gone 1313 unreachable. It can start using the Locator again when the bit that 1314 corresponds to the Locator goes from 0 to 1. Loc-Reach-Bits are 1315 associated with a locator-set per EID-prefix. Therefore, when a 1316 locator becomes unreachable, the loc-reach-bit that corresponds to 1317 that locator's position in the list returned by the last Map-Reply 1318 will be set to zero for that particular EID-prefix. 1320 When ITRs at the site are not deployed in CE routers, the IGP can 1321 still be used to determine the reachability of Locators provided they 1322 are injected a stub links into the IGP. This is typically done when 1323 a /32 address is configured on a loopback interface. 1325 When ITRs receive ICMP Network or Host Unreachable messages as a 1326 method to determine unreachability, they will refrain from using 1327 Locators which are described in Locator lists of Map-Replies. 1328 However, using this approach is unreliable because many network 1329 operators turn off generation of ICMP Unreachable messages. 1331 If an ITR does receive an ICMP Network or Host Unreachable message, 1332 it MAY originate its own ICMP Unreachable message destined for the 1333 host that originated the data packet the ITR encapsulated. 1335 Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if 1336 a locator address from a locator-set in a mapping entry matches a 1337 prefix. If it does not find one and BGP is running in the Default 1338 Free Zone (DFZ), it can decide to not use the locator even though the 1339 Loc-Reach-Bits indicate the locator is up. In this case, the path 1340 from the ITR to the ETR that is assigned the locator is not 1341 available. More details are in [LOC-ID-ARCH]. 1343 Optionally, an ITR can send a Map-Request to a Locator and if a Map- 1344 Reply is returned, reachability of the Locator has been determined. 1345 Obviously, sending such probes increases the number of control 1346 messages originated by tunnel routers for active flows, so Locators 1347 are assumed to be reachable when they are advertised. 1349 This assumption does create a dependency: Locator unreachability is 1350 detected by the receipt of ICMP Host Unreachable messages. When an 1351 Locator has been determined to be unreachable, it is not used for 1352 active traffic; this is the same as if it were listed in a Map-Reply 1353 with priority 255. 1355 The ITR can test the reachability of the unreachable Locator by 1356 sending periodic Requests. Both Requests and Replies MUST be rate- 1357 limited. Locator reachability testing is never done with data 1358 packets since that increases the risk of packet loss for end-to-end 1359 sessions. 1361 When an ETR is decapsulating packets, it can be sure that the path 1362 from the encapsulating ITR is available. The ETR can assume the path 1363 from the ETR to the ITR is also reachable. Even if there is 1364 asymmetric routing in the core, the first-hop and last-hop ASes will 1365 be the same for both directions of traffic since the locator 1366 addresses are out of the PA blocks of each. However, the assumption 1367 may not always be valid, so this mechanism should be used as a best- 1368 effort indication that a working path exists between the sites. In 1369 the event of unidirectional traffic from an ITR to an ETR, an ITR 1370 should not conclude that a locator is unreachable since it is not 1371 receiving packets, but use alternate mechanisms described above to 1372 determine reachability. 1374 6.4. Routing Locator Hashing 1376 When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to 1377 a requesting ITR, the locator-set for the EID-prefix may contain 1378 different priority values for each locator address. When more than 1379 one best priority locator exists, the ITR can decide how to load 1380 share traffic against the corresponding locators. 1382 The following hash algorithm may be used by an ITR to select a 1383 locator for a packet destined to an EID for the EID-to-RLOC mapping: 1385 1. Either a source and destination address hash can be used or the 1386 traditional 5-tuple hash which includes the source and 1387 destination addresses, source and destination TCP, UDP, or SCTP 1388 port numbers and the IP protocol number field or IPv6 next- 1389 protocol fields of a packet a host originates from within a LISP 1390 site. When a packet is not a TCP, UDP, or SCTP packet, the 1391 source and destination addresses only from the header are used to 1392 compute the hash. 1394 2. Take the hash value and divide it by the number of locators 1395 stored in the locator-set for the EID-to-RLOC mapping. 1397 3. The remainder will be yield a value of 0 to "number of locators 1398 minus 1". Use the remainder to select the locator in the 1399 locator-set. 1401 Note that when a packet is LISP encapsulated, the source port number 1402 in the outer UDP header needs to be set. Selecting a random value 1403 allows core routers which are attached to Link Aggregation Groups 1404 (LAGs) to load-split the encapsulated packets across member links of 1405 such LAGs. Otherwise, core routers would see a single flow, since 1406 packets have a source address of the ITR, for packets which are 1407 originated by different EIDs at the source site. A suggested setting 1408 for the source port number computed by an ITR is a 5-tuple hash 1409 function on the inner header, as described above. 1411 6.5. Changing the Contents of EID-to-RLOC Mappings 1413 Since the LISP architecture uses a caching scheme to retrieve and 1414 store EID-to-RLOC mappings, the only way an ITR can get a more up-to- 1415 date mapping is to re-request the mapping. However, the ITRs do not 1416 know when the mappings change and the ETRs do not keep track of who 1417 requested its mappings. For scalability reasons, we want to maintain 1418 this approach but need to provide a way for ETRs change their 1419 mappings and inform the sites that are currently communicating with 1420 the ETR site using such mappings. 1422 When a locator record is added to the end of a locator-set, it is 1423 easy to update mappings. We assume new mappings will maintain the 1424 same locator ordering as the old mapping but just have new locators 1425 appended to the end of the list. So some ITRs can have a new mapping 1426 while other ITRs have only an old mapping that is used until they 1427 time out. When an ITR has only an old mapping but detects bits set 1428 in the loc-reach-bits that correspond to locators beyond the list it 1429 has cached, it simply ignores them. 1431 When a locator record is removed from a locator-set, ITRs that have 1432 the mapping cached will not use the removed locator because the xTRs 1433 will set the loc-reach-bit to 0. So even if the locator is in the 1434 list, it will not be used. For new mapping requests, the xTRs can 1435 set the locator address to 0 as well as setting the corresponding 1436 loc-reach-bit to 0. This forces ITRs with old or new mappings to 1437 avoid using the removed locator. 1439 If many changes occur to a mapping over a long period of time, one 1440 will find empty record slots in the middle of the locator-set and new 1441 records appended to the locator-set. At some point, it would be 1442 useful to compact the locator-set so the loc-reach-bit settings can 1443 be efficiently packed. 1445 We propose here two approaches for locator-set compaction, one 1446 operational and the other a protocol mechanism. The operational 1447 approach uses a clock sweep method. The protocol approach uses the 1448 concept of Solicit-Map-Requests. 1450 6.5.1. Clock Sweep 1452 The clock sweep approach uses planning in advance and the use of 1453 count-down TTLs to time out mappings that have already been cached. 1454 The default setting for an EID-to-RLOC mapping TTL is 24 hours. So 1455 there is a 24 hour window to time out old mappings. The following 1456 clock sweep procedure is used: 1458 1. 24 hours before a mapping change is to take effect, a network 1459 administrator configures the ETRs at a site to start the clock 1460 sweep window. 1462 2. During the clock sweep window, ETRs continue to send Map-Reply 1463 messages with the current (unchanged) mapping records. The TTL 1464 for these mappings is set to 1 hour. 1466 3. 24 hours later, all previous cache entries will have timed out, 1467 and any active cache entries will time out within 1 hour. During 1468 this 1 hour window the ETRs continue to send Map-Reply messages 1469 with the current (unchanged) mapping records with the TTL set to 1470 1 minute. 1472 4. At the end of the 1 hour window, the ETRs will send Map-Reply 1473 messages with the new (changed) mapping records. So any active 1474 caches can get the new mapping contents right away if not cached, 1475 or in 1 minute if they had the mapping cached. 1477 6.5.2. Solicit-Map-Request (SMR) 1479 Soliciting a Map-Request is a selective way for xTRs, at the site 1480 where mappings change, to control the rate they receive requests for 1481 Map-Reply messages. SMRs are also used to tell remote ITRs to update 1482 the mappings they have cached. 1484 Since the xTRs don't keep track of remote ITRs that have cached their 1485 mappings, they can not tell exactly who needs the new mapping 1486 entries. So an xTR will solicit Map-Requests from sites it is 1487 currently sending encapsulated data to, and only from those sites. 1488 The xTRs can locally decide the algorithm for how often and to how 1489 many sites it sends SMR messages. 1491 An SMR message is simply a bit set in an encapsulated data packet 1492 (and a Map-Request message). When an ETR at a remote site 1493 decapsulates a data packet that has the SMR bit set, it can tell that 1494 a new Map-Request message is being solicited. Both the xTR that 1495 sends the SMR message and the site that acts on the SMR message MUST 1496 be rate-limited. 1498 The following procedure shows how a SMR exchange occurs when a site 1499 is doing locator-set compaction for an EID-to-RLOC mapping: 1501 1. When the database mappings in an ETR change, the ITRs at the site 1502 begin to set the SMR bit in packets they encapsulate to the sites 1503 they communicate with. 1505 2. A remote xTR which decapsulates a packet with the SMR bit set 1506 will schedule sending a Map-Request message to the source locator 1507 address of the encapsulated packet. The nonce in the Map-Request 1508 is copied from the nonce in the encapsulated data packet that has 1509 the SMR bit set. 1511 3. The remote xTR retransmits the Map-Request slowly until it gets a 1512 Map-Reply while continuing to use the cached mapping. 1514 4. The ETRs at the site with the changed mapping will reply to the 1515 Map-Request with a Map-Reply message provided the Map-Request 1516 nonce matches the nonce from the SMR. The Map-Reply messages 1517 SHOULD be rate limited. This is important to avoid Map-Reply 1518 implosion. 1520 5. The ETRs, at the site with the changed mapping, records the fact 1521 that the site that sent the Map-Request has received the new 1522 mapping data in the mapping cache entry for the remote site so 1523 the loc-reach-bits are reflective of the new mapping for packets 1524 going to the remote site. The ETR then stops sending packets 1525 with the SMR-bit set. 1527 7. Router Performance Considerations 1529 LISP is designed to be very hardware-based forwarding friendly. By 1530 doing tunnel header prepending [RFC1955] and stripping instead of re- 1531 writing addresses, existing hardware can support the forwarding model 1532 with little or no modification. Where modifications are required, 1533 they should be limited to re-programming existing hardware rather 1534 than requiring expensive design changes to hard-coded algorithms in 1535 silicon. 1537 A few implementation techniques can be used to incrementally 1538 implement LISP: 1540 o When a tunnel encapsulated packet is received by an ETR, the outer 1541 destination address may not be the address of the router. This 1542 makes it challenging for the control plane to get packets from the 1543 hardware. This may be mitigated by creating special FIB entries 1544 for the EID-prefixes of EIDs served by the ETR (those for which 1545 the router provides an RLOC translation). These FIB entries are 1546 marked with a flag indicating that control plane processing should 1547 be performed. The forwarding logic of testing for particular IP 1548 protocol number value is not necessary. No changes to existing, 1549 deployed hardware should be needed to support this. 1551 o On an ITR, prepending a new IP header is as simple as adding more 1552 bytes to a MAC rewrite string and prepending the string as part of 1553 the outgoing encapsulation procedure. Many routers that support 1554 GRE tunneling [RFC2784] or 6to4 tunneling [RFC3056] can already 1555 support this action. 1557 o When a received packet's outer destination address contains an EID 1558 which is not intended to be forwarded on the routable topology 1559 (i.e. LISP 1.5), the source address of a data packet or the 1560 router interface with which the source is associated (the 1561 interface from which it was received) can be associated with a VRF 1562 (Virtual Routing/Forwarding), in which a different (i.e. non- 1563 congruent) topology can be used to find EID-to-RLOC mappings. 1565 8. Deployment Scenarios 1567 This section will explore how and where ITRs and ETRs can be deployed 1568 and will discuss the pros and cons of each deployment scenario. 1569 There are two basic deployment trade-offs to consider: centralized 1570 versus distributed caches and flat, recursive, or re-encapsulating 1571 tunneling. 1573 When deciding on centralized versus distributed caching, the 1574 following issues should be considered: 1576 o Are the tunnel routers spread out so that the caches are spread 1577 across all the memories of each router? 1579 o Should management "touch points" be minimized by choosing few 1580 tunnel routers, just enough for redundancy? 1582 o In general, using more ITRs doesn't increase management load, 1583 since caches are built and stored dynamically. On the other hand, 1584 more ETRs does require more management since EID-prefix-to-RLOC 1585 mappings need to be explicitly configured. 1587 When deciding on flat, recursive, or re-encapsulation tunneling, the 1588 following issues should be considered: 1590 o Flat tunneling implements a single tunnel between source site and 1591 destination site. This generally offers better paths between 1592 sources and destinations with a single tunnel path. 1594 o Recursive tunneling is when tunneled traffic is again further 1595 encapsulated in another tunnel, either to implement VPNs or to 1596 perform Traffic Engineering. When doing VPN-based tunneling, the 1597 site has some control since the site is prepending a new tunnel 1598 header. In the case of TE-based tunneling, the site may have 1599 control if it is prepending a new tunnel header, but if the site's 1600 ISP is doing the TE, then the site has no control. Recursive 1601 tunneling generally will result in suboptimal paths but at the 1602 benefit of steering traffic to resource available parts of the 1603 network. 1605 o The technique of re-encapsulation ensures that packets only 1606 require one tunnel header. So if a packet needs to be rerouted, 1607 it is first decapsulated by the ETR and then re-encapsulated with 1608 a new tunnel header using a new RLOC. 1610 The next sub-sections will describe where tunnel routers can reside 1611 in the network. 1613 8.1. First-hop/Last-hop Tunnel Routers 1615 By locating tunnel routers close to hosts, the EID-prefix set is at 1616 the granularity of an IP subnet. So at the expense of more EID- 1617 prefix-to-RLOC sets for the site, the caches in each tunnel router 1618 can remain relatively small. But caches always depend on the number 1619 of non-aggregated EID destination flows active through these tunnel 1620 routers. 1622 With more tunnel routers doing encapsulation, the increase in control 1623 traffic grows as well: since the EID-granularity is greater, more 1624 Map-Requests and Map-Replies are traveling between more routers. 1626 The advantage of placing the caches and databases at these stub 1627 routers is that the products deployed in this part of the network 1628 have better price-memory ratios then their core router counterparts. 1629 Memory is typically less expensive in these devices and fewer routes 1630 are stored (only IGP routes). These devices tend to have excess 1631 capacity, both for forwarding and routing state. 1633 LISP functionality can also be deployed in edge switches. These 1634 devices generally have layer-2 ports facing hosts and layer-3 ports 1635 facing the Internet. Spare capacity is also often available in these 1636 devices as well. 1638 8.2. Border/Edge Tunnel Routers 1640 Using customer-edge (CE) routers for tunnel endpoints allows the EID 1641 space associated with a site to be reachable via a small set of RLOCs 1642 assigned to the CE routers for that site. 1644 This offers the opposite benefit of the first-hop/last-hop tunnel 1645 router scenario: the number of mapping entries and network management 1646 touch points are reduced, allowing better scaling. 1648 One disadvantage is that less of the network's resources are used to 1649 reach host endpoints thereby centralizing the point-of-failure domain 1650 and creating network choke points at the CE router. 1652 Note that more than one CE router at a site can be configured with 1653 the same IP address. In this case an RLOC is an anycast address. 1654 This allows resilience between the CE routers. That is, if a CE 1655 router fails, traffic is automatically routed to the other routers 1656 using the same anycast address. However, this comes with the 1657 disadvantage where the site cannot control the entrance point when 1658 the anycast route is advertised out from all border routers. 1660 8.3. ISP Provider-Edge (PE) Tunnel Routers 1662 Use of ISP PE routers as tunnel endpoint routers gives an ISP control 1663 over the location of the egress tunnel endpoints. That is, the ISP 1664 can decide if the tunnel endpoints are in the destination site (in 1665 either CE routers or last-hop routers within a site) or at other PE 1666 edges. The advantage of this case is that two or more tunnel headers 1667 can be avoided. By having the PE be the first router on the path to 1668 encapsulate, it can choose a TE path first, and the ETR can 1669 decapsulate and re-encapsulate for a tunnel to the destination end 1670 site. 1672 An obvious disadvantage is that the end site has no control over 1673 where its packets flow or the RLOCs used. 1675 As mentioned in earlier sections a combination of these scenarios is 1676 possible at the expense of extra packet header overhead, if both site 1677 and provider want control, then recursive or re-encapsulating tunnels 1678 are used. 1680 9. Traceroute Considerations 1682 When a source host in a LISP site initiates a traceroute to a 1683 destination host in another LISP site, it is highly desirable for it 1684 to see the entire path. Since packets are encapsulated from ITR to 1685 ETR, the hop across the tunnel could be viewed as a single hop. 1686 However, LISP traceroute will provide the entire path so the user can 1687 see 3 distinct segments of the path from a source LISP host to a 1688 destination LISP host: 1690 Segment 1 (in source LISP site based on EIDs): 1692 source-host ---> first-hop ... next-hop ---> ITR 1694 Segment 2 (in the core network based on RLOCs): 1696 ITR ---> next-hop ... next-hop ---> ETR 1698 Segment 3 (in the destination LISP site based on EIDs): 1700 ETR ---> next-hop ... last-hop ---> destination-host 1702 For segment 1 of the path, ICMP Time Exceeded messages are returned 1703 in the normal matter as they are today. The ITR performs a TTL 1704 decrement and test for 0 before encapsulating. So the ITR hop is 1705 seen by the traceroute source has an EID address (the address of 1706 site-facing interface). 1708 For segment 2 of the path, ICMP Time Exceeded messages are returned 1709 to the ITR because the TTL decrement to 0 is done on the outer 1710 header, so the destination of the ICMP messages are to the ITR RLOC 1711 address, the source source RLOC address of the encapsulated 1712 traceroute packet. The ITR looks inside of the ICMP payload to 1713 inspect the traceroute source so it can return the ICMP message to 1714 the address of the traceroute client as well as retaining the core 1715 router IP address in the ICMP message. This is so the traceroute 1716 client can display the core router address (the RLOC address) in the 1717 traceroute output. The ETR returns its RLOC address and responds to 1718 the TTL decrement to 0 like the previous core routers did. 1720 For segment 3, the next-hop router downstream from the ETR will be 1721 decrementing the TTL for the packet that was encapsulated, sent into 1722 the core, decapsulated by the ETR, and forwarded because it isn't the 1723 final destination. If the TTL is decremented to 0, any router on the 1724 path to the destination of the traceroute, including the next-hop 1725 router or destination, will send an ICMP Time Exceeded message to the 1726 source EID of the traceroute client. The ICMP message will be 1727 encapsulated by the local ITR and sent back to the ETR in the 1728 originated traceroute source site, where the packet will be delivered 1729 to the host. 1731 9.1. IPv6 Traceroute 1733 IPv6 traceroute follows the procedure described above since the 1734 entire traceroute data packet is included in ICMP Time Exceeded 1735 message payload. Therefore, only the ITR needs to pay special 1736 attention for forwarding ICMP messages back to the traceroute source. 1738 9.2. IPv4 Traceroute 1740 For IPv4 traceroute, we cannot follow the above procedure since IPv4 1741 ICMP Time Exceeded messages only include the invoking IP header and 8 1742 bytes that follow the IP header. Therefore, when a core router sends 1743 an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP 1744 payload is the encapsulated header it prepended followed by a UDP 1745 header. The original invoking IP header, and therefore the identity 1746 of the traceroute source is lost. 1748 The solution we propose to solve this problem is to cache traceroute 1749 IPv4 headers in the ITR and to match them up with corresponding IPv4 1750 Time Exceeded messages received from core routers and the ETR. The 1751 ITR will use a circular buffer for caching the IPv4 and UDP headers 1752 of traceroute packets. It will select a 16-bit number as a key to 1753 find them later when the IPv4 Time Exceeded messages are received. 1754 When an ITR encapsulates an IPv4 traceroute packet, it will use the 1755 16-bit number as the UDP source port in the encapsulating header. 1756 When the ICMP Time Exceeded message is returned to the ITR, the UDP 1757 header of the encapsulating header is present in the ICMP payload 1758 thereby allowing the ITR to find the cached headers for the 1759 traceroute source. The ITR puts the cached headers in the payload 1760 and sends the ICMP Time Exceeded message to the traceroute source 1761 retaining the source address of the original ICMP Time Exceeded 1762 message (a core router or the ETR of the site of the traceroute 1763 destination). 1765 9.3. Traceroute using Mixed Locators 1767 When either an IPv4 traceroute or IPv6 traceroute is originated and 1768 the ITR encapsulates it in the other address family header, you 1769 cannot get all 3 segments of the traceroute. Segment 2 of the 1770 traceroute can not be conveyed to the traceroute source since it is 1771 expecting addresses from intermediate hops in the same address format 1772 for the type of traceroute it originated. Therefore, in this case, 1773 segment 2 will make the tunnel look like one hop. All the ITR has to 1774 do to make this work is to not copy the inner TTL to the outer, 1775 encapsulating header's TTL when a traceroute packet is encapsulated 1776 using an RLOC from a different address family. This will cause no 1777 TTL decrement to 0 to occur in core routers between the ITR and ETR. 1779 10. Mobility Considerations 1781 There are several kinds of mobility of which only some might be of 1782 concern to LISP. Essentially they are as follows. 1784 10.1. Site Mobility 1786 A site wishes to change its attachment points to the Internet, and 1787 its LISP Tunnel Routers will have new RLOCs when it changes upstream 1788 providers. Changes in EID-RLOC mappings for sites are expected to be 1789 handled by configuration, outside of the LISP protocol. 1791 10.2. Slow Endpoint Mobility 1793 An individual endpoint wishes to move, but is not concerned about 1794 maintaining session continuity. Renumbering is involved. LISP can 1795 help with the issues surrounding renumbering [RFC4192] [LISA96] by 1796 decoupling the address space used by a site from the address spaces 1797 used by its ISPs. [RFC4984] 1799 10.3. Fast Endpoint Mobility 1801 Fast endpoint mobility occurs when an endpoint moves relatively 1802 rapidly, changing its IP layer network attachment point. Maintenance 1803 of session continuity is a goal. This is where the Mobile IPv4 1804 [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used, 1805 and primarily where interactions with LISP need to be explored. 1807 The problem is that as an endpoint moves, it may require changes to 1808 the mapping between its EID and a set of RLOCs for its new network 1809 location. When this is added to the overhead of mobile IP binding 1810 updates, some packets might be delayed or dropped. 1812 In IPv4 mobility, when an endpoint is away from home, packets to it 1813 are encapsulated and forwarded via a home agent which resides in the 1814 home area the endpoint's address belongs to. The home agent will 1815 encapsulate and forward packets either directly to the endpoint or to 1816 a foreign agent which resides where the endpoint has moved to. 1817 Packets from the endpoint may be sent directly to the correspondent 1818 node, may be sent via the foreign agent, or may be reverse-tunneled 1819 back to the home agent for delivery to the mobile node. As the 1820 mobile node's EID or available RLOC changes, LISP EID-to-RLOC 1821 mappings are required for communication between the mobile node and 1822 the home agent, whether via foreign agent or not. As a mobile 1823 endpoint changes networks, up to three LISP mapping changes may be 1824 required: 1826 o The mobile node moves from an old location to a new visited 1827 network location and notifies its home agent that it has done so. 1828 The Mobile IPv4 control packets the mobile node sends pass through 1829 one of the new visited network's ITRs, which needs a EID-RLOC 1830 mapping for the home agent. 1832 o The home agent might not have the EID-RLOC mappings for the mobile 1833 node's "care-of" address or its foreign agent in the new visited 1834 network, in which case it will need to acquire them. 1836 o When packets are sent directly to the correspondent node, it may 1837 be that no traffic has been sent from the new visited network to 1838 the correspondent node's network, and the new visited network's 1839 ITR will need to obtain an EID-RLOC mapping for the correspondent 1840 node's site. 1842 In addition, if the IPv4 endpoint is sending packets from the new 1843 visited network using its original EID, then LISP will need to 1844 perform a route-returnability check on the new EID-RLOC mapping for 1845 that EID. 1847 In IPv6 mobility, packets can flow directly between the mobile node 1848 and the correspondent node in either direction. The mobile node uses 1849 its "care-of" address (EID). In this case, the route-returnability 1850 check would not be needed but one more LISP mapping lookup may be 1851 required instead: 1853 o As above, three mapping changes may be needed for the mobile node 1854 to communicate with its home agent and to send packets to the 1855 correspondent node. 1857 o In addition, another mapping will be needed in the correspondent 1858 node's ITR, in order for the correspondent node to send packets to 1859 the mobile node's "care-of" address (EID) at the new network 1860 location. 1862 When both endpoints are mobile the number of potential mapping 1863 lookups increases accordingly. 1865 As a mobile node moves there are not only mobility state changes in 1866 the mobile node, correspondent node, and home agent, but also state 1867 changes in the ITRs and ETRs for at least some EID-prefixes. 1869 The goal is to support rapid adaptation, with little delay or packet 1870 loss for the entire system. Heuristics can be added to LISP to 1871 reduce the number of mapping changes required and to reduce the delay 1872 per mapping change. Also IP mobility can be modified to require 1873 fewer mapping changes. In order to increase overall system 1874 performance, there may be a need to reduce the optimization of one 1875 area in order to place fewer demands on another. 1877 In LISP, one possibility is to "glean" information. When a packet 1878 arrives, the ETR could examine the EID-RLOC mapping and use that 1879 mapping for all outgoing traffic to that EID. It can do this after 1880 performing a route-returnability check, to ensure that the new 1881 network location does have a internal route to that endpoint. 1882 However, this does not cover the case where an ITR (the node assigned 1883 the RLOC) at the mobile-node location has been compromised. 1885 Mobile IP packet exchange is designed for an environment in which all 1886 routing information is disseminated before packets can be forwarded. 1887 In order to allow the Internet to grow to support expected future 1888 use, we are moving to an environment where some information may have 1889 to be obtained after packets are in flight. Modifications to IP 1890 mobility should be considered in order to optimize the behavior of 1891 the overall system. Anything which decreases the number of new EID- 1892 RLOC mappings needed when a node moves, or maintains the validity of 1893 an EID-RLOC mapping for a longer time, is useful. 1895 10.4. Fast Network Mobility 1897 In addition to endpoints, a network can be mobile, possibly changing 1898 xTRs. A "network" can be as small as a single router and as large as 1899 a whole site. This is different from site mobility in that it is 1900 fast and possibly short-lived, but different from endpoint mobility 1901 in that a whole prefix is changing RLOCs. However, the mechanisms 1902 are the same and there is no new overhead in LISP. A map request for 1903 any endpoint will return a binding for the entire mobile prefix. 1905 If mobile networks become a more common occurrence, it may be useful 1906 to revisit the design of the mapping service and allow for dynamic 1907 updates of the database. 1909 The issue of interactions between mobility and LISP needs to be 1910 explored further. Specific improvements to the entire system will 1911 depend on the details of mapping mechanisms. Mapping mechanisms 1912 should be evaluated on how well they support session continuity for 1913 mobile nodes. 1915 11. Multicast Considerations 1917 A multicast group address, as defined in the original Internet 1918 architecture is an identifier of a grouping of topologically 1919 independent receiver host locations. The address encoding itself 1920 does not determine the location of the receiver(s). The multicast 1921 routing protocol, and the network-based state the protocol creates, 1922 determines where the receivers are located. 1924 In the context of LISP, a multicast group address is both an EID and 1925 a Routing Locator. Therefore, no specific semantic or action needs 1926 to be taken for a destination address, as it would appear in an IP 1927 header. Therefore, a group address that appears in an inner IP 1928 header built by a source host will be used as the destination EID. 1929 The outer IP header (the destination Routing Locator address), 1930 prepended by a LISP router, will use the same group address as the 1931 destination Routing Locator. 1933 Having said that, only the source EID and source Routing Locator 1934 needs to be dealt with. Therefore, an ITR merely needs to put its 1935 own IP address in the source Routing Locator field when prepending 1936 the outer IP header. This source Routing Locator address, like any 1937 other Routing Locator address MUST be globally routable. 1939 Therefore, an EID-to-RLOC mapping does not need to be performed by an 1940 ITR when a received data packet is a multicast data packet or when 1941 processing a source-specific Join (either by IGMPv3 or PIM). But the 1942 source Routing Locator is decided by the multicast routing protocol 1943 in a receiver site. That is, an EID to Routing Locator translation 1944 is done at control-time. 1946 Another approach is to have the ITR not encapsulate a multicast 1947 packet and allow the the host built packet to flow into the core even 1948 if the source address is allocated out of the EID namespace. If the 1949 RPF-Vector TLV [RPFV] is used by PIM in the core, then core routers 1950 can RPF to the ITR (the Locator address which is injected into core 1951 routing) rather than the host source address (the EID address which 1952 is not injected into core routing). 1954 To avoid any EID-based multicast state in the network core, the first 1955 approach is chosen for LISP-Multicast. Details for LISP-Multicast 1956 and Interworking with non-LISP sites is described in specification 1957 [MLISP]. 1959 12. Security Considerations 1961 It is believed that most of the security mechanisms will be part of 1962 the mapping database service when using control plane procedures for 1963 obtaining EID-to-RLOC mappings. For data plane triggered mappings, 1964 as described in this specification, protection is provided against 1965 ETR spoofing by using Return- Routability mechanisms evidenced by the 1966 use of a 4-byte Nonce field in the LISP encapsulation header. The 1967 nonce, coupled with the ITR accepting only solicited Map-Replies goes 1968 a long way toward providing decent authentication. 1970 LISP does not rely on a PKI infrastructure or a more heavy weight 1971 authentication system. These systems challenge the scalability of 1972 LISP which was a primary design goal. 1974 DoS attack prevention will depend on implementations rate-limiting 1975 Map-Requests and Map-Replies to the control plane as well as rate- 1976 limiting the number of data-triggered Map-Replies. 1978 13. Prototype Plans and Status 1980 The operator community has requested that the IETF take a practical 1981 approach to solving the scaling problems associated with global 1982 routing state growth. This document offers a simple solution which 1983 is intended for use in a pilot program to gain experience in working 1984 on this problem. 1986 The authors hope that publishing this specification will allow the 1987 rapid implementation of multiple vendor prototypes and deployment on 1988 a small scale. Doing this will help the community: 1990 o Decide whether a new EID-to-RLOC mapping database infrastructure 1991 is needed or if a simple, UDP-based, data-triggered approach is 1992 flexible and robust enough. 1994 o Experiment with provider-independent assignment of EIDs while at 1995 the same time decreasing the size of DFZ routing tables through 1996 the use of topologically-aligned, provider-based RLOCs. 1998 o Determine whether multiple levels of tunneling can be used by ISPs 1999 to achieve their Traffic Engineering goals while simultaneously 2000 removing the more specific routes currently injected into the 2001 global routing system for this purpose. 2003 o Experiment with mobility to determine if both acceptable 2004 convergence and session continuity properties can be scalably 2005 implemented to support both individual device roaming and site 2006 service provider changes. 2008 Here is a rough set of milestones: 2010 1. This draft will be the draft for interoperable implementations to 2011 code against. Interoperable implementations will be ready 2012 beginning of 2009. 2014 2. Continue pilot deployment using LISP-ALT as the database mapping 2015 mechanism. 2017 3. Continue prototyping and studying other database lookup schemes, 2018 be it DNS, DHTs, CONS, ALT, NERD, or other mechanisms. 2020 4. Implement the LISP Multicast draft [MLISP]. 2022 5. Research more on how policy affects what gets returned in a Map- 2023 Reply from an ETR. 2025 6. Continue to experiment with mixed locator-sets to understand how 2026 LISP can help the IPv4 to IPv6 transition. 2028 7. Add more robustness to locator reachability between LISP sites. 2030 As of this writing the following accomplishments have been achieved: 2032 1. A unit- and system-tested software switching implementation has 2033 been completed on cisco NX-OS for this draft for both IPv4 and 2034 IPv6 EIDs using a mixed locator-set of IPv4 and IPv6 locators. 2036 2. A unit- and system-tested software switching implementation on 2037 cisco NX-OS has been completed for draft [ALT]. 2039 3. A unit- and system-tested software switching implementation on 2040 cisco NX-OS has been completed for draft [INTERWORK]. Support 2041 for IPv4 translation is provided and PTR support for IPv4 and 2042 IPv6 is provided. 2044 4. The cisco NX-OS implementation supports an experimental mechanism 2045 for slow mobility. 2047 5. Dave Meyer, Vince Fuller, Darrel Lewis, Greg Shepherd, and Andrew 2048 Partan continue to test all the features described above on a 2049 dual-stack infrastructure. 2051 6. Darrel Lewis and Dave Meyer have deployed both LISP translation 2052 and LISP PTR support in the pilot network. Point your browser to 2053 http://www.lisp4.net to see translation happening in action so 2054 your non-LISP site can access a web server in a LISP site. 2056 7. Soon http://www.lisp6.net will work where your IPv6 LISP site can 2057 talk to a IPv6 web server in a LISP site by using mixed address- 2058 family based locators. 2060 8. An public domain implementation of LISP is underway. See 2061 [OPENLISP] for details. 2063 9. A cisco IOS implementation is underway which currently supports 2064 IPv4 encapsulation and decapsulation features. 2066 If interested in writing a LISP implementation, testing any of the 2067 LISP implementations, or want to be part of the LISP pilot program, 2068 please contact lisp@ietf.org. 2070 14. References 2072 14.1. Normative References 2074 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2075 August 1980. 2077 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 2078 November 1990. 2080 [RFC1498] Saltzer, J., "On the Naming and Binding of Network 2081 Destinations", RFC 1498, August 1993. 2083 [RFC1955] Hinden, R., "New Scheme for Internet Routing and 2084 Addressing (ENCAPS) for IPNG", RFC 1955, June 1996. 2086 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2087 Requirement Levels", BCP 14, RFC 2119, March 1997. 2089 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2090 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2091 October 1998. 2093 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 2094 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 2095 March 2000. 2097 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 2098 via IPv4 Clouds", RFC 3056, February 2001. 2100 [RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support 2101 in IPv6", RFC 3775, June 2004. 2103 [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol 2104 (HIP) Architecture", RFC 4423, May 2006. 2106 [RFC4866] Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route 2107 Optimization for Mobile IPv6", RFC 4866, May 2007. 2109 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 2110 Workshop on Routing and Addressing", RFC 4984, 2111 September 2007. 2113 14.2. Informative References 2115 [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY 2116 NUMBERS http://www.iana.org/numbers.html, Febuary 2007. 2118 [ALT] Farinacci, D., Fuller, V., and D. Meyer, "LISP Alternative 2119 Topology (LISP-ALT)", draft-fuller-lisp-alt-03.txt (work 2120 in progress), October 2008. 2122 [APT] Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and 2123 L. Zhang, "APT: A Practical Transit Mapping Service", 2124 draft-jen-apt-00.txt (work in progress), July 2007. 2126 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 2127 Enhancement to the Internet Architecture", Internet- 2128 Draft http://www.chiappa.net/~jnc/tech/endpoints.txt, 2129 1999. 2131 [CONS] Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A 2132 Content distribution Overlay Network Service for LISP", 2133 draft-meyer-lisp-cons-03.txt (work in progress), 2134 November 2007. 2136 [DHTs] Ratnasamy, S., Shenker, S., and I. Stoica, "Routing 2137 Algorithms for DHTs: Some Open Questions", PDF 2138 file http://www.cs.rice.edu/Conferences/IPTPS02/174.pdf. 2140 [GSE] "GSE - An Alternate Addressing Architecture for IPv6", 2141 draft-ietf-ipngwg-gseaddr-00.txt (work in progress), 1997. 2143 [INTERWORK] 2144 Lewis, D., Meyer, D., and D. Farinacci, "Interworking LISP 2145 with IPv4 and IPv6", draft-lewis-lisp-interworking-01.txt 2146 (work in progress), July 2008. 2148 [LISA96] Lear, E., Katinsky, J., Coffin, J., and D. Tharp, 2149 "Renumbering: Threat or Menace?", Usenix , September 1996. 2151 [LISP1] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2152 "Locator/ID Separation Protocol (LISP1) [Routable ID 2153 Version]", 2154 Slide-set http://www.dinof.net/~dino/ietf/lisp1.ppt, 2155 October 2006. 2157 [LISP2] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2158 "Locator/ID Separation Protocol (LISP2) [DNS-based 2159 Version]", 2160 Slide-set http://www.dinof.net/~dino/ietf/lisp2.ppt, 2161 November 2006. 2163 [LISPDHT] Mathy, L., Iannone, L., and O. Bonaventure, "LISP-DHT: 2164 Towards a DHT to map identifiers onto locators", 2165 draft-mathy-lisp-dht-00.txt (work in progress), 2166 February 2008. 2168 [LOC-ID-ARCH] 2169 Meyer, D. and D. Lewis, "Architectural Implications of 2170 Locator/ID Separation", 2171 draft-meyer-loc-id-implications-00.txt (work in progress), 2172 December 2008. 2174 [MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, 2175 "LISP for Multicast Environments", 2176 draft-farinacci-lisp-multicast-01.txt (work in progress), 2177 November 2008. 2179 [NERD] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", 2180 draft-lear-lisp-nerd-02.txt (work in progress), 2181 January 2008. 2183 [OPENLISP] 2184 Iannone, L. and O. Bonaventure, "OpenLISP Implementation 2185 Report", draft-iannone-openlisp-implementation-01.txt 2186 (work in progress), July 2008. 2188 [RADIR] Narten, T., "Routing and Addressing Problem Statement", 2189 draft-narten-radir-problem-statement-00.txt (work in 2190 progress), July 2007. 2192 [RFC3344bis] 2193 Perkins, C., "IP Mobility Support for IPv4, revised", 2194 draft-ietf-mip4-rfc3344bis-05 (work in progress), 2195 July 2007. 2197 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 2198 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 2199 September 2005. 2201 [RPFV] Wijnands, IJ., Boers, A., and E. Rosen, "The RPF Vector 2202 TLV", draft-ietf-pim-rpf-vector-03.txt (work in progress), 2203 October 2006. 2205 [RPMD] Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol 2206 for Routing Protocol Meta-data Dissemination", 2207 draft-handley-p2ppush-unpublished-2007726.txt (work in 2208 progress), July 2007. 2210 [SHIM6] Nordmark, E. and M. Bagnulo, "Level 3 multihoming shim 2211 protocol", draft-ietf-shim6-proto-06.txt (work in 2212 progress), October 2006. 2214 Appendix A. Acknowledgments 2216 A special and appreciative thank you goes to Noel Chiappa for 2217 providing architectural impetus over the past decades on separation 2218 of location and identity, as well as detailed review of the LISP 2219 architecture and documents, coupled with enthusiasm for making LISP a 2220 practical and incremental transition for the Internet. 2222 The authors would like to gratefully acknowledge many people who have 2223 contributed discussion and ideas to the making of this proposal. 2224 They include Darrel Lewis, Andrew Partan, John Zwiebel, Jason 2225 Schiller, Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill, 2226 Geoff Huston, David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark 2227 Townsley, Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave 2228 Thaler, Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure, 2229 Luigi Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger 2230 Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland 2231 Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, and 2232 Damian Lezama. 2234 In particular, we would like to thank Dave Meyer for his clever 2235 suggestion for the name "LISP". ;-) 2237 Authors' Addresses 2239 Dino Farinacci 2240 cisco Systems 2241 Tasman Drive 2242 San Jose, CA 95134 2243 USA 2245 Email: dino@cisco.com 2247 Vince Fuller 2248 cisco Systems 2249 Tasman Drive 2250 San Jose, CA 95134 2251 USA 2253 Email: vaf@cisco.com 2255 Dave Oran 2256 cisco Systems 2257 7 Ladyslipper Lane 2258 Acton, MA 2259 USA 2261 Email: oran@cisco.com 2263 Dave Meyer 2264 cisco Systems 2265 170 Tasman Drive 2266 San Jose, CA 2267 USA 2269 Email: dmm@cisco.com 2271 Scott Brim 2272 cisco Systems 2273 170 Tasman Drive 2274 San Jose, CA 2275 USA 2277 Email: sbrim@cisco.com