idnits 2.17.1 draft-ietf-lisp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: UDP Checksum: this field field MUST be transmitted as 0 and ignored on receipt by the ETR. Note, even when the UDP checksum is transmitted as 0 an intervening NAT device can recalculate the checksum and rewrite the UDP checksum field to non-zero. For performance reasons, the ETR MUST ignore the checksum and MUST not do a checksum computation. -- The document date (May 26, 2009) is 5421 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2402 (Obsoleted by RFC 4302, RFC 4305) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) ** Obsolete normative reference: RFC 4423 (Obsoleted by RFC 9063) == Outdated reference: A later version (-05) exists of draft-fuller-lisp-alt-03 == Outdated reference: A later version (-04) exists of draft-meyer-lisp-cons-03 == Outdated reference: A later version (-02) exists of draft-lewis-lisp-interworking-01 -- No information found for draft-mathy-lisp-dht - is the name correct? == Outdated reference: A later version (-14) exists of draft-ietf-lisp-multicast-00 == Outdated reference: A later version (-09) exists of draft-lear-lisp-nerd-04 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-00 == Outdated reference: A later version (-10) exists of draft-ietf-mip4-rfc3344bis-05 -- No information found for draft-handley-p2ppush-unpublished-2007726 - is the name correct? == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-06 Summary: 6 errors (**), 0 flaws (~~), 11 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft V. Fuller 4 Intended status: Experimental D. Meyer 5 Expires: November 27, 2009 D. Lewis 6 cisco Systems 7 May 26, 2009 9 Locator/ID Separation Protocol (LISP) 10 draft-ietf-lisp-00.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with the 15 provisions of BCP 78 and BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on November 27, 2009. 35 Copyright Notice 37 Copyright (c) 2009 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents in effect on the date of 42 publication of this document (http://trustee.ietf.org/license-info). 43 Please review these documents carefully, as they describe your rights 44 and restrictions with respect to this document. 46 Abstract 48 This draft describes a simple, incremental, network-based protocol to 49 implement separation of Internet addresses into Endpoint Identifiers 50 (EIDs) and Routing Locators (RLOCs). This mechanism requires no 51 changes to host stacks and no major changes to existing database 52 infrastructures. The proposed protocol can be implemented in a 53 relatively small number of routers. 55 This proposal was stimulated by the problem statement effort at the 56 Amsterdam IAB Routing and Addressing Workshop (RAWS), which took 57 place in October 2006. 59 Table of Contents 61 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 8 64 4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12 65 4.1. Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14 66 5. Tunneling Details . . . . . . . . . . . . . . . . . . . . . . 16 67 5.1. LISP IPv4-in-IPv4 Header Format . . . . . . . . . . . . . 17 68 5.2. LISP IPv6-in-IPv6 Header Format . . . . . . . . . . . . . 18 69 5.3. Tunnel Header Field Descriptions . . . . . . . . . . . . . 19 70 5.4. Dealing with Large Encapsulated Packets . . . . . . . . . 20 71 5.4.1. A Stateless Solution to MTU Handling . . . . . . . . . 21 72 5.4.2. A Stateful Solution to MTU Handling . . . . . . . . . 21 73 6. EID-to-RLOC Mapping . . . . . . . . . . . . . . . . . . . . . 23 74 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats . . . . . 23 75 6.1.1. LISP Packet Type Allocations . . . . . . . . . . . . . 25 76 6.1.2. Map-Request Message Format . . . . . . . . . . . . . . 25 77 6.1.3. EID-to-RLOC UDP Map-Request Message . . . . . . . . . 27 78 6.1.4. Map-Reply Message Format . . . . . . . . . . . . . . . 28 79 6.1.5. EID-to-RLOC UDP Map-Reply Message . . . . . . . . . . 30 80 6.1.6. Map-Register Message Format . . . . . . . . . . . . . 31 81 6.2. Routing Locator Selection . . . . . . . . . . . . . . . . 33 82 6.3. Routing Locator Reachability . . . . . . . . . . . . . . . 34 83 6.4. Routing Locator Hashing . . . . . . . . . . . . . . . . . 37 84 6.5. Changing the Contents of EID-to-RLOC Mappings . . . . . . 37 85 6.5.1. Clock Sweep . . . . . . . . . . . . . . . . . . . . . 38 86 6.5.2. Solicit-Map-Request (SMR) . . . . . . . . . . . . . . 39 87 7. Router Performance Considerations . . . . . . . . . . . . . . 41 88 8. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 42 89 8.1. First-hop/Last-hop Tunnel Routers . . . . . . . . . . . . 43 90 8.2. Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 43 91 8.3. ISP Provider-Edge (PE) Tunnel Routers . . . . . . . . . . 44 92 9. Traceroute Considerations . . . . . . . . . . . . . . . . . . 45 93 9.1. IPv6 Traceroute . . . . . . . . . . . . . . . . . . . . . 46 94 9.2. IPv4 Traceroute . . . . . . . . . . . . . . . . . . . . . 46 95 9.3. Traceroute using Mixed Locators . . . . . . . . . . . . . 46 96 10. Mobility Considerations . . . . . . . . . . . . . . . . . . . 48 97 10.1. Site Mobility . . . . . . . . . . . . . . . . . . . . . . 48 98 10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 48 99 10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 48 100 10.4. Fast Network Mobility . . . . . . . . . . . . . . . . . . 50 101 11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 51 102 12. Security Considerations . . . . . . . . . . . . . . . . . . . 52 103 13. Prototype Plans and Status . . . . . . . . . . . . . . . . . . 53 104 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 55 105 14.1. Normative References . . . . . . . . . . . . . . . . . . . 55 106 14.2. Informative References . . . . . . . . . . . . . . . . . . 56 107 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 59 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 60 110 1. Requirements Notation 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 114 document are to be interpreted as described in [RFC2119]. 116 2. Introduction 118 Many years of discussion about the current IP routing and addressing 119 architecture have noted that its use of a single numbering space (the 120 "IP address") for both host transport session identification and 121 network routing creates scaling issues (see [CHIAPPA] and [RFC1498]). 122 A number of scaling benefits would be realized by separating the 123 current IP address into separate spaces for Endpoint Identifiers 124 (EIDs) and Routing Locators (RLOCs); among them are: 126 1. Reduction of routing table size in the "default-free zone" (DFZ). 127 Use of a separate numbering space for RLOCs will allow them to be 128 assigned topologically (in today's Internet, RLOCs would be 129 assigned by providers at client network attachment points), 130 greatly improving aggregation and reducing the number of 131 globally-visible, routable prefixes. 133 2. More cost-effective multihoming for sites that connect to 134 different service providers where they can control their own 135 policies for packet flow into the site without using extra 136 routing table resources of core routers. 138 3. Easing of renumbering burden when clients change providers. 139 Because host EIDs are numbered from a separate, non-provider- 140 assigned and non-topologically-bound space, they do not need to 141 be renumbered when a client site changes its attachment points to 142 the network. 144 4. Traffic engineering capabilities that can be performed by network 145 elements and do not depend on injecting additional state into the 146 routing system. This will fall out of the mechanism that is used 147 to implement the EID/RLOC split (see Section 4). 149 5. Mobility without address changing. Existing mobility mechanisms 150 will be able to work in a locator/ID separation scenario. It 151 will be possible for a host (or a collection of hosts) to move to 152 a different point in the network topology either retaining its 153 home-based address or acquiring a new address based on the new 154 network location. A new network location could be a physically 155 different point in the network topology or the same physical 156 point of the topology with a different provider. 158 This draft describes protocol mechanisms to achieve the desired 159 functional separation. For flexibility, the mechanism used for 160 forwarding packets is decoupled from that used to determine EID to 161 RLOC mappings. This document covers the former. For the later, see 162 [CONS], [ALT], [RPMD], and [NERD]. This work is in response to and 163 intended to address the problem statement that came out of the RAWS 164 effort [RFC4984]. 166 The Routing and Addressing problem statement can be found in [RADIR]. 168 This draft focuses on a router-based solution. Building the solution 169 into the network will facilitate incremental deployment of the 170 technology on the Internet. Note that while the detailed protocol 171 specification and examples in this document assume IP version 4 172 (IPv4), there is nothing in the design that precludes use of the same 173 techniques and mechanisms for IPv6. It should be possible for IPv4 174 packets to use IPv6 RLOCs and for IPv6 EIDs to be mapped to IPv4 175 RLOCs. 177 Related work on host-based solutions is described in Shim6 [SHIM6] 178 and HIP [RFC4423]. Related work on a router-based solution is 179 described in [GSE]. This draft attempts to not compete or overlap 180 with such solutions and the proposed protocol changes are expected to 181 complement a host-based mechanism when Traffic Engineering 182 functionality is desired. 184 Some of the design goals of this proposal include: 186 1. Require no hardware or software changes to end-systems (hosts). 188 2. Minimize required changes to Internet infrastructure. 190 3. Be incrementally deployable. 192 4. Require no router hardware changes. 194 5. Minimize the number of routers which have to be modified. In 195 particular, most customer site routers and no core routers 196 require changes. 198 6. Minimize router software changes in those routers which are 199 affected. 201 7. Avoid or minimize packet loss when EID-to-RLOC mappings need to 202 be performed. 204 There are 4 variants of LISP, which differ along a spectrum of strong 205 to weak dependence on the topological nature and possible need for 206 routability of EIDs. The variants are: 208 LISP 1: uses EIDs that are routable through the RLOC topology for 209 bootstrapping EID-to-RLOC mappings. [LISP1] This was intended as 210 a prototyping mechanism for early protocol implementation. It is 211 now deprecated and should not be deployed. 213 LISP 1.5: uses EIDs that are routable for bootstrapping EID-to-RLOC 214 mappings; such routing is via a separate topology. 216 LISP 2: uses EIDS that are not routable and EID-to-RLOC mappings are 217 implemented within the DNS. [LISP2] 219 LISP 3: uses non-routable EIDs that are used as lookup keys for a 220 new EID-to-RLOC mapping database. Use of Distributed Hash Tables 221 [DHTs] [LISPDHT] to implement such a database would be an area to 222 explore. Other examples of new mapping database services are 223 [CONS], [ALT], [RPMD], [NERD], and [APT]. 225 This document on LISP 1.5, and LISP 3 variants, both of which rely on 226 a router-based distributed cache and database for EID-to-RLOC 227 mappings. The LISP 1.0 mechanism works but does not allow reduction 228 of routing information in the default-free-zone of the Internet. The 229 LISP 2 mechanisms are put on hold and may never come to fruition 230 since it is not architecturally pure to have routing depend on 231 directory and directory depend on routing. The LISP 3 mechanisms 232 will be documented elsewhere but may use the control-plane options 233 specified in this specification. 235 3. Definition of Terms 237 Provider Independent (PI) Addresses: an address block assigned from 238 a pool where blocks are not associated with any particular 239 location in the network (e.g. from a particular service provider), 240 and is therefore not topologically aggregatable in the routing 241 system. 243 Provider Assigned (PA) Addresses: a block of IP addresses that are 244 assigned to a site by each service provider to which a site 245 connects. Typically, each block is sub-block of a service 246 provider CIDR block and is aggregated into the larger block before 247 being advertised into the global Internet. Traditionally, IP 248 multihoming has been implemented by each multi-homed site 249 acquiring its own, globally-visible prefix. LISP uses only 250 topologically-assigned and aggregatable address blocks for RLOCs, 251 eliminating this demonstrably non-scalable practice. 253 Routing Locator (RLOC): the IPv4 or IPv6 address of an egress 254 tunnel router (ETR). It is the output of a EID-to-RLOC mapping 255 lookup. An EID maps to one or more RLOCs. Typically, RLOCs are 256 numbered from topologically-aggregatable blocks that are assigned 257 to a site at each point to which it attaches to the global 258 Internet; where the topology is defined by the connectivity of 259 provider networks, RLOCs can be thought of as PA addresses. 260 Multiple RLOCs can be assigned to the same ETR device or to 261 multiple ETR devices at a site. 263 Endpoint ID (EID): a 32-bit (for IPv4) or 128-bit (for IPv6) value 264 used in the source and destination address fields of the first 265 (most inner) LISP header of a packet. The host obtains a 266 destination EID the same way it obtains an destination address 267 today, for example through a DNS lookup or SIP exchange. The 268 source EID is obtained via existing mechanisms used to set a 269 host's "local" IP address. An EID is allocated to a host from an 270 EID-prefix block associated with the site where the host is 271 located. An EID can be used by a host to refer to other hosts. 272 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks may be 273 assigned in a hierarchical manner, independent of the network 274 topology, to facilitate scaling of the mapping database. In 275 addition, an EID block assigned to a site may have site-local 276 structure (subnetting) for routing within the site; this structure 277 is not visible to the global routing system. 279 EID-prefix: A power-of-2 block of EIDs which are allocated to a 280 site by an address allocation authority. EID-prefixes are 281 associated with a set of RLOC addresses which make up a "database 282 mapping". EID-prefix allocations can be broken up into smaller 283 blocks when an RLOC set is to be associated with the smaller EID- 284 prefix. A globally routed address block (whether PI or PA) is not 285 an EID-prefix. However, a globally routed address block may be 286 removed from global routing and reused as an EID-prefix. A site 287 that receives an explicitly allocated EID-prefix may not use that 288 EID-prefix as a globally routed prefix assigned to RLOCs. 290 End-system: is an IPv4 or IPv6 device that originates packets with 291 a single IPv4 or IPv6 header. The end-system supplies an EID 292 value for the destination address field of the IP header when 293 communicating globally (i.e. outside of its routing domain). An 294 end-system can be a host computer, a switch or router device, or 295 any network appliance. 297 Ingress Tunnel Router (ITR): a router which accepts an IP packet 298 with a single IP header (more precisely, an IP packet that does 299 not contain a LISP header). The router treats this "inner" IP 300 destination address as an EID and performs an EID-to-RLOC mapping 301 lookup. The router then prepends an "outer" IP header with one of 302 its globally-routable RLOCs in the source address field and the 303 result of the mapping lookup in the destination address field. 304 Note that this destination RLOC may be an intermediate, proxy 305 device that has better knowledge of the EID-to-RLOC mapping closer 306 to the destination EID. In general, an ITR receives IP packets 307 from site end-systems on one side and sends LISP-encapsulated IP 308 packets toward the Internet on the other side. 310 Specifically, when a service provider prepends a LISP header for 311 Traffic Engineering purposes, the router that does this is also 312 regarded as an ITR. The outer RLOC the ISP ITR uses can be based 313 on the outer destination address (the originating ITR's supplied 314 RLOC) or the inner destination address (the originating hosts 315 supplied EID). 317 TE-ITR: is an ITR that is deployed in a service provider network 318 that prepends an additional LISP header for Traffic Engineering 319 purposes. 321 Egress Tunnel Router (ETR): a router that accepts an IP packet 322 where the destination address in the "outer" IP header is one of 323 its own RLOCs. The router strips the "outer" header and forwards 324 the packet based on the next IP header found. In general, an ETR 325 receives LISP-encapsulated IP packets from the Internet on one 326 side and sends decapsulated IP packets to site end-systems on the 327 other side. ETR functionality does not have to be limited to a 328 router device. A server host can be the endpoint of a LISP tunnel 329 as well. 331 TE-ETR: is an ETR that is deployed in a service provider network 332 that strips an outer LISP header for Traffic Engineering purposes. 334 xTR: is a reference to an ITR or ETR when direction of data flow is 335 not part of the context description. xTR refers to the router that 336 is the tunnel endpoint. Used synonymously with the term "Tunnel 337 Router". For example, "An xTR can be located at the Customer Edge 338 (CE) router", meaning both ITR and ETR functionality is at the CE 339 router. 341 EID-to-RLOC Cache: a short-lived, on-demand table in an ITR that 342 stores, tracks, and is responsible for timing-out and otherwise 343 validating EID-to-RLOC mappings. This cache is distinct from the 344 full "database" of EID-to-RLOC mappings, it is dynamic, local to 345 the ITR(s), and relatively small while the database is 346 distributed, relatively static, and much more global in scope. 348 EID-to-RLOC Database: a global distributed database that contains 349 all known EID-prefix to RLOC mappings. Each potential ETR 350 typically contains a small piece of the database: the EID-to-RLOC 351 mappings for the EID prefixes "behind" the router. These map to 352 one of the router's own, globally-visible, IP addresses. 354 Recursive Tunneling: when a packet has more than one LISP IP 355 header. Additional layers of tunneling may be employed to 356 implement traffic engineering or other re-routing as needed. When 357 this is done, an additional "outer" LISP header is added and the 358 original RLOCs are preserved in the "inner" header. Any 359 references to tunnels in this specification refers to dynamic 360 encapsulating tunnels and never are they staticly configured. 362 Reencapsulating Tunnels: when a packet has no more than one LISP IP 363 header (two IP headers total) and when it needs to be diverted to 364 new RLOC, an ETR can decapsulate the packet (remove the LISP 365 header) and prepend a new tunnel header, with new RLOC, on to the 366 packet. Doing this allows a packet to be re-routed by the re- 367 encapsulating router without adding the overhead of additional 368 tunnel headers. Any references to tunnels in this specification 369 refers to dynamic encapsulating tunnels and never are they 370 staticly configured. 372 LISP Header: a term used in this document to refer to the outer 373 IPv4 or IPv6 header, a UDP header, and a LISP header, an ITR 374 prepends or an ETR strips. 376 Address Family Indicator (AFI): a term used to describe an address 377 encoding in a packet. An address family currently pertains to an 378 IPv4 or IPv6 address. See [AFI] for details. 380 Negative Mapping Entry: also known as a negative cache entry, is an 381 EID-to-RLOC entry where an EID-prefix is advertised or stored with 382 no RLOCs. That is, the locator-set for the EID-to-RLOC entry is 383 empty or has an encoded locator count of 0. This type of entry 384 could be used to describe a prefix from a non-LISP site, which is 385 explicitly not in the mapping database. 387 Data Probe: a LISP-encapsulated data packet where the inner header 388 destination address equals the outer header destination address 389 used to trigger a Map-Reply by a decapsulating ETR. In addition, 390 the original packet is decapsulated and delivered to the 391 destination host. A Data Probe is used in some of the mapping 392 database designs to "probe" or request a Map-Reply from an ETR; in 393 other cases, Map-Requests are used. See each mapping database 394 design for details. 396 4. Basic Overview 398 One key concept of LISP is that end-systems (hosts) operate the same 399 way they do today. The IP addresses that hosts use for tracking 400 sockets, connections, and for sending and receiving packets do not 401 change. In LISP terminology, these IP addresses are called Endpoint 402 Identifiers (EIDs). 404 Routers continue to forward packets based on IP destination 405 addresses. When a packet is LISP encapsulated, these addresses are 406 referred to as Routing Locators (RLOCs). Most routers along a path 407 between two hosts will not change; they continue to perform routing/ 408 forwarding lookups on the destination addresses. For routers between 409 the source host and the ITR as well as routers from the ETR to the 410 destination host, the destination address is an EID. For the routers 411 between the ITR and the ETR, the destination address is an RLOC. 413 This design introduces "Tunnel Routers", which prepend LISP headers 414 on host-originated packets and strip them prior to final delivery to 415 their destination. The IP addresses in this "outer header" are 416 RLOCs. During end-to-end packet exchange between two Internet hosts, 417 an ITR prepends a new LISP header to each packet and an egress tunnel 418 router strips the new header. The ITR performs EID-to-RLOC lookups 419 to determine the routing path to the the ETR, which has the RLOC as 420 one of its IP addresses. 422 Some basic rules governing LISP are: 424 o End-systems (hosts) only send to addresses which are EIDs. They 425 don't know addresses are EIDs versus RLOCs but assume packets get 426 to LISP routers, which in turn, deliver packets to the destination 427 the end-system has specified. 429 o EIDs are always IP addresses assigned to hosts. 431 o LISP routers mostly deal with Routing Locator addresses. See 432 details later in Section 4.1 to clarify what is meant by "mostly". 434 o RLOCs are always IP addresses assigned to routers; preferably, 435 topologically-oriented addresses from provider CIDR blocks. 437 o When a router originates packets it may use as a source address 438 either an EID or RLOC. When acting as a host (e.g. when 439 terminating a transport session such as SSH, TELNET, or SNMP), it 440 may use an EID that is explicitly assigned for that purpose. An 441 EID that identifies the router as a host MUST NOT be used as an 442 RLOC; an EID is only routable within the scope of a site. A 443 typical BGP configuration might demonstrate this "hybrid" EID/RLOC 444 usage where a router could use its "host-like" EID to terminate 445 iBGP sessions to other routers in a site while at the same time 446 using RLOCs to terminate eBGP sessions to routers outside the 447 site. 449 o EIDs are not expected to be usable for global end-to-end 450 communication in the absence of an EID-to-RLOC mapping operation. 451 They are expected to be used locally for intra-site communication. 453 o EID prefixes are likely to be hierarchically assigned in a manner 454 which is optimized for administrative convenience and to 455 facilitate scaling of the EID-to-RLOC mapping database. The 456 hierarchy is based on a address allocation hierarchy which is not 457 dependent on the network topology. 459 o EIDs may also be structured (subnetted) in a manner suitable for 460 local routing within an autonomous system. 462 An additional LISP header may be prepended to packets by a transit 463 router (i.e. TE-ITR) when re-routing of the path for a packet is 464 desired. An obvious instance of this would be an ISP router that 465 needs to perform traffic engineering for packets in flow through its 466 network. In such a situation, termed Recursive Tunneling, an ISP 467 transit acts as an additional ingress tunnel router and the RLOC it 468 uses for the new prepended header would be either an TE-ETR within 469 the ISP (along intra-ISP traffic engineered path) or in an TE-ETR 470 within another ISP (an inter-ISP traffic engineered path, where an 471 agreement to build such a path exists). 473 This specification mandates that no more than two LISP headers get 474 prepended to a packet. This avoids excessive packet overhead as well 475 as possible encapsulation loops. It is believed two headers is 476 sufficient, where the first prepended header is used at a site for 477 Location/Identity separation and second prepended header is used 478 inside a service provider for Traffic Engineering purposes. 480 Tunnel Routers can be placed fairly flexibly in a multi-AS topology. 481 For example, the ITR for a particular end-to-end packet exchange 482 might be the first-hop or default router within a site for the source 483 host. Similarly, the egress tunnel router might be the last-hop 484 router directly-connected to the destination host. Another example, 485 perhaps for a VPN service out-sourced to an ISP by a site, the ITR 486 could be the site's border router at the service provider attachment 487 point. Mixing and matching of site-operated, ISP-operated, and other 488 tunnel routers is allowed for maximum flexibility. See Section 8 for 489 more details. 491 4.1. Packet Flow Sequence 493 This section provides an example of the unicast packet flow with the 494 following conditions: 496 o Source host "host1.abc.com" is sending a packet to 497 "host2.xyz.com", exactly what host1 would do if the site was not 498 using LISP. 500 o Each site is multi-homed, so each tunnel router has an address 501 (RLOC) assigned from the service provider address block for each 502 provider to which that particular tunnel router is attached. 504 o The ITR(s) and ETR(s) are directly connected to the source and 505 destination, respectively. 507 o Data Probes are used to solicit Map-Replies versus using Map- 508 Requests. And the Data Probes are sent on the underlying topology 509 (the LISP 1.0 variant) but could also be sent over an alternative 510 topology (the LISP 1.5 variant) as it would in [ALT]. 512 Client host1.abc.com wants to communicate with server host2.xyz.com: 514 1. host1.abc.com wants to open a TCP connection to host2.xyz.com. 515 It does a DNS lookup on host2.xyz.com. An A/AAAA record is 516 returned. This address is used as the destination EID and the 517 locally-assigned address of host1.abc.com is used as the source 518 EID. An IPv4 or IPv6 packet is built using the EIDs in the IPv4 519 or IPv6 header and sent to the default router. 521 2. The default router is configured as an ITR. The ITR must be able 522 to map the EID destination to an RLOC of the ETR at the 523 destination site. The ITR prepends a LISP header to the packet, 524 with one of its RLOCs as the source IPv4 or IPv6 address. The 525 destination EID from the original packet header is used as the 526 destination IPv4 or IPv6 in the prepended LISP header. 527 Subsequent packets, where the outer destination address is the 528 destination EID will be sent until EID-to-RLOC mapping is 529 learned. 531 3. In LISP 1, the packet is routed through the Internet as it is 532 today. In LISP 1.5, the packet is routed on a different topology 533 which may have EID prefixes distributed and advertised in an 534 aggregatable fashion. In either case, the packet arrives at the 535 ETR. The router is configured to "punt" the packet to the 536 router's processor. See Section 7 for more details. For LISP 537 2.0 and 3.0, the behavior is not fully defined yet. 539 4. The LISP header is stripped so that the packet can be forwarded 540 by the router control plane. The router looks up the destination 541 EID in the router's EID-to-RLOC database (not the cache, but the 542 configured data structure of RLOCs). An EID-to-RLOC Map-Reply 543 message is originated by the ETR and is addressed to the source 544 RLOC in the LISP header of the original packet (this is the ITR). 545 The source RLOC of the Map-Reply is one of the ETR's RLOCs. 547 5. The ITR receives the Map-Reply message, parses the message (to 548 check for format validity) and stores the mapping information 549 from the packet. This information is put in the ITR's EID-to- 550 RLOC mapping cache (this is the on-demand cache, the cache where 551 entries time out due to inactivity). 553 6. Subsequent packets from host1.abc.com to host2.xyz.com will have 554 a LISP header prepended by the ITR using the appropriate RLOC as 555 the LISP header destination address learned from the ETR. Note, 556 the packet may be sent to a different ETR than the one which 557 returned the Map-Reply due to the source site's hashing policy or 558 the destination site's locator-set policy. 560 7. The ETR receives these packets directly (since the destination 561 address is one of its assigned IP addresses), strips the LISP 562 header and forwards the packets to the attached destination host. 564 In order to eliminate the need for a mapping lookup in the reverse 565 direction, an ETR MAY create a cache entry that maps the source EID 566 (inner header source IP address) to the source RLOC (outer header 567 source IP address) in a received LISP packet. Such a cache entry is 568 termed a "gleaned" mapping and only contains a single RLOC for the 569 EID in question. More complete information about additional RLOCs 570 SHOULD be verified by sending a LISP Map-Request for that EID. Both 571 ITR and the ETR may also influence the decision the other makes in 572 selecting an RLOC. See Section 6 for more details. 574 5. Tunneling Details 576 This section describes the LISP Data Message which defines the 577 tunneling header used to encapsulate IPv4 and IPv6 packets which 578 contain EID addresses. Even though the following formats illustrate 579 IPv4-in-IPv4 and IPv6-in-IPv6 encapsulations, the other 2 580 combinations are supported as well. 582 Since additional tunnel headers are prepended, the packet becomes 583 larger and in theory can exceed the MTU of any link traversed from 584 the ITR to the ETR. It is recommended, in IPv4 that packets do not 585 get fragmented as they are encapsulated by the ITR. Instead, the 586 packet is dropped and an ICMP Too Big message is returned to the 587 source. 589 Based on informal surveys of large ISP traffic patterns, it appears 590 that most transit paths can accommodate a path MTU of at least 4470 591 bytes. The exceptions, in terms of data rate, number of hosts 592 affected, or any other metric are expected to be vanishingly small. 594 To address MTU concerns, mainly raised on the RRG mailing list, the 595 LISP deployment process will include collecting data during its pilot 596 phase to either verify or refute the assumption about minimum 597 available MTU. If the assumption proves true and transit networks 598 with links limited to 1500 byte MTUs are corner cases, it would seem 599 more cost-effective to either upgrade or modify the equipment in 600 those transit networks to support larger MTUs or to use existing 601 mechanisms for accommodating packets that are too large. 603 For this reason, there is currently no plan for LISP to add any new 604 additional, complex mechanism for implementing fragmentation and 605 reassembly in the face of limited-MTU transit links. If analysis 606 during LISP pilot deployment reveals that the assumption of 607 essentially ubiquitous, 4470+ byte transit path MTUs, is incorrect, 608 then LISP can be modified prior to protocol standardization to add 609 support for one of the proposed fragmentation and reassembly schemes. 610 Note that two simple existing schemes are detailed in Section 5.4. 612 5.1. LISP IPv4-in-IPv4 Header Format 614 0 1 2 3 615 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 / |Version| IHL |Type of Service| Total Length | 618 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 | | Identification |Flags| Fragment Offset | 620 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 OH | Time to Live | Protocol = 17 | Header Checksum | 622 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 623 | | Source Routing Locator | 624 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 625 \ | Destination Routing Locator | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 / | Source Port = xxxx | Dest Port = 4341 | 628 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 629 \ | UDP Length | UDP Checksum | 630 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 631 L / |S| Locator Reach Bits | 632 I +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 633 S \ | Nonce | 634 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 635 / |Version| IHL |Type of Service| Total Length | 636 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 637 | | Identification |Flags| Fragment Offset | 638 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 IH | Time to Live | Protocol | Header Checksum | 640 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 641 | | Source EID | 642 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 643 \ | Destination EID | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 646 5.2. LISP IPv6-in-IPv6 Header Format 648 0 1 2 3 649 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 651 / |Version| Traffic Class | Flow Label | 652 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 653 | | Payload Length | Next Header=17| Hop Limit | 654 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 655 | | 656 O + + 657 u | | 658 t + Source Routing Locator + 659 e | | 660 r + + 661 | | 662 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 d | | 664 r + + 665 | | 666 ^ + Destination Routing Locator + 667 | | | 668 \ + + 669 \ | | 670 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 671 / | Source Port = xxxx | Dest Port = 4341 | 672 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 673 \ | UDP Length | UDP Checksum | 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 L / |S| Locator Reach Bits | 676 I +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 677 S \ | Nonce | 678 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 679 / |Version| Traffic Class | Flow Label | 680 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 681 / | Payload Length | Next Header | Hop Limit | 682 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | | 684 I + + 685 n | | 686 n + Source EID + 687 e | | 688 r + + 689 | | 690 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 d | | 692 r + + 693 | | 694 ^ + Destination EID + 695 \ | | 696 \ + + 697 \ | | 698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 700 5.3. Tunnel Header Field Descriptions 702 IH Header: is the inner header, preserved from the datagram received 703 from the originating host. The source and destination IP 704 addresses are EIDs. 706 OH Header: is the outer header prepended by an ITR. The address 707 fields contain RLOCs obtained from the ingress router's EID-to- 708 RLOC cache. The IP protocol number is "UDP (17)" from [RFC0768]. 710 UDP Header: contains a ITR selected source port when encapsulating a 711 packet. See Section 6.4 for details on the hash algorithm used 712 select a source port based on the 5-tuple of the inner header. 713 The destination port MUST be set to the well-known IANA assigned 714 port value 4341. 716 UDP Checksum: this field field MUST be transmitted as 0 and ignored 717 on receipt by the ETR. Note, even when the UDP checksum is 718 transmitted as 0 an intervening NAT device can recalculate the 719 checksum and rewrite the UDP checksum field to non-zero. For 720 performance reasons, the ETR MUST ignore the checksum and MUST not 721 do a checksum computation. 723 UDP Length: for an IPv4 encapsulated packet, the inner header Total 724 Length plus the UDP and LISP header lengths are used. For an IPv6 725 encapsulated packet, the inner header Payload Length plus the size 726 of the IPv6 header (40 bytes) plus the size of the UDP and LISP 727 headers are used. The UDP header length is 8 bytes. The LISP 728 header length is 8 bytes when no loc-reach-bit header extensions 729 are used. 731 S: this is the Solicit-Map-Request (SMR) bit. See section 732 Section 6.5.2 for details. 734 LISP Locator Reach Bits: in the LISP header are set by an ITR to 735 indicate to an ETR the reachability of the Locators in the source 736 site. Each RLOC in a Map-Reply is assigned an ordinal value from 737 0 to n-1 (when there are n RLOCs in a mapping entry). The Locator 738 Reach Bits are numbered from 0 to n-1 from the right significant 739 bit of the 31-bit field. When a bit is set to 1, the ITR is 740 indicating to the ETR the RLOC associated with the bit ordinal is 741 reachable. See Section 6.3 for details on how an ITR can 742 determine other ITRs at the site are reachable. When a site has 743 multiple EID-prefixes which result in multiple mappings (where 744 each could have a different locator-set), the Locator Reach Bits 745 setting in an encapsulated packet MUST reflect the mapping for the 746 EID-prefix that the inner-header source EID address matches. 748 LISP Nonce: is a 32-bit value that is randomly generated by an ITR. 749 It is used to test route-returnability when xTRs exchange 750 encapsulated data packets with the SMR bit set, Data-Probe, Map- 751 Request, or Map-Reply messages. 753 When doing Recursive Tunneling: 755 o The OH header Time to Live field (or Hop Limit field, in case of 756 IPv6) MUST be copied from the IH header Time to Live field. 758 o The OH header Type of Service field (or the Traffic Class field, 759 in the case of IPv6) SHOULD be copied from the IH header Type of 760 Service field. 762 When doing Re-encapsulated Tunneling: 764 o The new OH header Time to Live field SHOULD be copied from the 765 stripped OH header Time to Live field. 767 o The new OH header Type of Service field SHOULD be copied from the 768 stripped OH header Type of Service field. 770 Copying the TTL serves two purposes: first, it preserves the distance 771 the host intended the packet to travel; second, and more importantly, 772 it provides for suppression of looping packets in the event there is 773 a loop of concatenated tunnels due to misconfiguration. 775 5.4. Dealing with Large Encapsulated Packets 777 In the event that the MTU issues mentioned above prove to be more 778 serious than expected, this section proposes 2 simple mechanisms to 779 deal with large packets. One is stateless using IP fragmentation and 780 the other is stateful using Path MTU Discovery [RFC1191]. 782 It is left to the implementor to decide if the stateless or stateful 783 mechanism should be implemented. Both or neither can be decided as 784 well since it is a local decision in the ITR regarding how to deal 785 with MTU issues. Sites can interoperate with differing mechanisms. 787 5.4.1. A Stateless Solution to MTU Handling 789 An ITR stateless solution to handle MTU issues is described as 790 follows: 792 1. Define an architectural constant S for the maximum size of a 793 packet, in bytes, an ITR would receive from a source inside of 794 its site. 796 2. Define L to be the maximum size, in bytes, a packet of size S 797 would be after the ITR prepends the LISP header, UDP header, and 798 outer network layer header of size H. 800 3. Calculate: S + H = L. 802 When an ITR receives a packet from a site-facing interface and adds H 803 bytes worth of encapsulation to yield a packet size of L bytes, it 804 resolves the MTU issue by first splitting the original packet into 2 805 equal-sized fragments. A LISP header is then prepended to each 806 fragment. This will ensure that the new, encapsulated packets are of 807 size (S/2 + H), which is always below the effective tunnel MTU. 809 When an ETR receives encapsulated fragments, it treats them as two 810 individually encapsulated packets. It strips the LISP headers then 811 forwards each fragment to the destination host of the destination 812 site. The two fragments are reassembled at the destination host into 813 the single IP datagram that was originated by the source host. 815 This behavior is performed by the ITR when the source host originates 816 a packet with the DF field of the IP header is set to 0. When the DF 817 field of the IP header is set to 1, or the packet is an IPv6 packet 818 originated by the source host, the ITR will drop the packet when the 819 size is greater than L, and sends an ICMP Too Big message to the 820 source with a value of S, where S is (L - H). 822 This specification recommends that L be defined as 1500. 824 5.4.2. A Stateful Solution to MTU Handling 826 An ITR stateful solution to handle MTU issues is describe as follows 827 and was first introduced in [OPENLISP]: 829 1. The ITR will keep state of the effective MTU for each locator per 830 mapping cache entry. The effective MTU is what the core network 831 can deliver along the path between ITR and ETR. 833 2. When an encapsulated packet exceeds what the core network can 834 deliver, one of the intermediate routers on the path will send an 835 ICMP Too Big message to the ITR. The ITR will parse the ICMP 836 message to determine which locator is affected by the effective 837 MTU change and then record the new effective MTU value in the 838 mapping cache entry. 840 3. When a packet is received by the ITR from a source inside of the 841 site and the size of the packet is greater than the effective MTU 842 stored with the mapping cache entry associated with the 843 destination EID the packet is for, the ITR will send an ICMP Too 844 Big message back to the source. The packet size advertised by 845 the ITR in the ICMP Too Big message is the effective MTU minus 846 the LISP encapsulation length. 848 Even though this mechanism is stateful, it has advantages over the 849 stateless IP fragmentation mechanism, by not involving the 850 destination host with reassembly of ITR fragmented packets. 852 6. EID-to-RLOC Mapping 854 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats 856 The following new UDP packet types are used to retrieve EID-to-RLOC 857 mappings: 859 0 1 2 3 860 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 |Version| IHL |Type of Service| Total Length | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | Identification |Flags| Fragment Offset | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | Time to Live | Protocol = 17 | Header Checksum | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 | Source Routing Locator | 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 | Destination Routing Locator | 871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 / | Source Port | Dest Port | 873 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 874 \ | UDP Length | UDP Checksum | 875 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 876 | | 877 | LISP Message | 878 | | 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 0 1 2 3 882 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 |Version| Traffic Class | Flow Label | 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 886 | Payload Length | Next Header=17| Hop Limit | 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | | 889 + + 890 | | 891 + Source Routing Locator + 892 | | 893 + + 894 | | 895 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 896 | | 897 + + 898 | | 899 + Destination Routing Locator + 900 | | 901 + + 902 | | 903 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 904 / | Source Port | Dest Port | 905 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 906 \ | UDP Length | UDP Checksum | 907 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 908 | | 909 | LISP Message | 910 | | 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 913 The LISP UDP-based messages are the Map-Request and Map-Reply 914 messages. When a UDP Map-Request is sent, the UDP source port is 915 chosen by the sender and the destination UDP port number is set to 916 4342. When a UDP Map-Reply is sent, the source UDP port number is 917 set to 4342 and the destination UDP port number is copied from the 918 source port of either the Map-Request or the invoking data packet. 920 The UDP Length field will reflect the length of the UDP header and 921 the LISP Message payload. 923 The UDP Checksum is computed and set to non-zero for Map-Request and 924 Map-Reply messages. It MUST be checked on receipt and if the 925 checksum fails, the packet MUST be dropped. 927 LISP-CONS [CONS] use TCP to send LISP control messages. The format 928 of control messages includes the UDP header so the checksum and 929 length fields can be used to protect and delimit message boundaries. 931 This main LISP specification is the authoritative source for message 932 format definitions for the Map-Request and Map-Reply messages. 934 6.1.1. LISP Packet Type Allocations 936 This section will be the authoritative source for allocating LISP 937 Type values. Current allocations are: 939 Reserved: 0 b'0000' 940 LISP Map-Request: 1 b'0001' 941 LISP Map-Reply: 2 b'0010' 942 LISP Map-Register: 3 b'0011' 943 LISP-CONS Open Message: 8 b'1000' 944 LISP-CONS Push-Add Message: 9 b'1001' 945 LISP-CONS Push-Delete Message: 10 b'1010' 946 LISP-CONS Unreachable Message 11 b'1011' 948 6.1.2. Map-Request Message Format 950 0 1 2 3 951 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 953 |S| Locator Reach Bits | 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 955 | Nonce | 956 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 957 |Type=1 |A|R| Reserved | Record Count | 958 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 959 | Source-EID-AFI | ITR-AFI | 960 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 | Source EID Address ... | 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 963 | Originating ITR RLOC Address ... | 964 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 965 / | Reserved | EID mask-len | EID-prefix-AFI | 966 Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 \ | EID-prefix ... | 968 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 | Map-Reply Record ... | 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 | Mapping Protocol Data | 972 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 974 Packet field descriptions: 976 S: This is the SMR bit. See Section 6.5.2 for details. 978 Locator Reach Bits: These bits MUST be set to 0 on transmission and 979 ignored on receipt. They cannot be used for indicating 980 reachability because the Map-Request does not have the EID-prefix 981 for the sending site so the receiver of the Map-Request cannot 982 know what mapping entry to associate the reachability with. 983 However, when Mapping Data is provided in the Map-Reply Record 984 field, and the receiver of the Map-Request is configured to accept 985 the mapping data, the R-bit per locator entry in the EID-prefix 986 record is used to denote reachability. 988 Nonce: A 4-byte random value created by the sender of the Map- 989 Request. 991 Type: 1 (Map-Request) 993 A: This is an authoritative bit, which is set to 0 for UDP-based Map- 994 Requests sent by an ITR. See other control-specific documents 995 [CONS] for TCP-based Map-Requests. 997 R: When set, it indicates a Map-Reply Record segment is included in 998 the Map-Request. 1000 Reserved: Set to 0 on transmission and ignored on receipt. 1002 Record Count: The number of records in this request message. A 1003 record is comprised of the portion of the packet is labeled 'Rec' 1004 above and occurs the number of times equal to Record count. 1006 Source-EID-AFI: Address family of the "Source EID Address" field. 1008 ITR-AFI: Address family of the "Originating ITR RLOC Address" field. 1010 Source EID Address: This is the EID of the source host which 1011 originated the packet which is invoking this Map-Request. 1013 Originating ITR RLOC Address: Used to give the ETR the option of 1014 returning a Map-Reply in the address-family of this locator. 1016 EID mask-len: Mask length for EID prefix. 1018 EID-AFI: Address family of EID-prefix according to [RFC2434] 1020 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1021 address-family. When a Map-Request is sent by an ITR because a 1022 data packet is received for a destination where there is no 1023 mapping entry, the EID-prefix is set to the destination IP address 1024 of the data packet. And the 'EID mask-len' is set to 32 or 128 1025 for IPv4 or IPv6, respectively. When an xTR wants to query a site 1026 about the status of a mapping it already has cached, the EID- 1027 prefix used in the Map-Request has the same mask-length as the 1028 EID-prefix returned from the site when it sent a Map-Reply 1029 message. 1031 Map-Reply Record: When the R bit is set, this field is the size of 1032 the "Record" field in the Map-Reply format. This Map-Reply record 1033 contains the EID-to-RLOC mapping entry associated with the Source 1034 EID. This allows the ETR which will receive this Map-Request to 1035 cache the data if it chooses to do so. 1037 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1038 is optional and present when the UDP length indicates there is 1039 enough space in the packet to include it. 1041 6.1.3. EID-to-RLOC UDP Map-Request Message 1043 A Map-Request is sent from an ITR when it needs a mapping for an EID, 1044 wants to test an RLOC for reachability, or wants to refresh a mapping 1045 before TTL expiration. For the initial case, the destination IP 1046 address used for the Map-Request is the destination-EID from the 1047 packet which had a mapping cache lookup failure. For the later 2 1048 cases, the destination IP address used for the Map-Request is one of 1049 the RLOC addresses from the locator-set of the map cache entry. In 1050 all cases, the UDP source port number for the Map-Request message is 1051 a randomly allocated 16-bit value and the UDP destination port number 1052 is set to the well-known destination port number 4342. A successful 1053 Map-Reply updates the cached set of RLOCs associated with the EID 1054 prefix range. 1056 Map-Requests can also be LISP encapsulated using UDP destination port 1057 4341 when sent from an ITR to a Map-Resolver. Details on 1058 encapsulated Map-Reqeusts and Map-Resolvers can be found in 1059 [LISP-MS]. 1061 Map-Requests MUST be rate-limited. It is recommended that a Map- 1062 Request for the same EID-prefix be sent no more than once per second. 1064 6.1.4. Map-Reply Message Format 1066 0 1 2 3 1067 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1068 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 |x| Locator Reach Bits | 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 | Nonce | 1072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073 |Type=2 | Reserved | Record Count | 1074 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 | | Record TTL | 1076 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 R | Locator Count | EID mask-len |A| Reserved | 1078 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1079 c | Reserved | EID-AFI | 1080 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 r | EID-prefix | 1082 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1083 | /| Priority | Weight | M Priority | M Weight | 1084 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1085 | o | Unused Flags |R| Loc-AFI | 1086 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1087 | \| Locator | 1088 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1089 | Mapping Protocol Data | 1090 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1092 Packet field descriptions: 1094 x: Set to 0 on transmission and ignored on receipt. 1096 Locator Reach Bits: Refer to Section 5.3. This field MUST be set to 1097 0 on transmission and ignored on receipt. The locator 1098 reachability is encoded as the R-bit in each locator entry of each 1099 EID-prefix record. 1101 Nonce: A 4-byte value set in a Data-Probe packet or a Map-Request 1102 that is echoed here in the Map-Reply. 1104 Type: 2 (Map-Reply) 1105 Reserved: Set to 0 on transmission and ignored on receipt. 1107 Record Count: The number of records in this reply message. A record 1108 is comprised of that portion of the packet labeled 'Record' above 1109 and occurs the number of times equal to Record count. 1111 Record TTL: The time in minutes the recipient of the Map-Reply will 1112 store the mapping. If the TTL is 0, the entry should be removed 1113 from the cache immediately. If the value is 0xffffffff, the 1114 recipient can decide locally how long to store the mapping. 1116 Locator Count: The number of Locator entries. A locator entry 1117 comprises what is labeled above as 'Loc'. The locator count can 1118 be 0 indicating there are no locators for the EID-prefix. 1120 EID mask-len: Mask length for EID prefix. 1122 A: The Authoritative bit, when sent by a UDP-based message is always 1123 set by the ETR. See [CONS] for TCP-based Map-Replies. 1125 EID-AFI: Address family of EID-prefix according to [RFC2434]. 1127 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1128 address-family. 1130 Priority: each RLOC is assigned a unicast priority. Lower values 1131 are more preferable. When multiple RLOCs have the same priority, 1132 they may be used in a load-split fashion. A value of 255 means 1133 the RLOC MUST NOT be used for unicast forwarding. 1135 Weight: when priorities are the same for multiple RLOCs, the weight 1136 indicates how to balance unicast traffic between them. Weight is 1137 encoded as a percentage of total unicast packets that match the 1138 mapping entry. If a non-zero weight value is used for any RLOC, 1139 then all RLOCs must use a non-zero weight value and then the sum 1140 of all weight values MUST equal 100. If a zero value is used for 1141 any RLOC weight, then all weights MUST be zero and the receiver of 1142 the Map-Reply will decide how to load-split traffic. See 1143 Section 6.4 for a suggested hash algorithm to distribute load 1144 across locators with same priority and equal weight values. When 1145 a single RLOC exists in a mapping entry, the weight value MUST be 1146 set to 100 and ignored on receipt. 1148 M Priority: each RLOC is assigned a multicast priority used by an 1149 ETR in a receiver multicast site to select an ITR in a source 1150 multicast site for building multicast distribution trees. A value 1151 of 255 means the RLOC MUST NOT be used for joining a multicast 1152 distribution tree. 1154 M Weight: when priorities are the same for multiple RLOCs, the 1155 weight indicates how to balance building multicast distribution 1156 trees across multiple ITRs. The weight is encoded as a percentage 1157 of total number of trees build to the source site identified by 1158 the EID-prefix. If a non-zero weight value is used for any RLOC, 1159 then all RLOCs must use a non-zero weight value and then the sum 1160 of all weight values MUST equal 100. If a zero value is used for 1161 any RLOC weight, then all weights MUST be zero and the receiver of 1162 the Map-Reply will decide how to distribute multicast state across 1163 ITRs. 1165 Unused Flags: set to 0 when sending and ignored on receipt. 1167 R: when this bit is set, the locator is known to be reachable from 1168 the Map-Reply sender's perspective. When there is a single 1169 mapping record in the message, the R-bit for each locator must 1170 have a consistent setting with the bitfield setting of the 'Loc 1171 Reach Bits' field in the early part of the header. When there are 1172 multiple mapping records in the message, the 'Loc Reach Bits' 1173 field is set to 0. 1175 Locator: an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field) 1176 assigned to an ETR or router acting as a proxy replier for the 1177 EID-prefix. Note that the destination RLOC address MAY be an 1178 anycast address. A source RLOC can be an anycast address as well. 1179 The source or destination RLOC MUST NOT be the broadcast address 1180 (255.255.255.255 or any subnet broadcast address known to the 1181 router), and MUST NOT be a link-local multicast address. The 1182 source RLOC MUST NOT be a multicast address. The destination RLOC 1183 SHOULD be a multicast address if it is being mapped from a 1184 multicast destination EID. 1186 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1187 is optional and present when the UDP length indicates there is 1188 enough space in the packet to include it. 1190 6.1.5. EID-to-RLOC UDP Map-Reply Message 1192 When a Data Probe packet or a Map-Request triggers a Map-Reply to be 1193 sent, the RLOCs associated with the EID-prefix matched by the EID in 1194 the original packet destination IP address field will be returned. 1195 The RLOCs in the Map-Reply are the globally-routable IP addresses of 1196 the ETR but are not necessarily reachable; separate testing of 1197 reachability is required. 1199 Note that a Map-Reply may contain different EID-prefix granularity 1200 (prefix + length) than the Map-Request which triggers it. This might 1201 occur if a Map-Request were for a prefix that had been returned by an 1202 earlier Map-Reply. In such a case, the requester updates its cache 1203 with the new prefix information and granularity. For example, a 1204 requester with two cached EID-prefixes that are covered by a Map- 1205 Reply containing one, less-specific prefix, replaces the entry with 1206 the less-specific EID-prefix. Note that the reverse, replacement of 1207 one less-specific prefix with multiple more-specific prefixes, can 1208 also occur but not by removing the less-specific prefix rather by 1209 adding the more-specific prefixes which during a lookup will override 1210 the less-specific prefix. 1212 Replies SHOULD be sent for an EID-prefix no more often than once per 1213 second to the same requesting router. For scalability, it is 1214 expected that aggregation of EID addresses into EID-prefixes will 1215 allow one Map-Reply to satisfy a mapping for the EID addresses in the 1216 prefix range thereby reducing the number of Map-Request messages. 1218 The addresses for a encapsulated data packets or Map-Request message 1219 are swapped and used for sending the Map-Reply. The UDP source and 1220 destination ports are swapped as well. That is, the source port in 1221 the UDP header for the Map-Reply is set to the well-known UDP port 1222 number 4342. 1224 6.1.6. Map-Register Message Format 1226 The usage details of the Map-Register message can be found in 1227 specification [LISP-MS]. This section solely defines the message 1228 format. 1230 The message is sent in a UDP with a destination UDP port 4342 and a 1231 randomly selected UDP port number. Before an IPv4 or IPv6 network 1232 layer header is prepended, an AH header is prepended to carry 1233 authentication information. The format conforms to the IPsec 1234 specification [RFC2402]. The Map-Regiter message will use transport 1235 mode by setting the IP protocol number field or the IPv6 next-header 1236 field to 51. 1238 The AH header from [RFC2402] is: 1240 0 1 2 3 1241 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1242 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1243 | Next Header | Payload Len | RESERVED | 1244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1245 | Security Parameters Index (SPI) | 1246 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1247 | Sequence Number Field | 1248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1249 | | 1250 + Authentication Data (variable) | 1251 | | 1252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1254 The Next Header field is set to UDP. The SPI field is set to 0 1255 (since no Security Association or Key Exchange protocol is being 1256 used). The Sequece Number is a randomly chosen value by the sender. 1257 The Authentication Data is 16 bytes and holds a MD5 HMAC. 1259 The Map-Register message format is: 1261 0 1 2 3 1262 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1264 |x| Locator Reach Bits | 1265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1266 | Nonce | 1267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1268 |Type=3 | Reserved | Record Count | 1269 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1270 | | Record TTL | 1271 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1272 R | Locator Count | EID mask-len |A| Reserved | 1273 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1274 c | Reserved | EID-AFI | 1275 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 r | EID-prefix | 1277 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1278 | /| Priority | Weight | M Priority | M Weight | 1279 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1280 | o | Unused Flags |R| Loc-AFI | 1281 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 | \| Locator | 1283 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1285 The definition of each field of the Map-Register can be found in the 1286 Map-Reply section. 1288 6.2. Routing Locator Selection 1290 Both client-side and server-side may need control over the selection 1291 of RLOCs for conversations between them. This control is achieved by 1292 manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply 1293 messages. Alternatively, RLOC information may be gleaned from 1294 received tunneled packets or EID-to-RLOC Map-Request messages. 1296 The following enumerates different scenarios for choosing RLOCs and 1297 the controls that are available: 1299 o Server-side returns one RLOC. Client-side can only use one RLOC. 1300 Server-side has complete control of the selection. 1302 o Server-side returns a list of RLOC where a subset of the list has 1303 the same best priority. Client can only use the subset list 1304 according to the weighting assigned by the server-side. In this 1305 case, the server-side controls both the subset list and load- 1306 splitting across its members. The client-side can use RLOCs 1307 outside of the subset list if it determines that the subset list 1308 is unreachable (unless RLOCs are set to a Priority of 255). Some 1309 sharing of control exists: the server-side determines the 1310 destination RLOC list and load distribution while the client-side 1311 has the option of using alternatives to this list if RLOCs in the 1312 list are unreachable. 1314 o Server-side sets weight of 0 for the RLOC subset list. In this 1315 case, the client-side can choose how the traffic load is spread 1316 across the subset list. Control is shared by the server-side 1317 determining the list and the client determining load distribution. 1318 Again, the client can use alternative RLOCs if the server-provided 1319 list of RLOCs are unreachable. 1321 o Either side (more likely on the server-side ETR) decides not to 1322 send a Map-Request. For example, if the server-side ETR does not 1323 send Map-Requests, it gleans RLOCs from the client-side ITR, 1324 giving the client-side ITR responsibility for bidirectional RLOC 1325 reachability and preferability. Server-side ETR gleaning of the 1326 client-side ITR RLOC is done by caching the inner header source 1327 EID and the outer header source RLOC of received packets. The 1328 client-side ITR controls how traffic is returned and can alternate 1329 using an outer header source RLOC, which then can be added to the 1330 list the server-side ETR uses to return traffic. Since no 1331 Priority or Weights are provided using this method, the server- 1332 side ETR must assume each client-side ITR RLOC uses the same best 1333 Priority with a Weight of zero. In addition, since EID-prefix 1334 encoding cannot be conveyed in data packets, the EID-to-RLOC cache 1335 on tunnel routers can grow to be very large. 1337 RLOCs that appear in EID-to-RLOC Map-Reply messages are considered 1338 reachable. The Map-Reply and the database mapping service does not 1339 provide any reachability status for Locators. This is done outside 1340 of the mapping service. See next section for details. 1342 6.3. Routing Locator Reachability 1344 There are 4 methods for determining when a Locator is either 1345 reachable or has become unreachable: 1347 1. Locator reachability is determined by an ETR by examining the 1348 Loc-Reach-Bits from a LISP header of a encapsulated data packet 1349 which is provided by an ITR when an ITR encapsulates data. 1351 2. Locator unreachability is determined by an ITR by receiving ICMP 1352 Network or Host Unreachable messages. 1354 3. Locator unreachability can also be determined by an BGP-enabled 1355 ITR when there is no prefix matching a Locator address from the 1356 BGP RIB. 1358 4. Locator unreachability is determined when a host sends an ICMP 1359 Port Unreachable message. This occurs when an ITR may not use 1360 any methods of interworking. one which is describe in [INTERWORK] 1361 and the encapsulated data packet is received by a host at the 1362 destination non-LISP site. 1364 5. Locator reachability is determined by receiving a Map-Reply 1365 message from a ETR's Locator address in response to a previously 1366 sent Map-Request. 1368 6. Locator reachability can also be determined by receiving packets 1369 encapsulated by the ITR assigned to the locator address. 1371 When determining Locator reachability by examining the Loc-Reach-Bits 1372 from the LISP encapsulate data packet, an ETR will receive up to date 1373 status from the ITR closest to the Locators at the source site. The 1374 ITRs at the source site can determine reachability when running their 1375 IGP at the site. When the ITRs are deployed on CE routers, typically 1376 a default route is injected into the site's IGP from each of the 1377 ITRs. If an ITR goes down, the CE-PE link goes down, or the PE 1378 router goes down, the CE router withdraws the default route. This 1379 allows the other ITRs at the site to determine one of the Locators 1380 has gone unreachable. 1382 The Locators listed in a Map-Reply are numbered with ordinals 0 to 1383 n-1. The Loc-Reach-Bits in a LISP Data Message are numbered from 0 1384 to n-1 starting with the least significant bit numbered as 0. So, 1385 for example, if the ITR with locator listed as the 3rd Locator 1386 position in the Map-Reply goes down, all other ITRs at the site will 1387 have the 3rd bit from the right cleared (the bit that corresponds to 1388 ordinal 2). 1390 When an ETR decapsulates a packet, it will look for a change in the 1391 Loc-Reach-Bits value. When a bit goes from 1 to 0, the ETR will 1392 refrain from encapsulating packets to the Locator that has just gone 1393 unreachable. It can start using the Locator again when the bit that 1394 corresponds to the Locator goes from 0 to 1. Loc-Reach-Bits are 1395 associated with a locator-set per EID-prefix. Therefore, when a 1396 locator becomes unreachable, the loc-reach-bit that corresponds to 1397 that locator's position in the list returned by the last Map-Reply 1398 will be set to zero for that particular EID-prefix. 1400 When ITRs at the site are not deployed in CE routers, the IGP can 1401 still be used to determine the reachability of Locators provided they 1402 are injected a stub links into the IGP. This is typically done when 1403 a /32 address is configured on a loopback interface. 1405 When ITRs receive ICMP Network or Host Unreachable messages as a 1406 method to determine unreachability, they will refrain from using 1407 Locators which are described in Locator lists of Map-Replies. 1408 However, using this approach is unreliable because many network 1409 operators turn off generation of ICMP Unreachable messages. 1411 If an ITR does receive an ICMP Network or Host Unreachable message, 1412 it MAY originate its own ICMP Unreachable message destined for the 1413 host that originated the data packet the ITR encapsulated. 1415 Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if 1416 a locator address from a locator-set in a mapping entry matches a 1417 prefix. If it does not find one and BGP is running in the Default 1418 Free Zone (DFZ), it can decide to not use the locator even though the 1419 Loc-Reach-Bits indicate the locator is up. In this case, the path 1420 from the ITR to the ETR that is assigned the locator is not 1421 available. More details are in [LOC-ID-ARCH]. 1423 Optionally, an ITR can send a Map-Request to a Locator and if a Map- 1424 Reply is returned, reachability of the Locator has been determined. 1425 Obviously, sending such probes increases the number of control 1426 messages originated by tunnel routers for active flows, so Locators 1427 are assumed to be reachable when they are advertised. 1429 This assumption does create a dependency: Locator unreachability is 1430 detected by the receipt of ICMP Host Unreachable messages. When an 1431 Locator has been determined to be unreachable, it is not used for 1432 active traffic; this is the same as if it were listed in a Map-Reply 1433 with priority 255. 1435 The ITR can test the reachability of the unreachable Locator by 1436 sending periodic Requests. Both Requests and Replies MUST be rate- 1437 limited. Locator reachability testing is never done with data 1438 packets since that increases the risk of packet loss for end-to-end 1439 sessions. 1441 When an ETR is decapsulating packets, it can be sure that the path 1442 from the encapsulating ITR is available. The ETR can assume the path 1443 from the ETR to the ITR is also reachable. Even if there is 1444 asymmetric routing in the core, the first-hop and last-hop ASes will 1445 be the same for both directions of traffic since the locator 1446 addresses are out of the PA blocks of each. However, the assumption 1447 may not always be valid, so this mechanism should be used as a best- 1448 effort indication that a working path exists between the sites. In 1449 the event of unidirectional traffic from an ITR to an ETR, an ITR 1450 should not conclude that a locator is unreachable since it is not 1451 receiving packets, but use alternate mechanisms described above to 1452 determine reachability. 1454 6.4. Routing Locator Hashing 1456 When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to 1457 a requesting ITR, the locator-set for the EID-prefix may contain 1458 different priority values for each locator address. When more than 1459 one best priority locator exists, the ITR can decide how to load 1460 share traffic against the corresponding locators. 1462 The following hash algorithm may be used by an ITR to select a 1463 locator for a packet destined to an EID for the EID-to-RLOC mapping: 1465 1. Either a source and destination address hash can be used or the 1466 traditional 5-tuple hash which includes the source and 1467 destination addresses, source and destination TCP, UDP, or SCTP 1468 port numbers and the IP protocol number field or IPv6 next- 1469 protocol fields of a packet a host originates from within a LISP 1470 site. When a packet is not a TCP, UDP, or SCTP packet, the 1471 source and destination addresses only from the header are used to 1472 compute the hash. 1474 2. Take the hash value and divide it by the number of locators 1475 stored in the locator-set for the EID-to-RLOC mapping. 1477 3. The remainder will be yield a value of 0 to "number of locators 1478 minus 1". Use the remainder to select the locator in the 1479 locator-set. 1481 Note that when a packet is LISP encapsulated, the source port number 1482 in the outer UDP header needs to be set. Selecting a random value 1483 allows core routers which are attached to Link Aggregation Groups 1484 (LAGs) to load-split the encapsulated packets across member links of 1485 such LAGs. Otherwise, core routers would see a single flow, since 1486 packets have a source address of the ITR, for packets which are 1487 originated by different EIDs at the source site. A suggested setting 1488 for the source port number computed by an ITR is a 5-tuple hash 1489 function on the inner header, as described above. 1491 6.5. Changing the Contents of EID-to-RLOC Mappings 1493 Since the LISP architecture uses a caching scheme to retrieve and 1494 store EID-to-RLOC mappings, the only way an ITR can get a more up-to- 1495 date mapping is to re-request the mapping. However, the ITRs do not 1496 know when the mappings change and the ETRs do not keep track of who 1497 requested its mappings. For scalability reasons, we want to maintain 1498 this approach but need to provide a way for ETRs change their 1499 mappings and inform the sites that are currently communicating with 1500 the ETR site using such mappings. 1502 When a locator record is added to the end of a locator-set, it is 1503 easy to update mappings. We assume new mappings will maintain the 1504 same locator ordering as the old mapping but just have new locators 1505 appended to the end of the list. So some ITRs can have a new mapping 1506 while other ITRs have only an old mapping that is used until they 1507 time out. When an ITR has only an old mapping but detects bits set 1508 in the loc-reach-bits that correspond to locators beyond the list it 1509 has cached, it simply ignores them. 1511 When a locator record is removed from a locator-set, ITRs that have 1512 the mapping cached will not use the removed locator because the xTRs 1513 will set the loc-reach-bit to 0. So even if the locator is in the 1514 list, it will not be used. For new mapping requests, the xTRs can 1515 set the locator address to 0 as well as setting the corresponding 1516 loc-reach-bit to 0. This forces ITRs with old or new mappings to 1517 avoid using the removed locator. 1519 If many changes occur to a mapping over a long period of time, one 1520 will find empty record slots in the middle of the locator-set and new 1521 records appended to the locator-set. At some point, it would be 1522 useful to compact the locator-set so the loc-reach-bit settings can 1523 be efficiently packed. 1525 We propose here two approaches for locator-set compaction, one 1526 operational and the other a protocol mechanism. The operational 1527 approach uses a clock sweep method. The protocol approach uses the 1528 concept of Solicit-Map-Requests. 1530 6.5.1. Clock Sweep 1532 The clock sweep approach uses planning in advance and the use of 1533 count-down TTLs to time out mappings that have already been cached. 1534 The default setting for an EID-to-RLOC mapping TTL is 24 hours. So 1535 there is a 24 hour window to time out old mappings. The following 1536 clock sweep procedure is used: 1538 1. 24 hours before a mapping change is to take effect, a network 1539 administrator configures the ETRs at a site to start the clock 1540 sweep window. 1542 2. During the clock sweep window, ETRs continue to send Map-Reply 1543 messages with the current (unchanged) mapping records. The TTL 1544 for these mappings is set to 1 hour. 1546 3. 24 hours later, all previous cache entries will have timed out, 1547 and any active cache entries will time out within 1 hour. During 1548 this 1 hour window the ETRs continue to send Map-Reply messages 1549 with the current (unchanged) mapping records with the TTL set to 1550 1 minute. 1552 4. At the end of the 1 hour window, the ETRs will send Map-Reply 1553 messages with the new (changed) mapping records. So any active 1554 caches can get the new mapping contents right away if not cached, 1555 or in 1 minute if they had the mapping cached. 1557 6.5.2. Solicit-Map-Request (SMR) 1559 Soliciting a Map-Request is a selective way for xTRs, at the site 1560 where mappings change, to control the rate they receive requests for 1561 Map-Reply messages. SMRs are also used to tell remote ITRs to update 1562 the mappings they have cached. 1564 Since the xTRs don't keep track of remote ITRs that have cached their 1565 mappings, they can not tell exactly who needs the new mapping 1566 entries. So an xTR will solicit Map-Requests from sites it is 1567 currently sending encapsulated data to, and only from those sites. 1568 The xTRs can locally decide the algorithm for how often and to how 1569 many sites it sends SMR messages. 1571 An SMR message is simply a bit set in an encapsulated data packet 1572 (and a Map-Request message). When an ETR at a remote site 1573 decapsulates a data packet that has the SMR bit set, it can tell that 1574 a new Map-Request message is being solicited. Both the xTR that 1575 sends the SMR message and the site that acts on the SMR message MUST 1576 be rate-limited. 1578 The following procedure shows how a SMR exchange occurs when a site 1579 is doing locator-set compaction for an EID-to-RLOC mapping: 1581 1. When the database mappings in an ETR change, the ITRs at the site 1582 begin to set the SMR bit in packets they encapsulate to the sites 1583 they communicate with. 1585 2. A remote xTR which decapsulates a packet with the SMR bit set 1586 will schedule sending a Map-Request message to the source locator 1587 address of the encapsulated packet. The nonce in the Map-Request 1588 is copied from the nonce in the encapsulated data packet that has 1589 the SMR bit set. 1591 3. The remote xTR retransmits the Map-Request slowly until it gets a 1592 Map-Reply while continuing to use the cached mapping. 1594 4. The ETRs at the site with the changed mapping will reply to the 1595 Map-Request with a Map-Reply message provided the Map-Request 1596 nonce matches the nonce from the SMR. The Map-Reply messages 1597 SHOULD be rate limited. This is important to avoid Map-Reply 1598 implosion. 1600 5. The ETRs, at the site with the changed mapping, records the fact 1601 that the site that sent the Map-Request has received the new 1602 mapping data in the mapping cache entry for the remote site so 1603 the loc-reach-bits are reflective of the new mapping for packets 1604 going to the remote site. The ETR then stops sending packets 1605 with the SMR-bit set. 1607 For security reasons an ITR MUST NOT process unsolicited Map-Replies. 1608 The nonce MUST be carried from SMR packet, into the resultant Map- 1609 Request, and then into Map-Reply to reduce spoofing attacks. 1611 7. Router Performance Considerations 1613 LISP is designed to be very hardware-based forwarding friendly. By 1614 doing tunnel header prepending [RFC1955] and stripping instead of re- 1615 writing addresses, existing hardware can support the forwarding model 1616 with little or no modification. Where modifications are required, 1617 they should be limited to re-programming existing hardware rather 1618 than requiring expensive design changes to hard-coded algorithms in 1619 silicon. 1621 A few implementation techniques can be used to incrementally 1622 implement LISP: 1624 o When a tunnel encapsulated packet is received by an ETR, the outer 1625 destination address may not be the address of the router. This 1626 makes it challenging for the control plane to get packets from the 1627 hardware. This may be mitigated by creating special FIB entries 1628 for the EID-prefixes of EIDs served by the ETR (those for which 1629 the router provides an RLOC translation). These FIB entries are 1630 marked with a flag indicating that control plane processing should 1631 be performed. The forwarding logic of testing for particular IP 1632 protocol number value is not necessary. No changes to existing, 1633 deployed hardware should be needed to support this. 1635 o On an ITR, prepending a new IP header is as simple as adding more 1636 bytes to a MAC rewrite string and prepending the string as part of 1637 the outgoing encapsulation procedure. Many routers that support 1638 GRE tunneling [RFC2784] or 6to4 tunneling [RFC3056] can already 1639 support this action. 1641 o When a received packet's outer destination address contains an EID 1642 which is not intended to be forwarded on the routable topology 1643 (i.e. LISP 1.5), the source address of a data packet or the 1644 router interface with which the source is associated (the 1645 interface from which it was received) can be associated with a VRF 1646 (Virtual Routing/Forwarding), in which a different (i.e. non- 1647 congruent) topology can be used to find EID-to-RLOC mappings. 1649 8. Deployment Scenarios 1651 This section will explore how and where ITRs and ETRs can be deployed 1652 and will discuss the pros and cons of each deployment scenario. 1653 There are two basic deployment trade-offs to consider: centralized 1654 versus distributed caches and flat, recursive, or re-encapsulating 1655 tunneling. 1657 When deciding on centralized versus distributed caching, the 1658 following issues should be considered: 1660 o Are the tunnel routers spread out so that the caches are spread 1661 across all the memories of each router? 1663 o Should management "touch points" be minimized by choosing few 1664 tunnel routers, just enough for redundancy? 1666 o In general, using more ITRs doesn't increase management load, 1667 since caches are built and stored dynamically. On the other hand, 1668 more ETRs does require more management since EID-prefix-to-RLOC 1669 mappings need to be explicitly configured. 1671 When deciding on flat, recursive, or re-encapsulation tunneling, the 1672 following issues should be considered: 1674 o Flat tunneling implements a single tunnel between source site and 1675 destination site. This generally offers better paths between 1676 sources and destinations with a single tunnel path. 1678 o Recursive tunneling is when tunneled traffic is again further 1679 encapsulated in another tunnel, either to implement VPNs or to 1680 perform Traffic Engineering. When doing VPN-based tunneling, the 1681 site has some control since the site is prepending a new tunnel 1682 header. In the case of TE-based tunneling, the site may have 1683 control if it is prepending a new tunnel header, but if the site's 1684 ISP is doing the TE, then the site has no control. Recursive 1685 tunneling generally will result in suboptimal paths but at the 1686 benefit of steering traffic to resource available parts of the 1687 network. 1689 o The technique of re-encapsulation ensures that packets only 1690 require one tunnel header. So if a packet needs to be rerouted, 1691 it is first decapsulated by the ETR and then re-encapsulated with 1692 a new tunnel header using a new RLOC. 1694 The next sub-sections will describe where tunnel routers can reside 1695 in the network. 1697 8.1. First-hop/Last-hop Tunnel Routers 1699 By locating tunnel routers close to hosts, the EID-prefix set is at 1700 the granularity of an IP subnet. So at the expense of more EID- 1701 prefix-to-RLOC sets for the site, the caches in each tunnel router 1702 can remain relatively small. But caches always depend on the number 1703 of non-aggregated EID destination flows active through these tunnel 1704 routers. 1706 With more tunnel routers doing encapsulation, the increase in control 1707 traffic grows as well: since the EID-granularity is greater, more 1708 Map-Requests and Map-Replies are traveling between more routers. 1710 The advantage of placing the caches and databases at these stub 1711 routers is that the products deployed in this part of the network 1712 have better price-memory ratios then their core router counterparts. 1713 Memory is typically less expensive in these devices and fewer routes 1714 are stored (only IGP routes). These devices tend to have excess 1715 capacity, both for forwarding and routing state. 1717 LISP functionality can also be deployed in edge switches. These 1718 devices generally have layer-2 ports facing hosts and layer-3 ports 1719 facing the Internet. Spare capacity is also often available in these 1720 devices as well. 1722 8.2. Border/Edge Tunnel Routers 1724 Using customer-edge (CE) routers for tunnel endpoints allows the EID 1725 space associated with a site to be reachable via a small set of RLOCs 1726 assigned to the CE routers for that site. 1728 This offers the opposite benefit of the first-hop/last-hop tunnel 1729 router scenario: the number of mapping entries and network management 1730 touch points are reduced, allowing better scaling. 1732 One disadvantage is that less of the network's resources are used to 1733 reach host endpoints thereby centralizing the point-of-failure domain 1734 and creating network choke points at the CE router. 1736 Note that more than one CE router at a site can be configured with 1737 the same IP address. In this case an RLOC is an anycast address. 1738 This allows resilience between the CE routers. That is, if a CE 1739 router fails, traffic is automatically routed to the other routers 1740 using the same anycast address. However, this comes with the 1741 disadvantage where the site cannot control the entrance point when 1742 the anycast route is advertised out from all border routers. 1744 8.3. ISP Provider-Edge (PE) Tunnel Routers 1746 Use of ISP PE routers as tunnel endpoint routers gives an ISP control 1747 over the location of the egress tunnel endpoints. That is, the ISP 1748 can decide if the tunnel endpoints are in the destination site (in 1749 either CE routers or last-hop routers within a site) or at other PE 1750 edges. The advantage of this case is that two or more tunnel headers 1751 can be avoided. By having the PE be the first router on the path to 1752 encapsulate, it can choose a TE path first, and the ETR can 1753 decapsulate and re-encapsulate for a tunnel to the destination end 1754 site. 1756 An obvious disadvantage is that the end site has no control over 1757 where its packets flow or the RLOCs used. 1759 As mentioned in earlier sections a combination of these scenarios is 1760 possible at the expense of extra packet header overhead, if both site 1761 and provider want control, then recursive or re-encapsulating tunnels 1762 are used. 1764 9. Traceroute Considerations 1766 When a source host in a LISP site initiates a traceroute to a 1767 destination host in another LISP site, it is highly desirable for it 1768 to see the entire path. Since packets are encapsulated from ITR to 1769 ETR, the hop across the tunnel could be viewed as a single hop. 1770 However, LISP traceroute will provide the entire path so the user can 1771 see 3 distinct segments of the path from a source LISP host to a 1772 destination LISP host: 1774 Segment 1 (in source LISP site based on EIDs): 1776 source-host ---> first-hop ... next-hop ---> ITR 1778 Segment 2 (in the core network based on RLOCs): 1780 ITR ---> next-hop ... next-hop ---> ETR 1782 Segment 3 (in the destination LISP site based on EIDs): 1784 ETR ---> next-hop ... last-hop ---> destination-host 1786 For segment 1 of the path, ICMP Time Exceeded messages are returned 1787 in the normal matter as they are today. The ITR performs a TTL 1788 decrement and test for 0 before encapsulating. So the ITR hop is 1789 seen by the traceroute source has an EID address (the address of 1790 site-facing interface). 1792 For segment 2 of the path, ICMP Time Exceeded messages are returned 1793 to the ITR because the TTL decrement to 0 is done on the outer 1794 header, so the destination of the ICMP messages are to the ITR RLOC 1795 address, the source source RLOC address of the encapsulated 1796 traceroute packet. The ITR looks inside of the ICMP payload to 1797 inspect the traceroute source so it can return the ICMP message to 1798 the address of the traceroute client as well as retaining the core 1799 router IP address in the ICMP message. This is so the traceroute 1800 client can display the core router address (the RLOC address) in the 1801 traceroute output. The ETR returns its RLOC address and responds to 1802 the TTL decrement to 0 like the previous core routers did. 1804 For segment 3, the next-hop router downstream from the ETR will be 1805 decrementing the TTL for the packet that was encapsulated, sent into 1806 the core, decapsulated by the ETR, and forwarded because it isn't the 1807 final destination. If the TTL is decremented to 0, any router on the 1808 path to the destination of the traceroute, including the next-hop 1809 router or destination, will send an ICMP Time Exceeded message to the 1810 source EID of the traceroute client. The ICMP message will be 1811 encapsulated by the local ITR and sent back to the ETR in the 1812 originated traceroute source site, where the packet will be delivered 1813 to the host. 1815 9.1. IPv6 Traceroute 1817 IPv6 traceroute follows the procedure described above since the 1818 entire traceroute data packet is included in ICMP Time Exceeded 1819 message payload. Therefore, only the ITR needs to pay special 1820 attention for forwarding ICMP messages back to the traceroute source. 1822 9.2. IPv4 Traceroute 1824 For IPv4 traceroute, we cannot follow the above procedure since IPv4 1825 ICMP Time Exceeded messages only include the invoking IP header and 8 1826 bytes that follow the IP header. Therefore, when a core router sends 1827 an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP 1828 payload is the encapsulated header it prepended followed by a UDP 1829 header. The original invoking IP header, and therefore the identity 1830 of the traceroute source is lost. 1832 The solution we propose to solve this problem is to cache traceroute 1833 IPv4 headers in the ITR and to match them up with corresponding IPv4 1834 Time Exceeded messages received from core routers and the ETR. The 1835 ITR will use a circular buffer for caching the IPv4 and UDP headers 1836 of traceroute packets. It will select a 16-bit number as a key to 1837 find them later when the IPv4 Time Exceeded messages are received. 1838 When an ITR encapsulates an IPv4 traceroute packet, it will use the 1839 16-bit number as the UDP source port in the encapsulating header. 1840 When the ICMP Time Exceeded message is returned to the ITR, the UDP 1841 header of the encapsulating header is present in the ICMP payload 1842 thereby allowing the ITR to find the cached headers for the 1843 traceroute source. The ITR puts the cached headers in the payload 1844 and sends the ICMP Time Exceeded message to the traceroute source 1845 retaining the source address of the original ICMP Time Exceeded 1846 message (a core router or the ETR of the site of the traceroute 1847 destination). 1849 9.3. Traceroute using Mixed Locators 1851 When either an IPv4 traceroute or IPv6 traceroute is originated and 1852 the ITR encapsulates it in the other address family header, you 1853 cannot get all 3 segments of the traceroute. Segment 2 of the 1854 traceroute can not be conveyed to the traceroute source since it is 1855 expecting addresses from intermediate hops in the same address format 1856 for the type of traceroute it originated. Therefore, in this case, 1857 segment 2 will make the tunnel look like one hop. All the ITR has to 1858 do to make this work is to not copy the inner TTL to the outer, 1859 encapsulating header's TTL when a traceroute packet is encapsulated 1860 using an RLOC from a different address family. This will cause no 1861 TTL decrement to 0 to occur in core routers between the ITR and ETR. 1863 10. Mobility Considerations 1865 There are several kinds of mobility of which only some might be of 1866 concern to LISP. Essentially they are as follows. 1868 10.1. Site Mobility 1870 A site wishes to change its attachment points to the Internet, and 1871 its LISP Tunnel Routers will have new RLOCs when it changes upstream 1872 providers. Changes in EID-RLOC mappings for sites are expected to be 1873 handled by configuration, outside of the LISP protocol. 1875 10.2. Slow Endpoint Mobility 1877 An individual endpoint wishes to move, but is not concerned about 1878 maintaining session continuity. Renumbering is involved. LISP can 1879 help with the issues surrounding renumbering [RFC4192] [LISA96] by 1880 decoupling the address space used by a site from the address spaces 1881 used by its ISPs. [RFC4984] 1883 10.3. Fast Endpoint Mobility 1885 Fast endpoint mobility occurs when an endpoint moves relatively 1886 rapidly, changing its IP layer network attachment point. Maintenance 1887 of session continuity is a goal. This is where the Mobile IPv4 1888 [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used, 1889 and primarily where interactions with LISP need to be explored. 1891 The problem is that as an endpoint moves, it may require changes to 1892 the mapping between its EID and a set of RLOCs for its new network 1893 location. When this is added to the overhead of mobile IP binding 1894 updates, some packets might be delayed or dropped. 1896 In IPv4 mobility, when an endpoint is away from home, packets to it 1897 are encapsulated and forwarded via a home agent which resides in the 1898 home area the endpoint's address belongs to. The home agent will 1899 encapsulate and forward packets either directly to the endpoint or to 1900 a foreign agent which resides where the endpoint has moved to. 1901 Packets from the endpoint may be sent directly to the correspondent 1902 node, may be sent via the foreign agent, or may be reverse-tunneled 1903 back to the home agent for delivery to the mobile node. As the 1904 mobile node's EID or available RLOC changes, LISP EID-to-RLOC 1905 mappings are required for communication between the mobile node and 1906 the home agent, whether via foreign agent or not. As a mobile 1907 endpoint changes networks, up to three LISP mapping changes may be 1908 required: 1910 o The mobile node moves from an old location to a new visited 1911 network location and notifies its home agent that it has done so. 1912 The Mobile IPv4 control packets the mobile node sends pass through 1913 one of the new visited network's ITRs, which needs a EID-RLOC 1914 mapping for the home agent. 1916 o The home agent might not have the EID-RLOC mappings for the mobile 1917 node's "care-of" address or its foreign agent in the new visited 1918 network, in which case it will need to acquire them. 1920 o When packets are sent directly to the correspondent node, it may 1921 be that no traffic has been sent from the new visited network to 1922 the correspondent node's network, and the new visited network's 1923 ITR will need to obtain an EID-RLOC mapping for the correspondent 1924 node's site. 1926 In addition, if the IPv4 endpoint is sending packets from the new 1927 visited network using its original EID, then LISP will need to 1928 perform a route-returnability check on the new EID-RLOC mapping for 1929 that EID. 1931 In IPv6 mobility, packets can flow directly between the mobile node 1932 and the correspondent node in either direction. The mobile node uses 1933 its "care-of" address (EID). In this case, the route-returnability 1934 check would not be needed but one more LISP mapping lookup may be 1935 required instead: 1937 o As above, three mapping changes may be needed for the mobile node 1938 to communicate with its home agent and to send packets to the 1939 correspondent node. 1941 o In addition, another mapping will be needed in the correspondent 1942 node's ITR, in order for the correspondent node to send packets to 1943 the mobile node's "care-of" address (EID) at the new network 1944 location. 1946 When both endpoints are mobile the number of potential mapping 1947 lookups increases accordingly. 1949 As a mobile node moves there are not only mobility state changes in 1950 the mobile node, correspondent node, and home agent, but also state 1951 changes in the ITRs and ETRs for at least some EID-prefixes. 1953 The goal is to support rapid adaptation, with little delay or packet 1954 loss for the entire system. Heuristics can be added to LISP to 1955 reduce the number of mapping changes required and to reduce the delay 1956 per mapping change. Also IP mobility can be modified to require 1957 fewer mapping changes. In order to increase overall system 1958 performance, there may be a need to reduce the optimization of one 1959 area in order to place fewer demands on another. 1961 In LISP, one possibility is to "glean" information. When a packet 1962 arrives, the ETR could examine the EID-RLOC mapping and use that 1963 mapping for all outgoing traffic to that EID. It can do this after 1964 performing a route-returnability check, to ensure that the new 1965 network location does have a internal route to that endpoint. 1966 However, this does not cover the case where an ITR (the node assigned 1967 the RLOC) at the mobile-node location has been compromised. 1969 Mobile IP packet exchange is designed for an environment in which all 1970 routing information is disseminated before packets can be forwarded. 1971 In order to allow the Internet to grow to support expected future 1972 use, we are moving to an environment where some information may have 1973 to be obtained after packets are in flight. Modifications to IP 1974 mobility should be considered in order to optimize the behavior of 1975 the overall system. Anything which decreases the number of new EID- 1976 RLOC mappings needed when a node moves, or maintains the validity of 1977 an EID-RLOC mapping for a longer time, is useful. 1979 10.4. Fast Network Mobility 1981 In addition to endpoints, a network can be mobile, possibly changing 1982 xTRs. A "network" can be as small as a single router and as large as 1983 a whole site. This is different from site mobility in that it is 1984 fast and possibly short-lived, but different from endpoint mobility 1985 in that a whole prefix is changing RLOCs. However, the mechanisms 1986 are the same and there is no new overhead in LISP. A map request for 1987 any endpoint will return a binding for the entire mobile prefix. 1989 If mobile networks become a more common occurrence, it may be useful 1990 to revisit the design of the mapping service and allow for dynamic 1991 updates of the database. 1993 The issue of interactions between mobility and LISP needs to be 1994 explored further. Specific improvements to the entire system will 1995 depend on the details of mapping mechanisms. Mapping mechanisms 1996 should be evaluated on how well they support session continuity for 1997 mobile nodes. 1999 11. Multicast Considerations 2001 A multicast group address, as defined in the original Internet 2002 architecture is an identifier of a grouping of topologically 2003 independent receiver host locations. The address encoding itself 2004 does not determine the location of the receiver(s). The multicast 2005 routing protocol, and the network-based state the protocol creates, 2006 determines where the receivers are located. 2008 In the context of LISP, a multicast group address is both an EID and 2009 a Routing Locator. Therefore, no specific semantic or action needs 2010 to be taken for a destination address, as it would appear in an IP 2011 header. Therefore, a group address that appears in an inner IP 2012 header built by a source host will be used as the destination EID. 2013 The outer IP header (the destination Routing Locator address), 2014 prepended by a LISP router, will use the same group address as the 2015 destination Routing Locator. 2017 Having said that, only the source EID and source Routing Locator 2018 needs to be dealt with. Therefore, an ITR merely needs to put its 2019 own IP address in the source Routing Locator field when prepending 2020 the outer IP header. This source Routing Locator address, like any 2021 other Routing Locator address MUST be globally routable. 2023 Therefore, an EID-to-RLOC mapping does not need to be performed by an 2024 ITR when a received data packet is a multicast data packet or when 2025 processing a source-specific Join (either by IGMPv3 or PIM). But the 2026 source Routing Locator is decided by the multicast routing protocol 2027 in a receiver site. That is, an EID to Routing Locator translation 2028 is done at control-time. 2030 Another approach is to have the ITR not encapsulate a multicast 2031 packet and allow the the host built packet to flow into the core even 2032 if the source address is allocated out of the EID namespace. If the 2033 RPF-Vector TLV [RPFV] is used by PIM in the core, then core routers 2034 can RPF to the ITR (the Locator address which is injected into core 2035 routing) rather than the host source address (the EID address which 2036 is not injected into core routing). 2038 To avoid any EID-based multicast state in the network core, the first 2039 approach is chosen for LISP-Multicast. Details for LISP-Multicast 2040 and Interworking with non-LISP sites is described in specification 2041 [MLISP]. 2043 12. Security Considerations 2045 It is believed that most of the security mechanisms will be part of 2046 the mapping database service when using control plane procedures for 2047 obtaining EID-to-RLOC mappings. For data plane triggered mappings, 2048 as described in this specification, protection is provided against 2049 ETR spoofing by using Return- Routability mechanisms evidenced by the 2050 use of a 4-byte Nonce field in the LISP encapsulation header. The 2051 nonce, coupled with the ITR accepting only solicited Map-Replies goes 2052 a long way toward providing decent authentication. 2054 LISP does not rely on a PKI infrastructure or a more heavy weight 2055 authentication system. These systems challenge the scalability of 2056 LISP which was a primary design goal. 2058 DoS attack prevention will depend on implementations rate-limiting 2059 Map-Requests and Map-Replies to the control plane as well as rate- 2060 limiting the number of data-triggered Map-Replies. 2062 To deal with map-cache exhaustion attempts in an ITR/PTR, the 2063 implementation should consider putting a maximum cap on the number of 2064 entries stored with a reserve list for special or frequently accessed 2065 sites. This should be a configuration policy control set by the 2066 network administrator who manages ITRs and PTRs. 2068 13. Prototype Plans and Status 2070 The operator community has requested that the IETF take a practical 2071 approach to solving the scaling problems associated with global 2072 routing state growth. This document offers a simple solution which 2073 is intended for use in a pilot program to gain experience in working 2074 on this problem. 2076 The authors hope that publishing this specification will allow the 2077 rapid implementation of multiple vendor prototypes and deployment on 2078 a small scale. Doing this will help the community: 2080 o Decide whether a new EID-to-RLOC mapping database infrastructure 2081 is needed or if a simple, UDP-based, data-triggered approach is 2082 flexible and robust enough. 2084 o Experiment with provider-independent assignment of EIDs while at 2085 the same time decreasing the size of DFZ routing tables through 2086 the use of topologically-aligned, provider-based RLOCs. 2088 o Determine whether multiple levels of tunneling can be used by ISPs 2089 to achieve their Traffic Engineering goals while simultaneously 2090 removing the more specific routes currently injected into the 2091 global routing system for this purpose. 2093 o Experiment with mobility to determine if both acceptable 2094 convergence and session continuity properties can be scalably 2095 implemented to support both individual device roaming and site 2096 service provider changes. 2098 Here is a rough set of milestones: 2100 1. This draft will be the draft for interoperable implementations to 2101 code against. Interoperable implementations will be ready 2102 beginning of 2009. 2104 2. Continue pilot deployment using LISP-ALT as the database mapping 2105 mechanism. 2107 3. Continue prototyping and studying other database lookup schemes, 2108 be it DNS, DHTs, CONS, ALT, NERD, or other mechanisms. 2110 4. Implement the LISP Multicast draft [MLISP]. 2112 5. Research more on how policy affects what gets returned in a Map- 2113 Reply from an ETR. 2115 6. Continue to experiment with mixed locator-sets to understand how 2116 LISP can help the IPv4 to IPv6 transition. 2118 7. Add more robustness to locator reachability between LISP sites. 2120 As of this writing the following accomplishments have been achieved: 2122 1. A unit- and system-tested software switching implementation has 2123 been completed on cisco NX-OS for this draft for both IPv4 and 2124 IPv6 EIDs using a mixed locator-set of IPv4 and IPv6 locators. 2126 2. A unit- and system-tested software switching implementation on 2127 cisco NX-OS has been completed for draft [ALT]. 2129 3. A unit- and system-tested software switching implementation on 2130 cisco NX-OS has been completed for draft [INTERWORK]. Support 2131 for IPv4 translation is provided and PTR support for IPv4 and 2132 IPv6 is provided. 2134 4. The cisco NX-OS implementation supports an experimental 2135 mechanism for slow mobility. 2137 5. Dave Meyer, Vince Fuller, Darrel Lewis, Greg Shepherd, and 2138 Andrew Partan continue to test all the features described above 2139 on a dual-stack infrastructure. 2141 6. Darrel Lewis and Dave Meyer have deployed both LISP translation 2142 and LISP PTR support in the pilot network. Point your browser 2143 to http://www.lisp4.net to see translation happening in action 2144 so your non-LISP site can access a web server in a LISP site. 2146 7. Soon http://www.lisp6.net will work where your IPv6 LISP site 2147 can talk to a IPv6 web server in a LISP site by using mixed 2148 address-family based locators. 2150 8. An public domain implementation of LISP is underway. See 2151 [OPENLISP] for details. 2153 9. We have started deploying Map-Resolvers and Map-Servers on the 2154 pilot network to gather experience with [LISP-MS]. 2156 10. A cisco IOS implementation is underway which currently supports 2157 IPv4 encapsulation and decapsulation features. 2159 If interested in writing a LISP implementation, testing any of the 2160 LISP implementations, or want to be part of the LISP pilot program, 2161 please contact lisp@ietf.org. 2163 14. References 2165 14.1. Normative References 2167 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2168 August 1980. 2170 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 2171 November 1990. 2173 [RFC1498] Saltzer, J., "On the Naming and Binding of Network 2174 Destinations", RFC 1498, August 1993. 2176 [RFC1955] Hinden, R., "New Scheme for Internet Routing and 2177 Addressing (ENCAPS) for IPNG", RFC 1955, June 1996. 2179 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2180 Requirement Levels", BCP 14, RFC 2119, March 1997. 2182 [RFC2402] Kent, S. and R. Atkinson, "IP Authentication Header", 2183 RFC 2402, November 1998. 2185 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2186 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2187 October 1998. 2189 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 2190 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 2191 March 2000. 2193 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 2194 via IPv4 Clouds", RFC 3056, February 2001. 2196 [RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support 2197 in IPv6", RFC 3775, June 2004. 2199 [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol 2200 (HIP) Architecture", RFC 4423, May 2006. 2202 [RFC4866] Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route 2203 Optimization for Mobile IPv6", RFC 4866, May 2007. 2205 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 2206 Workshop on Routing and Addressing", RFC 4984, 2207 September 2007. 2209 14.2. Informative References 2211 [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY 2212 NUMBERS http://www.iana.org/numbers.html, Febuary 2007. 2214 [ALT] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "LISP 2215 Alternative Topology (LISP-ALT)", 2216 draft-fuller-lisp-alt-03.txt (work in progress), 2217 February 2009. 2219 [APT] Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and 2220 L. Zhang, "APT: A Practical Transit Mapping Service", 2221 draft-jen-apt-01.txt (work in progress), November 2007. 2223 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 2224 Enhancement to the Internet Architecture", Internet- 2225 Draft http://www.chiappa.net/~jnc/tech/endpoints.txt, 2226 1999. 2228 [CONS] Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A 2229 Content distribution Overlay Network Service for LISP", 2230 draft-meyer-lisp-cons-03.txt (work in progress), 2231 November 2007. 2233 [DHTs] Ratnasamy, S., Shenker, S., and I. Stoica, "Routing 2234 Algorithms for DHTs: Some Open Questions", PDF 2235 file http://www.cs.rice.edu/Conferences/IPTPS02/174.pdf. 2237 [GSE] "GSE - An Alternate Addressing Architecture for IPv6", 2238 draft-ietf-ipngwg-gseaddr-00.txt (work in progress), 1997. 2240 [INTERWORK] 2241 Lewis, D., Meyer, D., Farinacci, D., and V. Fuller, 2242 "Interworking LISP with IPv4 and IPv6", 2243 draft-lewis-lisp-interworking-01.txt (work in progress), 2244 January 2009. 2246 [LISA96] Lear, E., Katinsky, J., Coffin, J., and D. Tharp, 2247 "Renumbering: Threat or Menace?", Usenix , September 1996. 2249 [LISP-MAIN] 2250 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 2251 "Locator/ID Separation Protocol (LISP)", 2252 draft-farinacci-lisp-12.txt (work in progress), 2253 March 2009. 2255 [LISP-MS] Farinacci, D. and V. Fuller, "LISP Map Server", 2256 draft-fuller-lisp-ms-00.txt (work in progress), 2257 March 2009. 2259 [LISP1] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2260 "Locator/ID Separation Protocol (LISP1) [Routable ID 2261 Version]", 2262 Slide-set http://www.dinof.net/~dino/ietf/lisp1.ppt, 2263 October 2006. 2265 [LISP2] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2266 "Locator/ID Separation Protocol (LISP2) [DNS-based 2267 Version]", 2268 Slide-set http://www.dinof.net/~dino/ietf/lisp2.ppt, 2269 November 2006. 2271 [LISPDHT] Mathy, L., Iannone, L., and O. Bonaventure, "LISP-DHT: 2272 Towards a DHT to map identifiers onto locators", 2273 draft-mathy-lisp-dht-00.txt (work in progress), 2274 February 2008. 2276 [LOC-ID-ARCH] 2277 Meyer, D. and D. Lewis, "Architectural Implications of 2278 Locator/ID Separation", 2279 draft-meyer-loc-id-implications-01.txt (work in progress), 2280 Januaryr 2009. 2282 [MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, 2283 "LISP for Multicast Environments", 2284 draft-ietf-lisp-multicast-00.txt (work in progress), 2285 May 2009. 2287 [NERD] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", 2288 draft-lear-lisp-nerd-04.txt (work in progress), 2289 April 2008. 2291 [OPENLISP] 2292 Iannone, L. and O. Bonaventure, "OpenLISP Implementation 2293 Report", draft-iannone-openlisp-implementation-01.txt 2294 (work in progress), July 2008. 2296 [RADIR] Narten, T., "Routing and Addressing Problem Statement", 2297 draft-narten-radir-problem-statement-00.txt (work in 2298 progress), July 2007. 2300 [RFC3344bis] 2301 Perkins, C., "IP Mobility Support for IPv4, revised", 2302 draft-ietf-mip4-rfc3344bis-05 (work in progress), 2303 July 2007. 2305 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 2306 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 2307 September 2005. 2309 [RPFV] Wijnands, IJ., Boers, A., and E. Rosen, "The RPF Vector 2310 TLV", draft-ietf-pim-rpf-vector-08.txt (work in progress), 2311 January 2009. 2313 [RPMD] Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol 2314 for Routing Protocol Meta-data Dissemination", 2315 draft-handley-p2ppush-unpublished-2007726.txt (work in 2316 progress), July 2007. 2318 [SHIM6] Nordmark, E. and M. Bagnulo, "Level 3 multihoming shim 2319 protocol", draft-ietf-shim6-proto-06.txt (work in 2320 progress), October 2006. 2322 Appendix A. Acknowledgments 2324 An initial thank you goes to Dave Oran for planting the seeds for the 2325 initial ideas for LISP. His consultation continues to provide value 2326 to the LISP authors. 2328 A special and appreciative thank you goes to Noel Chiappa for 2329 providing architectural impetus over the past decades on separation 2330 of location and identity, as well as detailed review of the LISP 2331 architecture and documents, coupled with enthusiasm for making LISP a 2332 practical and incremental transition for the Internet. 2334 The authors would like to gratefully acknowledge many people who have 2335 contributed discussion and ideas to the making of this proposal. 2336 They include Scott Brim, Andrew Partan, John Zwiebel, Jason Schiller, 2337 Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill, Geoff Huston, 2338 David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark Townsley, 2339 Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave Thaler, 2340 Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure, Luigi 2341 Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger 2342 Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland 2343 Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, Damian 2344 Lezama, Attilla De Groot, and Parantap Lahiri. 2346 In particular, we would like to thank Dave Meyer for his clever 2347 suggestion for the name "LISP". ;-) 2349 This work originated in the Routing Research Group (RRG) of the IRTF. 2350 The individual submission [LISP-MAIN] was converted into this IETF 2351 LISP working group draft. 2353 Authors' Addresses 2355 Dino Farinacci 2356 cisco Systems 2357 Tasman Drive 2358 San Jose, CA 95134 2359 USA 2361 Email: dino@cisco.com 2363 Vince Fuller 2364 cisco Systems 2365 Tasman Drive 2366 San Jose, CA 95134 2367 USA 2369 Email: vaf@cisco.com 2371 Dave Meyer 2372 cisco Systems 2373 170 Tasman Drive 2374 San Jose, CA 2375 USA 2377 Email: dmm@cisco.com 2379 Darrel Lewis 2380 cisco Systems 2381 170 Tasman Drive 2382 San Jose, CA 2383 USA 2385 Email: darlewis@cisco.com