idnits 2.17.1 draft-ietf-lisp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 10 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 25, 2010) is 5114 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1700 (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) ** Obsolete normative reference: RFC 4423 (Obsoleted by RFC 9063) ** Obsolete normative reference: RFC 4634 (Obsoleted by RFC 6234) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: A later version (-01) exists of draft-eubanks-chimento-6man-00 == Outdated reference: A later version (-10) exists of draft-ietf-lisp-alt-04 == Outdated reference: A later version (-04) exists of draft-meyer-lisp-cons-03 == Outdated reference: A later version (-06) exists of draft-ietf-lisp-interworking-01 == Outdated reference: A later version (-10) exists of draft-farinacci-lisp-lcaf-01 == Outdated reference: A later version (-06) exists of draft-ietf-lisp-lig-00 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-00 == Outdated reference: A later version (-16) exists of draft-ietf-lisp-ms-05 -- No information found for draft-mathy-lisp-dht - is the name correct? == Outdated reference: A later version (-14) exists of draft-ietf-lisp-multicast-03 == Outdated reference: A later version (-09) exists of draft-lear-lisp-nerd-08 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-00 == Outdated reference: A later version (-10) exists of draft-ietf-mip4-rfc3344bis-05 -- No information found for draft-handley-p2ppush-unpublished-2007726 - is the name correct? == Outdated reference: A later version (-02) exists of draft-iannone-lisp-mapping-versioning-01 Summary: 6 errors (**), 0 flaws (~~), 16 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft V. Fuller 4 Intended status: Experimental D. Meyer 5 Expires: October 27, 2010 D. Lewis 6 cisco Systems 7 April 25, 2010 9 Locator/ID Separation Protocol (LISP) 10 draft-ietf-lisp-07 12 Abstract 14 This draft describes a simple, incremental, network-based protocol to 15 implement separation of Internet addresses into Endpoint Identifiers 16 (EIDs) and Routing Locators (RLOCs). This mechanism requires no 17 changes to host stacks and no major changes to existing database 18 infrastructures. The proposed protocol can be implemented in a 19 relatively small number of routers. 21 This proposal was stimulated by the problem statement effort at the 22 Amsterdam IAB Routing and Addressing Workshop (RAWS), which took 23 place in October 2006. 25 Status of this Memo 27 This Internet-Draft is submitted to IETF in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF), its areas, and its working groups. Note that 32 other groups may also distribute working documents as Internet- 33 Drafts. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/ietf/1id-abstracts.txt. 43 The list of Internet-Draft Shadow Directories can be accessed at 44 http://www.ietf.org/shadow.html. 46 This Internet-Draft will expire on October 27, 2010. 48 Copyright Notice 49 Copyright (c) 2010 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the BSD License. 62 Table of Contents 64 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4 65 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 8 67 4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12 68 4.1. Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14 69 5. Tunneling Details . . . . . . . . . . . . . . . . . . . . . . 16 70 5.1. LISP IPv4-in-IPv4 Header Format . . . . . . . . . . . . . 17 71 5.2. LISP IPv6-in-IPv6 Header Format . . . . . . . . . . . . . 18 72 5.3. Tunnel Header Field Descriptions . . . . . . . . . . . . . 19 73 5.4. Dealing with Large Encapsulated Packets . . . . . . . . . 22 74 5.4.1. A Stateless Solution to MTU Handling . . . . . . . . . 22 75 5.4.2. A Stateful Solution to MTU Handling . . . . . . . . . 23 76 5.5. Using Virtualization and Segmentation with LISP . . . . . 24 77 6. EID-to-RLOC Mapping . . . . . . . . . . . . . . . . . . . . . 25 78 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats . . . . . 25 79 6.1.1. LISP Packet Type Allocations . . . . . . . . . . . . . 27 80 6.1.2. Map-Request Message Format . . . . . . . . . . . . . . 27 81 6.1.3. EID-to-RLOC UDP Map-Request Message . . . . . . . . . 29 82 6.1.4. Map-Reply Message Format . . . . . . . . . . . . . . . 31 83 6.1.5. EID-to-RLOC UDP Map-Reply Message . . . . . . . . . . 34 84 6.1.6. Map-Register Message Format . . . . . . . . . . . . . 37 85 6.1.7. Encapsulated Control Message Format . . . . . . . . . 38 86 6.2. Routing Locator Selection . . . . . . . . . . . . . . . . 40 87 6.3. Routing Locator Reachability . . . . . . . . . . . . . . . 41 88 6.3.1. Echo Nonce Algorithm . . . . . . . . . . . . . . . . . 44 89 6.3.2. RLOC Probing Algorithm . . . . . . . . . . . . . . . . 45 90 6.4. Routing Locator Hashing . . . . . . . . . . . . . . . . . 46 91 6.5. Changing the Contents of EID-to-RLOC Mappings . . . . . . 47 92 6.5.1. Clock Sweep . . . . . . . . . . . . . . . . . . . . . 47 93 6.5.2. Solicit-Map-Request (SMR) . . . . . . . . . . . . . . 48 94 6.5.3. Database Map Versioning . . . . . . . . . . . . . . . 49 95 7. Router Performance Considerations . . . . . . . . . . . . . . 51 96 8. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 52 97 8.1. First-hop/Last-hop Tunnel Routers . . . . . . . . . . . . 53 98 8.2. Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 53 99 8.3. ISP Provider-Edge (PE) Tunnel Routers . . . . . . . . . . 54 100 8.4. LISP Functionality with Conventional NATs . . . . . . . . 54 101 9. Traceroute Considerations . . . . . . . . . . . . . . . . . . 55 102 9.1. IPv6 Traceroute . . . . . . . . . . . . . . . . . . . . . 56 103 9.2. IPv4 Traceroute . . . . . . . . . . . . . . . . . . . . . 56 104 9.3. Traceroute using Mixed Locators . . . . . . . . . . . . . 56 105 10. Mobility Considerations . . . . . . . . . . . . . . . . . . . 58 106 10.1. Site Mobility . . . . . . . . . . . . . . . . . . . . . . 58 107 10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 58 108 10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 58 109 10.4. Fast Network Mobility . . . . . . . . . . . . . . . . . . 60 110 10.5. LISP Mobile Node Mobility . . . . . . . . . . . . . . . . 60 111 11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 62 112 12. Security Considerations . . . . . . . . . . . . . . . . . . . 63 113 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 64 114 14. Prototype Plans and Status . . . . . . . . . . . . . . . . . . 65 115 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 68 116 15.1. Normative References . . . . . . . . . . . . . . . . . . . 68 117 15.2. Informative References . . . . . . . . . . . . . . . . . . 69 118 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 73 119 Appendix B. Document Change Log . . . . . . . . . . . . . . . . . 74 120 B.1. Changes to draft-ietf-lisp-07.txt . . . . . . . . . . . . 74 121 B.2. Changes to draft-ietf-lisp-06.txt . . . . . . . . . . . . 75 122 B.3. Changes to draft-ietf-lisp-05.txt . . . . . . . . . . . . 76 123 B.4. Changes to draft-ietf-lisp-04.txt . . . . . . . . . . . . 77 124 B.5. Changes to draft-ietf-lisp-03.txt . . . . . . . . . . . . 79 125 B.6. Changes to draft-ietf-lisp-02.txt . . . . . . . . . . . . 79 126 B.7. Changes to draft-ietf-lisp-01.txt . . . . . . . . . . . . 79 127 B.8. Changes to draft-ietf-lisp-00.txt . . . . . . . . . . . . 80 128 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 81 130 1. Requirements Notation 132 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 133 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 134 document are to be interpreted as described in [RFC2119]. 136 2. Introduction 138 Many years of discussion about the current IP routing and addressing 139 architecture have noted that its use of a single numbering space (the 140 "IP address") for both host transport session identification and 141 network routing creates scaling issues (see [CHIAPPA] and [RFC1498]). 142 A number of scaling benefits would be realized by separating the 143 current IP address into separate spaces for Endpoint Identifiers 144 (EIDs) and Routing Locators (RLOCs); among them are: 146 1. Reduction of routing table size in the "default-free zone" (DFZ). 147 Use of a separate numbering space for RLOCs will allow them to be 148 assigned topologically (in today's Internet, RLOCs would be 149 assigned by providers at client network attachment points), 150 greatly improving aggregation and reducing the number of 151 globally-visible, routable prefixes. 153 2. More cost-effective multihoming for sites that connect to 154 different service providers where they can control their own 155 policies for packet flow into the site without using extra 156 routing table resources of core routers. 158 3. Easing of renumbering burden when clients change providers. 159 Because host EIDs are numbered from a separate, non-provider- 160 assigned and non-topologically-bound space, they do not need to 161 be renumbered when a client site changes its attachment points to 162 the network. 164 4. Traffic engineering capabilities that can be performed by network 165 elements and do not depend on injecting additional state into the 166 routing system. This will fall out of the mechanism that is used 167 to implement the EID/RLOC split (see Section 4). 169 5. Mobility without address changing. Existing mobility mechanisms 170 will be able to work in a locator/ID separation scenario. It 171 will be possible for a host (or a collection of hosts) to move to 172 a different point in the network topology either retaining its 173 home-based address or acquiring a new address based on the new 174 network location. A new network location could be a physically 175 different point in the network topology or the same physical 176 point of the topology with a different provider. 178 This draft describes protocol mechanisms to achieve the desired 179 functional separation. For flexibility, the mechanism used for 180 forwarding packets is decoupled from that used to determine EID to 181 RLOC mappings. This document covers the former. For the latter, see 182 [CONS], [ALT], [EMACS], [RPMD], and [NERD]. This work is in response 183 to and intended to address the problem statement that came out of the 184 RAWS effort [RFC4984]. 186 The Routing and Addressing problem statement can be found in [RADIR]. 188 This draft focuses on a router-based solution. Building the solution 189 into the network will facilitate incremental deployment of the 190 technology on the Internet. Note that while the detailed protocol 191 specification and examples in this document assume IP version 4 192 (IPv4), there is nothing in the design that precludes use of the same 193 techniques and mechanisms for IPv6. It should be possible for IPv4 194 packets to use IPv6 RLOCs and for IPv6 EIDs to be mapped to IPv4 195 RLOCs. 197 Related work on host-based solutions is described in Shim6 [RFC5533] 198 and HIP [RFC4423]. Related work on a router-based solution is 199 described in [GSE]. This draft attempts to not compete or overlap 200 with such solutions and the proposed protocol changes are expected to 201 complement a host-based mechanism when Traffic Engineering 202 functionality is desired. 204 Some of the design goals of this proposal include: 206 1. Require no hardware or software changes to end-systems (hosts). 208 2. Minimize required changes to Internet infrastructure. 210 3. Be incrementally deployable. 212 4. Require no router hardware changes. 214 5. Minimize the number of routers which have to be modified. In 215 particular, most customer site routers and no core routers 216 require changes. 218 6. Minimize router software changes in those routers which are 219 affected. 221 7. Avoid or minimize packet loss when EID-to-RLOC mappings need to 222 be performed. 224 There are 4 variants of LISP, which differ along a spectrum of strong 225 to weak dependence on the topological nature and possible need for 226 routability of EIDs. The variants are: 228 LISP 1: uses EIDs that are routable through the RLOC topology for 229 bootstrapping EID-to-RLOC mappings. [LISP1] This was intended as 230 a prototyping mechanism for early protocol implementation. It is 231 now deprecated and SHOULD NOT be deployed. 233 LISP 1.5: uses EIDs that are routable for bootstrapping EID-to-RLOC 234 mappings; such routing is via a separate topology. 236 LISP 2: uses EIDS that are not routable and EID-to-RLOC mappings are 237 implemented within the DNS. [LISP2] 239 LISP 3: uses non-routable EIDs that are used as lookup keys for a 240 new EID-to-RLOC mapping database. Use of Distributed Hash Tables 241 [DHTs] [LISPDHT] to implement such a database would be an area to 242 explore. Other examples of new mapping database services are 243 [CONS], [ALT], [RPMD], [NERD], and [APT]. 245 This document on LISP 1.5, and LISP 3 variants, both of which rely on 246 a router-based distributed cache and database for EID-to-RLOC 247 mappings. The LISP 1.0 mechanism works but does not allow reduction 248 of routing information in the default-free-zone of the Internet. The 249 LISP 2 mechanisms are put on hold and may never come to fruition 250 since it is not architecturally pure to have routing depend on 251 directory and directory depend on routing. The LISP 3 mechanisms 252 will be documented elsewhere but may use the control-plane options 253 specified in this specification. 255 3. Definition of Terms 257 Provider Independent (PI) Addresses: an address block assigned from 258 a pool where blocks are not associated with any particular 259 location in the network (e.g. from a particular service provider), 260 and is therefore not topologically aggregatable in the routing 261 system. 263 Provider Assigned (PA) Addresses: a block of IP addresses that are 264 assigned to a site by each service provider to which a site 265 connects. Typically, each block is sub-block of a service 266 provider CIDR block and is aggregated into the larger block before 267 being advertised into the global Internet. Traditionally, IP 268 multihoming has been implemented by each multi-homed site 269 acquiring its own, globally-visible prefix. LISP uses only 270 topologically-assigned and aggregatable address blocks for RLOCs, 271 eliminating this demonstrably non-scalable practice. 273 Routing Locator (RLOC): the IPv4 or IPv6 address of an egress 274 tunnel router (ETR). It is the output of a EID-to-RLOC mapping 275 lookup. An EID maps to one or more RLOCs. Typically, RLOCs are 276 numbered from topologically-aggregatable blocks that are assigned 277 to a site at each point to which it attaches to the global 278 Internet; where the topology is defined by the connectivity of 279 provider networks, RLOCs can be thought of as PA addresses. 280 Multiple RLOCs can be assigned to the same ETR device or to 281 multiple ETR devices at a site. 283 Endpoint ID (EID): a 32-bit (for IPv4) or 128-bit (for IPv6) value 284 used in the source and destination address fields of the first 285 (most inner) LISP header of a packet. The host obtains a 286 destination EID the same way it obtains an destination address 287 today, for example through a DNS lookup or SIP exchange. The 288 source EID is obtained via existing mechanisms used to set a 289 host's "local" IP address. An EID is allocated to a host from an 290 EID-prefix block associated with the site where the host is 291 located. An EID can be used by a host to refer to other hosts. 292 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks may be 293 assigned in a hierarchical manner, independent of the network 294 topology, to facilitate scaling of the mapping database. In 295 addition, an EID block assigned to a site may have site-local 296 structure (subnetting) for routing within the site; this structure 297 is not visible to the global routing system. When used in 298 discussions with other Locator/ID separation proposals, a LISP EID 299 will be called a "LEID". Throughout this document, any references 300 to "EID" refers to an LEID. 302 EID-prefix: A power-of-2 block of EIDs which are allocated to a 303 site by an address allocation authority. EID-prefixes are 304 associated with a set of RLOC addresses which make up a "database 305 mapping". EID-prefix allocations can be broken up into smaller 306 blocks when an RLOC set is to be associated with the smaller EID- 307 prefix. A globally routed address block (whether PI or PA) is not 308 an EID-prefix. However, a globally routed address block may be 309 removed from global routing and reused as an EID-prefix. A site 310 that receives an explicitly allocated EID-prefix may not use that 311 EID-prefix as a globally routed prefix assigned to RLOCs. 313 End-system: is an IPv4 or IPv6 device that originates packets with 314 a single IPv4 or IPv6 header. The end-system supplies an EID 315 value for the destination address field of the IP header when 316 communicating globally (i.e. outside of its routing domain). An 317 end-system can be a host computer, a switch or router device, or 318 any network appliance. 320 Ingress Tunnel Router (ITR): a router which accepts an IP packet 321 with a single IP header (more precisely, an IP packet that does 322 not contain a LISP header). The router treats this "inner" IP 323 destination address as an EID and performs an EID-to-RLOC mapping 324 lookup. The router then prepends an "outer" IP header with one of 325 its globally-routable RLOCs in the source address field and the 326 result of the mapping lookup in the destination address field. 327 Note that this destination RLOC may be an intermediate, proxy 328 device that has better knowledge of the EID-to-RLOC mapping closer 329 to the destination EID. In general, an ITR receives IP packets 330 from site end-systems on one side and sends LISP-encapsulated IP 331 packets toward the Internet on the other side. 333 Specifically, when a service provider prepends a LISP header for 334 Traffic Engineering purposes, the router that does this is also 335 regarded as an ITR. The outer RLOC the ISP ITR uses can be based 336 on the outer destination address (the originating ITR's supplied 337 RLOC) or the inner destination address (the originating hosts 338 supplied EID). 340 TE-ITR: is an ITR that is deployed in a service provider network 341 that prepends an additional LISP header for Traffic Engineering 342 purposes. 344 Egress Tunnel Router (ETR): a router that accepts an IP packet 345 where the destination address in the "outer" IP header is one of 346 its own RLOCs. The router strips the "outer" header and forwards 347 the packet based on the next IP header found. In general, an ETR 348 receives LISP-encapsulated IP packets from the Internet on one 349 side and sends decapsulated IP packets to site end-systems on the 350 other side. ETR functionality does not have to be limited to a 351 router device. A server host can be the endpoint of a LISP tunnel 352 as well. 354 TE-ETR: is an ETR that is deployed in a service provider network 355 that strips an outer LISP header for Traffic Engineering purposes. 357 xTR: is a reference to an ITR or ETR when direction of data flow is 358 not part of the context description. xTR refers to the router that 359 is the tunnel endpoint. Used synonymously with the term "Tunnel 360 Router". For example, "An xTR can be located at the Customer Edge 361 (CE) router", meaning both ITR and ETR functionality is at the CE 362 router. 364 EID-to-RLOC Cache: a short-lived, on-demand table in an ITR that 365 stores, tracks, and is responsible for timing-out and otherwise 366 validating EID-to-RLOC mappings. This cache is distinct from the 367 full "database" of EID-to-RLOC mappings, it is dynamic, local to 368 the ITR(s), and relatively small while the database is 369 distributed, relatively static, and much more global in scope. 371 EID-to-RLOC Database: a global distributed database that contains 372 all known EID-prefix to RLOC mappings. Each potential ETR 373 typically contains a small piece of the database: the EID-to-RLOC 374 mappings for the EID prefixes "behind" the router. These map to 375 one of the router's own, globally-visible, IP addresses. The same 376 database mapping entries MUST be configured on all ETRs for a 377 given site. That is, the EID-prefixes for the site and locator- 378 set for each EID-prefix MUST be the same on all ETRs so they 379 consistently send Map-Reply messages with the same database 380 mapping contents. 382 Recursive Tunneling: when a packet has more than one LISP IP 383 header. Additional layers of tunneling may be employed to 384 implement traffic engineering or other re-routing as needed. When 385 this is done, an additional "outer" LISP header is added and the 386 original RLOCs are preserved in the "inner" header. Any 387 references to tunnels in this specification refers to dynamic 388 encapsulating tunnels and never are they statically configured. 390 Reencapsulating Tunnels: when a packet has no more than one LISP IP 391 header (two IP headers total) and when it needs to be diverted to 392 new RLOC, an ETR can decapsulate the packet (remove the LISP 393 header) and prepends a new tunnel header, with new RLOC, on to the 394 packet. Doing this allows a packet to be re-routed by the re- 395 encapsulating router without adding the overhead of additional 396 tunnel headers. Any references to tunnels in this specification 397 refers to dynamic encapsulating tunnels and never are they 398 statically configured. 400 LISP Header: a term used in this document to refer to the outer 401 IPv4 or IPv6 header, a UDP header, and a LISP-specific 8-byte 402 header that follows the UDP header, an ITR prepends or an ETR 403 strips. 405 Address Family Identifier (AFI): a term used to describe an address 406 encoding in a packet. An address family currently pertains to an 407 IPv4 or IPv6 address. See [AFI] and [RFC1700] for details. An 408 AFI value of 0 used in this specification indicates an unspecified 409 encoded address where the the length of the address is 0 bytes 410 following the 16-bit AFI value of 0. 412 Negative Mapping Entry: also known as a negative cache entry, is an 413 EID-to-RLOC entry where an EID-prefix is advertised or stored with 414 no RLOCs. That is, the locator-set for the EID-to-RLOC entry is 415 empty or has an encoded locator count of 0. This type of entry 416 could be used to describe a prefix from a non-LISP site, which is 417 explicitly not in the mapping database. There are a set of well 418 defined actions that are encoded in a Negative Map-Reply. 420 Data Probe: a LISP-encapsulated data packet where the inner header 421 destination address equals the outer header destination address 422 used to trigger a Map-Reply by a decapsulating ETR. In addition, 423 the original packet is decapsulated and delivered to the 424 destination host. A Data Probe is used in some of the mapping 425 database designs to "probe" or request a Map-Reply from an ETR; in 426 other cases, Map-Requests are used. See each mapping database 427 design for details. 429 Proxy ITR (PITR): also known as a PTR is defined and described in 430 [INTERWORK], a PITR acts like an ITR but does so on behalf of non- 431 LISP sites which send packets to destinations at LISP sites. 433 Proxy ETR (PETR): is defined and described in [INTERWORK], a PETR 434 acts like an ETR but does so on behalf of LISP sites which send 435 packets to destinations at non-LISP sites. 437 Route-returnability: is an assumption that the underlying routing 438 system will deliver packets to the destination. When combined 439 with a nonce that is provided by a sender and returned by a 440 receiver limits off-path data insertion. 442 LISP site: is a set of routers in an edge network that are under a 443 single technical administration. LISP routers which reside in the 444 edge network are the demarcation points to separate the edge 445 network from the core network. 447 4. Basic Overview 449 One key concept of LISP is that end-systems (hosts) operate the same 450 way they do today. The IP addresses that hosts use for tracking 451 sockets, connections, and for sending and receiving packets do not 452 change. In LISP terminology, these IP addresses are called Endpoint 453 Identifiers (EIDs). 455 Routers continue to forward packets based on IP destination 456 addresses. When a packet is LISP encapsulated, these addresses are 457 referred to as Routing Locators (RLOCs). Most routers along a path 458 between two hosts will not change; they continue to perform routing/ 459 forwarding lookups on the destination addresses. For routers between 460 the source host and the ITR as well as routers from the ETR to the 461 destination host, the destination address is an EID. For the routers 462 between the ITR and the ETR, the destination address is an RLOC. 464 This design introduces "Tunnel Routers", which prepends LISP headers 465 on host-originated packets and strip them prior to final delivery to 466 their destination. The IP addresses in this "outer header" are 467 RLOCs. During end-to-end packet exchange between two Internet hosts, 468 an ITR prepends a new LISP header to each packet and an egress tunnel 469 router strips the new header. The ITR performs EID-to-RLOC lookups 470 to determine the routing path to the the ETR, which has the RLOC as 471 one of its IP addresses. 473 Some basic rules governing LISP are: 475 o End-systems (hosts) only send to addresses which are EIDs. They 476 don't know addresses are EIDs versus RLOCs but assume packets get 477 to LISP routers, which in turn, deliver packets to the destination 478 the end-system has specified. 480 o EIDs are always IP addresses assigned to hosts. 482 o LISP routers mostly deal with Routing Locator addresses. See 483 details later in Section 4.1 to clarify what is meant by "mostly". 485 o RLOCs are always IP addresses assigned to routers; preferably, 486 topologically-oriented addresses from provider CIDR blocks. 488 o When a router originates packets it may use as a source address 489 either an EID or RLOC. When acting as a host (e.g. when 490 terminating a transport session such as SSH, TELNET, or SNMP), it 491 may use an EID that is explicitly assigned for that purpose. An 492 EID that identifies the router as a host MUST NOT be used as an 493 RLOC; an EID is only routable within the scope of a site. A 494 typical BGP configuration might demonstrate this "hybrid" EID/RLOC 495 usage where a router could use its "host-like" EID to terminate 496 iBGP sessions to other routers in a site while at the same time 497 using RLOCs to terminate eBGP sessions to routers outside the 498 site. 500 o EIDs are not expected to be usable for global end-to-end 501 communication in the absence of an EID-to-RLOC mapping operation. 502 They are expected to be used locally for intra-site communication. 504 o EID prefixes are likely to be hierarchically assigned in a manner 505 which is optimized for administrative convenience and to 506 facilitate scaling of the EID-to-RLOC mapping database. The 507 hierarchy is based on a address allocation hierarchy which is not 508 dependent on the network topology. 510 o EIDs may also be structured (subnetted) in a manner suitable for 511 local routing within an autonomous system. 513 An additional LISP header may be prepended to packets by a transit 514 router (i.e. TE-ITR) when re-routing of the path for a packet is 515 desired. An obvious instance of this would be an ISP router that 516 needs to perform traffic engineering for packets flowing through its 517 network. In such a situation, termed Recursive Tunneling, an ISP 518 transit acts as an additional ingress tunnel router and the RLOC it 519 uses for the new prepended header would be either a TE-ETR within the 520 ISP (along intra-ISP traffic engineered path) or a TE-ETR within 521 another ISP (an inter-ISP traffic engineered path, where an agreement 522 to build such a path exists). 524 This specification mandates that no more than two LISP headers get 525 prepended to a packet. This avoids excessive packet overhead as well 526 as possible encapsulation loops. It is believed two headers is 527 sufficient, where the first prepended header is used at a site for 528 Location/Identity separation and second prepended header is used 529 inside a service provider for Traffic Engineering purposes. 531 Tunnel Routers can be placed fairly flexibly in a multi-AS topology. 532 For example, the ITR for a particular end-to-end packet exchange 533 might be the first-hop or default router within a site for the source 534 host. Similarly, the egress tunnel router might be the last-hop 535 router directly-connected to the destination host. Another example, 536 perhaps for a VPN service out-sourced to an ISP by a site, the ITR 537 could be the site's border router at the service provider attachment 538 point. Mixing and matching of site-operated, ISP-operated, and other 539 tunnel routers is allowed for maximum flexibility. See Section 8 for 540 more details. 542 4.1. Packet Flow Sequence 544 This section provides an example of the unicast packet flow with the 545 following conditions: 547 o Source host "host1.abc.com" is sending a packet to 548 "host2.xyz.com", exactly what host1 would do if the site was not 549 using LISP. 551 o Each site is multi-homed, so each tunnel router has an address 552 (RLOC) assigned from the service provider address block for each 553 provider to which that particular tunnel router is attached. 555 o The ITR(s) and ETR(s) are directly connected to the source and 556 destination, respectively, but the source and destination can be 557 located anywhere in LISP site. 559 o Map-Requests can be sent on the underlying routing system topology 560 (LISP 1.0) or over an alternative topology [ALT]. 562 o Map-Replies are sent on the underlying routing system topology. 564 Client host1.abc.com wants to communicate with server host2.xyz.com: 566 1. host1.abc.com wants to open a TCP connection to host2.xyz.com. 567 It does a DNS lookup on host2.xyz.com. An A/AAAA record is 568 returned. This address is the destination EID. The locally- 569 assigned address of host1.abc.com is used as the source EID. An 570 IPv4 or IPv6 packet is built and forwarded through the LISP site 571 as a normal IP packet until it reaches a LISP ITR. 573 2. The LISP ITR must be able to map the EID destination to an RLOC 574 of one of the ETRs at the destination site. The specific method 575 used to do this is not described in this example. See [ALT] or 576 [CONS] for possible solutions. 578 3. The ITR will send a LISP Map-Request. Map-Requests SHOULD be 579 rate-limited. 581 4. In LISP 1.0, the Map-Request packet is routed through the 582 underlying routing system. In LISP 1.5, the Map-Request packet 583 is routed on an alternate logical topology. In either case, when 584 the Map-Request arrives at one of the ETRs at the destination 585 site, it will process the packet as a control message. 587 5. The ETR looks at the destination EID of the Map-Request and 588 matches it against the prefixes in the ETR's configured EID-to- 589 RLOC mapping database. This is the list of EID-prefixes the ETR 590 is supporting for the site it resides in. If there is no match, 591 the Map-Request is dropped. Otherwise, a LISP Map-Reply is 592 returned to the ITR. 594 6. The ITR receives the Map-Reply message, parses the message (to 595 check for format validity) and stores the mapping information 596 from the packet. This information is put in the ITR's EID-to- 597 RLOC mapping cache (this is the on-demand cache, the cache where 598 entries time out due to inactivity). 600 7. Subsequent packets from host1.abc.com to host2.xyz.com will have 601 a LISP header prepended by the ITR using the appropriate RLOC as 602 the LISP header destination address learned from the ETR. Note, 603 the packet may be sent to a different ETR than the one which 604 returned the Map-Reply due to the source site's hashing policy or 605 the destination site's locator-set policy. 607 8. The ETR receives these packets directly (since the destination 608 address is one of its assigned IP addresses), strips the LISP 609 header and forwards the packets to the attached destination host. 611 In order to eliminate the need for a mapping lookup in the reverse 612 direction, an ETR MAY create a cache entry that maps the source EID 613 (inner header source IP address) to the source RLOC (outer header 614 source IP address) in a received LISP packet. Such a cache entry is 615 termed a "gleaned" mapping and only contains a single RLOC for the 616 EID in question. More complete information about additional RLOCs 617 SHOULD be verified by sending a LISP Map-Request for that EID. Both 618 ITR and the ETR may also influence the decision the other makes in 619 selecting an RLOC. See Section 6 for more details. 621 5. Tunneling Details 623 This section describes the LISP Data Message which defines the 624 tunneling header used to encapsulate IPv4 and IPv6 packets which 625 contain EID addresses. Even though the following formats illustrate 626 IPv4-in-IPv4 and IPv6-in-IPv6 encapsulations, the other 2 627 combinations are supported as well. 629 Since additional tunnel headers are prepended, the packet becomes 630 larger and in theory can exceed the MTU of any link traversed from 631 the ITR to the ETR. It is recommended, in IPv4 that packets do not 632 get fragmented as they are encapsulated by the ITR. Instead, the 633 packet is dropped and an ICMP Too Big message is returned to the 634 source. 636 Based on informal surveys of large ISP traffic patterns, it appears 637 that most transit paths can accommodate a path MTU of at least 4470 638 bytes. The exceptions, in terms of data rate, number of hosts 639 affected, or any other metric are expected to be vanishingly small. 641 To address MTU concerns, mainly raised on the RRG mailing list, the 642 LISP deployment process will include collecting data during its pilot 643 phase to either verify or refute the assumption about minimum 644 available MTU. If the assumption proves true and transit networks 645 with links limited to 1500 byte MTUs are corner cases, it would seem 646 more cost-effective to either upgrade or modify the equipment in 647 those transit networks to support larger MTUs or to use existing 648 mechanisms for accommodating packets that are too large. 650 For this reason, there is currently no plan for LISP to add any new 651 additional, complex mechanism for implementing fragmentation and 652 reassembly in the face of limited-MTU transit links. If analysis 653 during LISP pilot deployment reveals that the assumption of 654 essentially ubiquitous, 4470+ byte transit path MTUs, is incorrect, 655 then LISP can be modified prior to protocol standardization to add 656 support for one of the proposed fragmentation and reassembly schemes. 657 Note that two simple existing schemes are detailed in Section 5.4. 659 5.1. LISP IPv4-in-IPv4 Header Format 661 0 1 2 3 662 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 663 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 664 / |Version| IHL |Type of Service| Total Length | 665 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 666 | | Identification |Flags| Fragment Offset | 667 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 OH | Time to Live | Protocol = 17 | Header Checksum | 669 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 670 | | Source Routing Locator | 671 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 \ | Destination Routing Locator | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 / | Source Port = xxxx | Dest Port = 4341 | 675 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 \ | UDP Length | UDP Checksum | 677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 678 L |N|L|E|V|I|flags| Nonce/Map-Version | 679 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 S / | Instance ID/Locator Status Bits | 681 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 / |Version| IHL |Type of Service| Total Length | 683 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 684 | | Identification |Flags| Fragment Offset | 685 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 IH | Time to Live | Protocol | Header Checksum | 687 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 | | Source EID | 689 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 690 \ | Destination EID | 691 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 693 5.2. LISP IPv6-in-IPv6 Header Format 695 0 1 2 3 696 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 698 / |Version| Traffic Class | Flow Label | 699 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 700 | | Payload Length | Next Header=17| Hop Limit | 701 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 702 | | 703 O + + 704 u | | 705 t + Source Routing Locator + 706 e | | 707 r + + 708 | | 709 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 710 d | | 711 r + + 712 | | 713 ^ + Destination Routing Locator + 714 | | | 715 \ + + 716 \ | | 717 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 / | Source Port = xxxx | Dest Port = 4341 | 719 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 720 \ | UDP Length | UDP Checksum | 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 722 L |N|L|E|V|I|flags| Nonce/Map-Version | 723 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 724 S / | Instance ID/Locator Status Bits | 725 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 726 / |Version| Traffic Class | Flow Label | 727 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 728 / | Payload Length | Next Header | Hop Limit | 729 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 730 | | 731 I + + 732 n | | 733 n + Source EID + 734 e | | 735 r + + 736 | | 737 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 738 d | | 739 r + + 740 | | 741 ^ + Destination EID + 742 \ | | 743 \ + + 744 \ | | 745 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 747 5.3. Tunnel Header Field Descriptions 749 Inner Header: is the inner header, preserved from the datagram 750 received from the originating host. The source and destination IP 751 addresses are EIDs. 753 Outer Header: is the outer header prepended by an ITR. The address 754 fields contain RLOCs obtained from the ingress router's EID-to- 755 RLOC cache. The IP protocol number is "UDP (17)" from [RFC0768]. 756 The DF bit of the Flags field is set to 0 when the method in 757 Section 5.4.1 is used and set to 1 when the method in 758 Section 5.4.2 is used. 760 UDP Header: contains a ITR selected source port when encapsulating a 761 packet. See Section 6.4 for details on the hash algorithm used to 762 select a source port based on the 5-tuple of the inner header. 763 The destination port MUST be set to the well-known IANA assigned 764 port value 4341. 766 UDP Checksum: this field SHOULD be transmitted as zero by an ITR for 767 either IPv4 [RFC0768] or IPv6 encapsulation [UDP-TUNNELS]. When a 768 packet with a zero UDP checksum is received by an ETR, the ETR 769 MUST accept the packet for decapsulation. When an ITR transmits a 770 non-zero value for the UDP checksum, it MUST send a correctly 771 computed value in this field. When an ETR receives a packet with 772 a non-zero UDP checksum, it MAY choose to verify the checksum 773 value. If it chooses to perform such verification, and the 774 verification fails, the packet MUST be silently dropped. If the 775 ETR chooses not to perform the verification, or performs the 776 verification successfully, the packet MUST be accepted for 777 decapsulation. The handling of UDP checksums for all tunneling 778 protocols, including LISP, is under active discussion within the 779 IETF. When that discussion concludes, any necessary changes will 780 be made to align LISP with the outcome of the broader discussion. 782 UDP Length: for an IPv4 encapsulated packet, the inner header Total 783 Length plus the UDP and LISP header lengths are used. For an IPv6 784 encapsulated packet, the inner header Payload Length plus the size 785 of the IPv6 header (40 bytes) plus the size of the UDP and LISP 786 headers are used. The UDP header length is 8 bytes. 788 N: this is the nonce-present bit. When this bit is set to 1, the 789 low-order 24-bits of the first 32-bits of the LISP header contains 790 a Nonce. See Section 6.3.1 for details. Both N and V bits MUST 791 NOT be set in the same packet. If they are, a decapsulating ETR 792 MUST treat the "Nonce/Map-Version" field as having a Nonce value 793 present. 795 L: this is the Locator-Status-Bits field enabled bit. When this bit 796 is set to 1, the Locator-Status-Bits in the second 32-bits of the 797 LISP header are in use. 799 x 1 x x 0 x x x 800 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 801 |N|L|E|V|I|flags| Nonce/Map-Version | 802 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 803 | Locator Status Bits | 804 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 806 E: this is the echo-nonce-request bit. When this bit is set to 1, 807 the N bit MUST be 1. This bit SHOULD be ignored and has no 808 meaning when the N bit is set to 0. See Section 6.3.1 for 809 details. 811 V: this is the Map-Version present bit. When this bit is set to 1, 812 the N bit MUST be 0. Refer to Section 6.5.3 for more details. 813 This bit indicates that the first 4 bytes of the LISP header is 814 encoded as: 816 0 x 0 1 x x x x 817 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 818 |N|L|E|V|I|flags| Source Map-Version | Dest Map-Version | 819 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 | Instance ID/Locator Status Bits | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 I: this is the Instance ID bit. See Section 5.5 for more details. 824 When this bit is set to 1, the Locator Status Bits field is 825 reduced to 8-bits and the high-order 24-bits are used as an 826 Instance ID. If the L-bit is set to 0, then the low-order 8 bits 827 are transmitted as zero and ignored on receipt. The format of the 828 last 4 bytes of the LISP header would look like: 830 x x x x 1 x x x 831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 832 |N|L|E|V|I|flags| Nonce/Map-Version | 833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 834 | Instance ID | LSBs | 835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 837 flags: this 3-bit field is reserved for future flag use. It is set 838 to 0 on transmit and ignored on receipt. 840 LISP Nonce: is a 24-bit value that is randomly generated by an ITR 841 when the N-bit is set to 1. The nonce is also used when the E-bit 842 is set to request the nonce value to be echoed by the other side 843 when packets are returned. When the E-bit is clear but the N-bit 844 is set, a remote ITR is either echoing a previously requested 845 echo-nonce or providing a random nonce. See Section 6.3.1 for 846 more details. 848 LISP Locator Status Bits: in the LISP header are set by an ITR to 849 indicate to an ETR the up/down status of the Locators in the 850 source site. Each RLOC in a Map-Reply is assigned an ordinal 851 value from 0 to n-1 (when there are n RLOCs in a mapping entry). 852 The Locator Status Bits are numbered from 0 to n-1 from the least 853 significant bit of field. The field is 32-bits when the I-bit is 854 set to 0 and is 8 bits when the I-bit is set to 1. When a Locator 855 Status Bit is set to 1, the ITR is indicating to the ETR the RLOC 856 associated with the bit ordinal has up status. See Section 6.3 857 for details on how an ITR can determine the status of other ITRs 858 at the same site. When a site has multiple EID-prefixes which 859 result in multiple mappings (where each could have a different 860 locator-set), the Locator Status Bits setting in an encapsulated 861 packet MUST reflect the mapping for the EID-prefix that the inner- 862 header source EID address matches. 864 When doing Recursive Tunneling or ITR/PTR encapsulation: 866 o The outer header Time to Live field (or Hop Limit field, in case 867 of IPv6) SHOULD be copied from the inner header Time to Live 868 field. 870 o The outer header Type of Service field (or the Traffic Class 871 field, in the case of IPv6) SHOULD be copied from the inner header 872 Type of Service field (with one caveat, see below). 874 When doing Re-encapsulated Tunneling: 876 o The new outer header Time to Live field SHOULD be copied from the 877 stripped outer header Time to Live field. 879 o The new outer header Type of Service field SHOULD be copied from 880 the stripped OH header Type of Service field (with one caveat, see 881 below). 883 Copying the TTL serves two purposes: first, it preserves the distance 884 the host intended the packet to travel; second, and more importantly, 885 it provides for suppression of looping packets in the event there is 886 a loop of concatenated tunnels due to misconfiguration. 888 The ECN field occupies bits 6 and 7 of both the IPv4 Type of Service 889 field and the IPv6 Traffic Class field [RFC3168]. The ECN field 890 requires special treatment in order to avoid discarding indications 891 of congestion [RFC3168]. ITR encapsulation MUST copy the 2-bit ECN 892 field from the inner header to the outer header. Re-encapsulation 893 MUST copy the 2-bit ECN field from the stripped outer header to the 894 new outer header. If the ECN field contains a congestion indication 895 codepoint (the value is '11', the Congestion Experienced (CE) 896 codepoint), then ETR decapsulation MUST copy the 2-bit ECN field from 897 the stripped outer header to the surviving inner header that is used 898 to forward the packet beyond the ETR. These requirements preserve 899 Congestion Experienced (CE) indications when a packet that uses ECN 900 traverses a LISP tunnel and becomes marked with a CE indication due 901 to congestion between the tunnel endpoints. 903 5.4. Dealing with Large Encapsulated Packets 905 In the event that the MTU issues mentioned above prove to be more 906 serious than expected, this section proposes 2 simple mechanisms to 907 deal with large packets. One is stateless using IP fragmentation and 908 the other is stateful using Path MTU Discovery [RFC1191]. 910 It is left to the implementor to decide if the stateless or stateful 911 mechanism SHOULD be implemented. Both or neither can be decided as 912 well since it is a local decision in the ITR regarding how to deal 913 with MTU issues. Sites can interoperate with differing mechanisms. 915 Both stateless and stateful mechanisms also apply to Reencapsulating 916 and Recursive Tunneling. So any actions below referring to an ITR 917 also apply to an TE-ITR. 919 5.4.1. A Stateless Solution to MTU Handling 921 An ITR stateless solution to handle MTU issues is described as 922 follows: 924 1. Define an architectural constant S for the maximum size of a 925 packet, in bytes, an ITR would receive from a source inside of 926 its site. 928 2. Define L to be the maximum size, in bytes, a packet of size S 929 would be after the ITR prepends the LISP header, UDP header, and 930 outer network layer header of size H. 932 3. Calculate: S + H = L. 934 When an ITR receives a packet from a site-facing interface and adds H 935 bytes worth of encapsulation to yield a packet size greater than L 936 bytes, it resolves the MTU issue by first splitting the original 937 packet into 2 equal-sized fragments. A LISP header is then prepended 938 to each fragment. This will ensure that the new, encapsulated 939 packets are of size (S/2 + H), which is always below the effective 940 tunnel MTU. 942 When an ETR receives encapsulated fragments, it treats them as two 943 individually encapsulated packets. It strips the LISP headers then 944 forwards each fragment to the destination host of the destination 945 site. The two fragments are reassembled at the destination host into 946 the single IP datagram that was originated by the source host. 948 This behavior is performed by the ITR when the source host originates 949 a packet with the DF field of the IP header is set to 0. When the DF 950 field of the IP header is set to 1, or the packet is an IPv6 packet 951 originated by the source host, the ITR will drop the packet when the 952 size is greater than L, and sends an ICMP Too Big message to the 953 source with a value of S, where S is (L - H). 955 When the outer header encapsulation uses an IPv4 header, an 956 implementation SHOULD set the DF bit to 1 so ETR fragment reassembly 957 can be avoided. An implementation MAY set the DF bit in such headers 958 to 0 if it has good reason to believe there are unresolvable path MTU 959 issues between the sending ITR and the receiving ETR. 961 This specification recommends that L be defined as 1500. 963 5.4.2. A Stateful Solution to MTU Handling 965 An ITR stateful solution to handle MTU issues is described as follows 966 and was first introduced in [OPENLISP]: 968 1. The ITR will keep state of the effective MTU for each locator per 969 mapping cache entry. The effective MTU is what the core network 970 can deliver along the path between ITR and ETR. 972 2. When an IPv6 encapsulated packet or an IPv4 encapsulated packet 973 with DF bit set to 1, exceeds what the core network can deliver, 974 one of the intermediate routers on the path will send an ICMP Too 975 Big message to the ITR. The ITR will parse the ICMP message to 976 determine which locator is affected by the effective MTU change 977 and then record the new effective MTU value in the mapping cache 978 entry. 980 3. When a packet is received by the ITR from a source inside of the 981 site and the size of the packet is greater than the effective MTU 982 stored with the mapping cache entry associated with the 983 destination EID the packet is for, the ITR will send an ICMP Too 984 Big message back to the source. The packet size advertised by 985 the ITR in the ICMP Too Big message is the effective MTU minus 986 the LISP encapsulation length. 988 Even though this mechanism is stateful, it has advantages over the 989 stateless IP fragmentation mechanism, by not involving the 990 destination host with reassembly of ITR fragmented packets. 992 5.5. Using Virtualization and Segmentation with LISP 994 When multiple organizations inside of a LISP site are using private 995 addresses [RFC1918] as EID-prefixes, their address spaces MUST remain 996 segregated due to possible address duplication. An Instance ID in 997 the address encoding can aid in making the entire AFI based address 998 unique. See [LCAF] for details for a possible address encoding. The 999 LCAF encoding is an area for further study. 1001 An Instance ID can be carried in a LISP encapsulated packet. An ITR 1002 that prepends a LISP header, will copy a 24-bit value, used by the 1003 LISP router to uniquely identify the address space. The value is 1004 copied to the Instance ID field of the LISP header and the I-bit is 1005 set to 1. 1007 When an ETR decapsulate a packet, the Instance ID from the LISP 1008 header is used as a table identifier to locate the forwarding table 1009 to use for the inner destination EID lookup. 1011 Some examples of the 24-bit value to copy or map into the Instance ID 1012 field could be a 802.1Q VLAN tag or a configured VRF-ID value. 1014 6. EID-to-RLOC Mapping 1016 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats 1018 The following new UDP packet types are used to retrieve EID-to-RLOC 1019 mappings: 1021 0 1 2 3 1022 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 |Version| IHL |Type of Service| Total Length | 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 | Identification |Flags| Fragment Offset | 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 | Time to Live | Protocol = 17 | Header Checksum | 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 | Source Routing Locator | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 | Destination Routing Locator | 1033 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1034 / | Source Port | Dest Port | 1035 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1036 \ | UDP Length | UDP Checksum | 1037 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1038 | | 1039 | LISP Message | 1040 | | 1041 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1043 0 1 2 3 1044 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1045 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1046 |Version| Traffic Class | Flow Label | 1047 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1048 | Payload Length | Next Header=17| Hop Limit | 1049 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1050 | | 1051 + + 1052 | | 1053 + Source Routing Locator + 1054 | | 1055 + + 1056 | | 1057 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 | | 1059 + + 1060 | | 1061 + Destination Routing Locator + 1062 | | 1063 + + 1064 | | 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 / | Source Port | Dest Port | 1067 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1068 \ | UDP Length | UDP Checksum | 1069 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1070 | | 1071 | LISP Message | 1072 | | 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 The LISP UDP-based messages are the Map-Request and Map-Reply 1076 messages. When a UDP Map-Request is sent, the UDP source port is 1077 chosen by the sender and the destination UDP port number is set to 1078 4342. When a UDP Map-Reply is sent, the source UDP port number is 1079 set to 4342 and the destination UDP port number is copied from the 1080 source port of either the Map-Request or the invoking data packet. 1082 The UDP Length field will reflect the length of the UDP header and 1083 the LISP Message payload. 1085 The UDP Checksum is computed and set to non-zero for Map-Request, 1086 Map-Reply, Map-Register and ECM control messages. It MUST be checked 1087 on receipt and if the checksum fails, the packet MUST be dropped. 1089 LISP-CONS [CONS] use TCP to send LISP control messages. The format 1090 of control messages includes the UDP header so the checksum and 1091 length fields can be used to protect and delimit message boundaries. 1093 This main LISP specification is the authoritative source for message 1094 format definitions for the Map-Request and Map-Reply messages. 1096 6.1.1. LISP Packet Type Allocations 1098 This section will be the authoritative source for allocating LISP 1099 Type values. Current allocations are: 1101 Reserved: 0 b'0000' 1102 LISP Map-Request: 1 b'0001' 1103 LISP Map-Reply: 2 b'0010' 1104 LISP Map-Register: 3 b'0011' 1105 LISP Encapsulated Control Message: 8 b'1000' 1107 6.1.2. Map-Request Message Format 1109 0 1 2 3 1110 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1112 |Type=1 |A|M|P|S| Reserved | IRC | Record Count | 1113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1114 | Nonce . . . | 1115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1116 | . . . Nonce | 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 | Source-EID-AFI | Source EID Address ... | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 | ITR-RLOC-AFI 1 | ITR-RLOC Address 1 ... | 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 | ... | 1123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1124 | ITR-RLOC-AFI n | ITR-RLOC Address n ... | 1125 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1126 / | Reserved | EID mask-len | EID-prefix-AFI | 1127 Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1128 \ | EID-prefix ... | 1129 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1130 | Map-Reply Record ... | 1131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1132 | Mapping Protocol Data | 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1135 Packet field descriptions: 1137 Type: 1 (Map-Request) 1139 A: This is an authoritative bit, which is set to 0 for UDP-based Map- 1140 Requests sent by an ITR. 1142 M: When set, it indicates a Map-Reply Record segment is included in 1143 the Map-Request. 1145 P: This is the probe-bit which indicates that a Map-Request SHOULD be 1146 treated as a locator reachability probe. The receiver SHOULD 1147 respond with a Map-Reply with the probe-bit set, indicating the 1148 Map-Reply is a locator reachability probe reply, with the nonce 1149 copied from the Map-Request. See Section 6.3.2 for more details. 1151 S: This is the SMR bit. See Section 6.5.2 for details. 1153 Reserved: Set to 0 on transmission and ignored on receipt. 1155 IRC: This 5-bit field is the ITR-RLOC Count which encodes the number 1156 of (ITR-RLOC-AFI, ITR-RLOC Address) fields present in this 1157 message. Multiple ITR-RLOC Address fields are used so a Map- 1158 Replier can select which destination address to use for a Map- 1159 Reply. The IRC value ranges from 0 to 31, where IRC value 0 means 1160 an ITR-RLOC address count of 1, an IRC value of 1 means an ITR- 1161 RLOC address count of 2, and so on up to an IRC value of 31, which 1162 means an ITR-RLOC address count of 32. 1164 Record Count: The number of records in this Map-Request message. A 1165 record is comprised of the portion of the packet that is labeled 1166 'Rec' above and occurs the number of times equal to Record Count. 1167 For this version of the protocol, a receiver MUST accept and 1168 process Map-Requests that contain one or more records, but a 1169 sender MUST only send Map-Requests containing one record. Support 1170 for requesting multiple EIDs in a single Map-Request message will 1171 be specified in a future version of the protocol. 1173 Nonce: An 8-byte random value created by the sender of the Map- 1174 Request. This nonce will be returned in the Map-Reply. The 1175 security of the LISP mapping protocol depends critically on the 1176 strength of the nonce in the Map-Request message. The nonce 1177 SHOULD be generated by a properly seeded pseudo-random (or strong 1178 random) source. See [RFC4086] for advice on generating security- 1179 sensitive random data. 1181 Source-EID-AFI: Address family of the "Source EID Address" field. 1183 Source EID Address: This is the EID of the source host which 1184 originated the packet which is invoking this Map-Request. When 1185 Map-Requests are used for refreshing a map-cache entry or for 1186 RLOC-probing, the value 0 is used. 1188 ITR-RLOC-AFI: Address family of the "ITR-RLOC Address" field that 1189 follows this field. 1191 ITR-RLOC Address: Used to give the ETR the option of selecting the 1192 destination address from any address family for the Map-Reply 1193 message. This address MUST be a routable RLOC address. 1195 EID mask-len: Mask length for EID prefix. 1197 EID-prefix-AFI: Address family of EID-prefix according to [RFC5226] 1199 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1200 address-family. When a Map-Request is sent by an ITR because a 1201 data packet is received for a destination where there is no 1202 mapping entry, the EID-prefix is set to the destination IP address 1203 of the data packet. And the 'EID mask-len' is set to 32 or 128 1204 for IPv4 or IPv6, respectively. When an xTR wants to query a site 1205 about the status of a mapping it already has cached, the EID- 1206 prefix used in the Map-Request has the same mask-length as the 1207 EID-prefix returned from the site when it sent a Map-Reply 1208 message. 1210 Map-Reply Record: When the M bit is set, this field is the size of 1211 the "Record" field in the Map-Reply format. This Map-Reply record 1212 contains the EID-to-RLOC mapping entry associated with the Source 1213 EID. This allows the ETR which will receive this Map-Request to 1214 cache the data if it chooses to do so. 1216 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1217 is optional and present when the UDP length indicates there is 1218 enough space in the packet to include it. 1220 6.1.3. EID-to-RLOC UDP Map-Request Message 1222 A Map-Request is sent from an ITR when it needs a mapping for an EID, 1223 wants to test an RLOC for reachability, or wants to refresh a mapping 1224 before TTL expiration. For the initial case, the destination IP 1225 address used for the Map-Request is the destination-EID from the 1226 packet which had a mapping cache lookup failure. For the latter 2 1227 cases, the destination IP address used for the Map-Request is one of 1228 the RLOC addresses from the locator-set of the map cache entry. The 1229 source address is either an IPv4 or IPv6 RLOC address depending if 1230 the Map-Request is using an IPv4 versus IPv6 header, respectively. 1232 In all cases, the UDP source port number for the Map-Request message 1233 is a randomly allocated 16-bit value and the UDP destination port 1234 number is set to the well-known destination port number 4342. A 1235 successful Map-Reply updates the cached set of RLOCs associated with 1236 the EID prefix range. 1238 One or more Map-Request (ITR-RLOC-AFI, ITR-RLOC-Address) fields MUST 1239 be filled in by the ITR. The number of fields (minus 1) encoded MUST 1240 be placed in the IRC field. The ITR MAY include all locally 1241 configured locators in this list or just provide one locator address 1242 from each address family it supports. If the ITR erroneously 1243 provides no ITR-RLOC addresses, the Map-Replier MUST drop the Map- 1244 Request. 1246 Map-Requests can also be LISP encapsulated using UDP destination port 1247 4342 with a LISP type value set to "Encapsulated Control Message", 1248 when sent from an ITR to a Map-Resolver. Likewise, Map-Requests are 1249 LISP encapsulated the same way from a Map-Server to an ETR. Details 1250 on encapsulated Map-Requests and Map-Resolvers can be found in 1251 [LISP-MS]. 1253 Map-Requests MUST be rate-limited. It is recommended that a Map- 1254 Request for the same EID-prefix be sent no more than once per second. 1256 An ITR that is configured with mapping database information (i.e. it 1257 is also an ETR) may optionally include those mappings in a Map- 1258 Request. When an ETR configured to accept and verify such 1259 "piggybacked" mapping data receives such a Map-Request and it does 1260 not have this mapping in the map-cache, it may originate a "verifying 1261 Map-Request", addressed to the map-requesting ITR. If the ETR has a 1262 map-cache entry that matches the "piggybacked" EID and the RLOC is in 1263 the locator-set for the entry, then it may send the "verifying Map- 1264 Request" directly to the originating Map-Request source. If the RLOC 1265 is not in the locator-set, then the ETR MUST send the "verifying Map- 1266 Request" to the "piggybacked" EID. Doing this forces the "verifying 1267 Map-Request" to go through the mapping database system to reach the 1268 authoritative source of information about that EID, guarding against 1269 RLOC-spoofing in in the "piggybacked" mapping data. 1271 6.1.4. Map-Reply Message Format 1273 0 1 2 3 1274 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 |Type=2 |P|E| Reserved | Record Count | 1277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1278 | Nonce . . . | 1279 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1280 | . . . Nonce | 1281 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 | | Record TTL | 1283 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1284 R | Locator Count | EID mask-len | ACT |A| Reserved | 1285 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1286 c | Rsvd | Map-Version Number | EID-AFI | 1287 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1288 r | EID-prefix | 1289 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1290 | /| Priority | Weight | M Priority | M Weight | 1291 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1292 | o | Unused Flags |L|p|R| Loc-AFI | 1293 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1294 | \| Locator | 1295 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1296 | Mapping Protocol Data | 1297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1299 Packet field descriptions: 1301 Type: 2 (Map-Reply) 1303 P: This is the probe-bit which indicates that the Map-Reply is in 1304 response to a locator reachability probe Map-Request. The nonce 1305 field MUST contain a copy of the nonce value from the original 1306 Map-Request. See Section 6.3.2 for more details. 1308 E: Indicates that the ETR which sends this Map-Reply message is 1309 advertising that the site is enabled for the Echo-Nonce locator 1310 reachability algorithm. See Section 6.3.1 for more details. 1312 Reserved: Set to 0 on transmission and ignored on receipt. 1314 Record Count: The number of records in this reply message. A record 1315 is comprised of that portion of the packet labeled 'Record' above 1316 and occurs the number of times equal to Record count. 1318 Nonce: A 24-bit value set in a Data-Probe packet or a 64-bit value 1319 from the Map-Request is echoed in this Nonce field of the Map- 1320 Reply. 1322 Record TTL: The time in minutes the recipient of the Map-Reply will 1323 store the mapping. If the TTL is 0, the entry SHOULD be removed 1324 from the cache immediately. If the value is 0xffffffff, the 1325 recipient can decide locally how long to store the mapping. 1327 Locator Count: The number of Locator entries. A locator entry 1328 comprises what is labeled above as 'Loc'. The locator count can 1329 be 0 indicating there are no locators for the EID-prefix. 1331 EID mask-len: Mask length for EID prefix. 1333 ACT: This 3-bit field describes negative Map-Reply actions. These 1334 bits are used only when the 'Locator Count' field is set to 0. 1335 The action bits are encoded only in Map-Reply messages. The 1336 actions defined are used by an ITR or PTR when a destination EID 1337 matches a negative mapping cache entry. Unassigned values should 1338 cause a map-cache entry to be created and, when packets match this 1339 negative cache entry, they will be dropped. The current assigned 1340 values are: 1342 (0) Drop: The packet is dropped silently. 1344 (1) Natively-Forward: The packet is not encapsulated or dropped 1345 but natively forwarded. 1347 (2) Send-Map-Request: The packet invokes sending a Map-Request. 1349 A: The Authoritative bit, when sent by a UDP-based message is always 1350 set to 1 by an ETR. See [CONS] for TCP-based Map-Replies. When a 1351 Map-Server is proxy Map-Replying [LISP-MS] for a LISP site, the 1352 Authoritative bit is set to 0. This indicates to requesting ITRs 1353 that the Map-Reply was not originated by a LISP node managed at 1354 the site that owns the EID-prefix. 1356 Map-Version Number: When this 12-bit value is non-zero the Map-Reply 1357 sender is informing the ITR what the version number is for the 1358 EID-record contained in the Map-Reply. The ETR can allocate this 1359 number internally but MUST coordinate this value with other ETRs 1360 for the site. When this value is 0, there is no versioning 1361 information conveyed. The Map-Version Number can be included in 1362 Map-Request and Map-Register messages. See Section 6.5.3 for more 1363 details. 1365 EID-AFI: Address family of EID-prefix according to [RFC5226]. 1367 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1368 address-family. 1370 Priority: each RLOC is assigned a unicast priority. Lower values 1371 are more preferable. When multiple RLOCs have the same priority, 1372 they may be used in a load-split fashion. A value of 255 means 1373 the RLOC MUST NOT be used for unicast forwarding. 1375 Weight: when priorities are the same for multiple RLOCs, the weight 1376 indicates how to balance unicast traffic between them. Weight is 1377 encoded as a percentage of total unicast packets that match the 1378 mapping entry. If a non-zero weight value is used for any RLOC, 1379 then all RLOCs MUST use a non-zero weight value and then the sum 1380 of all weight values MUST equal 100. If a zero value is used for 1381 any RLOC weight, then all weights MUST be zero and the receiver of 1382 the Map-Reply will decide how to load-split traffic. See 1383 Section 6.4 for a suggested hash algorithm to distribute load 1384 across locators with same priority and equal weight values. When 1385 a single RLOC exists in a mapping entry, the weight value MUST be 1386 set to 100 and ignored on receipt. 1388 M Priority: each RLOC is assigned a multicast priority used by an 1389 ETR in a receiver multicast site to select an ITR in a source 1390 multicast site for building multicast distribution trees. A value 1391 of 255 means the RLOC MUST NOT be used for joining a multicast 1392 distribution tree. 1394 M Weight: when priorities are the same for multiple RLOCs, the 1395 weight indicates how to balance building multicast distribution 1396 trees across multiple ITRs. The weight is encoded as a percentage 1397 of total number of trees build to the source site identified by 1398 the EID-prefix. If a non-zero weight value is used for any RLOC, 1399 then all RLOCs MUST use a non-zero weight value and then the sum 1400 of all weight values MUST equal 100. If a zero value is used for 1401 any RLOC weight, then all weights MUST be zero and the receiver of 1402 the Map-Reply will decide how to distribute multicast state across 1403 ITRs. 1405 Unused Flags: set to 0 when sending and ignored on receipt. 1407 L: when this bit is set, the locator is flagged as a local locator to 1408 the ETR that is sending the Map-Reply. When a Map-Server is doing 1409 proxy Map-Replying [LISP-MS] for a LISP site, the L bit is set to 1410 0 for all locators in this locator-set. 1412 p: when this bit is set, an ETR informs the RLOC-probing ITR that the 1413 locator address, for which this bit is set, is the one being RLOC- 1414 probed and may be different from the source address of the Map- 1415 Reply. An ITR that RLOC-probes a particular locator, MUST use 1416 this locator for retrieving the data structure used to store the 1417 fact that the locator is reachable. The "p" bit is set for a 1418 single locator in the same locator set. If an implementation sets 1419 more than one "p" bit erroneously, the receiver of the Map-Reply 1420 MUST select the first locator. The "p" bit MUST NOT be set for 1421 locator-set records sent in Map-Request and Map-Register messages. 1423 R: when this bit is set, the locator is known to be reachable from 1424 the Map-Reply sender's perspective. 1426 Locator: an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field) 1427 assigned to an ETR or router acting as a proxy replier for the 1428 EID-prefix. Note that the destination RLOC address MAY be an 1429 anycast address. A source RLOC can be an anycast address as well. 1430 The source or destination RLOC MUST NOT be the broadcast address 1431 (255.255.255.255 or any subnet broadcast address known to the 1432 router), and MUST NOT be a link-local multicast address. The 1433 source RLOC MUST NOT be a multicast address. The destination RLOC 1434 SHOULD be a multicast address if it is being mapped from a 1435 multicast destination EID. 1437 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1438 is optional and present when the UDP length indicates there is 1439 enough space in the packet to include it. 1441 6.1.5. EID-to-RLOC UDP Map-Reply Message 1443 When a Data Probe packet or a Map-Request triggers a Map-Reply to be 1444 sent, the RLOCs associated with the EID-prefix matched by the EID in 1445 the original packet destination IP address field will be returned. 1446 The RLOCs in the Map-Reply are the globally-routable IP addresses of 1447 the ETR but are not necessarily reachable; separate testing of 1448 reachability is required. 1450 Note that a Map-Reply may contain different EID-prefix granularity 1451 (prefix + length) than the Map-Request which triggers it. This might 1452 occur if a Map-Request were for a prefix that had been returned by an 1453 earlier Map-Reply. In such a case, the requester updates its cache 1454 with the new prefix information and granularity. For example, a 1455 requester with two cached EID-prefixes that are covered by a Map- 1456 Reply containing one, less-specific prefix, replaces the entry with 1457 the less-specific EID-prefix. Note that the reverse, replacement of 1458 one less-specific prefix with multiple more-specific prefixes, can 1459 also occur but not by removing the less-specific prefix rather by 1460 adding the more-specific prefixes which during a lookup will override 1461 the less-specific prefix. 1463 When an ETR is configured with overlapping EID-prefixes, a Map- 1464 Request with an EID that longest matches any EID-prefix MUST be 1465 returned in a single Map-Reply message. For instance, if an ETR had 1466 database mapping entries for EID-prefixes: 1468 10.0.0.0/8 1469 10.1.0.0/16 1470 10.1.1.0/24 1471 10.1.2.0/24 1473 A Map-Request for EID 10.1.1.1 would cause a Map-Reply with a record 1474 count of 1 to be returned with a mapping record EID-prefix of 1475 10.1.1.0/24. 1477 A Map-Request for EID 10.1.5.5, would cause a Map-Reply with a record 1478 count of 3 to be returned with mapping records for EID-prefixes 1479 10.1.0.0/16, 10.1.1.0/24, and 10.1.2.0/24. 1481 Note that not all overlapping EID-prefixes need to be returned, only 1482 the more specifics (note in the second example above 10.0.0.0/8 was 1483 not returned for requesting EID 10.1.5.5) entries for the matching 1484 EID-prefix of the requesting EID. When more than one EID-prefix is 1485 returned, all SHOULD use the same Time-to-Live value so they can all 1486 time out at the same time. When a more specific EID-prefix is 1487 received later, its Time-to-Live value in the Map-Reply record can be 1488 stored even when other less specifics exist. When a a less specific 1489 EID-prefix is received later, its map-cache expiration time SHOULD be 1490 set to the minimum expiration time of any more specific EID-prefix in 1491 the map-cache. 1493 Map-Replies SHOULD be sent for an EID-prefix no more often than once 1494 per second to the same requesting router. For scalability, it is 1495 expected that aggregation of EID addresses into EID-prefixes will 1496 allow one Map-Reply to satisfy a mapping for the EID addresses in the 1497 prefix range thereby reducing the number of Map-Request messages. 1499 Map-Reply records can have an empty locator-set. This type of a Map- 1500 Reply is called a Negative Map-Reply. Negative Map-Replies convey 1501 special actions by the sender to the ITR or PTR which have solicited 1502 the Map-Reply. There are two primary applications for Negative Map- 1503 Replies. The first is for a Map-Resolver to instruct an ITR or PTR 1504 when a destination is for a LISP site versus a non-LISP site. And 1505 the other is to source quench Map-Requests which are sent for non- 1506 allocated EIDs. 1508 For each Map-Reply record, the list of locators in a locator-set MUST 1509 appear in the same order for each ETR that originates a Map-Reply 1510 message. The locator-set MUST be sorted in order of ascending IP 1511 address where an IPv4 locator address is considered numerically 'less 1512 than' an IPv6 locator address. 1514 When sending a Map-Reply message, the destination address is copied 1515 from the one of the ITR-RLOC fields from the Map-Request. The ETR 1516 can choose a locator address from one of the address families it 1517 supports. For Data-Probes, the destination address of the Map-Reply 1518 is copied from the source address of the Data-Probe message which is 1519 invoking the reply. The source address of the Map-Reply is one of 1520 the local locator addresses listed in the locator-set of any mapping 1521 record in the message and SHOULD be chosen to allow uRPF checks to 1522 succeed in the upstream service provider. The destination port of a 1523 Map-Reply message is copied from the source port of the Map-Request 1524 or Data-Probe and the source port of the Map-Reply message is set to 1525 the well-known UDP port 4342. 1527 6.1.5.1. Traffic Redirection with Coarse EID-Prefixes 1529 When an ETR is misconfigured or compromised, it could return coarse 1530 EID-prefixes in Map-Reply messages it sends. The EID-prefix could 1531 cover EID-prefixes which are allocated to other sites redirecting 1532 their traffic to the locators of the compromised site. 1534 To solve this problem, there are two basic solutions that could be 1535 used. The first is to have Map-Servers proxy-map-reply on behalf of 1536 ETRs so their registered EID-prefixes are the ones returned in Map- 1537 Replies. Since the interaction between an ETR and Map-Server is 1538 secured with shared-keys, it is more difficult for an ETR to 1539 misbehave. The second solution is to have ITRs and PTRs cache EID- 1540 prefixes with mask-lengths that are greater than or equal to a 1541 configured prefix length. This limits the damage to a specific width 1542 of any EID-prefix advertised, but needs to be coordinated with the 1543 allocation of site prefixes. These solutions can be used 1544 independently or at the same time. 1546 At the time of this writing, other approaches are being considered 1547 and researched. 1549 6.1.6. Map-Register Message Format 1551 The usage details of the Map-Register message can be found in 1552 specification [LISP-MS]. This section solely defines the message 1553 format. 1555 The message is sent in UDP with a destination UDP port of 4342 and a 1556 randomly selected UDP source port number. 1558 The Map-Register message format is: 1560 0 1 2 3 1561 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1563 |Type=3 |P| Reserved | Record Count | 1564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1565 | Nonce . . . | 1566 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1567 | . . . Nonce | 1568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1569 | Key ID | Authentication Data Length | 1570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1571 ~ Authentication Data ~ 1572 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1573 | | Record TTL | 1574 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1575 R | Locator Count | EID mask-len | ACT |A| Reserved | 1576 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1577 c | Rsvd | Map-Version Number | EID-AFI | 1578 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1579 r | EID-prefix | 1580 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1581 | /| Priority | Weight | M Priority | M Weight | 1582 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1583 | o | Unused Flags |L|p|R| Loc-AFI | 1584 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1585 | \| Locator | 1586 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1588 Packet field descriptions: 1590 Type: 3 (Map-Register) 1592 P: This is the proxy-map-reply bit, when set to 1 an ETR sends a Map- 1593 Register message requesting for the Map-Server to proxy Map-Reply. 1594 The Map-Server will send non-authoritative Map-Replies on behalf 1595 of the ETR. Details on this usage will be provided in a future 1596 version of this draft. 1598 Reserved: Set to 0 on transmission and ignored on receipt. 1600 Record Count: The number of records in this Map-Register message. A 1601 record is comprised of that portion of the packet labeled 'Record' 1602 above and occurs the number of times equal to Record count. 1604 Nonce: This 8-byte Nonce field is set to 0 in Map-Register messages. 1606 Key ID: A configured ID to find the configured Message 1607 Authentication Code (MAC) algorithm and key value used for the 1608 authentication function. 1610 Authentication Data Length: The length in bytes of the 1611 Authentication Data field that follows this field. The length of 1612 the the Authentication Data field is dependent on the Message 1613 Authentication Code (MAC) algorithm used. The length field allows 1614 a device that doesn't know the MAC algorithm to correctly parse 1615 the packet. 1617 Authentication Data: The message digest used from the output of the 1618 Message Authentication Code (MAC) algorithm. The entire Map- 1619 Register payload is authenticated with this field preset to 0. 1620 After the MAC is computed, it is placed in this field. 1621 Implementations of this specification MUST include support for 1622 HMAC-SHA-1-96 [RFC2404] and support for HMAC-SHA-128-256 [RFC4634] 1623 is recommended. 1625 The definition of the rest of the Map-Register can be found in the 1626 Map-Reply section. However, the record TTL field is not used and set 1627 to 0. 1629 6.1.7. Encapsulated Control Message Format 1631 An Encapsulated Control Message is used to encapsulate control 1632 packets sent between xTRs and the mapping database system described 1633 in [LISP-MS]. 1635 0 1 2 3 1636 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1638 / | IPv4 or IPv6 Header | 1639 OH | (uses RLOC addresses) | 1640 \ | | 1641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1642 / | Source Port = xxxx | Dest Port = 4342 | 1643 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1644 \ | UDP Length | UDP Checksum | 1645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1646 LH |Type=8 | Reserved | 1647 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1648 / | IPv4 or IPv6 Header | 1649 IH | (uses RLOC or EID addresses) | 1650 \ | | 1651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1652 / | Source Port = xxxx | Dest Port = yyyy | 1653 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1654 \ | UDP Length | UDP Checksum | 1655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1656 LCM | LISP Control Message | 1657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1659 Packet header descriptions: 1661 OH: The outer IPv4 or IPv6 header which uses RLOC addresses in the 1662 source and destination header address fields. 1664 UDP: The outer UDP header with destination port 4342. The source 1665 port is randomly allocated. The checksum field MUST be non-zero. 1667 LH: Type 8 is defined to be a "LISP Encapsulated Control Message" 1668 and what follows is either an IPv4 or IPv6 header as encoded by 1669 the first 4 bits after the reserved field. 1671 IH: The inner IPv4 or IPv6 header which can use either RLOC or EID 1672 addresses in the header address fields. When a Map-Request is 1673 encapsulated in this packet format the destination address in this 1674 header is an EID. 1676 UDP: The inner UDP header where the port assignments depends on the 1677 control packet being encapsulated. When the control packet is a 1678 Map-Request or Map-Register, the source port is randomly assigned 1679 and the destination port is 4342. When the control packet is a 1680 Map-Reply, the source port is 4342 and the destination port is 1681 assigned from the source port of the invoking Map-Request. Port 1682 number 4341 MUST NOT be assigned to either port. The checksum 1683 field MUST be non-zero. 1685 LCM: The format is one of the control message formats described in 1686 this section. At this time, only Map-Request messages and PIM 1687 Join-Prune messages [MLISP] are allowed to be encapsulated. 1688 Encapsulating other types of LISP control messages are for further 1689 study. When Map-Requests are sent for RLOC-probing purposes (i.e 1690 the probe-bit is set), they MUST NOT be sent inside Encapsulated 1691 Control Messages. 1693 6.2. Routing Locator Selection 1695 Both client-side and server-side may need control over the selection 1696 of RLOCs for conversations between them. This control is achieved by 1697 manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply 1698 messages. Alternatively, RLOC information may be gleaned from 1699 received tunneled packets or EID-to-RLOC Map-Request messages. 1701 The following enumerates different scenarios for choosing RLOCs and 1702 the controls that are available: 1704 o Server-side returns one RLOC. Client-side can only use one RLOC. 1705 Server-side has complete control of the selection. 1707 o Server-side returns a list of RLOC where a subset of the list has 1708 the same best priority. Client can only use the subset list 1709 according to the weighting assigned by the server-side. In this 1710 case, the server-side controls both the subset list and load- 1711 splitting across its members. The client-side can use RLOCs 1712 outside of the subset list if it determines that the subset list 1713 is unreachable (unless RLOCs are set to a Priority of 255). Some 1714 sharing of control exists: the server-side determines the 1715 destination RLOC list and load distribution while the client-side 1716 has the option of using alternatives to this list if RLOCs in the 1717 list are unreachable. 1719 o Server-side sets weight of 0 for the RLOC subset list. In this 1720 case, the client-side can choose how the traffic load is spread 1721 across the subset list. Control is shared by the server-side 1722 determining the list and the client determining load distribution. 1723 Again, the client can use alternative RLOCs if the server-provided 1724 list of RLOCs are unreachable. 1726 o Either side (more likely on the server-side ETR) decides not to 1727 send a Map-Request. For example, if the server-side ETR does not 1728 send Map-Requests, it gleans RLOCs from the client-side ITR, 1729 giving the client-side ITR responsibility for bidirectional RLOC 1730 reachability and preferability. Server-side ETR gleaning of the 1731 client-side ITR RLOC is done by caching the inner header source 1732 EID and the outer header source RLOC of received packets. The 1733 client-side ITR controls how traffic is returned and can alternate 1734 using an outer header source RLOC, which then can be added to the 1735 list the server-side ETR uses to return traffic. Since no 1736 Priority or Weights are provided using this method, the server- 1737 side ETR MUST assume each client-side ITR RLOC uses the same best 1738 Priority with a Weight of zero. In addition, since EID-prefix 1739 encoding cannot be conveyed in data packets, the EID-to-RLOC cache 1740 on tunnel routers can grow to be very large. 1742 o A "gleaned" map-cache entry, one learned from the source RLOC of a 1743 received encapsulated packet, is only stored and used for a few 1744 seconds, pending verification. Verification is performed by 1745 sending a Map-Request to the source EID (the inner header IP 1746 source address) of the received encapsulated packet. A reply to 1747 this "verifying Map-Request" is used to fully populate the map- 1748 cache entry for the "gleaned" EID and is stored and used for the 1749 time indicated from the TTL field of a received Map-Reply. When a 1750 verified map-cache entry is stored, data gleaning no longer occurs 1751 for subsequent packets which have a source EID that matches the 1752 EID-prefix of the verified entry. 1754 RLOCs that appear in EID-to-RLOC Map-Reply messages are assumed to be 1755 reachable when the R-bit for the locator record is set to 1. Neither 1756 the information contained in a Map-Reply or that stored in the 1757 mapping database system provide reachability information for RLOCs. 1758 Such reachability needs to be determined separately, using one or 1759 more of the Routing Locator Reachability Algorithms described in the 1760 next section. 1762 6.3. Routing Locator Reachability 1764 Several mechanisms for determining RLOC reachability are currently 1765 defined: 1767 1. An ETR may examine the Loc-Status-Bits in the LISP header of an 1768 encapsulated data packet received from an ITR. If the ETR is 1769 also acting as an ITR and has traffic to return to the original 1770 ITR site, it can use this status information to help select an 1771 RLOC. 1773 2. An ITR may receive an ICMP Network or ICMP Host Unreachable 1774 message for an RLOC it is using. This indicates that the RLOC is 1775 likely down. 1777 3. An ITR which participates in the global routing system can 1778 determine that an RLOC is down if no BGP RIB route exists that 1779 matches the RLOC IP address. 1781 4. An ITR may receive an ICMP Port Unreachable message from a 1782 destination host. This occurs if an ITR attempts to use 1783 interworking [INTERWORK] and LISP-encapsulated data is sent to a 1784 non-LISP-capable site. 1786 5. An ITR may receive a Map-Reply from a ETR in response to a 1787 previously sent Map-Request. The RLOC source of the Map-Reply is 1788 likely up since the ETR was able to send the Map-Reply to the 1789 ITR. 1791 6. When an ETR receives an encapsulated packet from an ITR, the 1792 source RLOC from the outer header of the packet is likely up. 1794 7. An ITR/ETR pair can use the Locator Reachability Algorithms 1795 described in this section, namely Echo-Noncing or RLOC-Probing. 1797 When determining Locator up/down reachability by examining the Loc- 1798 Status-Bits from the LISP encapsulated data packet, an ETR will 1799 receive up to date status from an encapsulating ITR about 1800 reachability for all ETRs at the site. CE-based ITRs at the source 1801 site can determine reachability relative to each other using the site 1802 IGP as follows: 1804 o Under normal circumstances, each ITR will advertise a default 1805 route into the site IGP. 1807 o If an ITR fails or if the upstream link to its PE fails, its 1808 default route will either time-out or be withdrawn. 1810 Each ITR can thus observe the presence or lack of a default route 1811 originated by the others to determine the Locator Status Bits it sets 1812 for them. 1814 RLOCs listed in a Map-Reply are numbered with ordinals 0 to n-1. The 1815 Loc-Status-Bits in a LISP encapsulated packet are numbered from 0 to 1816 n-1 starting with the least significant bit. For example, if an RLOC 1817 listed in the 3rd position of the Map-Reply goes down (ordinal value 1818 2), then all ITRs at the site will clear the 3rd least significant 1819 bit (xxxx x0xx) of the Loc-Status-Bits field for the packets they 1820 encapsulate. 1822 When an ETR decapsulates a packet, it will check for any change in 1823 the Loc-Status-Bits field. When a bit goes from 1 to 0, the ETR will 1824 refrain from encapsulating packets to an RLOC that is indicated as 1825 down. It will only resume using that RLOC if the corresponding Loc- 1826 Status-Bit returns to a value of 1. Loc-Status-Bits are associated 1827 with a locator-set per EID-prefix. Therefore, when a locator becomes 1828 unreachable, the Loc-Status-Bit that corresponds to that locator's 1829 position in the list returned by the last Map-Reply will be set to 1830 zero for that particular EID-prefix. 1832 When ITRs at the site are not deployed in CE routers, the IGP can 1833 still be used to determine the reachability of Locators provided they 1834 are injected into the IGP. This is typically done when a /32 address 1835 is configured on a loopback interface. 1837 When ITRs receive ICMP Network or Host Unreachable messages as a 1838 method to determine unreachability, they will refrain from using 1839 Locators which are described in Locator lists of Map-Replies. 1840 However, using this approach is unreliable because many network 1841 operators turn off generation of ICMP Unreachable messages. 1843 If an ITR does receive an ICMP Network or Host Unreachable message, 1844 it MAY originate its own ICMP Unreachable message destined for the 1845 host that originated the data packet the ITR encapsulated. 1847 Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if 1848 a locator address from a locator-set in a mapping entry matches a 1849 prefix. If it does not find one and BGP is running in the Default 1850 Free Zone (DFZ), it can decide to not use the locator even though the 1851 Loc-Status-Bits indicate the locator is up. In this case, the path 1852 from the ITR to the ETR that is assigned the locator is not 1853 available. More details are in [LOC-ID-ARCH]. 1855 Optionally, an ITR can send a Map-Request to a Locator and if a Map- 1856 Reply is returned, reachability of the Locator has been determined. 1857 Obviously, sending such probes increases the number of control 1858 messages originated by tunnel routers for active flows, so Locators 1859 are assumed to be reachable when they are advertised. 1861 This assumption does create a dependency: Locator unreachability is 1862 detected by the receipt of ICMP Host Unreachable messages. When an 1863 Locator has been determined to be unreachable, it is not used for 1864 active traffic; this is the same as if it were listed in a Map-Reply 1865 with priority 255. 1867 The ITR can test the reachability of the unreachable Locator by 1868 sending periodic Requests. Both Requests and Replies MUST be rate- 1869 limited. Locator reachability testing is never done with data 1870 packets since that increases the risk of packet loss for end-to-end 1871 sessions. 1873 When an ETR decapsulates a packet, it knows that it is reachable from 1874 the encapsulating ITR because that is how the packet arrived. In 1875 most cases, the ETR can also reach the ITR but cannot assume this to 1876 be true due to the possibility of path asymmetry. In the presence of 1877 unidirectional traffic flow from an ITR to an ETR, the ITR SHOULD NOT 1878 use the lack of return traffic as an indication that the ETR is 1879 unreachable. Instead, it MUST use an alternate mechanisms to 1880 determine reachability. 1882 6.3.1. Echo Nonce Algorithm 1884 When data flows bidirectionally between locators from different 1885 sites, a simple mechanism called "nonce echoing" can be used to 1886 determine reachability between an ITR and ETR. When an ITR wants to 1887 solicit a nonce echo, it sets the N and E bits and places a 24-bit 1888 nonce in the LISP header of the next encapsulated data packet. 1890 When this packet is received by the ETR, the encapsulated packet is 1891 forwarded as normal. When the ETR next sends a data packet to the 1892 ITR, it includes the nonce received earlier with the N bit set and E 1893 bit cleared. The ITR sees this "echoed nonce" and knows the path to 1894 and from the ETR is up. 1896 The ITR will set the E-bit and N-bit for every packet it sends while 1897 in echo-nonce-request state. The time the ITR waits to process the 1898 echoed nonce before it determines the path is unreachable is variable 1899 and a choice left for the implementation. 1901 If the ITR is receiving packets from the ETR but does not see the 1902 nonce echoed while being in echo-nonce-request state, then the path 1903 to the ETR is unreachable. This decision may be overridden by other 1904 locator reachability algorithms. Once the ITR determines the path to 1905 the ETR is down it can switch to another locator for that EID-prefix. 1907 Note that "ITR" and "ETR" are relative terms here. Both devices MUST 1908 be implementing both ITR and ETR functionality for the echo nonce 1909 mechanism to operate. 1911 The ITR and ETR may both go into echo-nonce-request state at the same 1912 time. The number of packets sent or the time during which echo nonce 1913 requests are sent is an implementation specific setting. However, 1914 when an ITR is in echo-nonce-request state, it can echo the ETR's 1915 nonce in the next set of packets that it encapsulates and then 1916 subsequently, continue sending echo-nonce-request packets. 1918 This mechanism does not completely solve the forward path 1919 reachability problem as traffic may be unidirectional. That is, the 1920 ETR receiving traffic at a site may not be the same device as an ITR 1921 which transmits traffic from that site or the site to site traffic is 1922 unidirectional so there is no ITR returning traffic. 1924 The echo-nonce algorithm is bilateral. That is, if one side sets the 1925 E-bit and the other side is not enabled for echo-noncing, then the 1926 echoing of the nonce does not occur and the requesting side may 1927 regard the locator unreachable erroneously. An ITR SHOULD only set 1928 the E-bit in a encapsulated data packet when it knows the ETR is 1929 enabled for echo-noncing. This is conveyed by the E-bit in the Map- 1930 Reply message. 1932 Note that other locator reachability mechanisms are being researched 1933 and can be used to compliment or even override the Echo Nonce 1934 Algorithm. See next section for an example of control-plane probing. 1936 6.3.2. RLOC Probing Algorithm 1938 RLOC Probing is a method that an ITR or PTR can use to determine the 1939 reachability status of one or more locators that it has cached in a 1940 map-cache entry. The probe-bit of the Map-Request and Map-Reply 1941 messages are used for RLOC Probing. 1943 RLOC probing is done in the control-plane on a timer basis where an 1944 ITR or PTR will originate a Map-Request destined to a locator address 1945 from one of its own locator addresses. A Map-Request used as an 1946 RLOC-probe is NOT encapsulated and NOT sent to a Map-Server or on the 1947 ALT like one would when soliciting mapping data. The EID record 1948 encoded in the Map-Request is the EID-prefix of the map-cache entry 1949 cached by the ITR or PTR. The ITR or PTR may include a mapping data 1950 record for its own database mapping information. 1952 When an ETR receives a Map-Request message with the probe-bit set, it 1953 returns a Map-Reply with the probe-bit set. The source address of 1954 the Map-Reply is set from the destination address of the Map-Request 1955 and the destination address of the Map-Reply is set from the source 1956 address of the Map-Request. The Map-Reply SHOULD contain mapping 1957 data for the EID-prefix contained in the Map-Request. This provides 1958 the opportunity for the ITR or PTR, which sent the RLOC-probe to get 1959 mapping updates if there were changes to the ETR's database mapping 1960 entries. 1962 There are advantages and disadvantages of RLOC Probing. The greatest 1963 benefit of RLOC Probing is that it can handle many failure scenarios 1964 allowing the ITR to determine when the path to a specific locator is 1965 reachable or has become unreachable, thus providing a robust 1966 mechanism for switching to using another locator from the cached 1967 locator. RLOC Probing can also provide rough RTT estimates between a 1968 pair of locators which can be useful for network management purposes 1969 as well as for selecting low delay paths. The major disadvantage of 1970 RLOC Probing is in the number of control messages required and the 1971 amount of bandwidth used to obtain those benefits, especially if the 1972 requirement for failure detection times are very small. 1974 Continued research and testing will attempt to characterize the 1975 tradeoffs of failure detection times versus message overhead. 1977 6.4. Routing Locator Hashing 1979 When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to 1980 a requesting ITR, the locator-set for the EID-prefix may contain 1981 different priority values for each locator address. When more than 1982 one best priority locator exists, the ITR can decide how to load 1983 share traffic against the corresponding locators. 1985 The following hash algorithm may be used by an ITR to select a 1986 locator for a packet destined to an EID for the EID-to-RLOC mapping: 1988 1. Either a source and destination address hash can be used or the 1989 traditional 5-tuple hash which includes the source and 1990 destination addresses, source and destination TCP, UDP, or SCTP 1991 port numbers and the IP protocol number field or IPv6 next- 1992 protocol fields of a packet a host originates from within a LISP 1993 site. When a packet is not a TCP, UDP, or SCTP packet, the 1994 source and destination addresses only from the header are used to 1995 compute the hash. 1997 2. Take the hash value and divide it by the number of locators 1998 stored in the locator-set for the EID-to-RLOC mapping. 2000 3. The remainder will be yield a value of 0 to "number of locators 2001 minus 1". Use the remainder to select the locator in the 2002 locator-set. 2004 Note that when a packet is LISP encapsulated, the source port number 2005 in the outer UDP header needs to be set. Selecting a random value 2006 allows core routers which are attached to Link Aggregation Groups 2007 (LAGs) to load-split the encapsulated packets across member links of 2008 such LAGs. Otherwise, core routers would see a single flow, since 2009 packets have a source address of the ITR, for packets which are 2010 originated by different EIDs at the source site. A suggested setting 2011 for the source port number computed by an ITR is a 5-tuple hash 2012 function on the inner header, as described above. 2014 Many core router implementations use a 5-tuple hash to decide how to 2015 balance packet load across members of a LAG. The 5-tuple hash 2016 includes the source and destination addresses of the packet and the 2017 source and destination ports when the protocol number in the packet 2018 is TCP or UDP. For this reason, UDP encoding is used for LISP 2019 encapsulation. 2021 6.5. Changing the Contents of EID-to-RLOC Mappings 2023 Since the LISP architecture uses a caching scheme to retrieve and 2024 store EID-to-RLOC mappings, the only way an ITR can get a more up-to- 2025 date mapping is to re-request the mapping. However, the ITRs do not 2026 know when the mappings change and the ETRs do not keep track of who 2027 requested its mappings. For scalability reasons, we want to maintain 2028 this approach but need to provide a way for ETRs change their 2029 mappings and inform the sites that are currently communicating with 2030 the ETR site using such mappings. 2032 When a locator record is added to the end of a locator-set, it is 2033 easy to update mappings. We assume new mappings will maintain the 2034 same locator ordering as the old mapping but just have new locators 2035 appended to the end of the list. So some ITRs can have a new mapping 2036 while other ITRs have only an old mapping that is used until they 2037 time out. When an ITR has only an old mapping but detects bits set 2038 in the loc-status-bits that correspond to locators beyond the list it 2039 has cached, it simply ignores them. However, this can only happen 2040 for locator addresses that are lexicographically greater than the 2041 locator addresses in the existing locator-set. 2043 When a locator record is removed from a locator-set, ITRs that have 2044 the mapping cached will not use the removed locator because the xTRs 2045 will set the loc-status-bit to 0. So even if the locator is in the 2046 list, it will not be used. For new mapping requests, the xTRs can 2047 set the locator AFI to 0 (indicating an unspecified address), as well 2048 as setting the corresponding loc-status-bit to 0. This forces ITRs 2049 with old or new mappings to avoid using the removed locator. 2051 If many changes occur to a mapping over a long period of time, one 2052 will find empty record slots in the middle of the locator-set and new 2053 records appended to the locator-set. At some point, it would be 2054 useful to compact the locator-set so the loc-status-bit settings can 2055 be efficiently packed. 2057 We propose here three approaches for locator-set compaction, one 2058 operational and two protocol mechanisms. The operational approach 2059 uses a clock sweep method. The protocol approaches use the concept 2060 of Solicit-Map-Requests and Map-Versioning. 2062 6.5.1. Clock Sweep 2064 The clock sweep approach uses planning in advance and the use of 2065 count-down TTLs to time out mappings that have already been cached. 2066 The default setting for an EID-to-RLOC mapping TTL is 24 hours. So 2067 there is a 24 hour window to time out old mappings. The following 2068 clock sweep procedure is used: 2070 1. 24 hours before a mapping change is to take effect, a network 2071 administrator configures the ETRs at a site to start the clock 2072 sweep window. 2074 2. During the clock sweep window, ETRs continue to send Map-Reply 2075 messages with the current (unchanged) mapping records. The TTL 2076 for these mappings is set to 1 hour. 2078 3. 24 hours later, all previous cache entries will have timed out, 2079 and any active cache entries will time out within 1 hour. During 2080 this 1 hour window the ETRs continue to send Map-Reply messages 2081 with the current (unchanged) mapping records with the TTL set to 2082 1 minute. 2084 4. At the end of the 1 hour window, the ETRs will send Map-Reply 2085 messages with the new (changed) mapping records. So any active 2086 caches can get the new mapping contents right away if not cached, 2087 or in 1 minute if they had the mapping cached. 2089 6.5.2. Solicit-Map-Request (SMR) 2091 Soliciting a Map-Request is a selective way for xTRs, at the site 2092 where mappings change, to control the rate they receive requests for 2093 Map-Reply messages. SMRs are also used to tell remote ITRs to update 2094 the mappings they have cached. 2096 Since the xTRs don't keep track of remote ITRs that have cached their 2097 mappings, they can not tell exactly who needs the new mapping 2098 entries. So an xTR will solicit Map-Requests from sites it is 2099 currently sending encapsulated data to, and only from those sites. 2100 The xTRs can locally decide the algorithm for how often and to how 2101 many sites it sends SMR messages. 2103 An SMR message is simply a bit set in a Map-Request message. An ITR 2104 or PTR will send a Map-Request when they receive an SMR message. 2105 Both the SMR sender and the Map-Request responder MUST rate-limited 2106 these messages. 2108 The following procedure shows how a SMR exchange occurs when a site 2109 is doing locator-set compaction for an EID-to-RLOC mapping: 2111 1. When the database mappings in an ETR change, the ETRs at the site 2112 begin to send Map-Requests with the SMR bit set for each locator 2113 in each map-cache entry the ETR caches. 2115 2. A remote xTR which receives the SMR message will schedule sending 2116 a Map-Request message to the source locator address of the SMR 2117 message. A newly allocated random nonce is selected and the EID- 2118 prefix used is the one copied from the SMR message. 2120 3. The remote xTR MUST rate-limit the Map-Request until it gets a 2121 Map-Reply while continuing to use the cached mapping. 2123 4. The ETRs at the site with the changed mapping will reply to the 2124 Map-Request with a Map-Reply message provided the Map-Request 2125 nonce matches the nonce from the SMR. The Map-Reply messages 2126 SHOULD be rate limited. This is important to avoid Map-Reply 2127 implosion. 2129 5. The ETRs, at the site with the changed mapping, record the fact 2130 that the site that sent the Map-Request has received the new 2131 mapping data in the mapping cache entry for the remote site so 2132 the loc-status-bits are reflective of the new mapping for packets 2133 going to the remote site. The ETR then stops sending SMR 2134 messages. 2136 For security reasons an ITR MUST NOT process unsolicited Map-Replies. 2137 The nonce MUST be carried from SMR packet, into the resultant Map- 2138 Request, and then into Map-Reply to reduce spoofing attacks. 2140 To avoid map-cache entry corruption by a third-party, a sender of an 2141 SMR-based Map-Request MUST be verified. If an ITR receives an SMR- 2142 based Map-Request and the source is not in the locator-set for the 2143 stored map-cache entry, then the responding Map-Request MUST be sent 2144 with an EID destination to the mapping database system. Since the 2145 mapping database system is more secure to reach an authoritative ETR, 2146 it will deliver the Map-Request to the authoritative source of the 2147 mapping data. 2149 6.5.3. Database Map Versioning 2151 When there is unidirectional packet flow between an ITR and ETR, and 2152 the EID-to-RLOC mappings change on the ETR, it needs to inform the 2153 ITR so encapsulation can stop to a removed locator and start to a new 2154 locator in the locator-set. 2156 An ETR, when it sends Map-Reply messages, conveys its own Map-Version 2157 number. This is known as the Destination Map-Version Number. ITRs 2158 include the Destination Map-Version Number in packets they 2159 encapsulate to the site. When an ETR decapsulates a packet and 2160 detects the Destination Map-Version Number is less than the current 2161 version for its mapping, the SMR procedure described in Section 6.5.2 2162 occurs. 2164 An ITR, when it encapsulates packets to ETRs, can convey its own Map- 2165 Version number. This is known as the Source Map-Version Number. 2167 When an ETR decapsulates a packet and detects the Source Map-Version 2168 Number is greater than the last Map-Version Number sent in a Map- 2169 Reply from the ITR's site, the ETR will send a Map-Request to one of 2170 the ETRs for the source site. 2172 A Map-Version Number is used as a sequence number per EID-prefix. So 2173 values that are greater, are considered to be more recent. A value 2174 of 0 for the Source Map-Version Number or the Destination Map-Version 2175 Number conveys no versioning information and an xTR does no 2176 comparison with previously received Map-Version Numbers. 2178 A Map-Version Number can be included in Map-Register messages as 2179 well. This is a good way for the Map-Server can assure that all ETRs 2180 for a site registering to it will be Map-Version number synchronized. 2182 See [VERSIONING] for a more detailed analysis and description of 2183 Database Map Versioning. 2185 7. Router Performance Considerations 2187 LISP is designed to be very hardware-based forwarding friendly. By 2188 doing tunnel header prepending [RFC1955] and stripping instead of re- 2189 writing addresses, existing hardware can support the forwarding model 2190 with little or no modification. Where modifications are required, 2191 they should be limited to re-programming existing hardware rather 2192 than requiring expensive design changes to hard-coded algorithms in 2193 silicon. 2195 A few implementation techniques can be used to incrementally 2196 implement LISP: 2198 o When a tunnel encapsulated packet is received by an ETR, the outer 2199 destination address may not be the address of the router. This 2200 makes it challenging for the control plane to get packets from the 2201 hardware. This may be mitigated by creating special FIB entries 2202 for the EID-prefixes of EIDs served by the ETR (those for which 2203 the router provides an RLOC translation). These FIB entries are 2204 marked with a flag indicating that control plane processing should 2205 be performed. The forwarding logic of testing for particular IP 2206 protocol number value is not necessary. No changes to existing, 2207 deployed hardware should be needed to support this. 2209 o On an ITR, prepending a new IP header is as simple as adding more 2210 bytes to a MAC rewrite string and prepending the string as part of 2211 the outgoing encapsulation procedure. Many routers that support 2212 GRE tunneling [RFC2784] or 6to4 tunneling [RFC3056] can already 2213 support this action. 2215 o When a received packet's outer destination address contains an EID 2216 which is not intended to be forwarded on the routable topology 2217 (i.e. LISP 1.5), the source address of a data packet or the 2218 router interface with which the source is associated (the 2219 interface from which it was received) can be associated with a VRF 2220 (Virtual Routing/Forwarding), in which a different (i.e. non- 2221 congruent) topology can be used to find EID-to-RLOC mappings. 2223 8. Deployment Scenarios 2225 This section will explore how and where ITRs and ETRs can be deployed 2226 and will discuss the pros and cons of each deployment scenario. 2227 There are two basic deployment trade-offs to consider: centralized 2228 versus distributed caches and flat, recursive, or re-encapsulating 2229 tunneling. 2231 When deciding on centralized versus distributed caching, the 2232 following issues should be considered: 2234 o Are the tunnel routers spread out so that the caches are spread 2235 across all the memories of each router? 2237 o Should management "touch points" be minimized by choosing few 2238 tunnel routers, just enough for redundancy? 2240 o In general, using more ITRs doesn't increase management load, 2241 since caches are built and stored dynamically. On the other hand, 2242 more ETRs does require more management since EID-prefix-to-RLOC 2243 mappings need to be explicitly configured. 2245 When deciding on flat, recursive, or re-encapsulation tunneling, the 2246 following issues should be considered: 2248 o Flat tunneling implements a single tunnel between source site and 2249 destination site. This generally offers better paths between 2250 sources and destinations with a single tunnel path. 2252 o Recursive tunneling is when tunneled traffic is again further 2253 encapsulated in another tunnel, either to implement VPNs or to 2254 perform Traffic Engineering. When doing VPN-based tunneling, the 2255 site has some control since the site is prepending a new tunnel 2256 header. In the case of TE-based tunneling, the site may have 2257 control if it is prepending a new tunnel header, but if the site's 2258 ISP is doing the TE, then the site has no control. Recursive 2259 tunneling generally will result in suboptimal paths but at the 2260 benefit of steering traffic to resource available parts of the 2261 network. 2263 o The technique of re-encapsulation ensures that packets only 2264 require one tunnel header. So if a packet needs to be rerouted, 2265 it is first decapsulated by the ETR and then re-encapsulated with 2266 a new tunnel header using a new RLOC. 2268 The next sub-sections will describe where tunnel routers can reside 2269 in the network. 2271 8.1. First-hop/Last-hop Tunnel Routers 2273 By locating tunnel routers close to hosts, the EID-prefix set is at 2274 the granularity of an IP subnet. So at the expense of more EID- 2275 prefix-to-RLOC sets for the site, the caches in each tunnel router 2276 can remain relatively small. But caches always depend on the number 2277 of non-aggregated EID destination flows active through these tunnel 2278 routers. 2280 With more tunnel routers doing encapsulation, the increase in control 2281 traffic grows as well: since the EID-granularity is greater, more 2282 Map-Requests and Map-Replies are traveling between more routers. 2284 The advantage of placing the caches and databases at these stub 2285 routers is that the products deployed in this part of the network 2286 have better price-memory ratios then their core router counterparts. 2287 Memory is typically less expensive in these devices and fewer routes 2288 are stored (only IGP routes). These devices tend to have excess 2289 capacity, both for forwarding and routing state. 2291 LISP functionality can also be deployed in edge switches. These 2292 devices generally have layer-2 ports facing hosts and layer-3 ports 2293 facing the Internet. Spare capacity is also often available in these 2294 devices as well. 2296 8.2. Border/Edge Tunnel Routers 2298 Using customer-edge (CE) routers for tunnel endpoints allows the EID 2299 space associated with a site to be reachable via a small set of RLOCs 2300 assigned to the CE routers for that site. 2302 This offers the opposite benefit of the first-hop/last-hop tunnel 2303 router scenario: the number of mapping entries and network management 2304 touch points are reduced, allowing better scaling. 2306 One disadvantage is that less of the network's resources are used to 2307 reach host endpoints thereby centralizing the point-of-failure domain 2308 and creating network choke points at the CE router. 2310 Note that more than one CE router at a site can be configured with 2311 the same IP address. In this case an RLOC is an anycast address. 2312 This allows resilience between the CE routers. That is, if a CE 2313 router fails, traffic is automatically routed to the other routers 2314 using the same anycast address. However, this comes with the 2315 disadvantage where the site cannot control the entrance point when 2316 the anycast route is advertised out from all border routers. 2318 8.3. ISP Provider-Edge (PE) Tunnel Routers 2320 Use of ISP PE routers as tunnel endpoint routers gives an ISP control 2321 over the location of the egress tunnel endpoints. That is, the ISP 2322 can decide if the tunnel endpoints are in the destination site (in 2323 either CE routers or last-hop routers within a site) or at other PE 2324 edges. The advantage of this case is that two or more tunnel headers 2325 can be avoided. By having the PE be the first router on the path to 2326 encapsulate, it can choose a TE path first, and the ETR can 2327 decapsulate and re-encapsulate for a tunnel to the destination end 2328 site. 2330 An obvious disadvantage is that the end site has no control over 2331 where its packets flow or the RLOCs used. 2333 As mentioned in earlier sections a combination of these scenarios is 2334 possible at the expense of extra packet header overhead, if both site 2335 and provider want control, then recursive or re-encapsulating tunnels 2336 are used. 2338 8.4. LISP Functionality with Conventional NATs 2340 LISP routers can be deployed behind Network Address Translator (NAT) 2341 devices to provide the same set of packet services hosts have today 2342 when they are addressed out of private address space. 2344 It is important to note that a locator address in any LISP control 2345 message MUST be a globally routable address and therefore SHOULD NOT 2346 contain [RFC1918] addresses. If a LISP router is configured with 2347 private addresses, they MUST be used only in the outer IP header so 2348 the NAT device can translate properly. Otherwise, EID addresses MUST 2349 be translated before encapsulation is performed. Both NAT 2350 translation and LISP encapsulation functions could be co-located in 2351 the same device. 2353 More details on LISP address translation can be found in [INTERWORK]. 2355 9. Traceroute Considerations 2357 When a source host in a LISP site initiates a traceroute to a 2358 destination host in another LISP site, it is highly desirable for it 2359 to see the entire path. Since packets are encapsulated from ITR to 2360 ETR, the hop across the tunnel could be viewed as a single hop. 2361 However, LISP traceroute will provide the entire path so the user can 2362 see 3 distinct segments of the path from a source LISP host to a 2363 destination LISP host: 2365 Segment 1 (in source LISP site based on EIDs): 2367 source-host ---> first-hop ... next-hop ---> ITR 2369 Segment 2 (in the core network based on RLOCs): 2371 ITR ---> next-hop ... next-hop ---> ETR 2373 Segment 3 (in the destination LISP site based on EIDs): 2375 ETR ---> next-hop ... last-hop ---> destination-host 2377 For segment 1 of the path, ICMP Time Exceeded messages are returned 2378 in the normal matter as they are today. The ITR performs a TTL 2379 decrement and test for 0 before encapsulating. So the ITR hop is 2380 seen by the traceroute source has an EID address (the address of 2381 site-facing interface). 2383 For segment 2 of the path, ICMP Time Exceeded messages are returned 2384 to the ITR because the TTL decrement to 0 is done on the outer 2385 header, so the destination of the ICMP messages are to the ITR RLOC 2386 address, the source RLOC address of the encapsulated traceroute 2387 packet. The ITR looks inside of the ICMP payload to inspect the 2388 traceroute source so it can return the ICMP message to the address of 2389 the traceroute client as well as retaining the core router IP address 2390 in the ICMP message. This is so the traceroute client can display 2391 the core router address (the RLOC address) in the traceroute output. 2392 The ETR returns its RLOC address and responds to the TTL decrement to 2393 0 like the previous core routers did. 2395 For segment 3, the next-hop router downstream from the ETR will be 2396 decrementing the TTL for the packet that was encapsulated, sent into 2397 the core, decapsulated by the ETR, and forwarded because it isn't the 2398 final destination. If the TTL is decremented to 0, any router on the 2399 path to the destination of the traceroute, including the next-hop 2400 router or destination, will send an ICMP Time Exceeded message to the 2401 source EID of the traceroute client. The ICMP message will be 2402 encapsulated by the local ITR and sent back to the ETR in the 2403 originated traceroute source site, where the packet will be delivered 2404 to the host. 2406 9.1. IPv6 Traceroute 2408 IPv6 traceroute follows the procedure described above since the 2409 entire traceroute data packet is included in ICMP Time Exceeded 2410 message payload. Therefore, only the ITR needs to pay special 2411 attention for forwarding ICMP messages back to the traceroute source. 2413 9.2. IPv4 Traceroute 2415 For IPv4 traceroute, we cannot follow the above procedure since IPv4 2416 ICMP Time Exceeded messages only include the invoking IP header and 8 2417 bytes that follow the IP header. Therefore, when a core router sends 2418 an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP 2419 payload is the encapsulated header it prepended followed by a UDP 2420 header. The original invoking IP header, and therefore the identity 2421 of the traceroute source is lost. 2423 The solution we propose to solve this problem is to cache traceroute 2424 IPv4 headers in the ITR and to match them up with corresponding IPv4 2425 Time Exceeded messages received from core routers and the ETR. The 2426 ITR will use a circular buffer for caching the IPv4 and UDP headers 2427 of traceroute packets. It will select a 16-bit number as a key to 2428 find them later when the IPv4 Time Exceeded messages are received. 2429 When an ITR encapsulates an IPv4 traceroute packet, it will use the 2430 16-bit number as the UDP source port in the encapsulating header. 2431 When the ICMP Time Exceeded message is returned to the ITR, the UDP 2432 header of the encapsulating header is present in the ICMP payload 2433 thereby allowing the ITR to find the cached headers for the 2434 traceroute source. The ITR puts the cached headers in the payload 2435 and sends the ICMP Time Exceeded message to the traceroute source 2436 retaining the source address of the original ICMP Time Exceeded 2437 message (a core router or the ETR of the site of the traceroute 2438 destination). 2440 The signature of a traceroute packet comes in two forms. The first 2441 form is encoded as a UDP message where the destination port is 2442 inspected for a range of values. The second form is encoded as an 2443 ICMP message where the IP identification field is inspected for a 2444 well-known value. 2446 9.3. Traceroute using Mixed Locators 2448 When either an IPv4 traceroute or IPv6 traceroute is originated and 2449 the ITR encapsulates it in the other address family header, you 2450 cannot get all 3 segments of the traceroute. Segment 2 of the 2451 traceroute can not be conveyed to the traceroute source since it is 2452 expecting addresses from intermediate hops in the same address format 2453 for the type of traceroute it originated. Therefore, in this case, 2454 segment 2 will make the tunnel look like one hop. All the ITR has to 2455 do to make this work is to not copy the inner TTL to the outer, 2456 encapsulating header's TTL when a traceroute packet is encapsulated 2457 using an RLOC from a different address family. This will cause no 2458 TTL decrement to 0 to occur in core routers between the ITR and ETR. 2460 10. Mobility Considerations 2462 There are several kinds of mobility of which only some might be of 2463 concern to LISP. Essentially they are as follows. 2465 10.1. Site Mobility 2467 A site wishes to change its attachment points to the Internet, and 2468 its LISP Tunnel Routers will have new RLOCs when it changes upstream 2469 providers. Changes in EID-RLOC mappings for sites are expected to be 2470 handled by configuration, outside of the LISP protocol. 2472 10.2. Slow Endpoint Mobility 2474 An individual endpoint wishes to move, but is not concerned about 2475 maintaining session continuity. Renumbering is involved. LISP can 2476 help with the issues surrounding renumbering [RFC4192] [LISA96] by 2477 decoupling the address space used by a site from the address spaces 2478 used by its ISPs. [RFC4984] 2480 10.3. Fast Endpoint Mobility 2482 Fast endpoint mobility occurs when an endpoint moves relatively 2483 rapidly, changing its IP layer network attachment point. Maintenance 2484 of session continuity is a goal. This is where the Mobile IPv4 2485 [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used, 2486 and primarily where interactions with LISP need to be explored. 2488 The problem is that as an endpoint moves, it may require changes to 2489 the mapping between its EID and a set of RLOCs for its new network 2490 location. When this is added to the overhead of mobile IP binding 2491 updates, some packets might be delayed or dropped. 2493 In IPv4 mobility, when an endpoint is away from home, packets to it 2494 are encapsulated and forwarded via a home agent which resides in the 2495 home area the endpoint's address belongs to. The home agent will 2496 encapsulate and forward packets either directly to the endpoint or to 2497 a foreign agent which resides where the endpoint has moved to. 2498 Packets from the endpoint may be sent directly to the correspondent 2499 node, may be sent via the foreign agent, or may be reverse-tunneled 2500 back to the home agent for delivery to the mobile node. As the 2501 mobile node's EID or available RLOC changes, LISP EID-to-RLOC 2502 mappings are required for communication between the mobile node and 2503 the home agent, whether via foreign agent or not. As a mobile 2504 endpoint changes networks, up to three LISP mapping changes may be 2505 required: 2507 o The mobile node moves from an old location to a new visited 2508 network location and notifies its home agent that it has done so. 2509 The Mobile IPv4 control packets the mobile node sends pass through 2510 one of the new visited network's ITRs, which needs a EID-RLOC 2511 mapping for the home agent. 2513 o The home agent might not have the EID-RLOC mappings for the mobile 2514 node's "care-of" address or its foreign agent in the new visited 2515 network, in which case it will need to acquire them. 2517 o When packets are sent directly to the correspondent node, it may 2518 be that no traffic has been sent from the new visited network to 2519 the correspondent node's network, and the new visited network's 2520 ITR will need to obtain an EID-RLOC mapping for the correspondent 2521 node's site. 2523 In addition, if the IPv4 endpoint is sending packets from the new 2524 visited network using its original EID, then LISP will need to 2525 perform a route-returnability check on the new EID-RLOC mapping for 2526 that EID. 2528 In IPv6 mobility, packets can flow directly between the mobile node 2529 and the correspondent node in either direction. The mobile node uses 2530 its "care-of" address (EID). In this case, the route-returnability 2531 check would not be needed but one more LISP mapping lookup may be 2532 required instead: 2534 o As above, three mapping changes may be needed for the mobile node 2535 to communicate with its home agent and to send packets to the 2536 correspondent node. 2538 o In addition, another mapping will be needed in the correspondent 2539 node's ITR, in order for the correspondent node to send packets to 2540 the mobile node's "care-of" address (EID) at the new network 2541 location. 2543 When both endpoints are mobile the number of potential mapping 2544 lookups increases accordingly. 2546 As a mobile node moves there are not only mobility state changes in 2547 the mobile node, correspondent node, and home agent, but also state 2548 changes in the ITRs and ETRs for at least some EID-prefixes. 2550 The goal is to support rapid adaptation, with little delay or packet 2551 loss for the entire system. Also IP mobility can be modified to 2552 require fewer mapping changes. In order to increase overall system 2553 performance, there may be a need to reduce the optimization of one 2554 area in order to place fewer demands on another. 2556 In LISP, one possibility is to "glean" information. When a packet 2557 arrives, the ETR could examine the EID-RLOC mapping and use that 2558 mapping for all outgoing traffic to that EID. It can do this after 2559 performing a route-returnability check, to ensure that the new 2560 network location does have a internal route to that endpoint. 2561 However, this does not cover the case where an ITR (the node assigned 2562 the RLOC) at the mobile-node location has been compromised. 2564 Mobile IP packet exchange is designed for an environment in which all 2565 routing information is disseminated before packets can be forwarded. 2566 In order to allow the Internet to grow to support expected future 2567 use, we are moving to an environment where some information may have 2568 to be obtained after packets are in flight. Modifications to IP 2569 mobility should be considered in order to optimize the behavior of 2570 the overall system. Anything which decreases the number of new EID- 2571 RLOC mappings needed when a node moves, or maintains the validity of 2572 an EID-RLOC mapping for a longer time, is useful. 2574 10.4. Fast Network Mobility 2576 In addition to endpoints, a network can be mobile, possibly changing 2577 xTRs. A "network" can be as small as a single router and as large as 2578 a whole site. This is different from site mobility in that it is 2579 fast and possibly short-lived, but different from endpoint mobility 2580 in that a whole prefix is changing RLOCs. However, the mechanisms 2581 are the same and there is no new overhead in LISP. A map request for 2582 any endpoint will return a binding for the entire mobile prefix. 2584 If mobile networks become a more common occurrence, it may be useful 2585 to revisit the design of the mapping service and allow for dynamic 2586 updates of the database. 2588 The issue of interactions between mobility and LISP needs to be 2589 explored further. Specific improvements to the entire system will 2590 depend on the details of mapping mechanisms. Mapping mechanisms 2591 should be evaluated on how well they support session continuity for 2592 mobile nodes. 2594 10.5. LISP Mobile Node Mobility 2596 A mobile device can use the LISP infrastructure to achieve mobility 2597 by implementing the LISP encapsulation and decapsulation functions 2598 and acting as a simple ITR/ETR. By doing this, such a "LISP mobile 2599 node" can use topologically-independent EID IP addresses that are not 2600 advertised into and do not impose a cost on the global routing 2601 system. These EIDs are maintained at the edges of the mapping system 2602 (in LISP Map-Servers and Map-Resolvers) and are provided on demand to 2603 only the correspondents of the LISP mobile node. 2605 Refer to the LISP Mobility Architecture specification [LISP-MN] for 2606 more details. 2608 11. Multicast Considerations 2610 A multicast group address, as defined in the original Internet 2611 architecture is an identifier of a grouping of topologically 2612 independent receiver host locations. The address encoding itself 2613 does not determine the location of the receiver(s). The multicast 2614 routing protocol, and the network-based state the protocol creates, 2615 determines where the receivers are located. 2617 In the context of LISP, a multicast group address is both an EID and 2618 a Routing Locator. Therefore, no specific semantic or action needs 2619 to be taken for a destination address, as it would appear in an IP 2620 header. Therefore, a group address that appears in an inner IP 2621 header built by a source host will be used as the destination EID. 2622 The outer IP header (the destination Routing Locator address), 2623 prepended by a LISP router, will use the same group address as the 2624 destination Routing Locator. 2626 Having said that, only the source EID and source Routing Locator 2627 needs to be dealt with. Therefore, an ITR merely needs to put its 2628 own IP address in the source Routing Locator field when prepending 2629 the outer IP header. This source Routing Locator address, like any 2630 other Routing Locator address MUST be globally routable. 2632 Therefore, an EID-to-RLOC mapping does not need to be performed by an 2633 ITR when a received data packet is a multicast data packet or when 2634 processing a source-specific Join (either by IGMPv3 or PIM). But the 2635 source Routing Locator is decided by the multicast routing protocol 2636 in a receiver site. That is, an EID to Routing Locator translation 2637 is done at control-time. 2639 Another approach is to have the ITR not encapsulate a multicast 2640 packet and allow the the host built packet to flow into the core even 2641 if the source address is allocated out of the EID namespace. If the 2642 RPF-Vector TLV [RFC5496] is used by PIM in the core, then core 2643 routers can RPF to the ITR (the Locator address which is injected 2644 into core routing) rather than the host source address (the EID 2645 address which is not injected into core routing). 2647 To avoid any EID-based multicast state in the network core, the first 2648 approach is chosen for LISP-Multicast. Details for LISP-Multicast 2649 and Interworking with non-LISP sites is described in specification 2650 [MLISP]. 2652 12. Security Considerations 2654 It is believed that most of the security mechanisms will be part of 2655 the mapping database service when using control plane procedures for 2656 obtaining EID-to-RLOC mappings. For data plane triggered mappings, 2657 as described in this specification, protection is provided against 2658 ETR spoofing by using Return- Routability mechanisms evidenced by the 2659 use of a 24-bit Nonce field in the LISP encapsulation header and a 2660 64-bit Nonce field in the LISP control message. The nonce, coupled 2661 with the ITR accepting only solicited Map-Replies goes a long way 2662 toward providing decent authentication. 2664 LISP does not rely on a PKI infrastructure or a more heavy weight 2665 authentication system. These systems challenge the scalability of 2666 LISP which was a primary design goal. 2668 DoS attack prevention will depend on implementations rate-limiting 2669 Map-Requests and Map-Replies to the control plane as well as rate- 2670 limiting the number of data-triggered Map-Replies. 2672 To deal with map-cache exhaustion attempts in an ITR/PTR, the 2673 implementation should consider putting a maximum cap on the number of 2674 entries stored with a reserve list for special or frequently accessed 2675 sites. This should be a configuration policy control set by the 2676 network administrator who manages ITRs and PTRs. 2678 13. IANA Considerations 2680 This specification has already allocated UDP port numbers 4341 and 2681 4342 assigned from the IANA registry. 2683 14. Prototype Plans and Status 2685 The operator community has requested that the IETF take a practical 2686 approach to solving the scaling problems associated with global 2687 routing state growth. This document offers a simple solution which 2688 is intended for use in a pilot program to gain experience in working 2689 on this problem. 2691 The authors hope that publishing this specification will allow the 2692 rapid implementation of multiple vendor prototypes and deployment on 2693 a small scale. Doing this will help the community: 2695 o Decide whether a new EID-to-RLOC mapping database infrastructure 2696 is needed or if a simple, UDP-based, data-triggered approach is 2697 flexible and robust enough. 2699 o Experiment with provider-independent assignment of EIDs while at 2700 the same time decreasing the size of DFZ routing tables through 2701 the use of topologically-aligned, provider-based RLOCs. 2703 o Determine whether multiple levels of tunneling can be used by ISPs 2704 to achieve their Traffic Engineering goals while simultaneously 2705 removing the more specific routes currently injected into the 2706 global routing system for this purpose. 2708 o Experiment with mobility to determine if both acceptable 2709 convergence and session continuity properties can be scalably 2710 implemented to support both individual device roaming and site 2711 service provider changes. 2713 Here is a rough set of milestones: 2715 1. Interoperable implementations have been available since the 2716 beginning of 2009. 2718 2. Continue pilot deployment using LISP-ALT as the database mapping 2719 mechanism. 2721 3. Continue prototyping and studying other database lookup schemes, 2722 be it DNS, DHTs, CONS, ALT, NERD, or other mechanisms. 2724 4. Implement the LISP Multicast draft [MLISP]. 2726 5. Implement the LISP Mobile Node draft [LISP-MN]. 2728 6. Research more on how policy affects what gets returned in a Map- 2729 Reply from an ETR. 2731 7. Continue to experiment with mixed locator-sets to understand how 2732 LISP can help the IPv4 to IPv6 transition. 2734 8. Add more robustness to locator reachability between LISP sites. 2736 9. Continue the deployment of Proxy-ETRs (PETRs) for uses like uRPF 2737 avoidance, IPv6 connectivity, and LISP-MN. 2739 As of this writing the following accomplishments have been achieved: 2741 1. A unit- and system-tested software switching implementation has 2742 been completed on cisco NX-OS for this draft for both IPv4 and 2743 IPv6 EIDs using a mixed locator-set of IPv4 and IPv6 locators. 2745 2. A unit- and system-tested software switching implementation on 2746 cisco NX-OS has been completed for draft [ALT]. 2748 3. A unit- and system-tested software switching implementation on 2749 cisco NX-OS has been completed for draft [INTERWORK]. Support 2750 for IPv4 translation is provided and PTR support for IPv4 and 2751 IPv6 is provided. 2753 4. The cisco NX-OS implementation supports an experimental 2754 mechanism for slow mobility. 2756 5. There are 5 LISP implementations that exist and the first 4 2757 below have gone through interoperability testing at IETF 2758 Hiroshima, based on the draft-ietf-lisp-05.txt spec: 2760 1. cisco NX-OS 2762 2. OpenLISP 2764 3. LISP-Click 2766 4. ZLisp 2768 5. cisco IOS 2770 6. Dave Meyer, Vince Fuller, Darrel Lewis, Gregg Schudel, Andrew 2771 Partan and the rest of the lisp-beta team continue to test all 2772 the features described above on a dual-stack infrastructure. 2774 7. Darrel Lewis and Dave Meyer have deployed both LISP translation 2775 and LISP PTR support in the pilot network. Point your browser 2776 to http://www.lisp4.net to see translation happening in action 2777 so your non-LISP site can access a web server in a LISP site. 2779 8. Soon http://www.lisp6.net will work where your IPv6 LISP site 2780 can talk to a IPv6 web server in a LISP site by using mixed 2781 address-family based locators. 2783 9. An public domain implementation of LISP is available. See 2784 [OPENLISP] for details. 2786 10. We have deployed Map-Resolvers and Map-Servers on the LISP pilot 2787 network to gather experience with [LISP-MS]. The first layer of 2788 the architecture are the xTRs which use Map-Servers for EID- 2789 prefix registration and Map-Resolvers for EID-to-RLOC mapping 2790 resolution. The second layer are the Map-Resolvers and Map- 2791 Servers which connect to the ALT BGP peering infrastructure. 2792 And the third layer are ALT-routers which aggregate EID-prefixes 2793 and forward Map-Requests. 2795 11. A cisco IOS implementation is available which currently supports 2796 IPv4 and IPv6 encapsulation and decapsulation features. 2798 12. A LISP router based LIG implementation is supported, deployed, 2799 and used daily to debug and test the LISP pilot network. See 2800 [LIG] for details. 2802 13. A Linux implementation of LIG has been made available and 2803 supported by Dave Meyer. It can be run on any Linux system 2804 which resides in either a LISP site or non-LISP site. See [LIG] 2805 for details. Public domain code can be downloaded from 2806 http://github.com/davidmeyer/lig/tree/master. 2808 14. An experimental implementation has been written for three 2809 locator reachability algorithms. Two are the Echo-Noncing and 2810 RLOC-Probing algorithms which are documented in this 2811 specification. The third is called TCP-counts which will be 2812 documented in future drafts. 2814 15. The LISP pilot network has been converted from using MD5 HMAC 2815 authentication for Map-Register messages to SHA-1 HMAC 2816 authentication. ETRs send with SHA-1 but Map-Servers can 2817 received from either for compatibility purposes. 2819 16. The LISP pilot network is in its 3rd generation. Current 2820 experiments are being performed to test EID-prefix aggregation 2821 at multiple service boundaries as well as deploying models for 2822 the existence of multiple Mapping Service Providers (MSPs). 2824 If interested in writing a LISP implementation, testing any of the 2825 LISP implementations, or want to be part of the LISP pilot program, 2826 please contact lisp@ietf.org. 2828 15. References 2830 15.1. Normative References 2832 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2833 August 1980. 2835 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 2836 November 1990. 2838 [RFC1498] Saltzer, J., "On the Naming and Binding of Network 2839 Destinations", RFC 1498, August 1993. 2841 [RFC1700] Reynolds, J. and J. Postel, "Assigned Numbers", RFC 1700, 2842 October 1994. 2844 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 2845 E. Lear, "Address Allocation for Private Internets", 2846 BCP 5, RFC 1918, February 1996. 2848 [RFC1955] Hinden, R., "New Scheme for Internet Routing and 2849 Addressing (ENCAPS) for IPNG", RFC 1955, June 1996. 2851 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2852 Requirement Levels", BCP 14, RFC 2119, March 1997. 2854 [RFC2404] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within 2855 ESP and AH", RFC 2404, November 1998. 2857 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 2858 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 2859 March 2000. 2861 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 2862 via IPv4 Clouds", RFC 3056, February 2001. 2864 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 2865 of Explicit Congestion Notification (ECN) to IP", 2866 RFC 3168, September 2001. 2868 [RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support 2869 in IPv6", RFC 3775, June 2004. 2871 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness 2872 Requirements for Security", BCP 106, RFC 4086, June 2005. 2874 [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol 2875 (HIP) Architecture", RFC 4423, May 2006. 2877 [RFC4634] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms 2878 (SHA and HMAC-SHA)", RFC 4634, July 2006. 2880 [RFC4866] Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route 2881 Optimization for Mobile IPv6", RFC 4866, May 2007. 2883 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 2884 Workshop on Routing and Addressing", RFC 4984, 2885 September 2007. 2887 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2888 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 2889 May 2008. 2891 [RFC5496] Wijnands, IJ., Boers, A., and E. Rosen, "The Reverse Path 2892 Forwarding (RPF) Vector TLV", RFC 5496, March 2009. 2894 [RFC5533] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming 2895 Shim Protocol for IPv6", RFC 5533, June 2009. 2897 [UDP-TUNNELS] 2898 Eubanks, M. and P. Chimento, "UDP Checksums for Tunneled 2899 Packets"", draft-eubanks-chimento-6man-00.txt (work in 2900 progress), February 2009. 2902 15.2. Informative References 2904 [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY 2905 NUMBERS http://www.iana.org/numbers.html, Febuary 2007. 2907 [ALT] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "LISP 2908 Alternative Topology (LISP-ALT)", 2909 draft-ietf-lisp-alt-04.txt (work in progress), April 2010. 2911 [APT] Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and 2912 L. Zhang, "APT: A Practical Transit Mapping Service", 2913 draft-jen-apt-01.txt (work in progress), November 2007. 2915 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 2916 Enhancement to the Internet Architecture", Internet- 2917 Draft http://www.chiappa.net/~jnc/tech/endpoints.txt, 2918 1999. 2920 [CONS] Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A 2921 Content distribution Overlay Network Service for LISP", 2922 draft-meyer-lisp-cons-03.txt (work in progress), 2923 November 2007. 2925 [DHTs] Ratnasamy, S., Shenker, S., and I. Stoica, "Routing 2926 Algorithms for DHTs: Some Open Questions", PDF 2927 file http://www.cs.rice.edu/Conferences/IPTPS02/174.pdf. 2929 [EMACS] Brim, S., Farinacci, D., Meyer, D., and J. Curran, "EID 2930 Mappings Multicast Across Cooperating Systems for LISP", 2931 draft-curran-lisp-emacs-00.txt (work in progress), 2932 November 2007. 2934 [GSE] "GSE - An Alternate Addressing Architecture for IPv6", 2935 draft-ietf-ipngwg-gseaddr-00.txt (work in progress), 1997. 2937 [INTERWORK] 2938 Lewis, D., Meyer, D., Farinacci, D., and V. Fuller, 2939 "Interworking LISP with IPv4 and IPv6", 2940 draft-ietf-lisp-interworking-01.txt (work in progress), 2941 March 2010. 2943 [LCAF] Farinacci, D., Meyer, D., and J. Snijders, "LISP Canonical 2944 Address Format", draft-farinacci-lisp-lcaf-01.txt (work in 2945 progress), April 2010. 2947 [LIG] Farinacci, D. and D. Meyer, "LISP Internet Groper (LIG)", 2948 draft-ietf-lisp-lig-00.txt (work in progress), April 2010. 2950 [LISA96] Lear, E., Katinsky, J., Coffin, J., and D. Tharp, 2951 "Renumbering: Threat or Menace?", Usenix , September 1996. 2953 [LISP-MAIN] 2954 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 2955 "Locator/ID Separation Protocol (LISP)", 2956 draft-farinacci-lisp-12.txt (work in progress), 2957 March 2009. 2959 [LISP-MN] Farinacci, D., Fuller, V., Lewis, D., and D. Meyer, "LISP 2960 Mobility Architecture", draft-meyer-lisp-mn-00.txt (work 2961 in progress), July 2009. 2963 [LISP-MS] Farinacci, D. and V. Fuller, "LISP Map Server", 2964 draft-ietf-lisp-ms-05.txt (work in progress), April 2010. 2966 [LISP1] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2967 "Locator/ID Separation Protocol (LISP1) [Routable ID 2968 Version]", 2969 Slide-set http://www.dinof.net/~dino/ietf/lisp1.ppt, 2970 October 2006. 2972 [LISP2] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2973 "Locator/ID Separation Protocol (LISP2) [DNS-based 2974 Version]", 2975 Slide-set http://www.dinof.net/~dino/ietf/lisp2.ppt, 2976 November 2006. 2978 [LISPDHT] Mathy, L., Iannone, L., and O. Bonaventure, "LISP-DHT: 2979 Towards a DHT to map identifiers onto locators", 2980 draft-mathy-lisp-dht-00.txt (work in progress), 2981 February 2008. 2983 [LOC-ID-ARCH] 2984 Meyer, D. and D. Lewis, "Architectural Implications of 2985 Locator/ID Separation", 2986 draft-meyer-loc-id-implications-01.txt (work in progress), 2987 Januaryr 2009. 2989 [MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, 2990 "LISP for Multicast Environments", 2991 draft-ietf-lisp-multicast-03.txt (work in progress), 2992 April 2010. 2994 [NERD] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", 2995 draft-lear-lisp-nerd-08.txt (work in progress), 2996 March 2010. 2998 [OPENLISP] 2999 Iannone, L. and O. Bonaventure, "OpenLISP Implementation 3000 Report", draft-iannone-openlisp-implementation-01.txt 3001 (work in progress), July 2008. 3003 [RADIR] Narten, T., "Routing and Addressing Problem Statement", 3004 draft-narten-radir-problem-statement-00.txt (work in 3005 progress), July 2007. 3007 [RFC3344bis] 3008 Perkins, C., "IP Mobility Support for IPv4, revised", 3009 draft-ietf-mip4-rfc3344bis-05 (work in progress), 3010 July 2007. 3012 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 3013 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 3014 September 2005. 3016 [RPMD] Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol 3017 for Routing Protocol Meta-data Dissemination", 3018 draft-handley-p2ppush-unpublished-2007726.txt (work in 3019 progress), July 2007. 3021 [VERSIONING] 3022 Iannone, L., Saucez, D., and O. Bonaventure, "LISP Mapping 3023 Versioning", draft-iannone-lisp-mapping-versioning-01.txt 3024 (work in progress), March 2010. 3026 Appendix A. Acknowledgments 3028 An initial thank you goes to Dave Oran for planting the seeds for the 3029 initial ideas for LISP. His consultation continues to provide value 3030 to the LISP authors. 3032 A special and appreciative thank you goes to Noel Chiappa for 3033 providing architectural impetus over the past decades on separation 3034 of location and identity, as well as detailed review of the LISP 3035 architecture and documents, coupled with enthusiasm for making LISP a 3036 practical and incremental transition for the Internet. 3038 The authors would like to gratefully acknowledge many people who have 3039 contributed discussion and ideas to the making of this proposal. 3040 They include Scott Brim, Andrew Partan, John Zwiebel, Jason Schiller, 3041 Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill, Geoff Huston, 3042 David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark Townsley, 3043 Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave Thaler, 3044 Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure, Luigi 3045 Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger 3046 Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland 3047 Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, Damian 3048 Lezama, Attilla De Groot, Parantap Lahiri, David Black, Roque 3049 Gagliano, Isidor Kouvelas, Jesper Skriver, Fred Templin, Margaret 3050 Wasserman, Sam Hartman, Michael Hofling, Pedro Marques, Jari Arkko, 3051 Gregg Schudel, Srinivas Subramanian, Amit Jain, Xu Xiaohu, Dhirendra 3052 Trivedi, Yakov Rekhter, John Scudder, John Drake, and Dimitri 3053 Papadimitriou. 3055 In particular, we would like to thank Dave Meyer for his clever 3056 suggestion for the name "LISP". ;-) 3058 This work originated in the Routing Research Group (RRG) of the IRTF. 3059 The individual submission [LISP-MAIN] was converted into this IETF 3060 LISP working group draft. 3062 Appendix B. Document Change Log 3064 B.1. Changes to draft-ietf-lisp-07.txt 3066 o Posted April 2010. 3068 o Added I-bit to data header so LSB field can also be used as an 3069 Instance ID field. When this occurs, the LSB field is reduced to 3070 8-bits (from 32-bits). 3072 o Added V-bit to the data header so the 24-bit nonce field can also 3073 be used for source and destination version numbers. 3075 o Added Map-Version 12-bit value to the EID-record to be used in all 3076 of Map-Request, Map-Reply, and Map-Register messages. 3078 o Added multiple ITR-RLOC fields to the Map-Request packet so an ETR 3079 can decide what address to select for the destination of a Map- 3080 Reply. 3082 o Added L-bit (Local RLOC bit) and p-bit (Probe-Reply RLOC bit) to 3083 the Locator-Set record of an EID-record for a Map-Reply message. 3084 The L-bit indicates which RLOCs in the locator-set are local to 3085 the sender of the message. The P-bit indicates which RLOC is the 3086 source of a RLOC-probe Reply (Map-Reply) message. 3088 o Add reference to the LISP Canonical Address Format [LCAF] draft. 3090 o Made editorial and clarification changes based on comments from 3091 Dhirendra Trivedi. 3093 o Added wordsmithing comments from Joel Halpern on DF=1 setting. 3095 o Add John Zwiebel clarification to Echo Nonce Algorithm section 3096 6.3.1. 3098 o Add John Zwiebel comment about expanding on proxy-map-reply bit 3099 for Map-Register messages. 3101 o Add NAT section per Ron Bonica comments. 3103 o Fix IDnits issues per Ron Bonica. 3105 o Added section on Virtualization and Segmentation to explain the 3106 use if the Instance ID field in the data header. 3108 o There are too many P-bits, keep their scope to the packet format 3109 description and refer to them by name every where else in the 3110 spec. 3112 o Scanned all occurrences of "should", "should not", "must" and 3113 "must not" and uppercased them. 3115 o John Zwiebel offered text for section 4.1 to modernize the 3116 example. Thanks Z! 3118 o Make it more clear in the definition of "EID-to-RLOC Database" 3119 that all ETRs need to have the same database mapping. This 3120 reflects a comment from John Scudder. 3122 o Add a definition "Route-returnability" to the Definition of Terms 3123 section. 3125 o In section 9.2, add text to describe what the signature of 3126 traceroute packets can look like. 3128 o Removed references to Data Probe for introductory example. Data- 3129 probes are still part of the LISP design but not encouraged. 3131 o Added the definition for "LISP site" to the Definition of Terms" 3132 section. 3134 B.2. Changes to draft-ietf-lisp-06.txt 3136 Editorial based changes: 3138 o Posted December 2009. 3140 o Fix typo for flags in LISP data header. Changed from "4" to "5". 3142 o Add text to indicate that Map-Register messages must contain a 3143 computed UDP checksum. 3145 o Add definitions for PITR and PETR. 3147 o Indicate an AFI value of 0 is an unspecified address. 3149 o Indicate that the TTL field of a Map-Register is not used and set 3150 to 0 by the sender. This change makes this spec consistent with 3151 [LISP-MS]. 3153 o Change "... yield a packet size of L bytes" to "... yield a packet 3154 size greater than L bytes". 3156 o Clarify section 6.1.5 on what addresses and ports are used in Map- 3157 Reply messages. 3159 o Clarify that LSBs that go beyond the number of locators do not to 3160 be SMRed when the locator addresses are greater lexicographically 3161 than the locator in the existing locator-set. 3163 o Add Gregg, Srini, and Amit to acknowledgment section. 3165 o Clarify in the definition of a LISP header what is following the 3166 UDP header. 3168 o Clarify "verifying Map-Request" text in section 6.1.3. 3170 o Add Xu Xiaohu to the acknowledgment section for introducing the 3171 problem of overlapping EID-prefixes among multiple sites in an RRG 3172 email message. 3174 Design based changes: 3176 o Use stronger language to have the outer IPv4 header set DF=1 so we 3177 can avoid fragment reassembly in an ETR or PETR. This will also 3178 make IPv4 and IPv6 encapsulation have consistent behavior. 3180 o Map-Requests should not be sent in ECM with the Probe bit is set. 3181 These type of Map-Requests are used as RLOC-probes and are sent 3182 directly to locator addresses in the underlying network. 3184 o Add text in section 6.1.5 about returning all EID-prefixes in a 3185 Map-Reply sent by an ETR when there are overlapping EID-prefixes 3186 configure. 3188 o Add text in a new subsection of section 6.1.5 about dealing with 3189 Map-Replies with coarse EID-prefixes. 3191 B.3. Changes to draft-ietf-lisp-05.txt 3193 o Posted September 2009. 3195 o Added this Document Change Log appendix. 3197 o Added section indicating that encapsulated Map-Requests must use 3198 destination UDP port 4342. 3200 o Don't use AH in Map-Registers. Put key-id, auth-length, and auth- 3201 data in Map-Register payload. 3203 o Added Jari to acknowledgment section. 3205 o State the source-EID is set to 0 when using Map-Requests to 3206 refresh or RLOC-probe. 3208 o Make more clear what source-RLOC should be for a Map-Request. 3210 o The LISP-CONS authors thought that the Type definitions for CONS 3211 should be removed from this specification. 3213 o Removed nonce from Map-Register message, it wasn't used so no need 3214 for it. 3216 o Clarify what to do for unspecified Action bits for negative Map- 3217 Replies. Since No Action is a drop, make value 0 Drop. 3219 B.4. Changes to draft-ietf-lisp-04.txt 3221 o Posted September 2009. 3223 o How do deal with record count greater than 1 for a Map-Request. 3224 Damien and Joel comment. Joel suggests: 1) Specify that senders 3225 compliant with the current document will always set the count to 3226 1, and note that the count is included for future extensibility. 3227 2) Specify what a receiver compliant with the draft should do if 3228 it receives a request with a count greater than 1. Presumably, it 3229 should send some error back? 3231 o Add Fred Templin in acknowledgment section. 3233 o Add Margaret and Sam to the acknowledgment section for their great 3234 comments. 3236 o Say more about LAGs in the UDP section per Sam Hartman's comment. 3238 o Sam wants to use MAY instead of SHOULD for ignoring checksums on 3239 ETR. From the mailing list: "You'd need to word it as an ITR MAY 3240 send a zero checksum, an ETR MUST accept a 0 checksum and MAY 3241 ignore the checksum completely. And of course we'd need to 3242 confirm that can actually be implemented. In particular, hardware 3243 that verifies UDP checksums on receive needs to be checked to make 3244 sure it permits 0 checksums." 3246 o Margaret wants a reference to 3247 http://www.ietf.org/id/draft-eubanks-chimento-6man-00.txt. 3249 o Fix description in Map-Request section. Where we describe Map- 3250 Reply Record, change "R-bit" to "M-bit". 3252 o Add the mobility bit to Map-Replies. So PTRs don't probe so often 3253 for MNs but often enough to get mapping updates. 3255 o Indicate SHA1 can be used as well for Map-Registers. 3257 o More Fred comments on MTU handling. 3259 o Isidor comment about spec'ing better periodic Map-Registers. Will 3260 be fixed in draft-ietf-lisp-ms-02.txt. 3262 o Margaret's comment on gleaning: "The current specification does 3263 not make it clear how long gleaned map entries should be retained 3264 in the cache, nor does it make it clear how/ when they will be 3265 validated. The LISP spec should, at the very least, include a 3266 (short) default lifetime for gleaned entries, require that they be 3267 validated within a short period of time, and state that a new 3268 gleaned entry should never overwrite an entry that was obtained 3269 from the mapping system. The security implications of storing 3270 "gleaned" entries should also be explored in detail." 3272 o Add section on RLOC-probing per working group feedback. 3274 o Change "loc-reach-bits" to "loc-status-bits" per comment from 3275 Noel. 3277 o Remove SMR-bit from data-plane. Dino prefers to have it in the 3278 control plane only. 3280 o Change LISP header to allow a "Research Bit" so the Nonce and LSB 3281 fields can be turned off and used for another future purpose. For 3282 Luigi et al versioning convergence. 3284 o Add a N-bit to the data header suggested by Noel. Then the nonce 3285 field could be used when N is not 1. 3287 o Clarify that when E-bit is 0, the nonce field can be an echoed 3288 nonce or a random nonce. Comment from Jesper. 3290 o Indicate when doing data-gleaning that a verifying Map-Request is 3291 sent to the source-EID of the gleaned data packet so we can avoid 3292 map-cache corruption by a 3rd party. Comment from Pedro. 3294 o Indicate that a verifying Map-Request, for accepting mapping data, 3295 should be sent over the the ALT (or to the EID). 3297 o Reference IPsec RFC 4302. Comment from Sam and Brian Weis. 3299 o Put E-bit in Map-Reply to tell ITRs that the ETR supports echo- 3300 noncing. Comment by Pedro and Dino. 3302 o Jesper made a comment to loosen the language about requiring the 3303 copy of inner TTL to outer TTL since the text to get mixed-AF 3304 traceroute to work would violate the "MUST" clause. Changed from 3305 MUST to SHOULD in section 5.3. 3307 B.5. Changes to draft-ietf-lisp-03.txt 3309 o Posted July 2009. 3311 o Removed loc-reach-bits longword from control packets per Damien 3312 comment. 3314 o Clarifications in MTU text from Roque. 3316 o Added text to indicate that the locator-set be sorted by locator 3317 address from Isidor. 3319 o Clarification text from John Zwiebel in Echo-Nonce section. 3321 B.6. Changes to draft-ietf-lisp-02.txt 3323 o Posted July 2009. 3325 o Encapsulation packet format change to add E-bit and make loc- 3326 reach-bits 32-bits in length. 3328 o Added Echo-Nonce Algorithm section. 3330 o Clarification how ECN bits are copied. 3332 o Moved S-bit in Map-Request. 3334 o Added P-bit in Map-Request and Map-Reply messages to anticipate 3335 RLOC-Probe Algorithm. 3337 o Added to Mobility section to reference draft-meyer-lisp-mn-00.txt. 3339 B.7. Changes to draft-ietf-lisp-01.txt 3341 o Posted 2 days after draft-ietf-lisp-00.txt in May 2009. 3343 o Defined LEID to be a "LISP EID". 3345 o Indicate encapsulation use IPv4 DF=0. 3347 o Added negative Map-Reply messages with drop, native-forward, and 3348 send-map-request actions. 3350 o Added Proxy-Map-Reply bit to Map-Register. 3352 B.8. Changes to draft-ietf-lisp-00.txt 3354 o Posted May 2009. 3356 o Rename of draft-farinacci-lisp-12.txt. 3358 o Acknowledgment to RRG. 3360 Authors' Addresses 3362 Dino Farinacci 3363 cisco Systems 3364 Tasman Drive 3365 San Jose, CA 95134 3366 USA 3368 Email: dino@cisco.com 3370 Vince Fuller 3371 cisco Systems 3372 Tasman Drive 3373 San Jose, CA 95134 3374 USA 3376 Email: vaf@cisco.com 3378 Dave Meyer 3379 cisco Systems 3380 170 Tasman Drive 3381 San Jose, CA 3382 USA 3384 Email: dmm@cisco.com 3386 Darrel Lewis 3387 cisco Systems 3388 170 Tasman Drive 3389 San Jose, CA 3390 USA 3392 Email: darlewis@cisco.com