idnits 2.17.1 draft-ietf-lisp-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 29, 2009) is 5323 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) ** Obsolete normative reference: RFC 4423 (Obsoleted by RFC 9063) ** Obsolete normative reference: RFC 4634 (Obsoleted by RFC 6234) == Outdated reference: A later version (-01) exists of draft-eubanks-chimento-6man-00 == Outdated reference: A later version (-10) exists of draft-ietf-lisp-alt-01 == Outdated reference: A later version (-04) exists of draft-meyer-lisp-cons-03 == Outdated reference: A later version (-06) exists of draft-ietf-lisp-interworking-00 == Outdated reference: A later version (-02) exists of draft-farinacci-lisp-lig-01 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-00 == Outdated reference: A later version (-16) exists of draft-ietf-lisp-ms-03 -- No information found for draft-mathy-lisp-dht - is the name correct? == Outdated reference: A later version (-14) exists of draft-ietf-lisp-multicast-02 == Outdated reference: A later version (-09) exists of draft-lear-lisp-nerd-04 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-00 == Outdated reference: A later version (-10) exists of draft-ietf-mip4-rfc3344bis-05 -- No information found for draft-handley-p2ppush-unpublished-2007726 - is the name correct? == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-06 Summary: 6 errors (**), 0 flaws (~~), 14 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft V. Fuller 4 Intended status: Experimental D. Meyer 5 Expires: April 2, 2010 D. Lewis 6 cisco Systems 7 September 29, 2009 9 Locator/ID Separation Protocol (LISP) 10 draft-ietf-lisp-05.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with the 15 provisions of BCP 78 and BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on April 2, 2010. 35 Copyright Notice 37 Copyright (c) 2009 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents in effect on the date of 42 publication of this document (http://trustee.ietf.org/license-info). 43 Please review these documents carefully, as they describe your rights 44 and restrictions with respect to this document. 46 Abstract 48 This draft describes a simple, incremental, network-based protocol to 49 implement separation of Internet addresses into Endpoint Identifiers 50 (EIDs) and Routing Locators (RLOCs). This mechanism requires no 51 changes to host stacks and no major changes to existing database 52 infrastructures. The proposed protocol can be implemented in a 53 relatively small number of routers. 55 This proposal was stimulated by the problem statement effort at the 56 Amsterdam IAB Routing and Addressing Workshop (RAWS), which took 57 place in October 2006. 59 Table of Contents 61 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 8 64 4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12 65 4.1. Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14 66 5. Tunneling Details . . . . . . . . . . . . . . . . . . . . . . 16 67 5.1. LISP IPv4-in-IPv4 Header Format . . . . . . . . . . . . . 17 68 5.2. LISP IPv6-in-IPv6 Header Format . . . . . . . . . . . . . 18 69 5.3. Tunnel Header Field Descriptions . . . . . . . . . . . . . 19 70 5.4. Dealing with Large Encapsulated Packets . . . . . . . . . 21 71 5.4.1. A Stateless Solution to MTU Handling . . . . . . . . . 22 72 5.4.2. A Stateful Solution to MTU Handling . . . . . . . . . 22 73 6. EID-to-RLOC Mapping . . . . . . . . . . . . . . . . . . . . . 24 74 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats . . . . . 24 75 6.1.1. LISP Packet Type Allocations . . . . . . . . . . . . . 26 76 6.1.2. Map-Request Message Format . . . . . . . . . . . . . . 26 77 6.1.3. EID-to-RLOC UDP Map-Request Message . . . . . . . . . 28 78 6.1.4. Map-Reply Message Format . . . . . . . . . . . . . . . 30 79 6.1.5. EID-to-RLOC UDP Map-Reply Message . . . . . . . . . . 33 80 6.1.6. Map-Register Message Format . . . . . . . . . . . . . 34 81 6.1.7. Encapsualted Control Message Format . . . . . . . . . 36 82 6.2. Routing Locator Selection . . . . . . . . . . . . . . . . 38 83 6.3. Routing Locator Reachability . . . . . . . . . . . . . . . 39 84 6.3.1. Echo Nonce Algorithm . . . . . . . . . . . . . . . . . 42 85 6.3.2. RLOC Probing Algorithm . . . . . . . . . . . . . . . . 43 86 6.4. Routing Locator Hashing . . . . . . . . . . . . . . . . . 44 87 6.5. Changing the Contents of EID-to-RLOC Mappings . . . . . . 45 88 6.5.1. Clock Sweep . . . . . . . . . . . . . . . . . . . . . 45 89 6.5.2. Solicit-Map-Request (SMR) . . . . . . . . . . . . . . 46 90 7. Router Performance Considerations . . . . . . . . . . . . . . 48 91 8. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 49 92 8.1. First-hop/Last-hop Tunnel Routers . . . . . . . . . . . . 50 93 8.2. Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 50 94 8.3. ISP Provider-Edge (PE) Tunnel Routers . . . . . . . . . . 51 95 9. Traceroute Considerations . . . . . . . . . . . . . . . . . . 52 96 9.1. IPv6 Traceroute . . . . . . . . . . . . . . . . . . . . . 53 97 9.2. IPv4 Traceroute . . . . . . . . . . . . . . . . . . . . . 53 98 9.3. Traceroute using Mixed Locators . . . . . . . . . . . . . 53 99 10. Mobility Considerations . . . . . . . . . . . . . . . . . . . 55 100 10.1. Site Mobility . . . . . . . . . . . . . . . . . . . . . . 55 101 10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 55 102 10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 55 103 10.4. Fast Network Mobility . . . . . . . . . . . . . . . . . . 57 104 10.5. LISP Mobile Node Mobility . . . . . . . . . . . . . . . . 57 105 11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 59 106 12. Security Considerations . . . . . . . . . . . . . . . . . . . 60 107 13. Prototype Plans and Status . . . . . . . . . . . . . . . . . . 61 108 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 64 109 14.1. Normative References . . . . . . . . . . . . . . . . . . . 64 110 14.2. Informative References . . . . . . . . . . . . . . . . . . 65 111 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 68 112 Appendix B. Document Change Log . . . . . . . . . . . . . . . . . 69 113 B.1. Changes to draft-ietf-lisp-05.txt . . . . . . . . . . . . 69 114 B.2. Changes to draft-ietf-lisp-04.txt . . . . . . . . . . . . 69 115 B.3. Changes to draft-ietf-lisp-03.txt . . . . . . . . . . . . 71 116 B.4. Changes to draft-ietf-lisp-02.txt . . . . . . . . . . . . 71 117 B.5. Changes to draft-ietf-lisp-01.txt . . . . . . . . . . . . 72 118 B.6. Changes to draft-ietf-lisp-00.txt . . . . . . . . . . . . 72 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 73 121 1. Requirements Notation 123 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 124 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 125 document are to be interpreted as described in [RFC2119]. 127 2. Introduction 129 Many years of discussion about the current IP routing and addressing 130 architecture have noted that its use of a single numbering space (the 131 "IP address") for both host transport session identification and 132 network routing creates scaling issues (see [CHIAPPA] and [RFC1498]). 133 A number of scaling benefits would be realized by separating the 134 current IP address into separate spaces for Endpoint Identifiers 135 (EIDs) and Routing Locators (RLOCs); among them are: 137 1. Reduction of routing table size in the "default-free zone" (DFZ). 138 Use of a separate numbering space for RLOCs will allow them to be 139 assigned topologically (in today's Internet, RLOCs would be 140 assigned by providers at client network attachment points), 141 greatly improving aggregation and reducing the number of 142 globally-visible, routable prefixes. 144 2. More cost-effective multihoming for sites that connect to 145 different service providers where they can control their own 146 policies for packet flow into the site without using extra 147 routing table resources of core routers. 149 3. Easing of renumbering burden when clients change providers. 150 Because host EIDs are numbered from a separate, non-provider- 151 assigned and non-topologically-bound space, they do not need to 152 be renumbered when a client site changes its attachment points to 153 the network. 155 4. Traffic engineering capabilities that can be performed by network 156 elements and do not depend on injecting additional state into the 157 routing system. This will fall out of the mechanism that is used 158 to implement the EID/RLOC split (see Section 4). 160 5. Mobility without address changing. Existing mobility mechanisms 161 will be able to work in a locator/ID separation scenario. It 162 will be possible for a host (or a collection of hosts) to move to 163 a different point in the network topology either retaining its 164 home-based address or acquiring a new address based on the new 165 network location. A new network location could be a physically 166 different point in the network topology or the same physical 167 point of the topology with a different provider. 169 This draft describes protocol mechanisms to achieve the desired 170 functional separation. For flexibility, the mechanism used for 171 forwarding packets is decoupled from that used to determine EID to 172 RLOC mappings. This document covers the former. For the later, see 173 [CONS], [ALT], [EMACS], [RPMD], and [NERD]. This work is in response 174 to and intended to address the problem statement that came out of the 175 RAWS effort [RFC4984]. 177 The Routing and Addressing problem statement can be found in [RADIR]. 179 This draft focuses on a router-based solution. Building the solution 180 into the network will facilitate incremental deployment of the 181 technology on the Internet. Note that while the detailed protocol 182 specification and examples in this document assume IP version 4 183 (IPv4), there is nothing in the design that precludes use of the same 184 techniques and mechanisms for IPv6. It should be possible for IPv4 185 packets to use IPv6 RLOCs and for IPv6 EIDs to be mapped to IPv4 186 RLOCs. 188 Related work on host-based solutions is described in Shim6 [SHIM6] 189 and HIP [RFC4423]. Related work on a router-based solution is 190 described in [GSE]. This draft attempts to not compete or overlap 191 with such solutions and the proposed protocol changes are expected to 192 complement a host-based mechanism when Traffic Engineering 193 functionality is desired. 195 Some of the design goals of this proposal include: 197 1. Require no hardware or software changes to end-systems (hosts). 199 2. Minimize required changes to Internet infrastructure. 201 3. Be incrementally deployable. 203 4. Require no router hardware changes. 205 5. Minimize the number of routers which have to be modified. In 206 particular, most customer site routers and no core routers 207 require changes. 209 6. Minimize router software changes in those routers which are 210 affected. 212 7. Avoid or minimize packet loss when EID-to-RLOC mappings need to 213 be performed. 215 There are 4 variants of LISP, which differ along a spectrum of strong 216 to weak dependence on the topological nature and possible need for 217 routability of EIDs. The variants are: 219 LISP 1: uses EIDs that are routable through the RLOC topology for 220 bootstrapping EID-to-RLOC mappings. [LISP1] This was intended as 221 a prototyping mechanism for early protocol implementation. It is 222 now deprecated and should not be deployed. 224 LISP 1.5: uses EIDs that are routable for bootstrapping EID-to-RLOC 225 mappings; such routing is via a separate topology. 227 LISP 2: uses EIDS that are not routable and EID-to-RLOC mappings are 228 implemented within the DNS. [LISP2] 230 LISP 3: uses non-routable EIDs that are used as lookup keys for a 231 new EID-to-RLOC mapping database. Use of Distributed Hash Tables 232 [DHTs] [LISPDHT] to implement such a database would be an area to 233 explore. Other examples of new mapping database services are 234 [CONS], [ALT], [RPMD], [NERD], and [APT]. 236 This document on LISP 1.5, and LISP 3 variants, both of which rely on 237 a router-based distributed cache and database for EID-to-RLOC 238 mappings. The LISP 1.0 mechanism works but does not allow reduction 239 of routing information in the default-free-zone of the Internet. The 240 LISP 2 mechanisms are put on hold and may never come to fruition 241 since it is not architecturally pure to have routing depend on 242 directory and directory depend on routing. The LISP 3 mechanisms 243 will be documented elsewhere but may use the control-plane options 244 specified in this specification. 246 3. Definition of Terms 248 Provider Independent (PI) Addresses: an address block assigned from 249 a pool where blocks are not associated with any particular 250 location in the network (e.g. from a particular service provider), 251 and is therefore not topologically aggregatable in the routing 252 system. 254 Provider Assigned (PA) Addresses: a block of IP addresses that are 255 assigned to a site by each service provider to which a site 256 connects. Typically, each block is sub-block of a service 257 provider CIDR block and is aggregated into the larger block before 258 being advertised into the global Internet. Traditionally, IP 259 multihoming has been implemented by each multi-homed site 260 acquiring its own, globally-visible prefix. LISP uses only 261 topologically-assigned and aggregatable address blocks for RLOCs, 262 eliminating this demonstrably non-scalable practice. 264 Routing Locator (RLOC): the IPv4 or IPv6 address of an egress 265 tunnel router (ETR). It is the output of a EID-to-RLOC mapping 266 lookup. An EID maps to one or more RLOCs. Typically, RLOCs are 267 numbered from topologically-aggregatable blocks that are assigned 268 to a site at each point to which it attaches to the global 269 Internet; where the topology is defined by the connectivity of 270 provider networks, RLOCs can be thought of as PA addresses. 271 Multiple RLOCs can be assigned to the same ETR device or to 272 multiple ETR devices at a site. 274 Endpoint ID (EID): a 32-bit (for IPv4) or 128-bit (for IPv6) value 275 used in the source and destination address fields of the first 276 (most inner) LISP header of a packet. The host obtains a 277 destination EID the same way it obtains an destination address 278 today, for example through a DNS lookup or SIP exchange. The 279 source EID is obtained via existing mechanisms used to set a 280 host's "local" IP address. An EID is allocated to a host from an 281 EID-prefix block associated with the site where the host is 282 located. An EID can be used by a host to refer to other hosts. 283 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks may be 284 assigned in a hierarchical manner, independent of the network 285 topology, to facilitate scaling of the mapping database. In 286 addition, an EID block assigned to a site may have site-local 287 structure (subnetting) for routing within the site; this structure 288 is not visible to the global routing system. When used in 289 discussions with other Locator/ID separation proposals, a LISP EID 290 will be called a "LEID". Throughout this document, any references 291 to "EID" refers to an LEID. 293 EID-prefix: A power-of-2 block of EIDs which are allocated to a 294 site by an address allocation authority. EID-prefixes are 295 associated with a set of RLOC addresses which make up a "database 296 mapping". EID-prefix allocations can be broken up into smaller 297 blocks when an RLOC set is to be associated with the smaller EID- 298 prefix. A globally routed address block (whether PI or PA) is not 299 an EID-prefix. However, a globally routed address block may be 300 removed from global routing and reused as an EID-prefix. A site 301 that receives an explicitly allocated EID-prefix may not use that 302 EID-prefix as a globally routed prefix assigned to RLOCs. 304 End-system: is an IPv4 or IPv6 device that originates packets with 305 a single IPv4 or IPv6 header. The end-system supplies an EID 306 value for the destination address field of the IP header when 307 communicating globally (i.e. outside of its routing domain). An 308 end-system can be a host computer, a switch or router device, or 309 any network appliance. 311 Ingress Tunnel Router (ITR): a router which accepts an IP packet 312 with a single IP header (more precisely, an IP packet that does 313 not contain a LISP header). The router treats this "inner" IP 314 destination address as an EID and performs an EID-to-RLOC mapping 315 lookup. The router then prepends an "outer" IP header with one of 316 its globally-routable RLOCs in the source address field and the 317 result of the mapping lookup in the destination address field. 318 Note that this destination RLOC may be an intermediate, proxy 319 device that has better knowledge of the EID-to-RLOC mapping closer 320 to the destination EID. In general, an ITR receives IP packets 321 from site end-systems on one side and sends LISP-encapsulated IP 322 packets toward the Internet on the other side. 324 Specifically, when a service provider prepends a LISP header for 325 Traffic Engineering purposes, the router that does this is also 326 regarded as an ITR. The outer RLOC the ISP ITR uses can be based 327 on the outer destination address (the originating ITR's supplied 328 RLOC) or the inner destination address (the originating hosts 329 supplied EID). 331 TE-ITR: is an ITR that is deployed in a service provider network 332 that prepends an additional LISP header for Traffic Engineering 333 purposes. 335 Egress Tunnel Router (ETR): a router that accepts an IP packet 336 where the destination address in the "outer" IP header is one of 337 its own RLOCs. The router strips the "outer" header and forwards 338 the packet based on the next IP header found. In general, an ETR 339 receives LISP-encapsulated IP packets from the Internet on one 340 side and sends decapsulated IP packets to site end-systems on the 341 other side. ETR functionality does not have to be limited to a 342 router device. A server host can be the endpoint of a LISP tunnel 343 as well. 345 TE-ETR: is an ETR that is deployed in a service provider network 346 that strips an outer LISP header for Traffic Engineering purposes. 348 xTR: is a reference to an ITR or ETR when direction of data flow is 349 not part of the context description. xTR refers to the router that 350 is the tunnel endpoint. Used synonymously with the term "Tunnel 351 Router". For example, "An xTR can be located at the Customer Edge 352 (CE) router", meaning both ITR and ETR functionality is at the CE 353 router. 355 EID-to-RLOC Cache: a short-lived, on-demand table in an ITR that 356 stores, tracks, and is responsible for timing-out and otherwise 357 validating EID-to-RLOC mappings. This cache is distinct from the 358 full "database" of EID-to-RLOC mappings, it is dynamic, local to 359 the ITR(s), and relatively small while the database is 360 distributed, relatively static, and much more global in scope. 362 EID-to-RLOC Database: a global distributed database that contains 363 all known EID-prefix to RLOC mappings. Each potential ETR 364 typically contains a small piece of the database: the EID-to-RLOC 365 mappings for the EID prefixes "behind" the router. These map to 366 one of the router's own, globally-visible, IP addresses. 368 Recursive Tunneling: when a packet has more than one LISP IP 369 header. Additional layers of tunneling may be employed to 370 implement traffic engineering or other re-routing as needed. When 371 this is done, an additional "outer" LISP header is added and the 372 original RLOCs are preserved in the "inner" header. Any 373 references to tunnels in this specification refers to dynamic 374 encapsulating tunnels and never are they statically configured. 376 Reencapsulating Tunnels: when a packet has no more than one LISP IP 377 header (two IP headers total) and when it needs to be diverted to 378 new RLOC, an ETR can decapsulate the packet (remove the LISP 379 header) and prepends a new tunnel header, with new RLOC, on to the 380 packet. Doing this allows a packet to be re-routed by the re- 381 encapsulating router without adding the overhead of additional 382 tunnel headers. Any references to tunnels in this specification 383 refers to dynamic encapsulating tunnels and never are they 384 statically configured. 386 LISP Header: a term used in this document to refer to the outer 387 IPv4 or IPv6 header, a UDP header, and a LISP header, an ITR 388 prepends or an ETR strips. 390 Address Family Indicator (AFI): a term used to describe an address 391 encoding in a packet. An address family currently pertains to an 392 IPv4 or IPv6 address. See [AFI] for details. 394 Negative Mapping Entry: also known as a negative cache entry, is an 395 EID-to-RLOC entry where an EID-prefix is advertised or stored with 396 no RLOCs. That is, the locator-set for the EID-to-RLOC entry is 397 empty or has an encoded locator count of 0. This type of entry 398 could be used to describe a prefix from a non-LISP site, which is 399 explicitly not in the mapping database. There are a set of well 400 defined actions that are encoded in a Negative Map-Reply. 402 Data Probe: a LISP-encapsulated data packet where the inner header 403 destination address equals the outer header destination address 404 used to trigger a Map-Reply by a decapsulating ETR. In addition, 405 the original packet is decapsulated and delivered to the 406 destination host. A Data Probe is used in some of the mapping 407 database designs to "probe" or request a Map-Reply from an ETR; in 408 other cases, Map-Requests are used. See each mapping database 409 design for details. 411 4. Basic Overview 413 One key concept of LISP is that end-systems (hosts) operate the same 414 way they do today. The IP addresses that hosts use for tracking 415 sockets, connections, and for sending and receiving packets do not 416 change. In LISP terminology, these IP addresses are called Endpoint 417 Identifiers (EIDs). 419 Routers continue to forward packets based on IP destination 420 addresses. When a packet is LISP encapsulated, these addresses are 421 referred to as Routing Locators (RLOCs). Most routers along a path 422 between two hosts will not change; they continue to perform routing/ 423 forwarding lookups on the destination addresses. For routers between 424 the source host and the ITR as well as routers from the ETR to the 425 destination host, the destination address is an EID. For the routers 426 between the ITR and the ETR, the destination address is an RLOC. 428 This design introduces "Tunnel Routers", which prepends LISP headers 429 on host-originated packets and strip them prior to final delivery to 430 their destination. The IP addresses in this "outer header" are 431 RLOCs. During end-to-end packet exchange between two Internet hosts, 432 an ITR prepends a new LISP header to each packet and an egress tunnel 433 router strips the new header. The ITR performs EID-to-RLOC lookups 434 to determine the routing path to the the ETR, which has the RLOC as 435 one of its IP addresses. 437 Some basic rules governing LISP are: 439 o End-systems (hosts) only send to addresses which are EIDs. They 440 don't know addresses are EIDs versus RLOCs but assume packets get 441 to LISP routers, which in turn, deliver packets to the destination 442 the end-system has specified. 444 o EIDs are always IP addresses assigned to hosts. 446 o LISP routers mostly deal with Routing Locator addresses. See 447 details later in Section 4.1 to clarify what is meant by "mostly". 449 o RLOCs are always IP addresses assigned to routers; preferably, 450 topologically-oriented addresses from provider CIDR blocks. 452 o When a router originates packets it may use as a source address 453 either an EID or RLOC. When acting as a host (e.g. when 454 terminating a transport session such as SSH, TELNET, or SNMP), it 455 may use an EID that is explicitly assigned for that purpose. An 456 EID that identifies the router as a host MUST NOT be used as an 457 RLOC; an EID is only routable within the scope of a site. A 458 typical BGP configuration might demonstrate this "hybrid" EID/RLOC 459 usage where a router could use its "host-like" EID to terminate 460 iBGP sessions to other routers in a site while at the same time 461 using RLOCs to terminate eBGP sessions to routers outside the 462 site. 464 o EIDs are not expected to be usable for global end-to-end 465 communication in the absence of an EID-to-RLOC mapping operation. 466 They are expected to be used locally for intra-site communication. 468 o EID prefixes are likely to be hierarchically assigned in a manner 469 which is optimized for administrative convenience and to 470 facilitate scaling of the EID-to-RLOC mapping database. The 471 hierarchy is based on a address allocation hierarchy which is not 472 dependent on the network topology. 474 o EIDs may also be structured (subnetted) in a manner suitable for 475 local routing within an autonomous system. 477 An additional LISP header may be prepended to packets by a transit 478 router (i.e. TE-ITR) when re-routing of the path for a packet is 479 desired. An obvious instance of this would be an ISP router that 480 needs to perform traffic engineering for packets in flow through its 481 network. In such a situation, termed Recursive Tunneling, an ISP 482 transit acts as an additional ingress tunnel router and the RLOC it 483 uses for the new prepended header would be either an TE-ETR within 484 the ISP (along intra-ISP traffic engineered path) or in an TE-ETR 485 within another ISP (an inter-ISP traffic engineered path, where an 486 agreement to build such a path exists). 488 This specification mandates that no more than two LISP headers get 489 prepended to a packet. This avoids excessive packet overhead as well 490 as possible encapsulation loops. It is believed two headers is 491 sufficient, where the first prepended header is used at a site for 492 Location/Identity separation and second prepended header is used 493 inside a service provider for Traffic Engineering purposes. 495 Tunnel Routers can be placed fairly flexibly in a multi-AS topology. 496 For example, the ITR for a particular end-to-end packet exchange 497 might be the first-hop or default router within a site for the source 498 host. Similarly, the egress tunnel router might be the last-hop 499 router directly-connected to the destination host. Another example, 500 perhaps for a VPN service out-sourced to an ISP by a site, the ITR 501 could be the site's border router at the service provider attachment 502 point. Mixing and matching of site-operated, ISP-operated, and other 503 tunnel routers is allowed for maximum flexibility. See Section 8 for 504 more details. 506 4.1. Packet Flow Sequence 508 This section provides an example of the unicast packet flow with the 509 following conditions: 511 o Source host "host1.abc.com" is sending a packet to 512 "host2.xyz.com", exactly what host1 would do if the site was not 513 using LISP. 515 o Each site is multi-homed, so each tunnel router has an address 516 (RLOC) assigned from the service provider address block for each 517 provider to which that particular tunnel router is attached. 519 o The ITR(s) and ETR(s) are directly connected to the source and 520 destination, respectively. 522 o Data Probes are used to solicit Map-Replies versus using Map- 523 Requests. And the Data Probes are sent on the underlying topology 524 (the LISP 1.0 variant) but could also be sent over an alternative 525 topology (the LISP 1.5 variant) as it would in [ALT]. 527 Client host1.abc.com wants to communicate with server host2.xyz.com: 529 1. host1.abc.com wants to open a TCP connection to host2.xyz.com. 530 It does a DNS lookup on host2.xyz.com. An A/AAAA record is 531 returned. This address is used as the destination EID and the 532 locally-assigned address of host1.abc.com is used as the source 533 EID. An IPv4 or IPv6 packet is built using the EIDs in the IPv4 534 or IPv6 header and sent to the default router. 536 2. The default router is configured as an ITR. The ITR must be able 537 to map the EID destination to an RLOC of the ETR at the 538 destination site. The ITR prepends a LISP header to the packet, 539 with one of its RLOCs as the source IPv4 or IPv6 address. The 540 destination EID from the original packet header is used as the 541 destination IPv4 or IPv6 in the prepended LISP header. 542 Subsequent packets, where the outer destination address is the 543 destination EID will be sent until EID-to-RLOC mapping is 544 learned. 546 3. In LISP 1, the packet is routed through the Internet as it is 547 today. In LISP 1.5, the packet is routed on a different topology 548 which may have EID prefixes distributed and advertised in an 549 aggregatable fashion. In either case, the packet arrives at the 550 ETR. The router is configured to "punt" the packet to the 551 router's processor. See Section 7 for more details. For LISP 552 2.0 and 3.0, the behavior is not fully defined yet. 554 4. The LISP header is stripped so that the packet can be forwarded 555 by the router control plane. The router looks up the destination 556 EID in the router's EID-to-RLOC database (not the cache, but the 557 configured data structure of RLOCs). An EID-to-RLOC Map-Reply 558 message is originated by the ETR and is addressed to the source 559 RLOC in the LISP header of the original packet (this is the ITR). 560 The source RLOC of the Map-Reply is one of the ETR's RLOCs. 562 5. The ITR receives the Map-Reply message, parses the message (to 563 check for format validity) and stores the mapping information 564 from the packet. This information is put in the ITR's EID-to- 565 RLOC mapping cache (this is the on-demand cache, the cache where 566 entries time out due to inactivity). 568 6. Subsequent packets from host1.abc.com to host2.xyz.com will have 569 a LISP header prepended by the ITR using the appropriate RLOC as 570 the LISP header destination address learned from the ETR. Note, 571 the packet may be sent to a different ETR than the one which 572 returned the Map-Reply due to the source site's hashing policy or 573 the destination site's locator-set policy. 575 7. The ETR receives these packets directly (since the destination 576 address is one of its assigned IP addresses), strips the LISP 577 header and forwards the packets to the attached destination host. 579 In order to eliminate the need for a mapping lookup in the reverse 580 direction, an ETR MAY create a cache entry that maps the source EID 581 (inner header source IP address) to the source RLOC (outer header 582 source IP address) in a received LISP packet. Such a cache entry is 583 termed a "gleaned" mapping and only contains a single RLOC for the 584 EID in question. More complete information about additional RLOCs 585 SHOULD be verified by sending a LISP Map-Request for that EID. Both 586 ITR and the ETR may also influence the decision the other makes in 587 selecting an RLOC. See Section 6 for more details. 589 5. Tunneling Details 591 This section describes the LISP Data Message which defines the 592 tunneling header used to encapsulate IPv4 and IPv6 packets which 593 contain EID addresses. Even though the following formats illustrate 594 IPv4-in-IPv4 and IPv6-in-IPv6 encapsulations, the other 2 595 combinations are supported as well. 597 Since additional tunnel headers are prepended, the packet becomes 598 larger and in theory can exceed the MTU of any link traversed from 599 the ITR to the ETR. It is recommended, in IPv4 that packets do not 600 get fragmented as they are encapsulated by the ITR. Instead, the 601 packet is dropped and an ICMP Too Big message is returned to the 602 source. 604 Based on informal surveys of large ISP traffic patterns, it appears 605 that most transit paths can accommodate a path MTU of at least 4470 606 bytes. The exceptions, in terms of data rate, number of hosts 607 affected, or any other metric are expected to be vanishingly small. 609 To address MTU concerns, mainly raised on the RRG mailing list, the 610 LISP deployment process will include collecting data during its pilot 611 phase to either verify or refute the assumption about minimum 612 available MTU. If the assumption proves true and transit networks 613 with links limited to 1500 byte MTUs are corner cases, it would seem 614 more cost-effective to either upgrade or modify the equipment in 615 those transit networks to support larger MTUs or to use existing 616 mechanisms for accommodating packets that are too large. 618 For this reason, there is currently no plan for LISP to add any new 619 additional, complex mechanism for implementing fragmentation and 620 reassembly in the face of limited-MTU transit links. If analysis 621 during LISP pilot deployment reveals that the assumption of 622 essentially ubiquitous, 4470+ byte transit path MTUs, is incorrect, 623 then LISP can be modified prior to protocol standardization to add 624 support for one of the proposed fragmentation and reassembly schemes. 625 Note that two simple existing schemes are detailed in Section 5.4. 627 5.1. LISP IPv4-in-IPv4 Header Format 629 0 1 2 3 630 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 632 / |Version| IHL |Type of Service| Total Length | 633 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 634 | | Identification |Flags| Fragment Offset | 635 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 OH | Time to Live | Protocol = 17 | Header Checksum | 637 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | | Source Routing Locator | 639 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 \ | Destination Routing Locator | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 / | Source Port = xxxx | Dest Port = 4341 | 643 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 644 \ | UDP Length | UDP Checksum | 645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 646 L |N|L|E| rflags | Nonce | 647 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 648 S / | Locator Status Bits | 649 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 650 / |Version| IHL |Type of Service| Total Length | 651 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 652 | | Identification |Flags| Fragment Offset | 653 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 654 IH | Time to Live | Protocol | Header Checksum | 655 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | | Source EID | 657 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 \ | Destination EID | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 661 5.2. LISP IPv6-in-IPv6 Header Format 663 0 1 2 3 664 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 666 / |Version| Traffic Class | Flow Label | 667 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 | | Payload Length | Next Header=17| Hop Limit | 669 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 670 | | 671 O + + 672 u | | 673 t + Source Routing Locator + 674 e | | 675 r + + 676 | | 677 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 678 d | | 679 r + + 680 | | 681 ^ + Destination Routing Locator + 682 | | | 683 \ + + 684 \ | | 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 / | Source Port = xxxx | Dest Port = 4341 | 687 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 \ | UDP Length | UDP Checksum | 689 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 690 L |N|L|E| rflags | Nonce | 691 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 692 S / | Locator Status Bits | 693 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 / |Version| Traffic Class | Flow Label | 695 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 696 / | Payload Length | Next Header | Hop Limit | 697 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 698 | | 699 I + + 700 n | | 701 n + Source EID + 702 e | | 703 r + + 704 | | 705 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 706 d | | 707 r + + 708 | | 709 ^ + Destination EID + 710 \ | | 711 \ + + 712 \ | | 713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 715 5.3. Tunnel Header Field Descriptions 717 Inner Header: is the inner header, preserved from the datagram 718 received from the originating host. The source and destination IP 719 addresses are EIDs. 721 Outer Header: is the outer header prepended by an ITR. The address 722 fields contain RLOCs obtained from the ingress router's EID-to- 723 RLOC cache. The IP protocol number is "UDP (17)" from [RFC0768]. 724 The DF bit of the Flags field is set to 0 when the method in 725 Section 5.4.1 is used and set to 1 when the method in 726 Section 5.4.2 is used. 728 UDP Header: contains a ITR selected source port when encapsulating a 729 packet. See Section 6.4 for details on the hash algorithm used 730 select a source port based on the 5-tuple of the inner header. 731 The destination port MUST be set to the well-known IANA assigned 732 port value 4341. 734 UDP Checksum: this field SHOULD be transmitted as zero by an ITR for 735 either IPv4 [RFC0768] or IPv6 encapsulation [UDP-TUNNELS]. When a 736 packet with a zero UDP checksum is received by an ETR, the ETR 737 MUST accept the packet for decapsulation. When an ITR transmits a 738 non-zero value for the UDP checksum, it MUST send a correctly 739 computed value in this field. When an ETR receives a packet with 740 a non-zero UDP checksum, it MAY choose to verify the checksum 741 value. If it chooses to perform such verification, and the 742 verification fails, the packet MUST be silently dropped. If the 743 ETR chooses not to perform the verification, or performs the 744 verification successfully, the packet MUST be accepted for 745 decapsulation. The handling of UDP checksums for all tunneling 746 protocols, including LISP, is under active discussion within the 747 IETF. When that discussion concludes, any necessary changes will 748 be made to align LISP with the outcome of the broader discussion. 750 UDP Length: for an IPv4 encapsulated packet, the inner header Total 751 Length plus the UDP and LISP header lengths are used. For an IPv6 752 encapsulated packet, the inner header Payload Length plus the size 753 of the IPv6 header (40 bytes) plus the size of the UDP and LISP 754 headers are used. The UDP header length is 8 bytes. 756 N: this is the nonce-present bit. When this bit is set to 1, the 757 low-order 24-bits of the first 32-bits of the LISP header contains 758 a Nonce. See section Section 6.3.1 for details. 760 L: this is the Locator-Status-Bits field enabled bit. When this bit 761 is set to 1, the Locator-Status-Bits in the second 32-bits of the 762 LISP header are in use. 764 E: this is the echo-nonce-request bit. When this bit is set to 1, 765 the N bit must be 1. This bit should be ignored and has no 766 meaning when the N bit is set to 0. See section Section 6.3.1 for 767 details. 769 rflags: this 4-bit field is reserved for future flag use. It is set 770 to 0 on transmit and ignored on receipt. 772 LISP Nonce: is a 24-bit value that is randomly generated by an ITR 773 when the N-bit is set to 1. The nonce is also used when the E-bit 774 is set to request the nonce value to be echoed by the other side 775 when packets are returned. When the E-bit is clear but the N-bit 776 is set, an ITR is either echoing a previously requested echo-nonce 777 or providing a random nonce. See section Section 6.3.1 for more 778 details. 780 LISP Locator Status Bits: in the LISP header are set by an ITR to 781 indicate to an ETR the up/down status of the Locators in the 782 source site. Each RLOC in a Map-Reply is assigned an ordinal 783 value from 0 to n-1 (when there are n RLOCs in a mapping entry). 784 The Locator Status Bits are numbered from 0 to n-1 from the least 785 significant bit of the 32-bit field. When a bit is set to 1, the 786 ITR is indicating to the ETR the RLOC associated with the bit 787 ordinal has up status. See Section 6.3 for details on how an ITR 788 can determine other ITRs at the site are reachable. When a site 789 has multiple EID-prefixes which result in multiple mappings (where 790 each could have a different locator-set), the Locator Status Bits 791 setting in an encapsulated packet MUST reflect the mapping for the 792 EID-prefix that the inner-header source EID address matches. 794 When doing Recursive Tunneling or ITR/PTR encapsulation: 796 o The outer header Time to Live field (or Hop Limit field, in case 797 of IPv6) SHOULD be copied from the inner header Time to Live 798 field. 800 o The outer header Type of Service field (or the Traffic Class 801 field, in the case of IPv6) SHOULD be copied from the inner header 802 Type of Service field (with one caveat, see below). 804 When doing Re-encapsulated Tunneling: 806 o The new outer header Time to Live field SHOULD be copied from the 807 stripped outer header Time to Live field. 809 o The new outer header Type of Service field SHOULD be copied from 810 the stripped OH header Type of Service field (with one caveat, see 811 below). 813 Copying the TTL serves two purposes: first, it preserves the distance 814 the host intended the packet to travel; second, and more importantly, 815 it provides for suppression of looping packets in the event there is 816 a loop of concatenated tunnels due to misconfiguration. 818 The ECN field occupies bits 6 and 7 of both the IPv4 Type of Service 819 field and the IPv6 Traffic Class field [RFC3168]. The ECN field 820 requires special treatment in order to avoid discarding indications 821 of congestion [RFC3168]. ITR encapsulation MUST copy the 2-bit ECN 822 field from the inner header to the outer header. Re-encapsulation 823 MUST copy the 2-bit ECN field from the stripped outer header to the 824 new outer header. If the ECN field contains a congestion indication 825 codepoint (the value is '11', the Congestion Experienced (CE) 826 codepoint), then ETR decapsulation MUST copy the 2-bit ECN field from 827 the stripped outer header to the surviving inner header that is used 828 to forward the packet beyond the ETR. These requirements preserve 829 Congestion Experienced (CE) indications when a packet that uses ECN 830 traverses a LISP tunnel and becomes marked with a CE indication due 831 to congestion between the tunnel endpoints. 833 5.4. Dealing with Large Encapsulated Packets 835 In the event that the MTU issues mentioned above prove to be more 836 serious than expected, this section proposes 2 simple mechanisms to 837 deal with large packets. One is stateless using IP fragmentation and 838 the other is stateful using Path MTU Discovery [RFC1191]. 840 It is left to the implementor to decide if the stateless or stateful 841 mechanism should be implemented. Both or neither can be decided as 842 well since it is a local decision in the ITR regarding how to deal 843 with MTU issues. Sites can interoperate with differing mechanisms. 845 Both stateless and stateful mechanisms also apply to Reencapsulating 846 and Recursive Tunneling. So any actions reference below to an ITR 847 also apply to an TE-ITR. 849 5.4.1. A Stateless Solution to MTU Handling 851 An ITR stateless solution to handle MTU issues is described as 852 follows: 854 1. Define an architectural constant S for the maximum size of a 855 packet, in bytes, an ITR would receive from a source inside of 856 its site. 858 2. Define L to be the maximum size, in bytes, a packet of size S 859 would be after the ITR prepends the LISP header, UDP header, and 860 outer network layer header of size H. 862 3. Calculate: S + H = L. 864 When an ITR receives a packet from a site-facing interface and adds H 865 bytes worth of encapsulation to yield a packet size of L bytes, it 866 resolves the MTU issue by first splitting the original packet into 2 867 equal-sized fragments. A LISP header is then prepended to each 868 fragment. This will ensure that the new, encapsulated packets are of 869 size (S/2 + H), which is always below the effective tunnel MTU. 871 When an ETR receives encapsulated fragments, it treats them as two 872 individually encapsulated packets. It strips the LISP headers then 873 forwards each fragment to the destination host of the destination 874 site. The two fragments are reassembled at the destination host into 875 the single IP datagram that was originated by the source host. 877 This behavior is performed by the ITR when the source host originates 878 a packet with the DF field of the IP header is set to 0. When the DF 879 field of the IP header is set to 1, or the packet is an IPv6 packet 880 originated by the source host, the ITR will drop the packet when the 881 size is greater than L, and sends an ICMP Too Big message to the 882 source with a value of S, where S is (L - H). 884 When the outer header encapsulation uses an IPv4 header the DF bit is 885 always set to 0. 887 This specification recommends that L be defined as 1500. 889 5.4.2. A Stateful Solution to MTU Handling 891 An ITR stateful solution to handle MTU issues is describe as follows 892 and was first introduced in [OPENLISP]: 894 1. The ITR will keep state of the effective MTU for each locator per 895 mapping cache entry. The effective MTU is what the core network 896 can deliver along the path between ITR and ETR. 898 2. When an IPv6 encapsulated packet or an IPv4 encapsulated packet 899 with DF bit set to 1, exceeds what the core network can deliver, 900 one of the intermediate routers on the path will send an ICMP Too 901 Big message to the ITR. The ITR will parse the ICMP message to 902 determine which locator is affected by the effective MTU change 903 and then record the new effective MTU value in the mapping cache 904 entry. 906 3. When a packet is received by the ITR from a source inside of the 907 site and the size of the packet is greater than the effective MTU 908 stored with the mapping cache entry associated with the 909 destination EID the packet is for, the ITR will send an ICMP Too 910 Big message back to the source. The packet size advertised by 911 the ITR in the ICMP Too Big message is the effective MTU minus 912 the LISP encapsulation length. 914 Even though this mechanism is stateful, it has advantages over the 915 stateless IP fragmentation mechanism, by not involving the 916 destination host with reassembly of ITR fragmented packets. 918 6. EID-to-RLOC Mapping 920 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats 922 The following new UDP packet types are used to retrieve EID-to-RLOC 923 mappings: 925 0 1 2 3 926 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 927 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 928 |Version| IHL |Type of Service| Total Length | 929 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 930 | Identification |Flags| Fragment Offset | 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 | Time to Live | Protocol = 17 | Header Checksum | 933 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 | Source Routing Locator | 935 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 936 | Destination Routing Locator | 937 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 938 / | Source Port | Dest Port | 939 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 940 \ | UDP Length | UDP Checksum | 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 | | 943 | LISP Message | 944 | | 945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 947 0 1 2 3 948 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 |Version| Traffic Class | Flow Label | 951 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 | Payload Length | Next Header=17| Hop Limit | 953 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 954 | | 955 + + 956 | | 957 + Source Routing Locator + 958 | | 959 + + 960 | | 961 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 962 | | 963 + + 964 | | 965 + Destination Routing Locator + 966 | | 967 + + 968 | | 969 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 970 / | Source Port | Dest Port | 971 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 972 \ | UDP Length | UDP Checksum | 973 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 974 | | 975 | LISP Message | 976 | | 977 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 The LISP UDP-based messages are the Map-Request and Map-Reply 980 messages. When a UDP Map-Request is sent, the UDP source port is 981 chosen by the sender and the destination UDP port number is set to 982 4342. When a UDP Map-Reply is sent, the source UDP port number is 983 set to 4342 and the destination UDP port number is copied from the 984 source port of either the Map-Request or the invoking data packet. 986 The UDP Length field will reflect the length of the UDP header and 987 the LISP Message payload. 989 The UDP Checksum is computed and set to non-zero for Map-Request and 990 Map-Reply messages. It MUST be checked on receipt and if the 991 checksum fails, the packet MUST be dropped. 993 LISP-CONS [CONS] use TCP to send LISP control messages. The format 994 of control messages includes the UDP header so the checksum and 995 length fields can be used to protect and delimit message boundaries. 997 This main LISP specification is the authoritative source for message 998 format definitions for the Map-Request and Map-Reply messages. 1000 6.1.1. LISP Packet Type Allocations 1002 This section will be the authoritative source for allocating LISP 1003 Type values. Current allocations are: 1005 Reserved: 0 b'0000' 1006 LISP Map-Request: 1 b'0001' 1007 LISP Map-Reply: 2 b'0010' 1008 LISP Map-Register: 3 b'0011' 1009 LISP Encapsulated Control Message: 8 b'1000' 1011 6.1.2. Map-Request Message Format 1013 0 1 2 3 1014 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 |Type=1 |A|M|P|S| Reserved | Record Count | 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | Nonce . . . | 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1020 | . . . Nonce | 1021 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1022 | Source-EID-AFI | ITR-AFI | 1023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 | Source EID Address ... | 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 | Originating ITR RLOC Address ... | 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 / | Reserved | EID mask-len | EID-prefix-AFI | 1029 Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 \ | EID-prefix ... | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 | Map-Reply Record ... | 1033 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1034 | Mapping Protocol Data | 1035 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1037 Packet field descriptions: 1039 Type: 1 (Map-Request) 1041 A: This is an authoritative bit, which is set to 0 for UDP-based Map- 1042 Requests sent by an ITR. 1044 M: When set, it indicates a Map-Reply Record segment is included in 1045 the Map-Request. 1047 P: Indicates that a Map-Request should be treated as a "piggyback" 1048 locator reachability probe. The receiver should respond with a 1049 Map-Reply with the P bit set and the nonce copied from the Map- 1050 Request. See section Section 6.3.2 for more details. 1052 S: This is the SMR bit. See Section 6.5.2 for details. 1054 Reserved: Set to 0 on transmission and ignored on receipt. 1056 Record Count: The number of records in this Map-Request message. A 1057 record is comprised of the portion of the packet that is labeled 1058 'Rec' above and occurs the number of times equal to Record Count. 1059 For this version of the protocol, a receiver MUST accept and 1060 process Map-Requests that contain one or more records, but a 1061 sender MUST only send Map-Requests containing one record. Support 1062 for requesting multiple EIDs in a single Map-Request message will 1063 be specified in a future version of the protocol. 1065 Nonce: An 8-byte random value created by the sender of the Map- 1066 Request. This nonce will be returned in the Map-Reply. The 1067 security of the LISP mapping protocol depends critically on the 1068 strength of the nonce in the Map-Request message. The nonce 1069 SHOULD be generated by a properly seeded pseudo-random (or strong 1070 random) source. See [RFC4086] for advice on generating security- 1071 sensitive random data. 1073 Source-EID-AFI: Address family of the "Source EID Address" field. 1075 ITR-AFI: Address family of the "Originating ITR RLOC Address" field. 1077 Source EID Address: This is the EID of the source host which 1078 originated the packet which is invoking this Map-Request. When 1079 Map-Requests are used for refreshing a map-cache entry or for 1080 RLOC-probing, the value 0 is used. 1082 Originating ITR RLOC Address: Used to give the ETR the option of 1083 returning a Map-Reply in the address-family of this locator. 1085 EID mask-len: Mask length for EID prefix. 1087 EID-AFI: Address family of EID-prefix according to [RFC2434] 1089 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1090 address-family. When a Map-Request is sent by an ITR because a 1091 data packet is received for a destination where there is no 1092 mapping entry, the EID-prefix is set to the destination IP address 1093 of the data packet. And the 'EID mask-len' is set to 32 or 128 1094 for IPv4 or IPv6, respectively. When an xTR wants to query a site 1095 about the status of a mapping it already has cached, the EID- 1096 prefix used in the Map-Request has the same mask-length as the 1097 EID-prefix returned from the site when it sent a Map-Reply 1098 message. 1100 Map-Reply Record: When the M bit is set, this field is the size of 1101 the "Record" field in the Map-Reply format. This Map-Reply record 1102 contains the EID-to-RLOC mapping entry associated with the Source 1103 EID. This allows the ETR which will receive this Map-Request to 1104 cache the data if it chooses to do so. 1106 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1107 is optional and present when the UDP length indicates there is 1108 enough space in the packet to include it. 1110 6.1.3. EID-to-RLOC UDP Map-Request Message 1112 A Map-Request is sent from an ITR when it needs a mapping for an EID, 1113 wants to test an RLOC for reachability, or wants to refresh a mapping 1114 before TTL expiration. For the initial case, the destination IP 1115 address used for the Map-Request is the destination-EID from the 1116 packet which had a mapping cache lookup failure. For the later 2 1117 cases, the destination IP address used for the Map-Request is one of 1118 the RLOC addresses from the locator-set of the map cache entry. The 1119 source address is either an IPv4 or IPv6 RLOC address depending if 1120 the Map-Request is using an IPv4 versus IPv6 header, respectively. 1121 In all cases, the UDP source port number for the Map-Request message 1122 is a randomly allocated 16-bit value and the UDP destination port 1123 number is set to the well-known destination port number 4342. A 1124 successful Map-Reply updates the cached set of RLOCs associated with 1125 the EID prefix range. 1127 Map-Requests can also be LISP encapsulated using UDP destination port 1128 4342 with a LISP type value set to "Encapsulated Control Message", 1129 when sent from an ITR to a Map-Resolver. Likewise, Map-Requests are 1130 LISP encapsulated the same way from a Map-Server to an ETR. Details 1131 on encapsulated Map-Requests and Map-Resolvers can be found in 1132 [LISP-MS]. 1134 Map-Requests MUST be rate-limited. It is recommended that a Map- 1135 Request for the same EID-prefix be sent no more than once per second. 1137 An ITR that is configured with mapping database information (i.e. it 1138 is also an ETR) may optionally include those mappings in a Map- 1139 Request. When an ETR configured to accept and verify such 1140 "piggybacked" mapping data receives such a Map-Request, it may 1141 originate a "verifying Map-Request", addressed to the original ITR. 1142 If the ETR has a map-cache entry that matches the "piggybacked" EID 1143 and the RLOC is in the locator-set for the entry, then it may send 1144 the "verifying Map-Request" to the original Map-Request source. If 1145 not, then it MUST send it to the "piggybacked" EID. Doing this 1146 forces the "verifying Map-Request" to go through the mapping database 1147 system to reach the authoritative source of information about that 1148 EID, guarding against RLOC-spoofing in in the "piggybacked" mapping 1149 data. 1151 6.1.4. Map-Reply Message Format 1153 0 1 2 3 1154 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1155 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1156 |Type=2 |P|E| Reserved | Record Count | 1157 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1158 | Nonce . . . | 1159 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1160 | . . . Nonce | 1161 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1162 | | Record TTL | 1163 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 R | Locator Count | EID mask-len | ACT |A| Reserved | 1165 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1166 c | Reserved | EID-AFI | 1167 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1168 r | EID-prefix | 1169 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1170 | /| Priority | Weight | M Priority | M Weight | 1171 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1172 | o | Unused Flags |R| Loc-AFI | 1173 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1174 | \| Locator | 1175 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 | Mapping Protocol Data | 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1179 Packet field descriptions: 1181 Type: 2 (Map-Reply) 1183 P: Indicates that the Map-Reply is in response to a "piggyback" 1184 locator reachability Map-Request. The nonce field should contain 1185 a copy of the nonce value from the original Map-Request. See 1186 section Section 6.3.2 for more details. 1188 E: Indicates that the ETR which sends this Map-Reply message is 1189 advertising that the site is enabled for the Echo-Nonce locator 1190 reachability algorithm. See Section 6.3.1 for more details. 1192 Reserved: Set to 0 on transmission and ignored on receipt. 1194 Record Count: The number of records in this reply message. A record 1195 is comprised of that portion of the packet labeled 'Record' above 1196 and occurs the number of times equal to Record count. 1198 Nonce: A 24-bit value set in a Data-Probe packet or a 64-bit value 1199 from the Map-Request is echoed in this Nonce field of the Map- 1200 Reply. 1202 Record TTL: The time in minutes the recipient of the Map-Reply will 1203 store the mapping. If the TTL is 0, the entry should be removed 1204 from the cache immediately. If the value is 0xffffffff, the 1205 recipient can decide locally how long to store the mapping. 1207 Locator Count: The number of Locator entries. A locator entry 1208 comprises what is labeled above as 'Loc'. The locator count can 1209 be 0 indicating there are no locators for the EID-prefix. 1211 EID mask-len: Mask length for EID prefix. 1213 ACT: This 3-bit field describes negative Map-Reply actions. These 1214 bits are used only when the 'Locator Count' field is set to 0. 1215 The action bits are encoded only in Map-Reply messages. The 1216 actions defined are used by an ITR or PTR when a destination EID 1217 matches a negative mapping cache entry. Unassigned values should 1218 cause a map-cache entry to be created and, when packets match this 1219 negative cache entry, they will be dropped. The current assigned 1220 values are: 1222 (0) Drop: The packet is dropped silently. 1224 (1) Natively-Forward: The packet is not encapsulated or dropped 1225 but natively forwarded. 1227 (2) Send-Map-Request: The packet invokes sending a Map-Request. 1229 A: The Authoritative bit, when sent by a UDP-based message is always 1230 set by the ETR. See [CONS] for TCP-based Map-Replies. 1232 EID-AFI: Address family of EID-prefix according to [RFC2434]. 1234 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1235 address-family. 1237 Priority: each RLOC is assigned a unicast priority. Lower values 1238 are more preferable. When multiple RLOCs have the same priority, 1239 they may be used in a load-split fashion. A value of 255 means 1240 the RLOC MUST NOT be used for unicast forwarding. 1242 Weight: when priorities are the same for multiple RLOCs, the weight 1243 indicates how to balance unicast traffic between them. Weight is 1244 encoded as a percentage of total unicast packets that match the 1245 mapping entry. If a non-zero weight value is used for any RLOC, 1246 then all RLOCs must use a non-zero weight value and then the sum 1247 of all weight values MUST equal 100. If a zero value is used for 1248 any RLOC weight, then all weights MUST be zero and the receiver of 1249 the Map-Reply will decide how to load-split traffic. See 1250 Section 6.4 for a suggested hash algorithm to distribute load 1251 across locators with same priority and equal weight values. When 1252 a single RLOC exists in a mapping entry, the weight value MUST be 1253 set to 100 and ignored on receipt. 1255 M Priority: each RLOC is assigned a multicast priority used by an 1256 ETR in a receiver multicast site to select an ITR in a source 1257 multicast site for building multicast distribution trees. A value 1258 of 255 means the RLOC MUST NOT be used for joining a multicast 1259 distribution tree. 1261 M Weight: when priorities are the same for multiple RLOCs, the 1262 weight indicates how to balance building multicast distribution 1263 trees across multiple ITRs. The weight is encoded as a percentage 1264 of total number of trees build to the source site identified by 1265 the EID-prefix. If a non-zero weight value is used for any RLOC, 1266 then all RLOCs must use a non-zero weight value and then the sum 1267 of all weight values MUST equal 100. If a zero value is used for 1268 any RLOC weight, then all weights MUST be zero and the receiver of 1269 the Map-Reply will decide how to distribute multicast state across 1270 ITRs. 1272 Unused Flags: set to 0 when sending and ignored on receipt. 1274 R: when this bit is set, the locator is known to be reachable from 1275 the Map-Reply sender's perspective. 1277 Locator: an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field) 1278 assigned to an ETR or router acting as a proxy replier for the 1279 EID-prefix. Note that the destination RLOC address MAY be an 1280 anycast address. A source RLOC can be an anycast address as well. 1281 The source or destination RLOC MUST NOT be the broadcast address 1282 (255.255.255.255 or any subnet broadcast address known to the 1283 router), and MUST NOT be a link-local multicast address. The 1284 source RLOC MUST NOT be a multicast address. The destination RLOC 1285 SHOULD be a multicast address if it is being mapped from a 1286 multicast destination EID. 1288 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1289 is optional and present when the UDP length indicates there is 1290 enough space in the packet to include it. 1292 6.1.5. EID-to-RLOC UDP Map-Reply Message 1294 When a Data Probe packet or a Map-Request triggers a Map-Reply to be 1295 sent, the RLOCs associated with the EID-prefix matched by the EID in 1296 the original packet destination IP address field will be returned. 1297 The RLOCs in the Map-Reply are the globally-routable IP addresses of 1298 the ETR but are not necessarily reachable; separate testing of 1299 reachability is required. 1301 Note that a Map-Reply may contain different EID-prefix granularity 1302 (prefix + length) than the Map-Request which triggers it. This might 1303 occur if a Map-Request were for a prefix that had been returned by an 1304 earlier Map-Reply. In such a case, the requester updates its cache 1305 with the new prefix information and granularity. For example, a 1306 requester with two cached EID-prefixes that are covered by a Map- 1307 Reply containing one, less-specific prefix, replaces the entry with 1308 the less-specific EID-prefix. Note that the reverse, replacement of 1309 one less-specific prefix with multiple more-specific prefixes, can 1310 also occur but not by removing the less-specific prefix rather by 1311 adding the more-specific prefixes which during a lookup will override 1312 the less-specific prefix. 1314 Replies SHOULD be sent for an EID-prefix no more often than once per 1315 second to the same requesting router. For scalability, it is 1316 expected that aggregation of EID addresses into EID-prefixes will 1317 allow one Map-Reply to satisfy a mapping for the EID addresses in the 1318 prefix range thereby reducing the number of Map-Request messages. 1320 The addresses for a encapsulated data packets or Map-Request message 1321 are swapped and used for sending the Map-Reply. The UDP source and 1322 destination ports are swapped as well. That is, the source port in 1323 the UDP header for the Map-Reply is set to the well-known UDP port 1324 number 4342. 1326 Map-Reply records can have an empty locator-set. This type of a Map- 1327 Reply is called a Negative Map-Reply. Negative Map-Replies convey 1328 special actions by the sender to the ITR or PTR which have solicited 1329 the Map-Reply. There are two primary applications for Negative Map- 1330 Replies. The first is for a Map-Resolver to instruct an ITR or PTR 1331 when a destination is for a LISP site versus a non-LISP site. And 1332 the other is to source quench Map-Requests which are sent for non- 1333 allocated EIDs. 1335 For each Map-Reply record, the list of locators in a locator-set MUST 1336 appear in the same order for each ETR that originates a Map-Reply 1337 message. The locator-set MUST be sorted in order of ascending IP 1338 address where an IPv4 locator address is considered numerically 'less 1339 than' an IPv6 locator address. 1341 6.1.6. Map-Register Message Format 1343 The usage details of the Map-Register message can be found in 1344 specification [LISP-MS]. This section solely defines the message 1345 format. 1347 The message is sent in UDP with a destination UDP port of 4342 and a 1348 randomly selected UDP source port number. 1350 The Map-Register message format is: 1352 0 1 2 3 1353 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1355 |Type=3 |P| Reserved | Record Count | 1356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1357 | Nonce . . . | 1358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1359 | . . . Nonce | 1360 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1361 | Key ID | Authentication Data Length | 1362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1363 ~ Authentication Data ~ 1364 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1365 | | Record TTL | 1366 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1367 R | Locator Count | EID mask-len | ACT |A| Reserved | 1368 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1369 c | Reserved | EID-AFI | 1370 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1371 r | EID-prefix | 1372 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1373 | /| Priority | Weight | M Priority | M Weight | 1374 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1375 | o | Unused Flags |R| Loc-AFI | 1376 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1377 | \| Locator | 1378 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1380 Packet field descriptions: 1382 Type: 3 (Map-Register) 1384 P: Set to 1 by an ETR which sends a Map-Register message requesting 1385 for the Map-Server to proxy Map-Reply. The Map-Server will send 1386 non-authoritative Map-Replies on behalf of the ETR. Details on 1387 this usage will be provided in a future version of this draft. 1389 Reserved: Set to 0 on transmission and ignored on receipt. 1391 Record Count: The number of records in this Map-Register message. A 1392 record is comprised of that portion of the packet labeled 'Record' 1393 above and occurs the number of times equal to Record count. 1395 Nonce: This 8-byte Nonce field is set to 0 in Map-Register messages. 1397 Key ID: A configured ID to find the configured Message 1398 Authentication Code (MAC) algorithm and key value used for the 1399 authentication function. 1401 Authentication Data Length: The length in bytes of the 1402 Authentication Data field that follows this field. The length of 1403 the the Authentication Data field is dependent on the Message 1404 Authentication Code (MAC) algorithm used. The length field allows 1405 a device that doesn't know the MAC algorithm to correctly parse 1406 the packet. 1408 Authentication Data: The message digest used from the output of the 1409 Message Authentication Code (MAC) algorithm. The entire Map- 1410 Register payload is authenticated with this field preset to 0. 1411 After the MAC is computed, it is placed in this field. 1412 Implementations of this specification MUST include support for 1413 HMAC-SHA-1-96 [RFC2404] and support for HMAC-SHA-128-256 [RFC4634] 1414 is recommended. 1416 The definition of the rest of the Map-Register can be found in the 1417 Map-Reply section. 1419 6.1.7. Encapsualted Control Message Format 1421 An Encapsulated Control Message is used to encapsulate control 1422 packets sent between xTRs and the mapping database system described 1423 in [LISP-MS]. 1425 0 1 2 3 1426 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1427 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1428 / | IPv4 or IPv6 Header | 1429 OH | (uses RLOC addresses) | 1430 \ | | 1431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1432 / | Source Port = xxxx | Dest Port = 4342 | 1433 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1434 \ | UDP Length | UDP Checksum | 1435 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1436 LH |Type=8 | Reserved | 1437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1438 / | IPv4 or IPv6 Header | 1439 IH | (uses RLOC or EID addresses) | 1440 \ | | 1441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1442 / | Source Port = xxxx | Dest Port = yyyy | 1443 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1444 \ | UDP Length | UDP Checksum | 1445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1446 LCM | LISP Control Message | 1447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1449 Packet header descriptions: 1451 OH: The outer IPv4 or IPv6 header which uses RLOC addresses in the 1452 source and destination header address fields. 1454 UDP: The outer UDP header with destination port 4342. The source 1455 port is randomly allocated. The checksum field MUST be non-zero. 1457 LH: Type 8 is defined to be a "LISP Encapsulated Control Message" 1458 and what follows is either an IPv4 or IPv6 header as encoded by 1459 the first 4 bits after the reserved field. 1461 IH: The inner IPv4 or IPv6 header which can use either RLOC or EID 1462 addresses in the header address fields. When a Map-Request is 1463 encapsulated in this packet format the destination address in this 1464 header is an EID. 1466 UDP: The inner UDP header where the port assignments depends on the 1467 control packet being encapsulated. When the control packet is a 1468 Map-Request or Map-Register, the source port is randomly assigned 1469 and the destination port is 4342. When the control packet is a 1470 Map-Reply, the source port is 4342 and the destination port is 1471 assigned from the source port of the invoking Map-Request. Port 1472 number 4341 MUST NOT be assigned to either port. The checksum 1473 field MUST be non-zero. 1475 LCM: The format is one of the control message formats described in 1476 this section. At this time, only Map-Request messages and PIM 1477 Join-Prune messages [MLISP] are allowed to be encapsulated. 1478 Encapsulating other types of LISP control messages are for further 1479 study. 1481 6.2. Routing Locator Selection 1483 Both client-side and server-side may need control over the selection 1484 of RLOCs for conversations between them. This control is achieved by 1485 manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply 1486 messages. Alternatively, RLOC information may be gleaned from 1487 received tunneled packets or EID-to-RLOC Map-Request messages. 1489 The following enumerates different scenarios for choosing RLOCs and 1490 the controls that are available: 1492 o Server-side returns one RLOC. Client-side can only use one RLOC. 1493 Server-side has complete control of the selection. 1495 o Server-side returns a list of RLOC where a subset of the list has 1496 the same best priority. Client can only use the subset list 1497 according to the weighting assigned by the server-side. In this 1498 case, the server-side controls both the subset list and load- 1499 splitting across its members. The client-side can use RLOCs 1500 outside of the subset list if it determines that the subset list 1501 is unreachable (unless RLOCs are set to a Priority of 255). Some 1502 sharing of control exists: the server-side determines the 1503 destination RLOC list and load distribution while the client-side 1504 has the option of using alternatives to this list if RLOCs in the 1505 list are unreachable. 1507 o Server-side sets weight of 0 for the RLOC subset list. In this 1508 case, the client-side can choose how the traffic load is spread 1509 across the subset list. Control is shared by the server-side 1510 determining the list and the client determining load distribution. 1511 Again, the client can use alternative RLOCs if the server-provided 1512 list of RLOCs are unreachable. 1514 o Either side (more likely on the server-side ETR) decides not to 1515 send a Map-Request. For example, if the server-side ETR does not 1516 send Map-Requests, it gleans RLOCs from the client-side ITR, 1517 giving the client-side ITR responsibility for bidirectional RLOC 1518 reachability and preferability. Server-side ETR gleaning of the 1519 client-side ITR RLOC is done by caching the inner header source 1520 EID and the outer header source RLOC of received packets. The 1521 client-side ITR controls how traffic is returned and can alternate 1522 using an outer header source RLOC, which then can be added to the 1523 list the server-side ETR uses to return traffic. Since no 1524 Priority or Weights are provided using this method, the server- 1525 side ETR must assume each client-side ITR RLOC uses the same best 1526 Priority with a Weight of zero. In addition, since EID-prefix 1527 encoding cannot be conveyed in data packets, the EID-to-RLOC cache 1528 on tunnel routers can grow to be very large. 1530 o A "gleaned" map-cache entry, one learned from the source RLOC of a 1531 received encapsulated packet, is only stored and used for a few 1532 seconds, pending verification. Verification is performed by 1533 sending a Map-Request to the source EID (the inner header IP 1534 source address) of the received encapsulated packet. A reply to 1535 this "verifying Map-Request" is used to fully populate the map- 1536 cache entry for the "gleaned" EID and is stored and used for the 1537 time indicated from the TTL field of a received Map-Reply. When a 1538 verified map-cache entry is stored, data gleaning no longer occurs 1539 for subsequent packets which have a source EID that matches the 1540 EID-prefix of the verified entry. 1542 RLOCs that appear in EID-to-RLOC Map-Reply messages are assumed to be 1543 reachable when the R-bit for the locator record is set to 1. Neither 1544 the information contained in a Map-Reply or that stored in the 1545 mapping database system provide reachability information for RLOCs. 1546 Such reachability needs to be determined separately, using one or 1547 more of the Routing Locator Reachability Algorithms described in the 1548 next section. 1550 6.3. Routing Locator Reachability 1552 Several mechanisms for determining RLOC reachability are currently 1553 defined: 1555 1. An ETR may examine the Loc-Status-Bits in the LISP header of an 1556 encapsulated data packet received from an ITR. If the ETR is 1557 also acting as an ITR and has traffic to return to the original 1558 ITR site, it can use this status information to help select an 1559 RLOC. 1561 2. An ITR may receive an ICMP Network or ICMP Host Unreachable 1562 message for an RLOC it is using. This indicates that the RLOC is 1563 likely down. 1565 3. An ITR which participates in the global routing system can 1566 determine that an RLOC is down if no BGP RIB route exists that 1567 matches the RLOC IP address. 1569 4. An ITR may receive an ICMP Port Unreachable message from a 1570 destination host. This occurs if an ITR attempts to use 1571 interworking [INTERWORK] and LISP-encapsulated data is sent to a 1572 non-LISP-capable site. 1574 5. An ITR may receive a Map-Reply from a ETR in response to a 1575 previously sent Map-Request. The RLOC source of the Map-Reply is 1576 likely up since the ETR was able to send the Map-Reply to the 1577 ITR. 1579 6. When an ETR receives an encapsulated packet from an ITR, the 1580 source RLOC from the outer header of the packet is likely up. 1582 7. An ITR/ETR pair can use the Locator Reachability Algorithms 1583 described in this section, namely Echo-Noncing or RLOC-Probing. 1585 When determining Locator up/down reachability by examining the Loc- 1586 Status-Bits from the LISP encapsulated data packet, an ETR will 1587 receive up to date status from an encapsulating ITR about 1588 reachability for all ETRs at the site. CE-based ITRs at the source 1589 site can determine reachability relative to each other using the site 1590 IGP as follows: 1592 o Under normal circumstances, each ITR will advertise a default 1593 route into the site IGP. 1595 o If an ITR fails or if the upstream link to its PE fails, its 1596 default route will either time-out or be withdrawn. 1598 Each ITR can thus observe the presence or lack of a default route 1599 originated by the others to determine the Locator Status Bits it sets 1600 for them. 1602 RLOCs listed in a Map-Reply are numbered with ordinals 0 to n-1. The 1603 Loc-Status-Bits in a LISP encapsulated packet are numbered from 0 to 1604 n-1 starting with the least significant bit. For example, if an RLOC 1605 listed in the 3rd position of the Map-Reply goes down (ordinal value 1606 2), then all ITRs at the site will clear the 3rd least significant 1607 bit (xxxx x0xx) of the Loc-Status-Bits field for the packets they 1608 encapsulate. 1610 When an ETR decapsulates a packet, it will check for any change in 1611 the Loc-Status-Bits field. When a bit goes from 1 to 0, the ETR will 1612 refrain from encapsulating packets to an RLOC that is indicated as 1613 down. It will only resume using that RLOC if the corresponding Loc- 1614 Status-Bit returns to a value of 1. Loc-Status-Bits are associated 1615 with a locator-set per EID-prefix. Therefore, when a locator becomes 1616 unreachable, the Loc-Status-Bit that corresponds to that locator's 1617 position in the list returned by the last Map-Reply will be set to 1618 zero for that particular EID-prefix. 1620 When ITRs at the site are not deployed in CE routers, the IGP can 1621 still be used to determine the reachability of Locators provided they 1622 are injected into the IGP. This is typically done when a /32 address 1623 is configured on a loopback interface. 1625 When ITRs receive ICMP Network or Host Unreachable messages as a 1626 method to determine unreachability, they will refrain from using 1627 Locators which are described in Locator lists of Map-Replies. 1628 However, using this approach is unreliable because many network 1629 operators turn off generation of ICMP Unreachable messages. 1631 If an ITR does receive an ICMP Network or Host Unreachable message, 1632 it MAY originate its own ICMP Unreachable message destined for the 1633 host that originated the data packet the ITR encapsulated. 1635 Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if 1636 a locator address from a locator-set in a mapping entry matches a 1637 prefix. If it does not find one and BGP is running in the Default 1638 Free Zone (DFZ), it can decide to not use the locator even though the 1639 Loc-Status-Bits indicate the locator is up. In this case, the path 1640 from the ITR to the ETR that is assigned the locator is not 1641 available. More details are in [LOC-ID-ARCH]. 1643 Optionally, an ITR can send a Map-Request to a Locator and if a Map- 1644 Reply is returned, reachability of the Locator has been determined. 1645 Obviously, sending such probes increases the number of control 1646 messages originated by tunnel routers for active flows, so Locators 1647 are assumed to be reachable when they are advertised. 1649 This assumption does create a dependency: Locator unreachability is 1650 detected by the receipt of ICMP Host Unreachable messages. When an 1651 Locator has been determined to be unreachable, it is not used for 1652 active traffic; this is the same as if it were listed in a Map-Reply 1653 with priority 255. 1655 The ITR can test the reachability of the unreachable Locator by 1656 sending periodic Requests. Both Requests and Replies MUST be rate- 1657 limited. Locator reachability testing is never done with data 1658 packets since that increases the risk of packet loss for end-to-end 1659 sessions. 1661 When an ETR decapsulates a packet, it knows that it is reachable from 1662 the encapsulating ITR because that is how the packet arrived. In 1663 most cases, the ETR can also reach the ITR but cannot assume this to 1664 be true due to the possibility of path asymmetry. In the presence of 1665 unidirectional traffic flow from an ITR to an ETR, the ITR should not 1666 use the lack of return traffic as an indication that the ETR is 1667 unreachable. Instead, it must use an alternate mechanisms to 1668 determine reachability. 1670 6.3.1. Echo Nonce Algorithm 1672 When there is bidirectional data flow between a pair of locators, a 1673 simple mechanism called "nonce echoing" can be used to determine 1674 reachability between an ITR and ETR. When an ITR wants to solicit a 1675 nonce echo, it sets the N and E bits and places a 24-bit nonce in the 1676 LISP header of the next encapsulated data packet. 1678 When this packet is received by the ETR, the encapsulated packet is 1679 forwarded as normal. When the ETR next sends a data packet to the 1680 ITR, it includes the nonce received earlier with the N bit set and E 1681 bit cleared. The ITR sees this "echoed nonce" and knows the path to 1682 and from the ETR is up. 1684 The ITR will set the E-bit and N-bit for every packet it sends while 1685 in echo-nonce-request state. The time the ITR waits to process the 1686 echoed nonce before it determines the path is unreachable is variable 1687 and a choice left for the implementation. 1689 If the ITR is receiving packets from the ETR but does not see the 1690 nonce echoed while being in echo-nonce-request state, then the path 1691 to the ETR is unreachable. This decision may be overridden by other 1692 locator reachability algorithms. Once the ITR determines the path to 1693 the ETR is down it can switch to another locator for that EID-prefix. 1695 Note that "ITR" and "ETR" are relative terms here. Both devices must 1696 be implementing both ITR and ETR functionality for the echo nonce 1697 mechanism to operate. 1699 The ITR and ETR may both go into echo-nonce-request state at the same 1700 time. The number of packets sent or the time during which echo nonce 1701 requests are sent is an implementation specific setting. However, 1702 when an ITR is in echo-nonce-request state, it can echo the ETR's 1703 nonce in the next set of packets that it encapsulates and then 1704 subsequently, continue sending echo-nonce-request packets. 1706 This mechanism does not completely solve the forward path 1707 reachability problem as traffic may be unidirectional. That is, the 1708 ETR receiving traffic at a site may not may not be the same device as 1709 an ITR which transmits traffic from that site or the site to site 1710 traffic is unidirectional so there is no ITR returning traffic. 1712 The echo-nonce algorithm is bilateral. That is, if one side sets the 1713 E-bit and the other side is not enabled for echo-noncing, then the 1714 echoing of the nonce does not occur and the requesting side may 1715 regard the locator unreachable erroneously. An ITR should only set 1716 the E-bit in a encapsulated data packet when it knows the ETR is 1717 enabled for echo-noncing. This is conveyed by the E-bit in the Map- 1718 Reply message. 1720 Note that other locator reachability mechanisms are being researched 1721 and can be used to compliment or even override the Echo Nonce 1722 Algorithm. See next section for an example of control-plane probing. 1724 6.3.2. RLOC Probing Algorithm 1726 RLOC Probing is a method that an ITR or PTR can use to determine the 1727 reachability status of one or more locators that it has cached in a 1728 map-cache entry. The P-bit (Probe Bit) of the Map-Request and Map- 1729 Reply messages are used for RLOC Probing. 1731 RLOC probing is done in the control-plane on a timer basis where an 1732 ITR or PTR will originate a Map-Request destined to a locator address 1733 from one of its own locator addresses. A Map-Request used as an 1734 RLOC-probe is NOT encapsulated and NOT sent to a Map-Server or on the 1735 ALT like one would when soliciting mapping data. The EID record 1736 encoded in the Map-Request is the EID-prefix of the map-cache entry 1737 cached by the ITR or PTR. The ITR or PTR may include a mapping data 1738 record for its own database mapping information. 1740 When an ETR receives a Map-Request message with the P-bit set, it 1741 returns a Map-Reply with the P-bit set. The source address of the 1742 Map-Reply is set from the destination address of the Map-Request and 1743 the destination address of the Map-Reply is set from the source 1744 address of the Map-Request. The Map-Reply should contain mapping 1745 data for the EID-prefix contained in the Map-Request. This provides 1746 the opportunity for the ITR or PTR, which sent the RLOC-probe to get 1747 mapping updates if there were changes to the ETR's database mapping 1748 entries. 1750 There are advantages and disadvantages of RLOC Probing. The greatest 1751 benefit of RLOC Probing is that it can handle many failure scenarios 1752 allowing the ITR to determine when the path to a specific locator is 1753 reachable or has become unreachable, thus providing a robust 1754 mechanism for switching to using another locator from the cached 1755 locator. RLOC Probing can also provide RTT estimates between a pair 1756 of locators which can be useful for network management purposes as 1757 well as for selecting low delay paths. The major disadvantage of 1758 RLOC Probing is in the number of control messages required and the 1759 amount of bandwidth used to obtain those benefits, especially if the 1760 requirement for failure detection times are very small. 1762 Continued research and testing will attempt to characterize the 1763 tradeoffs of failure detection times versus message overhead. 1765 6.4. Routing Locator Hashing 1767 When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to 1768 a requesting ITR, the locator-set for the EID-prefix may contain 1769 different priority values for each locator address. When more than 1770 one best priority locator exists, the ITR can decide how to load 1771 share traffic against the corresponding locators. 1773 The following hash algorithm may be used by an ITR to select a 1774 locator for a packet destined to an EID for the EID-to-RLOC mapping: 1776 1. Either a source and destination address hash can be used or the 1777 traditional 5-tuple hash which includes the source and 1778 destination addresses, source and destination TCP, UDP, or SCTP 1779 port numbers and the IP protocol number field or IPv6 next- 1780 protocol fields of a packet a host originates from within a LISP 1781 site. When a packet is not a TCP, UDP, or SCTP packet, the 1782 source and destination addresses only from the header are used to 1783 compute the hash. 1785 2. Take the hash value and divide it by the number of locators 1786 stored in the locator-set for the EID-to-RLOC mapping. 1788 3. The remainder will be yield a value of 0 to "number of locators 1789 minus 1". Use the remainder to select the locator in the 1790 locator-set. 1792 Note that when a packet is LISP encapsulated, the source port number 1793 in the outer UDP header needs to be set. Selecting a random value 1794 allows core routers which are attached to Link Aggregation Groups 1795 (LAGs) to load-split the encapsulated packets across member links of 1796 such LAGs. Otherwise, core routers would see a single flow, since 1797 packets have a source address of the ITR, for packets which are 1798 originated by different EIDs at the source site. A suggested setting 1799 for the source port number computed by an ITR is a 5-tuple hash 1800 function on the inner header, as described above. 1802 Many core router implementations use a 5-tuple hash to decide how to 1803 balance packet load across members of a LAG. The 5-tuple hash 1804 includes the source and destination addresses of the packet and the 1805 source and destination ports when the protocol number in the packet 1806 is TCP or UDP. For this reason, UDP encoding is used for LISP 1807 encapsulation. 1809 6.5. Changing the Contents of EID-to-RLOC Mappings 1811 Since the LISP architecture uses a caching scheme to retrieve and 1812 store EID-to-RLOC mappings, the only way an ITR can get a more up-to- 1813 date mapping is to re-request the mapping. However, the ITRs do not 1814 know when the mappings change and the ETRs do not keep track of who 1815 requested its mappings. For scalability reasons, we want to maintain 1816 this approach but need to provide a way for ETRs change their 1817 mappings and inform the sites that are currently communicating with 1818 the ETR site using such mappings. 1820 When a locator record is added to the end of a locator-set, it is 1821 easy to update mappings. We assume new mappings will maintain the 1822 same locator ordering as the old mapping but just have new locators 1823 appended to the end of the list. So some ITRs can have a new mapping 1824 while other ITRs have only an old mapping that is used until they 1825 time out. When an ITR has only an old mapping but detects bits set 1826 in the loc-status-bits that correspond to locators beyond the list it 1827 has cached, it simply ignores them. 1829 When a locator record is removed from a locator-set, ITRs that have 1830 the mapping cached will not use the removed locator because the xTRs 1831 will set the loc-status-bit to 0. So even if the locator is in the 1832 list, it will not be used. For new mapping requests, the xTRs can 1833 set the locator address to 0 as well as setting the corresponding 1834 loc-status-bit to 0. This forces ITRs with old or new mappings to 1835 avoid using the removed locator. 1837 If many changes occur to a mapping over a long period of time, one 1838 will find empty record slots in the middle of the locator-set and new 1839 records appended to the locator-set. At some point, it would be 1840 useful to compact the locator-set so the loc-status-bit settings can 1841 be efficiently packed. 1843 We propose here two approaches for locator-set compaction, one 1844 operational and the other a protocol mechanism. The operational 1845 approach uses a clock sweep method. The protocol approach uses the 1846 concept of Solicit-Map-Requests. 1848 6.5.1. Clock Sweep 1850 The clock sweep approach uses planning in advance and the use of 1851 count-down TTLs to time out mappings that have already been cached. 1852 The default setting for an EID-to-RLOC mapping TTL is 24 hours. So 1853 there is a 24 hour window to time out old mappings. The following 1854 clock sweep procedure is used: 1856 1. 24 hours before a mapping change is to take effect, a network 1857 administrator configures the ETRs at a site to start the clock 1858 sweep window. 1860 2. During the clock sweep window, ETRs continue to send Map-Reply 1861 messages with the current (unchanged) mapping records. The TTL 1862 for these mappings is set to 1 hour. 1864 3. 24 hours later, all previous cache entries will have timed out, 1865 and any active cache entries will time out within 1 hour. During 1866 this 1 hour window the ETRs continue to send Map-Reply messages 1867 with the current (unchanged) mapping records with the TTL set to 1868 1 minute. 1870 4. At the end of the 1 hour window, the ETRs will send Map-Reply 1871 messages with the new (changed) mapping records. So any active 1872 caches can get the new mapping contents right away if not cached, 1873 or in 1 minute if they had the mapping cached. 1875 6.5.2. Solicit-Map-Request (SMR) 1877 Soliciting a Map-Request is a selective way for xTRs, at the site 1878 where mappings change, to control the rate they receive requests for 1879 Map-Reply messages. SMRs are also used to tell remote ITRs to update 1880 the mappings they have cached. 1882 Since the xTRs don't keep track of remote ITRs that have cached their 1883 mappings, they can not tell exactly who needs the new mapping 1884 entries. So an xTR will solicit Map-Requests from sites it is 1885 currently sending encapsulated data to, and only from those sites. 1886 The xTRs can locally decide the algorithm for how often and to how 1887 many sites it sends SMR messages. 1889 An SMR message is simply a bit set in a Map-Request message. An ITR 1890 or PTR will send a Map-Request when they receive an SMR message. 1891 Both the SMR sender and the Map-Request responder must rate-limited 1892 these messages. 1894 The following procedure shows how a SMR exchange occurs when a site 1895 is doing locator-set compaction for an EID-to-RLOC mapping: 1897 1. When the database mappings in an ETR change, the ETRs at the site 1898 begin to send Map-Requests with the SMR bit set for each locator 1899 in each map-cache entry the ETR caches. 1901 2. A remote xTR which receives the SMR message will schedule sending 1902 a Map-Request message to the source locator address of the SMR 1903 message. A newly allocated random nonce is selected and the EID- 1904 prefix uses is the one copied from the SMR message. 1906 3. The remote xTR retransmits the Map-Request slowly until it gets a 1907 Map-Reply while continuing to use the cached mapping. 1909 4. The ETRs at the site with the changed mapping will reply to the 1910 Map-Request with a Map-Reply message provided the Map-Request 1911 nonce matches the nonce from the SMR. The Map-Reply messages 1912 SHOULD be rate limited. This is important to avoid Map-Reply 1913 implosion. 1915 5. The ETRs, at the site with the changed mapping, records the fact 1916 that the site that sent the Map-Request has received the new 1917 mapping data in the mapping cache entry for the remote site so 1918 the loc-status-bits are reflective of the new mapping for packets 1919 going to the remote site. The ETR then stops sending SMR 1920 messages. 1922 For security reasons an ITR MUST NOT process unsolicited Map-Replies. 1923 The nonce MUST be carried from SMR packet, into the resultant Map- 1924 Request, and then into Map-Reply to reduce spoofing attacks. 1926 To avoid map-cache entry corruption by a third-party, a sender of an 1927 SMR-based Map-Request must be verified. If an ITR receives an SMR- 1928 based Map-Request and the source is not in the locator-set for the 1929 stored map-cache entry, then the responding Map-Request MUST be sent 1930 with an EID destination to the mapping database system. Since the 1931 mapping database system is more secure to reach an authoritative ETR, 1932 it will deliver the Map-Request to the authoritative source of the 1933 mapping data. 1935 7. Router Performance Considerations 1937 LISP is designed to be very hardware-based forwarding friendly. By 1938 doing tunnel header prepending [RFC1955] and stripping instead of re- 1939 writing addresses, existing hardware can support the forwarding model 1940 with little or no modification. Where modifications are required, 1941 they should be limited to re-programming existing hardware rather 1942 than requiring expensive design changes to hard-coded algorithms in 1943 silicon. 1945 A few implementation techniques can be used to incrementally 1946 implement LISP: 1948 o When a tunnel encapsulated packet is received by an ETR, the outer 1949 destination address may not be the address of the router. This 1950 makes it challenging for the control plane to get packets from the 1951 hardware. This may be mitigated by creating special FIB entries 1952 for the EID-prefixes of EIDs served by the ETR (those for which 1953 the router provides an RLOC translation). These FIB entries are 1954 marked with a flag indicating that control plane processing should 1955 be performed. The forwarding logic of testing for particular IP 1956 protocol number value is not necessary. No changes to existing, 1957 deployed hardware should be needed to support this. 1959 o On an ITR, prepending a new IP header is as simple as adding more 1960 bytes to a MAC rewrite string and prepending the string as part of 1961 the outgoing encapsulation procedure. Many routers that support 1962 GRE tunneling [RFC2784] or 6to4 tunneling [RFC3056] can already 1963 support this action. 1965 o When a received packet's outer destination address contains an EID 1966 which is not intended to be forwarded on the routable topology 1967 (i.e. LISP 1.5), the source address of a data packet or the 1968 router interface with which the source is associated (the 1969 interface from which it was received) can be associated with a VRF 1970 (Virtual Routing/Forwarding), in which a different (i.e. non- 1971 congruent) topology can be used to find EID-to-RLOC mappings. 1973 8. Deployment Scenarios 1975 This section will explore how and where ITRs and ETRs can be deployed 1976 and will discuss the pros and cons of each deployment scenario. 1977 There are two basic deployment trade-offs to consider: centralized 1978 versus distributed caches and flat, recursive, or re-encapsulating 1979 tunneling. 1981 When deciding on centralized versus distributed caching, the 1982 following issues should be considered: 1984 o Are the tunnel routers spread out so that the caches are spread 1985 across all the memories of each router? 1987 o Should management "touch points" be minimized by choosing few 1988 tunnel routers, just enough for redundancy? 1990 o In general, using more ITRs doesn't increase management load, 1991 since caches are built and stored dynamically. On the other hand, 1992 more ETRs does require more management since EID-prefix-to-RLOC 1993 mappings need to be explicitly configured. 1995 When deciding on flat, recursive, or re-encapsulation tunneling, the 1996 following issues should be considered: 1998 o Flat tunneling implements a single tunnel between source site and 1999 destination site. This generally offers better paths between 2000 sources and destinations with a single tunnel path. 2002 o Recursive tunneling is when tunneled traffic is again further 2003 encapsulated in another tunnel, either to implement VPNs or to 2004 perform Traffic Engineering. When doing VPN-based tunneling, the 2005 site has some control since the site is prepending a new tunnel 2006 header. In the case of TE-based tunneling, the site may have 2007 control if it is prepending a new tunnel header, but if the site's 2008 ISP is doing the TE, then the site has no control. Recursive 2009 tunneling generally will result in suboptimal paths but at the 2010 benefit of steering traffic to resource available parts of the 2011 network. 2013 o The technique of re-encapsulation ensures that packets only 2014 require one tunnel header. So if a packet needs to be rerouted, 2015 it is first decapsulated by the ETR and then re-encapsulated with 2016 a new tunnel header using a new RLOC. 2018 The next sub-sections will describe where tunnel routers can reside 2019 in the network. 2021 8.1. First-hop/Last-hop Tunnel Routers 2023 By locating tunnel routers close to hosts, the EID-prefix set is at 2024 the granularity of an IP subnet. So at the expense of more EID- 2025 prefix-to-RLOC sets for the site, the caches in each tunnel router 2026 can remain relatively small. But caches always depend on the number 2027 of non-aggregated EID destination flows active through these tunnel 2028 routers. 2030 With more tunnel routers doing encapsulation, the increase in control 2031 traffic grows as well: since the EID-granularity is greater, more 2032 Map-Requests and Map-Replies are traveling between more routers. 2034 The advantage of placing the caches and databases at these stub 2035 routers is that the products deployed in this part of the network 2036 have better price-memory ratios then their core router counterparts. 2037 Memory is typically less expensive in these devices and fewer routes 2038 are stored (only IGP routes). These devices tend to have excess 2039 capacity, both for forwarding and routing state. 2041 LISP functionality can also be deployed in edge switches. These 2042 devices generally have layer-2 ports facing hosts and layer-3 ports 2043 facing the Internet. Spare capacity is also often available in these 2044 devices as well. 2046 8.2. Border/Edge Tunnel Routers 2048 Using customer-edge (CE) routers for tunnel endpoints allows the EID 2049 space associated with a site to be reachable via a small set of RLOCs 2050 assigned to the CE routers for that site. 2052 This offers the opposite benefit of the first-hop/last-hop tunnel 2053 router scenario: the number of mapping entries and network management 2054 touch points are reduced, allowing better scaling. 2056 One disadvantage is that less of the network's resources are used to 2057 reach host endpoints thereby centralizing the point-of-failure domain 2058 and creating network choke points at the CE router. 2060 Note that more than one CE router at a site can be configured with 2061 the same IP address. In this case an RLOC is an anycast address. 2062 This allows resilience between the CE routers. That is, if a CE 2063 router fails, traffic is automatically routed to the other routers 2064 using the same anycast address. However, this comes with the 2065 disadvantage where the site cannot control the entrance point when 2066 the anycast route is advertised out from all border routers. 2068 8.3. ISP Provider-Edge (PE) Tunnel Routers 2070 Use of ISP PE routers as tunnel endpoint routers gives an ISP control 2071 over the location of the egress tunnel endpoints. That is, the ISP 2072 can decide if the tunnel endpoints are in the destination site (in 2073 either CE routers or last-hop routers within a site) or at other PE 2074 edges. The advantage of this case is that two or more tunnel headers 2075 can be avoided. By having the PE be the first router on the path to 2076 encapsulate, it can choose a TE path first, and the ETR can 2077 decapsulate and re-encapsulate for a tunnel to the destination end 2078 site. 2080 An obvious disadvantage is that the end site has no control over 2081 where its packets flow or the RLOCs used. 2083 As mentioned in earlier sections a combination of these scenarios is 2084 possible at the expense of extra packet header overhead, if both site 2085 and provider want control, then recursive or re-encapsulating tunnels 2086 are used. 2088 9. Traceroute Considerations 2090 When a source host in a LISP site initiates a traceroute to a 2091 destination host in another LISP site, it is highly desirable for it 2092 to see the entire path. Since packets are encapsulated from ITR to 2093 ETR, the hop across the tunnel could be viewed as a single hop. 2094 However, LISP traceroute will provide the entire path so the user can 2095 see 3 distinct segments of the path from a source LISP host to a 2096 destination LISP host: 2098 Segment 1 (in source LISP site based on EIDs): 2100 source-host ---> first-hop ... next-hop ---> ITR 2102 Segment 2 (in the core network based on RLOCs): 2104 ITR ---> next-hop ... next-hop ---> ETR 2106 Segment 3 (in the destination LISP site based on EIDs): 2108 ETR ---> next-hop ... last-hop ---> destination-host 2110 For segment 1 of the path, ICMP Time Exceeded messages are returned 2111 in the normal matter as they are today. The ITR performs a TTL 2112 decrement and test for 0 before encapsulating. So the ITR hop is 2113 seen by the traceroute source has an EID address (the address of 2114 site-facing interface). 2116 For segment 2 of the path, ICMP Time Exceeded messages are returned 2117 to the ITR because the TTL decrement to 0 is done on the outer 2118 header, so the destination of the ICMP messages are to the ITR RLOC 2119 address, the source source RLOC address of the encapsulated 2120 traceroute packet. The ITR looks inside of the ICMP payload to 2121 inspect the traceroute source so it can return the ICMP message to 2122 the address of the traceroute client as well as retaining the core 2123 router IP address in the ICMP message. This is so the traceroute 2124 client can display the core router address (the RLOC address) in the 2125 traceroute output. The ETR returns its RLOC address and responds to 2126 the TTL decrement to 0 like the previous core routers did. 2128 For segment 3, the next-hop router downstream from the ETR will be 2129 decrementing the TTL for the packet that was encapsulated, sent into 2130 the core, decapsulated by the ETR, and forwarded because it isn't the 2131 final destination. If the TTL is decremented to 0, any router on the 2132 path to the destination of the traceroute, including the next-hop 2133 router or destination, will send an ICMP Time Exceeded message to the 2134 source EID of the traceroute client. The ICMP message will be 2135 encapsulated by the local ITR and sent back to the ETR in the 2136 originated traceroute source site, where the packet will be delivered 2137 to the host. 2139 9.1. IPv6 Traceroute 2141 IPv6 traceroute follows the procedure described above since the 2142 entire traceroute data packet is included in ICMP Time Exceeded 2143 message payload. Therefore, only the ITR needs to pay special 2144 attention for forwarding ICMP messages back to the traceroute source. 2146 9.2. IPv4 Traceroute 2148 For IPv4 traceroute, we cannot follow the above procedure since IPv4 2149 ICMP Time Exceeded messages only include the invoking IP header and 8 2150 bytes that follow the IP header. Therefore, when a core router sends 2151 an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP 2152 payload is the encapsulated header it prepended followed by a UDP 2153 header. The original invoking IP header, and therefore the identity 2154 of the traceroute source is lost. 2156 The solution we propose to solve this problem is to cache traceroute 2157 IPv4 headers in the ITR and to match them up with corresponding IPv4 2158 Time Exceeded messages received from core routers and the ETR. The 2159 ITR will use a circular buffer for caching the IPv4 and UDP headers 2160 of traceroute packets. It will select a 16-bit number as a key to 2161 find them later when the IPv4 Time Exceeded messages are received. 2162 When an ITR encapsulates an IPv4 traceroute packet, it will use the 2163 16-bit number as the UDP source port in the encapsulating header. 2164 When the ICMP Time Exceeded message is returned to the ITR, the UDP 2165 header of the encapsulating header is present in the ICMP payload 2166 thereby allowing the ITR to find the cached headers for the 2167 traceroute source. The ITR puts the cached headers in the payload 2168 and sends the ICMP Time Exceeded message to the traceroute source 2169 retaining the source address of the original ICMP Time Exceeded 2170 message (a core router or the ETR of the site of the traceroute 2171 destination). 2173 9.3. Traceroute using Mixed Locators 2175 When either an IPv4 traceroute or IPv6 traceroute is originated and 2176 the ITR encapsulates it in the other address family header, you 2177 cannot get all 3 segments of the traceroute. Segment 2 of the 2178 traceroute can not be conveyed to the traceroute source since it is 2179 expecting addresses from intermediate hops in the same address format 2180 for the type of traceroute it originated. Therefore, in this case, 2181 segment 2 will make the tunnel look like one hop. All the ITR has to 2182 do to make this work is to not copy the inner TTL to the outer, 2183 encapsulating header's TTL when a traceroute packet is encapsulated 2184 using an RLOC from a different address family. This will cause no 2185 TTL decrement to 0 to occur in core routers between the ITR and ETR. 2187 10. Mobility Considerations 2189 There are several kinds of mobility of which only some might be of 2190 concern to LISP. Essentially they are as follows. 2192 10.1. Site Mobility 2194 A site wishes to change its attachment points to the Internet, and 2195 its LISP Tunnel Routers will have new RLOCs when it changes upstream 2196 providers. Changes in EID-RLOC mappings for sites are expected to be 2197 handled by configuration, outside of the LISP protocol. 2199 10.2. Slow Endpoint Mobility 2201 An individual endpoint wishes to move, but is not concerned about 2202 maintaining session continuity. Renumbering is involved. LISP can 2203 help with the issues surrounding renumbering [RFC4192] [LISA96] by 2204 decoupling the address space used by a site from the address spaces 2205 used by its ISPs. [RFC4984] 2207 10.3. Fast Endpoint Mobility 2209 Fast endpoint mobility occurs when an endpoint moves relatively 2210 rapidly, changing its IP layer network attachment point. Maintenance 2211 of session continuity is a goal. This is where the Mobile IPv4 2212 [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used, 2213 and primarily where interactions with LISP need to be explored. 2215 The problem is that as an endpoint moves, it may require changes to 2216 the mapping between its EID and a set of RLOCs for its new network 2217 location. When this is added to the overhead of mobile IP binding 2218 updates, some packets might be delayed or dropped. 2220 In IPv4 mobility, when an endpoint is away from home, packets to it 2221 are encapsulated and forwarded via a home agent which resides in the 2222 home area the endpoint's address belongs to. The home agent will 2223 encapsulate and forward packets either directly to the endpoint or to 2224 a foreign agent which resides where the endpoint has moved to. 2225 Packets from the endpoint may be sent directly to the correspondent 2226 node, may be sent via the foreign agent, or may be reverse-tunneled 2227 back to the home agent for delivery to the mobile node. As the 2228 mobile node's EID or available RLOC changes, LISP EID-to-RLOC 2229 mappings are required for communication between the mobile node and 2230 the home agent, whether via foreign agent or not. As a mobile 2231 endpoint changes networks, up to three LISP mapping changes may be 2232 required: 2234 o The mobile node moves from an old location to a new visited 2235 network location and notifies its home agent that it has done so. 2236 The Mobile IPv4 control packets the mobile node sends pass through 2237 one of the new visited network's ITRs, which needs a EID-RLOC 2238 mapping for the home agent. 2240 o The home agent might not have the EID-RLOC mappings for the mobile 2241 node's "care-of" address or its foreign agent in the new visited 2242 network, in which case it will need to acquire them. 2244 o When packets are sent directly to the correspondent node, it may 2245 be that no traffic has been sent from the new visited network to 2246 the correspondent node's network, and the new visited network's 2247 ITR will need to obtain an EID-RLOC mapping for the correspondent 2248 node's site. 2250 In addition, if the IPv4 endpoint is sending packets from the new 2251 visited network using its original EID, then LISP will need to 2252 perform a route-returnability check on the new EID-RLOC mapping for 2253 that EID. 2255 In IPv6 mobility, packets can flow directly between the mobile node 2256 and the correspondent node in either direction. The mobile node uses 2257 its "care-of" address (EID). In this case, the route-returnability 2258 check would not be needed but one more LISP mapping lookup may be 2259 required instead: 2261 o As above, three mapping changes may be needed for the mobile node 2262 to communicate with its home agent and to send packets to the 2263 correspondent node. 2265 o In addition, another mapping will be needed in the correspondent 2266 node's ITR, in order for the correspondent node to send packets to 2267 the mobile node's "care-of" address (EID) at the new network 2268 location. 2270 When both endpoints are mobile the number of potential mapping 2271 lookups increases accordingly. 2273 As a mobile node moves there are not only mobility state changes in 2274 the mobile node, correspondent node, and home agent, but also state 2275 changes in the ITRs and ETRs for at least some EID-prefixes. 2277 The goal is to support rapid adaptation, with little delay or packet 2278 loss for the entire system. Heuristics can be added to LISP to 2279 reduce the number of mapping changes required and to reduce the delay 2280 per mapping change. Also IP mobility can be modified to require 2281 fewer mapping changes. In order to increase overall system 2282 performance, there may be a need to reduce the optimization of one 2283 area in order to place fewer demands on another. 2285 In LISP, one possibility is to "glean" information. When a packet 2286 arrives, the ETR could examine the EID-RLOC mapping and use that 2287 mapping for all outgoing traffic to that EID. It can do this after 2288 performing a route-returnability check, to ensure that the new 2289 network location does have a internal route to that endpoint. 2290 However, this does not cover the case where an ITR (the node assigned 2291 the RLOC) at the mobile-node location has been compromised. 2293 Mobile IP packet exchange is designed for an environment in which all 2294 routing information is disseminated before packets can be forwarded. 2295 In order to allow the Internet to grow to support expected future 2296 use, we are moving to an environment where some information may have 2297 to be obtained after packets are in flight. Modifications to IP 2298 mobility should be considered in order to optimize the behavior of 2299 the overall system. Anything which decreases the number of new EID- 2300 RLOC mappings needed when a node moves, or maintains the validity of 2301 an EID-RLOC mapping for a longer time, is useful. 2303 10.4. Fast Network Mobility 2305 In addition to endpoints, a network can be mobile, possibly changing 2306 xTRs. A "network" can be as small as a single router and as large as 2307 a whole site. This is different from site mobility in that it is 2308 fast and possibly short-lived, but different from endpoint mobility 2309 in that a whole prefix is changing RLOCs. However, the mechanisms 2310 are the same and there is no new overhead in LISP. A map request for 2311 any endpoint will return a binding for the entire mobile prefix. 2313 If mobile networks become a more common occurrence, it may be useful 2314 to revisit the design of the mapping service and allow for dynamic 2315 updates of the database. 2317 The issue of interactions between mobility and LISP needs to be 2318 explored further. Specific improvements to the entire system will 2319 depend on the details of mapping mechanisms. Mapping mechanisms 2320 should be evaluated on how well they support session continuity for 2321 mobile nodes. 2323 10.5. LISP Mobile Node Mobility 2325 An mobile device can use the LISP infrastructure to achieve mobility 2326 by implementing the LISP encapsulation and decapsulation functions 2327 and acting as a simple ITR/ETR. By doing this, such a "LISP mobile 2328 node" can use topologically-independent EID IP addresses that are not 2329 advertised into and do not impose a cost on the global routing 2330 system. These EIDs are maintained at the edges of the mapping system 2331 (in LISP Map-Servers and Map-Resolvers) and are provided on demand to 2332 only the correspondents of the LISP mobile node. 2334 Refer to the LISP Mobility Architecture specification [LISP-MN] for 2335 more details. 2337 11. Multicast Considerations 2339 A multicast group address, as defined in the original Internet 2340 architecture is an identifier of a grouping of topologically 2341 independent receiver host locations. The address encoding itself 2342 does not determine the location of the receiver(s). The multicast 2343 routing protocol, and the network-based state the protocol creates, 2344 determines where the receivers are located. 2346 In the context of LISP, a multicast group address is both an EID and 2347 a Routing Locator. Therefore, no specific semantic or action needs 2348 to be taken for a destination address, as it would appear in an IP 2349 header. Therefore, a group address that appears in an inner IP 2350 header built by a source host will be used as the destination EID. 2351 The outer IP header (the destination Routing Locator address), 2352 prepended by a LISP router, will use the same group address as the 2353 destination Routing Locator. 2355 Having said that, only the source EID and source Routing Locator 2356 needs to be dealt with. Therefore, an ITR merely needs to put its 2357 own IP address in the source Routing Locator field when prepending 2358 the outer IP header. This source Routing Locator address, like any 2359 other Routing Locator address MUST be globally routable. 2361 Therefore, an EID-to-RLOC mapping does not need to be performed by an 2362 ITR when a received data packet is a multicast data packet or when 2363 processing a source-specific Join (either by IGMPv3 or PIM). But the 2364 source Routing Locator is decided by the multicast routing protocol 2365 in a receiver site. That is, an EID to Routing Locator translation 2366 is done at control-time. 2368 Another approach is to have the ITR not encapsulate a multicast 2369 packet and allow the the host built packet to flow into the core even 2370 if the source address is allocated out of the EID namespace. If the 2371 RPF-Vector TLV [RPFV] is used by PIM in the core, then core routers 2372 can RPF to the ITR (the Locator address which is injected into core 2373 routing) rather than the host source address (the EID address which 2374 is not injected into core routing). 2376 To avoid any EID-based multicast state in the network core, the first 2377 approach is chosen for LISP-Multicast. Details for LISP-Multicast 2378 and Interworking with non-LISP sites is described in specification 2379 [MLISP]. 2381 12. Security Considerations 2383 It is believed that most of the security mechanisms will be part of 2384 the mapping database service when using control plane procedures for 2385 obtaining EID-to-RLOC mappings. For data plane triggered mappings, 2386 as described in this specification, protection is provided against 2387 ETR spoofing by using Return- Routability mechanisms evidenced by the 2388 use of a 24-bit Nonce field in the LISP encapsulation header and a 2389 64-bit Nonce field in the LISP control message. The nonce, coupled 2390 with the ITR accepting only solicited Map-Replies goes a long way 2391 toward providing decent authentication. 2393 LISP does not rely on a PKI infrastructure or a more heavy weight 2394 authentication system. These systems challenge the scalability of 2395 LISP which was a primary design goal. 2397 DoS attack prevention will depend on implementations rate-limiting 2398 Map-Requests and Map-Replies to the control plane as well as rate- 2399 limiting the number of data-triggered Map-Replies. 2401 To deal with map-cache exhaustion attempts in an ITR/PTR, the 2402 implementation should consider putting a maximum cap on the number of 2403 entries stored with a reserve list for special or frequently accessed 2404 sites. This should be a configuration policy control set by the 2405 network administrator who manages ITRs and PTRs. 2407 13. Prototype Plans and Status 2409 The operator community has requested that the IETF take a practical 2410 approach to solving the scaling problems associated with global 2411 routing state growth. This document offers a simple solution which 2412 is intended for use in a pilot program to gain experience in working 2413 on this problem. 2415 The authors hope that publishing this specification will allow the 2416 rapid implementation of multiple vendor prototypes and deployment on 2417 a small scale. Doing this will help the community: 2419 o Decide whether a new EID-to-RLOC mapping database infrastructure 2420 is needed or if a simple, UDP-based, data-triggered approach is 2421 flexible and robust enough. 2423 o Experiment with provider-independent assignment of EIDs while at 2424 the same time decreasing the size of DFZ routing tables through 2425 the use of topologically-aligned, provider-based RLOCs. 2427 o Determine whether multiple levels of tunneling can be used by ISPs 2428 to achieve their Traffic Engineering goals while simultaneously 2429 removing the more specific routes currently injected into the 2430 global routing system for this purpose. 2432 o Experiment with mobility to determine if both acceptable 2433 convergence and session continuity properties can be scalably 2434 implemented to support both individual device roaming and site 2435 service provider changes. 2437 Here is a rough set of milestones: 2439 1. Interoperable implementations have been available since the 2440 beginning of 2009. We are trying to converge on a packet format 2441 so implementations can converge on the -04 and later drafts. 2443 2. Continue pilot deployment using LISP-ALT as the database mapping 2444 mechanism. 2446 3. Continue prototyping and studying other database lookup schemes, 2447 be it DNS, DHTs, CONS, ALT, NERD, or other mechanisms. 2449 4. Implement the LISP Multicast draft [MLISP]. 2451 5. Implement the LISP Mobile Node draft [LISP-MN]. 2453 6. Research more on how policy affects what gets returned in a Map- 2454 Reply from an ETR. 2456 7. Continue to experiment with mixed locator-sets to understand how 2457 LISP can help the IPv4 to IPv6 transition. 2459 8. Add more robustness to locator reachability between LISP sites. 2461 As of this writing the following accomplishments have been achieved: 2463 1. A unit- and system-tested software switching implementation has 2464 been completed on cisco NX-OS for this draft for both IPv4 and 2465 IPv6 EIDs using a mixed locator-set of IPv4 and IPv6 locators. 2467 2. A unit- and system-tested software switching implementation on 2468 cisco NX-OS has been completed for draft [ALT]. 2470 3. A unit- and system-tested software switching implementation on 2471 cisco NX-OS has been completed for draft [INTERWORK]. Support 2472 for IPv4 translation is provided and PTR support for IPv4 and 2473 IPv6 is provided. 2475 4. The cisco NX-OS implementation supports an experimental 2476 mechanism for slow mobility. 2478 5. Dave Meyer, Vince Fuller, Darrel Lewis, Greg Shepherd, and 2479 Andrew Partan continue to test all the features described above 2480 on a dual-stack infrastructure. 2482 6. Darrel Lewis and Dave Meyer have deployed both LISP translation 2483 and LISP PTR support in the pilot network. Point your browser 2484 to http://www.lisp4.net to see translation happening in action 2485 so your non-LISP site can access a web server in a LISP site. 2487 7. Soon http://www.lisp6.net will work where your IPv6 LISP site 2488 can talk to a IPv6 web server in a LISP site by using mixed 2489 address-family based locators. 2491 8. An public domain implementation of LISP is underway. See 2492 [OPENLISP] for details. 2494 9. We have deployed Map-Resolvers and Map-Servers on the LISP pilot 2495 network to gather experience with [LISP-MS]. The first layer of 2496 the architecture are the xTRs which use Map-Servers for EID- 2497 prefix registration and Map-Resolvers for EID-to-RLOC mapping 2498 resolution. The second layer are the Map-Resolvers and Map- 2499 Servers which connect to the ALT BGP peering infrastructure. 2500 And the third layer are ALT-routers which aggregate EID-prefixes 2501 and forward Map-Requests. 2503 10. A cisco IOS implementation is underway which currently supports 2504 IPv4 encapsulation and decapsulation features. 2506 11. A LISP router based LIG implementation is supported, deployed, 2507 and used daily to debug and test the LISP pilot network. See 2508 [LIG] for details. 2510 12. A Linux implementation of LIG has been made available and 2511 supported by Dave Meyer. It can be run on any Linux system 2512 which resides in either a LISP site or non-LISP site. See [LIG] 2513 for details. Public domain code can be downloaded from 2514 http://github.com/davidmeyer/lig/tree/master. 2516 13. An experimental implementation has been written for three 2517 locator reachability algorithms. Two are the Echo-Noncing and 2518 RLOC-Probing algorithms which are documented in this 2519 specification. The third is called TCP-counts which will be 2520 documented in future drafts. 2522 14. The LISP pilot network has been converted from using MD5 HMAC 2523 authentication for Map-Register messages to SHA-1 HMAC 2524 authentication. ETRs send with SHA-1 but Map-Servers can 2525 received from either for compatibility purposes. 2527 If interested in writing a LISP implementation, testing any of the 2528 LISP implementations, or want to be part of the LISP pilot program, 2529 please contact lisp@ietf.org. 2531 14. References 2533 14.1. Normative References 2535 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2536 August 1980. 2538 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 2539 November 1990. 2541 [RFC1498] Saltzer, J., "On the Naming and Binding of Network 2542 Destinations", RFC 1498, August 1993. 2544 [RFC1955] Hinden, R., "New Scheme for Internet Routing and 2545 Addressing (ENCAPS) for IPNG", RFC 1955, June 1996. 2547 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2548 Requirement Levels", BCP 14, RFC 2119, March 1997. 2550 [RFC2404] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within 2551 ESP and AH", RFC 2404, November 1998. 2553 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2554 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2555 October 1998. 2557 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 2558 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 2559 March 2000. 2561 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 2562 via IPv4 Clouds", RFC 3056, February 2001. 2564 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 2565 of Explicit Congestion Notification (ECN) to IP", 2566 RFC 3168, September 2001. 2568 [RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support 2569 in IPv6", RFC 3775, June 2004. 2571 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness 2572 Requirements for Security", BCP 106, RFC 4086, June 2005. 2574 [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol 2575 (HIP) Architecture", RFC 4423, May 2006. 2577 [RFC4634] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms 2578 (SHA and HMAC-SHA)", RFC 4634, July 2006. 2580 [RFC4866] Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route 2581 Optimization for Mobile IPv6", RFC 4866, May 2007. 2583 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 2584 Workshop on Routing and Addressing", RFC 4984, 2585 September 2007. 2587 [UDP-TUNNELS] 2588 Eubanks, M. and P. Chimento, "UDP Checksums for Tunneled 2589 Packets"", draft-eubanks-chimento-6man-00.txt (work in 2590 progress), February 2009. 2592 14.2. Informative References 2594 [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY 2595 NUMBERS http://www.iana.org/numbers.html, Febuary 2007. 2597 [ALT] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "LISP 2598 Alternative Topology (LISP-ALT)", 2599 draft-ietf-lisp-alt-01.txt (work in progress), May 2009. 2601 [APT] Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and 2602 L. Zhang, "APT: A Practical Transit Mapping Service", 2603 draft-jen-apt-01.txt (work in progress), November 2007. 2605 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 2606 Enhancement to the Internet Architecture", Internet- 2607 Draft http://www.chiappa.net/~jnc/tech/endpoints.txt, 2608 1999. 2610 [CONS] Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A 2611 Content distribution Overlay Network Service for LISP", 2612 draft-meyer-lisp-cons-03.txt (work in progress), 2613 November 2007. 2615 [DHTs] Ratnasamy, S., Shenker, S., and I. Stoica, "Routing 2616 Algorithms for DHTs: Some Open Questions", PDF 2617 file http://www.cs.rice.edu/Conferences/IPTPS02/174.pdf. 2619 [EMACS] Brim, S., Farinacci, D., Meyer, D., and J. Curran, "EID 2620 Mappings Multicast Across Cooperating Systems for LISP", 2621 draft-curran-lisp-emacs-00.txt (work in progress), 2622 November 2007. 2624 [GSE] "GSE - An Alternate Addressing Architecture for IPv6", 2625 draft-ietf-ipngwg-gseaddr-00.txt (work in progress), 1997. 2627 [INTERWORK] 2628 Lewis, D., Meyer, D., Farinacci, D., and V. Fuller, 2629 "Interworking LISP with IPv4 and IPv6", 2630 draft-ietf-lisp-interworking-00.txt (work in progress), 2631 January 2009. 2633 [LIG] Farinacci, D. and D. Meyer, "LISP Internet Groper (LIG)", 2634 draft-farinacci-lisp-lig-01.txt (work in progress), 2635 May 2009. 2637 [LISA96] Lear, E., Katinsky, J., Coffin, J., and D. Tharp, 2638 "Renumbering: Threat or Menace?", Usenix , September 1996. 2640 [LISP-MAIN] 2641 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 2642 "Locator/ID Separation Protocol (LISP)", 2643 draft-farinacci-lisp-12.txt (work in progress), 2644 March 2009. 2646 [LISP-MN] Farinacci, D., Fuller, V., Lewis, D., and D. Meyer, "LISP 2647 Mobility Architecture", draft-meyer-lisp-mn-00.txt (work 2648 in progress), July 2009. 2650 [LISP-MS] Farinacci, D. and V. Fuller, "LISP Map Server", 2651 draft-ietf-lisp-ms-03.txt (work in progress), 2652 September 2009. 2654 [LISP1] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2655 "Locator/ID Separation Protocol (LISP1) [Routable ID 2656 Version]", 2657 Slide-set http://www.dinof.net/~dino/ietf/lisp1.ppt, 2658 October 2006. 2660 [LISP2] Farinacci, D., Oran, D., Fuller, V., and J. Schiller, 2661 "Locator/ID Separation Protocol (LISP2) [DNS-based 2662 Version]", 2663 Slide-set http://www.dinof.net/~dino/ietf/lisp2.ppt, 2664 November 2006. 2666 [LISPDHT] Mathy, L., Iannone, L., and O. Bonaventure, "LISP-DHT: 2667 Towards a DHT to map identifiers onto locators", 2668 draft-mathy-lisp-dht-00.txt (work in progress), 2669 February 2008. 2671 [LOC-ID-ARCH] 2672 Meyer, D. and D. Lewis, "Architectural Implications of 2673 Locator/ID Separation", 2674 draft-meyer-loc-id-implications-01.txt (work in progress), 2675 Januaryr 2009. 2677 [MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, 2678 "LISP for Multicast Environments", 2679 draft-ietf-lisp-multicast-02.txt (work in progress), 2680 September 2009. 2682 [NERD] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", 2683 draft-lear-lisp-nerd-04.txt (work in progress), 2684 April 2008. 2686 [OPENLISP] 2687 Iannone, L. and O. Bonaventure, "OpenLISP Implementation 2688 Report", draft-iannone-openlisp-implementation-01.txt 2689 (work in progress), July 2008. 2691 [RADIR] Narten, T., "Routing and Addressing Problem Statement", 2692 draft-narten-radir-problem-statement-00.txt (work in 2693 progress), July 2007. 2695 [RFC3344bis] 2696 Perkins, C., "IP Mobility Support for IPv4, revised", 2697 draft-ietf-mip4-rfc3344bis-05 (work in progress), 2698 July 2007. 2700 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 2701 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 2702 September 2005. 2704 [RPFV] Wijnands, IJ., Boers, A., and E. Rosen, "The RPF Vector 2705 TLV", draft-ietf-pim-rpf-vector-08.txt (work in progress), 2706 January 2009. 2708 [RPMD] Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol 2709 for Routing Protocol Meta-data Dissemination", 2710 draft-handley-p2ppush-unpublished-2007726.txt (work in 2711 progress), July 2007. 2713 [SHIM6] Nordmark, E. and M. Bagnulo, "Level 3 multihoming shim 2714 protocol", draft-ietf-shim6-proto-06.txt (work in 2715 progress), October 2006. 2717 Appendix A. Acknowledgments 2719 An initial thank you goes to Dave Oran for planting the seeds for the 2720 initial ideas for LISP. His consultation continues to provide value 2721 to the LISP authors. 2723 A special and appreciative thank you goes to Noel Chiappa for 2724 providing architectural impetus over the past decades on separation 2725 of location and identity, as well as detailed review of the LISP 2726 architecture and documents, coupled with enthusiasm for making LISP a 2727 practical and incremental transition for the Internet. 2729 The authors would like to gratefully acknowledge many people who have 2730 contributed discussion and ideas to the making of this proposal. 2731 They include Scott Brim, Andrew Partan, John Zwiebel, Jason Schiller, 2732 Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill, Geoff Huston, 2733 David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark Townsley, 2734 Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave Thaler, 2735 Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure, Luigi 2736 Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger 2737 Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland 2738 Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, Damian 2739 Lezama, Attilla De Groot, Parantap Lahiri, David Black, Roque 2740 Gagliano, Isidor Kouvelas, Jesper Skriver, Fred Templin, Margaret 2741 Wasserman, Sam Hartman, Michael Hofling, Pedro Marques, and Jari 2742 Arkko. 2744 In particular, we would like to thank Dave Meyer for his clever 2745 suggestion for the name "LISP". ;-) 2747 This work originated in the Routing Research Group (RRG) of the IRTF. 2748 The individual submission [LISP-MAIN] was converted into this IETF 2749 LISP working group draft. 2751 Appendix B. Document Change Log 2753 B.1. Changes to draft-ietf-lisp-05.txt 2755 o Posted September 2009. 2757 o Added this Document Change Log appendix. 2759 o Added section indicating that encapsulated Map-Requests must use 2760 destination UDP port 4342. 2762 o Don't use AH in Map-Registers. Put key-id, auth-length, and auth- 2763 data in Map-Register payload. 2765 o Added Jari to acknowledgment section. 2767 o State the source-EID is set to 0 when using Map-Requests to 2768 refresh or RLOC-probe. 2770 o Make more clear what source-RLOC should be for a Map-Request. 2772 o The LISP-CONS authors thought that the Type definitions for CONS 2773 should be removed from this specification. 2775 o Removed nonce from Map-Register message, it wasn't used so no need 2776 for it. 2778 o Clarify what to do for unspecified Action bits for negative Map- 2779 Replies. Since No Action is a drop, make value 0 Drop. 2781 B.2. Changes to draft-ietf-lisp-04.txt 2783 o Posted September 2009. 2785 o How do deal with record count greater than 1 for a Map-Request. 2786 Damien and Joel comment. Joel suggests: 1) Specify that senders 2787 compliant with the current document will always set the count to 2788 1, and note that the count is included for future extensibility. 2789 2) Specify what a receiver compliant with the draft should do if 2790 it receives a request with a count greater than 1. Presumably, it 2791 should send some error back? 2793 o Add Fred Templin in ack section. 2795 o Add Margaret and Sam to the ack section for their great comments. 2797 o Say more about LAGs in the UDP section per Sam Hartman's comment. 2799 o Sam wants to use MAY instead of SHOULD for ignoring checksums on 2800 ETR. From the mailing list: "You'd need to word it as an ITR MAY 2801 send a zero checksum, an ETR MUST accept a 0 checksum and MAY 2802 ignore the checksum completely. And of course we'd need to 2803 confirm that can actually be implemented. In particular, hardware 2804 that verifies UDP checksums on receive needs to be checked to make 2805 sure it permits 0 checksums." 2807 o Margaret wants a reference to 2808 http://www.ietf.org/id/draft-eubanks-chimento-6man-00.txt. 2810 o Fix description in Map-Request section. Where we describe Map- 2811 Reply Record, change "R-bit" to "M-bit". 2813 o Add the mobility bit to Map-Replies. So PTRs don't probe so often 2814 for MNs but often enough to get mapping updates. 2816 o Indicate SHA1 can be used as well for Map-Registers. 2818 o More Fred comments on MTU handling. 2820 o Isidor comment about specing better periodic Map-Registers. Will 2821 be fixed in draft-ietf-lisp-ms-02.txt. 2823 o Margaret's comment on gleaning: "The current specification does 2824 not make it clear how long gleaned map entries should be retained 2825 in the cache, nor does it make it clear how/ when they will be 2826 validated. The LISP spec should, at the very least, include a 2827 (short) default lifetime for gleaned entries, require that they be 2828 validated within a short period of time, and state that a new 2829 gleaned entry should never overwrite an entry that was obtained 2830 from the mapping system. The security implications of storing 2831 "gleaned" entries should also be explored in detail." 2833 o Add section on RLOC-probing per working group feedback. 2835 o Change "loc-reach-bits" to "loc-status-bits" per comment from 2836 Noel. 2838 o Remove SMR-bit from data-plane. Dino prefers to have it in the 2839 control plane only. 2841 o Change LISP header to allow a "Research Bit" so the Nonce and LSB 2842 fields can be turned off and used for another future purpose. For 2843 Luigi et al versioning convergence. 2845 o Add a N-bit to the data header suggested by Noel. Then the nonce 2846 field could be used when N is not 1. 2848 o Clarify that when E-bit is 0, the nonce field can be an echoed 2849 nonce or a random nonce. Comment from Jesper. 2851 o Indicate when doing data-gleaning that a verifying Map-Request is 2852 sent to the source-EID of the gleaned data packet so we can avoid 2853 map-cache corruption by a 3rd party. Comment from Pedro. 2855 o Indicate that a verifying Map-Request, for accepting mapping data, 2856 should be sent over the the ALT (or to the EID). 2858 o Reference IPsec RFC 4302. Comment from Sam and Brian Weis. 2860 o Put E-bit in Map-Reply to tell ITRs that the ETR supports echo- 2861 noncing. Comment by Pedro and Dino. 2863 o Jesper made a comment to loosen the language about requiring the 2864 copy of inner TTL to outer TTL since the text to get mixed-AF 2865 traceroute to work would violate the "MUST" clause. Changed from 2866 MUST to SHOULD in section 5.3. 2868 B.3. Changes to draft-ietf-lisp-03.txt 2870 o Posted July 2009. 2872 o Removed loc-reach-bits longword from control packets per Damien 2873 comment. 2875 o Clarifications in MTU text from Roque. 2877 o Added text to indicate that the locator-set be sorted by locator 2878 address from Isidor. 2880 o Clarification text from JohnZ in Echo-Nonce section. 2882 B.4. Changes to draft-ietf-lisp-02.txt 2884 o Posted July 2009. 2886 o Encapsulation packet format change to add E-bit and make loc- 2887 reach-bits 32-bits in length. 2889 o Added Echo-Nonce Algorithm section. 2891 o Clarification how ECN bits are copied. 2893 o Moved S-bit in Map-Request. 2895 o Added P-bit in Map-Request and Map-Reply messages to anticipate 2896 RLOC-Probe Algorithm. 2898 o Added to Mobility section to reference draft-meyer-lisp-mn-00.txt. 2900 B.5. Changes to draft-ietf-lisp-01.txt 2902 o Posted 2 days after draft-ietf-lisp-00.txt in May 2009. 2904 o Defined LEID to be a "LISP EID". 2906 o Indicate encapsulation use IPv4 DF=0. 2908 o Added negative Map-Reply messages with drop, native-forward, and 2909 send-map-request actions. 2911 o Added Proxy-Map-Reply bit to Map-Register. 2913 B.6. Changes to draft-ietf-lisp-00.txt 2915 o Posted May 2009. 2917 o Rename of draft-farinacci-lisp-12.txt. 2919 o Acknowledgment to RRG. 2921 Authors' Addresses 2923 Dino Farinacci 2924 cisco Systems 2925 Tasman Drive 2926 San Jose, CA 95134 2927 USA 2929 Email: dino@cisco.com 2931 Vince Fuller 2932 cisco Systems 2933 Tasman Drive 2934 San Jose, CA 95134 2935 USA 2937 Email: vaf@cisco.com 2939 Dave Meyer 2940 cisco Systems 2941 170 Tasman Drive 2942 San Jose, CA 2943 USA 2945 Email: dmm@cisco.com 2947 Darrel Lewis 2948 cisco Systems 2949 170 Tasman Drive 2950 San Jose, CA 2951 USA 2953 Email: darlewis@cisco.com