idnits 2.17.1 draft-ietf-lisp-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 10 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 11, 2010) is 4943 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1700 (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) ** Obsolete normative reference: RFC 4634 (Obsoleted by RFC 6234) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: A later version (-01) exists of draft-eubanks-chimento-6man-00 == Outdated reference: A later version (-10) exists of draft-ietf-lisp-alt-04 == Outdated reference: A later version (-04) exists of draft-meyer-lisp-cons-03 == Outdated reference: A later version (-06) exists of draft-ietf-lisp-interworking-01 == Outdated reference: A later version (-10) exists of draft-farinacci-lisp-lcaf-02 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-05 == Outdated reference: A later version (-16) exists of draft-ietf-lisp-ms-05 == Outdated reference: A later version (-14) exists of draft-ietf-lisp-multicast-04 == Outdated reference: A later version (-09) exists of draft-lear-lisp-nerd-08 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-00 == Outdated reference: A later version (-10) exists of draft-ietf-mip4-rfc3344bis-05 -- No information found for draft-handley-p2ppush-unpublished-2007726 - is the name correct? Summary: 5 errors (**), 0 flaws (~~), 14 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft V. Fuller 4 Intended status: Experimental D. Meyer 5 Expires: April 14, 2011 D. Lewis 6 cisco Systems 7 October 11, 2010 9 Locator/ID Separation Protocol (LISP) 10 draft-ietf-lisp-09 12 Abstract 14 This draft describes a network-based protocol that enables separation 15 of IP addresses into two new numbering spaces: Endpoint Identifiers 16 (EIDs) and Routing Locators (RLOCs). No changes are required to 17 either host protocol stacks or to the "core" of the Internet 18 infrastructure. LISP can be incrementally deployed, without a "flag 19 day", and offers traffic engineering, multi-homing, and mobility 20 benefits even to early adopters, when there are relatively few LISP- 21 capable sites. 23 Design and development of LISP was largely motivated by the problem 24 statement produced by the October, 2006 IAB Routing and Addressing 25 Workshop. 27 Status of this Memo 29 This Internet-Draft is submitted to IETF in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF), its areas, and its working groups. Note that 34 other groups may also distribute working documents as Internet- 35 Drafts. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 The list of current Internet-Drafts can be accessed at 43 http://www.ietf.org/ietf/1id-abstracts.txt. 45 The list of Internet-Draft Shadow Directories can be accessed at 46 http://www.ietf.org/shadow.html. 48 This Internet-Draft will expire on April 14, 2011. 50 Copyright Notice 52 Copyright (c) 2010 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the BSD License. 65 Table of Contents 67 1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 4 68 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 7 70 4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12 71 4.1. Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14 72 5. LISP Encapsulation Details . . . . . . . . . . . . . . . . . . 16 73 5.1. LISP IPv4-in-IPv4 Header Format . . . . . . . . . . . . . 17 74 5.2. LISP IPv6-in-IPv6 Header Format . . . . . . . . . . . . . 17 75 5.3. Tunnel Header Field Descriptions . . . . . . . . . . . . . 19 76 5.4. Dealing with Large Encapsulated Packets . . . . . . . . . 22 77 5.4.1. A Stateless Solution to MTU Handling . . . . . . . . . 23 78 5.4.2. A Stateful Solution to MTU Handling . . . . . . . . . 24 79 5.5. Using Virtualization and Segmentation with LISP . . . . . 24 80 6. EID-to-RLOC Mapping . . . . . . . . . . . . . . . . . . . . . 26 81 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats . . . . . 26 82 6.1.1. LISP Packet Type Allocations . . . . . . . . . . . . . 28 83 6.1.2. Map-Request Message Format . . . . . . . . . . . . . . 28 84 6.1.3. EID-to-RLOC UDP Map-Request Message . . . . . . . . . 30 85 6.1.4. Map-Reply Message Format . . . . . . . . . . . . . . . 32 86 6.1.5. EID-to-RLOC UDP Map-Reply Message . . . . . . . . . . 35 87 6.1.6. Map-Register Message Format . . . . . . . . . . . . . 38 88 6.1.7. Encapsulated Control Message Format . . . . . . . . . 39 89 6.2. Routing Locator Selection . . . . . . . . . . . . . . . . 41 90 6.3. Routing Locator Reachability . . . . . . . . . . . . . . . 42 91 6.3.1. Echo Nonce Algorithm . . . . . . . . . . . . . . . . . 45 92 6.3.2. RLOC Probing Algorithm . . . . . . . . . . . . . . . . 46 93 6.4. EID Reachability within a LISP Site . . . . . . . . . . . 47 94 6.5. Routing Locator Hashing . . . . . . . . . . . . . . . . . 47 95 6.6. Changing the Contents of EID-to-RLOC Mappings . . . . . . 48 96 6.6.1. Clock Sweep . . . . . . . . . . . . . . . . . . . . . 49 97 6.6.2. Solicit-Map-Request (SMR) . . . . . . . . . . . . . . 49 98 6.6.3. Database Map Versioning . . . . . . . . . . . . . . . 51 99 7. Router Performance Considerations . . . . . . . . . . . . . . 52 100 8. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 53 101 8.1. First-hop/Last-hop Tunnel Routers . . . . . . . . . . . . 54 102 8.2. Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 54 103 8.3. ISP Provider-Edge (PE) Tunnel Routers . . . . . . . . . . 55 104 8.4. LISP Functionality with Conventional NATs . . . . . . . . 55 105 9. Traceroute Considerations . . . . . . . . . . . . . . . . . . 56 106 9.1. IPv6 Traceroute . . . . . . . . . . . . . . . . . . . . . 57 107 9.2. IPv4 Traceroute . . . . . . . . . . . . . . . . . . . . . 57 108 9.3. Traceroute using Mixed Locators . . . . . . . . . . . . . 57 109 10. Mobility Considerations . . . . . . . . . . . . . . . . . . . 59 110 10.1. Site Mobility . . . . . . . . . . . . . . . . . . . . . . 59 111 10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 59 112 10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 59 113 10.4. Fast Network Mobility . . . . . . . . . . . . . . . . . . 61 114 10.5. LISP Mobile Node Mobility . . . . . . . . . . . . . . . . 61 115 11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 63 116 12. Security Considerations . . . . . . . . . . . . . . . . . . . 64 117 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 65 118 13.1. LISP Address Type Codes . . . . . . . . . . . . . . . . . 65 119 13.2. LISP UDP Port Numbers . . . . . . . . . . . . . . . . . . 65 120 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 66 121 14.1. Normative References . . . . . . . . . . . . . . . . . . . 66 122 14.2. Informative References . . . . . . . . . . . . . . . . . . 67 123 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 70 124 Appendix B. Document Change Log . . . . . . . . . . . . . . . . . 71 125 B.1. Changes to draft-ietf-lisp-09.txt . . . . . . . . . . . . 71 126 B.2. Changes to draft-ietf-lisp-08.txt . . . . . . . . . . . . 71 127 B.3. Changes to draft-ietf-lisp-07.txt . . . . . . . . . . . . 73 128 B.4. Changes to draft-ietf-lisp-06.txt . . . . . . . . . . . . 74 129 B.5. Changes to draft-ietf-lisp-05.txt . . . . . . . . . . . . 75 130 B.6. Changes to draft-ietf-lisp-04.txt . . . . . . . . . . . . 76 131 B.7. Changes to draft-ietf-lisp-03.txt . . . . . . . . . . . . 78 132 B.8. Changes to draft-ietf-lisp-02.txt . . . . . . . . . . . . 78 133 B.9. Changes to draft-ietf-lisp-01.txt . . . . . . . . . . . . 78 134 B.10. Changes to draft-ietf-lisp-00.txt . . . . . . . . . . . . 79 135 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 80 137 1. Requirements Notation 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in [RFC2119]. 143 2. Introduction 145 This document describes the Locator/Identifier Separation Protocol 146 (LISP), which provides a set of functions for routers to exchange 147 information used to map from non-routeable Endpoint Identifiers 148 (EIDs) to routeable Routing Locators (RLOCs). It also defines a 149 mechanism for these LISP routers to encapsulate IP packets addressed 150 with EIDs for transmission across an Internet that uses RLOCs for 151 routing and forwarding. 153 Creation of LISP was initially motivated by discussions during the 154 IAB-sponsored Routing and Addressing Workshop held in Amsterdam in 155 October, 2006 (see [RFC4984]). A key conclusion of the workshop was 156 that the Internet routing and addressing system was not scaling well 157 in the face of the explosive growth of new sites; one reason for this 158 poor scaling is the increasing number of multi-homed and other sites 159 that cannot be addressed as part of topologically- or provider-based 160 aggregated prefixes. Additional work that more completely described 161 the problem statement may be found in [RADIR]. 163 A basic observation, made many years ago in early networking research 164 such as that documented in [CHIAPPA] and [RFC4984], is that using a 165 single address field for both identifying a device and for 166 determining where it is topologically located in the network requires 167 optimization along two conflicting axes: for routing to be efficient, 168 the address must be assigned topologically; for collections of 169 devices to be easily and effectively managed, without the need for 170 renumbering in response to topological change (such as that caused by 171 adding or removing attachment points to the network or by mobility 172 events), the address must explicitly not be tied to the topology. 174 The approach that LISP takes to solving the routing scalability 175 problem is to replace IP addresses with two new types of numbers: 176 Routing Locators (RLOCs), which are topologically assigned to network 177 attachment points (and are therefore amenable to aggregation) and 178 used for routing and forwarding of packets through the network; and 179 Endpoint Identifiers (EIDs), which are assigned independently from 180 the network topology, are used for numbering devices, and are 181 aggregated along administrative boundaries. LISP then defines 182 functions for mapping between the two numbering spaces and for 183 encapsulating traffic originated by devices using non-routeable EIDs 184 for transport across a network infrastructure that routes and 185 forwards using RLOCs. Both RLOCs and EIDs are syntactically- 186 identical to IP addresses; it is the semantics of how they are used 187 that differs. 189 This document describes the protocol that implements these functions. 190 The database which stores the mappings between EIDs and RLOCs is 191 explicitly a separate "module" to facilitate experimentation with a 192 variety of approaches. One database design that is being developed 193 and prototyped as part of the LISP working group work is [ALT]. 194 Others that have been described but not implemented include [CONS], 195 [EMACS], [RPMD], [NERD]. Finally, [LISP-MS], documents a general- 196 purpose service interface for accessing a mapping database; this 197 interface is intended to make the mapping database modular so that 198 different approaches can be tried without the need to modify 199 installed xTRs. 201 3. Definition of Terms 203 Provider Independent (PI) Addresses: PI addresses are an address 204 block assigned from a pool where blocks are not associated with 205 any particular location in the network (e.g. from a particular 206 service provider), and is therefore not topologically aggregatable 207 in the routing system. 209 Provider Assigned (PA) Addresses: PA addresses are an a address 210 block assigned to a site by each service provider to which a site 211 connects. Typically, each block is sub-block of a service 212 provider Classless Inter-Domain Routing (CIDR) [RFC4632] block and 213 is aggregated into the larger block before being advertised into 214 the global Internet. Traditionally, IP multihoming has been 215 implemented by each multi-homed site acquiring its own, globally- 216 visible prefix. LISP uses only topologically-assigned and 217 aggregatable address blocks for RLOCs, eliminating this 218 demonstrably non-scalable practice. 220 Routing Locator (RLOC): A RLOC is an IPv4 or IPv6 address of an 221 egress tunnel router (ETR). A RLOC is the output of a EID-to-RLOC 222 mapping lookup. An EID maps to one or more RLOCs. Typically, 223 RLOCs are numbered from topologically-aggregatable blocks that are 224 assigned to a site at each point to which it attaches to the 225 global Internet; where the topology is defined by the connectivity 226 of provider networks, RLOCs can be thought of as PA addresses. 227 Multiple RLOCs can be assigned to the same ETR device or to 228 multiple ETR devices at a site. 230 Endpoint ID (EID): An EID is a 32-bit (for IPv4) or 128-bit (for 231 IPv6) value used in the source and destination address fields of 232 the first (most inner) LISP header of a packet. The host obtains 233 a destination EID the same way it obtains an destination address 234 today, for example through a Domain Name System (DNS) [RFC1034] 235 lookup or Session Invitation Protocol (SIP) [RFC3261] exchange. 236 The source EID is obtained via existing mechanisms used to set a 237 host's "local" IP address. An EID is allocated to a host from an 238 EID-prefix block associated with the site where the host is 239 located. An EID can be used by a host to refer to other hosts. 240 EIDs MUST NOT be used as LISP RLOCs. Note that EID blocks may be 241 assigned in a hierarchical manner, independent of the network 242 topology, to facilitate scaling of the mapping database. In 243 addition, an EID block assigned to a site may have site-local 244 structure (subnetting) for routing within the site; this structure 245 is not visible to the global routing system. When used in 246 discussions with other Locator/ID separation proposals, a LISP EID 247 will be called a "LEID". Throughout this document, any references 248 to "EID" refers to an LEID. 250 EID-prefix: An EID-prefix is a power-of-two block of EIDs which are 251 allocated to a site by an address allocation authority. EID- 252 prefixes are associated with a set of RLOC addresses which make up 253 a "database mapping". EID-prefix allocations can be broken up 254 into smaller blocks when an RLOC set is to be associated with the 255 smaller EID-prefix. A globally routed address block (whether PI 256 or PA) is not an EID-prefix. However, a globally routed address 257 block may be removed from global routing and reused as an EID- 258 prefix. A site that receives an explicitly allocated EID-prefix 259 may not use that EID-prefix as a globally routed prefix assigned 260 to RLOCs. 262 End-system: An end-system is an IPv4 or IPv6 device that originates 263 packets with a single IPv4 or IPv6 header. The end-system 264 supplies an EID value for the destination address field of the IP 265 header when communicating globally (i.e. outside of its routing 266 domain). An end-system can be a host computer, a switch or router 267 device, or any network appliance. 269 Ingress Tunnel Router (ITR): An ITR is a router which accepts an IP 270 packet with a single IP header (more precisely, an IP packet that 271 does not contain a LISP header). The router treats this "inner" 272 IP destination address as an EID and performs an EID-to-RLOC 273 mapping lookup. The router then prepends an "outer" IP header 274 with one of its globally-routable RLOCs in the source address 275 field and the result of the mapping lookup in the destination 276 address field. Note that this destination RLOC may be an 277 intermediate, proxy device that has better knowledge of the EID- 278 to-RLOC mapping closer to the destination EID. In general, an ITR 279 receives IP packets from site end-systems on one side and sends 280 LISP-encapsulated IP packets toward the Internet on the other 281 side. 283 Specifically, when a service provider prepends a LISP header for 284 Traffic Engineering purposes, the router that does this is also 285 regarded as an ITR. The outer RLOC the ISP ITR uses can be based 286 on the outer destination address (the originating ITR's supplied 287 RLOC) or the inner destination address (the originating hosts 288 supplied EID). 290 TE-ITR: A TE-ITR is an ITR that is deployed in a service provider 291 network that prepends an additional LISP header for Traffic 292 Engineering purposes. 294 Egress Tunnel Router (ETR): An ETR is a router that accepts an IP 295 packet where the destination address in the "outer" IP header is 296 one of its own RLOCs. The router strips the "outer" header and 297 forwards the packet based on the next IP header found. In 298 general, an ETR receives LISP-encapsulated IP packets from the 299 Internet on one side and sends decapsulated IP packets to site 300 end-systems on the other side. ETR functionality does not have to 301 be limited to a router device. A server host can be the endpoint 302 of a LISP tunnel as well. 304 TE-ETR: A TE-ETR is an ETR that is deployed in a service provider 305 network that strips an outer LISP header for Traffic Engineering 306 purposes. 308 xTR: A xTR is a reference to an ITR or ETR when direction of data 309 flow is not part of the context description. xTR refers to the 310 router that is the tunnel endpoint. Used synonymously with the 311 term "Tunnel Router". For example, "An xTR can be located at the 312 Customer Edge (CE) router", meaning both ITR and ETR functionality 313 is at the CE router. 315 EID-to-RLOC Cache: The EID-to-RLOC cache is a short-lived, on- 316 demand table in an ITR that stores, tracks, and is responsible for 317 timing-out and otherwise validating EID-to-RLOC mappings. This 318 cache is distinct from the full "database" of EID-to-RLOC 319 mappings, it is dynamic, local to the ITR(s), and relatively small 320 while the database is distributed, relatively static, and much 321 more global in scope. 323 EID-to-RLOC Database: The EID-to-RLOC database is a global 324 distributed database that contains all known EID-prefix to RLOC 325 mappings. Each potential ETR typically contains a small piece of 326 the database: the EID-to-RLOC mappings for the EID prefixes 327 "behind" the router. These map to one of the router's own, 328 globally-visible, IP addresses. The same database mapping entries 329 MUST be configured on all ETRs for a given site. That is, the 330 EID-prefixes for the site and locator-set for each EID-prefix MUST 331 be the same on all ETRs so they consistently send Map-Reply 332 messages with the same database mapping contents. 334 Recursive Tunneling: Recursive tunneling occurs when a packet has 335 more than one LISP IP header. Additional layers of tunneling may 336 be employed to implement traffic engineering or other re-routing 337 as needed. When this is done, an additional "outer" LISP header 338 is added and the original RLOCs are preserved in the "inner" 339 header. Any references to tunnels in this specification refers to 340 dynamic encapsulating tunnels and never are they statically 341 configured. 343 Reencapsulating Tunnels: Reencapsulating tunneling occurs when a 344 packet has no more than one LISP IP header (two IP headers total) 345 and when it needs to be diverted to new RLOC, an ETR can 346 decapsulate the packet (remove the LISP header) and prepends a new 347 tunnel header, with new RLOC, on to the packet. Doing this allows 348 a packet to be re-routed by the re-encapsulating router without 349 adding the overhead of additional tunnel headers. Any references 350 to tunnels in this specification refers to dynamic encapsulating 351 tunnels and never are they statically configured. 353 LISP Header: a term used in this document to refer to the outer 354 IPv4 or IPv6 header, a UDP header, and a LISP-specific 8-byte 355 header that follows the UDP header, an ITR prepends or an ETR 356 strips. 358 Address Family Identifier (AFI): a term used to describe an address 359 encoding in a packet. An address family currently pertains to an 360 IPv4 or IPv6 address. See [AFI] and [RFC1700] for details. An 361 AFI value of 0 used in this specification indicates an unspecified 362 encoded address where the length of the address is 0 bytes 363 following the 16-bit AFI value of 0. 365 Negative Mapping Entry: A negative mapping entry, also known as a 366 negative cache entry, is an EID-to-RLOC entry where an EID-prefix 367 is advertised or stored with no RLOCs. That is, the locator-set 368 for the EID-to-RLOC entry is empty or has an encoded locator count 369 of 0. This type of entry could be used to describe a prefix from 370 a non-LISP site, which is explicitly not in the mapping database. 371 There are a set of well defined actions that are encoded in a 372 Negative Map-Reply. 374 Data Probe: A data-probe is a LISP-encapsulated data packet where 375 the inner header destination address equals the outer header 376 destination address used to trigger a Map-Reply by a decapsulating 377 ETR. In addition, the original packet is decapsulated and 378 delivered to the destination host. A Data Probe is used in some 379 of the mapping database designs to "probe" or request a Map-Reply 380 from an ETR; in other cases, Map-Requests are used. See each 381 mapping database design for details. 383 Proxy ITR (PITR): A PITR is also known as a PTR is defined and 384 described in [INTERWORK], a PITR acts like an ITR but does so on 385 behalf of non-LISP sites which send packets to destinations at 386 LISP sites. 388 Proxy ETR (PETR): A PETR is defined and described in [INTERWORK], a 389 PETR acts like an ETR but does so on behalf of LISP sites which 390 send packets to destinations at non-LISP sites. 392 Route-returnability: is an assumption that the underlying routing 393 system will deliver packets to the destination. When combined 394 with a nonce that is provided by a sender and returned by a 395 receiver limits off-path data insertion. 397 LISP site: is a set of routers in an edge network that are under a 398 single technical administration. LISP routers which reside in the 399 edge network are the demarcation points to separate the edge 400 network from the core network. 402 Client-side: a term used in this document to indicate a connection 403 initiation attempt by an EID. The ITR(s) at the LISP site are the 404 first to get involved in obtaining database map cache entries by 405 sending Map-Request messages. 407 Server-side: a term used in this document to indicate a connection 408 initiation attempt is being accepted for a destination EID. The 409 ETR(s) at the destination LISP site are the first to send Map- 410 Replies to the source site initiating the connection. The ETR(s) 411 at this destination site can obtain mappings by gleaning 412 information from Map-Requests, Data-Probes, or encapsulated 413 packets. 415 4. Basic Overview 417 One key concept of LISP is that end-systems (hosts) operate the same 418 way they do today. The IP addresses that hosts use for tracking 419 sockets, connections, and for sending and receiving packets do not 420 change. In LISP terminology, these IP addresses are called Endpoint 421 Identifiers (EIDs). 423 Routers continue to forward packets based on IP destination 424 addresses. When a packet is LISP encapsulated, these addresses are 425 referred to as Routing Locators (RLOCs). Most routers along a path 426 between two hosts will not change; they continue to perform routing/ 427 forwarding lookups on the destination addresses. For routers between 428 the source host and the ITR as well as routers from the ETR to the 429 destination host, the destination address is an EID. For the routers 430 between the ITR and the ETR, the destination address is an RLOC. 432 Another key LISP concept is the "Tunnel Router". A tunnel router 433 prepends LISP headers on host-originated packets and strip them prior 434 to final delivery to their destination. The IP addresses in this 435 "outer header" are RLOCs. During end-to-end packet exchange between 436 two Internet hosts, an ITR prepends a new LISP header to each packet 437 and an egress tunnel router strips the new header. The ITR performs 438 EID-to-RLOC lookups to determine the routing path to the ETR, which 439 has the RLOC as one of its IP addresses. 441 Some basic rules governing LISP are: 443 o End-systems (hosts) only send to addresses which are EIDs. They 444 don't know addresses are EIDs versus RLOCs but assume packets get 445 to LISP routers, which in turn, deliver packets to the destination 446 the end-system has specified. 448 o EIDs are always IP addresses assigned to hosts. 450 o LISP routers mostly deal with Routing Locator addresses. See 451 details later in Section 4.1 to clarify what is meant by "mostly". 453 o RLOCs are always IP addresses assigned to routers; preferably, 454 topologically-oriented addresses from provider CIDR blocks. 456 o When a router originates packets it may use as a source address 457 either an EID or RLOC. When acting as a host (e.g. when 458 terminating a transport session such as SSH, TELNET, or SNMP), it 459 may use an EID that is explicitly assigned for that purpose. An 460 EID that identifies the router as a host MUST NOT be used as an 461 RLOC; an EID is only routable within the scope of a site. A 462 typical BGP configuration might demonstrate this "hybrid" EID/RLOC 463 usage where a router could use its "host-like" EID to terminate 464 iBGP sessions to other routers in a site while at the same time 465 using RLOCs to terminate eBGP sessions to routers outside the 466 site. 468 o EIDs are not expected to be usable for global end-to-end 469 communication in the absence of an EID-to-RLOC mapping operation. 470 They are expected to be used locally for intra-site communication. 472 o EID prefixes are likely to be hierarchically assigned in a manner 473 which is optimized for administrative convenience and to 474 facilitate scaling of the EID-to-RLOC mapping database. The 475 hierarchy is based on a address allocation hierarchy which is 476 independent of the network topology. 478 o EIDs may also be structured (subnetted) in a manner suitable for 479 local routing within an autonomous system. 481 An additional LISP header may be prepended to packets by a TE-ITR 482 when re-routing of the path for a packet is desired. An obvious 483 instance of this would be an ISP router that needs to perform traffic 484 engineering for packets flowing through its network. In such a 485 situation, termed Recursive Tunneling, an ISP transit acts as an 486 additional ingress tunnel router and the RLOC it uses for the new 487 prepended header would be either a TE-ETR within the ISP (along 488 intra-ISP traffic engineered path) or a TE-ETR within another ISP (an 489 inter-ISP traffic engineered path, where an agreement to build such a 490 path exists). 492 In order to avoid excessive packet overhead as well as possible 493 encapsulation loops, this document mandates that a maximum of two 494 LISP headers can be prepended to a packet. It is believed two 495 headers is sufficient, where the first prepended header is used at a 496 site for Location/Identity separation and second prepended header is 497 used inside a service provider for Traffic Engineering purposes. 499 Tunnel Routers can be placed fairly flexibly in a multi-AS topology. 500 For example, the ITR for a particular end-to-end packet exchange 501 might be the first-hop or default router within a site for the source 502 host. Similarly, the egress tunnel router might be the last-hop 503 router directly-connected to the destination host. Another example, 504 perhaps for a VPN service out-sourced to an ISP by a site, the ITR 505 could be the site's border router at the service provider attachment 506 point. Mixing and matching of site-operated, ISP-operated, and other 507 tunnel routers is allowed for maximum flexibility. See Section 8 for 508 more details. 510 4.1. Packet Flow Sequence 512 This section provides an example of the unicast packet flow with the 513 following conditions: 515 o Source host "host1.abc.com" is sending a packet to 516 "host2.xyz.com", exactly what host1 would do if the site was not 517 using LISP. 519 o Each site is multi-homed, so each tunnel router has an address 520 (RLOC) assigned from the service provider address block for each 521 provider to which that particular tunnel router is attached. 523 o The ITR(s) and ETR(s) are directly connected to the source and 524 destination, respectively, but the source and destination can be 525 located anywhere in LISP site. 527 o Map-Requests can be sent on the underlying routing system topology 528 or over an alternative topology [ALT]. 530 o Map-Replies are sent on the underlying routing system topology. 532 Client host1.abc.com wants to communicate with server host2.xyz.com: 534 1. host1.abc.com wants to open a TCP connection to host2.xyz.com. 535 It does a DNS lookup on host2.xyz.com. An A/AAAA record is 536 returned. This address is the destination EID. The locally- 537 assigned address of host1.abc.com is used as the source EID. An 538 IPv4 or IPv6 packet is built and forwarded through the LISP site 539 as a normal IP packet until it reaches a LISP ITR. 541 2. The LISP ITR must be able to map the EID destination to an RLOC 542 of one of the ETRs at the destination site. The specific method 543 used to do this is not described in this example. See [ALT] or 544 [CONS] for possible solutions. 546 3. The ITR will send a LISP Map-Request. Map-Requests SHOULD be 547 rate-limited. 549 4. When an alternate mapping system is not in use, the Map-Request 550 packet is routed through the underlying routing system. 551 Otherwise, the Map-Request packet is routed on an alternate 552 logical topology. In either case, when the Map-Request arrives 553 at one of the ETRs at the destination site, it will process the 554 packet as a control message. 556 5. The ETR looks at the destination EID of the Map-Request and 557 matches it against the prefixes in the ETR's configured EID-to- 558 RLOC mapping database. This is the list of EID-prefixes the ETR 559 is supporting for the site it resides in. If there is no match, 560 the Map-Request is dropped. Otherwise, a LISP Map-Reply is 561 returned to the ITR. 563 6. The ITR receives the Map-Reply message, parses the message (to 564 check for format validity) and stores the mapping information 565 from the packet. This information is stored in the ITR's EID-to- 566 RLOC mapping cache. Note that the map cache is an on-demand 567 cache. An ITR will manage its map cache in such a way that 568 optimizes for its resource constraints. 570 7. Subsequent packets from host1.abc.com to host2.xyz.com will have 571 a LISP header prepended by the ITR using the appropriate RLOC as 572 the LISP header destination address learned from the ETR. Note 573 the packet may be sent to a different ETR than the one which 574 returned the Map-Reply due to the source site's hashing policy or 575 the destination site's locator-set policy. 577 8. The ETR receives these packets directly (since the destination 578 address is one of its assigned IP addresses), strips the LISP 579 header and forwards the packets to the attached destination host. 581 In order to eliminate the need for a mapping lookup in the reverse 582 direction, an ETR MAY create a cache entry that maps the source EID 583 (inner header source IP address) to the source RLOC (outer header 584 source IP address) in a received LISP packet. Such a cache entry is 585 termed a "gleaned" mapping and only contains a single RLOC for the 586 EID in question. More complete information about additional RLOCs 587 SHOULD be verified by sending a LISP Map-Request for that EID. Both 588 ITR and the ETR may also influence the decision the other makes in 589 selecting an RLOC. See Section 6 for more details. 591 5. LISP Encapsulation Details 593 Since additional tunnel headers are prepended, the packet becomes 594 larger and can exceed the MTU of any link traversed from the ITR to 595 the ETR. It is recommended in IPv4 that packets do not get 596 fragmented as they are encapsulated by the ITR. Instead, the packet 597 is dropped and an ICMP Too Big message is returned to the source. 599 This specification recommends that implementations support for one of 600 the proposed fragmentation and reassembly schemes. These two simple 601 existing schemes are detailed in Section 5.4. 603 5.1. LISP IPv4-in-IPv4 Header Format 605 0 1 2 3 606 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 607 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 608 / |Version| IHL |Type of Service| Total Length | 609 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 610 | | Identification |Flags| Fragment Offset | 611 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 612 OH | Time to Live | Protocol = 17 | Header Checksum | 613 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 614 | | Source Routing Locator | 615 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 616 \ | Destination Routing Locator | 617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 618 / | Source Port = xxxx | Dest Port = 4341 | 619 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 620 \ | UDP Length | UDP Checksum | 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 L |N|L|E|V|I|flags| Nonce/Map-Version | 623 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 S / | Instance ID/Locator Status Bits | 625 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 626 / |Version| IHL |Type of Service| Total Length | 627 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 628 | | Identification |Flags| Fragment Offset | 629 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 630 IH | Time to Live | Protocol | Header Checksum | 631 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 632 | | Source EID | 633 \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 634 \ | Destination EID | 635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 637 5.2. LISP IPv6-in-IPv6 Header Format 639 0 1 2 3 640 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 / |Version| Traffic Class | Flow Label | 643 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 | | Payload Length | Next Header=17| Hop Limit | 646 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 647 | | 648 O + + 649 u | | 650 t + Source Routing Locator + 651 e | | 652 r + + 653 | | 654 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 655 d | | 656 r + + 657 | | 658 ^ + Destination Routing Locator + 659 | | | 660 \ + + 661 \ | | 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 / | Source Port = xxxx | Dest Port = 4341 | 664 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 665 \ | UDP Length | UDP Checksum | 666 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 667 L |N|L|E|V|I|flags| Nonce/Map-Version | 668 I \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 669 S / | Instance ID/Locator Status Bits | 670 P +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 671 / |Version| Traffic Class | Flow Label | 672 / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 673 / | Payload Length | Next Header | Hop Limit | 674 v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 | | 676 I + + 677 n | | 678 n + Source EID + 679 e | | 680 r + + 681 | | 682 H +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 d | | 684 r + + 685 | | 686 ^ + Destination EID + 687 \ | | 688 \ + + 689 \ | | 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 692 5.3. Tunnel Header Field Descriptions 694 Inner Header: The inner header is the header on the datagram 695 received from the originating host. The source and destination IP 696 addresses are EIDs. 698 Outer Header: The outer header is a new header prepended by an ITR. 699 The address fields contain RLOCs obtained from the ingress 700 router's EID-to-RLOC cache. The IP protocol number is "UDP (17)" 701 from [RFC0768]. The DF bit of the Flags field is set to 0 when 702 the method in Section 5.4.1 is used and set to 1 when the method 703 in Section 5.4.2 is used. 705 UDP Header: The UDP header contains a ITR selected source port when 706 encapsulating a packet. See Section 6.5 for details on the hash 707 algorithm used to select a source port based on the 5-tuple of the 708 inner header. The destination port MUST be set to the well-known 709 IANA assigned port value 4341. 711 UDP Checksum: The UDP checksum field SHOULD be transmitted as zero 712 by an ITR for either IPv4 [RFC0768] or IPv6 encapsulation 713 [UDP-TUNNELS]. When a packet with a zero UDP checksum is received 714 by an ETR, the ETR MUST accept the packet for decapsulation. When 715 an ITR transmits a non-zero value for the UDP checksum, it MUST 716 send a correctly computed value in this field. When an ETR 717 receives a packet with a non-zero UDP checksum, it MAY choose to 718 verify the checksum value. If it chooses to perform such 719 verification, and the verification fails, the packet MUST be 720 silently dropped. If the ETR chooses not to perform the 721 verification, or performs the verification successfully, the 722 packet MUST be accepted for decapsulation. The handling of UDP 723 checksums for all tunneling protocols, including LISP, is under 724 active discussion within the IETF. When that discussion 725 concludes, any necessary changes will be made to align LISP with 726 the outcome of the broader discussion. 728 UDP Length: The UDP length field is for an IPv4 encapsulated packet, 729 the inner header Total Length plus the UDP and LISP header lengths 730 are used. For an IPv6 encapsulated packet, the inner header 731 Payload Length plus the size of the IPv6 header (40 bytes) plus 732 the size of the UDP and LISP headers are used. The UDP header 733 length is 8 bytes. 735 N: The N bit is the nonce-present bit. When this bit is set to 1, 736 the low-order 24-bits of the first 32-bits of the LISP header 737 contains a Nonce. See Section 6.3.1 for details. Both N and V 738 bits MUST NOT be set in the same packet. If they are, a 739 decapsulating ETR MUST treat the "Nonce/Map-Version" field as 740 having a Nonce value present. 742 L: The L bit is the Locator-Status-Bits field enabled bit. When this 743 bit is set to 1, the Locator-Status-Bits in the second 32-bits of 744 the LISP header are in use. 746 x 1 x x 0 x x x 747 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 748 |N|L|E|V|I|flags| Nonce/Map-Version | 749 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 750 | Locator Status Bits | 751 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 753 E: The E bit is the echo-nonce-request bit. When this bit is set to 754 1, the N bit MUST be 1. This bit SHOULD be ignored and has no 755 meaning when the N bit is set to 0. See Section 6.3.1 for 756 details. 758 V: The V bit is the Map-Version present bit. When this bit is set to 759 1, the N bit MUST be 0. Refer to Section 6.6.3 for more details. 760 This bit indicates that the first 4 bytes of the LISP header is 761 encoded as: 763 0 x 0 1 x x x x 764 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 765 |N|L|E|V|I|flags| Source Map-Version | Dest Map-Version | 766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 767 | Instance ID/Locator Status Bits | 768 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 770 I: The I bit is the Instance ID bit. See Section 5.5 for more 771 details. When this bit is set to 1, the Locator Status Bits field 772 is reduced to 8-bits and the high-order 24-bits are used as an 773 Instance ID. If the L-bit is set to 0, then the low-order 8 bits 774 are transmitted as zero and ignored on receipt. The format of the 775 last 4 bytes of the LISP header would look like: 777 x x x x 1 x x x 778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 779 |N|L|E|V|I|flags| Nonce/Map-Version | 780 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 781 | Instance ID | LSBs | 782 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 784 flags: The flags field is a 3-bit field is reserved for future flag 785 use. It is set to 0 on transmit and ignored on receipt. 787 LISP Nonce: The LISP nonce field is a 24-bit value that is randomly 788 generated by an ITR when the N-bit is set to 1. The nonce is also 789 used when the E-bit is set to request the nonce value to be echoed 790 by the other side when packets are returned. When the E-bit is 791 clear but the N-bit is set, a remote ITR is either echoing a 792 previously requested echo-nonce or providing a random nonce. See 793 Section 6.3.1 for more details. 795 LISP Locator Status Bits: The locator status bits field in the LISP 796 header is set by an ITR to indicate to an ETR the up/down status 797 of the Locators in the source site. Each RLOC in a Map-Reply is 798 assigned an ordinal value from 0 to n-1 (when there are n RLOCs in 799 a mapping entry). The Locator Status Bits are numbered from 0 to 800 n-1 from the least significant bit of field. The field is 32-bits 801 when the I-bit is set to 0 and is 8 bits when the I-bit is set to 802 1. When a Locator Status Bit is set to 1, the ITR is indicating 803 to the ETR the RLOC associated with the bit ordinal has up status. 804 See Section 6.3 for details on how an ITR can determine the status 805 of other ITRs at the same site. When a site has multiple EID- 806 prefixes which result in multiple mappings (where each could have 807 a different locator-set), the Locator Status Bits setting in an 808 encapsulated packet MUST reflect the mapping for the EID-prefix 809 that the inner-header source EID address matches. 811 When doing ITR/PITR encapsulation: 813 o The outer header Time to Live field (or Hop Limit field, in case 814 of IPv6) SHOULD be copied from the inner header Time to Live 815 field. 817 o The outer header Type of Service field (or the Traffic Class 818 field, in the case of IPv6) SHOULD be copied from the inner header 819 Type of Service field (with one caveat, see below). 821 When doing ETR/PETR decapsulation: 823 o The inner header Time to Live field (or Hop Limit field, in case 824 of IPv6) SHOULD be copied from the outer header Time to Live 825 field, when the Time to Live field of the outer header is less 826 than the Time to Live of the inner header. Failing to perform 827 this check can cause the Time to Live of the inner header to 828 increment across encapsulation/decapsulation cycle. This check is 829 also performed when doing initial encapsulation when a packet 830 comes to an ITR or PITR destined for a LISP site. 832 o The inner header Type of Service field (or the Traffic Class 833 field, in the case of IPv6) SHOULD be copied from the outer header 834 Type of Service field (with one caveat, see below). 836 Note if an ETR/PETR is also an ITR/PITR and choose to reencapsulate 837 after decapsulating, the net effect of this is that the new outer 838 header will carry the same Time to Live as the old outer header. 840 Copying the TTL serves two purposes: first, it preserves the distance 841 the host intended the packet to travel; second, and more importantly, 842 it provides for suppression of looping packets in the event there is 843 a loop of concatenated tunnels due to misconfiguration. See 844 Section 9.3 for TTL exception handling for traceroute packets. 846 The ECN field occupies bits 6 and 7 of both the IPv4 Type of Service 847 field and the IPv6 Traffic Class field [RFC3168]. The ECN field 848 requires special treatment in order to avoid discarding indications 849 of congestion [RFC3168]. ITR encapsulation MUST copy the 2-bit ECN 850 field from the inner header to the outer header. Re-encapsulation 851 MUST copy the 2-bit ECN field from the stripped outer header to the 852 new outer header. If the ECN field contains a congestion indication 853 codepoint (the value is '11', the Congestion Experienced (CE) 854 codepoint), then ETR decapsulation MUST copy the 2-bit ECN field from 855 the stripped outer header to the surviving inner header that is used 856 to forward the packet beyond the ETR. These requirements preserve 857 Congestion Experienced (CE) indications when a packet that uses ECN 858 traverses a LISP tunnel and becomes marked with a CE indication due 859 to congestion between the tunnel endpoints. 861 5.4. Dealing with Large Encapsulated Packets 863 This section proposes two simple mechanisms to deal with packets that 864 exceed the path MTU between the ITR and ETR. 866 It is left to the implementor to decide if the stateless or stateful 867 mechanism should be implemented. Both or neither can be used since 868 it is a local decision in the ITR regarding how to deal with MTU 869 issues, and sites can interoperate with differing mechanisms. 871 Both stateless and stateful mechanisms also apply to Reencapsulating 872 and Recursive Tunneling. So any actions below referring to an ITR 873 also apply to an TE-ITR. 875 5.4.1. A Stateless Solution to MTU Handling 877 An ITR stateless solution to handle MTU issues is described as 878 follows: 880 1. Define an architectural constant S for the maximum size of a 881 packet, in bytes, an ITR would like to receive from a source 882 inside of its site. 884 2. Define L to be the maximum size, in bytes, a packet of size S 885 would be after the ITR prepends the LISP header, UDP header, and 886 outer network layer header of size H. 888 3. Calculate: S + H = L. 890 When an ITR receives a packet from a site-facing interface and adds H 891 bytes worth of encapsulation to yield a packet size greater than L 892 bytes, it resolves the MTU issue by first splitting the original 893 packet into 2 equal-sized fragments. A LISP header is then prepended 894 to each fragment. The size of the encapsulated fragments is then 895 (S/2 + H), which is less than the ITR's estimate of the path MTU 896 between the ITR and its correspondent ETR. 898 When an ETR receives encapsulated fragments, it treats them as two 899 individually encapsulated packets. It strips the LISP headers then 900 forwards each fragment to the destination host of the destination 901 site. The two fragments are reassembled at the destination host into 902 the single IP datagram that was originated by the source host. 904 This behavior is performed by the ITR when the source host originates 905 a packet with the DF field of the IP header is set to 0. When the DF 906 field of the IP header is set to 1, or the packet is an IPv6 packet 907 originated by the source host, the ITR will drop the packet when the 908 size is greater than L, and sends an ICMP Too Big message to the 909 source with a value of S, where S is (L - H). 911 When the outer header encapsulation uses an IPv4 header, an 912 implementation SHOULD set the DF bit to 1 so ETR fragment reassembly 913 can be avoided. An implementation MAY set the DF bit in such headers 914 to 0 if it has good reason to believe there are unresolvable path MTU 915 issues between the sending ITR and the receiving ETR. 917 This specification recommends that L be defined as 1500. 919 5.4.2. A Stateful Solution to MTU Handling 921 An ITR stateful solution to handle MTU issues is described as follows 922 and was first introduced in [OPENLISP]: 924 1. The ITR will keep state of the effective MTU for each locator per 925 mapping cache entry. The effective MTU is what the core network 926 can deliver along the path between ITR and ETR. 928 2. When an IPv6 encapsulated packet or an IPv4 encapsulated packet 929 with DF bit set to 1, exceeds what the core network can deliver, 930 one of the intermediate routers on the path will send an ICMP Too 931 Big message to the ITR. The ITR will parse the ICMP message to 932 determine which locator is affected by the effective MTU change 933 and then record the new effective MTU value in the mapping cache 934 entry. 936 3. When a packet is received by the ITR from a source inside of the 937 site and the size of the packet is greater than the effective MTU 938 stored with the mapping cache entry associated with the 939 destination EID the packet is for, the ITR will send an ICMP Too 940 Big message back to the source. The packet size advertised by 941 the ITR in the ICMP Too Big message is the effective MTU minus 942 the LISP encapsulation length. 944 Even though this mechanism is stateful, it has advantages over the 945 stateless IP fragmentation mechanism, by not involving the 946 destination host with reassembly of ITR fragmented packets. 948 5.5. Using Virtualization and Segmentation with LISP 950 When multiple organizations inside of a LISP site are using private 951 addresses [RFC1918] as EID-prefixes, their address spaces MUST remain 952 segregated due to possible address duplication. An Instance ID in 953 the address encoding can aid in making the entire AFI based address 954 unique. See IANA Considerations Section 13.1 for details for 955 possible address encodings. 957 An Instance ID can be carried in a LISP encapsulated packet. An ITR 958 that prepends a LISP header, will copy a 24-bit value, used by the 959 LISP router to uniquely identify the address space. The value is 960 copied to the Instance ID field of the LISP header and the I-bit is 961 set to 1. 963 When an ETR decapsulates a packet, the Instance ID from the LISP 964 header is used as a table identifier to locate the forwarding table 965 to use for the inner destination EID lookup. 967 For example, a 802.1Q VLAN tag or VPN identifier could be used as a 968 24-bit Instance ID. 970 6. EID-to-RLOC Mapping 972 6.1. LISP IPv4 and IPv6 Control Plane Packet Formats 974 The following new UDP packet types are used to retrieve EID-to-RLOC 975 mappings: 977 0 1 2 3 978 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 979 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 980 |Version| IHL |Type of Service| Total Length | 981 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 | Identification |Flags| Fragment Offset | 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | Time to Live | Protocol = 17 | Header Checksum | 985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 986 | Source Routing Locator | 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 988 | Destination Routing Locator | 989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 990 / | Source Port | Dest Port | 991 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 992 \ | UDP Length | UDP Checksum | 993 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 994 | | 995 | LISP Message | 996 | | 997 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 999 0 1 2 3 1000 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1002 |Version| Traffic Class | Flow Label | 1003 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1004 | Payload Length | Next Header=17| Hop Limit | 1005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1006 | | 1007 + + 1008 | | 1009 + Source Routing Locator + 1010 | | 1011 + + 1012 | | 1013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1014 | | 1015 + + 1016 | | 1017 + Destination Routing Locator + 1018 | | 1019 + + 1020 | | 1021 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1022 / | Source Port | Dest Port | 1023 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 \ | UDP Length | UDP Checksum | 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 | | 1027 | LISP Message | 1028 | | 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1031 The LISP UDP-based messages are the Map-Request and Map-Reply 1032 messages. When a UDP Map-Request is sent, the UDP source port is 1033 chosen by the sender and the destination UDP port number is set to 1034 4342. When a UDP Map-Reply is sent, the source UDP port number is 1035 set to 4342 and the destination UDP port number is copied from the 1036 source port of either the Map-Request or the invoking data packet. 1037 Implementations MUST be prepared to accept packets when either the 1038 source port or destination UDP port is set to 4342 due to NATs 1039 changing port number values. 1041 The UDP Length field will reflect the length of the UDP header and 1042 the LISP Message payload. 1044 The UDP Checksum is computed and set to non-zero for Map-Request, 1045 Map-Reply, Map-Register and ECM control messages. It MUST be checked 1046 on receipt and if the checksum fails, the packet MUST be dropped. 1048 LISP-CONS [CONS] uses TCP to send LISP control messages. The format 1049 of control messages includes the UDP header so the checksum and 1050 length fields can be used to protect and delimit message boundaries. 1052 This main LISP specification is the authoritative source for message 1053 format definitions for the Map-Request and Map-Reply messages. 1055 6.1.1. LISP Packet Type Allocations 1057 This section will be the authoritative source for allocating LISP 1058 Type values. Current allocations are: 1060 Reserved: 0 b'0000' 1061 LISP Map-Request: 1 b'0001' 1062 LISP Map-Reply: 2 b'0010' 1063 LISP Map-Register: 3 b'0011' 1064 LISP Encapsulated Control Message: 8 b'1000' 1066 6.1.2. Map-Request Message Format 1068 0 1 2 3 1069 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 |Type=1 |A|M|P|S| Reserved | IRC | Record Count | 1072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073 | Nonce . . . | 1074 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 | . . . Nonce | 1076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 | Source-EID-AFI | Source EID Address ... | 1078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1079 | ITR-RLOC-AFI 1 | ITR-RLOC Address 1 ... | 1080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 | ... | 1082 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1083 | ITR-RLOC-AFI n | ITR-RLOC Address n ... | 1084 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1085 / | Reserved | EID mask-len | EID-prefix-AFI | 1086 Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1087 \ | EID-prefix ... | 1088 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1089 | Map-Reply Record ... | 1090 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1091 | Mapping Protocol Data | 1092 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1094 Packet field descriptions: 1096 Type: 1 (Map-Request) 1098 A: This is an authoritative bit, which is set to 0 for UDP-based Map- 1099 Requests sent by an ITR. 1101 M: When set, it indicates a Map-Reply Record segment is included in 1102 the Map-Request. 1104 P: This is the probe-bit which indicates that a Map-Request SHOULD be 1105 treated as a locator reachability probe. The receiver SHOULD 1106 respond with a Map-Reply with the probe-bit set, indicating the 1107 Map-Reply is a locator reachability probe reply, with the nonce 1108 copied from the Map-Request. See Section 6.3.2 for more details. 1110 S: This is the SMR bit. See Section 6.6.2 for details. 1112 Reserved: Set to 0 on transmission and ignored on receipt. 1114 IRC: This 5-bit field is the ITR-RLOC Count which encodes the 1115 additional number of (ITR-RLOC-AFI, ITR-RLOC Address) fields 1116 present in this message. At least one (ITR-RLOC-AFI, ITR-RLOC- 1117 Address) pair must always be encoded. Multiple ITR-RLOC Address 1118 fields are used so a Map-Replier can select which destination 1119 address to use for a Map-Reply. The IRC value ranges from 0 to 1120 31, and for a value of 1, there are 2 ITR-RLOC addresses encoded 1121 and so on up to 31 which encodes a total of 32 ITR-RLOC addresses. 1123 Record Count: The number of records in this Map-Request message. A 1124 record is comprised of the portion of the packet that is labeled 1125 'Rec' above and occurs the number of times equal to Record Count. 1126 For this version of the protocol, a receiver MUST accept and 1127 process Map-Requests that contain one or more records, but a 1128 sender MUST only send Map-Requests containing one record. Support 1129 for requesting multiple EIDs in a single Map-Request message will 1130 be specified in a future version of the protocol. 1132 Nonce: An 8-byte random value created by the sender of the Map- 1133 Request. This nonce will be returned in the Map-Reply. The 1134 security of the LISP mapping protocol depends critically on the 1135 strength of the nonce in the Map-Request message. The nonce 1136 SHOULD be generated by a properly seeded pseudo-random (or strong 1137 random) source. See [RFC4086] for advice on generating security- 1138 sensitive random data. 1140 Source-EID-AFI: Address family of the "Source EID Address" field. 1142 Source EID Address: This is the EID of the source host which 1143 originated the packet which is invoking this Map-Request. When 1144 Map-Requests are used for refreshing a map-cache entry or for 1145 RLOC-probing, an AFI value 0 is used and this field is of zero 1146 length. 1148 ITR-RLOC-AFI: Address family of the "ITR-RLOC Address" field that 1149 follows this field. 1151 ITR-RLOC Address: Used to give the ETR the option of selecting the 1152 destination address from any address family for the Map-Reply 1153 message. This address MUST be a routable RLOC address of the 1154 sender of the Map-Request message. 1156 EID mask-len: Mask length for EID prefix. 1158 EID-prefix-AFI: Address family of EID-prefix according to [RFC5226] 1160 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1161 address-family. When a Map-Request is sent by an ITR because a 1162 data packet is received for a destination where there is no 1163 mapping entry, the EID-prefix is set to the destination IP address 1164 of the data packet. And the 'EID mask-len' is set to 32 or 128 1165 for IPv4 or IPv6, respectively. When an xTR wants to query a site 1166 about the status of a mapping it already has cached, the EID- 1167 prefix used in the Map-Request has the same mask-length as the 1168 EID-prefix returned from the site when it sent a Map-Reply 1169 message. 1171 Map-Reply Record: When the M bit is set, this field is the size of a 1172 single "Record" in the Map-Reply format. This Map-Reply record 1173 contains the EID-to-RLOC mapping entry associated with the Source 1174 EID. This allows the ETR which will receive this Map-Request to 1175 cache the data if it chooses to do so. 1177 Mapping Protocol Data: See [CONS] for details. This field is 1178 optional and present when the UDP length indicates there is enough 1179 space in the packet to include it. 1181 6.1.3. EID-to-RLOC UDP Map-Request Message 1183 A Map-Request is sent from an ITR when it needs a mapping for an EID, 1184 wants to test an RLOC for reachability, or wants to refresh a mapping 1185 before TTL expiration. For the initial case, the destination IP 1186 address used for the Map-Request is the destination-EID from the 1187 packet which had a mapping cache lookup failure. For the latter 2 1188 cases, the destination IP address used for the Map-Request is one of 1189 the RLOC addresses from the locator-set of the map cache entry. The 1190 source address is either an IPv4 or IPv6 RLOC address depending if 1191 the Map-Request is using an IPv4 versus IPv6 header, respectively. 1192 In all cases, the UDP source port number for the Map-Request message 1193 is an ITR/PITR selected 16-bit value and the UDP destination port 1194 number is set to the well-known destination port number 4342. A 1195 successful Map-Reply updates the cached set of RLOCs associated with 1196 the EID prefix range. 1198 One or more Map-Request (ITR-RLOC-AFI, ITR-RLOC-Address) fields MUST 1199 be filled in by the ITR. The number of fields (minus 1) encoded MUST 1200 be placed in the IRC field. The ITR MAY include all locally 1201 configured locators in this list or just provide one locator address 1202 from each address family it supports. If the ITR erroneously 1203 provides no ITR-RLOC addresses, the Map-Replier MUST drop the Map- 1204 Request. 1206 Map-Requests can also be LISP encapsulated using UDP destination port 1207 4342 with a LISP type value set to "Encapsulated Control Message", 1208 when sent from an ITR to a Map-Resolver. Likewise, Map-Requests are 1209 LISP encapsulated the same way from a Map-Server to an ETR. Details 1210 on encapsulated Map-Requests and Map-Resolvers can be found in 1211 [LISP-MS]. 1213 Map-Requests MUST be rate-limited. It is recommended that a Map- 1214 Request for the same EID-prefix be sent no more than once per second. 1216 An ITR that is configured with mapping database information (i.e. it 1217 is also an ETR) may optionally include those mappings in a Map- 1218 Request. When an ETR configured to accept and verify such 1219 "piggybacked" mapping data receives such a Map-Request and it does 1220 not have this mapping in the map-cache, it may originate a "verifying 1221 Map-Request", addressed to the map-requesting ITR. If the ETR has a 1222 map-cache entry that matches the "piggybacked" EID and the RLOC is in 1223 the locator-set for the entry, then it may send the "verifying Map- 1224 Request" directly to the originating Map-Request source. If the RLOC 1225 is not in the locator-set, then the ETR MUST send the "verifying Map- 1226 Request" to the "piggybacked" EID. Doing this forces the "verifying 1227 Map-Request" to go through the mapping database system to reach the 1228 authoritative source of information about that EID, guarding against 1229 RLOC-spoofing in in the "piggybacked" mapping data. 1231 6.1.4. Map-Reply Message Format 1233 0 1 2 3 1234 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 |Type=2 |P|E| Reserved | Record Count | 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1238 | Nonce . . . | 1239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1240 | . . . Nonce | 1241 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1242 | | Record TTL | 1243 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1244 R | Locator Count | EID mask-len | ACT |A| Reserved | 1245 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1246 c | Rsvd | Map-Version Number | EID-AFI | 1247 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1248 r | EID-prefix | 1249 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1250 | /| Priority | Weight | M Priority | M Weight | 1251 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1252 | o | Unused Flags |L|p|R| Loc-AFI | 1253 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1254 | \| Locator | 1255 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1256 | Mapping Protocol Data | 1257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1259 Packet field descriptions: 1261 Type: 2 (Map-Reply) 1263 P: This is the probe-bit which indicates that the Map-Reply is in 1264 response to a locator reachability probe Map-Request. The nonce 1265 field MUST contain a copy of the nonce value from the original 1266 Map-Request. See Section 6.3.2 for more details. 1268 E: Indicates that the ETR which sends this Map-Reply message is 1269 advertising that the site is enabled for the Echo-Nonce locator 1270 reachability algorithm. See Section 6.3.1 for more details. 1272 Reserved: Set to 0 on transmission and ignored on receipt. 1274 Record Count: The number of records in this reply message. A record 1275 is comprised of that portion of the packet labeled 'Record' above 1276 and occurs the number of times equal to Record count. 1278 Nonce: A 24-bit value set in a Data-Probe packet or a 64-bit value 1279 from the Map-Request is echoed in this Nonce field of the Map- 1280 Reply. 1282 Record TTL: The time in minutes the recipient of the Map-Reply will 1283 store the mapping. If the TTL is 0, the entry SHOULD be removed 1284 from the cache immediately. If the value is 0xffffffff, the 1285 recipient can decide locally how long to store the mapping. 1287 Locator Count: The number of Locator entries. A locator entry 1288 comprises what is labeled above as 'Loc'. The locator count can 1289 be 0 indicating there are no locators for the EID-prefix. 1291 EID mask-len: Mask length for EID prefix. 1293 ACT: This 3-bit field describes negative Map-Reply actions. These 1294 bits are used only when the 'Locator Count' field is set to 0. 1295 The action bits are encoded only in Map-Reply messages. The 1296 actions defined are used by an ITR or PTR when a destination EID 1297 matches a negative mapping cache entry. Unassigned values should 1298 cause a map-cache entry to be created and, when packets match this 1299 negative cache entry, they will be dropped. The current assigned 1300 values are: 1302 (0) No-Action: The map-cache is kept alive and packet 1303 encapsulation occurs. 1305 (1) Natively-Forward: The packet is not encapsulated or dropped 1306 but natively forwarded. 1308 (2) Send-Map-Request: The packet invokes sending a Map-Request. 1310 (3) Drop: A packet that matches this map-cache entry is dropped. 1312 A: The Authoritative bit, when sent by a UDP-based message is always 1313 set to 1 by an ETR. See [CONS] for TCP-based Map-Replies. When a 1314 Map-Server is proxy Map-Replying [LISP-MS] for a LISP site, the 1315 Authoritative bit is set to 0. This indicates to requesting ITRs 1316 that the Map-Reply was not originated by a LISP node managed at 1317 the site that owns the EID-prefix. 1319 Map-Version Number: When this 12-bit value is non-zero the Map-Reply 1320 sender is informing the ITR what the version number is for the 1321 EID-record contained in the Map-Reply. The ETR can allocate this 1322 number internally but MUST coordinate this value with other ETRs 1323 for the site. When this value is 0, there is no versioning 1324 information conveyed. The Map-Version Number can be included in 1325 Map-Request and Map-Register messages. See Section 6.6.3 for more 1326 details. 1328 EID-AFI: Address family of EID-prefix according to [RFC5226]. 1330 EID-prefix: 4 bytes if an IPv4 address-family, 16 bytes if an IPv6 1331 address-family. 1333 Priority: each RLOC is assigned a unicast priority. Lower values 1334 are more preferable. When multiple RLOCs have the same priority, 1335 they may be used in a load-split fashion. A value of 255 means 1336 the RLOC MUST NOT be used for unicast forwarding. 1338 Weight: when priorities are the same for multiple RLOCs, the weight 1339 indicates how to balance unicast traffic between them. Weight is 1340 encoded as a relative weight of total unicast packets that match 1341 the mapping entry. If a non-zero weight value is used for any 1342 RLOC, then all RLOCs MUST use a non-zero weight value and then the 1343 sum of all weight values MUST equal 100. If a zero value is used 1344 for any RLOC weight, then all weights MUST be zero and the 1345 receiver of the Map-Reply will decide how to load-split traffic. 1346 See Section 6.5 for a suggested hash algorithm to distribute load 1347 across locators with same priority and equal weight values. 1349 M Priority: each RLOC is assigned a multicast priority used by an 1350 ETR in a receiver multicast site to select an ITR in a source 1351 multicast site for building multicast distribution trees. A value 1352 of 255 means the RLOC MUST NOT be used for joining a multicast 1353 distribution tree. 1355 M Weight: when priorities are the same for multiple RLOCs, the 1356 weight indicates how to balance building multicast distribution 1357 trees across multiple ITRs. The weight is encoded as a relative 1358 weight of total number of trees built to the source site 1359 identified by the EID-prefix. If a non-zero weight value is used 1360 for any RLOC, then all RLOCs MUST use a non-zero weight value and 1361 then the sum of all weight values MUST equal 100. If a zero value 1362 is used for any RLOC weight, then all weights MUST be zero and the 1363 receiver of the Map-Reply will decide how to distribute multicast 1364 state across ITRs. 1366 Unused Flags: set to 0 when sending and ignored on receipt. 1368 L: when this bit is set, the locator is flagged as a local locator to 1369 the ETR that is sending the Map-Reply. When a Map-Server is doing 1370 proxy Map-Replying [LISP-MS] for a LISP site, the L bit is set to 1371 0 for all locators in this locator-set. 1373 p: when this bit is set, an ETR informs the RLOC-probing ITR that the 1374 locator address, for which this bit is set, is the one being RLOC- 1375 probed and may be different from the source address of the Map- 1376 Reply. An ITR that RLOC-probes a particular locator, MUST use 1377 this locator for retrieving the data structure used to store the 1378 fact that the locator is reachable. The "p" bit is set for a 1379 single locator in the same locator set. If an implementation sets 1380 more than one "p" bit erroneously, the receiver of the Map-Reply 1381 MUST select the first locator. The "p" bit MUST NOT be set for 1382 locator-set records sent in Map-Request and Map-Register messages. 1384 R: set when the sender of a Map-Reply has a route to the locator in 1385 the locator data record. This receiver may find this useful to 1386 know when determining if the locator is reachable from the 1387 receiver. See also Section 6.4 for another way the R-bit may be 1388 used. 1390 Locator: an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field) 1391 assigned to an ETR. Note that the destination RLOC address MAY be 1392 an anycast address. A source RLOC can be an anycast address as 1393 well. The source or destination RLOC MUST NOT be the broadcast 1394 address (255.255.255.255 or any subnet broadcast address known to 1395 the router), and MUST NOT be a link-local multicast address. The 1396 source RLOC MUST NOT be a multicast address. The destination RLOC 1397 SHOULD be a multicast address if it is being mapped from a 1398 multicast destination EID. 1400 Mapping Protocol Data: See [CONS] or [ALT] for details. This field 1401 is optional and present when the UDP length indicates there is 1402 enough space in the packet to include it. 1404 6.1.5. EID-to-RLOC UDP Map-Reply Message 1406 A Map-Reply returns an EID-prefix with a prefix length that is less 1407 than or equal to the EID being requested. The EID being requested is 1408 either from the destination field of an IP header of a Data-Probe or 1409 the EID record of a Map-Request. The RLOCs in the Map-Reply are 1410 globally-routable IP addresses of all ETRs for the LISP site. Each 1411 RLOC conveys status reachability but does not convey path 1412 reachability from a requesters perspective. Separate testing of path 1413 reachability is required, See Section 6.3 for details. 1415 Note that a Map-Reply may contain different EID-prefix granularity 1416 (prefix + length) than the Map-Request which triggers it. This might 1417 occur if a Map-Request were for a prefix that had been returned by an 1418 earlier Map-Reply. In such a case, the requester updates its cache 1419 with the new prefix information and granularity. For example, a 1420 requester with two cached EID-prefixes that are covered by a Map- 1421 Reply containing one, less-specific prefix, replaces the entry with 1422 the less-specific EID-prefix. Note that the reverse, replacement of 1423 one less-specific prefix with multiple more-specific prefixes, can 1424 also occur but not by removing the less-specific prefix rather by 1425 adding the more-specific prefixes which during a lookup will override 1426 the less-specific prefix. 1428 When an ETR is configured with overlapping EID-prefixes, a Map- 1429 Request with an EID that longest matches any EID-prefix MUST be 1430 returned in a single Map-Reply message. For instance, if an ETR had 1431 database mapping entries for EID-prefixes: 1433 10.0.0.0/8 1434 10.1.0.0/16 1435 10.1.1.0/24 1436 10.1.2.0/24 1438 A Map-Request for EID 10.1.1.1 would cause a Map-Reply with a record 1439 count of 1 to be returned with a mapping record EID-prefix of 1440 10.1.1.0/24. 1442 A Map-Request for EID 10.1.5.5, would cause a Map-Reply with a record 1443 count of 3 to be returned with mapping records for EID-prefixes 1444 10.1.0.0/16, 10.1.1.0/24, and 10.1.2.0/24. 1446 Note that not all overlapping EID-prefixes need to be returned, only 1447 the more specifics (note in the second example above 10.0.0.0/8 was 1448 not returned for requesting EID 10.1.5.5) entries for the matching 1449 EID-prefix of the requesting EID. When more than one EID-prefix is 1450 returned, all SHOULD use the same Time-to-Live value so they can all 1451 time out at the same time. When a more specific EID-prefix is 1452 received later, its Time-to-Live value in the Map-Reply record can be 1453 stored even when other less specifics exist. When a less specific 1454 EID-prefix is received later, its map-cache expiration time SHOULD be 1455 set to the minimum expiration time of any more specific EID-prefix in 1456 the map-cache. 1458 Map-Replies SHOULD be sent for an EID-prefix no more often than once 1459 per second to the same requesting router. For scalability, it is 1460 expected that aggregation of EID addresses into EID-prefixes will 1461 allow one Map-Reply to satisfy a mapping for the EID addresses in the 1462 prefix range thereby reducing the number of Map-Request messages. 1464 Map-Reply records can have an empty locator-set. A negative Map- 1465 Reply is a Map-Reply with an empty locator-set. Negative Map-Replies 1466 convey special actions by the sender to the ITR or PTR which have 1467 solicited the Map-Reply. There are two primary applications for 1468 Negative Map-Replies. The first is for a Map-Resolver to instruct an 1469 ITR or PTR when a destination is for a LISP site versus a non-LISP 1470 site. And the other is to source quench Map-Requests which are sent 1471 for non-allocated EIDs. 1473 For each Map-Reply record, the list of locators in a locator-set MUST 1474 appear in the same order for each ETR that originates a Map-Reply 1475 message. The locator-set MUST be sorted in order of ascending IP 1476 address where an IPv4 locator address is considered numerically 'less 1477 than' an IPv6 locator address. 1479 When sending a Map-Reply message, the destination address is copied 1480 from the one of the ITR-RLOC fields from the Map-Request. The ETR 1481 can choose a locator address from one of the address families it 1482 supports. For Data-Probes, the destination address of the Map-Reply 1483 is copied from the source address of the Data-Probe message which is 1484 invoking the reply. The source address of the Map-Reply is one of 1485 the local locator addresses listed in the locator-set of any mapping 1486 record in the message and SHOULD be chosen to allow uRPF checks to 1487 succeed in the upstream service provider. The destination port of a 1488 Map-Reply message is copied from the source port of the Map-Request 1489 or Data-Probe and the source port of the Map-Reply message is set to 1490 the well-known UDP port 4342. 1492 6.1.5.1. Traffic Redirection with Coarse EID-Prefixes 1494 When an ETR is misconfigured or compromised, it could return coarse 1495 EID-prefixes in Map-Reply messages it sends. The EID-prefix could 1496 cover EID-prefixes which are allocated to other sites redirecting 1497 their traffic to the locators of the compromised site. 1499 To solve this problem, there are two basic solutions that could be 1500 used. The first is to have Map-Servers proxy-map-reply on behalf of 1501 ETRs so their registered EID-prefixes are the ones returned in Map- 1502 Replies. Since the interaction between an ETR and Map-Server is 1503 secured with shared-keys, it is more difficult for an ETR to 1504 misbehave. The second solution is to have ITRs and PTRs cache EID- 1505 prefixes with mask-lengths that are greater than or equal to a 1506 configured prefix length. This limits the damage to a specific width 1507 of any EID-prefix advertised, but needs to be coordinated with the 1508 allocation of site prefixes. These solutions can be used 1509 independently or at the same time. 1511 At the time of this writing, other approaches are being considered 1512 and researched. 1514 6.1.6. Map-Register Message Format 1516 The usage details of the Map-Register message can be found in 1517 specification [LISP-MS]. This section solely defines the message 1518 format. 1520 The message is sent in UDP with a destination UDP port of 4342 and a 1521 randomly selected UDP source port number. 1523 The Map-Register message format is: 1525 0 1 2 3 1526 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1528 |Type=3 |P| Reserved | Record Count | 1529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1530 | Nonce . . . | 1531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1532 | . . . Nonce | 1533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1534 | Key ID | Authentication Data Length | 1535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1536 ~ Authentication Data ~ 1537 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1538 | | Record TTL | 1539 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1540 R | Locator Count | EID mask-len | ACT |A| Reserved | 1541 e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1542 c | Rsvd | Map-Version Number | EID-AFI | 1543 o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1544 r | EID-prefix | 1545 d +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1546 | /| Priority | Weight | M Priority | M Weight | 1547 | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1548 | o | Unused Flags |L|p|R| Loc-AFI | 1549 | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1550 | \| Locator | 1551 +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1553 Packet field descriptions: 1555 Type: 3 (Map-Register) 1557 P: This is the proxy-map-reply bit, when set to 1 an ETR sends a Map- 1558 Register message requesting for the Map-Server to proxy Map-Reply. 1559 The Map-Server will send non-authoritative Map-Replies on behalf 1560 of the ETR. Details on this usage will be provided in a future 1561 version of this draft. 1563 Reserved: Set to 0 on transmission and ignored on receipt. 1565 Record Count: The number of records in this Map-Register message. A 1566 record is comprised of that portion of the packet labeled 'Record' 1567 above and occurs the number of times equal to Record count. 1569 Nonce: This 8-byte Nonce field is set to 0 in Map-Register messages. 1571 Key ID: A configured ID to find the configured Message 1572 Authentication Code (MAC) algorithm and key value used for the 1573 authentication function. 1575 Authentication Data Length: The length in bytes of the 1576 Authentication Data field that follows this field. The length of 1577 the Authentication Data field is dependent on the Message 1578 Authentication Code (MAC) algorithm used. The length field allows 1579 a device that doesn't know the MAC algorithm to correctly parse 1580 the packet. 1582 Authentication Data: The message digest used from the output of the 1583 Message Authentication Code (MAC) algorithm. The entire Map- 1584 Register payload is authenticated with this field preset to 0. 1585 After the MAC is computed, it is placed in this field. 1586 Implementations of this specification MUST include support for 1587 HMAC-SHA-1-96 [RFC2404] and support for HMAC-SHA-128-256 [RFC4634] 1588 is recommended. 1590 The definition of the rest of the Map-Register can be found in the 1591 Map-Reply section. 1593 6.1.7. Encapsulated Control Message Format 1595 An Encapsulated Control Message is used to encapsulate control 1596 packets sent between xTRs and the mapping database system described 1597 in [LISP-MS]. 1599 0 1 2 3 1600 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1602 / | IPv4 or IPv6 Header | 1603 OH | (uses RLOC addresses) | 1604 \ | | 1605 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1606 / | Source Port = xxxx | Dest Port = 4342 | 1607 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1608 \ | UDP Length | UDP Checksum | 1609 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1610 LH |Type=8 | Reserved | 1611 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1612 / | IPv4 or IPv6 Header | 1613 IH | (uses RLOC or EID addresses) | 1614 \ | | 1615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1616 / | Source Port = xxxx | Dest Port = yyyy | 1617 UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1618 \ | UDP Length | UDP Checksum | 1619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1620 LCM | LISP Control Message | 1621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1623 Packet header descriptions: 1625 OH: The outer IPv4 or IPv6 header which uses RLOC addresses in the 1626 source and destination header address fields. 1628 UDP: The outer UDP header with destination port 4342. The source 1629 port is randomly allocated. The checksum field MUST be non-zero. 1631 LH: Type 8 is defined to be a "LISP Encapsulated Control Message" 1632 and what follows is either an IPv4 or IPv6 header as encoded by 1633 the first 4 bits after the reserved field. 1635 IH: The inner IPv4 or IPv6 header which can use either RLOC or EID 1636 addresses in the header address fields. When a Map-Request is 1637 encapsulated in this packet format the destination address in this 1638 header is an EID. 1640 UDP: The inner UDP header where the port assignments depends on the 1641 control packet being encapsulated. When the control packet is a 1642 Map-Request or Map-Register, the source port is ITR/PITR selected 1643 and the destination port is 4342. When the control packet is a 1644 Map-Reply, the source port is 4342 and the destination port is 1645 assigned from the source port of the invoking Map-Request. Port 1646 number 4341 MUST NOT be assigned to either port. The checksum 1647 field MUST be non-zero. 1649 LCM: The format is one of the control message formats described in 1650 this section. At this time, only Map-Request messages and PIM 1651 Join-Prune messages [MLISP] are allowed to be encapsulated. 1652 Encapsulating other types of LISP control messages are for further 1653 study. When Map-Requests are sent for RLOC-probing purposes (i.e 1654 the probe-bit is set), they MUST NOT be sent inside Encapsulated 1655 Control Messages. 1657 6.2. Routing Locator Selection 1659 Both client-side and server-side may need control over the selection 1660 of RLOCs for conversations between them. This control is achieved by 1661 manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply 1662 messages. Alternatively, RLOC information may be gleaned from 1663 received tunneled packets or EID-to-RLOC Map-Request messages. 1665 The following enumerates different scenarios for choosing RLOCs and 1666 the controls that are available: 1668 o Server-side returns one RLOC. Client-side can only use one RLOC. 1669 Server-side has complete control of the selection. 1671 o Server-side returns a list of RLOC where a subset of the list has 1672 the same best priority. Client can only use the subset list 1673 according to the weighting assigned by the server-side. In this 1674 case, the server-side controls both the subset list and load- 1675 splitting across its members. The client-side can use RLOCs 1676 outside of the subset list if it determines that the subset list 1677 is unreachable (unless RLOCs are set to a Priority of 255). Some 1678 sharing of control exists: the server-side determines the 1679 destination RLOC list and load distribution while the client-side 1680 has the option of using alternatives to this list if RLOCs in the 1681 list are unreachable. 1683 o Server-side sets weight of 0 for the RLOC subset list. In this 1684 case, the client-side can choose how the traffic load is spread 1685 across the subset list. Control is shared by the server-side 1686 determining the list and the client determining load distribution. 1687 Again, the client can use alternative RLOCs if the server-provided 1688 list of RLOCs are unreachable. 1690 o Either side (more likely on the server-side ETR) decides not to 1691 send a Map-Request. For example, if the server-side ETR does not 1692 send Map-Requests, it gleans RLOCs from the client-side ITR, 1693 giving the client-side ITR responsibility for bidirectional RLOC 1694 reachability and preferability. Server-side ETR gleaning of the 1695 client-side ITR RLOC is done by caching the inner header source 1696 EID and the outer header source RLOC of received packets. The 1697 client-side ITR controls how traffic is returned and can alternate 1698 using an outer header source RLOC, which then can be added to the 1699 list the server-side ETR uses to return traffic. Since no 1700 Priority or Weights are provided using this method, the server- 1701 side ETR MUST assume each client-side ITR RLOC uses the same best 1702 Priority with a Weight of zero. In addition, since EID-prefix 1703 encoding cannot be conveyed in data packets, the EID-to-RLOC cache 1704 on tunnel routers can grow to be very large. 1706 o A "gleaned" map-cache entry, one learned from the source RLOC of a 1707 received encapsulated packet, is only stored and used for a few 1708 seconds, pending verification. Verification is performed by 1709 sending a Map-Request to the source EID (the inner header IP 1710 source address) of the received encapsulated packet. A reply to 1711 this "verifying Map-Request" is used to fully populate the map- 1712 cache entry for the "gleaned" EID and is stored and used for the 1713 time indicated from the TTL field of a received Map-Reply. When a 1714 verified map-cache entry is stored, data gleaning no longer occurs 1715 for subsequent packets which have a source EID that matches the 1716 EID-prefix of the verified entry. 1718 RLOCs that appear in EID-to-RLOC Map-Reply messages are assumed to be 1719 reachable when the R-bit for the locator record is set to 1. Neither 1720 the information contained in a Map-Reply or that stored in the 1721 mapping database system provides reachability information for RLOCs. 1722 Note that reachability is not part of the mapping system and is 1723 determined using one or more of the Routing Locator Reachability 1724 Algorithms described in the next section. 1726 6.3. Routing Locator Reachability 1728 Several mechanisms for determining RLOC reachability are currently 1729 defined: 1731 1. An ETR may examine the Loc-Status-Bits in the LISP header of an 1732 encapsulated data packet received from an ITR. If the ETR is 1733 also acting as an ITR and has traffic to return to the original 1734 ITR site, it can use this status information to help select an 1735 RLOC. 1737 2. An ITR may receive an ICMP Network or ICMP Host Unreachable 1738 message for an RLOC it is using. This indicates that the RLOC is 1739 likely down. 1741 3. An ITR which participates in the global routing system can 1742 determine that an RLOC is down if no BGP RIB route exists that 1743 matches the RLOC IP address. 1745 4. An ITR may receive an ICMP Port Unreachable message from a 1746 destination host. This occurs if an ITR attempts to use 1747 interworking [INTERWORK] and LISP-encapsulated data is sent to a 1748 non-LISP-capable site. 1750 5. An ITR may receive a Map-Reply from a ETR in response to a 1751 previously sent Map-Request. The RLOC source of the Map-Reply is 1752 likely up since the ETR was able to send the Map-Reply to the 1753 ITR. 1755 6. When an ETR receives an encapsulated packet from an ITR, the 1756 source RLOC from the outer header of the packet is likely up. 1758 7. An ITR/ETR pair can use the Locator Reachability Algorithms 1759 described in this section, namely Echo-Noncing or RLOC-Probing. 1761 When determining Locator up/down reachability by examining the Loc- 1762 Status-Bits from the LISP encapsulated data packet, an ETR will 1763 receive up to date status from an encapsulating ITR about 1764 reachability for all ETRs at the site. CE-based ITRs at the source 1765 site can determine reachability relative to each other using the site 1766 IGP as follows: 1768 o Under normal circumstances, each ITR will advertise a default 1769 route into the site IGP. 1771 o If an ITR fails or if the upstream link to its PE fails, its 1772 default route will either time-out or be withdrawn. 1774 Each ITR can thus observe the presence or lack of a default route 1775 originated by the others to determine the Locator Status Bits it sets 1776 for them. 1778 RLOCs listed in a Map-Reply are numbered with ordinals 0 to n-1. The 1779 Loc-Status-Bits in a LISP encapsulated packet are numbered from 0 to 1780 n-1 starting with the least significant bit. For example, if an RLOC 1781 listed in the 3rd position of the Map-Reply goes down (ordinal value 1782 2), then all ITRs at the site will clear the 3rd least significant 1783 bit (xxxx x0xx) of the Loc-Status-Bits field for the packets they 1784 encapsulate. 1786 When an ETR decapsulates a packet, it will check for any change in 1787 the Loc-Status-Bits field. When a bit goes from 1 to 0, the ETR will 1788 refrain from encapsulating packets to an RLOC that is indicated as 1789 down. It will only resume using that RLOC if the corresponding Loc- 1790 Status-Bit returns to a value of 1. Loc-Status-Bits are associated 1791 with a locator-set per EID-prefix. Therefore, when a locator becomes 1792 unreachable, the Loc-Status-Bit that corresponds to that locator's 1793 position in the list returned by the last Map-Reply will be set to 1794 zero for that particular EID-prefix. 1796 When ITRs at the site are not deployed in CE routers, the IGP can 1797 still be used to determine the reachability of Locators provided they 1798 are injected into the IGP. This is typically done when a /32 address 1799 is configured on a loopback interface. 1801 When ITRs receive ICMP Network or Host Unreachable messages as a 1802 method to determine unreachability, they will refrain from using 1803 Locators which are described in Locator lists of Map-Replies. 1804 However, using this approach is unreliable because many network 1805 operators turn off generation of ICMP Unreachable messages. 1807 If an ITR does receive an ICMP Network or Host Unreachable message, 1808 it MAY originate its own ICMP Unreachable message destined for the 1809 host that originated the data packet the ITR encapsulated. 1811 Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if 1812 a locator address from a locator-set in a mapping entry matches a 1813 prefix. If it does not find one and BGP is running in the Default 1814 Free Zone (DFZ), it can decide to not use the locator even though the 1815 Loc-Status-Bits indicate the locator is up. In this case, the path 1816 from the ITR to the ETR that is assigned the locator is not 1817 available. More details are in [LOC-ID-ARCH]. 1819 Optionally, an ITR can send a Map-Request to a Locator and if a Map- 1820 Reply is returned, reachability of the Locator has been determined. 1821 Obviously, sending such probes increases the number of control 1822 messages originated by tunnel routers for active flows, so Locators 1823 are assumed to be reachable when they are advertised. 1825 This assumption does create a dependency: Locator unreachability is 1826 detected by the receipt of ICMP Host Unreachable messages. When an 1827 Locator has been determined to be unreachable, it is not used for 1828 active traffic; this is the same as if it were listed in a Map-Reply 1829 with priority 255. 1831 The ITR can test the reachability of the unreachable Locator by 1832 sending periodic Requests. Both Requests and Replies MUST be rate- 1833 limited. Locator reachability testing is never done with data 1834 packets since that increases the risk of packet loss for end-to-end 1835 sessions. 1837 When an ETR decapsulates a packet, it knows that it is reachable from 1838 the encapsulating ITR because that is how the packet arrived. In 1839 most cases, the ETR can also reach the ITR but cannot assume this to 1840 be true due to the possibility of path asymmetry. In the presence of 1841 unidirectional traffic flow from an ITR to an ETR, the ITR SHOULD NOT 1842 use the lack of return traffic as an indication that the ETR is 1843 unreachable. Instead, it MUST use an alternate mechanisms to 1844 determine reachability. 1846 6.3.1. Echo Nonce Algorithm 1848 When data flows bidirectionally between locators from different 1849 sites, a simple mechanism called "nonce echoing" can be used to 1850 determine reachability between an ITR and ETR. When an ITR wants to 1851 solicit a nonce echo, it sets the N and E bits and places a 24-bit 1852 nonce in the LISP header of the next encapsulated data packet. 1854 When this packet is received by the ETR, the encapsulated packet is 1855 forwarded as normal. When the ETR next sends a data packet to the 1856 ITR, it includes the nonce received earlier with the N bit set and E 1857 bit cleared. The ITR sees this "echoed nonce" and knows the path to 1858 and from the ETR is up. 1860 The ITR will set the E-bit and N-bit for every packet it sends while 1861 in echo-nonce-request state. The time the ITR waits to process the 1862 echoed nonce before it determines the path is unreachable is variable 1863 and a choice left for the implementation. 1865 If the ITR is receiving packets from the ETR but does not see the 1866 nonce echoed while being in echo-nonce-request state, then the path 1867 to the ETR is unreachable. This decision may be overridden by other 1868 locator reachability algorithms. Once the ITR determines the path to 1869 the ETR is down it can switch to another locator for that EID-prefix. 1871 Note that "ITR" and "ETR" are relative terms here. Both devices MUST 1872 be implementing both ITR and ETR functionality for the echo nonce 1873 mechanism to operate. 1875 The ITR and ETR may both go into echo-nonce-request state at the same 1876 time. The number of packets sent or the time during which echo nonce 1877 requests are sent is an implementation specific setting. However, 1878 when an ITR is in echo-nonce-request state, it can echo the ETR's 1879 nonce in the next set of packets that it encapsulates and then 1880 subsequently, continue sending echo-nonce-request packets. 1882 This mechanism does not completely solve the forward path 1883 reachability problem as traffic may be unidirectional. That is, the 1884 ETR receiving traffic at a site may not be the same device as an ITR 1885 which transmits traffic from that site or the site to site traffic is 1886 unidirectional so there is no ITR returning traffic. 1888 The echo-nonce algorithm is bilateral. That is, if one side sets the 1889 E-bit and the other side is not enabled for echo-noncing, then the 1890 echoing of the nonce does not occur and the requesting side may 1891 regard the locator unreachable erroneously. An ITR SHOULD only set 1892 the E-bit in a encapsulated data packet when it knows the ETR is 1893 enabled for echo-noncing. This is conveyed by the E-bit in the Map- 1894 Reply message. 1896 Note that other locator reachability mechanisms are being researched 1897 and can be used to compliment or even override the Echo Nonce 1898 Algorithm. See next section for an example of control-plane probing. 1900 6.3.2. RLOC Probing Algorithm 1902 RLOC Probing is a method that an ITR or PTR can use to determine the 1903 reachability status of one or more locators that it has cached in a 1904 map-cache entry. The probe-bit of the Map-Request and Map-Reply 1905 messages are used for RLOC Probing. 1907 RLOC probing is done in the control-plane on a timer basis where an 1908 ITR or PTR will originate a Map-Request destined to a locator address 1909 from one of its own locator addresses. A Map-Request used as an 1910 RLOC-probe is NOT encapsulated and NOT sent to a Map-Server or on the 1911 ALT like one would when soliciting mapping data. The EID record 1912 encoded in the Map-Request is the EID-prefix of the map-cache entry 1913 cached by the ITR or PTR. The ITR may include a mapping data record 1914 for its own database mapping information which contains the local 1915 EID-prefixes and RLOCs for its site. 1917 When an ETR receives a Map-Request message with the probe-bit set, it 1918 returns a Map-Reply with the probe-bit set. The source address of 1919 the Map-Reply is set from the destination address of the Map-Request 1920 and the destination address of the Map-Reply is set from the source 1921 address of the Map-Request. The Map-Reply SHOULD contain mapping 1922 data for the EID-prefix contained in the Map-Request. This provides 1923 the opportunity for the ITR or PTR, which sent the RLOC-probe to get 1924 mapping updates if there were changes to the ETR's database mapping 1925 entries. 1927 There are advantages and disadvantages of RLOC Probing. The greatest 1928 benefit of RLOC Probing is that it can handle many failure scenarios 1929 allowing the ITR to determine when the path to a specific locator is 1930 reachable or has become unreachable, thus providing a robust 1931 mechanism for switching to using another locator from the cached 1932 locator. RLOC Probing can also provide rough RTT estimates between a 1933 pair of locators which can be useful for network management purposes 1934 as well as for selecting low delay paths. The major disadvantage of 1935 RLOC Probing is in the number of control messages required and the 1936 amount of bandwidth used to obtain those benefits, especially if the 1937 requirement for failure detection times are very small. 1939 Continued research and testing will attempt to characterize the 1940 tradeoffs of failure detection times versus message overhead. 1942 6.4. EID Reachability within a LISP Site 1944 A site may be multihomed using two or more ETRs. The hosts and 1945 infrastructure within a site will be addressed using one or more EID 1946 prefixes that are mapped to the RLOCs of the relevant ETRs in the 1947 mapping system. One possible failure mode is for an ETR to lose 1948 reachability to one or more of the EID prefixes within its own site. 1949 When this occurs when the ETR sends Map-Replies, it can clear the 1950 R-bit associated with its own locator. And when the ETR is also an 1951 ITR, it can clear its locator-status-bit in the encapsulation data 1952 header. 1954 6.5. Routing Locator Hashing 1956 When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to 1957 a requesting ITR, the locator-set for the EID-prefix may contain 1958 different priority values for each locator address. When more than 1959 one best priority locator exists, the ITR can decide how to load 1960 share traffic against the corresponding locators. 1962 The following hash algorithm may be used by an ITR to select a 1963 locator for a packet destined to an EID for the EID-to-RLOC mapping: 1965 1. Either a source and destination address hash can be used or the 1966 traditional 5-tuple hash which includes the source and 1967 destination addresses, source and destination TCP, UDP, or SCTP 1968 port numbers and the IP protocol number field or IPv6 next- 1969 protocol fields of a packet a host originates from within a LISP 1970 site. When a packet is not a TCP, UDP, or SCTP packet, the 1971 source and destination addresses only from the header are used to 1972 compute the hash. 1974 2. Take the hash value and divide it by the number of locators 1975 stored in the locator-set for the EID-to-RLOC mapping. 1977 3. The remainder will be yield a value of 0 to "number of locators 1978 minus 1". Use the remainder to select the locator in the 1979 locator-set. 1981 Note that when a packet is LISP encapsulated, the source port number 1982 in the outer UDP header needs to be set. Selecting a hashed value 1983 allows core routers which are attached to Link Aggregation Groups 1984 (LAGs) to load-split the encapsulated packets across member links of 1985 such LAGs. Otherwise, core routers would see a single flow, since 1986 packets have a source address of the ITR, for packets which are 1987 originated by different EIDs at the source site. A suggested setting 1988 for the source port number computed by an ITR is a 5-tuple hash 1989 function on the inner header, as described above. 1991 Many core router implementations use a 5-tuple hash to decide how to 1992 balance packet load across members of a LAG. The 5-tuple hash 1993 includes the source and destination addresses of the packet and the 1994 source and destination ports when the protocol number in the packet 1995 is TCP or UDP. For this reason, UDP encoding is used for LISP 1996 encapsulation. 1998 6.6. Changing the Contents of EID-to-RLOC Mappings 2000 Since the LISP architecture uses a caching scheme to retrieve and 2001 store EID-to-RLOC mappings, the only way an ITR can get a more up-to- 2002 date mapping is to re-request the mapping. However, the ITRs do not 2003 know when the mappings change and the ETRs do not keep track of which 2004 ITRs requested its mappings. For scalability reasons, we want to 2005 maintain this approach but need to provide a way for ETRs change 2006 their mappings and inform the sites that are currently communicating 2007 with the ETR site using such mappings. 2009 When a locator record is added to the end of a locator-set, it is 2010 easy to update mappings. We assume new mappings will maintain the 2011 same locator ordering as the old mapping but just have new locators 2012 appended to the end of the list. So some ITRs can have a new mapping 2013 while other ITRs have only an old mapping that is used until they 2014 time out. When an ITR has only an old mapping but detects bits set 2015 in the loc-status-bits that correspond to locators beyond the list it 2016 has cached, it simply ignores them. However, this can only happen 2017 for locator addresses that are lexicographically greater than the 2018 locator addresses in the existing locator-set. 2020 When a locator record is removed from a locator-set, ITRs that have 2021 the mapping cached will not use the removed locator because the xTRs 2022 will set the loc-status-bit to 0. So even if the locator is in the 2023 list, it will not be used. For new mapping requests, the xTRs can 2024 set the locator AFI to 0 (indicating an unspecified address), as well 2025 as setting the corresponding loc-status-bit to 0. This forces ITRs 2026 with old or new mappings to avoid using the removed locator. 2028 If many changes occur to a mapping over a long period of time, one 2029 will find empty record slots in the middle of the locator-set and new 2030 records appended to the locator-set. At some point, it would be 2031 useful to compact the locator-set so the loc-status-bit settings can 2032 be efficiently packed. 2034 We propose here three approaches for locator-set compaction, one 2035 operational and two protocol mechanisms. The operational approach 2036 uses a clock sweep method. The protocol approaches use the concept 2037 of Solicit-Map-Requests and Map-Versioning. 2039 6.6.1. Clock Sweep 2041 The clock sweep approach uses planning in advance and the use of 2042 count-down TTLs to time out mappings that have already been cached. 2043 The default setting for an EID-to-RLOC mapping TTL is 24 hours. So 2044 there is a 24 hour window to time out old mappings. The following 2045 clock sweep procedure is used: 2047 1. 24 hours before a mapping change is to take effect, a network 2048 administrator configures the ETRs at a site to start the clock 2049 sweep window. 2051 2. During the clock sweep window, ETRs continue to send Map-Reply 2052 messages with the current (unchanged) mapping records. The TTL 2053 for these mappings is set to 1 hour. 2055 3. 24 hours later, all previous cache entries will have timed out, 2056 and any active cache entries will time out within 1 hour. During 2057 this 1 hour window the ETRs continue to send Map-Reply messages 2058 with the current (unchanged) mapping records with the TTL set to 2059 1 minute. 2061 4. At the end of the 1 hour window, the ETRs will send Map-Reply 2062 messages with the new (changed) mapping records. So any active 2063 caches can get the new mapping contents right away if not cached, 2064 or in 1 minute if they had the mapping cached. The new mappings 2065 are cached with a time to live equal to the TTL in the Map-Reply. 2067 6.6.2. Solicit-Map-Request (SMR) 2069 Soliciting a Map-Request is a selective way for ETRs, at the site 2070 where mappings change, to control the rate they receive requests for 2071 Map-Reply messages. SMRs are also used to tell remote ITRs to update 2072 the mappings they have cached. 2074 Since the ETRs don't keep track of remote ITRs that have cached their 2075 mappings, they do not know which ITRs need to have their mappings 2076 updated. As a result, an ETR will solicit Map-Requests (called an 2077 SMR message) from those sites to which it has been sending 2078 encapsulated data to for the last minute. In particular, an ETR will 2079 send an SMR an ITR to which it has recently sent encapsulated data. 2081 An SMR message is simply a bit set in a Map-Request message. An ITR 2082 or PTR will send a Map-Request when they receive an SMR message. 2083 Both the SMR sender and the Map-Request responder MUST rate-limited 2084 these messages. Rate-limiting can be implemented as a global rate- 2085 limiter or one rate-limiter per SMR destination. 2087 The following procedure shows how a SMR exchange occurs when a site 2088 is doing locator-set compaction for an EID-to-RLOC mapping: 2090 1. When the database mappings in an ETR change, the ETRs at the site 2091 begin to send Map-Requests with the SMR bit set for each locator 2092 in each map-cache entry the ETR caches. 2094 2. A remote ITR which receives the SMR message will schedule sending 2095 a Map-Request message to the source locator address of the SMR 2096 message or to the mapping database system. A newly allocated 2097 random nonce is selected and the EID-prefix used is the one 2098 copied from the SMR message. If the source locator is the only 2099 locator in the cached locator-set, the remote ITR SHOULD send a 2100 Map-Request to the database mapping system just in case the 2101 single locator has changed and may no longer be reachable to 2102 accept the Map-Request. 2104 3. The remote ITR MUST rate-limit the Map-Request until it gets a 2105 Map-Reply while continuing to use the cached mapping. When Map 2106 Versioning is used, described in Section 6.6.3, an SMR sender can 2107 detect if an ITR is using the most up to date database mapping. 2109 4. The ETRs at the site with the changed mapping will reply to the 2110 Map-Request with a Map-Reply message that has a nonce from the 2111 SMR-invoked Map-Request. The Map-Reply messages SHOULD be rate 2112 limited. This is important to avoid Map-Reply implosion. 2114 5. The ETRs, at the site with the changed mapping, record the fact 2115 that the site that sent the Map-Request has received the new 2116 mapping data in the mapping cache entry for the remote site so 2117 the loc-status-bits are reflective of the new mapping for packets 2118 going to the remote site. The ETR then stops sending SMR 2119 messages. 2121 For security reasons an ITR MUST NOT process unsolicited Map-Replies. 2122 To avoid map-cache entry corruption by a third-party, a sender of an 2123 SMR-based Map-Request MUST be verified. If an ITR receives an SMR- 2124 based Map-Request and the source is not in the locator-set for the 2125 stored map-cache entry, then the responding Map-Request MUST be sent 2126 with an EID destination to the mapping database system. Since the 2127 mapping database system is more secure to reach an authoritative ETR, 2128 it will deliver the Map-Request to the authoritative source of the 2129 mapping data. 2131 When an ITR receives an SMR-based Map-Request for which it does not 2132 have a cached mapping for the EID in the SMR message, it MAY not send 2133 a SMR-invoked Map-Request. This scenario can occur when an ETR sends 2134 SMR messages to all locators in the locator-set it has stored in its 2135 map-cache but the remote ITRs that receive the SMR may not be sending 2136 packets to the site. There is no point in updating the ITRs until 2137 they need to send, in which case, they will send Map-Requests to 2138 obtain a map-cache entry. 2140 6.6.3. Database Map Versioning 2142 When there is unidirectional packet flow between an ITR and ETR, and 2143 the EID-to-RLOC mappings change on the ETR, it needs to inform the 2144 ITR so encapsulation can stop to a removed locator and start to a new 2145 locator in the locator-set. 2147 An ETR, when it sends Map-Reply messages, conveys its own Map-Version 2148 number. This is known as the Destination Map-Version Number. ITRs 2149 include the Destination Map-Version Number in packets they 2150 encapsulate to the site. When an ETR decapsulates a packet and 2151 detects the Destination Map-Version Number is less than the current 2152 version for its mapping, the SMR procedure described in Section 6.6.2 2153 occurs. 2155 An ITR, when it encapsulates packets to ETRs, can convey its own Map- 2156 Version number. This is known as the Source Map-Version Number. 2157 When an ETR decapsulates a packet and detects the Source Map-Version 2158 Number is greater than the last Map-Version Number sent in a Map- 2159 Reply from the ITR's site, the ETR will send a Map-Request to one of 2160 the ETRs for the source site. 2162 A Map-Version Number is used as a sequence number per EID-prefix. So 2163 values that are greater, are considered to be more recent. A value 2164 of 0 for the Source Map-Version Number or the Destination Map-Version 2165 Number conveys no versioning information and an ITR does no 2166 comparison with previously received Map-Version Numbers. 2168 A Map-Version Number can be included in Map-Register messages as 2169 well. This is a good way for the Map-Server can assure that all ETRs 2170 for a site registering to it will be Map-Version number synchronized. 2172 See [VERSIONING] for a more detailed analysis and description of 2173 Database Map Versioning. 2175 7. Router Performance Considerations 2177 LISP is designed to be very hardware-based forwarding friendly. A 2178 few implementation techniques can be used to incrementally implement 2179 LISP: 2181 o When a tunnel encapsulated packet is received by an ETR, the outer 2182 destination address may not be the address of the router. This 2183 makes it challenging for the control plane to get packets from the 2184 hardware. This may be mitigated by creating special FIB entries 2185 for the EID-prefixes of EIDs served by the ETR (those for which 2186 the router provides an RLOC translation). These FIB entries are 2187 marked with a flag indicating that control plane processing should 2188 be performed. The forwarding logic of testing for particular IP 2189 protocol number value is not necessary. No changes to existing, 2190 deployed hardware should be needed to support this. 2192 o On an ITR, prepending a new IP header consists of adding more 2193 bytes to a MAC rewrite string and prepending the string as part of 2194 the outgoing encapsulation procedure. Routers that support GRE 2195 tunneling [RFC2784] or 6to4 tunneling [RFC3056] may already 2196 support this action. 2198 o A packet's source address or interface the packet was received on 2199 can be used to select a VRF (Virtual Routing/Forwarding). The 2200 VRF's routing table can be used to find EID-to-RLOC mappings. 2202 8. Deployment Scenarios 2204 This section will explore how and where ITRs and ETRs can be deployed 2205 and will discuss the pros and cons of each deployment scenario. 2206 There are two basic deployment trade-offs to consider: centralized 2207 versus distributed caches and flat, recursive, or re-encapsulating 2208 tunneling. 2210 When deciding on centralized versus distributed caching, the 2211 following issues should be considered: 2213 o Are the tunnel routers spread out so that the caches are spread 2214 across all the memories of each router? 2216 o Should management "touch points" be minimized by choosing few 2217 tunnel routers, just enough for redundancy? 2219 o In general, using more ITRs doesn't increase management load, 2220 since caches are built and stored dynamically. On the other hand, 2221 more ETRs does require more management since EID-prefix-to-RLOC 2222 mappings need to be explicitly configured. 2224 When deciding on flat, recursive, or re-encapsulation tunneling, the 2225 following issues should be considered: 2227 o Flat tunneling implements a single tunnel between source site and 2228 destination site. This generally offers better paths between 2229 sources and destinations with a single tunnel path. 2231 o Recursive tunneling is when tunneled traffic is again further 2232 encapsulated in another tunnel, either to implement VPNs or to 2233 perform Traffic Engineering. When doing VPN-based tunneling, the 2234 site has some control since the site is prepending a new tunnel 2235 header. In the case of TE-based tunneling, the site may have 2236 control if it is prepending a new tunnel header, but if the site's 2237 ISP is doing the TE, then the site has no control. Recursive 2238 tunneling generally will result in suboptimal paths but at the 2239 benefit of steering traffic to resource available parts of the 2240 network. 2242 o The technique of re-encapsulation ensures that packets only 2243 require one tunnel header. So if a packet needs to be rerouted, 2244 it is first decapsulated by the ETR and then re-encapsulated with 2245 a new tunnel header using a new RLOC. 2247 The next sub-sections will survey where tunnel routers can reside in 2248 the network. 2250 8.1. First-hop/Last-hop Tunnel Routers 2252 By locating tunnel routers close to hosts, the EID-prefix set is at 2253 the granularity of an IP subnet. So at the expense of more EID- 2254 prefix-to-RLOC sets for the site, the caches in each tunnel router 2255 can remain relatively small. But caches always depend on the number 2256 of non-aggregated EID destination flows active through these tunnel 2257 routers. 2259 With more tunnel routers doing encapsulation, the increase in control 2260 traffic grows as well: since the EID-granularity is greater, more 2261 Map-Requests and Map-Replies are traveling between more routers. 2263 The advantage of placing the caches and databases at these stub 2264 routers is that the products deployed in this part of the network 2265 have better price-memory ratios then their core router counterparts. 2266 Memory is typically less expensive in these devices and fewer routes 2267 are stored (only IGP routes). These devices tend to have excess 2268 capacity, both for forwarding and routing state. 2270 LISP functionality can also be deployed in edge switches. These 2271 devices generally have layer-2 ports facing hosts and layer-3 ports 2272 facing the Internet. Spare capacity is also often available in these 2273 devices as well. 2275 8.2. Border/Edge Tunnel Routers 2277 Using customer-edge (CE) routers for tunnel endpoints allows the EID 2278 space associated with a site to be reachable via a small set of RLOCs 2279 assigned to the CE routers for that site. This is the default 2280 behavior envisioned in the rest of this specification. 2282 This offers the opposite benefit of the first-hop/last-hop tunnel 2283 router scenario: the number of mapping entries and network management 2284 touch points are reduced, allowing better scaling. 2286 One disadvantage is that less of the network's resources are used to 2287 reach host endpoints thereby centralizing the point-of-failure domain 2288 and creating network choke points at the CE router. 2290 Note that more than one CE router at a site can be configured with 2291 the same IP address. In this case an RLOC is an anycast address. 2292 This allows resilience between the CE routers. That is, if a CE 2293 router fails, traffic is automatically routed to the other routers 2294 using the same anycast address. However, this comes with the 2295 disadvantage where the site cannot control the entrance point when 2296 the anycast route is advertised out from all border routers. Another 2297 disadvantage of using anycast locators is the limited advertisement 2298 scope of /32 (or /128 for IPv6) routes. 2300 8.3. ISP Provider-Edge (PE) Tunnel Routers 2302 Use of ISP PE routers as tunnel endpoint routers is not the typical 2303 deployment scenario envisioned in the specification. This section 2304 attempts to capture some of reasoning behind this preference of 2305 implementing LISP on CE routers. 2307 Use of ISP PE routers as tunnel endpoint routers gives an ISP, rather 2308 than a site, control over the location of the egress tunnel 2309 endpoints. That is, the ISP can decide if the tunnel endpoints are 2310 in the destination site (in either CE routers or last-hop routers 2311 within a site) or at other PE edges. The advantage of this case is 2312 that two tunnel headers can be avoided. By having the PE be the 2313 first router on the path to encapsulate, it can choose a TE path 2314 first, and the ETR can decapsulate and re-encapsulate for a tunnel to 2315 the destination end site. 2317 An obvious disadvantage is that the end site has no control over 2318 where its packets flow or the RLOCs used. Other disadvantages 2319 include the difficulty in synchronizing path liveness updates between 2320 CE and PE routers. 2322 As mentioned in earlier sections a combination of these scenarios is 2323 possible at the expense of extra packet header overhead, if both site 2324 and provider want control, then recursive or re-encapsulating tunnels 2325 are used. 2327 8.4. LISP Functionality with Conventional NATs 2329 LISP routers can be deployed behind Network Address Translator (NAT) 2330 devices to provide the same set of packet services hosts have today 2331 when they are addressed out of private address space. 2333 It is important to note that a locator address in any LISP control 2334 message MUST be a globally routable address and therefore SHOULD NOT 2335 contain [RFC1918] addresses. If a LISP router is configured with 2336 private addresses, they MUST be used only in the outer IP header so 2337 the NAT device can translate properly. Otherwise, EID addresses MUST 2338 be translated before encapsulation is performed. Both NAT 2339 translation and LISP encapsulation functions could be co-located in 2340 the same device. 2342 More details on LISP address translation can be found in [INTERWORK]. 2344 9. Traceroute Considerations 2346 When a source host in a LISP site initiates a traceroute to a 2347 destination host in another LISP site, it is highly desirable for it 2348 to see the entire path. Since packets are encapsulated from ITR to 2349 ETR, the hop across the tunnel could be viewed as a single hop. 2350 However, LISP traceroute will provide the entire path so the user can 2351 see 3 distinct segments of the path from a source LISP host to a 2352 destination LISP host: 2354 Segment 1 (in source LISP site based on EIDs): 2356 source-host ---> first-hop ... next-hop ---> ITR 2358 Segment 2 (in the core network based on RLOCs): 2360 ITR ---> next-hop ... next-hop ---> ETR 2362 Segment 3 (in the destination LISP site based on EIDs): 2364 ETR ---> next-hop ... last-hop ---> destination-host 2366 For segment 1 of the path, ICMP Time Exceeded messages are returned 2367 in the normal matter as they are today. The ITR performs a TTL 2368 decrement and test for 0 before encapsulating. So the ITR hop is 2369 seen by the traceroute source has an EID address (the address of 2370 site-facing interface). 2372 For segment 2 of the path, ICMP Time Exceeded messages are returned 2373 to the ITR because the TTL decrement to 0 is done on the outer 2374 header, so the destination of the ICMP messages are to the ITR RLOC 2375 address, the source RLOC address of the encapsulated traceroute 2376 packet. The ITR looks inside of the ICMP payload to inspect the 2377 traceroute source so it can return the ICMP message to the address of 2378 the traceroute client as well as retaining the core router IP address 2379 in the ICMP message. This is so the traceroute client can display 2380 the core router address (the RLOC address) in the traceroute output. 2381 The ETR returns its RLOC address and responds to the TTL decrement to 2382 0 like the previous core routers did. 2384 For segment 3, the next-hop router downstream from the ETR will be 2385 decrementing the TTL for the packet that was encapsulated, sent into 2386 the core, decapsulated by the ETR, and forwarded because it isn't the 2387 final destination. If the TTL is decremented to 0, any router on the 2388 path to the destination of the traceroute, including the next-hop 2389 router or destination, will send an ICMP Time Exceeded message to the 2390 source EID of the traceroute client. The ICMP message will be 2391 encapsulated by the local ITR and sent back to the ETR in the 2392 originated traceroute source site, where the packet will be delivered 2393 to the host. 2395 9.1. IPv6 Traceroute 2397 IPv6 traceroute follows the procedure described above since the 2398 entire traceroute data packet is included in ICMP Time Exceeded 2399 message payload. Therefore, only the ITR needs to pay special 2400 attention for forwarding ICMP messages back to the traceroute source. 2402 9.2. IPv4 Traceroute 2404 For IPv4 traceroute, we cannot follow the above procedure since IPv4 2405 ICMP Time Exceeded messages only include the invoking IP header and 8 2406 bytes that follow the IP header. Therefore, when a core router sends 2407 an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP 2408 payload is the encapsulated header it prepended followed by a UDP 2409 header. The original invoking IP header, and therefore the identity 2410 of the traceroute source is lost. 2412 The solution we propose to solve this problem is to cache traceroute 2413 IPv4 headers in the ITR and to match them up with corresponding IPv4 2414 Time Exceeded messages received from core routers and the ETR. The 2415 ITR will use a circular buffer for caching the IPv4 and UDP headers 2416 of traceroute packets. It will select a 16-bit number as a key to 2417 find them later when the IPv4 Time Exceeded messages are received. 2418 When an ITR encapsulates an IPv4 traceroute packet, it will use the 2419 16-bit number as the UDP source port in the encapsulating header. 2420 When the ICMP Time Exceeded message is returned to the ITR, the UDP 2421 header of the encapsulating header is present in the ICMP payload 2422 thereby allowing the ITR to find the cached headers for the 2423 traceroute source. The ITR puts the cached headers in the payload 2424 and sends the ICMP Time Exceeded message to the traceroute source 2425 retaining the source address of the original ICMP Time Exceeded 2426 message (a core router or the ETR of the site of the traceroute 2427 destination). 2429 The signature of a traceroute packet comes in two forms. The first 2430 form is encoded as a UDP message where the destination port is 2431 inspected for a range of values. The second form is encoded as an 2432 ICMP message where the IP identification field is inspected for a 2433 well-known value. 2435 9.3. Traceroute using Mixed Locators 2437 When either an IPv4 traceroute or IPv6 traceroute is originated and 2438 the ITR encapsulates it in the other address family header, you 2439 cannot get all 3 segments of the traceroute. Segment 2 of the 2440 traceroute can not be conveyed to the traceroute source since it is 2441 expecting addresses from intermediate hops in the same address format 2442 for the type of traceroute it originated. Therefore, in this case, 2443 segment 2 will make the tunnel look like one hop. All the ITR has to 2444 do to make this work is to not copy the inner TTL to the outer, 2445 encapsulating header's TTL when a traceroute packet is encapsulated 2446 using an RLOC from a different address family. This will cause no 2447 TTL decrement to 0 to occur in core routers between the ITR and ETR. 2449 10. Mobility Considerations 2451 There are several kinds of mobility of which only some might be of 2452 concern to LISP. Essentially they are as follows. 2454 10.1. Site Mobility 2456 A site wishes to change its attachment points to the Internet, and 2457 its LISP Tunnel Routers will have new RLOCs when it changes upstream 2458 providers. Changes in EID-RLOC mappings for sites are expected to be 2459 handled by configuration, outside of the LISP protocol. 2461 10.2. Slow Endpoint Mobility 2463 An individual endpoint wishes to move, but is not concerned about 2464 maintaining session continuity. Renumbering is involved. LISP can 2465 help with the issues surrounding renumbering [RFC4192] [LISA96] by 2466 decoupling the address space used by a site from the address spaces 2467 used by its ISPs. [RFC4984] 2469 10.3. Fast Endpoint Mobility 2471 Fast endpoint mobility occurs when an endpoint moves relatively 2472 rapidly, changing its IP layer network attachment point. Maintenance 2473 of session continuity is a goal. This is where the Mobile IPv4 2474 [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used, 2475 and primarily where interactions with LISP need to be explored. 2477 The problem is that as an endpoint moves, it may require changes to 2478 the mapping between its EID and a set of RLOCs for its new network 2479 location. When this is added to the overhead of mobile IP binding 2480 updates, some packets might be delayed or dropped. 2482 In IPv4 mobility, when an endpoint is away from home, packets to it 2483 are encapsulated and forwarded via a home agent which resides in the 2484 home area the endpoint's address belongs to. The home agent will 2485 encapsulate and forward packets either directly to the endpoint or to 2486 a foreign agent which resides where the endpoint has moved to. 2487 Packets from the endpoint may be sent directly to the correspondent 2488 node, may be sent via the foreign agent, or may be reverse-tunneled 2489 back to the home agent for delivery to the mobile node. As the 2490 mobile node's EID or available RLOC changes, LISP EID-to-RLOC 2491 mappings are required for communication between the mobile node and 2492 the home agent, whether via foreign agent or not. As a mobile 2493 endpoint changes networks, up to three LISP mapping changes may be 2494 required: 2496 o The mobile node moves from an old location to a new visited 2497 network location and notifies its home agent that it has done so. 2498 The Mobile IPv4 control packets the mobile node sends pass through 2499 one of the new visited network's ITRs, which needs a EID-RLOC 2500 mapping for the home agent. 2502 o The home agent might not have the EID-RLOC mappings for the mobile 2503 node's "care-of" address or its foreign agent in the new visited 2504 network, in which case it will need to acquire them. 2506 o When packets are sent directly to the correspondent node, it may 2507 be that no traffic has been sent from the new visited network to 2508 the correspondent node's network, and the new visited network's 2509 ITR will need to obtain an EID-RLOC mapping for the correspondent 2510 node's site. 2512 In addition, if the IPv4 endpoint is sending packets from the new 2513 visited network using its original EID, then LISP will need to 2514 perform a route-returnability check on the new EID-RLOC mapping for 2515 that EID. 2517 In IPv6 mobility, packets can flow directly between the mobile node 2518 and the correspondent node in either direction. The mobile node uses 2519 its "care-of" address (EID). In this case, the route-returnability 2520 check would not be needed but one more LISP mapping lookup may be 2521 required instead: 2523 o As above, three mapping changes may be needed for the mobile node 2524 to communicate with its home agent and to send packets to the 2525 correspondent node. 2527 o In addition, another mapping will be needed in the correspondent 2528 node's ITR, in order for the correspondent node to send packets to 2529 the mobile node's "care-of" address (EID) at the new network 2530 location. 2532 When both endpoints are mobile the number of potential mapping 2533 lookups increases accordingly. 2535 As a mobile node moves there are not only mobility state changes in 2536 the mobile node, correspondent node, and home agent, but also state 2537 changes in the ITRs and ETRs for at least some EID-prefixes. 2539 The goal is to support rapid adaptation, with little delay or packet 2540 loss for the entire system. Also IP mobility can be modified to 2541 require fewer mapping changes. In order to increase overall system 2542 performance, there may be a need to reduce the optimization of one 2543 area in order to place fewer demands on another. 2545 In LISP, one possibility is to "glean" information. When a packet 2546 arrives, the ETR could examine the EID-RLOC mapping and use that 2547 mapping for all outgoing traffic to that EID. It can do this after 2548 performing a route-returnability check, to ensure that the new 2549 network location does have a internal route to that endpoint. 2550 However, this does not cover the case where an ITR (the node assigned 2551 the RLOC) at the mobile-node location has been compromised. 2553 Mobile IP packet exchange is designed for an environment in which all 2554 routing information is disseminated before packets can be forwarded. 2555 In order to allow the Internet to grow to support expected future 2556 use, we are moving to an environment where some information may have 2557 to be obtained after packets are in flight. Modifications to IP 2558 mobility should be considered in order to optimize the behavior of 2559 the overall system. Anything which decreases the number of new EID- 2560 RLOC mappings needed when a node moves, or maintains the validity of 2561 an EID-RLOC mapping for a longer time, is useful. 2563 10.4. Fast Network Mobility 2565 In addition to endpoints, a network can be mobile, possibly changing 2566 xTRs. A "network" can be as small as a single router and as large as 2567 a whole site. This is different from site mobility in that it is 2568 fast and possibly short-lived, but different from endpoint mobility 2569 in that a whole prefix is changing RLOCs. However, the mechanisms 2570 are the same and there is no new overhead in LISP. A map request for 2571 any endpoint will return a binding for the entire mobile prefix. 2573 If mobile networks become a more common occurrence, it may be useful 2574 to revisit the design of the mapping service and allow for dynamic 2575 updates of the database. 2577 The issue of interactions between mobility and LISP needs to be 2578 explored further. Specific improvements to the entire system will 2579 depend on the details of mapping mechanisms. Mapping mechanisms 2580 should be evaluated on how well they support session continuity for 2581 mobile nodes. 2583 10.5. LISP Mobile Node Mobility 2585 A mobile device can use the LISP infrastructure to achieve mobility 2586 by implementing the LISP encapsulation and decapsulation functions 2587 and acting as a simple ITR/ETR. By doing this, such a "LISP mobile 2588 node" can use topologically-independent EID IP addresses that are not 2589 advertised into and do not impose a cost on the global routing 2590 system. These EIDs are maintained at the edges of the mapping system 2591 (in LISP Map-Servers and Map-Resolvers) and are provided on demand to 2592 only the correspondents of the LISP mobile node. 2594 Refer to the LISP Mobility Architecture specification [LISP-MN] for 2595 more details. 2597 11. Multicast Considerations 2599 A multicast group address, as defined in the original Internet 2600 architecture is an identifier of a grouping of topologically 2601 independent receiver host locations. The address encoding itself 2602 does not determine the location of the receiver(s). The multicast 2603 routing protocol, and the network-based state the protocol creates, 2604 determines where the receivers are located. 2606 In the context of LISP, a multicast group address is both an EID and 2607 a Routing Locator. Therefore, no specific semantic or action needs 2608 to be taken for a destination address, as it would appear in an IP 2609 header. Therefore, a group address that appears in an inner IP 2610 header built by a source host will be used as the destination EID. 2611 The outer IP header (the destination Routing Locator address), 2612 prepended by a LISP router, will use the same group address as the 2613 destination Routing Locator. 2615 Having said that, only the source EID and source Routing Locator 2616 needs to be dealt with. Therefore, an ITR merely needs to put its 2617 own IP address in the source Routing Locator field when prepending 2618 the outer IP header. This source Routing Locator address, like any 2619 other Routing Locator address MUST be globally routable. 2621 Therefore, an EID-to-RLOC mapping does not need to be performed by an 2622 ITR when a received data packet is a multicast data packet or when 2623 processing a source-specific Join (either by IGMPv3 or PIM). But the 2624 source Routing Locator is decided by the multicast routing protocol 2625 in a receiver site. That is, an EID to Routing Locator translation 2626 is done at control-time. 2628 Another approach is to have the ITR not encapsulate a multicast 2629 packet and allow the host built packet to flow into the core even if 2630 the source address is allocated out of the EID namespace. If the 2631 RPF-Vector TLV [RFC5496] is used by PIM in the core, then core 2632 routers can RPF to the ITR (the Locator address which is injected 2633 into core routing) rather than the host source address (the EID 2634 address which is not injected into core routing). 2636 To avoid any EID-based multicast state in the network core, the first 2637 approach is chosen for LISP-Multicast. Details for LISP-Multicast 2638 and Interworking with non-LISP sites is described in specification 2639 [MLISP]. 2641 12. Security Considerations 2643 It is believed that most of the security mechanisms will be part of 2644 the mapping database service when using control plane procedures for 2645 obtaining EID-to-RLOC mappings. For data plane triggered mappings, 2646 as described in this specification, protection is provided against 2647 ETR spoofing by using Return- Routability mechanisms evidenced by the 2648 use of a 24-bit Nonce field in the LISP encapsulation header and a 2649 64-bit Nonce field in the LISP control message. The nonce, coupled 2650 with the ITR accepting only solicited Map-Replies goes a long way 2651 toward providing decent authentication. 2653 LISP does not rely on a PKI infrastructure or a more heavy weight 2654 authentication system. These systems challenge the scalability of 2655 LISP which was a primary design goal. 2657 DoS attack prevention will depend on implementations rate-limiting 2658 Map-Requests and Map-Replies to the control plane as well as rate- 2659 limiting the number of data-triggered Map-Replies. 2661 To deal with map-cache exhaustion attempts in an ITR/PTR, the 2662 implementation should consider putting a maximum cap on the number of 2663 entries stored with a reserve list for special or frequently accessed 2664 sites. This should be a configuration policy control set by the 2665 network administrator who manages ITRs and PTRs. 2667 13. IANA Considerations 2669 This section provides guidance to the Internet Assigned Numbers 2670 Authority (IANA) regarding registration of values related to the LISP 2671 specification, in accordance with BCP 26 and RFC 5226 [RFC5226]. 2673 There are two name spaces in LISP that require registration: 2675 o LISP IANA registry allocations should not be made for purposes 2676 unrelated to LISP routing or transport protocols. 2678 o The following policies are used here with the meanings defined in 2679 BCP 26: "Specification Required", "IETF Consensus", "Experimental 2680 Use", "First Come First Served". 2682 13.1. LISP Address Type Codes 2684 Instance ID type codes have a range from 0 to 15, of which 0 and 1 2685 have been allocated [LCAF]. New Type Codes MUST be allocated 2686 starting at 2. Type Codes 2 - 10 are to be assigned by IETF Review. 2687 Type Codes 11 - 15 are available on a First Come First Served policy. 2689 The following codes have been allocated: 2691 Type 0: Null Body Type 2693 Type 1: AFI List Type 2695 See [LCAF] for details for other possible unapproved address 2696 encodings. The unapproved LCAF encodings are an area for further 2697 study and experimentation. 2699 13.2. LISP UDP Port Numbers 2701 The IANA registry has allocated UDP port numbers 4341 and 4342 for 2702 LISP data-plane and control-plane operation, respectively. 2704 14. References 2706 14.1. Normative References 2708 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2709 August 1980. 2711 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 2712 STD 13, RFC 1034, November 1987. 2714 [RFC1700] Reynolds, J. and J. Postel, "Assigned Numbers", RFC 1700, 2715 October 1994. 2717 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 2718 E. Lear, "Address Allocation for Private Internets", 2719 BCP 5, RFC 1918, February 1996. 2721 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2722 Requirement Levels", BCP 14, RFC 2119, March 1997. 2724 [RFC2404] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within 2725 ESP and AH", RFC 2404, November 1998. 2727 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 2728 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 2729 March 2000. 2731 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 2732 via IPv4 Clouds", RFC 3056, February 2001. 2734 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 2735 of Explicit Congestion Notification (ECN) to IP", 2736 RFC 3168, September 2001. 2738 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 2739 A., Peterson, J., Sparks, R., Handley, M., and E. 2740 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 2741 June 2002. 2743 [RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support 2744 in IPv6", RFC 3775, June 2004. 2746 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness 2747 Requirements for Security", BCP 106, RFC 4086, June 2005. 2749 [RFC4632] Fuller, V. and T. Li, "Classless Inter-domain Routing 2750 (CIDR): The Internet Address Assignment and Aggregation 2751 Plan", BCP 122, RFC 4632, August 2006. 2753 [RFC4634] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms 2754 (SHA and HMAC-SHA)", RFC 4634, July 2006. 2756 [RFC4866] Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route 2757 Optimization for Mobile IPv6", RFC 4866, May 2007. 2759 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 2760 Workshop on Routing and Addressing", RFC 4984, 2761 September 2007. 2763 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2764 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 2765 May 2008. 2767 [RFC5496] Wijnands, IJ., Boers, A., and E. Rosen, "The Reverse Path 2768 Forwarding (RPF) Vector TLV", RFC 5496, March 2009. 2770 [UDP-TUNNELS] 2771 Eubanks, M. and P. Chimento, "UDP Checksums for Tunneled 2772 Packets"", draft-eubanks-chimento-6man-00.txt (work in 2773 progress), February 2009. 2775 14.2. Informative References 2777 [AFI] IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY 2778 NUMBERS http://www.iana.org/numbers.html, Febuary 2007. 2780 [ALT] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "LISP 2781 Alternative Topology (LISP-ALT)", 2782 draft-ietf-lisp-alt-04.txt (work in progress), April 2010. 2784 [CHIAPPA] Chiappa, J., "Endpoints and Endpoint names: A Proposed 2785 Enhancement to the Internet Architecture", Internet- 2786 Draft http://www.chiappa.net/~jnc/tech/endpoints.txt, 2787 1999. 2789 [CONS] Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A 2790 Content distribution Overlay Network Service for LISP", 2791 draft-meyer-lisp-cons-03.txt (work in progress), 2792 November 2007. 2794 [EMACS] Brim, S., Farinacci, D., Meyer, D., and J. Curran, "EID 2795 Mappings Multicast Across Cooperating Systems for LISP", 2796 draft-curran-lisp-emacs-00.txt (work in progress), 2797 November 2007. 2799 [INTERWORK] 2800 Lewis, D., Meyer, D., Farinacci, D., and V. Fuller, 2801 "Interworking LISP with IPv4 and IPv6", 2802 draft-ietf-lisp-interworking-01.txt (work in progress), 2803 March 2010. 2805 [LCAF] Farinacci, D., Meyer, D., and J. Snijders, "LISP Canonical 2806 Address Format", draft-farinacci-lisp-lcaf-02.txt (work in 2807 progress), October 2010. 2809 [LISA96] Lear, E., Katinsky, J., Coffin, J., and D. Tharp, 2810 "Renumbering: Threat or Menace?", Usenix , September 1996. 2812 [LISP-MAIN] 2813 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 2814 "Locator/ID Separation Protocol (LISP)", 2815 draft-farinacci-lisp-12.txt (work in progress), 2816 March 2009. 2818 [LISP-MN] Farinacci, D., Fuller, V., Lewis, D., and D. Meyer, "LISP 2819 Mobility Architecture", draft-meyer-lisp-mn-05.txt (work 2820 in progress), October 2010. 2822 [LISP-MS] Farinacci, D. and V. Fuller, "LISP Map Server", 2823 draft-ietf-lisp-ms-05.txt (work in progress), April 2010. 2825 [LOC-ID-ARCH] 2826 Meyer, D. and D. Lewis, "Architectural Implications of 2827 Locator/ID Separation", 2828 draft-meyer-loc-id-implications-01.txt (work in progress), 2829 Januaryr 2009. 2831 [MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, 2832 "LISP for Multicast Environments", 2833 draft-ietf-lisp-multicast-04.txt (work in progress), 2834 October 2010. 2836 [NERD] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", 2837 draft-lear-lisp-nerd-08.txt (work in progress), 2838 March 2010. 2840 [OPENLISP] 2841 Iannone, L. and O. Bonaventure, "OpenLISP Implementation 2842 Report", draft-iannone-openlisp-implementation-01.txt 2843 (work in progress), July 2008. 2845 [RADIR] Narten, T., "Routing and Addressing Problem Statement", 2846 draft-narten-radir-problem-statement-00.txt (work in 2847 progress), July 2007. 2849 [RFC3344bis] 2850 Perkins, C., "IP Mobility Support for IPv4, revised", 2851 draft-ietf-mip4-rfc3344bis-05 (work in progress), 2852 July 2007. 2854 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 2855 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 2856 September 2005. 2858 [RPMD] Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol 2859 for Routing Protocol Meta-data Dissemination", 2860 draft-handley-p2ppush-unpublished-2007726.txt (work in 2861 progress), July 2007. 2863 [VERSIONING] 2864 Iannone, L., Saucez, D., and O. Bonaventure, "LISP Mapping 2865 Versioning", draft-iannone-lisp-mapping-versioning-02.txt 2866 (work in progress), July 2010. 2868 Appendix A. Acknowledgments 2870 An initial thank you goes to Dave Oran for planting the seeds for the 2871 initial ideas for LISP. His consultation continues to provide value 2872 to the LISP authors. 2874 A special and appreciative thank you goes to Noel Chiappa for 2875 providing architectural impetus over the past decades on separation 2876 of location and identity, as well as detailed review of the LISP 2877 architecture and documents, coupled with enthusiasm for making LISP a 2878 practical and incremental transition for the Internet. 2880 The authors would like to gratefully acknowledge many people who have 2881 contributed discussion and ideas to the making of this proposal. 2882 They include Scott Brim, Andrew Partan, John Zwiebel, Jason Schiller, 2883 Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill, Geoff Huston, 2884 David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark Townsley, 2885 Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave Thaler, 2886 Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure, Luigi 2887 Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger 2888 Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland 2889 Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, Damian 2890 Lezama, Attilla De Groot, Parantap Lahiri, David Black, Roque 2891 Gagliano, Isidor Kouvelas, Jesper Skriver, Fred Templin, Margaret 2892 Wasserman, Sam Hartman, Michael Hofling, Pedro Marques, Jari Arkko, 2893 Gregg Schudel, Srinivas Subramanian, Amit Jain, Xu Xiaohu, Dhirendra 2894 Trivedi, Yakov Rekhter, John Scudder, John Drake, Dimitri 2895 Papadimitriou, Ross Callon, Selina Heimlich, Job Snijders, and Vina 2896 Ermagan. 2898 This work originated in the Routing Research Group (RRG) of the IRTF. 2899 The individual submission [LISP-MAIN] was converted into this IETF 2900 LISP working group draft. 2902 Appendix B. Document Change Log 2904 B.1. Changes to draft-ietf-lisp-09.txt 2906 o Posted October 2010. 2908 o Add to IANA Consideration section about the use of LCAF Type 2909 values that accepted and maintained by the IANA registry and not 2910 the LCAF specification. 2912 o Indicate that implementations should be able to receive LISP 2913 control messages when either UDP port is 4342, so they can be 2914 robust in the face of intervening NAT boxes. 2916 o Add paragraph to SMR section to indicate that an ITR does not need 2917 to respond to an SMR-based Map-Request when it has no map-cache 2918 entry for the SMR source's EID-prefix. 2920 B.2. Changes to draft-ietf-lisp-08.txt 2922 o Posted August 2010. 2924 o In section 6.1.6, remove statement about setting TTL to 0 in Map- 2925 Register messages. 2927 o Clarify language in section 6.1.5 about Map-Replying to Data- 2928 Probes or Map-Requests. 2930 o Indicate that outer TTL should only be copied to inner TTL when it 2931 is less than inner TTL. 2933 o Indicate a source-EID for RLOC-probes are encoded with an AFI 2934 value of 0. 2936 o Indicate that SMRs can have a global or per SMR destination rate- 2937 limiter. 2939 o Add clarifications to the SMR procedures. 2941 o Add definitions for "client-side" and 'server-side" terms used in 2942 this specification. 2944 o Clear up language in section 6.4, last paragraph. 2946 o Change ACT of value 0 to "no-action". This is so we can RLOC- 2947 probe a PETR and have it return a Map-Reply with a locator-set of 2948 size 0. The way it is spec'ed the map-cache entry has action 2949 "dropped". Drop-action is set to 3. 2951 o Add statement about normalizing locator weights. 2953 o Clarify R-bit definition in the Map-Reply locator record. 2955 o Add section on EID Reachability within a LISP site. 2957 o Clarify another disadvantage of using anycast locators. 2959 o Reworded Abstract. 2961 o Change section 2.0 Introduction to remove obsolete information 2962 such as the LISP variant definitions. 2964 o Change section 5 title from "Tunneling Details" to "LISP 2965 Encapsulation Details". 2967 o Changes to section 5 to include results of network deployment 2968 experience with MTU. Recommend that implementations use either 2969 the stateful or stateless handling. 2971 o Make clarification wordsmithing to Section 7 and 8. 2973 o Identify that if there is one locator in the locator-set of a map- 2974 cache entry, that an SMR from that locator should be responded to 2975 by sending the the SMR-invoked Map-Request to the database mapping 2976 system rather than to the RLOC itself (which may be unreachable). 2978 o When describing Unicast and Multicast Weights indicate the the 2979 values are relative weights rather than percentages. So it 2980 doesn't imply the sum of all locator weights in the locator-set 2981 need to be 100. 2983 o Do some wordsmithing on copying TTL and TOS fields. 2985 o Numerous wordsmithing changes from Dave Meyer. He fine toothed 2986 combed the spec. 2988 o Removed Section 14 "Prototype Plans and Status". We felt this 2989 type of section is no longer appropriate for a protocol 2990 specification. 2992 o Add clarification text for the IRC description per Damien's 2993 commentary. 2995 o Remove text on copying nonce from SMR to SMR-invoked Map- Request 2996 per Vina's comment about a possible DoS vector. 2998 o Clarify (S/2 + H) in the stateless MTU section. 3000 o Add text to reflect Damien's comment about the description of the 3001 "ITR-RLOC Address" field in the Map-Request. that the list of RLOC 3002 addresses are local addresses of the Map-Requester. 3004 B.3. Changes to draft-ietf-lisp-07.txt 3006 o Posted April 2010. 3008 o Added I-bit to data header so LSB field can also be used as an 3009 Instance ID field. When this occurs, the LSB field is reduced to 3010 8-bits (from 32-bits). 3012 o Added V-bit to the data header so the 24-bit nonce field can also 3013 be used for source and destination version numbers. 3015 o Added Map-Version 12-bit value to the EID-record to be used in all 3016 of Map-Request, Map-Reply, and Map-Register messages. 3018 o Added multiple ITR-RLOC fields to the Map-Request packet so an ETR 3019 can decide what address to select for the destination of a Map- 3020 Reply. 3022 o Added L-bit (Local RLOC bit) and p-bit (Probe-Reply RLOC bit) to 3023 the Locator-Set record of an EID-record for a Map-Reply message. 3024 The L-bit indicates which RLOCs in the locator-set are local to 3025 the sender of the message. The P-bit indicates which RLOC is the 3026 source of a RLOC-probe Reply (Map-Reply) message. 3028 o Add reference to the LISP Canonical Address Format [LCAF] draft. 3030 o Made editorial and clarification changes based on comments from 3031 Dhirendra Trivedi. 3033 o Added wordsmithing comments from Joel Halpern on DF=1 setting. 3035 o Add John Zwiebel clarification to Echo Nonce Algorithm section 3036 6.3.1. 3038 o Add John Zwiebel comment about expanding on proxy-map-reply bit 3039 for Map-Register messages. 3041 o Add NAT section per Ron Bonica comments. 3043 o Fix IDnits issues per Ron Bonica. 3045 o Added section on Virtualization and Segmentation to explain the 3046 use if the Instance ID field in the data header. 3048 o There are too many P-bits, keep their scope to the packet format 3049 description and refer to them by name every where else in the 3050 spec. 3052 o Scanned all occurrences of "should", "should not", "must" and 3053 "must not" and uppercased them. 3055 o John Zwiebel offered text for section 4.1 to modernize the 3056 example. Thanks Z! 3058 o Make it more clear in the definition of "EID-to-RLOC Database" 3059 that all ETRs need to have the same database mapping. This 3060 reflects a comment from John Scudder. 3062 o Add a definition "Route-returnability" to the Definition of Terms 3063 section. 3065 o In section 9.2, add text to describe what the signature of 3066 traceroute packets can look like. 3068 o Removed references to Data Probe for introductory example. Data- 3069 probes are still part of the LISP design but not encouraged. 3071 o Added the definition for "LISP site" to the Definition of Terms" 3072 section. 3074 B.4. Changes to draft-ietf-lisp-06.txt 3076 Editorial based changes: 3078 o Posted December 2009. 3080 o Fix typo for flags in LISP data header. Changed from "4" to "5". 3082 o Add text to indicate that Map-Register messages must contain a 3083 computed UDP checksum. 3085 o Add definitions for PITR and PETR. 3087 o Indicate an AFI value of 0 is an unspecified address. 3089 o Indicate that the TTL field of a Map-Register is not used and set 3090 to 0 by the sender. This change makes this spec consistent with 3091 [LISP-MS]. 3093 o Change "... yield a packet size of L bytes" to "... yield a packet 3094 size greater than L bytes". 3096 o Clarify section 6.1.5 on what addresses and ports are used in Map- 3097 Reply messages. 3099 o Clarify that LSBs that go beyond the number of locators do not to 3100 be SMRed when the locator addresses are greater lexicographically 3101 than the locator in the existing locator-set. 3103 o Add Gregg, Srini, and Amit to acknowledgment section. 3105 o Clarify in the definition of a LISP header what is following the 3106 UDP header. 3108 o Clarify "verifying Map-Request" text in section 6.1.3. 3110 o Add Xu Xiaohu to the acknowledgment section for introducing the 3111 problem of overlapping EID-prefixes among multiple sites in an RRG 3112 email message. 3114 Design based changes: 3116 o Use stronger language to have the outer IPv4 header set DF=1 so we 3117 can avoid fragment reassembly in an ETR or PETR. This will also 3118 make IPv4 and IPv6 encapsulation have consistent behavior. 3120 o Map-Requests should not be sent in ECM with the Probe bit is set. 3121 These type of Map-Requests are used as RLOC-probes and are sent 3122 directly to locator addresses in the underlying network. 3124 o Add text in section 6.1.5 about returning all EID-prefixes in a 3125 Map-Reply sent by an ETR when there are overlapping EID-prefixes 3126 configure. 3128 o Add text in a new subsection of section 6.1.5 about dealing with 3129 Map-Replies with coarse EID-prefixes. 3131 B.5. Changes to draft-ietf-lisp-05.txt 3133 o Posted September 2009. 3135 o Added this Document Change Log appendix. 3137 o Added section indicating that encapsulated Map-Requests must use 3138 destination UDP port 4342. 3140 o Don't use AH in Map-Registers. Put key-id, auth-length, and auth- 3141 data in Map-Register payload. 3143 o Added Jari to acknowledgment section. 3145 o State the source-EID is set to 0 when using Map-Requests to 3146 refresh or RLOC-probe. 3148 o Make more clear what source-RLOC should be for a Map-Request. 3150 o The LISP-CONS authors thought that the Type definitions for CONS 3151 should be removed from this specification. 3153 o Removed nonce from Map-Register message, it wasn't used so no need 3154 for it. 3156 o Clarify what to do for unspecified Action bits for negative Map- 3157 Replies. Since No Action is a drop, make value 0 Drop. 3159 B.6. Changes to draft-ietf-lisp-04.txt 3161 o Posted September 2009. 3163 o How do deal with record count greater than 1 for a Map-Request. 3164 Damien and Joel comment. Joel suggests: 1) Specify that senders 3165 compliant with the current document will always set the count to 3166 1, and note that the count is included for future extensibility. 3167 2) Specify what a receiver compliant with the draft should do if 3168 it receives a request with a count greater than 1. Presumably, it 3169 should send some error back? 3171 o Add Fred Templin in acknowledgment section. 3173 o Add Margaret and Sam to the acknowledgment section for their great 3174 comments. 3176 o Say more about LAGs in the UDP section per Sam Hartman's comment. 3178 o Sam wants to use MAY instead of SHOULD for ignoring checksums on 3179 ETR. From the mailing list: "You'd need to word it as an ITR MAY 3180 send a zero checksum, an ETR MUST accept a 0 checksum and MAY 3181 ignore the checksum completely. And of course we'd need to 3182 confirm that can actually be implemented. In particular, hardware 3183 that verifies UDP checksums on receive needs to be checked to make 3184 sure it permits 0 checksums." 3186 o Margaret wants a reference to 3187 http://www.ietf.org/id/draft-eubanks-chimento-6man-00.txt. 3189 o Fix description in Map-Request section. Where we describe Map- 3190 Reply Record, change "R-bit" to "M-bit". 3192 o Add the mobility bit to Map-Replies. So PTRs don't probe so often 3193 for MNs but often enough to get mapping updates. 3195 o Indicate SHA1 can be used as well for Map-Registers. 3197 o More Fred comments on MTU handling. 3199 o Isidor comment about spec'ing better periodic Map-Registers. Will 3200 be fixed in draft-ietf-lisp-ms-02.txt. 3202 o Margaret's comment on gleaning: "The current specification does 3203 not make it clear how long gleaned map entries should be retained 3204 in the cache, nor does it make it clear how/ when they will be 3205 validated. The LISP spec should, at the very least, include a 3206 (short) default lifetime for gleaned entries, require that they be 3207 validated within a short period of time, and state that a new 3208 gleaned entry should never overwrite an entry that was obtained 3209 from the mapping system. The security implications of storing 3210 "gleaned" entries should also be explored in detail." 3212 o Add section on RLOC-probing per working group feedback. 3214 o Change "loc-reach-bits" to "loc-status-bits" per comment from 3215 Noel. 3217 o Remove SMR-bit from data-plane. Dino prefers to have it in the 3218 control plane only. 3220 o Change LISP header to allow a "Research Bit" so the Nonce and LSB 3221 fields can be turned off and used for another future purpose. For 3222 Luigi et al versioning convergence. 3224 o Add a N-bit to the data header suggested by Noel. Then the nonce 3225 field could be used when N is not 1. 3227 o Clarify that when E-bit is 0, the nonce field can be an echoed 3228 nonce or a random nonce. Comment from Jesper. 3230 o Indicate when doing data-gleaning that a verifying Map-Request is 3231 sent to the source-EID of the gleaned data packet so we can avoid 3232 map-cache corruption by a 3rd party. Comment from Pedro. 3234 o Indicate that a verifying Map-Request, for accepting mapping data, 3235 should be sent over the ALT (or to the EID). 3237 o Reference IPsec RFC 4302. Comment from Sam and Brian Weis. 3239 o Put E-bit in Map-Reply to tell ITRs that the ETR supports echo- 3240 noncing. Comment by Pedro and Dino. 3242 o Jesper made a comment to loosen the language about requiring the 3243 copy of inner TTL to outer TTL since the text to get mixed-AF 3244 traceroute to work would violate the "MUST" clause. Changed from 3245 MUST to SHOULD in section 5.3. 3247 B.7. Changes to draft-ietf-lisp-03.txt 3249 o Posted July 2009. 3251 o Removed loc-reach-bits longword from control packets per Damien 3252 comment. 3254 o Clarifications in MTU text from Roque. 3256 o Added text to indicate that the locator-set be sorted by locator 3257 address from Isidor. 3259 o Clarification text from John Zwiebel in Echo-Nonce section. 3261 B.8. Changes to draft-ietf-lisp-02.txt 3263 o Posted July 2009. 3265 o Encapsulation packet format change to add E-bit and make loc- 3266 reach-bits 32-bits in length. 3268 o Added Echo-Nonce Algorithm section. 3270 o Clarification how ECN bits are copied. 3272 o Moved S-bit in Map-Request. 3274 o Added P-bit in Map-Request and Map-Reply messages to anticipate 3275 RLOC-Probe Algorithm. 3277 o Added to Mobility section to reference [LISP-MN]. 3279 B.9. Changes to draft-ietf-lisp-01.txt 3281 o Posted 2 days after draft-ietf-lisp-00.txt in May 2009. 3283 o Defined LEID to be a "LISP EID". 3285 o Indicate encapsulation use IPv4 DF=0. 3287 o Added negative Map-Reply messages with drop, native-forward, and 3288 send-map-request actions. 3290 o Added Proxy-Map-Reply bit to Map-Register. 3292 B.10. Changes to draft-ietf-lisp-00.txt 3294 o Posted May 2009. 3296 o Rename of draft-farinacci-lisp-12.txt. 3298 o Acknowledgment to RRG. 3300 Authors' Addresses 3302 Dino Farinacci 3303 cisco Systems 3304 Tasman Drive 3305 San Jose, CA 95134 3306 USA 3308 Email: dino@cisco.com 3310 Vince Fuller 3311 cisco Systems 3312 Tasman Drive 3313 San Jose, CA 95134 3314 USA 3316 Email: vaf@cisco.com 3318 Dave Meyer 3319 cisco Systems 3320 170 Tasman Drive 3321 San Jose, CA 3322 USA 3324 Email: dmm@cisco.com 3326 Darrel Lewis 3327 cisco Systems 3328 170 Tasman Drive 3329 San Jose, CA 3330 USA 3332 Email: darlewis@cisco.com