idnits 2.17.1 

draft-farinacci-lisp-11.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     UDP Checksum:  this field field MUST be transmitted as 0 and
     ignored on receipt by the ETR.  Note, even when the UDP checksum is
     transmitted as 0 an intervening NAT device can recalculate the checksum
     and rewrite the UDP checksum field to non-zero.  For performance reasons,
     the ETR MUST ignore the checksum and MUST not do a checksum computation.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (December 19, 2008) is 5600 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275)

  ** Obsolete normative reference: RFC 4423 (Obsoleted by RFC 9063)

  == Outdated reference: A later version (-05) exists of
     draft-fuller-lisp-alt-03

  == Outdated reference: A later version (-01) exists of draft-jen-apt-00

  == Outdated reference: A later version (-04) exists of
     draft-meyer-lisp-cons-03

  == Outdated reference: A later version (-02) exists of
     draft-lewis-lisp-interworking-01

  -- No information found for draft-mathy-lisp-dht - is the name correct?

  == Outdated reference: A later version (-01) exists of
     draft-meyer-loc-id-implications-00

  == Outdated reference: A later version (-09) exists of
     draft-lear-lisp-nerd-02

  == Outdated reference: A later version (-05) exists of
     draft-narten-radir-problem-statement-00

  == Outdated reference: A later version (-10) exists of
     draft-ietf-mip4-rfc3344bis-05

  == Outdated reference: A later version (-08) exists of
     draft-ietf-pim-rpf-vector-03

  -- No information found for draft-handley-p2ppush-unpublished-2007726 - is
     the name correct?

  == Outdated reference: A later version (-12) exists of
     draft-ietf-shim6-proto-06


     Summary: 4 errors (**), 0 flaws (~~), 13 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                       D. Farinacci
3	Internet-Draft                                                 V. Fuller
4	Intended status: Experimental                                    D. Oran
5	Expires: June 22, 2009                                          D. Meyer
6	                                                                 S. Brim
7	                                                           cisco Systems
8	                                                       December 19, 2008

10	                 Locator/ID Separation Protocol (LISP)
11	                      draft-farinacci-lisp-11.txt

13	Status of this Memo

15	   This Internet-Draft is submitted to IETF in full conformance with the
16	   provisions of BCP 78 and BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This Internet-Draft will expire on June 22, 2009.

36	Copyright Notice

38	   Copyright (c) 2008 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents
43	   (http://trustee.ietf.org/license-info) in effect on the date of
44	   publication of this document.  Please review these documents
45	   carefully, as they describe your rights and restrictions with respect
46	   to this document.

48	Abstract

50	   This draft describes a simple, incremental, network-based protocol to
51	   implement separation of Internet addresses into Endpoint Identifiers
52	   (EIDs) and Routing Locators (RLOCs).  This mechanism requires no
53	   changes to host stacks and no major changes to existing database
54	   infrastructures.  The proposed protocol can be implemented in a
55	   relatively small number of routers.

57	   This proposal was stimulated by the problem statement effort at the
58	   Amsterdam IAB Routing and Addressing Workshop (RAWS), which took
59	   place in October 2006.

61	Table of Contents

63	   1.  Requirements Notation  . . . . . . . . . . . . . . . . . . . .  4
64	   2.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  5
65	   3.  Definition of Terms  . . . . . . . . . . . . . . . . . . . . .  8
66	   4.  Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 12
67	     4.1.  Packet Flow Sequence . . . . . . . . . . . . . . . . . . . 14
68	   5.  Tunneling Details  . . . . . . . . . . . . . . . . . . . . . . 16
69	     5.1.  LISP IPv4-in-IPv4 Header Format  . . . . . . . . . . . . . 17
70	     5.2.  LISP IPv6-in-IPv6 Header Format  . . . . . . . . . . . . . 18
71	     5.3.  Tunnel Header Field Descriptions . . . . . . . . . . . . . 19
72	     5.4.  Dealing with Large Encapsulated Packets  . . . . . . . . . 20
73	       5.4.1.  A Stateless Solution to MTU Handling . . . . . . . . . 21
74	       5.4.2.  A Stateful Solution to MTU Handling  . . . . . . . . . 21
75	   6.  EID-to-RLOC Mapping  . . . . . . . . . . . . . . . . . . . . . 23
76	     6.1.  LISP IPv4 and IPv6 Control Plane Packet Formats  . . . . . 23
77	       6.1.1.  LISP Packet Type Allocations . . . . . . . . . . . . . 25
78	       6.1.2.  Map-Request Message Format . . . . . . . . . . . . . . 25
79	       6.1.3.  EID-to-RLOC UDP Map-Request Message  . . . . . . . . . 27
80	       6.1.4.  Map-Reply Message Format . . . . . . . . . . . . . . . 28
81	       6.1.5.  EID-to-RLOC UDP Map-Reply Message  . . . . . . . . . . 30
82	     6.2.  Routing Locator Selection  . . . . . . . . . . . . . . . . 31
83	     6.3.  Routing Locator Reachability . . . . . . . . . . . . . . . 32
84	     6.4.  Routing Locator Hashing  . . . . . . . . . . . . . . . . . 34
85	     6.5.  Changing the Contents of EID-to-RLOC Mappings  . . . . . . 35
86	       6.5.1.  Clock Sweep  . . . . . . . . . . . . . . . . . . . . . 36
87	       6.5.2.  Solicit-Map-Request (SMR)  . . . . . . . . . . . . . . 37
88	   7.  Router Performance Considerations  . . . . . . . . . . . . . . 39
89	   8.  Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 40
90	     8.1.  First-hop/Last-hop Tunnel Routers  . . . . . . . . . . . . 41
91	     8.2.  Border/Edge Tunnel Routers . . . . . . . . . . . . . . . . 41
92	     8.3.  ISP Provider-Edge (PE) Tunnel Routers  . . . . . . . . . . 42
93	   9.  Traceroute Considerations  . . . . . . . . . . . . . . . . . . 43
94	     9.1.  IPv6 Traceroute  . . . . . . . . . . . . . . . . . . . . . 44
95	     9.2.  IPv4 Traceroute  . . . . . . . . . . . . . . . . . . . . . 44
96	     9.3.  Traceroute using Mixed Locators  . . . . . . . . . . . . . 44
97	   10. Mobility Considerations  . . . . . . . . . . . . . . . . . . . 46
98	     10.1. Site Mobility  . . . . . . . . . . . . . . . . . . . . . . 46
99	     10.2. Slow Endpoint Mobility . . . . . . . . . . . . . . . . . . 46
100	     10.3. Fast Endpoint Mobility . . . . . . . . . . . . . . . . . . 46
101	     10.4. Fast Network Mobility  . . . . . . . . . . . . . . . . . . 48
102	   11. Multicast Considerations . . . . . . . . . . . . . . . . . . . 49
103	   12. Security Considerations  . . . . . . . . . . . . . . . . . . . 50
104	   13. Prototype Plans and Status . . . . . . . . . . . . . . . . . . 51
105	   14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 53
106	     14.1. Normative References . . . . . . . . . . . . . . . . . . . 53
107	     14.2. Informative References . . . . . . . . . . . . . . . . . . 53
108	   Appendix A.  Acknowledgments . . . . . . . . . . . . . . . . . . . 56
109	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 57

111	1.  Requirements Notation

113	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
114	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
115	   document are to be interpreted as described in [RFC2119].

117	2.  Introduction

119	   Many years of discussion about the current IP routing and addressing
120	   architecture have noted that its use of a single numbering space (the
121	   "IP address") for both host transport session identification and
122	   network routing creates scaling issues (see [CHIAPPA] and [RFC1498]).
123	   A number of scaling benefits would be realized by separating the
124	   current IP address into separate spaces for Endpoint Identifiers
125	   (EIDs) and Routing Locators (RLOCs); among them are:

127	   1.  Reduction of routing table size in the "default-free zone" (DFZ).
128	       Use of a separate numbering space for RLOCs will allow them to be
129	       assigned topologically (in today's Internet, RLOCs would be
130	       assigned by providers at client network attachment points),
131	       greatly improving aggregation and reducing the number of
132	       globally-visible, routable prefixes.

134	   2.  More cost-effective multihoming for sites that connect to
135	       different service providers where they can control their own
136	       policies for packet flow into the site without using extra
137	       routing table resources of core routers.

139	   3.  Easing of renumbering burden when clients change providers.
140	       Because host EIDs are numbered from a separate, non-provider-
141	       assigned and non-topologically-bound space, they do not need to
142	       be renumbered when a client site changes its attachment points to
143	       the network.

145	   4.  Traffic engineering capabilities that can be performed by network
146	       elements and do not depend on injecting additional state into the
147	       routing system.  This will fall out of the mechanism that is used
148	       to implement the EID/RLOC split (see Section 4).

150	   5.  Mobility without address changing.  Existing mobility mechanisms
151	       will be able to work in a locator/ID separation scenario.  It
152	       will be possible for a host (or a collection of hosts) to move to
153	       a different point in the network topology either retaining its
154	       home-based address or acquiring a new address based on the new
155	       network location.  A new network location could be a physically
156	       different point in the network topology or the same physical
157	       point of the topology with a different provider.

159	   This draft describes protocol mechanisms to achieve the desired
160	   functional separation.  For flexibility, the mechanism used for
161	   forwarding packets is decoupled from that used to determine EID to
162	   RLOC mappings.  This document covers the former.  For the later, see
163	   [CONS], [ALT], [RPMD], and [NERD].  This work is in response to and
164	   intended to address the problem statement that came out of the RAWS
165	   effort [RFC4984].

167	   The Routing and Addressing problem statement can be found in [RADIR].

169	   This draft focuses on a router-based solution.  Building the solution
170	   into the network will facilitate incremental deployment of the
171	   technology on the Internet.  Note that while the detailed protocol
172	   specification and examples in this document assume IP version 4
173	   (IPv4), there is nothing in the design that precludes use of the same
174	   techniques and mechanisms for IPv6.  It should be possible for IPv4
175	   packets to use IPv6 RLOCs and for IPv6 EIDs to be mapped to IPv4
176	   RLOCs.

178	   Related work on host-based solutions is described in Shim6 [SHIM6]
179	   and HIP [RFC4423].  Related work on a router-based solution is
180	   described in [GSE].  This draft attempts to not compete or overlap
181	   with such solutions and the proposed protocol changes are expected to
182	   complement a host-based mechanism when Traffic Engineering
183	   functionality is desired.

185	   Some of the design goals of this proposal include:

187	   1.  Require no hardware or software changes to end-systems (hosts).

189	   2.  Minimize required changes to Internet infrastructure.

191	   3.  Be incrementally deployable.

193	   4.  Require no router hardware changes.

195	   5.  Minimize the number of routers which have to be modified.  In
196	       particular, most customer site routers and no core routers
197	       require changes.

199	   6.  Minimize router software changes in those routers which are
200	       affected.

202	   7.  Avoid or minimize packet loss when EID-to-RLOC mappings need to
203	       be performed.

205	   There are 4 variants of LISP, which differ along a spectrum of strong
206	   to weak dependence on the topological nature and possible need for
207	   routability of EIDs.  The variants are:

209	   LISP 1:  uses EIDs that are routable through the RLOC topology for
210	      bootstrapping EID-to-RLOC mappings.  [LISP1] This was intended as
211	      a prototyping mechanism for early protocol implementation.  It is
212	      now deprecated and should not be deployed.

214	   LISP 1.5:  uses EIDs that are routable for bootstrapping EID-to-RLOC
215	      mappings; such routing is via a separate topology.

217	   LISP 2:  uses EIDS that are not routable and EID-to-RLOC mappings are
218	      implemented within the DNS.  [LISP2]

220	   LISP 3:  uses non-routable EIDs that are used as lookup keys for a
221	      new EID-to-RLOC mapping database.  Use of Distributed Hash Tables
222	      [DHTs] [LISPDHT] to implement such a database would be an area to
223	      explore.  Other examples of new mapping database services are
224	      [CONS], [ALT], [RPMD], [NERD], and [APT].

226	   This document on LISP 1.5, and LISP 3 variants, both of which rely on
227	   a router-based distributed cache and database for EID-to-RLOC
228	   mappings.  The LISP 1.0 mechanism works but does not allow reduction
229	   of routing information in the default-free-zone of the Internet.  The
230	   LISP 2 mechanisms are put on hold and may never come to fruition
231	   since it is not architecturally pure to have routing depend on
232	   directory and directory depend on routing.  The LISP 3 mechanisms
233	   will be documented elsewhere but may use the control-plane options
234	   specified in this specification.

236	3.  Definition of Terms

238	   Provider Independent (PI) Addresses:   an address block assigned from
239	      a pool where blocks are not associated with any particular
240	      location in the network (e.g. from a particular service provider),
241	      and is therefore not topologically aggregatable in the routing
242	      system.

244	   Provider Assigned (PA) Addresses:   a block of IP addresses that are
245	      assigned to a site by each service provider to which a site
246	      connects.  Typically, each block is sub-block of a service
247	      provider CIDR block and is aggregated into the larger block before
248	      being advertised into the global Internet.  Traditionally, IP
249	      multihoming has been implemented by each multi-homed site
250	      acquiring its own, globally-visible prefix.  LISP uses only
251	      topologically-assigned and aggregatable address blocks for RLOCs,
252	      eliminating this demonstrably non-scalable practice.

254	   Routing Locator (RLOC):   the IPv4 or IPv6 address of an egress
255	      tunnel router (ETR).  It is the output of a EID-to-RLOC mapping
256	      lookup.  An EID maps to one or more RLOCs.  Typically, RLOCs are
257	      numbered from topologically-aggregatable blocks that are assigned
258	      to a site at each point to which it attaches to the global
259	      Internet; where the topology is defined by the connectivity of
260	      provider networks, RLOCs can be thought of as PA addresses.
261	      Multiple RLOCs can be assigned to the same ETR device or to
262	      multiple ETR devices at a site.

264	   Endpoint ID (EID):   a 32-bit (for IPv4) or 128-bit (for IPv6) value
265	      used in the source and destination address fields of the first
266	      (most inner) LISP header of a packet.  The host obtains a
267	      destination EID the same way it obtains an destination address
268	      today, for example through a DNS lookup or SIP exchange.  The
269	      source EID is obtained via existing mechanisms used to set a
270	      host's "local" IP address.  An EID is allocated to a host from an
271	      EID-prefix block associated with the site where the host is
272	      located.  An EID can be used by a host to refer to other hosts.
273	      EIDs MUST NOT be used as LISP RLOCs.  Note that EID blocks may be
274	      assigned in a hierarchical manner, independent of the network
275	      topology, to facilitate scaling of the mapping database.  In
276	      addition, an EID block assigned to a site may have site-local
277	      structure (subnetting) for routing within the site; this structure
278	      is not visible to the global routing system.

280	   EID-prefix:   A power-of-2 block of EIDs which are allocated to a
281	      site by an address allocation authority.  EID-prefixes are
282	      associated with a set of RLOC addresses which make up a "database
283	      mapping".  EID-prefix allocations can be broken up into smaller
284	      blocks when an RLOC set is to be associated with the smaller EID-
285	      prefix.  A globally routed address block (whether PI or PA) is not
286	      an EID-prefix.  However, a globally routed address block may be
287	      removed from global routing and reused as an EID-prefix.  A site
288	      that receives an explicitly allocated EID-prefix may not use that
289	      EID-prefix as a globally routed prefix assigned to RLOCs.

291	   End-system:   is an IPv4 or IPv6 device that originates packets with
292	      a single IPv4 or IPv6 header.  The end-system supplies an EID
293	      value for the destination address field of the IP header when
294	      communicating globally (i.e. outside of its routing domain).  An
295	      end-system can be a host computer, a switch or router device, or
296	      any network appliance.

298	   Ingress Tunnel Router (ITR):   a router which accepts an IP packet
299	      with a single IP header (more precisely, an IP packet that does
300	      not contain a LISP header).  The router treats this "inner" IP
301	      destination address as an EID and performs an EID-to-RLOC mapping
302	      lookup.  The router then prepends an "outer" IP header with one of
303	      its globally-routable RLOCs in the source address field and the
304	      result of the mapping lookup in the destination address field.
305	      Note that this destination RLOC may be an intermediate, proxy
306	      device that has better knowledge of the EID-to-RLOC mapping closer
307	      to the destination EID.  In general, an ITR receives IP packets
308	      from site end-systems on one side and sends LISP-encapsulated IP
309	      packets toward the Internet on the other side.

311	      Specifically, when a service provider prepends a LISP header for
312	      Traffic Engineering purposes, the router that does this is also
313	      regarded as an ITR.  The outer RLOC the ISP ITR uses can be based
314	      on the outer destination address (the originating ITR's supplied
315	      RLOC) or the inner destination address (the originating hosts
316	      supplied EID).

318	   TE-ITR:   is an ITR that is deployed in a service provider network
319	      that prepends an additional LISP header for Traffic Engineering
320	      purposes.

322	   Egress Tunnel Router (ETR):   a router that accepts an IP packet
323	      where the destination address in the "outer" IP header is one of
324	      its own RLOCs.  The router strips the "outer" header and forwards
325	      the packet based on the next IP header found.  In general, an ETR
326	      receives LISP-encapsulated IP packets from the Internet on one
327	      side and sends decapsulated IP packets to site end-systems on the
328	      other side.  ETR functionality does not have to be limited to a
329	      router device.  A server host can be the endpoint of a LISP tunnel
330	      as well.

332	   TE-ETR:   is an ETR that is deployed in a service provider network
333	      that strips an outer LISP header for Traffic Engineering purposes.

335	   xTR:   is a reference to an ITR or ETR when direction of data flow is
336	      not part of the context description. xTR refers to the router that
337	      is the tunnel endpoint.  Used synonymously with the term "Tunnel
338	      Router".  For example, "An xTR can be located at the Customer Edge
339	      (CE) router", meaning both ITR and ETR functionality is at the CE
340	      router.

342	   EID-to-RLOC Cache:   a short-lived, on-demand table in an ITR that
343	      stores, tracks, and is responsible for timing-out and otherwise
344	      validating EID-to-RLOC mappings.  This cache is distinct from the
345	      full "database" of EID-to-RLOC mappings, it is dynamic, local to
346	      the ITR(s), and relatively small while the database is
347	      distributed, relatively static, and much more global in scope.

349	   EID-to-RLOC Database:   a global distributed database that contains
350	      all known EID-prefix to RLOC mappings.  Each potential ETR
351	      typically contains a small piece of the database: the EID-to-RLOC
352	      mappings for the EID prefixes "behind" the router.  These map to
353	      one of the router's own, globally-visible, IP addresses.

355	   Recursive Tunneling:   when a packet has more than one LISP IP
356	      header.  Additional layers of tunneling may be employed to
357	      implement traffic engineering or other re-routing as needed.  When
358	      this is done, an additional "outer" LISP header is added and the
359	      original RLOCs are preserved in the "inner" header.  Any
360	      references to tunnels in this specification refers to dynamic
361	      encapsulating tunnels and never are they staticly configured.

363	   Reencapsulating Tunnels:   when a packet has no more than one LISP IP
364	      header (two IP headers total) and when it needs to be diverted to
365	      new RLOC, an ETR can decapsulate the packet (remove the LISP
366	      header) and prepend a new tunnel header, with new RLOC, on to the
367	      packet.  Doing this allows a packet to be re-routed by the re-
368	      encapsulating router without adding the overhead of additional
369	      tunnel headers.  Any references to tunnels in this specification
370	      refers to dynamic encapsulating tunnels and never are they
371	      staticly configured.

373	   LISP Header:   a term used in this document to refer to the outer
374	      IPv4 or IPv6 header, a UDP header, and a LISP header, an ITR
375	      prepends or an ETR strips.

377	   Address Family Indicator (AFI):   a term used to describe an address
378	      encoding in a packet.  An address family currently pertains to an
379	      IPv4 or IPv6 address.  See [AFI] for details.

381	   Negative Mapping Entry:   also known as a negative cache entry, is an
382	      EID-to-RLOC entry where an EID-prefix is advertised or stored with
383	      no RLOCs.  That is, the locator-set for the EID-to-RLOC entry is
384	      empty or has an encoded locator count of 0.  This type of entry
385	      could be used to describe a prefix from a non-LISP site, which is
386	      explicitly not in the mapping database.

388	   Data Probe:   a LISP-encapsulated data packet where the inner header
389	      destination address equals the outer header destination address
390	      used to trigger a Map-Reply by a decapsulating ETR.  In addition,
391	      the original packet is decapsulated and delivered to the
392	      destination host.  A Data Probe is used in some of the mapping
393	      database designs to "probe" or request a Map-Reply from an ETR; in
394	      other cases, Map-Requests are used.  See each mapping database
395	      design for details.

397	4.  Basic Overview

399	   One key concept of LISP is that end-systems (hosts) operate the same
400	   way they do today.  The IP addresses that hosts use for tracking
401	   sockets, connections, and for sending and receiving packets do not
402	   change.  In LISP terminology, these IP addresses are called Endpoint
403	   Identifiers (EIDs).

405	   Routers continue to forward packets based on IP destination
406	   addresses.  When a packet is LISP encapsulated, these addresses are
407	   referred to as Routing Locators (RLOCs).  Most routers along a path
408	   between two hosts will not change; they continue to perform routing/
409	   forwarding lookups on the destination addresses.  For routers between
410	   the source host and the ITR as well as routers from the ETR to the
411	   destination host, the destination address is an EID.  For the routers
412	   between the ITR and the ETR, the destination address is an RLOC.

414	   This design introduces "Tunnel Routers", which prepend LISP headers
415	   on host-originated packets and strip them prior to final delivery to
416	   their destination.  The IP addresses in this "outer header" are
417	   RLOCs.  During end-to-end packet exchange between two Internet hosts,
418	   an ITR prepends a new LISP header to each packet and an egress tunnel
419	   router strips the new header.  The ITR performs EID-to-RLOC lookups
420	   to determine the routing path to the the ETR, which has the RLOC as
421	   one of its IP addresses.

423	   Some basic rules governing LISP are:

425	   o  End-systems (hosts) only send to addresses which are EIDs.  They
426	      don't know addresses are EIDs versus RLOCs but assume packets get
427	      to LISP routers, which in turn, deliver packets to the destination
428	      the end-system has specified.

430	   o  EIDs are always IP addresses assigned to hosts.

432	   o  LISP routers mostly deal with Routing Locator addresses.  See
433	      details later in Section 4.1 to clarify what is meant by "mostly".

435	   o  RLOCs are always IP addresses assigned to routers; preferably,
436	      topologically-oriented addresses from provider CIDR blocks.

438	   o  When a router originates packets it may use as a source address
439	      either an EID or RLOC.  When acting as a host (e.g. when
440	      terminating a transport session such as SSH, TELNET, or SNMP), it
441	      may use an EID that is explicitly assigned for that purpose.  An
442	      EID that identifies the router as a host MUST NOT be used as an
443	      RLOC; an EID is only routable within the scope of a site.  A
444	      typical BGP configuration might demonstrate this "hybrid" EID/RLOC
445	      usage where a router could use its "host-like" EID to terminate
446	      iBGP sessions to other routers in a site while at the same time
447	      using RLOCs to terminate eBGP sessions to routers outside the
448	      site.

450	   o  EIDs are not expected to be usable for global end-to-end
451	      communication in the absence of an EID-to-RLOC mapping operation.
452	      They are expected to be used locally for intra-site communication.

454	   o  EID prefixes are likely to be hierarchically assigned in a manner
455	      which is optimized for administrative convenience and to
456	      facilitate scaling of the EID-to-RLOC mapping database.  The
457	      hierarchy is based on a address allocation hierarchy which is not
458	      dependent on the network topology.

460	   o  EIDs may also be structured (subnetted) in a manner suitable for
461	      local routing within an autonomous system.

463	   An additional LISP header may be prepended to packets by a transit
464	   router (i.e.  TE-ITR) when re-routing of the path for a packet is
465	   desired.  An obvious instance of this would be an ISP router that
466	   needs to perform traffic engineering for packets in flow through its
467	   network.  In such a situation, termed Recursive Tunneling, an ISP
468	   transit acts as an additional ingress tunnel router and the RLOC it
469	   uses for the new prepended header would be either an TE-ETR within
470	   the ISP (along intra-ISP traffic engineered path) or in an TE-ETR
471	   within another ISP (an inter-ISP traffic engineered path, where an
472	   agreement to build such a path exists).

474	   This specification mandates that no more than two LISP headers get
475	   prepended to a packet.  This avoids excessive packet overhead as well
476	   as possible encapsulation loops.  It is believed two headers is
477	   sufficient, where the first prepended header is used at a site for
478	   Location/Identity separation and second prepended header is used
479	   inside a service provider for Traffic Engineering purposes.

481	   Tunnel Routers can be placed fairly flexibly in a multi-AS topology.
482	   For example, the ITR for a particular end-to-end packet exchange
483	   might be the first-hop or default router within a site for the source
484	   host.  Similarly, the egress tunnel router might be the last-hop
485	   router directly-connected to the destination host.  Another example,
486	   perhaps for a VPN service out-sourced to an ISP by a site, the ITR
487	   could be the site's border router at the service provider attachment
488	   point.  Mixing and matching of site-operated, ISP-operated, and other
489	   tunnel routers is allowed for maximum flexibility.  See Section 8 for
490	   more details.

492	4.1.  Packet Flow Sequence

494	   This section provides an example of the unicast packet flow with the
495	   following conditions:

497	   o  Source host "host1.abc.com" is sending a packet to
498	      "host2.xyz.com", exactly what host1 would do if the site was not
499	      using LISP.

501	   o  Each site is multi-homed, so each tunnel router has an address
502	      (RLOC) assigned from the service provider address block for each
503	      provider to which that particular tunnel router is attached.

505	   o  The ITR(s) and ETR(s) are directly connected to the source and
506	      destination, respectively.

508	   o  Data Probes are used to solicit Map-Replies versus using Map-
509	      Requests.  And the Data Probes are sent on the underlying topology
510	      (the LISP 1.0 variant) but could also be sent over an alternative
511	      topology (the LISP 1.5 variant) as it would in [ALT].

513	   Client host1.abc.com wants to communicate with server host2.xyz.com:

515	   1.  host1.abc.com wants to open a TCP connection to host2.xyz.com.
516	       It does a DNS lookup on host2.xyz.com.  An A/AAAA record is
517	       returned.  This address is used as the destination EID and the
518	       locally-assigned address of host1.abc.com is used as the source
519	       EID.  An IPv4 or IPv6 packet is built using the EIDs in the IPv4
520	       or IPv6 header and sent to the default router.

522	   2.  The default router is configured as an ITR.  The ITR must be able
523	       to map the EID destination to an RLOC of the ETR at the
524	       destination site.  The ITR prepends a LISP header to the packet,
525	       with one of its RLOCs as the source IPv4 or IPv6 address.  The
526	       destination EID from the original packet header is used as the
527	       destination IPv4 or IPv6 in the prepended LISP header.
528	       Subsequent packets, where the outer destination address is the
529	       destination EID will be sent until EID-to-RLOC mapping is
530	       learned.

532	   3.  In LISP 1, the packet is routed through the Internet as it is
533	       today.  In LISP 1.5, the packet is routed on a different topology
534	       which may have EID prefixes distributed and advertised in an
535	       aggregatable fashion.  In either case, the packet arrives at the
536	       ETR.  The router is configured to "punt" the packet to the
537	       router's processor.  See Section 7 for more details.  For LISP
538	       2.0 and 3.0, the behavior is not fully defined yet.

540	   4.  The LISP header is stripped so that the packet can be forwarded
541	       by the router control plane.  The router looks up the destination
542	       EID in the router's EID-to-RLOC database (not the cache, but the
543	       configured data structure of RLOCs).  An EID-to-RLOC Map-Reply
544	       message is originated by the ETR and is addressed to the source
545	       RLOC in the LISP header of the original packet (this is the ITR).
546	       The source RLOC of the Map-Reply is one of the ETR's RLOCs.

548	   5.  The ITR receives the Map-Reply message, parses the message (to
549	       check for format validity) and stores the mapping information
550	       from the packet.  This information is put in the ITR's EID-to-
551	       RLOC mapping cache (this is the on-demand cache, the cache where
552	       entries time out due to inactivity).

554	   6.  Subsequent packets from host1.abc.com to host2.xyz.com will have
555	       a LISP header prepended by the ITR using the appropriate RLOC as
556	       the LISP header destination address learned from the ETR.  Note,
557	       the packet may be sent to a different ETR than the one which
558	       returned the Map-Reply due to the source site's hashing policy or
559	       the destination site's locator-set policy.

561	   7.  The ETR receives these packets directly (since the destination
562	       address is one of its assigned IP addresses), strips the LISP
563	       header and forwards the packets to the attached destination host.

565	   In order to eliminate the need for a mapping lookup in the reverse
566	   direction, an ETR MAY create a cache entry that maps the source EID
567	   (inner header source IP address) to the source RLOC (outer header
568	   source IP address) in a received LISP packet.  Such a cache entry is
569	   termed a "gleaned" mapping and only contains a single RLOC for the
570	   EID in question.  More complete information about additional RLOCs
571	   SHOULD be verified by sending a LISP Map-Request for that EID.  Both
572	   ITR and the ETR may also influence the decision the other makes in
573	   selecting an RLOC.  See Section 6 for more details.

575	5.  Tunneling Details

577	   This section describes the LISP Data Message which defines the
578	   tunneling header used to encapsulate IPv4 and IPv6 packets which
579	   contain EID addresses.  Even though the following formats illustrate
580	   IPv4-in-IPv4 and IPv6-in-IPv6 encapsulations, the other 2
581	   combinations are supported as well.

583	   Since additional tunnel headers are prepended, the packet becomes
584	   larger and in theory can exceed the MTU of any link traversed from
585	   the ITR to the ETR.  It is recommended, in IPv4 that packets do not
586	   get fragmented as they are encapsulated by the ITR.  Instead, the
587	   packet is dropped and an ICMP Too Big message is returned to the
588	   source.

590	   Based on informal surveys of large ISP traffic patterns, it appears
591	   that most transit paths can accommodate a path MTU of at least 4470
592	   bytes.  The exceptions, in terms of data rate, number of hosts
593	   affected, or any other metric are expected to be vanishingly small.

595	   To address MTU concerns, mainly raised on the RRG mailing list, the
596	   LISP deployment process will include collecting data during its pilot
597	   phase to either verify or refute the assumption about minimum
598	   available MTU.  If the assumption proves true and transit networks
599	   with links limited to 1500 byte MTUs are corner cases, it would seem
600	   more cost-effective to either upgrade or modify the equipment in
601	   those transit networks to support larger MTUs or to use existing
602	   mechanisms for accommodating packets that are too large.

604	   For this reason, there is currently no plan for LISP to add any new
605	   additional, complex mechanism for implementing fragmentation and
606	   reassembly in the face of limited-MTU transit links.  If analysis
607	   during LISP pilot deployment reveals that the assumption of
608	   essentially ubiquitous, 4470+ byte transit path MTUs, is incorrect,
609	   then LISP can be modified prior to protocol standardization to add
610	   support for one of the proposed fragmentation and reassembly schemes.
611	   Note that two simple existing schemes are detailed in Section 5.4.

613	5.1.  LISP IPv4-in-IPv4 Header Format

615	        0                   1                   2                   3
616	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
617	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
618	     / |Version|  IHL  |Type of Service|          Total Length         |
619	    /  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
620	   |   |         Identification        |Flags|      Fragment Offset    |
621	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
622	   OH  |  Time to Live | Protocol = 17 |         Header Checksum       |
623	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
624	   |   |                    Source Routing Locator                     |
625	    \  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
626	     \ |                 Destination Routing Locator                   |
627	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
628	     / |       Source Port = xxxx      |       Dest Port = 4341        |
629	   UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
630	     \ |           UDP Length          |        UDP Checksum           |
631	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
632	   L / |S|                     Locator Reach Bits                      |
633	   I   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
634	   S \ |                             Nonce                             |
635	   P  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
636	     / |Version|  IHL  |Type of Service|          Total Length         |
637	    /  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
638	   |   |         Identification        |Flags|      Fragment Offset    |
639	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
640	   IH  |  Time to Live |    Protocol   |         Header Checksum       |
641	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
642	   |   |                           Source EID                          |
643	    \  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
644	     \ |                         Destination EID                       |
645	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

647	5.2.  LISP IPv6-in-IPv6 Header Format

649	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
650	     / |Version| Traffic Class |           Flow Label                  |
651	    /  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
652	   |   |         Payload Length        | Next Header=17|   Hop Limit   |
653	   v   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
654	       |                                                               |
655	   O   +                                                               +
656	   u   |                                                               |
657	   t   +                     Source Routing Locator                    +
658	   e   |                                                               |
659	   r   +                                                               +
660	       |                                                               |
661	   H   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
662	   d   |                                                               |
663	   r   +                                                               +
664	       |                                                               |
665	   ^   +                  Destination Routing Locator                  +
666	   |   |                                                               |
667	    \  +                                                               +
668	     \ |                                                               |
669	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
670	     / |       Source Port = xxxx      |       Dest Port = 4341        |
671	   UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
672	     \ |           UDP Length          |        UDP Checksum           |
673	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
674	   L / |S|                     Locator Reach Bits                      |
675	   I   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
676	   S \ |                             Nonce                             |
677	   P   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
678	     / |Version| Traffic Class |           Flow Label                  |
679	    /  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
680	   /   |         Payload Length        |  Next Header  |   Hop Limit   |
681	   v   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
682	       |                                                               |
683	   I   +                                                               +
684	   n   |                                                               |
685	   n   +                          Source EID                           +
686	   e   |                                                               |
687	   r   +                                                               +
688	       |                                                               |
689	   H   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
690	   d   |                                                               |
691	   r   +                                                               +
692	       |                                                               |

694	   ^   +                        Destination EID                        +
695	   \   |                                                               |
696	    \  +                                                               +
697	     \ |                                                               |
698	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

700	5.3.  Tunnel Header Field Descriptions

702	   IH Header:  is the inner header, preserved from the datagram received
703	      from the originating host.  The source and destination IP
704	      addresses are EIDs.

706	   OH Header:  is the outer header prepended by an ITR.  The address
707	      fields contain RLOCs obtained from the ingress router's EID-to-
708	      RLOC cache.  The IP protocol number is "UDP (17)" from [RFC0768].

710	   UDP Header:  contains a ITR selected source port when encapsulating a
711	      packet.  See Section 6.4 for details on the hash algorithm used
712	      select a source port based on the 5-tuple of the inner header.
713	      The destination port MUST be set to the well-known IANA assigned
714	      port value 4341.

716	   UDP Checksum:  this field field MUST be transmitted as 0 and ignored
717	      on receipt by the ETR.  Note, even when the UDP checksum is
718	      transmitted as 0 an intervening NAT device can recalculate the
719	      checksum and rewrite the UDP checksum field to non-zero.  For
720	      performance reasons, the ETR MUST ignore the checksum and MUST not
721	      do a checksum computation.

723	   UDP Length:  for an IPv4 encapsulated packet, the inner header Total
724	      Length plus the UDP and LISP header lengths are used.  For an IPv6
725	      encapsulated packet, the inner header Payload Length plus the size
726	      of the IPv6 header (40 bytes) plus the size of the UDP and LISP
727	      headers are used.  The UDP header length is 8 bytes.  The LISP
728	      header length is 8 bytes when no loc-reach-bit header extensions
729	      are used.

731	   S: this is the Solicit-Map-Request (SMR) bit.  See section
732	      Section 6.5.2 for details.

734	   LISP Locator Reach Bits:  in the LISP header are set by an ITR to
735	      indicate to an ETR the reachability of the Locators in the source
736	      site.  Each RLOC in a Map-Reply is assigned an ordinal value from
737	      0 to n-1 (when there are n RLOCs in a mapping entry).  The Locator
738	      Reach Bits are numbered from 0 to n-1 from the right significant
739	      bit of the 31-bit field.  When a bit is set to 1, the ITR is
740	      indicating to the ETR the RLOC associated with the bit ordinal is
741	      reachable.  See Section 6.3 for details on how an ITR can
742	      determine other ITRs at the site are reachable.  When a site has
743	      multiple EID-prefixes which result in multiple mappings (where
744	      each could have a different locator-set), the Locator Reach Bits
745	      setting in an encapsulated packet MUST reflect the mapping for the
746	      EID-prefix that the inner-header source EID address matches.

748	   LISP Nonce:  is a 32-bit value that is randomly generated by an ITR.
749	      It is used to test route-returnability when xTRs exchange
750	      encapsulated data packets with the SMR bit set, Data-Probe, Map-
751	      Request, or Map-Reply messages.

753	   When doing Recursive Tunneling:

755	   o  The OH header Time to Live field (or Hop Limit field, in case of
756	      IPv6) MUST be copied from the IH header Time to Live field.

758	   o  The OH header Type of Service field (or the Traffic Class field,
759	      in the case of IPv6) SHOULD be copied from the IH header Type of
760	      Service field.

762	   When doing Re-encapsulated Tunneling:

764	   o  The new OH header Time to Live field SHOULD be copied from the
765	      stripped OH header Time to Live field.

767	   o  The new OH header Type of Service field SHOULD be copied from the
768	      stripped OH header Type of Service field.

770	   Copying the TTL serves two purposes: first, it preserves the distance
771	   the host intended the packet to travel; second, and more importantly,
772	   it provides for suppression of looping packets in the event there is
773	   a loop of concatenated tunnels due to misconfiguration.

775	5.4.  Dealing with Large Encapsulated Packets

777	   In the event that the MTU issues mentioned above prove to be more
778	   serious than expected, this section proposes 2 simple mechanisms to
779	   deal with large packets.  One is stateless using IP fragmentation and
780	   the other is stateful using Path MTU Discovery [RFC1191].

782	   It is left to the implementor to decide if the stateless or stateful
783	   mechanism should be implemented.  Both or neither can be decided as
784	   well since it is a local decision in the ITR regarding how to deal
785	   with MTU issues.  Sites can interoperate with differing mechanisms.

787	5.4.1.  A Stateless Solution to MTU Handling

789	   An ITR stateless solution to handle MTU issues is described as
790	   follows:

792	   1.  Define an architectural constant S for the maximum size of a
793	       packet, in bytes, an ITR would receive from a source inside of
794	       its site.

796	   2.  Define L to be the maximum size, in bytes, a packet of size S
797	       would be after the ITR prepends the LISP header, UDP header, and
798	       outer network layer header of size H.

800	   3.  Calculate: S + H = L.

802	   When an ITR receives a packet from a site-facing interface and adds H
803	   bytes worth of encapsulation to yield a packet size of L bytes, it
804	   resolves the MTU issue by first splitting the original packet into 2
805	   equal-sized fragments.  A LISP header is then prepended to each
806	   fragment.  This will ensure that the new, encapsulated packets are of
807	   size (S/2 + H), which is always below the effective tunnel MTU.

809	   When an ETR receives encapsulated fragments, it treats them as two
810	   individually encapsulated packets.  It strips the LISP headers then
811	   forwards each fragment to the destination host of the destination
812	   site.  The two fragments are reassembled at the destination host into
813	   the single IP datagram that was originated by the source host.

815	   This behavior is performed by the ITR when the source host originates
816	   a packet with the DF field of the IP header is set to 0.  When the DF
817	   field of the IP header is set to 1, or the packet is an IPv6 packet
818	   originated by the source host, the ITR will drop the packet when the
819	   size is greater than L, and sends an ICMP Too Big message to the
820	   source with a value of S, where S is (L - H).

822	   This specification recommends that L be defined as 1500.

824	5.4.2.  A Stateful Solution to MTU Handling

826	   An ITR stateful solution to handle MTU issues is describe as follows
827	   and was first introduced in [OPENLISP]:

829	   1.  The ITR will keep state of the effective MTU for each locator per
830	       mapping cache entry.  The effective MTU is what the core network
831	       can deliver along the path between ITR and ETR.

833	   2.  When an encapsulated packet exceeds what the core network can
834	       deliver, one of the intermediate routers on the path will send an
835	       ICMP Too Big message to the ITR.  The ITR will parse the ICMP
836	       message to determine which locator is affected by the effective
837	       MTU change and then record the new effective MTU value in the
838	       mapping cache entry.

840	   3.  When a packet is received by the ITR from a source inside of the
841	       site and the size of the packet is greater than the effective MTU
842	       stored with the mapping cache entry associated with the
843	       destination EID the packet is for, the ITR will send an ICMP Too
844	       Big message back to the source.  The packet size advertised by
845	       the ITR in the ICMP Too Big message is the effective MTU minus
846	       the LISP encapsulation length.

848	   Even though this mechanism is stateful, it has advantages over the
849	   stateless IP fragmentation mechanism, by not involving the
850	   destination host with reassembly of ITR fragmented packets.

852	6.  EID-to-RLOC Mapping

854	6.1.  LISP IPv4 and IPv6 Control Plane Packet Formats

856	   The following new UDP packet types are used to retrieve EID-to-RLOC
857	   mappings:

859	       0                   1                   2                   3
860	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
861	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
862	       |Version|  IHL  |Type of Service|          Total Length         |
863	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
864	       |         Identification        |Flags|      Fragment Offset    |
865	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
866	       |  Time to Live | Protocol = 17 |         Header Checksum       |
867	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
868	       |                    Source Routing Locator                     |
869	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
870	       |                 Destination Routing Locator                   |
871	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
872	     / |           Source Port         |         Dest Port             |
873	   UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
874	     \ |           UDP Length          |        UDP Checksum           |
875	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
876	       |                                                               |
877	       |                         LISP Message                          |
878	       |                                                               |
879	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

881	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
882	       |Version| Traffic Class |           Flow Label                  |
883	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
884	       |         Payload Length        | Next Header=17|   Hop Limit   |
885	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
886	       |                                                               |
887	       +                                                               +
888	       |                                                               |
889	       +                     Source Routing Locator                    +
890	       |                                                               |
891	       +                                                               +
892	       |                                                               |
893	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
894	       |                                                               |
895	       +                                                               +
896	       |                                                               |
897	       +                  Destination Routing Locator                  +
898	       |                                                               |
899	       +                                                               +
900	       |                                                               |
901	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
902	     / |           Source Port         |         Dest Port             |
903	   UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
904	     \ |           UDP Length          |        UDP Checksum           |
905	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
906	       |                                                               |
907	       |                         LISP Message                          |
908	       |                                                               |
909	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

911	   The LISP UDP-based messages are the Map-Request and Map-Reply
912	   messages.  When a UDP Map-Request is sent, the UDP source port is
913	   chosen by the sender and the destination UDP port number is set to
914	   4342.  When a UDP Map-Reply is sent, the source UDP port number is
915	   set to 4342 and the destination UDP port number is copied from the
916	   source port of either the Map-Request or the invoking data packet.

918	   The UDP Length field will reflect the length of the UDP header and
919	   the LISP Message payload.

921	   The UDP Checksum is computed and set to non-zero for Map-Request and
922	   Map-Reply messages.  It MUST be checked on receipt and if the
923	   checksum fails, the packet MUST be dropped.

925	   LISP-CONS [CONS] use TCP to send LISP control messages.  The format
926	   of control messages includes the UDP header so the checksum and
927	   length fields can be used to protect and delimit message boundaries.

929	   This main LISP specification is the authoritative source for message
930	   format definitions for the Map-Request and Map-Reply messages.

932	6.1.1.  LISP Packet Type Allocations

934	   This section will be the authoritative source for allocating LISP
935	   Type values.  Current allocations are:

937	       Reserved:                        0    b'0000'
938	       LISP Map-Request:                1    b'0001'
939	       LISP Map-Reply:                  2    b'0010'
940	       LISP-CONS Open Message:          8    b'1000'
941	       LISP-CONS Push-Add Message:      9    b'1001'
942	       LISP-CONS Push-Delete Message:   10   b'1010'
943	       LISP-CONS Unreachable Message    11   b'1011'

945	6.1.2.  Map-Request Message Format

947	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
948	       |S|                     Locator Reach Bits                      |
949	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
950	       |                             Nonce                             |
951	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
952	       |Type=1 |A|R|            Reserved               | Record Count  |
953	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
954	       |         Source-EID-AFI        |            ITR-AFI            |
955	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
956	       |                   Source EID Address  ...                     |
957	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
958	       |                Originating ITR RLOC Address ...               |
959	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
960	     / |   Reserved    | EID mask-len  |        EID-prefix-AFI         |
961	   Rec +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
962	     \ |                       EID-prefix  ...                         |
963	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
964	       |                   Map-Reply Record  ...                       |
965	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
966	       |                     Mapping Protocol Data                     |
967	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

969	   Packet field descriptions:

971	   S: This is the SMR bit.  See Section 6.5.2 for details.

973	   Locator Reach Bits:  These bits MUST be set to 0 on transmission and
974	      ignored on receipt.  They cannot be used for indicating
975	      reachability because the Map-Request does not have the EID-prefix
976	      for the sending site so the receiver of the Map-Request cannot
977	      know what mapping entry to associate the reachability with.
978	      However, when Mapping Data is provided in the Map-Reply Record
979	      field, and the receiver of the Map-Request is configured to accept
980	      the mapping data, the R-bit per locator entry in the EID-prefix
981	      record is used to denote reachability.

983	   Nonce:  A 4-byte random value created by the sender of the Map-
984	      Request.

986	   Type:   1 (Map-Request)

988	   A: This is an authoritative bit, which is set to 0 for UDP-based Map-
989	      Requests sent by an ITR.  See other control-specific documents
990	      [CONS] for TCP-based Map-Requests.

992	   R: When set, it indicates a Map-Reply Record segment is included in
993	      the Map-Request.

995	   Reserved:  Set to 0 on transmission and ignored on receipt.

997	   Record Count:  The number of records in this request message.  A
998	      record is comprised of the portion of the packet is labeled 'Rec'
999	      above and occurs the number of times equal to Record count.

1001	   Source-EID-AFI:  Address family of the "Source EID Address" field.

1003	   ITR-AFI:  Address family of the "Originating ITR RLOC Address" field.

1005	   Source EID Address:  This is the EID of the source host which
1006	      originated the packet which is invoking this Map-Request.

1008	   Originating ITR RLOC Address:  Used to give the ETR the option of
1009	      returning a Map-Reply in the address-family of this locator.

1011	   EID mask-len:  Mask length for EID prefix.

1013	   EID-AFI:  Address family of EID-prefix according to [RFC2434]

1015	   EID-prefix:  4 bytes if an IPv4 address-family, 16 bytes if an IPv6
1016	      address-family.  When a Map-Request is sent by an ITR because a
1017	      data packet is received for a destination where there is no
1018	      mapping entry, the EID-prefix is set to the destination IP address
1019	      of the data packet.  And the 'EID mask-len' is set to 32 or 128
1020	      for IPv4 or IPv6, respectively.  When an xTR wants to query a site
1021	      about the status of a mapping it already has cached, the EID-
1022	      prefix used in the Map-Request has the same mask-length as the
1023	      EID-prefix returned from the site when it sent a Map-Reply
1024	      message.

1026	   Map-Reply Record:  When the R bit is set, this field is the size of
1027	      the "Record" field in the Map-Reply format.  This Map-Reply record
1028	      contains the EID-to-RLOC mapping entry associated with the Source
1029	      EID.  This allows the ETR which will receive this Map-Request to
1030	      cache the data if it chooses to do so.

1032	   Mapping Protocol Data:  See [CONS] or [ALT] for details.  This field
1033	      is optional and present when the UDP length indicates there is
1034	      enough space in the packet to include it.

1036	6.1.3.  EID-to-RLOC UDP Map-Request Message

1038	   A Map-Request is sent from an ITR when it needs a mapping for an EID,
1039	   wants to test an RLOC for reachability, or wants to refresh a mapping
1040	   before TTL expiration.  This is performed by using the RLOC as the
1041	   destination address for Map-Request message with a randomly allocated
1042	   source UDP port number and the well-known destination port number
1043	   4342.  A successful Map-Reply updates the cached set of RLOCs
1044	   associated with the EID prefix range.

1046	   Map-Requests MUST be rate-limited.  It is recommended that a Map-
1047	   Request for the same EID-prefix be sent no more than once per second.

1049	6.1.4.  Map-Reply Message Format

1051	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1052	       |x|                     Locator Reach Bits                      |
1053	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1054	       |                             Nonce                             |
1055	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1056	       |Type=2 |              Reserved                 | Record Count  |
1057	   +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1058	   |   |                          Record  TTL                          |
1059	   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1060	   R   | Locator Count | EID mask-len  |A|        Reserved             |
1061	   e   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1062	   c   |           Reserved            |            EID-AFI            |
1063	   o   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1064	   r   |                          EID-prefix                           |
1065	   d   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1066	   |  /|    Priority   |    Weight     |  M Priority   |   M Weight    |
1067	   | L +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1068	   | o |           Unused Flags      |R|           Loc-AFI             |
1069	   | c +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1070	   |  \|                             Locator                           |
1071	   +-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1072	       |                     Mapping Protocol Data                     |
1073	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1075	   Packet field descriptions:

1077	   x: Set to 0 on transmission and ignored on receipt.

1079	   Locator Reach Bits:  Refer to Section 5.3.  This field MUST be set to
1080	      0 on transmission and ignored on receipt.  The locator
1081	      reachability is encoded as the R-bit in each locator entry of each
1082	      EID-prefix record.

1084	   Nonce:  A 4-byte value set in a Data-Probe packet or a Map-Request
1085	      that is echoed here in the Map-Reply.

1087	   Type:   2 (Map-Reply)

1089	   Reserved:  Set to 0 on transmission and ignored on receipt.

1091	   Record Count:  The number of records in this reply message.  A record
1092	      is comprised of that portion of the packet labeled 'Record' above
1093	      and occurs the number of times equal to Record count.

1095	   Record TTL:  The time in minutes the recipient of the Map-Reply will
1096	      store the mapping.  If the TTL is 0, the entry should be removed
1097	      from the cache immediately.  If the value is 0xffffffff, the
1098	      recipient can decide locally how long to store the mapping.

1100	   Locator Count:  The number of Locator entries.  A locator entry
1101	      comprises what is labeled above as 'Loc'.  The locator count can
1102	      be 0 indicating there are no locators for the EID-prefix.

1104	   EID mask-len:  Mask length for EID prefix.

1106	   A: The Authoritative bit, when sent by a UDP-based message is always
1107	      set by the ETR.  See [CONS] for TCP-based Map-Replies.

1109	   EID-AFI:  Address family of EID-prefix according to [RFC2434].

1111	   EID-prefix:  4 bytes if an IPv4 address-family, 16 bytes if an IPv6
1112	      address-family.

1114	   Priority:  each RLOC is assigned a unicast priority.  Lower values
1115	      are more preferable.  When multiple RLOCs have the same priority,
1116	      they may be used in a load-split fashion.  A value of 255 means
1117	      the RLOC MUST NOT be used for unicast forwarding.

1119	   Weight:  when priorities are the same for multiple RLOCs, the weight
1120	      indicates how to balance unicast traffic between them.  Weight is
1121	      encoded as a percentage of total unicast packets that match the
1122	      mapping entry.  If a non-zero weight value is used for any RLOC,
1123	      then all RLOCs must use a non-zero weight value and then the sum
1124	      of all weight values MUST equal 100.  If a zero value is used for
1125	      any RLOC weight, then all weights MUST be zero and the receiver of
1126	      the Map-Reply will decide how to load-split traffic.  See
1127	      Section 6.4 for a suggested hash algorithm to distribute load
1128	      across locators with same priority and equal weight values.  When
1129	      a single RLOC exists in a mapping entry, the weight value MUST be
1130	      set to 100 and ignored on receipt.

1132	   M Priority:  each RLOC is assigned a multicast priority used by an
1133	      ETR in a receiver multicast site to select an ITR in a source
1134	      multicast site for building multicast distribution trees.  A value
1135	      of 255 means the RLOC MUST NOT be used for joining a multicast
1136	      distribution tree.

1138	   M Weight:  when priorities are the same for multiple RLOCs, the
1139	      weight indicates how to balance building multicast distribution
1140	      trees across multiple ITRs.  The weight is encoded as a percentage
1141	      of total number of trees build to the source site identified by
1142	      the EID-prefix.  If a non-zero weight value is used for any RLOC,
1143	      then all RLOCs must use a non-zero weight value and then the sum
1144	      of all weight values MUST equal 100.  If a zero value is used for
1145	      any RLOC weight, then all weights MUST be zero and the receiver of
1146	      the Map-Reply will decide how to distribute multicast state across
1147	      ITRs.

1149	   Unused Flags:  set to 0 when sending and ignored on receipt.

1151	   R: when this bit is set, the locator is known to be reachable from
1152	      the Map-Reply sender's perspective.  When there is a single
1153	      mapping record in the message, the R-bit for each locator must
1154	      have a consistent setting with the bitfield setting of the 'Loc
1155	      Reach Bits' field in the early part of the header.  When there are
1156	      multiple mapping records in the message, the 'Loc Reach Bits'
1157	      field is set to 0.

1159	   Locator:  an IPv4 or IPv6 address (as encoded by the 'Loc-AFI' field)
1160	      assigned to an ETR or router acting as a proxy replier for the
1161	      EID-prefix.  Note that the destination RLOC address MAY be an
1162	      anycast address.  A source RLOC can be an anycast address as well.
1163	      The source or destination RLOC MUST NOT be the broadcast address
1164	      (255.255.255.255 or any subnet broadcast address known to the
1165	      router), and MUST NOT be a link-local multicast address.  The
1166	      source RLOC MUST NOT be a multicast address.  The destination RLOC
1167	      SHOULD be a multicast address if it is being mapped from a
1168	      multicast destination EID.

1170	   Mapping Protocol Data:  See [CONS] or [ALT] for details.  This field
1171	      is optional and present when the UDP length indicates there is
1172	      enough space in the packet to include it.

1174	6.1.5.  EID-to-RLOC UDP Map-Reply Message

1176	   When a Data Probe packet or a Map-Request triggers a Map-Reply to be
1177	   sent, the RLOCs associated with the EID-prefix matched by the EID in
1178	   the original packet destination IP address field will be returned.
1179	   The RLOCs in the Map-Reply are the globally-routable IP addresses of
1180	   the ETR but are not necessarily reachable; separate testing of
1181	   reachability is required.

1183	   Note that a Map-Reply may contain different EID-prefix granularity
1184	   (prefix + length) than the Map-Request which triggers it.  This might
1185	   occur if a Map-Request were for a prefix that had been returned by an
1186	   earlier Map-Reply.  In such a case, the requester updates its cache
1187	   with the new prefix information and granularity.  For example, a
1188	   requester with two cached EID-prefixes that are covered by a Map-
1189	   Reply containing one, less-specific prefix, replaces the entry with
1190	   the less-specific EID-prefix.  Note that the reverse, replacement of
1191	   one less-specific prefix with multiple more-specific prefixes, can
1192	   also occur but not by removing the less-specific prefix rather by
1193	   adding the more-specific prefixes which during a lookup will override
1194	   the less-specific prefix.

1196	   Replies SHOULD be sent for an EID-prefix no more often than once per
1197	   second to the same requesting router.  For scalability, it is
1198	   expected that aggregation of EID addresses into EID-prefixes will
1199	   allow one Map-Reply to satisfy a mapping for the EID addresses in the
1200	   prefix range thereby reducing the number of Map-Request messages.

1202	   The addresses for a encapsulated data packets or Map-Request message
1203	   are swapped and used for sending the Map-Reply.  The UDP source and
1204	   destination ports are swapped as well.  That is, the source port in
1205	   the UDP header for the Map-Reply is set to the well-known UDP port
1206	   number 4342.

1208	6.2.  Routing Locator Selection

1210	   Both client-side and server-side may need control over the selection
1211	   of RLOCs for conversations between them.  This control is achieved by
1212	   manipulating the Priority and Weight fields in EID-to-RLOC Map-Reply
1213	   messages.  Alternatively, RLOC information may be gleaned from
1214	   received tunneled packets or EID-to-RLOC Map-Request messages.

1216	   The following enumerates different scenarios for choosing RLOCs and
1217	   the controls that are available:

1219	   o  Server-side returns one RLOC.  Client-side can only use one RLOC.
1220	      Server-side has complete control of the selection.

1222	   o  Server-side returns a list of RLOC where a subset of the list has
1223	      the same best priority.  Client can only use the subset list
1224	      according to the weighting assigned by the server-side.  In this
1225	      case, the server-side controls both the subset list and load-
1226	      splitting across its members.  The client-side can use RLOCs
1227	      outside of the subset list if it determines that the subset list
1228	      is unreachable (unless RLOCs are set to a Priority of 255).  Some
1229	      sharing of control exists: the server-side determines the
1230	      destination RLOC list and load distribution while the client-side
1231	      has the option of using alternatives to this list if RLOCs in the
1232	      list are unreachable.

1234	   o  Server-side sets weight of 0 for the RLOC subset list.  In this
1235	      case, the client-side can choose how the traffic load is spread
1236	      across the subset list.  Control is shared by the server-side
1237	      determining the list and the client determining load distribution.
1238	      Again, the client can use alternative RLOCs if the server-provided
1239	      list of RLOCs are unreachable.

1241	   o  Either side (more likely on the server-side ETR) decides not to
1242	      send a Map-Request.  For example, if the server-side ETR does not
1243	      send Map-Requests, it gleans RLOCs from the client-side ITR,
1244	      giving the client-side ITR responsibility for bidirectional RLOC
1245	      reachability and preferability.  Server-side ETR gleaning of the
1246	      client-side ITR RLOC is done by caching the inner header source
1247	      EID and the outer header source RLOC of received packets.  The
1248	      client-side ITR controls how traffic is returned and can alternate
1249	      using an outer header source RLOC, which then can be added to the
1250	      list the server-side ETR uses to return traffic.  Since no
1251	      Priority or Weights are provided using this method, the server-
1252	      side ETR must assume each client-side ITR RLOC uses the same best
1253	      Priority with a Weight of zero.  In addition, since EID-prefix
1254	      encoding cannot be conveyed in data packets, the EID-to-RLOC cache
1255	      on tunnel routers can grow to be very large.

1257	   RLOCs that appear in EID-to-RLOC Map-Reply messages are considered
1258	   reachable.  The Map-Reply and the database mapping service does not
1259	   provide any reachability status for Locators.  This is done outside
1260	   of the mapping service.  See next section for details.

1262	6.3.  Routing Locator Reachability

1264	   There are 4 methods for determining when a Locator is either
1265	   reachable or has become unreachable:

1267	   1.  Locator reachability is determined by an ETR by examining the
1268	       Loc-Reach-Bits from a LISP header of a encapsulated data packet
1269	       which is provided by an ITR when an ITR encapsulates data.

1271	   2.  Locator unreachability is determined by an ITR by receiving ICMP
1272	       Network or Host Unreachable messages.

1274	   3.  Locator unreachability can also be determined by an BGP-enabled
1275	       ITR when there is no prefix matching a Locator address from the
1276	       BGP RIB.

1278	   4.  Locator unreachability is determined when a host sends an ICMP
1279	       Port Unreachable message.  This occurs when an ITR may not use
1280	       any methods of interworking. one which is describe in [INTERWORK]
1281	       and the encapsulated data packet is received by a host at the
1282	       destination non-LISP site.

1284	   5.  Locator reachability is determined by receiving a Map-Reply
1285	       message from a ETR's Locator address in response to a previously
1286	       sent Map-Request.

1288	   6.  Locator reachability can also be determined by receiving packets
1289	       encapsulated by the ITR assigned to the locator address.

1291	   When determining Locator reachability by examining the Loc-Reach-Bits
1292	   from the LISP encapsulate data packet, an ETR will receive up to date
1293	   status from the ITR closest to the Locators at the source site.  The
1294	   ITRs at the source site can determine reachability when running their
1295	   IGP at the site.  When the ITRs are deployed on CE routers, typically
1296	   a default route is injected into the site's IGP from each of the
1297	   ITRs.  If an ITR goes down, the CE-PE link goes down, or the PE
1298	   router goes down, the CE router withdraws the default route.  This
1299	   allows the other ITRs at the site to determine one of the Locators
1300	   has gone unreachable.

1302	   The Locators listed in a Map-Reply are numbered with ordinals 0 to
1303	   n-1.  The Loc-Reach-Bits in a LISP Data Message are numbered from 0
1304	   to n-1 starting with the least significant bit numbered as 0.  So,
1305	   for example, if the ITR with locator listed as the 3rd Locator
1306	   position in the Map-Reply goes down, all other ITRs at the site will
1307	   have the 3rd bit from the right cleared (the bit that corresponds to
1308	   ordinal 2).

1310	   When an ETR decapsulates a packet, it will look for a change in the
1311	   Loc-Reach-Bits value.  When a bit goes from 1 to 0, the ETR will
1312	   refrain from encapsulating packets to the Locator that has just gone
1313	   unreachable.  It can start using the Locator again when the bit that
1314	   corresponds to the Locator goes from 0 to 1.  Loc-Reach-Bits are
1315	   associated with a locator-set per EID-prefix.  Therefore, when a
1316	   locator becomes unreachable, the loc-reach-bit that corresponds to
1317	   that locator's position in the list returned by the last Map-Reply
1318	   will be set to zero for that particular EID-prefix.

1320	   When ITRs at the site are not deployed in CE routers, the IGP can
1321	   still be used to determine the reachability of Locators provided they
1322	   are injected a stub links into the IGP.  This is typically done when
1323	   a /32 address is configured on a loopback interface.

1325	   When ITRs receive ICMP Network or Host Unreachable messages as a
1326	   method to determine unreachability, they will refrain from using
1327	   Locators which are described in Locator lists of Map-Replies.
1328	   However, using this approach is unreliable because many network
1329	   operators turn off generation of ICMP Unreachable messages.

1331	   If an ITR does receive an ICMP Network or Host Unreachable message,
1332	   it MAY originate its own ICMP Unreachable message destined for the
1333	   host that originated the data packet the ITR encapsulated.

1335	   Also, BGP-enabled ITRs can unilaterally examine the BGP RIB to see if
1336	   a locator address from a locator-set in a mapping entry matches a
1337	   prefix.  If it does not find one and BGP is running in the Default
1338	   Free Zone (DFZ), it can decide to not use the locator even though the
1339	   Loc-Reach-Bits indicate the locator is up.  In this case, the path
1340	   from the ITR to the ETR that is assigned the locator is not
1341	   available.  More details are in [LOC-ID-ARCH].

1343	   Optionally, an ITR can send a Map-Request to a Locator and if a Map-
1344	   Reply is returned, reachability of the Locator has been determined.
1345	   Obviously, sending such probes increases the number of control
1346	   messages originated by tunnel routers for active flows, so Locators
1347	   are assumed to be reachable when they are advertised.

1349	   This assumption does create a dependency: Locator unreachability is
1350	   detected by the receipt of ICMP Host Unreachable messages.  When an
1351	   Locator has been determined to be unreachable, it is not used for
1352	   active traffic; this is the same as if it were listed in a Map-Reply
1353	   with priority 255.

1355	   The ITR can test the reachability of the unreachable Locator by
1356	   sending periodic Requests.  Both Requests and Replies MUST be rate-
1357	   limited.  Locator reachability testing is never done with data
1358	   packets since that increases the risk of packet loss for end-to-end
1359	   sessions.

1361	   When an ETR is decapsulating packets, it can be sure that the path
1362	   from the encapsulating ITR is available.  The ETR can assume the path
1363	   from the ETR to the ITR is also reachable.  Even if there is
1364	   asymmetric routing in the core, the first-hop and last-hop ASes will
1365	   be the same for both directions of traffic since the locator
1366	   addresses are out of the PA blocks of each.  However, the assumption
1367	   may not always be valid, so this mechanism should be used as a best-
1368	   effort indication that a working path exists between the sites.  In
1369	   the event of unidirectional traffic from an ITR to an ETR, an ITR
1370	   should not conclude that a locator is unreachable since it is not
1371	   receiving packets, but use alternate mechanisms described above to
1372	   determine reachability.

1374	6.4.  Routing Locator Hashing

1376	   When an ETR provides an EID-to-RLOC mapping in a Map-Reply message to
1377	   a requesting ITR, the locator-set for the EID-prefix may contain
1378	   different priority values for each locator address.  When more than
1379	   one best priority locator exists, the ITR can decide how to load
1380	   share traffic against the corresponding locators.

1382	   The following hash algorithm may be used by an ITR to select a
1383	   locator for a packet destined to an EID for the EID-to-RLOC mapping:

1385	   1.  Either a source and destination address hash can be used or the
1386	       traditional 5-tuple hash which includes the source and
1387	       destination addresses, source and destination TCP, UDP, or SCTP
1388	       port numbers and the IP protocol number field or IPv6 next-
1389	       protocol fields of a packet a host originates from within a LISP
1390	       site.  When a packet is not a TCP, UDP, or SCTP packet, the
1391	       source and destination addresses only from the header are used to
1392	       compute the hash.

1394	   2.  Take the hash value and divide it by the number of locators
1395	       stored in the locator-set for the EID-to-RLOC mapping.

1397	   3.  The remainder will be yield a value of 0 to "number of locators
1398	       minus 1".  Use the remainder to select the locator in the
1399	       locator-set.

1401	   Note that when a packet is LISP encapsulated, the source port number
1402	   in the outer UDP header needs to be set.  Selecting a random value
1403	   allows core routers which are attached to Link Aggregation Groups
1404	   (LAGs) to load-split the encapsulated packets across member links of
1405	   such LAGs.  Otherwise, core routers would see a single flow, since
1406	   packets have a source address of the ITR, for packets which are
1407	   originated by different EIDs at the source site.  A suggested setting
1408	   for the source port number computed by an ITR is a 5-tuple hash
1409	   function on the inner header, as described above.

1411	6.5.  Changing the Contents of EID-to-RLOC Mappings

1413	   Since the LISP architecture uses a caching scheme to retrieve and
1414	   store EID-to-RLOC mappings, the only way an ITR can get a more up-to-
1415	   date mapping is to re-request the mapping.  However, the ITRs do not
1416	   know when the mappings change and the ETRs do not keep track of who
1417	   requested its mappings.  For scalability reasons, we want to maintain
1418	   this approach but need to provide a way for ETRs change their
1419	   mappings and inform the sites that are currently communicating with
1420	   the ETR site using such mappings.

1422	   When a locator record is added to the end of a locator-set, it is
1423	   easy to update mappings.  We assume new mappings will maintain the
1424	   same locator ordering as the old mapping but just have new locators
1425	   appended to the end of the list.  So some ITRs can have a new mapping
1426	   while other ITRs have only an old mapping that is used until they
1427	   time out.  When an ITR has only an old mapping but detects bits set
1428	   in the loc-reach-bits that correspond to locators beyond the list it
1429	   has cached, it simply ignores them.

1431	   When a locator record is removed from a locator-set, ITRs that have
1432	   the mapping cached will not use the removed locator because the xTRs
1433	   will set the loc-reach-bit to 0.  So even if the locator is in the
1434	   list, it will not be used.  For new mapping requests, the xTRs can
1435	   set the locator address to 0 as well as setting the corresponding
1436	   loc-reach-bit to 0.  This forces ITRs with old or new mappings to
1437	   avoid using the removed locator.

1439	   If many changes occur to a mapping over a long period of time, one
1440	   will find empty record slots in the middle of the locator-set and new
1441	   records appended to the locator-set.  At some point, it would be
1442	   useful to compact the locator-set so the loc-reach-bit settings can
1443	   be efficiently packed.

1445	   We propose here two approaches for locator-set compaction, one
1446	   operational and the other a protocol mechanism.  The operational
1447	   approach uses a clock sweep method.  The protocol approach uses the
1448	   concept of Solicit-Map-Requests.

1450	6.5.1.  Clock Sweep

1452	   The clock sweep approach uses planning in advance and the use of
1453	   count-down TTLs to time out mappings that have already been cached.
1454	   The default setting for an EID-to-RLOC mapping TTL is 24 hours.  So
1455	   there is a 24 hour window to time out old mappings.  The following
1456	   clock sweep procedure is used:

1458	   1.  24 hours before a mapping change is to take effect, a network
1459	       administrator configures the ETRs at a site to start the clock
1460	       sweep window.

1462	   2.  During the clock sweep window, ETRs continue to send Map-Reply
1463	       messages with the current (unchanged) mapping records.  The TTL
1464	       for these mappings is set to 1 hour.

1466	   3.  24 hours later, all previous cache entries will have timed out,
1467	       and any active cache entries will time out within 1 hour.  During
1468	       this 1 hour window the ETRs continue to send Map-Reply messages
1469	       with the current (unchanged) mapping records with the TTL set to
1470	       1 minute.

1472	   4.  At the end of the 1 hour window, the ETRs will send Map-Reply
1473	       messages with the new (changed) mapping records.  So any active
1474	       caches can get the new mapping contents right away if not cached,
1475	       or in 1 minute if they had the mapping cached.

1477	6.5.2.  Solicit-Map-Request (SMR)

1479	   Soliciting a Map-Request is a selective way for xTRs, at the site
1480	   where mappings change, to control the rate they receive requests for
1481	   Map-Reply messages.  SMRs are also used to tell remote ITRs to update
1482	   the mappings they have cached.

1484	   Since the xTRs don't keep track of remote ITRs that have cached their
1485	   mappings, they can not tell exactly who needs the new mapping
1486	   entries.  So an xTR will solicit Map-Requests from sites it is
1487	   currently sending encapsulated data to, and only from those sites.
1488	   The xTRs can locally decide the algorithm for how often and to how
1489	   many sites it sends SMR messages.

1491	   An SMR message is simply a bit set in an encapsulated data packet
1492	   (and a Map-Request message).  When an ETR at a remote site
1493	   decapsulates a data packet that has the SMR bit set, it can tell that
1494	   a new Map-Request message is being solicited.  Both the xTR that
1495	   sends the SMR message and the site that acts on the SMR message MUST
1496	   be rate-limited.

1498	   The following procedure shows how a SMR exchange occurs when a site
1499	   is doing locator-set compaction for an EID-to-RLOC mapping:

1501	   1.  When the database mappings in an ETR change, the ITRs at the site
1502	       begin to set the SMR bit in packets they encapsulate to the sites
1503	       they communicate with.

1505	   2.  A remote xTR which decapsulates a packet with the SMR bit set
1506	       will schedule sending a Map-Request message to the source locator
1507	       address of the encapsulated packet.  The nonce in the Map-Request
1508	       is copied from the nonce in the encapsulated data packet that has
1509	       the SMR bit set.

1511	   3.  The remote xTR retransmits the Map-Request slowly until it gets a
1512	       Map-Reply while continuing to use the cached mapping.

1514	   4.  The ETRs at the site with the changed mapping will reply to the
1515	       Map-Request with a Map-Reply message provided the Map-Request
1516	       nonce matches the nonce from the SMR.  The Map-Reply messages
1517	       SHOULD be rate limited.  This is important to avoid Map-Reply
1518	       implosion.

1520	   5.  The ETRs, at the site with the changed mapping, records the fact
1521	       that the site that sent the Map-Request has received the new
1522	       mapping data in the mapping cache entry for the remote site so
1523	       the loc-reach-bits are reflective of the new mapping for packets
1524	       going to the remote site.  The ETR then stops sending packets
1525	       with the SMR-bit set.

1527	7.  Router Performance Considerations

1529	   LISP is designed to be very hardware-based forwarding friendly.  By
1530	   doing tunnel header prepending [RFC1955] and stripping instead of re-
1531	   writing addresses, existing hardware can support the forwarding model
1532	   with little or no modification.  Where modifications are required,
1533	   they should be limited to re-programming existing hardware rather
1534	   than requiring expensive design changes to hard-coded algorithms in
1535	   silicon.

1537	   A few implementation techniques can be used to incrementally
1538	   implement LISP:

1540	   o  When a tunnel encapsulated packet is received by an ETR, the outer
1541	      destination address may not be the address of the router.  This
1542	      makes it challenging for the control plane to get packets from the
1543	      hardware.  This may be mitigated by creating special FIB entries
1544	      for the EID-prefixes of EIDs served by the ETR (those for which
1545	      the router provides an RLOC translation).  These FIB entries are
1546	      marked with a flag indicating that control plane processing should
1547	      be performed.  The forwarding logic of testing for particular IP
1548	      protocol number value is not necessary.  No changes to existing,
1549	      deployed hardware should be needed to support this.

1551	   o  On an ITR, prepending a new IP header is as simple as adding more
1552	      bytes to a MAC rewrite string and prepending the string as part of
1553	      the outgoing encapsulation procedure.  Many routers that support
1554	      GRE tunneling [RFC2784] or 6to4 tunneling [RFC3056] can already
1555	      support this action.

1557	   o  When a received packet's outer destination address contains an EID
1558	      which is not intended to be forwarded on the routable topology
1559	      (i.e.  LISP 1.5), the source address of a data packet or the
1560	      router interface with which the source is associated (the
1561	      interface from which it was received) can be associated with a VRF
1562	      (Virtual Routing/Forwarding), in which a different (i.e. non-
1563	      congruent) topology can be used to find EID-to-RLOC mappings.

1565	8.  Deployment Scenarios

1567	   This section will explore how and where ITRs and ETRs can be deployed
1568	   and will discuss the pros and cons of each deployment scenario.
1569	   There are two basic deployment trade-offs to consider: centralized
1570	   versus distributed caches and flat, recursive, or re-encapsulating
1571	   tunneling.

1573	   When deciding on centralized versus distributed caching, the
1574	   following issues should be considered:

1576	   o  Are the tunnel routers spread out so that the caches are spread
1577	      across all the memories of each router?

1579	   o  Should management "touch points" be minimized by choosing few
1580	      tunnel routers, just enough for redundancy?

1582	   o  In general, using more ITRs doesn't increase management load,
1583	      since caches are built and stored dynamically.  On the other hand,
1584	      more ETRs does require more management since EID-prefix-to-RLOC
1585	      mappings need to be explicitly configured.

1587	   When deciding on flat, recursive, or re-encapsulation tunneling, the
1588	   following issues should be considered:

1590	   o  Flat tunneling implements a single tunnel between source site and
1591	      destination site.  This generally offers better paths between
1592	      sources and destinations with a single tunnel path.

1594	   o  Recursive tunneling is when tunneled traffic is again further
1595	      encapsulated in another tunnel, either to implement VPNs or to
1596	      perform Traffic Engineering.  When doing VPN-based tunneling, the
1597	      site has some control since the site is prepending a new tunnel
1598	      header.  In the case of TE-based tunneling, the site may have
1599	      control if it is prepending a new tunnel header, but if the site's
1600	      ISP is doing the TE, then the site has no control.  Recursive
1601	      tunneling generally will result in suboptimal paths but at the
1602	      benefit of steering traffic to resource available parts of the
1603	      network.

1605	   o  The technique of re-encapsulation ensures that packets only
1606	      require one tunnel header.  So if a packet needs to be rerouted,
1607	      it is first decapsulated by the ETR and then re-encapsulated with
1608	      a new tunnel header using a new RLOC.

1610	   The next sub-sections will describe where tunnel routers can reside
1611	   in the network.

1613	8.1.  First-hop/Last-hop Tunnel Routers

1615	   By locating tunnel routers close to hosts, the EID-prefix set is at
1616	   the granularity of an IP subnet.  So at the expense of more EID-
1617	   prefix-to-RLOC sets for the site, the caches in each tunnel router
1618	   can remain relatively small.  But caches always depend on the number
1619	   of non-aggregated EID destination flows active through these tunnel
1620	   routers.

1622	   With more tunnel routers doing encapsulation, the increase in control
1623	   traffic grows as well: since the EID-granularity is greater, more
1624	   Map-Requests and Map-Replies are traveling between more routers.

1626	   The advantage of placing the caches and databases at these stub
1627	   routers is that the products deployed in this part of the network
1628	   have better price-memory ratios then their core router counterparts.
1629	   Memory is typically less expensive in these devices and fewer routes
1630	   are stored (only IGP routes).  These devices tend to have excess
1631	   capacity, both for forwarding and routing state.

1633	   LISP functionality can also be deployed in edge switches.  These
1634	   devices generally have layer-2 ports facing hosts and layer-3 ports
1635	   facing the Internet.  Spare capacity is also often available in these
1636	   devices as well.

1638	8.2.  Border/Edge Tunnel Routers

1640	   Using customer-edge (CE) routers for tunnel endpoints allows the EID
1641	   space associated with a site to be reachable via a small set of RLOCs
1642	   assigned to the CE routers for that site.

1644	   This offers the opposite benefit of the first-hop/last-hop tunnel
1645	   router scenario: the number of mapping entries and network management
1646	   touch points are reduced, allowing better scaling.

1648	   One disadvantage is that less of the network's resources are used to
1649	   reach host endpoints thereby centralizing the point-of-failure domain
1650	   and creating network choke points at the CE router.

1652	   Note that more than one CE router at a site can be configured with
1653	   the same IP address.  In this case an RLOC is an anycast address.
1654	   This allows resilience between the CE routers.  That is, if a CE
1655	   router fails, traffic is automatically routed to the other routers
1656	   using the same anycast address.  However, this comes with the
1657	   disadvantage where the site cannot control the entrance point when
1658	   the anycast route is advertised out from all border routers.

1660	8.3.  ISP Provider-Edge (PE) Tunnel Routers

1662	   Use of ISP PE routers as tunnel endpoint routers gives an ISP control
1663	   over the location of the egress tunnel endpoints.  That is, the ISP
1664	   can decide if the tunnel endpoints are in the destination site (in
1665	   either CE routers or last-hop routers within a site) or at other PE
1666	   edges.  The advantage of this case is that two or more tunnel headers
1667	   can be avoided.  By having the PE be the first router on the path to
1668	   encapsulate, it can choose a TE path first, and the ETR can
1669	   decapsulate and re-encapsulate for a tunnel to the destination end
1670	   site.

1672	   An obvious disadvantage is that the end site has no control over
1673	   where its packets flow or the RLOCs used.

1675	   As mentioned in earlier sections a combination of these scenarios is
1676	   possible at the expense of extra packet header overhead, if both site
1677	   and provider want control, then recursive or re-encapsulating tunnels
1678	   are used.

1680	9.  Traceroute Considerations

1682	   When a source host in a LISP site initiates a traceroute to a
1683	   destination host in another LISP site, it is highly desirable for it
1684	   to see the entire path.  Since packets are encapsulated from ITR to
1685	   ETR, the hop across the tunnel could be viewed as a single hop.
1686	   However, LISP traceroute will provide the entire path so the user can
1687	   see 3 distinct segments of the path from a source LISP host to a
1688	   destination LISP host:

1690	      Segment 1 (in source LISP site based on EIDs):

1692	          source-host ---> first-hop ... next-hop ---> ITR

1694	      Segment 2 (in the core network based on RLOCs):

1696	          ITR ---> next-hop ... next-hop ---> ETR

1698	      Segment 3 (in the destination LISP site based on EIDs):

1700	          ETR ---> next-hop ... last-hop ---> destination-host

1702	   For segment 1 of the path, ICMP Time Exceeded messages are returned
1703	   in the normal matter as they are today.  The ITR performs a TTL
1704	   decrement and test for 0 before encapsulating.  So the ITR hop is
1705	   seen by the traceroute source has an EID address (the address of
1706	   site-facing interface).

1708	   For segment 2 of the path, ICMP Time Exceeded messages are returned
1709	   to the ITR because the TTL decrement to 0 is done on the outer
1710	   header, so the destination of the ICMP messages are to the ITR RLOC
1711	   address, the source source RLOC address of the encapsulated
1712	   traceroute packet.  The ITR looks inside of the ICMP payload to
1713	   inspect the traceroute source so it can return the ICMP message to
1714	   the address of the traceroute client as well as retaining the core
1715	   router IP address in the ICMP message.  This is so the traceroute
1716	   client can display the core router address (the RLOC address) in the
1717	   traceroute output.  The ETR returns its RLOC address and responds to
1718	   the TTL decrement to 0 like the previous core routers did.

1720	   For segment 3, the next-hop router downstream from the ETR will be
1721	   decrementing the TTL for the packet that was encapsulated, sent into
1722	   the core, decapsulated by the ETR, and forwarded because it isn't the
1723	   final destination.  If the TTL is decremented to 0, any router on the
1724	   path to the destination of the traceroute, including the next-hop
1725	   router or destination, will send an ICMP Time Exceeded message to the
1726	   source EID of the traceroute client.  The ICMP message will be
1727	   encapsulated by the local ITR and sent back to the ETR in the
1728	   originated traceroute source site, where the packet will be delivered
1729	   to the host.

1731	9.1.  IPv6 Traceroute

1733	   IPv6 traceroute follows the procedure described above since the
1734	   entire traceroute data packet is included in ICMP Time Exceeded
1735	   message payload.  Therefore, only the ITR needs to pay special
1736	   attention for forwarding ICMP messages back to the traceroute source.

1738	9.2.  IPv4 Traceroute

1740	   For IPv4 traceroute, we cannot follow the above procedure since IPv4
1741	   ICMP Time Exceeded messages only include the invoking IP header and 8
1742	   bytes that follow the IP header.  Therefore, when a core router sends
1743	   an IPv4 Time Exceeded message to an ITR, all the ITR has in the ICMP
1744	   payload is the encapsulated header it prepended followed by a UDP
1745	   header.  The original invoking IP header, and therefore the identity
1746	   of the traceroute source is lost.

1748	   The solution we propose to solve this problem is to cache traceroute
1749	   IPv4 headers in the ITR and to match them up with corresponding IPv4
1750	   Time Exceeded messages received from core routers and the ETR.  The
1751	   ITR will use a circular buffer for caching the IPv4 and UDP headers
1752	   of traceroute packets.  It will select a 16-bit number as a key to
1753	   find them later when the IPv4 Time Exceeded messages are received.
1754	   When an ITR encapsulates an IPv4 traceroute packet, it will use the
1755	   16-bit number as the UDP source port in the encapsulating header.
1756	   When the ICMP Time Exceeded message is returned to the ITR, the UDP
1757	   header of the encapsulating header is present in the ICMP payload
1758	   thereby allowing the ITR to find the cached headers for the
1759	   traceroute source.  The ITR puts the cached headers in the payload
1760	   and sends the ICMP Time Exceeded message to the traceroute source
1761	   retaining the source address of the original ICMP Time Exceeded
1762	   message (a core router or the ETR of the site of the traceroute
1763	   destination).

1765	9.3.  Traceroute using Mixed Locators

1767	   When either an IPv4 traceroute or IPv6 traceroute is originated and
1768	   the ITR encapsulates it in the other address family header, you
1769	   cannot get all 3 segments of the traceroute.  Segment 2 of the
1770	   traceroute can not be conveyed to the traceroute source since it is
1771	   expecting addresses from intermediate hops in the same address format
1772	   for the type of traceroute it originated.  Therefore, in this case,
1773	   segment 2 will make the tunnel look like one hop.  All the ITR has to
1774	   do to make this work is to not copy the inner TTL to the outer,
1775	   encapsulating header's TTL when a traceroute packet is encapsulated
1776	   using an RLOC from a different address family.  This will cause no
1777	   TTL decrement to 0 to occur in core routers between the ITR and ETR.

1779	10.  Mobility Considerations

1781	   There are several kinds of mobility of which only some might be of
1782	   concern to LISP.  Essentially they are as follows.

1784	10.1.  Site Mobility

1786	   A site wishes to change its attachment points to the Internet, and
1787	   its LISP Tunnel Routers will have new RLOCs when it changes upstream
1788	   providers.  Changes in EID-RLOC mappings for sites are expected to be
1789	   handled by configuration, outside of the LISP protocol.

1791	10.2.  Slow Endpoint Mobility

1793	   An individual endpoint wishes to move, but is not concerned about
1794	   maintaining session continuity.  Renumbering is involved.  LISP can
1795	   help with the issues surrounding renumbering [RFC4192] [LISA96] by
1796	   decoupling the address space used by a site from the address spaces
1797	   used by its ISPs.  [RFC4984]

1799	10.3.  Fast Endpoint Mobility

1801	   Fast endpoint mobility occurs when an endpoint moves relatively
1802	   rapidly, changing its IP layer network attachment point.  Maintenance
1803	   of session continuity is a goal.  This is where the Mobile IPv4
1804	   [RFC3344bis] and Mobile IPv6 [RFC3775] [RFC4866] mechanisms are used,
1805	   and primarily where interactions with LISP need to be explored.

1807	   The problem is that as an endpoint moves, it may require changes to
1808	   the mapping between its EID and a set of RLOCs for its new network
1809	   location.  When this is added to the overhead of mobile IP binding
1810	   updates, some packets might be delayed or dropped.

1812	   In IPv4 mobility, when an endpoint is away from home, packets to it
1813	   are encapsulated and forwarded via a home agent which resides in the
1814	   home area the endpoint's address belongs to.  The home agent will
1815	   encapsulate and forward packets either directly to the endpoint or to
1816	   a foreign agent which resides where the endpoint has moved to.
1817	   Packets from the endpoint may be sent directly to the correspondent
1818	   node, may be sent via the foreign agent, or may be reverse-tunneled
1819	   back to the home agent for delivery to the mobile node.  As the
1820	   mobile node's EID or available RLOC changes, LISP EID-to-RLOC
1821	   mappings are required for communication between the mobile node and
1822	   the home agent, whether via foreign agent or not.  As a mobile
1823	   endpoint changes networks, up to three LISP mapping changes may be
1824	   required:

1826	   o  The mobile node moves from an old location to a new visited
1827	      network location and notifies its home agent that it has done so.
1828	      The Mobile IPv4 control packets the mobile node sends pass through
1829	      one of the new visited network's ITRs, which needs a EID-RLOC
1830	      mapping for the home agent.

1832	   o  The home agent might not have the EID-RLOC mappings for the mobile
1833	      node's "care-of" address or its foreign agent in the new visited
1834	      network, in which case it will need to acquire them.

1836	   o  When packets are sent directly to the correspondent node, it may
1837	      be that no traffic has been sent from the new visited network to
1838	      the correspondent node's network, and the new visited network's
1839	      ITR will need to obtain an EID-RLOC mapping for the correspondent
1840	      node's site.

1842	   In addition, if the IPv4 endpoint is sending packets from the new
1843	   visited network using its original EID, then LISP will need to
1844	   perform a route-returnability check on the new EID-RLOC mapping for
1845	   that EID.

1847	   In IPv6 mobility, packets can flow directly between the mobile node
1848	   and the correspondent node in either direction.  The mobile node uses
1849	   its "care-of" address (EID).  In this case, the route-returnability
1850	   check would not be needed but one more LISP mapping lookup may be
1851	   required instead:

1853	   o  As above, three mapping changes may be needed for the mobile node
1854	      to communicate with its home agent and to send packets to the
1855	      correspondent node.

1857	   o  In addition, another mapping will be needed in the correspondent
1858	      node's ITR, in order for the correspondent node to send packets to
1859	      the mobile node's "care-of" address (EID) at the new network
1860	      location.

1862	   When both endpoints are mobile the number of potential mapping
1863	   lookups increases accordingly.

1865	   As a mobile node moves there are not only mobility state changes in
1866	   the mobile node, correspondent node, and home agent, but also state
1867	   changes in the ITRs and ETRs for at least some EID-prefixes.

1869	   The goal is to support rapid adaptation, with little delay or packet
1870	   loss for the entire system.  Heuristics can be added to LISP to
1871	   reduce the number of mapping changes required and to reduce the delay
1872	   per mapping change.  Also IP mobility can be modified to require
1873	   fewer mapping changes.  In order to increase overall system
1874	   performance, there may be a need to reduce the optimization of one
1875	   area in order to place fewer demands on another.

1877	   In LISP, one possibility is to "glean" information.  When a packet
1878	   arrives, the ETR could examine the EID-RLOC mapping and use that
1879	   mapping for all outgoing traffic to that EID.  It can do this after
1880	   performing a route-returnability check, to ensure that the new
1881	   network location does have a internal route to that endpoint.
1882	   However, this does not cover the case where an ITR (the node assigned
1883	   the RLOC) at the mobile-node location has been compromised.

1885	   Mobile IP packet exchange is designed for an environment in which all
1886	   routing information is disseminated before packets can be forwarded.
1887	   In order to allow the Internet to grow to support expected future
1888	   use, we are moving to an environment where some information may have
1889	   to be obtained after packets are in flight.  Modifications to IP
1890	   mobility should be considered in order to optimize the behavior of
1891	   the overall system.  Anything which decreases the number of new EID-
1892	   RLOC mappings needed when a node moves, or maintains the validity of
1893	   an EID-RLOC mapping for a longer time, is useful.

1895	10.4.  Fast Network Mobility

1897	   In addition to endpoints, a network can be mobile, possibly changing
1898	   xTRs.  A "network" can be as small as a single router and as large as
1899	   a whole site.  This is different from site mobility in that it is
1900	   fast and possibly short-lived, but different from endpoint mobility
1901	   in that a whole prefix is changing RLOCs.  However, the mechanisms
1902	   are the same and there is no new overhead in LISP.  A map request for
1903	   any endpoint will return a binding for the entire mobile prefix.

1905	   If mobile networks become a more common occurrence, it may be useful
1906	   to revisit the design of the mapping service and allow for dynamic
1907	   updates of the database.

1909	   The issue of interactions between mobility and LISP needs to be
1910	   explored further.  Specific improvements to the entire system will
1911	   depend on the details of mapping mechanisms.  Mapping mechanisms
1912	   should be evaluated on how well they support session continuity for
1913	   mobile nodes.

1915	11.  Multicast Considerations

1917	   A multicast group address, as defined in the original Internet
1918	   architecture is an identifier of a grouping of topologically
1919	   independent receiver host locations.  The address encoding itself
1920	   does not determine the location of the receiver(s).  The multicast
1921	   routing protocol, and the network-based state the protocol creates,
1922	   determines where the receivers are located.

1924	   In the context of LISP, a multicast group address is both an EID and
1925	   a Routing Locator.  Therefore, no specific semantic or action needs
1926	   to be taken for a destination address, as it would appear in an IP
1927	   header.  Therefore, a group address that appears in an inner IP
1928	   header built by a source host will be used as the destination EID.
1929	   The outer IP header (the destination Routing Locator address),
1930	   prepended by a LISP router, will use the same group address as the
1931	   destination Routing Locator.

1933	   Having said that, only the source EID and source Routing Locator
1934	   needs to be dealt with.  Therefore, an ITR merely needs to put its
1935	   own IP address in the source Routing Locator field when prepending
1936	   the outer IP header.  This source Routing Locator address, like any
1937	   other Routing Locator address MUST be globally routable.

1939	   Therefore, an EID-to-RLOC mapping does not need to be performed by an
1940	   ITR when a received data packet is a multicast data packet or when
1941	   processing a source-specific Join (either by IGMPv3 or PIM).  But the
1942	   source Routing Locator is decided by the multicast routing protocol
1943	   in a receiver site.  That is, an EID to Routing Locator translation
1944	   is done at control-time.

1946	   Another approach is to have the ITR not encapsulate a multicast
1947	   packet and allow the the host built packet to flow into the core even
1948	   if the source address is allocated out of the EID namespace.  If the
1949	   RPF-Vector TLV [RPFV] is used by PIM in the core, then core routers
1950	   can RPF to the ITR (the Locator address which is injected into core
1951	   routing) rather than the host source address (the EID address which
1952	   is not injected into core routing).

1954	   To avoid any EID-based multicast state in the network core, the first
1955	   approach is chosen for LISP-Multicast.  Details for LISP-Multicast
1956	   and Interworking with non-LISP sites is described in specification
1957	   [MLISP].

1959	12.  Security Considerations

1961	   It is believed that most of the security mechanisms will be part of
1962	   the mapping database service when using control plane procedures for
1963	   obtaining EID-to-RLOC mappings.  For data plane triggered mappings,
1964	   as described in this specification, protection is provided against
1965	   ETR spoofing by using Return- Routability mechanisms evidenced by the
1966	   use of a 4-byte Nonce field in the LISP encapsulation header.  The
1967	   nonce, coupled with the ITR accepting only solicited Map-Replies goes
1968	   a long way toward providing decent authentication.

1970	   LISP does not rely on a PKI infrastructure or a more heavy weight
1971	   authentication system.  These systems challenge the scalability of
1972	   LISP which was a primary design goal.

1974	   DoS attack prevention will depend on implementations rate-limiting
1975	   Map-Requests and Map-Replies to the control plane as well as rate-
1976	   limiting the number of data-triggered Map-Replies.

1978	13.  Prototype Plans and Status

1980	   The operator community has requested that the IETF take a practical
1981	   approach to solving the scaling problems associated with global
1982	   routing state growth.  This document offers a simple solution which
1983	   is intended for use in a pilot program to gain experience in working
1984	   on this problem.

1986	   The authors hope that publishing this specification will allow the
1987	   rapid implementation of multiple vendor prototypes and deployment on
1988	   a small scale.  Doing this will help the community:

1990	   o  Decide whether a new EID-to-RLOC mapping database infrastructure
1991	      is needed or if a simple, UDP-based, data-triggered approach is
1992	      flexible and robust enough.

1994	   o  Experiment with provider-independent assignment of EIDs while at
1995	      the same time decreasing the size of DFZ routing tables through
1996	      the use of topologically-aligned, provider-based RLOCs.

1998	   o  Determine whether multiple levels of tunneling can be used by ISPs
1999	      to achieve their Traffic Engineering goals while simultaneously
2000	      removing the more specific routes currently injected into the
2001	      global routing system for this purpose.

2003	   o  Experiment with mobility to determine if both acceptable
2004	      convergence and session continuity properties can be scalably
2005	      implemented to support both individual device roaming and site
2006	      service provider changes.

2008	   Here is a rough set of milestones:

2010	   1.  This draft will be the draft for interoperable implementations to
2011	       code against.  Interoperable implementations will be ready
2012	       beginning of 2009.

2014	   2.  Continue pilot deployment using LISP-ALT as the database mapping
2015	       mechanism.

2017	   3.  Continue prototyping and studying other database lookup schemes,
2018	       be it DNS, DHTs, CONS, ALT, NERD, or other mechanisms.

2020	   4.  Implement the LISP Multicast draft [MLISP].

2022	   5.  Research more on how policy affects what gets returned in a Map-
2023	       Reply from an ETR.

2025	   6.  Continue to experiment with mixed locator-sets to understand how
2026	       LISP can help the IPv4 to IPv6 transition.

2028	   7.  Add more robustness to locator reachability between LISP sites.

2030	   As of this writing the following accomplishments have been achieved:

2032	   1.  A unit- and system-tested software switching implementation has
2033	       been completed on cisco NX-OS for this draft for both IPv4 and
2034	       IPv6 EIDs using a mixed locator-set of IPv4 and IPv6 locators.

2036	   2.  A unit- and system-tested software switching implementation on
2037	       cisco NX-OS has been completed for draft [ALT].

2039	   3.  A unit- and system-tested software switching implementation on
2040	       cisco NX-OS has been completed for draft [INTERWORK].  Support
2041	       for IPv4 translation is provided and PTR support for IPv4 and
2042	       IPv6 is provided.

2044	   4.  The cisco NX-OS implementation supports an experimental mechanism
2045	       for slow mobility.

2047	   5.  Dave Meyer, Vince Fuller, Darrel Lewis, Greg Shepherd, and Andrew
2048	       Partan continue to test all the features described above on a
2049	       dual-stack infrastructure.

2051	   6.  Darrel Lewis and Dave Meyer have deployed both LISP translation
2052	       and LISP PTR support in the pilot network.  Point your browser to
2053	       http://www.lisp4.net to see translation happening in action so
2054	       your non-LISP site can access a web server in a LISP site.

2056	   7.  Soon http://www.lisp6.net will work where your IPv6 LISP site can
2057	       talk to a IPv6 web server in a LISP site by using mixed address-
2058	       family based locators.

2060	   8.  An public domain implementation of LISP is underway.  See
2061	       [OPENLISP] for details.

2063	   9.  A cisco IOS implementation is underway which currently supports
2064	       IPv4 encapsulation and decapsulation features.

2066	   If interested in writing a LISP implementation, testing any of the
2067	   LISP implementations, or want to be part of the LISP pilot program,
2068	   please contact lisp@ietf.org.

2070	14.  References

2072	14.1.  Normative References

2074	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
2075	              August 1980.

2077	   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
2078	              November 1990.

2080	   [RFC1498]  Saltzer, J., "On the Naming and Binding of Network
2081	              Destinations", RFC 1498, August 1993.

2083	   [RFC1955]  Hinden, R., "New Scheme for Internet Routing and
2084	              Addressing (ENCAPS) for IPNG", RFC 1955, June 1996.

2086	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2087	              Requirement Levels", BCP 14, RFC 2119, March 1997.

2089	   [RFC2434]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
2090	              IANA Considerations Section in RFCs", BCP 26, RFC 2434,
2091	              October 1998.

2093	   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
2094	              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
2095	              March 2000.

2097	   [RFC3056]  Carpenter, B. and K. Moore, "Connection of IPv6 Domains
2098	              via IPv4 Clouds", RFC 3056, February 2001.

2100	   [RFC3775]  Johnson, D., Perkins, C., and J. Arkko, "Mobility Support
2101	              in IPv6", RFC 3775, June 2004.

2103	   [RFC4423]  Moskowitz, R. and P. Nikander, "Host Identity Protocol
2104	              (HIP) Architecture", RFC 4423, May 2006.

2106	   [RFC4866]  Arkko, J., Vogt, C., and W. Haddad, "Enhanced Route
2107	              Optimization for Mobile IPv6", RFC 4866, May 2007.

2109	   [RFC4984]  Meyer, D., Zhang, L., and K. Fall, "Report from the IAB
2110	              Workshop on Routing and Addressing", RFC 4984,
2111	              September 2007.

2113	14.2.  Informative References

2115	   [AFI]      IANA, "Address Family Indicators (AFIs)", ADDRESS FAMILY
2116	              NUMBERS http://www.iana.org/numbers.html, Febuary 2007.

2118	   [ALT]      Farinacci, D., Fuller, V., and D. Meyer, "LISP Alternative
2119	              Topology (LISP-ALT)", draft-fuller-lisp-alt-03.txt (work
2120	              in progress), October 2008.

2122	   [APT]      Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and
2123	              L. Zhang, "APT: A Practical Transit Mapping Service",
2124	              draft-jen-apt-00.txt (work in progress), July 2007.

2126	   [CHIAPPA]  Chiappa, J., "Endpoints and Endpoint names: A Proposed
2127	              Enhancement to the Internet Architecture", Internet-
2128	              Draft http://www.chiappa.net/~jnc/tech/endpoints.txt,
2129	              1999.

2131	   [CONS]     Farinacci, D., Fuller, V., and D. Meyer, "LISP-CONS: A
2132	              Content distribution Overlay Network  Service for LISP",
2133	              draft-meyer-lisp-cons-03.txt (work in progress),
2134	              November 2007.

2136	   [DHTs]     Ratnasamy, S., Shenker, S., and I. Stoica, "Routing
2137	              Algorithms for DHTs: Some Open Questions", PDF
2138	              file http://www.cs.rice.edu/Conferences/IPTPS02/174.pdf.

2140	   [GSE]      "GSE - An Alternate Addressing Architecture for  IPv6",
2141	              draft-ietf-ipngwg-gseaddr-00.txt (work in progress), 1997.

2143	   [INTERWORK]
2144	              Lewis, D., Meyer, D., and D. Farinacci, "Interworking LISP
2145	              with IPv4 and IPv6", draft-lewis-lisp-interworking-01.txt
2146	              (work in progress), July 2008.

2148	   [LISA96]   Lear, E., Katinsky, J., Coffin, J., and D. Tharp,
2149	              "Renumbering: Threat or Menace?", Usenix , September 1996.

2151	   [LISP1]    Farinacci, D., Oran, D., Fuller, V., and J. Schiller,
2152	              "Locator/ID Separation Protocol (LISP1) [Routable  ID
2153	              Version]",
2154	              Slide-set http://www.dinof.net/~dino/ietf/lisp1.ppt,
2155	              October 2006.

2157	   [LISP2]    Farinacci, D., Oran, D., Fuller, V., and J. Schiller,
2158	              "Locator/ID Separation Protocol (LISP2) [DNS-based
2159	              Version]",
2160	              Slide-set http://www.dinof.net/~dino/ietf/lisp2.ppt,
2161	              November 2006.

2163	   [LISPDHT]  Mathy, L., Iannone, L., and O. Bonaventure, "LISP-DHT:
2164	              Towards a DHT to map identifiers onto locators",
2165	              draft-mathy-lisp-dht-00.txt (work in progress),
2166	              February 2008.

2168	   [LOC-ID-ARCH]
2169	              Meyer, D. and D. Lewis, "Architectural Implications of
2170	              Locator/ID  Separation",
2171	              draft-meyer-loc-id-implications-00.txt (work in progress),
2172	              December 2008.

2174	   [MLISP]    Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas,
2175	              "LISP for Multicast Environments",
2176	              draft-farinacci-lisp-multicast-01.txt (work in progress),
2177	              November 2008.

2179	   [NERD]     Lear, E., "NERD: A Not-so-novel EID to RLOC Database",
2180	              draft-lear-lisp-nerd-02.txt (work in progress),
2181	              January 2008.

2183	   [OPENLISP]
2184	              Iannone, L. and O. Bonaventure, "OpenLISP Implementation
2185	              Report", draft-iannone-openlisp-implementation-01.txt
2186	              (work in progress), July 2008.

2188	   [RADIR]    Narten, T., "Routing and Addressing Problem Statement",
2189	              draft-narten-radir-problem-statement-00.txt (work in
2190	              progress), July 2007.

2192	   [RFC3344bis]
2193	              Perkins, C., "IP Mobility Support for IPv4, revised",
2194	              draft-ietf-mip4-rfc3344bis-05 (work in progress),
2195	              July 2007.

2197	   [RFC4192]  Baker, F., Lear, E., and R. Droms, "Procedures for
2198	              Renumbering an IPv6 Network without a Flag Day", RFC 4192,
2199	              September 2005.

2201	   [RPFV]     Wijnands, IJ., Boers, A., and E. Rosen, "The RPF Vector
2202	              TLV", draft-ietf-pim-rpf-vector-03.txt (work in progress),
2203	              October 2006.

2205	   [RPMD]     Handley, M., Huici, F., and A. Greenhalgh, "RPMD: Protocol
2206	              for Routing Protocol Meta-data  Dissemination",
2207	              draft-handley-p2ppush-unpublished-2007726.txt (work in
2208	              progress), July 2007.

2210	   [SHIM6]    Nordmark, E. and M. Bagnulo, "Level 3 multihoming shim
2211	              protocol", draft-ietf-shim6-proto-06.txt (work in
2212	              progress), October 2006.

2214	Appendix A.  Acknowledgments

2216	   A special and appreciative thank you goes to Noel Chiappa for
2217	   providing architectural impetus over the past decades on separation
2218	   of location and identity, as well as detailed review of the LISP
2219	   architecture and documents, coupled with enthusiasm for making LISP a
2220	   practical and incremental transition for the Internet.

2222	   The authors would like to gratefully acknowledge many people who have
2223	   contributed discussion and ideas to the making of this proposal.
2224	   They include Darrel Lewis, Andrew Partan, John Zwiebel, Jason
2225	   Schiller, Lixia Zhang, Dorian Kim, Peter Schoenmaker, Vijay Gill,
2226	   Geoff Huston, David Conrad, Mark Handley, Ron Bonica, Ted Seely, Mark
2227	   Townsley, Chris Morrow, Brian Weis, Dave McGrew, Peter Lothberg, Dave
2228	   Thaler, Eliot Lear, Shane Amante, Ved Kafle, Olivier Bonaventure,
2229	   Luigi Iannone, Robin Whittle, Brian Carpenter, Joel Halpern, Roger
2230	   Jorgensen, Ran Atkinson, Stig Venaas, Iljitsch van Beijnum, Roland
2231	   Bless, Dana Blair, Bill Lynch, Marc Woolward, Damien Saucez, and
2232	   Damian Lezama.

2234	   In particular, we would like to thank Dave Meyer for his clever
2235	   suggestion for the name "LISP". ;-)

2237	Authors' Addresses

2239	   Dino Farinacci
2240	   cisco Systems
2241	   Tasman Drive
2242	   San Jose, CA  95134
2243	   USA

2245	   Email: dino@cisco.com

2247	   Vince Fuller
2248	   cisco Systems
2249	   Tasman Drive
2250	   San Jose, CA  95134
2251	   USA

2253	   Email: vaf@cisco.com

2255	   Dave Oran
2256	   cisco Systems
2257	   7 Ladyslipper Lane
2258	   Acton, MA
2259	   USA

2261	   Email: oran@cisco.com

2263	   Dave Meyer
2264	   cisco Systems
2265	   170 Tasman Drive
2266	   San Jose, CA
2267	   USA

2269	   Email: dmm@cisco.com

2271	   Scott Brim
2272	   cisco Systems
2273	   170 Tasman Drive
2274	   San Jose, CA
2275	   USA

2277	   Email: sbrim@cisco.com