idnits 2.17.1 draft-ietf-lisp-introduction-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- == The document has an IETF Trust Provisions of 28 Dec 2009, Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 2239 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2013) is 3938 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 6830 (ref. 'LISP') (Obsoleted by RFC 9300, RFC 9301) ** Obsolete normative reference: RFC 6833 (ref. 'MapInterface') (Obsoleted by RFC 9301) ** Obsolete normative reference: RFC 6834 (ref. 'Versioning') (Obsoleted by RFC 9302) == Outdated reference: A later version (-09) exists of draft-ietf-lisp-ddt-00 -- No information found for draft-chiappa-lisp-evolution - is the name correct? == Outdated reference: A later version (-29) exists of draft-ietf-lisp-sec-04 == Outdated reference: A later version (-19) exists of draft-ermagan-lisp-nat-traversal-03 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-07 == Outdated reference: A later version (-12) exists of draft-ietf-lisp-deployment-08 -- Obsolete informational reference (is this intentional?): RFC 1631 (Obsoleted by RFC 3022) -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-00 Summary: 4 errors (**), 0 flaws (~~), 9 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LISP Working Group J. N. Chiappa 3 Internet-Draft Yorktown Museum of Asian Art 4 Intended status: Informational July 15, 2013 5 Expires: January 16, 2014 7 An Architecural Introduction to the LISP 8 Location-Identity Separation System 9 draft-ietf-lisp-introduction-01 11 Abstract 13 LISP is an upgrade to the architecture of the IPvN internetworking 14 system, one which separates location and identity (currently 15 intermingled in IPvN addresses). This is a change which has been 16 identified by the IRTF as a critically necessary evolutionary 17 architectural step for the Internet. In LISP, nodes have both a 18 'locator' (a name which says _where_ in the network's connectivity 19 structure the node is) and an 'identifier' (a name which serves only 20 to provide a persistent handle for the node). A node may have more 21 than one locator, or its locator may change over time (e.g. if the 22 node is mobile), but it keeps the same identifier. 24 One of the chief novelties of LISP, compared to other proposals for 25 the separation of location and identity, is its approach to deploying 26 this upgrade. (In general, it is comparatively easy to conceive of 27 new network designs, but much harder to devise approaches which will 28 actually get deployed throughout the global network.) LISP aims to 29 achieve the near-ubiquitous deployment necessary for maximum 30 exploitation of an architectural upgrade by i) minimizing the amount 31 of change needed (existing hosts and routers can operate unmodified); 32 and ii) by providing significant benefits to early adopters. 34 This document is an introduction to the entire LISP system, for those 35 who are unfamiliar with it. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. This document may not be modified, 41 and derivative works of it may not be created, except to format it 42 for publication as an RFC or to translate it into languages other 43 than English. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF). Note that other groups may also distribute 47 working documents as Internet-Drafts. The list of current Internet- 48 Drafts is at http://datatracker.ietf.org/drafts/current/. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 This Internet-Draft will expire on January 16, 2014. 57 Copyright Notice 59 Copyright (c) 2013 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Prefaratory Note 75 2. Background 76 3. Deployment Philosophy 77 3.1. Economics 78 3.2. Maximize Re-use of Existing Mechanism 79 3.3. 'Self-Deployment' 80 4. LISP Overview 81 4.1. Basic Approach 82 4.2. Basic Functionality 83 4.3. Mapping from EIDs to RLOCs 84 4.4. Interworking With Non-LISP-Capable Endpoints 85 4.5. Security in LISP 86 5. Initial Applications 87 5.1. Provider Independence 88 5.2. Multi-Homing 89 5.3. Traffic Engineering 90 5.4. Routing 91 5.5. Mobility 92 5.6. IP Version Reciprocal Traversal 93 5.7. Local Uses 94 6. Major Functional Subsystems 95 6.1. xTRs 96 6.1.1. Mapping Cache Performance 97 6.2. Mapping System 98 6.2.1. Mapping System Organization 99 6.2.2. Interface to the Mapping System 100 6.2.3. Indexing Sub-system 101 7. Examples of Operation 102 7.1. An Ordinary Packet's Processing 103 7.2. A Mapping Cache Miss 104 8. Design Approach 105 9. xTRs 106 9.1. When to Encapsulate 107 9.2. UDP Encapsulation Details 108 9.3. Header Control Channel 109 9.3.1. Mapping Versioning 110 9.3.2. Echo Nonces 111 9.3.3. Instances 112 9.4. Probing 113 9.5. Mapping Lifetimes and Timeouts 114 9.6. Security of Mapping Lookups 115 9.7. Mapping Gleaning in ETRs 116 9.8. Fragmentation 117 10. The Mapping System 118 10.1. The Mapping System Interface 119 10.1.1. Map-Request Messages 120 10.1.2. Map-Reply Messages 121 10.1.3. Map-Register and Map-Notify Messages 122 10.2. The DDT Indexing Sub-system 123 10.2.1. Map-Referral Messages 124 10.3. Reliability via Replication 125 10.4. Security of the DDT Indexing Sub-system 126 10.5. Extended Tools 127 10.6. Performance of the Mapping System 128 11. Deployment Mechanisms 129 11.1. LISP Deployment Needs 130 11.2. Internetworking Mechanism 131 11.3. Proxy Devices 132 11.3.1. PITRs 133 11.3.2. PETRs 134 11.4. LISP-NAT 135 11.5. Use Through NAT Devices 136 11.5.1. First-Phase NAT Support 137 11.5.2. Second-Phase NAT Support 138 11.6. LISP and DFZ Routing 139 11.6.1. Long-term Possibilities 140 12. Fault Discovery/Handling 141 12.1. Handling Missing Mappings 142 12.2. Outdated Mappings 143 12.2.1. Outdated Mappings - Updated Mapping 144 12.2.2. Outdated Mappings - Wrong ETR 145 12.2.3. Outdated Mappings - No Longer an ETR 146 12.3. Erroneous Mappings 147 12.4. Neighbour Liveness 148 12.5. Neighbour Reachability 149 13. Current Improvements 150 13.1. Improved NAT Support 151 13.2. Mobile Device Support 152 13.3. Multicast Support 153 13.4. {{Any others?}} 154 14. Acknowledgments 155 15. IANA Considerations 156 16. Security Considerations 157 17. References 158 17.1. Normative References 159 17.2. Informative References 160 Appendix A. Glossary/Definition of Terms 161 Appendix B. Other Appendices 162 B.1. Old LISP 'Models' 163 B.2. Possible Other Appendices 165 1. Prefaratory Note 167 This document is the first of a pair which, together, form what one 168 would think of as the 'architecture document' for LISP (the 169 'Location-Identity Separation Protocol'). Much of what would 170 normally be in an architecture document (e.g. the architectural 171 design principles used in LISP, and the design considerations behind 172 various components and aspects of the LISP system) is in the second 173 document, the 'Architectural Perspective on LISP' document. 175 This 'Architectural Introduction' document is primarily intended for 176 those who don't know anything about LISP, and want to start learning 177 about it. It is intended to both be easy to follow, and also to give 178 the reader a choice as to how much they wish to know about LISP. 179 Reading only the first part(s) of the document will give a good high- 180 level view of the system; reading the complete document should 181 provide a fairly detailed understanding of the entire system. 183 This goal explains why the document has a somewhat unusual structure. 184 It is not a reference document, where all the content on a particular 185 topic is grouped in one place. (That role is filled by the various 186 protocol specifications.) It starts with a very high-level view of 187 the entire system, to provide readers with a mental framework to help 188 understand the more detailed material which follows. A second pass 189 over the whole system then goes into more detail; finally, individual 190 sub-systems are covered in still deeper detail. 192 The intent is two-fold: first, the multiple passes over the entire 193 system, each one going into more detail, are intended to ease 194 understanding; second, people can simply stop reading when they have 195 a detailed-enough understanding for their purposes. People who just 196 want to get an idea of how LISP works might only read the first 197 part(s), whereas people who are going to go on and read all the 198 protocol specifications (perhaps to implement LISP) would need/want 199 to read the entire document. 201 Note: This document is a descriptive document, not a protocol 202 specification. Should it differ in any detail from any of the LISP 203 protocol specification documents, they take precedence for the actual 204 operation of the protocol. 206 2. Background 208 It has gradually been realized in the networking community that 209 networks (especially large networks) should deal quite separately 210 with the identity and location of a node (basically, 'who' a node is, 211 and 'where' it is). At the moment, in both IPv4 and IPv6, addresses 212 indicate both where the named device is, as well as identify it for 213 purposes of end-end communication. 215 The distinction was more than a little hazy at first: the early 216 Internet [RFC791], like the ARPANET before it [Heart] [NIC8246], co- 217 mingled the two, although there was recognition in the early Internet 218 work that there were two different things going on. [IEN19] 220 This likely resulted not just from lack of insight, but also the fact 221 that extra mechanism is needed to support this separation (and in the 222 early days there were no resources to spare), as well as the lack of 223 need for it in the smaller networks of the time. (It is a truism of 224 system design that small systems can get away with doing two things 225 with one mechanism, in a way that usually will not work when the 226 system gets much larger.) 228 The ISO protocol architecture took steps in this direction [NSAP], 229 but to the Internet community the necessity of a clear separation was 230 definitively shown by Saltzer. [RFC1498] Later work expanded on 231 Saltzer's, and tied his separation concepts into the fate-sharing 232 concepts of Clark. [Clark], [Chiappa] 234 The separation of location and identity is a step which has recently 235 been identified by the IRTF as a critically necessary evolutionary 236 architectural step for the Internet. However, it has taken some time 237 for this requirement to be generally accepted by the Internet 238 engineering community at large, although it seems that this may 239 finally be happening. [RFC6115] 241 The LISP system for separation of location and identity resulted from 242 the discussions of this topic at the Amsterdam IAB Routing and 243 Addressing Workshop, which took place in October 2006. [RFC4984] 245 A small group of like-minded personnel from various scattered 246 locations within Cisco, spontaneously formed immediately after that 247 workshop, to work on an idea that came out of informal discussions at 248 the workshop. The first Internet-Draft on LISP appeared in January, 249 2007, along with a LISP mailing list at the IETF. [LISP0] 251 Trial implementations started at that time, with initial trial 252 deployments underway since June 2007; the results of early experience 253 have been fed back into the design in a continuous, ongoing process 254 over several years. LISP at this point represents a moderately 255 mature system, having undergone a long organic series of changes and 256 updates. 258 LISP transitioned from an IRTF activity to an IETF WG in March 2009, 259 and after numerous revisions, the basic specifications moved to 260 becoming RFCs in 2012 (although work to expand and improve it 261 continues, and undoubtly will for a long time to come). 263 3. Deployment Philosophy 265 It may seem odd to cover 'deployment philosophy' at this point in 266 such a document. However the deployment philosophy was a major 267 driver for much of the design (to some degree the architecture, and 268 to a very large measure, the engineering). So, as such an important 269 motivator, it is very desirable for readers to have this material in 270 hand as they examine the design, so that design choices that may seem 271 questionable at first glance can be better understood. 273 Experience over the last several decades has shown that having a 274 viable 'deployment model' for a new design is absolutely key to the 275 success of that design. A new design may be fantastic - but if it 276 can not or will not be successfully deployed (for whatever factors), 277 it is useless. This absolute primacy of a viable deployment model is 278 what has lead to some painful compromises in the design. 280 The extreme focus on a viable deployment scheme is one of the 281 novelties of LISP. 283 3.1. Economics 285 The key factor in successful adoption, as shown by recent experience 286 in the Internet - and little appreciated to begin with, some decades 287 back - is economics: does the new design have benefits which outweigh 288 its costs. 290 More importantly, this balance needs to hold for early adopters - 291 because if they do not receive benefits to their adoption, the sphere 292 of earliest adopters will not expand, and it will never get to 293 widespread deployment. One might have the world's best 'clean-slate' 294 design, but if it does not have a deployment plan which is 295 economically feasible, it's not good for much. 297 This is particularly true of architectural enhancements, which are 298 far less likely to be an addition which one can 'bolt onto the side' 299 of existing mechanisms, and often offer their greatest benefits only 300 when widely (or ubiquitously) deployed. 302 Maximizing the cost-benefit ratio obviously has two aspects. First, 303 on the cost side, by making the design as inexpensive as possible, 304 which means in part making the deployment as easy as possible. 305 Second, on the benefit side, by providing many new capabilities, 306 which is best done not by loading the design up with lots of features 307 or options (which adds complexity), but by making the addition 308 powerful through deeper flexibility. We believe LISP has met both of 309 these goals. 311 3.2. Maximize Re-use of Existing Mechanism 313 One key part of reducing the cost of a new design is to absolutely 314 minimize the amount of change _required_ to existing, deployed, 315 devices: the fewer devices need to be changed, and the smaller the 316 change to those that do, the lower the pain (and thus the greater the 317 likelihood) of deployment. 319 Designs which absolutely require 'forklift upgrades' to large amounts 320 of existing gear are far less likely to succeed - because they have 321 to have extremely large benefits to make their very substantial costs 322 worthwhile. 324 It is for this reason that LISP, in most cases, initially requires no 325 changes to almost all existing devices in the Internet (both hosts 326 and routers); LISP functionality is needed in only a few places (see 327 Section 11.1 for more). 329 LISP also initially reuses, where-ever possible, existing protocols 330 (IPv4 [RFC791] and IPv6 [RFC2460]). The 'initially' must be stressed 331 - careful attention has also long been paid to the long-term future 332 (see [Future]), and larger changes become feasible as deployment 333 increases. 335 3.3. 'Self-Deployment' 337 LISP has deliberately employed a rather different deployment model, 338 which we might call 'self-deployment' (for want of a better term); it 339 does not require a huge push to get it deployed, rather, it is hoped 340 that once people see it and realize they can easily make good use of 341 it _on their own_ (i.e. without requiring adoption by others), it 342 will 'deploy itself' (hence the name of the approach). 344 One can liken the problem of deploying new systems in this way to 345 rolling a snowball down a hill: unless one starts with a big enough 346 snowball, and finds a hill of the right steepness (i.e. the right 347 path for it to travel), one's snowball is not going to go anywhere on 348 its own. However, if one has picked one's spot correctly, once 349 started, little additional work is needed. 351 4. LISP Overview 353 LISP is an incrementally deployable architectural upgrade to the 354 existing Internet infrastructure, one which provides separation of 355 location and identity. The separation is usually not perfect, for 356 reasons which are driven by the deployment philosophy (above), and 357 explored in a little more detail elsewhere (in [Perspective], Section 358 "Namespaces-EIDs-Residual"). 360 LISP separates the functions of location and identity of nodes (a 361 nebulous term, deliberately chosen for use in this document precisely 362 because its definition is not fixed - you will not go far wrong if 363 you think of a node as being something like a host), which are 364 currently intermingled in IPvN addresses. (This document uses the 365 meaning for 'address' proposed in [Atkinson], i.e. a name with mixed 366 location and identity semantics.) 368 4.1. Basic Approach 370 In LISP, nodes have both a 'locator' (a name which says _where_ in 371 the network's connectivity structure the node is), called an 'RLOC' 372 (short for 'routing locator'), and an 'identifier' (a name which 373 serves only to provide a persistent handle for the node), called an 374 'EID' (short for 'endpoint identifier'). 376 A node may have more than one RLOC, or its RLOC may change over time 377 (e.g. if the node is mobile), but it would normally always keep the 378 same EID. 380 Technically, one should probably say that ideally, the EID names the 381 node (or rather, its end-end communication stack, if one wants to be 382 as forward-looking as possible), and the RLOC(s) name interface(s). 383 (At the moment, in reality, the situation is somewhat more complex, 384 as will be explained elsewhere (in [Perspective], Section 385 "Namespaces-EIDs-Residual".) 387 This second distinction, of _what_ is named by the two classes of 388 name, is necessary both to enable some of the capabilities that LISP 389 provides (e.g the ability to seamlessly support multiple interfaces, 390 to different networks), and is also a further enhancement to the 391 architecture. Faailing to clearly recognize both interfaces and 392 communication stacks as distinctly separate classes of things is 393 another failing of the existing Internet architecture (again, one 394 inherited from the previous generation of networking). 396 A novelty in LISP is that it uses existing IPvN addresses (initially, 397 at least) for both of these kinds of names, thereby minimizing the 398 deployment cost, as well as providing the ability to easily interact 399 with unmodified hosts and routers. 401 4.2. Basic Functionality 403 The basic operation of LISP, as it currently stands, is that LISP 404 augmented packet switches near the source and destination of packets 405 intercept traffic, and 'enhance' the packets. 407 The LISP device near the source looks up additional information about 408 the destination, and then wraps the packet in an outer header, one 409 which contains some of that additional information. The LISP device 410 near the destination removes that header, leaving the original, 411 unmodified, packet to be processed by the destination node. 413 The LISP device near the original source (the Ingress Tunnel Router, 414 or 'ITR') uses the information originally in the packet about the 415 identity of its ultimate destination, i.e. the destination address, 416 which in LISP is the EID of the ultimate destination. It uses the 417 destination EID to look up the current location (the RLOC) of that 418 EID. 420 The lookup is performed through a 'mapping system', which is the 421 heart of LISP: it is a distributed directory of mappings from EIDs to 422 RLOCS. The destination RLOC will be (initially at least) the address 423 of the LISP device near the ultimate destination (the Egress Tunnel 424 Router, or 'ETR'). 426 {{Is it worth distinguishing between 'mapping' and 'binding'? Should 427 the document pick one term, and stick with it?}} 429 The ITR then generates a new outer header for the original packet, 430 with that header containing the ultimate destination's RLOC as the 431 wrapped packet's destination, and the ITR's own address (i.e. the 432 RLOC of the original source) as the wrapped packet's source, and 433 sends it off. 435 When the packet gets to the ETR, that outer header is stripped off, 436 and the original packet is forwarded to the original ultimate 437 destination for normal processing. 439 Return traffic is handled similarly, often (depending on the 440 network's configuration) with the original ITR and ETR switching 441 roles. The ETR and ITR functionality is usually co-located in a 442 single device; these are normally denominated as 'xTRs'. 444 4.3. Mapping from EIDs to RLOCs 446 The mappings from EIDs to RLOCs are provided by a distributed (and 447 potentially replicated) database, the mapping database, which is the 448 heart of LISP. 450 Mappings are requested on need, not (generally) pre-loaded; in other 451 words, mapping are distributed via a 'pull' mechanism. Once obtained 452 by an ITR, they are cached by the ITR, to limit the amount of control 453 traffic to a practicable level. (The mapping system will be 454 discussed in more detail below, in Section 6.2 and Section 10) 456 Extensive studies, including large-scale simulations driven by 457 lengthy recordings of actual traffic at several major sites, have 458 been performed to verify that this 'pull and cache' approach is 459 viable, in practical engineering terms. (This subject will be 460 discussed in more detail in Section 6.1.1, below.) 462 4.4. Interworking With Non-LISP-Capable Endpoints 464 The capability for 'easy' interoperation between nodes using LISP, 465 and existing non-LISP-using hosts (often called 'legacy' hosts) or 466 sites (where 'site' is usually taken to mean a collection of hosts, 467 routers and networks under a single administrative control), is 468 clearly crucial. 470 To allow such interoperation, a number of mechanisms have been 471 designed. This multiplicity is in part because different mechanisms 472 have different advantages and disadvantages (so that no single 473 mechanism is optimal for all cases), but also because with limited 474 field experience, it is not clear which (if any) approach will be 475 preferable. 477 One approach uses proxy LISP devices, called PITRs (proxy ITRs) and 478 PETRs (proxy ETRs), to provide LISP functionality during interaction 479 with legacy hosts. Another approach uses a device with combined LISP 480 and NAT ([RFC1631]) functionality, named a LISP-NAT. 482 4.5. Security in LISP 484 LISP has a subtle security philosophy; see [Perspective], Section 485 "Security", where it is laid out in some detail. 487 To provide a brief overview, it is definitely understood that LISP 488 needs to be highly _securable_, especially in the long term; over 489 time, the attacks mounted by 'bad guys' are becoming more and more 490 sophisticated. So LISP, like DNS, needs to be _capable_ of providing 491 'the very best' security there is. 493 At the same time, there is a conflicting goal: it must be deployable. 494 That means two things: First, with the limited manpower currently 495 available, we cannot expect to create the complete security apparatus 496 that we might see in the long term (which requires not just design, 497 but also implementation, etc). Second, security needs to be 498 flexible, so that we don't overload the users with more security than 499 they need at any point. 501 To accomplish these divergent goals, the approach taken is to 502 thorougly analyze what LISP needs for security, and then design, in 503 detail, a scheme for providing that security. Then, steps can be 504 taken to ensure that the appropriate 'hooks' (such as packet fields) 505 are included at an early stage, when doing so is still easy. Later 506 on, the design can be fully specified, implemented, and deployed. 508 5. Initial Applications 510 As previously mentioned, it is felt that LISP will provide even the 511 earliest adopters with some useful capabilities, and that these 512 capabilities will drive early LISP deployment. 514 It is very imporant to note that even when used only for 515 interoperation with existing unmodified hosts, use of LISP can still 516 provide benefits for communications with the site which has deployed 517 it - and, perhaps even more importantly, can do so _to both sides_. 518 This characteristic acts to further enhance the utility for early 519 adopters of deploying LISP, thereby increasing the cost/benefit ratio 520 needed to drive deployment, and increasing the 'self-deployment' 521 aspect of LISP. 523 Note also that this section only lists likely _early_ applications 524 and benefits - if and once deployment becomes more widespread, other 525 aspects will come into play (as described in [Perspective], in the 526 "Goals of LISP" section). 528 5.1. Provider Independence 530 Provider independence (i.e. the ability to easily change one's 531 Internet Service Provider) was probably the first place where the 532 Internet engineering community finally really felt the utility of 533 separating location and identity. 535 The problem is simple: for the global routing to scale, addresses 536 need to be aggregated (i.e. things which are close in the overall 537 network's connectivity need to have closely related addresses), the 538 so-called "provider aggregated" addresses. [RFC4116] However, if 539 this principle is followed, it means that when an entity switches 540 providers (i.e. it moves to a different 'place' in the network), it 541 has to renumber, a painful undertaking. [RFC5887] 543 In theory, it ought to be possible to update the DNS entries, and 544 have everyone switch to the new addresses, but in practise, addresses 545 are embedded in many places, such as firewall configurations at other 546 sites. 548 Having separate namespaces for location and identity greatly reduces 549 the problems involved with renumbering; an organization which moves 550 retains its EIDs (which are how most other parties refer to its 551 nodes), but is allocated new RLOCs, and the mapping system can 552 quickly provide the updated mapping from the EIDs to the new RLOCs. 554 5.2. Multi-Homing 556 Multi-homing is another place where the value of separation of 557 location and identity became apparent. There are several different 558 sub-flavours of the multi-homing problem - e.g. depending on whether 559 one wants open connections to keep working, etc - and other axes as 560 well (e.g. site multi-homing versus host multi-homing). 562 In particular, for the 'keep open connections up' case, without 563 separation of location and identity, the only currently feasible 564 approach is to use provider-independent addressses - which moves the 565 problem into the global routing system, with attendant costs. This 566 approach is also not really feasible for host multi-homing. 568 Multi-homing was once somewhat esoteric, but a number of trends are 569 driving an increased desirability, e.g. the wish to have multiple ISP 570 links to a site for robustness; the desire to have mobile handsets 571 connect up to multiple wireless systems; etc. 573 Again, separation of location and identity, and the existince of a 574 binding layer which can be updated fairly quickly, as provided by 575 LISP, is a very useful tool for all variants of this issue. 577 5.3. Traffic Engineering 579 Traffic engineering (TE) [RFC3272], desirable though this capability 580 is in a global network, is currently somewhat problematic to provide 581 in the Internet. The problem, fundamentally, is that this capability 582 was not forseen when the Internet was designed, so the support for it 583 via 'hacks' is neither clean, nor flexible. 585 TE is, fundamentally, a routing issue. However, the current Internet 586 routing architecture, which is basically the Baran design of fifty 587 years ago [Baran] (a single large, distributed computationa), is ill- 588 suited to provide TE. The Internet seems a long way from adopting a 589 more-advanced routing architecture, although the basic concepts for 590 such have been known for some time. [RFC1992] 592 Although the identity-location binding layer is thus a poor place, 593 architecturally, to provide TE capabilities, it is still an 594 improvement over the current routing tools available for this purpose 595 (e.g. injection of more-specific routes into the global routing 596 table). In addition, instead of the entire network incurring the 597 costs (through the routing system overhead), when using a binding 598 layer to provide TE, the overhead is limited to those who are 599 actually communicating with that particular destination. 601 LISP includes a number of features in the mapping system to support 602 TE. (Described in Section 6.2 below.) 604 A number of academic papers have explored how LISP can be used to do 605 TE, and how effective it can be. See the online LISP Bibliography 606 ([Bibliography]) for information about them. 608 5.4. Routing 610 Multi-homing and Traffic Engineering are both, in some sense, uses of 611 LISP for routing, but there are many other routing-related uses for 612 LISP. 614 One of the major original motivations for the separation of location 615 and identity in general, and thus LISP, was to reduce the growth of 616 the routing tables in the so-called 'Default-Free-Zone' (DFZ) - the 617 core of the Internet, the part where routes to _all_ ultimate 618 destinations must be available. LISP is expected to help with this; 619 for more detail, see Section 11.6, below. 621 LISP may also have more local applications in which it can help with 622 routing; see, for instance, [CorasBGP]. 624 5.5. Mobility 626 Mobility is yet another place where separation of location and 627 identity is obviously a key part of a clean, efficient and high- 628 functionality solution. Considerable experimentation has been 629 completed on doing mobility with LISP. 631 5.6. IP Version Reciprocal Traversal 633 Note that LISP 'automagically' allows intermixing of various IP 634 versions for packet carriage; IPv4 packets might well be carried in 635 IPv6, or vice versa, depending on the network's configuration. This 636 would allow an 'island' of operation of one type to be 637 'automatically' tunneled over a stretch of infrastucture which only 638 supports the other type. 640 While the machinery of LISP may seem too heavyweight to be good for 641 such a mundane use, this is not intended as a 'sole use' case for 642 deployment of LISP. Rather, it is something which, if LISP is being 643 deployed anyway (for its other advantages), is an added benefit that 644 one gets 'for free'. 646 5.7. Local Uses 648 LISP has a number of use cases which are within purely local 649 contexts, i.e. not in the larger Internet. These fall into two 650 categories: uses seen on the Internet (above), but here on a private 651 (and usually small scale) setting; and applications which do not have 652 a direct analog in the larger Internet, and which apply only to local 653 deployments. 655 Among the former are multi-homing, IP version traversal, and support 656 of VPN's for segmentation and multi-tenancy (i.e. a spatially 657 separated private VPN whose components are joined together using the 658 public Internet as a backbone). 660 Among the latter class, non-Internet applications which have no 661 analog on the Internet, are the following example applications: 662 virtual machine mobility in data centers; other non-IP EID types such 663 as local network MAC addresses, or application specific data. 665 6. Major Functional Subsystems 667 LISP has only two major functional sub-systems - the collection of 668 LISP packet switches (the xTRs), and the mapping system, which 669 manages the mapping database. The purpose and operation of each is 670 described at a high level below, and then, later on, in a fair amount 671 of detail, in separate sections on each (Sections Section 9 and 672 Section 10, respectively). 674 6.1. xTRs 676 xTRs are fairly normal packet switches, enhanced with a little extra 677 functionality in both the data and control planes, to perform LISP 678 data and control functionality. 680 The data plane functions in ITRs include deciding which packets need 681 to be given LISP processing (since packets to non-LISP hosts may be 682 sent 'vanilla'); i.e. looking up the mapping; encapsulating the 683 packet; and sending it to the ETR. This encapsulation is done using 684 UDP [RFC768] (for reasons to be explained below, in Section 9.2), 685 along with an additional IPvN header (to hold the source and 686 destination RLOCs). To the extent that traffic engineering features 687 are in use for a particular EID, the ITRs implement them as well. 689 In the ETR, the data plane simply unwraps the packets, and forwards 690 the now-normal packets to the ultimate destination. 692 Control plane functions in ITRs include: asking for {EID->RLOC} 693 mappings via Map-Request control messages; handling the returning 694 Map-Replies which contain the requested information; managing the 695 local cache of mappings; checking for the reachability and liveness 696 of their neighbour ETRs; and checking for outdated mappings and 697 requesting updates. 699 In the ETR, control plane functions include participating in the 700 neighbour reachability and liveness function (see Section 12.4); 701 interacting with the mapping sub-system (next section); and answering 702 requests for mappings (ditto). 704 6.1.1. Mapping Cache Performance 706 As mentioned, studies have been performed to verify that caching 707 mappings in ITRs is viable, in practical engineering terms. These 708 studies not only verified that such caching is feasible, but also 709 provided some insight for designing ITR mapping caches. 711 Obviously, these studies are all snapshots of a particular point in 712 time, and as the Internet continues its life-cycle they will 713 increasingly become out-dated. However, they are useful because they 714 provide an insight into how well LISP can be expected to perform, and 715 scale, over time. 717 The first, [Iannone], was performed in the very early stages of the 718 LISP effort, to verify that that approach was feasible. First, 719 packet traces of all traffic over the external connection of a large 720 university (around 10,000 users) over a week-long period were 721 collected. Simulations driven by these recording were then 722 performed; a variety of control settings on the cache were used, to 723 study the effects of varying the settings. The simulations set no 724 limit on the total cache size, but used a range of cache retention 725 times (i.e. an entry that remained unused longer than a fixed 726 retention time was discarded), from 3 minutes, up to 300 minutes. 728 First, the simulation gave the cache sizes that would result from 729 such a cache design. It showed that the resulting cache sizes ranged 730 from 7,500 entries (at night, with the shortest retention time) up to 731 about 100,000. Using some estimations as to i) how many RLOCs the 732 average mapping would have (since this will affect its size), and ii) 733 how much memory it would take to store a mapping, this indicated 734 cache sizes of between roughly 100 Kbytes and a few Mbytes. 736 Of more interest, in a way, were the results regarding two important 737 measurements of the effectiveness of the cache: i) the hit ratio 738 (i.e. the share of references which could be satisified by the 739 cache), and ii) the miss _rate_ (since control traffic overhead is 740 one of the chief concerns when using a cache). These results were 741 also encouraging: miss (and hence lookup) rates ranged (again, 742 depending on the time of day, cache settings, etc) from 30 per 743 minute, up to 3,000 per minute (i.e. 150 per second; with the 744 shortest timeout, and thus the smallest cache). Significantly, this 745 was substantially lower than the amount of observed DNS traffic, 746 which ranged from 1,800 packets per minute up to 15,000 per minute. 748 The second, [Kim], was in general terms similar, except that it used 749 data from a large ISP (taken over two days, at different times of the 750 year), one with about three times as many users as the previous 751 study. It used the same cache design philosophy (the cache size was 752 not fixed), but slightly different, lower, retention time values: 60 753 seconds, 180 seconds, and 1,800 seconds (30 minutes), since the 754 previous study had indicated that extremely long times (hours) had 755 little additional benefit. 757 The results were similar: cache sizes ranges from 20,000 entries with 758 the shortest timeout, to roughly 60,000 with the longest; the miss 759 rate ranged from very roughly 400 per minute (with the longest 760 timeout) to very roughly 7,000 per minute (with the shortest), 761 similar to the previous results. 763 Finally, a third study, [CorasCache], examined the effect of using a 764 fixed size cache, and a purely Least Recently Used (LRU) cache 765 eviction algorithm (i.e. no timeouts). It also tried to verify that 766 models of the performance of such a cache (using previous theoretical 767 work on caches) produced results that conformed with actual empirical 768 measurements. 770 It used yet another set of packet traces (some from an earlier study, 771 [Jakab]). Using a cache size of around 50,000 entries produced a 772 miss rate of around 1x10-4; again, definitely viable, and in line 773 with the results of the other studies. 775 6.2. Mapping System 777 The mapping database is a distributed, and potentially replicated, 778 database which holds mappings between EIDs (identity) and RLOCs 779 (location). To be exact, it contains mappings between EID blocks and 780 RLOCs (the block size is given explicitly, as part of the syntax). 782 Support for blocks is both for minimizing the administrative 783 configuration overhead, as well as for operational efficiency; e.g. 784 when a group of EIDs are behind a single xTR. 786 However, the block may be (and often is) as small as a single EID. 787 Since mappings are only loaded upon demand, if smaller blocks become 788 predominant, then the increased size of the overall database is far 789 less problematic than if the routing table came to be dominated by 790 such small entries. 792 A particular node may have more than one RLOC, or may change its 793 RLOC(s), while keeping its singlar identity. 795 The mapping contains not just the RLOC(s), but also (for each RLOC 796 for any given EID) priority and weight (to allow allocation of load 797 between several RLOCs at a given priority); this allows a certain 798 amount of traffic engineering to be accomplished with LISP. 800 6.2.1. Mapping System Organization 802 The mapping system is actually split into what are effectively three 803 major functional sub-systems (although the latter two are closely 804 integrated, and appear to most entities in the LISP system as a 805 single sub-system). 807 The first covers the actual mappings themselves; they are held by the 808 ETRs, and an ITR which needs a mapping gets it (effectively) directly 809 from the ETR. This co-location of the authoritative version of the 810 mappings, and the forwarding functionality which it describes, is an 811 instance of fate-sharing. [Clark] 813 To find the appropriate ETR(s) to query for the mapping, the second 814 two sub-systems form an 'indexing system', itself also a distributed, 815 potentally replicated database. It provides information on which 816 ETR(s) are authoritative sources for the various {EID -> RLOC} 817 mappings which are available. The two sub-systems which form it are 818 the user interface sub-system, and indexing sub-system (which holds 819 and provides the actual information). 821 6.2.2. Interface to the Mapping System 823 The client interface to the indexing system from an ITR's point of 824 view is not with the indexing sub-system directly; rather, it is 825 through the client-interface sub-system, which is provied by devices 826 called Map Resolvers (MRs). 828 ITRs send request control messages (Map-Request packets) to an MR. 829 (This interface is probably the most important standardized interface 830 in LISP - it is the key to the entire system.) 832 The MR then uses the indexing sub-system to allow it to forward the 833 Map-Request to the appropriate ETR. The ETR formulates reply control 834 messages (Map-Reply packets), which are sent to the ITR. The details 835 of the indexing system are thus hidden from the ITRs. 837 Similarly, the client interface to the indexing system from an ETR's 838 point of view is through devices called Map Servers (MSs - admittedly 839 a poorly chosen term, since their primary function is not to respond 840 to queries, but it's too late to change it now). 842 ETRs send registration control messages (Map-Register packets) to an 843 MS, which makes the information about the mappings which the ETR 844 indicates it is authoritative for available to the indexing system. 845 The MS formulates a reply control message (the Map-Notify packet), 846 which confirms the registration, and is returned to the ETR. The 847 details of the indexing system are thus likewise hidden from the 848 'ordinary' ETRs. 850 6.2.3. Indexing Sub-system 852 The current indexing sub-system is the Delegated Database Tree (DDT), 853 which is very similar to DNS. [DDT], [RFC1034] However, unlike DNS, 854 the actual mappings are not handled by DDT; DDT (as part of the 855 indexing system) merely identifies the ETRs which hold the actual 856 mappings. 858 DDT replaced an earlier indexing sub-system, ALT ([Perspective], 859 section "Appendices-ALT"); this swap validated the concept of having 860 a separate client-interface sub-system, which would allow the actual 861 indexing sub-system to be replaced without needing to modify the 862 clients. 864 6.2.3.1. DDT Overview 866 Conceptually, DDT is fairly simple: like DNS, in DDT the delegation 867 of the EID namespace ([Perspective], Section "Namespaces-XEIDs") is 868 instantiated as a tree of DDT 'nodes', starting with the 'root' DDT 869 node. Each node is responsible (authoritative?) for one or more 870 blocks of the EID namespace. 872 The 'root' node is reponsible for the entire namespace; any DDT node 873 can 'delegate' part(s) of its block(s) of the namespace to child DDT 874 node(s). The child node(s) can in turn further delgate (necessarily 875 smaller) blocks of namespace to their children, through as many 876 levels as are needed (for operational, administrative, etc, needs). 878 Just as with DNS, for reasons of performance, reliability and 879 robustness, any particular node in the DDT delegation tree may be 880 instantiated in more than one redundant physical server machines. 881 Obviously, all the servers which instantiate a particular node in the 882 tree have to have identical data about that node. 884 Also, although the delegation hierarchy is a strict tree {{check - do 885 all servers for the delegation of block X have to return the same 886 list of servers for that block?}}, a single DDT server could be 887 responsible (authoritative?) for more than one block of the EID 888 namespace. 890 Eventually, leaf nodes in the DDT tree assign ({{delegate? - it's all 891 static configured, nothing is dynamic}}) EID namespace blocks to 892 MS's, which are DDT terminal nodes; i.e. a leaf of the tree is 893 reached when the delegation points to an MS instead of to another DDT 894 node. 896 The MS is in direct communication with the ETR(s) which both i) are 897 authoritative for the mappings for that block, and ii) handle traffic 898 to that block of EID namespace. 900 6.2.3.2. Use of DDT by MRs 902 An MR which wants to find a mapping for a particular EID first 903 interacts with the nodes of the DDT tree, discovering (by querying 904 DDT nodes) the chain of delegations which cover that EID. Eventually 905 it is directed to an MS, and then to an ETR which is responsible 906 {{authoritative?}} for that EID. 908 Also, again like DNS, MRs cache information about the delegations in 909 the DDT tree. This means that once an MR has been in operation for 910 while, it will usually have much of the delegation information cached 911 locally (especially the top levels of the delegation tree). This 912 allows them, when passed a request for a mapping by an ITR, to 913 usually forward the mapping request to the appropriate MS without 914 having to do a complete tree-walk of the DDT tree to find any 915 particular mappping. 917 Thus, a typical resolution cycle would usually involve looking at 918 some locally cached delegation information, perhaps loading some 919 missing delegation entries into their delegation cache, and finally 920 sending the Map-Request to the appropriate MS. 922 The big advantage of DDT over the ALT, in performance terms, is that 923 it allows MRs to interact _directly_ with distant DDT nodes (as 924 opposed to the ALT, which _always_ required mediation through 925 intermediate nodes); caching of information about those distant nodes 926 allows DDT to make extremely effective use of this capability. 928 7. Examples of Operation 930 To aid in comprehension, a few examples are given of user packets 931 traversing the LISP system. The first shows the processing of a 932 typical user packet, i.e. what the vast majority of user packets will 933 see. The second shows what happens when the first packet to a 934 previously-unseen ultimate destination (at a particular ITR) is to be 935 processed by LISP. 937 7.1. An Ordinary Packet's Processing 939 This case follows the processing of a typical user packet (for 940 instance, a normal TCP data or acknowledgment packet associated with 941 an already-open TCP connection) as it makes its way from the original 942 source host to the ultimate destination. 944 When the packet has made its way through the local site to an ITR 945 (which is also a border router for the site), the border router looks 946 up the desination address (an EID) in its local mapping cache. It 947 finds a mapping, which instructs it to wrap the packet in an outer 948 header (an IP packet, containing a UDP packet which contains a LISP 949 header, and then the user's original packet). The destination 950 address in the outer header is set by the ITR to the RLOC of the 951 destination ETR. 953 The packet is then sent off through the Internet, using normal 954 Internet routing tables, etc. 956 On arrival at the destination ETR, the ETR will notice that it is 957 listed as the destination in the outer header. It will examine the 958 packet, detect that it is a LISP packet, and unwrap it. It will then 959 examine the header of the user's original packet, and forward it 960 internally, through the local site, to the ultimate destination. 962 At the ultimate destination, the packet will be processed, and may 963 produce a return packet, which follows the exact same process in 964 reverse - with the exception that the roles of the ITR and ETR are 965 swapped. 967 7.2. A Mapping Cache Miss 969 If a host sends a packet, and it gets to the ITR, and the ITR both i) 970 determines that it needs to perform LISP processing on the user data 971 packet, but ii) does not yet have a mapping cache entry which covers 972 that destination EID, then more complex processing ensues. 974 It sends a Map-Request packet, giving the destination EID it needs a 975 mapping for, to its MR. The MR will look in its cache of delegation 976 information to see if it has the RLOC for the ETR for that 977 destination EID. If not, it will query the DDT system to find the 978 RLOC of the ETR. When it has the RLOC, it will send the Map-Request 979 on to the ETR. 981 The ETR sends a Map-Reply to the ITR which needs the mapping; from 982 then on, processing of user packets through that ITR to that ultimate 983 destination proceeds as above. (Typically, like many ARP 984 implementations, the original user packet will have been discarded, 985 not cached waiting for the mapping to be found. When the host 986 retransmits the packet, the mapping will be there, and the packet 987 will be forwarded.) 989 8. Design Approach 991 Before describing LISP's components in more detail below, it it worth 992 pointing out that what may seem, in some cases, like odd (or poor) 993 design approaches do in fact result from the application of a 994 thought-through, and consistent, design philosophy used in creating 995 them. 997 This design philosophy is covered in detail in in [Perspective], 998 Section "Design"), and readers who are interested in the 'why' of 999 various mechanisms should consult that; reading it may make clearer 1000 the reasons for some engineering choices in the mechanisms given 1001 here. 1003 9. xTRs 1005 As mentioned above (in Section 6.1), xTRs are the basic data-handling 1006 devices in LISP. This section explores some advanced topics related 1007 to xTRs. 1009 Careful rules have been specified for both TTL and ECN [RFC3168] to 1010 ensure that passage through xTRs does not interfere with the 1011 operation of these mechanisms. In addition, care has been taken to 1012 ensure that 'traceroute' works when xTRs are involved. 1014 9.1. When to Encapsulate 1016 An ITR knows that an ultimate destination is 'running' LISP (remember 1017 that the destination machine itself probably knows nothing about 1018 LISP), and thus that it should perform LISP processing on a packet 1019 (including potential encapsulation) if it has an entry in its local 1020 mapping cache that covers the destination EID. 1022 Conversely, if the cache contains a 'negative' entry (indicating that 1023 the ITR has previously attempted to find a mapping that covers this 1024 EID, and it has been informed by the mapping system that no such 1025 mapping exists), it knows the ultimate destination is not running 1026 LISP, and the packet can be forwarded normally. 1028 Note that the ITR cannot simply depend on the appearance, or non- 1029 appearance, of the destination in the routing tables in the DFZ, as a 1030 way to tell if an ultimate destination is a LISP node or not, because 1031 mechanisms to allow interoperation of LISP sites and 'legacy' sites 1032 necessarily involve advertising LISP sites' EIDs into the DFZ. 1034 9.2. UDP Encapsulation Details 1036 The UDP encapsulation used by LISP for carrying traffic from ITR to 1037 ETR, and many of the details of how it works, were all chosen for 1038 very practical reasons. 1040 Use of UDP (instead of, say, a LISP-specific protocol number) was 1041 driven by the fact that many devices filter out 'unknown' protocols, 1042 so adopting a non-UDP encapsulation would have made the initial 1043 deployment of LISP harder - and our goal (see Section 3.1) was to 1044 make the deployment as easy as possible. 1046 The UDP source port in the encapsulated packet is a hash of the 1047 original source and ultimate destination; this is because many ISPs 1048 use multiple parallel paths (so-called 'Equal Cost Multi-Path'), and 1049 load-share across them. Using such a hash in the source-port in the 1050 outer header both allows LISP traffic to be load-shared, and also 1051 ensures that packets from individual connections are delivered in 1052 order (since most ISPs try to ensure that packets for a particular 1053 {source, source port, destination, destination port} tuple flow along 1054 a single path, and do not become disordered).. 1056 The UDP checksum is zero because the inner packet usually already has 1057 a end-end checksum, and the outer checksum adds no value. [Saltzer] 1058 In most exising hardware, computing such a checksum (and checking it 1059 at the other end) would also present an intolerable load, for no 1060 benefit. 1062 9.3. Header Control Channel 1064 LISP provides a multiplexed channel in the encapsulation header. It 1065 is mostly (but not entirely) used for control purposes. (See 1066 [Perspective], Section "Architecture-Piggyback" for a longer 1067 discussion of the architectural implications of performing control 1068 functions with data traffic.) 1070 The general concept is that the header starts with an 8-bit 'flags' 1071 field, and it also includes two data fields (one 24 bits, one 32), 1072 the contents and meaning of which vary, depending on which flags are 1073 set. This allows these fields to be 'multiplexed' among a number of 1074 different low-duty-cycle functions, while minimizing the space 1075 overhead of the LISP encapsulation header. 1077 9.3.1. Mapping Versioning 1079 One important use of the multiplexed control channel is mapping 1080 versioning; i.e. the discovery of when the mapping cached in an ITR 1081 is outdated. To allow an ITR to discover this, identifying sequence 1082 numbers are applied to different versions of a mappping. 1083 [Versioning] This allows an ITR to easily discover when a cached 1084 mapping has been updated by a more recent variant. 1086 Version numbers are available in control messages (Map-Replies), but 1087 the initial concept is that to limit control message overhead, the 1088 versioning mechanism should primarily use the multiplex user data 1089 header control channel. 1091 Versioning can operate in both directions: an ITR can advise an ETR 1092 what version of a mapping it is currently using (so the ETR can 1093 notify it if there is a more recent version), and ETRs can let ITRs 1094 know what the current mapping version is (so the ITRs can request an 1095 update, if their copy is outdated). 1097 At the moment version numbers are manually assigned, and ordered. 1098 Some felt that this was non-optimal, and that a better approach would 1099 have been to have 'fingerprints' which were computed from the current 1100 mapping data (i.e. a hash). It is not clear that the ordering buys 1101 much (if anything), and the potential for mishaps with manually 1102 configured version numbers is self-evident. 1104 9.3.2. Echo Nonces 1106 Another important use of the header control channel is for a 1107 mechanism known as the Nonce Echo, which is used as an efficient 1108 method for ITRs to check the reachability of correspondent ETRs. 1110 Basically, an ITR which wishes to ensure that an ETR is up, and 1111 reachable, sends a nonce to that ETR, carried in the encapsulation 1112 header; when that ETR (acting as an ITR) sends some other user data 1113 packet back to the ITR (acting in turn as an ETR), that nonce is 1114 carried in the header of that packet, allowing the original ITR to 1115 confirm that its packets are reaching that ETR. 1117 Note that lack of a response is not necessarily _proof_ that 1118 something has gone wrong - but it stronly suggests that something 1119 has, so other actions (e.g. a switch to an alternative ETR, if one is 1120 listed; a direct probe; etc) are advised. 1122 (See Section 12.5 for more about Echo Nonces.) 1124 9.3.3. Instances 1126 Another use of these header fields is for 'Instances' - basically, 1127 support for VPN's across backbones. [RFC4026] Since there is only 1128 one destination UDP port used for carriage of user data packets, and 1129 the source port is used for multiplexing (above), there is no other 1130 way to differentiate among different destination address namespaces 1131 (which are often overlapped in VPNs). 1133 9.4. Probing 1135 RLOC-Probing (see [LISP], Section 6.3.2. "RLOC-Probing Algorithm" 1136 for details) is a mechanism method that an ITR can use to determine 1137 with certainty that an ETR is up and reachable from the ITR. As a 1138 side-benfit, it gives a rough RTT estimates. 1140 It is quite a simple mechanism - an ITR simply sends a specially 1141 marked Map-Request directly to the ETR it wishes information about; 1142 that ETR sends back a specially marked Map-Reply. A Map-Request and 1143 Map-Reply are used, rather than a special probing control-message 1144 pair, because as a side-benefit the ITR can discover if the mapping 1145 has been updated since it cached it. 1147 The probing mechanism is rather heavy-weight and expensive (compared 1148 to mechanisms like the Echo-Nonce), since it costs a control message 1149 from each side, so it should only be used sparingly. However, it has 1150 the advantages of providing information quickly (a single RTT), and 1151 being a simple, direct robust way of doing so. 1153 9.5. Mapping Lifetimes and Timeouts 1155 Mappings come with a Time-To-Live, which indicate how long the 1156 creator of the mapping expects them to be useful for. The TTL may 1157 also indicate that the mapping should not be cached at all, or it can 1158 indicate that it has no particular lifetime, and the recipient can 1159 chose how long to store it. 1161 Mappings might also be discarded before the TTL expires, depending on 1162 what strategies the ITR is using to maintain its cache; if the 1163 maximum cache size is fixed, or the ITR needs to reclaim memory, 1164 mappings which have not been used 'recently' may be discarded. 1165 (After all, there is no harm in so doing; a future reference will 1166 merely cause that mapping to be reloaded.) 1168 9.6. Security of Mapping Lookups 1170 LISP provides an optional mechanism to secure the obtaining of 1171 mappings by an ITR. [LISP-SEC] It provides protection against 1172 attackers generating spurious Map-Reply messages (including replaying 1173 old Map-Replies), and also against 'over-claiming' attacks (where a 1174 malicious ETR by claims EID-prefixes which are larger what what have 1175 been actually delegated to it). 1177 Very briefly, the ITR provided a One-Time Key with its query; this 1178 key is used by both the MS (to verify the EID block that it has 1179 delegated to the ETR), and indirectly by the ETR (to verify the 1180 mapping that it is returning to the ITR). 1182 The specification for LISP-SEC suggests that the ITR-MR stage be 1183 cryptographically protected, and indicates that the existing 1184 mechanisms for securing the ETR-MS stage are used to protect Map- 1185 Rquests also. It does assume that the channel from the MR to the MS 1186 is secure (otherwise an attacker could obtain the OTK from the Map- 1187 Request and use it to forge a reply). 1189 9.7. Mapping Gleaning in ETRs 1191 As an optimization to the mapping acquisition process, ETRs are 1192 allowed to 'glean' mappings from incoming user data packets, and also 1193 from incoming Map-Request control messages. {{Is this still there? 1194 Check the latest version of the spec.}} This is not secure, and so 1195 any such mapping must be 'verified' by sending a Map-Request to get 1196 an authoritative mapping. (See further discussion of the security 1197 implications of this in [Perspective], Section "Security-xTRs".) 1199 The value of gleaning is that most communications are two-way, and so 1200 if host A is sending packets to host B (therefore needing B's 1201 EID->RLOC mapping), very likely B will soon be sending packets back 1202 to A (and thus needing A's EID->RLOC mapping). Without gleaning, 1203 this would sometimes result in a delay, and the dropping of the first 1204 return packet; this is felt to be very undesirable. 1206 9.8. Fragmentation 1208 Several mechanisms have been proposed for dealing with packets which 1209 are too large to transit the path from a particular ITR to a given 1210 ETR. 1212 One, called the 'stateful' approach, keeps a per-ETR record of the 1213 maximum size allowed, and sends an ICMP Too Big message to the 1214 original source host when a packet which is too large is seen. 1216 In the other, referred to as the 'stateless' approach, for IPv4 1217 packets without the 'DF' bit set, too-large packets are fragmented, 1218 and then the fragments are forwarded; all other packets are 1219 discarded, and an ICMP Too Big message returned. 1221 It is not clear at this point which approach is preferable. 1223 10. The Mapping System 1225 RFC 1034 ("DNS Concepts and Facilities") has this to say about the 1226 DNS name to IP address mapping system: 1228 "The sheer size of the database and frequency of updates suggest 1229 that it must be maintained in a distributed manner, with local 1230 caching to improve performance. Approaches that attempt to 1231 collect a consistent copy of the entire database will become more 1232 and more expensive and difficult, and hence should be avoided." 1234 and this observation applies equally to the LISP mapping system. 1236 To recap, the mapping system is split into an indexing sub-system, 1237 which keeps track of where all the mappings are kept, and the 1238 mappings themselves, the authoritative copies of which are always 1239 held by ETRs. 1241 10.1. The Mapping System Interface 1243 As mentioned in Section 6.2.2, both of the inferfaces to the mapping 1244 system (from ITRs, and ETRs) are standardized, so that the more 1245 numerous xTRs do not have to be modified when the mapping indexing 1246 sub-system is changed. 1248 (This precaution has already allowed the mapping system to be 1249 upgraded during LISP's evolution, when ALT was replaced by DDT.) 1251 This section describes the interfaces in a little more detail; for 1252 the details, see [MapInterface]. 1254 10.1.1. Map-Request Messages 1256 The Map-Request message contains a number of fields, the two most 1257 important of which are the requested EID block identifier (remember 1258 that individual mappings may cover a block of EIDs, not just a single 1259 EID), and the Address Family Identifier (AFI) for that EID block. 1260 [AFI] The inclusion of the AFI allows the mapping system interface 1261 (as embodied in these control packets) a great deal of flexibility. 1262 (See [Perspective], Section "Namespaces" for more on this.) 1264 Other important fields are the source EID (and its AFI), and one or 1265 more RLOCs for the source EID, along with their AFIs. Multiple RLOCs 1266 are included to ensure that at least one is in a form which will 1267 allow the reply to be returned to the requesting ITR, and the source 1268 EID is used for a variety of functions, including 'gleaning' (see 1269 Section 9.7). 1271 Finally, the message includes a long nonce, for simple, efficient 1272 protection against offpath attackers (see [Perspective], Section 1273 "Security-xTRs" for more), and a variety of other fields and control 1274 flag bits. 1276 10.1.2. Map-Reply Messages 1278 The Map-Reply message looks similar, except it includes the mapping 1279 entry for the requested EID(s), which contains one or more RLOCs and 1280 their associated data. (Note that the reply may cover a larger block 1281 of the EID namespace than the request; most requests will be for a 1282 single EID, the one which prompted the query.) 1284 For each RLOC in the entry, there is the RLOC, its AFI (of course), 1285 priority and weight fields (see Section 6.2), and multicast priority 1286 and weight fields. 1288 10.1.2.1. Solicit-Map-Request Messages 1290 "Solicit-Map-Request" (SMR) messages are actually not another message 1291 type, but a sub-type of Map-Reply messages. They include a special 1292 flag which indicates to the recipient that it _should_ send a new 1293 Map-Request message, to refresh its mapping, because the ETR has 1294 detected that the one it is using is out-dated. 1296 SMR's, like most other control traffic, is rate-limited. {{Need to 1297 say more about rate limiting, probably in security section? Ref to 1298 that from here.}} 1300 10.1.3. Map-Register and Map-Notify Messages 1302 The Map-Register message contains authentication information, and a 1303 number of mapping records, each with an individual Time-To-Live 1304 (TTL). Each of the records contains an EID (potentially, a block of 1305 EIDs) and its AFI, a version number for this mapping (see 1306 Section 9.3.1), and a number of RLOCs and their AFIs. 1308 Each RLOC entry also includes the same data as in the Map-Replies 1309 (i.e. priority and weight); this is because in some circumstances it 1310 is advantageous to allow the MS to proxy reply on the ETR's behalf to 1311 Map-Request messages. [Mobility] 1313 Map-Notify messages have the exact same contents as Map-Register 1314 messages; they are purely acknowledgements. 1316 10.2. The DDT Indexing Sub-system 1318 As previously mentioned Section 6.2.3, the indexing sub-system in 1319 LISP is currently the DDT system. 1321 The overall operation is fairly simple; an MR which needs a mapping 1322 starts at a server for the root DDT node (there will normally be more 1323 than one such server available, for both performance and robustness 1324 reasons), and through a combination of cached delegation information, 1325 and repetitive querying of a sequence of DDT servers, works its way 1326 down the delegation tree until it arrives at an MS which is 1327 authoritative (responsible?) for the block of EID namespace which 1328 holds the destination EID in question. 1330 The interaction between MRs and DDT servers is not complex; the MR 1331 sends the DDT server a Map-Request control message (which looks 1332 almost exactly like the Map-Request which an ITR sends to an MR). 1333 The DDT server uses its data (which is configured, and static) to see 1334 whether it is directly peered to an MS which can answer the request, 1335 or if it has a child (or children, if replicated) which is 1336 responsible for that portion of the EID namespace. 1338 If it has children which are responsible, it will reply to the MR 1339 with another kind of LISP control message, a Map-Referral message, 1340 which provides information about the delegation of the block 1341 containing the requested EID. The Map-Referral also gives the RLOCs 1342 of all the machines which are DDT servers for that block. and the MR 1343 can then send Map-Requests to any one (or all) of them. 1345 Control flags in the Map-Referral indicate to the querying MR whether 1346 the referral is to another DDT node, an MS, or an ETR. If the 1347 former, the MR then sends the Map-Request to the child DDT node, 1348 repeating the process. 1350 If the latter, the MR then interacts with that MS, and usually the 1351 block's ETR(s) as well, to cause a mapping to be sent to the ITR 1352 which queried the MR for it. (Recall that some MS's provide Map- 1353 Replies on behalf of an associated ETR, so in such cases the Map- 1354 Reply will come from the MS, not the ETR. {{I think this case has 1355 been mentioned already; check.}}) 1357 Delegations are cached in the MRs, so that once an MR has received 1358 information about a delegation, it will not need to look that up 1359 again. Once it has been in operation for a short while, it will only 1360 need to ask for delegation information which is has not yet asked 1361 about - probably only the last stage in a delegation to a 'leaf' MS. 1363 As describe below (Section 10.6), significant amounts of modeling and 1364 performance measurement have been performed, to verify that DDT has 1365 (and will continue to have) acceptable performance. 1367 10.2.1. Map-Referral Messages 1369 Map-Referral messages look almost identical to Map-Reply messages 1370 (which is felt to be an advantage by some people, although having a 1371 more generic record-based format would probably be better in the long 1372 run, as ample experience with DNS has shown), except that the RLOCs 1373 potentially name either i) other DDT nodes (children in the 1374 delegation tree), or ii) terminal MSs. 1376 10.3. Reliability via Replication 1378 Everywhere throughout the mapping system, robustness to operational 1379 failures is obtained by replicating data in multiple instances of any 1380 particular node (of whatever type). Map-Resolvers, Map-Servers, DDT 1381 nodes, ETRs - all of them can be replicated, and the protocol 1382 supports this replication. 1384 The deployed DDT system actually uses anycast [RFC4786], along with 1385 replicated servers, to improve both performance and robustness. 1387 There are generally no mechanisms specified yet to ensure coherence 1388 between multiple copies of any particular data item, etc - this is 1389 currently a manual responsibility. If and when LISP protocol 1390 adoption proceeds, an automated layer to perform this functionality 1391 can 'easily' be layered on top of the existing mechanisms. 1393 10.4. Security of the DDT Indexing Sub-system 1395 LISP provides an advanced model for securing the mapping indexing 1396 system, in line with the overall LISP security philosophy. 1398 Briefly, securing the mapping indexing system is broken into two 1399 parts: the interface between the clients of the system (MR's) and the 1400 mapping indexing system itself, and the interaction between the DDT 1401 nodes/servers which make it up. 1403 The client interface provides only a single model, using the 1404 'canonical' public-private key system (starting from a trust anchor), 1405 in which the child's public key is provided by the parent, along with 1406 the delegation. This requires very little configuration in the 1407 clients, and is fairly secure. 1409 The interface between the DDT nodes/servers allows for choices 1410 between a number of different options, allowing the operators to 1411 trade off among configuration complexity, security level, etc. This 1412 is based on experience with DNS-SEC ([RFC4033]), where configuration 1413 complexity in the servers has been a major stumbling block to 1414 deployment. 1416 See [Perspective], Section "Security-Mappings" for more. 1418 10.5. Extended Tools 1420 In addition to the priority and weight data items in mappings, LISP 1421 offers other tools to enhance functionality, particularly in the 1422 traffic engineering area. 1424 One is 'source-specific mappings', i.e. the ETR may return different 1425 mappings to the enquiring ITR, depending on the identity of the ITR. 1426 This allows very fine-tuned traffic engineering, far more powerful 1427 than routing-based TE. 1429 10.6. Performance of the Mapping System 1431 Prior to the creation of DDT, a large study of the performance of the 1432 previous mapping system, ALT ([ALT]), along with a proposed new 1433 design called TREE (which used DNS to hold delegation information) 1434 provided considerable insight into the likely performance of the 1435 mapping systems at larger scale. [Jakab] The basic structure and 1436 concepts of DDT are identical to those of TREE, so the performance 1437 simulation work done for that design applies aequally to DDT. 1439 In that study, as with earlier LISP performance analyses, extensive 1440 large-scale simulations were driven by lengthy recordings of actual 1441 traffic at several major sites; one was the site in the first study 1442 ([Iannone]), and the other was an even large university, with roughly 1443 35,000 users. 1445 The results showed that a system like DDT, which caches information 1446 about delegations, and allows the MR to communicate directly with the 1447 lower nodes on the delegation hierarchy based on cached delegation 1448 information, would have good performance, with average resolution 1449 times on the order of the MR to MS RTT. This verified the 1450 effectiveness of this particular type of indexing system. 1452 A more recent study, [Saucez], has measured actual resolution times 1453 in the deployed LISP network; it took measurements from a variety of 1454 locations in the Internet, with respect to a number of different 1455 target EIDs. Average measured resolution delays ranged from roughly 1456 175 msec to 225 msec, depending on the location. 1458 11. Deployment Mechanisms 1460 This section discusses several deployment issues in more detail. 1461 With LISP's heavy emphasis on practicality, much work has gone into 1462 making sure it works well in the real-world environments most people 1463 have to deal with. 1465 11.1. LISP Deployment Needs 1467 As mentioned earlier (Section 3.2), LISP requires no change to almost 1468 all existing hosts and routers. Obviously, however, one must deploy 1469 _something_ to run LISP! Exactly what that has to be will depend 1470 greatly on the details of the site's existing networking gear. 1472 The primary requirement is for one or more xTRs. These may be 1473 existing routers, just with new software loads, or it may require the 1474 deployment of new devices. 1476 LISP also requires a small amount of LISP-specific support 1477 infrastructure, such as MRs, MSs, the DDT hierarchy, etc but much of 1478 this will either i) already be deployed, and if the new site can make 1479 arrangements to use it, it need do nothing else, or ii) those 1480 functions it must provide may be co-located in other LISP devices 1481 (again, either new devices, or new software on existing ones). 1483 11.2. Internetworking Mechanism 1485 One aspect which has received a lot of attention are the mechanisms 1486 previously referred to (in Section 4.4) to allow interoperation of 1487 LISP sites with so-called 'legacy' sites which are not running LISP 1488 (yet). 1490 To briefly refresh what was said there, there are two main approaches 1491 to such interworking: proxy nodes (PITRs and PETRs), and an 1492 alternative mechanism using device with combined NAT and LISP 1493 functionality; these are described in more detail here. 1495 11.3. Proxy Devices 1497 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1498 nodes using LISP. PETRs (proxy ETRs) serve as ETRs for LISP traffic 1499 _to_ legacy hosts (for cases where a LISP device cannot send packets 1500 directly to such hosts, without encapsulation). 1502 Note that return traffic _to_ a legacy host from a LISP-using node 1503 does not necessarily have to pass through an ITR/PETR pair - the 1504 original packets can usually just be sent directly to the ultimate 1505 destination. However, for some kinds of LISP operation (e.g. mobile 1506 nodes), this is not possible; in these situations, the PETR is 1507 needed. 1509 11.3.1. PITRs 1511 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1512 nodes using LISP. To do that, they have to advertise into the 1513 existing legacy backbone Internet routing the availability of 1514 whatever ranges of EIDs (i.e. of nodes using LISP) they are proxying 1515 for, so that legacy hosts will know where to send traffic to those 1516 LISP nodes. 1518 As mentioned previously (Section 9.1), an ITR at another LISP site 1519 can avoid using a PITR (i.e. it can detect that a given ultimate 1520 destination is not a legacy host, if a PITR is advertising it into 1521 the DFZ) by checking to see if a LISP mapping exists for that 1522 ultimate destination. 1524 This technique obviously has an impact on routing table in the DFZ, 1525 but it is not clear yet exactly what that impact will be; it is very 1526 dependent on the collected details of many individual deployment 1527 decisions. 1529 A PITR may cover a group of EID blocks with a single EID 1530 advertisement, in order to reduce the number of routing table entries 1531 added. (In fact, at the moment, aggressive aggregation of EID 1532 announcements is performed, precisely to to minimize the number of 1533 new announced routes added by this technique.) 1535 At the same time, if a site does traffic engineering with LISP 1536 instead of fine-grained BGP announcement, that will help keep table 1537 sizes down (and this is true even in the early stages of LISP 1538 deployment). The same is true for multi-homing. 1540 11.3.2. PETRs 1542 PETRs (proxy ETRs) serve as ETRs for LISP traffic _to_ legacy hosts, 1543 for cases where a LISP device cannot send packets to such hosts 1544 without encapsulation. That typically happens for one of two 1545 reasons. 1547 First, it will happen in places where some device is implementing 1548 Unicast Reverse Path Forwarding (uRPF), to prevent a variety of 1549 negative behaviour; originating packets with the original source's 1550 EID in the source address field will result in them being filtered 1551 out and discarded. 1553 Second, it will happen when a LISP site wishes to send packets to a 1554 non-LISP site, and the path in between does not support the 1555 particular IP protocol version used by the original source along its 1556 entire length. Use of a PETR on the other side of the 'gap' will 1557 allow the LISP site's packet to 'hop over' the gap, by utilizing 1558 LISP's built-in support for mixed protocol encapsulation. 1560 PETRs are generally paired with specific ITRs, which have the 1561 location of their PETRs configured into them. In other words, unlike 1562 normal ETRS, PETRs do not have to register themselves in the mapping 1563 database, on behalf of any legacy sites they serve. 1565 Also, allowing an ITR to always send traffic leaving a site to a PETR 1566 does avoid having to chose whether or not to encapsulate packets; it 1567 can just always encapsulate packets, sending them to the PETR if it 1568 has no specific mapping for the ultimate destination. However, this 1569 is not advised: as mentioned, it is easy to tell if something is a 1570 legacy destination. 1572 11.4. LISP-NAT 1574 A LISP-NAT device, as previously mentioned, combines LISP and NAT 1575 functionality, in order to allow a LISP site which is internally 1576 using addresses which cannot be globally routed to communicate with 1577 non-LISP sites elsewhere in the Internet. (In other words, the 1578 technique used by the PITR approach simply cannot be used in this 1579 case.) 1581 To do this, a LISP-NAT performs the usual NAT functionality, and 1582 translates a host's source address(es) in packets passing through it 1583 from an 'inner' value to an 'outer' value, and storing that 1584 translation in a table, which it can use to similarly process 1585 subsequent packets (both outgoing and incoming). [Interworking] 1587 There are two main cases where this might apply: 1588 - Sites using non-routable global addresses 1589 - Sites using private addresses [RFC1918] 1591 11.5. Use Through NAT Devices 1593 Like them or not (and NAT devices have many egregious issues - some 1594 inherent in the nature of the process of mapping addresses; others, 1595 such as the brittleness due to non-replicated critical state, caused 1596 by the way NATs were introduced, as stand-alone 'invisible' boxes), 1597 NATs are both ubiquitous, and here to stay for a long time to come. 1599 Thus, in the actual Internet of today, having any new mechanisms 1600 function well in the presence of NATs (i.e. with LISP xTRs behind a 1601 NAT device) is absolutely necessary. LISP has produced a variety of 1602 mechanisms to do this. 1604 11.5.1. First-Phase NAT Support 1606 The first mechanism used by LISP to operate through a NAT device only 1607 worked with some NATs, those which were configurable to allow inbound 1608 packet traffic to reach a configured host. 1610 A pair of new LISP control messages, LISP Echo-Request and Echo- 1611 Reply, allowed the ETR to discover its temporary global address; the 1612 Echo-Request was sent to the configured Map-Server, and it replied 1613 with an Echo-Reply which included the source address from which the 1614 Echo Request was received (i.e. the public global address assigned to 1615 the ETR by the NAT). The ETR could then insert that address in any 1616 Map-Reply control messages which it sent to correspondent ITRs. 1618 The fact that this mechanism did not support all NATs, and also 1619 required manual configuration of the NAT, meant that this was not a 1620 good solution; in addition, since LISP expects all incoming data 1621 traffic to be on a specific port, it was not possible to have 1622 multiple ETRs behind a single NAT (which normally would have only one 1623 global address to share, meaning port mapping would have to be used, 1624 except that... ) 1626 11.5.2. Second-Phase NAT Support 1628 For a more comprehensive approach to support of LISP xTR deployment 1629 behind NAT devices, a fairly extensive supplement to LISP, LISP NAT 1630 Traversal, has been designed. [LISP-NAT] 1632 A new class of LISP device, the LISP Re-encapsulating Tunnel Router 1633 (RTR), passes traffic through the NAT, both to and from the xTR. 1634 (Inbound traffic has to go through the RTR as well, since otherwise 1635 multiple xTRs could not operate behind a single NAT, for the 1636 'specified port' reason in the section above.) 1638 (Had the Map-Reply included a port number, this could have been 1639 avoided - although of course it would be possible to define a new 1640 RLOC type which included protocol and port, to allow other 1641 encapsulation techniques.) 1643 Two new LISP control messages (Info-Request and Info-Reply) allow an 1644 xTR to detect if it is behind a NAT device, and also discover the 1645 global IP address and UDP port assigned by the NAT to the xTR. A 1646 modification to LISP Map-Register control messages allows the xTR to 1647 initialize mapping state in the NAT, in order to use the RTR. 1649 This mechanism addresses cases where the xTR is behind a NAT, but the 1650 xTR's associated MS is on the public side of the NAT; this 1651 limitation, that MS's must be in the 'public' part of the Internet, 1652 seems reasonable. 1654 11.6. LISP and DFZ Routing 1656 One of LISP's original motivations was to try and control the growth 1657 of the size of the so-called 'Default-Free-Zone' (DFZ), the core of 1658 the Internet, the part where routes to _all_ destinations must be 1659 available. As LISP becomes more widely deployed, it can help with 1660 this issue, in a variety of ways. 1662 In covering this topic, one must recognize that conditions in various 1663 stages of LISP deployment (in terms of ubiquity) will have a large 1664 influence. [Deployment] introduced useful terminology for this 1665 progression, in addition to some coverage of the topic (see Section 1666 5, "Migration to LISP"): 1668 The loosely defined terms of "early transition phase", "late 1669 transition phase", and "LISP Internet phase" refer to time periods 1670 when LISP sites are a minority, a majority, or represent all edge 1671 networks respectively. 1673 In the early phases of deployment, two primary effects will allow 1674 LISP to have a positive impact on the routing table growth: 1675 - Using LISP for traffic engineering instead of BGP 1676 - Aggregation of smaller PI sites into a single PITR advertisement 1677 The first is fairly obvious (doing TE with BGP requires injecting 1678 more-specific routes into the DFZ routing tables, something doing TE 1679 with LISP avoids); the second is not guaranteed to happen (since it 1680 requires coordination among a number of different parties), and only 1681 time will tell if it does happen. 1683 11.6.1. Long-term Possibilities 1685 At a later stage of the deployment, a more aggressive approach 1686 becomes available: taking part of the DFZ, one for which all 'stub' 1687 sites connected to it have deployed LISP, and removing all 'EID 1688 routes' (used for backwards compatability with 'legacy' sites); only 1689 RLOC routes would remain in the routing table in that part of the 1690 Internet backbone. 1692 Obviously there would be a boundary between the two parts of the DFZ, 1693 and the routers on the border would have to (effectively) become 1694 PITRs, and inject routes to all of the LISP sites 'behind' them into 1695 the 'legacy' DFZ (to coin a name for the part of the DFZ which, for 1696 reasons of interoperability with legacy sites, still carries EID 1697 routes). 1699 Note that it is likely not feasible to have the 'RLOC only' part of 1700 the DFZ in the 'middle' of the DFZ; that would require (effectively) 1701 EID routes to be removed from BGP on crossing the boundary _into_ the 1702 RLOC DFZ, but re-created on crossing the boundary _out_ of the RLOC 1703 DFZ. This is likely to be impractical, leading to the suggestion of 1704 a simpler boundary between the RLOC-only part of the DFZ, and the 1705 'legacy' DFZ. 1707 The mechanism for detecting which routes are 'EID routes' and which 1708 are 'RLOC routes' (required for the boundary routers to be able to 1709 filter out the 'EID routes') would also need to be worked out; the 1710 most likely appears to be something involving BGP attributes. 1712 12. Fault Discovery/Handling 1714 LISP is, in terms of its functionality, a fairly simple system: the 1715 list of failure modes is thus not extensive. 1717 12.1. Handling Missing Mappings 1719 Handling of missing mappings is fairly simple: the ITR calls for the 1720 mapping, and in the meantime can either discard traffic to that 1721 ultimate destination (as many ARP implementations do) [RFC826], or, 1722 if dropping the traffic is deemed undesirable, it can forward them 1723 via a 'default PITR'. 1725 A number of PITRs advertise all EID blocks into the backbone routing, 1726 so that any ITRs which are temporarily missing a mapping can forward 1727 the traffic to these default PITRs via normal transmission methods, 1728 where they are encapsulated and passed on. 1730 12.2. Outdated Mappings 1732 If a mapping changes once an ITR has retrieved it, that may result in 1733 traffic to the EIDs covered by that mapping failing. There are three 1734 cases to consider: 1736 - When the ETR traffic is being sent to is still a valid ETR for 1737 that EID, but the mapping has been updated (e.g. to change the 1738 priority of various ETRs) 1739 - When the ETR traffic is being sent to is still an ETR, but no 1740 longer a valid ETR for that EID 1741 - When the ETR traffic is being sent to is no longer an ETR 1743 12.2.1. Outdated Mappings - Updated Mapping 1745 A 'mapping versioning' system, whereby mappings have version numbers, 1746 and ITRs are notified when their mapping is out of date, has been 1747 added to detect this, and the ITR responds by refreshing the mapping. 1748 [Versioning] 1750 12.2.2. Outdated Mappings - Wrong ETR 1752 If an ITR is holding a seriously outdated cached mapping, it may send 1753 packets to an ETR which is no longer an ETR for that EID. 1755 It might be argued that if the ETR is properly managing the lifetimes 1756 on its mapping entries, this 'cannot happen', but it is a wise design 1757 methodology to assume that 'cannot happen' events will in fact happen 1758 (as they do, due to software errors, or, on rare occasions, hardware 1759 faults), and ensure that the system will handle them properly (if, 1760 perhaps not in the most expeditious, or 'clean' way - they are, after 1761 all, very unlikely to happen). 1763 ETRs can easily detect cases where this happpens, after they have un- 1764 wrapped a user data packet; in response, they send a Solicit-Map- 1765 Request to the source ITR to cause it to refresh its mapping. 1767 12.2.3. Outdated Mappings - No Longer an ETR 1769 In another case for what can happen if an ITR uses an outdated 1770 mapping, the destination of traffic from an ITR might no longer be a 1771 LISP device at all. In such cases, one might get an ICMP Destination 1772 Unreachable error message. However, one cannot depend on that - and 1773 in any event, that would provide an attack vector, so it should be 1774 used with care. (See [LISP], Section 6.3, "Routing Locator 1775 Reachability" for more about this.) 1777 The following mechanism will work, though. Since the destination is 1778 not an ETR, the echoing reachability detection mechanism (see 1779 Section 9.3.2) will detect a problem. At that point, the backstop 1780 mechanism, Probing, will kick in. Since the destination is still not 1781 an ETR, that will fail, too. 1783 At that point, traffic will be switched to a different ETR, or, if 1784 none are available, a reload of the mapping may be initiated. 1786 12.3. Erroneous Mappings 1788 Again, this 'should not happen', but a good system should deal with 1789 it. However, in practise, should this happen, it will produce one of 1790 the prior two cases (the wrong ETR, or something that is not an ETR), 1791 and will be handled as described there. 1793 12.4. Neighbour Liveness 1795 The ITR, like all packet switches, needs to detect, and react, when 1796 its next-hop neighbour ceases operation. As LISP traffic is 1797 effectively always unidirectional (from ITR to ETR), this could be 1798 somewhat problematic. 1800 Solving a related problem, neighbour reachability (below) subsumes 1801 handling this fault mode, however. 1803 Note that the two terms (liveness and reachability) are _not_ 1804 synonmous (although a lot of LISP documentation confuses them). 1805 Liveness is a property of a node - it is either up and functioning, 1806 or it is not. Reachability is only a property of a particular _pair_ 1807 of nodes. 1809 If packets sent from a first node to a second are successfully 1810 received at the second, it is 'reachable' from the first. However, 1811 the second node may at the very same time _not_ be reachable from 1812 some other node. Reachability is _always_ a ordered pairwise 1813 property, and of a specified ordered pair. 1815 12.5. Neighbour Reachability 1817 A more significant issue than whether a particular ETR E is up or not 1818 is, as mentioned above, that although ETR E may be up, attached to 1819 the network, etc, an issue in the network between a source ITR I and 1820 E may prevent traffic from I from getting to E. (Perhaps a routing 1821 problem, or perhaps some sort of access control setting.) 1823 The one-way nature of LISP traffic makes this situation hard to 1824 detect in a way which is economic, robust and fast. Two out of the 1825 three are usually not to hard, but all three at the same time - as is 1826 highly desirable for this particular issue - are harder. 1828 In line with the LISP design philosophy ([Perspective], Section 1829 "Design-Theoretical"), this problem is attacked not with a single 1830 mechanism (which would have a hard time meeting all those three goals 1831 simultaneously), but with a collection of simpler, cheaper 1832 mechanisms, which collectively will usually meet all three. 1834 They are reliance on the underlying routing system (which can of 1835 course only reliably provide a negative reachabilty indication, not a 1836 positive one), the echo nonce (which depends on some return traffic 1837 from the destination xTR back to the source xTR), and finally direct 1838 'pinging', in the case where no positive echo is returned. 1840 (The last is not the first choice, as due to the large fan-out 1841 expected of LISP devices, reliance on it as a sole mechanism would 1842 produce a fair amount of overhead.) 1844 13. Current Improvements 1846 In line with the philosophies laid out in Section 8, LISP is 1847 something of a moving target. This section discusses some of the 1848 contemporaneous improvements being made to LISP. 1850 13.1. Improved NAT Support 1852 13.2. Mobile Device Support 1854 Mobility is an obvious capability to provide with LISP. Doing so is 1855 relatively simple, if the mobile host is prepared to act as its own 1856 ETR. It obtains a local 'temporary use' address, and registers that 1857 address as its RLOC. Packets to the mobile host are sent to its 1858 temporary address, wherever that may be, and the mobile host first 1859 unwraps them (acting as an ETR), and the processes them normally 1860 (acting as a host). 1862 (Doing mobility without having the mobile host act as its ETR is 1863 difficult, even if ETRs are quite common. The reason is that if the 1864 ETR and mobile host are not integrated, during the step from the ETR 1865 to the mobile host, the packets must contain the mobile host's EID, 1866 and this may not be workable. If there is a local router between the 1867 ETR and mobile host, for instance, it is unlikely to know how to get 1868 the packets to the mobile host.) 1870 If the mobile host migrates to a site which is itself a LISP site, 1871 things get a little more complicated. The 'temporary address' it 1872 gets is itself an EID, requiring mapping, and wrapping for transit 1873 across the rest of the Internet. A 'double encapsulation' is thus 1874 required at the other end; the packets are first encapsulated with 1875 the mobile node's temporary address as their RLOC, and then this has 1876 to be looked up in a second lookup cycle (see Section 9.1), and then 1877 wrapped again, with the site's RLOC as their destination. 1879 This results in slight loss in maximum packet size, due to the 1880 duplicated headers, but on the whole it is considerably simpler than 1881 the alternative, which would be to re-wrap the packet at the site's 1882 ETR, when it is discovered that the ultimate destination's EID was 1883 not 'native' to the site. This would require that the mobile node's 1884 EID effectively have two different mappings, depending on whether the 1885 lookup was being performed outside the LISP site, or inside. 1887 {{Also probably need to mention briefly how the other end is notified 1888 when mappings are updated, and about proxy-Map-Replies.}} [Mobility] 1890 13.3. Multicast Support 1892 Multicast may seem an odd thing to support with LISP, since LISP is 1893 all about separating identity from location, but although a multicast 1894 group in some sense has an identity, it certainly does not have _a_ 1895 location. 1897 However, multicast is important to some users of the network, for a 1898 number of reasons: doing multiple unicast streams is inefficient; it 1899 is easy to use up all the upstream bandwidth, and without multicast a 1900 server can also be saturated fairly easily in doing the unicast 1901 replication. So it is important for LISP to 'play nicely' with 1902 multicast; work on multicast support in LISP is fairly advanced, 1903 although not far-ranging. 1905 Briefly, destination group addresses are not mapped; only the source 1906 address (when the original source is inside a LISP site) needs to be 1907 mapped, both during distribution tree setup, as well as actual 1908 traffic delivery. In other words, LISP's mapping capability is used: 1909 it is just applied to the source, not the destination (as with most 1910 LISP activity); the inner source is the EID, and the outer source is 1911 the EID's RLOC. 1913 Note that this does mean that if the group is using separate source- 1914 specific trees for distribution, there isn't a separate distribution 1915 tree outside the LISP site for each different source of traffic to 1916 the group from inside the LISP site; they are all lumped together 1917 under a single source, the RLOC. 1919 The approach currently used by LISP requires no packet format changes 1920 to existing multicast protocols. See [Multicast] for more; 1921 additional LISP multicast issues are discussed in [LISP], Section 12. 1923 13.4. {{Any others?}} 1925 14. Acknowledgments 1927 The author would like to start by thanking all the members of the 1928 core LISP group for their willingness to allow him to add himself to 1929 their effort, and for their enthusiasm for whatever assistance he has 1930 been able to provide. 1932 He would also like to thank (in alphabetical order) Vina Ermagan, 1933 Vince Fuller and Vasileios Lakafosis for their careful review of, and 1934 helpful suggestions for, this document. (If I have missed anyone in 1935 this list, I apologize most profusely.) A very special thank you 1936 goes to Joel Halpern, who, when asked, promptly returned comments on 1937 intermediate versions of this document. Grateful thanks go also to 1938 Darrel Lewis for his help with material on non-Internet uses of LISP, 1939 and to Vince Fuller and Dino Farinacci for answering detailed 1940 questions about some obscure LISP topics. 1942 A final thanks is due to John Wrocklawski for the author's 1943 organizational affiliation, and to Vince Fuller for help with XML. 1944 This memo was created using the xml2rfc tool. 1946 I would like to dedicate this document to the memory of my parents, 1947 who gave me so much, and whom I can no longer thank in person, as I 1948 would have so much liked to be able to. 1950 15. IANA Considerations 1952 This document makes no request of the IANA. 1954 16. Security Considerations 1956 This memo does not define any protocol and therefore creates no new 1957 security issues. 1959 17. References 1961 17.1. Normative References 1963 [RFC768] J. Postel, "User Datagram Protocol", RFC 768, 1964 August 1980. 1966 [RFC791] J. Postel, "Internet Protocol", RFC 791, 1967 September 1981. 1969 [RFC1498] J. H. Saltzer, "On the Naming and Binding of Network 1970 Destinations", RFC 1498, (Originally published in: 1971 "Local Computer Networks", edited by P. Ravasio et 1972 al., North-Holland Publishing Company, Amsterdam, 1973 1982, pp. 311-317.), August 1993. 1975 [RFC2460] S. Deering and R. Hinden, "Internet Protocol, Version 1976 6 (IPv6) Specification", RFC 2460, December 1998. 1978 [AFI] IANA, "Address Family Indicators (AFIs)", Address 1979 Family Numbers, January 2011, . 1982 [LISP] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, "The 1983 Locator/ID Separation Protocol (LISP)", RFC 6830, 1984 January 2013. 1986 [MapInterface] V. Fuller and D. Farinacci, "Locator/ID Separation 1987 Protocol (LISP) Map-Server Interface", RFC 6833, 1988 January 2013. 1990 [Versioning] L. Iannone, D. Saucez, and O. Bonaventure, 1991 "Locator/ID Separation Protocol (LISP) Map- 1992 Versioning", RFC 6834, January 2013. 1994 [Interworking] D. Lewis, D. Meyer, D. Farinacci, and V. Fuller, 1995 "Interworking between Locator/ID Separation Protocol 1996 (LISP) and Non-LISP Sites", RFC 6832, January 2013. 1998 [DDT] V. Fuller, D. Lewis, and D. Farinacci, "LISP 1999 Delegated Database Tree", draft-ietf-lisp-ddt-00 2000 (work in progress), October 2012. 2002 [Perspective] J. N. Chiappa, "An Architectural Perspective on the 2003 LISP Location-Identity Separation System", 2004 draft-ietf-lisp-perspective-00 (work in progress), 2005 February 2013. 2007 [Future] J. N. Chiappa, "Potential Long-Term Developments With 2008 the LISP System", draft-chiappa-lisp-evolution-00 2009 (work in progress), October 2012. 2011 [LISP-SEC] F. Maino, V. Ermagan, A. Cabellos-Aparicio, 2012 D. Saucez, and O. Bonaventure, "LISP-Security (LISP- 2013 SEC)", draft-ietf-lisp-sec-04 (work in progress), 2014 October 2012. 2016 [LISP-NAT] V. Ermagan, D. Farinacci, D. Lewis, J. Skriver, 2017 F. Maino, and C. White, "NAT traversal for LISP", 2018 draft-ermagan-lisp-nat-traversal-03 (work in 2019 progress), March 2013. 2021 [Mobility] D. Farinacci, V. Fuller, D. Lewis, and D. Meyer, 2022 "LISP Mobility Architecture", draft-meyer-lisp-mn-07 2023 (work in progress), April 2012. 2025 [Multicast] D. Farinacci, D. Meyer, J. Zwiebel, and S. Venaas, 2026 "The Locator/ID Separation Protocol (LISP) for 2027 Multicast Environments", RFC 6831, January 2013. 2029 [Deployment] L. Jakab, A. Cabellos-Aparicio, F. Coras, J. Domingo- 2030 Pascual, and D. Lewis, "LISP Network Element 2031 Deployment Considerations", 2032 draft-ietf-lisp-deployment-08 (work in progress), 2033 June 2013. 2035 17.2. Informative References 2037 [NIC8246] A. McKenzie and J. Postel, "Host-to-Host Protocol for 2038 the ARPANET", NIC 8246, Network Information Center, 2039 SRI International, Menlo Park, CA, October 1977. 2041 [IEN19] J. F. Shoch, "Inter-Network Naming, Addressing, and 2042 Routing", IEN (Internet Experiment Note) 19, 2043 January 1978. 2045 [RFC826] D. Plummer, "Ethernet Address Resolution Protocol", 2046 RFC 826, November 1982. 2048 [RFC1034] P. V. Mockapetris, "Domain Names - Concepts and 2049 Facilities", RFC 1034, November 1987. 2051 [RFC1631] K. Egevang and P. Francis, "The IP Network Address 2052 Translator (NAT)", RFC 1631, May 1994. 2054 [RFC1918] Y. Rekhter, R. Moskowitz, D. Karrenberg, 2055 G. J. de Groot, and E. Lear, "Address Allocation for 2056 Private Internets", RFC 1918, February 1996. 2058 [RFC1992] I. Castineyra, J. N. Chiappa, and M. Steenstrup, "The 2059 Nimrod Routing Architecture", RFC 1992, August 1996. 2061 [RFC3168] K. Ramakrishnan, S. Floyd, and D. Black, "The 2062 Addition of Explicit Congestion Notification (ECN) to 2063 IP", RFC 3168, September 2001. 2065 [RFC3272] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and 2066 X. Xiao, "Overview and Principles of Internet Traffic 2067 Engineering", RFC 3272, May 2002. 2069 [RFC4026] L. Andersson and T. Madsen, "Provider Provisioned 2070 Virtual Private Network (VPN) Terminology", RFC 4026, 2071 March 2005. 2073 [RFC4033] R. Arends, R. Austein, M. Larson, D. Massey, and 2074 S. Rose, "DNS Security Introduction and 2075 Requirements", RFC 4033, March 2005. 2077 [RFC4116] J. Abley, K. Lindqvist, E. Davies, B. Black, and 2078 V. Gill, "IPv4 Multihoming Practices and 2079 Limitations", RFC 4116, July 2005. 2081 [RFC4786] J. Abley and K. Lindqvist, "Operation of Anycast 2082 Services", RFC 4786, December 2006. 2084 [RFC4984] D. Meyer, L. Zhang, and K. Fall, "Report from the IAB 2085 Workshop on Routing and Addressing", RFC 4984, 2086 September 2007. 2088 [RFC5887] B. Carpenter, R. Atkinson, and H. Flinck, 2089 "Renumbering Still Needs Work", RFC 5887, May 2010. 2091 [RFC6115] T. Li, Ed., "Recommendation for a Routing 2092 Architecture", RFC 6115, February 2011. 2094 Perhaps the most ill-named RFC of all time; it 2095 contains nothing that could truly be called a 2096 'routing architecture'. 2098 [LISP0] D. Farinacci, V. Fuller, and D. Oran, "Locator/ID 2099 Separation Protocol (LISP)", draft-farinacci-lisp-00 2100 (work in progress), January 2007. 2102 [ALT] V. Fuller, D. Farinacci, D. Meyer, and D. Lewis, 2103 "Locator/ID Separation Protocol Alternative Logical 2104 Topology (LISP+ALT)", RFC 6836, January 2013. 2106 [NSAP] International Organization for Standardization, 2107 "Information Processing Systems - Open Systems 2108 Interconnection - Basic Reference Model", ISO 2109 Standard 7489.1984, 1984. 2111 [Atkinson] R. Atkinson, "Revised draft proposed definitions", 2112 RRG list message, Message-Id: 808E6500-97B4-4107- 2113 8A2F-36BC913BE196@extremenetworks.com, 11 June 2007, 2114 . 2117 [Baran] P. Baran, "On Distributed Communications Networks", 2118 IEEE Transactions on Communications Systems Vol. 2119 CS-12 No. 1, pp. 1-9, March 1964. 2121 [Chiappa] J. N. Chiappa, "Endpoints and Endpoint Names: A 2122 Proposed Enhancement to the Internet Architecture", 2123 Personal draft (work in progress), 1999, 2124 . 2126 [Clark] D. D. Clark, "The Design Philosophy of the DARPA 2127 Internet Protocols", in 'Proceedings of the Symposium 2128 on Communications Architectures and Protocols SIGCOMM 2129 '88', pp. 106-114, 1988. 2131 [Heart] F. E. Heart, R. E. Kahn, S. M. Ornstein, 2132 W. R. Crowther, and D. C. Walden, "The Interface 2133 Message Processor for the ARPA Computer Network", 2134 Proceedings AFIPS 1970 SJCC, Vol. 36, pp. 551-567. 2136 [Bibliography] J. N. Chiappa (editor), "LISP (Location/Identity 2137 Separation Protocol) Bibliography", Personal 2138 site (work in progress), July 2013, . 2141 [Iannone] L. Iannone and O. Bonaventure, "On the Cost of 2142 Caching Locator/ID Mappings", in 'Proceedings of the 2143 3rd International Conference on emerging Networking 2144 EXperiments and Technologies (CoNEXT'07)', ACM, pp. 2145 1-12, December 2007. 2147 [Kim] J. Kim, L. Iannone, and A. Feldmann, "A Deep Dive 2148 Into the LISP Cache and What ISPs Should Know About 2149 It", in 'Proceedings of the 10th International IFIP 2150 TC 6 Conference on Networking - Volume Part I 2151 (NETWORKING '11)', IFIP, pp. 367-378, May 2011. 2153 [CorasCache] F. Coras, A. Cabellos-Aparicio, and J. Domingo- 2154 Pascual, "An Analytical Model for the LISP Cache 2155 Size", in 'Proceedings of the 11th International IFIP 2156 TC 6 Networking Conference: Part I', IFIP, pp. 409- 2157 420, May 2012. 2159 [Jakab] L. Jakab, A. Cabellos-Aparicio, F. Coras, D. Saucez, 2160 and O. Bonaventure, "LISP-TREE: A DNS Hierarchy to 2161 Support the LISP Mapping System", in 'IEEE Journal on 2162 Selected Areas in Communications', Vol. 28, No. 8, 2163 pp. 1332-1343, October 2010. 2165 [Saucez] D. Saucez, L. Iannone, and B. Donnet, "A First 2166 Measurement Look at the Deployment and Evolution of 2167 the Locator/ID Separation Protocol", in 'ACM SIGCOMM 2168 Computer Communication Review', Vol. 43 No. 2, pp. 2169 37-43, April 2013. 2171 [CorasBGP] F. Coras, D. Saucez, L. Jakab, A. Cabellos-Aparicio, 2172 and J. Domingo-Pascual, "Implementing a BGP-free ISP 2173 Core with LISP", in 'Proceedings of the Global 2174 Communications Conference (GlobeCom)', IEEE, pp. 2175 2772-2778, December 2012. 2177 [Saltzer] J. H. Saltzer, D. P. Reed, and D. D. Clark, "End-To- 2178 End Arguments in System Design", ACM TOCS, Vol 2, No. 2179 4, pp 277-288, November 1984. 2181 Appendix A. Glossary/Definition of Terms 2183 - Address 2184 - Locator 2185 - EID 2186 - RLOC 2187 - ITR 2188 - ETR 2189 - xTR 2190 - PITR 2191 - PETR 2192 - MR 2193 - MS 2194 - DFZ 2196 Appendix B. Other Appendices 2198 B.1. Old LISP 'Models' 2200 LISP, as initilly conceived, had a number of potential operating 2201 modes, named 'models'. Although they are now obsolete, one 2202 occasionally sees mention of them, so they are briefly described 2203 here. 2205 - LISP 1: EIDs all appear in the normal routing and forwarding 2206 tables of the network (i.e. they are 'routable');this property is 2207 used to 'bootstrap' operation, by using this to load EID->RLOC 2208 mappings. Packets were sent with the EID as the destination in 2209 the outer wrapper; when an ETR saw such a packet, it would send a 2210 Map-Reply to the source ITR, giving the full mapping. 2211 - LISP 1.5: Similar to LISP 1, but the routability of EIDs happens 2212 on a separate network. 2213 - LISP 2: EIDs are not routable; EID->RLOC mappings are available 2214 from the DNS. 2215 - LISP 3: EIDs are not routable; and have to be looked up in in a 2216 new EID->RLOC mapping database (in the initial concept, a system 2217 using Distributed Hash Tables). Two variants were possible: a 2218 'push' system, in which all mappings were distributed to all ITRs, 2219 and a 'pull' system in which ITRs load the mappings they need, as 2220 needed. 2222 B.2. Possible Other Appendices 2224 -- Location/Identity Separation Brief History 2225 -- LISP History 2227 Author's Address 2229 J. Noel Chiappa 2230 Yorktown Museum of Asian Art 2231 Yorktown, Virginia 2232 USA 2234 EMail: jnc@mit.edu