idnits 2.17.1 draft-chiappa-lisp-introduction-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1706 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 9, 2012) is 4302 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-01) exists of draft-chiappa-lisp-architecture-00 == Outdated reference: A later version (-04) exists of draft-fuller-lisp-ddt-01 == Outdated reference: A later version (-24) exists of draft-ietf-lisp-23 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-07 == Outdated reference: A later version (-19) exists of draft-ermagan-lisp-nat-traversal-01 -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LISP Working Group N. Chiappa 3 Internet-Draft Yorktown Museum of Asian Art 4 Intended status: Informational July 9, 2012 5 Expires: January 10, 2013 7 An Introduction to the LISP Location-Identity Separation System 8 draft-chiappa-lisp-introduction-00.txt 10 Abstract 12 LISP is an upgrade to the architecture of the IPvN internetworking 13 system, one which separates location and identity (currently 14 intermingled in IPvN addresses). This is a change which has been 15 identified by the IRTF as a critically necessary evolutionary 16 architectural step for the Internet. In LISP, nodes have both a 17 'locator' (a name which says _where_ in the network's connectivity 18 structure the node is) and an 'identifier' (a name which serves only 19 to provide a persistent handle for the node). A node may have more 20 than one locator, or its locator may change over time (e.g. if the 21 node is mobile), but it keeps the same identifier. 23 One of the chief novelties of LISP, compared to other proposals for 24 the separation of location and identity, is its approach to deploying 25 this upgrade. (In general, it is comparatively easy to conceive of 26 new network designs, but much harder to devise approaches which will 27 actually get deployed throughout the global network.) LISP aims to 28 achieve the near-ubiquitous deployment necessary for maximum 29 exploitation of an architectural upgrade by i) minimizing the amount 30 of change needed (existing hosts and routers can operate unmodified); 31 and ii) by providing significant benefits to early adopters. 33 This document is an introduction to the entire LISP system, for those 34 who are unfamiliar with it. It is intended to be both easy to 35 follow, and also give a fairly detailed understanding of the entire 36 system. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. This document may not be modified, 42 and derivative works of it may not be created, except to format it 43 for publication as an RFC or to translate it into languages other 44 than English. 46 Internet-Drafts are working documents of the Internet Engineering 47 Task Force (IETF). Note that other groups may also distribute 48 working documents as Internet-Drafts. The list of current Internet- 49 Drafts is at http://datatracker.ietf.org/drafts/current/. 51 Internet-Drafts are draft documents valid for a maximum of six months 52 and may be updated, replaced, or obsoleted by other documents at any 53 time. It is inappropriate to use Internet-Drafts as reference 54 material or to cite them other than as "work in progress." 56 This Internet-Draft will expire on January 10, 2013. 58 Copyright Notice 60 Copyright (c) 2012 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1. Background 76 2. Deployment Philosophy 77 2.1. Economics 78 2.2. Maximize Re-use of Existing Mechanism 79 2.3. Self-Deployment 80 3. LISP Overview 81 3.1. Basic Approach 82 3.2. Basic Functionality 83 3.3. Mapping from EIDs to RLOCs 84 3.4. Interworking With Non-LISP-Capable Endpoints 85 4. Initial Applications 86 4.1. Provider Independence 87 4.2. Multi-Homing 88 4.3. Traffic Engineering 89 4.4. Mobility 90 4.5. IP Version Reciprocal Traversal 91 4.6. Local Uses 92 5. Goals of LISP 93 5.1. Reduce DFZ Routing Table Size 94 5.2. Deployment of New Namespaces 95 5.3. {{What else?}} 96 6. Major Functional Subsystems 97 6.1. xTRs 98 6.2. Mapping System 99 6.2.1. Mapping System Organization 100 6.2.2. Interface to the Mapping System 101 6.2.3. Indexing Subsystem 102 7. Examples of Operation 103 7.1. An Ordinary Packet's Processing 104 7.2. A Mapping Cache Miss 105 8. Design Approach 106 8.1. Quick Implement-Test Loop 107 8.1.1. No Desk Fixes 108 8.1.2. Code Before Documentation 109 8.2. Only Fix Real Problems 110 8.3. No Theoretical Perfection 111 8.3.1. No Ocean Boiling 112 8.4. Just Enough Security 113 9. xTRs 114 9.1. When to Encapsulate 115 9.2. UDP Encapsulation Details 116 9.3. Header Control Channel 117 9.3.1. Echo Nonces 118 9.3.2. Instances 119 9.4. Fragmentation 120 9.5. Mapping Gleaning in ETRs 121 10. The Mapping System 122 10.1. The Indexing Subsystem 123 10.2. The Mapping System Interface 124 10.2.1. Map-Request Messages 125 10.2.2. Map-Reply Maessages 126 10.2.3. Map-Register and Map-Notify Messages 127 10.2.4. Map-Referral Messages 128 10.3. Reliability via Replication 129 10.4. Extended Tools 130 10.5. Expected Performance 131 11. Deployment Mechanisms 132 11.1. Internetworking Mechanism 133 11.2. Proxy Devices 134 11.2.1. PITRs 135 11.2.2. PETRs 136 11.3. LISP-NAT 137 11.4. LISP and DFZ Routing 138 11.5. Use Through NAT Devices 139 11.5.1. First-Phase NAT Support 140 11.5.2. Second-Phase NAT Support 141 12. Current Improvements 142 12.1. Mapping Versioning 143 12.2. Replacement of ALT with DDT 144 12.2.1. Why Not Use DNS 145 12.3. {{Any others?}} 146 13. Fault Discovery/Handling 147 13.1. Handling Missing Mappings 148 13.2. Outdated Mappings 149 13.2.1. Outdated Mappings - Updated Mapping 150 13.2.2. Outdated Mappings - Wrong ETR 151 13.2.3. Outdated Mappings - No Longer an ETR 152 13.3. Erroneous mappings 153 13.4. Neighbour Liveness 154 13.5. Neighbour Reachability 155 14. Acknowledgments 156 15. IANA Considerations 157 16. Security Considerations 158 17. References 159 17.1. Normative References 160 17.2. Informative References 161 Appendix A. RefComment 162 Appendix B. Glossary/Definition of Terms 163 Appendix C. Other Appendices 165 1. Background 167 It has gradually been realized in the networking community that 168 networks (especially large networks) should deal quite separately 169 with the identity and location of a node (basically, 'who' a node is, 170 and 'where' it is). At the moment, in both IPv4 and IPv6, addresses 171 indicate both where the named device is, as well as identify it for 172 purposes of end-end communication. 174 The distinction was more than a little hazy at first: the early 175 Internet [RFC791], like the ARPANET before it [Heart] [NIC8246], co- 176 mingled the two, although there was recognition in the early Internet 177 work that there were two different things going on. [IEN19] 179 This likely resulted not just from lack of insight, but also the fact 180 that extra mechanism is needed to support this separation (and in the 181 early days there were no resources to spare), as well as the lack of 182 need for it in the smaller networks of the time. (It is a truism of 183 system design that small systems can get away with doing two things 184 with one mechanism, in a way that usually will not work when the 185 system gets much larger.) 187 The ISO protocol architecture took steps in this direction [NSAP], 188 but to the Internet community the necessity of a clear separation was 189 definitively shown by Saltzer. [RFC1498] Later work expanded on 190 Saltzer's, and tied his separation concepts into the fate-sharing 191 concepts of Clark. [Clark], [Chiappa] 193 The separation of location and identity is a step which has recently 194 been identified by the IRTF as a critically necessary evolutionary 195 architectural step for the Internet. However, it has taken some time 196 for this requirement to be generally accepted by the Internet 197 engineering community at large, although it seems that this may 198 finally be happening. 200 The LISP system for separation of location and identity resulted from 201 the discussions of this topic at the Amsterdam IAB Routing and 202 Addressing Workshop, which took place in October 2006. [RFC4984] 204 A small group of like-minded personnel from various scattered 205 locations within Cisco, spontaneously formed immediately after that 206 workshop, to work on an idea that came out of informal discussions at 207 the workshop. The first Internet-Draft on LISP appeared in January, 208 2007, along with a LISP mailing list at the IETF. [LISP] 210 Trial implementations started at that time, with initial trial 211 deployments underway since June 2007; the results of early experience 212 have been fed back into the design in a continuous, ongoing process 213 over several years. LISP at this point represents a moderately 214 mature system, having undergone a long organic series of changes and 215 updates. 217 LISP transitioned from an IRTF activity to an IETF WG in March 2009, 218 and after numerous revisions, the basic specifications moved to 219 becoming RFCs in 2012 (although work to expand and improve it 220 continues, and undoubtly will for a long time to come). 222 2. Deployment Philosophy 224 It may seem odd to cover 'deployment philosophy' at this point in 225 such a document. However the deployment philosophy was a major 226 driver for much of the design (to some degree the architecture, and 227 to a very large measure, the engineering). So, as such an important 228 motivator, it is very desirable for readers to have this material in 229 hand as they examine the design, so that design choices that may seem 230 questionable at first glance can be better understood. 232 Experience over the last several decades has shown that having a 233 viable 'deployment model' for a new design is absolutely key to the 234 success of that design. A new design may be fantastic - but if it 235 can not or will not be successfully deployed (for whatever factors), 236 it is useless. This absolute primacy of a viable deployment model is 237 what has lead to some painful compromises in the design. 239 The extreme focus on a viable deployment scheme is one of the 240 novelties of LISP. 242 2.1. Economics 244 The key factor in successful adoption, as shown by recent experience 245 in the Internet - and little appreciated to begin with, some decades 246 back - is economics: does the new design have benefits which outweigh 247 its costs. 249 More importantly, this balance needs to hold for early adopters - 250 because if they do not receive benefits to their adoption, the sphere 251 of earliest adopters will not expand, and it will never get to 252 widespread deployment. One might have the world's best clean-slate 253 design, but if it does not have a deployment plan which is 254 economically feasible, it's just a mildly interesting piece of paper. 256 This is particularly true of architectural enhancements, which are 257 far less likely to be an addition which one can 'bolt onto the side' 258 of existing mechanisms, and often offer their greatest benefits only 259 when widely (or ubiquitously) deployed. 261 Maximizing the cost-benefit ratio obviously has two aspects. First, 262 on the cost side, by making the design as inexpensive as possible, 263 which means in part making the deployment as easy as possible. 264 Second, on the benefit side, by providing many new capabilities, 265 which is best done not by loading the design up with lots of features 266 or options (which adds complexity), but by making the addition 267 powerful through deeper flexibility. We believe LISP has met both of 268 these goals. 270 2.2. Maximize Re-use of Existing Mechanism 272 One key part of reducing the cost of a new design is to absolutely 273 minimize the amount of change _required_ to existing, deployed, 274 devices: the fewer devices need to be changed, and the smaller the 275 change to those that do, the lower the pain (and thus the greater the 276 likelihood) of deployment. 278 Designs which absolutely require 'forklift upgrades' to large amounts 279 of existing gear are far less likely to succeed - because they have 280 to have extremely large benefits to make their very substantial costs 281 worthwhile. 283 It is for this reason that LISP, in most cases, initially requires no 284 changes to devices in the Internet (both hosts and routers), and also 285 initially reuses, whereever possible, existing protocols (IPv4 286 [RFC791] and IPv6 [RFC2460]). The 'initially' must be stressed - 287 careful attention has also been paid to the long-term future (see 288 [probably separate RFC to be]), and larger changes become feasible as 289 deployment succeeds. 291 2.3. Self-Deployment 293 LISP has deliberately employed a rather different deployment model, 294 which we might call 'self-deployment'; it does not require a huge 295 push to get it deployed, rather, it is hoped that once people see it 296 and realize they can easily make good use of it _on their own_ (i.e. 297 without requiring adoption by others), it will 'deploy itself' (hence 298 the name of the approach). 300 One can liken the problem of deploying new systems in this way to 301 rolling a snowball down a hill: unless one starts with a big enough 302 initial snowball, and finds a hill of the right steepness (i.e. the 303 right path for it to travel, once it starts moving), one's snowball 304 is not going to go anywhere on its own. However, if one has picked 305 one's spot correctly, little additional work is needed - just stand 306 back and watch it go. 308 3. LISP Overview 310 LISP is an incrementally deployable architectural upgrade to the 311 existing Internet infrastructure, one which provides separation of 312 location and identity. The separation is usually not perfect, for 313 reasons which are driven by the deployment philosophy (above), and 314 explored in a little more detail elsewhere (in [Architecture], 315 Section "Namespaces-EIDs-Residual"). 317 LISP separates the functions of location and identity, current 318 intermingled in IPvN addresses. (This document uses the meaning for 319 'address' proposed in [Atkinson], i.e. a name with mixed location and 320 identity semantics.) 322 3.1. Basic Approach 324 In LISP, nodes have both a 'locator' (a name which says _where_ in 325 the network's connectivity structure the node is), called an 'RLOC', 326 and an 'identifier' (a name which serves only to provide a persistent 327 handle for the node), called an 'EID'. A node may have more than one 328 RLOC, or its RLOC may change over time (e.g. if the node is mobile), 329 but it keeps the same EID. 331 Technically, one should probably say that ideally, the EID names the 332 node (or rather, its end-end communication stack, if one wants to be 333 as forward-looking as possible), and the RLOC(s) name interface(s). 334 (At the moment, in reality, the situation is somewhat more complex, 335 as will be explained elsewhere (in [Architecture], Section 336 "Namespaces-EIDs-Residual"). 338 This second distinction, of _what_ is named by the two classes of 339 name, is necessary both to enable some of the capabilities that LISP 340 provides (e.g the ability to seamlessly support multiple interfaces, 341 to different networks), and is also a further enhancement to the 342 architecture. Faailing to clearly recognize both interfaces and 343 communication stacks as distinctly separate classes of things is 344 another failing of the existing Internet architecture (again, one 345 inherited from the previous generation of networking). 347 A novelty in LISP is that it uses existing IPvN addresses (initially, 348 at least) for both of these kinds of names, thereby minimizing the 349 deployment cost, as well as providing the ability to easily interact 350 with unmodified hosts and routers. 352 3.2. Basic Functionality 354 The basic operation of LISP, as it currently stands, is that LISP 355 augmented packet switches near the source and destination of packets 356 intercept traffic, and 'enhance' the packets. 358 The LISP device near the source looks up additional information about 359 the destination, and then wraps the packet in an outer header, one 360 which contains some of that additional information. The LISP device 361 near the destination removes that header, leaving the original, 362 unmodified, packet to be processed by the destination node. 364 The LISP device near the source (the Ingress Tunnel Router, or 'ITR') 365 uses the information originally in the packet about the identity of 366 its ultimate destination, i.e. the destination address, which one can 367 view as the EID of the ultimate destination. It uses the destination 368 EID to look up the current location (the RLOC) of that EID. 370 The lookup is performed through a 'mapping system', which is the 371 heart of LISP: it is a distributed directory of bindings from EIDs to 372 RLOCS. The destination RLOC will be (initially at least) the address 373 of the LISP device near the destination (the Egress Tunnel Router, or 374 'ETR'). 376 The ITR then generates a new outer header for the original packet, 377 with that header containing the destination's RLOC as the wrapped 378 packet's destination, and the ITR's own address (i.e. the RLOC of the 379 original source) as the wrapped packet's source, and sends it off. 381 When the packet gets to the ETR, that outer header is stripped off, 382 and the original packet is forwarded to the original ultimate 383 destination for normal processing. 385 Return traffic is handled similarly, often (depending on the 386 network's configuration) with the original ITR and ETR switching 387 roles. The ETR and ITR functionality is usually co-located in a 388 single device; these are normally denominated as 'xTRs'. 390 3.3. Mapping from EIDs to RLOCs 392 The mappings from EIDs to RLOCs are provided by a distributed (and 393 potentially replicated) database, the mapping database, which is the 394 heart of LISP. 396 Mappings are requested on need, not (generally) pre-loaded; in other 397 words, mapping are distributed via a 'pull' mechanism. Once obtained 398 by an ITR, they are cached, to limit the amount of control traffic to 399 a practicable level. (The mapping system will be discussed in more 400 detail below, in Section 6.2 and Section 10) 402 Extensive studies, including large-scale simulations driven by 403 lengthy recordings of actual traffic at several major sites, have 404 been performed to verify that this 'pull and cache' approach is 405 viable, in practical engineering terms. [Iannone] (This subject will 406 be discussed in more detail in Section 10.5, below.) 408 3.4. Interworking With Non-LISP-Capable Endpoints 410 The capability for 'easy' interoperation between nodes using LISP, 411 and existing non-LISP-using hosts or sites (often called 'legacy' 412 hosts), is clearly crucial. 414 To allow such interoperation, a number of mechanisms have been 415 designed. This multiplicity is in part because different mechanisms 416 have different advantages and disadvantages (so that no single 417 mechanism is optimal for all cases), but also because with limited 418 field experience, it is not clear which (if any) approach will be 419 preferable. 421 One approach uses proxy LISP devices, called PITRs (proxy ITRs) and 422 PETRs (proxy ETRs), to provide LISP functionality during interaction 423 with legacy sites. Another approach uses a device with combined LISP 424 and NAT functionality, [RFC2993] named a LISP-NAT. 426 4. Initial Applications 428 As previously mentioned, it is felt that LISP will provide even the 429 earliest adopters with some useful capabilities, and that these 430 capabilities will drive early LISP deployment. 432 It is very imporant to note that even when used only for 433 interoperation with existing unmodified hosts, use of LISP can still 434 provide benefits for communications with the site which has deployed 435 it - and, perhaps even more importantly, can do so _to both sides_. 436 This characteristic acts to further enhance the utility for early 437 adopters of deploying LISP, thereby increasing the cost/benefit ratio 438 needed to drive deployment, and increasing the 'self-deployment' 439 aspect of LISP. 441 Note also that this section only lists likely _early_ applications 442 and benefits - if and once deployment becomes more widespread, other 443 aspects will come into play (as described in Section 5, "Goals of 444 LISP", below). 446 4.1. Provider Independence 448 Provider independence (i.e. the ability to easily change one's 449 Internet Service Provider) was probably the first place where the 450 Internet engineering community finally really felt the utility of 451 separating location and identity. 453 The problem is simple: for the global routing to scale, addresses 454 need to be aggregated (i.e. things which are close in the overall 455 network's connectivity need to have closely related addresses), the 456 so-called "provider aggregated" addresses. [RFC4116] However, if 457 this principle is followed, it means that when an entity switches 458 providers (i.e. it moves to a different 'place' in the network), it 459 has to renumber, a painful undertaking. [RFC5887] 461 In theory, it ought to be possible to update the DNS entries, and 462 have everyone switch to the new addresses, but in practise, addresses 463 are embedded in many places, such as firewall configurations at other 464 sites. 466 Having separate namespaces for location and identity greatly reduces 467 the problems involved with renumbering; an organization which moves 468 retains its EIDs (which are how most other parties refer to its 469 nodes), but is allocated new RLOCs, and the mapping system can 470 quickly provide the updated binding from the EIDs to the new RLOCs. 472 4.2. Multi-Homing 474 Multi-homing is another place where the value of separation of 475 location and identity became apparent. There are several different 476 sub-flavours of the multi-homing problem - e.g. depending on whether 477 one wants open connections to keep working, etc - and other axes as 478 well (e.g. site multi-homing versus host multi-homing). 480 In particular, for the 'keep open connections up' case, without 481 separation of location and identity, the only currently feasible 482 approach is to use provider-independent addressses - which moves the 483 problem into the global routing system, with attendant costs. This 484 approach is also not really feasible for host multi-homing. 486 Multi-homing was once somewhat esoteric, but a number of trends are 487 driving an increased desirability, e.g. the wish to have multiple ISP 488 links to a site for robustness; the desire to have mobile handsets 489 connect up to multiple wireless systems; etc. 491 Again, separation of location and identity, and the existince of a 492 binding layer which can be updated fairly quickly, as provided by 493 LISP, is a very useful tool for all variants of this issue. 495 4.3. Traffic Engineering 497 Traffic engineering (TE) [RFC3272], desirable though this capability 498 is in a global network, is currently somewhat problematic to provide 499 in the Internet. The problem, fundamentally, is that this capability 500 was not visualized when the Internet was designed, so support for it 501 is somewhat in the 'when the only tool you have is a hammer, 502 everything looks like nail' category. 504 TE is, fundamentally, a routing issue. However, the current Internet 505 routing architecture, which is basically the Baran design of fifty 506 years ago [Baran] (a single large, distributed computationa), is ill- 507 suited to provide TE. The Internet seems a long way from adopting a 508 more-advanced routing architecture, although the basic concepts for 509 such have been known for some time. [RFC1992] 511 Although the identity-location binding layer is thus a poor place, 512 architecturally, to provide TE capabilities, it is still an 513 improvement over the current routing tools available for this purpose 514 (e.g. injection of more-specific routes into the global routing 515 table). In addition, instead of the entire network incurring the 516 costs (through the routing system overhead), when using a binding 517 layer to provide TE, the overhead is limited to those who are 518 actually communicating with that particular destination. 520 LISP includes a number of features in the mapping system to support 521 TE. (Described in Section 6.2 below.) 523 4.4. Mobility 525 Mobility is yet another place where separation of location and 526 identity is obviously a key part of a clean, efficient and high- 527 functionality solution. Considerable experimentation has been 528 completed on doing mobility with LISP. 530 4.5. IP Version Reciprocal Traversal 532 Note that LISP 'automagically' allows intermixing of various IP 533 versions for packet carriage; IPv4 packets might well be carried in 534 IPv6, or vice versa, depending on the network's configuration. This 535 would allow an 'island' of operation of one type to be 536 'automatically' tunneled over a stretch of infrastucture which only 537 supports the other type. 539 While the machinery of LISP may seem too heavyweight to be good for 540 such a mundane use, this is not intended as a 'sole use' case for 541 deployment of LISP. Rather, it is something which, if LISP is being 542 deployed anyway (for its other advantages), is an added benefit that 543 one gets 'for free'. 545 4.6. Local Uses 547 LISP has a number of use cases which are within purely local 548 contexts, i.e. not in the larger Internet. These fall into two 549 categories: uses seen on the Internet (above), but here on a private 550 (and usually small scale) setting; and applications which do not have 551 a direct analog in the larger Internet, and which apply only to local 552 deployments. 554 Among the former are multi-homing, IP version traversal, and support 555 of VPN's for segmentation and multi-tenancy (i.e. a spatially 556 separated private VPN whose components are joined together using the 557 public Internet as a backbone). 559 Among the latter class, non-Internet applications which have no 560 analog on the Internet, are the following example applications: 561 virtual machine mobility in data centers; other non-IP EID types such 562 as local network MAC addresses, or application specific data. 564 5. Goals of LISP 566 As previously stated, broadly, the goal of LISP is to be an 567 practically deployable architectural upgrade to IPvN, to allow 568 separation of location and identity. But what is the value of that? 569 What will it allow us to do? 571 The answer to that obviously starts with the things mentioned in 572 "Initial Applications" (above), but there are other goals as well. 574 5.1. Reduce DFZ Routing Table Size 576 One of the main design drivers for LISP, as well as other location- 577 identity separation proposals, is to decrease the overhead of running 578 global routing system. In fact, it was this aspect that led the IRTF 579 Routing RG to conclude that separation of location and identity was a 580 key architectural underpinning needed to control the growth of the 581 global routing system. [RFC6115] 583 As noted above, many of the practical needs of Internet users are 584 today met with techniques that increase the load on the global 585 routing system (Provider Independent addresses for the provision of 586 provider independence, multihoming, etc; more-specific routes for TE; 587 etc.) Provision of these capabilities by a mechanism which does not 588 involve extra load on the global routing system is therefore very 589 desirable. 591 A number of factors, including the use of these techniques, has led 592 to a great increase in the fragmentation of the address space, at 593 least in terms of routing table entries. In particular, the growth 594 in demand for multi-homing has been forseen as driving a large 595 increase in the size of the global routing tables. 597 In addition, as the IPv4 address space becomes fuller and fuller, 598 there will be an inevitable tendency to find use in smaller and 599 smaller 'chunks' of that space. [RFC6127] This too would tend to 600 increase the size of the global routing table. 602 LISP, if successful and widely deployed, offers an opportunity to use 603 separation of location and identity to control the growth of the size 604 of the global routing table. (A full examination of this topic is 605 beyond the scope of this document - see YYYY.) 607 5.2. Deployment of New Namespaces 609 Once the mapping system is widely deployed and available, it should 610 make deployment of new namespaces (in the sense of new syntax, if not 611 new semantics) easier. E.g. if someone wishes in the future to 612 devise a system which uses native MPLS [RFC3031] for a data carriage 613 system joining together a large number of xTRs, it would easy enough 614 to arrange to have the mappings for destinations attached to those 615 xTRs abe some sort of MPLS-specific name. 617 5.3. {{What else?}} 619 6. Major Functional Subsystems 621 LISP has only two major functional subsystems - the collection of 622 LISP packet switches (the xTRs), and the mapping system, which 623 manages the mapping database. The purpose and operation of each is 624 described at a high level below, and then, later on, in a fair amount 625 of detail, in separate sections on each (Sections Section 9 and 626 Section 10, respectively). 628 6.1. xTRs 630 xTRs are fairly normal packet switches, enhanced with a little extra 631 functionality in both the data and control planes, to perform LISP 632 data and control functionality. 634 The data plane functions in ITRs include deciding which packets need 635 to be given LISP processing (since packets to non-LISP sites may be 636 sent 'vanilla'); looking up the mapping; encapsulating the packet; 637 and sending it to the ETR. This encapsulation is done using UDP 638 [RFC768] (for reasons to be explained below, in Section 9.2), along 639 with an additional IPvN header (to hold the asource and destination 640 RLOCs). To the extent that traffic engineering features are in use 641 for a particular EID, the ITRs implement them as well. 643 In the ETR, the data plane simply unwraps the packets, and forwards 644 the 'vanilla' packets to the ultimate destination. 646 Control plane functions in ITRs include: asking for {EID->RLOC} 647 mappings via Map-Request control messages; handling the returning 648 Map-Replies which contain the requested information; managing the 649 local cache of mappings; checking for the reachability and liveness 650 of their neighbour ETRs; and checking for outdated mappings and 651 requesting updates. 653 In the ETR, control plane functions include participating in the 654 neighbour reachability and liveness function (see Section 13.4); 655 interacting with the mapping indexing system (next section); and 656 answering requests for mappings (ditto). 658 6.2. Mapping System 660 The mapping database is a distributed, and potentially replicated, 661 database which holds bindings between EIDs (identity) and RLOCs 662 (location). To be exact, it contains bindings between EID blocks and 663 RLOCs (the block size is given explicitly, as part of the syntax). 665 Support for blocks is both for minimizing the administrative 666 configuration overhead, as well as for operational efficiency; e.g. 667 when a group of EIDs are behind a single xTR. 669 However, the block may be (and often is) as small as a single EID. 670 Since mappings are only loaded upon demand, if smaller blocks become 671 predominant, then the increased size of the overall database is far 672 less problematic than if the routing table came to be dominated by 673 such small entries. 675 A particular node may have more than one RLOC, or may change its 676 RLOC(s), while keeping its singlar identity. 678 The binding contains not just the RLOC(s), but also (for each RLOC 679 for any given EID) priority and weight (to allow allocation of load 680 between several RLOCs at a given priority); this allows a certain 681 amount of traffic engineering to be accomplished with LISP. 683 6.2.1. Mapping System Organization 685 The mapping system is actually split into two major functional sub- 686 systems. The actual bindings themselves are held by the ETRs, and an 687 ITR which needs a binding effectively gets it from the ETR. 689 This co-location of the authoritative version of the mappings, and 690 the forwarding functionality which it describes, is an instance of 691 fate-sharing. [Clark] 693 To find the appropriate ETR(s) to query for the mapping, the second 694 subsystem, an 'indexing system', itself also a distributed, 695 potentally replicated database, provides information on which ETR(s) 696 are authoritative sources of information about the bindings which are 697 available. 699 6.2.2. Interface to the Mapping System 701 The client interface to the mapping system from an ITR's point of 702 view is not with the indexing system directly; rather, it is through 703 devices called Map Resolvers (MRs). 705 ITRs send request control messages (Map-Request packets) to an MR. 706 (This interface is probably the most important standardized interface 707 in LISP - it is the key to the entire system.) The MR uses the 708 indexing system to eventually forward the Map-Request to the 709 appropriate ETR. The ETR formulates reply control messages (Map- 710 Reply packets), which is conveyed to the ITR. The details of the 711 indexing system, etc, are thus hidden from the 'ordinary' ITRs. 713 Similarly, the client interface to the indexing system from an ETR's 714 point of view is through devices called Map Servers (MSs - admittedly 715 a poorly chosen term, but it's too late to change it now). 717 ETRs send registration control messages (Map-Register packets) to an 718 MS, which makes the information about the mappings which the ETR 719 indicates it is authoritative for available to the indexing system. 720 The MS formulates a reply control message (the Map-Notify packet), 721 which confirms the registration, and is returned to the ETR. The 722 details of the indexing system are thus likewise hidden from the 723 'ordinary' ETRs. 725 6.2.3. Indexing Subsystem 727 The current indexing system is called the Delegated Database Tree 728 (DDT), which is very similar in operation to DNS. [DDT], [RFC1034] 729 However, unlike DNS, the actual mappings are not handled by DDT; DDT 730 merely identifies the ETRs which hold the mappings. 732 Again, extensive large-scale simulations driven by lengthy recordings 733 of actual traffic at several major sites, have been performed to 734 verify the effectiveness of this particular indexing system. [Jakab] 736 7. Examples of Operation 738 To aid in comprehension, a few examples are given of user packets 739 traversing the LISP system. The first shows the processing of a 740 typical user packet, i.e. what the vast majority of user packets will 741 see. The second shows what happens when the first packet to a 742 previously-unseen destination (at a particular ITR) is to be 743 processed by LISP. 745 7.1. An Ordinary Packet's Processing 747 This case follows the processing of a typical user packet (for 748 instance, a normal TCP data packet associated with an open HTTP 749 connection) as it makes its way from the source host to the 750 destination, along with the TCP acknowledgement packet which is 751 instigatd by this packet. 753 7.2. A Mapping Cache Miss 755 8. Design Approach 757 Before describing LISP's mechanisms in more detail, it may be worth 758 saying a few words about the design philosophy used in creating them 759 - this may make clearer some of the engineering choices described 760 below. 762 8.1. Quick Implement-Test Loop 764 LISP uses a philosophy similar to that used in the early days of the 765 Internet, which is to just build it, then try it and see what 766 happens, and move forward from there based on what actually happens. 767 The concept has been to get something up and running, and then modify 768 it based on testing and experience. 770 8.1.1. No Desk Fixes 772 Don't try and forsee all issues from desk analysis. (Which is not to 773 say that one should not spend _some_ time on trying to forsee 774 problems, but be aware that it is a 'diminishing returns' process.) 775 The performance of very large, complex, physically distributed 776 systems is hard to predict, so rather than try (which would 777 necessarily be an incomplete exercise anyway, testing would 778 inevitably be required eventually), at a certain point it's better 779 just to get on with it - and you will learn a host of other lessons 780 in the process, too. 782 8.1.2. Code Before Documentation 784 This is often a corollary to the kind of style described above. 785 While it probably would not have been possible in a large, 786 inhomogenous group, the small, close nature of the LISP 787 implementation group did allow this approach. 789 8.2. Only Fix Real Problems 791 Don't worry about anything unless experience show it's a real 792 problem. For instance, in the early stages, much was made out of the 793 problem of 'what does an ITR do if it gets a packet, but does not 794 (yet) have a mapping for the destination?' 796 In practise, simply dropping such packets has just not proved to be a 797 problem; the higher level protocol will retransmit them after a 798 timeout, and the mapping is usually in place by then. So spending a 799 lot of time (and its companion, energy) and mechanism (and _its_ 800 extremely undesirable companion, complexity) on solving this 801 'problem' would not have been the most efficient approach, overall. 803 8.3. No Theoretical Perfection 805 Attack hard problems with a number of cheap and simple mechanisms 806 that co-operate and overlap. Trying to find a single mechanism that 807 is all of: 809 - Robust 810 - Efficient 811 - Fast 813 is often (usually?) a fool's errand. (The analogy to the aphorism 814 'Fast, Cheap, Good - Pick Any Two' should be obvious.) However, a 815 collection of simple and cheap mechanisms may effectively be able to 816 meet all of these goals (see, for example, ETR Liveness/Reachability, 817 Section 13.4). 819 Yes, this results in a system which is not provably correct in all 820 circumstances. The world, however, is full of such systems - and in 821 the real world, effective robustness is more likely to result from 822 having multiple, overlapping mechanisms than one single high-powered 823 (and inevitably complex) one. In the world of civil engineering, 824 redundancy is now accepted as a key design principle; the same should 825 be true of information systems. [Salvadori] 827 8.3.1. No Ocean Boiling 829 Don't boil the ocean to kill a single fish. This is a combination of 830 7.2 (Only Fix Real Problems) and 7.3 (No Theoretical Perfection); it 831 just means that spending a lot of complexity and/or overhead to deal 832 with a problem that's not really a problem is not good engineering. 834 8.4. Just Enough Security 836 How much security to have is a complex issue. It's relatively easy 837 for designers to add good security, but much harder to get the users 838 to jump over all the hoops necessary to use it. LISP has therefore 839 adopted a position where we add 'just enough' security. 841 The overall approach to security in LISP is fairly subtle, though, 842 and is covered in more detail elsewhere (in [Architecture], Section 843 "Security"). 845 9. xTRs 847 As mentioned above (in Section 6.1), xTRs are the basic data-handling 848 devices in LISP. This section explores some advanced topics related 849 to xTRs. 851 Careful rules have been specified for both TTL and ECN [RFC3168] to 852 ensure that passage through xTRs does not interfere with the 853 operation of these mechanisms. In addition, care has been taken to 854 ensure that 'traceroute' works when xTRs are involved. 856 9.1. When to Encapsulate 858 An ITR knows that a destination is running LISP, and thus that it 859 should perform LISP processing on a packet (including potential 860 encapsulation) if it has an entry in its local mapping cache that 861 covers the destination EID. 863 Conversely, if the cache contains a 'negative' entry (indicating that 864 the ITR has previously attempted to find a mapping that covers this 865 EID, and it has been informed by the mapping system that no such 866 mapping exists), it knows the destination is not running LISP, and 867 the packet can be forwarded normally. 869 (The ITR cannot simply depend on the appearance, or non-appearance, 870 of the destination in the DFZ routing tables, as a way to tell if a 871 destination is a LISP site or not, because mechanisms to allow 872 interoperation of LISP sites and 'legacy' sites necessarily involve 873 advertising LISP sites' EIDs into the DFZ.) 875 9.2. UDP Encapsulation Details 877 The UDP encapsulation used by LISP for carrying traffic from ITR to 878 ETR, and many of the details of how the it works, were all chosen for 879 very practical reasons. 881 Use of UDP (instead of, say, a LISP-specific protocol number) was 882 driven by the fact that many devices filter out 'unknown' protocols, 883 so adopting a non-UDP encapsulation would have made the initial 884 deployment of LISP harder - and our goal (see Section 2.1) was to 885 make the deployment as easy as possible. 887 The UDP source port in the encapsulated packet is a hash of the 888 original source and destination; this is because many ISPs use 889 multiple parallel paths (so-called 'Equal Cost Multi-Path'), and 890 load-share across them. Using such a hash in the source-port in the 891 outer header both allows LISP traffic to be load-shared, and also 892 ensures that packets from individual connections are delivered in 893 order (since most ISPs try to ensure that packets for a particular 894 {source, source port, destination, destination port} tuple flow along 895 a single path, and do not become disordered).. 897 The UDP checksum is zero because the inner packet usually already has 898 a end-end checksum, and the outer checksum adds no value. [Saltzer] 899 In most exising hardware, computing such a checksum (and checking it 900 at the other end) would also present an intolerable load, for no 901 benefit. 903 9.3. Header Control Channel 905 LISP provides a multiplexed channel in the encapsulation header. It 906 is mostly (but not entirely) used for control purposes. (See 907 [Architecture], Section "Architecture-Piggyback" for a longer 908 discussion of the architectural implications of this.) 910 The general concept is that the header starts with an 8-bit 'flags' 911 field, and it also includes two data fields (one 24 bits, one 32), 912 the contents and meaning of which vary, depending on which flags are 913 set. This allows these fields to be 'multiplexed' among a number of 914 different low-duty-cycle functions, while minimizing the space 915 overhead of the LISP encapsulation header. 917 9.3.1. Echo Nonces 919 One important use is for a mechanism known as the Nonce Echo, which 920 is used as an efficient method for ITRs to check the reachability of 921 correspondent ETRs. 923 Basically, an ITR which wishes to ensure that an ETR is up, and 924 reachable, sends a nonce to that ETR, carried in the encapsulation 925 header; when that ETR (acting as an ITR) sends some other user data 926 packet back to the ITR (acting in turn as an ETR), that nonce is 927 carried in the header of that packet, allowing the original ITR to 928 confirm that its packets are reaching that ETR. 930 Note that lack of a response is not necessarily _proof_ that 931 something has gone wrong - but it stronly suggests that something 932 has, so other actions (e.g. a switch to an alternative ETR, if one is 933 listed; a direct probe; etc) are advised. 935 (See Section 13.5 for more about Echo Nonces.) 937 9.3.2. Instances 939 Another use of these header fields is for 'Instances' - basically, 940 support for VPN's across backbones. [RFC4026] Since there is only 941 one destination UDP port used for carriage of user data packets, and 942 the source port is used for multiplexing (above), there is no other 943 way to differentiate among different destination address namespaces 944 (which are often overlapped in VPNs). 946 9.4. Fragmentation 948 Several mechanisms have been proposed for dealing with packets which 949 are too large to transit the path from a particular ITR to a given 950 ETR. 952 One, called the 'stateful' approach, keeps a per-ETR record of the 953 maximum size allowed, and sends an ICMP Too Big message to the 954 original source host when a packet which is too large is seen. 956 In the other, referred to as the 'stateless' approach, for IPv4 957 packets without the 'DF' bit set, too-large packets are fragmented, 958 and then the fragments are forwarded; all other packets are 959 discarded, and an ICMP Too Big message returned. 961 It is not clear at this point which approach is preferable. 963 9.5. Mapping Gleaning in ETRs 965 As an optimization to the mapping acquisition process, ETRs are 966 allowed to 'glean' mappings from incoming user data packets, and also 967 from incoming Map-Request control messages. This is not secure, and 968 so any such mapping must be 'verified' by sending a Map-Request to 969 get an authoritative mapping. (See further discussion of the 970 security implications of this in [Architecture], Section "Security- 971 xTRs".) 973 The value of gleaning is that most communications are two-way, and so 974 if host A is sending packets to host B (therefore needing B's 975 EID->RLOC mapping), very likely B will soon be sending packets back 976 to A (and thus needing A's EID->RLOC mapping). Without gleaning, 977 this would sometimes result in a delay, and the dropping of the first 978 return packet; this is felt to be very undesirable. 980 10. The Mapping System 982 RFC 1034 ("DNS Concepts and Facilities") has this to say about the 983 DNS name to IP address mapping system: 985 "The sheer size of the database and frequency of updates suggest 986 that it must be maintained in a distributed manner, with local 987 caching to improve performance. Approaches that attempt to 988 collect a consistent copy of the entire database will become more 989 and more expensive and difficult, and hence should be avoided." 991 and this observation applies equally to the LISP mapping system. 993 As previously mentioned, the mapping system is split into an indexing 994 subsystem, which keeps track of where all the mappings are kept, and 995 the mappings themsleves, the authoritative copies of which are always 996 held by ETRs. 998 10.1. The Indexing Subsystem 1000 The indexing system in LISP is currently implemented by the DDT 1001 system. LISP initially used (for ease of getting something 1002 operational without having to write a lot of code) an indexing system 1003 called ALT, which used BGP running over virtual tunnels. [ALT] This 1004 proved to have a number of issues, and has now been superseded by 1005 DDT. 1007 In DDT, the EID namespace(s) are instantiated as a tree of DDT nodes. 1008 Starting with the root node(s), which have 'reponsibility' for the 1009 entire namespace, portions of the namespace are delegated to child 1010 nodes, in a recursive process extending through as many levels as are 1011 needed. Eventually, leaf nodes in the DDT tree delegate namespace 1012 blocks to ETRs. 1014 MRs obtain information about delegations by interrogating DDT nodes, 1015 and caching the results. aThis allows them, when passed a request for 1016 a mapping by an ITR, to forward the mapping request to the 1017 appropriate ETR (perhaps after loading some missing delegation 1018 entries into their delegation cache). 1020 10.2. The Mapping System Interface 1022 As mentioned in Section 6.2.2, both of the inferfaces to the mapping 1023 system (from ITRs, and ETRs) are standardized, so that the more 1024 numerous xTRs do not have to be modified when the mapping indexing 1025 system is changed. This precaution has already allowed the mapping 1026 system to be upgraded during LISP's evolution, when ALT was replaced 1027 by DDT. 1029 This section describes the interfaces in a little more detail. 1031 10.2.1. Map-Request Messages 1033 The Map-Request message contains a number of fields, the two most 1034 important of which are the requested EID block identifier (remember 1035 that individual mappings may cover a block of EIDs, not just a single 1036 EID), and the Address Family Identifier (AFI) for that EID block. 1037 [AFI] The inclusion of the AFI allows the mapping system interface 1038 (as embodied in these control packets) a great deal of flexibility. 1039 (See [Architecture], Section "Namespaces" for more on this.) 1041 Other important fields are the source EID (and its AFI), and one or 1042 more RLOCs for the source EID, along with their AFIs. Multiple RLOCs 1043 are included to ensure that at least one is in a form which will 1044 allow the reply to be returned to the requesting ITR, and the source 1045 EID is used for a variety of functions, including 'gleaning' (see 1046 Section 9.5). 1048 Finally, the message includes a long nonce, for simple, efficient 1049 protection against offpath attackers (see [Architecture], Section 1050 "Security-xTRs" for more), and a variety of other fields and control 1051 flag bits. 1053 10.2.2. Map-Reply Maessages 1055 The Map-Reply message looks similar, except it includes the mapping 1056 entry for the requested EID(s), which contains one or more RLOCs and 1057 their associated data. (Note that the reply may cover a larger block 1058 of the EID namespace than the request; most requests will be for a 1059 single EID, the one which prompted the query.) 1061 For each RLOC in the entry, there is the RLOC, its AFI (of course), 1062 priority and weight fields (see Section 6.2), and multicast priority 1063 and weight fields. 1065 10.2.3. Map-Register and Map-Notify Messages 1067 The Map-Register message contains authentication information, and a 1068 number of mapping records, each with an individual Time-To-Live 1069 (TTL). Each of the records contains an EID (potentially, a block of 1070 EIDs) and its AFI, a version number for this mapping (see 1071 Section 12.1), and a number of RLOCs and their AFIs. 1073 Each RLOC entry also includes the same data as in the Map-Replies 1074 (i.e. priority and weight); this is because in some circumstances it 1075 is advantageous to allow the MS to proxy reply on the ETR's behalf to 1076 Map-Request messages. [Mobility] 1078 Map-Notify messages have the exact same contents as Map-Register 1079 messages; they are purely acknowledgements. 1081 10.2.4. Map-Referral Messages 1083 Map-Referral messages look almost identical to Map-Reply messages 1084 (which is felt to be an advantage by some people, although having a 1085 more generic record-based format would probably be better in the long 1086 run, as ample experience with DNS has shown), except that the RLOCs 1087 potentially name either i) other DDT nodes (children in the 1088 delegation tree), or ii) terminal MSs. 1090 There are also optional authentication fields; see [Architecture], 1091 Section "Security-Mappings" for more. 1093 10.3. Reliability via Replication 1095 Everywhere throughout the mapping system, robustness to operational 1096 failures is obtained by replicating data in multiple instances of any 1097 particular node (of whatever type). Map-Resolvers, Map-Servers, DDT 1098 nodes, ETRs - all of them can be replicated, and the protocol 1099 supports this replication. 1101 There are generally no mechanisms specified yet to ensure coherence 1102 between multiple copies of any particular data item, etc - this is 1103 currently a manual responsibility. If and when LISP protocol 1104 adoption proceeds, an automated layer to perform this functionality 1105 can 'easily' be layered on top of the existing mechanisms. 1107 10.4. Extended Tools 1109 In addition to the priority and weight data items in mappings, LISP 1110 offers other tools to enhance functionality, particularly in the 1111 traffic engineering area. One are 'source-specific mappings', i.e. 1112 the ETR may return different mappings to the enquiring ITR, depending 1113 on the identity of the ITR. This allows very fine-tuned traffic 1114 engineering, far more powerful than routing-based TE. 1116 10.5. Expected Performance 1118 11. Deployment Mechanisms 1120 This section discusses several deployment issues in more detail. 1121 With LISP's heavy emphasis on practicality, much work has gone into 1122 making sure it works well in the real-world environments most people 1123 have to deal with. 1125 11.1. Internetworking Mechanism 1127 One aspect which has received a lot of attention are the mechanisms 1128 previously referred to (in Section 3.4) to allow interoperation of 1129 LISP sites with so-called 'legacy' sites which are not running LISP 1130 (yet). 1132 To briefly refresh what was said there, there are two main approaches 1133 to such interworking: proxy nodes (PITRs and PETRs), and an 1134 alternative mechanism using device with combined NAT and LISP 1135 functionality; these are described in more detail here. 1137 11.2. Proxy Devices 1139 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1140 nodes using LISP. PETRs (proxy ETRs) serve as ETRs for LISP traffic 1141 _to_ legacy hosts (for cases where a LISP device cannot send packets 1142 directly to such sites, without encapsulation). 1144 Note that return traffic _to_ a legacy site from a LISP-using node 1145 does not necessarily have to pass through an ITR/PETR pair - the 1146 original packets can usually just be sent directly to the 1147 destination. However, for some kinds of LISP operation (e.g. mobile 1148 nodes), this is not possible; in these situations, the PETR is 1149 needed. 1151 11.2.1. PITRs 1153 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1154 nodes using LISP. To do that, they have to advertise into the 1155 existing legacy backbone Internet routing the availability of 1156 whatever ranges of EIDs (i.e. of nodes using LISP) they are proxying 1157 for, so that legacy hosts will know where to send traffic to those 1158 LISP nodes. 1160 As mentioned previously (Section 9.1), an ITR at another LISP site 1161 can avoid using a PITR (i.e. it can detect that a given destination 1162 is not a legacy site, if a PITR is advertising it into the DFZ) by 1163 checking to see if a LISP mapping exists for that destination. 1165 "Aggressive aggregation is performed to minimize the number aof new 1166 announced routes." 1168 --- Impact on routing table 1169 ---- Expands, but a PITR may cover a bunch of RLOC routes 1170 with a single EID advertisement 1171 ---- You can do TE with LISP instead of BGP, that will help 1172 keep table sizes down, even in early stages 1174 11.2.2. PETRs 1176 PETRs (proxy ETRs) serve as ETRs for LISP traffic _to_ legacy hosts, 1177 for cases where a LISP device cannot send packets to sites without 1178 encapsulation. That typically happens for one of two reasons. 1180 First, it will happen in places where some device is implementing 1181 Unicast Reverse Path Forwarding (uRPF), to prevent a variety of 1182 negative behaviour; originating packets with the source's EID in the 1183 source address field will result in them being filtered out and 1184 discarded. 1186 Second, it will happen when a LISP site wishes to send packets to a 1187 non-LISP site, and the path in between does not support the 1188 particular IP protocol version used by the source along its entir 1189 length. Use of a PETR on the other side of the 'gap' will allow the 1190 LISP site's packet to 'hop over' the gap, by utilizing LISP's 1191 built-in support for mixed protocol encapsulation. 1193 PETRs are generally paired with specific ITRs, which have the 1194 location of their PETRs configured into them. In other words, unlike 1195 normal ETRS, PETRs do not have to register themselves in the mapping 1196 database, on behalf of any legacy sites they serve. 1198 Also, allowing an ITR to always send traffic leaving a site to a PETR 1199 does avoid having to chose whether or not to encapsulate packets; it 1200 can just always encapsulate packets, sending them to the PETR if it 1201 has no specific mapping for the destination. However, this is not 1202 advised: as mentioned, it is easy to tell if something is a legacy 1203 destination. 1205 11.3. LISP-NAT 1207 A LISP-NAT device, as previously mentioned, combines LISP and NAT 1208 functionality, in order to allow a LISP site which is internally 1209 using addresses which cannot be globally routed to communicate with 1210 non-LISP sites elsewhere in the Internet. (In other words, the 1211 technique used by the PITR approach simply cannot be used in this 1212 case.) 1214 To do this, a LISP-NAT performs the usual NAT functionality, and 1215 translates a host's source address(es) in packets passing through it 1216 from an 'inner' value to an 'outer' value, and storing that 1217 translation in a table, which it can use to similarly process 1218 subsequent packets (both outgoing and incoming). [Interworking] 1220 There are two main cases where this might apply: 1221 - Sites using non-routable global addresses 1222 - Sites using private addresses [RFC1918] 1224 11.4. LISP and DFZ Routing 1226 11.5. Use Through NAT Devices 1228 Like them or not (and NAT devices have many egregious issues - some 1229 inherent in the nature of the process of mapping addresses; others, 1230 such as the brittleness due to non-replicated critical state, caused 1231 by the way NATs were introduced, as stand-alone 'invisible' boxes), 1232 NATs are both ubiquitous, and here to stay for a long time to come. 1234 Thus, in the actual Internet of today, having any new mechanisms 1235 function well in the presence of NATs (i.e. with LISP xTRs behind a 1236 NAT device) is absolutely necessary. LISP has produced a variety of 1237 mechanisms to do this. 1239 11.5.1. First-Phase NAT Support 1241 The first mechanism used by LISP to operate through a NAT device only 1242 worked with some NATs, those which were configurable to allow inbound 1243 packet traffic to reach a configured host. 1245 A pair of new LISP control messages, LISP Echo-Request and Echo- 1246 Reply, allowed the ETR to discover its temporary global address; the 1247 Echo-Request was sent to the configured Map-Server, and it replied 1248 with an Echo-Reply which included the source address from which the 1249 Echo Request was received (i.e. the public global address assigned to 1250 the ETR by the NAT). The ETR could then insert that address in any 1251 Map-Reply control messages which it sent to correspondent ITRs. 1253 The fact that this mechanism did not support all NATs, and also 1254 required manual configuration of the NAT, meant that this was not a 1255 good solution; in addition, since LISP expects all incoming data 1256 traffic to be on a specific port, it was not possible to have 1257 multiple ETRs behind a single NAT (which normally would have only one 1258 global address to share, meaning port mapping would have to be used, 1259 except that... ) 1261 11.5.2. Second-Phase NAT Support 1263 For a more comprehensive approach to support of LISP xTR deployment 1264 behind NAT devices, a fairly extensive supplement to LISP, LISP NAT 1265 Traversal, has been designed. [NAT] 1267 A new class of LISP device, the LISP Re-encapsulating Tunnel Router 1268 (RTR), passes traffic through the NAT, both to and from the xTR. 1269 (Inbound traffic has to go through the RTR as well, since otherwise 1270 multiple xTRs could not operate behind a single NAT, for the 1271 'specified port' reason in the section above.) 1273 (Had the Map-Reply included a port number, this could have been 1274 avoided - although of course it would be possible to define a new 1275 RLOC type which included protocol and port, to allow other 1276 encapsulation techniques.) 1278 Two new LISP control messages (Info-Request and Info-Reply) allow an 1279 xTR to detect if it is behind a NAT device, and also discover the 1280 global IP address and UDP port assigned by the NAT to the xTR. A 1281 modification to LISP Map-Register control messages allows the xTR to 1282 initialize mapping state in the NAT, in order to use the RTR. 1284 This mechanism addresses cases where the xTR is behind a NAT, but the 1285 xTR's associated MS is on the public side of the NAT; this 1286 limitation, that MS's must be in the 'public' part of the Internet, 1287 seems reasonable. 1289 12. Current Improvements 1291 In line with the philosophies laid out in Section 8, LISP is 1292 something of a moving target. This section discusses some of the 1293 contemporaneous improvements being made to LISP. 1295 12.1. Mapping Versioning 1297 As mentioned, LISP has been under development for a considerable 1298 time. One early addition to LISP (it is already part of the base 1299 specification) is mapping versioning; i.e. the application of 1300 identifying sequence numbers to different versions of a mappping. 1301 [Versioning] This allows an ITR to easily discover when a cached 1302 mapping has been updated by a more recent variant. 1304 Version numbers are available in control messages (Map-Replies), but 1305 the initial concept is that to limit control message overhead, the 1306 versioning mechanism should primarily use the multiplex user data 1307 header control channel (see Section 9.3). 1309 Versioning can operate in both directions: an ITR can advise an ETR 1310 what version of a mapping it is currently using (so the ETR can 1311 notify it if there is a more recent version), and ETRs can let ITRs 1312 know what the current mapping version is (so the ITRs can request an 1313 update, if their copy is outdated). 1315 At the moment version numbers are manually assigned, and ordered. 1316 Some felt that this was non-optimal, and that a better approach would 1317 have been to have 'fingerprints' which were computed from the current 1318 mapping data (i.e. a hash). It is not clear that the ordering buys 1319 much (if anything), and the potential for mishaps with manually 1320 configured version numbers is self-evident. 1322 12.2. Replacement of ALT with DDT 1324 As mentioned in Section 10.2, an interface is provided to allow 1325 replacement of the indexing subsystem. LISP initially used an 1326 indexing system called ALT. [ALT] ALT was relatively easy to 1327 construct from existing tools (GRE, BGP, etc), but it had a number of 1328 issues that made it unsuitable for large-scale use. ALT is now being 1329 superseded by DDT. 1331 As indicated previously (Section 10.5), the basic structure and 1332 contents of DDT is identical to that of TREE, so the extensive 1333 simulation work done for TREE applies equally to DDT, as do the 1334 conclusions drawn about TREE's superiority to ALT. [Jakab] 1336 {{Briefly synopsize results}} 1338 12.2.1. Why Not Use DNS 1340 One obvious question is 'Since DDT is so similar to DNS, why not 1341 simply use DNS?' In particular, people are familiar with the DNS, 1342 how to configure it, etc - would it not thus be preferable to use it? 1343 To completely answer this would take more space that available here, 1344 but, briefly, there were two main reasons, and one lesser one. 1346 First, the syntax of DNS names did not lend itself to looking up 1347 names in other syntaxes (e.g. bit fields). This is a problem which 1348 has been previously encountered, e.g. in reverse address lookups. 1349 [RFC5855] 1351 Second, as an existing system, the interfaces between DNS (should it 1352 have been used as an indexing subsystem for LISP) would not be 1353 'tuneable' to be optimal for LISP. For instance, if it were desired 1354 to have the leaf node in an indexing lookup directly contact the ETR 1355 on behalf of the node doing the lookup (thereby avoiding a round-trip 1356 delay), that would not be easy without modifications to the DNS code. 1357 Obviously, with a 'custom' system, this issue does not arise. 1359 Finally, DNS security, while robust, is fairly complex. Doing DDT 1360 offered an opportunity to provide a more nuanced security model. 1361 (See [Architecture], Section "Security-Phil" for more about this.) 1363 12.3. {{Any others?}} 1365 13. Fault Discovery/Handling 1367 LISP is, in terms of its functionality, a fairly simple system: the 1368 list of failure modes is thus not extensive. 1370 13.1. Handling Missing Mappings 1372 Handling of missing mappings is fairly simple: the ITR calls for the 1373 mapping, and in the meantime can either discard traffic to the 1374 destination (as many ARP implementations do) [RFC826], or, if 1375 dropping the traffic is deemed undesirable, it can forward them via a 1376 'default PITR'. 1378 A number of PITRs advertise all EID blocks into the backbone routing, 1379 so that any ITRs which are temporarily missing a mapping can forward 1380 the traffic to these default PITRs via normal transmission methods, 1381 where they are encapsulated and passed on. 1383 13.2. Outdated Mappings 1385 If a mapping changes once an ITR has retrieved it, that may result in 1386 traffic to the EIDs covered by that mapping failing. There are three 1387 cases to consider: 1389 - when the ETR traffic is being sent to is still a valid ETR for 1390 that EID, but the mapping has been updated (e.g. to change the 1391 priority of various ETRs) 1392 - when the ETR traffic is being sent to is still an ETR, but no 1393 longer a valid ETR for that EID 1394 - when the ETR traffic is being sent to is no longer an ETR 1396 13.2.1. Outdated Mappings - Updated Mapping 1398 A 'mapping versioning' system, whereby mappings have version numbers, 1399 and ITRs are notified when their mapping is out of date, has been 1400 added to detect this, and the ITR responds by refreshing the mapping. 1401 [Versioning] 1403 13.2.2. Outdated Mappings - Wrong ETR 1405 13.2.3. Outdated Mappings - No Longer an ETR 1407 If the destination of traffic from an ITR is no longer an ETR, one 1408 might get an ICMP Destination Unreachable error message. However, 1409 one cannot depend on that. The following mechanism will work, 1410 though. 1412 Since the destination is not an ETR, the echoing reachability 1413 detection mechanism (see Section 9.3.1) will detect a problem. At 1414 that point, the backstop mechanism, Probing, will kick in. Since the 1415 destination is still not an ETR, that will fail, too. 1417 At that point, traffic will be switched to a different ETR, or, if 1418 none are available, a re-map may be requested. 1420 13.3. Erroneous mappings 1422 13.4. Neighbour Liveness 1424 The ITR, like all packet switches, needs to detect, and react, when 1425 its next-hop neighbour ceases operation. As LISP traffic is 1426 effectively always unidirectional (from ITR to ETR), this could be 1427 somewhat problematic. 1429 Solving a related problem, neighbour reachability (below) subsumes 1430 handling this fault mode, however. 1432 Note that the two terms (liveness and reachability) are _not_ 1433 synonmous (although a lot of LISP documentation confuses them). 1434 Liveness is a property of a node - it is either up and functioning, 1435 or it is not. Reachability is only a property of a particular _pair_ 1436 of nodes. 1438 If packets sent from a first node to a second are successfully 1439 received at the second, it is 'reachable' from the first. However, 1440 the second node may at the very same time _not_ be reachable from 1441 some other node. Reachability is _always_ a ordered pairwise 1442 property, and of a specified ordered pair. 1444 13.5. Neighbour Reachability 1446 A more significant issue than whether a particular ETR E is up or not 1447 is, as mentioned above, that although ETR E may be up, attached to 1448 the network, etc, an issue in the network between a source ITR I and 1449 E may prevent traffic from I from getting to E. (Perhaps a routing 1450 problem, or perhaps some sort of access control setting.) 1452 The one-way nature of LISP traffic makes this situation hard to 1453 detect in a way which is economic, robust and fast. Two out of the 1454 three are usually not to hard, but all three at the same time - as is 1455 highly desirable for this particular issue - are harder. 1457 In line with the LISP design philosophy (Section 8.3), this problem 1458 is attacked not with a single mechanism (which would have a hard time 1459 meeting all those three goals simultaneously), but with a collection 1460 of simpler, cheaper mechanisms, which collectively will usually meet 1461 all three. 1463 They are reliance on the underlying routing system (which can of 1464 course only reliably provide a negative reachabilty indication, not a 1465 positive one), the echo nonce (which depends on some return traffic 1466 from the destination xTR back to the source), and finally direct 1467 'pinging', in the case where no positive echo is returned. 1469 (The last is not the first choice, as due to the large fan-out 1470 expected of LISP devices, reliance on it as a sole mechanism would 1471 produce a fair amount of overhead.) 1473 14. Acknowledgments 1475 The author would like thank the core LISP group for their willingness 1476 to allow him to add himself to their effort, and for their enthusiasm 1477 for whatever assistance he has been able to provide. He would also 1478 like to thank (in alphabetical order) Vina Ermagan, Vince Fuller, and 1479 Joel Halpern for their careful review of, and helpful suggestions 1480 for, this document. Grateful thanks also to Darrel Lewis for his 1481 help with material on non-Internet uses of LISP, and to Vince Fuller 1482 for help with XML. A final thanks is due to John Wrocklawski for the 1483 author's organizational affiliation. 1485 15. IANA Considerations 1487 This document makes no request of the IANA. 1489 16. Security Considerations 1491 17. References 1493 17.1. Normative References 1495 [RFC768] J. Postel, "User Datagram Protocol", RFC 768, 1496 August 1980. 1498 [RFC791] J. Postel, "Internet Protocol", RFC 791, 1499 September 1981. 1501 [RFC1498] J. H. Saltzer, "On the Naming and Binding of Network 1502 Destinations", RFC 1498, (Originally published in: 1503 "Local Computer Networks", edited by P. Ravasio et 1504 al., North-Holland Publishing Company, Amsterdam, 1505 1982, pp. 311-317.), August 1993. 1507 [RFC2460] S. Deering and R. Hinden, "Internet Protocol, Version 1508 6 (IPv6) Specification", RFC 2460, December 1998. 1510 [Architecture] J.N. Chiappa, "The Architecture of the LISP Location- 1511 Identity Separation System", 1512 draft-chiappa-lisp-architecture-00.txt (work in 1513 progress), July 2012. 1515 [DDT] V. Fuller, D. Lewis, and D. Farinacci, "LISP 1516 Delegated Database Tree", 1517 draft-fuller-lisp-ddt-01.txt (work in progress), 1518 March 2012. 1520 [Interworking] D. Lewis, D. Meyer, D. Farinacci, and V. Fuller, 1521 "Interworking LISP with IPv4 and IPv6", 1522 draft-ietf-lisp-interworking-06.txt (work in 1523 progress), March 2012. 1525 [LISP] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, 1526 "Locator/ID Separation Protocol (LISP)", 1527 draft-ietf-lisp-23.txt (work in progress), May 2012. 1529 [Mobility] D. Farinacci, V. Fuller, D. Lewis, and D. Meyer, 1530 "LISP Mobility Architecture", 1531 draft-meyer-lisp-mn-07.txt (work in progress), 1532 April 2012. 1534 [NAT] V. Ermagan, D. Farinacci, D. Lewis, J. Skriver, 1535 F. Maino, and C. White, "NAT traversal for LISP", 1536 draft-ermagan-lisp-nat-traversal-01.txt (work in 1537 progress), March 2012. 1539 [Versioning] L. Iannone, D. Saucez, and O. Bonaventure, "LISP 1540 Mapping Versioning", 1541 draft-ietf-lisp-map-versioning-09.txt (work in 1542 progress), March 2012. 1544 [AFI] IANA, "Address Family Indicators (AFIs)", Address 1545 Family Numbers, January 2011, . 1548 17.2. Informative References 1550 [NIC8246] A. McKenzie and J. Postel, "Host-to-Host Protocol for 1551 the ARPANET", NIC 8246, Network Information Center, 1552 SRI International, Menlo Park, CA, October 1977. 1554 [IEN19] J. F. Shoch, "Inter-Network Naming, Addressing, and 1555 Routing", IEN (Internet Experiment Note) 19, 1556 January 1978. 1558 [RFC826] D. Plummer, "Ethernet Address Resolution Protocol", 1559 RFC 826, November 1982. 1561 [RFC1034] P. V. Mockapetris, "Domain Names - Concepts and 1562 Facilities", RFC 1034, November 1987. 1564 [RFC1918] Y. Rekhter, R. Moskowitz, D. Karrenberg, 1565 G. J. de Groot, and E. Lear, "Address Allocation for 1566 Private Internets", RFC 1918, February 1996. 1568 [RFC1992] I. Castineyra, J. N. Chiappa, and M. Steenstrup, "The 1569 Nimrod Routing Architecture", RFC 1992, August 1996. 1571 [RFC2993] T. Hain, "Architectural Implications of NAT", 1572 RFC 2993, November 2000. 1574 [RFC3031] E. Rosen, A. Viswanathan, and R. Callon, 1575 "Multiprotocol Label Switching Architecture", 1576 RFC 3031, January 2001. 1578 [RFC3168] K. Ramakrishnan, S. Floyd, and D. Black, "The 1579 Addition of Explicit Congestion Notification (ECN) to 1580 IP", RFC 3168, September 2001. 1582 [RFC3272] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and 1583 X. Xiao, "Overview and Principles of Internet Traffic 1584 Engineering", RFC 3272, May 2002. 1586 [RFC4026] L. Andersson and T. Madsen, "Provider Provisioned 1587 Virtual Private Network (VPN) Terminology", RFC 4026, 1588 March 2005. 1590 [RFC4116] J. Abley, K. Lindqvist, E. Davies, B. Black, and 1591 V. Gill, "IPv4 Multihoming Practices and 1592 Limitations", RFC 4116, July 2005. 1594 [RFC4984] D. Meyer, L. Zhang, and K. Fall, "Report from the IAB 1595 Workshop on Routing and Addressing", RFC 4984, 1596 September 2007. 1598 [RFC5855] J. Abley and T. Manderson, "Nameservers for IPv4 and 1599 IPv6 Reverse Zones", RFC 5855, May 2010. 1601 [RFC5887] B. Carpenter, R. Atkinson, and H. Flinck, 1602 "Renumbering Still Needs Work", RFC 5887, May 2010. 1604 [RFC6115] T. Li, Ed., "Recommendation for a Routing 1605 Architecture", RFC 6115, February 2011. 1607 Perhaps the most ill-named RFC of all time; it 1608 contains nothing that could truly be called a 1609 'routing architecture'. 1611 [RFC6127] J. Arkko and M. Townsley, "IPv4 Run-Out and IPv4-IPv6 1612 Co-Existence Scenarios", RFC 6127, May 2011. 1614 [ALT] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, 1615 "LISP Alternative Topology (LISP-ALT)", 1616 draft-ietf-lisp-alt-10.txt (work in progress), 1617 December 2011. 1619 [NSAP] International Organization for Standardization, 1620 "Information Processing Systems - Open Systems 1621 Interconnection - Basic Reference Model", ISO 1622 Standard 7489.1984, 1984. 1624 [Atkinson] R. Atkinson, "Revised draft proposed definitions", 1625 RRG list message, Message-Id: 808E6500-97B4-4107- 1626 8A2F-36BC913BE196@extremenetworks.com, 11 June 2007, 1627 . 1630 [Baran] P. Baran, "On Distributed Communications Networks", 1631 IEEE Transactions on Communications Systems Vol. 1632 CS-12 No. 1, pp. 1-9, March 1964. 1634 [Chiappa] J. N. Chiappa, "Endpoints and Endpoint Names: A 1635 Proposed Enhancement to the Internet Architecture", 1636 Personal draft (work in progress), 1999, 1637 . 1639 [Clark] D. D. Clark, "The Design Philosophy of the DARPA 1640 Internet Protocols", in 'Proceedings of the Symposium 1641 on Communications Architectures and Protocols SIGCOMM 1642 '88', pp. 106-114, 1988. 1644 [Heart] F. E. Heart, R. E. Kahn, S. M. Ornstein, 1645 W. R. Crowther, and D. C. Walden, "The Interface 1646 Message Processor for the ARPA Computer Network", 1647 Proceedings AFIPS 1970 SJCC, Vol. 36, pp. 551-567. 1649 [Jakab] L. Jakab, A. Cabellos-Aparicio, F. Coras, D. Saucez, 1650 and O. Bonaventure, "LISP-TREE: A DNS Hierarchy to 1651 Support the LISP Mapping System", in 'IEEE Journal on 1652 Selected Areas in Communications', Vol. 28, No. 8, 1653 pp. 1332-1343, October 2010. 1655 [Iannone] L. Iannone and O. Bonaventure, "On the Cost of 1656 Caching Locator/ID Mappings", in 'Proceedings of the 1657 3rd International Conference on emerging Networking 1658 EXperiments and Technologies (CoNEXT'07)', ACM, pp. 1659 1-12, December 2007. 1661 [Saltzer] J. H. Saltzer, D. P. Reed, and D. D. Clark, "End-To- 1662 End Arguments in System Design", ACM TOCS, Vol 2, No. 1663 4, pp 277-288, November 1984. 1665 [Salvadori] M. Salvadori and M. Levy, "Why Buildings Fall Down", 1666 W. W. Norton, New York, pg. 81, 1992. 1668 Appendix A. RefComment 1670 Appendix B. Glossary/Definition of Terms 1672 - Address 1673 - Locator 1674 - EID 1675 - RLOC 1676 - ITR 1677 - ETR 1678 - xTR 1679 - PITR 1680 - PETR 1681 - MR 1682 - MS 1683 - DFZ 1685 Appendix C. Other Appendices 1687 -- Location/Identity Separation Brief History 1688 -- LISP History 1689 -- Old models (LISP 1, LISP 1.5, etc) 1690 -- Different mapping distribution models (e.g. LISP-NERD) 1691 -- Different mapping indexing models (LISP-ALT 1692 forwarding/overlay model), 1693 LISP-TREE DNS-based, LISP-CONS) 1695 Author's Address 1697 J. Noel Chiappa 1698 Yorktown Museum of Asian Art 1699 Yorktown, Virginia 1700 USA 1702 EMail: jnc@mit.edu