idnits 2.17.1 draft-chiappa-lisp-introduction-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1737 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 16, 2012) is 4295 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-01) exists of draft-chiappa-lisp-architecture-00 == Outdated reference: A later version (-04) exists of draft-fuller-lisp-ddt-01 -- No information found for draft-chiappa-lisp-evolution - is the name correct? == Outdated reference: A later version (-24) exists of draft-ietf-lisp-23 == Outdated reference: A later version (-16) exists of draft-meyer-lisp-mn-07 == Outdated reference: A later version (-19) exists of draft-ermagan-lisp-nat-traversal-01 -- Obsolete informational reference (is this intentional?): RFC 1631 (Obsoleted by RFC 3022) -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LISP Working Group J. N. Chiappa 3 Internet-Draft Yorktown Museum of Asian Art 4 Intended status: Informational July 16, 2012 5 Expires: January 17, 2013 7 An Introduction to the LISP Location-Identity Separation System 8 draft-chiappa-lisp-introduction-01 10 Abstract 12 LISP is an upgrade to the architecture of the IPvN internetworking 13 system, one which separates location and identity (currently 14 intermingled in IPvN addresses). This is a change which has been 15 identified by the IRTF as a critically necessary evolutionary 16 architectural step for the Internet. In LISP, nodes have both a 17 'locator' (a name which says _where_ in the network's connectivity 18 structure the node is) and an 'identifier' (a name which serves only 19 to provide a persistent handle for the node). A node may have more 20 than one locator, or its locator may change over time (e.g. if the 21 node is mobile), but it keeps the same identifier. 23 One of the chief novelties of LISP, compared to other proposals for 24 the separation of location and identity, is its approach to deploying 25 this upgrade. (In general, it is comparatively easy to conceive of 26 new network designs, but much harder to devise approaches which will 27 actually get deployed throughout the global network.) LISP aims to 28 achieve the near-ubiquitous deployment necessary for maximum 29 exploitation of an architectural upgrade by i) minimizing the amount 30 of change needed (existing hosts and routers can operate unmodified); 31 and ii) by providing significant benefits to early adopters. 33 This document is an introduction to the entire LISP system, for those 34 who are unfamiliar with it. It is intended to be both easy to 35 follow, and also give a fairly detailed understanding of the entire 36 system. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. This document may not be modified, 42 and derivative works of it may not be created, except to format it 43 for publication as an RFC or to translate it into languages other 44 than English. 46 Internet-Drafts are working documents of the Internet Engineering 47 Task Force (IETF). Note that other groups may also distribute 48 working documents as Internet-Drafts. The list of current Internet- 49 Drafts is at http://datatracker.ietf.org/drafts/current/. 51 Internet-Drafts are draft documents valid for a maximum of six months 52 and may be updated, replaced, or obsoleted by other documents at any 53 time. It is inappropriate to use Internet-Drafts as reference 54 material or to cite them other than as "work in progress." 56 This Internet-Draft will expire on January 17, 2013. 58 Copyright Notice 60 Copyright (c) 2012 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1. Background 76 2. Deployment Philosophy 77 2.1. Economics 78 2.2. Maximize Re-use of Existing Mechanism 79 2.3. Self-Deployment 80 3. LISP Overview 81 3.1. Basic Approach 82 3.2. Basic Functionality 83 3.3. Mapping from EIDs to RLOCs 84 3.4. Interworking With Non-LISP-Capable Endpoints 85 4. Initial Applications 86 4.1. Provider Independence 87 4.2. Multi-Homing 88 4.3. Traffic Engineering 89 4.4. Mobility 90 4.5. IP Version Reciprocal Traversal 91 4.6. Local Uses 92 5. Major Functional Subsystems 93 5.1. xTRs 94 5.2. Mapping System 95 5.2.1. Mapping System Organization 96 5.2.2. Interface to the Mapping System 97 5.2.3. Indexing Subsystem 98 6. Examples of Operation 99 6.1. An Ordinary Packet's Processing 100 6.2. A Mapping Cache Miss 101 7. Design Approach 102 7.1. Quick Implement-Test Loop 103 7.1.1. No Desk Fixes 104 7.1.2. Code Before Documentation 105 7.2. Only Fix Real Problems 106 7.3. No Theoretical Perfection 107 7.3.1. No Ocean Boiling 108 7.4. Just Enough Security 109 8. xTRs 110 8.1. When to Encapsulate 111 8.2. UDP Encapsulation Details 112 8.3. Header Control Channel 113 8.3.1. Echo Nonces 114 8.3.2. Instances 115 8.4. Fragmentation 116 8.5. Mapping Gleaning in ETRs 117 9. The Mapping System 118 9.1. The Indexing Subsystem 119 9.2. The Mapping System Interface 120 9.2.1. Map-Request Messages 121 9.2.2. Map-Reply Maessages 122 9.2.3. Map-Register and Map-Notify Messages 123 9.2.4. Map-Referral Messages 124 9.3. Reliability via Replication 125 9.4. Extended Tools 126 9.5. Expected Performance 127 10. Deployment Mechanisms 128 10.1. Internetworking Mechanism 129 10.2. Proxy Devices 130 10.2.1. PITRs 131 10.2.2. PETRs 132 10.3. LISP-NAT 133 10.4. LISP and DFZ Routing 134 10.5. Use Through NAT Devices 135 10.5.1. First-Phase NAT Support 136 10.5.2. Second-Phase NAT Support 137 11. Current Improvements 138 11.1. Mapping Versioning 139 11.2. Replacement of ALT with DDT 140 11.2.1. Why Not Use DNS 141 11.3. Mobile Device Support 142 11.4. Multicast Support 143 11.5. {{Any others?}} 144 12. Fault Discovery/Handling 145 12.1. Handling Missing Mappings 146 12.2. Outdated Mappings 147 12.2.1. Outdated Mappings - Updated Mapping 148 12.2.2. Outdated Mappings - Wrong ETR 149 12.2.3. Outdated Mappings - No Longer an ETR 150 12.3. Erroneous mappings 151 12.4. Neighbour Liveness 152 12.5. Neighbour Reachability 153 13. Acknowledgments 154 14. IANA Considerations 155 15. Security Considerations 156 16. References 157 16.1. Normative References 158 16.2. Informative References 159 Appendix A. Glossary/Definition of Terms 160 Appendix B. Other Appendices 162 1. Background 164 It has gradually been realized in the networking community that 165 networks (especially large networks) should deal quite separately 166 with the identity and location of a node (basically, 'who' a node is, 167 and 'where' it is). At the moment, in both IPv4 and IPv6, addresses 168 indicate both where the named device is, as well as identify it for 169 purposes of end-end communication. 171 The distinction was more than a little hazy at first: the early 172 Internet [RFC791], like the ARPANET before it [Heart] [NIC8246], co- 173 mingled the two, although there was recognition in the early Internet 174 work that there were two different things going on. [IEN19] 176 This likely resulted not just from lack of insight, but also the fact 177 that extra mechanism is needed to support this separation (and in the 178 early days there were no resources to spare), as well as the lack of 179 need for it in the smaller networks of the time. (It is a truism of 180 system design that small systems can get away with doing two things 181 with one mechanism, in a way that usually will not work when the 182 system gets much larger.) 184 The ISO protocol architecture took steps in this direction [NSAP], 185 but to the Internet community the necessity of a clear separation was 186 definitively shown by Saltzer. [RFC1498] Later work expanded on 187 Saltzer's, and tied his separation concepts into the fate-sharing 188 concepts of Clark. [Clark], [Chiappa] 190 The separation of location and identity is a step which has recently 191 been identified by the IRTF as a critically necessary evolutionary 192 architectural step for the Internet. However, it has taken some time 193 for this requirement to be generally accepted by the Internet 194 engineering community at large, although it seems that this may 195 finally be happening. 197 The LISP system for separation of location and identity resulted from 198 the discussions of this topic at the Amsterdam IAB Routing and 199 Addressing Workshop, which took place in October 2006. [RFC4984] 201 A small group of like-minded personnel from various scattered 202 locations within Cisco, spontaneously formed immediately after that 203 workshop, to work on an idea that came out of informal discussions at 204 the workshop. The first Internet-Draft on LISP appeared in January, 205 2007, along with a LISP mailing list at the IETF. [LISP] 207 Trial implementations started at that time, with initial trial 208 deployments underway since June 2007; the results of early experience 209 have been fed back into the design in a continuous, ongoing process 210 over several years. LISP at this point represents a moderately 211 mature system, having undergone a long organic series of changes and 212 updates. 214 LISP transitioned from an IRTF activity to an IETF WG in March 2009, 215 and after numerous revisions, the basic specifications moved to 216 becoming RFCs in 2012 (although work to expand and improve it 217 continues, and undoubtly will for a long time to come). 219 2. Deployment Philosophy 221 It may seem odd to cover 'deployment philosophy' at this point in 222 such a document. However the deployment philosophy was a major 223 driver for much of the design (to some degree the architecture, and 224 to a very large measure, the engineering). So, as such an important 225 motivator, it is very desirable for readers to have this material in 226 hand as they examine the design, so that design choices that may seem 227 questionable at first glance can be better understood. 229 Experience over the last several decades has shown that having a 230 viable 'deployment model' for a new design is absolutely key to the 231 success of that design. A new design may be fantastic - but if it 232 can not or will not be successfully deployed (for whatever factors), 233 it is useless. This absolute primacy of a viable deployment model is 234 what has lead to some painful compromises in the design. 236 The extreme focus on a viable deployment scheme is one of the 237 novelties of LISP. 239 2.1. Economics 241 The key factor in successful adoption, as shown by recent experience 242 in the Internet - and little appreciated to begin with, some decades 243 back - is economics: does the new design have benefits which outweigh 244 its costs. 246 More importantly, this balance needs to hold for early adopters - 247 because if they do not receive benefits to their adoption, the sphere 248 of earliest adopters will not expand, and it will never get to 249 widespread deployment. One might have the world's best clean-slate 250 design, but if it does not have a deployment plan which is 251 economically feasible, it's just a mildly interesting piece of paper. 253 This is particularly true of architectural enhancements, which are 254 far less likely to be an addition which one can 'bolt onto the side' 255 of existing mechanisms, and often offer their greatest benefits only 256 when widely (or ubiquitously) deployed. 258 Maximizing the cost-benefit ratio obviously has two aspects. First, 259 on the cost side, by making the design as inexpensive as possible, 260 which means in part making the deployment as easy as possible. 261 Second, on the benefit side, by providing many new capabilities, 262 which is best done not by loading the design up with lots of features 263 or options (which adds complexity), but by making the addition 264 powerful through deeper flexibility. We believe LISP has met both of 265 these goals. 267 2.2. Maximize Re-use of Existing Mechanism 269 One key part of reducing the cost of a new design is to absolutely 270 minimize the amount of change _required_ to existing, deployed, 271 devices: the fewer devices need to be changed, and the smaller the 272 change to those that do, the lower the pain (and thus the greater the 273 likelihood) of deployment. 275 Designs which absolutely require 'forklift upgrades' to large amounts 276 of existing gear are far less likely to succeed - because they have 277 to have extremely large benefits to make their very substantial costs 278 worthwhile. 280 It is for this reason that LISP, in most cases, initially requires no 281 changes to devices in the Internet (both hosts and routers), and also 282 initially reuses, whereever possible, existing protocols (IPv4 283 [RFC791] and IPv6 [RFC2460]). The 'initially' must be stressed - 284 careful attention has also long been paid to the long-term future 285 (see [Future]), and larger changes become feasible as deployment 286 succeeds. 288 2.3. Self-Deployment 290 LISP has deliberately employed a rather different deployment model, 291 which we might call 'self-deployment'; it does not require a huge 292 push to get it deployed, rather, it is hoped that once people see it 293 and realize they can easily make good use of it _on their own_ (i.e. 294 without requiring adoption by others), it will 'deploy itself' (hence 295 the name of the approach). 297 One can liken the problem of deploying new systems in this way to 298 rolling a snowball down a hill: unless one starts with a big enough 299 initial snowball, and finds a hill of the right steepness (i.e. the 300 right path for it to travel, once it starts moving), one's snowball 301 is not going to go anywhere on its own. However, if one has picked 302 one's spot correctly, little additional work is needed - just stand 303 back and watch it go. 305 3. LISP Overview 307 LISP is an incrementally deployable architectural upgrade to the 308 existing Internet infrastructure, one which provides separation of 309 location and identity. The separation is usually not perfect, for 310 reasons which are driven by the deployment philosophy (above), and 311 explored in a little more detail elsewhere (in [Architecture], 312 Section "Namespaces-EIDs-Residual"). 314 LISP separates the functions of location and identity, current 315 intermingled in IPvN addresses. (This document uses the meaning for 316 'address' proposed in [Atkinson], i.e. a name with mixed location and 317 identity semantics.) 319 3.1. Basic Approach 321 In LISP, nodes have both a 'locator' (a name which says _where_ in 322 the network's connectivity structure the node is), called an 'RLOC', 323 and an 'identifier' (a name which serves only to provide a persistent 324 handle for the node), called an 'EID'. A node may have more than one 325 RLOC, or its RLOC may change over time (e.g. if the node is mobile), 326 but it keeps the same EID. 328 Technically, one should probably say that ideally, the EID names the 329 node (or rather, its end-end communication stack, if one wants to be 330 as forward-looking as possible), and the RLOC(s) name interface(s). 331 (At the moment, in reality, the situation is somewhat more complex, 332 as will be explained elsewhere (in [Architecture], Section 333 "Namespaces-EIDs-Residual"). 335 This second distinction, of _what_ is named by the two classes of 336 name, is necessary both to enable some of the capabilities that LISP 337 provides (e.g the ability to seamlessly support multiple interfaces, 338 to different networks), and is also a further enhancement to the 339 architecture. Faailing to clearly recognize both interfaces and 340 communication stacks as distinctly separate classes of things is 341 another failing of the existing Internet architecture (again, one 342 inherited from the previous generation of networking). 344 A novelty in LISP is that it uses existing IPvN addresses (initially, 345 at least) for both of these kinds of names, thereby minimizing the 346 deployment cost, as well as providing the ability to easily interact 347 with unmodified hosts and routers. 349 3.2. Basic Functionality 351 The basic operation of LISP, as it currently stands, is that LISP 352 augmented packet switches near the source and destination of packets 353 intercept traffic, and 'enhance' the packets. 355 The LISP device near the source looks up additional information about 356 the destination, and then wraps the packet in an outer header, one 357 which contains some of that additional information. The LISP device 358 near the destination removes that header, leaving the original, 359 unmodified, packet to be processed by the destination node. 361 The LISP device near the source (the Ingress Tunnel Router, or 'ITR') 362 uses the information originally in the packet about the identity of 363 its ultimate destination, i.e. the destination address, which one can 364 view as the EID of the ultimate destination. It uses the destination 365 EID to look up the current location (the RLOC) of that EID. 367 The lookup is performed through a 'mapping system', which is the 368 heart of LISP: it is a distributed directory of bindings from EIDs to 369 RLOCS. The destination RLOC will be (initially at least) the address 370 of the LISP device near the destination (the Egress Tunnel Router, or 371 'ETR'). 373 The ITR then generates a new outer header for the original packet, 374 with that header containing the destination's RLOC as the wrapped 375 packet's destination, and the ITR's own address (i.e. the RLOC of the 376 original source) as the wrapped packet's source, and sends it off. 378 When the packet gets to the ETR, that outer header is stripped off, 379 and the original packet is forwarded to the original ultimate 380 destination for normal processing. 382 Return traffic is handled similarly, often (depending on the 383 network's configuration) with the original ITR and ETR switching 384 roles. The ETR and ITR functionality is usually co-located in a 385 single device; these are normally denominated as 'xTRs'. 387 3.3. Mapping from EIDs to RLOCs 389 The mappings from EIDs to RLOCs are provided by a distributed (and 390 potentially replicated) database, the mapping database, which is the 391 heart of LISP. 393 Mappings are requested on need, not (generally) pre-loaded; in other 394 words, mapping are distributed via a 'pull' mechanism. Once obtained 395 by an ITR, they are cached, to limit the amount of control traffic to 396 a practicable level. (The mapping system will be discussed in more 397 detail below, in Section 5.2 and Section 9) 399 Extensive studies, including large-scale simulations driven by 400 lengthy recordings of actual traffic at several major sites, have 401 been performed to verify that this 'pull and cache' approach is 402 viable, in practical engineering terms. [Iannone] (This subject will 403 be discussed in more detail in Section 9.5, below.) 405 3.4. Interworking With Non-LISP-Capable Endpoints 407 The capability for 'easy' interoperation between nodes using LISP, 408 and existing non-LISP-using hosts or sites (often called 'legacy' 409 hosts), is clearly crucial. 411 To allow such interoperation, a number of mechanisms have been 412 designed. This multiplicity is in part because different mechanisms 413 have different advantages and disadvantages (so that no single 414 mechanism is optimal for all cases), but also because with limited 415 field experience, it is not clear which (if any) approach will be 416 preferable. 418 One approach uses proxy LISP devices, called PITRs (proxy ITRs) and 419 PETRs (proxy ETRs), to provide LISP functionality during interaction 420 with legacy sites. Another approach uses a device with combined LISP 421 and NAT ([RFC1631]) functionality, named a LISP-NAT. 423 4. Initial Applications 425 As previously mentioned, it is felt that LISP will provide even the 426 earliest adopters with some useful capabilities, and that these 427 capabilities will drive early LISP deployment. 429 It is very imporant to note that even when used only for 430 interoperation with existing unmodified hosts, use of LISP can still 431 provide benefits for communications with the site which has deployed 432 it - and, perhaps even more importantly, can do so _to both sides_. 433 This characteristic acts to further enhance the utility for early 434 adopters of deploying LISP, thereby increasing the cost/benefit ratio 435 needed to drive deployment, and increasing the 'self-deployment' 436 aspect of LISP. 438 Note also that this section only lists likely _early_ applications 439 and benefits - if and once deployment becomes more widespread, other 440 aspects will come into play (as described in [Architecture], in the 441 "Goals of LISP" section). 443 4.1. Provider Independence 445 Provider independence (i.e. the ability to easily change one's 446 Internet Service Provider) was probably the first place where the 447 Internet engineering community finally really felt the utility of 448 separating location and identity. 450 The problem is simple: for the global routing to scale, addresses 451 need to be aggregated (i.e. things which are close in the overall 452 network's connectivity need to have closely related addresses), the 453 so-called "provider aggregated" addresses. [RFC4116] However, if 454 this principle is followed, it means that when an entity switches 455 providers (i.e. it moves to a different 'place' in the network), it 456 has to renumber, a painful undertaking. [RFC5887] 458 In theory, it ought to be possible to update the DNS entries, and 459 have everyone switch to the new addresses, but in practise, addresses 460 are embedded in many places, such as firewall configurations at other 461 sites. 463 Having separate namespaces for location and identity greatly reduces 464 the problems involved with renumbering; an organization which moves 465 retains its EIDs (which are how most other parties refer to its 466 nodes), but is allocated new RLOCs, and the mapping system can 467 quickly provide the updated binding from the EIDs to the new RLOCs. 469 4.2. Multi-Homing 471 Multi-homing is another place where the value of separation of 472 location and identity became apparent. There are several different 473 sub-flavours of the multi-homing problem - e.g. depending on whether 474 one wants open connections to keep working, etc - and other axes as 475 well (e.g. site multi-homing versus host multi-homing). 477 In particular, for the 'keep open connections up' case, without 478 separation of location and identity, the only currently feasible 479 approach is to use provider-independent addressses - which moves the 480 problem into the global routing system, with attendant costs. This 481 approach is also not really feasible for host multi-homing. 483 Multi-homing was once somewhat esoteric, but a number of trends are 484 driving an increased desirability, e.g. the wish to have multiple ISP 485 links to a site for robustness; the desire to have mobile handsets 486 connect up to multiple wireless systems; etc. 488 Again, separation of location and identity, and the existince of a 489 binding layer which can be updated fairly quickly, as provided by 490 LISP, is a very useful tool for all variants of this issue. 492 4.3. Traffic Engineering 494 Traffic engineering (TE) [RFC3272], desirable though this capability 495 is in a global network, is currently somewhat problematic to provide 496 in the Internet. The problem, fundamentally, is that this capability 497 was not visualized when the Internet was designed, so support for it 498 is somewhat in the 'when the only tool you have is a hammer, 499 everything looks like nail' category. 501 TE is, fundamentally, a routing issue. However, the current Internet 502 routing architecture, which is basically the Baran design of fifty 503 years ago [Baran] (a single large, distributed computationa), is ill- 504 suited to provide TE. The Internet seems a long way from adopting a 505 more-advanced routing architecture, although the basic concepts for 506 such have been known for some time. [RFC1992] 508 Although the identity-location binding layer is thus a poor place, 509 architecturally, to provide TE capabilities, it is still an 510 improvement over the current routing tools available for this purpose 511 (e.g. injection of more-specific routes into the global routing 512 table). In addition, instead of the entire network incurring the 513 costs (through the routing system overhead), when using a binding 514 layer to provide TE, the overhead is limited to those who are 515 actually communicating with that particular destination. 517 LISP includes a number of features in the mapping system to support 518 TE. (Described in Section 5.2 below.) 520 4.4. Mobility 522 Mobility is yet another place where separation of location and 523 identity is obviously a key part of a clean, efficient and high- 524 functionality solution. Considerable experimentation has been 525 completed on doing mobility with LISP. 527 4.5. IP Version Reciprocal Traversal 529 Note that LISP 'automagically' allows intermixing of various IP 530 versions for packet carriage; IPv4 packets might well be carried in 531 IPv6, or vice versa, depending on the network's configuration. This 532 would allow an 'island' of operation of one type to be 533 'automatically' tunneled over a stretch of infrastucture which only 534 supports the other type. 536 While the machinery of LISP may seem too heavyweight to be good for 537 such a mundane use, this is not intended as a 'sole use' case for 538 deployment of LISP. Rather, it is something which, if LISP is being 539 deployed anyway (for its other advantages), is an added benefit that 540 one gets 'for free'. 542 4.6. Local Uses 544 LISP has a number of use cases which are within purely local 545 contexts, i.e. not in the larger Internet. These fall into two 546 categories: uses seen on the Internet (above), but here on a private 547 (and usually small scale) setting; and applications which do not have 548 a direct analog in the larger Internet, and which apply only to local 549 deployments. 551 Among the former are multi-homing, IP version traversal, and support 552 of VPN's for segmentation and multi-tenancy (i.e. a spatially 553 separated private VPN whose components are joined together using the 554 public Internet as a backbone). 556 Among the latter class, non-Internet applications which have no 557 analog on the Internet, are the following example applications: 558 virtual machine mobility in data centers; other non-IP EID types such 559 as local network MAC addresses, or application specific data. 561 5. Major Functional Subsystems 563 LISP has only two major functional subsystems - the collection of 564 LISP packet switches (the xTRs), and the mapping system, which 565 manages the mapping database. The purpose and operation of each is 566 described at a high level below, and then, later on, in a fair amount 567 of detail, in separate sections on each (Sections Section 8 and 568 Section 9, respectively). 570 5.1. xTRs 572 xTRs are fairly normal packet switches, enhanced with a little extra 573 functionality in both the data and control planes, to perform LISP 574 data and control functionality. 576 The data plane functions in ITRs include deciding which packets need 577 to be given LISP processing (since packets to non-LISP sites may be 578 sent 'vanilla'); looking up the mapping; encapsulating the packet; 579 and sending it to the ETR. This encapsulation is done using UDP 580 [RFC768] (for reasons to be explained below, in Section 8.2), along 581 with an additional IPvN header (to hold the asource and destination 582 RLOCs). To the extent that traffic engineering features are in use 583 for a particular EID, the ITRs implement them as well. 585 In the ETR, the data plane simply unwraps the packets, and forwards 586 the 'vanilla' packets to the ultimate destination. 588 Control plane functions in ITRs include: asking for {EID->RLOC} 589 mappings via Map-Request control messages; handling the returning 590 Map-Replies which contain the requested information; managing the 591 local cache of mappings; checking for the reachability and liveness 592 of their neighbour ETRs; and checking for outdated mappings and 593 requesting updates. 595 In the ETR, control plane functions include participating in the 596 neighbour reachability and liveness function (see Section 12.4); 597 interacting with the mapping indexing system (next section); and 598 answering requests for mappings (ditto). 600 5.2. Mapping System 602 The mapping database is a distributed, and potentially replicated, 603 database which holds bindings between EIDs (identity) and RLOCs 604 (location). To be exact, it contains bindings between EID blocks and 605 RLOCs (the block size is given explicitly, as part of the syntax). 607 Support for blocks is both for minimizing the administrative 608 configuration overhead, as well as for operational efficiency; e.g. 609 when a group of EIDs are behind a single xTR. 611 However, the block may be (and often is) as small as a single EID. 612 Since mappings are only loaded upon demand, if smaller blocks become 613 predominant, then the increased size of the overall database is far 614 less problematic than if the routing table came to be dominated by 615 such small entries. 617 A particular node may have more than one RLOC, or may change its 618 RLOC(s), while keeping its singlar identity. 620 The binding contains not just the RLOC(s), but also (for each RLOC 621 for any given EID) priority and weight (to allow allocation of load 622 between several RLOCs at a given priority); this allows a certain 623 amount of traffic engineering to be accomplished with LISP. 625 5.2.1. Mapping System Organization 627 The mapping system is actually split into two major functional sub- 628 systems. The actual bindings themselves are held by the ETRs, and an 629 ITR which needs a binding effectively gets it from the ETR. 631 This co-location of the authoritative version of the mappings, and 632 the forwarding functionality which it describes, is an instance of 633 fate-sharing. [Clark] 635 To find the appropriate ETR(s) to query for the mapping, the second 636 subsystem, an 'indexing system', itself also a distributed, 637 potentally replicated database, provides information on which ETR(s) 638 are authoritative sources of information about the bindings which are 639 available. 641 5.2.2. Interface to the Mapping System 643 The client interface to the mapping system from an ITR's point of 644 view is not with the indexing system directly; rather, it is through 645 devices called Map Resolvers (MRs). 647 ITRs send request control messages (Map-Request packets) to an MR. 648 (This interface is probably the most important standardized interface 649 in LISP - it is the key to the entire system.) The MR uses the 650 indexing system to eventually forward the Map-Request to the 651 appropriate ETR. The ETR formulates reply control messages (Map- 652 Reply packets), which is conveyed to the ITR. The details of the 653 indexing system, etc, are thus hidden from the 'ordinary' ITRs. 655 Similarly, the client interface to the indexing system from an ETR's 656 point of view is through devices called Map Servers (MSs - admittedly 657 a poorly chosen term, but it's too late to change it now). 659 ETRs send registration control messages (Map-Register packets) to an 660 MS, which makes the information about the mappings which the ETR 661 indicates it is authoritative for available to the indexing system. 662 The MS formulates a reply control message (the Map-Notify packet), 663 which confirms the registration, and is returned to the ETR. The 664 details of the indexing system are thus likewise hidden from the 665 'ordinary' ETRs. 667 5.2.3. Indexing Subsystem 669 The current indexing system is called the Delegated Database Tree 670 (DDT), which is very similar in operation to DNS. [DDT], [RFC1034] 671 However, unlike DNS, the actual mappings are not handled by DDT; DDT 672 merely identifies the ETRs which hold the mappings. 674 Again, extensive large-scale simulations driven by lengthy recordings 675 of actual traffic at several major sites, have been performed to 676 verify the effectiveness of this particular indexing system. [Jakab] 678 6. Examples of Operation 680 To aid in comprehension, a few examples are given of user packets 681 traversing the LISP system. The first shows the processing of a 682 typical user packet, i.e. what the vast majority of user packets will 683 see. The second shows what happens when the first packet to a 684 previously-unseen destination (at a particular ITR) is to be 685 processed by LISP. 687 6.1. An Ordinary Packet's Processing 689 This case follows the processing of a typical user packet (for 690 instance, a normal TCP data or acknowledgment packet associated with 691 an open HTTP connection) as it makes its way from the source host to 692 the destination. 694 {{Rest to be written.}} 696 6.2. A Mapping Cache Miss 698 If a host sends a packet, and it gets to the ITR, and the ITR both i) 699 determines that it needs to perform LISP processing on the user data 700 packet, but ii) does not yet have a mapping cache entry which covers 701 that destination EID, then more complex processing ensues. 703 {{Rest to be written.}} 705 7. Design Approach 707 Before describing LISP's components in more detail below, it may be 708 worth saying a few words about the design philosophy used in creating 709 them - this may make clearer the reasons for some engineering choices 710 in the mechanisms given there. 712 7.1. Quick Implement-Test Loop 714 LISP uses a philosophy similar to that used in the early days of the 715 Internet, which is to just build it, then try it and see what 716 happens, and move forward from there based on what actually happens. 717 The concept has been to get something up and running, and then modify 718 it based on testing and experience. 720 7.1.1. No Desk Fixes 722 Don't try and forsee all issues from desk analysis. (Which is not to 723 say that one should not spend _some_ time on trying to forsee 724 problems, but be aware that it is a 'diminishing returns' process.) 725 The performance of very large, complex, physically distributed 726 systems is hard to predict, so rather than try (which would 727 necessarily be an incomplete exercise anyway, testing would 728 inevitably be required eventually), at a certain point it's better 729 just to get on with it - and you will learn a host of other lessons 730 in the process, too. 732 7.1.2. Code Before Documentation 734 This is often a corollary to the kind of style described above. 735 While it probably would not have been possible in a large, 736 inhomogenous group, the small, close nature of the LISP 737 implementation group did allow this approach. 739 7.2. Only Fix Real Problems 741 Don't worry about anything unless experience show it's a real 742 problem. For instance, in the early stages, much was made out of the 743 problem of 'what does an ITR do if it gets a packet, but does not 744 (yet) have a mapping for the destination?' 746 In practise, simply dropping such packets has just not proved to be a 747 problem; the higher level protocol will retransmit them after a 748 timeout, and the mapping is usually in place by then. So spending a 749 lot of time (and its companion, energy) and mechanism (and _its_ 750 extremely undesirable companion, complexity) on solving this 751 'problem' would not have been the most efficient approach, overall. 753 7.3. No Theoretical Perfection 755 Attack hard problems with a number of cheap and simple mechanisms 756 that co-operate and overlap. Trying to find a single mechanism that 757 is all of: 759 - Robust 760 - Efficient 761 - Fast 763 is often (usually?) a fool's errand. (The analogy to the aphorism 764 'Fast, Cheap, Good - Pick Any Two' should be obvious.) However, a 765 collection of simple and cheap mechanisms may effectively be able to 766 meet all of these goals (see, for example, ETR Liveness/Reachability, 767 Section 12.4). 769 Yes, this results in a system which is not provably correct in all 770 circumstances. The world, however, is full of such systems - and in 771 the real world, effective robustness is more likely to result from 772 having multiple, overlapping mechanisms than one single high-powered 773 (and inevitably complex) one. In the world of civil engineering, 774 redundancy is now accepted as a key design principle; the same should 775 be true of information systems. [Salvadori] 777 7.3.1. No Ocean Boiling 779 Don't boil the ocean to kill a single fish. This is a combination of 780 7.2 (Only Fix Real Problems) and 7.3 (No Theoretical Perfection); it 781 just means that spending a lot of complexity and/or overhead to deal 782 with a problem that's not really a problem is not good engineering. 784 7.4. Just Enough Security 786 How much security to have is a complex issue. It's relatively easy 787 for designers to add good security, but much harder to get the users 788 to jump over all the hoops necessary to use it. LISP has therefore 789 adopted a position where we add 'just enough' security. 791 The overall approach to security in LISP is fairly subtle, though, 792 and is covered in more detail elsewhere (in [Architecture], Section 793 "Security"). 795 8. xTRs 797 As mentioned above (in Section 5.1), xTRs are the basic data-handling 798 devices in LISP. This section explores some advanced topics related 799 to xTRs. 801 Careful rules have been specified for both TTL and ECN [RFC3168] to 802 ensure that passage through xTRs does not interfere with the 803 operation of these mechanisms. In addition, care has been taken to 804 ensure that 'traceroute' works when xTRs are involved. 806 8.1. When to Encapsulate 808 An ITR knows that a destination is running LISP, and thus that it 809 should perform LISP processing on a packet (including potential 810 encapsulation) if it has an entry in its local mapping cache that 811 covers the destination EID. 813 Conversely, if the cache contains a 'negative' entry (indicating that 814 the ITR has previously attempted to find a mapping that covers this 815 EID, and it has been informed by the mapping system that no such 816 mapping exists), it knows the destination is not running LISP, and 817 the packet can be forwarded normally. 819 (The ITR cannot simply depend on the appearance, or non-appearance, 820 of the destination in the DFZ routing tables, as a way to tell if a 821 destination is a LISP site or not, because mechanisms to allow 822 interoperation of LISP sites and 'legacy' sites necessarily involve 823 advertising LISP sites' EIDs into the DFZ.) 825 8.2. UDP Encapsulation Details 827 The UDP encapsulation used by LISP for carrying traffic from ITR to 828 ETR, and many of the details of how the it works, were all chosen for 829 very practical reasons. 831 Use of UDP (instead of, say, a LISP-specific protocol number) was 832 driven by the fact that many devices filter out 'unknown' protocols, 833 so adopting a non-UDP encapsulation would have made the initial 834 deployment of LISP harder - and our goal (see Section 2.1) was to 835 make the deployment as easy as possible. 837 The UDP source port in the encapsulated packet is a hash of the 838 original source and destination; this is because many ISPs use 839 multiple parallel paths (so-called 'Equal Cost Multi-Path'), and 840 load-share across them. Using such a hash in the source-port in the 841 outer header both allows LISP traffic to be load-shared, and also 842 ensures that packets from individual connections are delivered in 843 order (since most ISPs try to ensure that packets for a particular 844 {source, source port, destination, destination port} tuple flow along 845 a single path, and do not become disordered).. 847 The UDP checksum is zero because the inner packet usually already has 848 a end-end checksum, and the outer checksum adds no value. [Saltzer] 849 In most exising hardware, computing such a checksum (and checking it 850 at the other end) would also present an intolerable load, for no 851 benefit. 853 8.3. Header Control Channel 855 LISP provides a multiplexed channel in the encapsulation header. It 856 is mostly (but not entirely) used for control purposes. (See 857 [Architecture], Section "Architecture-Piggyback" for a longer 858 discussion of the architectural implications of this.) 860 The general concept is that the header starts with an 8-bit 'flags' 861 field, and it also includes two data fields (one 24 bits, one 32), 862 the contents and meaning of which vary, depending on which flags are 863 set. This allows these fields to be 'multiplexed' among a number of 864 different low-duty-cycle functions, while minimizing the space 865 overhead of the LISP encapsulation header. 867 8.3.1. Echo Nonces 869 One important use is for a mechanism known as the Nonce Echo, which 870 is used as an efficient method for ITRs to check the reachability of 871 correspondent ETRs. 873 Basically, an ITR which wishes to ensure that an ETR is up, and 874 reachable, sends a nonce to that ETR, carried in the encapsulation 875 header; when that ETR (acting as an ITR) sends some other user data 876 packet back to the ITR (acting in turn as an ETR), that nonce is 877 carried in the header of that packet, allowing the original ITR to 878 confirm that its packets are reaching that ETR. 880 Note that lack of a response is not necessarily _proof_ that 881 something has gone wrong - but it stronly suggests that something 882 has, so other actions (e.g. a switch to an alternative ETR, if one is 883 listed; a direct probe; etc) are advised. 885 (See Section 12.5 for more about Echo Nonces.) 887 8.3.2. Instances 889 Another use of these header fields is for 'Instances' - basically, 890 support for VPN's across backbones. [RFC4026] Since there is only 891 one destination UDP port used for carriage of user data packets, and 892 the source port is used for multiplexing (above), there is no other 893 way to differentiate among different destination address namespaces 894 (which are often overlapped in VPNs). 896 8.4. Fragmentation 898 Several mechanisms have been proposed for dealing with packets which 899 are too large to transit the path from a particular ITR to a given 900 ETR. 902 One, called the 'stateful' approach, keeps a per-ETR record of the 903 maximum size allowed, and sends an ICMP Too Big message to the 904 original source host when a packet which is too large is seen. 906 In the other, referred to as the 'stateless' approach, for IPv4 907 packets without the 'DF' bit set, too-large packets are fragmented, 908 and then the fragments are forwarded; all other packets are 909 discarded, and an ICMP Too Big message returned. 911 It is not clear at this point which approach is preferable. 913 8.5. Mapping Gleaning in ETRs 915 As an optimization to the mapping acquisition process, ETRs are 916 allowed to 'glean' mappings from incoming user data packets, and also 917 from incoming Map-Request control messages. This is not secure, and 918 so any such mapping must be 'verified' by sending a Map-Request to 919 get an authoritative mapping. (See further discussion of the 920 security implications of this in [Architecture], Section "Security- 921 xTRs".) 923 The value of gleaning is that most communications are two-way, and so 924 if host A is sending packets to host B (therefore needing B's 925 EID->RLOC mapping), very likely B will soon be sending packets back 926 to A (and thus needing A's EID->RLOC mapping). Without gleaning, 927 this would sometimes result in a delay, and the dropping of the first 928 return packet; this is felt to be very undesirable. 930 9. The Mapping System 932 RFC 1034 ("DNS Concepts and Facilities") has this to say about the 933 DNS name to IP address mapping system: 935 "The sheer size of the database and frequency of updates suggest 936 that it must be maintained in a distributed manner, with local 937 caching to improve performance. Approaches that attempt to 938 collect a consistent copy of the entire database will become more 939 and more expensive and difficult, and hence should be avoided." 941 and this observation applies equally to the LISP mapping system. 943 As previously mentioned, the mapping system is split into an indexing 944 subsystem, which keeps track of where all the mappings are kept, and 945 the mappings themsleves, the authoritative copies of which are always 946 held by ETRs. 948 9.1. The Indexing Subsystem 950 The indexing system in LISP is currently implemented by the DDT 951 system. LISP initially used (for ease of getting something 952 operational without having to write a lot of code) an indexing system 953 called ALT, which used BGP running over virtual tunnels. [ALT] This 954 proved to have a number of issues, and has now been superseded by 955 DDT. 957 In DDT, the EID namespace(s) are instantiated as a tree of DDT nodes. 958 Starting with the root node(s), which have 'reponsibility' for the 959 entire namespace, portions of the namespace are delegated to child 960 nodes, in a recursive process extending through as many levels as are 961 needed. Eventually, leaf nodes in the DDT tree delegate namespace 962 blocks to ETRs. 964 MRs obtain information about delegations by interrogating DDT nodes, 965 and caching the results. aThis allows them, when passed a request for 966 a mapping by an ITR, to forward the mapping request to the 967 appropriate ETR (perhaps after loading some missing delegation 968 entries into their delegation cache). 970 9.2. The Mapping System Interface 972 As mentioned in Section 5.2.2, both of the inferfaces to the mapping 973 system (from ITRs, and ETRs) are standardized, so that the more 974 numerous xTRs do not have to be modified when the mapping indexing 975 system is changed. This precaution has already allowed the mapping 976 system to be upgraded during LISP's evolution, when ALT was replaced 977 by DDT. 979 This section describes the interfaces in a little more detail. 981 9.2.1. Map-Request Messages 983 The Map-Request message contains a number of fields, the two most 984 important of which are the requested EID block identifier (remember 985 that individual mappings may cover a block of EIDs, not just a single 986 EID), and the Address Family Identifier (AFI) for that EID block. 987 [AFI] The inclusion of the AFI allows the mapping system interface 988 (as embodied in these control packets) a great deal of flexibility. 989 (See [Architecture], Section "Namespaces" for more on this.) 991 Other important fields are the source EID (and its AFI), and one or 992 more RLOCs for the source EID, along with their AFIs. Multiple RLOCs 993 are included to ensure that at least one is in a form which will 994 allow the reply to be returned to the requesting ITR, and the source 995 EID is used for a variety of functions, including 'gleaning' (see 996 Section 8.5). 998 Finally, the message includes a long nonce, for simple, efficient 999 protection against offpath attackers (see [Architecture], Section 1000 "Security-xTRs" for more), and a variety of other fields and control 1001 flag bits. 1003 9.2.2. Map-Reply Maessages 1005 The Map-Reply message looks similar, except it includes the mapping 1006 entry for the requested EID(s), which contains one or more RLOCs and 1007 their associated data. (Note that the reply may cover a larger block 1008 of the EID namespace than the request; most requests will be for a 1009 single EID, the one which prompted the query.) 1011 For each RLOC in the entry, there is the RLOC, its AFI (of course), 1012 priority and weight fields (see Section 5.2), and multicast priority 1013 and weight fields. 1015 9.2.3. Map-Register and Map-Notify Messages 1017 The Map-Register message contains authentication information, and a 1018 number of mapping records, each with an individual Time-To-Live 1019 (TTL). Each of the records contains an EID (potentially, a block of 1020 EIDs) and its AFI, a version number for this mapping (see 1021 Section 11.1), and a number of RLOCs and their AFIs. 1023 Each RLOC entry also includes the same data as in the Map-Replies 1024 (i.e. priority and weight); this is because in some circumstances it 1025 is advantageous to allow the MS to proxy reply on the ETR's behalf to 1026 Map-Request messages. [Mobility] 1028 Map-Notify messages have the exact same contents as Map-Register 1029 messages; they are purely acknowledgements. 1031 9.2.4. Map-Referral Messages 1033 Map-Referral messages look almost identical to Map-Reply messages 1034 (which is felt to be an advantage by some people, although having a 1035 more generic record-based format would probably be better in the long 1036 run, as ample experience with DNS has shown), except that the RLOCs 1037 potentially name either i) other DDT nodes (children in the 1038 delegation tree), or ii) terminal MSs. 1040 There are also optional authentication fields; see [Architecture], 1041 Section "Security-Mappings" for more. 1043 9.3. Reliability via Replication 1045 Everywhere throughout the mapping system, robustness to operational 1046 failures is obtained by replicating data in multiple instances of any 1047 particular node (of whatever type). Map-Resolvers, Map-Servers, DDT 1048 nodes, ETRs - all of them can be replicated, and the protocol 1049 supports this replication. 1051 There are generally no mechanisms specified yet to ensure coherence 1052 between multiple copies of any particular data item, etc - this is 1053 currently a manual responsibility. If and when LISP protocol 1054 adoption proceeds, an automated layer to perform this functionality 1055 can 'easily' be layered on top of the existing mechanisms. 1057 9.4. Extended Tools 1059 In addition to the priority and weight data items in mappings, LISP 1060 offers other tools to enhance functionality, particularly in the 1061 traffic engineering area. One are 'source-specific mappings', i.e. 1062 the ETR may return different mappings to the enquiring ITR, depending 1063 on the identity of the ITR. This allows very fine-tuned traffic 1064 engineering, far more powerful than routing-based TE. 1066 9.5. Expected Performance 1068 {{To be written.}} 1070 10. Deployment Mechanisms 1072 This section discusses several deployment issues in more detail. 1073 With LISP's heavy emphasis on practicality, much work has gone into 1074 making sure it works well in the real-world environments most people 1075 have to deal with. 1077 10.1. Internetworking Mechanism 1079 One aspect which has received a lot of attention are the mechanisms 1080 previously referred to (in Section 3.4) to allow interoperation of 1081 LISP sites with so-called 'legacy' sites which are not running LISP 1082 (yet). 1084 To briefly refresh what was said there, there are two main approaches 1085 to such interworking: proxy nodes (PITRs and PETRs), and an 1086 alternative mechanism using device with combined NAT and LISP 1087 functionality; these are described in more detail here. 1089 10.2. Proxy Devices 1091 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1092 nodes using LISP. PETRs (proxy ETRs) serve as ETRs for LISP traffic 1093 _to_ legacy hosts (for cases where a LISP device cannot send packets 1094 directly to such sites, without encapsulation). 1096 Note that return traffic _to_ a legacy site from a LISP-using node 1097 does not necessarily have to pass through an ITR/PETR pair - the 1098 original packets can usually just be sent directly to the 1099 destination. However, for some kinds of LISP operation (e.g. mobile 1100 nodes), this is not possible; in these situations, the PETR is 1101 needed. 1103 10.2.1. PITRs 1105 PITRs (proxy ITRs) serve as ITRs for traffic _from_ legacy hosts to 1106 nodes using LISP. To do that, they have to advertise into the 1107 existing legacy backbone Internet routing the availability of 1108 whatever ranges of EIDs (i.e. of nodes using LISP) they are proxying 1109 for, so that legacy hosts will know where to send traffic to those 1110 LISP nodes. 1112 As mentioned previously (Section 8.1), an ITR at another LISP site 1113 can avoid using a PITR (i.e. it can detect that a given destination 1114 is not a legacy site, if a PITR is advertising it into the DFZ) by 1115 checking to see if a LISP mapping exists for that destination. 1117 This technique obviously has an impact on routing table in the DFZ, 1118 but it is not clear yet exactly what that impact will be; it is very 1119 dependent on the collected details of many individual deployment 1120 decisions. 1122 A PITR may cover a group of EID blocks with a single EID 1123 advertisement, in order to reduce the number of routing table entries 1124 added. (In fact, at the moment, aggressive aggregation of EID 1125 announcements is performed, precisely to to minimize the number of 1126 new announced routes added by this technique.) 1128 At the same time, it a site does traffic engineering with LISP 1129 instead of fine-grained BGP announcement, that will help keep table 1130 sizes down (and this is true even in the early stages of LISP 1131 deployment). The same is true for multi-homing. 1133 10.2.2. PETRs 1135 PETRs (proxy ETRs) serve as ETRs for LISP traffic _to_ legacy hosts, 1136 for cases where a LISP device cannot send packets to sites without 1137 encapsulation. That typically happens for one of two reasons. 1139 First, it will happen in places where some device is implementing 1140 Unicast Reverse Path Forwarding (uRPF), to prevent a variety of 1141 negative behaviour; originating packets with the source's EID in the 1142 source address field will result in them being filtered out and 1143 discarded. 1145 Second, it will happen when a LISP site wishes to send packets to a 1146 non-LISP site, and the path in between does not support the 1147 particular IP protocol version used by the source along its entir 1148 length. Use of a PETR on the other side of the 'gap' will allow the 1149 LISP site's packet to 'hop over' the gap, by utilizing LISP's 1150 built-in support for mixed protocol encapsulation. 1152 PETRs are generally paired with specific ITRs, which have the 1153 location of their PETRs configured into them. In other words, unlike 1154 normal ETRS, PETRs do not have to register themselves in the mapping 1155 database, on behalf of any legacy sites they serve. 1157 Also, allowing an ITR to always send traffic leaving a site to a PETR 1158 does avoid having to chose whether or not to encapsulate packets; it 1159 can just always encapsulate packets, sending them to the PETR if it 1160 has no specific mapping for the destination. However, this is not 1161 advised: as mentioned, it is easy to tell if something is a legacy 1162 destination. 1164 10.3. LISP-NAT 1166 A LISP-NAT device, as previously mentioned, combines LISP and NAT 1167 functionality, in order to allow a LISP site which is internally 1168 using addresses which cannot be globally routed to communicate with 1169 non-LISP sites elsewhere in the Internet. (In other words, the 1170 technique used by the PITR approach simply cannot be used in this 1171 case.) 1173 To do this, a LISP-NAT performs the usual NAT functionality, and 1174 translates a host's source address(es) in packets passing through it 1175 from an 'inner' value to an 'outer' value, and storing that 1176 translation in a table, which it can use to similarly process 1177 subsequent packets (both outgoing and incoming). [Interworking] 1179 There are two main cases where this might apply: 1180 - Sites using non-routable global addresses 1181 - Sites using private addresses [RFC1918] 1183 10.4. LISP and DFZ Routing 1185 {{To be written.}} 1187 10.5. Use Through NAT Devices 1189 Like them or not (and NAT devices have many egregious issues - some 1190 inherent in the nature of the process of mapping addresses; others, 1191 such as the brittleness due to non-replicated critical state, caused 1192 by the way NATs were introduced, as stand-alone 'invisible' boxes), 1193 NATs are both ubiquitous, and here to stay for a long time to come. 1195 Thus, in the actual Internet of today, having any new mechanisms 1196 function well in the presence of NATs (i.e. with LISP xTRs behind a 1197 NAT device) is absolutely necessary. LISP has produced a variety of 1198 mechanisms to do this. 1200 10.5.1. First-Phase NAT Support 1202 The first mechanism used by LISP to operate through a NAT device only 1203 worked with some NATs, those which were configurable to allow inbound 1204 packet traffic to reach a configured host. 1206 A pair of new LISP control messages, LISP Echo-Request and Echo- 1207 Reply, allowed the ETR to discover its temporary global address; the 1208 Echo-Request was sent to the configured Map-Server, and it replied 1209 with an Echo-Reply which included the source address from which the 1210 Echo Request was received (i.e. the public global address assigned to 1211 the ETR by the NAT). The ETR could then insert that address in any 1212 Map-Reply control messages which it sent to correspondent ITRs. 1214 The fact that this mechanism did not support all NATs, and also 1215 required manual configuration of the NAT, meant that this was not a 1216 good solution; in addition, since LISP expects all incoming data 1217 traffic to be on a specific port, it was not possible to have 1218 multiple ETRs behind a single NAT (which normally would have only one 1219 global address to share, meaning port mapping would have to be used, 1220 except that... ) 1222 10.5.2. Second-Phase NAT Support 1224 For a more comprehensive approach to support of LISP xTR deployment 1225 behind NAT devices, a fairly extensive supplement to LISP, LISP NAT 1226 Traversal, has been designed. [NAT] 1228 A new class of LISP device, the LISP Re-encapsulating Tunnel Router 1229 (RTR), passes traffic through the NAT, both to and from the xTR. 1230 (Inbound traffic has to go through the RTR as well, since otherwise 1231 multiple xTRs could not operate behind a single NAT, for the 1232 'specified port' reason in the section above.) 1234 (Had the Map-Reply included a port number, this could have been 1235 avoided - although of course it would be possible to define a new 1236 RLOC type which included protocol and port, to allow other 1237 encapsulation techniques.) 1239 Two new LISP control messages (Info-Request and Info-Reply) allow an 1240 xTR to detect if it is behind a NAT device, and also discover the 1241 global IP address and UDP port assigned by the NAT to the xTR. A 1242 modification to LISP Map-Register control messages allows the xTR to 1243 initialize mapping state in the NAT, in order to use the RTR. 1245 This mechanism addresses cases where the xTR is behind a NAT, but the 1246 xTR's associated MS is on the public side of the NAT; this 1247 limitation, that MS's must be in the 'public' part of the Internet, 1248 seems reasonable. 1250 11. Current Improvements 1252 In line with the philosophies laid out in Section 7, LISP is 1253 something of a moving target. This section discusses some of the 1254 contemporaneous improvements being made to LISP. 1256 11.1. Mapping Versioning 1258 As mentioned, LISP has been under development for a considerable 1259 time. One early addition to LISP (it is already part of the base 1260 specification) is mapping versioning; i.e. the application of 1261 identifying sequence numbers to different versions of a mappping. 1262 [Versioning] This allows an ITR to easily discover when a cached 1263 mapping has been updated by a more recent variant. 1265 Version numbers are available in control messages (Map-Replies), but 1266 the initial concept is that to limit control message overhead, the 1267 versioning mechanism should primarily use the multiplex user data 1268 header control channel (see Section 8.3). 1270 Versioning can operate in both directions: an ITR can advise an ETR 1271 what version of a mapping it is currently using (so the ETR can 1272 notify it if there is a more recent version), and ETRs can let ITRs 1273 know what the current mapping version is (so the ITRs can request an 1274 update, if their copy is outdated). 1276 At the moment version numbers are manually assigned, and ordered. 1277 Some felt that this was non-optimal, and that a better approach would 1278 have been to have 'fingerprints' which were computed from the current 1279 mapping data (i.e. a hash). It is not clear that the ordering buys 1280 much (if anything), and the potential for mishaps with manually 1281 configured version numbers is self-evident. 1283 11.2. Replacement of ALT with DDT 1285 As mentioned in Section 9.2, an interface is provided to allow 1286 replacement of the indexing subsystem. LISP initially used an 1287 indexing system called ALT. [ALT] ALT was relatively easy to 1288 construct from existing tools (GRE, BGP, etc), but it had a number of 1289 issues that made it unsuitable for large-scale use. ALT is now being 1290 superseded by DDT. 1292 As indicated previously (Section 9.5), the basic structure and 1293 operation of DDT is identical to that of TREE, so the extensive 1294 simulation work done for TREE applies equally to DDT, as do the 1295 conclusions drawn about TREE's superiority to ALT. [Jakab] 1297 {{Briefly synopsize results}} 1299 11.2.1. Why Not Use DNS 1301 One obvious question is 'Since DDT is so similar to DNS, why not 1302 simply use DNS?' In particular, people are familiar with the DNS, 1303 how to configure it, etc - would it not thus be preferable to use it? 1304 To completely answer this would take more space that available here, 1305 but, briefly, there were two main reasons, and one lesser one. 1307 First, the syntax of DNS names did not lend itself to looking up 1308 names in other syntaxes (e.g. bit fields). This is a problem which 1309 has been previously encountered, e.g. in reverse address lookups. 1310 [RFC5855] 1312 Second, as an existing system, the interfaces between DNS (should it 1313 have been used as an indexing subsystem for LISP) would not be 1314 'tuneable' to be optimal for LISP. For instance, if it were desired 1315 to have the leaf node in an indexing lookup directly contact the ETR 1316 on behalf of the node doing the lookup (thereby avoiding a round-trip 1317 delay), that would not be easy without modifications to the DNS code. 1318 Obviously, with a 'custom' system, this issue does not arise. 1320 Finally, DNS security, while robust, is fairly complex. Doing DDT 1321 offered an opportunity to provide a more nuanced security model. 1322 (See [Architecture], Section "Security" for more about this.) 1324 11.3. Mobile Device Support 1326 Mobility is an obvious capability to provide with LISP. Doing so is 1327 relatively simple, if the mobile host is prepared to act as its own 1328 ETR. It obtains a local 'temporary use' address, and registers that 1329 address as its RLOC. Packets to the mobile host are sent to its 1330 temporary address, whereever that may be, and the mobile host first 1331 unwraps them (acting as an ETR), and the processes them normally 1332 (acting as a host). 1334 (Doing mobility without having the mobile host act as its ETR is 1335 difficult, even if ETRs are quite common. The reason is that if the 1336 ETR and mobile host are not integrated, during the step from the ETR 1337 to the mobile host, the packets must contain the mobile host's EID, 1338 and this may not be workable. If there is a local router between the 1339 ETR and mobile host, for instance, it is unlikely to know how to get 1340 the packets to the mobile host.) 1342 If the mobile host migrates to a site which is itself a LISP site, 1343 things get a little more complicated. The 'temporary address' it 1344 gets is itself an EID, requiring mapping, and wrapping for transit 1345 across the rest of the Internet. A 'double encapsulation' is thus 1346 required at the other end; the packets are first encapsulated with 1347 the mobile node's temporary address as their RLOC, and then this has 1348 to be looked up in a second lookup cycle (see Section 8.1), and then 1349 wrapped again, with the site's RLOC as their destination. 1351 This results in slight loss in maximum packet size, due to the 1352 duplicated headers, but on the whole it is considerably simpler than 1353 the alternative, which would be to re-wrap the packet at the site's 1354 ETR, when it is discovered that the destination's EID was not 1355 'native' to the site. This would require that the mobile node's EID 1356 effectively have two different mappings, depending on whether the 1357 lookup was being performed outside the LISP site, or inside. 1359 {{Also probably need to mention briefly how the other end is notified 1360 when mappings are updated, and about proxy-Map-Replies.}} [Mobility] 1362 11.4. Multicast Support 1364 Multicast may seem an odd thing to support with LISP, since LISP is 1365 all about separating identity from location, but although a multicast 1366 group in some sense has an identity, it certainly does not have _a_ 1367 location. 1369 However, multicast is important to some users of the network, for a 1370 number of reasons: doing multiple unicast streams is inefficient; it 1371 is easy to use up all the upstream bandwidth, and without multicast a 1372 server can also be saturated fairly easily in doing the unicast 1373 replication. So it is important for LISP to 'play nicely' with 1374 multicast; work on multicast support in LISP is fairly advanced, 1375 although not far-ranging. 1377 Briefly, destination group addresses are not mapped; only the source 1378 address (when the source is inside a LISP site) needs to be mapped, 1379 both during distribution tree setup, as well as actual traffic 1380 delivery. In other words, LISP's mapping capability isa used: it is 1381 just applied to the source, not the destination (as with most LISP 1382 activity); the inner source is the EID, and the outer source is the 1383 EID's RLOC. 1385 Note that this does mean that if the group is using separate source- 1386 specific trees for distribution, there isn't a separate distribution 1387 tree outside the LISP site for each different source of traffic to 1388 the group from inside the LISP site; they are all lumped together 1389 under a single source, the RLOC. 1391 The approach currently used by LISP requires no packet format changes 1392 to existing multicast protocols. See [Multicast] for more; 1393 additional LISP multicast issues are discussed in [LISP], Section 12. 1395 11.5. {{Any others?}} 1397 12. Fault Discovery/Handling 1399 LISP is, in terms of its functionality, a fairly simple system: the 1400 list of failure modes is thus not extensive. 1402 12.1. Handling Missing Mappings 1404 Handling of missing mappings is fairly simple: the ITR calls for the 1405 mapping, and in the meantime can either discard traffic to the 1406 destination (as many ARP implementations do) [RFC826], or, if 1407 dropping the traffic is deemed undesirable, it can forward them via a 1408 'default PITR'. 1410 A number of PITRs advertise all EID blocks into the backbone routing, 1411 so that any ITRs which are temporarily missing a mapping can forward 1412 the traffic to these default PITRs via normal transmission methods, 1413 where they are encapsulated and passed on. 1415 12.2. Outdated Mappings 1417 If a mapping changes once an ITR has retrieved it, that may result in 1418 traffic to the EIDs covered by that mapping failing. There are three 1419 cases to consider: 1421 - When the ETR traffic is being sent to is still a valid ETR for 1422 that EID, but the mapping has been updated (e.g. to change the 1423 priority of various ETRs) 1424 - When the ETR traffic is being sent to is still an ETR, but no 1425 longer a valid ETR for that EID 1426 - When the ETR traffic is being sent to is no longer an ETR 1428 12.2.1. Outdated Mappings - Updated Mapping 1430 A 'mapping versioning' system, whereby mappings have version numbers, 1431 and ITRs are notified when their mapping is out of date, has been 1432 added to detect this, and the ITR responds by refreshing the mapping. 1433 [Versioning] 1435 12.2.2. Outdated Mappings - Wrong ETR 1437 {{To be written.}} 1439 12.2.3. Outdated Mappings - No Longer an ETR 1441 If the destination of traffic from an ITR is no longer an ETR, one 1442 might get an ICMP Destination Unreachable error message. However, 1443 one cannot depend on that. The following mechanism will work, 1444 though. 1446 Since the destination is not an ETR, the echoing reachability 1447 detection mechanism (see Section 8.3.1) will detect a problem. At 1448 that point, the backstop mechanism, Probing, will kick in. Since the 1449 destination is still not an ETR, that will fail, too. 1451 At that point, traffic will be switched to a different ETR, or, if 1452 none are available, a re-map may be requested. 1454 12.3. Erroneous mappings 1456 {{To be written.}} 1458 12.4. Neighbour Liveness 1460 The ITR, like all packet switches, needs to detect, and react, when 1461 its next-hop neighbour ceases operation. As LISP traffic is 1462 effectively always unidirectional (from ITR to ETR), this could be 1463 somewhat problematic. 1465 Solving a related problem, neighbour reachability (below) subsumes 1466 handling this fault mode, however. 1468 Note that the two terms (liveness and reachability) are _not_ 1469 synonmous (although a lot of LISP documentation confuses them). 1470 Liveness is a property of a node - it is either up and functioning, 1471 or it is not. Reachability is only a property of a particular _pair_ 1472 of nodes. 1474 If packets sent from a first node to a second are successfully 1475 received at the second, it is 'reachable' from the first. However, 1476 the second node may at the very same time _not_ be reachable from 1477 some other node. Reachability is _always_ a ordered pairwise 1478 property, and of a specified ordered pair. 1480 12.5. Neighbour Reachability 1482 A more significant issue than whether a particular ETR E is up or not 1483 is, as mentioned above, that although ETR E may be up, attached to 1484 the network, etc, an issue in the network between a source ITR I and 1485 E may prevent traffic from I from getting to E. (Perhaps a routing 1486 problem, or perhaps some sort of access control setting.) 1488 The one-way nature of LISP traffic makes this situation hard to 1489 detect in a way which is economic, robust and fast. Two out of the 1490 three are usually not to hard, but all three at the same time - as is 1491 highly desirable for this particular issue - are harder. 1493 In line with the LISP design philosophy (Section 7.3), this problem 1494 is attacked not with a single mechanism (which would have a hard time 1495 meeting all those three goals simultaneously), but with a collection 1496 of simpler, cheaper mechanisms, which collectively will usually meet 1497 all three. 1499 They are reliance on the underlying routing system (which can of 1500 course only reliably provide a negative reachabilty indication, not a 1501 positive one), the echo nonce (which depends on some return traffic 1502 from the destination xTR back to the source), and finally direct 1503 'pinging', in the case where no positive echo is returned. 1505 (The last is not the first choice, as due to the large fan-out 1506 expected of LISP devices, reliance on it as a sole mechanism would 1507 produce a fair amount of overhead.) 1509 13. Acknowledgments 1511 The author would like thank all the members of the core LISP group 1512 for their willingness to allow him to add himself to their effort, 1513 and for their enthusiasm for whatever assistance he has been able to 1514 provide. He would also like to thank (in alphabetical order) Vina 1515 Ermagan, Vince Fuller, and especially Joel Halpern for their careful 1516 review of, and helpful suggestions for, this document. Grateful 1517 thanks also to Darrel Lewis for his help with material on non- 1518 Internet uses of LISP, and to Vince Fuller for help with XML. 1520 A final thanks is due to John Wrocklawski for the author's 1521 organizational affiliation. This memo was created using the xml2rfc 1522 tool 1524 14. IANA Considerations 1526 This document makes no request of the IANA. 1528 15. Security Considerations 1530 This memo does not define any protocol and therefore creates no new 1531 security issues. 1533 16. References 1535 16.1. Normative References 1537 [RFC768] J. Postel, "User Datagram Protocol", RFC 768, 1538 August 1980. 1540 [RFC791] J. Postel, "Internet Protocol", RFC 791, 1541 September 1981. 1543 [RFC1498] J. H. Saltzer, "On the Naming and Binding of Network 1544 Destinations", RFC 1498, (Originally published in: 1545 "Local Computer Networks", edited by P. Ravasio et 1546 al., North-Holland Publishing Company, Amsterdam, 1547 1982, pp. 311-317.), August 1993. 1549 [RFC2460] S. Deering and R. Hinden, "Internet Protocol, Version 1550 6 (IPv6) Specification", RFC 2460, December 1998. 1552 [Architecture] J.N. Chiappa, "The Architecture of the LISP Location- 1553 Identity Separation System", 1554 draft-chiappa-lisp-architecture-00 (work in 1555 progress), July 2012. 1557 [DDT] V. Fuller, D. Lewis, and D. Farinacci, "LISP 1558 Delegated Database Tree", draft-fuller-lisp-ddt-01 1559 (work in progress), March 2012. 1561 [Future] J. N. Chiappa, "Potential Long-Term Developments With 1562 the LISP System", draft-chiappa-lisp-evolution-00 1563 (work in progress), July 2012. 1565 [Interworking] D. Lewis, D. Meyer, D. Farinacci, and V. Fuller, 1566 "Interworking LISP with IPv4 and IPv6", 1567 draft-ietf-lisp-interworking-06 (work in progress), 1568 March 2012. 1570 [LISP] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, 1571 "Locator/ID Separation Protocol (LISP)", 1572 draft-ietf-lisp-23 (work in progress), May 2012. 1574 [Mobility] D. Farinacci, V. Fuller, D. Lewis, and D. Meyer, 1575 "LISP Mobility Architecture", draft-meyer-lisp-mn-07 1576 (work in progress), April 2012. 1578 [Multicast] D. Farinacci, D. Meyer, J. Zwiebel, and S. Venaas, 1579 "LISP for Multicast Environments", 1580 draft-ietf-lisp-multicast-14 (work in progress), 1581 February 2012. 1583 [NAT] V. Ermagan, D. Farinacci, D. Lewis, J. Skriver, 1584 F. Maino, and C. White, "NAT traversal for LISP", 1585 draft-ermagan-lisp-nat-traversal-01 (work in 1586 progress), March 2012. 1588 [Versioning] L. Iannone, D. Saucez, and O. Bonaventure, "LISP 1589 Mapping Versioning", 1590 draft-ietf-lisp-map-versioning-09 (work in progress), 1591 March 2012. 1593 [AFI] IANA, "Address Family Indicators (AFIs)", Address 1594 Family Numbers, January 2011, . 1597 16.2. Informative References 1599 [NIC8246] A. McKenzie and J. Postel, "Host-to-Host Protocol for 1600 the ARPANET", NIC 8246, Network Information Center, 1601 SRI International, Menlo Park, CA, October 1977. 1603 [IEN19] J. F. Shoch, "Inter-Network Naming, Addressing, and 1604 Routing", IEN (Internet Experiment Note) 19, 1605 January 1978. 1607 [RFC826] D. Plummer, "Ethernet Address Resolution Protocol", 1608 RFC 826, November 1982. 1610 [RFC1034] P. V. Mockapetris, "Domain Names - Concepts and 1611 Facilities", RFC 1034, November 1987. 1613 [RFC1631] K. Egevang and P. Francis, "The IP Network Address 1614 Translator (NAT)", RFC 1631, May 1994. 1616 [RFC1918] Y. Rekhter, R. Moskowitz, D. Karrenberg, 1617 G. J. de Groot, and E. Lear, "Address Allocation for 1618 Private Internets", RFC 1918, February 1996. 1620 [RFC1992] I. Castineyra, J. N. Chiappa, and M. Steenstrup, "The 1621 Nimrod Routing Architecture", RFC 1992, August 1996. 1623 [RFC3168] K. Ramakrishnan, S. Floyd, and D. Black, "The 1624 Addition of Explicit Congestion Notification (ECN) to 1625 IP", RFC 3168, September 2001. 1627 [RFC3272] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and 1628 X. Xiao, "Overview and Principles of Internet Traffic 1629 Engineering", RFC 3272, May 2002. 1631 [RFC4026] L. Andersson and T. Madsen, "Provider Provisioned 1632 Virtual Private Network (VPN) Terminology", RFC 4026, 1633 March 2005. 1635 [RFC4116] J. Abley, K. Lindqvist, E. Davies, B. Black, and 1636 V. Gill, "IPv4 Multihoming Practices and 1637 Limitations", RFC 4116, July 2005. 1639 [RFC4984] D. Meyer, L. Zhang, and K. Fall, "Report from the IAB 1640 Workshop on Routing and Addressing", RFC 4984, 1641 September 2007. 1643 [RFC5855] J. Abley and T. Manderson, "Nameservers for IPv4 and 1644 IPv6 Reverse Zones", RFC 5855, May 2010. 1646 [RFC5887] B. Carpenter, R. Atkinson, and H. Flinck, 1647 "Renumbering Still Needs Work", RFC 5887, May 2010. 1649 [ALT] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, 1650 "LISP Alternative Topology (LISP-ALT)", 1651 draft-ietf-lisp-alt-10 (work in progress), 1652 December 2011. 1654 [NSAP] International Organization for Standardization, 1655 "Information Processing Systems - Open Systems 1656 Interconnection - Basic Reference Model", ISO 1657 Standard 7489.1984, 1984. 1659 [Atkinson] R. Atkinson, "Revised draft proposed definitions", 1660 RRG list message, Message-Id: 808E6500-97B4-4107- 1661 8A2F-36BC913BE196@extremenetworks.com, 11 June 2007, 1662 . 1665 [Baran] P. Baran, "On Distributed Communications Networks", 1666 IEEE Transactions on Communications Systems Vol. 1667 CS-12 No. 1, pp. 1-9, March 1964. 1669 [Chiappa] J. N. Chiappa, "Endpoints and Endpoint Names: A 1670 Proposed Enhancement to the Internet Architecture", 1671 Personal draft (work in progress), 1999, 1672 . 1674 [Clark] D. D. Clark, "The Design Philosophy of the DARPA 1675 Internet Protocols", in 'Proceedings of the Symposium 1676 on Communications Architectures and Protocols SIGCOMM 1677 '88', pp. 106-114, 1988. 1679 [Heart] F. E. Heart, R. E. Kahn, S. M. Ornstein, 1680 W. R. Crowther, and D. C. Walden, "The Interface 1681 Message Processor for the ARPA Computer Network", 1682 Proceedings AFIPS 1970 SJCC, Vol. 36, pp. 551-567. 1684 [Jakab] L. Jakab, A. Cabellos-Aparicio, F. Coras, D. Saucez, 1685 and O. Bonaventure, "LISP-TREE: A DNS Hierarchy to 1686 Support the LISP Mapping System", in 'IEEE Journal on 1687 Selected Areas in Communications', Vol. 28, No. 8, 1688 pp. 1332-1343, October 2010. 1690 [Iannone] L. Iannone and O. Bonaventure, "On the Cost of 1691 Caching Locator/ID Mappings", in 'Proceedings of the 1692 3rd International Conference on emerging Networking 1693 EXperiments and Technologies (CoNEXT'07)', ACM, pp. 1694 1-12, December 2007. 1696 [Saltzer] J. H. Saltzer, D. P. Reed, and D. D. Clark, "End-To- 1697 End Arguments in System Design", ACM TOCS, Vol 2, No. 1698 4, pp 277-288, November 1984. 1700 [Salvadori] M. Salvadori and M. Levy, "Why Buildings Fall Down", 1701 W. W. Norton, New York, pg. 81, 1992. 1703 Appendix A. Glossary/Definition of Terms 1705 - Address 1706 - Locator 1707 - EID 1708 - RLOC 1709 - ITR 1710 - ETR 1711 - xTR 1712 - PITR 1713 - PETR 1714 - MR 1715 - MS 1716 - DFZ 1718 Appendix B. Other Appendices 1720 Possible appendices: 1722 -- Location/Identity Separation Brief History 1723 -- LISP History 1724 -- Old models (LISP 1, LISP 1.5, etc) 1726 Author's Address 1728 J. Noel Chiappa 1729 Yorktown Museum of Asian Art 1730 Yorktown, Virginia 1731 USA 1733 EMail: jnc@mit.edu