idnits 2.17.1 draft-irtf-rrg-recommendation-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 26, 2009) is 5207 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC4423' is mentioned on line 1421, but not defined ** Obsolete undefined reference: RFC 4423 (Obsoleted by RFC 9063) == Missing Reference: 'RFC5214' is mentioned on line 1424, but not defined == Missing Reference: 'I-D.zhang-evolution' is mentioned on line 1537, but not defined == Missing Reference: 'I-D.farinacci-lisp-lig' is mentioned on line 1371, but not defined == Missing Reference: 'I-D.ietf-lisp' is mentioned on line 1375, but not defined == Missing Reference: 'I-D.ietf-lisp-alt' is mentioned on line 1380, but not defined == Missing Reference: 'I-D.ietf-lisp-interworking' is mentioned on line 1385, but not defined == Missing Reference: 'I-D.ietf-lisp-ms' is mentioned on line 1391, but not defined == Missing Reference: 'I-D.meyer-lisp-mn' is mentioned on line 1395, but not defined == Missing Reference: 'I-D.meyer-loc-id-implications' is mentioned on line 1400, but not defined == Missing Reference: 'I-D.xu-rangi' is mentioned on line 1407, but not defined == Missing Reference: 'I-D.xu-rangi-proxy' is mentioned on line 1412, but not defined == Missing Reference: 'RANGI' is mentioned on line 1417, but not defined == Missing Reference: 'I-D.whittle-ivip-db-fast-push' is mentioned on line 1430, but not defined == Missing Reference: 'I-D.whittle-ivip4-etr-addr-forw' is mentioned on line 1435, but not defined == Missing Reference: 'Ivip PMTUD' is mentioned on line 1452, but not defined == Missing Reference: 'Ivip6' is mentioned on line 1463, but not defined == Missing Reference: 'I-D.frejborg-hipv4' is mentioned on line 1469, but not defined == Missing Reference: 'LMS' is mentioned on line 1475, but not defined == Missing Reference: 'GLI' is mentioned on line 1488, but not defined == Missing Reference: 'I-D.adan-idr-tidr' is mentioned on line 1495, but not defined == Unused Reference: 'RFC1887' is defined on line 1359, but no explicit reference was found in the text == Unused Reference: 'I-D.carpenter-renum-needs-work' is defined on line 1364, but no explicit reference was found in the text == Outdated reference: A later version (-06) exists of draft-irtf-rrg-design-goals-01 == Outdated reference: A later version (-05) exists of draft-narten-radir-problem-statement-04 == Outdated reference: A later version (-05) exists of draft-carpenter-renum-needs-work-04 Summary: 2 errors (**), 0 flaws (~~), 27 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force T. Li, Ed. 3 Internet-Draft Ericsson 4 Intended status: Informational December 26, 2009 5 Expires: June 29, 2010 7 Recommendation for a Routing Architecture 8 draft-irtf-rrg-recommendation-03 10 Abstract 12 It is commonly recognized that the Internet routing and addressing 13 architecture is facing challenges in scalability, multi-homing, and 14 inter-domain traffic engineering. This document reports the Routing 15 Research Group's prelimnary findings from its efforts towards 16 developing a recommendation for a scalable routing architecture. 18 This document is a work in progress. 20 Status of this Memo 22 This Internet-Draft is submitted to IETF in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on June 29, 2010. 43 Copyright Notice 45 Copyright (c) 2009 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. Structure of This Document . . . . . . . . . . . . . . . 4 62 2. Locator Identifier Separation Protocol (LISP) . . . . . . . . 4 63 2.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 4 64 2.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 2.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 3. Routing Architecture for the Next Generation Internet 67 (RANGI) . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 3.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 5 69 3.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 3.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 4. Internet Vastly Improved Plumbing (Ivip) . . . . . . . . . . . 7 72 4.1. Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . 7 73 4.2. Extensions . . . . . . . . . . . . . . . . . . . . . . . 8 74 4.2.1. TTR Mobility . . . . . . . . . . . . . . . . . . . . . 8 75 4.2.2. Modified Header Forwarding . . . . . . . . . . . . . . 9 76 4.3. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 4.4. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 5. hIPv4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 5.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 10 80 5.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 10 81 5.3. Costs And Issues . . . . . . . . . . . . . . . . . . . . 11 82 6. Name overlay (NOL) service for scalable Internet routing . . . 12 83 6.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 12 84 6.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 6.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 13 86 7. Compact routing in locator identifier mapping system . . . . . 14 87 7.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 14 88 7.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 14 89 7.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 14 90 8. Layered mapping system (LMS) . . . . . . . . . . . . . . . . . 14 91 8.1. Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . 14 92 8.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 15 93 8.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 15 94 9. 2-phased mapping . . . . . . . . . . . . . . . . . . . . . . . 16 95 9.1. Considerations . . . . . . . . . . . . . . . . . . . . . 16 96 9.2. My contribution: a 2-phased mapping . . . . . . . . . . . 16 97 9.3. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 16 98 9.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 17 99 10. Global Locator, Local Locator, and Identifier Split 100 (GLI-Split) . . . . . . . . . . . . . . . . . . . . . . . . . 17 101 10.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 17 102 10.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 17 103 10.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 18 104 11. Tunneled Inter-domain Routing (TIDR) . . . . . . . . . . . . . 18 105 11.1. Key Idea . . . . . . . . . . . . . . . . . . . . . . . . 18 106 11.2. Gains . . . . . . . . . . . . . . . . . . . . . . . . . . 19 107 11.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 19 108 12. Identifier-Locator Network Protocol (ILNP) . . . . . . . . . . 20 109 12.1. Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . 20 110 12.2. Benefits . . . . . . . . . . . . . . . . . . . . . . . . 20 111 12.3. Costs . . . . . . . . . . . . . . . . . . . . . . . . . . 22 112 13. Enhanced Efficiency of Mapping Distribution Protocols in 113 Map-and-Encap Schemes . . . . . . . . . . . . . . . . . . . . 22 114 13.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 22 115 13.2. Management of Mapping Distribution of Subprefixes 116 Spread Across Multiple ETRs . . . . . . . . . . . . . . . 22 117 13.3. Management of Mapping Distribution for Scenarios with 118 Hierarchy of ETRs and Multi-Homing . . . . . . . . . . . 24 119 14. Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 24 120 14.1. Need for Evolution . . . . . . . . . . . . . . . . . . . 24 121 14.2. Relation to Other RRG Proposals . . . . . . . . . . . . . 25 122 14.3. Aggregation with Increasing Scopes . . . . . . . . . . . 25 123 15. Name-Based Sockets . . . . . . . . . . . . . . . . . . . . . . 27 124 16. Recommendation . . . . . . . . . . . . . . . . . . . . . . . . 29 125 17. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 126 18. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 127 19. Security Considerations . . . . . . . . . . . . . . . . . . . 29 128 20. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 129 20.1. Normative References . . . . . . . . . . . . . . . . . . 29 130 20.2. Informative References . . . . . . . . . . . . . . . . . 29 131 20.3. LISP References . . . . . . . . . . . . . . . . . . . . . 30 132 20.4. RANGI References . . . . . . . . . . . . . . . . . . . . 30 133 20.5. Ivip References . . . . . . . . . . . . . . . . . . . . . 31 134 20.6. hIPv4 References . . . . . . . . . . . . . . . . . . . . 32 135 20.7. Layered Mapping System References . . . . . . . . . . . . 32 136 20.8. GLI References . . . . . . . . . . . . . . . . . . . . . 32 137 20.9. TIDR References . . . . . . . . . . . . . . . . . . . . . 32 138 20.10. ILNP References . . . . . . . . . . . . . . . . . . . . . 33 139 20.11. EEMDP References . . . . . . . . . . . . . . . . . . . . 33 140 20.12. Evolution References . . . . . . . . . . . . . . . . . . 33 141 20.13. Name Based Sockets References . . . . . . . . . . . . . . 33 142 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 33 144 1. Introduction 146 It is commonly recognized that the Internet routing and addressing 147 architecture is facing challenges in scalability, multi-homing, and 148 inter-domain traffic engineering. The problem being addressed has 149 been documented in [I-D.narten-radir-problem-statement], and the 150 design goals that we have agreed to can be found in 151 [I-D.irtf-rrg-design-goals]. This document reports the Routing 152 Research Group's (RRG's) results from its efforts towards developing 153 a recommendation for a scalable routing architecture. 155 This document is a work in progress. 157 1.1. Structure of This Document 159 This document describes a number of the different possible approaches 160 that could be taken in a new routing architecture, as well as a 161 summary of the current thinking of the overall group regarding each 162 approach. 164 2. Locator Identifier Separation Protocol (LISP) 166 2.1. Key Idea 168 Implements a locator-identifier separation mechanism using 169 encapsulation between routers at the "edge" of the Internet. Such a 170 separation allows topological aggregation of the routeable addresses 171 (locators) while providing stable and portable numbering of end 172 systems (identifiers). 174 2.2. Gains 176 o topological aggregation of numbering space (RLOCs) used for 177 routing, which greatly reduces both the overall size and the 178 "churn rate" of the information needed to operate the Internet 179 global routing system 181 o seperate numbering space (EIDs) for end-systems, effectively 182 allowing "PI for all" (no renumbering cost for connectivity 183 changes) without adding state to the global routing system 185 o improved traffic engineering capabilities that explicitly do not 186 add state to the global routing system and whose deployment will 187 allow active removal of more-specific state currently used 189 o no changes required to end systems 190 o no changes to Internet "core" routers 192 o minimal and straightforward changes to "edge" routers 194 o day-one advatanges for early adopters 196 o defined router-to-router protocol 198 o defined database mapping system 200 o defined deployment plan 202 o defined interoperability/interworking mechanisms 204 o defined scalable end-host mobility mechanisms 206 o prototype implementation already exists and undergoing testing 208 o production implementations in progress 210 2.3. Costs 212 o mapping system infrastructure (map servers, map resolvers, ALT 213 routers) (new potential business opportunity) 215 o Interworking infrastructure (proxy ITRs) (new potential business 216 opportunity) 218 o overhead for determining/maintaining locator/path liveness (common 219 issue for all id/loc separation proposals) 221 3. Routing Architecture for the Next Generation Internet (RANGI) 223 3.1. Key Idea 225 Similar to HIP [RFC4423], RANGI introduces a host identifier layer 226 between the network layer and the transport layer, and the transport- 227 layer associations (i.e., TCP connections) are no longer bound to IP 228 addresses, but to host identifiers. The major difference from the 229 HIP is that the host identifier in RANGI is a 128-bit hierarchical 230 and cryptographic identifier which has organizational structure. As 231 a result, the corresponding ID->locator mapping system for such 232 identifiers has reasonable business model and clear trust boundaries. 233 In addition, RANGI uses IPv4-embeded IPv6 addresses as locators. The 234 LD ID (i.e., the leftmost 96 bits) of this locator is a provider- 235 assigned /96 IPv6 prefix, while the last four octets of this locator 236 is a local IPv4 address (either public or private). This special 237 locator could be used to realize 6over4 automatic tunneling 238 (borrowing ideas from ISATAP [RFC5214]), which will reduce the 239 deployment cost of this new routing architecture. Within RANGI, the 240 mappings from FQDN to host identifiers are stored in the DNS system, 241 while the mappings from host identifiers to locators are stored in a 242 distributed id/locator mapping system (e.g., a hierarchical 243 Distributed Hash Table (DHT) system, or a reverse DNS system). 245 3.2. Gains 247 RANGI achieves almost all of goals set by RRG as follows: 249 1. Routing Scalability: Scalability is achieved by decoupling 250 identifiers from locators. 252 2. Traffic Engineering: Hosts located in a multi-homed site can 253 suggest the upstream ISP for outbound and inbound traffics, while 254 the first-hop LDBR (i. e., site border router) has the final 255 decision right on the upstream ISP selection. 257 3. Mobility and Multi-homing: Sessions will not be interrupted due 258 to locator change in cases of mobility or multi-homing. 260 4. Simplified Renumbering: When changing providers, the local IPv4 261 addresses of the site do not need to change. Hence the internal 262 routers within the site don't need renumbering. 264 5. Decoupling Location and Identifier: Obvious. 266 6. Routing Stability: Since the locators are topologically 267 aggregatable and the internal topology within LD will not be 268 disclosed outside, the routing stability could be improved 269 greatly. 271 7. Routing Security: RANGI reuses the current routing system and 272 does not introduce any new security risk into the routing system. 274 8. Incremental Deployability: RANGI allows easy transition from IPv4 275 network to IPv6 network. In addition, RANGI proxy allows RANGI- 276 aware hosts to communicate to legacy IPv4 or IPv6 hosts, and vice 277 versa. 279 3.3. Costs 281 1. Host change is required 283 2. First-hop LDBR change is required to support site-controlled 284 traffic-engineering capability. 286 3. The ID->Locator mapping system is a new infrastructure to be 287 deployed. 289 4. Proxy needs to be deployed for communication between RANGI-aware 290 hosts and legacy hosts. 292 4. Internet Vastly Improved Plumbing (Ivip) 294 4.1. Key Ideas 296 Ivip (pr. eye-vip, est. 2007-06-15) is a core-edge separation scheme 297 for IPv4 and IPv6. It provides multihoming, portability of address 298 space and inbound traffic engineering for end-user networks of all 299 sizes and types, including those of corporations, SOHO and mobile 300 devices. 302 Ivip meets all the constraints imposed by the need for widespread 303 voluntary adoption [Ivip Constraints]. 305 Ivip's global fast-push mapping distribution network is structured 306 like a cross-linked multicast tree. This pushes all mapping changes 307 to full database query servers (QSDs) within ISPs and end-user 308 networks which have ITRs. Each mapping change is sent to all QSDs 309 within a few seconds. 311 ITRs gain mapping information from these local QSDs within a few tens 312 of milliseconds. QSDs notify ITRs of changed mapping with similarly 313 low latency. ITRs tunnel all traffic packets to the correct ETR 314 without significant delay. 316 Ivip's mapping consists of a single ETR address for each range of 317 mapped address space. Ivip ITRs do not need to test reachability to 318 ETRs because the mapping is changed in real-time to that of the 319 desired ETR. 321 End-user networks control the mapping, typically by contracting a 322 specialized company to monitor the reachability of their ETRs and 323 change the mapping to achieve multihoming and/or TE. So the 324 mechanisms which control ITR tunneling are controlled by the end-user 325 networks in real-time and are completely separate from the core-edge 326 separation scheme itself. 328 ITRs can be implemented in dedicated servers or hardware-based 329 routers. The ITR function can also be integrated into sending hosts. 330 ETRs are relatively simple and only communicate with ITRs rarely - 331 for Path MTU management with longer packets. 333 Ivip-mapped ranges of end-user address space need not be subnets. 334 They can be of any length, in units of IPv4 addresses or IPv6 /64s. 336 Compared to conventional unscalable BGP techniques, and to the use of 337 core-edge separation architectures with non-real-time mapping 338 systems, end-user networks will be able to achieve more flexible and 339 responsive inbound TE. If inbound traffic is split into several 340 streams, each to addresses in different mapped ranges, then real-time 341 mapping changes can be used to steer the streams between multiple 342 ETRs at multiple ISPs. 344 Open ITRs in the DFZ (OITRDs, similar to LISP's PTRs) tunnel packets 345 sent by hosts in networks which lack ITRs. So multihoming, 346 portability and TE benefits apply to all traffic. 348 ITRs request mapping either directly from a local QSD or via one or 349 more layers of caching query servers (QSCs) which in turn request it 350 from a local QSD. QSCs are optional but generally desirable since 351 they reduce the query load on QSDs. 353 ETRs may be in ISP or end-user networks. IP-in-IP encapsulation is 354 used, so there is no UDP or any other header. PMTUD (Path MTU 355 Discovery) management with minimal complexity and overhead will 356 handle the problems caused by encapsulation, and adapt smoothly to 357 jumboframe paths becoming available in the DFZ. The outer header's 358 source address is that of the sending host - which enables existing 359 ISP BR filtering of source addresses to be extended to encapsulated 360 traffic packets by the simple mechanism of the ETR dropping packets 361 whose inner and outer source address do not match. 363 4.2. Extensions 365 4.2.1. TTR Mobility 367 The TTR approach to mobility [Ivip Mobility] is applicable to all 368 core-edge separation techniques and provides scalable IPv4 and IPv6 369 mobility in which the MN keeps its own mapped IP address(es) no 370 matter how or where it is physically connected, including behind one 371 or more layers of NAT. 373 Path-lengths are typically optimal or close to optimal and the MN 374 communicates normally with all other non-mobile hosts (no stack or 375 app changes), and of course other MNs. Mapping changes are only 376 needed when the MN uses a new TTR, which would typically be if the MN 377 moved more than 1000km. Mapping changes are not required when the MN 378 changes its physical address(es). 380 4.2.2. Modified Header Forwarding 382 Separate schemes for IPv4 and IPv6 enable tunneling from ITR to ETR 383 without encapsulation. This will remove the encapsulation overhead 384 and PMTUD problems. Both approaches involve modifying all routers 385 between the ITR and ETR to accept a modified form of the IP header. 386 These schemes require new FIB/RIB functionality in DFZ and some other 387 routers but do not alter the BGP functions of DFZ routers. 389 4.3. Gains 391 Amenable to widespread voluntary adoption due to no need for host 392 changes, complete support for packets sent from non-upgraded networks 393 and no significant degradation in performance. 395 Modular separation of the control of ITR tunneling behavior from the 396 ITRs and the core-edge separation scheme itself: end-user networks 397 control mapping in any way they like, in real-time. 399 A small fee per mapping change deters frivolous changes and helps pay 400 for pushing the mapping data to all QSDs. End-user networks who make 401 frequent mapping changes for inbound TE, should find these fees 402 attractive considering how it improves their ability to utilize the 403 bandwidth of multiple ISP links. 405 End-user networks will typically pay the cost of OITRD forwarding to 406 their networks. This provides a business model for OITRD deployment 407 and avoids unfair distribution of costs. 409 Existing source address filtering arrangements at BRs of ISPs and 410 end-user networks are prohibitively expensive to implement directly 411 in ETRs, but with the outer header's source address being the same as 412 the sending host's address, Ivip ETRs inexpensively enforce BR 413 filtering on decapsulated packets. 415 4.4. Costs 417 QSDs receive all mapping changes and store a complete copy of the 418 mapping database. However, a worst case scenario is 10 billion IPv6 419 mappings, each of 32 bytes, which fits on a consumer hard drive today 420 and should fit in server DRAM by the time such adoption is reached. 422 The maximum number of non-mobile networks requiring multihoming etc. 423 is likely to be ~10M, so most of the 10B mappings would be for mobile 424 devices. However, TTR mobility does not involve frequent mapping 425 changes since most MNs only rarely move more than 1000km. 427 5. hIPv4 429 5.1. Key Idea 431 The hierarchical IPv4 framework is adding scalability in the routing 432 architecture by introducing hierarchy in the IPv4 address space. The 433 hIPv4 addressing scheme is divided in two parts, the Area Locator 434 (ALOC) address space which is globally unique and the Endpoint 435 Locator (ELOC) address space which is only regionally unique. The 436 ALOC and ELOC prefixes are added as an IP option to the IPv4 header 437 as described in RFC 1385. Instead of creating a tunneling (i.e. 438 overlay) solution a new routing element is needed in every ALOC 439 realm, a Locator Swap Router - the current IPv4 forwarding plane 440 remains intact, also no new routing protocols or mapping systems are 441 required. The control plane of the ALOC realm routers needs some 442 modification in order for ICMP to be compatible with the hIPv4 443 framework. When an area (one or several AS) of an ISP has become an 444 ALOC realm only ALOC prefixes are exchanged with other ALOC realms. 445 Directly attached ELOC prefixes are only inserted to the RIB of the 446 local ALOC realm, ELOC prefixes are not distributed in the DFZ. 447 Multi-homing can be achieved in two ways, either the enterprise 448 request an ALOC prefix from the RIR (this is not recommended) or the 449 enterprise receive the ALOC prefixes from their upstream ISPs - ELOC 450 prefixes are PI addresses and remains intact when a upstream ISP is 451 changed, only the ALOC prefixes is replaced. When the RIB of DFZ is 452 compressed no longer an ingress router knows if the destination 453 prefix is available or not, only attachment points (ALOC prefixes) of 454 the destination prefix are advertised in the DFZ. Thus the endpoints 455 must take more responsibility for their sessions. This can be 456 achieved by using multipath enabled transport protocols, such as SCTP 457 and MPTCP, at the endpoints. The multipath transport protocols also 458 provides a session identifier, i.e. verification tag/token, thus the 459 location and identifier split is carried out - site mobility, 460 endpoint mobility and mobile site mobility is achieved. DNS needs to 461 be upgraded, to resolve the location of an endpoint it must have one 462 ELOC value (current A-record) and at least one ALOC value (in multi- 463 homing solutions there will be several ALOC values for an endpoint). 464 The hIPv4 framework can also be integrated to a map-and-encapsulate 465 solution; the ITR/ETR needs to incorporate the hIPv4 stack and might 466 use a multipath enabled transport protocol to serve the hIPv4/ 467 multipath transport protocol enabled endpoints. 469 5.2. Gains 471 1. Improved routing scalability: Adding hierarchy in the address 472 space enables a hierarchy in the routing architecture. Early 473 adapters of an ALOC realm will no longer carry the RIB of the DFZ 474 - only ELOC prefixes of directly attached networks and ALOC 475 prefixes from other service provider that have migrated. 477 2. Scalable support for traffic engineering: Multipath enabled 478 transport protocols are recommended to achieve dynamic load- 479 balancing of a session. Support for Valiant Load-balancing 480 schemes has been added to the framework; more research work is 481 required around VLB switching. 483 3. Scalable support for multi-homing: Only attachment points of a 484 multi-homed site are advertised in the DFZ, DNS will inform the 485 requester how many attachment points the destination endpoint 486 has. It is the initiating endpoints choice/responsibility which 487 attachment point is used; endpoints using multipath enabled 488 transport protocols can make use of several attachment points for 489 a session. 491 4. Simplified Renumbering: When changing provider, the local ELOC 492 prefixes remains intact, only the ALOC prefix is changed on the 493 endpoints. 495 5. Decoupling Location and Identifier: The verification tag (SCTP) 496 and token (MPTCP) can be considered to have the characteristics 497 of a session identifier and thus a session layer is created 498 between the transport and application layer in the TCP/IP model 500 6. Routing quality: The hIPv4 framework introduce no tunneling 501 mechanisms, only a swap of the IPv4 header and locator header at 502 the destination ALOC realm is required, thus current routing 503 algorithms are preserved as such. Valiant Load-balancing might 504 be used as a new forwarding mechanism. 506 7. Routing Security: Similar as with today's DFZ, except that ELOC 507 prefixes can not be high-jacked (by injecting a longest match 508 prefix) outside an ALOC realm (improved security) 510 8. Deployability: The hIPv4 framework is an evolution of the current 511 IPv4 framework and is backwards compatible with the current IPv4 512 framework. Sessions in a local network and inside an ALOC realm 513 might in the future still use the current IPv4 framework. 515 5.3. Costs And Issues 517 1. Upgrade of the stack at an endpoint or the endpoint should make 518 use of an ITR/XTR 520 2. In a multi-homing solution the border routers should be able to 521 apply policy based routing upon the ALOC value in the locator 522 header 524 3. New policies must be set by the RIRs 526 4. Short timeframe before the expected depletion of the IPv4 address 527 space occurs 529 5. Will enterprises give up their global allocation of the current 530 IPv4 address block they have gained? 532 6. Co-ordination with MPTCP is highly desirable 534 6. Name overlay (NOL) service for scalable Internet routing 536 6.1. Key Idea 538 The basic idea is to add a name overlay (NOL) on the existing TCP/IP 539 stack. 541 Its functions include: 543 1. host names configuration, registration and authentication; 545 2. Initiate and manage transport connection channels (i.e., TCP/IP 546 connections) by name; 548 3. keep application data transport continuity for mobility. 550 At the edge network, we introduce a new type of gateway NTR (Name 551 Transfer Relay), which block the PI addresses of edge networks into 552 upstream transit networks. NTRs performs address and/or port 553 translation between blocked PI addresses and globally routable 554 addresses, which seem like today's widely used NAT/NAPT devices. 555 Both legacy and NOL applications behind a NTR can access the outside 556 as usual. To access the hosts behind a NTR from outside, we need to 557 use NOL traverse the NTR by name and initiate connections to the 558 hosts behind it. 560 Different from proposed host-based ID/Locator split solutions, such 561 as HIP, Shim6, and name-oriented stack, NOL doesn't need to change 562 the existing TCP/IP stack, sockets and their packet formats. NOL can 563 co-exist with the legacy infrastructure, the core-edges separation 564 solutions (e.g., APT, LISP, Six/one, Ivip, etc.) 566 6.2. Gains 568 1. Reduce routing table size: Prevent edge network PI address into 569 transit netwok by deploying gateway NTR 571 2. Traffic Engineering: For legacy and NOL application initiating 572 session, the incoming traffic can be directed to a specific NTR 573 by DNS answer for names. In addition, for NOL application, its 574 initial session can be redirected from one NTR to other 575 appropriate NTRs. These mechanisms provide some support for 576 traffic engineering. 578 3. Multi-homing: When a PI address network connects to Internet by 579 multi-homing with several providers, it can deploy NTRs to block 580 the PI addresses into provide networks. 582 4. And the NTRs can be allocated PA addresses from the upstream 583 providers and store them in NTRs' address pool. By DNS query or 584 NOL session, any session that want to access the hosts behind 585 the NTR can be delegated to a specific PA address in the NTR 586 address pool. 588 5. Mobility: NOL layer manage the traditional TCP/IP transport 589 connections, and keeps application data transport continue by 590 setting breakpoints and sequence numbers in data stream. 592 6. No need to change TCP/IP stack, sockets and DNS system. 594 7. No need for extra mapping system. 596 8. NTR can be deployed unilaterally, just like NATs 598 9. NOL applications can communicate with legacy applications. 600 10. NOL can be compatible with existing solutions, such as APT, 601 LISP, Ivip, etc. 603 11. End user controlled multi-path indirect routing based on 604 distributed NTRs. This will give benefits to the performance- 605 aware applications, such as, MSN, Video streaming, etc. 607 6.3. Costs 609 1. Legacy applications have trouble with initiating access to the 610 servers behind NTR. Such trouble can be resolved by deploying 611 NOL proxy for legacy hosts, or delegating globally routable PA 612 addresses in NTR address pool for these servers, or deploying 613 server proxy outside NTR. 615 2. It may increase the number of entries of DNS, but not drastic, 616 because it only increases DNS entries in domains granularity not 617 hosts. The name used in NOL, for example, just like email 618 address hostname@domain.net. The needed DNS entries and query is 619 just for "domain.net", and The NTR knows "hostnames". The DNS 620 entries will not only be increased, but its dynamic might be 621 agitated as well. However the scalability and performance of DNS 622 is guaranteed by name hierarchy and cache mechanism. 624 3. Address translating/rewriting costs on NTRs. 626 7. Compact routing in locator identifier mapping system 628 7.1. Key Idea 630 Builds a highly scalable locator identity mapping system using 631 compact routing principles. Provides means for dynamic topology 632 adaption to facilitate efficient aggregation. Map servers are 633 assigned as cluster heads or landmarks based on their capability to 634 aggregate EID announcements. 636 7.2. Gains 638 Minimizes the routing table sizes in at the system level (= map 639 servers). Provides clear upper bounds for routing stretch that 640 defines the packet delivery delay of the map request/first packet. 642 Organizes the mapping system based EID numbering space, minimizes the 643 administrative of overhead of managing EID space. No need for 644 administratively planned hierarchical address allocation as the 645 system will find convergence into a sets of EID allocations. 647 Availability and robustness of the overall routing system (including 648 xTRs and map servers) is improved because potential to use multiple 649 map servers and direct routes without involvement of map servers. 651 7.3. Costs 653 The scalability gains will materialize only in large deployments. If 654 the stretch is required to be bound to those of compact routing 655 (worst case stretch less or equal to 3, on average 1+epsilon) then 656 xTRs need to have memory/cache for the mappings of its cluster. 658 8. Layered mapping system (LMS) 660 8.1. Key Ideas 662 Build a hierarchical mapping system to support scalability, analyze 663 the design constraints and present an explicit system structure; 664 design a two-cache mechanism on ingress tunneling router (ITR) to 665 gain low request delay and facilitate data validation. Tunneling and 666 mapping are done at core and no change needed on edge networks. 667 Mapping system is run by interest groups independent of ISP, which 668 conforms to economical model and can be voluntarily adopted by 669 various networks. Mapping system can also be constructed stepwise, 670 especially in the IPv6 scenario. 672 8.2. Gains 674 1. Scalability 676 1. Distributed storage of mapping data avoids central storage of 677 massive data; restrict updates within local areas; 679 2. Cache mechanism in ITR reduces request loads on mapping 680 system reasonably. 682 2. Deployability 684 1. No change on edge works; only tunneling in core routers; new 685 devices in core networks; 687 2. Mapping system can be constructed stepwise: a mapping node 688 needn't be constructed if none of its responsible ELOCs is 689 allocated. This makes sense especially for IPv6. 691 3. Conform to economic model: mapping system can profit from 692 their services; core routers and edge networks are willing to 693 join the circle, either to avoid router upgrades or realize 694 traffic engineering. Benefits from joining are independent 695 of the scheme's implementation scale. 697 3. Low request delay: Low layer number of the mapping structure and 698 two-stage cache can well achieve low request delay. 700 4. Data consistency: Two-stage cache enables ITR to update data in 701 the map cache conveniently. 703 5. Traffic engineering support: Edge networks inform mapping system 704 their mappings with all upstream routers with different priority, 705 thus to control their ingress flows. 707 8.3. Costs 709 1. Deployment of LMS needs to be further discussed. 711 2. The structure of mapping system needs to be refined according to 712 practical circumstances. 714 9. 2-phased mapping 716 9.1. Considerations 718 1. Mapping from prefixes to ETRs is an M:M mapping. Any change of 719 (prefix, ETR) pair should be updated timely which can be a heavy 720 burden to any mapping systems if the relation changes frequently. 722 2. prefix<->ETR mapping system cannot be deployed efficiently if it 723 is overwhelmed by the worldwide dynamics. Therefore the mapping 724 itself is not scalable with this direct mapping scheme. 726 9.2. My contribution: a 2-phased mapping 728 1. Introduce AS number in the middle of the mapping, phase I mapping 729 is prefix<->AS#, phase II mapping is AS#<->ETRs. We have a M:1:M 730 mapping model now. 732 2. My assumption is that all ASes know better their local prefixes 733 (in the IGP) than others. and most likely local prefixes can be 734 aggregated when map them to the AS#, which will make the mapping 735 entry reduction possible, ASes also know clearly their ETRs on 736 its border between core and edge. So all mapping information can 737 be collected locally. 739 3. A registry system will take care of the phase I mapping 740 information. Each AS should have a register agent to notify the 741 local range of IP address space to the registry. This system can 742 be organized as a hierarchical infrastructure like DNS, or 743 alternatively as a centralized registry like "whois" in each RIR. 744 Phase II mapping information can be distributed between XTRs as a 745 BGP extension. 747 4. A basic forwarding procedure is that ITR firstly get the 748 destination AS# from phase I mapper (or from cache) when the 749 packet is entering the "core". Then it will check the closest 750 ETR of destination AS#, since phase 2 mapping information has 751 been "pushed" to it through BGP updates. At last the ITR encap 752 the packet and tunnel it to a corresponding ETR. 754 9.3. Gains 756 1. Any prefixes reconfiguration (aggregation/ deaggregation) within 757 an AS will not be notified to mapping system. 759 2. Possible highly efficient aggregation of the local prefixes (in 760 the form of an IP space range). 762 3. Both phase I and phase II mapping can be stable. 764 4. A stable mapping system will reduce the update overhead 765 introduced by topology change/routing policy dynamics.ETR. 767 9.4. Summary 769 1. The 2-phased mapping scheme introduces AS# between the mapping 770 prefixes and ETRs. 772 2. The decoupling of direct mapping makes highly dynamic updates 773 stable, therefore it can be more scalable than any direct mapping 774 designs. 776 3. The 2-phased mapping scheme is adaptable to any core/edge split 777 based proposals. 779 10. Global Locator, Local Locator, and Identifier Split (GLI-Split) 781 10.1. Key Idea 783 GLI-Split implements a separation between global routing (in the 784 global Internet outside edge networks) and local routing (inside edge 785 networks) and using global and local locators (GLs, LLs). In 786 addition, a separate static identifier (ID) is used to identify 787 communication endpoints (e.g. nodes or services) independently of any 788 routing information. Locators and IDs are encoded in IPv6 addresses 789 to enable backwards-compatibility with the IPv6 Internet. The higher 790 order bits store either a GL or a LL while the lower order bits 791 contain the ID. A local mapping system maps IDs to LLs and a global 792 mapping system maps IDs to GLs. The full GLI-mode requires nodes 793 with upgraded networking stacks and special GLI-gateways. The GLI- 794 gateways perform stateless locator rewriting in IPv6 addresses with 795 the help of the local and global mapping system. Non-upgraded IPv6 796 nodes can also be accommodated in GLI-domains since an enhanced DHCP 797 service and GLI-gateways compensate their missing GLI-functionality. 798 This is an important feature for incremental deployability. 800 10.2. Gains 802 The benefits of GLI-Split are 804 o Hierarchical aggregation of routing information in the global 805 Internet through separation of edge and core routing 807 o Provider changes not visible to nodes inside GLI-domains 808 (renumbering not needed) 810 o Rearrangement of subnetworks within edge networks not visible to 811 the outside world (better support of large edge networks) 813 o Transport connections survive both types of changes 815 o Multihoming 817 o Improved traffic engineering for incoming and outgoing traffic 819 o Multipath routing and load balancing for hosts 821 o Improved resilience 823 o Improved mobility support without home agents and triangle routing 825 o Interworking with the classic Internet 827 * without triangle routing over proxy routers 829 * without stateful NAT 831 These benefits are available for upgraded GLI-nodes, but non-upgraded 832 nodes in GLI-domains partially benefit from these advanced features, 833 too. This offers multiple incentives for early adopters and they 834 have the option to migrate their nodes gradually from non-GLI stacks 835 to GLI-stacks. 837 10.3. Costs 839 o Local and global mapping system 841 o Modified DHCP or similar mechanism 843 o GLI-gateways with stateless locator rewriting in IPv6 addresses 845 o Upgraded stacks (only for full GLI-mode) 847 11. Tunneled Inter-domain Routing (TIDR) 849 11.1. Key Idea 851 Provides a method for locator-identifier separation using tunnels 852 between routers of the edge of the Internet transit infrastructure. 853 It enrichs BGP protocol for distributing the identifier-to-locator 854 mapping. Using new BGP atributes "identifier prefixes" are assigned 855 interdomain routing locators so that they will not be installed in 856 the RIB and will be moved to a new table called Tunnel Information 857 Base (TIB). Afterwards, when routing a packet to the "identifier 858 prefix", the TIB will be searched first to perform tunnel imposition, 859 and secondly the RIB for actual routing. After the edge router 860 performs tunnel imposition, all routers in the middle will route this 861 packet until the router being the tail-end of the tunnel. 863 11.2. Gains 865 o Smooth deployment 867 o Size Reduction of the Global RIB Table 869 o Deterministic Customer Traffic Engineering for Incoming Traffic 871 o Numerous Forwarding Decisions for a Particular Address Prefix 873 o TIDR Stops AS Number Space Depletion 875 o Improved BGP Convergence 877 o Protection of the Inter-domain Routing Infrastructure 879 o Easy Separation of Control Traffic and Transit Traffic 881 o Different Layer-2 Protocol-IDs for Transit and Non-Transit Traffic 883 o Multihoming Resilience 885 o New Address Families and Tunneling Techniques 887 o TIDR for IPv4 or IPv6, and Migration to IPv6 889 o Scalability, Stability and Reliability 891 o Faster Inter-domain Routing 893 11.3. Costs 895 o Routers of the edge of the interdomain infrastructure will need to 896 be upgraded to hold the mapping database (i.e. the TIB) 898 o "Mapping updates" will need to be treated differently from usual 899 BGP "routing updates" 901 12. Identifier-Locator Network Protocol (ILNP) 903 12.1. Key Ideas 905 o Provide crisp separation of Identifiers from Locators. 907 o Identifiers name nodes, not interfaces. 909 o Locators name subnetworks, rather than interfaces, so they are 910 equivalent to an IP routing prefix. 912 o Identifiers are never used for network-layer routing, whilst 913 Locators are never used for Node Identity. 915 o Transport-layer sessions (e.g. TCP session state) use only 916 Identifiers, never Locators, meaning that changes in location have 917 no adverse impact on an IP session. 919 12.2. Benefits 921 o The underlying protocol mechanisms support fully scalable site 922 multi-homing, node multi-homing, site mobility, and node mobility. 924 o ILNP enables topological aggregation of location information while 925 providing stable and topology-independent identities for nodes. 927 o In turn, this topological aggregation reduces both the routing 928 prefix "churn" rate and the overall size of the Internet's global 929 routing table, by eliminating the value and need for more-specific 930 routing state currently carried throughout the global (default- 931 free) zone of the routing system. 933 o ILNP enables improved Traffic Engineering capabilities without 934 adding any state to the global routing system. TE capabilities 935 include both provider-driven TE and also end-site-controlled TE. 937 o ILNP's mobility approach: 939 * eliminates the need for special-purpose routers (e.g. Home 940 Agent and/or Foreign Agent now required by Mobile IP & NEMO). 942 * eliminates "triangle routing" in all cases. 944 * supports both "make before break" and "break before make" 945 layer-3 handoffs. 947 o ILNP improves resilience and network availability while reducing 948 the global routing state (as compared with the currently deployed 949 Internet). 951 o ILNP is Incrementally Deployable: 953 * No changes are required to existing IPv6 (or IPv4) routers. 955 * Upgraded nodes gain benefits immediately ("day one"); those 956 benefits gain in value as more nodes are upgraded (this follows 957 Metcalfe's Law). 959 * Incremental Deployment approach is documented. 961 o ILNP is Backwards Compatible: 963 * ILNPv6 is fully backwards compatible with IPv6 (ILNPv4 is fully 964 backwards compatible with IPv4). 966 * Reuses existing known-to-scale DNS mechanisms to provide 967 identifier/locator mapping. 969 * Existing DNS Security mechanisms are reused without change. 971 * Existing IP Security mechanisms are reused with one minor 972 change (IPsec Security Associations replace current use of IP 973 Addresses with new use of Locator values). NB: IPsec is also 974 backwards compatible. 976 * Backwards Compatibility approach is documented. 978 o No new or additional overhead is required to determine or to 979 maintain locator/path liveness. 981 o ILNP does not require locator rewriting (NAT); ILNP permits and 982 tolerates NAT should that be desirable in some deployment(s). 984 o Changes to upstream network providers do not require node or 985 subnetwork renumbering within end-sites. 987 o Compatible with and can facilitiate transition from current 988 single-path TCP to multi-path TCP. 990 o ILNP can be implemented such that existing applications (e.g. 991 applications using the BSD Sockets API) do NOT need any changes or 992 modifications to use ILNP. 994 12.3. Costs 996 o End systems need to be enhanced incrementally to support ILNP in 997 addition to IPv6 (or IPv4 or both). 999 o DNS servers supporting upgraded end systems also should be 1000 upgraded to support new DNS resource records for ILNP. (DNS 1001 protocol & DNS security do not need any changes.) 1003 13. Enhanced Efficiency of Mapping Distribution Protocols in Map-and- 1004 Encap Schemes 1006 13.1. Introduction 1008 We present some architectural principles pertaining to the mapping 1009 distribution protocols, especially applicable to map-and-encap (e.g., 1010 LISP) type of protocols. These principles enhance the efficiency of 1011 the map-and-encap protocols in terms of (1) better utilization of 1012 resources (e.g., processing and memory) at Ingress Tunnel Routers 1013 (ITRs) and mapping servers, and consequently, (2) reduction of 1014 response time (e.g., first packet delay). We consider how Egress 1015 Tunnel Routers (ETRs) can perform aggregation of end-point ID (EID) 1016 address space belonging to their downstream delivery networks, in 1017 spite of migration/re-homing of some subprefixes to other ETRs. This 1018 aggregation may be useful for reducing the processing load and memory 1019 consumption associated with map messages, especially at some 1020 resource-constrained ITRs and subsystems of the mapping distribution 1021 system. We also consider another architectural concept where the 1022 ETRs are organized in a hierarchical manner for the potential benefit 1023 of aggregation of their EID address spaces. The two key 1024 architectural ideas are discussed in some more detail below. A more 1025 complete description can be found in a document [EEMDP 1026 Considerations] that was presented at the RRG meeting in Dublin 1027 [EEMDP Presentation]. 1029 It will be helpful to refer to Figures 1, 2, and 3 in the document 1030 noted above for some of the discussions that follow here below. 1032 13.2. Management of Mapping Distribution of Subprefixes Spread Across 1033 Multiple ETRs 1035 To assist in this discussion, we start with the high level 1036 architecture of a map-and-encap approach (it would be helpful to see 1037 Fig. 1 in the document mentioned above). In this architecture we 1038 have the usual ITRs, ETRs, delivery networks, etc. In addition, we 1039 have the ID-Locator Mapping (ILM) servers which are repositories for 1040 complete mapping information, while the ILM-Regional (ILM-R) servers 1041 can contain partial and/or regionally relevant mapping information. 1043 While a large endpoint address space contained in a prefix may be 1044 mostly associated with the delivery networks served by one ETR, some 1045 fragments (subprefixes) of that address space may be located 1046 elsewhere at other ETRs. Let a/20 denote a prefix that is 1047 conceptually viewed as composed of 16 subnets of /24 size that are 1048 denoted as a1/24, a2/24, :::, a16/24. For example, a/20 is mostly at 1049 ETR1, while only two of its subprefixes a8/24 and a15/24 are 1050 elsewhere at ETR3 and ETR2, respectively (see Fig. 2 in the 1051 document). From the point of view of efficiency of the mapping 1052 distribution protocol, it may be beneficial for ETR1 to announce a 1053 map for the entire space a/20 (rather than fragment it into a 1054 multitude of more-specific prefixes), and provide the necessary 1055 exceptions in the map information. Thus the map message could be in 1056 the form of Map:(a/20, ETR1; Exceptions: a8/24, a15/24). In 1057 addition, ETR2 and ETR3 announce the maps for a15/24 and a8/24, 1058 respectively, and so the ILMs know where the exception EID addresses 1059 are located. Now consider a host associated with ITR1 initiating a 1060 packet destined for an address a7(1), which is in a7/24 that is not 1061 in the exception portion of a/20. Now a question arises as to which 1062 of the following approaches would be the best choice: 1064 1. ILM-R provides the complete mapping information for a/20 to ITR1 1065 including all maps for relevant exception subprefixes. 1067 2. ILM-R provides only the directly relevant map to ITR1 which in 1068 this case is (a/20, ETR1). 1070 In the first approach, the advantage is that ITR1 would have the 1071 complete mapping for a/20 (including exception subnets), and it would 1072 not have to generate queries for subsequent first packets that are 1073 destined to any address in a/20, including a8/24 and a15/24. 1074 However, the disadvantage is that if there is a significant number of 1075 exception subprefixes, then the very first packet destined for a/20 1076 will experience a long delay, and also the processors at ITR1 and 1077 ILM-R can experience overload. In addition, the memory usage at ITR1 1078 can be very inefficient as well. The advantage of the second 1079 approach above is that the ILM-R does not overload resources at ITR1 1080 both in terms of processing and memory usage but it needs an enhanced 1081 map response in of the form Map:(a/20, ETR1, MS=1), where MS (more 1082 specific) indicator is set to 1 to indicate to ITR1 that not all 1083 subnets in a/20 map to ETR1. The key idea is that aggregation is 1084 beneficial and subnet exceptions must be handled with additional 1085 messages or indicators in the maps. 1087 13.3. Management of Mapping Distribution for Scenarios with Hierarchy 1088 of ETRs and Multi-Homing 1090 Now we highlight another architectural concept related to mapping 1091 management (helpful here to refer to Fig. 3 in the document). Here 1092 we consider the possibility that ETRs may be organized in a 1093 hierarchical manner. For instance ETR7 is higher in hierarchy 1094 relative to ETR1, ETR2, and ETR3, and like-wise ETR8 is higher 1095 relative to ETR4, ETR5, and ETR6. For instance, ETRs 1 through 3 can 1096 relegate locator role to ETR7 for their EID address space. In 1097 essence, they can allow ETR7 to act as the locator for the delivery 1098 networks in their purview. ETR7 keeps a local mapping table for 1099 mapping the appropriate EID address space to specific ETRs that are 1100 hierarchically associated with it in the level below. In this 1101 situation, ETR7 can perform EID address space aggregation across ETRs 1102 1 through 3 and can also include its own immediate EID address space 1103 for the purpose of that aggregation. The many details related to 1104 this approach and special circumstances involving multi-homing of 1105 subnets are discussed in detail in the detailed document noted 1106 earlier. The hierarchical organization of ETRs and delivery networks 1107 should help in the future growth and scalability of ETRs and mapping 1108 distribution networks. This is essentially recursive map-and-encap, 1109 and some of the mapping distribution and management functionality 1110 will remain local to topologically neighboring delivery networks 1111 which are hierarchically underneath ETRs. 1113 14. Evolution 1115 As the Internet continues its rapid growth, router memory size and 1116 CPU cycle requirements are outpacing feasible hardware upgrade 1117 schedules. We propose to solve this problem by applying aggregation 1118 with increasing scopes to gradually evolve the routing system towards 1119 a scalable structure. At each evolutionary step, our solution is 1120 able to interoperate with the existing system and provide immediate 1121 benefits to adopters to enable deployment. This document summarizes 1122 the need for an evolutionary design, the relationship between our 1123 proposal and other revolutionary proposals and the steps of 1124 aggregation with increasing scopes. Our detailed proposal can be 1125 found in [I-D.zhang-evolution]. 1127 14.1. Need for Evolution 1129 Multiple different views exist regarding the routing scalability 1130 problem. Networks differ vastly in goals, behavior, and resources, 1131 giving each a different view of the severity and imminence of the 1132 scalability problem. Therefore we believe that, for any solution to 1133 be adopted, it will start with one or a few early adopters, and may 1134 not ever reach the entire Internet. The evolutionary approach 1135 recognizes that changes to the Internet can only be a gradual process 1136 with multiple stages. At each stage, adopters are driven by and 1137 rewarded with solving an immediate problem. Each solution must be 1138 deployable by individual networks who deem it necessary at a time 1139 they deem it necessary, without requiring coordination from other 1140 networks, and the solution has to bring immediate relief to a single 1141 first-mover. 1143 14.2. Relation to Other RRG Proposals 1145 Most proposals take a revolutionary approach that expects the entire 1146 Internet to eventually move to some new design whose main benefits 1147 would not materialize until the vast majority of the system has been 1148 upgraded; their incremental deployment plan simply ensures 1149 interoperation between upgraded and legacy parts of the system. In 1150 contrast, the evolutionary approach depicts a picture where changes 1151 may happen here and there as needed, but there is no dependency on 1152 the system as a whole making a change. Whoever takes a step forward 1153 gains the benefit by solving his own problem, without depending on 1154 others to take actions. Thus, deployability includes not only 1155 interoperability, but also the alignment of costs and gains. 1157 The main differences between our approach and more revolutionary map- 1158 encap proposals are: (a) we do not start with a pre-defined boundary 1159 between edge and core; and (b) each step brings immediate benefits to 1160 individual first-movers. Note that our proposal neither interferes 1161 nor prevents any revolutionary host-based solutions such as ILNP from 1162 being rolled out. However, host-based solutions do not bring useful 1163 impact until a large portion of hosts have been upgraded. Thus even 1164 if a host-based solution is rolled out in the long run, an 1165 evolutionary solution is still needed for the near term. 1167 14.3. Aggregation with Increasing Scopes 1169 Aggregating many routing entries to a fewer number is a basic 1170 approach to improving routing scalability. Aggregation can take 1171 different forms and be done within different scopes. In our design, 1172 the aggregation scope starts from a single router, then expands to a 1173 single network, and neighbor networks. The order of the following 1174 steps is not fixed but merely a suggestion; it is under each 1175 individual network's discretion which steps they choose to take based 1176 on their evaluation of the severity of the problems and the 1177 affordability of the solutions. 1179 1. FIB Aggregation (FA) in a single router. A router 1180 algorithmically aggregates its FIB entries without changing its 1181 RIB or its routing announcements. No coordinations among routers 1182 is needed, nor any change to existing protocols. This brings 1183 scalability relief to individual routers with only a software 1184 upgrade. 1186 2. Enabling 'best external' on PEs, ASBRs, and RRs, and turning on 1187 next-hop-self on RRs. For heirarchical networks, the RRs in each 1188 PoP can serve as a default gateway for nodes in the PoP, thus 1189 allowing the non-RR nodes in each PoP to maintain smaller routing 1190 tables that only include paths that egress out of that PoP. This 1191 is known as 'topology-based mode' Virtual Aggregation, and can be 1192 done with existing hardware and configuration changes only. 1193 Please see [Evolution Grow Presenatation] for details. 1195 3. Virtual Aggregation (VA) in a single network. Within an AS, some 1196 fraction of existing routers are designated as Aggregation Point 1197 Routers (APRs). These routers are either individually or 1198 collectively maintain the full FIB table. Other routers may 1199 suppress entries from their FIBs, instead forwarding packets to 1200 APRs, which will then tunnel the packets to the correct egress 1201 routers. VA can be viewed as an intra-domain map-encap system to 1202 provide the operators a control mechanism for the FIB size in 1203 their routers. 1205 4. VA across neighbor networks. When adjacent networks have VA 1206 deployed, they can go one step further by piggybacking egress 1207 router information on existing BGP announcements, so that packets 1208 can be tunneled directly to a neighbor network's egress router. 1209 This improves packet delivery performance by performing the 1210 encapsulation/decapsulation only once across these neighbor 1211 networks, as well as reducing the stretch of the path. 1213 5. Reducing RIB Size by separating control plane from the data 1214 plane. Although a router's FIB can be reduced by FA or VA, it 1215 usually still needs to maintain the full RIB in order for routing 1216 announcements to its neighbors. To reduce the RIB size, a 1217 network can set up special boxes, which we call controllers, to 1218 take over the eBGP sessions from border routers. The controllers 1219 receive eBGP announcements, make routing decisions, and then 1220 inform other routers in the same network of how to forward 1221 packets, while the regular routers just focus on the job of 1222 forwarding packets. The controllers, not being part of the data 1223 path, can be scaled using commodity hardware. 1225 6. Insulating forwarding routers from routing churns. For routers 1226 with a smaller RIB, the rate of routing churns is naturally 1227 reduced. Further reduction can be achieved by not announcing 1228 failures of customer prefixes into the core, but handling these 1229 failures in a data-driven fashion, e.g., a link failure to an 1230 edge network is not reported unless and until there are data 1231 packets that are heading towards the failed link. 1233 15. Name-Based Sockets 1235 Name-based sockets are an evolution of the existing address-based 1236 sockets, enabling applications to initiate and receive communication 1237 sessions by use of domain names in lieu of IP addresses. Name-based 1238 sockets move the existing indirection from domain names to IP 1239 addresses from its current position in applications down to the IP 1240 layer. As a result, applications communicate exclusively based on 1241 domain names, while the discovery, selection, and potentially in- 1242 session re-selection of IP addresses is centrally performed by the 1243 operating system. 1245 Name-based sockets help mitigate the Internet routing scalability 1246 problem by separating naming and addressing more consistently than 1247 what is possible with the existing address-based sockets. This 1248 supports IP address aggregation because it simplifies the use of IP 1249 addresses with high topological significance, as well as the dynamic 1250 replacement of IP addresses during network-topological and host- 1251 attachment changes. 1253 A particularly positive effect of name-based sockets on Internet 1254 routing scalability is new incentives for edge network operators to 1255 use provider-assigned IP addresses, which are better aggregatable 1256 than the typically preferred provider-independent IP addresses. Even 1257 though provider-independent IP addresses are harder to get and more 1258 expensive than provider-assigned IP addresses, many operators desire 1259 provider- independent addresses due to the high indirect cost of 1260 provider-assigned IP addresses. This indirect cost comprises both, 1261 difficulties to multi- home, and tedious and largely manual 1262 renumbering upon provider changes. 1264 Name-based sockets reduce the indirect cost of provider-assigned IP 1265 addresses in three ways, and hence make the use of provider-assigned 1266 IP addresses more acceptable: (1) They enable fine-granular and 1267 responsive multi-homing. (2) They simplify renumbering by offering an 1268 easy means to replace IP addresses in referrals with domain names. 1269 This helps avoiding updates to application and operating system 1270 configurations, scripts, and databases during renumbering. (3) They 1271 facilitate low-cost solutions that eliminate renumbering altogether. 1272 One such low-cost solution is IP address translation, which in 1273 combination with name-based sockets loses its adverse impact on 1274 applications. 1276 Prerequisite for a positive effect of name-based sockets on Internet 1277 routing scalability is their adoption in operating systems and 1278 applications. Operating systems should be augmented to offer name- 1279 based sockets as a new alternative to the existing address-based 1280 sockets, and applications should use name-based sockets for their 1281 communications. Neither an instantaneous, nor an eventually complete 1282 transition to name-based sockets is required, yet the positive effect 1283 on Interent routing scalability will grow with the extent of this 1284 transition. 1286 Name-based sockets were hence designed with focus on deployment 1287 incentives, comprising both immediate deployment benefits as well as 1288 low deployment costs. Name-based sockets provide a benefit to 1289 application developers because the alleviation of applications from 1290 IP address management responsibilities simplifies and expedites 1291 application development. This benefit is immediate owing to the 1292 backwards compatibility of name-based sockets with legacy 1293 applications and legacy peers. The appeal to application developers, 1294 in turn, is an immediate benefit for operating system vendors who 1295 adopt name-based sockets. 1297 Name-based sockets furthermore minimize deployment costs: Alternative 1298 techniques to separate naming and addressing provide applications 1299 with "surrogate IP addresses" that dynamically map onto regular IP 1300 addresses. A surrogate IP address is indistinguishable from a 1301 regular IP address for applications, but does not have the 1302 topological significance of a regular IP address. Mobile IP and the 1303 Host Identity Protocol are examples of such separation techniques. 1304 Mobile IP uses "home IP addresses" as surrogate IP addresses with 1305 reduced topological significance. The Host Identity Protocol uses 1306 "host identifiers" as surrogate IP addresses without topological 1307 significance. A disadvantage of surrogate IP addresses is their 1308 incurred cost in terms of extra administrative overhead and, for some 1309 techniques, extra infrastructure. Since surrogate IP addresses must 1310 be resolvable to the corresponding regular IP addresses, they must be 1311 provisioned in the DNS or similar infrastructure. Mobile IP uses a 1312 new infrastructure of home agents for this purpose, while the Host 1313 Identity Protocol populates DNS servers with host identities. Name- 1314 based sockets avoid this cost because they function without surrogate 1315 IP addresses, and hence without the provisioning and infrastructure 1316 requirements that accompany those. 1318 Certainly, some edge networks will continue to use provider- 1319 independent addresses despite name-based sockets, perhaps simply due 1320 to inertia. But name-based sockets will help reduce the number of 1321 those networks, and thus have a positive impact on Internet routing 1322 scalability. 1324 A more comprehensive description of name-based sockets can be found 1325 in [Name Based Sockets]. 1327 16. Recommendation 1329 17. Acknowledgements 1331 This document represents a small portion of the overall work product 1332 of the Routing Research Group, who have developed all of these 1333 architectural approaches and many specific proposals within this 1334 solution space. 1336 18. IANA Considerations 1338 This memo includes no requests to IANA. 1340 19. Security Considerations 1342 All solutions are required to provide security that is at least as 1343 strong as the existing Internet routing and addressing architecture. 1345 20. References 1347 20.1. Normative References 1349 [I-D.irtf-rrg-design-goals] 1350 Li, T., "Design Goals for Scalable Internet Routing", 1351 draft-irtf-rrg-design-goals-01 (work in progress), 1352 July 2007. 1354 [I-D.narten-radir-problem-statement] 1355 Narten, T., "Routing and Addressing Problem Statement", 1356 draft-narten-radir-problem-statement-04 (work in 1357 progress), December 2009. 1359 [RFC1887] Rekhter, Y. and T. Li, "An Architecture for IPv6 Unicast 1360 Address Allocation", RFC 1887, December 1995. 1362 20.2. Informative References 1364 [I-D.carpenter-renum-needs-work] 1365 Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1366 still needs work", draft-carpenter-renum-needs-work-04 1367 (work in progress), October 2009. 1369 20.3. LISP References 1371 [I-D.farinacci-lisp-lig] 1372 Farinacci, D. and D. Meyer, "LISP Internet Groper (LIG)", 1373 draft-farinacci-lisp-lig-01 (work in progress), May 2009. 1375 [I-D.ietf-lisp] 1376 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 1377 "Locator/ID Separation Protocol (LISP)", 1378 draft-ietf-lisp-05 (work in progress), September 2009. 1380 [I-D.ietf-lisp-alt] 1381 Fuller, V., Farinacci, D., Meyer, D., and D. Lewis, "LISP 1382 Alternative Topology (LISP+ALT)", draft-ietf-lisp-alt-01 1383 (work in progress), May 2009. 1385 [I-D.ietf-lisp-interworking] 1386 Lewis, D., Meyer, D., Farinacci, D., and V. Fuller, 1387 "Interworking LISP with IPv4 and IPv6", 1388 draft-ietf-lisp-interworking-00 (work in progress), 1389 May 2009. 1391 [I-D.ietf-lisp-ms] 1392 Fuller, V. and D. Farinacci, "LISP Map Server", 1393 draft-ietf-lisp-ms-04 (work in progress), October 2009. 1395 [I-D.meyer-lisp-mn] 1396 Farinacci, D., Fuller, V., Lewis, D., and D. Meyer, "LISP 1397 Mobility Architecture", draft-meyer-lisp-mn-00 (work in 1398 progress), July 2009. 1400 [I-D.meyer-loc-id-implications] 1401 Meyer, D. and D. Lewis, "Architectural Implications of 1402 Locator/ID Separation", draft-meyer-loc-id-implications-01 1403 (work in progress), January 2009. 1405 20.4. RANGI References 1407 [I-D.xu-rangi] 1408 Xu, X., "Routing Architecture for the Next Generation 1409 Internet (RANGI)", draft-xu-rangi-01 (work in progress), 1410 July 2009. 1412 [I-D.xu-rangi-proxy] 1413 Xu, X., "Transition Mechanisms for Routing Architecture 1414 for the Next Generation Internet (RANGI)", 1415 draft-xu-rangi-proxy-01 (work in progress), July 2009. 1417 [RANGI] Xu, X., "Routing Architecture for the Next-Generation 1418 Internet (RANGI)", 1419 . 1421 [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol 1422 (HIP) Architecture", RFC 4423, May 2006. 1424 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1425 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1426 March 2008. 1428 20.5. Ivip References 1430 [I-D.whittle-ivip-db-fast-push] 1431 Whittle, R., "Ivip Mapping Database Fast Push", 1432 draft-whittle-ivip-db-fast-push-01 (work in progress), 1433 August 2008. 1435 [I-D.whittle-ivip4-etr-addr-forw] 1436 Whittle, R., "Ivip4 ETR Address Forwarding", 1437 draft-whittle-ivip4-etr-addr-forw-01 (work in progress), 1438 August 2008. 1440 [Ivip Constraints] 1441 Whittle, R., "List of constraints on a successful scalable 1442 routing solution which result from the need for widespread 1443 voluntary adoption", 1444 . 1446 [Ivip Mobility] 1447 Whittle, R., "TTR Mobility Extensions for Core-Edge 1448 Separation Solutions to the Internet's Routing Scaling 1449 Problem", 1450 . 1452 [Ivip PMTUD] 1453 Whittle, R., "IPTM - Ivip's approach to solving the 1454 problems with encapsulation overhead, MTU, fragmentation 1455 and Path MTU Discovery", 1456 . 1458 [Ivip Summary] 1459 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1460 Conceptual Summary and Analysis", 1461 . 1463 [Ivip6] Whittle, R., "Ivip6 - instead of map-encap, use the 20 bit 1464 Flow Label as a Forwarding Label", 1465 . 1467 20.6. hIPv4 References 1469 [I-D.frejborg-hipv4] 1470 Frejborg, P., "Hierarchical IPv4 Framework", 1471 draft-frejborg-hipv4-04 (work in progress), November 2009. 1473 20.7. Layered Mapping System References 1475 [LMS] Letong, S., Xia, Y., ZhiLiang, W., and W. Jianping, "A 1476 Layered Mapping System For Scalable Routing", . 1481 [LMS Summary] 1482 Sun, C., "A Layered Mapping System (Summary)", . 1486 20.8. GLI References 1488 [GLI] Menth, M., Hartmann, M., and D. Klein, "Global Locator, 1489 Local Locator, and Identifier Split (GLI-Split)", . 1493 20.9. TIDR References 1495 [I-D.adan-idr-tidr] 1496 Adan, J., "Tunneled Inter-domain Routing (TIDR)", 1497 draft-adan-idr-tidr-01 (work in progress), December 2006. 1499 [TIDR AS forwarding] 1500 Adan, J., "yetAnotherProposal: AS-number forwarding", 1501 . 1503 [TIDR and LISP] 1504 Adan, J., "LISP etc architecture", 1505 . 1507 [TIDR identifiers] 1508 Adan, J., "TIDR using the IDENTIFIERS attribute", . 1511 20.10. ILNP References 1513 [ILNP Site] 1514 Atkinson, R., Bhatti, S., Hailes, S., Rehunathan, D., and 1515 M. Lad, "ILNP - Identifier/Locator Network Protocol", 1516 . 1518 20.11. EEMDP References 1520 [EEMDP Considerations] 1521 Sriram, K., Kim, Y., and D. Montgomery, "Architectural 1522 Considerations for Mapping Distribution Protocols", 1523 . 1525 [EEMDP Presentation] 1526 Sriram, K., Kim, Y., and D. Montgomery, "Architectural 1527 Considerations for Mapping Distribution Protocols", . 1530 20.12. Evolution References 1532 [Evolution Grow Presenatation] 1533 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1534 L. Zhang, "Virtual Aggregation (VA)", 1535 . 1537 [I-D.zhang-evolution] 1538 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1539 Scalability", draft-zhang-evolution-02 (work in progress), 1540 October 2009. 1542 20.13. Name Based Sockets References 1544 [Name Based Sockets] 1545 Vogt, C., "Simplifying Internet Applications Development 1546 With A Name-Based Sockets Interface", . 1550 Author's Address 1552 Tony Li (editor) 1553 Ericsson 1554 300 Holger Way 1555 San Jose, CA 95134 1556 USA 1558 Phone: +1 408 750 5160 1559 Email: tony.li@tony.li