idnits 2.17.1 draft-templin-iron-16.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 23, 2010) is 4866 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'B' is mentioned on line 1273, but not defined == Unused Reference: 'RFC3849' is defined on line 1547, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1574, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-03 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-25 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-19 -- Obsolete informational reference (is this intentional?): RFC 3068 (Obsoleted by RFC 7526) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft December 23, 2010 5 Intended status: Experimental 6 Expires: June 26, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-16.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth through Provider Independent addressing while requiring no 18 changes to end systems and no changes to the existing routing system. 19 IRON further addresses other important issues including routing 20 scaling, mobility management, multihoming, traffic engineering and 21 NAT traversal. While business considerations are an important 22 determining factor for widespread adoption, they are out of scope for 23 this document. This document is a product of the IRTF Routing 24 Research Group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 26, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 63 3.1. IRON Client Router . . . . . . . . . . . . . . . . . . . . 9 64 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 10 65 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 66 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 67 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 68 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 69 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 70 5.3. IRON Client Router Initialization . . . . . . . . . . . . 15 71 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 16 72 6.1. IRON Client Router Operation . . . . . . . . . . . . . . . 16 73 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 17 74 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 18 75 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 76 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 77 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 78 6.5. Mobility, Multihoming and Traffic Engineering 79 Considerations . . . . . . . . . . . . . . . . . . . . . . 25 80 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 25 81 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 26 82 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 26 83 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 26 84 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 26 85 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 26 86 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 27 87 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 28 88 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 29 89 7. Implications for the Internet . . . . . . . . . . . . . . . . 30 90 8. Additional Considerations . . . . . . . . . . . . . . . . . . 31 91 9. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 31 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 32 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 96 13.1. Normative References . . . . . . . . . . . . . . . . . . . 33 97 13.2. Informative References . . . . . . . . . . . . . . . . . . 33 98 Appendix A. IRON VPs Over Internetworks with Different 99 Address Families . . . . . . . . . . . . . . . . . . 35 100 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 36 101 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 37 103 1. Introduction 105 Growth in the number of entries instantiated in the Internet routing 106 system has led to concerns for unsustainable routing scaling 107 [I-D.narten-radir-problem-statement]. Operational practices such as 108 increased use of multihoming with IPv4 Provider-Independent (PI) 109 addressing are resulting in more and more fine-grained prefixes 110 injected into the routing system from more and more end-user 111 networks. Furthermore, the forthcoming depletion of the public IPv4 112 address space has raised concerns for both increased address space 113 fragmentation (leading to yet further routing table entries) and an 114 impending address space run-out scenario. At the same time, the IPv6 115 routing system is beginning to see growth in IPv6 Provider-Aggregated 116 (PA) prefixes [BGPMON] which must be managed in order to avoid the 117 same routing scaling issues the IPv4 Internet now faces. Since the 118 Internet must continue to scale to accommodate increasing demand, it 119 is clear that new routing methodologies and operational practices are 120 needed. 122 Several related works have investigated routing scaling issues. 123 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 124 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 125 proposals that introduce routing overlays with Virtual Prefixes (VPs) 126 to reduce the number of entries required in each router's Forwarding 127 Information Base (FIB) and Routing Information Base (RIB). Routing 128 and Addressing in Networks with Global Enterprise Recursion (RANGER) 129 [RFC5720] examines recursive arrangements of enterprise networks that 130 can apply to a very broad set of use case scenarios 131 [I-D.russert-rangers]. In particular, RANGER supports encapsulation 132 and secure redirection by treating each layer in the recursive 133 hierarchy as a virtual non-broadcast, multiple access (NBMA) "link". 134 RANGER is an architectural framework that includes Virtual Enterprise 135 Traversal (VET) [I-D.templin-intarea-vet] and the Subnetwork 136 Adaptation and Encapsulation Layer (SEAL) [I-D.templin-intarea-seal] 137 as its functional building blocks. 139 This document proposes an Internet Routing Overlay Network (IRON) 140 with goals of supporting sustainable growth while requiring no 141 changes to the existing routing system. IRON borrows concepts from 142 VA, AIS and RANGER, and further borrows concepts from the Internet 143 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 144 proposal along with its associated Translating Tunnel Router (TTR) 145 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 146 degree inspired the IRON mobility architecture design discussed in 147 this document. The Network Address Translator (NAT) traversal 148 techniques adapted for IRON were inspired by the Simple Address 149 Mapping for Premises Legacy Equipment (SAMPLE) proposal 150 [I-D.carpenter-softwire-sample]. 152 IRON specifically seeks to provide scalable PI addressing without 153 changing the current BGP [RFC4271] routing system. IRON observes the 154 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 155 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 156 [RFC1070], etc.) are also within scope. 158 The IRON is a global routing system comprising virtual overlay 159 networks managed by Virtual Prefix Companies (VPCs) that own and 160 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 161 prefixes (EPs) are delegated to customer sites. The IRON is 162 motivated by a growing customer demand for multihoming, mobility 163 management and traffic engineering while using stable PI addressing 164 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 165 existing IPv4 and IPv6 global Internet routing systems as virtual 166 links for tunneling inner network protocol packets within outer IPv4 167 or IPv6 headers (see: Section 3). The IRON requires deployment of a 168 small number of new BGP core routers and supporting servers, as well 169 as IRON-aware routers/servers in customer EUNs. No modifications to 170 hosts, and no modifications to most routers are required. 172 While the IRON architecture addresses network mobility, host mobility 173 considerations are outside the scope of this document. IP multicast 174 considerations are also out of scope. 176 Note: This document is offered in compliance with Internet Research 177 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 178 IETF product and is not a standard. The views in this document were 179 considered controversial by the IRTF Routing Research Group (RRG) but 180 the RG reached a consensus that the document should still be 181 published. The document will undergo a period of review within the 182 RRG and through selected expert reviewers prior to publication. The 183 following sections discuss details of the IRON architecture. 185 2. Terminology 187 This document makes use of the following terms: 189 End User Network (EUN) 190 an edge network that connects an organization's devices (e.g., 191 computers, routers, printers, etc.) to the Internet. 193 End User Network PI Prefix (EP) 194 a more-specific Provider-Independent (PI) prefix derived from a 195 Virtual Prefix (VP) (e.g., an IPv4 /28, an IPv6 /56, etc.) and 196 delegated to an EUN by a Virtual Prefix Company (VPC). 198 End User Network PI Address (EPA) 199 a network layer address belonging to an EP and assigned to the 200 interface of an end system in an EUN. 202 Forwarding Information Based (FIB) 203 a data structure containing network prefix to next-hop mappings; 204 usually maintained in a router's fast-path processing lookup 205 tables. 207 Internet Routing Overlay Network (IRON) 208 a composite virtual overlay network that comprises the union of 209 all VPC overlay networks configured over a common Internetwork. 210 The IRON supports routing through encapsulation of inner packets 211 with EPA addresses within outer headers that use locator 212 addresses. 214 IRON Client Router ("Client") 215 a customer's router (or host with embedded gateway function) that 216 logically connects the customer's EUNs and their associated EPs to 217 the IRON via tunnels. 219 IRON Serving Router ("Server") 220 a VPC's overlay network router that provides forwarding and 221 mapping services for the EPs owned by customer Client routers. 223 IRON Relay Router ("Relay") 224 a VPC's overlay network router that acts as a relay between the 225 IRON and the native Internet. 227 IRON Router (IR) 228 generically refers to any of an IRON Client/Server/Relay router. 230 Internet Service Provider (ISP) 231 a service provider which connects customer EUNs to the underlying 232 Internetwork. In other words, an ISP is responsible for providing 233 basic Internet connectivity for customer EUNs. 235 Locator 236 an IP address assigned to the interface of a router or end system 237 within a public or private network. Locators taken from public IP 238 prefixes are routable on a global basis, while locators taken from 239 private IP prefixes are made public via Network Address 240 Translation (NAT). 242 Provider Aggregated (PA) address or prefix 243 a network layer address or prefix delegated to an EUN by an ISP. 245 Provider Independent (PI) address or prefix 246 a network layer address or prefix delegated to an EUN by a third 247 party independently of the EUN's ISP arrangements. 249 Routing and Addressing in Networks with Global Enterprise Recursion 250 (RANGER) 251 an architectural examination of virtual overlay networks applied 252 to enterprise network scenarios, with implications for a wider 253 variety of use cases. 255 Subnetwork Encapsulation and Adaptation Layer (SEAL) 256 an encapsulation sublayer that provides extended packet 257 identification and a control message protocol to ensure 258 deterministic network-layer feedback. 260 Virtual Enterprise Traversal (VET) 261 a method for discovering border routers and forming dynamic point- 262 to-(multi)point tunnels over enterprise networks (or sites) with 263 varying properties. 265 Virtual Prefix (VP) 266 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 267 prefix, etc.) that is owned and managed by a Virtual Prefix 268 Company (VPC). 270 Virtual Prefix Company (VPC) 271 a company that owns and manages a set of VPs from which it 272 delegates EPs to EUNs. 274 VPC Overlay Network 275 a specialized set of routers deployed by a VPC to service customer 276 EUNs through a virtual overlay network configured over an 277 underlying Internetwork (e.g., the global Internet). 279 3. The Internet Routing Overlay Network 281 The Internet Routing Overlay Network (IRON) is a system of virtual 282 overlay networks configured over a common Internetwork. While the 283 principles presented in this document are discussed within the 284 context of the public global Internet, they can also be applied to 285 any autonomous Internetwork. The rest of this document therefore 286 refers to the terms "Internet" and "Internetwork" interchangeably 287 except in cases where specific distinctions must be made. 289 The IRON consists of IRON Routers (IRs) that automatically tunnel the 290 packets of end-to-end communication sessions within encapsulating 291 headers used for Internet routing. IRs use Virtual Enterprise 292 Traversal (VET) [I-D.templin-intarea-vet] in conjunction with the 293 Subnetwork Encapsulation and Adaptation Layer (SEAL) 294 [I-D.templin-intarea-seal] to encapsulate inner network layer packets 295 within outer headers as shown in Figure 1: 297 +-------------------------+ 298 | Outer headers with | 299 ~ locator addresses ~ 300 | (IPv4 or IPv6) | 301 +-------------------------+ 302 | SEAL Header | 303 +-------------------------+ +-------------------------+ 304 | Inner Packet Header | --> | Inner Packet Header | 305 ~ with EP addresses ~ --> ~ with EP addresses ~ 306 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 307 +-------------------------+ +-------------------------+ 308 | | --> | | 309 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 310 | | --> | | 311 +-------------------------+ +-------------------------+ 313 Inner packet before Outer packet after 314 before encapsulation after encapsulation 316 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 318 VET specifies the automatic tunneling mechanisms used for 319 encapsulation, while SEAL specifies the format and usage of the SEAL 320 header as well as a set of control messages. Most notably, IRs use 321 the SEAL Control Message Protocol (SCMP) to deterministically 322 exchange and authenticate control messages such as route 323 redirections, indications of Path Maximum Transmission Unit (PMTU) 324 limitations, destination unreachables, etc. 326 The IRON is the union of all virtual overlay networks that are 327 configured over a common underlying Internet and are owned and 328 managed Virtual Prefix Companies (VPCs). Each such virtual overlay 329 network comprises a set of IRs distributed throughout the Internet to 330 serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate sub- 331 prefixes from their VPs which they lease to customers as End User 332 Network PI prefixes (EPs). The customers in turn assign the EPs to 333 their customer edge IRs which connect their End User Networks (EUNs) 334 to the IRON. 336 VPCs may have no affiliation with the ISP networks from which 337 customers obtain their basic Internet connectivity. Therefore, a 338 customer could procure its summary network services either through a 339 common broker or through separate entities. In that case, the VPC 340 can open for business and begin serving its customers immediately 341 without the need to coordinate its activities with ISPs or with other 342 VPCs. Further details on business considerations are out of scope 343 for this document. 345 The IRON requires no changes to end systems and no changes to most 346 routers in the Internet. Instead, the IRON comprises IRs that are 347 deployed either as new platforms or as modifications to existing 348 platforms. IRs may be deployed incrementally without disturbing the 349 existing Internet routing system, and act as waypoints (or "cairns") 350 for navigating the IRON. The functional roles for IRs are described 351 in the following sections. 353 3.1. IRON Client Router 355 An IRON client router (or, simply, "Client") is a customer's router 356 (or host with embedded gateway function) that logically connects the 357 customer's EUNs and their associated EPs to the IRON via tunnels as 358 shown in Figure 2. Clients obtain EPs from VPCs and use them to 359 number subnets and interfaces within their EUNs. A Client can be 360 deployed on the same physical platform that also connects the 361 customer's EUNs to its ISPs, but it may also be a separate router or 362 even a standalone server system located within the EUN. (This model 363 applies even if the EUN connects to the ISP via a Network Address 364 Translator (NAT) - see Section 6.7). 365 .-. 366 ,-( _)-. 367 +--------+ .-(_ (_ )-. 368 | Client |--(_ ISP ) 369 +---+----+ `-(______)-' 370 | <= T \ .-. 371 .-. u \ ,-( _)-. 372 ,-( _)-. n .-(_ (- )-. 373 .-(_ (_ )-. n (_ Internet ) 374 (_ EUN ) e `-(______)- 375 `-(______)-' l ___ 376 | s => (:::)-. 377 +----+---+ .-(::::::::) 378 | Host | .-(::::::::::::)-. 379 +--------+ (:::: The IRON ::::) 380 `-(::::::::::::)-' 381 `-(::::::)-' 383 Figure 2: IRON Client Router Connecting EUN to the IRON 385 3.2. IRON Serving Router 387 An IRON serving router (or, simply, "Server") is a VPC's overlay 388 network router that provides forwarding and mapping services for the 389 EPs owned by customer Client routers. In typical deployments, a VPC 390 will deploy many Servers around the IRON in a globally-distributed 391 fashion (e.g., as depicted in Figure 3) so that Clients can discover 392 those that are nearby. 394 +--------+ +--------+ 395 | Boston | | Tokyo | 396 | Server | | Server | 397 +--+-----+ ++-------+ 398 +--------+ \ / 399 | Seattle| \ ___ / 400 | Server | \ (:::)-. +--------+ 401 +------+-+ .-(::::::::)------+ Paris | 402 \.-(::::::::::::)-. | Server | 403 (:::: The IRON ::::) +--------+ 404 `-(::::::::::::)-' 405 +--------+ / `-(::::::)-' \ +--------+ 406 | Moscow + | \--- + Sydney | 407 | Server | +----+---+ | Server | 408 +--------+ | Cairo | +--------+ 409 | Server | 410 +--------+ 412 Figure 3: IRON Serving Router Global Distribution Example 414 Each Server acts as tunnel-endpoint router that forms a bi- 415 directional tunnel with each of its Client customers. Each Server 416 also associates with a set of Relays that can forward packets from 417 the IRON out to the native Internet and vice-versa as discussed in 418 the next section. 420 3.3. IRON Relay Router 422 An IRON Relay Router (or, simply, "Relay") is a VPC's overlay network 423 router that acts as a relay between the IRON and the native Internet. 424 It therefore also serves as an Autonomous System Border Router (ASBR) 425 that is owned and managed by the VPC. 427 Each VPC configures one or more Relays which advertise the company's 428 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 429 Relay associates with all of the VPC's overlay network Servers, e.g., 430 via tunnels over the IRON, via a direct interconnect such as an 431 Ethernet cable, etc. The Relay role (as well as its relationship 432 with overlay network Servers) is depicted in Figure 4: 434 .-. 435 ,-( _)-. 436 .-(_ (_ )-. 437 (_ Internet ) 438 `-(______)-' | +--------+ 439 | |--| Server | 440 +----+---+ | +--------+ 441 | Relay |----| +--------+ 442 +--------+ |--| Server | 443 _|| | +--------+ 444 (:::)-. (Ethernet) 445 .-(::::::::) 446 +--------+ .-(::::::::::::)-. +--------+ 447 | Server |=(:::: The IRON ::::)=| Server | 448 +--------+ `-(::::::::::::)-' +--------+ 449 `-(::::::)-' 450 || (Tunnels) 451 +--------+ 452 | Server | 453 +--------+ 455 Figure 4: IRON Relay Router Connecting IRON to Native Internet 457 4. IRON Organizational Principles 459 The IRON consists of the union of all VPC overlay networks configured 460 over a common Internetwork (e.g., the public Internet). Each such 461 overlay network represents a distinct "patch" on the Internet 462 "quilt", where the patches are stitched together by tunnels over the 463 links, routers, bridges, etc., that connect the underlying. When a 464 new VPC overlay network is deployed, it becomes yet another patch on 465 the quilt. The IRON is therefore a composite overlay network 466 consisting of multiple individual patches, where each patch 467 coordinates its activities independently of all others (with the 468 exception that the Servers of each patch must be aware of all VPs in 469 the IRON). In order to ensure mutual cooperation between all VPC 470 overlay networks, sufficient address space portions of the inner 471 network layer protocol (e.g., IPv4, IPv6, etc.) should be set aside 472 and designated as VP space. 474 Each VPC overlay network in the IRON maintains a set of Relays and 475 Servers that provide services to their Client customers. In order to 476 ensure adequate customer service levels, the VPC should conduct a 477 traffic scaling analysis and distribute sufficient Relays and Servers 478 for the overlay network globally throughout the Internet. Figure 5 479 depicts the logical arrangement of Relays Servers and Clients in an 480 IRON virtual overlay network: 482 .-. 483 ,-( _)-. 484 .-(_ (_ )-. 485 (__ Internet _) 486 `-(______)-' 488 <------------ Relays ------------> 489 ________________________ 490 (::::::::::::::::::::::::)-. 491 .-(:::::::::::::::::::::::::::::) 492 .-(:::::::::::::::::::::::::::::::::)-. 493 (::::::::::: The IRON :::::::::::::::) 494 `-(:::::::::::::::::::::::::::::::::)-' 495 `-(::::::::::::::::::::::::::::)-' 497 <------------ Servers ------------> 498 .-. .-. .-. 499 ,-( _)-. ,-( _)-. ,-( _)-. 500 .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. 501 (__ ISP A _) (__ ISP B _) ... (__ ISP x _) 502 `-(______)-' `-(______)-' `-(______)-' 503 <----------- NATs ------------> 505 <----------- Clients and EUNs -----------> 507 Figure 5: Virtual Overlay Network Organization 509 Each Relay in the VPC overlay network connects the overlay directly 510 to the underlying IPv4 and IPv6 Internets. It also advertises the 511 VPC overlay network's IPv4 VPs into the IPv4 BGP routing system and 512 advertises the overlay network's IPv6 VPs into the IPv6 BGP routing 513 system. Relays will therefore receive packets with EPA destination 514 addresses sent by end systems in the Internet and direct them toward 515 EPA-addressed end systems connected to the VPC overlay network. 517 Each VPC overlay network also manages a set of Servers that connect 518 their Clients and associated EUNs to the IRON and to the IPv6 and 519 IPv4 Internets via their associations with Relays. IRON Servers 520 therefore need not be BGP routers themselves and can be simple 521 commodity hardware platforms. Moreover, the Server and Relay 522 functions can be deployed together on the same physical platform as a 523 unified gateway or they may be deployed on separate platforms (e.g., 524 for load balancing purposes). 526 Each Server maintains a working set of Clients for which it caches 527 EP-to-Client mappings in its Forwarding Information Base (FIB). Each 528 Server also in turn propagates the list of EPs in its working set to 529 each of the Relays in the VPC overlay network via a dynamic routing 530 protocol (e.g., an overlay network internal BGP instance that carries 531 only the EP-to-Server mappings and does not interact with the 532 external BGP routing system). Each Server therefore only needs to 533 track the EPs for its current working set of Clients, while each 534 Relay will maintain a full EP-to-Server mapping table that represents 535 reachability information for all EPs in the VPC overlay network. 537 Customers establish Clients that obtain their basic Internet 538 connectivity from ISPs and connect to Servers to attach their EUNs to 539 the IRON. Each EUN can connect to the IRON via one or multiple 540 Clients as long as the Clients coordinate with one another, e.g., to 541 mitigate EUN partitions. Unlike Relays and Servers, Clients may use 542 private addresses behind one or several layers of NATs. Each Client 543 initially discovers a list of nearby Servers through an anycast 544 discovery process (described below). It then selects one of these 545 nearby Servers and forms a bidirectional tunnel through an initial 546 exchange followed by periodic keepalives. 548 After the Client selects a Server, it forwards initial outbound 549 packets from its EUNs by tunneling them to the Server which in turn 550 forwards them to the nearest Relay within the IRON that serves the 551 final destination. The Client will subsequently receive redirect 552 messages informing it of a more direct route through a Server that 553 serves the final destination EUN. 555 The IRON can also be used to support VPs of network layer address 556 families that cannot be routed natively in the underlying 557 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 558 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 559 Further details for support of IRON VPs of one address family over 560 Internetworks based on other address families are discussed in 561 Appendix A. 563 5. IRON Initialization 565 IRON initialization entails the startup actions of IRs within the VPC 566 overlay network and customer EUNs. The following sections discuss 567 these startups procedures. 569 5.1. IRON Relay Router Initialization 571 Before its first operational use, each Relay in a VPC overlay network 572 is provisioned with the list of VPs that it will serve as well as the 573 locators for all Servers that belong to the same overlay network. 574 The Relay is also provisioned with external BGP interconnections the 575 same as for any BGP router. 577 Upon startup, the Relay engages in BGP routing exchanges with its 578 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 579 It then connects to all of the Servers in the overlay network (e.g., 580 via a TCP connection over a bidirectional tunnel, via an iBGP route 581 reflector, etc.) for the purpose of discovering EP->Server mappings. 582 After the Relay has fully populated its EP->Server mapping 583 information database, it is said to be "synchronized" wrt its VPs. 585 After this initial synchronization procedure, the Relay then 586 advertises the overlay network's VPs externally. In particular, the 587 Relay advertises the IPv6 VPs into the IPv6 BGP routing system and 588 advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay 589 additionally advertises an IPv4 /24 companion prefix (e.g., 590 192.0.2.0/24) into the IPv4 routing system and an IPv6 ::/64 591 companion prefix (e.g., 2001:DB8::/64) into the IPv6 routing system 592 (note that these may also be sub-prefixes taken from a VP). The 593 Relay then configures the host number '1' in the IPv4 companion 594 prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the 595 IPv6 companion prefix (e.g., as 2001:DB8::0) and assigns the 596 resulting addresses as subnet router anycast addresses 597 [RFC3068][RFC2526] for the VPC overlay network. (See Appendix A for 598 more information on the discovery and use of companion prefixes.) 599 The Relay then engages in ordinary packet forwarding operations. 601 5.2. IRON Serving Router Initialization 603 Before its first operational use, each Server in a VPC overlay 604 network is provisioned with the locators for all Relays that 605 aggregate the overlay network's VPs. In order to support route 606 optimization, the Server must also be provisioned with the list of 607 all VPs in the IRON (i.e., and not just the VPs of its own overlay 608 network) so that it can discern EPA and non-EPA addresses. (The 609 Server could therefore be greatly simplified if the list of VPs could 610 be covered within a small number of very short prefixes, e.g., one or 611 a few IPv6 ::/20's). The Server must also discover the VP companion 612 prefix relationships discussed in Section 5.1, e.g., via a global 613 database such as discussed in Appendix A. 615 Upon startup, each Server must connect to all of the Relays within 616 its overlay network (e.g., via a TCP connection over a bidirectional 617 tunnel, via an iBGP route reflector, etc.) for the purpose of 618 reporting its EP->Server mappings. The Server then actively listens 619 for Client customers which register their EP prefixes as part of 620 establishing a bidirectional tunnel. When a new Client registers its 621 EP prefixes, the Server announces the new EP additions to all Relays; 622 when an existing Client unregisters its EP prefixes, the Server 623 withdraws its announcements. 625 5.3. IRON Client Router Initialization 627 Before its first operational use, each Client must obtain one or more 628 EPs from its VPC as well as the companion prefixes associated with 629 the VPC overlay network (see Section 5.1). The Client must also 630 obtain a certificate and a public/private key pair from the VPC that 631 it can later use to prove ownership of its EPs. This implies that 632 each VPC must run its own public key infrastructure to be used only 633 for the purpose of verifying its customers' claimed right to use an 634 EP. Hence, the VPC need not coordinate its public key infrastructure 635 with any other organization. 637 Upon startup, the Client sends an SCMP Router Solicitation (SRS) 638 message to the VPC overlay network subnet router anycast address to 639 discover the nearest Relay. The Relay will return an SCMP Router 640 Advertisement (SRA) message that lists the locator addresses of one 641 or more nearby Servers. (This list is analogous to the ISATAP 642 Potential Router List (PRL) [RFC5214].) 644 After the Client receives an SRA message from the nearby Relay 645 listing the locator addresses of nearby Servers, it sends SRS test 646 messages to one or more of the locator addresses to elicit SRA 647 messages. The Server that configures the locator will include the 648 header of the soliciting SRS message in its SRA message so that the 649 Client can determine the number of hops along the forward path. The 650 Server also includes a metric in its SRA messages indicating its 651 service availability so that the Client can avoid selecting Servers 652 that are overloaded. The Server also includes a challenge/response 653 puzzle that the Client must answer if it wishes to connect to this 654 Server. 656 When the Client receives these SRA messages, it can measure the round 657 trip time between sending the SRS and receiving the SRA as an 658 indication of round-trip delay. If the Client wishes to enlist the 659 services of a specific Server (e.g., based on the measured 660 performance), it then calculates the answer to the puzzle using its 661 keying information and sends the answer back to the Server in a new 662 SRS message that also contains all of the Client's EP prefixes for 663 which it claims ownership. If the Client solved the puzzle 664 correctly, the Server will send back a new SRA message that includes 665 a non-zero default router lifetime and that signifies the 666 establishment of a bidirectional tunnel. (A zero default router 667 lifetime on the other hand signifies that the Server is currently 668 unable to establish a bidirectional tunnel, e.g., due to heavy load, 669 due to challenge/response failure, etc.) 671 Note that it is essential that the Client select one and only one 672 Server. This is to allow the VPC overlay network mapping system to 673 have one and only one active EP-to-Server mapping at any point in 674 time which shares fate with the Server itself. If this Server fails, 675 the Client can select a new one which will automatically update the 676 VPC overlay network mapping system with a new EP-to-Server mapping. 678 6. IRON Operation 680 Following the IRON initialization detailed in Section 5, IRs engage 681 in the steady-state process of receiving and forwarding packets. All 682 IRs forward encapsulated packets over the IRON using the mechanisms 683 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 684 while Relays (and in some cases Servers) additionally forward packets 685 to and from the native IPv6 and IPv4 Internets. IRs also use SCMP to 686 coordinate with other IRs, including the process of sending and 687 receiving redirect messages, error messages, etc. (Note however that 688 an IR must not send an SCMP message in response to an SCMP error 689 message.) Each IR operates as specified in the following sub- 690 sections. 692 6.1. IRON Client Router Operation 694 After selecting its Server as specified in Section 5.3, the Client 695 should register each of its ISP connections with the Server in order 696 to establish multiple bidirectional tunnels for multihoming purposes. 697 To do so, it sends periodic SRS messages to its Server via each of 698 its ISPs to establish additional bidirectional tunnels and to keep 699 each tunnel alive. These messages need not include challenge/ 700 response mechanisms since prefix proof of ownership was already 701 established in the initial exchange and a nonce in the SEAL header 702 can be used to confirm that the SRS message was sent by the correct 703 Client. This implies that a single nonce is used to represent the 704 set of all bidirectional tunnels between the Client and the Server. 705 Therefore, there are multiple bidirectional tunnels, and the nonce 706 names this "bundle" of tunnels. (The Client and Server may 707 conceptually represent this "bundle" as a single tunnel with multiple 708 locator addresses, however each such locator address must be tested 709 independently in case there are NATs on the path.) 711 If the Client ceases to receive SRA messages from its Server via a 712 specific ISP connection, it marks the Server as unreachable from that 713 address and therefore over that ISP connection. (The Client should 714 also inform its Server of this outage via one of its working ISP 715 connections.) If the Client ceases to receive SRA messages from its 716 Server via multiple ISP connections, it marks the Server as unusable 717 and quickly attempts to establish a bidirectional tunnel with a new 718 Server. The act of establishing the tunnel with a new Server will 719 automatically purge the stale mapping state associated with the old 720 Server, since dynamic routing will propagate the new client/server 721 relationship to the VPC overlay network relay routers. 723 When an end system in an EUN sends a flow of packets to a 724 correspondent, the packets are forwarded through the EUN via normal 725 routing until they reach the Client, which then tunnels the initial 726 packets to its Server as the next hop. In particular, the Client 727 encapsulates each packet in an outer header with its locator as the 728 source address and the locator of its Server as the destination 729 address. Note that after sending the initial packets of a flow, the 730 Client may receive important SCMP messages such as indications of 731 PMTU limitations, redirects that point to a better next hop, etc. It 732 is therefore essential that the Client send the initial packets 733 through its Server to avoid loss of SCMP messages that cannot 734 traverse a NAT in the reverse direction. (The Server also provides a 735 control point for inbound traffic engineering and a mobility anchor 736 point and hence cannot by bypassed in the inbound direction). 738 The Client uses the mechanisms specified in VET and SEAL to 739 encapsulate each forwarded packet. The Client further uses the SCMP 740 protocol to coordinate with Servers, including accepting redirects 741 and other SCMP messages. When the Client receives an SCMP message, 742 it checks the nonce field of the encapsulated packet-in-error to 743 verify that the message corresponds to the tunnel to its Server and 744 accepts the message if the nonce matches. (Note however that the 745 outer source and destination addresses of the packet-in-error may be 746 different than those in the original packet due to possible Server 747 and/or Relay address rewritings.) 749 6.2. IRON Serving Router Operation 751 After the Server is initialized, it responds to SRSs from Clients by 752 sending SRAs as described in Section 6.1. When the Server receives 753 an SRS message from a new Client, it sends back an SRA message with a 754 challenge/response puzzle. The Client in turn sends an SRS message 755 with an answer to the puzzle. If this authentication fails, the 756 Server discards the message. Otherwise, it creates tunnel state for 757 this new Client, records the Client's EPs (see Section 5.3) in its 758 FIB, and records the locator address from the SCMP message as the 759 link-layer address of the next hop. The Server next sends an SRA 760 message back to the Client to complete the tunnel establishment. 762 When the Server receives a SEAL-encapsulated packet from one of its 763 Client tunnel endpoints, it examines the inner destination address. 764 If the inner destination address is not an EPA, the Server 765 decapsulates the packet and forwards it unencapsulated into the 766 Internet if it is able to do so without loss due to ingress 767 filtering. Otherwise, the Server re-encapsulates the packet (i.e., 768 it removes the outer header and replaces it with a new outer header 769 of the same address family) and sets the outer destination address to 770 the locator address of an Relay within its VPC overlay network. It 771 then forwards the re-encapsulated packet to the Relay, which will in 772 turn decapsulate it and forward it into the Internet. 774 If the inner destination address is an EPA, however, the Server 775 rewrites the outer source address to one of its own locator addresses 776 and rewrites the outer destination address to the subnet router 777 anycast address taken from the companion prefix associated with the 778 inner destination address (where the companion prefix of the same 779 address family as the outer IP protocol is used). The Server then 780 forwards the revised encapsulated packet into the Internet via a 781 default or more-specific route, where it will be directed to the 782 closest Relay within the destination VPC overlay network. After 783 sending the packet, the Server may then receive an SCMP error or 784 redirect message from a Relay/Server within the destination VPC 785 overlay network. In that case, the Server verifies that the nonce in 786 the message matches the tunnel corresponding to the Client that sent 787 the original inner packet and discards the message if the nonce does 788 not match. Otherwise, the Server re-encapsulates the SCMP message in 789 a new outer header that uses the source address, destination address 790 and nonce parameters associated with the tunnel to the Client; it 791 then forwards the message to the Client. This arrangement is 792 necessary to allow SCMP messages to flow through any NATs on the 793 path. 795 When a Server ('A') receives a SEAL-encapsulated packet from a Relay 796 or from the Internet, if the inner destination address matches an EP 797 in its FIB 'A' re-encapsulates the packet in a new outer header that 798 uses the source address, destination address and nonce parameters 799 associated with the tunnel and forwards it to a Client ('B') which in 800 turn decapsulates the packet and forwards it to the correct end 801 system in the EUN. If 'B' has left notice with 'A' that it has moved 802 to a new Server ('C'), however, 'A' will instead forward the packet 803 to 'C' and also send an SCMP redirect message back to the source of 804 the packet. In this way, 'B' can leave behind forwarding information 805 when changing between Servers 'A' and 'C' (e.g., due to mobility 806 events) without exposing packets to loss. 808 6.3. IRON Relay Router Operation 810 After each Relay has synchronized its VPs (see: Section 5.1) it 811 advertises the full set of the company's VPs and companion prefixes 812 into the IPv4 and IPv6 Internet BGP routing systems. These prefixes 813 will be represented as ordinary routing information in the BGP, and 814 any packets originating from the IPv4 or IPv6 Internet destined to an 815 address covered by one of the prefixes will be forwarded to one of 816 the VPC overlay network's Relays. 818 When a Relay receives a packet from the Internet destined to an EPA 819 covered by one of its VPs, it behaves as an ordinary IP router. In 820 particular, the Relay looks in its FIB to discover a locator of the 821 Server that serves the EP that covers the destination address. The 822 Relay then simply encapsulates the packet with its own locator as the 823 outer source address and the locator of the Server as the outer 824 destination address and forwards the packet to the Server. 826 When a Relay receives a packet from the Internet destined to one of 827 its subnet router anycast addresses, it discards the packet if it is 828 not SEAL-encapsulated. If the packet is an SCMP SRS message, the 829 Relay instead sends an SRA message back to the source listing the 830 locator addresses of nearby Servers then discards the message. The 831 Relay otherwise discards all other SCMP messages. 833 If the packet is an ordinary SEAL packet (i.e., one that encapsulates 834 an inner packet) the Relay sends an SCMP redirect message of the same 835 address family back to the source with the locator of the Server that 836 serves the EPA destination in the inner packet as the redirected 837 target. The source and destination addresses of the SCMP redirect 838 message use the outer destination and source addresses of the 839 original packet, respectively. After sending the redirect message, 840 the Relay then rewrites the outer destination address of the SEAL- 841 encapsulated packet to the locator of the Server and forwards the 842 revised packet to the Server. Note that in this arrangement any 843 errors that occur on the path between the Relay and the Server will 844 be delivered to the original source but with a different destination 845 address due to this Relay address rewriting. 847 6.4. IRON Reference Operating Scenarios 849 The IRON supports communications when one or both hosts are located 850 within EP-addressed EUNs regardless of whether the EPs are 851 provisioned by the same VPC or by different VPCs. When both hosts 852 are within IRON EUNs, route redirections that eliminate unnecessary 853 Servers and Relays from the path are possible. When only one host is 854 within an IRON EUN, however, route optimization cannot be used. The 855 following sections discuss the two scenarios. 857 6.4.1. Both Hosts Within IRON EUNs 859 When both hosts are within IRON EUNs, it is sufficient to consider 860 the scenario in a unidirectional fashion, i.e., by tracing packet 861 flows only in the forward direction from the source host to 862 destination host. The reverse direction can be considered 863 separately, and incurs the same considerations as for the forward 864 direction. 866 In this scenario, the initial packets of a flow produced by a source 867 host within an EUN connected to the IRON by a Client must flow 868 through both the Server of the source host and a Relay of the 869 destination host, but route optimization can eliminate these elements 870 from the path for subsequent packets in the flow. Figure 6 shows the 871 flow of initial packets from host A to host B within two IRON EUNs 872 (the same scenario applies whether the two EUNs are within the same 873 VPC overlay network or different overlay networks): 875 ________________________________________ 876 .-( .-. )-. 877 .-( ,-( _)-. )-. 878 .-( +========+(_ (_ +=====+ )-. 879 .( || (_|| Internet ||_) || ). 880 .( || ||-(______)-|| vv ). 881 .( +--------++--+ || || +------------+ ). 882 ( +==>| Server(A) | vv || | Server(B) |====+ ) 883 ( // +---------|\-+ +--++----++--+ +------------+ \\ ) 884 ( // .-. | \ | Relay(B) | .-. \\ ) 885 ( //,-( _)-. | \ +-v----------+ ,-( _)-\\ ) 886 ( .||_ (_ )-. | \____| .-(_ (_ ||. ) 887 ( _|| ISP A .) | (__ ISP B ||_)) 888 ( ||-(______)-' | (redirect) `-(______)|| ) 889 ( || | | | vv ) 890 ( +-----+-----+ | +-----+-----+ ) 891 | Client(A) | <--+ | Client(B) | 892 +-----+-----+ The IRON +-----+-----+ 893 | ( (Overlaid on the native Internet) ) | 894 .-. .-( .-) .-. 895 ,-( _)-. .-(________________________)-. ,-( _)-. 896 .-(_ (_ )-. .-(_ (_ )-. 897 (_ IRON EUN A ) (_ IRON EUN B ) 898 `-(______)-' `-(______)-' 899 | | 900 +---+----+ +---+----+ 901 | Host A | | Host B | 902 +--------+ +--------+ 904 Figure 6: Initial Packet Flow Before Redirects 906 With reference to Figure 6, host A sends packets destined to host B 907 via its network interface connected to EUN A. Routing within EUN A 908 will direct the packets to Client(A) as a default router for the EUN 909 which then uses VET and SEAL to encapsulate them in outer headers 910 with its locator address as the outer source address and the locator 911 address of Server(A) as the outer destination address. Client(A) 912 then simply forwards the encapsulated packets into its ISP network 913 connection that provided its locator. The ISP will forward the 914 encapsulated packets into the Internet without filtering since the 915 (outer) source address is topologically correct. Once the packets 916 have been forwarded into the Internet, routing will direct them to 917 Server(A). 919 Server(A) receives the encapsulated packets from Client(A) then 920 rewrites the outer source address to one of its own locator 921 addresses, and rewrites the outer destination address to the subnet 922 router anycast address of the appropriate address family associated 923 with the inner destination address. Server(A) then forwards the 924 revised encapsulated packets into the Internet where routing will 925 direct them to Relay(B) which services the VPC overlay network 926 associated with host B. 928 Relay(B) will intercept the encapsulated packets from Server(A) then 929 check its FIB to discover an entry that covers inner destination 930 address B with Server(B) as the next hop. Relay(B) then returns SCMP 931 redirect messages to Server(A) (*), rewrites the outer destination 932 address of the encapsulated packets to the locator address of 933 Server(B), and forwards these revised packets to Server(B). 935 Server(B) will receive the encapsulated packets from Relay(B) then 936 check its FIB to discover an entry that covers destination address B 937 with Client(B) as the next hop. Server(B) then re-encapsulates the 938 packets in a new outer header that uses the source address, 939 destination address and nonce parameters associated with the tunnel 940 to Client(B). Server(B) then forwards these re-encapsulated packets 941 into the Internet, where routing will direct them to Client(B). 942 Client(B) will in turn decapsulate the packets and forward the inner 943 packets to host B via EUN B. 945 (*) Note that after the initial flow of packets, Server(A) will have 946 received one or more SCMP redirect messages from Relay(B) listing 947 Server(B) as a better next hop. Server(A) will in turn forward the 948 redirects to Client(A), which will thereafter forward its 949 encapsulated packets directly to the locator address of Server(B) 950 without involving either Server(A) or Relay(B) as shown in Figure 7: 952 ________________________________________ 953 .-( .-. )-. 954 .-( ,-( _)-. )-. 955 .-( +=============> .-(_ (_ )-.======+ )-. 956 .( // (__ Internet _) || ). 957 .( // `-(______)-' vv ). 958 .( // +------------+ ). 959 ( // | Server(B) |====+ ) 960 ( // +------------+ \\ ) 961 ( // .-. .-. \\ ) 962 ( //,-( _)-. ,-( _)-\\ ) 963 ( .||_ (_ )-. .-(_ (_ ||. ) 964 ( _|| ISP A .) (__ ISP B ||_)) 965 ( ||-(______)-' `-(______)|| ) 966 ( || | | vv ) 967 ( +-----+-----+ The IRON +-----+-----+ ) 968 | Client(A) | (Overlaid on the native Internet) | Client(B) | 969 +-----+-----+ +-----+-----+ 970 | ( ) | 971 .-. .-( .-) .-. 972 ,-( _)-. .-(________________________)-. ,-( _)-. 973 .-(_ (_ )-. .-(_ (_ )-. 974 (_ IRON EUN A ) (_ IRON EUN B ) 975 `-(______)-' `-(______)-' 976 | | 977 +---+----+ +---+----+ 978 | Host A | | Host B | 979 +--------+ +--------+ 981 Figure 7: Sustained Packet Flow After Redirects 983 6.4.2. Mixed IRON and Non-IRON Hosts 985 When one host is within an IRON EUN and the other is in a non-IRON 986 EUN (i.e., one that connects to the native Internet instead of the 987 IRON), the IR elements involved depend on the packet flow directions. 988 The cases are described in the following sections. 990 6.4.2.1. From IRON Host A to Non-IRON Host B 992 Figure 8 depicts the IRON reference operating scenario for packets 993 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 995 _________________________________________ 996 .-( )-. )-. 997 .-( +-------)----+ )-. 998 .-( | Relay(A) |--------------+ )-. 999 .( +------------+ \ ). 1000 .( +=======>| Server(A) | \ ). 1001 .( // +--------)---+ \ ). 1002 ( // ) \ ) 1003 ( // The IRON ) \ ) 1004 ( // .-. ) \ .-. ) 1005 ( //,-( _)-. ) \ ,-( _)-. ) 1006 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1007 ( _|| ISP A ) ) (_ | ISP B )) 1008 ( ||-(______)-' ) |-(______)-' ) 1009 ( || | )-. v | ) 1010 ( +-----+ ----+ )-. +-----+-----+ ) 1011 | Client(A) |)-. | Router B | 1012 +-----+-----+ +-----+-----+ 1013 | ( ) | 1014 .-. .-(____________________________________)-. .-. 1015 ,-( _)-. ,-( _)-. 1016 .-(_ (_ )-. .-(_ (_ )-. 1017 (_ IRON EUN A ) (_non-IRON EUN B) 1018 `-(______)-' `-(______)-' 1019 | | 1020 +---+----+ +---+----+ 1021 | Host A | | Host B | 1022 +--------+ +--------+ 1024 Figure 8: From IRON Host A to Non-IRON Host B 1026 In this scenario, host A sends packets destined to host B via its 1027 network interface connected to IRON EUN A. Routing within EUN A will 1028 direct the packets to Client(A) as a default router for the EUN which 1029 then uses VET and SEAL to encapsulate them in outer headers with its 1030 locator address as the outer source address and the locator address 1031 of Server(A) as the outer destination address. The ISP will pass the 1032 packets without filtering since the (outer) source address is 1033 topologically correct. Once the packets have been released into the 1034 native Internet, routing will direct them to Server(A). 1036 Server(A) receives the encapsulated packets from Client(A) then re- 1037 encapsulates and forwards them to Relay(A), which simply decapsulates 1038 them and forwards the unencapsulated packets into the Internet. Once 1039 the packets are released into the Internet, routing will direct them 1040 to the final destination B. (Note that Server(A) and Relay(A) are 1041 depicted in Figure 8 as two halves of a unified gateway. In that 1042 case, the "forwarding" between Server(A) and Relay(A) is a zero- 1043 instruction imaginary operation within the gateway.) 1045 This scenario always involves a Server and Relay owned by the VPC 1046 that provides service to IRON EUN A. It therefore imparts a cost that 1047 would need to be borne by either the VPC or its customers. 1049 6.4.2.2. From Non-IRON Host B to IRON Host A 1051 Figure 9 depicts the IRON reference operating scenario for packets 1052 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1054 _______________________________________ 1055 .-( )-. )-. 1056 .-( +-------)----+ )-. 1057 .-( | Relay(A) |<-------------+ )-. 1058 .( +------------+ \ ). 1059 .( +========| Server(A) | \ ). 1060 .( // +--------)---+ \ ). 1061 ( // ) \ ) 1062 ( // The IRON ) \ ) 1063 ( // .-. ) \ .-. ) 1064 ( //,-( _)-. ) \ ,-( _)-. ) 1065 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1066 ( _|| ISP A ) ) (_ | ISP B )) 1067 ( ||-(______)-' ) |-(______)-' ) 1068 ( vv | )-. | | ) 1069 ( +-----+ ----+ )-. +-----+-----+ ) 1070 | Client(A) |)-. | Router B | 1071 +-----+-----+ +-----+-----+ 1072 | ( ) | 1073 .-. .-(____________________________________)-. .-. 1074 ,-( _)-. ,-( _)-. 1075 .-(_ (_ )-. .-(_ (_ )-. 1076 (_ IRON EUN A ) (_non-IRON EUN B) 1077 `-(______)-' `-(_______)-' 1078 | | 1079 +---+----+ +---+----+ 1080 | Host A | | Host B | 1081 +--------+ +--------+ 1083 Figure 9: From Non-IRON Host B to IRON Host A 1085 In this scenario, host B sends packets destined to host A via its 1086 network interface connected to non-IRON EUN B. Routing will direct 1087 the packets to Relay(A) which then forwards them to Server(A) using 1088 encapsulation if necessary. 1090 Server(A) will then check its FIB to discover an entry that covers 1091 destination address A with Client(A) as the next hop. Server(A) then 1092 (re-)encapsulates the packets in an outer header that uses the source 1093 address, destination address and nonce parameters associated with the 1094 tunnel to Client(A). Server(A) next forwards these (re-)encapsulated 1095 packets into the Internet, where routing will direct them to 1096 Client(A). Client(A) will in turn decapsulate the packets and 1097 forward the inner packets to host A via its network interface 1098 connected to IRON EUN A. 1100 This scenario always involves a Server and Relay owned by the VPC 1101 that provides service to IRON EUN A. It therefore imparts a cost that 1102 would need to be borne by either the VPC or its customers. 1104 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1106 While IRON Servers and Relays can be considered as fixed 1107 infrastructure, Clients may need to move between different network 1108 points of attachment, connect to multiple ISPs, or explicitly manage 1109 their traffic flows. The following sections discuss mobility, multi- 1110 homing and traffic engineering considerations for IRON client 1111 routers. 1113 6.5.1. Mobility Management 1115 When a Client changes its network point of attachment (e.g., due to a 1116 mobility event), it configures one or more new locators. If the 1117 Client has not moved far away from its previous network point of 1118 attachment, it simply informs its Server of any locator additions or 1119 deletions. This operation is performance-sensitive, and should be 1120 conducted immediately to avoid packet loss. 1122 If the Client has moved far away from its previous network point of 1123 attachment, however, it re-issues the anycast discovery procedure 1124 described in Section 6.1 to discover whether its candidate set of 1125 Servers has changed. If the Client's current Server is also included 1126 in the new list received from the VPC, this provides indication that 1127 the Client has not moved far enough to warrant changing to a new 1128 Server. Otherwise, the Client may wish to move to a new Server in 1129 order to maintain optimal routing. This operation is not 1130 performance-critical, and therefore can be conducted over a matter of 1131 seconds/minutes instead of milliseconds/microseconds. 1133 To move to a new Server, the Client first engages in the EP 1134 registration process with the new Server and maintains the 1135 registrations through periodic SRS/SRA exchanges the same as 1136 described in Section 6.1. The Client then informs its former Server 1137 that it has moved by providing it with the locator address of the new 1138 Server. The Client then discontinues the SRS/SRA keepalive process 1139 with the former Server, which will garbage-collect the stale FIB 1140 entries when their lifetime expires. This will allow the former 1141 Server to redirect existing correspondents to the new Server so that 1142 no packets are lost. 1144 6.5.2. Multihoming 1146 A Client may register multiple locators with its Server. It can 1147 assign metrics with its registrations to inform the Server of 1148 preferred locators, and can select outgoing locators according to its 1149 local preferences. Multihoming is therefore naturally supported. 1151 6.5.3. Inbound Traffic Engineering 1153 A Client can dynamically adjust the priorities of its prefix 1154 registrations with its Server in order to influence inbound traffic 1155 flows. It can also change between Servers when multiple Servers are 1156 available, but should strive for stability in its Server selection in 1157 order to limit VPC network routing churn. 1159 6.5.4. Outbound Traffic Engineering 1161 A Client can select outgoing locators, e.g., based on current QoS 1162 considerations such as minimizing one-way delay or one-way delay 1163 variance. 1165 6.6. Renumbering Considerations 1167 As new link layer technologies and/or service models emerge, 1168 customers will be motivated to select their service providers through 1169 healthy competition between ISPs. If a customer's EUN addresses are 1170 tied to a specific ISP, however, the customer may be forced to 1171 undergo a painstaking EUN renumbering process if it wishes to change 1172 to a different ISP [RFC4192][RFC5887]. 1174 When a customer obtains EP prefixes from a VPC, it can change between 1175 ISPs seamlessly and without need to renumber. If the VPC itself 1176 applies unreasonable costing structures for use of the EPs, however, 1177 the customer may be compelled to seek a different VPC and would again 1178 be required to confront a renumbering scenario. The IRON approach to 1179 renumbering avoidance therefore depends on VPCs conducting ethical 1180 business practices and offering reasonable rates. 1182 6.7. NAT Traversal Considerations 1184 The Internet today consists of a global public IPv4 routing and 1185 addressing system with non-IRON EUNs that use either public or 1186 private IPv4 addressing. The latter class of EUNs connect to the 1187 public Internet via Network Address Translators (NATs). When a 1188 Client is located behind a NAT, its selects Servers using the same 1189 procedures as for Clients with public addresses, i.e., it will send 1190 SRS messages to Servers in order to get SRA messages in return. The 1191 only requirement is that the Client must configure its SEAL 1192 encapsulation to use a transport protocol that supports NAT 1193 traversal, namely UDP. 1195 Since the Server maintains state about its Client customers, it can 1196 discover locator information for each Client by examining the UDP 1197 port number and IP address in the outer headers of SRS messages. 1198 When there is a NAT in the path, the UDP port number and IP address 1199 in the SRS message will correspond to state in the NAT box and might 1200 not correspond to the actual values assigned to the Client. The 1201 Server can then encapsulate packets destined to hosts in the Client's 1202 EUN within outer headers that use this IP address and UDP port 1203 number. The NAT box will receive the packets, translate the values 1204 in the outer headers, then forward the packets to the Client. In 1205 this sense, the Server's "locator" for the Client consists of the 1206 concatenation of the IP address and UDP port number. 1208 IRON does not introduce any new issues to complications raised for 1209 NAT traversal or for applications embedding address referrals in 1210 their payload. 1212 6.8. Nested EUN Considerations 1214 Each Client configures a locator that may be taken from an ordinary 1215 non-EPA address assigned by an ISP or from an EPA address taken from 1216 an EP assigned to another Client. In that case, the Client is said 1217 to be "nested" within the EUN of another Client, and recursive 1218 nestings of multiple layers of encapsulations may be necessary. 1220 For example, in the network scenario depicted in Figure 10 Client(A) 1221 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1222 Client(B) in turn configures a locator EPA(C) taken from the EP 1223 assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) 1224 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1225 this example, the "nested-IRON" case must be examined in which a host 1226 A which configures the address EPA(A) within EUN(A) exchanges packets 1227 with host Z located elsewhere in the Internet. 1229 .-. 1230 ISP(D) ,-( _)-. 1231 +-----------+ .-(_ (_ )-. 1232 | Client(C) |--(_ ISP(D) ) 1233 +-----+-----+ `-(______)-' 1234 | <= T \ .-. 1235 .-. u \ ,-( _)-. 1236 ,-( _)-. n .-(_ (- )-. 1237 .-(_ (_ )-. n (_ Internet ) 1238 (_ EUN(C) ) e `-(______)-' 1239 `-(______)-' l ___ 1240 | EPA(C) s => (:::)-. 1241 +-----+-----+ .-(::::::::) 1242 | Client(B) | .-(::::::::::::)-. +-----------+ 1243 +-----+-----+ (:::: The IRON ::::) | Relay(Z) | 1244 | `-(::::::::::::)-' +-----------+ 1245 .-. `-(::::::)-' +-----------+ 1246 ,-( _)-. | Server(Z) | 1247 .-(_ (_ )-. +-----------+ +-----------+ 1248 (_ EUN(B) ) | Server(C) | +-----------+ 1249 `-(______)-' +-----------+ | Client(Z) | 1250 | EPA(B) +-----------+ +-----------+ 1251 +-----+-----+ | Server(B) | +--------+ 1252 | Client(A) | +-----------+ | Host Z | 1253 +-----------+ +-----------+ +--------+ 1254 | | Server(A) | 1255 .-. +-----------+ 1256 ,-( _)-. EPA(A) 1257 .-(_ (_ )-. +--------+ 1258 (_ EUN(A) )---| Host A | 1259 `-(______)-' +--------+ 1261 Figure 10: Nested EUN Example 1263 The two cases of host A sending packets to host Z, and host Z sending 1264 packets to host A, must be considered separately as described below. 1266 6.8.1. Host A Sends Packets to Host Z 1268 Host A first forwards a packet with source address EPA(A) and 1269 destination address Z into EUN(A). Routing within EUN(A) will direct 1270 the packet to Client(A), which encapsulates it in an outer header 1271 with EPA(B) as the outer source address and Server(A) as the outer 1272 destination address then forwards the once-encapsulated packet into 1273 EUN(B). Routing within EUN[B] will direct the packet to Client(B), 1274 which encapsulates it in an outer header with EPA(C) as the outer 1275 source address and Server(B) as the outer destination address then 1276 forwards the twice-encapsulated packet into EUN(C). Routing within 1277 EUN(C) will direct the packet to Client(C), which encapsulates it in 1278 an outer header with ISP(D) as the outer source address and Server(C) 1279 as the outer destination address. Client(C) then sends this triple- 1280 encapsulated packet into the ISP(D) network, where it will be routed 1281 into the Internet to Server(C). 1283 When Server(C) receives the triple-encapsulated packet, it removes 1284 the outer layer of encapsulation and forwards the resulting twice- 1285 encapsulated packet into the Internet to Server(B). Next, Server(B) 1286 removes the outer layer of encapsulation and forwards the resulting 1287 once-encapsulated packet into the Internet to Server(A). Next, 1288 Server(A) checks the address type of the inner address 'Z'. If Z is 1289 a non-EPA address, Server(A) simply decapsulates the packet and 1290 forwards it into the Internet. Otherwise, Server(A) rewrites the 1291 outer source and destination addresses of the once-encapsulated 1292 packet and forwards it to Relay(Z). Relay(Z) in turn rewrites the 1293 outer destination address of the packet to the locator for Server(Z), 1294 then forwards the packet and sends a redirect to Server(A) (which 1295 forwards the redirect to Client(A)). Server(Z) then re-encapsulates 1296 the packet and forwards it to Client(Z), which decapsulates it and 1297 forwards the inner packet to host Z. Subsequent packets from 1298 Client(A) will then use Server(Z) as the next hop toward host Z, 1299 which eliminates Server(A) and Relay(Z) from the path. 1301 6.8.2. Host Z Sends Packets to Host A 1303 Whether or not host Z configures an EPA address, its packets destined 1304 to Host A will eventually reach Server(A). Server(A) will have a 1305 mapping that lists Client(A) as the next hop toward EPA(A). 1306 Server(A) will then encapsulate the packet with EPA(B) as the outer 1307 destination address and forward the packet into the Internet. 1308 Internet routing will convey this once-encapsulated packet to 1309 Server(B) which will have a mapping that lists Client(B) as the next 1310 hop toward EPA(B). Server(B) will then encapsulate the packet with 1311 EPA(C) as the outer destination address and forward the packet into 1312 the Internet. Internet routing will then convey this twice- 1313 encapsulated packet to Server(C) which will have a mapping that lists 1314 Client(C) as the next hop toward EPA(C). Server(C) will then 1315 encapsulate the packet with ISP(D) as the outer destination address 1316 and forward the packet into the Internet. Internet routing will then 1317 convey this triple-encapsulated packet to Client(C). 1319 When the triple-encapsulated packet arrives at Client(C), it strips 1320 the outer layer of encapsulation and forwards the twice-encapsulated 1321 packet to EPA(C) which is the locator address of Client(B). When 1322 Client(B) receives the twice-encapsulated packet, it strips the outer 1323 layer of encapsulation and forwards the once-encapsulated packet to 1324 EPA(B) which is the locator address of Client(A). When Client(A) 1325 receives the once-encapsulated packet, it strips the outer layer of 1326 encapsulation and forwards the unencapsulated packet to EPA(A) which 1327 is the host address of host A. 1329 7. Implications for the Internet 1331 The IRON architecture envisions a hybrid routing/mapping system that 1332 benefits from both the shortest-path routing afforded by pure dynamic 1333 routing systems and the routing scaling suppression afforded by pure 1334 mapping systems. IRON therefore targets the elusive "sweet spot" 1335 that pure routing and pure mapping systems alone cannot satisfy. 1337 The IRON system requires a deployment of new routers/servers 1338 throughout the Internet and/or provider networks to maintain well- 1339 balanced virtual overlay networks. These routers/servers can be 1340 deployed incrementally without disruption to existing Internet 1341 infrastructure and appropriately managed to provide acceptable 1342 service levels to customers. 1344 End-to-end traffic that traverses an IRON virtual overlay network may 1345 experience delay variance between the initial packets and subsequent 1346 packets of a flow. This is due to the IRON system allowing longer 1347 path stretch for initial packets followed by timely route 1348 optimizations to utilize better next hop routers/servers for 1349 subsequent packets. 1351 IRON virtual overlay networks also work seamlessly with existing and 1352 emerging services within the native Internet. In particular, 1353 customers serviced by IRON virtual overlay networks will receive the 1354 same service enjoyed by customers serviced by non-IRON service 1355 providers. Internet services already deployed within the native 1356 Internet also need not make any changes to accommodate IRON virtual 1357 overlay network customers. 1359 The IRON system operates between routers within provider networks and 1360 end user networks. Within these networks, the underlying paths 1361 traversed by the virtual overlay networks may comprise links that 1362 accommodate varying MTUs. While the IRON system imposes an 1363 additional per-packet overhead that may cause the size of packets to 1364 become slightly larger than the underlying path can accommodate, IRON 1365 routers have a method for naturally detecting and tuning out all 1366 instances of path MTU underruns. In some cases, these MTU underruns 1367 may need to be reported back to the original hosts; however, the 1368 system will also allow for MTUs much larger than those typically 1369 available in current Internet paths to be discovered and utilized as 1370 more links with larger MTUs are deployed. 1372 Finally, and perhaps most importantly, the IRON system provides an 1373 in-built mobility management and multihoming capability that allows 1374 end user devices and networks to move about freely while both 1375 imparting minimal oscillations in the routing system and maintaining 1376 generally shortest-path routes. This mobility management is afforded 1377 through the very nature of the IRON customer/provider relationship, 1378 and therefore requires no adjunct mechanisms. The mobility 1379 management and multihoming capabilities are further supported by 1380 forward-path reachability detection that provides "hints of forward 1381 progress" in the same spirit as for IPv6 ND. 1383 8. Additional Considerations 1385 Considerations for the scalability of Internet Routing due to 1386 multihoming, traffic engineering and provider-independent addressing 1387 are discussed in [I-D.narten-radir-problem-statement]. Other scaling 1388 considerations specific to IRON are discussed in Appendix B. 1390 Route optimization considerations for mobile networks are found in 1391 [RFC5522]. 1393 9. Related Initiatives 1395 IRON builds upon the concepts RANGER architecture [RFC5720], and 1396 therefore inherits the same set of related initiatives. The Internet 1397 Research Task Force (IRTF) Routing Research Group (RRG) mentions IRON 1398 in its recommendation for a routing architecture 1399 [I-D.irtf-rrg-recommendation]. 1401 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1402 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1403 the Virtual Prefix concepts. 1405 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1406 contributed valuable insights, including the use of real-time 1407 mapping. The use of Servers as mobility anchor points is directly 1408 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1410 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1411 approach using a Correspondent Router (CR) model. The IRON Server 1412 construct is similar to the CR concept described in this work, 1413 however the manner in which customer EUNs coordinates with Servers is 1414 different and based on the redirection model associated with NBMA 1415 links. 1417 Numerous publications have proposed NAT traversal techniques. The 1418 NAT traversal techniques adapted for IRON were inspired by the Simple 1419 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1420 [I-D.carpenter-softwire-sample]. 1422 10. IANA Considerations 1424 There are no IANA considerations for this document. 1426 11. Security Considerations 1428 Security considerations that apply to tunneling in general are 1429 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1430 considerations that apply also to IRON are discussed in RANGER 1431 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1432 [I-D.templin-intarea-seal]. 1434 The IRON system further depends on mutual authentication of IRON 1435 Clients to Servers and Servers to Relays. This is accomplished 1436 through initial authentication exchanges followed by per-packet 1437 nonces that can be used to detect off-path attacks. As for all 1438 Internet communications, the IRON system also depends on Relays 1439 acting with integrity and not injecting false advertisements into the 1440 BGP (e.g., to mount traffic siphoning attacks). 1442 Each VPC overlay network requires a means for assuring the integrity 1443 of the interior routing system so that all Relays and Servers in the 1444 overlay have a consistent view of Client<->Server bindings. Finally, 1445 DOS attacks on IRON Relays and Servers can occur when packets with 1446 spoofed source addresses arrive at high data rates. This issue is no 1447 different than for any border router in the public Internet today, 1448 however. 1450 12. Acknowledgements 1452 This ideas behind this work have benefited greatly from discussions 1453 with colleagues; some of which appear on the RRG and other IRTF/IETF 1454 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1455 mobility architecture which strongly influenced IRON. Eric 1456 Fleischman pointed out the opportunity to leverage anycast for 1457 discovering topologically-close Servers. Thomas Henderson 1458 recommended a quantitative analysis of scaling properties. 1460 The following individuals provided essential review input: Jari 1461 Arkko, Mohamed Boucadair, Stewart Bryant, John Buford, Ralph Droms, 1462 Wesley Eddy, Adrian Farrel, Dae Young Kim and Robin Whittle. 1464 13. References 1466 13.1. Normative References 1468 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1469 September 1981. 1471 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1472 (IPv6) Specification", RFC 2460, December 1998. 1474 13.2. Informative References 1476 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1477 http://bgpmon.net/stat.php", June 2010. 1479 [I-D.bernardos-mext-nemo-ro-cr] 1480 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1481 Router based Route Optimisation for NEMO (CRON)", 1482 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1483 July 2008. 1485 [I-D.carpenter-softwire-sample] 1486 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1487 IPv6: Simple Address Mapping for Premises Legacy Equipment 1488 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1489 progress), June 2010. 1491 [I-D.ietf-grow-va] 1492 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1493 L. Zhang, "FIB Suppression with Virtual Aggregation", 1494 draft-ietf-grow-va-03 (work in progress), August 2010. 1496 [I-D.ietf-v6ops-tunnel-security-concerns] 1497 Krishnan, S., Thaler, D., and J. Hoagland, "Security 1498 Concerns With IP Tunneling", 1499 draft-ietf-v6ops-tunnel-security-concerns-04 (work in 1500 progress), October 2010. 1502 [I-D.irtf-rrg-recommendation] 1503 Li, T., "Recommendation for a Routing Architecture", 1504 draft-irtf-rrg-recommendation-16 (work in progress), 1505 November 2010. 1507 [I-D.narten-radir-problem-statement] 1508 Narten, T., "On the Scalability of Internet Routing", 1509 draft-narten-radir-problem-statement-05 (work in 1510 progress), February 2010. 1512 [I-D.russert-rangers] 1513 Russert, S., Fleischman, E., and F. Templin, "RANGER 1514 Scenarios", draft-russert-rangers-05 (work in progress), 1515 July 2010. 1517 [I-D.templin-intarea-seal] 1518 Templin, F., "The Subnetwork Encapsulation and Adaptation 1519 Layer (SEAL)", draft-templin-intarea-seal-25 (work in 1520 progress), December 2010. 1522 [I-D.templin-intarea-vet] 1523 Templin, F., "Virtual Enterprise Traversal (VET)", 1524 draft-templin-intarea-vet-19 (work in progress), 1525 December 2010. 1527 [I-D.whittle-ivip-arch] 1528 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1529 Architecture", draft-whittle-ivip-arch-04 (work in 1530 progress), March 2010. 1532 [I-D.zhang-evolution] 1533 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1534 Scalability", draft-zhang-evolution-02 (work in progress), 1535 October 2009. 1537 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1538 a subnetwork for experimentation with the OSI network 1539 layer", RFC 1070, February 1989. 1541 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1542 Addresses", RFC 2526, March 1999. 1544 [RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers", 1545 RFC 3068, June 2001. 1547 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1548 Reserved for Documentation", RFC 3849, July 2004. 1550 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1551 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1552 September 2005. 1554 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1555 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1557 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1558 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1559 May 2006. 1561 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1562 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1563 March 2008. 1565 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1566 Route Optimization Requirements for Operational Use in 1567 Aeronautics and Space Exploration Mobile Networks", 1568 RFC 5522, October 2009. 1570 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1571 Global Enterprise Recursion (RANGER)", RFC 5720, 1572 February 2010. 1574 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1575 Reserved for Documentation", RFC 5737, January 2010. 1577 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1578 (IRTF) Document Stream", RFC 5743, December 2009. 1580 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1581 Still Needs Work", RFC 5887, May 2010. 1583 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1584 Core-Edge Separation Solutions to the Internet's Routing 1585 Scaling Problem, 1586 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1587 August 2008. 1589 Appendix A. IRON VPs Over Internetworks with Different Address Families 1591 The IRON architecture leverages the routing system by providing 1592 generally shortest-path routing for packets with EPA addresses from 1593 VPs that match the address family of the underlying Internetwork. 1594 When the VPs are of an address family that is not routable within the 1595 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1596 are used within an IPv4 Internetwork) a global mapping database is 1597 required to allow Servers to map VPs to companion prefixes taken from 1598 address families that are routable within the Internetwork. For 1599 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1600 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1601 packets can be forwarded over IPv4-only Internetworks. 1603 Every VP in the IRON must therefore be represented in a globally 1604 distributed Master VP database (MVPd) that maintains VP-to-companion 1605 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1606 globally-managed assigned numbers authority in the same manner as the 1607 Internet Assigned Numbers Authority (IANA) currently maintains the 1608 master list of all top-level IPv4 and IPv6 delegations. The database 1609 can be replicated across multiple servers for load balancing much in 1610 the same way that FTP mirror sites are used to manage software 1611 distributions. 1613 Upon startup, each Server discovers the full set of VPs for the IRON 1614 by reading the MVPd. The Server reads the MVPd from a nearby server 1615 and periodically checks the server for deltas since the database was 1616 last read. After reading the MVPd, the Server has a full list of VP 1617 to companion prefix mappings. 1619 The Server can then forward packets toward EPAs covered by a VP by 1620 encapsulating them in an outer header of the VP's companion prefix 1621 address family and using any address taken from the companion prefix 1622 as the outer destination address. The companion prefix therefore 1623 serves as an anycast prefix. 1625 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1626 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1628 Appendix B. Scaling Considerations 1630 Scaling aspects of the IRON architecture have strong implications for 1631 its applicability in practical deployments. Scaling must be 1632 considered along multiple vectors including Interdomain core routing 1633 scaling, scaling to accommodate large numbers of customer EUNs, 1634 traffic scaling, state requirements, etc. 1636 In terms of routing scaling, each VPC will advertise one or more VPs 1637 into the global Internet routing system from which EPs are delegated 1638 to customer EUNs. Routing scaling will therefore be minimized when 1639 each VP covers many EPs. For example, the IPv6 prefix 2001:DB8::/32 1640 contains 2^24 ::/56 EP prefixes for assignment to EUNs. The IRON 1641 could therefore accommodate 2^32 ::/56 EPs with only 2^8 ::/32 VPs 1642 advertised in the interdomain routing core. (When even longer EP 1643 prefixes are used, e.g., /64s assigned to individual handsets in a 1644 cellular provider network, considerable numbers of EUNs can be 1645 represented within only a single VP.) Each VP also has an associated 1646 anycast companion prefix; hence, there will be one anycast prefix 1647 advertised into the global routing system for each VP. 1649 In terms of traffic scaling for Relays, each Relay represents an ASBR 1650 of a "shell" enterprise network that simply directs arriving traffic 1651 packets with EPA destination addresses towards Servers that service 1652 customer EUNs. Moreover, the Relay sheds traffic destined to EPAs 1653 through redirection which removes it from the path for the vast 1654 majority of traffic packets. On the other hand, each Relay must 1655 handle all traffic packets forwarded between its customer EUNs and 1656 the non-IRON Internet. The scaling concerns for this latter class of 1657 traffic are no different than for ASBR routers that connect large 1658 enterprise networks to the Internet. In terms of traffic scaling for 1659 Servers, each Server services a set of the VPC overlay network's 1660 customer EUNs. The Server services all traffic packets destined to 1661 its EUNs but only services the initial packets of flows initiated 1662 from the EUNs and destined to EPAs. Therefore, traffic scaling for 1663 EPA-addressed traffic is an asymmetric consideration and is 1664 proportional to the number of EUNs each Server serves. 1666 In terms of state requirements for Relays, each Relay maintains a 1667 list of all Servers in the VPC overlay network as well as FIB entries 1668 for all customer EUNs that each Server serves. This state is 1669 therefore dominated by the number of EUNs in the VPC overlay network. 1670 Sizing the Relay to accommodate state information for all EUNs is 1671 therefore required during VPC overlay network planning. In terms of 1672 state requirements for Servers, each Server maintains tunnel state 1673 for each of the customer EUNs it serves but need not keep state for 1674 all EUNs in the VPC overlay network. Finally, neither Relays nor 1675 Servers need keep state for final destinations of outbound traffic. 1677 Clients source and sink all traffic packets originating from or 1678 destined to the customer EUN. Therefore traffic scaling 1679 considerations for Clients are the same as for any site border 1680 router. Clients also retain state for the Servers for final 1681 destinations of outbound traffic flows. This can be managed as soft 1682 state, since stale entries purged from the cache will be refreshed 1683 when new traffic packets are sent. 1685 Author's Address 1687 Fred L. Templin (editor) 1688 Boeing Research & Technology 1689 P.O. Box 3707 MC 7L-49 1690 Seattle, WA 98124 1691 USA 1693 Email: fltemplin@acm.org