idnits 2.17.1 draft-templin-iron-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 22, 2010) is 4874 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'B' is mentioned on line 1239, but not defined == Unused Reference: 'RFC3849' is defined on line 1513, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1540, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-03 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-25 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-19 -- Obsolete informational reference (is this intentional?): RFC 3068 (Obsoleted by RFC 7526) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft December 22, 2010 5 Intended status: Experimental 6 Expires: June 25, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-15.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth through Provider Independent addressing while requiring no 18 changes to end systems and no changes to the existing routing system. 19 IRON further addresses other important issues including routing 20 scaling, mobility management, multihoming, traffic engineering and 21 NAT traversal. While business considerations are an important 22 determining factor for widespread adoption, they are out of scope for 23 this document. This document is a product of the IRTF Routing 24 Research Group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 25, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 63 3.1. IRON Client Router . . . . . . . . . . . . . . . . . . . . 9 64 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 10 65 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 66 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 67 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 68 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 69 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 70 5.3. IRON Client Router Initialization . . . . . . . . . . . . 15 71 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 15 72 6.1. IRON Client Router Operation . . . . . . . . . . . . . . . 16 73 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 17 74 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 18 75 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 76 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 77 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 78 6.5. Mobility, Multihoming and Traffic Engineering 79 Considerations . . . . . . . . . . . . . . . . . . . . . . 25 80 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 25 81 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 26 82 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 26 83 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 26 84 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 26 85 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 26 86 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 27 87 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 28 88 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 29 89 7. Implications for the Internet . . . . . . . . . . . . . . . . 30 90 8. Additional Considerations . . . . . . . . . . . . . . . . . . 31 91 9. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 31 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 32 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 96 13.1. Normative References . . . . . . . . . . . . . . . . . . . 33 97 13.2. Informative References . . . . . . . . . . . . . . . . . . 33 98 Appendix A. IRON VPs Over Internetworks with Different 99 Address Families . . . . . . . . . . . . . . . . . . 35 100 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 36 101 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 37 103 1. Introduction 105 Growth in the number of entries instantiated in the Internet routing 106 system has led to concerns for unsustainable routing scaling 107 [I-D.narten-radir-problem-statement]. Operational practices such as 108 increased use of multihoming with IPv4 Provider-Independent (PI) 109 addressing are resulting in more and more fine-grained prefixes 110 injected into the routing system from more and more end-user 111 networks. Furthermore, the forthcoming depletion of the public IPv4 112 address space has raised concerns for both increased address space 113 fragmentation (leading to yet further routing table entries) and an 114 impending address space run-out scenario. At the same time, the IPv6 115 routing system is beginning to see growth in IPv6 Provider-Aggregated 116 (PA) prefixes [BGPMON] which must be managed in order to avoid the 117 same routing scaling issues the IPv4 Internet now faces. Since the 118 Internet must continue to scale to accommodate increasing demand, it 119 is clear that new routing methodologies and operational practices are 120 needed. 122 Several related works have investigated routing scaling issues. 123 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 124 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 125 proposals that introduce routing overlays with Virtual Prefixes (VPs) 126 to reduce the number of entries required in each router's Forwarding 127 Information Base (FIB) and Routing Information Base (RIB). Routing 128 and Addressing in Networks with Global Enterprise Recursion (RANGER) 129 [RFC5720] examines recursive arrangements of enterprise networks that 130 can apply to a very broad set of use case scenarios 131 [I-D.russert-rangers]. In particular, RANGER supports encapsulation 132 and secure redirection by treating each layer in the recursive 133 hierarchy as a virtual non-broadcast, multiple access (NBMA) "link". 134 RANGER is an architectural framework that includes Virtual Enterprise 135 Traversal (VET) [I-D.templin-intarea-vet] and the Subnetwork 136 Adaptation and Encapsulation Layer (SEAL) [I-D.templin-intarea-seal] 137 as its functional building blocks. 139 This document proposes an Internet Routing Overlay Network (IRON) 140 with goals of supporting sustainable growth while requiring no 141 changes to the existing routing system. IRON borrows concepts from 142 VA, AIS and RANGER, and further borrows concepts from the Internet 143 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 144 proposal along with its associated Translating Tunnel Router (TTR) 145 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 146 degree inspired the IRON mobility architecture design discussed in 147 this document. The Network Address Translator (NAT) traversal 148 techniques adapted for IRON were inspired by the Simple Address 149 Mapping for Premises Legacy Equipment (SAMPLE) proposal 150 [I-D.carpenter-softwire-sample]. 152 IRON specifically seeks to provide scalable PI addressing without 153 changing the current BGP [RFC4271] routing system. IRON observes the 154 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 155 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 156 [RFC1070], etc.) are also within scope. 158 The IRON is a global routing system comprising virtual overlay 159 networks managed by Virtual Prefix Companies (VPCs) that own and 160 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 161 prefixes (EPs) are delegated to customer sites. The IRON is 162 motivated by a growing customer demand for multihoming, mobility 163 management and traffic engineering while using stable PI addressing 164 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 165 existing IPv4 and IPv6 global Internet routing systems as virtual 166 links for tunneling inner network protocol packets within outer IPv4 167 or IPv6 headers (see: Section 3). The IRON requires deployment of a 168 small number of new BGP core routers and supporting servers, as well 169 as IRON-aware routers/servers in customer EUNs. No modifications to 170 hosts, and no modifications to most routers are required. 172 While the IRON architecture addresses network mobility, host mobility 173 considerations are outside the scope of this document. IP multicast 174 considerations are also out of scope. 176 Note: This document is offered in compliance with Internet Research 177 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 178 IETF product and is not a standard. The views in this document were 179 considered controversial by the IRTF Routing Research Group (RRG) but 180 the RG reached a consensus that the document should still be 181 published. The document will undergo a period of review within the 182 RRG and through selected expert reviewers prior to publication. The 183 following sections discuss details of the IRON architecture. 185 2. Terminology 187 This document makes use of the following terms: 189 End User Network (EUN) 190 an edge network that connects an organization's devices (e.g., 191 computers, routers, printers, etc.) to the Internet. 193 End User Network PI Prefix (EP) 194 a more-specific Provider-Independent (PI) prefix derived from a 195 Virtual Prefix (VP) (e.g., an IPv4 /28, an IPv6 /56, etc.) and 196 delegated to an EUN by a Virtual Prefix Company (VPC). 198 End User Network PI Address (EPA) 199 a network layer address belonging to an EP and assigned to the 200 interface of an end system in an EUN. 202 Forwarding Information Based (FIB) 203 a data structure containing network prefix to next-hop mappings; 204 usually maintained in a router's fast-path processing lookup 205 tables. 207 Internet Routing Overlay Network (IRON) 208 a composite virtual overlay network that comprises the union of 209 all VPC overlay networks configured over a common Internetwork. 210 The IRON supports routing through encapsulation of inner packets 211 with EPA addresses within outer headers that use locator 212 addresses. 214 IRON Client Router ("Client") 215 a customer's router (or host with embedded gateway function) that 216 logically connects the customer's EUNs and their associated EPs to 217 the IRON via tunnels. 219 IRON Serving Router ("Server") 220 a VPC's overlay network router that provides forwarding and 221 mapping services for the EPs owned by customer Client routers. 223 IRON Relay Router ("Relay") 224 a VPC's overlay network router that acts as a relay between the 225 IRON and the native Internet. 227 IRON Router (IR) 228 generically refers to any of an IRON Client/Server/Relay router. 230 Internet Service Provider (ISP) 231 a service provider which connects customer EUNs to the underlying 232 Internetwork. In other words, an ISP is responsible for providing 233 basic Internet connectivity for customer EUNs. 235 Locator 236 an IP address assigned to the interface of a router or end system 237 within a public or private network. Locators taken from public IP 238 prefixes are routable on a global basis, while locators taken from 239 private IP prefixes are made public via Network Address 240 Translation (NAT). 242 Provider Aggregated (PA) address or prefix 243 a network layer address or prefix delegated to an EUN by an ISP. 245 Provider Independent (PI) address or prefix 246 a network layer address or prefix delegated to an EUN by a third 247 party independently of the EUN's ISP arrangements. 249 Routing and Addressing in Networks with Global Enterprise Recursion 250 (RANGER) 251 an architectural examination of virtual overlay networks applied 252 to enterprise network scenarios, with implications for a wider 253 variety of use cases. 255 Subnetwork Encapsulation and Adaptation Layer (SEAL) 256 an encapsulation sublayer that provides extended packet 257 identification and a control message protocol to ensure 258 deterministic network-layer feedback. 260 Virtual Enterprise Traversal (VET) 261 a method for discovering border routers and forming dynamic point- 262 to-(multi)point tunnels over enterprise networks (or sites) with 263 varying properties. 265 Virtual Prefix (VP) 266 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 267 prefix, etc.) that is owned and managed by a Virtual Prefix 268 Company (VPC). 270 Virtual Prefix Company (VPC) 271 a company that owns and manages a set of VPs from which it 272 delegates EPs to EUNs. 274 VPC Overlay Network 275 a specialized set of routers deployed by a VPC to service customer 276 EUNs through a virtual overlay network configured over an 277 underlying Internetwork (e.g., the global Internet). 279 3. The Internet Routing Overlay Network 281 The Internet Routing Overlay Network (IRON) is a system of virtual 282 overlay networks configured over a common Internetwork. While the 283 principles presented in this document are discussed within the 284 context of the public global Internet, they can also be applied to 285 any autonomous Internetwork. The rest of this document therefore 286 refers to the terms "Internet" and "Internetwork" interchangeably 287 except in cases where specific distinctions must be made. 289 The IRON consists of IRON Routers (IRs) that automatically tunnel the 290 packets of end-to-end communication sessions within encapsulating 291 headers used for Internet routing. IRs use Virtual Enterprise 292 Traversal (VET) [I-D.templin-intarea-vet] in conjunction with the 293 Subnetwork Encapsulation and Adaptation Layer (SEAL) 294 [I-D.templin-intarea-seal] to encapsulate inner network layer packets 295 within outer headers as shown in Figure 1: 297 +-------------------------+ 298 | Outer headers with | 299 ~ locator addresses ~ 300 | (IPv4 or IPv6) | 301 +-------------------------+ 302 | SEAL Header | 303 +-------------------------+ +-------------------------+ 304 | Inner Packet Header | --> | Inner Packet Header | 305 ~ with EP addresses ~ --> ~ with EP addresses ~ 306 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 307 +-------------------------+ +-------------------------+ 308 | | --> | | 309 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 310 | | --> | | 311 +-------------------------+ +-------------------------+ 313 Inner packet before Outer packet after 314 before encapsulation after encapsulation 316 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 318 VET specifies the automatic tunneling mechanisms used for 319 encapsulation, while SEAL specifies the format and usage of the SEAL 320 header as well as a set of control messages. Most notably, IRs use 321 the SEAL Control Message Protocol (SCMP) to deterministically 322 exchange and authenticate control messages such as route 323 redirections, indications of Path Maximum Transmission Unit (PMTU) 324 limitations, destination unreachables, etc. 326 The IRON is the union of all virtual overlay networks that are 327 configured over a common underlying Internet and are owned and 328 managed Virtual Prefix Companies (VPCs). Each such virtual overlay 329 network comprises a set of IRs distributed throughout the Internet to 330 serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate sub- 331 prefixes from their VPs which they lease to customers as End User 332 Network PI prefixes (EPs). The customers in turn assign the EPs to 333 their customer edge IRs which connect their End User Networks (EUNs) 334 to the IRON. 336 VPCs may have no affiliation with the ISP networks from which 337 customers obtain their basic Internet connectivity. Therefore, a 338 customer could procure its summary network services either through a 339 common broker or through separate entities. In that case, the VPC 340 can open for business and begin serving its customers immediately 341 without the need to coordinate its activities with ISPs or with other 342 VPCs. Further details on business considerations are out of scope 343 for this document. 345 The IRON requires no changes to end systems and no changes to most 346 routers in the Internet. Instead, the IRON comprises IRs that are 347 deployed either as new platforms or as modifications to existing 348 platforms. IRs may be deployed incrementally without disturbing the 349 existing Internet routing system, and act as waypoints (or "cairns") 350 for navigating the IRON. The functional roles for IRs are described 351 in the following sections. 353 3.1. IRON Client Router 355 An IRON client router (or, simply, "Client") is a customer's router 356 (or host with embedded gateway function) that logically connects the 357 customer's EUNs and their associated EPs to the IRON via tunnels as 358 shown in Figure 2. Clients obtain EPs from VPCs and use them to 359 number subnets and interfaces within their EUNs. A Client can be 360 deployed on the same physical platform that also connects the 361 customer's EUNs to its ISPs, but it may also be a separate router or 362 even a standalone server system located within the EUN. (This model 363 applies even if the EUN connects to the ISP via a Network Address 364 Translator (NAT) - see Section 6.7). 365 .-. 366 ,-( _)-. 367 +--------+ .-(_ (_ )-. 368 | Client |--(_ ISP ) 369 +---+----+ `-(______)-' 370 | <= T \ .-. 371 .-. u \ ,-( _)-. 372 ,-( _)-. n .-(_ (- )-. 373 .-(_ (_ )-. n (_ Internet ) 374 (_ EUN ) e `-(______)- 375 `-(______)-' l ___ 376 | s => (:::)-. 377 +----+---+ .-(::::::::) 378 | Host | .-(::::::::::::)-. 379 +--------+ (:::: The IRON ::::) 380 `-(::::::::::::)-' 381 `-(::::::)-' 383 Figure 2: IRON Client Router Connecting EUN to the IRON 385 3.2. IRON Serving Router 387 An IRON serving router (or, simply, "Server") is a VPC's overlay 388 network router that provides forwarding and mapping services for the 389 EPs owned by customer Client routers. In typical deployments, a VPC 390 will deploy many Servers around the IRON in a globally-distributed 391 fashion (e.g., as depicted in Figure 3) so that Clients can discover 392 those that are nearby. 394 +--------+ +--------+ 395 | Boston | | Tokyo | 396 | Server | | Server | 397 +--+-----+ ++-------+ 398 +--------+ \ / 399 | Seattle| \ ___ / 400 | Server | \ (:::)-. +--------+ 401 +------+-+ .-(::::::::)------+ Paris | 402 \.-(::::::::::::)-. | Server | 403 (:::: The IRON ::::) +--------+ 404 `-(::::::::::::)-' 405 +--------+ / `-(::::::)-' \ +--------+ 406 | Moscow + | \--- + Sydney | 407 | Server | +----+---+ | Server | 408 +--------+ | Cairo | +--------+ 409 | Server | 410 +--------+ 412 Figure 3: IRON Serving Router Global Distribution Example 414 Each Server acts as tunnel-endpoint router that forms a bi- 415 directional tunnel with each of its Client customers. Each Server 416 also associates with a set of Relays that can forward packets from 417 the IRON out to the native Internet and vice-versa as discussed in 418 the next section. 420 3.3. IRON Relay Router 422 An IRON Relay Router (or, simply, "Relay") is a VPC's overlay network 423 router that acts as a relay between the IRON and the native Internet. 424 It therefore also serves as an Autonomous System Border Router (ASBR) 425 that is owned and managed by the VPC. 427 Each VPC configures one or more Relays which advertise the company's 428 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 429 Relay associates with all of the VPC's overlay network Servers, e.g., 430 via tunnels over the IRON, via a direct interconnect such as an 431 Ethernet cable, etc. The Relay role (as well as its relationship 432 with overlay network Servers) is depicted in Figure 4: 434 .-. 435 ,-( _)-. 436 .-(_ (_ )-. 437 (_ Internet ) 438 `-(______)-' | +--------+ 439 | |--| Server | 440 +----+---+ | +--------+ 441 | Relay |----| +--------+ 442 +--------+ |--| Server | 443 _|| | +--------+ 444 (:::)-. (Ethernet) 445 .-(::::::::) 446 +--------+ .-(::::::::::::)-. +--------+ 447 | Server |=(:::: The IRON ::::)=| Server | 448 +--------+ `-(::::::::::::)-' +--------+ 449 `-(::::::)-' 450 || (Tunnels) 451 +--------+ 452 | Server | 453 +--------+ 455 Figure 4: IRON Relay Router Connecting IRON to Native Internet 457 4. IRON Organizational Principles 459 The IRON consists of the union of all VPC overlay networks configured 460 over a common Internetwork (e.g., the public Internet). Each such 461 overlay network represents a distinct "patch" on the Internet 462 "quilt", where the patches are stitched together by tunnels over the 463 links, routers, bridges, etc., that connect the underlying. When a 464 new VPC overlay network is deployed, it becomes yet another patch on 465 the quilt. The IRON is therefore a composite overlay network 466 consisting of multiple individual patches, where each patch 467 coordinates its activities independently of all others (with the 468 exception that the Servers of each patch must be aware of all VPs in 469 the IRON). In order to ensure mutual cooperation between all VPC 470 overlay networks, sufficient address space portions of the inner 471 network layer protocol (e.g., IPv4, IPv6, etc.) should be set aside 472 and designated as VP space. 474 Each VPC overlay network in the IRON maintains a set of Relays and 475 Servers that provide services to their Client customers. In order to 476 ensure adequate customer service levels, the VPC should conduct a 477 traffic scaling analysis and distribute sufficient Relays and Servers 478 for the overlay network globally throughout the Internet. Figure 5 479 depicts the logical arrangement of Relays Servers and Clients in an 480 IRON virtual overlay network: 482 .-. 483 ,-( _)-. 484 .-(_ (_ )-. 485 (__ Internet _) 486 `-(______)-' 488 <------------ Relays ------------> 489 ________________________ 490 (::::::::::::::::::::::::)-. 491 .-(:::::::::::::::::::::::::::::) 492 .-(:::::::::::::::::::::::::::::::::)-. 493 (::::::::::: The IRON :::::::::::::::) 494 `-(:::::::::::::::::::::::::::::::::)-' 495 `-(::::::::::::::::::::::::::::)-' 497 <------------ Servers ------------> 498 .-. .-. .-. 499 ,-( _)-. ,-( _)-. ,-( _)-. 500 .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. 501 (__ ISP A _) (__ ISP B _) ... (__ ISP x _) 502 `-(______)-' `-(______)-' `-(______)-' 503 <----------- NATs ------------> 505 <----------- Clients and EUNs -----------> 507 Figure 5: Virtual Overlay Network Organization 509 Each Relay in the VPC overlay network connects the overlay directly 510 to the underlying IPv4 and IPv6 Internets. It also advertises the 511 VPC overlay network's IPv4 VPs into the IPv4 BGP routing system and 512 advertises the overlay network's IPv6 VPs into the IPv6 BGP routing 513 system. Relays will therefore receive packets with EPA destination 514 addresses sent by end systems in the Internet and direct them toward 515 EPA-addressed end systems connected to the VPC overlay network. 517 Each VPC overlay network also manages a set of Servers that connect 518 their Clients and associated EUNs to the IRON and to the IPv6 and 519 IPv4 Internets via their associations with Relays. IRON Servers 520 therefore need not be BGP routers themselves and can be simple 521 commodity hardware platforms. Moreover, the Server and Relay 522 functions can be deployed together on the same physical platform as a 523 unified gateway or they may be deployed on separate platforms (e.g., 524 for load balancing purposes). 526 Each Server maintains a working set of Clients for which it caches 527 EP-to-Client mappings in its Forwarding Information Base (FIB). Each 528 Server also in turn propagates the list of EPs in its working set to 529 each of the Relays in the VPC overlay network via a dynamic routing 530 protocol (e.g., an overlay network internal BGP instance that carries 531 only the EP-to-Server mappings and does not interact with the 532 external BGP routing system). Each Server therefore only needs to 533 track the EPs for its current working set of Clients, while each 534 Relay will maintain a full EP-to-Server mapping table that represents 535 reachability information for all EPs in the VPC overlay network. 537 Customers establish Clients that obtain their basic Internet 538 connectivity from ISPs and connect to Servers to attach their EUNs to 539 the IRON. Each EUN can connect to the IRON via one or multiple 540 Clients as long as the Clients coordinate with one another, e.g., to 541 mitigate EUN partitions. Unlike Relays and Servers, Clients may use 542 private addresses behind one or several layers of NATs. Each Client 543 initially discovers a list of nearby Servers through an anycast 544 discovery process (described below). It then selects one of these 545 nearby Servers and forms a bidirectional tunnel through an initial 546 exchange followed by periodic keepalives. 548 After the Client selects a Server, it forwards initial outbound 549 packets from its EUNs by tunneling them to the Server which in turn 550 forwards them to the nearest Relay within the IRON that serves the 551 final destination. The Client will subsequently receive redirect 552 messages informing it of a more direct route through a Server that 553 serves the final destination EUN. 555 The IRON can also be used to support VPs of network layer address 556 families that cannot be routed natively in the underlying 557 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 558 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 559 Further details for support of IRON VPs of one address family over 560 Internetworks based on other address families are discussed in 561 Appendix A. 563 5. IRON Initialization 565 IRON initialization entails the startup actions of IRs within the VPC 566 overlay network and customer EUNs. The following sections discuss 567 these startups procedures. 569 5.1. IRON Relay Router Initialization 571 Before its first operational use, each Relay in a VPC overlay network 572 is provisioned with the list of VPs that it will serve as well as the 573 locators for all Servers that belong to the same overlay network. 574 The Relay is also provisioned with external BGP interconnections the 575 same as for any BGP router. 577 Upon startup, the Relay engages in BGP routing exchanges with its 578 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 579 It then connects to all of the Servers in the overlay network (e.g., 580 via a TCP connection over a bidirectional tunnel, via an iBGP route 581 reflector, etc.) for the purpose of discovering EP->Server mappings. 582 After the Relay has fully populated its EP->Server mapping 583 information database, it is said to be "synchronized" wrt its VPs. 585 After this initial synchronization procedure, the Relay then 586 advertises the overlay network's VPs externally. In particular, the 587 Relay advertises the IPv6 VPs into the IPv6 BGP routing system and 588 advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay 589 additionally advertises an IPv4 /24 companion prefix (e.g., 590 192.0.2.0/24) into the IPv4 routing system and an IPv6 ::/64 591 companion prefix (e.g., 2001:DB8::/64) into the IPv6 routing system 592 (note that these may also be sub-prefixes taken from a VP). The 593 Relay then configures the host number '1' in the IPv4 companion 594 prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the 595 IPv6 companion prefix (e.g., as 2001:DB8::0) and assigns the 596 resulting addresses as subnet router anycast addresses 597 [RFC3068][RFC2526] for the VPC overlay network. (See Appendix A for 598 more information on the discovery and use of companion prefixes.) 599 The Relay then engages in ordinary packet forwarding operations. 601 5.2. IRON Serving Router Initialization 603 Before its first operational use, each Server in a VPC overlay 604 network is provisioned with the locators for all Relays that 605 aggregate the overlay network's VPs. In order to support route 606 optimization, the Server must also be provisioned with the list of 607 all VPs in the IRON (i.e., and not just the VPs of its own overlay 608 network) so that it can discern EPA and non-EPA addresses. (The 609 Server could therefore be greatly simplified if the list of VPs could 610 be covered within a small number of very short prefixes, e.g., one or 611 a few IPv6 ::/20's). The Server must also discover the VP companion 612 prefix relationships discussed in Section 5.1, e.g., via a global 613 database such as discussed in Appendix A. 615 Upon startup, each Server must connect to all of the Relays within 616 its overlay network (e.g., via a TCP connection over a bidirectional 617 tunnel, via an iBGP route reflector, etc.) for the purpose of 618 reporting its EP->Server mappings. The Server then actively listens 619 for Client customers which register their EP prefixes as part of 620 establishing a bidirectional tunnel. When a new Client registers its 621 EP prefixes, the Server announces the new EP additions to all Relays; 622 when an existing Client unregisters its EP prefixes, the Server 623 withdraws its announcements. 625 5.3. IRON Client Router Initialization 627 Before its first operational use, each Client must obtain one or more 628 EPs from its VPC as well as the companion prefixes associated with 629 the VPC overlay network (see Section 5.1). The Client must also 630 obtain a certificate and a public/private key pair from the VPC that 631 it can later use to prove ownership of its EPs. This implies that 632 each VPC must run its own public key infrastructure to be used only 633 for the purpose of verifying its customers' claimed right to use an 634 EP. Hence, the VPC need not coordinate its public key infrastructure 635 with any other organization. 637 Upon startup, the Client sends an SCMP Router Solicitation (SRS) 638 message to the VPC overlay network subnet router anycast address to 639 discover the nearest Relay. The Relay will return an SCMP Router 640 Advertisement (SRA) message that lists the locator addresses of one 641 or more nearby Servers. (This list is analogous to the ISATAP 642 Potential Router List (PRL) [RFC5214].) 644 After the Client receives an SRA message from the nearby Relay 645 listing the locator addresses of nearby Servers, it initiates a short 646 transaction with one of the servers carried by a reliable transport 647 protocol such as TCP in order to establish a bidirectional tunnel. 648 The protocol details of the transaction are specific to the VPC, and 649 hence out of scope for this document. 651 Note that it is essential that the Client select one and only one 652 Server. This is to allow the VPC overlay network mapping system to 653 have one and only one active EP-to-Server mapping at any point in 654 time which shares fate with the Server itself. If this Server fails, 655 the Client can select a new one which will automatically update the 656 VPC overlay network mapping system with a new EP-to-Server mapping. 658 6. IRON Operation 660 Following the IRON initialization detailed in Section 5, IRs engage 661 in the steady-state process of receiving and forwarding packets. All 662 IRs forward encapsulated packets over the IRON using the mechanisms 663 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 664 while Relays (and in some cases Servers) additionally forward packets 665 to and from the native IPv6 and IPv4 Internets. IRs also use SCMP to 666 coordinate with other IRs, including the process of sending and 667 receiving redirect messages, error messages, etc. (Note however that 668 an IR must not send an SCMP message in response to an SCMP error 669 message.) Each IR operates as specified in the following sub- 670 sections. 672 6.1. IRON Client Router Operation 674 After selecting its Server as specified in Section 5.3, the Client 675 should register each of its ISP connections with the Server in order 676 to establish multiple bidirectional tunnels for multihoming purposes. 677 To do so, it sends periodic SRS messages to its Server via each of 678 its ISPs to establish additional bidirectional tunnels and to keep 679 each tunnel alive. These messages need not include challenge/ 680 response mechanisms since prefix proof of ownership was already 681 established in the initial exchange and a nonce in the SEAL header 682 can be used to confirm that the SRS message was sent by the correct 683 Client. This implies that a single nonce is used to represent the 684 set of all bidirectional tunnels between the Client and the Server. 685 Therefore, there are multiple bidirectional tunnels, and the nonce 686 names this "bundle" of tunnels. (The Client and Server may 687 conceptually represent this "bundle" as a single tunnel with multiple 688 locator addresses, however each such locator address must be tested 689 independently in case there are NATs on the path.) 691 If the Client ceases to receive SRA messages from its Server via a 692 specific ISP connection, it marks the Server as unreachable from that 693 address and therefore over that ISP connection. (The Client should 694 also inform its Server of this outage via one of its working ISP 695 connections.) If the Client ceases to receive SRA messages from its 696 Server via multiple ISP connections, it marks the Server as unusable 697 and quickly attempts to establish a bidirectional tunnel with a new 698 Server. The act of establishing the tunnel with a new Server will 699 automatically purge the stale mapping state associated with the old 700 Server, since dynamic routing will propagate the new client/server 701 relationship to the VPC overlay network relay routers. 703 When an end system in an EUN sends a flow of packets to a 704 correspondent, the packets are forwarded through the EUN via normal 705 routing until they reach the Client, which then tunnels the initial 706 packets to its Server as the next hop. In particular, the Client 707 encapsulates each packet in an outer header with its locator as the 708 source address and the locator of its Server as the destination 709 address. Note that after sending the initial packets of a flow, the 710 Client may receive important SCMP messages such as indications of 711 PMTU limitations, redirects that point to a better next hop, etc. 713 The Client uses the mechanisms specified in VET and SEAL to 714 encapsulate each forwarded packet. The Client further uses the SCMP 715 protocol to coordinate with Servers, including accepting redirects 716 and other SCMP messages. When the Client receives an SCMP message, 717 it checks the nonce field of the encapsulated packet-in-error to 718 verify that the message corresponds to the tunnel to its Server and 719 accepts the message if the nonce matches. (Note however that the 720 outer source and destination addresses of the packet-in-error may be 721 different than those in the original packet due to possible Server 722 and/or Relay address rewritings.) 724 6.2. IRON Serving Router Operation 726 After the Server is initialized, it responds to SRSs from Clients by 727 sending SRAs as described in Section 6.1. When the Server receives a 728 SEAL-encapsulated packet from one of its Client tunnel endpoints, it 729 examines the inner destination address. If the inner destination 730 address is not an EPA, the Server decapsulates the packet and 731 forwards it unencapsulated into the Internet if it is able to do so 732 without loss due to ingress filtering. Otherwise, the Server re- 733 encapsulates the packet (i.e., it removes the outer header and 734 replaces it with a new outer header of the same address family) and 735 sets the outer destination address to the locator address of an Relay 736 within its VPC overlay network. It then forwards the re-encapsulated 737 packet to the Relay, which will in turn decapsulate it and forward it 738 into the Internet. 740 If the inner destination address is an EPA, however, the Server 741 rewrites the outer source address to one of its own locator addresses 742 and rewrites the outer destination address to the subnet router 743 anycast address taken from the companion prefix associated with the 744 inner destination address (where the companion prefix of the same 745 address family as the outer IP protocol is used). The Server then 746 forwards the revised encapsulated packet into the Internet via a 747 default or more-specific route, where it will be directed to the 748 closest Relay within the destination VPC overlay network. After 749 sending the packet, the Server may then receive an SCMP error or 750 redirect message from a Relay/Server within the destination VPC 751 overlay network. In that case, the Server verifies that the nonce in 752 the message matches the tunnel corresponding to the Client that sent 753 the original inner packet and discards the message if the nonce does 754 not match. Otherwise, the Server re-encapsulates the SCMP message in 755 a new outer header that uses the source address, destination address 756 and nonce parameters associated with the tunnel to the Client; it 757 then forwards the message to the Client. This arrangement is 758 necessary to allow SCMP messages to flow through any NATs on the 759 path. 761 When a Server ('A') receives a SEAL-encapsulated packet from a Relay 762 or from the Internet, if the inner destination address matches an EP 763 in its FIB 'A' re-encapsulates the packet in a new outer header that 764 uses the source address, destination address and nonce parameters 765 associated with the tunnel and forwards it to a Client ('B') which in 766 turn decapsulates the packet and forwards it to the correct end 767 system in the EUN. If 'B' has left notice with 'A' that it has moved 768 to a new Server ('C'), however, 'A' will instead forward the packet 769 to 'C' and also send an SCMP redirect message back to the source of 770 the packet. In this way, 'B' can leave behind forwarding information 771 when changing between Servers 'A' and 'C' (e.g., due to mobility 772 events) without exposing packets to loss. 774 6.3. IRON Relay Router Operation 776 After each Relay has synchronized its VPs (see: Section 5.1) it 777 advertises the full set of the company's VPs and companion prefixes 778 into the IPv4 and IPv6 Internet BGP routing systems. These prefixes 779 will be represented as ordinary routing information in the BGP, and 780 any packets originating from the IPv4 or IPv6 Internet destined to an 781 address covered by one of the prefixes will be forwarded to one of 782 the VPC overlay network's Relays. 784 When a Relay receives a packet from the Internet destined to an EPA 785 covered by one of its VPs, it behaves as an ordinary IP router. In 786 particular, the Relay looks in its FIB to discover a locator of the 787 Server that serves the EP that covers the destination address. The 788 Relay then simply encapsulates the packet with its own locator as the 789 outer source address and the locator of the Server as the outer 790 destination address and forwards the packet to the Server. 792 When a Relay receives a packet from the Internet destined to one of 793 its subnet router anycast addresses, it discards the packet if it is 794 not SEAL-encapsulated. If the packet is an SCMP SRS message, the 795 Relay instead sends an SRA message back to the source listing the 796 locator addresses of nearby Servers then discards the message. The 797 Relay otherwise discards all other SCMP messages. 799 If the packet is an ordinary SEAL packet (i.e., one that encapsulates 800 an inner packet) the Relay sends an SCMP redirect message of the same 801 address family back to the source with the locator of the Server that 802 serves the EPA destination in the inner packet as the redirected 803 target. The source and destination addresses of the SCMP redirect 804 message use the outer destination and source addresses of the 805 original packet, respectively. After sending the redirect message, 806 the Relay then rewrites the outer destination address of the SEAL- 807 encapsulated packet to the locator of the Server and forwards the 808 revised packet to the Server. Note that in this arrangement any 809 errors that occur on the path between the Relay and the Server will 810 be delivered to the original source but with a different destination 811 address due to this Relay address rewriting. 813 6.4. IRON Reference Operating Scenarios 815 The IRON supports communications when one or both hosts are located 816 within EP-addressed EUNs regardless of whether the EPs are 817 provisioned by the same VPC or by different VPCs. When both hosts 818 are within IRON EUNs, route redirections that eliminate unnecessary 819 Servers and Relays from the path are possible. When only one host is 820 within an IRON EUN, however, route optimization cannot be used. The 821 following sections discuss the two scenarios. 823 6.4.1. Both Hosts Within IRON EUNs 825 When both hosts are within IRON EUNs, it is sufficient to consider 826 the scenario in a unidirectional fashion, i.e., by tracing packet 827 flows only in the forward direction from the source host to 828 destination host. The reverse direction can be considered 829 separately, and incurs the same considerations as for the forward 830 direction. 832 In this scenario, the initial packets of a flow produced by a source 833 host within an EUN connected to the IRON by a Client must flow 834 through both the Server of the source host and a Relay of the 835 destination host, but route optimization can eliminate these elements 836 from the path for subsequent packets in the flow. Figure 6 shows the 837 flow of initial packets from host A to host B within two IRON EUNs 838 (the same scenario applies whether the two EUNs are within the same 839 VPC overlay network or different overlay networks): 841 ________________________________________ 842 .-( .-. )-. 843 .-( ,-( _)-. )-. 844 .-( +========+(_ (_ +=====+ )-. 845 .( || (_|| Internet ||_) || ). 846 .( || ||-(______)-|| vv ). 847 .( +--------++--+ || || +------------+ ). 848 ( +==>| Server(A) | vv || | Server(B) |====+ ) 849 ( // +---------|\-+ +--++----++--+ +------------+ \\ ) 850 ( // .-. | \ | Relay(B) | .-. \\ ) 851 ( //,-( _)-. | \ +-v----------+ ,-( _)-\\ ) 852 ( .||_ (_ )-. | \____| .-(_ (_ ||. ) 853 ( _|| ISP A .) | (__ ISP B ||_)) 854 ( ||-(______)-' | (redirect) `-(______)|| ) 855 ( || | | | vv ) 856 ( +-----+-----+ | +-----+-----+ ) 857 | Client(A) | <--+ | Client(B) | 858 +-----+-----+ The IRON +-----+-----+ 859 | ( (Overlaid on the native Internet) ) | 860 .-. .-( .-) .-. 861 ,-( _)-. .-(________________________)-. ,-( _)-. 862 .-(_ (_ )-. .-(_ (_ )-. 863 (_ IRON EUN A ) (_ IRON EUN B ) 864 `-(______)-' `-(______)-' 865 | | 866 +---+----+ +---+----+ 867 | Host A | | Host B | 868 +--------+ +--------+ 870 Figure 6: Initial Packet Flow Before Redirects 872 With reference to Figure 6, host A sends packets destined to host B 873 via its network interface connected to EUN A. Routing within EUN A 874 will direct the packets to Client(A) as a default router for the EUN 875 which then uses VET and SEAL to encapsulate them in outer headers 876 with its locator address as the outer source address and the locator 877 address of Server(A) as the outer destination address. Client(A) 878 then simply forwards the encapsulated packets into its ISP network 879 connection that provided its locator. The ISP will forward the 880 encapsulated packets into the Internet without filtering since the 881 (outer) source address is topologically correct. Once the packets 882 have been forwarded into the Internet, routing will direct them to 883 Server(A). 885 Server(A) receives the encapsulated packets from Client(A) then 886 rewrites the outer source address to one of its own locator 887 addresses, and rewrites the outer destination address to the subnet 888 router anycast address of the appropriate address family associated 889 with the inner destination address. Server(A) then forwards the 890 revised encapsulated packets into the Internet where routing will 891 direct them to Relay(B) which services the VPC overlay network 892 associated with host B. 894 Relay(B) will intercept the encapsulated packets from Server(A) then 895 check its FIB to discover an entry that covers inner destination 896 address B with Server(B) as the next hop. Relay(B) then returns SCMP 897 redirect messages to Server(A) (*), rewrites the outer destination 898 address of the encapsulated packets to the locator address of 899 Server(B), and forwards these revised packets to Server(B). 901 Server(B) will receive the encapsulated packets from Relay(B) then 902 check its FIB to discover an entry that covers destination address B 903 with Client(B) as the next hop. Server(B) then re-encapsulates the 904 packets in a new outer header that uses the source address, 905 destination address and nonce parameters associated with the tunnel 906 to Client(B). Server(B) then forwards these re-encapsulated packets 907 into the Internet, where routing will direct them to Client(B). 908 Client(B) will in turn decapsulate the packets and forward the inner 909 packets to host B via EUN B. 911 (*) Note that after the initial flow of packets, Server(A) will have 912 received one or more SCMP redirect messages from Relay(B) listing 913 Server(B) as a better next hop. Server(A) will in turn forward the 914 redirects to Client(A), which will thereafter forward its 915 encapsulated packets directly to the locator address of Server(B) 916 without involving either Server(A) or Relay(B) as shown in Figure 7: 918 ________________________________________ 919 .-( .-. )-. 920 .-( ,-( _)-. )-. 921 .-( +=============> .-(_ (_ )-.======+ )-. 922 .( // (__ Internet _) || ). 923 .( // `-(______)-' vv ). 924 .( // +------------+ ). 925 ( // | Server(B) |====+ ) 926 ( // +------------+ \\ ) 927 ( // .-. .-. \\ ) 928 ( //,-( _)-. ,-( _)-\\ ) 929 ( .||_ (_ )-. .-(_ (_ ||. ) 930 ( _|| ISP A .) (__ ISP B ||_)) 931 ( ||-(______)-' `-(______)|| ) 932 ( || | | vv ) 933 ( +-----+-----+ The IRON +-----+-----+ ) 934 | Client(A) | (Overlaid on the native Internet) | Client(B) | 935 +-----+-----+ +-----+-----+ 936 | ( ) | 937 .-. .-( .-) .-. 938 ,-( _)-. .-(________________________)-. ,-( _)-. 939 .-(_ (_ )-. .-(_ (_ )-. 940 (_ IRON EUN A ) (_ IRON EUN B ) 941 `-(______)-' `-(______)-' 942 | | 943 +---+----+ +---+----+ 944 | Host A | | Host B | 945 +--------+ +--------+ 947 Figure 7: Sustained Packet Flow After Redirects 949 6.4.2. Mixed IRON and Non-IRON Hosts 951 When one host is within an IRON EUN and the other is in a non-IRON 952 EUN (i.e., one that connects to the native Internet instead of the 953 IRON), the IR elements involved depend on the packet flow directions. 954 The cases are described in the following sections. 956 6.4.2.1. From IRON Host A to Non-IRON Host B 958 Figure 8 depicts the IRON reference operating scenario for packets 959 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 961 _________________________________________ 962 .-( )-. )-. 963 .-( +-------)----+ )-. 964 .-( | Relay(A) |--------------+ )-. 965 .( +------------+ \ ). 966 .( +=======>| Server(A) | \ ). 967 .( // +--------)---+ \ ). 968 ( // ) \ ) 969 ( // The IRON ) \ ) 970 ( // .-. ) \ .-. ) 971 ( //,-( _)-. ) \ ,-( _)-. ) 972 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 973 ( _|| ISP A ) ) (_ | ISP B )) 974 ( ||-(______)-' ) |-(______)-' ) 975 ( || | )-. v | ) 976 ( +-----+ ----+ )-. +-----+-----+ ) 977 | Client(A) |)-. | Router B | 978 +-----+-----+ +-----+-----+ 979 | ( ) | 980 .-. .-(____________________________________)-. .-. 981 ,-( _)-. ,-( _)-. 982 .-(_ (_ )-. .-(_ (_ )-. 983 (_ IRON EUN A ) (_non-IRON EUN B) 984 `-(______)-' `-(______)-' 985 | | 986 +---+----+ +---+----+ 987 | Host A | | Host B | 988 +--------+ +--------+ 990 Figure 8: From IRON Host A to Non-IRON Host B 992 In this scenario, host A sends packets destined to host B via its 993 network interface connected to IRON EUN A. Routing within EUN A will 994 direct the packets to Client(A) as a default router for the EUN which 995 then uses VET and SEAL to encapsulate them in outer headers with its 996 locator address as the outer source address and the locator address 997 of Server(A) as the outer destination address. The ISP will pass the 998 packets without filtering since the (outer) source address is 999 topologically correct. Once the packets have been released into the 1000 native Internet, routing will direct them to Server(A). 1002 Server(A) receives the encapsulated packets from Client(A) then re- 1003 encapsulates and forwards them to Relay(A), which simply decapsulates 1004 them and forwards the unencapsulated packets into the Internet. Once 1005 the packets are released into the Internet, routing will direct them 1006 to the final destination B. (Note that Server(A) and Relay(A) are 1007 depicted in Figure 8 as two halves of a unified gateway. In that 1008 case, the "forwarding" between Server(A) and Relay(A) is a zero- 1009 instruction imaginary operation within the gateway.) 1011 This scenario always involves a Server and Relay owned by the VPC 1012 that provides service to IRON EUN A. It therefore imparts a cost that 1013 would need to be borne by either the VPC or its customers. 1015 6.4.2.2. From Non-IRON Host B to IRON Host A 1017 Figure 9 depicts the IRON reference operating scenario for packets 1018 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1020 _______________________________________ 1021 .-( )-. )-. 1022 .-( +-------)----+ )-. 1023 .-( | Relay(A) |<-------------+ )-. 1024 .( +------------+ \ ). 1025 .( +========| Server(A) | \ ). 1026 .( // +--------)---+ \ ). 1027 ( // ) \ ) 1028 ( // The IRON ) \ ) 1029 ( // .-. ) \ .-. ) 1030 ( //,-( _)-. ) \ ,-( _)-. ) 1031 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1032 ( _|| ISP A ) ) (_ | ISP B )) 1033 ( ||-(______)-' ) |-(______)-' ) 1034 ( vv | )-. | | ) 1035 ( +-----+ ----+ )-. +-----+-----+ ) 1036 | Client(A) |)-. | Router B | 1037 +-----+-----+ +-----+-----+ 1038 | ( ) | 1039 .-. .-(____________________________________)-. .-. 1040 ,-( _)-. ,-( _)-. 1041 .-(_ (_ )-. .-(_ (_ )-. 1042 (_ IRON EUN A ) (_non-IRON EUN B) 1043 `-(______)-' `-(_______)-' 1044 | | 1045 +---+----+ +---+----+ 1046 | Host A | | Host B | 1047 +--------+ +--------+ 1049 Figure 9: From Non-IRON Host B to IRON Host A 1051 In this scenario, host B sends packets destined to host A via its 1052 network interface connected to non-IRON EUN B. Routing will direct 1053 the packets to Relay(A) which then forwards them to Server(A) using 1054 encapsulation if necessary. 1056 Server(A) will then check its FIB to discover an entry that covers 1057 destination address A with Client(A) as the next hop. Server(A) then 1058 (re-)encapsulates the packets in an outer header that uses the source 1059 address, destination address and nonce parameters associated with the 1060 tunnel to Client(A). Server(A) next forwards these (re-)encapsulated 1061 packets into the Internet, where routing will direct them to 1062 Client(A). Client(A) will in turn decapsulate the packets and 1063 forward the inner packets to host A via its network interface 1064 connected to IRON EUN A. 1066 This scenario always involves a Server and Relay owned by the VPC 1067 that provides service to IRON EUN A. It therefore imparts a cost that 1068 would need to be borne by either the VPC or its customers. 1070 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1072 While IRON Servers and Relays can be considered as fixed 1073 infrastructure, Clients may need to move between different network 1074 points of attachment, connect to multiple ISPs, or explicitly manage 1075 their traffic flows. The following sections discuss mobility, multi- 1076 homing and traffic engineering considerations for IRON client 1077 routers. 1079 6.5.1. Mobility Management 1081 When a Client changes its network point of attachment (e.g., due to a 1082 mobility event), it configures one or more new locators. If the 1083 Client has not moved far away from its previous network point of 1084 attachment, it simply informs its Server of any locator additions or 1085 deletions. This operation is performance-sensitive, and should be 1086 conducted immediately to avoid packet loss. 1088 If the Client has moved far away from its previous network point of 1089 attachment, however, it re-issues the anycast discovery procedure 1090 described in Section 6.1 to discover whether its candidate set of 1091 Servers has changed. If the Client's current Server is also included 1092 in the new list received from the VPC, this provides indication that 1093 the Client has not moved far enough to warrant changing to a new 1094 Server. Otherwise, the Client may wish to move to a new Server in 1095 order to maintain optimal routing. This operation is not 1096 performance-critical, and therefore can be conducted over a matter of 1097 seconds/minutes instead of milliseconds/microseconds. 1099 To move to a new Server, the Client first engages in the EP 1100 registration process with the new Server and maintains the 1101 registrations through periodic SRS/SRA exchanges the same as 1102 described in Section 6.1. The Client then informs its former Server 1103 that it has moved by providing it with the locator address of the new 1104 Server. The Client then discontinues the SRS/SRA keepalive process 1105 with the former Server, which will garbage-collect the stale FIB 1106 entries when their lifetime expires. This will allow the former 1107 Server to redirect existing correspondents to the new Server so that 1108 no packets are lost. 1110 6.5.2. Multihoming 1112 A Client may register multiple locators with its Server. It can 1113 assign metrics with its registrations to inform the Server of 1114 preferred locators, and can select outgoing locators according to its 1115 local preferences. Multihoming is therefore naturally supported. 1117 6.5.3. Inbound Traffic Engineering 1119 A Client can dynamically adjust the priorities of its prefix 1120 registrations with its Server in order to influence inbound traffic 1121 flows. It can also change between Servers when multiple Servers are 1122 available, but should strive for stability in its Server selection in 1123 order to limit VPC network routing churn. 1125 6.5.4. Outbound Traffic Engineering 1127 A Client can select outgoing locators, e.g., based on current QoS 1128 considerations such as minimizing one-way delay or one-way delay 1129 variance. 1131 6.6. Renumbering Considerations 1133 As new link layer technologies and/or service models emerge, 1134 customers will be motivated to select their service providers through 1135 healthy competition between ISPs. If a customer's EUN addresses are 1136 tied to a specific ISP, however, the customer may be forced to 1137 undergo a painstaking EUN renumbering process if it wishes to change 1138 to a different ISP [RFC4192][RFC5887]. 1140 When a customer obtains EP prefixes from a VPC, it can change between 1141 ISPs seamlessly and without need to renumber. If the VPC itself 1142 applies unreasonable costing structures for use of the EPs, however, 1143 the customer may be compelled to seek a different VPC and would again 1144 be required to confront a renumbering scenario. The IRON approach to 1145 renumbering avoidance therefore depends on VPCs conducting ethical 1146 business practices and offering reasonable rates. 1148 6.7. NAT Traversal Considerations 1150 The Internet today consists of a global public IPv4 routing and 1151 addressing system with non-IRON EUNs that use either public or 1152 private IPv4 addressing. The latter class of EUNs connect to the 1153 public Internet via Network Address Translators (NATs). When a 1154 Client is located behind a NAT, its selects Servers using the same 1155 procedures as for Clients with public addresses, i.e., it will send 1156 SRS messages to Servers in order to get SRA messages in return. The 1157 only requirement is that the Client must configure its SEAL 1158 encapsulation to use a transport protocol that supports NAT 1159 traversal, namely UDP. 1161 Since the Server maintains state about its Client customers, it can 1162 discover locator information for each Client by examining the UDP 1163 port number and IP address in the outer headers of SRS messages. 1164 When there is a NAT in the path, the UDP port number and IP address 1165 in the SRS message will correspond to state in the NAT box and might 1166 not correspond to the actual values assigned to the Client. The 1167 Server can then encapsulate packets destined to hosts in the Client's 1168 EUN within outer headers that use this IP address and UDP port 1169 number. The NAT box will receive the packets, translate the values 1170 in the outer headers, then forward the packets to the Client. In 1171 this sense, the Server's "locator" for the Client consists of the 1172 concatenation of the IP address and UDP port number. 1174 IRON does not introduce any new issues to complications raised for 1175 NAT traversal or for applications embedding address referrals in 1176 their payload. 1178 6.8. Nested EUN Considerations 1180 Each Client configures a locator that may be taken from an ordinary 1181 non-EPA address assigned by an ISP or from an EPA address taken from 1182 an EP assigned to another Client. In that case, the Client is said 1183 to be "nested" within the EUN of another Client, and recursive 1184 nestings of multiple layers of encapsulations may be necessary. 1186 For example, in the network scenario depicted in Figure 10 Client(A) 1187 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1188 Client(B) in turn configures a locator EPA(C) taken from the EP 1189 assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) 1190 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1191 this example, the "nested-IRON" case must be examined in which a host 1192 A which configures the address EPA(A) within EUN(A) exchanges packets 1193 with host Z located elsewhere in the Internet. 1195 .-. 1196 ISP(D) ,-( _)-. 1197 +-----------+ .-(_ (_ )-. 1198 | Client(C) |--(_ ISP(D) ) 1199 +-----+-----+ `-(______)-' 1200 | <= T \ .-. 1201 .-. u \ ,-( _)-. 1202 ,-( _)-. n .-(_ (- )-. 1203 .-(_ (_ )-. n (_ Internet ) 1204 (_ EUN(C) ) e `-(______)-' 1205 `-(______)-' l ___ 1206 | EPA(C) s => (:::)-. 1207 +-----+-----+ .-(::::::::) 1208 | Client(B) | .-(::::::::::::)-. +-----------+ 1209 +-----+-----+ (:::: The IRON ::::) | Relay(Z) | 1210 | `-(::::::::::::)-' +-----------+ 1211 .-. `-(::::::)-' +-----------+ 1212 ,-( _)-. | Server(Z) | 1213 .-(_ (_ )-. +-----------+ +-----------+ 1214 (_ EUN(B) ) | Server(C) | +-----------+ 1215 `-(______)-' +-----------+ | Client(Z) | 1216 | EPA(B) +-----------+ +-----------+ 1217 +-----+-----+ | Server(B) | +--------+ 1218 | Client(A) | +-----------+ | Host Z | 1219 +-----------+ +-----------+ +--------+ 1220 | | Server(A) | 1221 .-. +-----------+ 1222 ,-( _)-. EPA(A) 1223 .-(_ (_ )-. +--------+ 1224 (_ EUN(A) )---| Host A | 1225 `-(______)-' +--------+ 1227 Figure 10: Nested EUN Example 1229 The two cases of host A sending packets to host Z, and host Z sending 1230 packets to host A, must be considered separately as described below. 1232 6.8.1. Host A Sends Packets to Host Z 1234 Host A first forwards a packet with source address EPA(A) and 1235 destination address Z into EUN(A). Routing within EUN(A) will direct 1236 the packet to Client(A), which encapsulates it in an outer header 1237 with EPA(B) as the outer source address and Server(A) as the outer 1238 destination address then forwards the once-encapsulated packet into 1239 EUN(B). Routing within EUN[B] will direct the packet to Client(B), 1240 which encapsulates it in an outer header with EPA(C) as the outer 1241 source address and Server(B) as the outer destination address then 1242 forwards the twice-encapsulated packet into EUN(C). Routing within 1243 EUN(C) will direct the packet to Client(C), which encapsulates it in 1244 an outer header with ISP(D) as the outer source address and Server(C) 1245 as the outer destination address. Client(C) then sends this triple- 1246 encapsulated packet into the ISP(D) network, where it will be routed 1247 into the Internet to Server(C). 1249 When Server(C) receives the triple-encapsulated packet, it removes 1250 the outer layer of encapsulation and forwards the resulting twice- 1251 encapsulated packet into the Internet to Server(B). Next, Server(B) 1252 removes the outer layer of encapsulation and forwards the resulting 1253 once-encapsulated packet into the Internet to Server(A). Next, 1254 Server(A) checks the address type of the inner address 'Z'. If Z is 1255 a non-EPA address, Server(A) simply decapsulates the packet and 1256 forwards it into the Internet. Otherwise, Server(A) rewrites the 1257 outer source and destination addresses of the once-encapsulated 1258 packet and forwards it to Relay(Z). Relay(Z) in turn rewrites the 1259 outer destination address of the packet to the locator for Server(Z), 1260 then forwards the packet and sends a redirect to Server(A) (which 1261 forwards the redirect to Client(A)). Server(Z) then re-encapsulates 1262 the packet and forwards it to Client(Z), which decapsulates it and 1263 forwards the inner packet to host Z. Subsequent packets from 1264 Client(A) will then use Server(Z) as the next hop toward host Z, 1265 which eliminates Server(A) and Relay(Z) from the path. 1267 6.8.2. Host Z Sends Packets to Host A 1269 Whether or not host Z configures an EPA address, its packets destined 1270 to Host A will eventually reach Server(A). Server(A) will have a 1271 mapping that lists Client(A) as the next hop toward EPA(A). 1272 Server(A) will then encapsulate the packet with EPA(B) as the outer 1273 destination address and forward the packet into the Internet. 1274 Internet routing will convey this once-encapsulated packet to 1275 Server(B) which will have a mapping that lists Client(B) as the next 1276 hop toward EPA(B). Server(B) will then encapsulate the packet with 1277 EPA(C) as the outer destination address and forward the packet into 1278 the Internet. Internet routing will then convey this twice- 1279 encapsulated packet to Server(C) which will have a mapping that lists 1280 Client(C) as the next hop toward EPA(C). Server(C) will then 1281 encapsulate the packet with ISP(D) as the outer destination address 1282 and forward the packet into the Internet. Internet routing will then 1283 convey this triple-encapsulated packet to Client(C). 1285 When the triple-encapsulated packet arrives at Client(C), it strips 1286 the outer layer of encapsulation and forwards the twice-encapsulated 1287 packet to EPA(C) which is the locator address of Client(B). When 1288 Client(B) receives the twice-encapsulated packet, it strips the outer 1289 layer of encapsulation and forwards the once-encapsulated packet to 1290 EPA(B) which is the locator address of Client(A). When Client(A) 1291 receives the once-encapsulated packet, it strips the outer layer of 1292 encapsulation and forwards the unencapsulated packet to EPA(A) which 1293 is the host address of host A. 1295 7. Implications for the Internet 1297 The IRON architecture envisions a hybrid routing/mapping system that 1298 benefits from both the shortest-path routing afforded by pure dynamic 1299 routing systems and the routing scaling suppression afforded by pure 1300 mapping systems. IRON therefore targets the elusive "sweet spot" 1301 that pure routing and pure mapping systems alone cannot satisfy. 1303 The IRON system requires a deployment of new routers/servers 1304 throughout the Internet and/or provider networks to maintain well- 1305 balanced virtual overlay networks. These routers/servers can be 1306 deployed incrementally without disruption to existing Internet 1307 infrastructure and appropriately managed to provide acceptable 1308 service levels to customers. 1310 End-to-end traffic that traverses an IRON virtual overlay network may 1311 experience delay variance between the initial packets and subsequent 1312 packets of a flow. This is due to the IRON system allowing longer 1313 path stretch for initial packets followed by timely route 1314 optimizations to utilize better next hop routers/servers for 1315 subsequent packets. 1317 IRON virtual overlay networks also work seamlessly with existing and 1318 emerging services within the native Internet. In particular, 1319 customers serviced by IRON virtual overlay networks will receive the 1320 same service enjoyed by customers serviced by non-IRON service 1321 providers. Internet services already deployed within the native 1322 Internet also need not make any changes to accommodate IRON virtual 1323 overlay network customers. 1325 The IRON system operates between routers within provider networks and 1326 end user networks. Within these networks, the underlying paths 1327 traversed by the virtual overlay networks may comprise links that 1328 accommodate varying MTUs. While the IRON system imposes an 1329 additional per-packet overhead that may cause the size of packets to 1330 become slightly larger than the underlying path can accommodate, IRON 1331 routers have a method for naturally detecting and tuning out all 1332 instances of path MTU underruns. In some cases, these MTU underruns 1333 may need to be reported back to the original hosts; however, the 1334 system will also allow for MTUs much larger than those typically 1335 available in current Internet paths to be discovered and utilized as 1336 more links with larger MTUs are deployed. 1338 Finally, and perhaps most importantly, the IRON system provides an 1339 in-built mobility management and multihoming capability that allows 1340 end user devices and networks to move about freely while both 1341 imparting minimal oscillations in the routing system and maintaining 1342 generally shortest-path routes. This mobility management is afforded 1343 through the very nature of the IRON customer/provider relationship, 1344 and therefore requires no adjunct mechanisms. The mobility 1345 management and multihoming capabilities are further supported by 1346 forward-path reachability detection that provides "hints of forward 1347 progress" in the same spirit as for IPv6 ND. 1349 8. Additional Considerations 1351 Considerations for the scalability of Internet Routing due to 1352 multihoming, traffic engineering and provider-independent addressing 1353 are discussed in [I-D.narten-radir-problem-statement]. Other scaling 1354 considerations specific to IRON are discussed in Appendix B. 1356 Route optimization considerations for mobile networks are found in 1357 [RFC5522]. 1359 9. Related Initiatives 1361 IRON builds upon the concepts RANGER architecture [RFC5720], and 1362 therefore inherits the same set of related initiatives. The Internet 1363 Research Task Force (IRTF) Routing Research Group (RRG) mentions IRON 1364 in its recommendation for a routing architecture 1365 [I-D.irtf-rrg-recommendation]. 1367 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1368 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1369 the Virtual Prefix concepts. 1371 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1372 contributed valuable insights, including the use of real-time 1373 mapping. The use of Servers as mobility anchor points is directly 1374 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1376 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1377 approach using a Correspondent Router (CR) model. The IRON Server 1378 construct is similar to the CR concept described in this work, 1379 however the manner in which customer EUNs coordinates with Servers is 1380 different and based on the redirection model associated with NBMA 1381 links. 1383 Numerous publications have proposed NAT traversal techniques. The 1384 NAT traversal techniques adapted for IRON were inspired by the Simple 1385 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1386 [I-D.carpenter-softwire-sample]. 1388 10. IANA Considerations 1390 There are no IANA considerations for this document. 1392 11. Security Considerations 1394 Security considerations that apply to tunneling in general are 1395 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1396 considerations that apply also to IRON are discussed in RANGER 1397 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1398 [I-D.templin-intarea-seal]. 1400 The IRON system further depends on mutual authentication of IRON 1401 Clients to Servers and Servers to Relays. This is accomplished 1402 through initial authentication exchanges followed by per-packet 1403 nonces that can be used to detect off-path attacks. As for all 1404 Internet communications, the IRON system also depends on Relays 1405 acting with integrity and not injecting false advertisements into the 1406 BGP (e.g., to mount traffic siphoning attacks). 1408 Each VPC overlay network requires a means for assuring the integrity 1409 of the interior routing system so that all Relays and Servers in the 1410 overlay have a consistent view of Client<->Server bindings. Finally, 1411 DOS attacks on IRON Relays and Servers can occur when packets with 1412 spoofed source addresses arrive at high data rates. This issue is no 1413 different than for any border router in the public Internet today, 1414 however. 1416 12. Acknowledgements 1418 This ideas behind this work have benefited greatly from discussions 1419 with colleagues; some of which appear on the RRG and other IRTF/IETF 1420 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1421 mobility architecture which strongly influenced IRON. Eric 1422 Fleischman pointed out the opportunity to leverage anycast for 1423 discovering topologically-close Servers. Thomas Henderson 1424 recommended a quantitative analysis of scaling properties. 1426 The following individuals provided essential review input: Jari 1427 Arkko, Mohamed Boucadair, Stewart Bryant, John Buford, Ralph Droms, 1428 Wesley Eddy, Adrian Farrel, Dae Young Kim and Robin Whittle. 1430 13. References 1432 13.1. Normative References 1434 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1435 September 1981. 1437 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1438 (IPv6) Specification", RFC 2460, December 1998. 1440 13.2. Informative References 1442 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1443 http://bgpmon.net/stat.php", June 2010. 1445 [I-D.bernardos-mext-nemo-ro-cr] 1446 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1447 Router based Route Optimisation for NEMO (CRON)", 1448 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1449 July 2008. 1451 [I-D.carpenter-softwire-sample] 1452 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1453 IPv6: Simple Address Mapping for Premises Legacy Equipment 1454 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1455 progress), June 2010. 1457 [I-D.ietf-grow-va] 1458 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1459 L. Zhang, "FIB Suppression with Virtual Aggregation", 1460 draft-ietf-grow-va-03 (work in progress), August 2010. 1462 [I-D.ietf-v6ops-tunnel-security-concerns] 1463 Krishnan, S., Thaler, D., and J. Hoagland, "Security 1464 Concerns With IP Tunneling", 1465 draft-ietf-v6ops-tunnel-security-concerns-04 (work in 1466 progress), October 2010. 1468 [I-D.irtf-rrg-recommendation] 1469 Li, T., "Recommendation for a Routing Architecture", 1470 draft-irtf-rrg-recommendation-16 (work in progress), 1471 November 2010. 1473 [I-D.narten-radir-problem-statement] 1474 Narten, T., "On the Scalability of Internet Routing", 1475 draft-narten-radir-problem-statement-05 (work in 1476 progress), February 2010. 1478 [I-D.russert-rangers] 1479 Russert, S., Fleischman, E., and F. Templin, "RANGER 1480 Scenarios", draft-russert-rangers-05 (work in progress), 1481 July 2010. 1483 [I-D.templin-intarea-seal] 1484 Templin, F., "The Subnetwork Encapsulation and Adaptation 1485 Layer (SEAL)", draft-templin-intarea-seal-25 (work in 1486 progress), December 2010. 1488 [I-D.templin-intarea-vet] 1489 Templin, F., "Virtual Enterprise Traversal (VET)", 1490 draft-templin-intarea-vet-19 (work in progress), 1491 December 2010. 1493 [I-D.whittle-ivip-arch] 1494 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1495 Architecture", draft-whittle-ivip-arch-04 (work in 1496 progress), March 2010. 1498 [I-D.zhang-evolution] 1499 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1500 Scalability", draft-zhang-evolution-02 (work in progress), 1501 October 2009. 1503 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1504 a subnetwork for experimentation with the OSI network 1505 layer", RFC 1070, February 1989. 1507 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1508 Addresses", RFC 2526, March 1999. 1510 [RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers", 1511 RFC 3068, June 2001. 1513 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1514 Reserved for Documentation", RFC 3849, July 2004. 1516 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1517 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1518 September 2005. 1520 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1521 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1523 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1524 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1525 May 2006. 1527 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1528 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1529 March 2008. 1531 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1532 Route Optimization Requirements for Operational Use in 1533 Aeronautics and Space Exploration Mobile Networks", 1534 RFC 5522, October 2009. 1536 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1537 Global Enterprise Recursion (RANGER)", RFC 5720, 1538 February 2010. 1540 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1541 Reserved for Documentation", RFC 5737, January 2010. 1543 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1544 (IRTF) Document Stream", RFC 5743, December 2009. 1546 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1547 Still Needs Work", RFC 5887, May 2010. 1549 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1550 Core-Edge Separation Solutions to the Internet's Routing 1551 Scaling Problem, 1552 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1553 August 2008. 1555 Appendix A. IRON VPs Over Internetworks with Different Address Families 1557 The IRON architecture leverages the routing system by providing 1558 generally shortest-path routing for packets with EPA addresses from 1559 VPs that match the address family of the underlying Internetwork. 1560 When the VPs are of an address family that is not routable within the 1561 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1562 are used within an IPv4 Internetwork) a global mapping database is 1563 required to allow Servers to map VPs to companion prefixes taken from 1564 address families that are routable within the Internetwork. For 1565 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1566 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1567 packets can be forwarded over IPv4-only Internetworks. 1569 Every VP in the IRON must therefore be represented in a globally 1570 distributed Master VP database (MVPd) that maintains VP-to-companion 1571 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1572 globally-managed assigned numbers authority in the same manner as the 1573 Internet Assigned Numbers Authority (IANA) currently maintains the 1574 master list of all top-level IPv4 and IPv6 delegations. The database 1575 can be replicated across multiple servers for load balancing much in 1576 the same way that FTP mirror sites are used to manage software 1577 distributions. 1579 Upon startup, each Server discovers the full set of VPs for the IRON 1580 by reading the MVPd. The Server reads the MVPd from a nearby server 1581 and periodically checks the server for deltas since the database was 1582 last read. After reading the MVPd, the Server has a full list of VP 1583 to companion prefix mappings. 1585 The Server can then forward packets toward EPAs covered by a VP by 1586 encapsulating them in an outer header of the VP's companion prefix 1587 address family and using any address taken from the companion prefix 1588 as the outer destination address. The companion prefix therefore 1589 serves as an anycast prefix. 1591 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1592 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1594 Appendix B. Scaling Considerations 1596 Scaling aspects of the IRON architecture have strong implications for 1597 its applicability in practical deployments. Scaling must be 1598 considered along multiple vectors including Interdomain core routing 1599 scaling, scaling to accommodate large numbers of customer EUNs, 1600 traffic scaling, state requirements, etc. 1602 In terms of routing scaling, each VPC will advertise one or more VPs 1603 into the global Internet routing system from which EPs are delegated 1604 to customer EUNs. Routing scaling will therefore be minimized when 1605 each VP covers many EPs. For example, the IPv6 prefix 2001:DB8::/32 1606 contains 2^24 ::/56 EP prefixes for assignment to EUNs. The IRON 1607 could therefore accommodate 2^32 ::/56 EPs with only 2^8 ::/32 VPs 1608 advertised in the interdomain routing core. (When even longer EP 1609 prefixes are used, e.g., /64s assigned to individual handsets in a 1610 cellular provider network, considerable numbers of EUNs can be 1611 represented within only a single VP.) Each VP also has an associated 1612 anycast companion prefix; hence, there will be one anycast prefix 1613 advertised into the global routing system for each VP. 1615 In terms of traffic scaling for Relays, each Relay represents an ASBR 1616 of a "shell" enterprise network that simply directs arriving traffic 1617 packets with EPA destination addresses towards Servers that service 1618 customer EUNs. Moreover, the Relay sheds traffic destined to EPAs 1619 through redirection which removes it from the path for the vast 1620 majority of traffic packets. On the other hand, each Relay must 1621 handle all traffic packets forwarded between its customer EUNs and 1622 the non-IRON Internet. The scaling concerns for this latter class of 1623 traffic are no different than for ASBR routers that connect large 1624 enterprise networks to the Internet. In terms of traffic scaling for 1625 Servers, each Server services a set of the VPC overlay network's 1626 customer EUNs. The Server services all traffic packets destined to 1627 its EUNs but only services the initial packets of flows initiated 1628 from the EUNs and destined to EPAs. Therefore, traffic scaling for 1629 EPA-addressed traffic is an asymmetric consideration and is 1630 proportional to the number of EUNs each Server serves. 1632 In terms of state requirements for Relays, each Relay maintains a 1633 list of all Servers in the VPC overlay network as well as FIB entries 1634 for all customer EUNs that each Server serves. This state is 1635 therefore dominated by the number of EUNs in the VPC overlay network. 1636 Sizing the Relay to accommodate state information for all EUNs is 1637 therefore required during VPC overlay network planning. In terms of 1638 state requirements for Servers, each Server maintains tunnel state 1639 for each of the customer EUNs it serves but need not keep state for 1640 all EUNs in the VPC overlay network. Finally, neither Relays nor 1641 Servers need keep state for final destinations of outbound traffic. 1643 Clients source and sink all traffic packets originating from or 1644 destined to the customer EUN. Therefore traffic scaling 1645 considerations for Clients are the same as for any site border 1646 router. Clients also retain state for the Servers for final 1647 destinations of outbound traffic flows. This can be managed as soft 1648 state, since stale entries purged from the cache will be refreshed 1649 when new traffic packets are sent. 1651 Author's Address 1653 Fred L. Templin (editor) 1654 Boeing Research & Technology 1655 P.O. Box 3707 MC 7L-49 1656 Seattle, WA 98124 1657 USA 1659 Email: fltemplin@acm.org