idnits 2.17.1 draft-templin-iron-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 8, 2010) is 4942 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'B' is mentioned on line 1272, but not defined == Unused Reference: 'RFC3849' is defined on line 1483, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1510, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-03 == Outdated reference: A later version (-04) exists of draft-ietf-v6ops-tunnel-security-concerns-02 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-20 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-16 -- Obsolete informational reference (is this intentional?): RFC 3068 (Obsoleted by RFC 7526) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft October 8, 2010 5 Intended status: Experimental 6 Expires: April 11, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-13.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth through Provider Independent addressing while requiring no 18 changes to end systems and no changes to the existing routing system. 19 IRON further addresses other important issues including routing 20 scaling, mobility management, multihoming, traffic engineering and 21 NAT traversal. While business considerations are an important 22 determining factor for widespread adoption, they are out of scope for 23 this document. This document is a product of the IRTF Routing 24 Research Group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 11, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 63 3.1. IRON Client Router . . . . . . . . . . . . . . . . . . . . 9 64 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 10 65 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 66 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 67 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 68 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 69 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 70 5.3. IRON Client Router Initialization . . . . . . . . . . . . 15 71 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 16 72 6.1. IRON Client Router Operation . . . . . . . . . . . . . . . 16 73 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 17 74 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 18 75 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 76 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 77 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 78 6.5. Mobility, Multihoming and Traffic Engineering 79 Considerations . . . . . . . . . . . . . . . . . . . . . . 25 80 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 25 81 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 26 82 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 26 83 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 26 84 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 26 85 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 27 86 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 27 87 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 28 88 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 29 89 7. Additional Considerations . . . . . . . . . . . . . . . . . . 30 90 8. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 30 91 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 92 10. Security Considerations . . . . . . . . . . . . . . . . . . . 31 93 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 94 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 95 12.1. Normative References . . . . . . . . . . . . . . . . . . . 31 96 12.2. Informative References . . . . . . . . . . . . . . . . . . 32 97 Appendix A. IRON VPs Over Internetworks with Different 98 Address Families . . . . . . . . . . . . . . . . . . 34 99 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 35 100 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 102 1. Introduction 104 Growth in the number of entries instantiated in the Internet routing 105 system has led to concerns for unsustainable routing scaling 106 [I-D.narten-radir-problem-statement]. Operational practices such as 107 increased use of multihoming with IPv4 Provider-Independent (PI) 108 addressing are resulting in more and more fine-grained prefixes 109 injected into the routing system from more and more end-user 110 networks. Furthermore, the forthcoming depletion of the public IPv4 111 address space has raised concerns for both increased address space 112 fragmentation (leading to yet further routing table entries) and an 113 impending address space run-out scenario. At the same time, the IPv6 114 routing system is beginning to see growth in IPv6 Provider-Aggregated 115 (PA) prefixes [BGPMON] which must be managed in order to avoid the 116 same routing scaling issues the IPv4 Internet now faces. Since the 117 Internet must continue to scale to accommodate increasing demand, it 118 is clear that new routing methodologies and operational practices are 119 needed. 121 Several related works have investigated routing scaling issues. 122 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 123 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 124 proposals that introduce routing overlays with Virtual Prefixes (VPs) 125 to reduce the number of entries required in each router's Forwarding 126 Information Base (FIB) and Routing Information Base (RIB). Routing 127 and Addressing in Networks with Global Enterprise Recursion (RANGER) 128 [RFC5720] examines recursive arrangements of enterprise networks that 129 can apply to a very broad set of use case scenarios 130 [I-D.russert-rangers]. In particular, RANGER supports encapsulation 131 and secure redirection by treating each layer in the recursive 132 hierarchy as a virtual non-broadcast, multiple access (NBMA) "link". 133 RANGER is an architectural framework that includes Virtual Enterprise 134 Traversal (VET) [I-D.templin-intarea-vet] and the Subnetwork 135 Adaptation and Encapsulation Layer (SEAL) [I-D.templin-intarea-seal] 136 as its functional building blocks. 138 This document proposes an Internet Routing Overlay Network (IRON) 139 with goals of supporting sustainable growth while requiring no 140 changes to the existing routing system. IRON borrows concepts from 141 VA, AIS and RANGER, and further borrows concepts from the Internet 142 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 143 proposal along with its associated Translating Tunnel Router (TTR) 144 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 145 degree inspired the IRON mobility architecture design discussed in 146 this document. The Network Address Translator (NAT) traversal 147 techniques adapted for IRON were inspired by the Simple Address 148 Mapping for Premises Legacy Equipment (SAMPLE) proposal 149 [I-D.carpenter-softwire-sample]. 151 IRON specifically seeks to provide scalable PI addressing without 152 changing the current BGP [RFC4271] routing system. IRON observes the 153 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 154 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 155 [RFC1070], etc.) are also within scope. 157 The IRON is a global routing system comprising virtual overlay 158 networks managed by Virtual Prefix Companies (VPCs) that own and 159 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 160 prefixes (EPs) are delegated to customer sites. The IRON is 161 motivated by a growing customer demand for multihoming, mobility 162 management and traffic engineering while using stable PI addressing 163 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 164 existing IPv4 and IPv6 global Internet routing systems as virtual 165 links for tunneling inner network protocol packets within outer IPv4 166 or IPv6 headers (see: Section 3). The IRON requires deployment of a 167 small number of new BGP core routers and supporting servers, as well 168 as IRON-aware routers/servers in customer EUNs. No modifications to 169 hosts, and no modifications to most routers are required. 171 While the IRON architecture addresses network mobility, host mobility 172 considerations are outside the scope of this document. IP multicast 173 considerations are also out of scope. 175 Note: This document is offered in compliance with Internet Research 176 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 177 IETF product and is not a standard. The views in this document were 178 considered controversial by the IRTF Routing Research Group (RRG) but 179 the RG reached a consensus that the document should still be 180 published. The document will undergo a period of review within the 181 RRG and through selected expert reviewers prior to publication. The 182 following sections discuss details of the IRON architecture. 184 2. Terminology 186 This document makes use of the following terms: 188 End User Network (EUN) 189 an edge network that connects an organization's devices (e.g., 190 computers, routers, printers, etc.) to the Internet. 192 End User Network PI Prefix (EP) 193 a more-specific Provider-Independent (PI) prefix derived from a 194 Virtual Prefix (VP) (e.g., an IPv4 /28, an IPv6 /56, etc.) and 195 delegated to an EUN by a Virtual Prefix Company (VPC). 197 End User Network PI Address (EPA) 198 a network layer address belonging to an EP and assigned to the 199 interface of an end system in an EUN. 201 Forwarding Information Based (FIB) 202 a data structure containing network prefix to next-hop mappings; 203 usually maintained in a router's fast-path processing lookup 204 tables. 206 Internet Routing Overlay Network (IRON) 207 a composite virtual overlay network that comprises the union of 208 all VPC overlay networks configured over a common Internetwork. 209 The IRON supports routing through encapsulation of inner packets 210 with EPA addresses within outer headers that use locator 211 addresses. 213 IRON Client Router ("Client") 214 a customer's router (or host with embedded gateway function) that 215 logically connects the customer's EUNs and their associated EPs to 216 the IRON via tunnels. 218 IRON Serving Router ("Server") 219 a VPC's overlay network router that provides forwarding and 220 mapping services for the EPs owned by customer Client routers. 222 IRON Relay Router ("Relay") 223 a VPC's overlay network router that acts as a relay between the 224 IRON and the native Internet. 226 IRON Router (IR) 227 generically refers to any of an IRON Client/Server/Relay router. 229 Internet Service Provider (ISP) 230 a service provider which connects customer EUNs to the underlying 231 Internetwork. In other words, an ISP is responsible for providing 232 basic Internet connectivity for customer EUNs. 234 Locator 235 an IP address assigned to the interface of a router or end system 236 within a public or private network. Locators taken from public IP 237 prefixes are routable on a global basis, while locators taken from 238 private IP prefixes are made public via Network Address 239 Translation (NAT). 241 Provider Aggregated (PA) address or prefix 242 a network layer address or prefix delegated to an EUN by an ISP. 244 Provider Independent (PI) address or prefix 245 a network layer address or prefix delegated to an EUN by a third 246 party independently of the EUN's ISP arrangements. 248 Routing and Addressing in Networks with Global Enterprise Recursion 249 (RANGER) 250 an architectural examination of virtual overlay networks applied 251 to enterprise network scenarios, with implications for a wider 252 variety of use cases. 254 Subnetwork Encapsulation and Adaptation Layer (SEAL) 255 an encapsulation sublayer that provides extended packet 256 identification and a control message protocol to ensure 257 deterministic network-layer feedback. 259 Virtual Enterprise Traversal (VET) 260 a method for discovering border routers and forming dynamic point- 261 to-(multi)point tunnels over enterprise networks (or sites) with 262 varying properties. 264 Virtual Prefix (VP) 265 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 266 prefix, etc.) that is owned and managed by a Virtual Prefix 267 Company (VPC). 269 Virtual Prefix Company (VPC) 270 a company that owns and manages a set of VPs from which it 271 delegates EPs to EUNs. 273 VPC Overlay Network 274 a specialized set of routers deployed by a VPC to service customer 275 EUNs through a virtual overlay network configured over an 276 underlying Internetwork (e.g., the global Internet). 278 3. The Internet Routing Overlay Network 280 The Internet Routing Overlay Network (IRON) is a system of virtual 281 overlay networks configured over a common Internetwork. While the 282 principles presented in this document are discussed within the 283 context of the public global Internet, they can also be applied to 284 any autonomous Internetwork. The rest of this document therefore 285 refers to the terms "Internet" and "Internetwork" interchangeably 286 except in cases where specific distinctions must be made. 288 The IRON consists of IRON Routers (IRs) that automatically tunnel the 289 packets of end-to-end communication sessions within encapsulating 290 headers used for Internet routing. IRs use Virtual Enterprise 291 Traversal (VET) [I-D.templin-intarea-vet] in conjunction with the 292 Subnetwork Encapsulation and Adaptation Layer (SEAL) 293 [I-D.templin-intarea-seal] to encapsulate inner network layer packets 294 within outer headers as shown in Figure 1: 296 +-------------------------+ 297 | Outer headers with | 298 ~ locator addresses ~ 299 | (IPv4 or IPv6) | 300 +-------------------------+ 301 | SEAL Header | 302 +-------------------------+ +-------------------------+ 303 | Inner Packet Header | --> | Inner Packet Header | 304 ~ with EP addresses ~ --> ~ with EP addresses ~ 305 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 306 +-------------------------+ +-------------------------+ 307 | | --> | | 308 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 309 | | --> | | 310 +-------------------------+ +-------------------------+ 312 Inner packet before Outer packet after 313 before encapsulation after encapsulation 315 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 317 VET specifies the automatic tunneling mechanisms used for 318 encapsulation, while SEAL specifies the format and usage of the SEAL 319 header as well as a set of control messages. Most notably, IRs use 320 the SEAL Control Message Protocol (SCMP) to deterministically 321 exchange and authenticate control messages such as route 322 redirections, indications of Path Maximum Transmission Unit (PMTU) 323 limitations, destination unreachables, etc. 325 The IRON is the union of all virtual overlay networks that are 326 configured over a common underlying Internet and are owned and 327 managed Virtual Prefix Companies (VPCs). Each such virtual overlay 328 network comprises a set of IRs distributed throughout the Internet to 329 serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate sub- 330 prefixes from their VPs which they lease to customers as End User 331 Network PI prefixes (EPs). The customers in turn assign the EPs to 332 their customer edge IRs which connect their End User Networks (EUNs) 333 to the IRON. 335 VPCs may have no affiliation with the ISP networks from which 336 customers obtain their basic Internet connectivity. Therefore, a 337 customer could procure its summary network services either through a 338 common broker or through separate entities. In that case, the VPC 339 can open for business and begin serving its customers immediately 340 without the need to coordinate its activities with ISPs or with other 341 VPCs. Further details on business considerations are out of scope 342 for this document. 344 The IRON requires no changes to end systems and no changes to most 345 routers in the Internet. Instead, the IRON comprises IRs that are 346 deployed either as new platforms or as modifications to existing 347 platforms. IRs may be deployed incrementally without disturbing the 348 existing Internet routing system, and act as waypoints (or "cairns") 349 for navigating the IRON. The functional roles for IRs are described 350 in the following sections. 352 3.1. IRON Client Router 354 An IRON client router (or, simply, "Client") is a customer's router 355 (or host with embedded gateway function) that logically connects the 356 customer's EUNs and their associated EPs to the IRON via tunnels as 357 shown in Figure 2. Clients obtain EPs from VPCs and use them to 358 number subnets and interfaces within their EUNs. A Client can be 359 deployed on the same physical platform that also connects the 360 customer's EUNs to its ISPs, but it may also be a separate router or 361 even a standalone server system located within the EUN. (This model 362 applies even if the EUN connects to the ISP via a Network Address 363 Translator (NAT) - see Section 6.7). 364 .-. 365 ,-( _)-. 366 +--------+ .-(_ (_ )-. 367 | Client |--(_ ISP ) 368 +---+----+ `-(______)-' 369 | <= T \ .-. 370 .-. u \ ,-( _)-. 371 ,-( _)-. n .-(_ (- )-. 372 .-(_ (_ )-. n (_ Internet ) 373 (_ EUN ) e `-(______)- 374 `-(______)-' l ___ 375 | s => (:::)-. 376 +----+---+ .-(::::::::) 377 | Host | .-(::::::::::::)-. 378 +--------+ (:::: The IRON ::::) 379 `-(::::::::::::)-' 380 `-(::::::)-' 382 Figure 2: IRON Client Router Connecting EUN to the IRON 384 3.2. IRON Serving Router 386 An IRON serving router (or, simply, "Server") is a VPC's overlay 387 network router that provides forwarding and mapping services for the 388 EPs owned by customer Client routers. In typical deployments, a VPC 389 will deploy many Servers around the IRON in a globally-distributed 390 fashion (e.g., as depicted in Figure 3) so that Clients can discover 391 those that are nearby. 393 +--------+ +--------+ 394 | Boston | | Tokyo | 395 | Server | | Server | 396 +--+-----+ ++-------+ 397 +--------+ \ / 398 | Seattle| \ ___ / 399 | Server | \ (:::)-. +--------+ 400 +------+-+ .-(::::::::)------+ Paris | 401 \.-(::::::::::::)-. | Server | 402 (:::: The IRON ::::) +--------+ 403 `-(::::::::::::)-' 404 +--------+ / `-(::::::)-' \ +--------+ 405 | Moscow + | \--- + Sydney | 406 | Server | +----+---+ | Server | 407 +--------+ | Cairo | +--------+ 408 | Server | 409 +--------+ 411 Figure 3: IRON Serving Router Global Distribution Example 413 Each Server acts as tunnel-endpoint router that forms a bi- 414 directional tunnel with each of its Client customers. Each Server 415 also associates with a set of Relays that can forward packets from 416 the IRON out to the native Internet and vice-versa as discussed in 417 the next section. 419 3.3. IRON Relay Router 421 An IRON Relay Router (or, simply, "Relay") is a VPC's overlay network 422 router that acts as a relay between the IRON and the native Internet. 423 It therefore also serves as an Autonomous System Border Router (ASBR) 424 that is owned and managed by the VPC. 426 Each VPC configures one or more Relays which advertise the company's 427 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 428 Relay associates with all of the VPC's overlay network Servers, e.g., 429 via tunnels over the IRON, via a direct interconnect such as an 430 Ethernet cable, etc. The Relay role (as well as its relationship 431 with overlay network Servers) is depicted in Figure 4: 433 .-. 434 ,-( _)-. 435 .-(_ (_ )-. 436 (_ Internet ) 437 `-(______)-' | +--------+ 438 | |--| Server | 439 +----+---+ | +--------+ 440 | Relay |----| +--------+ 441 +--------+ |--| Server | 442 _|| | +--------+ 443 (:::)-. (Ethernet) 444 .-(::::::::) 445 +--------+ .-(::::::::::::)-. +--------+ 446 | Server |=(:::: The IRON ::::)=| Server | 447 +--------+ `-(::::::::::::)-' +--------+ 448 `-(::::::)-' 449 || (Tunnels) 450 +--------+ 451 | Server | 452 +--------+ 454 Figure 4: IRON Relay Router Connecting IRON to Native Internet 456 4. IRON Organizational Principles 458 The IRON consists of the union of all VPC overlay networks configured 459 over a common Internetwork (e.g., the public Internet). Each such 460 overlay network represents a distinct "patch" on the Internet 461 "quilt", where the patches are stitched together by tunnels over the 462 links, routers, bridges, etc., that connect the underlying. When a 463 new VPC overlay network is deployed, it becomes yet another patch on 464 the quilt. The IRON is therefore a composite overlay network 465 consisting of multiple individual patches, where each patch 466 coordinates its activities independently of all others (with the 467 exception that the Servers of each patch must be aware of all VPs in 468 the IRON). In order to ensure mutual cooperation between all VPC 469 overlay networks, sufficient address space portions of the inner 470 network layer protocol (e.g., IPv4, IPv6, etc.) should be set aside 471 and designated as VP space. 473 Each VPC overlay network in the IRON maintains a set of Relays and 474 Servers that provide services to their Client customers. In order to 475 ensure adequate customer service levels, the VPC should conduct a 476 traffic scaling analysis and distribute sufficient Relays and Servers 477 for the overlay network globally throughout the Internet. Figure 5 478 depicts the logical arrangement of Relays Servers and Clients in an 479 IRON virtual overlay network: 481 .-. 482 ,-( _)-. 483 .-(_ (_ )-. 484 (__ Internet _) 485 `-(______)-' 487 <------------ Relays ------------> 488 ________________________ 489 (::::::::::::::::::::::::)-. 490 .-(:::::::::::::::::::::::::::::) 491 .-(:::::::::::::::::::::::::::::::::)-. 492 (::::::::::: The IRON :::::::::::::::) 493 `-(:::::::::::::::::::::::::::::::::)-' 494 `-(::::::::::::::::::::::::::::)-' 496 <------------ Servers ------------> 497 .-. .-. .-. 498 ,-( _)-. ,-( _)-. ,-( _)-. 499 .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. 500 (__ ISP A _) (__ ISP B _) ... (__ ISP x _) 501 `-(______)-' `-(______)-' `-(______)-' 502 <----------- NATs ------------> 504 <----------- Clients and EUNs -----------> 506 Figure 5: Virtual Overlay Network Organization 508 Each Relay in the VPC overlay network connects the overlay directly 509 to the underlying IPv4 and IPv6 Internets. It also advertises the 510 VPC overlay network's IPv4 VPs into the IPv4 BGP routing system and 511 advertises the overlay network's IPv6 VPs into the IPv6 BGP routing 512 system. Relays will therefore receive packets with EPA destination 513 addresses sent by end systems in the Internet and direct them toward 514 EPA-addressed end systems connected to the VPC overlay network. 516 Each VPC overlay network also manages a set of Servers that connect 517 their Clients and associated EUNs to the IRON and to the IPv6 and 518 IPv4 Internets via their associations with Relays. IRON Servers 519 therefore need not be BGP routers themselves and can be simple 520 commodity hardware platforms. Moreover, the Server and Relay 521 functions can be deployed together on the same physical platform as a 522 unified gateway or they may be deployed on separate platforms (e.g., 523 for load balancing purposes). 525 Each Server maintains a working set of Clients for which it caches 526 EP-to-Client mappings in its Forwarding Information Base (FIB). Each 527 Server also in turn propagates the list of EPs in its working set to 528 each of the Relays in the VPC overlay network via a dynamic routing 529 protocol (e.g., an overlay network internal BGP instance that carries 530 only the EP-to-Server mappings and does not interact with the 531 external BGP routing system). Each Server therefore only needs to 532 track the EPs for its current working set of Clients, while each 533 Relay will maintain a full EP-to-Server mapping table that represents 534 reachability information for all EPs in the VPC overlay network. 536 Customers establish Clients that obtain their basic Internet 537 connectivity from ISPs and connect to Servers to attach their EUNs to 538 the IRON. Each EUN can connect to the IRON via one or multiple 539 Clients as long as the Clients coordinate with one another, e.g., to 540 mitigate EUN partitions. Unlike Relays and Servers, Clients may use 541 private addresses behind one or several layers of NATs. Each Client 542 initially discovers a list of nearby Servers through an anycast 543 discovery process (described below). It then selects one of these 544 nearby Servers and forms a bidirectional tunnel through an initial 545 exchange followed by periodic keepalives. 547 After the Client selects a Server, it forwards initial outbound 548 packets from its EUNs by tunneling them to the Server which in turn 549 forwards them to the nearest Relay within the IRON that serves the 550 final destination. The Client will subsequently receive redirect 551 messages informing it of a more direct route through a Server that 552 serves the final destination EUN. 554 The IRON can also be used to support VPs of network layer address 555 families that cannot be routed natively in the underlying 556 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 557 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 558 Further details for support of IRON VPs of one address family over 559 Internetworks based on other address families are discussed in 560 Appendix A. 562 5. IRON Initialization 564 IRON initialization entails the startup actions of IRs within the VPC 565 overlay network and customer EUNs. The following sections discuss 566 these startups procedures. 568 5.1. IRON Relay Router Initialization 570 Before its first operational use, each Relay in a VPC overlay network 571 is provisioned with the list of VPs that it will serve as well as the 572 locators for all Servers that belong to the same overlay network. 573 The Relay is also provisioned with external BGP interconnections the 574 same as for any BGP router. 576 Upon startup, the Relay engages in BGP routing exchanges with its 577 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 578 It then connects to all of the Servers in the overlay network (e.g., 579 via a TCP connection over a bidirectional tunnel, via an iBGP route 580 reflector, etc.) for the purpose of discovering EP->Server mappings. 581 After the Relay has fully populated its EP->Server mapping 582 information database, it is said to be "synchronized" wrt its VPs. 584 After this initial synchronization procedure, the Relay then 585 advertises the overlay network's VPs externally. In particular, the 586 Relay advertises the IPv6 VPs into the IPv6 BGP routing system and 587 advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay 588 additionally advertises an IPv4 /24 companion prefix (e.g., 589 192.0.2.0/24) into the IPv4 routing system and an IPv6 ::/64 590 companion prefix (e.g., 2001:DB8::/64) into the IPv6 routing system 591 (note that these may also be sub-prefixes taken from a VP). The 592 Relay then configures the host number '1' in the IPv4 companion 593 prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the 594 IPv6 companion prefix (e.g., as 2001:DB8::0) and assigns the 595 resulting addresses as subnet router anycast addresses 596 [RFC3068][RFC2526] for the VPC overlay network. (See Appendix A for 597 more information on the discovery and use of companion prefixes.) 598 The Relay then engages in ordinary packet forwarding operations. 600 5.2. IRON Serving Router Initialization 602 Before its first operational use, each Server in a VPC overlay 603 network is provisioned with the locators for all Relays that 604 aggregate the overlay network's VPs. In order to support route 605 optimization, the Server must also be provisioned with the list of 606 all VPs in the IRON (i.e., and not just the VPs of its own overlay 607 network) so that it can discern EPA and non-EPA addresses. (The 608 Server could therefore be greatly simplified if the list of VPs could 609 be covered within a small number of very short prefixes, e.g., one or 610 a few IPv6 ::/20's). The Server must also discover the VP companion 611 prefix relationships discussed in Section 5.1, e.g., via a global 612 database such as discussed in Appendix A. 614 Upon startup, each Server must connect to all of the Relays within 615 its overlay network (e.g., via a TCP connection over a bidirectional 616 tunnel, via an iBGP route reflector, etc.) for the purpose of 617 reporting its EP->Server mappings. The Server then actively listens 618 for Client customers which register their EP prefixes as part of 619 establishing a bidirectional tunnel. When a new Client registers its 620 EP prefixes, the Server announces the new EP additions to all Relays; 621 when an existing Client unregisters its EP prefixes, the Server 622 withdraws its announcements. 624 5.3. IRON Client Router Initialization 626 Before its first operational use, each Client must obtain one or more 627 EPs from its VPC as well as the companion prefixes associated with 628 the VPC overlay network (see Section 5.1). The Client must also 629 obtain a certificate and a public/private key pair from the VPC that 630 it can later use to prove ownership of its EPs. This implies that 631 each VPC must run its own public key infrastructure to be used only 632 for the purpose of verifying its customers' claimed right to use an 633 EP. Hence, the VPC need not coordinate its public key infrastructure 634 with any other organization. 636 Upon startup, the Client sends an SCMP Router Solicitation (SRS) 637 message to the VPC overlay network subnet router anycast address to 638 discover the nearest Relay. The Relay will return an SCMP Router 639 Advertisement message that lists the locator addresses of one or more 640 nearby Servers. (This list is analogous to the ISATAP Potential 641 Router List (PRL) [RFC5214].) 643 After the Client receives an SRA message from the nearby Relay 644 listing the locator addresses of nearby Servers, it sends SRS test 645 messages to one or more of the locator addresses to elicit SRA 646 messages. The Server that configures the locator will include the 647 header of the soliciting SRS message in its SRA message so that the 648 Client can determine the number of hops along the forward path. The 649 Server also includes a metric in its SRA messages indicating its 650 service availability so that the Client can avoid selecting Servers 651 that are overloaded. The Server also includes a challenge/response 652 puzzle that the Client must answer if it wishes to connect to this 653 Server. 655 When the Client receives these SRA messages, it can measure the round 656 trip time between sending the SRS and receiving the SRA as an 657 indication of round-trip delay. If the Client wishes to enlist the 658 services of a specific Server (e.g., based on the measured 659 performance), it then calculates the answer to the puzzle using its 660 keying information and sends the answer back to the Server in a new 661 SRS message that also contains all of the Client's EP prefixes for 662 which it claims ownership. If the Client solved the puzzle 663 correctly, the Server will send back a new SRA message that includes 664 a non-zero default router lifetime and that signifies the 665 establishment of a bidirectional tunnel. (A zero default router 666 lifetime on the other hand signifies that the Server is currently 667 unable to establish a bidirectional tunnel, e.g., due to heavy load, 668 due to challenge/response failure, etc.) 670 Note that in the above procedure it is essential that the Client 671 select one and only one Server. This is to allow the VPC overlay 672 network mapping system to have one and only one active EP-to-Server 673 mapping at any point in time which shares fate with the Server 674 itself. If this Server fails, the Client will quickly select a new 675 one which will automatically update the VPC overlay network mapping 676 system with a new EP-to-Server mapping. 678 6. IRON Operation 680 Following the IRON initialization detailed in Section 5, IRs engage 681 in the steady-state process of receiving and forwarding packets. All 682 IRs forward encapsulated packets over the IRON using the mechanisms 683 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 684 while Relays (and in some cases Servers) additionally forward packets 685 to and from the native IPv6 and IPv4 Internets. IRs also use SCMP to 686 coordinate with other IRs, including the process of sending and 687 receiving redirect messages, error messages, etc. (Note however that 688 an IR must not send an SCMP message in response to an SCMP error 689 message.) Each IR operates as specified in the following sub- 690 sections. 692 6.1. IRON Client Router Operation 694 After selecting its Server as specified in Section 5.3, the Client 695 should register each of its ISP connections with the Server in order 696 to establish multiple bidirectional tunnels for multihoming purposes. 697 To do so, it sends periodic SRS messages to its Server via each of 698 its ISPs to establish additional bidirectional tunnels and to keep 699 each tunnel alive. These messages need not include challenge/ 700 response mechanisms since prefix proof of ownership was already 701 established in the initial exchange and a nonce in the SEAL header 702 can be used to confirm that the SRS message was sent by the correct 703 Client. This implies that a single nonce is used to represent the 704 set of all bidirectional tunnels between the Client and the Server. 705 Therefore, there are multiple bidirectional tunnels, and the nonce 706 names this "bundle" of tunnels. (The Client and Server may 707 conceptually represent this "bundle" as a single tunnel with multiple 708 locator addresses, however each such locator address must be tested 709 independently in case there are NATs on the path.) 711 If the Client ceases to receive SRA messages from its Server via a 712 specific ISP connection, it marks the Server as unreachable from that 713 address and therefore over that ISP connection. (The Client should 714 also inform its Server of this outage via one of its working ISP 715 connections.) If the Client ceases to receive SRA messages from its 716 Server via multiple ISP connections, it marks the Server as unusable 717 and quickly attempts to establish a bidirectional tunnel with a new 718 Server. The act of establishing the tunnel with a new Server will 719 automatically purge the stale mapping state associated with the old 720 Server. 722 When an end system in an EUN sends a flow of packets to a 723 correspondent, the packets are forwarded through the EUN via normal 724 routing until they reach the Client, which then tunnels the initial 725 packets to its Server as the next hop. In particular, the Client 726 encapsulates each packet in an outer header with its locator as the 727 source address and the locator of its Server as the destination 728 address. Note that after sending the initial packets of a flow, the 729 Client may receive important SCMP messages such as indications of 730 PMTU limitations, redirects that point to a better next hop, etc. It 731 is therefore essential that the Client send the initial packets 732 through its Server to avoid loss of SCMP messages that cannot 733 traverse a NAT in the reverse direction. (The Server also provides a 734 control point for inbound traffic engineering and a mobility anchor 735 point and hence cannot by bypassed in the inbound direction). 737 The Client uses the mechanisms specified in VET and SEAL to 738 encapsulate each forwarded packet. The Client further uses the SCMP 739 protocol to coordinate with other IRs, including accepting redirects 740 and other SCMP messages. When the Client receives an SCMP message, 741 it checks the nonce field of the encapsulated packet-in-error to 742 verify that the message corresponds to the tunnel to its Server and 743 accepts the message if the nonce matches. (Note however that the 744 outer source and destination addresses of the packet-in-error may be 745 different than those in the original packet due to possible Server 746 and/or Relay address rewritings.) 748 6.2. IRON Serving Router Operation 750 After the Server is initialized, it responds to SRSs from Clients by 751 sending SRAs as described in Section 6.1. When the Server receives 752 an SRS message from a new Client, it sends back an SRA message with a 753 challenge/response puzzle. The Client in turn sends an SRS message 754 with an answer to the puzzle. If this authentication fails, the 755 Server discards the message. Otherwise, it creates tunnel state for 756 this new Client, records the Client's EPs (see Section 5.3) in its 757 FIB, and records the locator address from the SCMP message as the 758 link-layer address of the next hop. The Server next sends an SRA 759 message back to the Client to complete the tunnel establishment. 761 When the Server receives a SEAL-encapsulated packet from one of its 762 Client tunnel endpoints, it examines the inner destination address. 763 If the inner destination address is not an EPA, the Server 764 decapsulates the packet and forwards it unencapsulated into the 765 Internet if it is able to do so without loss due to ingress 766 filtering. Otherwise, the Server re-encapsulates the packet (i.e., 767 it removes the outer header and replaces it with a new outer header 768 of the same address family) and sets the outer destination address to 769 the locator address of an Relay within its VPC overlay network. It 770 then forwards the re-encapsulated packet to the Relay, which will in 771 turn decapsulate it and forward it into the Internet. 773 If the inner destination address is an EPA, however, the Server 774 rewrites the outer source address to one of its own locator addresses 775 and rewrites the outer destination address to the subnet router 776 anycast address taken from the companion prefix associated with the 777 inner destination address (where the companion prefix of the same 778 address family as the outer IP protocol is used). The Server then 779 forwards the revised packet into the Internet via a default or more- 780 specific route, where it will be directed to the closest Relay within 781 the destination VPC overlay network. After sending the packet, the 782 Server may then receive an SCMP error or redirect message from a 783 Relay/Server within the destination VPC overlay network. In that 784 case, the Server verifies that the nonce in the message matches the 785 tunnel corresponding to the Client that sent the original inner 786 packet and discards the message if the nonce does not match. 787 Otherwise, the Server re-encapsulates the SCMP message in a new outer 788 header that uses the source address, destination address and nonce 789 parameters associated with the tunnel to the Client; it then forwards 790 the message to the Client. This arrangement is necessary to allow 791 SCMP messages to flow through any NATs on the path. 793 When a Server ('A') receives a SEAL-encapsulated packet from a Relay 794 or from the Internet, if the inner destination address matches an EP 795 in its FIB 'A' re-encapsulates the packet in a new outer header that 796 uses the source address, destination address and nonce parameters 797 associated with the tunnel and forwards it to a Client ('B') which in 798 turn decapsulates the packet and forwards it to the correct end 799 system in the EUN. If 'B' has left notice with 'A' that it has moved 800 to a new Server ('C'), however, 'A' will instead forward the packet 801 to 'C' and also send an SCMP redirect message back to the source of 802 the packet. In this way, 'B' can leave behind forwarding information 803 when changing between Servers 'A' and 'C' (e.g., due to mobility 804 events) without exposing packets to loss. 806 6.3. IRON Relay Router Operation 808 After each Relay has synchronized its VPs (see: Section 5.1) it 809 advertises the full set of the company's VPs and companion prefixes 810 into the IPv4 and IPv6 Internet BGP routing systems. These prefixes 811 will be represented as ordinary routing information in the BGP, and 812 any packets originating from the IPv4 or IPv6 Internet destined to an 813 address covered by one of the prefixes will be forwarded to one of 814 the VPC overlay network's Relays. 816 When a Relay receives a packet from the Internet destined to an EPA 817 covered by one of its VPs, it behaves as an ordinary IP router. In 818 particular, the Relay looks in its FIB to discover a locator of the 819 Server that serves the EP that covers the destination address. The 820 Relay then simply encapsulates the packet with its own locator as the 821 outer source address and the locator of the Server as the outer 822 destination address and forwards the packet to the Server. 824 When a Relay receives a packet from the Internet destined to one of 825 its subnet router anycast addresses, it discards the packet if it is 826 not SEAL-encapsulated. If the packet is an SCMP SRS message, the 827 Relay instead sends an SRA message back to the source listing the 828 locator addresses of nearby Servers then discards the message. The 829 Relay otherwise discards all other SCMP messages. 831 If the packet is an ordinary SEAL packet (i.e., one that encapsulates 832 an inner packet) the Relay sends an SCMP redirect message of the same 833 address family back to the source with the locator of the Server that 834 serves the EPA destination in the inner packet as the redirected 835 target. The source and destination addresses of the SCMP redirect 836 message use the outer destination and source addresses of the 837 original packet, respectively. After sending the redirect message, 838 the Relay then rewrites the outer destination address of the SEAL- 839 encapsulated packet to the locator of the Server and forwards the 840 revised packet to the Server. Note that in this arrangement any 841 errors that occur on the path between the Relay and the Server will 842 be delivered to the original source but with a different destination 843 address due to this Relay address rewriting. 845 6.4. IRON Reference Operating Scenarios 847 The IRON supports communications when one or both hosts are located 848 within EP-addressed EUNs regardless of whether the EPs are 849 provisioned by the same VPC or by different VPCs. When both hosts 850 are within IRON EUNs, route redirections that eliminate unnecessary 851 Servers and Relays from the path are possible. When only one host is 852 within an IRON EUN, however, route optimization cannot be used. The 853 following sections discuss the two scenarios. 855 6.4.1. Both Hosts Within IRON EUNs 857 When both hosts are within IRON EUNs, it is sufficient to consider 858 the scenario in a unidirectional fashion, i.e., by tracing packet 859 flows only in the forward direction from the source host to 860 destination host. The reverse direction can be considered 861 separately, and incurs the same considerations as for the forward 862 direction. 864 In this scenario, the initial packets of a flow produced by a source 865 host within an EUN connected to the IRON by a Client must flow 866 through both the Server of the source host and a Relay of the 867 destination host, but route optimization can eliminate these elements 868 from the path for subsequent packets in the flow. Figure 6 shows the 869 flow of initial packets from host A to host B within two IRON EUNs 870 (the same scenario applies whether the two EUNs are within the same 871 VPC overlay network or different overlay networks): 873 ________________________________________ 874 .-( .-. )-. 875 .-( ,-( _)-. )-. 876 .-( +========+(_ (_ +=====+ )-. 877 .( || (_|| Internet ||_) || ). 878 .( || ||-(______)-|| vv ). 879 .( +--------++--+ || || +------------+ ). 880 ( +==>| Server(A) | vv || | Server(B) |====+ ) 881 ( // +---------|\-+ +--++----++--+ +------------+ \\ ) 882 ( // .-. | \ | Relay(B) | .-. \\ ) 883 ( //,-( _)-. | \ +-v----------+ ,-( _)-\\ ) 884 ( .||_ (_ )-. | \____| .-(_ (_ ||. ) 885 ( _|| ISP A .) | (__ ISP B ||_)) 886 ( ||-(______)-' | (redirect) `-(______)|| ) 887 ( || | | | vv ) 888 ( +-----+-----+ | +-----+-----+ ) 889 | Client(A) | <--+ | Client(B) | 890 +-----+-----+ The IRON +-----+-----+ 891 | ( (Overlaid on the native Internet) ) | 892 .-. .-( .-) .-. 893 ,-( _)-. .-(________________________)-. ,-( _)-. 894 .-(_ (_ )-. .-(_ (_ )-. 895 (_ IRON EUN A ) (_ IRON EUN B ) 896 `-(______)-' `-(______)-' 897 | | 898 +---+----+ +---+----+ 899 | Host A | | Host B | 900 +--------+ +--------+ 902 Figure 6: Initial Packet Flow Before Redirects 904 With reference to Figure 6, host A sends packets destined to host B 905 via its network interface connected to EUN A. Routing within EUN A 906 will direct the packets to Client(A) as a default router for the EUN 907 which then uses VET and SEAL to encapsulate them in outer headers 908 with its locator address as the outer source address and the locator 909 address of Server(A) as the outer destination address. Client(A) 910 then simply releases the encapsulated packets into its ISP network 911 connection that provided its locator. The ISP will release the 912 packets into the Internet without filtering since the (outer) source 913 address is topologically correct. Once the packets have been 914 released into the Internet, routing will direct them to Server(A). 916 Server(A) receives the encapsulated packets from Client(A) then 917 rewrites the outer source address to one of its own locator 918 addresses, and rewrites the outer destination address to the subnet 919 router anycast address of the appropriate address family associated 920 with the inner destination address. Server(A) then releases the 921 revised packets into the Internet where routing will direct them to 922 Relay(B). 924 Relay(B) will intercept the encapsulated packets from Server(A) then 925 check its FIB to discover an entry that covers inner destination 926 address B with Server(B) as the next hop. Relay(B) then returns SCMP 927 redirect messages to Server(A) (*), rewrites the outer destination 928 address of the encapsulated packets to the locator address of 929 Server(B), and forwards these revised packets to Server(B). 931 Server(B) will receive the encapsulated packets from Relay(B) then 932 check its FIB to discover an entry that covers destination address B 933 with Client(B) as the next hop. Server(B) then re-encapsulates the 934 packets in a new outer header that uses the source address, 935 destination address and nonce parameters associated with the tunnel 936 to Client(B). Server(B) then releases these re-encapsulated packets 937 into the Internet, where routing will direct them to Client(B). 938 Client(B) will in turn decapsulate the packets and forward the inner 939 packets to host B via EUN B. 941 (*) Note that after the initial flow of packets, Server(A) will have 942 received one or more SCMP redirect messages from Relay(B) listing 943 Server(B) as a better next hop. Server(A) will in turn forward the 944 redirects to Client(A), which will thereafter forward its 945 encapsulated packets directly to the locator address of Server(B) 946 without involving either Server(A) or Relay(B) as shown in Figure 7: 948 ________________________________________ 949 .-( .-. )-. 950 .-( ,-( _)-. )-. 951 .-( +=============> .-(_ (_ )-.======+ )-. 952 .( // (__ Internet _) || ). 953 .( // `-(______)-' vv ). 954 .( // +------------+ ). 955 ( // | Server(B) |====+ ) 956 ( // +------------+ \\ ) 957 ( // .-. .-. \\ ) 958 ( //,-( _)-. ,-( _)-\\ ) 959 ( .||_ (_ )-. .-(_ (_ ||. ) 960 ( _|| ISP A .) (__ ISP B ||_)) 961 ( ||-(______)-' `-(______)|| ) 962 ( || | | vv ) 963 ( +-----+-----+ The IRON +-----+-----+ ) 964 | Client(A) | (Overlaid on the native Internet) | Client(B) | 965 +-----+-----+ +-----+-----+ 966 | ( ) | 967 .-. .-( .-) .-. 968 ,-( _)-. .-(________________________)-. ,-( _)-. 969 .-(_ (_ )-. .-(_ (_ )-. 970 (_ IRON EUN A ) (_ IRON EUN B ) 971 `-(______)-' `-(______)-' 972 | | 973 +---+----+ +---+----+ 974 | Host A | | Host B | 975 +--------+ +--------+ 977 Figure 7: Sustained Packet Flow After Redirects 979 6.4.2. Mixed IRON and Non-IRON Hosts 981 When one host is within an IRON EUN and the other is in a non-IRON 982 EUN (i.e., one that connects to the native Internet instead of the 983 IRON), the IR elements involved depend on the packet flow directions. 984 The cases are described in the following sections. 986 6.4.2.1. From IRON Host A to Non-IRON Host B 988 Figure 8 depicts the IRON reference operating scenario for packets 989 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 991 _________________________________________ 992 .-( )-. )-. 993 .-( +-------)----+ )-. 994 .-( | Relay(A) |--------------+ )-. 995 .( +------------+ \ ). 996 .( +=======>| Server(A) | \ ). 997 .( // +--------)---+ \ ). 998 ( // ) \ ) 999 ( // The IRON ) \ ) 1000 ( // .-. ) \ .-. ) 1001 ( //,-( _)-. ) \ ,-( _)-. ) 1002 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1003 ( _|| ISP A ) ) (_ | ISP B )) 1004 ( ||-(______)-' ) |-(______)-' ) 1005 ( || | )-. v | ) 1006 ( +-----+ ----+ )-. +-----+-----+ ) 1007 | Client(A) |)-. | Router B | 1008 +-----+-----+ +-----+-----+ 1009 | ( ) | 1010 .-. .-(____________________________________)-. .-. 1011 ,-( _)-. ,-( _)-. 1012 .-(_ (_ )-. .-(_ (_ )-. 1013 (_ IRON EUN A ) (_non-IRON EUN B) 1014 `-(______)-' `-(______)-' 1015 | | 1016 +---+----+ +---+----+ 1017 | Host A | | Host B | 1018 +--------+ +--------+ 1020 Figure 8: From IRON Host A to Non-IRON Host B 1022 In this scenario, host A sends packets destined to host B via its 1023 network interface connected to IRON EUN A. Routing within EUN A will 1024 direct the packets to Client(A) as a default router for the EUN which 1025 then uses VET and SEAL to encapsulate them in outer headers with its 1026 locator address as the outer source address and the locator address 1027 of Server(A) as the outer destination address. The ISP will pass the 1028 packets without filtering since the (outer) source address is 1029 topologically correct. Once the packets have been released into the 1030 native Internet, routing will direct them to Server(A). 1032 Server(A) receives the encapsulated packets from Client(A) then re- 1033 encapsulates and forwards them to Relay(A), which simply decapsulates 1034 them and releases the unencapsulated packets into the Internet. Once 1035 the packets are released into the Internet, routing will direct them 1036 to the final destination B. (Note that Server(A) and Relay(A) are 1037 depicted in Figure 8 as two halves of a unified gateway. In that 1038 case, the "forwarding" between Server(A) and Relay(A) is a zero- 1039 instruction imaginary operation within the gateway.) 1041 This scenario always involves a Server and Relay owned by the VPC 1042 that provides service to IRON EUN A. It therefore imparts a cost that 1043 would need to be borne by either the VPC or its customers. 1045 6.4.2.2. From Non-IRON Host B to IRON Host A 1047 Figure 9 depicts the IRON reference operating scenario for packets 1048 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1050 _______________________________________ 1051 .-( )-. )-. 1052 .-( +-------)----+ )-. 1053 .-( | Relay(A) |<-------------+ )-. 1054 .( +------------+ \ ). 1055 .( +========| Server(A) | \ ). 1056 .( // +--------)---+ \ ). 1057 ( // ) \ ) 1058 ( // The IRON ) \ ) 1059 ( // .-. ) \ .-. ) 1060 ( //,-( _)-. ) \ ,-( _)-. ) 1061 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1062 ( _|| ISP A ) ) (_ | ISP B )) 1063 ( ||-(______)-' ) |-(______)-' ) 1064 ( vv | )-. | | ) 1065 ( +-----+ ----+ )-. +-----+-----+ ) 1066 | Client(A) |)-. | Router B | 1067 +-----+-----+ +-----+-----+ 1068 | ( ) | 1069 .-. .-(____________________________________)-. .-. 1070 ,-( _)-. ,-( _)-. 1071 .-(_ (_ )-. .-(_ (_ )-. 1072 (_ IRON EUN A ) (_non-IRON EUN B) 1073 `-(______)-' `-(_______)-' 1074 | | 1075 +---+----+ +---+----+ 1076 | Host A | | Host B | 1077 +--------+ +--------+ 1079 Figure 9: From Non-IRON Host B to IRON Host A 1081 In this scenario, host B sends packets destined to host A via its 1082 network interface connected to non-IRON EUN B. Routing will direct 1083 the packets to Relay(A) which then forwards them to Server(A) using 1084 encapsulation if necessary. 1086 Server(A) will then check its FIB to discover an entry that covers 1087 destination address A with Client(A) as the next hop. Server(A) then 1088 (re-)encapsulates the packets in an outer header that uses the source 1089 address, destination address and nonce parameters associated with the 1090 tunnel to Client(A). Server(A) next releases these (re-)encapsulated 1091 packets into the Internet, where routing will direct them to 1092 Client(A). Client(A) will in turn decapsulate the packets and 1093 forward the inner packets to host A via its network interface 1094 connected to IRON EUN A. 1096 This scenario always involves a Server and Relay owned by the VPC 1097 that provides service to IRON EUN A. It therefore imparts a cost that 1098 would need to be borne by either the VPC or its customers. 1100 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1102 While IRON Servers and Relays can be considered as fixed 1103 infrastructure, Clients may need to move between different network 1104 points of attachment, connect to multiple ISPs, or explicitly manage 1105 their traffic flows. The following sections discuss mobility, multi- 1106 homing and traffic engineering considerations for IRON client 1107 routers. 1109 6.5.1. Mobility Management 1111 When a Client changes its network point of attachment (e.g., due to a 1112 mobility event), it configures one or more new locators. If the 1113 Client has not moved far away from its previous network point of 1114 attachment, it simply informs its Server of any locator additions or 1115 deletions. This operation is performance-sensitive, and should be 1116 conducted immediately to avoid packet loss. 1118 If the Client has moved far away from its previous network point of 1119 attachment, however, it re-issues the anycast discovery procedure 1120 described in Section 6.1 to discover whether its candidate set of 1121 Servers has changed. If the Client's current Server is also included 1122 in the new list received from the VPC, this provides indication that 1123 the Client has not moved far enough to warrant changing to a new 1124 Server. Otherwise, the Client may wish to move to a new Server in 1125 order to maintain optimal routing. This operation is not 1126 performance-critical, and therefore can be conducted over a matter of 1127 seconds/minutes instead of milliseconds/microseconds. 1129 To move to a new Server, the Client first engages in the EP 1130 registration process with the new Server and maintains the 1131 registrations through periodic SRS/SRA exchanges the same as 1132 described in Section 6.1. The Client then informs its former Server 1133 that it has moved by providing it with the locator address of the new 1134 Server. The Client then discontinues the SRS/SRA keepalive process 1135 with the former Server, which will garbage-collect the stale FIB 1136 entries when their lifetime expires. This will allow the former 1137 Server to redirect existing correspondents to the new Server so that 1138 no packets are lost. 1140 Note that IRON addresses only network mobility and not host mobility. 1141 Mobility considerations for hosts within IRON EUNs are out of scope. 1143 6.5.2. Multihoming 1145 A Client may register multiple locators with its Server. It can 1146 assign metrics with its registrations to inform the Server of 1147 preferred locators, and can select outgoing locators according to its 1148 local preferences. Multihoming is therefore naturally supported. 1150 6.5.3. Inbound Traffic Engineering 1152 A Client can dynamically adjust the priorities of its prefix 1153 registrations with its Server in order to influence inbound traffic 1154 flows. It can also change between Servers when multiple Servers are 1155 available, but should strive for stability in its Server selection in 1156 order to limit VPC network routing churn. 1158 6.5.4. Outbound Traffic Engineering 1160 A Client can select outgoing locators, e.g., based on current QoS 1161 considerations such as minimizing one-way delay or one-way delay 1162 variance. 1164 6.6. Renumbering Considerations 1166 As new link layer technologies and/or service models emerge, 1167 customers will be motivated to select their service providers through 1168 healthy competition between ISPs. If a customer's EUN addresses are 1169 tied to a specific ISP, however, the customer may be forced to 1170 undergo a painstaking EUN renumbering process if it wishes to change 1171 to a different ISP [RFC4192][RFC5887]. 1173 When a customer obtains EP prefixes from a VPC, it can change between 1174 ISPs seamlessly and without need to renumber. If the VPC itself 1175 applies unreasonable costing structures for use of the EPs, however, 1176 the customer may be compelled to seek a different VPC and would again 1177 be required to confront a renumbering scenario. The IRON approach to 1178 renumbering avoidance therefore depends on VPCs conducting ethical 1179 business practices and offering reasonable rates. 1181 6.7. NAT Traversal Considerations 1183 The Internet today consists of a global public IPv4 routing and 1184 addressing system with non-IRON EUNs that use either public or 1185 private IPv4 addressing. The latter class of EUNs connect to the 1186 public Internet via Network Address Translators (NATs). When a 1187 Client is located behind a NAT, its selects Servers using the same 1188 procedures as for Clients with public addresses, i.e., it will send 1189 SRS messages to Servers in order to get SRA messages in return. The 1190 only requirement is that the Client must configure its SEAL 1191 encapsulation to use a transport protocol that supports NAT 1192 traversal, namely UDP. 1194 Since the Server maintains state about its Client customers, it can 1195 discover locator information for each Client by examining the UDP 1196 port number and IP address in the outer headers of SRS messages. 1197 When there is a NAT in the path, the UDP port number and IP address 1198 in the SRS message will correspond to state in the NAT box and might 1199 not correspond to the actual values assigned to the Client. The 1200 Server can then encapsulate packets destined to hosts in the Client's 1201 EUN within outer headers that use this IP address and UDP port 1202 number. The NAT box will receive the packets, translate the values 1203 in the outer headers, then forward the packets to the Client. In 1204 this sense, the Server's "locator" for the Client consists of the 1205 concatenation of the IP address and UDP port number. 1207 IRON does not introduce any new issues to complications raised for 1208 NAT traversal or for applications embedding address referrals in 1209 their payload. 1211 6.8. Nested EUN Considerations 1213 Each Client configures a locator that may be taken from an ordinary 1214 non-EPA address assigned by an ISP or from an EPA address taken from 1215 an EP assigned to another Client. In that case, the Client is said 1216 to be "nested" within the EUN of another Client, and recursive 1217 nestings of multiple layers of encapsulations may be necessary. 1219 For example, in the network scenario depicted in Figure 10 Client(A) 1220 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1221 Client(B) in turn configures a locator EPA(C) taken from the EP 1222 assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) 1223 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1224 this example, the "nested-IRON" case must be examined in which a host 1225 A which configures the address EPA(A) within EUN(A) exchanges packets 1226 with host Z located elsewhere in the Internet. 1228 .-. 1229 ISP(D) ,-( _)-. 1230 +-----------+ .-(_ (_ )-. 1231 | Client(C) |--(_ ISP(D) ) 1232 +-----+-----+ `-(______)-' 1233 | <= T \ .-. 1234 .-. u \ ,-( _)-. 1235 ,-( _)-. n .-(_ (- )-. 1236 .-(_ (_ )-. n (_ Internet ) 1237 (_ EUN(C) ) e `-(______)-' 1238 `-(______)-' l ___ 1239 | EPA(C) s => (:::)-. 1240 +-----+-----+ .-(::::::::) 1241 | Client(B) | .-(::::::::::::)-. +-----------+ 1242 +-----+-----+ (:::: The IRON ::::) | Relay(Z) | 1243 | `-(::::::::::::)-' +-----------+ 1244 .-. `-(::::::)-' +-----------+ 1245 ,-( _)-. | Server(Z) | 1246 .-(_ (_ )-. +-----------+ +-----------+ 1247 (_ EUN(B) ) | Server(C) | +-----------+ 1248 `-(______)-' +-----------+ | Client(Z) | 1249 | EPA(B) +-----------+ +-----------+ 1250 +-----+-----+ | Server(B) | +--------+ 1251 | Client(A) | +-----------+ | Host Z | 1252 +-----------+ +-----------+ +--------+ 1253 | | Server(A) | 1254 .-. +-----------+ 1255 ,-( _)-. EPA(A) 1256 .-(_ (_ )-. +--------+ 1257 (_ EUN(A) )---| Host A | 1258 `-(______)-' +--------+ 1260 Figure 10: Nested EUN Example 1262 The two cases of host A sending packets to host Z, and host Z sending 1263 packets to host A, must be considered separately as described below. 1265 6.8.1. Host A Sends Packets to Host Z 1267 Host A first forwards a packet with source address EPA(A) and 1268 destination address Z into EUN(A). Routing within EUN(A) will direct 1269 the packet to Client(A), which encapsulates it in an outer header 1270 with EPA(B) as the outer source address and Server(A) as the outer 1271 destination address then forwards the once-encapsulated packet into 1272 EUN(B). Routing within EUN[B] will direct the packet to Client(B), 1273 which encapsulates it in an outer header with EPA(C) as the outer 1274 source address and Server(B) as the outer destination address then 1275 forwards the twice-encapsulated packet into EUN(C). Routing within 1276 EUN(C) will direct the packet to Client(C), which encapsulates it in 1277 an outer header with ISP(D) as the outer source address and Server(C) 1278 as the outer destination address. Client(C) then sends this triple- 1279 encapsulated packet into the ISP(D) network, where it will be routed 1280 into the Internet to Server(C). 1282 When Server(C) receives the triple-encapsulated packet, it removes 1283 the outer layer of encapsulation and forwards the resulting twice- 1284 encapsulated packet into the Internet to Server(B). Next, Server(B) 1285 removes the outer layer of encapsulation and forwards the resulting 1286 once-encapsulated packet into the Internet to Server(A). Next, 1287 Server(A) checks the address type of the inner address 'Z'. If Z is 1288 a non-EPA address, Server(A) simply decapsulates the packet and 1289 forwards it into the Internet. Otherwise, Server(A) rewrites the 1290 outer source and destination addresses of the once-encapsulated 1291 packet and forwards it to Relay(Z). Relay(Z) in turn rewrites the 1292 outer destination address of the packet to the locator for Server(Z), 1293 then forwards the packet and sends a redirect to Server(A) (which 1294 forwards the redirect to Client(A)). Server(Z) then re-encapsulates 1295 the packet and forwards it to Client(Z), which decapsulates it and 1296 forwards the inner packet to host Z. Subsequent packets from 1297 Client(A) will then use Server(Z) as the next hop toward host Z, 1298 which eliminates Server(A) and Relay(Z) from the path. 1300 6.8.2. Host Z Sends Packets to Host A 1302 Whether or not host Z configures an EPA address, its packets destined 1303 to Host A will eventually reach Server(A). Server(A) will have a 1304 mapping that lists Client(A) as the next hop toward EPA(A). 1305 Server(A) will then encapsulate the packet with EPA(B) as the outer 1306 destination address and forward the packet into the Internet. 1307 Internet routing will convey this once-encapsulated packet to 1308 Server(B) which will have a mapping that lists Client(B) as the next 1309 hop toward EPA(B). Server(B) will then encapsulate the packet with 1310 EPA(C) as the outer destination address and forward the packet into 1311 the Internet. Internet routing will then convey this twice- 1312 encapsulated packet to Server(C) which will have a mapping that lists 1313 Client(C) as the next hop toward EPA(C). Server(C) will then 1314 encapsulate the packet with ISP(D) as the outer destination address 1315 and forward the packet into the Internet. Internet routing will then 1316 convey this triple-encapsulated packet to Client(C). 1318 When the triple-encapsulated packet arrives at Client(C), it strips 1319 the outer layer of encapsulation and forwards the twice-encapsulated 1320 packet to EPA(C) which is the locator address of Client(B). When 1321 Client(B) receives the twice-encapsulated packet, it strips the outer 1322 layer of encapsulation and forwards the once-encapsulated packet to 1323 EPA(B) which is the locator address of Client(A). When Client(A) 1324 receives the once-encapsulated packet, it strips the outer layer of 1325 encapsulation and forwards the unencapsulated packet to EPA(A) which 1326 is the host address of host A. 1328 7. Additional Considerations 1330 Considerations for the scalability of Internet Routing due to 1331 multihoming, traffic engineering and provider-independent addressing 1332 are discussed in [I-D.narten-radir-problem-statement]. Other scaling 1333 considerations specific to IRON are discussed in Appendix B. 1335 Route optimization considerations for mobile networks are found in 1336 [RFC5522]. 1338 8. Related Initiatives 1340 IRON builds upon the concepts RANGER architecture [RFC5720], and 1341 therefore inherits the same set of related initiatives. 1343 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1344 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1345 the Virtual Prefix concepts. 1347 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1348 contributed valuable insights, including the use of real-time 1349 mapping. The use of Servers as mobility anchor points is directly 1350 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1352 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1353 approach using a Correspondent Router (CR) model. The IRON Server 1354 construct is similar to the CR concept described in this work, 1355 however the manner in which customer EUNs coordinates with Servers is 1356 different and based on the redirection model associated with NBMA 1357 links. 1359 Numerous publications have proposed NAT traversal techniques. The 1360 NAT traversal techniques adapted for IRON were inspired by the Simple 1361 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1362 [I-D.carpenter-softwire-sample]. 1364 9. IANA Considerations 1366 There are no IANA considerations for this document. 1368 10. Security Considerations 1370 Security considerations that apply to tunneling in general are 1371 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1372 considerations that apply also to IRON are discussed in RANGER 1373 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1374 [I-D.templin-intarea-seal]. 1376 The IRON system further depends on mutual authentication of IRON 1377 Clients to Servers and Servers to Relays. This is accomplished 1378 through initial authentication exchanges followed by per-packet 1379 nonces that can be used to detect off-path attacks. As for all 1380 Internet communications, the IRON system also depends on Relays 1381 acting with integrity and not injecting false advertisements into the 1382 BGP (e.g., to mount traffic siphoning attacks). 1384 Each VPC overlay network requires a means for assuring the integrity 1385 of the interior routing system so that all Relays and Servers in the 1386 overlay have a consistent view of Client<->Server bindings. Finally, 1387 DOS attacks on IRON Relays and Servers can occur when packets with 1388 spoofed source addresses arrive at high data rates. This issue is no 1389 different than for any border router in the public Internet today, 1390 however. 1392 11. Acknowledgements 1394 This ideas behind this work have benefited greatly from discussions 1395 with colleagues; some of which appear on the RRG and other IRTF/IETF 1396 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1397 mobility architecture which strongly influenced IRON. Eric 1398 Fleischman pointed out the opportunity to leverage anycast for 1399 discovering topologically-close Servers. Thomas Henderson 1400 recommended a quantitative analysis of scaling properties. 1402 The following individuals provided essential review input: Mohamed 1403 Boucadair, John Buford, Wesley Eddy, Dae Young Kim and Robin Whittle. 1405 12. References 1407 12.1. Normative References 1409 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1410 September 1981. 1412 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1413 (IPv6) Specification", RFC 2460, December 1998. 1415 12.2. Informative References 1417 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1418 http://bgpmon.net/stat.php", June 2010. 1420 [I-D.bernardos-mext-nemo-ro-cr] 1421 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1422 Router based Route Optimisation for NEMO (CRON)", 1423 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1424 July 2008. 1426 [I-D.carpenter-softwire-sample] 1427 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1428 IPv6: Simple Address Mapping for Premises Legacy Equipment 1429 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1430 progress), June 2010. 1432 [I-D.ietf-grow-va] 1433 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1434 L. Zhang, "FIB Suppression with Virtual Aggregation", 1435 draft-ietf-grow-va-03 (work in progress), August 2010. 1437 [I-D.ietf-v6ops-tunnel-security-concerns] 1438 Hoagland, J., Krishnan, S., and D. Thaler, "Security 1439 Concerns With IP Tunneling", 1440 draft-ietf-v6ops-tunnel-security-concerns-02 (work in 1441 progress), August 2010. 1443 [I-D.narten-radir-problem-statement] 1444 Narten, T., "On the Scalability of Internet Routing", 1445 draft-narten-radir-problem-statement-05 (work in 1446 progress), February 2010. 1448 [I-D.russert-rangers] 1449 Russert, S., Fleischman, E., and F. Templin, "RANGER 1450 Scenarios", draft-russert-rangers-05 (work in progress), 1451 July 2010. 1453 [I-D.templin-intarea-seal] 1454 Templin, F., "The Subnetwork Encapsulation and Adaptation 1455 Layer (SEAL)", draft-templin-intarea-seal-20 (work in 1456 progress), September 2010. 1458 [I-D.templin-intarea-vet] 1459 Templin, F., "Virtual Enterprise Traversal (VET)", 1460 draft-templin-intarea-vet-16 (work in progress), 1461 July 2010. 1463 [I-D.whittle-ivip-arch] 1464 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1465 Architecture", draft-whittle-ivip-arch-04 (work in 1466 progress), March 2010. 1468 [I-D.zhang-evolution] 1469 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1470 Scalability", draft-zhang-evolution-02 (work in progress), 1471 October 2009. 1473 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1474 a subnetwork for experimentation with the OSI network 1475 layer", RFC 1070, February 1989. 1477 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1478 Addresses", RFC 2526, March 1999. 1480 [RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers", 1481 RFC 3068, June 2001. 1483 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1484 Reserved for Documentation", RFC 3849, July 2004. 1486 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1487 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1488 September 2005. 1490 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1491 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1493 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1494 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1495 May 2006. 1497 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1498 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1499 March 2008. 1501 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1502 Route Optimization Requirements for Operational Use in 1503 Aeronautics and Space Exploration Mobile Networks", 1504 RFC 5522, October 2009. 1506 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1507 Global Enterprise Recursion (RANGER)", RFC 5720, 1508 February 2010. 1510 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1511 Reserved for Documentation", RFC 5737, January 2010. 1513 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1514 (IRTF) Document Stream", RFC 5743, December 2009. 1516 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1517 Still Needs Work", RFC 5887, May 2010. 1519 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1520 Core-Edge Separation Solutions to the Internet's Routing 1521 Scaling Problem, 1522 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1523 August 2008. 1525 Appendix A. IRON VPs Over Internetworks with Different Address Families 1527 The IRON architecture leverages the routing system by providing 1528 generally shortest-path routing for packets with EPA addresses from 1529 VPs that match the address family of the underlying Internetwork. 1530 When the VPs are of an address family that is not routable within the 1531 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1532 are used within an IPv4 Internetwork) a global mapping database is 1533 required to allow Servers to map VPs to companion prefixes taken from 1534 address families that are routable within the Internetwork. For 1535 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1536 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1537 packets can be forwarded over IPv4-only Internetworks. 1539 Every VP in the IRON must therefore be represented in a globally 1540 distributed Master VP database (MVPd) that maintains VP-to-companion 1541 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1542 globally-managed assigned numbers authority in the same manner as the 1543 Internet Assigned Numbers Authority (IANA) currently maintains the 1544 master list of all top-level IPv4 and IPv6 delegations. The database 1545 can be replicated across multiple servers for load balancing much in 1546 the same way that FTP mirror sites are used to manage software 1547 distributions. 1549 Upon startup, each Server discovers the full set of VPs for the IRON 1550 by reading the MVPd. The Server reads the MVPd from a nearby server 1551 and periodically checks the server for deltas since the database was 1552 last read. After reading the MVPd, the Server has a full list of VP 1553 to companion prefix mappings. 1555 The Server can then forward packets toward EPAs covered by a VP by 1556 encapsulating them in an outer header of the VP's companion prefix 1557 address family and using any address taken from the companion prefix 1558 as the outer destination address. The companion prefix therefore 1559 serves as an anycast prefix. 1561 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1562 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1564 Appendix B. Scaling Considerations 1566 Scaling aspects of the IRON architecture have strong implications for 1567 its applicability in practical deployments. Scaling must be 1568 considered along multiple vectors including Interdomain core routing 1569 scaling, scaling to accommodate large numbers of customer EUNs, 1570 traffic scaling, state requirements, etc. 1572 In terms of routing scaling, each VPC will advertise one or more VPs 1573 from which EPs are delegated to customer EUNs. Routing scaling will 1574 therefore be minimized when each VP covers many EPs. For example, 1575 the IPv6 prefix 2001:DB8::/32 contains 2^24 ::/56 EP prefixes for 1576 assignment to EUNs. The IRON could therefore accommodate 2^32 ::/56 1577 EPs with only 2^8 ::/32 VPs advertised in the interdomain routing 1578 core. 1580 In terms of traffic scaling for Relays, each Relay represents an ASBR 1581 of a "shell" enterprise network that simply directs arriving traffic 1582 packets with EPA destination addresses towards Servers that service 1583 customer EUNs. Moreover, the Relay sheds traffic destined to EPAs 1584 through redirection which removes it from the path for the vast 1585 majority of traffic packets. On the other hand, each Relay must 1586 handle all traffic packets forwarded between its customer EUNs and 1587 the non-IRON Internet. The scaling concerns for this latter class of 1588 traffic are no different than for ASBR routers that connect large 1589 enterprise networks to the Internet. In terms of traffic scaling for 1590 Servers, each Server services a set of the VPC overlay network's 1591 customer EUNs. The Server services all traffic packets destined to 1592 its EUNs but only services the initial packets of flows initiated 1593 from the EUNs and destined to EPAs. Therefore, traffic scaling for 1594 EPA-addressed traffic is an asymmetric consideration and is 1595 proportional to the number of EUNs each Server serves. 1597 In terms of state requirements for Relays, each Relay maintains a 1598 list of all Servers in the VPC overlay network as well as FIB entries 1599 for all customer EUNs that each Server serves. This state is 1600 therefore dominated by the number of EUNs in the VPC overlay network. 1601 Sizing the Relay to accommodate state information for all EUNs is 1602 therefore required during VPC overlay network planning. In terms of 1603 state requirements for Servers, each Server maintains tunnel state 1604 for each of the customer EUNs it serves but need not keep state for 1605 all EUNs in the VPC overlay network. Finally, neither Relays nor 1606 Servers need keep state for final destinations of outbound traffic. 1608 Clients source and sink all traffic packets originating from or 1609 destined to the customer EUN. Therefore traffic scaling 1610 considerations for Clients are the same as for any site border 1611 router. Clients also retain state for the Servers for final 1612 destinations of outbound traffic flows. This can be managed as soft 1613 state, since stale entries purged from the cache will be refreshed 1614 when new traffic packets are sent. 1616 Author's Address 1618 Fred L. Templin (editor) 1619 Boeing Research & Technology 1620 P.O. Box 3707 MC 7L-49 1621 Seattle, WA 98124 1622 USA 1624 Email: fltemplin@acm.org