idnits 2.17.1 draft-templin-iron-17.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 05, 2011) is 4831 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'B' is mentioned on line 1229, but not defined == Unused Reference: 'RFC3849' is defined on line 1503, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1535, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-03 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-25 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-19 -- Obsolete informational reference (is this intentional?): RFC 3068 (Obsoleted by RFC 7526) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft January 05, 2011 5 Intended status: Experimental 6 Expires: July 9, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-17.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth while requiring no changes to end systems and no changes to 18 the existing routing system. IRON further addresses other important 19 issues including routing scaling, mobility management, multihoming, 20 traffic engineering and NAT traversal. While business considerations 21 are an important determining factor for widespread adoption, they are 22 out of scope for this document. This document is a product of the 23 IRTF Routing Research Group. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on July 9, 2011. 42 Copyright Notice 44 Copyright (c) 2011 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 62 3.1. IRON Client . . . . . . . . . . . . . . . . . . . . . . . 9 63 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 9 64 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 65 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 66 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 67 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 68 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 69 5.3. IRON Client Initialization . . . . . . . . . . . . . . . . 15 70 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 15 71 6.1. IRON Client Operation . . . . . . . . . . . . . . . . . . 16 72 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 16 73 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 17 74 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 18 75 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 18 76 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 21 77 6.5. Mobility, Multihoming and Traffic Engineering 78 Considerations . . . . . . . . . . . . . . . . . . . . . . 24 79 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 24 80 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 25 81 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 25 82 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 25 83 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 25 84 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 25 85 6.8. Multicast Considerations . . . . . . . . . . . . . . . . . 26 86 6.9. Nested EUN Considerations . . . . . . . . . . . . . . . . 26 87 6.9.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 27 88 6.9.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 28 89 7. Implications for the Internet . . . . . . . . . . . . . . . . 29 90 8. Additional Considerations . . . . . . . . . . . . . . . . . . 30 91 9. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 30 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 31 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 96 13.1. Normative References . . . . . . . . . . . . . . . . . . . 32 97 13.2. Informative References . . . . . . . . . . . . . . . . . . 32 98 Appendix A. IRON VPs Over Internetworks with Different 99 Address Families . . . . . . . . . . . . . . . . . . 34 100 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 35 101 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36 103 1. Introduction 105 Growth in the number of entries instantiated in the Internet routing 106 system has led to concerns for unsustainable routing scaling 107 [I-D.narten-radir-problem-statement]. Operational practices such as 108 increased use of multihoming with Provider-Independent (PI) 109 addressing are resulting in more and more fine-grained prefixes 110 injected into the routing system from more and more end-user 111 networks. Furthermore, the forthcoming depletion of the public IPv4 112 address space has raised concerns for both increased address space 113 fragmentation (leading to yet further routing table entries) and an 114 impending address space run-out scenario. At the same time, the IPv6 115 routing system is beginning to see growth [BGPMON] which must be 116 managed in order to avoid the same routing scaling issues the IPv4 117 Internet now faces. Since the Internet must continue to scale to 118 accommodate increasing demand, it is clear that new routing 119 methodologies and operational practices are needed. 121 Several related works have investigated routing scaling issues. 122 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 123 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 124 proposals that introduce routing overlays with Virtual Prefixes (VPs) 125 to reduce the number of entries required in each router's Forwarding 126 Information Base (FIB) and Routing Information Base (RIB). Routing 127 and Addressing in Networks with Global Enterprise Recursion (RANGER) 128 [RFC5720] examines recursive arrangements of enterprise networks that 129 can apply to a very broad set of use case scenarios 130 [I-D.russert-rangers]. IRON specifically adopts the RANGER non- 131 broadcast, multiple access (NBMA) tunnel virtual interface model, and 132 uses Virtual Enterprise Traversal (VET) [I-D.templin-intarea-vet] and 133 the Subnetwork Adaptation and Encapsulation Layer (SEAL) 134 [I-D.templin-intarea-seal] as its functional building blocks. 136 This document proposes an Internet Routing Overlay Network (IRON) 137 with goals of supporting sustainable growth while requiring no 138 changes to the existing routing system. IRON borrows concepts from 139 VA and AIS, and further borrows concepts from the Internet Vastly 140 Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 141 proposal along with its associated Translating Tunnel Router (TTR) 142 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 143 degree inspired the IRON mobility architecture design discussed in 144 this document. The Network Address Translator (NAT) traversal 145 techniques adapted for IRON were inspired by the Simple Address 146 Mapping for Premises Legacy Equipment (SAMPLE) proposal 147 [I-D.carpenter-softwire-sample]. 149 IRON supports scalable addressing without changing the current BGP 150 [RFC4271] routing system. IRON observes the Internet Protocol 151 standards [RFC0791][RFC2460]. Other network layer protocols that can 152 be encapsulated within IP packets (e.g., OSI/CLNP [RFC1070], etc.) 153 are also within scope. 155 The IRON is a global routing system comprising virtual overlay 156 networks managed by Virtual Prefix Companies (VPCs) that own and 157 manage Virtual Prefixes (VPs) from which End User Network (EUN) 158 prefixes (EPs) are delegated to customer sites. The IRON is 159 motivated by a growing customer demand for multihoming, mobility 160 management and traffic engineering while using stable addressing to 161 minimize dependence on network renumbering [RFC4192][RFC5887]. The 162 IRON uses the existing IPv4 and IPv6 global Internet routing systems 163 as virtual NBMA links for tunneling inner network protocol packets 164 within outer IPv4 or IPv6 headers (see: Section 3). The IRON 165 requires deployment of a small number of new BGP core routers and 166 supporting servers, as well as IRON-aware routers/servers in customer 167 EUNs. No modifications to hosts, and no modifications to most 168 routers are required. 170 Note: This document is offered in compliance with Internet Research 171 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 172 IETF product and is not a standard. The views in this document were 173 considered controversial by the IRTF Routing Research Group (RRG) but 174 the RG reached a consensus that the document should still be 175 published. The document will undergo a period of review within the 176 RRG and through selected expert reviewers prior to publication. The 177 following sections discuss details of the IRON architecture. 179 2. Terminology 181 This document makes use of the following terms: 183 End User Network (EUN) 184 an edge network that connects an organization's devices (e.g., 185 computers, routers, printers, etc.) to the Internet. 187 End User Network Prefix (EP) 188 a more-specific inner network-layer prefix derived from a Virtual 189 Prefix (VP) (e.g., an IPv4 /28, an IPv6 /56, etc.) and delegated 190 to an EUN by a Virtual Prefix Company (VPC). 192 End User Network Prefix Address (EPA) 193 a network layer address belonging to an EP and assigned to the 194 interface of an end system in an EUN. 196 Forwarding Information Based (FIB) 197 a data structure containing network prefix to next-hop mappings; 198 usually maintained in a router's fast-path processing lookup 199 tables. 201 Internet Routing Overlay Network (IRON) 202 a composite virtual overlay network that comprises the union of 203 all VPC overlay networks configured over a common Internetwork. 204 The IRON supports routing through encapsulation of inner packets 205 with EPA addresses within outer headers that use locator 206 addresses. 208 IRON Client Router/Host ("Client") 209 a customer's router or host that logically connects the customer's 210 EUNs and their associated EPs to the IRON via an NBMA tunnel 211 virtual interface. 213 IRON Serving Router ("Server") 214 a VPC's overlay network router that provides forwarding and 215 mapping services for the EPs owned by customer Clients. 217 IRON Relay Router ("Relay") 218 a VPC's overlay network router that acts as a relay between the 219 IRON and the native Internet. 221 IRON Agent (IA) 222 generically refers to any of an IRON Client/Server/Relay. 224 Internet Service Provider (ISP) 225 a service provider which connects customer EUNs to the underlying 226 Internetwork. In other words, an ISP is responsible for providing 227 basic Internet connectivity for customer EUNs. 229 Locator 230 an IP address assigned to the interface of a router or end system 231 within a public or private network. Locators taken from public IP 232 prefixes are routable on a global basis, while locators taken from 233 private IP prefixes are made public via Network Address 234 Translation (NAT). 236 Routing and Addressing in Networks with Global Enterprise Recursion 237 (RANGER) 238 an architectural examination of virtual overlay networks applied 239 to enterprise network scenarios, with implications for a wider 240 variety of use cases. 242 Subnetwork Encapsulation and Adaptation Layer (SEAL) 243 an encapsulation sublayer that provides extended packet 244 identification and a control message protocol to ensure 245 deterministic network-layer feedback. 247 Virtual Enterprise Traversal (VET) 248 a method for discovering border routers and forming dynamic tunnel 249 neighbor relationships over enterprise networks (or sites) with 250 varying properties. 252 Virtual Prefix (VP) 253 a prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 254 prefix, etc.) that is owned and managed by a Virtual Prefix 255 Company (VPC). 257 Virtual Prefix Company (VPC) 258 a company that owns and manages a set of VPs from which it 259 delegates EPs to EUNs. 261 VPC Overlay Network 262 a specialized set of routers deployed by a VPC to service customer 263 EUNs through a virtual overlay network configured over an 264 underlying Internetwork (e.g., the global Internet). 266 3. The Internet Routing Overlay Network 268 The Internet Routing Overlay Network (IRON) is a system of virtual 269 overlay networks configured over a common Internetwork. While the 270 principles presented in this document are discussed within the 271 context of the public global Internet, they can also be applied to 272 any autonomous Internetwork. The rest of this document therefore 273 refers to the terms "Internet" and "Internetwork" interchangeably 274 except in cases where specific distinctions must be made. 276 The IRON consists of IRON Agents (IAs) that automatically tunnel the 277 packets of end-to-end communication sessions within encapsulating 278 headers used for Internet routing. IAs use the Virtual Enterprise 279 Traversal (VET) [I-D.templin-intarea-vet] virtual NBMA link model in 280 conjunction with the Subnetwork Encapsulation and Adaptation Layer 281 (SEAL) [I-D.templin-intarea-seal] to encapsulate inner network layer 282 packets within outer headers as shown in Figure 1: 284 +-------------------------+ 285 | Outer headers with | 286 ~ locator addresses ~ 287 | (IPv4 or IPv6) | 288 +-------------------------+ 289 | SEAL Header | 290 +-------------------------+ +-------------------------+ 291 | Inner Packet Header | --> | Inner Packet Header | 292 ~ with EP addresses ~ --> ~ with EP addresses ~ 293 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 294 +-------------------------+ +-------------------------+ 295 | | --> | | 296 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 297 | | --> | | 298 +-------------------------+ +-------------------------+ 300 Inner packet before Outer packet after 301 before encapsulation after encapsulation 303 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 305 VET specifies the automatic tunneling mechanisms used for 306 encapsulation, while SEAL specifies the format and usage of the SEAL 307 header as well as a set of control messages. Most notably, IAs use 308 the SEAL Control Message Protocol (SCMP) to deterministically 309 exchange and authenticate control messages such as route 310 redirections, indications of Path Maximum Transmission Unit (PMTU) 311 limitations, destination unreachables, etc. IAs appear as neighbors 312 on an NBMA virtual link, and form bidirectional and/or unidirectional 313 tunnel neigbhbor relationships. 315 The IRON is the union of all virtual overlay networks that are 316 configured over a common underlying Internet and are owned and 317 managed by Virtual Prefix Companies (VPCs). Each such virtual 318 overlay network comprises a set of IAs distributed throughout the 319 Internet to serve highly-aggregated Virtual Prefixes (VPs). VPCs 320 delegate sub-prefixes from their VPs which they lease to customers as 321 End User Network Prefixes (EPs). The customers in turn assign the 322 EPs to their customer edge IAs which connect their End User Networks 323 (EUNs) to the IRON. 325 VPCs may have no affiliation with the ISP networks from which 326 customers obtain their basic Internet connectivity. Therefore, a 327 customer could procure its summary network services either through a 328 common broker or through separate entities. In that case, the VPC 329 can open for business and begin serving its customers immediately 330 without the need to coordinate its activities with ISPs or with other 331 VPCs. Further details on business considerations are out of scope 332 for this document. 334 The IRON requires no changes to end systems and no changes to most 335 routers in the Internet. Instead, the IRON comprises IAs that are 336 deployed either as new platforms or as modifications to existing 337 platforms. IAs may be deployed incrementally without disturbing the 338 existing Internet routing system, and act as waypoints (or "cairns") 339 for navigating the IRON. The functional roles for IAs are described 340 in the following sections. 342 3.1. IRON Client 344 An IRON client (or, simply, "Client") is a customer's router or host 345 that logically connects the customer's EUNs and their associated EPs 346 to the IRON via tunnels as shown in Figure 2. Client routers obtain 347 EPs from VPCs and use them to number subnets and interfaces within 348 their EUNs. A Client can be deployed on the same physical platform 349 that also connects the customer's EUNs to its ISPs, but it may also 350 be a separate router or even a standalone server system located 351 within the EUN. (This model applies even if the EUN connects to the 352 ISP via a Network Address Translator (NAT) - see Section 6.7). 353 Finally, a Client may also be a simple end system that connects a 354 singleton EUN and exhibits the outward appearance of a host. 355 .-. 356 ,-( _)-. 357 +--------+ .-(_ (_ )-. 358 | Client |--(_ ISP ) 359 +---+----+ `-(______)-' 360 | <= T \ .-. 361 .-. u \ ,-( _)-. 362 ,-( _)-. n .-(_ (- )-. 363 .-(_ (_ )-. n (_ Internet ) 364 (_ EUN ) e `-(______)- 365 `-(______)-' l ___ 366 | s => (:::)-. 367 +----+---+ .-(::::::::) 368 | Host | .-(::::::::::::)-. 369 +--------+ (:::: The IRON ::::) 370 `-(::::::::::::)-' 371 `-(::::::)-' 373 Figure 2: IRON Client Router Connecting EUN to the IRON 375 3.2. IRON Serving Router 377 An IRON serving router (or, simply, "Server") is a VPC's overlay 378 network router that provides forwarding and mapping services for the 379 EPs owned by customer Client routers. In typical deployments, a VPC 380 will deploy many Servers around the IRON in a globally-distributed 381 fashion (e.g., as depicted in Figure 3) so that Clients can discover 382 those that are nearby. 384 +--------+ +--------+ 385 | Boston | | Tokyo | 386 | Server | | Server | 387 +--+-----+ ++-------+ 388 +--------+ \ / 389 | Seattle| \ ___ / 390 | Server | \ (:::)-. +--------+ 391 +------+-+ .-(::::::::)------+ Paris | 392 \.-(::::::::::::)-. | Server | 393 (:::: The IRON ::::) +--------+ 394 `-(::::::::::::)-' 395 +--------+ / `-(::::::)-' \ +--------+ 396 | Moscow + | \--- + Sydney | 397 | Server | +----+---+ | Server | 398 +--------+ | Cairo | +--------+ 399 | Server | 400 +--------+ 402 Figure 3: IRON Serving Router Global Distribution Example 404 Each Server acts as tunnel-endpoint router that forms a bidirectional 405 tunnel neighbor relationship with each of its Client customers. Each 406 Server also associates with a set of Relays that can forward packets 407 from the IRON out to the native Internet and vice-versa as discussed 408 in the next section. 410 3.3. IRON Relay Router 412 An IRON Relay Router (or, simply, "Relay") is a VPC's overlay network 413 router that acts as a relay between the IRON and the native Internet. 414 It therefore also serves as an Autonomous System Border Router (ASBR) 415 that is owned and managed by the VPC. 417 Each VPC configures one or more Relays which advertise the company's 418 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 419 Relay associates with all of the VPC's overlay network Servers, e.g., 420 via tunnels over the IRON, via a direct interconnect such as an 421 Ethernet cable, etc. The Relay role (as well as its relationship 422 with overlay network Servers) is depicted in Figure 4: 424 .-. 425 ,-( _)-. 426 .-(_ (_ )-. 427 (_ Internet ) 428 `-(______)-' | +--------+ 429 | |--| Server | 430 +----+---+ | +--------+ 431 | Relay |----| +--------+ 432 +--------+ |--| Server | 433 _|| | +--------+ 434 (:::)-. (Ethernet) 435 .-(::::::::) 436 +--------+ .-(::::::::::::)-. +--------+ 437 | Server |=(:::: The IRON ::::)=| Server | 438 +--------+ `-(::::::::::::)-' +--------+ 439 `-(::::::)-' 440 || (Tunnels) 441 +--------+ 442 | Server | 443 +--------+ 445 Figure 4: IRON Relay Router Connecting IRON to Native Internet 447 4. IRON Organizational Principles 449 The IRON consists of the union of all VPC overlay networks configured 450 over a common Internetwork (e.g., the public Internet). Each such 451 overlay network represents a distinct "patch" on the Internet 452 "quilt", where the patches are stitched together by tunnels over the 453 links, routers, bridges, etc., that connect the underlying 454 Internetwork. When a new VPC overlay network is deployed, it becomes 455 yet another patch on the quilt. The IRON is therefore a composite 456 overlay network consisting of multiple individual patches, where each 457 patch coordinates its activities independently of all others (with 458 the exception that the Servers of each patch must be aware of all VPs 459 in the IRON). In order to ensure mutual cooperation between all VPC 460 overlay networks, sufficient address space portions of the inner 461 network layer protocol (e.g., IPv4, IPv6, etc.) should be set aside 462 and designated as VP space. 464 Each VPC overlay network in the IRON maintains a set of Relays and 465 Servers that provide services to their Client customers. In order to 466 ensure adequate customer service levels, the VPC should conduct a 467 traffic scaling analysis and distribute sufficient Relays and Servers 468 for the overlay network globally throughout the Internet. Figure 5 469 depicts the logical arrangement of Relays Servers and Clients in an 470 IRON virtual overlay network: 472 .-. 473 ,-( _)-. 474 .-(_ (_ )-. 475 (__ Internet _) 476 `-(______)-' 478 <------------ Relays ------------> 479 ________________________ 480 (::::::::::::::::::::::::)-. 481 .-(:::::::::::::::::::::::::::::) 482 .-(:::::::::::::::::::::::::::::::::)-. 483 (::::::::::: The IRON :::::::::::::::) 484 `-(:::::::::::::::::::::::::::::::::)-' 485 `-(::::::::::::::::::::::::::::)-' 487 <------------ Servers ------------> 488 .-. .-. .-. 489 ,-( _)-. ,-( _)-. ,-( _)-. 490 .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. 491 (__ ISP A _) (__ ISP B _) ... (__ ISP x _) 492 `-(______)-' `-(______)-' `-(______)-' 493 <----------- NATs ------------> 495 <----------- Clients and EUNs -----------> 497 Figure 5: Virtual Overlay Network Organization 499 Each Relay in the VPC overlay network connects the overlay directly 500 to the underlying IPv4 and IPv6 Internets. It also advertises the 501 VPC overlay network's IPv4 VPs into the IPv4 BGP routing system and 502 advertises the overlay network's IPv6 VPs into the IPv6 BGP routing 503 system. Relays will therefore receive packets with EPA destination 504 addresses sent by end systems in the Internet and direct them toward 505 EPA-addressed end systems connected to the VPC overlay network. 507 Each VPC overlay network also manages a set of Servers that connect 508 their Clients and associated EUNs to the IRON and to the IPv6 and 509 IPv4 Internets via their associations with Relays. IRON Servers 510 therefore need not be BGP routers themselves and can be simple 511 commodity hardware platforms. Moreover, the Server and Relay 512 functions can be deployed together on the same physical platform as a 513 unified gateway or they may be deployed on separate platforms (e.g., 514 for load balancing purposes). 516 Each Server maintains a working set of Clients for which it caches 517 EP-to-Client mappings in its Forwarding Information Base (FIB). Each 518 Server also in turn propagates the list of EPs in its working set to 519 each of the Relays in the VPC overlay network via a dynamic routing 520 protocol (e.g., an overlay network internal BGP instance that carries 521 only the EP-to-Server mappings and does not interact with the 522 external BGP routing system). Each Server therefore only needs to 523 track the EPs for its current working set of Clients, while each 524 Relay will maintain a full EP-to-Server mapping table that represents 525 reachability information for all EPs in the VPC overlay network. 527 Customers establish Clients that obtain their basic Internet 528 connectivity from ISPs and connect to Servers to attach their EUNs to 529 the IRON. Each EUN can connect to the IRON via one or multiple 530 Clients as long as the Clients coordinate with one another, e.g., to 531 mitigate EUN partitions. Unlike Relays and Servers, Clients may use 532 private addresses behind one or several layers of NATs. Each Client 533 initially discovers a list of nearby Servers through an anycast 534 discovery process (described below). It then selects one of these 535 nearby Servers and forms a bidirectional tunnel neighbor relationship 536 with the server through an initial exchange followed by periodic 537 keepalives. 539 After the Client selects a Server, it forwards initial outbound 540 packets from its EUNs by tunneling them to the Server which in turn 541 forwards them to the nearest Relay within the IRON that serves the 542 final destination. The Client will subsequently receive redirect 543 messages informing it of a more direct route through a Server that 544 serves the final destination EUN. 546 The IRON can also be used to support VPs of network layer address 547 families that cannot be routed natively in the underlying 548 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 549 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 550 Further details for support of IRON VPs of one address family over 551 Internetworks based on other address families are discussed in 552 Appendix A. 554 5. IRON Initialization 556 IRON initialization entails the startup actions of IAs within the VPC 557 overlay network and customer EUNs. The following sections discuss 558 these startup procedures. 560 5.1. IRON Relay Router Initialization 562 Before its first operational use, each Relay in a VPC overlay network 563 is provisioned with the list of VPs that it will serve as well as the 564 locators for all Servers that belong to the same overlay network. 565 The Relay is also provisioned with external BGP interconnections the 566 same as for any BGP router. 568 Upon startup, the Relay engages in BGP routing exchanges with its 569 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 570 It then connects to all of the Servers in the overlay network (e.g., 571 via a TCP connection over a bidirectional tunnel, via an iBGP route 572 reflector, etc.) for the purpose of discovering EP->Server mappings. 573 After the Relay has fully populated its EP->Server mapping 574 information database, it is said to be "synchronized" wrt its VPs. 576 After this initial synchronization procedure, the Relay then 577 advertises the overlay network's VPs externally. In particular, the 578 Relay advertises the IPv6 VPs into the IPv6 BGP routing system and 579 advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay 580 additionally advertises an IPv4 /24 companion prefix (e.g., 581 192.0.2.0/24) into the IPv4 routing system and an IPv6 ::/64 582 companion prefix (e.g., 2001:DB8::/64) into the IPv6 routing system 583 (note that these may also be sub-prefixes taken from a VP). The 584 Relay then configures the host number '1' in the IPv4 companion 585 prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the 586 IPv6 companion prefix (e.g., as 2001:DB8::0) and assigns the 587 resulting addresses as subnet router anycast addresses 588 [RFC3068][RFC2526] for the VPC overlay network. (See Appendix A for 589 more information on the discovery and use of companion prefixes.) 590 The Relay then engages in ordinary packet forwarding operations. 592 5.2. IRON Serving Router Initialization 594 Before its first operational use, each Server in a VPC overlay 595 network is provisioned with the locators for all Relays that 596 aggregate the overlay network's VPs. In order to support route 597 optimization, the Server must also be provisioned with the list of 598 all VPs in the IRON (i.e., and not just the VPs of its own overlay 599 network) so that it can discern EPA and non-EPA addresses. (The 600 Server could therefore be greatly simplified if the list of VPs could 601 be covered within a small number of very short prefixes, e.g., one or 602 a few IPv6 ::/20's). The Server must also discover the VP companion 603 prefix relationships discussed in Section 5.1, e.g., via a global 604 database such as discussed in Appendix A. 606 Upon startup, each Server must connect to all of the Relays within 607 its overlay network (e.g., via a TCP connection, via an iBGP route 608 reflector, etc.) for the purpose of reporting its EP->Server 609 mappings. The Server then actively listens for Client customers 610 which register their EP prefixes as part of establishing a 611 bidirectional tunnel neighbor relationship. When a new Client 612 registers its EP prefixes, the Server announces the new EP additions 613 to all Relays; when an existing Client unregisters its EP prefixes, 614 the Server withdraws its announcements. 616 5.3. IRON Client Initialization 618 Before its first operational use, each Client must obtain one or more 619 EPs from its VPC as well as the companion prefixes associated with 620 the VPC overlay network (see Section 5.1). The Client must also 621 obtain a certificate and a public/private key pair from the VPC that 622 it can later use to prove ownership of its EPs. This implies that 623 each VPC must run its own public key infrastructure to be used only 624 for the purpose of verifying its customers' claimed right to use an 625 EP. Hence, the VPC need not coordinate its public key infrastructure 626 with any other organization. 628 Upon startup, the Client sends an SCMP Router Solicitation (SRS) 629 message to the VPC overlay network subnet router anycast address to 630 discover the nearest Relay. The Relay will return an SCMP Router 631 Advertisement (SRA) message that lists the locator addresses of one 632 or more nearby Servers. (This list is analogous to the ISATAP 633 Potential Router List (PRL) [RFC5214].) 635 After the Client receives an SRA message from the nearby Relay 636 listing the locator addresses of nearby Servers, it initiates a short 637 transaction with one of the servers carried by a reliable transport 638 protocol such as TCP in order to establish a bidirectional tunnel 639 neighbor relationship. The protocol details of the transaction are 640 specific to the VPC, and hence out of scope for this document. 642 Note that it is essential that the Client select one and only one 643 Server. This is to allow the VPC overlay network mapping system to 644 have one and only one active EP-to-Server mapping at any point in 645 time which shares fate with the Server itself. If this Server fails, 646 the Client can select a new one which will automatically update the 647 VPC overlay network mapping system with a new EP-to-Server mapping. 649 6. IRON Operation 651 Following the IRON initialization detailed in Section 5, IAs engage 652 in the steady-state process of receiving and forwarding packets. All 653 IAs forward encapsulated packets over the IRON using the mechanisms 654 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 655 while Relays (and in some cases Servers) additionally forward packets 656 to and from the native IPv6 and IPv4 Internets. IAs also use SCMP to 657 coordinate with other IAs, including the process of sending and 658 receiving redirect messages, error messages, etc. (Note however that 659 an IA must not send an SCMP message in response to an SCMP error 660 message.) Each IA operates as specified in the following sub- 661 sections. 663 6.1. IRON Client Operation 665 After selecting its Server as specified in Section 5.3, the Client 666 should register each of its ISP connections with the Server for 667 multihoming purposes. To do so, it sends periodic beacons (e.g., SRS 668 messages) to its Server via each of its ISPs to establish additional 669 tunnel neighbor state. This implies that a single tunnel neighbor 670 identifier (i.e., a "nonce") is used to represent the set of all ISP 671 paths between the Client and the Server. Therefore, the nonce names 672 this "bundle" of ISP paths. 674 If the Client ceases to receive acknowledgements from its Server via 675 a specific ISP connection, it marks the Server as unreachable from 676 that address and therefore over that ISP connection. (The Client 677 should also inform its Server of this outage via one of its working 678 ISP connections.) If the Client ceases to receive acknowledgements 679 from its Server via multiple ISP connections, it marks the Server as 680 unusable and quickly attempts to register with a new Server. The act 681 of registering with a new Server will automatically purge the stale 682 mapping state associated with the old Server, since dynamic routing 683 will propagate the new client/server relationship to the VPC overlay 684 network relay routers. 686 When an end system in an EUN sends a flow of packets to a 687 correspondent, the packets are forwarded through the EUN via normal 688 routing until they reach the Client, which then tunnels the initial 689 packets to its Server as the next hop. In particular, the Client 690 encapsulates each packet in an outer header with its locator as the 691 source address and the locator of its Server as the destination 692 address. Note that after sending the initial packets of a flow, the 693 Client may receive important SCMP messages such as indications of 694 PMTU limitations, redirects that point to a better next hop, etc. 696 The Client uses the mechanisms specified in VET and SEAL to 697 encapsulate each forwarded packet. The Client further uses the SCMP 698 protocol to coordinate with Servers, including accepting redirects 699 and other SCMP messages. When the Client receives an SCMP message, 700 it checks the nonce field of the encapsulated packet-in-error to 701 verify that the message corresponds to the tunnel neighbor state for 702 its Server and accepts the message if the nonce matches. (Note 703 however that the outer source and destination addresses of the 704 packet-in-error may be different than those in the original packet 705 due to possible Server and/or Relay address rewritings.) 707 6.2. IRON Serving Router Operation 709 After the Server is initialized, it responds to SRSs from Clients by 710 sending SRAs. When the Server receives a SEAL-encapsulated packet 711 from one of its Client tunnel neighbors, it examines the inner 712 destination address. If the inner destination address is not an EPA, 713 the Server decapsulates the packet and forwards it unencapsulated 714 into the Internet if it is able to do so without loss due to ingress 715 filtering. Otherwise, the Server re-encapsulates the packet (i.e., 716 it removes the outer header and replaces it with a new outer header 717 of the same address family) and sets the outer destination address to 718 the locator address of an Relay within its VPC overlay network. It 719 then forwards the re-encapsulated packet to the Relay, which will in 720 turn decapsulate it and forward it into the Internet. 722 If the inner destination address is an EPA, however, the Server 723 rewrites the outer source address to one of its own locator addresses 724 and rewrites the outer destination address to the subnet router 725 anycast address taken from the companion prefix associated with the 726 inner destination address (where the companion prefix of the same 727 address family as the outer IP protocol is used). The Server then 728 forwards the revised encapsulated packet into the Internet via a 729 default or more-specific route, where it will be directed to the 730 closest Relay within the destination VPC overlay network. After 731 sending the packet, the Server may then receive an SCMP error or 732 redirect message from a Relay/Server within the destination VPC 733 overlay network. In that case, the Server verifies that the nonce in 734 the message matches the Client that sent the original inner packet 735 and discards the message if the nonce does not match. Otherwise, the 736 Server re-encapsulates the SCMP message in a new outer header that 737 uses the source address, destination address and nonce parameters 738 associated with the Client's tunnel neighbor state; it then forwards 739 the message to the Client. This arrangement is necessary to allow 740 SCMP messages to flow through any NATs on the path. 742 When a Server ('A') receives a SEAL-encapsulated packet from a Relay 743 or from the Internet, if the inner destination address matches an EP 744 in its FIB 'A' re-encapsulates the packet in a new outer header and 745 forwards it to a Client ('B') which in turn decapsulates the packet 746 and forwards it to the correct end system in the EUN. If 'B' has 747 left notice with 'A' that it has moved to a new Server ('C'), 748 however, 'A' will instead forward the packet to 'C' and also send an 749 SCMP redirect message back to the source of the packet. In this way, 750 'B' can leave behind forwarding information when changing between 751 Servers 'A' and 'C' (e.g., due to mobility events) without exposing 752 packets to loss. 754 6.3. IRON Relay Router Operation 756 After each Relay has synchronized its VPs (see: Section 5.1) it 757 advertises the full set of the company's VPs and companion prefixes 758 into the IPv4 and IPv6 Internet BGP routing systems. These prefixes 759 will be represented as ordinary routing information in the BGP, and 760 any packets originating from the IPv4 or IPv6 Internet destined to an 761 address covered by one of the prefixes will be forwarded to one of 762 the VPC overlay network's Relays. 764 When a Relay receives a packet from the Internet destined to an EPA 765 covered by one of its VPs, it behaves as an ordinary IP router. In 766 particular, the Relay looks in its FIB to discover a locator of the 767 Server that serves the EP that covers the destination address. The 768 Relay then simply encapsulates the packet with its own locator as the 769 outer source address and the locator of the Server as the outer 770 destination address and forwards the packet to the Server. 772 When a Relay receives a packet from the Internet destined to one of 773 its subnet router anycast addresses, it discards the packet if it is 774 not SEAL-encapsulated. If the packet is an SCMP SRS message, the 775 Relay instead sends an SRA message back to the source listing the 776 locator addresses of nearby Servers then discards the message. The 777 Relay otherwise discards all other SCMP messages. 779 If the packet is an ordinary SEAL packet (i.e., one that encapsulates 780 an inner packet) the Relay sends an SCMP redirect message of the same 781 address family back to the source with the locator of the Server that 782 serves the EPA destination in the inner packet as the redirected 783 target. The source and destination addresses of the SCMP redirect 784 message use the outer destination and source addresses of the 785 original packet, respectively. After sending the redirect message, 786 the Relay then rewrites the outer destination address of the SEAL- 787 encapsulated packet to the locator of the Server and forwards the 788 revised packet to the Server. Note that in this arrangement any 789 errors that occur on the path between the Relay and the Server will 790 be delivered to the original source but with a different destination 791 address due to this Relay address rewriting. 793 6.4. IRON Reference Operating Scenarios 795 The IRON supports communications when one or both hosts are located 796 within EP-addressed EUNs regardless of whether the EPs are 797 provisioned by the same VPC or by different VPCs. When both hosts 798 are within IRON EUNs, route redirections that eliminate unnecessary 799 Servers and Relays from the path are possible. When only one host is 800 within an IRON EUN, however, route optimization cannot be used. The 801 following sections discuss the two scenarios. 803 6.4.1. Both Hosts Within IRON EUNs 805 When both hosts are within IRON EUNs, it is sufficient to consider 806 the scenario in a unidirectional fashion, i.e., by tracing packet 807 flows only in the forward direction from the source host to 808 destination host. The reverse direction can be considered 809 separately, and incurs the same considerations as for the forward 810 direction. 812 In this scenario, the initial packets of a flow produced by a source 813 host within an EUN connected to the IRON by a Client must flow 814 through both the Server of the source host and a Relay of the 815 destination host, but route optimization can eliminate these elements 816 from the path for subsequent packets in the flow. Figure 6 shows the 817 flow of initial packets from host A to host B within two IRON EUNs 818 (the same scenario applies whether the two EUNs are within the same 819 VPC overlay network or different overlay networks): 821 ________________________________________ 822 .-( .-. )-. 823 .-( ,-( _)-. )-. 824 .-( +========+(_ (_ +=====+ )-. 825 .( || (_|| Internet ||_) || ). 826 .( || ||-(______)-|| vv ). 827 .( +--------++--+ || || +------------+ ). 828 ( +==>| Server(A) | vv || | Server(B) |====+ ) 829 ( // +---------|\-+ +--++----++--+ +------------+ \\ ) 830 ( // .-. | \ | Relay(B) | .-. \\ ) 831 ( //,-( _)-. | \ +-v----------+ ,-( _)-\\ ) 832 ( .||_ (_ )-. | \____| .-(_ (_ ||. ) 833 ( _|| ISP A .) | (__ ISP B ||_)) 834 ( ||-(______)-' | (redirect) `-(______)|| ) 835 ( || | | | vv ) 836 ( +-----+-----+ | +-----+-----+ ) 837 | Client(A) | <--+ | Client(B) | 838 +-----+-----+ The IRON +-----+-----+ 839 | ( (Overlaid on the native Internet) ) | 840 .-. .-( .-) .-. 841 ,-( _)-. .-(________________________)-. ,-( _)-. 842 .-(_ (_ )-. .-(_ (_ )-. 843 (_ IRON EUN A ) (_ IRON EUN B ) 844 `-(______)-' `-(______)-' 845 | | 846 +---+----+ +---+----+ 847 | Host A | | Host B | 848 +--------+ +--------+ 850 Figure 6: Initial Packet Flow Before Redirects 852 With reference to Figure 6, host A sends packets destined to host B 853 via its network interface connected to EUN A. Routing within EUN A 854 will direct the packets to Client(A) as a default router for the EUN 855 which then uses VET and SEAL to encapsulate them in outer headers 856 with its locator address as the outer source address and the locator 857 address of Server(A) as the outer destination address. Client(A) 858 then simply forwards the encapsulated packets into its ISP network 859 connection that provided its locator. The ISP will forward the 860 encapsulated packets into the Internet without filtering since the 861 (outer) source address is topologically correct. Once the packets 862 have been forwarded into the Internet, routing will direct them to 863 Server(A). 865 Server(A) receives the encapsulated packets from Client(A) then 866 rewrites the outer source address to one of its own locator 867 addresses, and rewrites the outer destination address to the subnet 868 router anycast address of the appropriate address family associated 869 with the inner destination address. Server(A) then forwards the 870 revised encapsulated packets into the Internet where routing will 871 direct them to Relay(B) which services the VPC overlay network 872 associated with host B. 874 Relay(B) will intercept the encapsulated packets from Server(A) then 875 check its FIB to discover an entry that covers inner destination 876 address B with Server(B) as the next hop. Relay(B) then returns SCMP 877 redirect messages to Server(A) (*), rewrites the outer destination 878 address of the encapsulated packets to the locator address of 879 Server(B), and forwards these revised packets to Server(B). 881 Server(B) will receive the encapsulated packets from Relay(B) then 882 check its FIB to discover an entry that covers destination address B 883 with Client(B) as the next hop. Server(B) then re-encapsulates the 884 packets in a new outer header that uses the source address, 885 destination address and nonce parameters associated with the tunnel 886 neighbor state for Client(B). Server(B) then forwards these re- 887 encapsulated packets into the Internet, where routing will direct 888 them to Client(B). Client(B) will in turn decapsulate the packets 889 and forward the inner packets to host B via EUN B. 891 (*) Note that after the initial flow of packets, Server(A) will have 892 received one or more SCMP redirect messages from Relay(B) listing 893 Server(B) as a better next hop. Server(A) will in turn forward the 894 redirects to Client(A), which will establish unidirectional tunnel 895 neighbor state and thereafter forward its encapsulated packets 896 directly to the locator address of Server(B) without involving either 897 Server(A) or Relay(B) as shown in Figure 7: 899 ________________________________________ 900 .-( .-. )-. 901 .-( ,-( _)-. )-. 902 .-( +=============> .-(_ (_ )-.======+ )-. 903 .( // (__ Internet _) || ). 904 .( // `-(______)-' vv ). 905 .( // +------------+ ). 906 ( // | Server(B) |====+ ) 907 ( // +------------+ \\ ) 908 ( // .-. .-. \\ ) 909 ( //,-( _)-. ,-( _)-\\ ) 910 ( .||_ (_ )-. .-(_ (_ ||. ) 911 ( _|| ISP A .) (__ ISP B ||_)) 912 ( ||-(______)-' `-(______)|| ) 913 ( || | | vv ) 914 ( +-----+-----+ The IRON +-----+-----+ ) 915 | Client(A) | (Overlaid on the native Internet) | Client(B) | 916 +-----+-----+ +-----+-----+ 917 | ( ) | 918 .-. .-( .-) .-. 919 ,-( _)-. .-(________________________)-. ,-( _)-. 920 .-(_ (_ )-. .-(_ (_ )-. 921 (_ IRON EUN A ) (_ IRON EUN B ) 922 `-(______)-' `-(______)-' 923 | | 924 +---+----+ +---+----+ 925 | Host A | | Host B | 926 +--------+ +--------+ 928 Figure 7: Sustained Packet Flow After Redirects 930 6.4.2. Mixed IRON and Non-IRON Hosts 932 When one host is within an IRON EUN and the other is in a non-IRON 933 EUN (i.e., one that connects to the native Internet instead of the 934 IRON), the IA elements involved depend on the packet flow directions. 935 The cases are described in the following sections. 937 6.4.2.1. From IRON Host A to Non-IRON Host B 939 Figure 8 depicts the IRON reference operating scenario for packets 940 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 942 _________________________________________ 943 .-( )-. )-. 944 .-( +-------)----+ )-. 945 .-( | Relay(A) |--------------+ )-. 946 .( +------------+ \ ). 947 .( +=======>| Server(A) | \ ). 948 .( // +--------)---+ \ ). 949 ( // ) \ ) 950 ( // The IRON ) \ ) 951 ( // .-. ) \ .-. ) 952 ( //,-( _)-. ) \ ,-( _)-. ) 953 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 954 ( _|| ISP A ) ) (_ | ISP B )) 955 ( ||-(______)-' ) |-(______)-' ) 956 ( || | )-. v | ) 957 ( +-----+ ----+ )-. +-----+-----+ ) 958 | Client(A) |)-. | Router B | 959 +-----+-----+ +-----+-----+ 960 | ( ) | 961 .-. .-(____________________________________)-. .-. 962 ,-( _)-. ,-( _)-. 963 .-(_ (_ )-. .-(_ (_ )-. 964 (_ IRON EUN A ) (_non-IRON EUN B) 965 `-(______)-' `-(______)-' 966 | | 967 +---+----+ +---+----+ 968 | Host A | | Host B | 969 +--------+ +--------+ 971 Figure 8: From IRON Host A to Non-IRON Host B 973 In this scenario, host A sends packets destined to host B via its 974 network interface connected to IRON EUN A. Routing within EUN A will 975 direct the packets to Client(A) as a default router for the EUN which 976 then uses VET and SEAL to encapsulate them in outer headers with its 977 locator address as the outer source address and the locator address 978 of Server(A) as the outer destination address. The ISP will pass the 979 packets without filtering since the (outer) source address is 980 topologically correct. Once the packets have been released into the 981 native Internet, routing will direct them to Server(A). 983 Server(A) receives the encapsulated packets from Client(A) then re- 984 encapsulates and forwards them to Relay(A), which simply decapsulates 985 them and forwards the unencapsulated packets into the Internet. Once 986 the packets are released into the Internet, routing will direct them 987 to the final destination B. (Note that Server(A) and Relay(A) are 988 depicted in Figure 8 as two halves of a unified gateway. In that 989 case, the "forwarding" between Server(A) and Relay(A) is a zero- 990 instruction imaginary operation within the gateway.) 992 This scenario always involves a Server and Relay owned by the VPC 993 that provides service to IRON EUN A. It therefore imparts a cost that 994 would need to be borne by either the VPC or its customers. 996 6.4.2.2. From Non-IRON Host B to IRON Host A 998 Figure 9 depicts the IRON reference operating scenario for packets 999 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1001 _______________________________________ 1002 .-( )-. )-. 1003 .-( +-------)----+ )-. 1004 .-( | Relay(A) |<-------------+ )-. 1005 .( +------------+ \ ). 1006 .( +========| Server(A) | \ ). 1007 .( // +--------)---+ \ ). 1008 ( // ) \ ) 1009 ( // The IRON ) \ ) 1010 ( // .-. ) \ .-. ) 1011 ( //,-( _)-. ) \ ,-( _)-. ) 1012 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1013 ( _|| ISP A ) ) (_ | ISP B )) 1014 ( ||-(______)-' ) |-(______)-' ) 1015 ( vv | )-. | | ) 1016 ( +-----+ ----+ )-. +-----+-----+ ) 1017 | Client(A) |)-. | Router B | 1018 +-----+-----+ +-----+-----+ 1019 | ( ) | 1020 .-. .-(____________________________________)-. .-. 1021 ,-( _)-. ,-( _)-. 1022 .-(_ (_ )-. .-(_ (_ )-. 1023 (_ IRON EUN A ) (_non-IRON EUN B) 1024 `-(______)-' `-(_______)-' 1025 | | 1026 +---+----+ +---+----+ 1027 | Host A | | Host B | 1028 +--------+ +--------+ 1030 Figure 9: From Non-IRON Host B to IRON Host A 1032 In this scenario, host B sends packets destined to host A via its 1033 network interface connected to non-IRON EUN B. Routing will direct 1034 the packets to Relay(A) which then forwards them to Server(A) using 1035 encapsulation if necessary. 1037 Server(A) will then check its FIB to discover an entry that covers 1038 destination address A with Client(A) as the next hop. Server(A) then 1039 (re-)encapsulates the packets in an outer header that uses the source 1040 address, destination address and nonce parameters associated with the 1041 tunnel neighbor state for Client(A). Server(A) next forwards these 1042 (re-)encapsulated packets into the Internet, where routing will 1043 direct them to Client(A). Client(A) will in turn decapsulate the 1044 packets and forward the inner packets to host A via its network 1045 interface connected to IRON EUN A. 1047 This scenario always involves a Server and Relay owned by the VPC 1048 that provides service to IRON EUN A. It therefore imparts a cost that 1049 would need to be borne by either the VPC or its customers. 1051 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1053 While IRON Servers and Relays can be considered as fixed 1054 infrastructure, Clients may need to move between different network 1055 points of attachment, connect to multiple ISPs, or explicitly manage 1056 their traffic flows. The following sections discuss mobility, 1057 multihoming and traffic engineering considerations for IRON client 1058 routers. 1060 6.5.1. Mobility Management 1062 When a Client changes its network point of attachment (e.g., due to a 1063 mobility event), it configures one or more new locators. If the 1064 Client has not moved far away from its previous network point of 1065 attachment, it simply informs its Server of any locator additions or 1066 deletions. This operation is performance-sensitive, and should be 1067 conducted immediately to avoid packet loss. 1069 If the Client has moved far away from its previous network point of 1070 attachment, however, it re-issues the anycast discovery procedure 1071 described in Section 6.1 to discover whether its candidate set of 1072 Servers has changed. If the Client's current Server is also included 1073 in the new list received from the VPC, this provides indication that 1074 the Client has not moved far enough to warrant changing to a new 1075 Server. Otherwise, the Client may wish to move to a new Server in 1076 order to reduce routing stretch. This operation is not performance- 1077 critical, and therefore can be conducted over a matter of seconds/ 1078 minutes instead of milliseconds/microseconds. 1080 To move to a new Server, the Client first engages in the EP 1081 registration process with the new Server as described in Section 5.3. 1082 The Client then informs its former Server that it has moved by 1083 providing it with the locator address of the new Server; again, via a 1084 VPC-specific reliable transaction. The former Server will then 1085 garbage-collect the stale FIB entries when their lifetime expires. 1087 This will allow the former Server to redirect existing correspondents 1088 to the new Server so that no packets are lost. 1090 6.5.2. Multihoming 1092 A Client may register multiple locators with its Server. It can 1093 assign metrics with its registrations to inform the Server of 1094 preferred locators, and can select outgoing locators according to its 1095 local preferences. Multihoming is therefore naturally supported. 1097 6.5.3. Inbound Traffic Engineering 1099 A Client can dynamically adjust the priorities of its prefix 1100 registrations with its Server in order to influence inbound traffic 1101 flows. It can also change between Servers when multiple Servers are 1102 available, but should strive for stability in its Server selection in 1103 order to limit VPC network routing churn. 1105 6.5.4. Outbound Traffic Engineering 1107 A Client can select outgoing locators, e.g., based on current QoS 1108 considerations such as minimizing one-way delay or one-way delay 1109 variance. 1111 6.6. Renumbering Considerations 1113 As new link layer technologies and/or service models emerge, 1114 customers will be motivated to select their service providers through 1115 healthy competition between ISPs. If a customer's EUN addresses are 1116 tied to a specific ISP, however, the customer may be forced to 1117 undergo a painstaking EUN renumbering process if it wishes to change 1118 to a different ISP [RFC4192][RFC5887]. 1120 When a customer obtains EP prefixes from a VPC, it can change between 1121 ISPs seamlessly and without need to renumber. If the VPC itself 1122 applies unreasonable costing structures for use of the EPs, however, 1123 the customer may be compelled to seek a different VPC and would again 1124 be required to confront a renumbering scenario. The IRON approach to 1125 renumbering avoidance therefore depends on VPCs conducting ethical 1126 business practices and offering reasonable rates. 1128 6.7. NAT Traversal Considerations 1130 The Internet today consists of a global public IPv4 routing and 1131 addressing system with non-IRON EUNs that use either public or 1132 private IPv4 addressing. The latter class of EUNs connect to the 1133 public Internet via Network Address Translators (NATs). When a 1134 Client is located behind a NAT, its selects Servers using the same 1135 procedures as for Clients with public addresses, e.g., it can send 1136 SRS messages to Servers in order to get SRA messages in return. The 1137 only requirement is that the Client must configure its SEAL 1138 encapsulation to use a transport protocol that supports NAT 1139 traversal, namely UDP. 1141 Since the Server maintains state about its Client customers, it can 1142 discover locator information for each Client by examining the UDP 1143 port number and IP address in the outer headers of the Client's 1144 encapsulated packets. When there is a NAT in the path, the UDP port 1145 number and IP address in each encapsulated packet will correspond to 1146 state in the NAT box and might not correspond to the actual values 1147 assigned to the Client. The Server can then encapsulate packets 1148 destined to hosts in the Client's EUN within outer headers that use 1149 this IP address and UDP port number. The NAT box will receive the 1150 packets, translate the values in the outer headers, then forward the 1151 packets to the Client. In this sense, the Server's "locator" for the 1152 Client consists of the concatenation of the IP address and UDP port 1153 number. 1155 IRON does not introduce any new issues to complications raised for 1156 NAT traversal or for applications embedding address referrals in 1157 their payload. 1159 6.8. Multicast Considerations 1161 IRON Servers and Relays are topologically positioned to provide 1162 Internet Group Management Protocol (IGMP) / Multicast Listener 1163 Discovery (MLD) proxying for their Clients [RFC4605]. Further 1164 multicast considerations for IRON (e.g., interactions with multicast 1165 routing protocols, traffic scaling, etc.) will be discussed in a 1166 separate document. 1168 6.9. Nested EUN Considerations 1170 Each Client configures a locator that may be taken from an ordinary 1171 non-EPA address assigned by an ISP or from an EPA address taken from 1172 an EP assigned to another Client. In that case, the Client is said 1173 to be "nested" within the EUN of another Client, and recursive 1174 nestings of multiple layers of encapsulations may be necessary. 1176 For example, in the network scenario depicted in Figure 10 Client(A) 1177 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1178 Client(B) in turn configures a locator EPA(C) taken from the EP 1179 assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) 1180 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1181 this example, the "nested-IRON" case must be examined in which a host 1182 A which configures the address EPA(A) within EUN(A) exchanges packets 1183 with host Z located elsewhere in the Internet. 1185 .-. 1186 ISP(D) ,-( _)-. 1187 +-----------+ .-(_ (_ )-. 1188 | Client(C) |--(_ ISP(D) ) 1189 +-----+-----+ `-(______)-' 1190 | <= T \ .-. 1191 .-. u \ ,-( _)-. 1192 ,-( _)-. n .-(_ (- )-. 1193 .-(_ (_ )-. n (_ Internet ) 1194 (_ EUN(C) ) e `-(______)-' 1195 `-(______)-' l ___ 1196 | EPA(C) s => (:::)-. 1197 +-----+-----+ .-(::::::::) 1198 | Client(B) | .-(::::::::::::)-. +-----------+ 1199 +-----+-----+ (:::: The IRON ::::) | Relay(Z) | 1200 | `-(::::::::::::)-' +-----------+ 1201 .-. `-(::::::)-' +-----------+ 1202 ,-( _)-. | Server(Z) | 1203 .-(_ (_ )-. +-----------+ +-----------+ 1204 (_ EUN(B) ) | Server(C) | +-----------+ 1205 `-(______)-' +-----------+ | Client(Z) | 1206 | EPA(B) +-----------+ +-----------+ 1207 +-----+-----+ | Server(B) | +--------+ 1208 | Client(A) | +-----------+ | Host Z | 1209 +-----------+ +-----------+ +--------+ 1210 | | Server(A) | 1211 .-. +-----------+ 1212 ,-( _)-. EPA(A) 1213 .-(_ (_ )-. +--------+ 1214 (_ EUN(A) )---| Host A | 1215 `-(______)-' +--------+ 1217 Figure 10: Nested EUN Example 1219 The two cases of host A sending packets to host Z, and host Z sending 1220 packets to host A, must be considered separately as described below. 1222 6.9.1. Host A Sends Packets to Host Z 1224 Host A first forwards a packet with source address EPA(A) and 1225 destination address Z into EUN(A). Routing within EUN(A) will direct 1226 the packet to Client(A), which encapsulates it in an outer header 1227 with EPA(B) as the outer source address and Server(A) as the outer 1228 destination address then forwards the once-encapsulated packet into 1229 EUN(B). Routing within EUN[B] will direct the packet to Client(B), 1230 which encapsulates it in an outer header with EPA(C) as the outer 1231 source address and Server(B) as the outer destination address then 1232 forwards the twice-encapsulated packet into EUN(C). Routing within 1233 EUN(C) will direct the packet to Client(C), which encapsulates it in 1234 an outer header with ISP(D) as the outer source address and Server(C) 1235 as the outer destination address. Client(C) then sends this triple- 1236 encapsulated packet into the ISP(D) network, where it will be routed 1237 into the Internet to Server(C). 1239 When Server(C) receives the triple-encapsulated packet, it removes 1240 the outer layer of encapsulation and forwards the resulting twice- 1241 encapsulated packet into the Internet to Server(B). Next, Server(B) 1242 removes the outer layer of encapsulation and forwards the resulting 1243 once-encapsulated packet into the Internet to Server(A). Next, 1244 Server(A) checks the address type of the inner address 'Z'. If Z is 1245 a non-EPA address, Server(A) simply decapsulates the packet and 1246 forwards it into the Internet. Otherwise, Server(A) rewrites the 1247 outer source and destination addresses of the once-encapsulated 1248 packet and forwards it to Relay(Z). Relay(Z) in turn rewrites the 1249 outer destination address of the packet to the locator for Server(Z), 1250 then forwards the packet and sends a redirect to Server(A) (which 1251 forwards the redirect to Client(A)). Server(Z) then re-encapsulates 1252 the packet and forwards it to Client(Z), which decapsulates it and 1253 forwards the inner packet to host Z. Subsequent packets from 1254 Client(A) will then use Server(Z) as the next hop toward host Z, 1255 which eliminates Server(A) and Relay(Z) from the path. 1257 6.9.2. Host Z Sends Packets to Host A 1259 Whether or not host Z configures an EPA address, its packets destined 1260 to Host A will eventually reach Server(A). Server(A) will have a 1261 mapping that lists Client(A) as the next hop toward EPA(A). 1262 Server(A) will then encapsulate the packet with EPA(B) as the outer 1263 destination address and forward the packet into the Internet. 1264 Internet routing will convey this once-encapsulated packet to 1265 Server(B) which will have a mapping that lists Client(B) as the next 1266 hop toward EPA(B). Server(B) will then encapsulate the packet with 1267 EPA(C) as the outer destination address and forward the packet into 1268 the Internet. Internet routing will then convey this twice- 1269 encapsulated packet to Server(C) which will have a mapping that lists 1270 Client(C) as the next hop toward EPA(C). Server(C) will then 1271 encapsulate the packet with ISP(D) as the outer destination address 1272 and forward the packet into the Internet. Internet routing will then 1273 convey this triple-encapsulated packet to Client(C). 1275 When the triple-encapsulated packet arrives at Client(C), it strips 1276 the outer layer of encapsulation and forwards the twice-encapsulated 1277 packet to EPA(C) which is the locator address of Client(B). When 1278 Client(B) receives the twice-encapsulated packet, it strips the outer 1279 layer of encapsulation and forwards the once-encapsulated packet to 1280 EPA(B) which is the locator address of Client(A). When Client(A) 1281 receives the once-encapsulated packet, it strips the outer layer of 1282 encapsulation and forwards the unencapsulated packet to EPA(A) which 1283 is the host address of host A. 1285 7. Implications for the Internet 1287 The IRON architecture envisions a hybrid routing/mapping system that 1288 benefits from both the shortest-path routing afforded by pure dynamic 1289 routing systems and the routing scaling suppression afforded by pure 1290 mapping systems. IRON therefore targets the elusive "sweet spot" 1291 that pure routing and pure mapping systems alone cannot satisfy. 1293 The IRON system requires a deployment of new routers/servers 1294 throughout the Internet and/or provider networks to maintain well- 1295 balanced virtual overlay networks. These routers/servers can be 1296 deployed incrementally without disruption to existing Internet 1297 infrastructure and appropriately managed to provide acceptable 1298 service levels to customers. 1300 End-to-end traffic that traverses an IRON virtual overlay network may 1301 experience delay variance between the initial packets and subsequent 1302 packets of a flow. This is due to the IRON system allowing longer 1303 path stretch for initial packets followed by timely route 1304 optimizations to utilize better next hop routers/servers for 1305 subsequent packets. 1307 IRON virtual overlay networks also work seamlessly with existing and 1308 emerging services within the native Internet. In particular, 1309 customers serviced by IRON virtual overlay networks will receive the 1310 same service enjoyed by customers serviced by non-IRON service 1311 providers. Internet services already deployed within the native 1312 Internet also need not make any changes to accommodate IRON virtual 1313 overlay network customers. 1315 The IRON system operates between routers within provider networks and 1316 end user networks. Within these networks, the underlying paths 1317 traversed by the virtual overlay networks may comprise links that 1318 accommodate varying MTUs. While the IRON system imposes an 1319 additional per-packet overhead that may cause the size of packets to 1320 become slightly larger than the underlying path can accommodate, IRON 1321 routers have a method for naturally detecting and tuning out all 1322 instances of path MTU underruns. In some cases, these MTU underruns 1323 may need to be reported back to the original hosts; however, the 1324 system will also allow for MTUs much larger than those typically 1325 available in current Internet paths to be discovered and utilized as 1326 more links with larger MTUs are deployed. 1328 Finally, and perhaps most importantly, the IRON system provides an 1329 in-built mobility management and multihoming capability that allows 1330 end user devices and networks to move about freely while both 1331 imparting minimal oscillations in the routing system and maintaining 1332 generally shortest-path routes. This mobility management is afforded 1333 through the very nature of the IRON customer/provider relationship, 1334 and therefore requires no adjunct mechanisms. The mobility 1335 management and multihoming capabilities are further supported by 1336 forward-path reachability detection that provides "hints of forward 1337 progress" in the same spirit as for IPv6 ND. 1339 8. Additional Considerations 1341 Considerations for the scalability of Internet Routing due to 1342 multihoming, traffic engineering and provider-independent addressing 1343 are discussed in [I-D.narten-radir-problem-statement]. Other scaling 1344 considerations specific to IRON are discussed in Appendix B. 1346 Route optimization considerations for mobile networks are found in 1347 [RFC5522]. 1349 9. Related Initiatives 1351 IRON builds upon the concepts RANGER architecture [RFC5720], and 1352 therefore inherits the same set of related initiatives. The Internet 1353 Research Task Force (IRTF) Routing Research Group (RRG) mentions IRON 1354 in its recommendation for a routing architecture 1355 [I-D.irtf-rrg-recommendation]. 1357 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1358 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1359 the Virtual Prefix concepts. 1361 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1362 contributed valuable insights, including the use of real-time 1363 mapping. The use of Servers as mobility anchor points is directly 1364 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1366 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1367 approach using a Correspondent Router (CR) model. The IRON Server 1368 construct is similar to the CR concept described in this work, 1369 however the manner in which customer EUNs coordinates with Servers is 1370 different and based on the redirection model associated with NBMA 1371 links. 1373 Numerous publications have proposed NAT traversal techniques. The 1374 NAT traversal techniques adapted for IRON were inspired by the Simple 1375 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1376 [I-D.carpenter-softwire-sample]. 1378 10. IANA Considerations 1380 There are no IANA considerations for this document. 1382 11. Security Considerations 1384 Security considerations that apply to tunneling in general are 1385 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1386 considerations that apply also to IRON are discussed in RANGER 1387 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1388 [I-D.templin-intarea-seal]. 1390 The IRON system further depends on mutual authentication of IRON 1391 Clients to Servers and Servers to Relays. This is accomplished 1392 through initial authentication exchanges followed by tunnel neighbor 1393 nonces that can be used to detect off-path attacks. As for all 1394 Internet communications, the IRON system also depends on Relays 1395 acting with integrity and not injecting false advertisements into the 1396 BGP (e.g., to mount traffic siphoning attacks). 1398 Each VPC overlay network requires a means for assuring the integrity 1399 of the interior routing system so that all Relays and Servers in the 1400 overlay have a consistent view of Client<->Server bindings. Finally, 1401 DOS attacks on IRON Relays and Servers can occur when packets with 1402 spoofed source addresses arrive at high data rates. This issue is no 1403 different than for any border router in the public Internet today, 1404 however. 1406 12. Acknowledgements 1408 This ideas behind this work have benefited greatly from discussions 1409 with colleagues; some of which appear on the RRG and other IRTF/IETF 1410 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1411 mobility architecture which strongly influenced IRON. Eric 1412 Fleischman pointed out the opportunity to leverage anycast for 1413 discovering topologically-close Servers. Thomas Henderson 1414 recommended a quantitative analysis of scaling properties. 1416 The following individuals provided essential review input: Jari 1417 Arkko, Mohamed Boucadair, Stewart Bryant, John Buford, Ralph Droms, 1418 Wesley Eddy, Adrian Farrel, Dae Young Kim and Robin Whittle. 1420 13. References 1422 13.1. Normative References 1424 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1425 September 1981. 1427 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1428 (IPv6) Specification", RFC 2460, December 1998. 1430 13.2. Informative References 1432 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1433 http://bgpmon.net/stat.php", June 2010. 1435 [I-D.bernardos-mext-nemo-ro-cr] 1436 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1437 Router based Route Optimisation for NEMO (CRON)", 1438 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1439 July 2008. 1441 [I-D.carpenter-softwire-sample] 1442 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1443 IPv6: Simple Address Mapping for Premises Legacy Equipment 1444 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1445 progress), June 2010. 1447 [I-D.ietf-grow-va] 1448 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1449 L. Zhang, "FIB Suppression with Virtual Aggregation", 1450 draft-ietf-grow-va-03 (work in progress), August 2010. 1452 [I-D.ietf-v6ops-tunnel-security-concerns] 1453 Krishnan, S., Thaler, D., and J. Hoagland, "Security 1454 Concerns With IP Tunneling", 1455 draft-ietf-v6ops-tunnel-security-concerns-04 (work in 1456 progress), October 2010. 1458 [I-D.irtf-rrg-recommendation] 1459 Li, T., "Recommendation for a Routing Architecture", 1460 draft-irtf-rrg-recommendation-16 (work in progress), 1461 November 2010. 1463 [I-D.narten-radir-problem-statement] 1464 Narten, T., "On the Scalability of Internet Routing", 1465 draft-narten-radir-problem-statement-05 (work in 1466 progress), February 2010. 1468 [I-D.russert-rangers] 1469 Russert, S., Fleischman, E., and F. Templin, "RANGER 1470 Scenarios", draft-russert-rangers-05 (work in progress), 1471 July 2010. 1473 [I-D.templin-intarea-seal] 1474 Templin, F., "The Subnetwork Encapsulation and Adaptation 1475 Layer (SEAL)", draft-templin-intarea-seal-25 (work in 1476 progress), December 2010. 1478 [I-D.templin-intarea-vet] 1479 Templin, F., "Virtual Enterprise Traversal (VET)", 1480 draft-templin-intarea-vet-19 (work in progress), 1481 December 2010. 1483 [I-D.whittle-ivip-arch] 1484 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1485 Architecture", draft-whittle-ivip-arch-04 (work in 1486 progress), March 2010. 1488 [I-D.zhang-evolution] 1489 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1490 Scalability", draft-zhang-evolution-02 (work in progress), 1491 October 2009. 1493 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1494 a subnetwork for experimentation with the OSI network 1495 layer", RFC 1070, February 1989. 1497 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1498 Addresses", RFC 2526, March 1999. 1500 [RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers", 1501 RFC 3068, June 2001. 1503 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1504 Reserved for Documentation", RFC 3849, July 2004. 1506 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1507 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1508 September 2005. 1510 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1511 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1513 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1514 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1515 May 2006. 1517 [RFC4605] Fenner, B., He, H., Haberman, B., and H. Sandick, 1518 "Internet Group Management Protocol (IGMP) / Multicast 1519 Listener Discovery (MLD)-Based Multicast Forwarding 1520 ("IGMP/MLD Proxying")", RFC 4605, August 2006. 1522 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1523 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1524 March 2008. 1526 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1527 Route Optimization Requirements for Operational Use in 1528 Aeronautics and Space Exploration Mobile Networks", 1529 RFC 5522, October 2009. 1531 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1532 Global Enterprise Recursion (RANGER)", RFC 5720, 1533 February 2010. 1535 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1536 Reserved for Documentation", RFC 5737, January 2010. 1538 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1539 (IRTF) Document Stream", RFC 5743, December 2009. 1541 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1542 Still Needs Work", RFC 5887, May 2010. 1544 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1545 Core-Edge Separation Solutions to the Internet's Routing 1546 Scaling Problem, 1547 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1548 August 2008. 1550 Appendix A. IRON VPs Over Internetworks with Different Address Families 1552 The IRON architecture leverages the routing system by providing 1553 generally shortest-path routing for packets with EPA addresses from 1554 VPs that match the address family of the underlying Internetwork. 1555 When the VPs are of an address family that is not routable within the 1556 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1557 are used within an IPv4 Internetwork) a global mapping database is 1558 required to allow Servers to map VPs to companion prefixes taken from 1559 address families that are routable within the Internetwork. For 1560 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1561 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1562 packets can be forwarded over IPv4-only Internetworks. 1564 Every VP in the IRON must therefore be represented in a globally 1565 distributed Master VP database (MVPd) that maintains VP-to-companion 1566 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1567 globally-managed assigned numbers authority in the same manner as the 1568 Internet Assigned Numbers Authority (IANA) currently maintains the 1569 master list of all top-level IPv4 and IPv6 delegations. The database 1570 can be replicated across multiple servers for load balancing much in 1571 the same way that FTP mirror sites are used to manage software 1572 distributions. 1574 Upon startup, each Server discovers the full set of VPs for the IRON 1575 by reading the MVPd. The Server reads the MVPd from a nearby server 1576 and periodically checks the server for deltas since the database was 1577 last read. After reading the MVPd, the Server has a full list of VP 1578 to companion prefix mappings. 1580 The Server can then forward packets toward EPAs covered by a VP by 1581 encapsulating them in an outer header of the VP's companion prefix 1582 address family and using any address taken from the companion prefix 1583 as the outer destination address. The companion prefix therefore 1584 serves as an anycast prefix. 1586 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1587 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1589 Appendix B. Scaling Considerations 1591 Scaling aspects of the IRON architecture have strong implications for 1592 its applicability in practical deployments. Scaling must be 1593 considered along multiple vectors including Interdomain core routing 1594 scaling, scaling to accommodate large numbers of customer EUNs, 1595 traffic scaling, state requirements, etc. 1597 In terms of routing scaling, each VPC will advertise one or more VPs 1598 into the global Internet routing system from which EPs are delegated 1599 to customer EUNs. Routing scaling will therefore be minimized when 1600 each VP covers many EPs. For example, the IPv6 prefix 2001:DB8::/32 1601 contains 2^24 ::/56 EP prefixes for assignment to EUNs. The IRON 1602 could therefore accommodate 2^32 ::/56 EPs with only 2^8 ::/32 VPs 1603 advertised in the interdomain routing core. (When even longer EP 1604 prefixes are used, e.g., /64s assigned to individual handsets in a 1605 cellular provider network, considerable numbers of EUNs can be 1606 represented within only a single VP.) Each VP also has an associated 1607 anycast companion prefix; hence, there will be one anycast prefix 1608 advertised into the global routing system for each VP. 1610 In terms of traffic scaling for Relays, each Relay represents an ASBR 1611 of a "shell" enterprise network that simply directs arriving traffic 1612 packets with EPA destination addresses towards Servers that service 1613 customer EUNs. Moreover, the Relay sheds traffic destined to EPAs 1614 through redirection which removes it from the path for the vast 1615 majority of traffic packets. On the other hand, each Relay must 1616 handle all traffic packets forwarded between its customer EUNs and 1617 the non-IRON Internet. The scaling concerns for this latter class of 1618 traffic are no different than for ASBR routers that connect large 1619 enterprise networks to the Internet. In terms of traffic scaling for 1620 Servers, each Server services a set of the VPC overlay network's 1621 customer EUNs. The Server services all traffic packets destined to 1622 its EUNs but only services the initial packets of flows initiated 1623 from the EUNs and destined to EPAs. Therefore, traffic scaling for 1624 EPA-addressed traffic is an asymmetric consideration and is 1625 proportional to the number of EUNs each Server serves. 1627 In terms of state requirements for Relays, each Relay maintains a 1628 list of all Servers in the VPC overlay network as well as FIB entries 1629 for all customer EUNs that each Server serves. This state is 1630 therefore dominated by the number of EUNs in the VPC overlay network. 1631 Sizing the Relay to accommodate state information for all EUNs is 1632 therefore required during VPC overlay network planning. In terms of 1633 state requirements for Servers, each Server maintains tunnel neighbor 1634 state for each of the customer EUNs it serves but need not keep state 1635 for all EUNs in the VPC overlay network. Finally, neither Relays nor 1636 Servers need keep state for final destinations of outbound traffic. 1638 Clients source and sink all traffic packets originating from or 1639 destined to the customer EUN. Therefore traffic scaling 1640 considerations for Clients are the same as for any site border 1641 router. Clients also retain state for the Servers for final 1642 destinations of outbound traffic flows. This can be managed as soft 1643 state, since stale entries purged from the cache will be refreshed 1644 when new traffic packets are sent. 1646 Author's Address 1648 Fred L. Templin (editor) 1649 Boeing Research & Technology 1650 P.O. Box 3707 MC 7L-49 1651 Seattle, WA 98124 1652 USA 1654 Email: fltemplin@acm.org