idnits 2.17.1 draft-templin-iron-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 20, 2010) is 4869 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'B' is mentioned on line 1271, but not defined == Unused Reference: 'RFC3849' is defined on line 1536, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1563, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-03 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-25 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-19 -- Obsolete informational reference (is this intentional?): RFC 3068 (Obsoleted by RFC 7526) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft December 20, 2010 5 Intended status: Experimental 6 Expires: June 23, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-14.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth through Provider Independent addressing while requiring no 18 changes to end systems and no changes to the existing routing system. 19 IRON further addresses other important issues including routing 20 scaling, mobility management, multihoming, traffic engineering and 21 NAT traversal. While business considerations are an important 22 determining factor for widespread adoption, they are out of scope for 23 this document. This document is a product of the IRTF Routing 24 Research Group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 23, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 63 3.1. IRON Client Router . . . . . . . . . . . . . . . . . . . . 9 64 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 10 65 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 66 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 67 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 68 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 69 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 70 5.3. IRON Client Router Initialization . . . . . . . . . . . . 15 71 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 16 72 6.1. IRON Client Router Operation . . . . . . . . . . . . . . . 16 73 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 17 74 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 18 75 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 76 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 77 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 78 6.5. Mobility, Multihoming and Traffic Engineering 79 Considerations . . . . . . . . . . . . . . . . . . . . . . 25 80 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 25 81 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 26 82 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 26 83 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 26 84 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 26 85 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 26 86 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 27 87 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 28 88 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 29 89 7. Implications for the Internet . . . . . . . . . . . . . . . . 30 90 8. Additional Considerations . . . . . . . . . . . . . . . . . . 31 91 9. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 31 92 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 93 11. Security Considerations . . . . . . . . . . . . . . . . . . . 32 94 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 96 13.1. Normative References . . . . . . . . . . . . . . . . . . . 33 97 13.2. Informative References . . . . . . . . . . . . . . . . . . 33 98 Appendix A. IRON VPs Over Internetworks with Different 99 Address Families . . . . . . . . . . . . . . . . . . 35 100 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 36 101 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 37 103 1. Introduction 105 Growth in the number of entries instantiated in the Internet routing 106 system has led to concerns for unsustainable routing scaling 107 [I-D.narten-radir-problem-statement]. Operational practices such as 108 increased use of multihoming with IPv4 Provider-Independent (PI) 109 addressing are resulting in more and more fine-grained prefixes 110 injected into the routing system from more and more end-user 111 networks. Furthermore, the forthcoming depletion of the public IPv4 112 address space has raised concerns for both increased address space 113 fragmentation (leading to yet further routing table entries) and an 114 impending address space run-out scenario. At the same time, the IPv6 115 routing system is beginning to see growth in IPv6 Provider-Aggregated 116 (PA) prefixes [BGPMON] which must be managed in order to avoid the 117 same routing scaling issues the IPv4 Internet now faces. Since the 118 Internet must continue to scale to accommodate increasing demand, it 119 is clear that new routing methodologies and operational practices are 120 needed. 122 Several related works have investigated routing scaling issues. 123 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 124 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 125 proposals that introduce routing overlays with Virtual Prefixes (VPs) 126 to reduce the number of entries required in each router's Forwarding 127 Information Base (FIB) and Routing Information Base (RIB). Routing 128 and Addressing in Networks with Global Enterprise Recursion (RANGER) 129 [RFC5720] examines recursive arrangements of enterprise networks that 130 can apply to a very broad set of use case scenarios 131 [I-D.russert-rangers]. In particular, RANGER supports encapsulation 132 and secure redirection by treating each layer in the recursive 133 hierarchy as a virtual non-broadcast, multiple access (NBMA) "link". 134 RANGER is an architectural framework that includes Virtual Enterprise 135 Traversal (VET) [I-D.templin-intarea-vet] and the Subnetwork 136 Adaptation and Encapsulation Layer (SEAL) [I-D.templin-intarea-seal] 137 as its functional building blocks. 139 This document proposes an Internet Routing Overlay Network (IRON) 140 with goals of supporting sustainable growth while requiring no 141 changes to the existing routing system. IRON borrows concepts from 142 VA, AIS and RANGER, and further borrows concepts from the Internet 143 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 144 proposal along with its associated Translating Tunnel Router (TTR) 145 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 146 degree inspired the IRON mobility architecture design discussed in 147 this document. The Network Address Translator (NAT) traversal 148 techniques adapted for IRON were inspired by the Simple Address 149 Mapping for Premises Legacy Equipment (SAMPLE) proposal 150 [I-D.carpenter-softwire-sample]. 152 IRON specifically seeks to provide scalable PI addressing without 153 changing the current BGP [RFC4271] routing system. IRON observes the 154 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 155 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 156 [RFC1070], etc.) are also within scope. 158 The IRON is a global routing system comprising virtual overlay 159 networks managed by Virtual Prefix Companies (VPCs) that own and 160 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 161 prefixes (EPs) are delegated to customer sites. The IRON is 162 motivated by a growing customer demand for multihoming, mobility 163 management and traffic engineering while using stable PI addressing 164 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 165 existing IPv4 and IPv6 global Internet routing systems as virtual 166 links for tunneling inner network protocol packets within outer IPv4 167 or IPv6 headers (see: Section 3). The IRON requires deployment of a 168 small number of new BGP core routers and supporting servers, as well 169 as IRON-aware routers/servers in customer EUNs. No modifications to 170 hosts, and no modifications to most routers are required. 172 While the IRON architecture addresses network mobility, host mobility 173 considerations are outside the scope of this document. IP multicast 174 considerations are also out of scope. 176 Note: This document is offered in compliance with Internet Research 177 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 178 IETF product and is not a standard. The views in this document were 179 considered controversial by the IRTF Routing Research Group (RRG) but 180 the RG reached a consensus that the document should still be 181 published. The document will undergo a period of review within the 182 RRG and through selected expert reviewers prior to publication. The 183 following sections discuss details of the IRON architecture. 185 2. Terminology 187 This document makes use of the following terms: 189 End User Network (EUN) 190 an edge network that connects an organization's devices (e.g., 191 computers, routers, printers, etc.) to the Internet. 193 End User Network PI Prefix (EP) 194 a more-specific Provider-Independent (PI) prefix derived from a 195 Virtual Prefix (VP) (e.g., an IPv4 /28, an IPv6 /56, etc.) and 196 delegated to an EUN by a Virtual Prefix Company (VPC). 198 End User Network PI Address (EPA) 199 a network layer address belonging to an EP and assigned to the 200 interface of an end system in an EUN. 202 Forwarding Information Based (FIB) 203 a data structure containing network prefix to next-hop mappings; 204 usually maintained in a router's fast-path processing lookup 205 tables. 207 Internet Routing Overlay Network (IRON) 208 a composite virtual overlay network that comprises the union of 209 all VPC overlay networks configured over a common Internetwork. 210 The IRON supports routing through encapsulation of inner packets 211 with EPA addresses within outer headers that use locator 212 addresses. 214 IRON Client Router ("Client") 215 a customer's router (or host with embedded gateway function) that 216 logically connects the customer's EUNs and their associated EPs to 217 the IRON via tunnels. 219 IRON Serving Router ("Server") 220 a VPC's overlay network router that provides forwarding and 221 mapping services for the EPs owned by customer Client routers. 223 IRON Relay Router ("Relay") 224 a VPC's overlay network router that acts as a relay between the 225 IRON and the native Internet. 227 IRON Router (IR) 228 generically refers to any of an IRON Client/Server/Relay router. 230 Internet Service Provider (ISP) 231 a service provider which connects customer EUNs to the underlying 232 Internetwork. In other words, an ISP is responsible for providing 233 basic Internet connectivity for customer EUNs. 235 Locator 236 an IP address assigned to the interface of a router or end system 237 within a public or private network. Locators taken from public IP 238 prefixes are routable on a global basis, while locators taken from 239 private IP prefixes are made public via Network Address 240 Translation (NAT). 242 Provider Aggregated (PA) address or prefix 243 a network layer address or prefix delegated to an EUN by an ISP. 245 Provider Independent (PI) address or prefix 246 a network layer address or prefix delegated to an EUN by a third 247 party independently of the EUN's ISP arrangements. 249 Routing and Addressing in Networks with Global Enterprise Recursion 250 (RANGER) 251 an architectural examination of virtual overlay networks applied 252 to enterprise network scenarios, with implications for a wider 253 variety of use cases. 255 Subnetwork Encapsulation and Adaptation Layer (SEAL) 256 an encapsulation sublayer that provides extended packet 257 identification and a control message protocol to ensure 258 deterministic network-layer feedback. 260 Virtual Enterprise Traversal (VET) 261 a method for discovering border routers and forming dynamic point- 262 to-(multi)point tunnels over enterprise networks (or sites) with 263 varying properties. 265 Virtual Prefix (VP) 266 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 267 prefix, etc.) that is owned and managed by a Virtual Prefix 268 Company (VPC). 270 Virtual Prefix Company (VPC) 271 a company that owns and manages a set of VPs from which it 272 delegates EPs to EUNs. 274 VPC Overlay Network 275 a specialized set of routers deployed by a VPC to service customer 276 EUNs through a virtual overlay network configured over an 277 underlying Internetwork (e.g., the global Internet). 279 3. The Internet Routing Overlay Network 281 The Internet Routing Overlay Network (IRON) is a system of virtual 282 overlay networks configured over a common Internetwork. While the 283 principles presented in this document are discussed within the 284 context of the public global Internet, they can also be applied to 285 any autonomous Internetwork. The rest of this document therefore 286 refers to the terms "Internet" and "Internetwork" interchangeably 287 except in cases where specific distinctions must be made. 289 The IRON consists of IRON Routers (IRs) that automatically tunnel the 290 packets of end-to-end communication sessions within encapsulating 291 headers used for Internet routing. IRs use Virtual Enterprise 292 Traversal (VET) [I-D.templin-intarea-vet] in conjunction with the 293 Subnetwork Encapsulation and Adaptation Layer (SEAL) 294 [I-D.templin-intarea-seal] to encapsulate inner network layer packets 295 within outer headers as shown in Figure 1: 297 +-------------------------+ 298 | Outer headers with | 299 ~ locator addresses ~ 300 | (IPv4 or IPv6) | 301 +-------------------------+ 302 | SEAL Header | 303 +-------------------------+ +-------------------------+ 304 | Inner Packet Header | --> | Inner Packet Header | 305 ~ with EP addresses ~ --> ~ with EP addresses ~ 306 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 307 +-------------------------+ +-------------------------+ 308 | | --> | | 309 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 310 | | --> | | 311 +-------------------------+ +-------------------------+ 313 Inner packet before Outer packet after 314 before encapsulation after encapsulation 316 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 318 VET specifies the automatic tunneling mechanisms used for 319 encapsulation, while SEAL specifies the format and usage of the SEAL 320 header as well as a set of control messages. Most notably, IRs use 321 the SEAL Control Message Protocol (SCMP) to deterministically 322 exchange and authenticate control messages such as route 323 redirections, indications of Path Maximum Transmission Unit (PMTU) 324 limitations, destination unreachables, etc. 326 The IRON is the union of all virtual overlay networks that are 327 configured over a common underlying Internet and are owned and 328 managed Virtual Prefix Companies (VPCs). Each such virtual overlay 329 network comprises a set of IRs distributed throughout the Internet to 330 serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate sub- 331 prefixes from their VPs which they lease to customers as End User 332 Network PI prefixes (EPs). The customers in turn assign the EPs to 333 their customer edge IRs which connect their End User Networks (EUNs) 334 to the IRON. 336 VPCs may have no affiliation with the ISP networks from which 337 customers obtain their basic Internet connectivity. Therefore, a 338 customer could procure its summary network services either through a 339 common broker or through separate entities. In that case, the VPC 340 can open for business and begin serving its customers immediately 341 without the need to coordinate its activities with ISPs or with other 342 VPCs. Further details on business considerations are out of scope 343 for this document. 345 The IRON requires no changes to end systems and no changes to most 346 routers in the Internet. Instead, the IRON comprises IRs that are 347 deployed either as new platforms or as modifications to existing 348 platforms. IRs may be deployed incrementally without disturbing the 349 existing Internet routing system, and act as waypoints (or "cairns") 350 for navigating the IRON. The functional roles for IRs are described 351 in the following sections. 353 3.1. IRON Client Router 355 An IRON client router (or, simply, "Client") is a customer's router 356 (or host with embedded gateway function) that logically connects the 357 customer's EUNs and their associated EPs to the IRON via tunnels as 358 shown in Figure 2. Clients obtain EPs from VPCs and use them to 359 number subnets and interfaces within their EUNs. A Client can be 360 deployed on the same physical platform that also connects the 361 customer's EUNs to its ISPs, but it may also be a separate router or 362 even a standalone server system located within the EUN. (This model 363 applies even if the EUN connects to the ISP via a Network Address 364 Translator (NAT) - see Section 6.7). 365 .-. 366 ,-( _)-. 367 +--------+ .-(_ (_ )-. 368 | Client |--(_ ISP ) 369 +---+----+ `-(______)-' 370 | <= T \ .-. 371 .-. u \ ,-( _)-. 372 ,-( _)-. n .-(_ (- )-. 373 .-(_ (_ )-. n (_ Internet ) 374 (_ EUN ) e `-(______)- 375 `-(______)-' l ___ 376 | s => (:::)-. 377 +----+---+ .-(::::::::) 378 | Host | .-(::::::::::::)-. 379 +--------+ (:::: The IRON ::::) 380 `-(::::::::::::)-' 381 `-(::::::)-' 383 Figure 2: IRON Client Router Connecting EUN to the IRON 385 3.2. IRON Serving Router 387 An IRON serving router (or, simply, "Server") is a VPC's overlay 388 network router that provides forwarding and mapping services for the 389 EPs owned by customer Client routers. In typical deployments, a VPC 390 will deploy many Servers around the IRON in a globally-distributed 391 fashion (e.g., as depicted in Figure 3) so that Clients can discover 392 those that are nearby. 394 +--------+ +--------+ 395 | Boston | | Tokyo | 396 | Server | | Server | 397 +--+-----+ ++-------+ 398 +--------+ \ / 399 | Seattle| \ ___ / 400 | Server | \ (:::)-. +--------+ 401 +------+-+ .-(::::::::)------+ Paris | 402 \.-(::::::::::::)-. | Server | 403 (:::: The IRON ::::) +--------+ 404 `-(::::::::::::)-' 405 +--------+ / `-(::::::)-' \ +--------+ 406 | Moscow + | \--- + Sydney | 407 | Server | +----+---+ | Server | 408 +--------+ | Cairo | +--------+ 409 | Server | 410 +--------+ 412 Figure 3: IRON Serving Router Global Distribution Example 414 Each Server acts as tunnel-endpoint router that forms a bi- 415 directional tunnel with each of its Client customers. Each Server 416 also associates with a set of Relays that can forward packets from 417 the IRON out to the native Internet and vice-versa as discussed in 418 the next section. 420 3.3. IRON Relay Router 422 An IRON Relay Router (or, simply, "Relay") is a VPC's overlay network 423 router that acts as a relay between the IRON and the native Internet. 424 It therefore also serves as an Autonomous System Border Router (ASBR) 425 that is owned and managed by the VPC. 427 Each VPC configures one or more Relays which advertise the company's 428 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 429 Relay associates with all of the VPC's overlay network Servers, e.g., 430 via tunnels over the IRON, via a direct interconnect such as an 431 Ethernet cable, etc. The Relay role (as well as its relationship 432 with overlay network Servers) is depicted in Figure 4: 434 .-. 435 ,-( _)-. 436 .-(_ (_ )-. 437 (_ Internet ) 438 `-(______)-' | +--------+ 439 | |--| Server | 440 +----+---+ | +--------+ 441 | Relay |----| +--------+ 442 +--------+ |--| Server | 443 _|| | +--------+ 444 (:::)-. (Ethernet) 445 .-(::::::::) 446 +--------+ .-(::::::::::::)-. +--------+ 447 | Server |=(:::: The IRON ::::)=| Server | 448 +--------+ `-(::::::::::::)-' +--------+ 449 `-(::::::)-' 450 || (Tunnels) 451 +--------+ 452 | Server | 453 +--------+ 455 Figure 4: IRON Relay Router Connecting IRON to Native Internet 457 4. IRON Organizational Principles 459 The IRON consists of the union of all VPC overlay networks configured 460 over a common Internetwork (e.g., the public Internet). Each such 461 overlay network represents a distinct "patch" on the Internet 462 "quilt", where the patches are stitched together by tunnels over the 463 links, routers, bridges, etc., that connect the underlying. When a 464 new VPC overlay network is deployed, it becomes yet another patch on 465 the quilt. The IRON is therefore a composite overlay network 466 consisting of multiple individual patches, where each patch 467 coordinates its activities independently of all others (with the 468 exception that the Servers of each patch must be aware of all VPs in 469 the IRON). In order to ensure mutual cooperation between all VPC 470 overlay networks, sufficient address space portions of the inner 471 network layer protocol (e.g., IPv4, IPv6, etc.) should be set aside 472 and designated as VP space. 474 Each VPC overlay network in the IRON maintains a set of Relays and 475 Servers that provide services to their Client customers. In order to 476 ensure adequate customer service levels, the VPC should conduct a 477 traffic scaling analysis and distribute sufficient Relays and Servers 478 for the overlay network globally throughout the Internet. Figure 5 479 depicts the logical arrangement of Relays Servers and Clients in an 480 IRON virtual overlay network: 482 .-. 483 ,-( _)-. 484 .-(_ (_ )-. 485 (__ Internet _) 486 `-(______)-' 488 <------------ Relays ------------> 489 ________________________ 490 (::::::::::::::::::::::::)-. 491 .-(:::::::::::::::::::::::::::::) 492 .-(:::::::::::::::::::::::::::::::::)-. 493 (::::::::::: The IRON :::::::::::::::) 494 `-(:::::::::::::::::::::::::::::::::)-' 495 `-(::::::::::::::::::::::::::::)-' 497 <------------ Servers ------------> 498 .-. .-. .-. 499 ,-( _)-. ,-( _)-. ,-( _)-. 500 .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. 501 (__ ISP A _) (__ ISP B _) ... (__ ISP x _) 502 `-(______)-' `-(______)-' `-(______)-' 503 <----------- NATs ------------> 505 <----------- Clients and EUNs -----------> 507 Figure 5: Virtual Overlay Network Organization 509 Each Relay in the VPC overlay network connects the overlay directly 510 to the underlying IPv4 and IPv6 Internets. It also advertises the 511 VPC overlay network's IPv4 VPs into the IPv4 BGP routing system and 512 advertises the overlay network's IPv6 VPs into the IPv6 BGP routing 513 system. Relays will therefore receive packets with EPA destination 514 addresses sent by end systems in the Internet and direct them toward 515 EPA-addressed end systems connected to the VPC overlay network. 517 Each VPC overlay network also manages a set of Servers that connect 518 their Clients and associated EUNs to the IRON and to the IPv6 and 519 IPv4 Internets via their associations with Relays. IRON Servers 520 therefore need not be BGP routers themselves and can be simple 521 commodity hardware platforms. Moreover, the Server and Relay 522 functions can be deployed together on the same physical platform as a 523 unified gateway or they may be deployed on separate platforms (e.g., 524 for load balancing purposes). 526 Each Server maintains a working set of Clients for which it caches 527 EP-to-Client mappings in its Forwarding Information Base (FIB). Each 528 Server also in turn propagates the list of EPs in its working set to 529 each of the Relays in the VPC overlay network via a dynamic routing 530 protocol (e.g., an overlay network internal BGP instance that carries 531 only the EP-to-Server mappings and does not interact with the 532 external BGP routing system). Each Server therefore only needs to 533 track the EPs for its current working set of Clients, while each 534 Relay will maintain a full EP-to-Server mapping table that represents 535 reachability information for all EPs in the VPC overlay network. 537 Customers establish Clients that obtain their basic Internet 538 connectivity from ISPs and connect to Servers to attach their EUNs to 539 the IRON. Each EUN can connect to the IRON via one or multiple 540 Clients as long as the Clients coordinate with one another, e.g., to 541 mitigate EUN partitions. Unlike Relays and Servers, Clients may use 542 private addresses behind one or several layers of NATs. Each Client 543 initially discovers a list of nearby Servers through an anycast 544 discovery process (described below). It then selects one of these 545 nearby Servers and forms a bidirectional tunnel through an initial 546 exchange followed by periodic keepalives. 548 After the Client selects a Server, it forwards initial outbound 549 packets from its EUNs by tunneling them to the Server which in turn 550 forwards them to the nearest Relay within the IRON that serves the 551 final destination. The Client will subsequently receive redirect 552 messages informing it of a more direct route through a Server that 553 serves the final destination EUN. 555 The IRON can also be used to support VPs of network layer address 556 families that cannot be routed natively in the underlying 557 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 558 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 559 Further details for support of IRON VPs of one address family over 560 Internetworks based on other address families are discussed in 561 Appendix A. 563 5. IRON Initialization 565 IRON initialization entails the startup actions of IRs within the VPC 566 overlay network and customer EUNs. The following sections discuss 567 these startups procedures. 569 5.1. IRON Relay Router Initialization 571 Before its first operational use, each Relay in a VPC overlay network 572 is provisioned with the list of VPs that it will serve as well as the 573 locators for all Servers that belong to the same overlay network. 574 The Relay is also provisioned with external BGP interconnections the 575 same as for any BGP router. 577 Upon startup, the Relay engages in BGP routing exchanges with its 578 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 579 It then connects to all of the Servers in the overlay network (e.g., 580 via a TCP connection over a bidirectional tunnel, via an iBGP route 581 reflector, etc.) for the purpose of discovering EP->Server mappings. 582 After the Relay has fully populated its EP->Server mapping 583 information database, it is said to be "synchronized" wrt its VPs. 585 After this initial synchronization procedure, the Relay then 586 advertises the overlay network's VPs externally. In particular, the 587 Relay advertises the IPv6 VPs into the IPv6 BGP routing system and 588 advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay 589 additionally advertises an IPv4 /24 companion prefix (e.g., 590 192.0.2.0/24) into the IPv4 routing system and an IPv6 ::/64 591 companion prefix (e.g., 2001:DB8::/64) into the IPv6 routing system 592 (note that these may also be sub-prefixes taken from a VP). The 593 Relay then configures the host number '1' in the IPv4 companion 594 prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the 595 IPv6 companion prefix (e.g., as 2001:DB8::0) and assigns the 596 resulting addresses as subnet router anycast addresses 597 [RFC3068][RFC2526] for the VPC overlay network. (See Appendix A for 598 more information on the discovery and use of companion prefixes.) 599 The Relay then engages in ordinary packet forwarding operations. 601 5.2. IRON Serving Router Initialization 603 Before its first operational use, each Server in a VPC overlay 604 network is provisioned with the locators for all Relays that 605 aggregate the overlay network's VPs. In order to support route 606 optimization, the Server must also be provisioned with the list of 607 all VPs in the IRON (i.e., and not just the VPs of its own overlay 608 network) so that it can discern EPA and non-EPA addresses. (The 609 Server could therefore be greatly simplified if the list of VPs could 610 be covered within a small number of very short prefixes, e.g., one or 611 a few IPv6 ::/20's). The Server must also discover the VP companion 612 prefix relationships discussed in Section 5.1, e.g., via a global 613 database such as discussed in Appendix A. 615 Upon startup, each Server must connect to all of the Relays within 616 its overlay network (e.g., via a TCP connection over a bidirectional 617 tunnel, via an iBGP route reflector, etc.) for the purpose of 618 reporting its EP->Server mappings. The Server then actively listens 619 for Client customers which register their EP prefixes as part of 620 establishing a bidirectional tunnel. When a new Client registers its 621 EP prefixes, the Server announces the new EP additions to all Relays; 622 when an existing Client unregisters its EP prefixes, the Server 623 withdraws its announcements. 625 5.3. IRON Client Router Initialization 627 Before its first operational use, each Client must obtain one or more 628 EPs from its VPC as well as the companion prefixes associated with 629 the VPC overlay network (see Section 5.1). The Client must also 630 obtain a certificate and a public/private key pair from the VPC that 631 it can later use to prove ownership of its EPs. This implies that 632 each VPC must run its own public key infrastructure to be used only 633 for the purpose of verifying its customers' claimed right to use an 634 EP. Hence, the VPC need not coordinate its public key infrastructure 635 with any other organization. 637 Upon startup, the Client sends an SCMP Router Solicitation (SRS) 638 message to the VPC overlay network subnet router anycast address to 639 discover the nearest Relay. The Relay will return an SCMP Router 640 Advertisement message that lists the locator addresses of one or more 641 nearby Servers. (This list is analogous to the ISATAP Potential 642 Router List (PRL) [RFC5214].) 644 After the Client receives an SRA message from the nearby Relay 645 listing the locator addresses of nearby Servers, it sends SRS test 646 messages to one or more of the locator addresses to elicit SRA 647 messages. The Server that configures the locator will include the 648 header of the soliciting SRS message in its SRA message so that the 649 Client can determine the number of hops along the forward path. The 650 Server also includes a metric in its SRA messages indicating its 651 service availability so that the Client can avoid selecting Servers 652 that are overloaded. The Server also includes a challenge/response 653 puzzle that the Client must answer if it wishes to connect to this 654 Server. 656 When the Client receives these SRA messages, it can measure the round 657 trip time between sending the SRS and receiving the SRA as an 658 indication of round-trip delay. If the Client wishes to enlist the 659 services of a specific Server (e.g., based on the measured 660 performance), it then calculates the answer to the puzzle using its 661 keying information and sends the answer back to the Server in a new 662 SRS message that also contains all of the Client's EP prefixes for 663 which it claims ownership. If the Client solved the puzzle 664 correctly, the Server will send back a new SRA message that includes 665 a non-zero default router lifetime and that signifies the 666 establishment of a bidirectional tunnel. (A zero default router 667 lifetime on the other hand signifies that the Server is currently 668 unable to establish a bidirectional tunnel, e.g., due to heavy load, 669 due to challenge/response failure, etc.) 671 Note that in the above procedure it is essential that the Client 672 select one and only one Server. This is to allow the VPC overlay 673 network mapping system to have one and only one active EP-to-Server 674 mapping at any point in time which shares fate with the Server 675 itself. If this Server fails, the Client will quickly select a new 676 one which will automatically update the VPC overlay network mapping 677 system with a new EP-to-Server mapping. 679 6. IRON Operation 681 Following the IRON initialization detailed in Section 5, IRs engage 682 in the steady-state process of receiving and forwarding packets. All 683 IRs forward encapsulated packets over the IRON using the mechanisms 684 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 685 while Relays (and in some cases Servers) additionally forward packets 686 to and from the native IPv6 and IPv4 Internets. IRs also use SCMP to 687 coordinate with other IRs, including the process of sending and 688 receiving redirect messages, error messages, etc. (Note however that 689 an IR must not send an SCMP message in response to an SCMP error 690 message.) Each IR operates as specified in the following sub- 691 sections. 693 6.1. IRON Client Router Operation 695 After selecting its Server as specified in Section 5.3, the Client 696 should register each of its ISP connections with the Server in order 697 to establish multiple bidirectional tunnels for multihoming purposes. 698 To do so, it sends periodic SRS messages to its Server via each of 699 its ISPs to establish additional bidirectional tunnels and to keep 700 each tunnel alive. These messages need not include challenge/ 701 response mechanisms since prefix proof of ownership was already 702 established in the initial exchange and a nonce in the SEAL header 703 can be used to confirm that the SRS message was sent by the correct 704 Client. This implies that a single nonce is used to represent the 705 set of all bidirectional tunnels between the Client and the Server. 706 Therefore, there are multiple bidirectional tunnels, and the nonce 707 names this "bundle" of tunnels. (The Client and Server may 708 conceptually represent this "bundle" as a single tunnel with multiple 709 locator addresses, however each such locator address must be tested 710 independently in case there are NATs on the path.) 712 If the Client ceases to receive SRA messages from its Server via a 713 specific ISP connection, it marks the Server as unreachable from that 714 address and therefore over that ISP connection. (The Client should 715 also inform its Server of this outage via one of its working ISP 716 connections.) If the Client ceases to receive SRA messages from its 717 Server via multiple ISP connections, it marks the Server as unusable 718 and quickly attempts to establish a bidirectional tunnel with a new 719 Server. The act of establishing the tunnel with a new Server will 720 automatically purge the stale mapping state associated with the old 721 Server, since dynamic routing will propagate the new client/server 722 relationship to the VPC overlay network relay routers. 724 When an end system in an EUN sends a flow of packets to a 725 correspondent, the packets are forwarded through the EUN via normal 726 routing until they reach the Client, which then tunnels the initial 727 packets to its Server as the next hop. In particular, the Client 728 encapsulates each packet in an outer header with its locator as the 729 source address and the locator of its Server as the destination 730 address. Note that after sending the initial packets of a flow, the 731 Client may receive important SCMP messages such as indications of 732 PMTU limitations, redirects that point to a better next hop, etc. It 733 is therefore essential that the Client send the initial packets 734 through its Server to avoid loss of SCMP messages that cannot 735 traverse a NAT in the reverse direction. (The Server also provides a 736 control point for inbound traffic engineering and a mobility anchor 737 point and hence cannot by bypassed in the inbound direction). 739 The Client uses the mechanisms specified in VET and SEAL to 740 encapsulate each forwarded packet. The Client further uses the SCMP 741 protocol to coordinate with Servers, including accepting redirects 742 and other SCMP messages. When the Client receives an SCMP message, 743 it checks the nonce field of the encapsulated packet-in-error to 744 verify that the message corresponds to the tunnel to its Server and 745 accepts the message if the nonce matches. (Note however that the 746 outer source and destination addresses of the packet-in-error may be 747 different than those in the original packet due to possible Server 748 and/or Relay address rewritings.) 750 6.2. IRON Serving Router Operation 752 After the Server is initialized, it responds to SRSs from Clients by 753 sending SRAs as described in Section 6.1. When the Server receives 754 an SRS message from a new Client, it sends back an SRA message with a 755 challenge/response puzzle. The Client in turn sends an SRS message 756 with an answer to the puzzle. If this authentication fails, the 757 Server discards the message. Otherwise, it creates tunnel state for 758 this new Client, records the Client's EPs (see Section 5.3) in its 759 FIB, and records the locator address from the SCMP message as the 760 link-layer address of the next hop. The Server next sends an SRA 761 message back to the Client to complete the tunnel establishment. 763 When the Server receives a SEAL-encapsulated packet from one of its 764 Client tunnel endpoints, it examines the inner destination address. 765 If the inner destination address is not an EPA, the Server 766 decapsulates the packet and forwards it unencapsulated into the 767 Internet if it is able to do so without loss due to ingress 768 filtering. Otherwise, the Server re-encapsulates the packet (i.e., 769 it removes the outer header and replaces it with a new outer header 770 of the same address family) and sets the outer destination address to 771 the locator address of an Relay within its VPC overlay network. It 772 then forwards the re-encapsulated packet to the Relay, which will in 773 turn decapsulate it and forward it into the Internet. 775 If the inner destination address is an EPA, however, the Server 776 rewrites the outer source address to one of its own locator addresses 777 and rewrites the outer destination address to the subnet router 778 anycast address taken from the companion prefix associated with the 779 inner destination address (where the companion prefix of the same 780 address family as the outer IP protocol is used). The Server then 781 forwards the revised packet into the Internet via a default or more- 782 specific route, where it will be directed to the closest Relay within 783 the destination VPC overlay network. After sending the packet, the 784 Server may then receive an SCMP error or redirect message from a 785 Relay/Server within the destination VPC overlay network. In that 786 case, the Server verifies that the nonce in the message matches the 787 tunnel corresponding to the Client that sent the original inner 788 packet and discards the message if the nonce does not match. 789 Otherwise, the Server re-encapsulates the SCMP message in a new outer 790 header that uses the source address, destination address and nonce 791 parameters associated with the tunnel to the Client; it then forwards 792 the message to the Client. This arrangement is necessary to allow 793 SCMP messages to flow through any NATs on the path. 795 When a Server ('A') receives a SEAL-encapsulated packet from a Relay 796 or from the Internet, if the inner destination address matches an EP 797 in its FIB 'A' re-encapsulates the packet in a new outer header that 798 uses the source address, destination address and nonce parameters 799 associated with the tunnel and forwards it to a Client ('B') which in 800 turn decapsulates the packet and forwards it to the correct end 801 system in the EUN. If 'B' has left notice with 'A' that it has moved 802 to a new Server ('C'), however, 'A' will instead forward the packet 803 to 'C' and also send an SCMP redirect message back to the source of 804 the packet. In this way, 'B' can leave behind forwarding information 805 when changing between Servers 'A' and 'C' (e.g., due to mobility 806 events) without exposing packets to loss. 808 6.3. IRON Relay Router Operation 810 After each Relay has synchronized its VPs (see: Section 5.1) it 811 advertises the full set of the company's VPs and companion prefixes 812 into the IPv4 and IPv6 Internet BGP routing systems. These prefixes 813 will be represented as ordinary routing information in the BGP, and 814 any packets originating from the IPv4 or IPv6 Internet destined to an 815 address covered by one of the prefixes will be forwarded to one of 816 the VPC overlay network's Relays. 818 When a Relay receives a packet from the Internet destined to an EPA 819 covered by one of its VPs, it behaves as an ordinary IP router. In 820 particular, the Relay looks in its FIB to discover a locator of the 821 Server that serves the EP that covers the destination address. The 822 Relay then simply encapsulates the packet with its own locator as the 823 outer source address and the locator of the Server as the outer 824 destination address and forwards the packet to the Server. 826 When a Relay receives a packet from the Internet destined to one of 827 its subnet router anycast addresses, it discards the packet if it is 828 not SEAL-encapsulated. If the packet is an SCMP SRS message, the 829 Relay instead sends an SRA message back to the source listing the 830 locator addresses of nearby Servers then discards the message. The 831 Relay otherwise discards all other SCMP messages. 833 If the packet is an ordinary SEAL packet (i.e., one that encapsulates 834 an inner packet) the Relay sends an SCMP redirect message of the same 835 address family back to the source with the locator of the Server that 836 serves the EPA destination in the inner packet as the redirected 837 target. The source and destination addresses of the SCMP redirect 838 message use the outer destination and source addresses of the 839 original packet, respectively. After sending the redirect message, 840 the Relay then rewrites the outer destination address of the SEAL- 841 encapsulated packet to the locator of the Server and forwards the 842 revised packet to the Server. Note that in this arrangement any 843 errors that occur on the path between the Relay and the Server will 844 be delivered to the original source but with a different destination 845 address due to this Relay address rewriting. 847 6.4. IRON Reference Operating Scenarios 849 The IRON supports communications when one or both hosts are located 850 within EP-addressed EUNs regardless of whether the EPs are 851 provisioned by the same VPC or by different VPCs. When both hosts 852 are within IRON EUNs, route redirections that eliminate unnecessary 853 Servers and Relays from the path are possible. When only one host is 854 within an IRON EUN, however, route optimization cannot be used. The 855 following sections discuss the two scenarios. 857 6.4.1. Both Hosts Within IRON EUNs 859 When both hosts are within IRON EUNs, it is sufficient to consider 860 the scenario in a unidirectional fashion, i.e., by tracing packet 861 flows only in the forward direction from the source host to 862 destination host. The reverse direction can be considered 863 separately, and incurs the same considerations as for the forward 864 direction. 866 In this scenario, the initial packets of a flow produced by a source 867 host within an EUN connected to the IRON by a Client must flow 868 through both the Server of the source host and a Relay of the 869 destination host, but route optimization can eliminate these elements 870 from the path for subsequent packets in the flow. Figure 6 shows the 871 flow of initial packets from host A to host B within two IRON EUNs 872 (the same scenario applies whether the two EUNs are within the same 873 VPC overlay network or different overlay networks): 875 ________________________________________ 876 .-( .-. )-. 877 .-( ,-( _)-. )-. 878 .-( +========+(_ (_ +=====+ )-. 879 .( || (_|| Internet ||_) || ). 880 .( || ||-(______)-|| vv ). 881 .( +--------++--+ || || +------------+ ). 882 ( +==>| Server(A) | vv || | Server(B) |====+ ) 883 ( // +---------|\-+ +--++----++--+ +------------+ \\ ) 884 ( // .-. | \ | Relay(B) | .-. \\ ) 885 ( //,-( _)-. | \ +-v----------+ ,-( _)-\\ ) 886 ( .||_ (_ )-. | \____| .-(_ (_ ||. ) 887 ( _|| ISP A .) | (__ ISP B ||_)) 888 ( ||-(______)-' | (redirect) `-(______)|| ) 889 ( || | | | vv ) 890 ( +-----+-----+ | +-----+-----+ ) 891 | Client(A) | <--+ | Client(B) | 892 +-----+-----+ The IRON +-----+-----+ 893 | ( (Overlaid on the native Internet) ) | 894 .-. .-( .-) .-. 895 ,-( _)-. .-(________________________)-. ,-( _)-. 896 .-(_ (_ )-. .-(_ (_ )-. 897 (_ IRON EUN A ) (_ IRON EUN B ) 898 `-(______)-' `-(______)-' 899 | | 900 +---+----+ +---+----+ 901 | Host A | | Host B | 902 +--------+ +--------+ 904 Figure 6: Initial Packet Flow Before Redirects 906 With reference to Figure 6, host A sends packets destined to host B 907 via its network interface connected to EUN A. Routing within EUN A 908 will direct the packets to Client(A) as a default router for the EUN 909 which then uses VET and SEAL to encapsulate them in outer headers 910 with its locator address as the outer source address and the locator 911 address of Server(A) as the outer destination address. Client(A) 912 then simply releases the encapsulated packets into its ISP network 913 connection that provided its locator. The ISP will release the 914 packets into the Internet without filtering since the (outer) source 915 address is topologically correct. Once the packets have been 916 released into the Internet, routing will direct them to Server(A). 918 Server(A) receives the encapsulated packets from Client(A) then 919 rewrites the outer source address to one of its own locator 920 addresses, and rewrites the outer destination address to the subnet 921 router anycast address of the appropriate address family associated 922 with the inner destination address. Server(A) then releases the 923 revised packets into the Internet where routing will direct them to 924 Relay(B). 926 Relay(B) will intercept the encapsulated packets from Server(A) then 927 check its FIB to discover an entry that covers inner destination 928 address B with Server(B) as the next hop. Relay(B) then returns SCMP 929 redirect messages to Server(A) (*), rewrites the outer destination 930 address of the encapsulated packets to the locator address of 931 Server(B), and forwards these revised packets to Server(B). 933 Server(B) will receive the encapsulated packets from Relay(B) then 934 check its FIB to discover an entry that covers destination address B 935 with Client(B) as the next hop. Server(B) then re-encapsulates the 936 packets in a new outer header that uses the source address, 937 destination address and nonce parameters associated with the tunnel 938 to Client(B). Server(B) then releases these re-encapsulated packets 939 into the Internet, where routing will direct them to Client(B). 940 Client(B) will in turn decapsulate the packets and forward the inner 941 packets to host B via EUN B. 943 (*) Note that after the initial flow of packets, Server(A) will have 944 received one or more SCMP redirect messages from Relay(B) listing 945 Server(B) as a better next hop. Server(A) will in turn forward the 946 redirects to Client(A), which will thereafter forward its 947 encapsulated packets directly to the locator address of Server(B) 948 without involving either Server(A) or Relay(B) as shown in Figure 7: 950 ________________________________________ 951 .-( .-. )-. 952 .-( ,-( _)-. )-. 953 .-( +=============> .-(_ (_ )-.======+ )-. 954 .( // (__ Internet _) || ). 955 .( // `-(______)-' vv ). 956 .( // +------------+ ). 957 ( // | Server(B) |====+ ) 958 ( // +------------+ \\ ) 959 ( // .-. .-. \\ ) 960 ( //,-( _)-. ,-( _)-\\ ) 961 ( .||_ (_ )-. .-(_ (_ ||. ) 962 ( _|| ISP A .) (__ ISP B ||_)) 963 ( ||-(______)-' `-(______)|| ) 964 ( || | | vv ) 965 ( +-----+-----+ The IRON +-----+-----+ ) 966 | Client(A) | (Overlaid on the native Internet) | Client(B) | 967 +-----+-----+ +-----+-----+ 968 | ( ) | 969 .-. .-( .-) .-. 970 ,-( _)-. .-(________________________)-. ,-( _)-. 971 .-(_ (_ )-. .-(_ (_ )-. 972 (_ IRON EUN A ) (_ IRON EUN B ) 973 `-(______)-' `-(______)-' 974 | | 975 +---+----+ +---+----+ 976 | Host A | | Host B | 977 +--------+ +--------+ 979 Figure 7: Sustained Packet Flow After Redirects 981 6.4.2. Mixed IRON and Non-IRON Hosts 983 When one host is within an IRON EUN and the other is in a non-IRON 984 EUN (i.e., one that connects to the native Internet instead of the 985 IRON), the IR elements involved depend on the packet flow directions. 986 The cases are described in the following sections. 988 6.4.2.1. From IRON Host A to Non-IRON Host B 990 Figure 8 depicts the IRON reference operating scenario for packets 991 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 993 _________________________________________ 994 .-( )-. )-. 995 .-( +-------)----+ )-. 996 .-( | Relay(A) |--------------+ )-. 997 .( +------------+ \ ). 998 .( +=======>| Server(A) | \ ). 999 .( // +--------)---+ \ ). 1000 ( // ) \ ) 1001 ( // The IRON ) \ ) 1002 ( // .-. ) \ .-. ) 1003 ( //,-( _)-. ) \ ,-( _)-. ) 1004 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1005 ( _|| ISP A ) ) (_ | ISP B )) 1006 ( ||-(______)-' ) |-(______)-' ) 1007 ( || | )-. v | ) 1008 ( +-----+ ----+ )-. +-----+-----+ ) 1009 | Client(A) |)-. | Router B | 1010 +-----+-----+ +-----+-----+ 1011 | ( ) | 1012 .-. .-(____________________________________)-. .-. 1013 ,-( _)-. ,-( _)-. 1014 .-(_ (_ )-. .-(_ (_ )-. 1015 (_ IRON EUN A ) (_non-IRON EUN B) 1016 `-(______)-' `-(______)-' 1017 | | 1018 +---+----+ +---+----+ 1019 | Host A | | Host B | 1020 +--------+ +--------+ 1022 Figure 8: From IRON Host A to Non-IRON Host B 1024 In this scenario, host A sends packets destined to host B via its 1025 network interface connected to IRON EUN A. Routing within EUN A will 1026 direct the packets to Client(A) as a default router for the EUN which 1027 then uses VET and SEAL to encapsulate them in outer headers with its 1028 locator address as the outer source address and the locator address 1029 of Server(A) as the outer destination address. The ISP will pass the 1030 packets without filtering since the (outer) source address is 1031 topologically correct. Once the packets have been released into the 1032 native Internet, routing will direct them to Server(A). 1034 Server(A) receives the encapsulated packets from Client(A) then re- 1035 encapsulates and forwards them to Relay(A), which simply decapsulates 1036 them and releases the unencapsulated packets into the Internet. Once 1037 the packets are released into the Internet, routing will direct them 1038 to the final destination B. (Note that Server(A) and Relay(A) are 1039 depicted in Figure 8 as two halves of a unified gateway. In that 1040 case, the "forwarding" between Server(A) and Relay(A) is a zero- 1041 instruction imaginary operation within the gateway.) 1043 This scenario always involves a Server and Relay owned by the VPC 1044 that provides service to IRON EUN A. It therefore imparts a cost that 1045 would need to be borne by either the VPC or its customers. 1047 6.4.2.2. From Non-IRON Host B to IRON Host A 1049 Figure 9 depicts the IRON reference operating scenario for packets 1050 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1052 _______________________________________ 1053 .-( )-. )-. 1054 .-( +-------)----+ )-. 1055 .-( | Relay(A) |<-------------+ )-. 1056 .( +------------+ \ ). 1057 .( +========| Server(A) | \ ). 1058 .( // +--------)---+ \ ). 1059 ( // ) \ ) 1060 ( // The IRON ) \ ) 1061 ( // .-. ) \ .-. ) 1062 ( //,-( _)-. ) \ ,-( _)-. ) 1063 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1064 ( _|| ISP A ) ) (_ | ISP B )) 1065 ( ||-(______)-' ) |-(______)-' ) 1066 ( vv | )-. | | ) 1067 ( +-----+ ----+ )-. +-----+-----+ ) 1068 | Client(A) |)-. | Router B | 1069 +-----+-----+ +-----+-----+ 1070 | ( ) | 1071 .-. .-(____________________________________)-. .-. 1072 ,-( _)-. ,-( _)-. 1073 .-(_ (_ )-. .-(_ (_ )-. 1074 (_ IRON EUN A ) (_non-IRON EUN B) 1075 `-(______)-' `-(_______)-' 1076 | | 1077 +---+----+ +---+----+ 1078 | Host A | | Host B | 1079 +--------+ +--------+ 1081 Figure 9: From Non-IRON Host B to IRON Host A 1083 In this scenario, host B sends packets destined to host A via its 1084 network interface connected to non-IRON EUN B. Routing will direct 1085 the packets to Relay(A) which then forwards them to Server(A) using 1086 encapsulation if necessary. 1088 Server(A) will then check its FIB to discover an entry that covers 1089 destination address A with Client(A) as the next hop. Server(A) then 1090 (re-)encapsulates the packets in an outer header that uses the source 1091 address, destination address and nonce parameters associated with the 1092 tunnel to Client(A). Server(A) next releases these (re-)encapsulated 1093 packets into the Internet, where routing will direct them to 1094 Client(A). Client(A) will in turn decapsulate the packets and 1095 forward the inner packets to host A via its network interface 1096 connected to IRON EUN A. 1098 This scenario always involves a Server and Relay owned by the VPC 1099 that provides service to IRON EUN A. It therefore imparts a cost that 1100 would need to be borne by either the VPC or its customers. 1102 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1104 While IRON Servers and Relays can be considered as fixed 1105 infrastructure, Clients may need to move between different network 1106 points of attachment, connect to multiple ISPs, or explicitly manage 1107 their traffic flows. The following sections discuss mobility, multi- 1108 homing and traffic engineering considerations for IRON client 1109 routers. 1111 6.5.1. Mobility Management 1113 When a Client changes its network point of attachment (e.g., due to a 1114 mobility event), it configures one or more new locators. If the 1115 Client has not moved far away from its previous network point of 1116 attachment, it simply informs its Server of any locator additions or 1117 deletions. This operation is performance-sensitive, and should be 1118 conducted immediately to avoid packet loss. 1120 If the Client has moved far away from its previous network point of 1121 attachment, however, it re-issues the anycast discovery procedure 1122 described in Section 6.1 to discover whether its candidate set of 1123 Servers has changed. If the Client's current Server is also included 1124 in the new list received from the VPC, this provides indication that 1125 the Client has not moved far enough to warrant changing to a new 1126 Server. Otherwise, the Client may wish to move to a new Server in 1127 order to maintain optimal routing. This operation is not 1128 performance-critical, and therefore can be conducted over a matter of 1129 seconds/minutes instead of milliseconds/microseconds. 1131 To move to a new Server, the Client first engages in the EP 1132 registration process with the new Server and maintains the 1133 registrations through periodic SRS/SRA exchanges the same as 1134 described in Section 6.1. The Client then informs its former Server 1135 that it has moved by providing it with the locator address of the new 1136 Server. The Client then discontinues the SRS/SRA keepalive process 1137 with the former Server, which will garbage-collect the stale FIB 1138 entries when their lifetime expires. This will allow the former 1139 Server to redirect existing correspondents to the new Server so that 1140 no packets are lost. 1142 6.5.2. Multihoming 1144 A Client may register multiple locators with its Server. It can 1145 assign metrics with its registrations to inform the Server of 1146 preferred locators, and can select outgoing locators according to its 1147 local preferences. Multihoming is therefore naturally supported. 1149 6.5.3. Inbound Traffic Engineering 1151 A Client can dynamically adjust the priorities of its prefix 1152 registrations with its Server in order to influence inbound traffic 1153 flows. It can also change between Servers when multiple Servers are 1154 available, but should strive for stability in its Server selection in 1155 order to limit VPC network routing churn. 1157 6.5.4. Outbound Traffic Engineering 1159 A Client can select outgoing locators, e.g., based on current QoS 1160 considerations such as minimizing one-way delay or one-way delay 1161 variance. 1163 6.6. Renumbering Considerations 1165 As new link layer technologies and/or service models emerge, 1166 customers will be motivated to select their service providers through 1167 healthy competition between ISPs. If a customer's EUN addresses are 1168 tied to a specific ISP, however, the customer may be forced to 1169 undergo a painstaking EUN renumbering process if it wishes to change 1170 to a different ISP [RFC4192][RFC5887]. 1172 When a customer obtains EP prefixes from a VPC, it can change between 1173 ISPs seamlessly and without need to renumber. If the VPC itself 1174 applies unreasonable costing structures for use of the EPs, however, 1175 the customer may be compelled to seek a different VPC and would again 1176 be required to confront a renumbering scenario. The IRON approach to 1177 renumbering avoidance therefore depends on VPCs conducting ethical 1178 business practices and offering reasonable rates. 1180 6.7. NAT Traversal Considerations 1182 The Internet today consists of a global public IPv4 routing and 1183 addressing system with non-IRON EUNs that use either public or 1184 private IPv4 addressing. The latter class of EUNs connect to the 1185 public Internet via Network Address Translators (NATs). When a 1186 Client is located behind a NAT, its selects Servers using the same 1187 procedures as for Clients with public addresses, i.e., it will send 1188 SRS messages to Servers in order to get SRA messages in return. The 1189 only requirement is that the Client must configure its SEAL 1190 encapsulation to use a transport protocol that supports NAT 1191 traversal, namely UDP. 1193 Since the Server maintains state about its Client customers, it can 1194 discover locator information for each Client by examining the UDP 1195 port number and IP address in the outer headers of SRS messages. 1196 When there is a NAT in the path, the UDP port number and IP address 1197 in the SRS message will correspond to state in the NAT box and might 1198 not correspond to the actual values assigned to the Client. The 1199 Server can then encapsulate packets destined to hosts in the Client's 1200 EUN within outer headers that use this IP address and UDP port 1201 number. The NAT box will receive the packets, translate the values 1202 in the outer headers, then forward the packets to the Client. In 1203 this sense, the Server's "locator" for the Client consists of the 1204 concatenation of the IP address and UDP port number. 1206 IRON does not introduce any new issues to complications raised for 1207 NAT traversal or for applications embedding address referrals in 1208 their payload. 1210 6.8. Nested EUN Considerations 1212 Each Client configures a locator that may be taken from an ordinary 1213 non-EPA address assigned by an ISP or from an EPA address taken from 1214 an EP assigned to another Client. In that case, the Client is said 1215 to be "nested" within the EUN of another Client, and recursive 1216 nestings of multiple layers of encapsulations may be necessary. 1218 For example, in the network scenario depicted in Figure 10 Client(A) 1219 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1220 Client(B) in turn configures a locator EPA(C) taken from the EP 1221 assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) 1222 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1223 this example, the "nested-IRON" case must be examined in which a host 1224 A which configures the address EPA(A) within EUN(A) exchanges packets 1225 with host Z located elsewhere in the Internet. 1227 .-. 1228 ISP(D) ,-( _)-. 1229 +-----------+ .-(_ (_ )-. 1230 | Client(C) |--(_ ISP(D) ) 1231 +-----+-----+ `-(______)-' 1232 | <= T \ .-. 1233 .-. u \ ,-( _)-. 1234 ,-( _)-. n .-(_ (- )-. 1235 .-(_ (_ )-. n (_ Internet ) 1236 (_ EUN(C) ) e `-(______)-' 1237 `-(______)-' l ___ 1238 | EPA(C) s => (:::)-. 1239 +-----+-----+ .-(::::::::) 1240 | Client(B) | .-(::::::::::::)-. +-----------+ 1241 +-----+-----+ (:::: The IRON ::::) | Relay(Z) | 1242 | `-(::::::::::::)-' +-----------+ 1243 .-. `-(::::::)-' +-----------+ 1244 ,-( _)-. | Server(Z) | 1245 .-(_ (_ )-. +-----------+ +-----------+ 1246 (_ EUN(B) ) | Server(C) | +-----------+ 1247 `-(______)-' +-----------+ | Client(Z) | 1248 | EPA(B) +-----------+ +-----------+ 1249 +-----+-----+ | Server(B) | +--------+ 1250 | Client(A) | +-----------+ | Host Z | 1251 +-----------+ +-----------+ +--------+ 1252 | | Server(A) | 1253 .-. +-----------+ 1254 ,-( _)-. EPA(A) 1255 .-(_ (_ )-. +--------+ 1256 (_ EUN(A) )---| Host A | 1257 `-(______)-' +--------+ 1259 Figure 10: Nested EUN Example 1261 The two cases of host A sending packets to host Z, and host Z sending 1262 packets to host A, must be considered separately as described below. 1264 6.8.1. Host A Sends Packets to Host Z 1266 Host A first forwards a packet with source address EPA(A) and 1267 destination address Z into EUN(A). Routing within EUN(A) will direct 1268 the packet to Client(A), which encapsulates it in an outer header 1269 with EPA(B) as the outer source address and Server(A) as the outer 1270 destination address then forwards the once-encapsulated packet into 1271 EUN(B). Routing within EUN[B] will direct the packet to Client(B), 1272 which encapsulates it in an outer header with EPA(C) as the outer 1273 source address and Server(B) as the outer destination address then 1274 forwards the twice-encapsulated packet into EUN(C). Routing within 1275 EUN(C) will direct the packet to Client(C), which encapsulates it in 1276 an outer header with ISP(D) as the outer source address and Server(C) 1277 as the outer destination address. Client(C) then sends this triple- 1278 encapsulated packet into the ISP(D) network, where it will be routed 1279 into the Internet to Server(C). 1281 When Server(C) receives the triple-encapsulated packet, it removes 1282 the outer layer of encapsulation and forwards the resulting twice- 1283 encapsulated packet into the Internet to Server(B). Next, Server(B) 1284 removes the outer layer of encapsulation and forwards the resulting 1285 once-encapsulated packet into the Internet to Server(A). Next, 1286 Server(A) checks the address type of the inner address 'Z'. If Z is 1287 a non-EPA address, Server(A) simply decapsulates the packet and 1288 forwards it into the Internet. Otherwise, Server(A) rewrites the 1289 outer source and destination addresses of the once-encapsulated 1290 packet and forwards it to Relay(Z). Relay(Z) in turn rewrites the 1291 outer destination address of the packet to the locator for Server(Z), 1292 then forwards the packet and sends a redirect to Server(A) (which 1293 forwards the redirect to Client(A)). Server(Z) then re-encapsulates 1294 the packet and forwards it to Client(Z), which decapsulates it and 1295 forwards the inner packet to host Z. Subsequent packets from 1296 Client(A) will then use Server(Z) as the next hop toward host Z, 1297 which eliminates Server(A) and Relay(Z) from the path. 1299 6.8.2. Host Z Sends Packets to Host A 1301 Whether or not host Z configures an EPA address, its packets destined 1302 to Host A will eventually reach Server(A). Server(A) will have a 1303 mapping that lists Client(A) as the next hop toward EPA(A). 1304 Server(A) will then encapsulate the packet with EPA(B) as the outer 1305 destination address and forward the packet into the Internet. 1306 Internet routing will convey this once-encapsulated packet to 1307 Server(B) which will have a mapping that lists Client(B) as the next 1308 hop toward EPA(B). Server(B) will then encapsulate the packet with 1309 EPA(C) as the outer destination address and forward the packet into 1310 the Internet. Internet routing will then convey this twice- 1311 encapsulated packet to Server(C) which will have a mapping that lists 1312 Client(C) as the next hop toward EPA(C). Server(C) will then 1313 encapsulate the packet with ISP(D) as the outer destination address 1314 and forward the packet into the Internet. Internet routing will then 1315 convey this triple-encapsulated packet to Client(C). 1317 When the triple-encapsulated packet arrives at Client(C), it strips 1318 the outer layer of encapsulation and forwards the twice-encapsulated 1319 packet to EPA(C) which is the locator address of Client(B). When 1320 Client(B) receives the twice-encapsulated packet, it strips the outer 1321 layer of encapsulation and forwards the once-encapsulated packet to 1322 EPA(B) which is the locator address of Client(A). When Client(A) 1323 receives the once-encapsulated packet, it strips the outer layer of 1324 encapsulation and forwards the unencapsulated packet to EPA(A) which 1325 is the host address of host A. 1327 7. Implications for the Internet 1329 The IRON architecture envisions a hybrid routing/mapping system that 1330 benefits from both the shortest-path routing afforded by pure dynamic 1331 routing systems and the routing scaling suppression afforded by pure 1332 mapping systems. IRON therefore targets the elusive "sweet spot" 1333 that pure routing and pure mapping systems alone cannot satisfy. 1335 The IRON system requires a deployment of new routers/servers 1336 throughout the Internet and/or provider networks to maintain well- 1337 balanced virtual overlay networks. These routers/servers can be 1338 deployed incrementally without disruption to existing Internet 1339 infrastructure and appropriately managed to provide acceptable 1340 service levels to customers. 1342 End-to-end traffic that traverses an IRON virtual overlay network may 1343 experience delay variance between the initial packets and subsequent 1344 packets of a flow. This is due to the IRON system allowing longer 1345 path stretch for initial packets followed by timely route 1346 optimizations to utilize better next hop routers/servers for 1347 subsequent packets. 1349 IRON virtual overlay networks also work seamlessly with existing and 1350 emerging services within the native Internet. In particular, 1351 customers serviced by IRON virtual overlay networks will receive the 1352 same service enjoyed by customers serviced by non-IRON service 1353 providers. Internet services already deployed within the native 1354 Internet also need not make any changes to accommodate IRON virtual 1355 overlay network customers. 1357 The IRON system operates between routers within provider networks and 1358 end user networks. Within these networks, the underlying paths 1359 traversed by the virtual overlay networks may comprise links that 1360 accommodate varying MTUs. While the IRON system imposes an 1361 additional per-packet overhead that may cause the size of packets to 1362 become slightly larger than the underlying path can accommodate, IRON 1363 routers have a method for naturally detecting and tuning out all 1364 instances of path MTU underruns. In some cases, these MTU underruns 1365 may need to be reported back to the original hosts; however, the 1366 system will also allow for MTUs much larger than those typically 1367 available in current Internet paths to be discovered and utilized as 1368 more links with larger MTUs are deployed. 1370 Finally, and perhaps most importantly, the IRON system provides an 1371 in-built mobility management and multihoming capability that allows 1372 end user devices and networks to move about freely while both 1373 imparting minimal oscillations in the routing system and maintaining 1374 generally shortest-path routes. This mobility management is afforded 1375 through the very nature of the IRON customer/provider relationship, 1376 and therefore requires no adjunct mechanisms. The mobility 1377 management and multihoming capabilities are further supported by 1378 forward-path reachability detection that provides "hints of forward 1379 progress" in the same spirit as for IPv6 ND. 1381 8. Additional Considerations 1383 Considerations for the scalability of Internet Routing due to 1384 multihoming, traffic engineering and provider-independent addressing 1385 are discussed in [I-D.narten-radir-problem-statement]. Other scaling 1386 considerations specific to IRON are discussed in Appendix B. 1388 Route optimization considerations for mobile networks are found in 1389 [RFC5522]. 1391 9. Related Initiatives 1393 IRON builds upon the concepts RANGER architecture [RFC5720], and 1394 therefore inherits the same set of related initiatives. 1396 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1397 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1398 the Virtual Prefix concepts. 1400 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1401 contributed valuable insights, including the use of real-time 1402 mapping. The use of Servers as mobility anchor points is directly 1403 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1405 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1406 approach using a Correspondent Router (CR) model. The IRON Server 1407 construct is similar to the CR concept described in this work, 1408 however the manner in which customer EUNs coordinates with Servers is 1409 different and based on the redirection model associated with NBMA 1410 links. 1412 Numerous publications have proposed NAT traversal techniques. The 1413 NAT traversal techniques adapted for IRON were inspired by the Simple 1414 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1415 [I-D.carpenter-softwire-sample]. 1417 10. IANA Considerations 1419 There are no IANA considerations for this document. 1421 11. Security Considerations 1423 Security considerations that apply to tunneling in general are 1424 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1425 considerations that apply also to IRON are discussed in RANGER 1426 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1427 [I-D.templin-intarea-seal]. 1429 The IRON system further depends on mutual authentication of IRON 1430 Clients to Servers and Servers to Relays. This is accomplished 1431 through initial authentication exchanges followed by per-packet 1432 nonces that can be used to detect off-path attacks. As for all 1433 Internet communications, the IRON system also depends on Relays 1434 acting with integrity and not injecting false advertisements into the 1435 BGP (e.g., to mount traffic siphoning attacks). 1437 Each VPC overlay network requires a means for assuring the integrity 1438 of the interior routing system so that all Relays and Servers in the 1439 overlay have a consistent view of Client<->Server bindings. Finally, 1440 DOS attacks on IRON Relays and Servers can occur when packets with 1441 spoofed source addresses arrive at high data rates. This issue is no 1442 different than for any border router in the public Internet today, 1443 however. 1445 12. Acknowledgements 1447 This ideas behind this work have benefited greatly from discussions 1448 with colleagues; some of which appear on the RRG and other IRTF/IETF 1449 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1450 mobility architecture which strongly influenced IRON. Eric 1451 Fleischman pointed out the opportunity to leverage anycast for 1452 discovering topologically-close Servers. Thomas Henderson 1453 recommended a quantitative analysis of scaling properties. 1455 The following individuals provided essential review input: Jari 1456 Arkko, Mohamed Boucadair, John Buford, Wesley Eddy, Dae Young Kim and 1457 Robin Whittle. 1459 13. References 1460 13.1. Normative References 1462 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1463 September 1981. 1465 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1466 (IPv6) Specification", RFC 2460, December 1998. 1468 13.2. Informative References 1470 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1471 http://bgpmon.net/stat.php", June 2010. 1473 [I-D.bernardos-mext-nemo-ro-cr] 1474 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1475 Router based Route Optimisation for NEMO (CRON)", 1476 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1477 July 2008. 1479 [I-D.carpenter-softwire-sample] 1480 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1481 IPv6: Simple Address Mapping for Premises Legacy Equipment 1482 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1483 progress), June 2010. 1485 [I-D.ietf-grow-va] 1486 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1487 L. Zhang, "FIB Suppression with Virtual Aggregation", 1488 draft-ietf-grow-va-03 (work in progress), August 2010. 1490 [I-D.ietf-v6ops-tunnel-security-concerns] 1491 Krishnan, S., Thaler, D., and J. Hoagland, "Security 1492 Concerns With IP Tunneling", 1493 draft-ietf-v6ops-tunnel-security-concerns-04 (work in 1494 progress), October 2010. 1496 [I-D.narten-radir-problem-statement] 1497 Narten, T., "On the Scalability of Internet Routing", 1498 draft-narten-radir-problem-statement-05 (work in 1499 progress), February 2010. 1501 [I-D.russert-rangers] 1502 Russert, S., Fleischman, E., and F. Templin, "RANGER 1503 Scenarios", draft-russert-rangers-05 (work in progress), 1504 July 2010. 1506 [I-D.templin-intarea-seal] 1507 Templin, F., "The Subnetwork Encapsulation and Adaptation 1508 Layer (SEAL)", draft-templin-intarea-seal-25 (work in 1509 progress), December 2010. 1511 [I-D.templin-intarea-vet] 1512 Templin, F., "Virtual Enterprise Traversal (VET)", 1513 draft-templin-intarea-vet-19 (work in progress), 1514 December 2010. 1516 [I-D.whittle-ivip-arch] 1517 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1518 Architecture", draft-whittle-ivip-arch-04 (work in 1519 progress), March 2010. 1521 [I-D.zhang-evolution] 1522 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1523 Scalability", draft-zhang-evolution-02 (work in progress), 1524 October 2009. 1526 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1527 a subnetwork for experimentation with the OSI network 1528 layer", RFC 1070, February 1989. 1530 [RFC2526] Johnson, D. and S. Deering, "Reserved IPv6 Subnet Anycast 1531 Addresses", RFC 2526, March 1999. 1533 [RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers", 1534 RFC 3068, June 2001. 1536 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1537 Reserved for Documentation", RFC 3849, July 2004. 1539 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1540 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1541 September 2005. 1543 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1544 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1546 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1547 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1548 May 2006. 1550 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1551 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1552 March 2008. 1554 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1555 Route Optimization Requirements for Operational Use in 1556 Aeronautics and Space Exploration Mobile Networks", 1557 RFC 5522, October 2009. 1559 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1560 Global Enterprise Recursion (RANGER)", RFC 5720, 1561 February 2010. 1563 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1564 Reserved for Documentation", RFC 5737, January 2010. 1566 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1567 (IRTF) Document Stream", RFC 5743, December 2009. 1569 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1570 Still Needs Work", RFC 5887, May 2010. 1572 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1573 Core-Edge Separation Solutions to the Internet's Routing 1574 Scaling Problem, 1575 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1576 August 2008. 1578 Appendix A. IRON VPs Over Internetworks with Different Address Families 1580 The IRON architecture leverages the routing system by providing 1581 generally shortest-path routing for packets with EPA addresses from 1582 VPs that match the address family of the underlying Internetwork. 1583 When the VPs are of an address family that is not routable within the 1584 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1585 are used within an IPv4 Internetwork) a global mapping database is 1586 required to allow Servers to map VPs to companion prefixes taken from 1587 address families that are routable within the Internetwork. For 1588 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1589 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1590 packets can be forwarded over IPv4-only Internetworks. 1592 Every VP in the IRON must therefore be represented in a globally 1593 distributed Master VP database (MVPd) that maintains VP-to-companion 1594 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1595 globally-managed assigned numbers authority in the same manner as the 1596 Internet Assigned Numbers Authority (IANA) currently maintains the 1597 master list of all top-level IPv4 and IPv6 delegations. The database 1598 can be replicated across multiple servers for load balancing much in 1599 the same way that FTP mirror sites are used to manage software 1600 distributions. 1602 Upon startup, each Server discovers the full set of VPs for the IRON 1603 by reading the MVPd. The Server reads the MVPd from a nearby server 1604 and periodically checks the server for deltas since the database was 1605 last read. After reading the MVPd, the Server has a full list of VP 1606 to companion prefix mappings. 1608 The Server can then forward packets toward EPAs covered by a VP by 1609 encapsulating them in an outer header of the VP's companion prefix 1610 address family and using any address taken from the companion prefix 1611 as the outer destination address. The companion prefix therefore 1612 serves as an anycast prefix. 1614 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1615 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1617 Appendix B. Scaling Considerations 1619 Scaling aspects of the IRON architecture have strong implications for 1620 its applicability in practical deployments. Scaling must be 1621 considered along multiple vectors including Interdomain core routing 1622 scaling, scaling to accommodate large numbers of customer EUNs, 1623 traffic scaling, state requirements, etc. 1625 In terms of routing scaling, each VPC will advertise one or more VPs 1626 from which EPs are delegated to customer EUNs. Routing scaling will 1627 therefore be minimized when each VP covers many EPs. For example, 1628 the IPv6 prefix 2001:DB8::/32 contains 2^24 ::/56 EP prefixes for 1629 assignment to EUNs. The IRON could therefore accommodate 2^32 ::/56 1630 EPs with only 2^8 ::/32 VPs advertised in the interdomain routing 1631 core. 1633 In terms of traffic scaling for Relays, each Relay represents an ASBR 1634 of a "shell" enterprise network that simply directs arriving traffic 1635 packets with EPA destination addresses towards Servers that service 1636 customer EUNs. Moreover, the Relay sheds traffic destined to EPAs 1637 through redirection which removes it from the path for the vast 1638 majority of traffic packets. On the other hand, each Relay must 1639 handle all traffic packets forwarded between its customer EUNs and 1640 the non-IRON Internet. The scaling concerns for this latter class of 1641 traffic are no different than for ASBR routers that connect large 1642 enterprise networks to the Internet. In terms of traffic scaling for 1643 Servers, each Server services a set of the VPC overlay network's 1644 customer EUNs. The Server services all traffic packets destined to 1645 its EUNs but only services the initial packets of flows initiated 1646 from the EUNs and destined to EPAs. Therefore, traffic scaling for 1647 EPA-addressed traffic is an asymmetric consideration and is 1648 proportional to the number of EUNs each Server serves. 1650 In terms of state requirements for Relays, each Relay maintains a 1651 list of all Servers in the VPC overlay network as well as FIB entries 1652 for all customer EUNs that each Server serves. This state is 1653 therefore dominated by the number of EUNs in the VPC overlay network. 1654 Sizing the Relay to accommodate state information for all EUNs is 1655 therefore required during VPC overlay network planning. In terms of 1656 state requirements for Servers, each Server maintains tunnel state 1657 for each of the customer EUNs it serves but need not keep state for 1658 all EUNs in the VPC overlay network. Finally, neither Relays nor 1659 Servers need keep state for final destinations of outbound traffic. 1661 Clients source and sink all traffic packets originating from or 1662 destined to the customer EUN. Therefore traffic scaling 1663 considerations for Clients are the same as for any site border 1664 router. Clients also retain state for the Servers for final 1665 destinations of outbound traffic flows. This can be managed as soft 1666 state, since stale entries purged from the cache will be refreshed 1667 when new traffic packets are sent. 1669 Author's Address 1671 Fred L. Templin (editor) 1672 Boeing Research & Technology 1673 P.O. Box 3707 MC 7L-49 1674 Seattle, WA 98124 1675 USA 1677 Email: fltemplin@acm.org