idnits 2.17.1 draft-templin-iron-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 26, 2010) is 4990 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'VP' is mentioned on line 468, but not defined == Missing Reference: 'CE' is mentioned on line 1234, but not defined == Missing Reference: 'VE' is mentioned on line 1548, but not defined == Missing Reference: 'VC' is mentioned on line 1543, but not defined == Missing Reference: 'B' is mentioned on line 1234, but not defined == Unused Reference: 'RFC3849' is defined on line 1425, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1452, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-02 == Outdated reference: A later version (-04) exists of draft-ietf-v6ops-tunnel-security-concerns-02 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-16 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-16 Summary: 1 error (**), 0 flaws (~~), 12 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft August 26, 2010 5 Intended status: Experimental 6 Expires: February 27, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-11.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network (IRON) that supports sustainable 17 growth through Provider Independent addressing while requiring no 18 changes to end systems and no changes to the existing routing system. 19 IRON further addresses other important issues including routing 20 scaling, mobility management, multihoming, traffic engineering and 21 NAT traversal. While business considerations are an important 22 determining factor for widespread adoption, they are out of scope for 23 this document. This document is a product of the IRTF Routing 24 Research Group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on February 27, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 6 63 3.1. IR[CE] - IRON Customer Edge Router . . . . . . . . . . . . 8 64 3.2. IR[VE] - IRON Virtual Prefix Company Edge Router . . . . . 8 65 3.3. IR[VC] - IRON Virtual Prefix Company Core Router . . . . . 9 66 3.4. IR[VP] - IRON Virtual Prefix Company Combined Router . . . 10 67 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 68 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 12 69 5.1. IR[VC] Initialization . . . . . . . . . . . . . . . . . . 13 70 5.2. IR[VE] Initialization . . . . . . . . . . . . . . . . . . 13 71 5.3. IR[CE] Initialization . . . . . . . . . . . . . . . . . . 14 72 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 15 73 6.1. IR[CE] Operation . . . . . . . . . . . . . . . . . . . . . 15 74 6.2. IR[VE] Operation . . . . . . . . . . . . . . . . . . . . . 17 75 6.3. IR(VC) Operation . . . . . . . . . . . . . . . . . . . . . 18 76 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 77 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 78 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 79 6.5. Mobility, Multihoming and Traffic Engineering 80 Considerations . . . . . . . . . . . . . . . . . . . . . . 25 81 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 25 82 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 26 83 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 26 84 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 26 85 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 26 86 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 27 87 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 27 88 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 28 89 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 29 90 7. Additional Considerations . . . . . . . . . . . . . . . . . . 30 91 8. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 30 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 93 10. Security Considerations . . . . . . . . . . . . . . . . . . . 31 94 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 95 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 96 12.1. Normative References . . . . . . . . . . . . . . . . . . . 31 97 12.2. Informative References . . . . . . . . . . . . . . . . . . 31 98 Appendix A. IRON VPs Over Internetworks with Different 99 Address Families . . . . . . . . . . . . . . . . . . 34 100 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 34 101 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 35 103 1. Introduction 105 Growth in the number of entries instantiated in the Internet routing 106 system has led to concerns for unsustainable routing scaling 107 [I-D.narten-radir-problem-statement]. Operational practices such as 108 increased use of multihoming with IPv4 Provider-Independent (PI) 109 addressing are resulting in more and more fine-grained prefixes 110 injected into the routing system from more and more end-user 111 networks. Furthermore, the forthcoming depletion of the public IPv4 112 address space has raised concerns for both increased deaggregation 113 (leading to yet further routing table entries) and an impending 114 address space run-out scenario. At the same time, the IPv6 routing 115 system is beginning to see growth in IPv6 Provider-Aggregated (PA) 116 prefixes [BGPMON] which must be managed in order to avoid the same 117 routing scaling issues the IPv4 Internet now faces. Since the 118 Internet must continue to scale to accommodate increasing demand, it 119 is clear that new routing methodologies and operational practices are 120 needed. 122 Several related works have investigated routing scaling issues. 123 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 124 Increasing Scopes (AIS) [I-D.zhang-evolution] are global routing 125 proposals that introduce routing overlays with Virtual Prefixes (VPs) 126 to reduce the number of entries required in each router's Forwarding 127 Information Base (FIB) and Routing Information Base (RIB). Routing 128 and Addressing in Networks with Global Enterprise Recursion (RANGER) 129 [RFC5720] examines recursive arrangements of enterprise networks that 130 can apply to a very broad set of use case scenarios 131 [I-D.russert-rangers]. In particular, RANGER supports encapsulation 132 and secure redirection by treating each layer in the recursive 133 hierarchy as a virtual non-broadcast, multiple access (NBMA) "link". 134 RANGER is an architectural framework that includes Virtual Enterprise 135 Traversal (VET) [I-D.templin-intarea-vet] and the Subnetwork 136 Adaptation and Encapsulation Layer (SEAL) [I-D.templin-intarea-seal] 137 as its functional building blocks. 139 This document proposes an Internet Routing Overlay Network (IRON) 140 with goals of supporting sustainable growth while requiring no 141 changes to the existing routing system. IRON borrows concepts from 142 VA, AIS and RANGER, and further borrows concepts from the Internet 143 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 144 proposal along with its associated Translating Tunnel Router (TTR) 145 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 146 degree inspired the IRON mobility architecture design discussed in 147 this document. The Network Address Translator (NAT) traversal 148 techniques adapted for IRON were inspired by the Simple Address 149 Mapping for Premises Legacy Equipment (SAMPLE) proposal 150 [I-D.carpenter-softwire-sample]. 152 IRON specifically seeks to provide scalable PI addressing without 153 changing the current BGP [RFC4271] routing system. IRON observes the 154 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 155 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 156 [RFC1070], etc.) are also within scope. 158 The IRON is a global routing system comprising virtual overlay 159 networks managed by Virtual Prefix Companies (VPCs) that own and 160 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 161 prefixes (EPs) are delegated to customer sites. The IRON is 162 motivated by a growing customer demand for multihoming, mobility 163 management and traffic engineering while using stable PI addressing 164 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 165 existing IPv4 and IPv6 global Internet routing systems as virtual 166 links for tunneling inner network protocol packets within outer IPv4 167 or IPv6 headers (see: Section 3). The IRON requires deployment of a 168 small number of new BGP core routers and supporting servers, as well 169 as IRON-aware routers/servers in customer EUNs. No modifications to 170 hosts, and no modifications to most routers are required. 172 Note: This document is offered in compliance with Internet Research 173 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 174 IETF product and is not a standard. The views in this document were 175 considered controversial by the IRTF Routing Research Group (RRG) but 176 the RG reached a consensus that the document should still be 177 published. The document will undergo a period of review within the 178 RRG and through selected expert reviewers prior to publication. The 179 following sections discuss details of the IRON architecture. 181 2. Terminology 183 This document makes use of the following terms: 185 End User Network (EUN) 186 an edge network that connects an organization's devices (e.g., 187 computers, routers, printers, etc.) to the Internet and possibly 188 also the IRON. 190 Internet Service Provider (ISP) 191 a service provider which physically connects customer EUNs to the 192 Internet. In other words, an ISP is responsible for providing IP 193 connectivity to a customer owning an EUN. 195 Provider Aggregated (PA) address or prefix 196 a network layer address or prefix delegated to an EUN by an ISP. 198 Provider Independent (PI) address or prefix 199 a network layer address or prefix delegated to an EUN by a third 200 party independently of the EUN's ISP arrangements. 202 Virtual Prefix (VP) 203 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 204 prefix, etc.) that is owned and managed by a Virtual Prefix 205 Company (VPC). 207 End User Network PI prefix (EP) 208 a more-specific PI prefix derived from a VP (e.g., an IPv4 /28, an 209 IPv6 /56, etc.) and delegated to an EUN by a VPC. 211 EP Address (EPA) 212 a network layer address belonging to an EP and assigned to the 213 interface of an end system in an EUN. 215 Locator 216 an IP address assigned to the interface of a router or end system 217 within a public or private network. Locators taken from public IP 218 prefixes are routable on a global basis, while locators taken from 219 private IP prefixes are made public via Network Address 220 Translation (NAT). 222 Virtual Prefix Company (VPC) 223 a company that owns and manages a set of VPs from which it 224 delegates End User Network PI Prefixes (EPs) to EUNs 226 Internet Routing Overlay Network (IRON) 227 an overlay network configured over the global Internet. The IRON 228 supports routing through encapsulation of inner packets with EPA 229 addresses within outer headers that use locator addresses. 231 implicit anycast 232 an anycast discovery procedure whereby a customer router discovers 233 provider routers that are topologically nearby. Also a means by 234 which a router on the path to a tunnel egress makes its presence 235 known by sending a redirect informing the tunnel ingress of a 236 better route. 238 3. The Internet Routing Overlay Network 240 The Internet Routing Overlay Network (IRON) consists of IRON Routers 241 (IRs) that automatically tunnel the packets of end-to-end 242 communication sessions within encapsulating headers used for 243 Internetwork routing. IRs use Virtual Enterprise Traversal (VET) 244 [I-D.templin-intarea-vet] in conjunction with the Subnetwork 245 Encapsulation and Adaptation Layer (SEAL) [I-D.templin-intarea-seal] 246 to encapsulate inner network layer packets within outer headers as 247 shown in Figure 1: 249 +-------------------------+ 250 | Outer headers with | 251 ~ locator addresses ~ 252 | (IPv4 or IPv6) | 253 +-------------------------+ 254 | SEAL Header | 255 +-------------------------+ +-------------------------+ 256 | Inner Packet Header | --> | Inner Packet Header | 257 ~ with EP addresses ~ --> ~ with EP addresses ~ 258 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 259 +-------------------------+ +-------------------------+ 260 | | --> | | 261 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 262 | | --> | | 263 +-------------------------+ +-------------------------+ 265 Inner packet before Outer packet after 266 before encapsulation after encapsulation 268 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 270 VET specifies the automatic tunneling mechanisms used for 271 encapsulation, while SEAL specifies the format and usage of the SEAL 272 header as well as a set of control messages. Most notably, IRs use 273 the SEAL Control Message Protocol (SCMP) to deterministically 274 exchange and authenticate control messages such as route 275 redirections, indications of Path Maximum Transmission Unit (PMTU) 276 limitations, destination unreachables, etc. 278 The IRON is manifested through a business model in which Virtual 279 Prefix Companies (VPCs) own and manage virtual overlay networks 280 comprising a set of IRs that are distributed throughout the Internet 281 and serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate 282 sub-prefixes from their VPs which they lease to customers as End User 283 Network PI prefixes (EPs). The customers in turn assign the EPs to 284 their customer edge IRs which connect their End User Networks (EUNs) 285 to the IRON. 287 VPCs may have no affiliation with the ISP networks from which 288 customers obtain their basic Internet connectivity. Therefore, 289 unless the ISP also acts as a VPC the customer must have two business 290 relationships - one with the ISP and a second with the VPC. In that 291 case, the VPC can open for business and begin serving their customers 292 immediately without the need to coordinate their activities with ISPs 293 or with other VPCs. Further details on business considerations are 294 out of scope for this document. 296 The IRON requires no changes to end systems and no changes to most 297 routers in the Internet. Instead, the IRON comprises IRs that are 298 deployed either as new platforms or as modifications to existing 299 platforms. IRs may be deployed incrementally without disturbing the 300 existing Internet routing system, and act as waypoints (or "cairns") 301 for navigating the IRON. The functional roles for IRs are described 302 in the following sections. 304 3.1. IR[CE] - IRON Customer Edge Router 306 An IR[CE] is a Customer Edge router (or host with embedded gateway 307 function) that logically connects the customer's EUNs and their 308 associated EPs to the IRON via tunnels as shown in Figure 2. IR[CE]s 309 obtain EPs from VPCs and use them to number subnets and interfaces 310 within their EUNs. An IR[CE] can be deployed on the same physical 311 platform that also connects the customer's EUNs to its ISPs, but it 312 may also be a separate router or even a standalone server system 313 located within the EUN. (This model applies even if the EUN connects 314 to the ISP via a Network Address Translator (NAT) - see Section 6.7). 315 .-. 316 ,-( _)-. 317 +--------+ .-(_ (_ )-. 318 | IR[CE] |--(_ ISP ) 319 +---+----+ `-(______)-' 320 | <= T \ .-. 321 .-. u \ ,-( _)-. 322 ,-( _)-. n .-(_ (- )-. 323 .-(_ (_ )-. n (_ Internet ) 324 (_ EUN ) e `-(______)- 325 `-(______)-' l ___ 326 | s => (:::)-. 327 +----+---+ .-(::::::::) 328 | Host | .-(::::::::::::)-. 329 +--------+ (:::: The IRON ::::) 330 `-(::::::::::::)-' 331 `-(::::::)-' 333 Figure 2: IR[CE] Connecting EUN to the IRON 335 3.2. IR[VE] - IRON Virtual Prefix Company Edge Router 337 An IR[VE] is a VPC's overlay network edge router that provides 338 forwarding and mapping services for the EPs owned by customer 339 IR[CE]s. In typical deployments, a VPC will deploy many IR[VE]s 340 around the IRON in a globally-distributed fashion (e.g., as depicted 341 in Figure 3) so that IR[CE] clients can discover those that are 342 nearby. 344 +--------+ +--------+ 345 | IR[VE] | | IR[VE] | 346 | Boston | | Tokyo | 347 +--+-----+ ++-------+ 348 +--------+ \ / 349 | IR[VE] | \ ___ / 350 | Seattle| \ (:::)-. +--------+ 351 +------+-+ .-(::::::::)------+ IR[VE] | 352 \.-(::::::::::::)-. | Paris | 353 (:::: The IRON ::::) +--------+ 354 `-(::::::::::::)-' 355 +--------+ / `-(::::::)-' \ +--------+ 356 | IR[VE] + | \--- + IR[VE] | 357 | Moscow | +----+---+ | Sydney | 358 +--------+ | IR[VE] | +--------+ 359 | Cairo | 360 +--------+ 362 Figure 3: IR[VE] Global Distribution Example 364 Each IR[VE] serves as a customer-facing tunnel endpoint router that 365 IR[CE]s form bidirectional tunnels with over the IRON. Each IR[VE] 366 also associates with an Internet-facing IR[VC] that can forward 367 packets from the IRON out to the native public Internet and vice- 368 versa as discussed in the next section. 370 3.3. IR[VC] - IRON Virtual Prefix Company Core Router 372 An IR[VC] is a VPC's overlay network core router that acts as a 373 gateway between the IRON and the native public Internet. It 374 therefore also serves as an Autonomous System Border Router (ASBR) 375 that is owned and managed by the VPC. 377 Each VPC configures one or more IR[VC]s which advertise the company's 378 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 379 IR[VC] associates with all of the VPC's overlay network IR[VE] 380 routers, e.g., via tunnels over the IRON, via a direct interconnect 381 such as an Ethernet cable, etc. The IR[VC] role (as well as its 382 relationship with overlay network IR[VE]s) is depicted in Figure 4: 384 ,-( _)-. 385 .-(_ (_ )-. 386 (_ Internet ) 387 `-(______)-' | +--------+ 388 | |--| IR[VE] | 389 +----+---+ | +--------+ 390 | IR[VC] |----| +--------+ 391 +--------+ |--| IR[VE] | 392 _|| | +--------+ 393 (:::)-. (Ethernet) 394 .-(::::::::) 395 +--------+ .-(::::::::::::)-. +--------+ 396 | IR[VE] |=(:::: The IRON ::::)=| IR[VE] | 397 +--------+ `-(::::::::::::)-' +--------+ 398 `-(::::::)-' 399 || (Tunnels) 400 +--------+ 401 | IR[VE] | 402 +--------+ 404 Figure 4: IR[VC] Connecting IRON to Native Internet 406 3.4. IR[VP] - IRON Virtual Prefix Company Combined Router 408 An IR[VP] is a VPC's overlay network router that combines the 409 functions of both the IR[VE] and IR[VC]. While not in itself a 410 fundamental building block of the architecture, it is mentioned here 411 to clarify an implementation option available to VPCs. 413 In the IR[VP] model, the IR[VE] and IR[VC] functions can be thought 414 of as "half-gateway" functions that together comprise a unified 415 IR[VP]. The IR[VE] and IR[VC] functions can therefore be discussed 416 separately even when both functions reside within the same physical 417 IR[VP] platform as shown in Figure 5: 419 ,-( _)-. 420 .-(_ (_ )-. 421 (_ Internet ) 422 `-(______)-' 423 | 424 +----------+----------+ 425 | IR[VC] half-gateway | 426 +---------------------+ 427 | IR[VE] half-gateway | 428 +----------+----------+ 429 <- IR[VP] Unified Gateway -> 430 _|_ 431 (:::)-. 432 .-(::::::::) 433 .-(::::::::::::)-. 434 (:::: The IRON ::::) 435 `-(::::::::::::)-' 436 `-(::::::)-' 438 Figure 5: IR[VP] Combining IR[VE] and IR[VC] Functions 440 4. IRON Organizational Principles 442 The IRON consists of the union of all VPC overlay networks worldwide 443 (where each VPC configures one or more overlay networks). Each such 444 overlay network represents a distinct "patch" on the Internet 445 "quilt", where the patches are stitched together by tunnels over the 446 links, routers, bridges, etc. that connect the public Internet. When 447 a new VPC overlay network is deployed, it becomes yet another patch 448 on the quilt. The IRON is therefore a composite overlay network 449 consisting of multiple individual patches, where each patch 450 coordinates its activities independently of all others (with the 451 exception that the IR[VE]s of each patch must be aware of all VP's in 452 the IRON). 454 Each VPC overlay network in the IRON maintains a set of IR[VC]s that 455 connect the overlay network directly to the public IPv4 and IPv6 456 Internets. Each IR[VC] advertises the VPC overlay network's IPv4 VPs 457 into the IPv4 BGP routing system and advertises the overlay network's 458 IPv6 VPs into the IPv6 BGP routing system. IR[VC]s will therefore 459 receive packets with EPA destination addresses sent by end systems in 460 the Internet and direct them toward EPA-addressed end systems 461 connected to the VPC overlay network. 463 Each VPC overlay network also manages a set of IR[VE]s that connect 464 customer EUNs to the IRON and to the IPv6 and IPv4 Internets via 465 their associations with IR[VC]s. IR[VE]s therefore need not be BGP 466 routers themselves and can be simple commodity hardware platforms. 467 Moreover, the IR[VE] and IR[VC] functions can be deployed together on 468 the same physical platform as an IR[VP] or they may be deployed on 469 separate platforms (e.g., for load balancing purposes). 471 Each IR[VE] maintains a working set of IR[CE]s for which it caches 472 EP-to-IR[CE] mappings in its Forwarding Information Base (FIB). Each 473 IR[VE] also in turn propagates the list of EPs in its working set to 474 each of the IR[VC]s in the VPC overlay network via a dynamic routing 475 protocol (e.g., an overlay network internal BGP instance that carries 476 only the EP-to-IR[VE] mappings and does not interact with the 477 external BGP routing system). Each IR[VE] therefore only needs to 478 track the EPs for its current working set of IR[CE]s, while each 479 IR[VC] will maintain a full EP-to-IR[VE] mapping table that 480 represents reachability information for all EPs in the VPC overlay 481 network. 483 Customers establish IR[CE]s to connect their EUNs to both the VPC 484 overlay network and to the rest of the IRON. Each EUN can connect to 485 the IRON via one or multiple IR[CE]s as long as the multiple IR[CE]s 486 coordinate with one another, e.g., to mitigate EUN partitions. 487 Unlike IR[VC]s and IR[VE]s, IR[CE]s may use private addresses behind 488 one or several layers of NATs. The IR[CE] initially discovers a list 489 of nearby IR[VE]s through an "implicit anycast" discovery process 490 (described below). It then selects one of these nearby IR[VE]s as 491 its server and forms a bidirectional tunnel with the IR[VE] through 492 an initial exchange followed by periodic keepalives. 494 After the IR[CE] selects a serving IR[VE], it forwards initial 495 outbound packets from its EUNs by tunneling them to its own serving 496 IR[VE] which in turn forwards them to the nearest IR[VC] within the 497 IRON that serves the final destination. The IR[CE] will subsequently 498 receive redirect messages informing it of a more direct route through 499 the IR[VE] that serves the final destination. 501 The IRON can also be used to support VPs of network layer address 502 families that cannot be routed natively in the underlying 503 Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over 504 IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). 505 Further details for support of IRON VPs over non-native Internetworks 506 are discussed in Appendix A. 508 5. IRON Initialization 510 IRON initialization entails the startup actions of IRs within the VPC 511 overlay network and customer EUNs. The following sections discuss 512 these startups procedures. 514 5.1. IR[VC] Initialization 516 Before its first operational use, each IR[VC] in a VPC overlay 517 network is provisioned with the list of VPs that it will serve as 518 well as the locators for all IR[VE]s that belong to the same overlay 519 network. The IR[VC] is also provisioned with external BGP 520 interconnections the same as for any BGP router. 522 Upon startup, the IR[VC] engages in BGP routing exchanges with its 523 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 524 It then connects to all of the IR[VE]s in the overlay network (e.g., 525 via a TCP connection over a bidirectional tunnel, via an iBGP route 526 reflector, etc.) for the purpose of discovering EP->IR[VE] mappings. 527 After the IR[VC] has fully populated its EP->IR[VE] mapping 528 information database, it is said to be "synchronized" wrt its VPs. 530 After this initial synchronization procedure, the IR[VC] then 531 advertises the overlay network's VPs externally. In particular, the 532 IR[VC] advertises the IPv6 VPs into the IPv6 BGP routing system and 533 advertises the IPv4 VPs into the IPv4 BGP routing system. If the 534 IR[VC] only services IPv6 VPs (e.g., 2001:DB8::/32), it advertises 535 the IPv6 VPs into the IPv6 routing system and also advertises a 536 companion IPv4 prefix (e.g., 192.0.2.0/24) into the IPv4 routing 537 system that can be used by IR[CE]s/IR[VE]s from other VPC overlay 538 networks for implicit anycast discovery purposes. Similarly, if the 539 IR[VC] only services IPv4 VPs, it also advertises a companion IPv6 540 prefix (e.g., 2001:DB8::/56) into the IPv6 routing system. (See 541 Appendix A for more information on the discovery and use of companion 542 prefixes.) The IR[VC] then engages in ordinary packet forwarding 543 operations. 545 5.2. IR[VE] Initialization 547 Before its first operational use, each IR[VE] in a VPC overlay 548 network is provisioned with the locators for all IR[VC]s that serve 549 the overlay network's VPs. In order to support route optimization, 550 the IR[VE] must also be provisioned with the list of all VPs in the 551 IRON (i.e., and not just the VPs of its own overlay network) so that 552 it can discern EPA and non-EPA addresses. (The IR[VE] could 553 therefore be greatly simplified if the list of VPs could be covered 554 within a small number of very short prefixes, e.g., one or a few IPv6 555 ::/20's) The IR[VE] should also discover the VP companion prefix 556 relationships discussed in Section 5.1, e.g., via a global database 557 such as discussed in Appendix A. 559 Upon startup, each IR[VE] must connect to all of the IR[VC]s within 560 its overlay network (e.g., via a TCP connection over a bidirectional 561 tunnel, via an iBGP route reflector, etc.) for the purpose of 562 reporting its EP->IR[VE] mappings. The IR[VE] then actively listens 563 for IR[CE] customers which register their EP prefixes as part of 564 establishing a bidirectional tunnel. When a new IR[CE] registers its 565 EP prefixes, the IR[VE] announces the new EP additions to all 566 IR[VC]s; when an existing IR[CE] unregisters its EP prefixes, the 567 IR[VE] withdraws its announcements. 569 5.3. IR[CE] Initialization 571 Before its first operational use, each IR[CE] must obtain one or more 572 EPs from its VPC as well as any companion prefixes of other address 573 families (see Section 5.1) associated with the EPs. The IR[CE] must 574 also obtain a certificate and a public/private key pair from the VPC 575 that it can later use to prove ownership of its EPs. This implies 576 that each VPC must run its own key infrastructure to be used only for 577 the purpose of verifying a customer's claimed right to use an EP. 578 Hence, the VPC need not coordinate its key infrastructure with any 579 other organization. 581 Upon startup, the IR[CE] sends a SEAL Control Message Protocol (SCMP) 582 Router Solicitation (SRS) message using an implicit anycast procedure 583 to discover the nearest IR[VC] in its VPC overlay network. The 584 IR[VC] will in turn return a list of locators of the company's nearby 585 IR[VE]s. (This list is analogous to the ISATAP Potential Router List 586 (PRL) [RFC5214].)I 588 To perform the implicit anycast procedure, the IR[CE] sets the source 589 address of the SRS message to one of its locator addresses and sets 590 the destination address of the message to any EPA taken from one of 591 its own EPs. (If the EP is of a different address family than the 592 IR[CE]'s locators, however, the IR[CE] instead sets the destination 593 address to any address taken from the companion prefix associated 594 with the EP.) This SRS message will be delivered to the nearest 595 IR[VC] that attaches the VPC overlay network to the Internet. When 596 the IR[VC] receives the SRS message, it sends back an SCMP Router 597 Advertisement (SRA) message that lists the locator addresses of one 598 or more nearby IR[VE] routers. 600 After the IR[CE] receives an SRA message from the nearby IR[VC] 601 listing the locator addresses of nearby IR[VE]s, it sends SRS test 602 messages to one or more of the locator addresses to elicit SRA 603 messages. The IR[VE] that configures the locator will include the 604 header of the soliciting SRS message in its SRA message so that the 605 IR[CE] can determine the number of hops along the forward path. The 606 IR[VE] also includes a metric in its SRA messages indicating its 607 service availability so that the IR[CE] can avoid selecting IR[VE]s 608 that are overloaded. The IR[VE] also includes a challenge/response 609 puzzle that the IR[CE] must answer if it wishes to enlist this 610 IR[VE]'s services. 612 When the IR[CE] receives these SRA messages, it can measure the round 613 trip time between sending the SRS and receiving the SRA as an 614 indication of round-trip delay. If the IR[CE] wishes to enlist the 615 services of a specific IR[VE] (e.g., based on the measured 616 performance), it then calculates the answer to the puzzle using its 617 keying information and sends the answer back to the IR[VE] in a new 618 SRS message that also contains all of the IR[CE]'s EP prefixes for 619 which it claims ownership. If the IR[CE] answered the puzzle 620 correctly, the IR[VE] will send back a new SRA message that includes 621 a non-zero default router lifetime and that signifies the 622 establishment of a bidirectional tunnel. (A zero default router 623 lifetime on the other hand signifies that the IR[VE] is currently 624 unable to establish a bidirectional tunnel, e.g., due to heavy load, 625 due to challenge/response failure, etc.) 627 Note that in the above procedure it is essential that the IR[CE] 628 select one and only one IR[VE]. This is to allow the VPC overlay 629 network mapping system to have one and only one active EP-to-IR[VE] 630 mapping at any point in time which shares fate with the IR[VE] 631 itself. If this IR[VE] fails, the IR[CE] will quickly select a new 632 one which will automatically update the VPC overlay network mapping 633 system with a new EP-to-IR[VE] mapping. 635 6. IRON Operation 637 Following the IRON initialization detailed in Section 5, IRs engage 638 in the steady-state process of receiving and forwarding packets. All 639 IRs forward encapsulated packets over the IRON using the mechanisms 640 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 641 while IR[VC]s (and in some cases IR[VE]s) additionally forward 642 packets to and from the native IPv6 and IPv4 Internets. IRs also use 643 the SEAL Control Message Protocol (SCMP) to coordinate with other 644 IRs, including the process of sending and receiving redirect 645 messages, error messages, etc. (Note however that an IR must not 646 send an SCMP message in response to an SCMP error message.) Each IR 647 operates as specified in the following sub-sections. 649 6.1. IR[CE] Operation 651 After selecting its serving IR[VE] as specified in Section 5.3, the 652 IR[CE] should register each of its ISP connections with the IR[VE] in 653 order to establish multiple bidirectional tunnels for multihoming 654 purposes. To do so, it sends periodic SRS messages to its serving 655 IR[VE] via each of its ISPs to establish additional bidirectional 656 tunnels and to keep each tunnel alive. These messages need not 657 include challenge/response mechanisms since prefix proof of ownership 658 was already established in the initial exchange and a nonce in the 659 SEAL header can be used to confirm that the SRS message was sent by 660 the correct IR[CE]. This implies that a single nonce is used to 661 represent the set of all bidirectional tunnels between the IR[CE] and 662 the IR[VE]. Therefore, there are multiple bidirectional tunnels, and 663 the nonce names this "bundle" of tunnels. (The IR[CE] and IR[VE] may 664 conceptually represent this "bundle" as a single tunnel with multiple 665 locator addresses, however each such locator address must be tested 666 independently in case there are NATs on the path.) 668 If the IR[CE] ceases to receive SRA messages from its serving IR[VE] 669 via a specific ISP connection, it marks the IR[VE] as unreachable 670 from that address and therefore over that ISP connection. (The 671 IR[CE] should also inform its serving IR[VE] of this outage via one 672 of its working ISP connections.) If the IR[CE] ceases to receive SRA 673 messages from its serving IR[VE] via multiple ISP connections, it 674 marks the IR[VE] as unusable and quickly attempts to establish a 675 bidirectional tunnel with a new IR[VE]. The act of establishing the 676 tunnel with a new serving IR[VE] will automatically purge the stale 677 mapping state associated with the old serving IR[VE]. 679 When an end system in an EUN sends a flow of packets to a 680 correspondent, the packets are forwarded through the EUN via normal 681 routing until they reach the IR[CE], which then tunnels the initial 682 packets to its serving IR[VE] as the next hop. In particular, the 683 IR[CE] encapsulates each packet in an outer header with its locator 684 as the source address and the locator of its serving IR[VE] as the 685 destination address. Note that after sending the initial packets of 686 a flow, the IR[CE] may receive critical SCMP messages such as 687 indications of PMTU limitations, redirects that point to a better 688 next hop, etc. It is therefore essential that the IR[CE] send the 689 initial packets through its serving IR[VE] to avoid loss of SCMP 690 messages that cannot traverse a NAT in the reverse direction. 692 The IR[CE] uses the mechanisms specified in VET and SEAL to 693 encapsulate each forwarded packet. The IR[CE] further uses the SCMP 694 protocol to coordinate with other IRs, including accepting redirects 695 and other SCMP messages. When the IR[CE] receives an SCMP message, 696 it checks the nonce field of the encapsulated packet-in-error to 697 verify that the message corresponds to a packet that it had 698 previously sent and accepts the message if the nonce matches. (Note 699 however that the outer source and destination addresses of the 700 packet-in-error may be different than those in the original packet 701 due to possible IR[VE] and/or IR[VC] address rewritings.) 703 6.2. IR[VE] Operation 705 After an IR[VE] is initialized, it responds to SRSs from IR[CE]s by 706 sending SRAs as described in Section 6.1. When the IR[VE] receives 707 an SRS message from a new IR[CE], it sends back an SRA message with a 708 challenge/response puzzle. The IR[CE] in turn sends an SRS message 709 with an answer to the puzzle. If this authentication fails, the 710 IR[VE] discards the message. Otherwise, it creates tunnel state for 711 this new IR[CE], records the EPs in its FIB, and records the locator 712 address from the SCMP message as the link-layer address of the next 713 hop. The IR[VE] next sends an SRA message back to the IR[CE] to 714 complete the tunnel establishment. 716 When the IR[VE] receives a SEAL-encapsulated packet from one of its 717 IR[CE] tunnel endpoints, it examines the inner destination address. 718 If the inner destination address is not an EPA, the IR[VE] 719 decapsulates the packet and forwards it unencapsulated into the 720 Internet if it is able to do so without loss due to ingress 721 filtering. Otherwise, the IR[VE] re-encapsulates the packet (i.e., 722 it removes the outer header and replaces it with a new outer header 723 of the same address family) and sets the outer destination address to 724 the locator address of an IR[VC] within its VPC overlay network. It 725 then forwards the re-encapsulated packet to the IR[VC], which will in 726 turn decapsulate it and forward it into the Internet. 728 If the inner destination address is an EPA, however, the IR[VE] 729 rewrites the outer source address to one of its own locator address 730 and rewrites the outer destination address to the inner destination 731 address. (If the outer header is of a different address family than 732 the inner header, the IR[VE] instead rewrites the destination address 733 to any address taken from the companion prefix associated with the 734 inner destination address.) The IR[VE] then forwards the revised 735 packet into the Internet via a default or more-specific route, where 736 it may be interpreted as an implicit anycast by a router within the 737 destination VPC overlay network. After sending the packet, the 738 IR[VE] may then receive an SCMP error or redirect message from an 739 IR[VC]/IR[VE] within the destination VPC overlay network. In that 740 case, the IR[VE] verifies that the nonce in the message matches the 741 tunnel corresponding to the IR[CE] that sent the original inner 742 packet and discards the message if the nonce does not match. 743 Otherwise, the IR[VE] re-encapsulates the SCMP message in a new outer 744 header that uses the source address, destination address and nonce 745 parameters associated with the tunnel to IR[CE]]; it then forwards 746 the message to the IR[CE]. This arrangement is necessary to allow 747 SCMP messages to flow through any NATs on the path. 749 When an IR[VE](A) receives a SEAL-encapsulated packet from an IR[VC] 750 or from the Internet, if the inner destination address matches an EP 751 in its FIB IR[VE](A) re-encapsulates the packet in a new outer header 752 that uses the source address, destination address and nonce 753 parameters associated with the tunnel and forwards it to its client 754 IR[CE](B) which in turn decapsulates the packet and forwards it to 755 the correct end system in the EUN. If IR[CE](B) has left notice with 756 IR[VE](A) that it has moved to a new IR[VE](C), however, IR[VE](A) 757 will instead forward the packet to IR[VE](C) and also send an SCMP 758 redirect message back to the source of the packet. In this way, 759 IR[CE](B) can leave behind forwarding information when changing 760 between IR[VE]s (e.g., due to mobility events) without exposing 761 packets to loss. 763 6.3. IR(VC) Operation 765 After an IR[VC] has synchronized its VPs (see: Section 5.1) it 766 advertises the full set of the company's VP's into the IPv4 and IPv6 767 Internet BGP routing systems. The VPs will be represented as 768 ordinary routing information in the BGP, and any packets originating 769 from the IPv4 or IPv6 Internet destined to an EPA covered by one of 770 the VPs will be forwarded into the VPC's overlay network by an 771 IR[VC]. 773 When an IR[VC] receives a packet from the Internet destined either to 774 an EPA covered by one of its VPs or to an address within one of its 775 companion prefixes, it intercepts the packet as though it were 776 addressed to itself, i.e., to support the implicit anycast service 777 model. It then examines the packet format to determine the proper 778 handling procedures as follows: 780 o If the packet is an SCMP SRS message, the IR[VC] sends an SRA 781 message back to the source listing the locator addresses of nearby 782 IR[VE] routers then discards the message. 784 o If the packet is not SEAL-encapsulated the IR[VC] looks in its FIB 785 to discover a locator of the IR[VE] that serves the destination 786 address. The IR[VC] then simply encapsulates the packet with its 787 own locator as the outer source address and the locator of the 788 IR[VE] as the outer destination address and forwards the packet to 789 the IR[VE]. 791 o If the packet is SEAL-encapsulated the IR[VC] sends an SCMP 792 redirect message of the same address family back to the source 793 with the locator of the serving IR[VE] as the redirected target. 794 The source and destination addresses of the SCMP redirect message 795 use the outer destination and source addresses of the original 796 packet, respectively. After sending the redirect message, the 797 IR[VC] then rewrites the outer destination address of the SEAL- 798 encapsulated packet to the locator of the IR[VE] and forwards the 799 revised packet to the IR[VE]. Note that in this arrangement any 800 errors that occur on the path between the IR[VC] to the IR[VE] 801 will be delivered to the original source but with a different 802 destination address due to this IR[VC] address rewriting. 804 6.4. IRON Reference Operating Scenarios 806 The IRON is used to support communications when one or both hosts are 807 located within EP-addressed EUNs regardless of whether the EPs are 808 provisioned by the same VPC or by different VPCs. When both hosts 809 are within IRON EUNs, route redirections that eliminate unnecessary 810 IR[VE]s and IR[VC]s from the path are possible. When only one host 811 is within an IRON EUN, however, route optimization cannot be used. 812 The following sections discuss the two scenarios. 814 6.4.1. Both Hosts Within IRON EUNs 816 When both hosts are within IRON EUNs, it is sufficient to consider 817 the scenario in a unidirectional fashion, i.e., by tracing packet 818 flows only in the forward direction from the source host to 819 destination host. The reverse direction can be considered 820 separately, and incurs the same considerations as for the forward 821 direction. 823 In this scenario, the initial packets of a flow produced by a source 824 host must flow through both the source's serving IR[VE] and an IR[VC] 825 of the destination host, but route optimization can eliminate these 826 elements from the path for subsequent packets in the flow. Figure 6 827 shows the flow of initial packets from host A to host B within two 828 IRON EUNs. 830 ________________________________________ 831 .-( .-. )-. 832 .-( ,-( _)-. )-. 833 .-( +========+(_ (_ +=====+ )-. 834 .( || (_|| Internet ||_) || ). 835 .( || ||-(______)-|| vv ). 836 .( +--------++--+ || || +------------+ ). 837 ( +==>| IR[VE](A) | vv || | IR[VE](B) |====+ ) 838 ( // +---------++-+ +--++----++--+ +------------+ \\ ) 839 ( // .-. | \<-- | IR[VC](B) | .-. \\ ) 840 ( //,-( _)-. | +------------+ ,-( _)-\\ ) 841 ( .||_ (_ )-. | .-(_ (_ ||. ) 842 ( _|| ISP A .) | (redirect) (__ ISP B ||_)) 843 ( ||-(______)-' | `-(______)|| ) 844 ( || | | | vv ) 845 ( +-----+-----+ | +-----+-----+ ) 846 | IR[CE](A) | <--+ | IR[CE](B) | 847 +-----+-----+ The IRON +-----+-----+ 848 | ( (Overlaid on the native Internet) ) | 849 .-. .-( .-) .-. 850 ,-( _)-. .-(________________________)-. ,-( _)-. 851 .-(_ (_ )-. .-(_ (_ )-. 852 (_ IRON EUN A ) (_ IRON EUN B ) 853 `-(______)-' `-(______)-' 854 | | 855 +---+----+ +---+----+ 856 | Host A | | Host B | 857 +--------+ +--------+ 859 Figure 6: Initial Packet Flow Before Redirects 861 With reference to Figure 6, host A sends packets destined to host B 862 via its network interface connected to EUN A. Routing within EUN A 863 will direct the packets to IR[CE](A) as a default router for the EUN 864 which then uses VET and SEAL to encapsulate them in outer headers 865 with its locator address as the outer source address and the locator 866 address of its serving IR[VE](A) as the outer destination address. 867 IR[CE](A) then simply releases the encapsulated packets into its ISP 868 network connection that provided its locator. The ISP will release 869 the packets into the Internet without filtering since the (outer) 870 source address is topologically correct. Once the packets have been 871 released into the Internet, routing will direct them to IR[VE](A). 873 IR[VE](A) receives the encapsulated packets from IR[CE](A) then 874 rewrites the outer source address to one of its own locator 875 addresses, and rewrites the outer destination address to the inner 876 destination address. (If the outer header is of a different address 877 family than the inner header, however, the IR[VE] instead rewrites 878 the destination address to any address taken from the companion 879 prefix associated with the inner destination address.) IR[VE](A) 880 then releases the revised packets into the Internet where routing 881 will direct them to IR[VC](B) which advertises a prefix that covers 882 the outer destination address. 884 IR[VC](B) will intercept the encapsulated packets from IR[VE](A) then 885 check its FIB to discover an entry that covers inner destination 886 address B with IR[VE](B) as the next hop. IR[VC](B) then returns 887 SCMP redirect messages to IR[VE](A) (*), rewrites the outer 888 destination address of the encapsulated packets to the locator 889 address of IR[VE](B), and forwards these revised packets to 890 IR[VE](B). 892 IR[VE](B) will receive the encapsulated packets from IR[VC](B) then 893 check its FIB to discover an entry that covers destination address B 894 with IR[CE](B) as the next hop. IR[VE](B) then re-encapsulates the 895 packets in a new outer header that uses the source address, 896 destination address and nonce parameters associated with the tunnel 897 to IR[CE](B). IR[VE](B) then releases these re-encapsulated packets 898 into the Internet, where routing will direct them to IR[CE](B). 899 IR[CE](B) will in turn decapsulate the packets and forward the inner 900 packets to host B via EUN B. 902 (*) Note that after the initial flow of packets, IR[VE](A) will have 903 received one or more SCMP redirect messages from IR[VC](B) informing 904 it of IR[VE](B) as a better next hop. IR[VE](A) will in turn forward 905 the redirects to IR[CE](A), which will thereafter forward its 906 encapsulated packets directly to the locator address of IR[VE](B) 907 without involving either IR[VE](A) or IR[VC](B) as shown in Figure 7: 909 ________________________________________ 910 .-( .-. )-. 911 .-( ,-( _)-. )-. 912 .-( +=============> .-(_ (_ )-.======+ )-. 913 .( // (__ Internet _) || ). 914 .( // `-(______)-' vv ). 915 .( // +------------+ ). 916 ( // | IR[VE](B) |====+ ) 917 ( // +------------+ \\ ) 918 ( // .-. .-. \\ ) 919 ( //,-( _)-. ,-( _)-\\ ) 920 ( .||_ (_ )-. .-(_ (_ ||. ) 921 ( _|| ISP A .) (__ ISP B ||_)) 922 ( ||-(______)-' `-(______)|| ) 923 ( || | | vv ) 924 ( +-----+-----+ The IRON +-----+-----+ ) 925 | IR[CE](A) | (Overlaid on the native Internet) | IR[CE](B) | 926 +-----+-----+ +-----+-----+ 927 | ( ) | 928 .-. .-( .-) .-. 929 ,-( _)-. .-(________________________)-. ,-( _)-. 930 .-(_ (_ )-. .-(_ (_ )-. 931 (_ IRON EUN A ) (_ IRON EUN B ) 932 `-(______)-' `-(______)-' 933 | | 934 +---+----+ +---+----+ 935 | Host A | | Host B | 936 +--------+ +--------+ 938 Figure 7: Sustained Packet Flow After Redirects 940 6.4.2. Mixed IRON and Non-IRON Hosts 942 When one host is within an IRON EUN and the other is in a non-IRON 943 EUN (i.e., one that connects to the native Internet instead of the 944 IRON), the IR elements involved depend on the packet flow directions. 945 The cases are described in the following sections. 947 6.4.2.1. From IRON Host A to Non-IRON Host B 949 Figure 8 depicts the IRON reference operating scenario for packets 950 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 952 _________________________________________ 953 .-( )-. )-. 954 .-( +-------)----+ )-. 955 .-( | IR[VC](A) |--------------+ )-. 956 .( +------------+ \ ). 957 .( +=======>| IR[VE](A) | \ ). 958 .( // +--------)---+ \ ). 959 ( // ) \ ) 960 ( // The IRON ) \ ) 961 ( // .-. ) \ .-. ) 962 ( //,-( _)-. ) \ ,-( _)-. ) 963 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 964 ( _|| ISP A ) ) (_ | ISP B )) 965 ( ||-(______)-' ) |-(______)-' ) 966 ( || | )-. v | ) 967 ( +-----+ ----+ )-. +-----+-----+ ) 968 | IR[CE](A) |)-. | Router B | 969 +-----+-----+ +-----+-----+ 970 | ( ) | 971 .-. .-(____________________________________)-. .-. 972 ,-( _)-. ,-( _)-. 973 .-(_ (_ )-. .-(_ (_ )-. 974 (_ IRON EUN A ) (_ non-IRON EUN ) 975 `-(______)-' `-(___B___)-' 976 | | 977 +---+----+ +---+----+ 978 | Host A | | Host B | 979 +--------+ +--------+ 981 Figure 8: From IRON Host A to Non-IRON Host B 983 In this scenario, host A sends packets destined to host B via its 984 network interface connected to IRON EUN A. Routing within EUN A will 985 direct the packets to IR[CE](A) as a default router for the EUN which 986 then uses VET and SEAL to encapsulate them in outer headers with its 987 locator address as the outer source address and the locator address 988 of IR[VE](A) as the outer destination address. The ISP will pass the 989 packets without filtering since the (outer) source address is 990 topologically correct. Once the packets have been released into the 991 native Internet, routing will direct them to IR[VE](A). 993 IR[VE](A) receives the encapsulated packets from IR[CE](A) then re- 994 encapsulates and forwards them to IR[VC](A), which simply 995 decapsulates them and releases the unencapsulated packets into the 996 Internet. Once the packets are released into the Internet, routing 997 will direct them to the final destination B. (Note that IR[VE](A) and 998 IR[VC](A) are depicted in Figure 8 as two halves of a unified 999 IR[VP](A). In that case, the "forwarding" between IR[VE](A) and 1000 IR[VC](A) is a zero-instruction imaginary operation.) 1002 This scenario always involves an IR[VE](A) and IR[VC](A) owned by the 1003 VPC that provides service to IRON EUN A. It therefore imparts a cost 1004 that would need to be borne by either the VPC or its customers. 1006 6.4.2.2. From Non-IRON Host B to IRON Host A 1008 Figure 9 depicts the IRON reference operating scenario for packets 1009 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1011 _______________________________________ 1012 .-( )-. )-. 1013 .-( +-------)----+ )-. 1014 .-( | IR[VC](A) |<-------------+ )-. 1015 .( +------------+ \ ). 1016 .( +========| IR[VE](A) | \ ). 1017 .( // +--------)---+ \ ). 1018 ( // ) \ ) 1019 ( // The IRON ) \ ) 1020 ( // .-. ) \ .-. ) 1021 ( //,-( _)-. ) \ ,-( _)-. ) 1022 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1023 ( _|| ISP A ) ) (_ | ISP B )) 1024 ( ||-(______)-' ) |-(______)-' ) 1025 ( vv | )-. | | ) 1026 ( +-----+ ----+ )-. +-----+-----+ ) 1027 | IR[CE](A) |)-. | Router B | 1028 +-----+-----+ +-----+-----+ 1029 | ( ) | 1030 .-. .-(____________________________________)-. .-. 1031 ,-( _)-. ,-( _)-. 1032 .-(_ (_ )-. .-(_ (_ )-. 1033 (_ IRON EUN A ) (_ non-IRON EUN ) 1034 `-(______)-' `-(___B___)-' 1035 | | 1036 +---+----+ +---+----+ 1037 | Host A | | Host B | 1038 +--------+ +--------+ 1040 Figure 9: From Non-IRON Host B to IRON Host A 1042 In this scenario, host B sends packets destined to host A via its 1043 network interface connected to non-IRON EUN B. Routing will direct 1044 the packets to IR[VC](A) which then forwards them to IR[VE](A) using 1045 encapsulation if necessary. (Note that in this diagram IR[VE](A) and 1046 IR[VC](A) are depicted as two halves of a unified IR[VP](A). In that 1047 case, the "forwarding" between IR[VE](A) and IR[VC](A) is a zero- 1048 instruction imaginary operation.) 1050 IR[VE](A) will then check its FIB to discover an entry that covers 1051 destination address A with IR[CE](A) as the next hop. IR[VE](A) then 1052 (re-)encapsulates the packets in an outer header that uses the source 1053 address, destination address and nonce parameters associated with the 1054 tunnel to IR[CE](A). IR[VE](A) next releases these (re-)encapsulated 1055 packets into the Internet, where routing will direct them to 1056 IR[CE](A). IR[CE](A) will in turn decapsulate the packets and 1057 forward the inner packets to host A via its network interface 1058 connected to IRON EUN A. 1060 This scenario always involves an IR[VE](A) and IR[VC](A) owned by the 1061 VPC that provides service to IRON EUN A. It therefore imparts a cost 1062 that would need to be borne by either the VPC or its customers. 1064 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1066 While IR[VE]s and IR[VC]s can be considered as fixed infrastructure, 1067 IR[CE]s may need to move between different network points of 1068 attachment, connect to multiple ISPs, or explicitly manage their 1069 traffic flows. The following sections discuss mobility, multi-homing 1070 and traffic engineering considerations for IR[CE]s. 1072 6.5.1. Mobility Management 1074 When an IR[CE] changes its network point of attachment (e.g., due to 1075 a mobility event), it configures one or more new locators. If the 1076 IR[CE] has not moved far away from its previous network point of 1077 attachment, it simply informs its serving IR[VE] of any locator 1078 additions or deletions. This operation is performance-sensitive, and 1079 should be conducted immediately to avoid packet loss. 1081 If the IR[CE] has moved far away from its previous network point of 1082 attachment, however, it re-issues the implicit anycast discovery 1083 procedure described in Section 6.1 to discover whether its candidate 1084 set of serving IR[VE]s has changed. If the IR[CE]'s current serving 1085 IR[VE] is also included in the new list received from the VPC, this 1086 serves as indication that the IR[CE] has not moved far enough to 1087 warrant changing to a new serving IR[VE]. Otherwise, the IR[CE] may 1088 wish to move to a new serving IR[VE] in order to maintain optimal 1089 routing. This operation is not performance-critical, and therefore 1090 can be conducted over a matter of seconds/minutes instead of 1091 milliseconds/microseconds. 1093 To move to a new IR[VE], the IR[CE] first engages in the EP 1094 registration process with the new IR[VE] and maintains the 1095 registrations through periodic SRS/SRA exchanges the same as 1096 described in Section 6.1. The IR[CE] then informs its former IR[VE] 1097 that it has moved by providing it with the locator address of the new 1098 IR[VE]. The IR[CE] then discontinues the SRS/SRA keepalive process 1099 with the former IR[VE], which will garbage-collect the stale FIB 1100 entries when their lifetime expires. This will allow the former 1101 IR[VE] to redirect existing correspondents to the new IR[VE] so that 1102 no packets are lost. 1104 6.5.2. Multihoming 1106 An IR[CE] may register multiple locators with its serving IR[VE]. It 1107 can assign metrics with its registrations to inform its IR[VE] of 1108 preferred locators, and can select outgoing locators according to its 1109 local preferences. Multihoming is therefore naturally supported. 1111 6.5.3. Inbound Traffic Engineering 1113 An IR[CE] can dynamically adjust the priorities of its prefix 1114 registrations with its serving IR[VE] in order to influence inbound 1115 traffic flows. It can also change between serving IR[VE]s when 1116 multiple IR[VE]s are available, but should strive for stability in 1117 its IR[VE] selection in order to limit VPC network routing churn. 1119 6.5.4. Outbound Traffic Engineering 1121 An IR[CE] can select outgoing locators, e.g., based on current QoS 1122 considerations such as minimizing one-way delay or one-way delay 1123 variance. 1125 6.6. Renumbering Considerations 1127 As better link layer technologies and service plans emerge, customers 1128 will be motivated to select their service providers through healthy 1129 competition between ISPs. If a customer's EUN addresses are tied to 1130 a specific ISP, however, the customer may be forced to undergo a 1131 painstaking EUN renumbering process if it wishes to change to a 1132 different ISP [RFC4192][RFC5887]. 1134 When a customer obtains EP prefixes from a VPC, it can change between 1135 ISPs seamlessly and without need to renumber. If the VPC itself 1136 applies unreasonable costing structures for use of the EPs, however, 1137 the customer may be compelled to seek a different VPC and would again 1138 be required to confront a renumbering scenario. The IRON approach to 1139 renumbering avoidance therefore depends on VPCs conducting ethical 1140 business practices and offering reasonable rates. 1142 6.7. NAT Traversal Considerations 1144 The Internet today consists of a global public IPv4 routing and 1145 addressing system with non-IRON EUNs that use either public or 1146 private IPv4 addressing. The latter class of EUNs connect to the 1147 public Internet via Network Address Translators (NATs). When an 1148 IR[CE] is located behind a NAT, its selects IR[VE]s using the same 1149 procedures as for IR[CE]s with public addresses, i.e., it will send 1150 SRS messages to IR[VE]s in order to get SRA messages in return. The 1151 only requirement is that the IR[CE] must configure its SEAL 1152 encapsulation to use a transport protocol that supports NAT 1153 traversal, namely UDP. 1155 Since the IR[VE] maintains state about its IR[CE] customers, it can 1156 discover locator information for each IR[CE] by examining the UDP 1157 port number and IP address in the outer headers of SRS messages. 1158 When there is a NAT in the path, the UDP port number and IP address 1159 in the SRS message will correspond to state in the NAT box and might 1160 not correspond to the actual values assigned to the IR[CE]. The 1161 IR[VE] can then encapsulate packets destined to hosts serviced by the 1162 IR[CE] within outer headers that use this IP address and UDP port 1163 number. The NAT box will receive the packets, translate the values 1164 in the outer headers to match those assigned to the IR[CE], then 1165 forward the packets to the IR[CE]. In this sense, the IR[VE]'s 1166 "locator" for the IR[CE] consists of the concatenation of the IP 1167 address and UDP port number. 1169 IRON does not introduce any new issues to complications raised for 1170 NAT traversal or for applications embedding address referrals in 1171 their payload. 1173 6.8. Nested EUN Considerations 1175 Each IR[CE] configures a locator that may be taken from an ordinary 1176 non-EPA address assigned by an ISP or from an EPA address taken from 1177 an EP assigned to another IR[CE]. In that case, the IR[CE] is said 1178 to be "nested" within the EUN of another IR[CE], and recursive 1179 nestings of multiple layers of encapsulations may be necessary. 1181 For example, in the network scenario depicted in Figure 10 IR[CE](A) 1182 configures a locator EPA(B) taken from the EP assigned to EUN(B). 1183 IR[CE](B) in turn configures a locator EPA(C) taken from the EP 1184 assigned to EUN(C). Finally, IR[CE](C) configures a locator ISP(D) 1185 taken from a non-EPA address delegated by an ordinary ISP(D). Using 1186 this example, the "nested-IRON" case must be examined in which a host 1187 A which configures the address EPA(A) within EUN(A) exchanges packets 1188 with host Z located elsewhere in the Internet. 1190 .-. 1191 ISP(D) ,-( _)-. 1192 +-----------+ .-(_ (_ )-. 1193 | IR[CE](C) |--(_ ISP(D) ) 1194 +-----+-----+ `-(______)-' 1195 | <= T \ .-. 1196 .-. u \ ,-( _)-. 1197 ,-( _)-. n .-(_ (- )-. 1198 .-(_ (_ )-. n (_ Internet ) 1199 (_ EUN(C) ) e `-(______)-' 1200 `-(______)-' l ___ 1201 | EPA(C) s => (:::)-. 1202 +-----+-----+ .-(::::::::) 1203 | IR[CE](B) | .-(::::::::::::)-. +-----------+ 1204 +-----+-----+ (:::: The IRON ::::) | IR[VC](Z) | 1205 | `-(::::::::::::)-' +-----------+ 1206 .-. `-(::::::)-' +-----------+ 1207 ,-( _)-. | IR[VE](Z) | 1208 .-(_ (_ )-. +-----------+ +-----------+ 1209 (_ EUN(B) ) | IR[VE](C) | +-----------+ 1210 `-(______)-' +-----------+ | IR[CE](Z) | 1211 | EPA(B) +-----------+ +-----------+ 1212 +-----+-----+ | IR[VE](B) | +--------+ 1213 | IR[CE](A) | +-----------+ | Host Z | 1214 +-----------+ +-----------+ +--------+ 1215 | | IR[VE](A) | 1216 .-. +-----------+ 1217 ,-( _)-. EPA(A) 1218 .-(_ (_ )-. +--------+ 1219 (_ EUN(A) )---| Host A | 1220 `-(______)-' +--------+ 1222 Figure 10: Nested EUN Example 1224 The two cases of host A sending packets to host Z, and host Z sending 1225 packets to host A, must be considered separately as described below. 1227 6.8.1. Host A Sends Packets to Host Z 1229 Host A first forwards a packet with source address EPA(A) and 1230 destination address Z into EUN(A). Routing within EUN(A) will direct 1231 the packet to IR[CE](A), which encapsulates it in an outer header 1232 with EPA(B) as the outer source address and IR[VE](A) as the outer 1233 destination address then forwards the once-encapsulated packet into 1234 EUN(B). Routing within EUN[B] will direct the packet to IR[CE](B), 1235 which encapsulates it in an outer header with EPA(C) as the outer 1236 source address and IR[VE](B) as the outer destination address then 1237 forwards the twice-encapsulated packet into EUN(C). Routing within 1238 EUN(C) will direct the packet to IR[CE](C), which encapsulates it in 1239 an outer header with ISP(D) as the outer source address and IR[VE](C) 1240 as the outer destination address. IR[CE](C) then sends this triple- 1241 encapsulated packet into the ISP(D) network, where it will be routed 1242 into the Internet to IR[VE](C). 1244 When IR[VE](C) receives the triple-encapsulated packet, it removes 1245 the outer layer of encapsulation and forwards the resulting twice- 1246 encapsulated packet into the Internet to IR[VE](B). Next, IR[VE](B) 1247 removes the outer layer of encapsulation and forwards the resulting 1248 once-encapsulated packet into the Internet to IR[VE](A). Next, 1249 IR[VE](A) checks the address type of the inner address 'Z'. If Z is 1250 a non-EPA address, IR[VE](A) simply decapsulates the packet and 1251 forwards it into the Internet. Otherwise, IR[VE](A) rewrites the 1252 outer source and destination addresses of the once-encapsulated 1253 packet and forwards it to IR[VC](Z). IR[VC](Z) in turn rewrites the 1254 outer destination address of the packet to the locator for IR[VE](Z), 1255 then forwards the packet and sends a redirect to IR[VE](A). 1256 IR[VE](Z) then re-encapsulates the packet and forwards it to 1257 IR[CE](Z), which decapsulates it and forwards the inner packet to 1258 host Z. Subsequent packets from IR[CE](A) will then use IR[VE](Z) as 1259 the next hop toward host Z 1261 6.8.2. Host Z Sends Packets to Host A 1263 Whether or not host Z configures an EPA address, its packets destined 1264 to Host A will eventually reach IR[VE](A). IR[VE](A) will have a 1265 mapping that lists IR[CE](A) as the next hop toward EPA(A). 1266 IR[VE](A) will then encapsulate the packet with EPA(B) as the outer 1267 destination address and forward the packet into the Internet. 1268 Internet routing will convey this once-encapsulated packet to 1269 IR[VE](B) which will have a mapping that lists IR[CE](B) as the next 1270 hop toward EPA(B). IR[VE](B) will then encapsulate the packet with 1271 EPA(C) as the outer destination address and forward the packet into 1272 the Internet. Internet routing will then convey this twice- 1273 encapsulated packet to IR[VE](C) which will have a mapping that lists 1274 IR[CE](C) as the next hop toward EPA(C). IR[VE](C) will then 1275 encapsulate the packet with ISP(D) as the outer destination address 1276 and forward the packet into the Internet. Internet routing will then 1277 convey this triple-encapsulated packet to IR[CE](C). 1279 When the triple-encapsulated packet arrives at IR[CE](C), it strips 1280 the outer layer of encapsulation and forwards the twice-encapsulated 1281 packet to EPA(C) which is the locator address of IR[CE](B). When 1282 IR[CE](B) receives the twice-encapsulated packet, it strips the outer 1283 layer of encapsulation and forwards the once-encapsulated packet to 1284 EPA(B) which is the locator address of IR[CE](A). When IR[CE](A) 1285 receives the once-encapsulated packet, it strips the outer layer of 1286 encapsulation and forwards the unencapsulated packet to EPA(A) which 1287 is the host address of host A. 1289 7. Additional Considerations 1291 Considerations for the scalability of Internet Routing due to 1292 multihoming, traffic engineering and provider-independent addressing 1293 are discussed in [I-D.narten-radir-problem-statement]. 1295 Route optimization considerations for mobile networks are found in 1296 [RFC5522]. 1298 8. Related Initiatives 1300 IRON builds upon the concepts RANGER architecture [RFC5720], and 1301 therefore inherits the same set of related initiatives. 1303 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1304 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1305 the Virtual Prefix concepts. 1307 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1308 contributed valuable insights, including the use of real-time 1309 mapping. The use of IR[VE]s as mobility anchor points is directly 1310 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1312 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1313 approach using a Correspondent Router (CR) model. The IRON IR[VE] 1314 construct is similar to the CR concept described in this work, 1315 however the manner in which customer EUNs coordinates with IR[VE]s is 1316 different and based on the redirection model associated with NBMA 1317 links. 1319 Numerous publications have proposed NAT traversal techniques. The 1320 NAT traversal techniques adapted for IRON were inspired by the Simple 1321 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1322 [I-D.carpenter-softwire-sample]. 1324 9. IANA Considerations 1326 There are no IANA considerations for this document. 1328 10. Security Considerations 1330 Security considerations that apply to tunneling in general are 1331 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1332 considerations that apply also to IRON are discussed in RANGER 1333 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1334 [I-D.templin-intarea-seal]. 1336 IR[CE]s require a means for securely registering their EP-to-locator 1337 bindings with their VPC. Each VPC provides its customer IR[CE]s with 1338 a secure means for registering and re-registering their mappings. 1340 11. Acknowledgements 1342 This ideas behind this work have benefited greatly from discussions 1343 with colleagues; some of which appear on the RRG and other IRTF/IETF 1344 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1345 mobility architecture which strongly influenced IRON. Eric 1346 Fleischman pointed out the opportunity to leverage anycast for 1347 discovering topologically-close servers. Thomas Henderson 1348 recommended a quantitative analysis of scaling properties. 1350 The following individuals provided essential review input: Mohamed 1351 Boucadair, Wesley Eddy, Dae Young Kim and Robin Whittle. 1353 12. References 1355 12.1. Normative References 1357 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1358 September 1981. 1360 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1361 (IPv6) Specification", RFC 2460, December 1998. 1363 12.2. Informative References 1365 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1366 http://bgpmon.net/stat.php", June 2010. 1368 [I-D.bernardos-mext-nemo-ro-cr] 1369 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1370 Router based Route Optimisation for NEMO (CRON)", 1371 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1372 July 2008. 1374 [I-D.carpenter-softwire-sample] 1375 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1376 IPv6: Simple Address Mapping for Premises Legacy Equipment 1377 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1378 progress), June 2010. 1380 [I-D.ietf-grow-va] 1381 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1382 L. Zhang, "FIB Suppression with Virtual Aggregation", 1383 draft-ietf-grow-va-02 (work in progress), March 2010. 1385 [I-D.ietf-v6ops-tunnel-security-concerns] 1386 Hoagland, J., Krishnan, S., and D. Thaler, "Security 1387 Concerns With IP Tunneling", 1388 draft-ietf-v6ops-tunnel-security-concerns-02 (work in 1389 progress), March 2010. 1391 [I-D.narten-radir-problem-statement] 1392 Narten, T., "On the Scalability of Internet Routing", 1393 draft-narten-radir-problem-statement-05 (work in 1394 progress), February 2010. 1396 [I-D.russert-rangers] 1397 Russert, S., Fleischman, E., and F. Templin, "RANGER 1398 Scenarios", draft-russert-rangers-05 (work in progress), 1399 July 2010. 1401 [I-D.templin-intarea-seal] 1402 Templin, F., "The Subnetwork Encapsulation and Adaptation 1403 Layer (SEAL)", draft-templin-intarea-seal-16 (work in 1404 progress), July 2010. 1406 [I-D.templin-intarea-vet] 1407 Templin, F., "Virtual Enterprise Traversal (VET)", 1408 draft-templin-intarea-vet-16 (work in progress), 1409 July 2010. 1411 [I-D.whittle-ivip-arch] 1412 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1413 Architecture", draft-whittle-ivip-arch-04 (work in 1414 progress), March 2010. 1416 [I-D.zhang-evolution] 1417 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1418 Scalability", draft-zhang-evolution-02 (work in progress), 1419 October 2009. 1421 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1422 a subnetwork for experimentation with the OSI network 1423 layer", RFC 1070, February 1989. 1425 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1426 Reserved for Documentation", RFC 3849, July 2004. 1428 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1429 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1430 September 2005. 1432 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1433 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1435 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1436 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1437 May 2006. 1439 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1440 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1441 March 2008. 1443 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1444 Route Optimization Requirements for Operational Use in 1445 Aeronautics and Space Exploration Mobile Networks", 1446 RFC 5522, October 2009. 1448 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1449 Global Enterprise Recursion (RANGER)", RFC 5720, 1450 February 2010. 1452 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1453 Reserved for Documentation", RFC 5737, January 2010. 1455 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1456 (IRTF) Document Stream", RFC 5743, December 2009. 1458 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1459 Still Needs Work", RFC 5887, May 2010. 1461 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1462 Core-Edge Separation Solutions to the Internet's Routing 1463 Scaling Problem, 1464 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1465 August 2008. 1467 Appendix A. IRON VPs Over Internetworks with Different Address Families 1469 The IRON architecture leverages the routing system by providing 1470 generally shortest-path routing for packets with EPA addresses from 1471 VPs that match the address family of the underlying Internetwork. 1472 When the VPs are of an address family that is not routable within the 1473 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1474 are used within an IPv4 Internetwork) a global mapping database is 1475 required to allow IR[VE]s to map VPs to companion prefixes taken from 1476 address families that are routable within the Internetwork. For 1477 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1478 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1479 packets can be forwarded over IPv4-only Internetworks. 1481 Every VP in the IRON must therefore be represented in a globally 1482 distributed Master VP database (MVPd) that maintains VP-to-companion 1483 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1484 globally-managed assigned numbers authority in the same manner as the 1485 Internet Assigned Numbers Authority (IANA) currently maintains the 1486 master list of all top-level IPv4 and IPv6 delegations. The database 1487 can be replicated across multiple servers for load balancing much in 1488 the same way that FTP mirror sites are used to manage software 1489 distributions. 1491 Upon startup, each IR[VE] discovers the full set of VPs for the IRON 1492 by reading the MVPd. The IR[VE] reads the MVPd from a nearby server 1493 and periodically checks the server for deltas since the database was 1494 last read. After reading the MVPd, the IR[VE] has a full list of VP 1495 to companion prefix mappings. 1497 The IR[VE] can then forward packets toward EPAs covered by a VP by 1498 encapsulating them in an outer header of the VP's companion prefix 1499 address family and using any address taken from the companion prefix 1500 as the outer destination address. The companion prefix therefore 1501 serves as an implicit anycast prefix. 1503 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1504 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1506 Appendix B. Scaling Considerations 1508 Scaling aspects of the IRON architecture have strong implications for 1509 its applicability in practical deployments. Scaling must be 1510 considered along multiple vectors including Interdomain core routing 1511 scaling, scaling to accommodate large numbers of customer EUNs, 1512 traffic scaling, state requirements, etc. 1514 In terms of routing scaling, each VPC will advertise one or more VPs 1515 from which EPs are delegated to customer EUNs. Routing scaling will 1516 therefore be minimized when each VP covers many EPs. For example, 1517 the IPv6 prefix 2001:DB8::/32 contains 2^24 ::/56 EP prefixes for 1518 assignment to EUNs. The IRON could therefore accommodate 2^32 ::/56 1519 EPs with only 2^8 ::/32 VPs advertised in the interdomain routing 1520 core. 1522 In terms of traffic scaling for IR[VC]s, each IR[VC] represents an 1523 ASBR of a "shell" enterprise network that simply directs arriving 1524 traffic packets with EPA destination addresses towards IR[VE]s that 1525 service customer EUNs. Moreover, the IR[VC] sheds traffic destined 1526 to EPAs through redirection which removes it from the path for the 1527 vast majority of traffic packets. On the other hand, each IR[VC] 1528 must handle all traffic packets forwarded between its customer EUNs 1529 and the non-IRON Internet. The scaling concerns for this latter 1530 class of traffic are no different than for ASBR routers that connect 1531 large enterprise networks to the Internet. In terms of traffic 1532 scaling for IR[VE]s, each IR[VE] services a set of the VPC overlay 1533 network's customer EUNs. The IR[VE] services all traffic packets 1534 destined to its EUNs but only services the initial packets of flows 1535 initiated from the EUNs and destined to EPAs. Therefore, traffic 1536 scaling for EPA-addressed traffic is an asymmetric consideration and 1537 is proportional to the number of EUNs each IR[VE] serves. 1539 In terms of state requirements for IR[VC]s, each IR[VC] maintains a 1540 list of all IR[VE]s in the VPC overlay network as well as FIB entries 1541 for all customer EUNs that each IR[VE] serves. This state is 1542 therefore dominated by the number of EUNs in the VPC overlay network. 1543 Sizing the IR[VC] to accommodate state information for all EUNs is 1544 therefore required during VPC overlay network planning. In terms of 1545 state requirements for IR[VE]s, each IR[VE] maintains tunnel state 1546 for each of the customer EUNs it serves but need not keep state for 1547 all EUNs in the VPC overlay network. Finally, neither IR[VC]s nor 1548 IR[VE] need keep state for final destinations of outbound traffic. 1550 IR[CE]s source and sink all traffic packets originating from or 1551 destined to the customer EUN. Therefore traffic scaling 1552 considerations for IR[CE]s are the same as for any site border 1553 router. IR[CE]s also retain state for the final destinations of 1554 outbound traffic flows. This can be managed as soft state, since 1555 stale entries purged from the cache will be refreshed when new 1556 traffic packets are sent. 1558 Author's Address 1560 Fred L. Templin (editor) 1561 Boeing Research & Technology 1562 entire. Box 3707 MC 7L-49 1563 Seattle, WA 98124 1564 USA 1566 Email: fltemplin@acm.org