idnits 2.17.1 draft-templin-iron-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 12, 2010) is 4996 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'VP' is mentioned on line 1374, but not defined == Missing Reference: 'CE' is mentioned on line 1420, but not defined == Missing Reference: 'VE' is mentioned on line 1686, but not defined == Missing Reference: 'VC' is mentioned on line 1681, but not defined == Missing Reference: 'B' is mentioned on line 1364, but not defined == Unused Reference: 'RFC3849' is defined on line 1561, but no explicit reference was found in the text == Unused Reference: 'RFC5737' is defined on line 1588, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-02 == Outdated reference: A later version (-04) exists of draft-ietf-v6ops-tunnel-security-concerns-02 == Outdated reference: A later version (-68) exists of draft-templin-intarea-seal-16 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-16 Summary: 1 error (**), 0 flaws (~~), 12 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force F. Templin, Ed. 3 (IRTF) Boeing Research & Technology 4 Internet-Draft August 12, 2010 5 Intended status: Experimental 6 Expires: February 13, 2011 8 The Internet Routing Overlay Network (IRON) 9 draft-templin-iron-10.txt 11 Abstract 13 Since the Internet must continue to support escalating growth due to 14 increasing demand, it is clear that current routing architectures and 15 operational practices must be updated. This document proposes an 16 Internet Routing Overlay Network for supporting sustainable growth 17 through Provider Independent addressing while requiring no changes to 18 end systems and no changes to the existing routing system. While 19 business considerations are an important determining factor for 20 widespread adoption, they are out of scope for this document. This 21 document is a product of the IRTF Routing Research Group. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on February 13, 2011. 40 Copyright Notice 42 Copyright (c) 2010 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 6 60 3.1. IR[CE] - IRON Customer Edge Router . . . . . . . . . . . . 8 61 3.2. IR[VE] - IRON Virtual Prefix Company Edge Router . . . . . 8 62 3.3. IR[VC] - IRON Virtual Prefix Company Core Router . . . . . 9 63 3.4. IR[VP] - IRON Virtual Prefix Company Combined Router . . . 10 64 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 65 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 12 66 5.1. IR[VC] Initialization . . . . . . . . . . . . . . . . . . 13 67 5.2. IR[VE] Initialization . . . . . . . . . . . . . . . . . . 13 68 5.3. IR[CE] Initialization . . . . . . . . . . . . . . . . . . 14 69 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 15 70 6.1. IR[CE] Operation . . . . . . . . . . . . . . . . . . . . . 15 71 6.2. IR[VE] Operation . . . . . . . . . . . . . . . . . . . . . 17 72 6.3. IR(VC) Operation . . . . . . . . . . . . . . . . . . . . . 18 73 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 19 74 6.4.1. Both Hosts Within IRON EUNs . . . . . . . . . . . . . 19 75 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 24 76 6.5. Mobility, Multihoming and Traffic Engineering 77 Considerations . . . . . . . . . . . . . . . . . . . . . . 27 78 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 27 79 6.5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . 28 80 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 28 81 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 28 82 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 28 83 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 29 84 6.8. Nested EUN Considerations . . . . . . . . . . . . . . . . 29 85 6.8.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 30 86 6.8.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 32 87 7. Additional Considerations . . . . . . . . . . . . . . . . . . 33 88 8. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 33 89 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 90 10. Security Considerations . . . . . . . . . . . . . . . . . . . 34 91 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 34 92 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 93 12.1. Normative References . . . . . . . . . . . . . . . . . . . 34 94 12.2. Informative References . . . . . . . . . . . . . . . . . . 34 95 Appendix A. IRON VPs Over Internetworks with Different 96 Address Families . . . . . . . . . . . . . . . . . . 37 97 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 37 98 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 38 100 1. Introduction 102 Growth in the number of entries instantiated in the Internet routing 103 system has led to concerns for unsustainable routing scaling 104 [I-D.narten-radir-problem-statement]. Operational practices such as 105 increased use of multihoming with IPv4 Provider-Independent (PI) 106 addressing are resulting in more and more fine-grained prefixes 107 injected into the routing system from more and more end-user 108 networks. Furthermore, the forthcoming depletion of the public IPv4 109 address space has raised concerns for both increased deaggregation 110 (leading to yet further routing table entries) and an impending 111 address space run-out scenario. At the same time, the IPv6 routing 112 system is beginning to see growth in IPv6 Provider-Aggregated (PA) 113 prefixes [BGPMON] which must be managed in order to avoid the same 114 routing scaling issues the IPv4 Internet now faces. Since the 115 Internet must continue to scale to accommodate increasing demand, it 116 is clear that new routing methodologies and operational practices are 117 needed. 119 Several related works have investigated routing scaling issues and 120 proposed solutions. Virtual Aggregation (VA) [I-D.ietf-grow-va] and 121 Aggregation in Increasing Scopes (AIS) [I-D.zhang-evolution] are 122 global routing proposals that introduce routing overlays with Virtual 123 Prefixes (VPs) to reduce the number of entries required in each 124 router's Forwarding Information Base (FIB) and Routing Information 125 Base (RIB). Routing and Addressing in Networks with Global 126 Enterprise Recursion (RANGER) [RFC5720] examines recursive 127 arrangements of enterprise networks that can apply to a very broad 128 set of use case scenarios [I-D.russert-rangers]. In particular, 129 RANGER supports encapsulation and secure redirection by treating each 130 layer in the recursive hierarchy as a virtual non-broadcast, multiple 131 access (NBMA) "link". RANGER is an architectural framework that 132 includes Virtual Enterprise Traversal (VET) [I-D.templin-intarea-vet] 133 and the Subnetwork Adaptation and Encapsulation Layer (SEAL) 134 (including the SEAL Control Message Protocol (SCMP)) 135 [I-D.templin-intarea-seal] as its functional building blocks. 137 This document proposes an Internet Routing Overlay Network (IRON) 138 with goals of supporting sustainable growth while requiring no 139 changes to the existing routing system. IRON borrows concepts from 140 VA, AIS and RANGER, and further borrows concepts from the Internet 141 Vastly Improved Plumbing (Ivip) [I-D.whittle-ivip-arch] architecture 142 proposal along with its associated Translating Tunnel Router (TTR) 143 mobility extensions [TTRMOB]. Indeed, the TTR model to a great 144 degree inspired the IRON mobility architecture design discussed in 145 this document. The Network Address Translator (NAT) traversal 146 techniques adapted for IRON were inspired by the Simple Address 147 Mapping for Premises Legacy Equipment (SAMPLE) proposal 149 [I-D.carpenter-softwire-sample]. 151 IRON specifically seeks to provide scalable PI addressing without 152 changing the current BGP [RFC4271] routing system. IRON observes the 153 Internet Protocol standards [RFC0791][RFC2460]. Other network layer 154 protocols that can be encapsulated within IP packets (e.g., OSI/CLNP 155 [RFC1070], etc.) are also within scope. 157 The IRON is a global routing system comprising virtual overlay 158 networks managed by Virtual Prefix Companies (VPCs) that own and 159 manage Virtual Prefixes (VPs) from which End User Network (EUN) PI 160 prefixes (EPs) are delegated to customer sites. The IRON is 161 motivated by a growing customer demand for multihoming, mobility 162 management and traffic engineering while using stable PI addressing 163 to avoid network renumbering [RFC4192][RFC5887]. The IRON uses the 164 existing IPv4 and IPv6 global Internet routing systems as virtual 165 links for tunneling inner network protocol packets within outer IPv4 166 or IPv6 headers (see: Section 3). The IRON requires deployment of a 167 small number of new BGP core routers and supporting servers, as well 168 as IRON-aware routers/servers in customer EUNs. No modifications to 169 hosts, and no modifications to most routers are required. 171 Note: This document is offered in compliance with Internet Research 172 Task Force (IRTF) document stream procedures [RFC5743]; it is not an 173 IETF product and is not a standard. The views in this document were 174 considered controversial by the IRTF Routing Research Group (RRG) but 175 the RG reached a consensus that the document should still be 176 published. The document will undergo a period of review within the 177 RRG and through selected expert reviewers prior to publication. The 178 following sections discuss details of the IRON architecture. 180 2. Terminology 182 This document makes use of the following terms: 184 End User Network (EUN) 185 an edge network that connects an organization's devices (e.g., 186 computers, routers, printers, etc.) to the Internet and possibly 187 also the IRON. 189 Internet Service Provider (ISP) 190 a service provider which physically connects customer EUNs to the 191 Internet. In other words, an ISP is responsible for providing IP 192 connectivity to a customer owning an EUN. 194 Provider Aggregated (PA) address or prefix 195 a network layer address or prefix delegated to an EUN by an ISP. 197 Provider Independent (PI) address or prefix 198 a network layer address or prefix delegated to an EUN by a third 199 party independently of the EUN's ISP arrangements. 201 Virtual Prefix (VP) 202 a PI prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI NSAP 203 prefix, etc.) that is owned and managed by a Virtual Prefix 204 Company (VPC). 206 End User Network PI prefix (EP) 207 a more-specific PI prefix derived from a VP (e.g., an IPv4 /28, an 208 IPv6 /56, etc.) and delegated to an EUN by a VPC. 210 EP Address (EPA) 211 a network layer address belonging to an EP and assigned to the 212 interface of an end system in an EUN. 214 Locator 215 an IP address assigned to the interface of a router or end system 216 within a public or private network. Locators taken from public IP 217 prefixes are routable on a global basis, while locators taken from 218 private IP prefixes are made public via Network Address 219 Translation (NAT). 221 Virtual Prefix Company (VPC) 222 a company that owns and manages a set of VPs from which it 223 delegates End User Network PI Prefixes (EPs) to EUNs 225 Internet Routing Overlay Network (IRON) 226 an overlay network configured over the global Internet. The IRON 227 supports routing through encapsulation of inner packets with EPA 228 addresses within outer headers that use locator addresses. 230 3. The Internet Routing Overlay Network 232 The Internet Routing Overlay Network (IRON) consists of IRON Routers 233 (IRs) that automatically tunnel the packets of end-to-end 234 communication sessions within encapsulating headers used for 235 Internetwork routing. IRs use Virtual Enterprise Traversal (VET) 236 [I-D.templin-intarea-vet] in conjunction with the Subnetwork 237 Encapsulation and Adaptation Layer (SEAL) [I-D.templin-intarea-seal] 238 to encapsulate inner network layer packets within outer headers as 239 shown in Figure 1: 241 +-------------------------+ 242 | Outer headers with | 243 ~ locator addresses ~ 244 | (IPv4 or IPv6) | 245 +-------------------------+ 246 | SEAL Header | 247 +-------------------------+ +-------------------------+ 248 | Inner Packet Header | --> | Inner Packet Header | 249 ~ with EP addresses ~ --> ~ with EP addresses ~ 250 | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | 251 +-------------------------+ +-------------------------+ 252 | | --> | | 253 ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ 254 | | --> | | 255 +-------------------------+ +-------------------------+ 257 Inner packet before Outer packet after 258 before encapsulation after encapsulation 260 Figure 1: Encapsulation of Inner Packets Within Outer IP Headers 262 VET specifies the automatic tunneling mechanisms used for 263 encapsulation, while SEAL specifies the format and usage of the SEAL 264 header as well as a set of control messages. Most notably, IRs use 265 SEAL to deterministically exchange and authenticate control messages 266 such as indications of Path Maximum Transmission Unit (PMTU) 267 limitations. 269 The IRON is manifested through a business model in which Virtual 270 Prefix Companies (VPCs) own and manage virtual overlay networks 271 comprising a set of IRs that are distributed throughout the Internet 272 and serve highly-aggregated Virtual Prefixes (VPs). VPCs delegate 273 sub-prefixes from their VPs which they lease to customers as End User 274 Network PI prefixes (EPs). The customers in turn assign the EPs to 275 their customer edge IRs which connect their End User Networks (EUNs) 276 to the IRON. 278 VPCs may have no affiliation with the ISP networks from which 279 customers obtain their basic Internet connectivity. Therefore, 280 unless the ISP also acts as a VPC the customer must have two business 281 relationships - one with the ISP and a second with the VPC. In that 282 case, the VPC can open for business and begin serving their customers 283 immediately without the need to coordinate their activities with ISPs 284 or with other VPCs. Further details on business considerations are 285 out of scope for this document. 287 The IRON requires no changes to end systems and no changes to most 288 routers in the Internet. Instead, the IRON comprises IRs that are 289 deployed either as new platforms or as modifications to existing 290 platforms. IRs may be deployed incrementally without disturbing the 291 existing Internet routing system, and act as waypoints (or "cairns") 292 for navigating the IRON. The functional roles for IRs are described 293 in the following sections. 295 3.1. IR[CE] - IRON Customer Edge Router 297 An IR[CE] is a Customer Edge router (or host with embedded gateway 298 function) that logically connects the customer's EUNs and their 299 associated EPs to the IRON via tunnels as shown in Figure 2. IR[CE]s 300 obtain EPs from VPCs and use them to number subnets and interfaces 301 within their EUNs. An IR[CE] can be deployed on the same physical 302 platform that also connects the customer's EUNs to its ISPs, but it 303 may also be a separate router or even a standalone server system 304 located within the EUN. (This model applies even if the EUN connects 305 to the ISP via a Network Address Translator (NAT) - see Section 6.7). 306 .-. 307 ,-( _)-. 308 +--------+ .-(_ (_ )-. 309 | IR[CE] |--(_ ISP ) 310 +---+----+ `-(______)-' 311 | <= T \ .-. 312 .-. u \ ,-( _)-. 313 ,-( _)-. n .-(_ (- )-. 314 .-(_ (_ )-. n (_ Internet ) 315 (_ EUN ) e `-(______)- 316 `-(______)-' l ___ 317 | s => (:::)-. 318 +----+---+ .-(::::::::) 319 | Host | .-(::::::::::::)-. 320 +--------+ (:::: The IRON ::::) 321 `-(::::::::::::)-' 322 `-(::::::)-' 324 Figure 2: IR[CE] Connecting EUN to the IRON 326 3.2. IR[VE] - IRON Virtual Prefix Company Edge Router 328 An IR[VE] is a VPC's overlay network edge router that provides 329 forwarding and mapping services for the EPs owned by customer 330 IR[CE]s. In typical deployments, a VPC will deploy many IR[VE]s 331 around the IRON in a globally-distributed fashion (e.g., as depicted 332 in Figure 3) so that IR[CE] clients can discover those that are 333 nearby. 335 +--------+ +--------+ 336 | IR[VE] | | IR[VE] | 337 | Boston | | Tokyo | 338 +--+-----+ ++-------+ 339 +--------+ \ / 340 | IR[VE] | \ ___ / 341 | Seattle| \ (:::)-. +--------+ 342 +------+-+ .-(::::::::)------+ IR[VE] | 343 \.-(::::::::::::)-. | Paris | 344 (:::: The IRON ::::) +--------+ 345 `-(::::::::::::)-' 346 +--------+ / `-(::::::)-' \ +--------+ 347 | IR[VE] + | \--- + IR[VE] | 348 | Moscow | +----+---+ | Sydney | 349 +--------+ | IR[VE] | +--------+ 350 | Cairo | 351 +--------+ 353 Figure 3: IR[VE] Global Distribution Example 355 Each IR[VE] serves as a customer-facing tunnel endpoint router that 356 IR[CE]s form bidirectional tunnels with over the IRON. Each IR[VE] 357 also associates with an Internet-facing IR[VC] that can forward 358 packets from the IRON out to the native public Internet and vice- 359 versa as discussed in the next section. 361 3.3. IR[VC] - IRON Virtual Prefix Company Core Router 363 An IR[VC] is a VPC's overlay network core router that acts as a 364 gateway between the IRON and the native public Internet. It 365 therefore also serves as an Autonomous System Border Router (ASBR) 366 that is owned and managed by the VPC. 368 Each VPC configures one or more IR[VC]s which advertise the company's 369 VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each 370 IR[VC] associates with all of the VPC's overlay network edge routers, 371 e.g., via tunnels over the IRON, via a direct interconnect such as an 372 Ethernet cable, etc. The IR[VC] role is depicted in Figure 4: 374 ,-( _)-. 375 .-(_ (_ )-. 376 (_ Internet ) 377 `-(______)-' 378 | 379 +----+---+ 380 | IR[VC] | 381 +----+---+ 382 _|_ 383 (:::)-. 384 .-(::::::::) 385 +--------+ .-(::::::::::::)-. +--------+ 386 | IR[VE] | (:::: The IRON ::::) | IR[VE] | 387 +--------+ `-(::::::::::::)-' +--------+ 388 `-(::::::)-' 390 +--------+ 391 | IR[VE] | 392 +--------+ 394 Figure 4: IR[VC] Connecting IRON to Native Internet 396 3.4. IR[VP] - IRON Virtual Prefix Company Combined Router 398 An IR[VP] is a VPC's overlay network router that combines the 399 functions of both the IR[VE] and IR[VC]. While not in itself a 400 fundamental building block of the architecture, it is mentioned here 401 to clarify an implementation option available to VPCs. 403 In the IR[VP] model, the IR[VE] and IR[VC] functions can be thought 404 of as "half-gateway" functions that together comprise a unified 405 IR[VP]. The IR[VE] and IR[VC] functions can therefore be discussed 406 separately even when both functions reside within the same physical 407 IR[VP] platform as shown in Figure 5: 409 ,-( _)-. 410 .-(_ (_ )-. 411 (_ Internet ) 412 `-(______)-' 413 | 414 +----------+----------+ 415 | IR[VC] half-gateway | 416 +---------------------+ 417 | IR[VE] half-gateway | 418 +----------+----------+ 419 <- IR[VP] Unified Gateway -> 420 _|_ 421 (:::)-. 422 .-(::::::::) 423 .-(::::::::::::)-. 424 (:::: The IRON ::::) 425 `-(::::::::::::)-' 426 `-(::::::)-' 428 Figure 5: IR[VP] Combining IR[VE] and IR[VC] Functions 430 4. IRON Organizational Principles 432 The IRON consists of the union of all VPC overlay networks worldwide 433 (where each VPC configures one or more overlay networks). Each such 434 overlay network represents a distinct "patch" on the Internet 435 "quilt", where the patches are stitched together by tunnels over the 436 links, routers, bridges, etc. that connect the public Internet. When 437 a new VPC overlay network is deployed, it becomes yet another patch 438 on the quilt. The IRON is therefore a composite overlay network 439 consisting of multiple individual patches, where each patch can 440 coordinate its activities independently of all others (with the 441 exception that each patch must be aware of all VP's in the IRON). 443 Each VPC overlay network in the IRON maintains a set of IR[VC]s that 444 connect the overlay network directly to the public IPv4 and IPv6 445 Internets. Each IR[VC] advertises the VPC overlay network's IPv4 VPs 446 into the IPv4 BGP routing system and advertises the overlay network's 447 IPv6 VPs into the IPv6 BGP routing system. IR[VC]s will therefore 448 receive packets with EPA destination addresses sent by end systems in 449 the Internet then re-encapsulate and forward them toward the correct 450 EPA-addressed end systems connected to the VPC overlay network. 452 Each VPC overlay network also manages a set of IR[VE]s that connect 453 customer EUNs to the IRON and to the IPv6 and IPv4 Internets via 454 their associations with IR[VC]s. IR[VE]s therefore need not be BGP 455 routers themselves and can be simple commodity hardware platforms. 457 Moreover, the IR[VE] and IR[VC] functions can be deployed together on 458 the same physical platform as an IR[VP] or they may be deployed on 459 separate platforms (e.g., for load balancing purposes). 461 Each IR[VE] maintains a working set of IR[CE]s for which it caches 462 EP-to-IR[CE] mappings in its Forwarding Information Base (FIB). Each 463 IR[VE] also in turn propagates the list of EPs in its working set to 464 each of the IR[VC]s in the VPC overlay network via a dynamic routing 465 protocol (e.g., an overlay network internal BGP instance that carries 466 only the EP-to-IR[VE] mappings and does not interact with the 467 external BGP routing system). Each IR[VE] therefore only needs to 468 track the EPs for its current working set of IR[CE]s, while each 469 IR[VC] will maintain a full EP-to-IR[VE] mapping table that 470 represents reachability information for all EPs in the VPC overlay 471 network. 473 Customers establish IR[CE]s to connect their EUNs to both the VPC 474 overlay network and to the rest of the IRON. Each EUN can connect to 475 the IRON via one or multiple IR[CE]s as long as the multiple IR[CE]s 476 coordinate with one another, e.g., to mitigate EUN partitions. 477 Unlike IR[VC]s and IR[VE]s, IR[CE]s may use private addresses behind 478 one or several layers of NATs. The IR[CE] initially discovers a list 479 of nearby IR[VE]s through an anycast discovery process. It then 480 selects one of these nearby IR[VE]s as its server and forms a two-way 481 tunnel with the IR[VE] through an initial exchange followed by 482 periodic keepalives. 484 After the IR[CE] selects a serving IR[VE], it forwards outbound 485 packets from its EUNs by tunneling them to an IR[VC]/IR[VE] within 486 the IRON that serves the final destination. When the IR[CE] cannot 487 tunnel packets directly to an IR[VC]/IR[VE] that serves the final 488 destination (e.g., when the destination address is a non-EPA address) 489 it instead tunnels them to its own serving IR[VE]. 491 The IRON can also be used to support VPs of network layer address 492 families that cannot be routed natively in the underlying 493 Internetwork (e.g., OSI/CLNP within the public Internet, IPv6 within 494 IPv4-only Internetworks, IPv4 within IPv6-only Internetworks, etc.). 495 Further details for support of IRON VPs over non-native Internetworks 496 are discussed in Appendix A. 498 5. IRON Initialization 500 IRON initialization entails the startup actions of IRs within the VPC 501 overlay network and customer EUNs. The following sections discuss 502 these startups procedures. 504 5.1. IR[VC] Initialization 506 Before its first operational use, each IR[VC] in a VPC overlay 507 network is provisioned with the list of VPs that it will serve as 508 well as the locators for all IR[VE]s that belong to the same overlay 509 network. The IR[VC] is also provisioned with external BGP 510 interconnections the same as for any BGP router. 512 Upon startup, the IR[VC] engages in BGP routing exchanges with its 513 peers in the IPv4 and IPv6 Internets the same as for any BGP router. 514 It then connects to all of the IR[VE]s in the overlay network (e.g., 515 via a TCP connection over a two-way tunnel, via an iBGP route 516 reflector, etc.) for the purpose of discovering EP->IR[VE] mappings. 517 After the IR[VC] has fully populated its EP->IR[VE] mapping 518 information database, it is said to be "synchronized" wrt its VPs. 520 After this initial synchronization procedure, the IR[VC] then 521 advertises the overlay network's VPs externally. In particular, the 522 IR[VC] advertises the IPv6 VPs into the IPv6 BGP routing system and 523 advertises the IPv4 VPs into the IPv4 BGP routing system. If the 524 IR[VC] only services IPv6 VPs (e.g., 2001:DB8::/32), it advertises 525 the IPv6 VPs into the IPv6 routing system and also advertises a 526 companion IPv4 prefix (e.g., 192.0.2.0/24) into the IPv4 routing 527 system that can be used by IR[CE]s/IR[VE]s from other VPC overlay 528 networks for anycast discovery purposes. Similarly, if the IR[VC] 529 only services IPv4 VPs, it also advertises a companion IPv6 prefix 530 (e.g., 2001:DB8::/56) into the IPv6 routing system. (See Appendix A 531 for more information on the discovery and use of companion prefixes.) 532 The IR[VC] then engages in ordinary packet forwarding operations. 534 5.2. IR[VE] Initialization 536 Before its first operational use, each IR[VE] in a VPC overlay 537 network is provisioned with the locators for all IR[VC]s that serve 538 the overlay network's VPs. In order to support route optimization, 539 the IR[VE] must also be provisioned with the list of all VPs in the 540 IRON (i.e., and not just the VPs of it own overlay network) so that 541 it can discern EPA and non-EPA addresses. The IR[VE] should also 542 discover the VP companion prefix relationships discussed in Section 543 5.1, e.g., via a global database such as discussed in Appendix A. 545 Upon startup, each IR[VE] must connect to all of the IR[VC]s within 546 its overlay network (e.g., via a TCP connection over a two-way 547 tunnel, via an iBGP route reflector, etc.) for the purpose of 548 reporting its EP->IR[VE] mappings. The IR[VE] then actively listens 549 for IR[CE] customers which register their EP prefixes as part of 550 establishing a two-way tunnel. When a new IR[CE] registers its EP 551 prefixes, the IR[VE] announces the new EP additions to all IR[VC]s; 552 when an existing IR[CE] unregisters its EP prefixes, the IR[VE] 553 withdraws its announcements. 555 5.3. IR[CE] Initialization 557 Before its first operational use, each IR[CE] must obtain one or more 558 EPs from its VPC as well as any companion prefixes of other address 559 families (see Section 5.1) associated with the EPs. The IR[CE] must 560 also be provisioned with the list of all VPs in the IRON (i.e., and 561 not just the VPs of its own overlay network) so that it can discern 562 EPA and non-EPA addresses. The IR[CE] could therefore be greatly 563 simplified if the list of VPs could be covered within a small number 564 of very short prefixes, e.g., one or a few IPv6 ::/20's. 566 The IR[CE] must also obtain a certificate and a public/private key 567 pair from the VPC that it can later use to prove ownership of its 568 EPs. This implies that each VPC must run its own key infrastructure 569 to be used only for the purpose of verifying a customer's claimed 570 right to use an EP. Hence, the VPC need not coordinate its key 571 infrastructure with any other organization. 573 Upon startup, the IR[CE] sends a SEAL Control Message Protocol (SCMP) 574 Router Solicitation (SRS) message using an implicit anycast procedure 575 to discover the nearest IR[VC] in its VPC overlay network. The 576 IR[VC] will in turn return a list of locators of the company's nearby 577 IR[VE]s. (This list is analogous to the ISATAP Potential Router List 578 (PRL) [RFC5214].)I 580 To perform the implicit anycast procedure, the IR[CE] sets the source 581 address of the SRS message to one of its locator addresses and sets 582 the destination address of the message to any EPA taken from one of 583 its own EPs. (If the EP is of a different address family than the 584 IR[CE]'s locators, however, the IR[CE] instead sets the destination 585 address to any address taken from the companion prefix associated 586 with the EP.) This SRS message will be delivered to the nearest 587 IR[VC] that attaches the VPC overlay network to the Internet. When 588 the IR[VC] receives the SRS message, it sends back an SCMP Router 589 Advertisement (SRA) message that lists the locator addresses of one 590 or more nearby IR[VE] routers. 592 After the IR[CE] receives an SRA message from the nearby IR[VC] 593 listing the locator addresses of nearby IR[VE]s, it sends SRS test 594 messages to one or more of the locator addresses to elicit SRA 595 messages. The IR[VE] that configures the locator will include the 596 header of the soliciting SRS message in its SRA message so that the 597 IR[CE] can determine the number of hops along the forward path. The 598 IR[VE] also includes a metric in its SRA messages indicating its 599 service availability so that the IR[CE] can avoid selecting IR[VE]s 600 that are overloaded. The IR[VE] also includes a challenge/response 601 puzzle that the IR[CE] must answer if it wishes to enlist this 602 IR[VE]'s services. 604 When the IR[CE] receives these SRA messages, it can measure the round 605 trip time between sending the SRS and receiving the SRA as an 606 indication of round-trip delay. If the IR[CE] wishes the enlist the 607 services of a specific IR[VE] (e.g., based on the measured 608 performance), it then calculates the answer to the puzzle using its 609 keying information and sends the answer back to the IR[VE] in a new 610 SRS message that also contains all of the IR[CE]'s EP prefixes for 611 which it claims ownership. If the IR[CE] answered the puzzle 612 correctly, the IR[VE] will send back a new SRA message that includes 613 a non-zero default router lifetime and that signifies the 614 establishment of a two-way tunnel. (A zero default router lifetime 615 on the other hand signifies that the IR[VE] is currently unable to 616 establish a two-way tunnel, e.g., due to heavy load, due to 617 challenge/response failure, etc.) 619 Note that in the above procedure it is essential that the IR[CE] 620 select one and only one IR[VE]. This is to allow the VPC overlay 621 network mapping system to have one and only one active EP-to-IR[VE] 622 mapping at any point in time which shares fate with the IR[VE] 623 itself. If this IR[VE] fails, the IR[CE] will quickly select a new 624 one which will automatically update the VPC overlay network mapping 625 system with a new EP-to-IR[VE] mapping. 627 6. IRON Operation 629 Following the IRON initialization detailed in Section 5, IRs engage 630 in the steady-state process of receiving and forwarding packets. All 631 IRs forward encapsulated packets over the IRON using the mechanisms 632 of VET [I-D.templin-intarea-vet] and SEAL [I-D.templin-intarea-seal], 633 while IR[VC]s and IR[VE]s additionally forward packets to and from 634 the native IPv6 and IPv4 Internets. IRs also use the SEAL Control 635 Message Protocol (SCMP) to coordinate with other IRs, including the 636 process of sending and receiving redirect messages for route 637 optimization. Each IR operates as specified in the following sub- 638 sections. 640 6.1. IR[CE] Operation 642 After selecting its serving IR[VE] as specified in Section 5.3, the 643 IR[CE] should register each of its ISP connections with the IR[VE] in 644 order to establish multiple two-way tunnels for multihoming purposes. 645 To do so, it sends periodic SRS messages to its serving IR[VE] via 646 each of its ISPs to establish additional two-way tunnels and to keep 647 each two-way tunnel alive. These messages need not include 648 challenge/response mechanisms since prefix proof of ownership was 649 already established in the initial exchange and a nonce in the SEAL 650 header can be used to confirm that the SRS message was sent by the 651 correct IR[CE]. This implies that a single nonce is used to 652 represent the set of all two-way tunnels between the IR[CE] and the 653 IR[VE]. Therefore, there are multiple two-way tunnels, and the nonce 654 names this "bundle" of tunnels. 656 If the IR[CE] ceases to receive SRA messages from its serving IR[VE] 657 via a specific ISP connection, it marks the IR[VE] as unreachable 658 from that address and therefore over that ISP connection. (The 659 IR[CE] must also inform its serving IR[VE] of this outage via one of 660 its working ISP connections.) If the IR[CE] ceases to receive SRA 661 messages from its serving IR[VE] via multiple ISP connections, it 662 marks the IR[VE] as unusable and quickly attempts to establish a 663 connection with a new IR[VE]. The act of establishing the connection 664 with a new serving IR[VE] will automatically purge the stale mapping 665 state associated with the old serving IR[VE]. 667 When an end system in an EUN has a packet to send, the packet is 668 forwarded through the EUN via normal routing until it reaches the 669 IR[CE], which then tunnels the packet either to its serving IR[VE]s 670 or to an IR[VC]/IR[VE] that serves the packet's final destination. 671 When the IR[CE] does not know an outer destination locator address 672 that can be used to reach an IR[VC]/IR[VE] that serves the packet's 673 final destination (or, if the final destination is a non-EPA address) 674 the IR[CE] encapsulates the packet in an outer header with its 675 locator as the source address and the locator of its serving IR[VE] 676 as the destination address. 678 Otherwise, when the inner destination address matches the address 679 family of the IR[CE]'s locator, the IR[CE] encapsulates the packet in 680 an outer header with its locator as the source address and the 681 destination address of the inner packet copied into the destination 682 address of the outer packet. When the inner destination address does 683 not match the address family of the IR[CE]'s locator, but the IR[CE] 684 knows of an outer locator address that can reach an IR/[VC]/IR[VE] 685 that serves the final destination, the IR[CE] encapsulates the packet 686 with the outer destination address set to this outer locator address. 687 The IR[CE] then forwards the encapsulated packet via one of its ISP 688 connections, where normal Internet routing will convey it to an 689 IR[VC]/IR[VE] that services the destination. 691 The IR[CE] uses the mechanisms specified in VET and SEAL to 692 encapsulate each forwarded packet. The IR[CE] further uses the SCMP 693 protocol to coordinate with other IRs, including accepting redirect 694 messages that indicate a better next hop. When the IR[CE] receives 695 an SCMP redirect, it checks the identification field of the 696 encapsulated message to verify that the redirect corresponds to a 697 packet that it had previously sent and accepts the redirect if there 698 is a match. Thereafter, subsequent packets forwarded by the source 699 IR[CE] will follow a route-optimized path. 701 6.2. IR[VE] Operation 703 After an IR[VE] is initialized, it responds to SRSs from IR[CE]s by 704 sending SRAs as described in Section 6.1. When the IR[VE] receives 705 an SRS message from a new IR[CE], it sends back an SRA message with a 706 challenge/response puzzle. The IR[CE] in turn sends an SRS message 707 with an answer to the puzzle. If this authentication fails, the 708 IR[VE] discards the message. Otherwise, it creates tunnel state for 709 this new IR[CE], records the EPs in its FIB, and records the locator 710 address from the SCMP message as the link-layer address of the next 711 hop. The IR[VE] next sends an SRA message back to the IR[CE] to 712 complete the tunnel establishment. 714 When the IR[VE] receives a SEAL-encapsulated packet from one of its 715 IR[CE] tunnel endpoints, it examines the inner destination address. 716 If the inner destination address is not an EPA, the IR[VE] 717 decapsulates the packet and forwards it unencapsulated into the 718 Internet if it is able to do so without loss due to ingress 719 filtering. Otherwise, the IR[VE] re-encapsulates the packet (i.e., 720 it removes the outer header and replaces it with a new outer header 721 of the same address family) and sets the outer destination address to 722 the locator address of an IR[VC] within its VPC overlay network. It 723 then forwards the re-encapsulated packet to the IR[VC], which will in 724 turn decapsulate it and forward it into the Internet. 726 If the inner destination address is an EPA, however, the IR[VE] re- 727 encapsulates the packet, sets the outer source address to one of its 728 own locator address, and sets the outer destination address to the 729 inner destination address. (If the outer header is of a different 730 address family than the inner header, however, the IR[VE] instead 731 sets the destination address to any address taken from the companion 732 prefix associated with the inner destination address.) The IR[VE] 733 then forwards the re-encapsulated packet into the Internet via a 734 default or more-specific route. The IR[VE] may then receives SCMP 735 redirect messages from an IR[VC]/IR[VE] that serves the destination 736 EUN. In that case, the IR[VE] forwards the redirect message to the 737 IR[CE] that sent the original inner packet. The source and 738 destination addresses of the forwarded SCMP redirect message use the 739 outer destination and source addresses of the original packet, 740 respectively. This arrangement is necessary to allow the redirect 741 messages to flow through any NATs on the path. 743 When the IR[VE] receives a SEAL-encapsulated packet from an IR[VC] or 744 from the Internet, if the inner destination address matches an EP in 745 its FIB the IR[VE] 'A' re-encapsulates the packet and forwards it to 746 its client IR[CE] 'B' which in turn decapsulates the packet and 747 forwards it to the correct end system in the EUN. If 'B' has left 748 notice with 'A' that it has moved to a new IR[VE] 'C', however, 'A' 749 will instead forward the re-encapsulated packet to 'C' and also send 750 an SCMP redirect message back to the source of the packet. In this 751 way, IR[CE]s can change between IR[VE]s (e.g., due to mobility 752 events) without exposing packets to loss. 754 6.3. IR(VC) Operation 756 After an IR[VC] has synchronized its VPs (see: Section 5.1) it 757 advertises the full set of the company's VP's into the IPv4 and IPv6 758 Internet BGP routing systems. The VPs will be represented as 759 ordinary routing information in the BGP, and any packets originating 760 from the IPv4 or IPv6 Internet destined to an EPA covered by one of 761 the VPs will be forwarded into the VPC's overlay network by an 762 IR[VC]. 764 When an IR[VC] receives a packet from the Internet destined to an EPA 765 covered by one of its VPs, it examines the packet format to determine 766 the proper handling procedures as follows: 768 o If the packet is an SCMP SRS message, the IR[VC] sends an SRA 769 message back to the source listing the locator addresses of nearby 770 IR[VE] routers then discards the message. The IR[VC] silently 771 discards all other SCMP messages. 773 o If the packet is not SEAL-encapsulated the IR[VC] looks in its FIB 774 to discover a locator of the IR[VE] that serves the destination 775 address. The IR[VC] then simply encapsulates the packet with its 776 own locator as the outer source address and the locator of the 777 IR[VE] as the outer destination address and forwards the packet to 778 the IR[VE]. 780 o If the packet is SEAL-encapsulated the IR[VC] sends an SCMP 781 redirect message of the same address family back to the source 782 with the locator of the serving IR[VE] as the redirected target. 783 The source and destination addresses of the SCMP redirect message 784 use the outer destination and source addresses of the original 785 packet, respectively. This arrangement is necessary to allow the 786 redirect messages to flow through any NATs on the path. After 787 sending the redirect message, the IR[VC] then rewrites the outer 788 source address to one of its own locators, rewrites the outer 789 destination address to the locator of the IR[VE] and forwards the 790 packet to the IR[VE] (*). 792 (*) Note that in this arrangement any errors that occur on the path 793 between the IR[VC] to the IR[VE] will not be delivered to the 794 original source. This implies that the path between the IR[VC] and 795 IR[VE] should be made as free from errors as possible (e.g., such as 796 when the IR[VC] and IR[VE] are connected to the same physical link). 798 6.4. IRON Reference Operating Scenarios 800 The IRON is used to support communications when one or both hosts are 801 located within EP-addressed EUNs regardless of whether the EPs are 802 provisioned by the same VPC or by different VPCs . When both hosts 803 are within IRON EUNs, route redirections that eliminate unnecessary 804 IR[VC]s (and sometimes also IR[VE]s) from the path are possible. 805 When only one host is within an IRON EUN, however, route optimization 806 cannot be used. 808 The following sections discuss the two scenarios. Note that it is 809 sufficient to discuss the scenarios in a unidirectional fashion, 810 i.e., by tracing packet flows only in the forward direction from the 811 source host to destination host. The reverse direction can be 812 considered separately, and incurs the same considerations as for the 813 forward direction. 815 6.4.1. Both Hosts Within IRON EUNs 817 When both hosts are within EP-addressed EUNs, the initial packets of 818 the flow may need to involve an IR[VC] of the destination host but 819 route optimization can eliminate the IR[VC] from the path for 820 subsequent packets. Two sub-scenarios exist based on whether or not 821 the IR[CE] of the source host configures a locator of the same 822 address family as the inner packet. The sub-cases are discussed in 823 the following sections. 825 6.4.1.1. IR[CE] of Source Host Configures a Locator of the Same 826 Protocol Version as the EPA 828 Figure 6 shows the flow of initial packets from host A to host B 829 within two EP-addressed EUNs when the IR[CE] of the source host A 830 configures a locator of the same protocol version as the inner 831 packet: 833 ________________________________________ 834 .-( .-. )-. 835 .-( ,-( _)-. )-. 836 .-( +=================+ _ +========+ )-. 837 .( // (_|| Internet|| _) || ). 838 .( // ||-(______)|| vv ). 839 .( // || || +------------+ ). 840 ( // vv || | IR[VE](B) |====+ ) 841 ( // +------------+ +------------+ \\ ) 842 ( // .-. | IR[VC](B) | .-. \\ ) 843 ( //,-( _)-. +------------+ ,-( _)-\\ ) 844 ( .||_ (_ )-. / .-(_ (_ ||. ) 845 ( _|| ISP A .) / (redirect) (__ ISP B ||_)) 846 ( ||-(______)-' / `-(______)|| ) 847 ( || | / | vv ) 848 ( +-----+-----+ <=/ +-----+-----+ ) 849 | IR[CE](A) | | IR[CE](B) | 850 +-----+-----+ The IRON +-----+-----+ 851 | ( (Overlaid on the native Internet) ) | 852 .-. .-( .-) .-. 853 ,-( _)-. .-(________________________)-. ,-( _)-. 854 .-(_ (_ )-. .-(_ (_ )-. 855 (_ IRON EUN A ) (_ IRON EUN B ) 856 `-(______)-' `-(______)-' 857 | | 858 +---+----+ +---+----+ 859 | Host A | | Host B | 860 +--------+ +--------+ 862 Figure 6: EPA/Locator Matching Scenario Before Redirects 864 In this scenario, host A sends packets destined to host B (i.e., 865 packets with source address A and destination address B) via its 866 network interface connected to EUN A. (This interface could be a 867 physical interface such as an Ethernet NIC, an ISATAP or VET tunnel 868 virtual interface with IR[CE](A) as a PRL router, etc.) Routing with 869 EUN A will direct the packets to IR[CE](A) as a default router for 870 the EUN which then uses VET and SEAL to encapsulate them in outer 871 headers with its locator address as the outer source address and B as 872 the outer destination address (i.e., the inner and outer destination 873 address will be the same). IR[CE](A) then releases the encapsulated 874 packets into its ISP network connection that provided its locator. 875 The ISP will release the packet into the Internet without filtering 876 since the (outer) source address is topologically correct. Once the 877 packets have been released into the Internet, routing will direct 878 them to the nearest IR[VC] that advertises reachability to a VP that 879 covers destination address B (namely, IR[VC](B)). 881 IR[VC](B) will receive the encapsulated packets from IR[CE](A) then 882 check its FIB to discover an entry that covers address B with 883 IR[VE](B) as the next hop. IR[VC](B) will then issue SCMP redirect 884 messages to inform IR[CE](A) that IR[VE](B) is a better next hop (*). 885 IR[VC](B) then rewrites the outer source address of the encapsulated 886 packets to its own locator address and rewrites the destination 887 address of the encapsulated packets to the locator address of 888 IR[VE](B). IR[VC](B) then forwards these re-encapsulated packets to 889 IR[VE](B). 891 IR[VE](B) will receive the encapsulated packets from IR[VC](B) then 892 check its FIB to discover an entry that covers destination address B 893 with IR[CE](B) as the next hop. IR[VE](B) then rewrites the outer 894 source address of the packets to its own locator address and rewrites 895 the outer destination address to the locator address of IR[CE](B). 896 IR[VE](B) then tunnels these re-encapsulated packets to IR[CE](B), 897 which will in turn decapsulate the packets and forward the inner 898 packets to host B via EUN B. 900 (*) Note that after the initial flow of packets, IR[CE](A) will have 901 received one or more SCMP redirect messages from IR[VC](B) informing 902 it of IR[VE](B) as a better next hop. Thereafter, IR[CE](A) will 903 forward its encapsulated packets directly to the locator address of 904 IR[VE](B) without involving IR[VC](B) as shown in Figure 7: 906 ________________________________________ 907 .-( .-. )-. 908 .-( ,-( _)-. )-. 909 .-( +=============> .-(_ (_ )-.======+ )-. 910 .( // (__ Internet _) || ). 911 .( // `-(______)-' vv ). 912 .( // +------------+ ). 913 ( // | IR[VE](B) |====+ ) 914 ( // +------------+ \\ ) 915 ( // .-. .-. \\ ) 916 ( //,-( _)-. ,-( _)-\\ ) 917 ( .||_ (_ )-. .-(_ (_ ||. ) 918 ( _|| ISP A .) (__ ISP B ||_)) 919 ( ||-(______)-' `-(______)|| ) 920 ( || | | vv ) 921 ( +-----+-----+ The IRON +-----+-----+ ) 922 | IR[CE](A) | (Overlaid on the native Internet) | IR[CE](B) | 923 +-----+-----+ +-----+-----+ 924 | ( ) | 925 .-. .-( .-) .-. 926 ,-( _)-. .-(________________________)-. ,-( _)-. 927 .-(_ (_ )-. .-(_ (_ )-. 928 (_ IRON EUN A ) (_ IRON EUN B ) 929 `-(______)-' `-(______)-' 930 | | 931 +---+----+ +---+----+ 932 | Host A | | Host B | 933 +--------+ +--------+ 935 Figure 7: EPA/Locator Matching Scenario After Redirects 937 6.4.1.2. IR[CE] of Source Host Configures a Locator of a Different 938 Protocol Version than the EPA 940 Figure 8 shows the flow of initial packets from host A to host B 941 within two EP-addressed EUNs when the IR[CE] of source host A cannot 942 configure a locator of the same address family as the inner network 943 layer protocol. For example, if the IR[CE] configures only an IPv4 944 locator, but EUN A uses IPv6 natively, IR[CE] is obliged to forward 945 its initial packets through its serving IR[VE]. 947 ________________________________________ 948 .-( .-. )-. 949 .-( ,-( _)-. )-. 950 .-( +========+(_ (_ +=====+ )-. 951 .( || (_|| Internet ||_) || ). 952 .( || ||-(______)-|| vv ). 953 .( +--------++--+ || || +------------+ ). 954 ( +==>| IR[VE](A) | vv || | IR[VE](B) |====+ ) 955 ( // +------------+ +--++----++--+ +------------+ \\ ) 956 ( // .-. | IR[VC](B) | .-. \\ ) 957 ( //,-( _)-. +------------+ ,-( _)-\\ ) 958 ( .||_ (_ )-. / .-(_ (_ ||. ) 959 ( _|| ISP A .) / (redirect) (__ ISP B ||_)) 960 ( ||-(______)-' / `-(______)|| ) 961 ( || | / | vv ) 962 ( +-----+-----+ <=/ +-----+-----+ ) 963 | IR[CE](A) | | IR[CE](B) | 964 +-----+-----+ The IRON +-----+-----+ 965 | ( (Overlaid on the native Internet) ) | 966 .-. .-( .-) .-. 967 ,-( _)-. .-(________________________)-. ,-( _)-. 968 .-(_ (_ )-. .-(_ (_ )-. 969 (_ IRON EUN A ) (_ IRON EUN B ) 970 `-(______)-' `-(______)-' 971 | | 972 +---+----+ +---+----+ 973 | Host A | | Host B | 974 +--------+ +--------+ 976 Figure 8: EPA/Locator Mismatching Scenario Before Redirects 978 In this scenario, host A sends packets destined to host B via its 979 network interface connected to EUN A. Routing with EUN A will direct 980 the packets to IR[CE](A) as a default router for the EUN which then 981 uses VET and SEAL to encapsulate them in outer headers with its 982 locator address as the outer source address and the locator address 983 of its serving IR[VE](A) as the outer destination address. IR[CE](A) 984 then simply releases the encapsulated packets into its ISP network 985 connection that provided its locator. The ISP will release the 986 packets into the Internet without filtering since the (outer) source 987 address is topologically correct. Once the packets have been 988 released into the Internet, routing will direct them to IR[VE](A). 990 IR[VE](A) receives the encapsulated packets from IR[CE](A) then 991 rewrites the outer source address to its own locator address and 992 rewrites the outer destination address to an address taken from the 993 companion prefix associated with the VP that matches B. IR[VE](A) 994 then releases the re-encapsulated packets into the Internet where 995 routing will direct them to IR[VC](B) which advertises the companion 996 prefix.. 998 IR[VC](B) will receive the encapsulated packets from IR[VE](A) then 999 check its FIB to discover an entry that covers inner destination 1000 address B with IR[VE](B) as the next hop. IR[VC](B) will then issue 1001 SCMP redirect messages to inform IR[VE](A) that IR[VE](B) is a better 1002 next hop (*). IR[VC](B) then rewrites the outer source address of 1003 the encapsulated packets to its own locator address and rewrites the 1004 outer destination address to the locator address of IR[VE](B). 1005 IR[VC](B) then forwards these re-encapsulated packets to IR[VE](B). 1007 IR[VE](B) will receive the encapsulated packets from IR[VC](B) then 1008 check its FIB to discover an entry that covers destination address B 1009 with IR[CE](B) as the next hop. IR[VE](B) then re-encapsulates the 1010 packet in an outer header with its own locator address as the outer 1011 source address and the locator address of IR[CE](B) as the outer 1012 destination address. IR[VE](B) then releases these re-encapsulated 1013 packets into the Internet, where routing will direct them to 1014 IR[CE](B). IR[CE](B) will in turn decapsulate the packets and 1015 forward the inner packets to host B via EUN B. 1017 (*) Note that after the initial flow of packets, IR[VE](A) will have 1018 received one or more SCMP redirect messages from IR[VC](B) informing 1019 it of IR[VE](B) as a better next hop. IR[VE](A) will in turn forward 1020 the redirects to IR[CE](A), which will thereafter forward its 1021 encapsulated packets directly to the locator address of IR[VE](B) 1022 without involving either IR[VE](A) or IR[VC](B) as shown earlier in 1023 Figure 7. 1025 6.4.2. Mixed IRON and Non-IRON Hosts 1027 When one host is within an IRON EUN and the other is in a non-IRON 1028 EUN (i.e., one that connects to the native Internet instead of the 1029 IRON), the IR elements involved depend on the packet flow directions. 1030 The cases are described in the following sections. 1032 6.4.2.1. From IRON Host A to Non-IRON Host B 1034 Figure 9 depicts the IRON reference operating scenario for packets 1035 flowing from Host A in an IRON EUN to Host B in a non-IRON EUN: 1037 _________________________________________ 1038 .-( )-. )-. 1039 .-( +-------)----+ )-. 1040 .-( | IR[VC](A) |--------------+ )-. 1041 .( +------------+ \ ). 1042 .( +=======>| IR[VE](A) | \ ). 1043 .( // +--------)---+ \ ). 1044 ( // ) \ ) 1045 ( // The IRON ) \ ) 1046 ( // .-. ) \ .-. ) 1047 ( //,-( _)-. ) \ ,-( _)-. ) 1048 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1049 ( _|| ISP A ) ) (_ | ISP B )) 1050 ( ||-(______)-' ) |-(______)-' ) 1051 ( || | )-. v | ) 1052 ( +-----+ ----+ )-. +-----+-----+ ) 1053 | IR[CE](A) |)-. | Router B | 1054 +-----+-----+ +-----+-----+ 1055 | ( ) | 1056 .-. .-(____________________________________)-. .-. 1057 ,-( _)-. ,-( _)-. 1058 .-(_ (_ )-. .-(_ (_ )-. 1059 (_ IRON EUN A ) (_ non-IRON EUN ) 1060 `-(______)-' `-(___B___)-' 1061 | | 1062 +---+----+ +---+----+ 1063 | Host A | | Host B | 1064 +--------+ +--------+ 1066 Figure 9: From IRON Host A to Non-IRON Host B 1068 In this scenario, host A sends packets destined to host B via its 1069 network interface connected to IRON EUN A. Routing with EUN A will 1070 direct the packets to IR[CE](A) as a default router for the EUN which 1071 then uses VET and SEAL to encapsulate them in outer headers with its 1072 locator address as the outer source address and the locator address 1073 of a serving IR[VE] (i.e., IR[VE](A) as the outer destination 1074 address. The ISP will pass the packets without filtering since the 1075 (outer) source address is topologically correct. Once the packets 1076 have been released into the native Internet, routing will direct them 1077 to IR[VE](A). 1079 IR[VE](A) receives the encapsulated packets from IR[CE](A) then 1080 forwards them to IR[VC](A) which simply decapsulates them and 1081 releases the unencapsulated packets into the Internet. Once the 1082 packets are released into the Internet, routing will direct them to 1083 the final destination B. (Note that in this diagram IR[VE](A) and 1084 IR[VC](A) are depicted as two halves of a unified IR[VP](A). In that 1085 case, the "forwarding" between IR[VE](A) and IR[VC](A) is a zero- 1086 instruction imaginary operation.) 1088 Note that this scenario always involves an IR[VE](A) and IR[VC](A) 1089 owned by the VPC that provides service to IRON EUN A. This scenario 1090 therefore imparts a cost that would need to be borne by either the 1091 VPC or its customers. 1093 6.4.2.2. From Non-IRON Host B to IRON Host A 1095 Figure 10 depicts the IRON reference operating scenario for packets 1096 flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN: 1098 _______________________________________ 1099 .-( )-. )-. 1100 .-( +-------)----+ )-. 1101 .-( | IR[VC](A) |<-------------+ )-. 1102 .( +------------+ \ ). 1103 .( +========| IR[VE](A) | \ ). 1104 .( // +--------)---+ \ ). 1105 ( // ) \ ) 1106 ( // The IRON ) \ ) 1107 ( // .-. ) \ .-. ) 1108 ( //,-( _)-. ) \ ,-( _)-. ) 1109 ( .||_ (_ )-. ) The Native Internet .-|_ (_ )-. ) 1110 ( _|| ISP A ) ) (_ | ISP B )) 1111 ( ||-(______)-' ) |-(______)-' ) 1112 ( vv | )-. | | ) 1113 ( +-----+ ----+ )-. +-----+-----+ ) 1114 | IR[CE](A) |)-. | Router B | 1115 +-----+-----+ +-----+-----+ 1116 | ( ) | 1117 .-. .-(____________________________________)-. .-. 1118 ,-( _)-. ,-( _)-. 1119 .-(_ (_ )-. .-(_ (_ )-. 1120 (_ IRON EUN A ) (_ non-IRON EUN ) 1121 `-(______)-' `-(___B___)-' 1122 | | 1123 +---+----+ +---+----+ 1124 | Host A | | Host B | 1125 +--------+ +--------+ 1127 Figure 10: From Non-IRON Host B to IRON Host A 1129 In this scenario, host B sends packets destined to host A via its 1130 network interface connected to non-IRON EUN B. Routing will direct 1131 the packets to IR[VC](A) which then forwards them to IR[VE](A) using 1132 encapsulation if necessary. (Note that in this diagram IR[VE](A) and 1133 IR[VC](A) are depicted as two halves of a unified IR[VP](A). In that 1134 case, the "forwarding" between IR[VE](A) and IR[VC](A) is a zero- 1135 instruction imaginary operation.) 1137 IR[VE](A) will then check its FIB to discover an entry that covers 1138 destination address A with IR[CE](A) as the next hop. IR[VE](A) then 1139 encapsulates the packets using its own locator address as the outer 1140 source address and the locator address of IR[CE](A) as the outer 1141 destination address. IR[VE](A) then releases these encapsulated 1142 packets into the Internet, where routing will direct them to 1143 IR[CE](A). IR[CE](A) will in turn decapsulate the packets and 1144 forward the inner packets to host A via its network interface 1145 connected to IRON EUN A. 1147 Note that this scenario always involves an IR[VE](A) and IR[VC](A) 1148 owned by the VPC that provides service to IRON EUN A. This scenario 1149 therefore imparts a cost that would need to be borne by either the 1150 VPC or its customers. 1152 6.5. Mobility, Multihoming and Traffic Engineering Considerations 1154 While IR[VE]s and IR[VC]s can be considered as fixed infrastructure, 1155 IR[CE]s may need to move between different network points of 1156 attachment, connect to multiple ISPs, or explicitly manage their 1157 traffic flows. The following sections discuss mobility, multi-homing 1158 and traffic engineering considerations for IR[CE]s. 1160 6.5.1. Mobility Management 1162 When an IR[CE] changes its network point of attachment (e.g., due to 1163 a mobility event), it configures one or more new locators. If the 1164 IR[CE] has not moved far away from its previous network point of 1165 attachment, it simply informs its serving IR[VE] of any locator 1166 additions or deletions. This operation is performance-sensitive, and 1167 should be conducted immediately to avoid packet loss. 1169 If the IR[CE] has moved far away from its previous network point of 1170 attachment, however, it re-issues the implicit anycast discovery 1171 procedure described in Section 6.1 to discover whether its candidate 1172 set of serving IR[VE]s has changed. If the IR[CE]'s current serving 1173 IR[VE] is also included in the new list received from the VPC, this 1174 serves as indication that the IR[CE] has not moved far enough to 1175 warrant changing to a new serving IR[VE]. Otherwise, the IR[CE] may 1176 wish to move to a new serving IR[VE] in order to maintain optimal 1177 routing. This operation is not performance-critical, and therefore 1178 can be conducted over a matter of seconds/minutes instead of 1179 milliseconds/microseconds. 1181 To move to a new IR[VE], the IR[CE] first engages in the EP 1182 registration process with the new IR[VE] and maintains the 1183 registrations through periodic SRS/SRA exchanges the same as 1184 described in Section 6.1. The IR[CE] then informs its former IR[VE] 1185 that it has moved by providing it with the locator address of the new 1186 IR[VE]. The IR[CE] then discontinues the SRS/SRA keepalive process 1187 with the former IR[VE], which will garbage-collect the stale FIB 1188 entries when their lifetime expires. This will allow the former 1189 IR[VE] to redirect existing correspondents to the new IR[VE] so that 1190 no packets are lost. 1192 6.5.2. Multihoming 1194 An IR[CE] may register multiple locators with its serving IR[VE]. It 1195 can assign metrics with its registrations to inform its IR[VE] of 1196 preferred locators, and can select outgoing locators according to its 1197 local preferences. Multihoming is therefore naturally supported. 1199 6.5.3. Inbound Traffic Engineering 1201 An IR[CE] can dynamically adjust the priorities of its prefix 1202 registrations with its serving IR[VE] in order to influence inbound 1203 traffic flows. It can also change between serving IR[VE]s when 1204 multiple IR[VE]s are available, but should strive for stability in 1205 its IR[VE] selection in order to limit routing churn. 1207 6.5.4. Outbound Traffic Engineering 1209 An IR[CE] can select outgoing locators, e.g., based on current QoS 1210 considerations such as minimizing one-way delay or one-way delay 1211 variation. 1213 6.6. Renumbering Considerations 1215 As better link layer technologies and service plans emerge, customers 1216 will be motivated to select their service providers through healthy 1217 competition between ISPs. If a customer's EUN addresses are tied to 1218 a specific ISP, however, the customer may be forced to undergo a 1219 painstaking EUN renumbering process if it wishes to changes to a 1220 different ISP [RFC4192][RFC5887]. 1222 When a customer obtains EP prefixes from a VPC, it can change between 1223 ISPs seamlessly and without need to renumber. If the VPC itself 1224 applies unreasonable costing structures for use of the EPs, however, 1225 the customer may be compelled to seek a different VPC and would again 1226 be required to confront a renumbering scenario. The IRON approach to 1227 renumbering avoidance therefore depends on VPCs conducting ethical 1228 business practices and offering reasonable rates. 1230 6.7. NAT Traversal Considerations 1232 The Internet today consists of a global public IPv4 routing and 1233 addressing system with non-IRON EUNs that use either public or 1234 private IPv4 addressing. The latter class of EUNs connect to the 1235 public Internet via Network Address Translators (NATs). When an 1236 IR[CE] is located behind a NAT, its selects IR[VE]s using the same 1237 procedures as for IR[CE]s with public addresses, i.e., it will send 1238 SRS messages to IR[VE]s in order to get SRA messages in return. The 1239 only requirement is that the IR[CE] must configure its SEAL 1240 encapsulation to use a transport protocol that supports NAT 1241 traversal, namely UDP. 1243 Since the IR[VE] maintains state about its IR[CE] customers, it can 1244 discover locator information for each IR[CE] by examining the UDP 1245 port number and IP address in the outer headers of SRS messages. 1246 When there is a NAT in the path, the UDP port number and IP address 1247 in the SRS message will correspond to state in the NAT box and might 1248 not correspond to the actual values assigned to the IR[CE]. The 1249 IR[VE] can then encapsulate packets destined to hosts serviced by the 1250 IR[CE] within outer headers that use this IP address and UDP port 1251 number. The NAT box will receive the packets, translate the values 1252 in the outer headers to match those assigned to the IR[CE], then 1253 forward the packets to the IR[CE]. In this sense, the IR[VE]'s 1254 "locator" for the IR[CE] consists of the concatenation of the IP 1255 address and UDP port number. 1257 IRON does not introduce any new issues to complications raised for 1258 NAT traversal or for applications embedding address referrals in 1259 their payload. 1261 6.8. Nested EUN Considerations 1263 Each IR[CE] configures a locator that may be taken from an ordinary 1264 non-EPA address assigned by an ISP or from an EPA address taken from 1265 an EP assigned to another IR[CE]. In that case, the IR[CE] is said 1266 to be "nested" within the EUN of another IR[CE]. 1268 For example, assume a configuration in which IR[CE](A) configures a 1269 locator EPA(B) taken from the EP assigned to EUN(B). IR[CE](B) in 1270 turn configures a locator EPA(C) taken from the EP assigned to 1271 EUN(C). Finally, IR[CE](C) assigns a locator ISP(D) taken from a 1272 non-EPA address delegated by an ordinary ISP(D). Using this example, 1273 the "nested-IRON" case must be examined in which a host A which 1274 configures the address EPA(A) within EUN(A) exchanges packets with 1275 host Z located elsewhere in the Internet. The example configuration 1276 is depicted in Figure 11: 1278 .-. 1279 EPA(D) ,-( _)-. 1280 +-----------+ .-(_ (_ )-. 1281 | IR[CE](C) |--(_ ISP(D) ) 1282 +-----+-----+ `-(______)-' 1283 | <= T \ .-. 1284 .-. u \ ,-( _)-. 1285 ,-( _)-. n .-(_ (- )-. 1286 .-(_ (_ )-. n (_ Internet ) 1287 (_ EUN(C) ) e `-(______)- +--------+ 1288 `-(______)-' l ___ | Host Z | 1289 | EPA(C) s => (:::)-. +--------+ 1290 +-----+-----+ .-(::::::::) 1291 | IR[CE](B) | .-(::::::::::::)-. 1292 +-----+-----+ (:::: The IRON ::::) 1293 | `-(::::::::::::)-' 1294 .-. `-(::::::)-' 1295 ,-( _)-. 1296 .-(_ (_ )-. +-----------------+ 1297 (_ EUN(B) ) | IR[VP/VC/VE]'s] | 1298 `-(______)-' +-----------------+ 1299 | EPA(B) 1300 +-----+-----+ 1301 | IR[CE](A) | 1302 +-----------+ 1303 | 1304 .-. 1305 ,-( _)-. EPA(A) 1306 .-(_ (_ )-. +--------+ 1307 (_ EUN(A) )---| Host A | 1308 `-(______)-' +--------+ 1310 Figure 11: Nested EUN Example 1312 The two cases of host A sending packets to host Z, and host Z sending 1313 packets to host A, must be considered separately as described below. 1315 6.8.1. Host A Sends Packets to Host Z 1317 6.8.1.1. Nested IRON Example When Z Configures an EPA Address 1319 Host A first forwards a packet with source address EPA(A) and 1320 destination address EPA(Z) into EUN(A). Routing within EUN(A) will 1321 direct the packet to IR[CE](A), which encapsulates it in an outer 1322 header with EPA(B) as the outer source address and EPA(Z) as the 1323 outer destination address then forwards the once-encapsulated packet 1324 into EUN(B). Routing within EUN[B] will direct the packet to 1325 IR[CE](B), which encapsulates it in an outer header with EPA(C) as 1326 the outer source address and EPA(Z) as the outer destination address 1327 then forwards the twice-encapsulated packet into EUN(C). Routing 1328 within EUN(C) will direct the packet to IR[CE](C), which encapsulates 1329 it in an outer header with ISP(D) as the outer source address and 1330 EPA(Z) as the outer destination address. IR[CE](C) then sends this 1331 triple-encapsulated packet into the ISP(D) network, where it will be 1332 routed into the Internet to an IR[VC](Z) that advertises a VP that 1333 covers destination address EPA(Z). 1335 When IR[VC](Z) receives the triple-encapsulated packet, it consults 1336 its FIB to determine that IR[VE](Z) is the serving router for EP(Z). 1337 It then re-encapsulates the packet by changing the outer source 1338 address to its own locator address and the outer destination address 1339 to the locator address for IR[VE](Z). It also sends a redirect 1340 message back to IR[CE](C) as normal. When IR[VE](Z) receives the 1341 triple-encapsulated packet, it strips off all outer layers of 1342 encapsulation and re-encapsulates the inner packet in a single outer 1343 header using its own locator address as the source address and the 1344 locator address of IR[CE](Z) as the destination address. IR[VE](Z) 1345 then tunnels the packet to IR[CE](Z), which decapsulates the packet 1346 and forwards it to host Z. 1348 The key architectural requirement derived from this case is that each 1349 IR[VE] must iteratively decapsulate each layer of a multi- 1350 encapsulated packet when the outer destination address matches an EPA 1351 assigned to one of its IR[CE] customers. When the final such layer 1352 of encapsulation is reached, the IR[VE] must re-encapsulate the 1353 packet and forward it to the correct customer IR[CE]. 1355 6.8.1.2. Nested IRON Example when Z Configures a non-EPA Address 1357 Host A first forwards a packet with source address EPA(A) and 1358 destination address Z into EUN(A). Routing within EUN(A) will direct 1359 the packet to IR[CE](A), which encapsulates it in an outer header 1360 with EPA(B) as the outer source address and IR[VE](A) as the outer 1361 destination address then forwards this once-encapsulated packet into 1362 EUN(B). (Note that IR[CE](A) must forward this packet via its 1363 serving IR[VE](A) for reasons explained in Section 6.4.2). Routing 1364 within EUN[B] will direct the packet to IR[CE](B), which encapsulates 1365 it in an outer header with EPA(C) as the outer source address and 1366 IR[VE](B) as the outer destination address then forwards this twice- 1367 encapsulated packet into EUN(C). Routing within EUN(C) will direct 1368 the packet to IR[CE](C), which encapsulates it in an outer header 1369 with ISP(D) as the outer source address and IR[VE](C) as the outer 1370 destination address. IR[CE](C) then sends this triple-encapsulated 1371 packet into its ISP network, where it will be routed to IR[VE](C). 1373 To ease in discussion of this case, now consider that each IR[VE] 1374 named above is half of a unified IR[VP] that combines both the IR[VC] 1375 and IR[VE] functions. With this simplification in mind, when 1376 IR[VP](C) receives the triple-encapsulated packet, it removes the 1377 outermost layer of encapsulation and forwards the twice-encapsulated 1378 packet into the Internet where Internet routing will direct it to 1379 IR[VP](B). IR[VP](B) in turn removes the next layer of encapsulation 1380 and forwards the once-encapsulated packet into the Internet where 1381 Internet routing will direct it to IR[VP](A). IR[VP](A) will finally 1382 remove the final layer of encapsulation and forward the packet into 1383 the Internet where Internet routing will direct it to host Z. 1385 The key architectural requirement derived from this case is that each 1386 IR[VE] must iteratively decapsulate each layer of a multi- 1387 encapsulated packet when the outer destination address is one of its 1388 own locator addresses. When the final such layer of encapsulation is 1389 reached, the IR[VE] forwards the packet into the Internet. 1391 6.8.2. Host Z Sends Packets to Host A 1393 Whether or not host Z configures an EPA address, its packets destined 1394 to Host A will eventually reach IR[VE](A). IR[VE](A) will have a 1395 mapping that lists IR[CE](A) as the next hop toward EPA(A). 1396 IR[VE](A) will then encapsulate the packet with EPA(B) as the outer 1397 destination address and forward the packet into the Internet. 1398 Internet routing will convey this once-encapsulated packet to 1399 IR[VE](B) which will have a mapping that lists IR[CE](B) as the next 1400 hop toward EPA(B). IR[VE](B) will then encapsulate the packet with 1401 EPA(C) as the outer destination address and forward the packet into 1402 the Internet. Internet routing will then convey this twice- 1403 encapsulated packet to IR[VE](C) which will have a mapping that lists 1404 IR[CE](C) as the next hop toward EPA(C). IR[VE](C) will then 1405 encapsulate the packet with ISP(D) as the outer destination address 1406 and forward the packet into the Internet. Internet routing will then 1407 convey this triple-encapsulated packet to IR[CE](C). 1409 When the triple-encapsulated packet arrives at IR[CE](C), it strips 1410 the outer layer of encapsulation and forwards the twice-encapsulated 1411 packet to EPA(C) which is the locator address of IR[CE](B). When 1412 IR[CE](B) receives the twice-encapsulated packet, it strips the outer 1413 layer of encapsulation and forwards the once-encapsulated packet to 1414 EPA(B) which is the locator address of IR[CE](A). When IR[CE](A) 1415 receives the once-encapsulated packet, it strips the outer layer of 1416 encapsulation and forwards the unencapsulated packet to EPA(A) which 1417 is the host address of host A. 1419 The key architectural requirement derived from this case is that each 1420 IR[CE] must decapsulate only the outermost layer of a multi- 1421 encapsulated packet when the outer destination address matches an EPA 1422 assigned to a device in its EUN. This class of packets can be 1423 considered as "inbound" wrt the IR[CE]'s EUNs. The outbound cases 1424 are discussed in Section 6.8.1 1426 7. Additional Considerations 1428 Considerations for the scalability of Internet Routing due to 1429 multihoming, traffic engineering and provider-independent addressing 1430 are discussed in [I-D.narten-radir-problem-statement]. Route 1431 optimization considerations for mobile networks are found in 1432 [RFC5522]. 1434 8. Related Initiatives 1436 IRON builds upon the concepts RANGER architecture [RFC5720], and 1437 therefore inherits the same set of related initiatives. 1439 Virtual Aggregation (VA) [I-D.ietf-grow-va] and Aggregation in 1440 Increasing Scopes (AIS) [I-D.zhang-evolution] provide the basis for 1441 the Virtual Prefix concepts. 1443 Internet vastly improved plumbing (Ivip) [I-D.whittle-ivip-arch] has 1444 contributed valuable insights, including the use of real-time 1445 mapping. The use of IR[VE]s as mobility anchor points is directly 1446 influenced by Ivip's associated TTR mobility extensions [TTRMOB]. 1448 [I-D.bernardos-mext-nemo-ro-cr] discussed a route optimization 1449 approach using a Correspondent Router (CR) model. The IRON IR[VE] 1450 construct is similar to the CR concept described in this work, 1451 however the manner in which customer EUNs coordinates with IR[VE]s is 1452 different and based on the redirection model associated with NBMA 1453 links. 1455 Numerous publications have proposed NAT traversal techniques. The 1456 NAT traversal techniques adapted for IRON were inspired by the Simple 1457 Address Mapping for Premises Legacy Equipment (SAMPLE) proposal 1458 [I-D.carpenter-softwire-sample]. 1460 9. IANA Considerations 1462 There are no IANA considerations for this document. 1464 10. Security Considerations 1466 Security considerations that apply to tunneling in general are 1467 discussed in [I-D.ietf-v6ops-tunnel-security-concerns]. Additional 1468 considerations that apply also to IRON are discussed in RANGER 1469 [RFC5720], VET [I-D.templin-intarea-vet] and SEAL 1470 [I-D.templin-intarea-seal]. 1472 IR[CE]s require a means for securely registering their EP-to-locator 1473 bindings with their VPC. Each VPC provides its customer IR[CE]s with 1474 a secure means for registering and re-registering their mappings. 1476 11. Acknowledgements 1478 This ideas behind this work have benefited greatly from discussions 1479 with colleagues; some of which appear on the RRG and other IRTF/IETF 1480 mailing lists. Robin Whittle and Steve Russert co-authored the TTR 1481 mobility architecture which strongly influenced IRON. Eric 1482 Fleischman pointed out the opportunity to leverage anycast for 1483 discovering topologically-close servers. Thomas Henderson 1484 recommended a quantitative analysis of scaling properties. 1486 The following individuals provided essential review input: Mohamed 1487 Boucadair, Wesley Eddy, Dae Young Kim and Robin Whittle. 1489 12. References 1491 12.1. Normative References 1493 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1494 September 1981. 1496 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1497 (IPv6) Specification", RFC 2460, December 1998. 1499 12.2. Informative References 1501 [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, 1502 http://bgpmon.net/stat.php", June 2010. 1504 [I-D.bernardos-mext-nemo-ro-cr] 1505 Bernardos, C., Calderon, M., and I. Soto, "Correspondent 1506 Router based Route Optimisation for NEMO (CRON)", 1507 draft-bernardos-mext-nemo-ro-cr-00 (work in progress), 1508 July 2008. 1510 [I-D.carpenter-softwire-sample] 1511 Carpenter, B. and S. Jiang, "Legacy NAT Traversal for 1512 IPv6: Simple Address Mapping for Premises Legacy Equipment 1513 (SAMPLE)", draft-carpenter-softwire-sample-00 (work in 1514 progress), June 2010. 1516 [I-D.ietf-grow-va] 1517 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 1518 L. Zhang, "FIB Suppression with Virtual Aggregation", 1519 draft-ietf-grow-va-02 (work in progress), March 2010. 1521 [I-D.ietf-v6ops-tunnel-security-concerns] 1522 Hoagland, J., Krishnan, S., and D. Thaler, "Security 1523 Concerns With IP Tunneling", 1524 draft-ietf-v6ops-tunnel-security-concerns-02 (work in 1525 progress), March 2010. 1527 [I-D.narten-radir-problem-statement] 1528 Narten, T., "On the Scalability of Internet Routing", 1529 draft-narten-radir-problem-statement-05 (work in 1530 progress), February 2010. 1532 [I-D.russert-rangers] 1533 Russert, S., Fleischman, E., and F. Templin, "RANGER 1534 Scenarios", draft-russert-rangers-05 (work in progress), 1535 July 2010. 1537 [I-D.templin-intarea-seal] 1538 Templin, F., "The Subnetwork Encapsulation and Adaptation 1539 Layer (SEAL)", draft-templin-intarea-seal-16 (work in 1540 progress), July 2010. 1542 [I-D.templin-intarea-vet] 1543 Templin, F., "Virtual Enterprise Traversal (VET)", 1544 draft-templin-intarea-vet-16 (work in progress), 1545 July 2010. 1547 [I-D.whittle-ivip-arch] 1548 Whittle, R., "Ivip (Internet Vastly Improved Plumbing) 1549 Architecture", draft-whittle-ivip-arch-04 (work in 1550 progress), March 2010. 1552 [I-D.zhang-evolution] 1553 Zhang, B. and L. Zhang, "Evolution Towards Global Routing 1554 Scalability", draft-zhang-evolution-02 (work in progress), 1555 October 2009. 1557 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1558 a subnetwork for experimentation with the OSI network 1559 layer", RFC 1070, February 1989. 1561 [RFC3849] Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix 1562 Reserved for Documentation", RFC 3849, July 2004. 1564 [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for 1565 Renumbering an IPv6 Network without a Flag Day", RFC 4192, 1566 September 2005. 1568 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1569 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1571 [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code 1572 Point (ICP) Assignments for NSAP Addresses", RFC 4548, 1573 May 2006. 1575 [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site 1576 Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, 1577 March 2008. 1579 [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility 1580 Route Optimization Requirements for Operational Use in 1581 Aeronautics and Space Exploration Mobile Networks", 1582 RFC 5522, October 2009. 1584 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1585 Global Enterprise Recursion (RANGER)", RFC 5720, 1586 February 2010. 1588 [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks 1589 Reserved for Documentation", RFC 5737, January 2010. 1591 [RFC5743] Falk, A., "Definition of an Internet Research Task Force 1592 (IRTF) Document Stream", RFC 5743, December 2009. 1594 [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering 1595 Still Needs Work", RFC 5887, May 2010. 1597 [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for 1598 Core-Edge Separation Solutions to the Internet's Routing 1599 Scaling Problem, 1600 http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", 1601 August 2008. 1603 Appendix A. IRON VPs Over Internetworks with Different Address Families 1605 The IRON architecture leverages the routing system by providing 1606 generally shortest-path routing for packets with EPA addresses from 1607 VPs that match the address family of the underlying Internetwork. 1608 When the VPs are of an address family that is not routable within the 1609 underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs 1610 are used within an IPv4 Internetwork) a global mapping database is 1611 required to allow IR[VE]s to map VPs to companion prefixes taken from 1612 address families that are routable within the Internetwork. For 1613 example, an IPv6 VP (e.g., 2001:DB8::/32) could be paired with a 1614 companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 1615 packets can be forwarded over IPv4-only Internetworks. 1617 Every VP in the IRON must therefore be represented in a globally 1618 distributed Master VP database (MVPd) that maintains VP-to-companion 1619 prefix mappings for all VPs in the IRON. The MVPd is maintained by a 1620 globally-managed assigned numbers authority in the same manner as the 1621 Internet Assigned Numbers Authority (IANA) currently maintains the 1622 master list of all top-level IPv4 and IPv6 delegations. The database 1623 can be replicated across multiple servers for load balancing much in 1624 the same way that FTP mirror sites are used to manage software 1625 distributions. 1627 Upon startup, each IR[VE] discovers the full set of VPs for the IRON 1628 by reading the MVPd. The IR[VE] reads the MVPd from a nearby server 1629 and periodically checks the server for deltas since the database was 1630 last read. After reading the MVPd, the IR[VE] has a full list of VP 1631 to companion prefix mappings and is said to be "synchronized with the 1632 IRON". 1634 The IR[VE] can then forward packets toward EPAs covered by a VP by 1635 encapsulating them in an outer header of the VP's companion prefix 1636 address family and using any address taken from the companion prefix 1637 as the outer destination address. The companion prefix therefore 1638 serves as an implicit anycast prefix. 1640 Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- 1641 IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. 1643 Appendix B. Scaling Considerations 1645 Scaling aspects of the IRON architecture have strong implications for 1646 its applicability in practical deployments. Scaling must be 1647 considered along multiple vectors including Interdomain core routing 1648 scaling, scaling to accommodate large numbers of customer EUNs, 1649 traffic scaling, state requirements, etc. 1651 In terms of routing scaling, each VPC will advertise one or more VPs 1652 from which EPs are delegated to customer EUNs. Routing scaling will 1653 therefore be minimized when each VP covers many EPs. For example, 1654 the IPv6 prefix 2001:DB8::/32 contains 2^16 /56 prefixes for 1655 assignment to EUNs. Therefore, 2^16 EUNs can be represented as a 1656 single VP in the interdomain routing core. The IRON could therefore 1657 accommodate 10^10 IPv6 ::/56 EPs with only 625 IPv6 ::/32 VPs 1658 advertised in the interdomain routing core. 1660 In terms of traffic scaling for IR[VC]s, each IR[VC] represents an 1661 ASBR of a "shell" enterprise network that simply turns arriving 1662 traffic packets with EPA destination addresses back out into the 1663 Internet towards IR[VE]s that service customer EUNs. Moreover, the 1664 IR[VC] sheds traffic destined to EPAs through redirection which 1665 removes it from the path for the vast majority of traffic packets. 1666 On the other hand, IR[VC]s must handle all traffic packets forwarded 1667 between EUNs and the non-IRON Internet. The scaling concerns for 1668 this latter class of traffic are no different than for ASBR routers 1669 that connect large enterprise networks to the Internet. In terms of 1670 traffic scaling for IR[VE]s, each IR[VE] services a set of the VPC 1671 overlay network's customer EUNs. The IR[VE] services all traffic 1672 packets destined to its EUNs but only services the initial packets of 1673 flows initiated from the EUNs. Therefore, traffic scaling is an 1674 asymmetric consideration and is proportional to the number of EUNs 1675 each IR[VE] serves. 1677 In terms of state requirements for IR[VC]s, each IR[VC] maintains a 1678 list of all IR[VE]s in the VPC overlay network as well as all 1679 customer EUNs that each IR[VE] serves. This state is therefore 1680 dominated by the number of EUNs in the VPC overlay network. Sizing 1681 the IR[VC] to accommodate state information for all EUNs is therefore 1682 required during VPC overlay network planning. In terms of state 1683 requirements for IR[VE]s, each IR[VE] maintains two-way tunnel state 1684 for each of the customer EUNs it serves but need not keep state for 1685 all EUNs in the VPC overlay network. Finally, neither IR[VC]s nor 1686 IR[VE] need keep state for final destinations of outbound traffic. 1688 IR[CE]s source and sink all traffic packets originating from or 1689 destined to the customer EUN. Therefore traffic scaling 1690 considerations for IR[CE]s are the same as for any site border 1691 router. IR[CE]s also retain state for the final destinations of 1692 outbound traffic flows. This can be managed as soft state, since 1693 stale entries purged from the cache will be refreshed when new 1694 traffic packets are sent. 1696 Author's Address 1698 Fred L. Templin (editor) 1699 Boeing Research & Technology 1700 entire. Box 3707 MC 7L-49 1701 Seattle, WA 98124 1702 USA 1704 Email: fltemplin@acm.org