idnits 2.17.1 draft-ietf-ipngwg-esd-analysis-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 2 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The abstract seems to contain references ([GSE]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. == There are 7 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 12, 1999) is 9204 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2073' is mentioned on line 669, but not defined ** Obsolete undefined reference: RFC 2073 (Obsoleted by RFC 2374) == Missing Reference: 'ESD' is mentioned on line 759, but not defined == Missing Reference: 'RFC 2267' is mentioned on line 1360, but not defined ** Obsolete undefined reference: RFC 2267 (Obsoleted by RFC 2827) == Unused Reference: 'ANYCAST' is defined on line 1764, but no explicit reference was found in the text == Unused Reference: 'RFC1884' is defined on line 1817, but no explicit reference was found in the text == Unused Reference: 'RFC2267' is defined on line 1834, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 1546 (ref. 'ANYCAST') ** Downref: Normative reference to an Informational RFC: RFC 2260 (ref. 'BATES') -- Possible downref: Non-RFC (?) normative reference: ref. 'Bellovin 89' ** Obsolete normative reference: RFC 1519 (ref. 'CIDR') (Obsoleted by RFC 4632) -- Possible downref: Non-RFC (?) normative reference: ref. 'DHCP-DDNS' -- Possible downref: Non-RFC (?) normative reference: ref. 'EUI64' -- Possible downref: Non-RFC (?) normative reference: ref. 'GSE' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE802' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE1212' ** Obsolete normative reference: RFC 2374 (ref. 'IPv6-ADDRESS') (Obsoleted by RFC 3587) ** Obsolete normative reference: RFC 2002 (ref. 'MOBILITY') (Obsoleted by RFC 3220) ** Obsolete normative reference: RFC 1788 (Obsoleted by RFC 6918) ** Obsolete normative reference: RFC 1884 (Obsoleted by RFC 2373) ** Downref: Normative reference to an Informational RFC: RFC 1958 ** Obsolete normative reference: RFC 1971 (Obsoleted by RFC 2462) ** Obsolete normative reference: RFC 2073 (Obsoleted by RFC 2374) ** Obsolete normative reference: RFC 2267 (Obsoleted by RFC 2827) == Outdated reference: A later version (-10) exists of draft-ietf-ipngwg-router-renum-06 Summary: 20 errors (**), 0 flaws (~~), 10 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Matt Crawford 2 Fermilab 3 Allison Mankin 4 ISI 5 Thomas Narten 6 IBM 7 John W. Stewart, III 8 Juniper 9 Lixia Zhang 10 UCLA 11 February 12, 1999 13 Separating Identifiers and Locators in Addresses: 14 An Analysis of the GSE Proposal for IPv6 16 18 Status of this Memo 20 This document is an Internet-Draft and is in full conformance with 21 all provisions of Section 10 of RFC2026 except that the right to 22 produce derivative works is not granted. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet- Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 Abstract 42 On February 27-28, 1997, the IPng Working Group held an interim 43 meeting in Palo Alto, California to consider adopting Mike O'Dell's 44 "GSE - An Alternate Addressing Architecture for IPv6" proposal [GSE]. 45 In GSE, 16-byte IPv6 addresses are split into distinct portions for 46 global routing, local routing and end-point identification. GSE 47 includes the feature of configuring a node internal to a site with 48 only the local routing and end-point identification portions of the 49 address, thus hiding the full address from the node. When such a 50 node generates a packet, only the low-order bytes of the source 51 address are specified; the high-order bytes of the address are filled 52 in by a border router when the packet leaves the site. 54 There is a long history of a vague assertion in certain circles that 55 IPv4 "got it wrong" by treating its addresses simultaneously as 56 locators and identifiers. Despite these claims, however, there was 57 never a complete proposal for a scaleable network protocol which 58 separated the functions. As a result, it wasn't possible to do a 59 serious analysis comparing and contrasting a "separated" architecture 60 and an "overloaded" architecture. The GSE proposal serves as a 61 vehicle for just such an analysis, and that is the purpose of this 62 paper. 64 We conclude that an architecture that clearly separates locators and 65 identifiers in addresses introduces new issues and problems that do 66 not have an easy or clear solution. Indeed, the alleged 67 disadvantages of overloading addresses turn out to provide some 68 significant benefits over the non-overloaded approach. 70 Contents 72 Status of this Memo.......................................... 1 74 1. Introduction............................................. 3 76 2. Definitions and Terminology.............................. 4 78 3. Addressing and Routing in IPv4........................... 5 79 3.1. The Need for Aggregation............................ 7 80 3.2. The Pre-CIDR Internet............................... 7 81 3.3. CIDR and Provider-Based Addressing.................. 9 82 3.4. Multi-Homed Sites and Aggregation................... 12 84 4. The GSE Proposal......................................... 15 85 4.1. Motivation For GSE.................................. 15 86 4.2. GSE Address Format.................................. 16 87 4.2.1. Routing Stuff (RG and STP)..................... 16 88 4.2.2. End-System Designator.......................... 18 89 4.3. Address Rewriting by Border Routers................. 19 90 4.4. Renumbering and Rehoming Mid-Level ISPs............. 20 91 4.5. Support for Multi-Homed Sites....................... 21 92 4.6. Explicit Non-Goals for GSE.......................... 22 94 5. Analysis: The Pros and Cons of Overloading Addresses..... 22 95 5.1. Purpose of an Identifier............................ 23 96 5.2. Mapping an Identifier to a Locator.................. 25 97 5.2.1. Scalable Mapping of Identifiers to Locators.... 27 98 5.2.2. Insufficient Hierarchy Space in ESDs........... 27 99 5.3. Authentication of Identifiers....................... 28 100 5.3.1. Identifier Authentication in IPv4.............. 29 101 5.3.2. Identifier Authentication in GSE............... 30 102 5.4. Transport Layer: What Locator Should Be Used?....... 30 103 5.4.1. RG Selection On An Active Open................. 31 104 5.4.2. RG Selection On An Passive Open................ 31 105 5.4.3. Mid-Connection RG Changes...................... 31 106 5.4.4. The Impact of Corrupted Routing Goop........... 33 107 5.5. On The Uniqueness Of ESDs........................... 34 108 5.5.1. Impact of Duplicate ESDs....................... 34 109 5.5.2. New Denial of Service Attacks.................. 35 110 5.6. Summary of Identifier Authentication Issues......... 35 112 6. Conclusion............................................... 37 114 7. Security Considerations.................................. 38 116 8. Acknowledgments.......................................... 38 118 9. References............................................... 38 120 10. Authors' Addresses...................................... 40 122 Appendix A: Increased Reliance on Domain Name System (DNS)... 41 124 Appendix B: Additional Issues Related to GSE................. 45 126 Appendix C: Ideas Incorporated Into IPv6..................... 46 128 Appendix D: Reverse Mapping of Complete GSE Addresses........ 47 130 1. Introduction 132 In October of 1996, Mike O'Dell published an Internet-Draft (dubbed 133 "8+8") that proposed significant changes to the IPv6 unicast 134 addressing architecture. The 8+8 proposal was the topic of 135 considerable discussion at the December 1996 IETF meeting in San 136 Jose. Because the proposal offered both potential benefits (e.g., 137 enhanced routing scalability) and risks (e.g., changes to the basic 138 IPv6 architecture), the IPng Working Group held an interim meeting on 139 February 27-28, 1997 to consider adopting the 8+8 proposal. 141 Shortly before the interim meeting, an updated version of the 142 Internet-Draft was produced. This version changed the name of the 143 proposal from "8+8" to "GSE" to identify the three separate 144 components of a unicast address: Global, Site and End-System 145 Designator. 147 The well-attended meeting generated high caliber, focused technical 148 discussions on the issues involved, with participation by almost all 149 of the attendees. By the middle of the second day there was 150 unanimous agreement that the GSE proposal as written presented too 151 many risks and should not be adopted as the basis for IPv6. The 152 proposal did, however, challenge the group to make several 153 improvements to the then existing IPv6 specifications (including 154 increasing the aggregatability of addresses, having hard boundaries 155 between routing and non-routing parts of the address, and easing the 156 DNS aspects of renumbering). 158 This document focuses primarily on the issue of separating unicast 159 addresses into distinct portions for identification and location 160 purposes, a separation that IPv4 does not make but that is 161 fundamental to GSE. We start with a discussion of the current 162 architecture of IPv4 addressing and its impact on route scalability, 163 identification, multi-homing, etc. Next, the details of the GSE 164 proposal are described. Finally, the fundamental issue of 165 decomposing addresses into multiple separate functional parts is 166 analyzed in the context of the GSE proposal. Here we detail some of 167 the practical reasons why separating addresses into locators and 168 identifier poses a number of new challenges, making it clear that 169 having such a separation is no panacea. An appendix contains a 170 summary of the IPng Working Group's deliberations of GSE and the 171 results on IPv6 addressing. 173 Finally, this document's focus on unicast issues should not be 174 interpreted to mean that the impact of separating identifier and 175 locating functions on non-unicast aspects of routing and addressing 176 are well understood or trivial to deal with. Specifically, 177 understanding how multicasting and anycast addressing [ANYCAST, 178 RFC1884] fits into such a model requires further work. 180 2. Definitions and Terminology 182 The following terminology is used throughout this document. 184 Routing Goop --- A term defined by the GSE document. It refers to 185 the first six bytes of a sixteen byte IPv6 GSE 186 address. The Routing Goop portion of an address 187 identifies where a site connects to the public 188 Internet. More generally, the term refers to the 189 portion of an address's routing prefix that 190 identifies where on the public Internet the site 191 housing the address resides. 193 Site Topology Partition --- A term defined by the GSE document 194 that refers to the two bytes of a sixteen byte IPv6 195 GSE address immediately to the right of the Routing 196 Goop. The Site Topology Partition part of an 197 address identifies which link within a site an 198 address resides on. 200 Routing Stuff --- The part of an address that identifies which 201 link the address resides on. Within the context of 202 GSE, the Routing Stuff comprises the Routing Goop 203 and Site Topology Partition parts of an address 204 (i.e., the left mots eight bytes). 206 identifier --- a value that indicates the sender of a packet, or 207 the intended recipient of a packet. Within the 208 context of GSE, the ESD portion (i.e., the rightmost 209 eight bytes) of the address is an identifier. 211 locator --- a field in a packet header that is used by the routing 212 subsystem to deliver a packet to the link on which a 213 destination resides. The terms locator and Routing 214 Stuff are similar, we use Routing Stuff when 215 referring to the specific locator in GSE. 217 3. Addressing and Routing in IPv4 219 Before dealing with details of GSE, we present some background about 220 how routing and addressing works in "classical IP" (i.e., IPv4). We 221 present this background because the GSE proposal proposes a fairly 222 major change to the base model. In order to properly evaluate GSE, 223 one must understand what problems in IPv4 it alleges to improve or 224 fix. 226 The structure and semantics of a network layer protocol's addresses 227 are absolutely core to that protocol. Addressing substantially 228 impacts the way packets are routed, the ability of a protocol to 229 scale and the kinds of functionality higher layer protocols can count 230 on. Indeed, addressing is intertwined with both routing and 231 transport layer issues; a change in any one of these can impact 232 another. Issues of administration and operation (e.g., address 233 allocation/re-allocation and required renumbering), while not part of 234 the pure exercise of engineering a network layer protocol, turn out 235 to be critical to the scalability of that protocol in a global and 236 commercial network. The interaction between addressing, routing and 237 especially aggregation is particularly relevant to this document, so 238 some time will be spent describing it. 240 Addresses in IPv4 serve two purposes: 242 1) Unique identification of an interface. A sending host tells the 243 network the identity of the intended recipient by placing an IP 244 address into the destination address field. In addition, the 245 receiving host checks the destination address field of received 246 packets to ensure that the packet is, in fact, for it. 248 2) Location information of that interface. Routers use the 249 packet's destination address in deciding where to forward the 250 packet to get it closer to its ultimate destination. That is, 251 addresses identify "where" the intended recipient is located 252 within the Internet topology. 254 For scalability, the location information contained in addresses 255 must be aggregatable. In practice, this means that nodes 256 topologically close to each other (e.g., connected to the same 257 link, residing at the same site, or customers of the same ISP) 258 must use addresses that share a common prefix. 260 What is important to note is that these identification and location 261 requirements have been met through the use of the same value, namely 262 the IP address. As will be noted repeatedly in this document, the 263 "overloading" of IPv4 addresses with multiple semantics has some 264 undesirable implications. For example, the embedding of IPv4 265 addresses within transport protocol addresses that identify the end- 266 point of a connection couples those transport protocols with routing 267 to a degree. This entanglement is inconsistent with a (too) strictly 268 layered model in which routing would be a completely independent 269 function of the network layer and not directly impact the transport 270 layer. 272 Combining locator and identifier functions also complicates the 273 support for mobility. In a mobile environment, the location of an 274 end-station may change even though its identity stays the same; 275 ideally, transport connections should be able to survive such 276 changes. In IPv4, however, one cannot change the locator without 277 also changing the identifier since the same packet field is used for 278 both. 280 Consequently, there has been a train of thought for some time that 281 having separate values for location and identification could be of 282 significant benefit. The GSE proposal, among other things, attempts 283 to make such a separation. 285 This document frequently uses mobility as an example to demonstrate 286 the pros and cons of separating the identifier from the locator. 287 However, the reader should note the fundamental equivalence between 288 the problems faced by mobile hosts and the problem faced by sites 289 that change providers yet don't want to renumber their network. When 290 a site changes providers, it moves topologically in much the same way 291 a mobile node does when it moves from one place to another. 292 Consequently, techniques that help or hinder mobility are often 293 relevant to the issue of site renumbering. 295 3.1. The Need for Aggregation 297 IPv4 has seen a number of different addressing schemes. Since the 298 original specification, the two major additions have been subnetting 299 and classless routing. The motivation for adding subnetting was to 300 allow a collection of networks located at one site to be viewed from 301 afar as a single IP network (i.e., to aggregate all of the individual 302 networks into a single bigger network). The practical benefit of 303 subnetting was that all of a site's hosts, even if scattered among 304 tens or hundreds of LANs, could be represented by a single routing 305 table entry in routers located far from the site. In contrast, prior 306 to subnetting, a site with ten LANs would advertise ten separate 307 network entries, and all routers would have to maintain ten separate 308 entries, even though they contained essentially redundant 309 information. 311 The benefits of aggregation should be clear. The amount of work 312 involved in constructing forwarding tables (i.e., selecting best 313 routes and installing them into the switching subsystem) is dependent 314 in part on the number of network routes (i.e., destinations) to which 315 best paths are computed. If each site has 10 internal networks, and 316 each of those networks is individually advertised to the global 317 routing system, the complexity of computing forwarding tables can 318 easily be an order of magnitude greater than if each site advertised 319 a single entry that covered all of the addresses used within the 320 site. 322 3.2. The Pre-CIDR Internet 324 In the early days of the Internet, its topology and addressing were 325 orthogonal. Specifically, when a site wanted to connect to the 326 Internet, it approached the central Internet Assigned Numbers 327 Authority (IANA) to obtain an address block and then approached a 328 provider about procuring connectivity. This procedure for address 329 allocation resulted in a system where the addresses used by customers 330 of the same provider bore little relation to the addresses used by 331 other customers of that same provider. In other words, though the 332 actual topology of the Internet was mostly hierarchical, the 333 addressing was not. An example of such a topology and addressing 334 scheme is shown in Figure 1. 336 +----------------+ 337 | |------- Customer1 (192.2.2.0) 338 | |------- Customer2 (128.128.0.0) 339 | Provider A |------- Customer3 (18.0.0.0) 340 | |------- Customer4 (193.3.3.0) 341 | |------- Customer5 (194.4.4.0) 342 +----------------+ 343 | 344 | 345 | 346 | 347 +----------------+ 348 | Provider B | 349 +----------------+ 351 Figure 1 353 Figure 1 shows Provider A having 5 customers, each with their own 354 independently obtained network address. Providers A and B connect to 355 each other. In order for Provider B to be able to send traffic to 356 Customers1-5, Provider A must announce a separate route to Provider B 357 for each of the 5 networks. That is, the routers within Provider B 358 must have explicit routing entries for each of Provider A's customers 359 -- 5 separate routes. 361 Experience has shown that this approach scales very poorly. In the 362 Default-Free Zone (DFZ) of the Public Internet, where routers must 363 maintain routing entries for all reachable destinations, the cost of 364 computing forwarding tables quickly becomes unacceptably large. A 365 large part of the cost is related to the seemingly redundant 366 computations that must be made for each individual network, even 367 though many of them reside in the same topological location (e.g., 368 under the same provider). Looking at Figure 1, the problem is that 369 provider B performs 5 separate calculations to construct the 370 forwarding table needed to reach each of A's customers, even though 371 it is going to take the same path for all of them; in other words, 372 there is an opportunity to do data abstraction. 374 Figure 1 shows network numbers using the older "classful" notation. 375 Since 1981, the first few bits of an address syntactically identified 376 which parts of an address identified the "network" and "local" 377 portions of an address. There were a small number of Class A 378 addresses (intended for very large sites), a medium number of Class B 379 addresses (for medium-sized sites) and a very large number of Class C 380 addresses (for very small sites). In practice, the actual size of 381 real networks didn't match the original allocation of Class A, B, and 382 C addresses. Class B addresses were bigger than most sites needed 383 (and there weren't enough of them), and Class C addresses were too 384 small (i.e., typical sites would need to get 10 or more C blocks to 385 cover all of the hosts on their networks). Consequently, classless 386 addressing was developed [CIDR], which made the boundaries between 387 the network and local parts of an address more flexible. With 388 classless addressing, a separate prefix-length (i.e., network mask) 389 specifies how many of the left-most bits of an address identify the 390 network part of the address. 392 3.3. CIDR and Provider-Based Addressing 394 One of the reasons CIDR (Classless Inter-Domain Routing) and its 395 associated provider-assigned address allocation policy were 396 introduced was to help reduce the cost of computing a routing table 397 and the size of the forwarding table computed from the routing table. 398 To achieve this goal CIDR aggressively aggregates network addresses. 399 Aggregating network addresses means "merging" multiple addresses into 400 a single "bigger" one, that is to use a common prefix to provide 401 location information for all addresses sharing that same prefix. 403 With CIDR, sites that want to connect to the Internet approach a 404 provider to procure both connectivity and a network address. 405 Individual providers have a block of address space covered by one 406 prefix and assign pieces of that space to customers. Consequently, 407 customers of the same provider have addresses that share the same 408 prefix. The combination of CIDR and provider-based addressing 409 results in the ability of a provider to address many hundreds of 410 sites while introducing just one network address into the global 411 routing system. An example of such a topology and addressing scheme 412 is shown in Figure 2. 414 +----------------+ 415 | |------- Customer1 (204.1.0.0/19) 416 | |------- Customer2 (204.1.32.0/23) 417 | Provider A |------- Customer3 (204.1.34.0/24) 418 | |------- Customer4 (204.1.35.0/24) 419 | |------- Customer5 (204.1.36.0/23) 420 +----------------+ 421 | 422 | A announces 423 | 204.1/16 to B 424 | 425 +----------------+ 426 | Provider B | 427 +----------------+ 429 Figure 2 431 In Figure 2, Provider A has been assigned the classless block, or 432 "aggregate", 204.1.0.0/16 (i.e., a prefix with the high-order 16 bits 433 denoting a single network). Provider A has 5 customers, each of 434 which has been assigned a prefix subordinate to the aggregate. In 435 order for Provider B to be able to reach Customers1-5, Provider A 436 only needs to announce the single prefix 204.1.0.0/16, and Provider 437 B's routers need only a single routing table entry to reach all of 438 Provider A's customers. Note the important difference between the 439 cases described in Figures 1 and 2; the latter example uses fewer 440 entries in the routing table to reach the same number of 441 destinations. 443 CIDR was a critical step for the Internet: in the early 1990s the 444 size of default-free routing tables required to support the classful 445 Internet was almost more than the commercially-available hardware and 446 software of the day could handle. The introduction of BGP4's 447 classless routing and provider-based address allocation policies 448 resulted in a significant decrease in the growth rate of the routing 449 tables. At the same time, however, CIDR introduced some new 450 weaknesses. First, the Internet addressing model had to shift from 451 one of "address owning" to "address lending" [RFC2008]. In pre-CIDR 452 days sites acquired addresses from a central authority independent of 453 their provider, and a site could assume it "owned" the address block 454 it was given. Owning addresses meant that once one had been given a 455 set of network addresses, one could always use them; no matter where 456 one's site connected to the Internet, the prefix for that network 457 could be injected into the public routing system. Today, however, it 458 is simply not possible for all individual sites to have their own 459 prefixes injected into the DFZ; there would be too many of them. 460 Consequently, if a site decides to change providers, it needs to 461 renumber all of its nodes using address space given to it by the new 462 provider. The "old" addresses it had used are returned back to its 463 previous provider. To understand this, consider if, from Figure 2, 464 Customer3 changes its provider from Provider A to Provider C, but 465 does not renumber. The picture would be as follows: 467 +----------------+ 468 | |---- Customer1 (204.1.0.0/19) 469 | |---- Customer2 (204.1.32.0/23) 470 | Provider A | 471 +---------------| |---- Customer4 (204.1.35.0/24) 472 | A announces | |---- Customer5 (204.1.36.0/23) 473 | 204.1/16 to B +----------------+ 474 | | 475 | | 476 | | 477 +----------------+ | 478 | Provider B | | 479 +----------------+ | 480 | | 481 | | 482 | | 483 | C announces | 484 | 204.1.34/24 | 485 | to B +----------------+ 486 +---------------| Provider C |---- Customer3 (204.1.34.0/24) 487 +----------------+ 489 Figure 3 491 In Figure 3, Providers A, B and C are all directly connected to each 492 other. In order for Provider B to reach Customers 1, 2, 4 and 5, 493 Provider A still only announces the 204.1.0.0/16 aggregate. However, 494 in order for Provider B to reach Customer3, Provider C must announce 495 the prefix 204.1.34.0/24. Prefix 204.1.34.0/24 is called a "more- 496 specific" of 204.1.0.0/16; another term used is that Customer3 and 497 Provider C have "punched a hole" in Provider A's address block. From 498 Provider B's view, the address space underneath 204.1.0.0/16 is no 499 longer cleanly aggregated into a single prefix and instead the 500 aggregation has been broken because the addressing is inconsistent 501 with the topology; in order to maintain reachability to Customer1-5, 502 Provider B must carry two prefixes where it used to have to carry 503 only one. 505 The example in Figure 3 explains why sites must renumber if existing 506 levels of aggregation are to be maintained. While a small number of 507 new exceptions could be tolerated, and certain prefixes have been 508 grandfathered, the reality in today's Internet is that there are 509 thousands of providers, many with thousands of individual customers. 510 It is generally accepted that renumbering of sites is essential for 511 maintaining sufficient aggregation. 513 The empirical cost of renumbering a site in order to maintain 514 aggregation has been the subject of much discussion. The practical 515 reality, however, is that forcing all sites to renumber is difficult 516 given the size and wealth of companies that now depend on the 517 Internet for running their business. Thus, although the technical 518 community came to consensus that, with the current practice of 519 provider-based addressing, address lending was necessary in order for 520 the Internet to continue to operate and grow, the reality has been 521 that some of CIDR's benefits have been lost because not all sites 522 renumber. It is worth noting that a number of providers today do 523 route filtering based, in part, on prefix length; as a result, a site 524 which does not renumber may have only partial connectivity to the 525 Internet. That is, a site may advertise a long prefix into the 526 routing system, but there is no assurance that all parts of the 527 Internet will accept the route; some simply ignore it. 529 One unfortunate characteristic of CIDR at an architectural level is 530 that the pieces of the infrastructure that benefit from the 531 aggregation (i.e., the providers which make up the DFZ) are not the 532 pieces that incur the renumbering cost (i.e., the end site). The 533 logical corollary of this statement is that the pieces of the 534 infrastructure that do incur cost to achieve aggregation (e.g., sites 535 which renumber when they change providers) don't directly see the 536 benefit. (The word "directly" is used here because the continued 537 operation of the Internet is a benefit, though it requires 538 selflessness on the part of the site to recognize.) This can lead to 539 a "tragedy of the commons" situation, where everyone agrees that some 540 sites should renumber, but they themselves want to be one of those 541 that do not. 543 3.4. Multi-Homed Sites and Aggregation 545 As sites become more dependent on the Internet, they have begun to 546 install additional connections to the Internet to improve robustness 547 and performance. Such sites are called "multi-homed". 548 Unfortunately, when a site connects to the Internet at multiple 549 places, the impact on routing can be much like a site that switches 550 providers but refuses to renumber. 552 In the pre-CIDR days, multi-homed sites were typically known by only 553 one network prefix, the prefix of their own address block. When that 554 site's providers announced the site's network into the global routing 555 system, a "shortest path" type of routing would occur so that pieces 556 of the Internet closest to the first provider might use the first 557 provider while other pieces of the Internet would use the second 558 provider. This allowed sites to use the routing system itself to 559 load balance traffic across their multiple connections. This type of 560 multi-homing assumes that a site's prefix can be propagated 561 throughout the DFZ, an assumption that is no longer universally true. 563 With CIDR, issues of addressing and aggregation complicate matters 564 significantly. At the highest level, there are three possible ways 565 to deal with multi-homed sites. The first possibility is to stay 566 with pre-CIDR approach, allowing each multi-homed site to receive its 567 address block directly from a registry, independent of its providers. 568 The problem with this approach is that, because the address block is 569 obtained independent of either provider, it is not aggregatable and 570 therefore has a negative impact on the scaling of global routing. 572 The second approach is for a multi-homed site to receive an 573 allocation from one of its providers and just use that single prefix. 574 The site would advertise its prefix to all of the providers to which 575 it connects. There are two problems with this approach. First, 576 although the prefix is aggregatable by the provider which made the 577 allocation, it is not aggregatable by the other providers. To the 578 other providers, the site's prefix poses the same problem that a 579 provider-independent address would. Second, due to CIDR's rule for 580 longest-match routing, it turns out that the site's prefix is not 581 always aggregatable in practice even by the provider that made the 582 allocation, if you want shortest-path routing load-spreading. 583 Consider Figure 4. Provider C has two paths for reaching Customer1. 584 Provider A advertises 204.1/16, an aggregate which includes 585 Customer1. But Provider C will also receive an advertisement for 586 prefix 204.1.0/19 from Provider B, and because the prefix match 587 through B is longer, C will choose that path. In order for Provider 588 C to be able to choose between the two paths, Provider A would also 589 have to advertise the longer prefix for 204.1.0/19 in addition to the 590 shorter 204.1/16. At this point, from the routing perspective, the 591 situation is very similar to the general problem posed by the use of 592 provider-independent addresses. 594 It should be noted that the above example simplifies a very complex 595 issue. For example, consider the example in Figure 4 again. 596 Provider A could choose not to propagate a route entry for the longer 597 204.1.0/19 prefix, advertising only the shorter 204.1/16. In such 598 cases, provider C would always select Provider B. Internally, 599 Provider A would continue to route traffic from its other customers 600 to Customer1 directly. If Provider A had a large enough customer 601 base, effective load sharing might be achieved. 603 A advertises 604 +------------+ 204.1/16 to C +------------+ 605 ___| Provider A |-----------------| Provider C | 606 / +------------+ +------------+ 607 / +----------/ 608 / / 609 Customer1 --- / B advertises 204.1.0/19 to C 610 204.1.0.0/19 | / 611 | +------------+ 612 ----- | Provider B | 613 +------------+ 615 Figure 4 617 The third approach is for a multi-homed site to receive an allocation 618 from each of its providers and not advertise the prefix obtained from 619 one provider to any of its other providers. This approach has 620 advantages from the perspective of route scaling because both 621 allocations are aggregatable. Unfortunately, the approach doesn't 622 necessarily meet the demands of the multi-homed site. A site that 623 has a prefix from each of its providers faces a number of choices 624 about how to use that address space. Possibilities include: 626 1) The site can number a distinct set of hosts out of each of the 627 prefixes. Consider a configuration where a site is connected to 628 ISP-A and ISP-B. If the link to ISP-A goes down, then unless 629 the ISP-A prefix is announced to ISP-B (which breaks 630 aggregation), the hosts numbered out of the ISP-A prefix would 631 be unreachable. 633 2) The site could assign each host multiple addresses (i.e., one 634 address for each ISP connection). There are two problems with 635 this. First, it accelerates the consumption of the address 636 space. While this may be a problem for the (limited) IPv4 637 address space, it is not a significant issue in IPv6. Second, 638 when the connection to ISP-A goes down, addresses numbered out 639 of ISP-A's space become unreachable. Remote peers would have to 640 have sufficient intelligence to use the second address. For 641 example, when initiating a connection to a host, the DNS would 642 return multiple candidate addresses. Clients would need to try 643 them all before concluding that a destination is unreachable 644 (something not all network applications currently do). In 645 addition, a site's hosts would need a significant amount of 646 intelligence for choosing the source addresses they use. A host 647 shouldn't choose a source address corresponding to a link that 648 is down. At present, hosts do not have such sophistication. 650 In summary, how best to support multi-homing with IPv4/CIDR faces a 651 delicate balance between the scalability of routing versus the site's 652 requirements of robustness and load-sharing. At this point in time, 653 no solution has been discovered that satisfies the competing 654 requirements of route scaling and robustness/performance. It is 655 worth noting, however, that some people are beginning to study the 656 issue more closely and propose novel ideas [BATES]. 658 4. The GSE Proposal 660 This section provides a description of GSE with the intent of making 661 this document stand-alone with respect to the GSE "specification". 662 We begin by reviewing the motivation for GSE. Next we review the 663 salient technical details, and we conclude by listing the explicit 664 non-goals of the GSE proposal. 666 4.1. Motivation For GSE 668 The primary motivation for GSE was the concern that the chief initial 669 IPv6 global unicast address structure, provider-based [RFC 2073], was 670 fundamentally the same as IPv4 with CIDR and provider-based 671 aggregation. Provider-based addressing requires that sites renumber 672 when they switch providers, so that sites are always aggregated 673 within their provider's prefix. In practice, the cost of renumbering 674 (which can only grow as a site grows in size and becomes more 675 dependent on the Internet for day-to-day business) is high enough 676 that an increasing number of sites refuse to renumber when they 677 change providers. This cost is particularly relevant in cases where 678 end-users are asked to renumber because an upstream provider has 679 changed its transit provider (i.e., the end site is asked to renumber 680 for reasons outside of its control and for which it sees no direct 681 benefit). Consequently, the GSE draft asserts that IPv4 with CIDR 682 has not achieved the aggressive aggregation required for the route 683 computation functions of the DFZ of the Internet to scale for IPv4 684 and that the much larger address space of IPv6 simply exacerbates the 685 problem. 687 The GSE proposal does not propose to eliminate the need for 688 renumbering. Indeed, it asserts that end sites will have to renumber 689 more frequently in order to continue scaling the Internet. However, 690 GSE proposes to make the cost of renumbering small enough that sites 691 can be renumbered at essentially any time with little or no 692 disruption to its network connectivity, and in particular with no 693 impact on communications that are strictly within the site. 695 Finally, GSE attempts to address the problem of sites that have 696 multiple Internet connections. In CIDR, the pressure for better 697 multi-homing support can create exceptions to route aggregation and 698 result in poor scaling. That is, the public routing infrastructure 699 may have to carry multiple distinct routes for some demanding multi- 700 homed sites, one for each independent path. GSE recognizes the 701 "special work done by the global Internet infrastructure on behalf of 702 multi-homed sites" [GSE], and proposes a way for multi-homed sites to 703 gain certain benefit without impacting global scaling. This includes 704 a specific mechanism that providers can use to support multi-homed 705 sites, presumably at a cost that the site would consider when 706 deciding whether or not to become multi-homed. 708 4.2. GSE Address Format 710 The key departure of GSE from classical IP addressing (both v4 and 711 v6) was that rather than over-loading addresses with both locator and 712 identifier functions, it splits the address into two elements: the 713 high-order 8 bytes used for routing purposes (called "Routing Stuff" 714 throughout the rest of this document) and the low-order 8 bytes for 715 unique identification of an end-point. The structure of GSE 716 addresses is: 718 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 719 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 720 | Routing Goop | STP| End System Designator | 721 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 722 6+ bytes ~2 bytes 8 bytes 724 Figure 5 726 4.2.1. Routing Stuff (RG and STP) 728 The Routing Goop (RG) identifies where within the public Internet 729 topology a site connects and is used to route datagrams to the site. 730 RG is structured as follows: 732 1 2 3 733 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 734 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 735 | xxx | 13 Bits of LSID | Upper 16 bits of Goop | 736 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 738 3 4 739 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 740 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 741 | Bottom 18 bits of Routing Goop | 742 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 744 Figure 6 746 The RG describes the location of a site's connection by identifying 747 smaller and smaller regions of topology until finally it identifies 748 the link which connects the site. Before interpreting the bits in 749 the RG, it is important to understand that routing with GSE depends 750 on decomposing the Internet's topology into a specific graph. At the 751 highest level, the topology is broken into Large Structures (LSs). 752 An LS is a region that can aggregate significant amounts of topology. 753 Examples of potential LSs are large providers and exchange points. 754 Within an LS the topology is further divided into another graph of 755 structures, with each LS dividing itself however it sees fit. This 756 division of the topology into smaller and smaller structures can 757 recurse for a number of levels, where the trade-off is "between the 758 flat-routing complexity within a region and minimizing total depth of 759 the substructure" [ESD]. 761 Having described the decomposition process, we now examine the bits 762 in the RG. After the 3-bit prefix identifying the address as having 763 a GSE format, the next 13 bits identify the LS. By limiting the 764 field to 13 bits, a ceiling is defined on the complexity of the top- 765 most routing level (i.e., what we currently call the DFZ). In the 766 next 34 bits, a series of subordinate structure(s) are identified 767 until finally the leaf subordinate structure is identified, at which 768 point the remaining bits identify the individual link within that 769 leaf structure. 771 The remaining 14 bits of the Routing Stuff (i.e., the low-order 14 772 bits of the high-order 8 bytes) comprise the STP and are used for 773 routing structure within a site, similar to subnetting with IPv4. 774 These bits are not part of the Routing Goop per se. The distinction 775 between Routing Stuff and Routing Goop is that RG controls routing in 776 the Public Internet, while Routing Stuff includes the RG plus the 777 Site Topology Partition (STP). The STP is used for routing structure 778 within a site. 780 The GSE proposal formalized the ideas of sites and of public versus 781 private topology. In the first case, a site is a set of hosts, 782 routers and media under the same administrative control which have 783 zero or more connections to the Internet. A site can have an 784 arbitrarily complicated topology, but all of that complexity is 785 hidden from everyone outside of the site. A site only carries 786 packets which originated from, or are destined to, that site; in 787 other words, a site cannot be a transit network. A site is private 788 topology, while the transit networks form the public topology. 790 A datagram is routed through public topology using just the RG, but 791 within the destination site, routing is based on the Site Topology 792 Partition (STP). 794 4.2.2. End-System Designator 796 The End-System Designator (ESD) is an unstructured 8-byte field that 797 uniquely identifies an interface from all others. The most important 798 feature of the ESD is that it alone identifies an interface; the 799 Routing Stuff portion of an address, although used to help deliver a 800 packet to its destination, is not used to identify an end point. 801 End-points of communication care about the ESD; as examples, TCP 802 peers could be identified by the source and destination ESDs alone 803 (together with port numbers), checksums would exclude the RG (the 804 sender doesn't even know its RG, as described later) and on receipt 805 of a packet only the ESD would be used in testing whether the packet 806 is intended for local delivery. 808 The leading contender for the role of a 64-bit globally unique ESD is 809 the recently defined "EUI-64" identifier [EUI64]. These identifiers 810 consist of a 24-bit "company_id" concatenated with a 40-bit 811 "extension". (Company_id is a new name for the "Organizationally 812 Unique Identifier" that forms the first half of an 802 MAC address). 813 Manufacturers are expected to assign locally unique values to the 814 extension field, guaranteeing global uniqueness for the complete 64- 815 bit identifier. A range of the EUI-64 space is reserved to cover 816 pre-existing 48-bit MAC addresses, and a defined mapping insures that 817 an ESD derived from a MAC address will not duplicate the ESD of a 818 device that has a built-in EUI-64. 820 In some cases, interfaces may not have an appropriate MAC address or 821 EUI-64 identifier. A globally unique ESD must then be obtained 822 through some alternate mechanism. Several possible mechanisms can be 823 imagined (e.g., the IANA could hand out addresses from the company_id 824 it has been allocated). Although we do not explore them in detail 825 here, we note that a global coordination structure is required here 826 to control the allocation of globally unique identifiers. 828 4.3. Address Rewriting by Border Routers 830 To obviate the need to renumber devices within sites because of 831 changing providers, the GSE design hides the global Routing Goop (RG) 832 from hosts in each site by having site border routers rewrite 833 addresses of the packets they forward across the boundary between the 834 site and public topology. Within a site, nodes need not know the RG 835 associated with their addresses. They simply use a designated 836 "Site-Local RG" value for internal addresses. When a packet is 837 forwarded to the public topology, the border router replaces the 838 Site-Local RG portion of the packet's source address with an 839 appropriate value. Likewise, when a packet from the public topology 840 is forwarded into a site, the border router replaces the RG part of 841 the destination address with the designated Site-Local RG. 843 To simplify discussion, the following text uses the singular term RG 844 as if a site could have only one RG value (i.e., one connection to 845 the Internet). In fact, a site could have multiple Internet 846 connections and consequently multiple RGs. 848 GSE's approach to easing renumbering isn't so much to ease 849 renumbering as to make it transparent to end users. The RG by which 850 a site is known is hidden from nodes within that site. Instead, the 851 RG for the site would be known only by the exit router, either 852 through static configuration or through a dynamic protocol with an 853 upstream provider. 855 Because end hosts don't know their RG, they don't know their entire 856 16-byte address, so they can't specify the full address in the source 857 fields of packets they originate. Consequently, when a datagram 858 leaves a site, the egress border router fills in the high-order 859 portion of the source address with the appropriate RG. 861 The point of keeping the RG hidden from nodes within the core of a 862 site is to insure the changeability of the RG without impacting the 863 site itself. It is expected that the RG would need to change 864 relatively frequently (e.g., several times a year) in order to 865 support sufficient aggregation as the topology of the Internet 866 changes. A change to a site's RG would only require a change at the 867 site's egress point, and it's well possible that this change could be 868 accomplished through a dynamic protocol with the upstream provider. 870 Hiding a site's RG from its internal nodes does not, however, mean 871 that changes to RG have no impact on end sites. Since the full 16- 872 byte address of a node isn't a stable value (the RG portion can 873 change), a stored address may contain invalid RG and be unusable if 874 it isn't "refreshed" through some other means. For example, opening 875 a TCP connection, writing the address of the peer to a file and then 876 later trying to reestablish a connection to that peer may well fail. 877 For intra-site communication, however, it is expected that only the 878 Site-Local RG would be used (and stored) which would continue to work 879 for intra-site communication regardless of changes to the site's 880 external RG. This shields a site's intra-site traffic from any 881 instabilities resulting from renumbering. 883 In addition to rewriting source addresses that leave a site, 884 destination addresses must be rewritten upon entering a site. To 885 understand the motivation behind this, consider a site with 886 connections to three Internet providers. Because each of those 887 connections has its own RG, each destination within the site would be 888 known by three different 16-byte addresses. As a result, intra-site 889 routers would have to carry a routing table three times larger than 890 expected. To work around this, GSE proposed replacing the RG in 891 inbound packets with the special "Site-Local RG" value to reduce 892 intra-site routing tables to the minimum necessary. 894 In summary, when a node initiates a flow to a node at another site, 895 the initiating node is expected to know the full 16-byte address for 896 the destination through mechanisms such as a DNS query. The 897 initiating node does not, however, know its own RG, and uses the 898 Site-Local RG values in the RG part of the source address. When the 899 datagram reaches the exit border router, the router replaces the RG 900 of the packet's source address. When the datagram arrives at the 901 entry router at the destination site, the router replaces the RG 902 portion of the destination address with the distinguished "Site-Local 903 RG" value. When the destination host needs to send return traffic, 904 that host knows the full 16-byte address for the other host because 905 it appeared in the source address field of the arriving packet. 907 4.4. Renumbering and Rehoming Mid-Level ISPs 909 One of the most difficult-to-solve components of the renumbering 910 problem with CIDR is that of renumbering mid-level service providers. 911 Specifically, if SmallISP1 changes its transit provider from BigISP1 912 to BigISP2, then in order for the overall size of the routing tables 913 to stay the same, all of SmallISP1's customers would have to renumber 914 into address space covered by an aggregate of BigISP2. GSE deals 915 with this problem by handling the RG in DNS with indirection. 916 Specifically, a site's DNS server specifies the RG portion of its 917 addresses by referencing the "name" of its immediate provider, which 918 is a resolvable DNS name (this implies a new Resource Record type). 919 That provider may define some of the low-order bits of the RG and 920 then reference its immediate provider. This chain of reference 921 allows mid-level service providers to change transit providers, and 922 the customers of that mid-level will simply "inherit" the change in 923 RG. Note that this mechanism does not depend on the GSE address 924 format per se and can also be applied to IPv4 addressing. 926 4.5. Support for Multi-Homed Sites 928 GSE defines a specific mechanism for providers to use to support 929 multi-homed customers that gives those customers more reliability 930 than singly-homed sites, but without a negative impact on the scaling 931 of global routing. This mechanism is not specific to GSE and could 932 be applied to any multi-homing scenario where a site is known by 933 multiple prefixes (including provider-based addressing). Assume the 934 following topology: 936 Provider1 Provider2 937 +------+ +------+ 938 | | | | 939 | PBR1 | | PBR2 | 940 +----x-+ +-x----+ 941 | | 942 RG1 | | RG2 943 | | 944 +--x-----------x--+ 945 | SBR1 SBR2 | 946 | | 947 +-----------------+ 948 Site 950 Figure 7 952 PBR1 is Provider1's border router while PBR2 is Provider2's border 953 router. SBR1 is the site's border router that connects to Provider1 954 while SBR2 is the site's border router that connects to Provider2. 955 Imagine, for example, that the line between Provider1 and the site 956 goes down. Any already existing flows that use a destination address 957 including RG1 would stop working. In addition, any addresses 958 returned from DNS queries that include RG1 would not be viable 959 addresses. If PBR1 and PBR2 knew about each other, however, then in 960 this case PBR1 could tunnel packets destined for RG1-prefixed 961 addresses to PBR2, thus keeping the communication working. (Note 962 that IP-in-IP encapsulation is necessary since routers between PBR1 963 and PBR2 would forward packets destined for addresses with PBR1's 964 prefix back towards PBR1.) 966 4.6. Explicit Non-Goals for GSE 968 It is worth noting explicitly that GSE did not attempt to address the 969 following issues: 971 1) Survival of TCP connections through renumbering events. If a 972 site is renumbered, TCP connections using a previous address 973 will continue to work only as long as the previous address still 974 works (i.e., while it is still "valid" using RFC 1971 975 terminology). No attempt is made to have existing connections 976 switch to the new address. 978 2) It is not known how multicast can be made to work under GSE. 980 3) It is not known how mobility can be made to work under GSE. 982 4) The performance impact of having routers rewrite portions of the 983 source and destination address in packet headers requires 984 further study. 986 That GSE didn't address the above does not mean they cannot be 987 solved. Rather, the issues simply weren't studied in sufficient 988 depth. 990 5. Analysis: The Pros and Cons of Overloading Addresses 992 At this point we have given complete descriptions of two addressing 993 architectures: IPv4, which uses the overloading technique, and GSE, 994 which uses the separated technique. We now compare and contrast the 995 two techniques. 997 The following discussion is organized around three fundamental 998 points: 1000 1) Identifiers indicate who the intended recipient of a packet is. 1001 At the network layer, an identifier refers to an interface, at 1002 the transport layer it refers to a process or other endpoint of 1003 a "connection". 1005 2) Identifiers must be mapped into a locator that the network layer 1006 can use to actually deliver a packet to its intended 1007 destination. 1009 3) There must be a suitable way to adequately authenticate the user 1010 of an identifier, so that communicating peers have sufficient 1011 confidence that packets sent to or received from a particular 1012 identifier correspond to the intended recipient. 1014 5.1. Purpose of an Identifier 1016 An identifier gives an entity the ability to refer to a communication 1017 end point and to refer to the same endpoint over an extended period 1018 of time. In terms of semantics, two or more packets sent to the same 1019 identifier should be delivered to the same end point. Likewise, one 1020 expects multiple packets received from the same identifier to have 1021 been originated by the same sending entity. That is, a source 1022 identifier indicates who the packet is from and a destination 1023 identifier indicates who the packet is intended for. 1025 In IPv4, when applications communicate, transport "identifiers" 1026 consist of addresses and port numbers. For the purposes of this 1027 discussion, we use the term "identifier" to mean the identifier of an 1028 interface. It is assumed that port numbers will be present when 1029 higher layer entities communicate; the exact port numbers used are 1030 not relevant to this discussion. 1032 In small networks, flat routing can be used to deliver packets to 1033 their destination based only on the destination identifier carried in 1034 a packet header (i.e., the identifier is the locator and is not 1035 required to have any structure). However, in such systems, a 1036 distinct route entry is required for every destination, an approach 1037 that does not scale. In larger networks, packet addresses include a 1038 locator that helps the network layer deliver a packet to its 1039 destination. Such a locator typically has a structure to keep 1040 routing tables small relative to the total number of reachable 1041 destinations. In IPv4, the identifier and locator are combined in a 1042 single address; it is not possible to separate the locator portion of 1043 an address from the identifier portion. In contrast, the ESD portion 1044 of a GSE address (which can easily be extracted from the address) 1045 serves as an identifier, while the Routing Stuff plays the role of a 1046 locator. 1048 Having a clear separation between the locator and the identifier 1049 portion of an address appears to provide protocols some additional 1050 flexibility. Once a packet has been delivered to its intended 1051 destination interface (i.e., node), for example, the locator has 1052 served its purpose and is no longer needed to further demultiplex a 1053 packet to its higher-layer end point. This means that if a packet is 1054 delivered to the correct destination node (that is the identifier 1055 carried in the packet address matches to one interface identifier of 1056 the node), the node will accept the packet, regardless of how the 1057 packet got there. The exact locator used does not matter, within 1058 most Internet circumstances, so long as it gets the packet delivered 1059 to its proper destination. 1061 The most obvious example that could benefit from the separation of 1062 locators and identifiers involves communication with a mobile host. 1063 Transport protocols such as TCP are unable to keep connections open 1064 if either of the two endpoint identifiers for an open connection 1065 changes. Fundamentally, the endpoint identifiers indicate the two 1066 endpoint entities that are communicating. If a node were to receive 1067 a packet from a node with which it had been communicating previously, 1068 but the identifier used by the sending node has changed, the 1069 recipient would be unable to distinguish this case from that of a 1070 packet received from a completely different node. 1072 In the specific case of TCP and IPv4, connections are identified 1073 uniquely by the tuple: (srcIPaddr, dstIPaddr, srcport, dstport). 1074 Because IPv4 addresses contain a combined locator/identifier, it is 1075 not possible to have a node's location change without also having its 1076 identifier change. Consequently, when a mobile node moves, its 1077 existing connections no longer work, in the absence of special 1078 protocols such as Mobile IP [MOBILITY]. 1080 In contrast, connections in GSE are identified by the ESDs rather 1081 than full IPv6 addresses. That is, connections are identified 1082 uniquely by the tuple: (srcESD, dstESD, srcport, dstport). 1083 Consequently, when demultiplexing incoming packets to their proper 1084 end point, TCP would ignore the Routing Stuff portions of addresses. 1085 Because the Routing Stuff portion of an address is ignored during 1086 demultiplexing operations, a mobile node is free to move -- and 1087 change its Routing Stuff -- without changing its identification. 1089 As a side note, it is a requirement in GSE that packets be 1090 demultiplexed to higher layer endpoints on ESDs alone independent of 1091 the Routing Stuff. If a site is multi-homed, the packets it sends 1092 may exit the site at different egress border routers during the 1093 lifetime of a connection. Because each border router will place its 1094 own RG into the source addresses of outgoing packets, the receiving 1095 TCP must ignore (at least) the RG portion of addresses when 1096 demultiplexing received packets. The alternative would make TCP 1097 unable to cope with common routing changes, i.e., if the path 1098 changed, packets delivered correctly would be discarded by the 1099 receiving TCP rather than accepted. 1101 Not surprisingly, having separate locator and identifiers in 1102 addresses leads to additional problems as well. First, an identifier 1103 by itself provides only limited value. In order to actually deliver 1104 packets to a destination identifier, a corresponding locator must be 1105 known. The general problem of mapping identifiers into locators is 1106 non-trivial to solve, and is the topic of the next Section. Second, 1107 because the Routing Stuff is ignored when packets being demultiplexed 1108 upward in the protocol stack, it becomes much easier for an intruder 1109 to masquerade as someone else. 1111 5.2. Mapping an Identifier to a Locator 1113 The idea of using addresses that cleanly separate location and 1114 identification information is not new. However, there are several 1115 different flavors. In its pure form, a sender need only know the 1116 identifier of an end-point in order to send packets to it. When 1117 presented with a datagram to send, network software would be 1118 responsible for determining the locator associated with an identifier 1119 so that the packet can be delivered. A key question is: "who is 1120 responsible for finding the Routing Stuff associated with a given 1121 identifier"? There are a number of possibilities, each with a 1122 different set of implications: 1124 1) The network layer could be responsible for doing the mapping. 1125 The advantage of such a system is that an ESD could be stored 1126 essentially forever (e.g., in configuration files), but whenever 1127 it is actually used, network layer software would automatically 1128 perform the mapping to determine the appropriate Routing Stuff 1129 for the destination. Likewise, should an existing mapping 1130 become invalid, network layer software could dynamically 1131 determine the updated value. Unfortunately, building such a 1132 mapping mechanism that scales is difficult if not impossible 1133 with a flat identifier space (e.g., the ESD identifier). 1135 2) The transport layer could be responsible for doing the mapping. 1136 It could perform the mapping when a connection is first opened, 1137 periodically refreshing the binding for long-running 1138 connections. Implementing such a scheme would change the 1139 existing transport layer protocols TCP and UDP significantly. 1140 However, in the case of TCP, such a scheme would have the 1141 benefit that applications would probably not need to be 1142 modified. For UDP-based applications, this may not hold, since 1143 most UDP-based protocols are implemented within applications. 1145 3) Higher-layer software (e.g., the application itself) could be 1146 responsible for performing the mapping. This potentially 1147 increases the burden on application programmers significantly, 1148 especially if long-running connections are required to survive 1149 renumbering and/or deal with mobile nodes. 1151 The GSE proposal uses the last approach. The network and transport 1152 layers are always presented with both the Routing Stuff (RG + STP) 1153 and the ESD together in one IPv6 address. It is neither of these 1154 layers' jobs to determine the Routing Stuff given only the ESD or to 1155 validate that the Routing Stuff is correct. When an application has 1156 data to send, it queries the DNS to obtain the IPv6 AAAA record for a 1157 destination. The returned AAAA record contains both the Routing 1158 Stuff and the ESD of the specified destination. While such an 1159 approach eliminates the need for the lower layers to be able to map 1160 ESDs into corresponding Routing Stuff, it also means that when 1161 presented with an address containing an incorrect (i.e., no longer 1162 valid) Routing Stuff, the network is unable to deliver the packet to 1163 its correct destination. Note that addresses containing invalid 1164 Routing Stuff will result any time when cached addresses are used 1165 after the Routing Stuff of the address becomes invalid. This may 1166 happen if addresses are stored in configuration files, a mobile node 1167 moves to a new location, long-running applications (clients and 1168 servers) cache the result of DNS queries, a long-running connection 1169 attempts to continue operating during a site renumbering event, etc. 1170 Whatever the causes, the failures are fundamentally due to dynamic 1171 topological changes at the network layer, yet in GSE such failures 1172 are left to be dealt with at the application level (through DNS), 1173 because neither the transport nor the network level has the ability 1174 to re-mapping identifiers to corresponding locators. 1176 To avoid the above problem a network architecture must provide the 1177 ability to map an identifier to a locator. In IPv4, this mapping is 1178 trivial, since the identifier and locator are combined in a single 1179 quantity (i.e., the IPv4 address). GSE does not provide this mapping 1180 functionality directly. Instead, GSE assumes that a node's DNS name 1181 serves as its stable identifier, and uses normal DNS queries to map 1182 the DNS "identifier" into an IPv6 address. The IPv6 address contains 1183 both the ESD identifier together with its Routing Stuff, that is an 1184 initial binding/mapping between the identifier and locator. When 1185 this binding breaks (for example due to dynamic topological changes), 1186 the ESD identifier cannot be mapped into a new locator by itself. 1187 Instead one must resort back to application level, hoping another DNS 1188 query would provide rescue to the broken binding between identifier 1189 to locator that is needed for network delivery. 1191 The use of DNS to provide identifier to locator mapping contributes 1192 to GSE's apparent simplicity. However, there are two fundamental 1193 problems with this approach, if the intention is to make it 1194 transparently easy to change locators over time. First, the burden 1195 of performing the mapping from identifier to locator is placed 1196 directly on the application, because lower layers (i.e., transport 1197 and network layers) cannot perform the mapping themselves due to 1198 layering violation concerns (i.e., TCP and UDP can't perform a DNS 1199 query). Second, following all RG changes the DNS database must be 1200 promptly updated and all expired information must be flushed out of 1201 all DNS caches. This stringent timing requirement imposed by lower 1202 level operation would represent a departure from the original DNS 1203 design, which provides DNS names to address mappings that only change 1204 slowly over time if at all, and which relies heavily on caching over 1205 relatively long time periods to scale well. 1207 The following subsections discuss a number of issues related to 1208 keeping track of or determining the locator associated with an 1209 identifier. 1211 5.2.1. Scalable Mapping of Identifiers to Locators 1213 It is not difficult to construct a mapping from an identifier (such 1214 as an ESD) to a locator (as well as other information such as a name, 1215 cryptographic keys, etc.) provided one can structure the identifier 1216 space appropriately to support scalable lookups. In particular, 1217 identifiers must have sufficient structure to support the delegating 1218 mechanism of a distributed database such as DNS. On the other hand, 1219 no scalable mechanism is known for performing such a mapping on 1220 arbitrary identifiers taken from a flat space lacking any structure. 1222 Imposing a hierarchy on identifiers poses the following difficulties: 1224 - - It increases the size of the identifier. The exact size 1225 necessary to support sufficient hierarchy is unclear, though it 1226 is likely to be roughly the same as that used for the routing 1227 hierarchy. Analysis done during the original IPng debates 1228 [RFC1752] suggests that close to 48-bits of hierarchy are needed 1229 to identify all the possible sites 30-40 years from now. 1231 - - The assignment of identifiers must be tied to the delegation 1232 structure. That is, the site that "owns" an identifier is the 1233 one responsible for maintaining the identifier-to-locator 1234 mapping information about it. 1236 - - Due to the requirement of tying an identifier to the 1237 delegation structure the identifier of a node cannot be burned 1238 in during manufacturing. Instead a mechanism is needed to allow 1239 a node to learn its identifier. To be practical, such a 1240 mechanism would need to be automated and avoid the need for 1241 manual configuration. 1243 5.2.2. Insufficient Hierarchy Space in ESDs 1245 In the case of GSE's 8-byte ESD, the size of the identifier is not 1246 large enough to contain sufficient hierarchy to both create DNS-like 1247 delegation points and support stateless address autoconfiguration. 1248 Stateless address autoconfiguration [RFC1971] already assumes that an 1249 interface's 6-byte link-layer (i.e., MAC) address can be appended to 1250 a link's routing prefix to produce a globally unique IPv6 address. 1251 With GSE, only two bytes would be available for hierarchy and 1252 delegation. 1254 It is also the case that the sorts of built-in identifiers now found 1255 in computing hardware, such as "EUI-48" and "EUI-64" addresses 1256 [IEEE802, IEEE1212], do not have the structure required for this 1257 delegation. Such identifiers have only two-levels of hierarchy; the 1258 top-level typically identifies a manufacturer, with the remaining 1259 part of the address being the equivalent of the serial number unique 1260 to the manufacturer. The delegation of the two-level hierarchy 1261 (i.e., equipment manufacturer) does not correspond to the 1262 administrator under which the end-user operates. Hence, stateless 1263 autoconfiguration [RFC1971] cannot create addresses with the 1264 necessary hierarchical property in the ESD portion of an address. 1266 Finally, imposing a required hierarchical structure on identifiers 1267 such as an ESD would also introduce a new administrative burden and a 1268 new or expanded registry system to manage ESD space (i.e., to insure 1269 that ESDs are globally unique). While the procedures for assigning 1270 ESDs, which need only organizational and not topological 1271 significance, would be simpler than the procedures for managing IPv4 1272 addresses, it seems a laudable goal to avoid the problem altogether 1273 if possible. In addition, it would likely increase the complexity 1274 for connecting new nodes to the Internet, a goal inconsistent with 1275 Stateless Address autoconfiguration [RFC1971]. 1277 The topic of mapping full 16-byte GSE addresses to a locator or other 1278 information is discussed in Appendix D. 1280 5.3. Authentication of Identifiers 1282 The true value of a globally unique identifier lies not on its 1283 uniqueness but on an ability to use the same identifier repeatedly 1284 and have it refer to the same end point. That is, there is an 1285 expectation that repeated and subsequent use of the same identifier 1286 results in continued communication with the same end point. To be 1287 useful then, a valid identifier must either be easily distinguishable 1288 from a fraudulent one, or the system must have a way to prevent 1289 identifiers from being used in an unauthorized manner. 1291 The remainder of this section discusses how identifier authentication 1292 is done in both IPv4 and GSE, and shows how overloading an address 1293 with both an identifier and a locator provides a significant 1294 automatic identifier authentication. In contrast, there is 1295 essentially no identifier authentication in GSE. It should be noted 1296 that the actual strength of authentication that would be considered 1297 sufficient is a topic in its own right, and we do not cover it here. 1298 Instead, we focus on the relative strengths in the two schemes. 1300 5.3.1. Identifier Authentication in IPv4 1302 As described earlier, an IPv4 address simultaneously plays two roles: 1303 a unique identifier and a locator. Using an overloaded address as an 1304 identifier has the side-effect of insuring that (for all practical 1305 purposes) the identifier is globally unique. Furthermore, because 1306 the same number is used both to identify an interface and to deliver 1307 data to that interface, it is impossible for some interface A to use 1308 the identification of another interface B in an attempt to receive 1309 data destined to B without being detected, unless the routing system 1310 is compromised. 1312 When both interfaces A and B claim the same unicast address, the 1313 routing subsystem generally delivers packets to only one of them. 1314 The other node will quickly realize that something is wrong (since 1315 communication using the duplicate address fails) and take corrective 1316 actions, either correcting a misconfiguration or otherwise detecting 1317 and thwarting the intruder. To understand how the routing subsystem 1318 prevents the same address from being used in multiple locations, 1319 there are two cases to consider, depending on whether the two 1320 interfaces using duplicate addresses are attached to the same or to 1321 different links. 1323 When two interfaces on the same link use the same address, a node 1324 (host or router) sending traffic to the duplicate address will in 1325 practice send all packets to one of the nodes. On Ethernets, for 1326 example, the sender will use ARP (or Neighbor Discovery in IPv6) to 1327 determine the link-layer address corresponding to the destination 1328 address. When multiple ARP replies for the target IP address are 1329 received, the most recently received response replaces whatever is 1330 already in the cache. Consequently, the destinations a node using a 1331 duplicate IP address can communicate with depends on what its 1332 neighboring nodes have in their ARP caches. In most cases, such 1333 communication failures become apparent relatively quickly, since it 1334 is unlikely that communication can proceed correctly on both nodes. 1336 It is also the case that a number of ARP implementations (e.g., BSD- 1337 derived implementations) log warning messages when an ARP request is 1338 received from a node using the same address as the machine receiving 1339 the ARP request. 1341 When two interfaces on different links use the same address, the 1342 routing subsystem generally delivers packets to only one of the nodes 1343 because only one of the links has the right subnet corresponding to 1344 the IP address. Consequently, the node using the address on the 1345 "wrong" link will generally never receive any packets sent to it and 1346 will be unable to communicate with anyone. For obvious reasons, this 1347 condition is usually detected quickly. 1349 It should be noted that although an address containing a combined 1350 identifier and locator can be forged, the routing subsystem 1351 significantly limits communication using the forged address. First, 1352 return traffic will be sent to the correct destination and not the 1353 originator of the forged address. This alone prevents certain types 1354 of spoofing attacks. For example, if a destination receives an 1355 unexpected packet corresponding to a TCP connection that it is 1356 unaware of, it may return at TCP segment resetting the connection. 1357 Second, routers performing ingress filtering can refuse to forward 1358 traffic claiming to originate from a source whose claimed address 1359 does not match the expected addresses (from a topology perspective) 1360 for sources located within a particular region [RFC 2267]. To 1361 effectively masquerade as someone else requires subverting the 1362 intermediate routing subsystem. 1364 5.3.2. Identifier Authentication in GSE 1366 In GSE, it is not possible for the routing subsystem to provide any 1367 enforcement on the authenticity of identifiers with respect to their 1368 corresponding Routing Stuff, since the Routing Stuff and ESD portions 1369 of an address are by definition completely orthogonal quantities. 1370 This fundamental problem is compounded by the fact that GSE provides 1371 no way (at the transport or network layer) to map an ESD into its 1372 corresponding Routing Stuff. Thus, when looking at the source 1373 address of a received packet, there is no way to ascertain whether 1374 the Routing Stuff portion of the address corresponds to legitimate 1375 Routing Stuff with respect to the corresponding ESD. Consequently, 1376 it becomes trivial in many cases for one node to masquerade as 1377 another. 1379 5.4. Transport Layer: What Locator Should Be Used? 1381 In the following, we focus on what Routing Stuff to use with TCP; UDP 1382 also depends on the Routing Stuff in similar way. Indeed, we believe 1383 that TCP is the "easier" case to deal with, for two reasons. First, 1384 TCP is a stateful protocol in which both ends of the connection can 1385 negotiate with each other. UDP-based communications are stateless, 1386 and remember nothing from one packet to the next. Consequently, 1387 changing UDP to remember locator information in addition to the 1388 identifier of the peer may require the introduction of "session" 1389 features, perhaps as part of a common "library". Second, changes to 1390 UDP in practice mean changing individual applications themselves, 1391 raising deployability questions. 1393 There are three cases of interest from TCP's perspective: 1395 - - the sending side of an active open 1397 - - the sending side of a passive open (i.e., how to respond to an 1398 active open) 1400 - - changes to the Routing Stuff during an open connection. 1402 5.4.1. RG Selection On An Active Open 1404 If the host is performing a TCP "active open", the application first 1405 queries the DNS to obtain the destination address, which contains the 1406 appropriate RG for the remote peer. That is, the initiator of 1407 communication is assumed to provide the correct Routing Stuff when 1408 initiating communication to a specific destination. 1410 5.4.2. RG Selection On An Passive Open 1412 When a server passively accepts connections from arbitrary clients, 1413 it has no choice but to assume that the Routing Stuff in the source 1414 address of a received packet that initiated the communication is 1415 correct, because it has no way to authenticate its validity. Note 1416 that the Routing Stuff is "correct" only in the sense that it 1417 corresponds to the site originating the connection, which the server 1418 will send the reply to. Whether the Routing Stuff paired with the 1419 received ESD actually matches the Routing Stuff located at the site 1420 where the legitimate owner of the ESD currently resides is not known 1421 and cannot be determined. Because the ESD alone cannot be mapped 1422 into a locator (or some other quantity that can provide input to an 1423 authentication procedure), there is no way to determine whether the 1424 received Routing Stuff corresponds to that legitimately associated 1425 with the source identifier of the received packet. The issue of 1426 spoofing is discussed in more detail later. 1428 5.4.3. Mid-Connection RG Changes 1430 While packets are flowing as part of an open connection, the RG 1431 appearing on subsequent packets is susceptible to change through 1432 renumbering events, or as a result of site-internal routing changes 1433 that cause the egress point for off-site traffic to change. It is 1434 even possible that traffic-balancing schemes could result in the use 1435 of two egress routers, with roughly every other packet exiting 1436 through a different egress router. 1438 Because TCP under GSE demultiplexes packets using only ESDs, newly 1439 arrived packets will be delivered to the correct end-point regardless 1440 of whether their source RG have changed. The GSE proposal calls for 1441 return traffic to continue to be sent via the "old" RG, even though 1442 it may have been deprecated or become less optimal because the peer's 1443 border router has changed. That is, the RG to use for reaching a 1444 peer is bound to a connection when the connection is established and 1445 does not change thereafter. However, the completion of renumbering 1446 events (so that an earlier RG is now invalid) and certain topology 1447 changes would require TCP to switch sending to a new RG mid- 1448 connection. To explore the scenario, we consider ways of allowing 1449 the RG change to be made to existing established connections. 1451 If TCP connection identifiers are based on ESDs rather than full 1452 addresses, traffic from the same ESD would be viewed as coming from 1453 the same peer, regardless of the source RG. Because this 1454 vulnerability is already present in today's Internet (forging the 1455 source address of a packet is trivial), the mere delivery of incoming 1456 datagrams with the same ESD but a different RG does not introduce new 1457 vulnerability to TCP. In today's Internet, any node can already 1458 originate FINs/RSTs from an arbitrary source address and potentially 1459 or definitely disrupt the connection. Therefore, acceptance of 1460 traffic independent of its source RG does not appear to significantly 1461 worsen existing robustness. Note, however, that ingress filtering as 1462 described in Section 5.3.1, cannot be performed on packets containing 1463 GSE addresses. This does make it more difficult to prevent certain 1464 types of attacks. 1466 We also considered allowing TCP to reply to each segment using the RG 1467 of the most recently-received segment. Although this allows TCP 1468 connections to survive certain important events (e.g., renumbering), 1469 it also makes it trivial for anyone to hijack connections, 1470 unacceptably weakening robustness compared with today's Internet. A 1471 sender simply needs to guess the sequence numbers in use by a given 1472 TCP connection [Bellovin 89] and send traffic with a bogus RG to 1473 hijack a connection to an intruder at an arbitrary location. 1475 Providing protection from hijacking implies that the RG used to send 1476 packets must be bound to a connection end-point (e.g., it is part of 1477 the connection state). Although it may be reasonable to accept 1478 incoming traffic independent of the source RG, the choice of sending 1479 RG requires more careful consideration. Indeed, any subsequent 1480 change in the RG used for sending traffic must be properly 1481 authenticated (e.g., using cryptographic means). In the GSE 1482 proposal, the is no apparent way to authenticate such a change, since 1483 the remote peer doesn't even know its own RG. Consequently, the only 1484 reasonable approach in GSE is to send to the peer using the first RG 1485 used for the entire life of a connection. That is, always use the 1486 first RG seen, and accept the loss of connectivity whenever the RG 1487 changes. 1489 5.4.4. The Impact of Corrupted Routing Goop 1491 Another interesting issue that arises is what impact corrupted RG 1492 would have on robustness. Because the RG is not covered by the TCP 1493 checksum (the sender doesn't know what source RG will be inserted), 1494 no TCP mechanism can detect such corruption at the receiver. 1495 Moreover, once a specific RG is in use, it does not change for the 1496 duration of a connection. One interesting case occurs on the passive 1497 side of a TCP connection, where a server accepts incoming connections 1498 from remote clients. If the initial SYN from the client includes a 1499 corrupted RG, the server TCP will create a TCP connection (in the 1500 SYN-RECEIVED state) and cache the corrupted RG with the connection. 1501 The second packet of the 3-way handshake, the SYN-ACK packet, would 1502 be sent to the wrong RG and consequently not reach the correct 1503 destination. Later, when the client retransmits the unacknowledged 1504 SYN, the server will continue to send the SYN-ACK using the bad RG. 1505 Eventually the client times out, and the attempt to open a TCP 1506 connection fails. 1508 We next consider relaxing the restriction on switching RGs in an 1509 attempt to avoid the previous failure scenario. The situation is 1510 complicated by the fact that the RG on received packets may change 1511 for legitimate reasons (e.g., a multi-homed site load-shares traffic 1512 across multiple border routers). The key question is how one can 1513 determine which RG is valid and which is not. That is, for each of 1514 the destination RGs a sender attempts to use, how can it determine 1515 which RG worked and which did not? Solving this problem is more 1516 difficult than first appears, since one must cover the cases of 1517 delayed segments, lost segments, simultaneous opens, etc. If a SYN- 1518 ACK is retransmitted using different RGs, it is not possible to 1519 determine which of the two RGs worked correctly. We conclude that 1520 the only way TCP can determine that a particular RG is correct is by 1521 receiving an ACK for a specific sequence number in which all 1522 transmissions of that sequence number used the same RG. This would 1523 involve non-trivial changes to TCP implementations. 1525 At best, an RG selection algorithm for TCP would require new logic in 1526 implementations of TCP's opening handshake --- a significant 1527 transition and deployment issue. We are not certain that a valid 1528 algorithm is attainable, however. RG changes would have to be 1529 handled in all cases handled by the opening handshake: delayed 1530 segments, lost segments, undetected bit errors in RG, simultaneous 1531 opens, old segments, etc. 1533 In the end, we conclude that although the corrupted SYN case 1534 introduces potential problems, the changes that would need to be made 1535 to TCP to robustly deal with such corruption would be significant, if 1536 tractable at all. This would result in a transition to GSE also 1537 having a significant TCPng component, a significant drawback. 1539 5.5. On The Uniqueness Of ESDs 1541 Although ESDs are expected to be globally unique, their uniqueness 1542 property may be violated either due to mistakes in allocation or by 1543 malicious attacks. The exact uniqueness requirements for ESDs 1544 depends on what purpose they serve and how they are used. If the 1545 correctness of some applications relies on the global uniqueness of 1546 ESDs, then active checking and enforcement will be necessary. On the 1547 other hand if ESDs are used only to uniquely identify individual 1548 endpoints within a session, then one may consider global uniqueness 1549 as unnecessary. 1551 5.5.1. Impact of Duplicate ESDs 1553 Consider what happens when two nodes using the same ESD attempt to 1554 communicate with each other. In the GSE proposal, a node queries the 1555 DNS to obtain an IPv6 address. The returned address includes the 1556 Routing Stuff of an address (the RG+STP portions). The sender may 1557 not notice the destination ESD is the same as its own ESD and may 1558 well forward the packet to a router that delivers the packet to its 1559 correct destination (using the information in the Routing Stuff). On 1560 receipt of the packet, however, the destination node would extract 1561 the ESD portion of the destination address and detect the conflict. 1563 A more problematic case occurs if two nodes having the same ESD 1564 communicate with a third party. To the third party, packets received 1565 from either machine might appear to be coming from the same machine 1566 since they all carry the same ESD. Consequently, at the transport 1567 level, if both machines choose the same source and destination port 1568 numbers (one of the ports --- a server's well-known port number --- 1569 will likely be the same), packets belonging to two distinct transport 1570 connections will be demultiplexed to a single transport end-point. 1572 When packets from different sources using the same source ESD are 1573 delivered to the same transport end-point, a number of possibilities 1574 come to mind: 1576 1) Following the GSE specification, the transport end-point would 1577 accept the packet, without regard to the Routing Stuff of the 1578 source address. This may lead to a number of robustness 1579 problems (and at best will confuse the application). 1581 2) The transport end-point could verify that the Routing Stuff of 1582 the source address matches one of a set of expected values 1583 before processing the packet further. If the Routing Stuff 1584 doesn't match any expected value, the packet could be dropped. 1585 This would result in a connection from one host operating 1586 correctly, while a connection from another host (using the same 1587 ESD) would fail. 1589 3) When a packet is received with an unexpected Routing Stuff the 1590 receiver could invoke special-purpose code to deal with this 1591 case. Possible actions include attempting to verify whether the 1592 Routing Stuff is indeed correct (the saved values may have 1593 expired) or attempting to verify whether duplicate ESDs are in 1594 use (e.g., by inventing a protocol that sends packets using both 1595 Routing Stuff and verifies that they are delivered to the same 1596 end-point). 1598 5.5.2. New Denial of Service Attacks. 1600 It is clear that there are potential problems if identifiers are not 1601 globally unique. How common such problems would actually occur in 1602 practice depends on how many duplicates there actually are. Thus, 1603 one might be tempted to make the argument that a scheme for assigning 1604 identifiers could be made to be "unique enough" in practice. This 1605 would be a dangerous and naive assumption, because in the absence of 1606 any ESD enforcement (i.e. ensuring each host use only the assigned 1607 ESD), intruders will actively impersonate other sites for the sole 1608 purpose of invalidating the uniqueness assumption. For example, one 1609 could deny service to host foo.bar.com by querying the DNS for its 1610 corresponding ESD, and then impersonating that ESD. 1612 As a specific example, one GSE-specific denial-of-service attack 1613 would be for an intruder to masquerade as another host and "wedge" 1614 connections in a SYN-RECEIVED state by sending SYN segments 1615 containing an invalid RG in the source IP address for a specific ESD. 1616 Subsequent connection attempts to the wedged host from the legitimate 1617 owner of the ESD (if they used the same TCP port numbers) would then 1618 not complete, since return traffic would be sent to the wrong place. 1620 5.6. Summary of Identifier Authentication Issues 1622 In summary, changing the RG dynamically in a safe way for a 1623 connection requires that an originator of traffic be able to 1624 authenticate a proposed change in the RG before sending to a 1625 particular ESD via that RG. This is difficult for several reasons: 1627 1) It can't be done on an end-to-end basis in GSE (e.g., via IPSec) 1628 because the sender doesn't know what value the RG portion of the 1629 address will have when it reaches the receiver. 1631 2) It can't be easily done in GSE because there is no mechanism at 1632 or below the transport layer to map ESDs into a quantity that 1633 can be used as a key to jump start the authentication process 1634 (using the DNS would be problematic due to layering circularity 1635 considerations). 1637 3) Any scheme that uses the full IPv6 address to do the 1638 authentication can be used with today's standard provider-based 1639 addressing, raising the question of what benefit is retained 1640 from having separate identifiers and locators. 1642 Our final conclusion is that with the GSE approach, transport 1643 protocol end-points must make an early, single choice of the RG to 1644 use when sending to a peer and stick with that choice for the 1645 duration of the connection. Specifically: 1647 1) The demultiplexing of arriving packets to their transport end 1648 points should use only the ESD, and not the Routing Stuff. 1650 2) If the application chooses an RG for the remote peer (i.e., an 1651 active open), use the provided RG for all traffic sent to that 1652 peer, even if alternative RGs are received on subsequent 1653 incoming datagrams from the same ESD. For all other cases, use 1654 the first RG received with a given ESD for all sending. 1656 3) Simultaneously, we understand that, with the above rules, there 1657 are still open issues with regard to invalid RGs, either through 1658 corruption or through a active hostile attacks. 1660 One difficulty With the above recommendation is that there does not 1661 appear to be a straightforward way to use ESDs in conjunction with 1662 mobility or site renumbering (in which existing connections survive 1663 the renumbering). This presents a quandary. The main benefit of 1664 separating identifiers and locators is the ability to have 1665 communication (e.g., a TCP connection) continue transparently, even 1666 when the Routing Stuff associated with a particular ESD changes. 1667 However, switching to a new Routing Stuff without properly 1668 authenticating it makes it trivial to hijack connections. 1670 We cannot emphasize enough that the use of an ESD independent of an 1671 associated RG can be very dangerous. That is, communicating with a 1672 peer implies that one is always talking to the same peer for the 1673 duration of the communication. But as has been described in previous 1674 sections, such assurance can only come from properly authenticating 1675 the RG associated with an ESD. That is not possible in GSE. 1677 6. Conclusion 1679 The GSE proposal provides a concrete example of a network protocol 1680 design that separates identifiers from locators in addresses. In 1681 this paper we compared GSE with IPv4's CIDR-style addressing to 1682 better understand the pros and cons of the respective design 1683 approaches. 1685 Functionally speaking, identifiers and locators each have a logically 1686 different role to play. Thus overloading both in one field causes 1687 problems whenever the location of a node changes but its identity 1688 does not. However, our analysis shows that overloading also presents 1689 two critically important benefits. 1691 First, for network entity A to send data to network entity B, A must 1692 not only know B's end identifier but also B's locator. No scalable 1693 way is known at this time to provide this mapping at the network 1694 layer, other than overloading the two quantities into an address as 1695 is done in IPv4. Fundamentally, a scalable mapping algorithm 1696 strongly suggests that the identifier space be structured 1697 hierarchically, yet identifiers in GSE are not sufficiently large to 1698 both contain sufficient hierarchy and support stateless address 1699 autoconfiguration. Instead, GSE forces applications to supply up- 1700 to-date locators. However, relying on the locator provided at the 1701 time communication is established as GSE does is inadequate when the 1702 remote locator can change dynamically, precisely the scenario that is 1703 supposed to benefit from the separation. That is, the benefits of 1704 separating the identifier from the locator are largely lost, if the 1705 changes in the identifier to locator binding are not tracked quickly. 1707 Secondly, when communicating with a remote site, if the RG changes 1708 there begins to be uncertainty as to whether a reliable TCP handshake 1709 is possible (because of the need for passively opened TCP to use the 1710 RG's it obtains from the packets). Because the reliability of TCP's 1711 byte stream is critically dependent on its three-way handshake, this 1712 is a significant issue. 1714 Finally, when communicating with a remote site, a receiver must be 1715 able to insure (with reasonable certainty) that received data does 1716 indeed come from the expected remote entity. In IPv4, it is possible 1717 to receive packets from a forged source, but the potential for 1718 mischief between communicating peers is significantly limited because 1719 return traffic will not generally reach the source of the forged 1720 traffic. That is, communication involving packets sent in both 1721 directions will not succeed. In contrast, architectures like GSE 1722 that decouple the identifier and locator functions lose the built-in 1723 protection available in classical IP and thus face great difficulty 1724 assuring that traffic from a source identified only by an identifier 1725 actually comes from the correct source. Short of using cryptographic 1726 techniques (e.g. IPsec), there is no known mechanism that can use an 1727 identifier alone to perform this remote entity authentication. Using 1728 an identifier alone for authentication of received packets is 1729 dangerously unsafe. 1731 In summary, although overloading the address field with a combined 1732 identifier and locator leads to difficulties in retaining the 1733 identity of a node whenever its address changes, analysis in this 1734 paper suggests that the benefit of the overloading actually out- 1735 weighs its cost. Completely separating an identifier from its 1736 locator renders the identifier untrustworthy, thus useless, in the 1737 absence of an accompanying authentication system. 1739 7. Security Considerations 1741 The primary security consideration with GSE or, more generally, a 1742 network layer with addresses split into locator and identifier parts, 1743 is that of one node impersonating another by copying the 1744 identification without the location. Indeed, the main conclusion of 1745 this paper is that a GSE-like addressing structure introduces new 1746 security vulnerabilities that are not present in IP, and that those 1747 problems are serious enough to question the benefits of an 1748 architecture that separates locaters and identifiers in addresses. 1750 8. Acknowledgments 1752 Thanks go to Steve Deering and Bob Hinden (the Chairs of the IPng 1753 Working Group) as well as Sun Microsystems (the host for the interim 1754 meeting) for the planning and execution of the interim meeting. 1755 Thanks also go to Mike O'Dell for writing the 8+8 and GSE drafts; by 1756 publishing these documents and speaking on their behalf, Mike was the 1757 catalyst for some valuable discussions, both for IPv6 addressing and 1758 for addressing architectures in general. Special thanks to the 1759 attendees of the interim meeting whose high caliber discussions 1760 helped motivate and shape this document. 1762 9. References 1764 [ANYCAST] "Host Anycasting Service", C. Partridge, T. Mendez, & W. 1765 Milliken, RFC 1546. 1767 [BATES] Scalable support for multi-homed multi-provider 1768 connectivity, Tony Bates & Yakov Rekhter, RFC 2260, 1769 January, 1998. 1771 [Bellovin 89] "Security Problems in the TCP/IP Protocol Suite", 1772 Bellovin, Steve, Computer Communications Review, Vol. 19, 1773 No. 2, pp32-48, April 1989. 1775 [CIDR] "Classless Inter-Domain Routing (CIDR): an Address 1776 Assignment and Aggregation Strategy". V. Fuller, T. Li, J. 1777 Yu, & K. Varadhan, RFC 1519, September 1993. 1779 [DHCP-DDNS] Interaction between DHCP and DNS, Internet Draft, Yakov 1780 Rekhter, (Work in Progress.) 1782 [DDNS] "Dynamic Updates in the Domain Name System (DNS UPDATE)", 1783 Paul Vixie (Editor), RFC 2136, April, 1997. 1785 [EUI64] 64-Bit Global Identifier Format Tutorial. 1786 http://standards.ieee.org/db/oui/tutorials/EUI64.html. 1787 Note: "EUI-64" is claimed as a trademark by an organization 1788 which also forbids reference to itself in association with 1789 that term in a standards document which is not their own, 1790 unless they have approved that reference. However, since 1791 this document is not standards-track, it seems safe to name 1792 that organization: the IEEE. 1794 [GSE] "GSE - An Alternate Addressing Architecture for IPv6", Mike 1795 O'Dell, (Work in progress). 1797 [IEEE802] IEEE Std 802-1990, "Local and Metropolitan Area Networks: 1798 IEEE Standard Overview and Architecture." 1800 [IEEE1212] IEEE Std 1212-1994, "Information technology-- 1801 Microprocessor systems: Control and Status Registers (CSR) 1802 Architecture for microcomputer buses." 1804 [IPv6-ADDRESS] "An IPv6 Aggregatable Global Unicast Address 1805 Format", R. Hinden, M. O'Dell, S. Deering, RFC 2374, July, 1806 1998. 1808 [MOBILITY] "IP Mobility Support", C. Perkins, RFC 2002, October, 1809 1996. 1811 [RFC1752] "The Recommendation for the IP Next Generation Protocol", 1812 S. Bradner, A. Mankin, RFC 1752, January, 1995. 1814 [RFC1788] "ICMP Domain Name Messages", W. Simpson, RFC 1788, April, 1815 1995. 1817 [RFC1884] "IP Version 6 Addressing Architecture", R. Hinden & S. 1818 Deering, Editors, RFC 1884. 1820 [RFC1958] "Architectural Principles of the Internet", B. Carpenter, 1821 RFC 1958, June, 1996. 1823 [RFC1971] "IPv6 Stateless Address Autoconfiguration", S. Thomson, 1824 T. Narten, RFC 1971, August, 1996. 1826 [RFC2008] "Implications of Various Address Allocation Policies for 1827 Internet Routing", Y. Rekhter, T. Li, RFC 2008, October 1828 1996. 1830 [RFC2073] An IPv6 Provider-Based Unicast Address Format. Y. 1831 Rekhter, P. Lothberg, R. Hinden, S. Deering, J. Postel. RFC 1832 2073, January, 1997. 1834 [RFC2267] Network Ingress Filtering: Defeating Denial of Service 1835 Attacks which employ IP Source Address Spoofing, P. 1836 Ferguson, D. Senie, RFC 2267. 1838 [ROUTER-RENUM] "Router Renumbering for IPv6", M. Crawford, draft- 1839 ietf-ipngwg-router-renum-06.txt. 1841 10. Authors' Addresses 1843 Matt Crawford John Stewart 1844 Fermilab MS 368 Juniper Networks, Inc. 1845 PO Box 500 385 Ravendale Drive 1846 Batavia, IL 60510 USA Mountain View, CA 94043 1847 Phone: 630-840-3461 Phone: +1 650 526 8000 1848 EMail: crawdad@fnal.gov EMail: jstewart@juniper.net 1850 Allison Mankin Lixia Zhang 1851 USC/ISI UCLA Computer Science Department 1852 4350 North Fairfax Drive 4531G Boelter Hall 1853 Suite 620 Los Angeles, CA 90095-1596 USA 1854 Arlington, VA 22203 USA Phone: 310-825-2695 1855 EMail: mankin@isi.edu EMail: lixia@cs.ucla.edu 1856 Phone: 703-812-3706 1858 Thomas Narten 1859 IBM Corporation 1860 3039 Cornwallis Ave. 1861 PO Box 12195 - F11/502 1862 Research Triangle Park, NC 27709-2195 1863 Phone: 919-254-7798 1864 EMail: narten@raleigh.ibm.com 1866 Appendix A: Increased Reliance on Domain Name System (DNS) 1868 As we've discussed in previous sections, the motivation for 1869 separating identifiers from locators in IP address is to allow the 1870 locator portion to change more easily. However because GSE does not 1871 provide a mapping from an ESD to its locator, whenever the locator 1872 changes, GSE falls back on DNS to provide such mapping. 1874 Because any mapping scheme is complicated by renumbering, and because 1875 recent IPv4 experience has shown a requirement for renumbering at 1876 some frequency, it is worthwhile to explore the general renumbering 1877 issue. 1879 A.1: Renumbering and DNS: How Frequently Can We Renumber? 1881 One premise of the GSE proposal [GSE] is that an ISP can renumber the 1882 Routing Goop portion of a site's addresses transparently to the site 1883 (i.e., without coordinating the change with the site). This would 1884 make it possible for backbone providers to aggressively renumber the 1885 Routing Goop part of addresses to achieve a high degree of route 1886 aggregation. On closer examination, frequent (e.g., daily) 1887 renumbering turns out to be difficult in practice because of a 1888 circular dependency between the DNS and routing. Specifically, if a 1889 site's Routing Stuff changes, nodes communicating with the site need 1890 to obtain the new Routing Stuff. In the GSE proposal, one queries 1891 the DNS to obtain this information. However, in order to reach a 1892 site's DNS servers, the pointers controlling the downward delegation 1893 of authoritative DNS servers (i.e., DNS "glue records") must use 1894 addresses with Routing Stuff that are reachable. That is, in order 1895 to find the address for the web server "www.foo.bar.com", DNS queries 1896 might need to be sent to a root DNS server, as well as DNS servers 1897 for "bar.com" and "foo.bar.com". Each of these servers must be 1898 reachable from the querying client. Consequently, there must be an 1899 adequate overlap period after the RG changes, during which both the 1900 old Routing Stuff and the new Routing Stuff can be used 1901 simultaneously. During the overlap period, DNS glue records will 1902 need to be updated to use the new addresses (including Routing Stuff) 1903 and DNS RR's needs to be updated. Only after all relevant DNS 1904 servers have been updated and all previously cached RRs containing 1905 the old addresses have timed out can the old RG be deleted. 1907 An important observation is that the above issue is not specific to 1908 GSE; the same requirement exists with today's provider-based 1909 addressing architecture. When a site is renumbered (e.g., it 1910 switches ISPs and obtains a new set of addresses from its new 1911 provider), the DNS must be updated in a similar fashion. 1913 A.2: Efficient DNS support for Site Renumbering 1915 In the current Internet, when a site is renumbered, the addresses of 1916 all the site's internal nodes change. This requires a potentially 1917 large update to the RR database for that site. Although Dynamic DNS 1918 [DDNS] could potentially be used, the cost is likely to be large due 1919 to the large number of individual records that would need to be 1920 updated. In addition, when DHCP and DDNS are used together [DHCP- 1921 DDNS], it may be the case that individual hosts "own" their own A or 1922 AAAA records, further complicating the question of who is able to 1923 update the contents of DNS RRs. 1925 With GSE, When a site renumbers to satisfy its ISP, only the site's 1926 routing prefix needs to change. That is, the prefix reflects where 1927 within the Internet the site resides. One DNS modification that 1928 could reduce the cost of updating the DNS when a site is renumbered 1929 is to store addresses in two distinct RR's: one for the Routing Goop 1930 that reflects where a node attaches to the Internet and the other for 1931 STP-plus-ESD that is the site-specific part of an address. During a 1932 renumbering, the Routing Goop would change, but the "site internal 1933 part" would remain fixed. That way, renumbering a site would only 1934 require that the Routing Goop RR of a site be updated; the "site- 1935 internal part" of individual addresses would not change. 1937 To obtain the address of a node from the DNS, a DNS query for the 1938 name would return two quantities: the "site internal part" and the 1939 DNS name of the Routing Stuff for the site. An additional DNS query 1940 would then obtain the specific RR of the site, and the complete 1941 address would be synthesized by concatenating the two pieces of 1942 information. 1944 Implementing these DNS changes increases the practicality of using 1945 Dynamic DNS to update a site's DNS records as it is renumbered. Only 1946 the site's Routing Goop RRs would need updating. 1948 Finally, it may be useful to divide a node's AAAA RR into the three 1949 logical parts of the GSE proposal, namely RG, STP and ESD. Whether 1950 or not it is useful to have separate RRs for the STP and ESD portions 1951 of an address or a single RR combining both is an issue that requires 1952 further study. 1954 If AAAA records are comprised of multiple distinct RRs, then one 1955 question is who should be responsible for synthesizing the AAAA from 1956 its components: the resolver running on the querying client's machine 1957 or the queried name server? To minimize the impact on client hosts 1958 and make it easier to deploy future changes, it is recommended that 1959 the synthesis of AAAA records from its constituent parts be done on 1960 name servers rather than in client resolvers. 1962 A.2.1: Two-Faced DNS 1964 The GSE proposal attempts to hide the RG part of addresses from nodes 1965 within a site. If the nodes do not know their own RG, then they 1966 can't store or use them in ways that cause problems should the site 1967 be renumbered and its RG change (i.e., the cached RG become invalid). 1968 A site's DNS servers, however, will need to have more information 1969 about the RG its site uses. Moreover, the responses it returns will 1970 depend on who queries the server. A query from a node within the 1971 site should return an address with a Site Local RG, whereas a query 1972 for the same name from a client located at a different site should 1973 return the global scope RG. This facilitates intra-site 1974 communication to be more resilient to failures outside of the site. 1975 Such context-dependent DNS servers are commonly referred as "two- 1976 faced" DNS servers. 1978 Some issues that must be considered in this context: 1980 1) A DNS server may recursively attempt to resolve a query on 1981 behalf of a requesting client. Consequently, a DNS query might 1982 be received from a proxy rather than from the client that 1983 actually seeks the information. Because the proxy may not be 1984 located at the same site as the originating client, a DNS server 1985 cannot reliably determine whether a DNS request is coming from 1986 the same site or a remote site. One solution would be to 1987 disallow recursive queries for off-site requesters, though this 1988 raises additional questions. 1990 2) Since cached responses are, in general, context sensitive, a 1991 name server may be unable to correctly answer a query from its 1992 cache, since the information it has is incomplete. That is, it 1993 may have loaded the information via a query from a local client, 1994 and the information has a site-local prefix. If a subsequent 1995 request comes in from an off-site requester, the DNS server 1996 cannot return a correct response (i.e., one containing the 1997 correct RG). 1999 A.2.2: Bootstrapping Issues 2001 If Routing Stuff information is distributed via the DNS, key DNS 2002 servers must always be reachable. In particular, the addresses 2003 (including Routing Stuff) of all root DNS servers are, for all 2004 practical purposes, well-known and assumed to never change. It is 2005 not uncommon for the addresses of root servers to be hard-coded into 2006 software distributions. Consequently, the Routing Stuff associated 2007 with such addresses must always be usable for reaching root servers. 2009 If it becomes necessary or desirable to change the Routing Stuff of 2010 an address at which a root DNS server resides, the routing subsystem 2011 will likely need to continue carrying "exceptions" for those 2012 addresses. Because the total number of root DNS servers is 2013 relatively small, the routing subsystem is expected to be able to 2014 handle this requirement. 2016 All other DNS server addresses can be changed, since their addresses 2017 are typically learned from an upper-level DNS server that has 2018 delegated a part of the name space to them. So long as the 2019 delegating server is configured with the new address, the addresses 2020 of other servers can change. 2022 Appendix B: Additional Issues Related to GSE 2024 This paper focused primarily on the issues of separating identifiers 2025 and locators in unicast addresses. It is worth noting that a number 2026 of additional issues were identified during the IPng interim meeting 2027 with respect to the GSE proposal that would need to be considered 2028 before an architecture such as GSE could be deployed. Specifically: 2030 - - it is not known how multicast would work under GSE. One 2031 identified issue is that a site with multiple egress routers 2032 would (by default) inject multicast traffic through each of all 2033 the egress routers, each would then replace the source Routing 2034 Goop with a differing value. This would lead to multiple copies 2035 of the same packet each carrying a different IPv6 address, thus 2036 being considered as from different sources. 2038 - - It would be more difficult to create tunnels. Any tunnel that 2039 crosses a site boundary (i.e., the entry and exit points are in 2040 differing sites) would in effect require that both tunnel 2041 endpoints be border routers to insure that the addresses in the 2042 inner headers were rewritten correctly. 2044 - - In order for the DNS to hide a site's Routing Goop from 2045 internal nodes yet make it visible to external nodes requires a 2046 two-faced DNS. The current DNS model assumes a single global 2047 database in which all queries are answered the same way, 2048 irregardless of who issued the query. It is unclear how to make 2049 the DNS answer queries in a context-sensitive manner without 2050 also negatively impacting its caching model. 2052 Appendix C: Ideas Incorporated Into IPv6 2054 This section summarizes changes made to IPv6 specifications which 2055 originated in the GSE proposal or in the discussions arising from it. 2057 The unicast address format was changed to improve the aggregability 2058 of unicast addresses. Instead of a topologically insignificant 2059 Registry ID immediately following the Format Prefix [RFC2073], there 2060 is now a Top-Level Aggregation Identifier [IPv6-ADDRESS]. This field 2061 identifies a large routable aggregate to which an address belongs 2062 rather than an administrative unit that assigned the address. The 2063 TLA corresponds to the "Large Structure" of GSE. The IPv6 Next-Level 2064 Aggregation Identifier (NLA) is roughly the rest of the GSE "Routing 2065 Goop" and the Site-Level Aggregation Identifier (SLA) is a slightly 2066 expanded GSE Site Topology Partition. 2068 The decision to put fixed boundaries between parts of the unicast 2069 address (TLA, NLA, SLA, Interface Identifier) into IPv6 addresses 2070 [IPv6-ADDRESS] also came from GSE. The previous "provider-based" 2071 addressing architecture for IPv6 [RFC2073] had fluid boundaries 2072 between Registry ID, Provider ID, Subscriber ID and the Intra- 2073 Subscriber part, as well as undefined divisions within the Provider- 2074 ID and Intra-Subscriber part. (On subnetworks with a MAC-layer 2075 address, the latter boundary was generally placed to accommodate use 2076 of that address as an Interface ID.) The new addressing architecture 2077 still expects divisions within the NLA portion of the address, placed 2078 to reflect topological aggregation points. 2080 Defining a fixed boundary between the routable portion of the address 2081 and the part indicating an interface on a specific link required 2082 specifying an Interface Identifier that would be suitable for all 2083 subnetwork technologies. The IEEE "EUI-64" identifier was selected, 2084 having the advantages of an easy mapping from 48 bit MAC addresses 2085 and a defined escape flag into locally-administered values. 2087 Another change was the redefinition of the interface identifier to be 2088 a 64-bit quantity. In the common case where a node has at least one 2089 IEEE interface, the interface identifier is constructed from an IEEE 2090 identifier (i.e., a MAC address) in such a way that there is a very 2091 high probability that the identifier will be globally unique. In the 2092 case where a globally unique identifier can't easily be constructed 2093 automatically, a bit in the identifier indicates that the address is 2094 not globally unique. At present, there are no plans for transport 2095 protocols such as TCP to exploit interface identifiers, but the door 2096 has been left open for a future protocol (e.g., TCPng) to take 2097 advantage of the ESD concept. 2099 Another change to come out of the GSE discussions relates to reducing 2100 the number of DNS record changes required in the event of site 2101 renumbering. This work is not finalized as of this writing, but the 2102 result may be that individual IPv6 addresses are stored (and signed, 2103 in the case of Secure DNS) as a partial address and an indirect 2104 pointer which leads to the high-order part of the address. There may 2105 be multiple levels of indirection and a changed record at any one 2106 level would suffice to update the DNS's record of the IPv6 addresses 2107 of every node in a given branch of the addressing hierarchy. 2109 A change in the method of doing DNS address-to-name lookups is also 2110 in the works. This may be a change in the form and/or operation of 2111 the ip6.int domain or some new mechanism which involves participation 2112 by the routers or the end-nodes themselves. 2114 Two other changes arising from GSE will not affect the IPv6 base 2115 specifications themselves, but do direct additional work. Those are 2116 the injection of global prefix information into a site from a 2117 provider or exchange [ROUTER-RENUM], and some inter-provider 2118 cooperative method of providing multihoming to mutual customers with 2119 minimal impact on routing tables in distant parts of the network. 2121 Appendix D: Reverse Mapping of Complete GSE Addresses 2123 The ability to map an IP address into its corresponding DNS name is 2124 used in several contexts: 2126 1) Network packet tracing utilities (e.g., tcpdump) display the 2127 contents of packets. Printing out the DNS names appearing in 2128 those packets (rather than dotted IP addresses) requires access 2129 to an address-to-name mapping mechanism. 2131 2) Some applications perform a "poor-man's" authentication by using 2132 the DNS to map the source address of a peer into a DNS name. 2133 The client then queries the DNS a second time, this time asking 2134 for the address(es) corresponding to the peer's DNS name. Only 2135 if one of the addresses returned by the DNS matches the peer 2136 address of the TCP connection is the source of the TCP 2137 connection accepted as being from the indicated DNS name. 2139 It is important to note that although two DNS queries are made 2140 during the above operation, it is the second one --- mapping the 2141 peer's DNS name back into an IP address --- that provides the 2142 authentication property. The first transaction simply obtains 2143 the peer's DNS name, but no assumption is made that the returned 2144 DNS name is correct. Thus, the first DNS query could be 2145 replaced by an alternate mechanism without weakening the already 2146 weak authentication check described above. One possible 2147 alternate mechanism, an ICMP "Who Are You" message, is described 2148 below 2150 3) Applications that log all incoming network connections (e.g., 2151 anonymous FTP servers) may prefer logging recognizable DNS names 2152 to addresses. 2154 4) Network administrators examining logs or other trace data 2155 containing addresses may wish to determine the DNS name of some 2156 addresses. Note that this may occur sometime after those 2157 addresses were actually used. 2159 The following subsections describe techniques for mapping a full IPv6 2160 address back into some quantity (e.g., a DNS name or locator). We 2161 include these descriptions for completeness even though they do not 2162 address the fundamental problem of how to perform the mapping on an 2163 identifier alone. It should also be noted that because both 2164 techniques operate on complete IPv6 addresses, they are both directly 2165 applicable to provider-based addressing schemes and are not specific 2166 to GSE. 2168 D.1: DNS-Like Reverse Mapping of Full GSE Addresses 2170 Although it seems infeasible to have a global scale, reverse mapping 2171 of ESDs, within a site, it may be feasible to maintain a database 2172 keyed on unstructured 8-byte ESDs. However, it is an open question 2173 whether such a database can be kept up-to-date at reasonable cost, 2174 without making unreasonable assumptions as to how large sites are 2175 going to grow, and how frequently ESD registrations will be made or 2176 updated. Note that the issue isn't just the physical database 2177 itself, but the operational issues involved in keeping it up-to-date. 2178 For the rest of this section, however, let us assume that such a 2179 database can be built. 2181 A mechanism supporting a lookup keyed on a flat-space ESD from an 2182 arbitrary site requires having sufficient structure to identify the 2183 site that needs to be queried. In practice, since the Routing Stuff 2184 is organized hierarchically, if an ESD is always used in conjunction 2185 with Routing Stuff (i.e., a full 16-byte address), it becomes 2186 feasible to maintain a DNS-like tree that maps full GSE addresses 2187 into DNS names, in a fashion analogous to what is done with IPv4 PTR 2188 records today. 2190 It should be noted that a GSE address lookup will work only if the 2191 Routing Stuff portion of the address is correctly entered in the DNS 2192 tree. Because the Routing Stuff portion of an address is expected to 2193 change over time, this assumption will not hold valid indefinitely. 2195 As a consequence, a packet trace recorded in the past might not 2196 contain enough information to identify the off-Site sources of the 2197 packets in the present. This problem can be addressed by requiring 2198 that the database of RG delegations be maintained, together with 2199 accurate timing information, for some period of time after the RG is 2200 no longer usable for routing packets. 2202 Finally, it should be noted that the problem where an address's RG 2203 "expires" with the implication that the mapping of "expired" 2204 addresses into DNS names may no longer hold is not a problem specific 2205 to the GSE proposal. With provider-based addressing, the same issue 2206 arises when a site renumbers into a new provider prefix and releases 2207 the allocation from a previous block. The authors are aware of one 2208 such renumbering incidence in IPv4 where a block of returned 2209 addresses was reassigned and reused within 24 hours of the 2210 renumbering event. 2212 D.2: The ICMP Who-Are-You Message 2214 There is widespread agreement on the utility of being able to 2215 determine the DNS name one is communicating with from the address 2216 being used. In addition to the fact that DNS names are more 2217 meaningful to human users and more stable than addresses, many users 2218 use this reverse mapping as part of a poor-man's authentication for 2219 the remote peer; if one can map the obtained DNS name back to the 2220 same address, one has an increased confidence of the peer being a 2221 legitimate one. 2223 In practice, however, the IN-ADDR.ARPA domain is not fully populated 2224 and poorly maintained. Consequently, an old proposal to define an 2225 ICMP Who-Are-You message was resurrected [RFC1788]. A client would 2226 send such a message to a peer, and that peer would return an ICMP 2227 message containing its DNS name. Asking a remote host to supply its 2228 own name in no way implies that the returned information is accurate. 2229 However, having a remote peer provide a piece of information that a 2230 client can use as input to a separate authentication procedure 2231 provides a starting point for performing strong authentication. The 2232 actual strength of the authentication depends on the authentication 2233 procedure invoked, rather than the untrustable piece of information 2234 provided by a remote peer. 2236 Reconsidering the "cheap" authentication procedure described earlier, 2237 the ICMP Who-Are-You replaces the DNS PTR query used to obtain the 2238 DNS name of a remote peer. The second DNS query, to map the DNS name 2239 back into a set of addresses, would be performed as before. Because 2240 the latter DNS query provides the strength of the authentication, the 2241 use of an ICMP Who-Are-You message does not in any way weaken the 2242 strength of the authentication method. Indeed, it can only make it 2243 more useful in practice, because virtually all hosts can be expected 2244 to implement the Who-Are-You message. 2246 The Who-Are-You message has advantages outside the context of GSE as 2247 well, including a more decentralized, and hence more scalable, 2248 administration and easier upkeep than a DNS reverse-lookup zone. It 2249 also has drawbacks: it requires the target node to be up and 2250 reachable at the time of the query and to know its fully qualified 2251 domain name. It is also not possible to resolve addresses once those 2252 addresses become unroutable. In contrast, the DNS PTR mirrors, but 2253 is independent of, the routing hierarchy. The DNS can maintain 2254 mappings long after the routing subsystem stops delivering packets to 2255 certain addresses. 2257 The requirement that the target node be up and reachable at the time 2258 of the query makes it very uncertain that one would be able to take 2259 addresses from a packet log and translate them to correct domain 2260 names at a later time. One can argue that this is a design flaw in 2261 the logging system, as it violates the architectural principle, 2262 "Avoid any design that requires addresses to be ... stored on non- 2263 volatile storage" [RFC1958]. A better-designed system would look up 2264 domain names promptly from logged addresses. Indeed, one of the 2265 authors has been doing that for some years.