idnits 2.17.1 

draft-ietf-ipngwg-esd-analysis-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 2 instances of too long lines in the document, the longest one
     being 3 characters in excess of 72.

  ** The abstract seems to contain references ([GSE]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.

  == There are 7 instances of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 12, 1999) is 9204 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC 2073' is mentioned on line 669, but not defined

  ** Obsolete undefined reference: RFC 2073 (Obsoleted by RFC 2374)

  == Missing Reference: 'ESD' is mentioned on line 759, but not defined

  == Missing Reference: 'RFC 2267' is mentioned on line 1360, but not defined

  ** Obsolete undefined reference: RFC 2267 (Obsoleted by RFC 2827)

  == Unused Reference: 'ANYCAST' is defined on line 1764, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC1884' is defined on line 1817, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2267' is defined on line 1834, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Informational RFC: RFC 1546 (ref.
     'ANYCAST')

  ** Downref: Normative reference to an Informational RFC: RFC 2260 (ref.
     'BATES')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Bellovin 89'

  ** Obsolete normative reference: RFC 1519 (ref. 'CIDR') (Obsoleted by RFC
     4632)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DHCP-DDNS'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'EUI64'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GSE'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE802'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE1212'

  ** Obsolete normative reference: RFC 2374 (ref. 'IPv6-ADDRESS') (Obsoleted
     by RFC 3587)

  ** Obsolete normative reference: RFC 2002 (ref. 'MOBILITY') (Obsoleted by
     RFC 3220)

  ** Obsolete normative reference: RFC 1788 (Obsoleted by RFC 6918)

  ** Obsolete normative reference: RFC 1884 (Obsoleted by RFC 2373)

  ** Downref: Normative reference to an Informational RFC: RFC 1958

  ** Obsolete normative reference: RFC 1971 (Obsoleted by RFC 2462)

  ** Obsolete normative reference: RFC 2073 (Obsoleted by RFC 2374)

  ** Obsolete normative reference: RFC 2267 (Obsoleted by RFC 2827)

  == Outdated reference: A later version (-10) exists of
     draft-ietf-ipngwg-router-renum-06


     Summary: 20 errors (**), 0 flaws (~~), 10 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                             Matt Crawford
2	                                                                Fermilab
3	<draft-ietf-ipngwg-esd-analysis-04.txt>                   Allison Mankin
4	                                                                     ISI
5	                                                           Thomas Narten
6	                                                                     IBM
7	                                                    John W. Stewart, III
8	                                                                 Juniper
9	                                                             Lixia Zhang
10	                                                                    UCLA
11	                                                       February 12, 1999

13	             Separating Identifiers and Locators in Addresses:
14	                 An Analysis of the GSE Proposal for IPv6

16	                  <draft-ietf-ipngwg-esd-analysis-04.txt>

18	Status of this Memo

20	   This document is an Internet-Draft and is in full conformance with
21	   all provisions of Section 10 of RFC2026 except that the right to
22	   produce derivative works is not granted.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF), its areas, and its working groups.  Note that
26	   other groups may also distribute working documents as Internet-
27	   Drafts.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet- Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   The list of current Internet-Drafts can be accessed at
35	   http://www.ietf.org/ietf/1id-abstracts.txt

37	   The list of Internet-Draft Shadow Directories can be accessed at
38	   http://www.ietf.org/shadow.html.

40	Abstract

42	   On February 27-28, 1997, the IPng Working Group held an interim
43	   meeting in Palo Alto, California to consider adopting Mike O'Dell's
44	   "GSE - An Alternate Addressing Architecture for IPv6" proposal [GSE].
45	   In GSE, 16-byte IPv6 addresses are split into distinct portions for
46	   global routing, local routing and end-point identification.  GSE
47	   includes the feature of configuring a node internal to a site with
48	   only the local routing and end-point identification portions of the
49	   address, thus hiding the full address from the node.  When such a
50	   node generates a packet, only the low-order bytes of the source
51	   address are specified; the high-order bytes of the address are filled
52	   in by a border router when the packet leaves the site.

54	   There is a long history of a vague assertion in certain circles that
55	   IPv4 "got it wrong" by treating its addresses simultaneously as
56	   locators and identifiers.  Despite these claims, however, there was
57	   never a complete proposal for a scaleable network protocol which
58	   separated the functions.  As a result, it wasn't possible to do a
59	   serious analysis comparing and contrasting a "separated" architecture
60	   and an "overloaded" architecture.  The GSE proposal serves as a
61	   vehicle for just such an analysis, and that is the purpose of this
62	   paper.

64	   We conclude that an architecture that clearly separates locators and
65	   identifiers in addresses introduces new issues and problems that do
66	   not have an easy or clear solution.  Indeed, the alleged
67	   disadvantages of overloading addresses turn out to provide some
68	   significant benefits over the non-overloaded approach.

70	   Contents

72	   Status of this Memo..........................................    1

74	   1.  Introduction.............................................    3

76	   2.  Definitions and Terminology..............................    4

78	   3.  Addressing and Routing in IPv4...........................    5
79	      3.1.  The Need for Aggregation............................    7
80	      3.2.  The Pre-CIDR Internet...............................    7
81	      3.3.  CIDR and Provider-Based Addressing..................    9
82	      3.4.  Multi-Homed Sites and Aggregation...................   12

84	   4.  The GSE Proposal.........................................   15
85	      4.1.  Motivation For GSE..................................   15
86	      4.2.  GSE Address Format..................................   16
87	         4.2.1.  Routing Stuff (RG and STP).....................   16
88	         4.2.2.  End-System Designator..........................   18
89	      4.3.  Address Rewriting by Border Routers.................   19
90	      4.4.  Renumbering and Rehoming Mid-Level ISPs.............   20
91	      4.5.  Support for Multi-Homed Sites.......................   21
92	      4.6.  Explicit Non-Goals for GSE..........................   22

94	   5.  Analysis: The Pros and Cons of Overloading Addresses.....   22
95	      5.1.  Purpose of an Identifier............................   23
96	      5.2.  Mapping an Identifier to a Locator..................   25
97	         5.2.1.  Scalable Mapping of Identifiers to Locators....   27
98	         5.2.2.  Insufficient Hierarchy Space in ESDs...........   27
99	      5.3.  Authentication of Identifiers.......................   28
100	         5.3.1.  Identifier Authentication in IPv4..............   29
101	         5.3.2.  Identifier Authentication in GSE...............   30
102	      5.4.  Transport Layer: What Locator Should Be Used?.......   30
103	         5.4.1.  RG Selection On An Active Open.................   31
104	         5.4.2.  RG Selection On An Passive Open................   31
105	         5.4.3.  Mid-Connection RG Changes......................   31
106	         5.4.4.  The Impact of Corrupted Routing Goop...........   33
107	      5.5.  On The Uniqueness Of ESDs...........................   34
108	         5.5.1.  Impact of Duplicate ESDs.......................   34
109	         5.5.2.  New Denial of Service Attacks..................   35
110	      5.6.  Summary of Identifier Authentication Issues.........   35

112	   6.  Conclusion...............................................   37

114	   7.  Security Considerations..................................   38

116	   8.  Acknowledgments..........................................   38

118	   9.  References...............................................   38

120	   10.  Authors' Addresses......................................   40

122	   Appendix A: Increased Reliance on Domain Name System (DNS)...   41

124	   Appendix B: Additional Issues Related to GSE.................   45

126	   Appendix C: Ideas Incorporated Into IPv6.....................   46

128	   Appendix D: Reverse Mapping of Complete GSE Addresses........   47

130	1.  Introduction

132	   In October of 1996, Mike O'Dell published an Internet-Draft (dubbed
133	   "8+8") that proposed significant changes to the IPv6 unicast
134	   addressing architecture.  The 8+8 proposal was the topic of
135	   considerable discussion at the December 1996 IETF meeting in San
136	   Jose.  Because the proposal offered both potential benefits (e.g.,
137	   enhanced routing scalability) and risks (e.g., changes to the basic
138	   IPv6 architecture), the IPng Working Group held an interim meeting on
139	   February 27-28, 1997 to consider adopting the 8+8 proposal.

141	   Shortly before the interim meeting, an updated version of the
142	   Internet-Draft was produced.  This version changed the name of the
143	   proposal from "8+8" to "GSE" to identify the three separate
144	   components of a unicast address: Global, Site and End-System
145	   Designator.

147	   The well-attended meeting generated high caliber, focused technical
148	   discussions on the issues involved, with participation by almost all
149	   of the attendees.  By the middle of the second day there was
150	   unanimous agreement that the GSE proposal as written presented too
151	   many risks and should not be adopted as the basis for IPv6.  The
152	   proposal did, however, challenge the group to make several
153	   improvements to the then existing IPv6 specifications (including
154	   increasing the aggregatability of addresses, having hard boundaries
155	   between routing and non-routing parts of the address, and easing the
156	   DNS aspects of renumbering).

158	   This document focuses primarily on the issue of separating unicast
159	   addresses into distinct portions for identification and location
160	   purposes, a separation that IPv4 does not make but that is
161	   fundamental to GSE.  We start with a discussion of the current
162	   architecture of IPv4 addressing and its impact on route scalability,
163	   identification, multi-homing, etc.  Next, the details of the GSE
164	   proposal are described.  Finally, the fundamental issue of
165	   decomposing addresses into multiple separate functional parts is
166	   analyzed in the context of the GSE proposal.  Here we detail some of
167	   the practical reasons why separating addresses into locators and
168	   identifier poses a number of new challenges, making it clear that
169	   having such a separation is no panacea.  An appendix contains a
170	   summary of the IPng Working Group's deliberations of GSE and the
171	   results on IPv6 addressing.

173	   Finally, this document's focus on unicast issues should not be
174	   interpreted to mean that the impact of separating identifier and
175	   locating functions on non-unicast aspects of routing and addressing
176	   are well understood or trivial to deal with.  Specifically,
177	   understanding how multicasting and anycast addressing [ANYCAST,
178	   RFC1884] fits into such a model requires further work.

180	2.  Definitions and Terminology

182	   The following terminology is used throughout this document.

184	      Routing Goop --- A term defined by the GSE document.  It refers to
185	                    the first six bytes of a sixteen byte IPv6 GSE
186	                    address.  The Routing Goop portion of an address
187	                    identifies where a site connects to the public
188	                    Internet.  More generally, the term refers to the
189	                    portion of an address's routing prefix that
190	                    identifies where on the public Internet the site
191	                    housing the address resides.

193	      Site Topology Partition --- A term defined by the GSE document
194	                    that refers to the two bytes of a sixteen byte IPv6
195	                    GSE address immediately to the right of the Routing
196	                    Goop.  The Site Topology Partition part of an
197	                    address identifies which link within a site an
198	                    address resides on.

200	      Routing Stuff --- The part of an address that identifies which
201	                    link the address resides on.  Within the context of
202	                    GSE, the Routing Stuff comprises the Routing Goop
203	                    and Site Topology Partition parts of an address
204	                    (i.e., the left mots eight bytes).

206	      identifier --- a value that indicates the sender of a packet, or
207	                    the intended recipient of a packet.  Within the
208	                    context of GSE, the ESD portion (i.e., the rightmost
209	                    eight bytes) of the address is an identifier.

211	      locator --- a field in a packet header that is used by the routing
212	                    subsystem to deliver a packet to the link on which a
213	                    destination resides.  The terms locator and Routing
214	                    Stuff are similar, we use Routing Stuff when
215	                    referring to the specific locator in GSE.

217	3.  Addressing and Routing in IPv4

219	   Before dealing with details of GSE, we present some background about
220	   how routing and addressing works in "classical IP" (i.e., IPv4).  We
221	   present this background because the GSE proposal proposes a fairly
222	   major change to the base model.  In order to properly evaluate GSE,
223	   one must understand what problems in IPv4 it alleges to improve or
224	   fix.

226	   The structure and semantics of a network layer protocol's addresses
227	   are absolutely core to that protocol.  Addressing substantially
228	   impacts the way packets are routed, the ability of a protocol to
229	   scale and the kinds of functionality higher layer protocols can count
230	   on.  Indeed, addressing is intertwined with both routing and
231	   transport layer issues; a change in any one of these can impact
232	   another.  Issues of administration and operation (e.g., address
233	   allocation/re-allocation and required renumbering), while not part of
234	   the pure exercise of engineering a network layer protocol, turn out
235	   to be critical to the scalability of that protocol in a global and
236	   commercial network.  The interaction between addressing, routing and
237	   especially aggregation is particularly relevant to this document, so
238	   some time will be spent describing it.

240	   Addresses in IPv4 serve two purposes:

242	     1) Unique identification of an interface.  A sending host tells the
243	        network the identity of the intended recipient by placing an IP
244	        address into the destination address field.  In addition, the
245	        receiving host checks the destination address field of received
246	        packets to ensure that the packet is, in fact, for it.

248	     2) Location information of that interface.  Routers use the
249	        packet's destination address in deciding where to forward the
250	        packet to get it closer to its ultimate destination.  That is,
251	        addresses identify "where" the intended recipient is located
252	        within the Internet topology.

254	        For scalability, the location information contained in addresses
255	        must be aggregatable.  In practice, this means that nodes
256	        topologically close to each other (e.g., connected to the same
257	        link, residing at the same site, or customers of the same ISP)
258	        must use addresses that share a common prefix.

260	   What is important to note is that these identification and location
261	   requirements have been met through the use of the same value, namely
262	   the IP address.  As will be noted repeatedly in this document, the
263	   "overloading" of IPv4 addresses with multiple semantics has some
264	   undesirable implications.  For example, the embedding of IPv4
265	   addresses within transport protocol addresses that identify the end-
266	   point of a connection couples those transport protocols with routing
267	   to a degree. This entanglement is inconsistent with a (too) strictly
268	   layered model in which routing would be a completely independent
269	   function of the network layer and not directly impact the transport
270	   layer.

272	   Combining locator and identifier functions also complicates the
273	   support for mobility.  In a mobile environment, the location of an
274	   end-station may change even though its identity stays the same;
275	   ideally, transport connections should be able to survive such
276	   changes.  In IPv4, however, one cannot change the locator without
277	   also changing the identifier since the same packet field is used for
278	   both.

280	   Consequently, there has been a train of thought for some time that
281	   having separate values for location and identification could be of
282	   significant benefit.  The GSE proposal, among other things, attempts
283	   to make such a separation.

285	   This document frequently uses mobility as an example to demonstrate
286	   the pros and cons of separating the identifier from the locator.
287	   However, the reader should note the fundamental equivalence between
288	   the problems faced by mobile hosts and the problem faced by sites
289	   that change providers yet don't want to renumber their network.  When
290	   a site changes providers, it moves topologically in much the same way
291	   a mobile node does when it moves from one place to another.
292	   Consequently, techniques that help or hinder mobility are often
293	   relevant to the issue of site renumbering.

295	3.1.  The Need for Aggregation

297	   IPv4 has seen a number of different addressing schemes.  Since the
298	   original specification, the two major additions have been subnetting
299	   and classless routing.  The motivation for adding subnetting was to
300	   allow a collection of networks located at one site to be viewed from
301	   afar as a single IP network (i.e., to aggregate all of the individual
302	   networks into a single bigger network).  The practical benefit of
303	   subnetting was that all of a site's hosts, even if scattered among
304	   tens or hundreds of LANs, could be represented by a single routing
305	   table entry in routers located far from the site.  In contrast, prior
306	   to subnetting, a site with ten LANs would advertise ten separate
307	   network entries, and all routers would have to maintain ten separate
308	   entries, even though they contained essentially redundant
309	   information.

311	   The benefits of aggregation should be clear.  The amount of work
312	   involved in constructing forwarding tables (i.e., selecting best
313	   routes and installing them into the switching subsystem) is dependent
314	   in part on the number of network routes (i.e., destinations) to which
315	   best paths are computed.  If each site has 10 internal networks, and
316	   each of those networks is individually advertised to the global
317	   routing system, the complexity of computing forwarding tables can
318	   easily be an order of magnitude greater than if each site advertised
319	   a single entry that covered all of the addresses used within the
320	   site.

322	3.2.  The Pre-CIDR Internet

324	   In the early days of the Internet, its topology and addressing were
325	   orthogonal.  Specifically, when a site wanted to connect to the
326	   Internet, it approached the central Internet Assigned Numbers
327	   Authority (IANA) to obtain an address block and then approached a
328	   provider about procuring connectivity.  This procedure for address
329	   allocation resulted in a system where the addresses used by customers
330	   of the same provider bore little relation to the addresses used by
331	   other customers of that same provider.  In other words, though the
332	   actual topology of the Internet was mostly hierarchical, the
333	   addressing was not.  An example of such a topology and addressing
334	   scheme is shown in Figure 1.

336	                +----------------+
337	                |                |------- Customer1 (192.2.2.0)
338	                |                |------- Customer2 (128.128.0.0)
339	                |   Provider A   |------- Customer3 (18.0.0.0)
340	                |                |------- Customer4 (193.3.3.0)
341	                |                |------- Customer5 (194.4.4.0)
342	                +----------------+
343	                        |
344	                        |
345	                        |
346	                        |
347	                +----------------+
348	                |   Provider B   |
349	                +----------------+

351	                                 Figure 1

353	   Figure 1 shows Provider A having 5 customers, each with their own
354	   independently obtained network address.  Providers A and B connect to
355	   each other.  In order for Provider B to be able to send traffic to
356	   Customers1-5, Provider A must announce a separate route to Provider B
357	   for each of the 5 networks.  That is, the routers within Provider B
358	   must have explicit routing entries for each of Provider A's customers
359	   -- 5 separate routes.

361	   Experience has shown that this approach scales very poorly.  In the
362	   Default-Free Zone (DFZ) of the Public Internet, where routers must
363	   maintain routing entries for all reachable destinations, the cost of
364	   computing forwarding tables quickly becomes unacceptably large.  A
365	   large part of the cost is related to the seemingly redundant
366	   computations that must be made for each individual network, even
367	   though many of them reside in the same topological location (e.g.,
368	   under the same provider).  Looking at Figure 1, the problem is that
369	   provider B performs 5 separate calculations to construct the
370	   forwarding table needed to reach each of A's customers, even though
371	   it is going to take the same path for all of them; in other words,
372	   there is an opportunity to do data abstraction.

374	   Figure 1 shows network numbers using the older "classful" notation.
375	   Since 1981, the first few bits of an address syntactically identified
376	   which parts of an address identified the "network" and "local"
377	   portions of an address.  There were a small number of Class A
378	   addresses (intended for very large sites), a medium number of Class B
379	   addresses (for medium-sized sites) and a very large number of Class C
380	   addresses (for very small sites).  In practice, the actual size of
381	   real networks didn't match the original allocation of Class A, B, and
382	   C addresses.  Class B addresses were bigger than most sites needed
383	   (and there weren't enough of them), and Class C addresses were too
384	   small (i.e., typical sites would need to get 10 or more C blocks to
385	   cover all of the hosts on their networks).  Consequently, classless
386	   addressing was developed [CIDR], which made the boundaries between
387	   the network and local parts of an address more flexible.  With
388	   classless addressing, a separate prefix-length (i.e., network mask)
389	   specifies how many of the left-most bits of an address identify the
390	   network part of the address.

392	3.3.  CIDR and Provider-Based Addressing

394	   One of the reasons CIDR (Classless Inter-Domain Routing) and its
395	   associated provider-assigned address allocation policy were
396	   introduced was to help reduce the cost of computing a routing table
397	   and the size of the forwarding table computed from the routing table.
398	   To achieve this goal CIDR aggressively aggregates network addresses.
399	   Aggregating network addresses means "merging" multiple addresses into
400	   a single "bigger" one, that is to use a common prefix to provide
401	   location information for all addresses sharing that same prefix.

403	   With CIDR, sites that want to connect to the Internet approach a
404	   provider to procure both connectivity and a network address.
405	   Individual providers have a block of address space covered by one
406	   prefix and assign pieces of that space to customers.  Consequently,
407	   customers of the same provider have addresses that share the same
408	   prefix.  The combination of CIDR and provider-based addressing
409	   results in the ability of a provider to address many hundreds of
410	   sites while introducing just one network address into the global
411	   routing system.  An example of such a topology and addressing scheme
412	   is shown in Figure 2.

414	                +----------------+
415	                |                |------- Customer1 (204.1.0.0/19)
416	                |                |------- Customer2 (204.1.32.0/23)
417	                |   Provider A   |------- Customer3 (204.1.34.0/24)
418	                |                |------- Customer4 (204.1.35.0/24)
419	                |                |------- Customer5 (204.1.36.0/23)
420	                +----------------+
421	                        |
422	                        |  A announces
423	                        |  204.1/16 to B
424	                        |
425	                +----------------+
426	                |   Provider B   |
427	                +----------------+

429	                                  Figure 2

431	   In Figure 2, Provider A has been assigned the classless block, or
432	   "aggregate", 204.1.0.0/16 (i.e., a prefix with the high-order 16 bits
433	   denoting a single network).  Provider A has 5 customers, each of
434	   which has been assigned a prefix subordinate to the aggregate.  In
435	   order for Provider B to be able to reach Customers1-5, Provider A
436	   only needs to announce the single prefix 204.1.0.0/16, and Provider
437	   B's routers need only a single routing table entry to reach all of
438	   Provider A's customers.  Note the important difference between the
439	   cases described in Figures 1 and 2; the latter example uses fewer
440	   entries in the routing table to reach the same number of
441	   destinations.

443	   CIDR was a critical step for the Internet: in the early 1990s the
444	   size of default-free routing tables required to support the classful
445	   Internet was almost more than the commercially-available hardware and
446	   software of the day could handle.  The introduction of BGP4's
447	   classless routing and provider-based address allocation policies
448	   resulted in a significant decrease in the growth rate of the routing
449	   tables.  At the same time, however, CIDR introduced some new
450	   weaknesses.  First, the Internet addressing model had to shift from
451	   one of "address owning" to "address lending" [RFC2008].  In pre-CIDR
452	   days sites acquired addresses from a central authority independent of
453	   their provider, and a site could assume it "owned" the address block
454	   it was given.  Owning addresses meant that once one had been given a
455	   set of network addresses, one could always use them; no matter where
456	   one's site connected to the Internet, the prefix for that network
457	   could be injected into the public routing system.  Today, however, it
458	   is simply not possible for all individual sites to have their own
459	   prefixes injected into the DFZ; there would be too many of them.
460	   Consequently, if a site decides to change providers, it needs to
461	   renumber all of its nodes using address space given to it by the new
462	   provider.  The "old" addresses it had used are returned back to its
463	   previous provider.  To understand this, consider if, from Figure 2,
464	   Customer3 changes its provider from Provider A to Provider C, but
465	   does not renumber.  The picture would be as follows:

467	                        +----------------+
468	                        |                |---- Customer1 (204.1.0.0/19)
469	                        |                |---- Customer2 (204.1.32.0/23)
470	                        |   Provider A   |
471	        +---------------|                |---- Customer4 (204.1.35.0/24)
472	        | A announces   |                |---- Customer5 (204.1.36.0/23)
473	        | 204.1/16 to B +----------------+
474	        |                     |
475	        |                     |
476	        |                     |
477	      +----------------+      |
478	      |   Provider B   |      |
479	      +----------------+      |
480	        |                     |
481	        |                     |
482	        |                     |
483	        | C announces         |
484	        | 204.1.34/24         |
485	        | to B          +----------------+
486	        +---------------|   Provider C   |---- Customer3 (204.1.34.0/24)
487	                        +----------------+

489	                                  Figure 3

491	   In Figure 3, Providers A, B and C are all directly connected to each
492	   other.  In order for Provider B to reach Customers 1, 2, 4 and 5,
493	   Provider A still only announces the 204.1.0.0/16 aggregate.  However,
494	   in order for Provider B to reach Customer3, Provider C must announce
495	   the prefix 204.1.34.0/24.  Prefix 204.1.34.0/24 is called a "more-
496	   specific" of 204.1.0.0/16; another term used is that Customer3 and
497	   Provider C have "punched a hole" in Provider A's address block.  From
498	   Provider B's view, the address space underneath 204.1.0.0/16 is no
499	   longer cleanly aggregated into a single prefix and instead the
500	   aggregation has been broken because the addressing is inconsistent
501	   with the topology; in order to maintain reachability to Customer1-5,
502	   Provider B must carry two prefixes where it used to have to carry
503	   only one.

505	   The example in Figure 3 explains why sites must renumber if existing
506	   levels of aggregation are to be maintained.  While a small number of
507	   new exceptions could be tolerated, and certain prefixes have been
508	   grandfathered, the reality in today's Internet is that there are
509	   thousands of providers, many with thousands of individual customers.
510	   It is generally accepted that renumbering of sites is essential for
511	   maintaining sufficient aggregation.

513	   The empirical cost of renumbering a site in order to maintain
514	   aggregation has been the subject of much discussion.  The practical
515	   reality, however, is that forcing all sites to renumber is difficult
516	   given the size and wealth of companies that now depend on the
517	   Internet for running their business.  Thus, although the technical
518	   community came to consensus that, with the current practice of
519	   provider-based addressing, address lending was necessary in order for
520	   the Internet to continue to operate and grow, the reality has been
521	   that some of CIDR's benefits have been lost because not all sites
522	   renumber.  It is worth noting that a number of providers today do
523	   route filtering based, in part, on prefix length; as a result, a site
524	   which does not renumber may have only partial connectivity to the
525	   Internet.  That is, a site may advertise a long prefix into the
526	   routing system, but there is no assurance that all parts of the
527	   Internet will accept the route; some simply ignore it.

529	   One unfortunate characteristic of CIDR at an architectural level is
530	   that the pieces of the infrastructure that benefit from the
531	   aggregation (i.e., the providers which make up the DFZ) are not the
532	   pieces that incur the renumbering cost (i.e., the end site).  The
533	   logical corollary of this statement is that the pieces of the
534	   infrastructure that do incur cost to achieve aggregation (e.g., sites
535	   which renumber when they change providers) don't directly see the
536	   benefit. (The word "directly" is used here because the continued
537	   operation of the Internet is a benefit, though it requires
538	   selflessness on the part of the site to recognize.) This can lead to
539	   a "tragedy of the commons" situation, where everyone agrees that some
540	   sites should renumber, but they themselves want to be one of those
541	   that do not.

543	3.4.  Multi-Homed Sites and Aggregation

545	   As sites become more dependent on the Internet, they have begun to
546	   install additional connections to the Internet to improve robustness
547	   and performance.  Such sites are called "multi-homed".
548	   Unfortunately, when a site connects to the Internet at multiple
549	   places, the impact on routing can be much like a site that switches
550	   providers but refuses to renumber.

552	   In the pre-CIDR days, multi-homed sites were typically known by only
553	   one network prefix, the prefix of their own address block.  When that
554	   site's providers announced the site's network into the global routing
555	   system, a "shortest path" type of routing would occur so that pieces
556	   of the Internet closest to the first provider might use the first
557	   provider while other pieces of the Internet would use the second
558	   provider.  This allowed sites to use the routing system itself to
559	   load balance traffic across their multiple connections.  This type of
560	   multi-homing assumes that a site's prefix can be propagated
561	   throughout the DFZ, an assumption that is no longer universally true.

563	   With CIDR, issues of addressing and aggregation complicate matters
564	   significantly.  At the highest level, there are three possible ways
565	   to deal with multi-homed sites.  The first possibility is to stay
566	   with pre-CIDR approach, allowing each multi-homed site to receive its
567	   address block directly from a registry, independent of its providers.
568	   The problem with this approach is that, because the address block is
569	   obtained independent of either provider, it is not aggregatable and
570	   therefore has a negative impact on the scaling of global routing.

572	   The second approach is for a multi-homed site to receive an
573	   allocation from one of its providers and just use that single prefix.
574	   The site would advertise its prefix to all of the providers to which
575	   it connects.  There are two problems with this approach.  First,
576	   although the prefix is aggregatable by the provider which made the
577	   allocation, it is not aggregatable by the other providers.  To the
578	   other providers, the site's prefix poses the same problem that a
579	   provider-independent address would.  Second, due to CIDR's rule for
580	   longest-match routing, it turns out that the site's prefix is not
581	   always aggregatable in practice even by the provider that made the
582	   allocation, if you want shortest-path routing load-spreading.
583	   Consider Figure 4.  Provider C has two paths for reaching Customer1.
584	   Provider A advertises 204.1/16, an aggregate which includes
585	   Customer1.  But Provider C will also receive an advertisement for
586	   prefix 204.1.0/19 from Provider B, and because the prefix match
587	   through B is longer, C will choose that path.  In order for Provider
588	   C to be able to choose between the two paths, Provider A would also
589	   have to advertise the longer prefix for 204.1.0/19 in addition to the
590	   shorter 204.1/16.  At this point, from the routing perspective, the
591	   situation is very similar to the general problem posed by the use of
592	   provider-independent addresses.

594	   It should be noted that the above example simplifies a very complex
595	   issue.  For example, consider the example in Figure 4 again.
596	   Provider A could choose not to propagate a route entry for the longer
597	   204.1.0/19 prefix, advertising only the shorter 204.1/16.  In such
598	   cases, provider C would always select Provider B.  Internally,
599	   Provider A would continue to route traffic from its other customers
600	   to Customer1 directly.  If Provider A had a large enough customer
601	   base, effective load sharing might be achieved.

603	                                      A advertises
604	                     +------------+  204.1/16 to C  +------------+
605	                  ___| Provider A |-----------------| Provider C |
606	                 /   +------------+                 +------------+
607	                /                       +----------/
608	               /                       /
609	    Customer1 ---                     / B advertises 204.1.0/19 to C
610	   204.1.0.0/19  |                   /
611	                 |      +------------+
612	                  ----- | Provider B |
613	                        +------------+

615	                                Figure 4

617	   The third approach is for a multi-homed site to receive an allocation
618	   from each of its providers and not advertise the prefix obtained from
619	   one provider to any of its other providers.  This approach has
620	   advantages from the perspective of route scaling because both
621	   allocations are aggregatable.  Unfortunately, the approach doesn't
622	   necessarily meet the demands of the multi-homed site.  A site that
623	   has a prefix from each of its providers faces a number of choices
624	   about how to use that address space.  Possibilities include:

626	      1) The site can number a distinct set of hosts out of each of the
627	        prefixes.  Consider a configuration where a site is connected to
628	        ISP-A and ISP-B.  If the link to ISP-A goes down, then unless
629	        the ISP-A prefix is announced to ISP-B (which breaks
630	        aggregation), the hosts numbered out of the ISP-A prefix would
631	        be unreachable.

633	      2) The site could assign each host multiple addresses (i.e., one
634	        address for each ISP connection).  There are two problems with
635	        this.  First, it accelerates the consumption of the address
636	        space. While this may be a problem for the (limited) IPv4
637	        address space, it is not a significant issue in IPv6.  Second,
638	        when the connection to ISP-A goes down, addresses numbered out
639	        of ISP-A's space become unreachable.  Remote peers would have to
640	        have sufficient intelligence to use the second address.  For
641	        example, when initiating a connection to a host, the DNS would
642	        return multiple candidate addresses.  Clients would need to try
643	        them all before concluding that a destination is unreachable
644	        (something not all network applications currently do).  In
645	        addition, a site's hosts would need a significant amount of
646	        intelligence for choosing the source addresses they use.  A host
647	        shouldn't choose a source address corresponding to a link that
648	        is down.  At present, hosts do not have such sophistication.

650	   In summary, how best to support multi-homing with IPv4/CIDR faces a
651	   delicate balance between the scalability of routing versus the site's
652	   requirements of robustness and load-sharing.  At this point in time,
653	   no solution has been discovered that satisfies the competing
654	   requirements of route scaling and robustness/performance.  It is
655	   worth noting, however, that some people are beginning to study the
656	   issue more closely and propose novel ideas [BATES].

658	4.  The GSE Proposal

660	   This section provides a description of GSE with the intent of making
661	   this document stand-alone with respect to the GSE "specification".
662	   We begin by reviewing the motivation for GSE.  Next we review the
663	   salient technical details, and we conclude by listing the explicit
664	   non-goals of the GSE proposal.

666	4.1.  Motivation For GSE

668	   The primary motivation for GSE was the concern that the chief initial
669	   IPv6 global unicast address structure, provider-based [RFC 2073], was
670	   fundamentally the same as IPv4 with CIDR and provider-based
671	   aggregation.  Provider-based addressing requires that sites renumber
672	   when they switch providers, so that sites are always aggregated
673	   within their provider's prefix.  In practice, the cost of renumbering
674	   (which can only grow as a site grows in size and becomes more
675	   dependent on the Internet for day-to-day business) is high enough
676	   that an increasing number of sites refuse to renumber when they
677	   change providers.  This cost is particularly relevant in cases where
678	   end-users are asked to renumber because an upstream provider has
679	   changed its transit provider (i.e., the end site is asked to renumber
680	   for reasons outside of its control and for which it sees no direct
681	   benefit).  Consequently, the GSE draft asserts that IPv4 with CIDR
682	   has not achieved the aggressive aggregation required for the route
683	   computation functions of the DFZ of the Internet to scale for IPv4
684	   and that the much larger address space of IPv6 simply exacerbates the
685	   problem.

687	   The GSE proposal does not propose to eliminate the need for
688	   renumbering.  Indeed, it asserts that end sites will have to renumber
689	   more frequently in order to continue scaling the Internet.  However,
690	   GSE proposes to make the cost of renumbering small enough that sites
691	   can be renumbered at essentially any time with little or no
692	   disruption to its network connectivity, and in particular with no
693	   impact on communications that are strictly within the site.

695	   Finally, GSE attempts to address the problem of sites that have
696	   multiple Internet connections.  In CIDR, the pressure for better
697	   multi-homing support can create exceptions to route aggregation and
698	   result in poor scaling.  That is, the public routing infrastructure
699	   may have to carry multiple distinct routes for some demanding multi-
700	   homed sites, one for each independent path.  GSE recognizes the
701	   "special work done by the global Internet infrastructure on behalf of
702	   multi-homed sites" [GSE], and proposes a way for multi-homed sites to
703	   gain certain benefit without impacting global scaling.  This includes
704	   a specific mechanism that providers can use to support multi-homed
705	   sites, presumably at a cost that the site would consider when
706	   deciding whether or not to become multi-homed.

708	4.2.  GSE Address Format

710	   The key departure of GSE from classical IP addressing (both v4 and
711	   v6) was that rather than over-loading addresses with both locator and
712	   identifier functions, it splits the address into two elements: the
713	   high-order 8 bytes used for routing purposes (called "Routing Stuff"
714	   throughout the rest of this document) and the low-order 8 bytes for
715	   unique identification of an end-point.  The structure of GSE
716	   addresses is:

718	                0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
719	              +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
720	              |  Routing Goop    | STP| End System Designator |
721	              +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
722	                     6+ bytes   ~2 bytes       8 bytes

724	                                 Figure 5

726	4.2.1.  Routing Stuff (RG and STP)

728	   The Routing Goop (RG) identifies where within the public Internet
729	   topology a site connects and is used to route datagrams to the site.
730	   RG is structured as follows:

732	                           1                   2                   3
733	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
734	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
735	      | xxx | 13 Bits of LSID         |      Upper 16 bits of Goop    |
736	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

738	       3               4
739	       2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
740	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
741	      | Bottom 18 bits of Routing Goop    |
742	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

744	                                 Figure 6

746	   The RG describes the location of a site's connection by identifying
747	   smaller and smaller regions of topology until finally it identifies
748	   the link which connects the site.  Before interpreting the bits in
749	   the RG, it is important to understand that routing with GSE depends
750	   on decomposing the Internet's topology into a specific graph.  At the
751	   highest level, the topology is broken into Large Structures (LSs).
752	   An LS is a region that can aggregate significant amounts of topology.
753	   Examples of potential LSs are large providers and exchange points.
754	   Within an LS the topology is further divided into another graph of
755	   structures, with each LS dividing itself however it sees fit.  This
756	   division of the topology into smaller and smaller structures can
757	   recurse for a number of levels, where the trade-off is "between the
758	   flat-routing complexity within a region and minimizing total depth of
759	   the substructure" [ESD].

761	   Having described the decomposition process, we now examine the bits
762	   in the RG.  After the 3-bit prefix identifying the address as having
763	   a GSE format, the next 13 bits identify the LS.  By limiting the
764	   field to 13 bits, a ceiling is defined on the complexity of the top-
765	   most routing level (i.e., what we currently call the DFZ).  In the
766	   next 34 bits, a series of subordinate structure(s) are identified
767	   until finally the leaf subordinate structure is identified, at which
768	   point the remaining bits identify the individual link within that
769	   leaf structure.

771	   The remaining 14 bits of the Routing Stuff (i.e., the low-order 14
772	   bits of the high-order 8 bytes) comprise the STP and are used for
773	   routing structure within a site, similar to subnetting with IPv4.
774	   These bits are not part of the Routing Goop per se.  The distinction
775	   between Routing Stuff and Routing Goop is that RG controls routing in
776	   the Public Internet, while Routing Stuff includes the RG plus the
777	   Site Topology Partition (STP).  The STP is used for routing structure
778	   within a site.

780	   The GSE proposal formalized the ideas of sites and of public versus
781	   private topology.  In the first case, a site is a set of hosts,
782	   routers and media under the same administrative control which have
783	   zero or more connections to the Internet.  A site can have an
784	   arbitrarily complicated topology, but all of that complexity is
785	   hidden from everyone outside of the site.  A site only carries
786	   packets which originated from, or are destined to, that site; in
787	   other words, a site cannot be a transit network.  A site is private
788	   topology, while the transit networks form the public topology.

790	   A datagram is routed through public topology using just the RG, but
791	   within the destination site, routing is based on the Site Topology
792	   Partition (STP).

794	4.2.2.  End-System Designator

796	   The End-System Designator (ESD) is an unstructured 8-byte field that
797	   uniquely identifies an interface from all others.  The most important
798	   feature of the ESD is that it alone identifies an interface; the
799	   Routing Stuff portion of an address, although used to help deliver a
800	   packet to its destination, is not used to identify an end point.
801	   End-points of communication care about the ESD; as examples, TCP
802	   peers could be identified by the source and destination ESDs alone
803	   (together with port numbers), checksums would exclude the RG (the
804	   sender doesn't even know its RG, as described later) and on receipt
805	   of a packet only the ESD would be used in testing whether the packet
806	   is intended for local delivery.

808	   The leading contender for the role of a 64-bit globally unique ESD is
809	   the recently defined "EUI-64" identifier [EUI64].  These identifiers
810	   consist of a 24-bit "company_id" concatenated with a 40-bit
811	   "extension".  (Company_id is a new name for the "Organizationally
812	   Unique Identifier" that forms the first half of an 802 MAC address).
813	   Manufacturers are expected to assign locally unique values to the
814	   extension field, guaranteeing global uniqueness for the complete 64-
815	   bit identifier.  A range of the EUI-64 space is reserved to cover
816	   pre-existing 48-bit MAC addresses, and a defined mapping insures that
817	   an ESD derived from a MAC address will not duplicate the ESD of a
818	   device that has a built-in EUI-64.

820	   In some cases, interfaces may not have an appropriate MAC address or
821	   EUI-64 identifier.  A globally unique ESD must then be obtained
822	   through some alternate mechanism.  Several possible mechanisms can be
823	   imagined (e.g., the IANA could hand out addresses from the company_id
824	   it has been allocated).  Although we do not explore them in detail
825	   here, we note that a global coordination structure is required here
826	   to control the allocation of globally unique identifiers.

828	4.3.  Address Rewriting by Border Routers

830	   To obviate the need to renumber devices within sites because of
831	   changing providers, the GSE design hides the global Routing Goop (RG)
832	   from hosts in each site by having site border routers rewrite
833	   addresses of the packets they forward across the boundary between the
834	   site and public topology.  Within a site, nodes need not know the RG
835	   associated with their addresses.  They simply use a designated
836	   "Site-Local RG" value for internal addresses.  When a packet is
837	   forwarded to the public topology, the border router replaces the
838	   Site-Local RG portion of the packet's source address with an
839	   appropriate value.  Likewise, when a packet from the public topology
840	   is forwarded into a site, the border router replaces the RG part of
841	   the destination address with the designated Site-Local RG.

843	   To simplify discussion, the following text uses the singular term RG
844	   as if a site could have only one RG value (i.e., one connection to
845	   the Internet).  In fact, a site could have multiple Internet
846	   connections and consequently multiple RGs.

848	   GSE's approach to easing renumbering isn't so much to ease
849	   renumbering as to make it transparent to end users.  The RG by which
850	   a site is known is hidden from nodes within that site.  Instead, the
851	   RG for the site would be known only by the exit router, either
852	   through static configuration or through a dynamic protocol with an
853	   upstream provider.

855	   Because end hosts don't know their RG, they don't know their entire
856	   16-byte address, so they can't specify the full address in the source
857	   fields of packets they originate.  Consequently, when a datagram
858	   leaves a site, the egress border router fills in the high-order
859	   portion of the source address with the appropriate RG.

861	   The point of keeping the RG hidden from nodes within the core of a
862	   site is to insure the changeability of the RG without impacting the
863	   site itself.  It is expected that the RG would need to change
864	   relatively frequently (e.g., several times a year) in order to
865	   support sufficient aggregation as the topology of the Internet
866	   changes.  A change to a site's RG would only require a change at the
867	   site's egress point, and it's well possible that this change could be
868	   accomplished through a dynamic protocol with the upstream provider.

870	   Hiding a site's RG from its internal nodes does not, however, mean
871	   that changes to RG have no impact on end sites.  Since the full 16-
872	   byte address of a node isn't a stable value (the RG portion can
873	   change), a stored address may contain invalid RG and be unusable if
874	   it isn't "refreshed" through some other means.  For example, opening
875	   a TCP connection, writing the address of the peer to a file and then
876	   later trying to reestablish a connection to that peer may well fail.
877	   For intra-site communication, however, it is expected that only the
878	   Site-Local RG would be used (and stored) which would continue to work
879	   for intra-site communication regardless of changes to the site's
880	   external RG.  This shields a site's intra-site traffic from any
881	   instabilities resulting from renumbering.

883	   In addition to rewriting source addresses that leave a site,
884	   destination addresses must be rewritten upon entering a site.  To
885	   understand the motivation behind this, consider a site with
886	   connections to three Internet providers.  Because each of those
887	   connections has its own RG, each destination within the site would be
888	   known by three different 16-byte addresses.  As a result, intra-site
889	   routers would have to carry a routing table three times larger than
890	   expected.  To work around this, GSE proposed replacing the RG in
891	   inbound packets with the special "Site-Local RG" value to reduce
892	   intra-site routing tables to the minimum necessary.

894	   In summary, when a node initiates a flow to a node at another site,
895	   the initiating node is expected to know the full 16-byte address for
896	   the destination through mechanisms such as a DNS query.  The
897	   initiating node does not, however, know its own RG, and uses the
898	   Site-Local RG values in the RG part of the source address.  When the
899	   datagram reaches the exit border router, the router replaces the RG
900	   of the packet's source address.  When the datagram arrives at the
901	   entry router at the destination site, the router replaces the RG
902	   portion of the destination address with the distinguished "Site-Local
903	   RG" value.  When the destination host needs to send return traffic,
904	   that host knows the full 16-byte address for the other host because
905	   it appeared in the source address field of the arriving packet.

907	4.4.  Renumbering and Rehoming Mid-Level ISPs

909	   One of the most difficult-to-solve components of the renumbering
910	   problem with CIDR is that of renumbering mid-level service providers.
911	   Specifically, if SmallISP1 changes its transit provider from BigISP1
912	   to BigISP2, then in order for the overall size of the routing tables
913	   to stay the same, all of SmallISP1's customers would have to renumber
914	   into address space covered by an aggregate of BigISP2.  GSE deals
915	   with this problem by handling the RG in DNS with indirection.
916	   Specifically, a site's DNS server specifies the RG portion of its
917	   addresses by referencing the "name" of its immediate provider, which
918	   is a resolvable DNS name (this implies a new Resource Record type).
919	   That provider may define some of the low-order bits of the RG and
920	   then reference its immediate provider.  This chain of reference
921	   allows mid-level service providers to change transit providers, and
922	   the customers of that mid-level will simply "inherit" the change in
923	   RG.  Note that this mechanism does not depend on the GSE address
924	   format per se and can also be applied to IPv4 addressing.

926	4.5.  Support for Multi-Homed Sites

928	   GSE defines a specific mechanism for providers to use to support
929	   multi-homed customers that gives those customers more reliability
930	   than singly-homed sites, but without a negative impact on the scaling
931	   of global routing.  This mechanism is not specific to GSE and could
932	   be applied to any multi-homing scenario where a site is known by
933	   multiple prefixes (including provider-based addressing).  Assume the
934	   following topology:

936	                             Provider1     Provider2
937	                             +------+       +------+
938	                             |      |       |      |
939	                             | PBR1 |       | PBR2 |
940	                             +----x-+       +-x----+
941	                                  |           |
942	                              RG1 |           | RG2
943	                                  |           |
944	                               +--x-----------x--+
945	                               | SBR1       SBR2 |
946	                               |                 |
947	                               +-----------------+
948	                                      Site

950	                                    Figure 7

952	   PBR1 is Provider1's border router while PBR2 is Provider2's border
953	   router.  SBR1 is the site's border router that connects to Provider1
954	   while SBR2 is the site's border router that connects to Provider2.
955	   Imagine, for example, that the line between Provider1 and the site
956	   goes down.  Any already existing flows that use a destination address
957	   including RG1 would stop working.  In addition, any addresses
958	   returned from DNS queries that include RG1 would not be viable
959	   addresses.  If PBR1 and PBR2 knew about each other, however, then in
960	   this case PBR1 could tunnel packets destined for RG1-prefixed
961	   addresses to PBR2, thus keeping the communication working.  (Note
962	   that IP-in-IP encapsulation is necessary since routers between PBR1
963	   and PBR2 would forward packets destined for addresses with PBR1's
964	   prefix back towards PBR1.)

966	4.6.  Explicit Non-Goals for GSE

968	   It is worth noting explicitly that GSE did not attempt to address the
969	   following issues:

971	     1) Survival of TCP connections through renumbering events.  If a
972	        site is renumbered, TCP connections using a previous address
973	        will continue to work only as long as the previous address still
974	        works (i.e., while it is still "valid" using RFC 1971
975	        terminology).  No attempt is made to have existing connections
976	        switch to the new address.

978	     2) It is not known how multicast can be made to work under GSE.

980	     3) It is not known how mobility can be made to work under GSE.

982	     4) The performance impact of having routers rewrite portions of the
983	        source and destination address in packet headers requires
984	        further study.

986	   That GSE didn't address the above does not mean they cannot be
987	   solved.  Rather, the issues simply weren't studied in sufficient
988	   depth.

990	5.  Analysis: The Pros and Cons of Overloading Addresses

992	   At this point we have given complete descriptions of two addressing
993	   architectures:  IPv4, which uses the overloading technique, and GSE,
994	   which uses the separated technique.  We now compare and contrast the
995	   two techniques.

997	   The following discussion is organized around three fundamental
998	   points:

1000	     1) Identifiers indicate who the intended recipient of a packet is.
1001	        At the network layer, an identifier refers to an interface, at
1002	        the transport layer it refers to a process or other endpoint of
1003	        a "connection".

1005	     2) Identifiers must be mapped into a locator that the network layer
1006	        can use to actually deliver a packet to its intended
1007	        destination.

1009	     3) There must be a suitable way to adequately authenticate the user
1010	        of an identifier, so that communicating peers have sufficient
1011	        confidence that packets sent to or received from a particular
1012	        identifier correspond to the intended recipient.

1014	5.1.  Purpose of an Identifier

1016	   An identifier gives an entity the ability to refer to a communication
1017	   end point and to refer to the same endpoint over an extended period
1018	   of time.  In terms of semantics, two or more packets sent to the same
1019	   identifier should be delivered to the same end point.  Likewise, one
1020	   expects multiple packets received from the same identifier to have
1021	   been originated by the same sending entity.  That is, a source
1022	   identifier indicates who the packet is from and a destination
1023	   identifier indicates who the packet is intended for.

1025	   In IPv4, when applications communicate, transport "identifiers"
1026	   consist of addresses and port numbers.  For the purposes of this
1027	   discussion, we use the term "identifier" to mean the identifier of an
1028	   interface.  It is assumed that port numbers will be present when
1029	   higher layer entities communicate; the exact port numbers used are
1030	   not relevant to this discussion.

1032	   In small networks, flat routing can be used to deliver packets to
1033	   their destination based only on the destination identifier carried in
1034	   a packet header (i.e., the identifier is the locator and is not
1035	   required to have any structure).  However, in such systems, a
1036	   distinct route entry is required for every destination, an approach
1037	   that does not scale.  In larger networks, packet addresses include a
1038	   locator that helps the network layer deliver a packet to its
1039	   destination.  Such a locator typically has a structure to keep
1040	   routing tables small relative to the total number of reachable
1041	   destinations.  In IPv4, the identifier and locator are combined in a
1042	   single address; it is not possible to separate the locator portion of
1043	   an address from the identifier portion.  In contrast, the ESD portion
1044	   of a GSE address (which can easily be extracted from the address)
1045	   serves as an identifier, while the Routing Stuff plays the role of a
1046	   locator.

1048	   Having a clear separation between the locator and the identifier
1049	   portion of an address appears to provide protocols some additional
1050	   flexibility.  Once a packet has been delivered to its intended
1051	   destination interface (i.e., node), for example, the locator has
1052	   served its purpose and is no longer needed to further demultiplex a
1053	   packet to its higher-layer end point.  This means that if a packet is
1054	   delivered to the correct destination node (that is the identifier
1055	   carried in the packet address matches to one interface identifier of
1056	   the node), the node will accept the packet, regardless of how the
1057	   packet got there.  The exact locator used does not matter, within
1058	   most Internet circumstances, so long as it gets the packet delivered
1059	   to its proper destination.

1061	   The most obvious example that could benefit from the separation of
1062	   locators and identifiers involves communication with a mobile host.
1063	   Transport protocols such as TCP are unable to keep connections open
1064	   if either of the two endpoint identifiers for an open connection
1065	   changes.  Fundamentally, the endpoint identifiers indicate the two
1066	   endpoint entities that are communicating.  If a node were to receive
1067	   a packet from a node with which it had been communicating previously,
1068	   but the identifier used by the sending node has changed, the
1069	   recipient would be unable to distinguish this case from that of a
1070	   packet received from a completely different node.

1072	   In the specific case of TCP and IPv4, connections are identified
1073	   uniquely by the tuple: (srcIPaddr, dstIPaddr, srcport, dstport).
1074	   Because IPv4 addresses contain a combined locator/identifier, it is
1075	   not possible to have a node's location change without also having its
1076	   identifier change.  Consequently, when a mobile node moves, its
1077	   existing connections no longer work, in the absence of special
1078	   protocols such as Mobile IP [MOBILITY].

1080	   In contrast, connections in GSE are identified by the ESDs rather
1081	   than full IPv6 addresses.  That is, connections are identified
1082	   uniquely by the tuple: (srcESD, dstESD, srcport, dstport).
1083	   Consequently, when demultiplexing incoming packets to their proper
1084	   end point, TCP would ignore the Routing Stuff portions of addresses.
1085	   Because the Routing Stuff portion of an address is ignored during
1086	   demultiplexing operations, a mobile node is free to move -- and
1087	   change its Routing Stuff -- without changing its identification.

1089	   As a side note, it is a requirement in GSE that packets be
1090	   demultiplexed to higher layer endpoints on ESDs alone independent of
1091	   the Routing Stuff.  If a site is multi-homed, the packets it sends
1092	   may exit the site at different egress border routers during the
1093	   lifetime of a connection.  Because each border router will place its
1094	   own RG into the source addresses of outgoing packets, the receiving
1095	   TCP must ignore (at least) the RG portion of addresses when
1096	   demultiplexing received packets.  The alternative would make TCP
1097	   unable to cope with common routing changes, i.e., if the path
1098	   changed, packets delivered correctly would be discarded by the
1099	   receiving TCP rather than accepted.

1101	   Not surprisingly, having separate locator and identifiers in
1102	   addresses leads to additional problems as well.  First, an identifier
1103	   by itself provides only limited value.  In order to actually deliver
1104	   packets to a destination identifier, a corresponding locator must be
1105	   known.  The general problem of mapping identifiers into locators is
1106	   non-trivial to solve, and is the topic of the next Section.  Second,
1107	   because the Routing Stuff is ignored when packets being demultiplexed
1108	   upward in the protocol stack, it becomes much easier for an intruder
1109	   to masquerade as someone else.

1111	5.2.  Mapping an Identifier to a Locator

1113	   The idea of using addresses that cleanly separate location and
1114	   identification information is not new.  However, there are several
1115	   different flavors.  In its pure form, a sender need only know the
1116	   identifier of an end-point in order to send packets to it.  When
1117	   presented with a datagram to send, network software would be
1118	   responsible for determining the locator associated with an identifier
1119	   so that the packet can be delivered.  A key question is: "who is
1120	   responsible for finding the Routing Stuff associated with a given
1121	   identifier"? There are a number of possibilities, each with a
1122	   different set of implications:

1124	     1) The network layer could be responsible for doing the mapping.
1125	        The advantage of such a system is that an ESD could be stored
1126	        essentially forever (e.g., in configuration files), but whenever
1127	        it is actually used, network layer software would automatically
1128	        perform the mapping to determine the appropriate Routing Stuff
1129	        for the destination.  Likewise, should an existing mapping
1130	        become invalid, network layer software could dynamically
1131	        determine the updated value.  Unfortunately, building such a
1132	        mapping mechanism that scales is difficult if not impossible
1133	        with a flat identifier space (e.g., the ESD identifier).

1135	     2) The transport layer could be responsible for doing the mapping.
1136	        It could perform the mapping when a connection is first opened,
1137	        periodically refreshing the binding for long-running
1138	        connections.  Implementing such a scheme would change the
1139	        existing transport layer protocols TCP and UDP significantly.
1140	        However, in the case of TCP, such a scheme would have the
1141	        benefit that applications would probably not need to be
1142	        modified.  For UDP-based applications, this may not hold, since
1143	        most UDP-based protocols are implemented within applications.

1145	     3) Higher-layer software (e.g., the application itself) could be
1146	        responsible for performing the mapping.  This potentially
1147	        increases the burden on application programmers significantly,
1148	        especially if long-running connections are required to survive
1149	        renumbering and/or deal with mobile nodes.

1151	   The GSE proposal uses the last approach.  The network and transport
1152	   layers are always presented with both the Routing Stuff (RG + STP)
1153	   and the ESD together in one IPv6 address.  It is neither of these
1154	   layers' jobs to determine the Routing Stuff given only the ESD or to
1155	   validate that the Routing Stuff is correct.  When an application has
1156	   data to send, it queries the DNS to obtain the IPv6 AAAA record for a
1157	   destination.  The returned AAAA record contains both the Routing
1158	   Stuff and the ESD of the specified destination.  While such an
1159	   approach eliminates the need for the lower layers to be able to map
1160	   ESDs into corresponding Routing Stuff, it also means that when
1161	   presented with an address containing an incorrect (i.e., no longer
1162	   valid) Routing Stuff, the network is unable to deliver the packet to
1163	   its correct destination.  Note that addresses containing invalid
1164	   Routing Stuff will result any time when cached addresses are used
1165	   after the Routing Stuff of the address becomes invalid.  This may
1166	   happen if addresses are stored in configuration files, a mobile node
1167	   moves to a new location, long-running applications (clients and
1168	   servers) cache the result of DNS queries, a long-running connection
1169	   attempts to continue operating during a site renumbering event, etc.
1170	   Whatever the causes, the failures are fundamentally due to dynamic
1171	   topological changes at the network layer, yet in GSE such failures
1172	   are left to be dealt with at the application level (through DNS),
1173	   because neither the transport nor the network level has the ability
1174	   to re-mapping identifiers to corresponding locators.

1176	   To avoid the above problem a network architecture must provide the
1177	   ability to map an identifier to a locator.  In IPv4, this mapping is
1178	   trivial, since the identifier and locator are combined in a single
1179	   quantity (i.e., the IPv4 address).  GSE does not provide this mapping
1180	   functionality directly.  Instead, GSE assumes that a node's DNS name
1181	   serves as its stable identifier, and uses normal DNS queries to map
1182	   the DNS "identifier" into an IPv6 address.  The IPv6 address contains
1183	   both the ESD identifier together with its Routing Stuff, that is an
1184	   initial binding/mapping between the identifier and locator.  When
1185	   this binding breaks (for example due to dynamic topological changes),
1186	   the ESD identifier cannot be mapped into a new locator by itself.
1187	   Instead one must resort back to application level, hoping another DNS
1188	   query would provide rescue to the broken binding between identifier
1189	   to locator that is needed for network delivery.

1191	   The use of DNS to provide identifier to locator mapping contributes
1192	   to GSE's apparent simplicity.  However, there are two fundamental
1193	   problems with this approach, if the intention is to make it
1194	   transparently easy to change locators over time.  First, the burden
1195	   of performing the mapping from identifier to locator is placed
1196	   directly on the application, because lower layers (i.e., transport
1197	   and network layers) cannot perform the mapping themselves due to
1198	   layering violation concerns (i.e., TCP and UDP can't perform a DNS
1199	   query).  Second, following all RG changes the DNS database must be
1200	   promptly updated and all expired information must be flushed out of
1201	   all DNS caches.  This stringent timing requirement imposed by lower
1202	   level operation would represent a departure from the original DNS
1203	   design, which provides DNS names to address mappings that only change
1204	   slowly over time if at all, and which relies heavily on caching over
1205	   relatively long time periods to scale well.

1207	   The following subsections discuss a number of issues related to
1208	   keeping track of or determining the locator associated with an
1209	   identifier.

1211	5.2.1.  Scalable Mapping of Identifiers to Locators

1213	   It is not difficult to construct a mapping from an identifier (such
1214	   as an ESD) to a locator (as well as other information such as a name,
1215	   cryptographic keys, etc.) provided one can structure the identifier
1216	   space appropriately to support scalable lookups.  In particular,
1217	   identifiers must have sufficient structure to support the delegating
1218	   mechanism of a distributed database such as DNS.  On the other hand,
1219	   no scalable mechanism is known for performing such a mapping on
1220	   arbitrary identifiers taken from a flat space lacking any structure.

1222	   Imposing a hierarchy on identifiers poses the following difficulties:

1224	      - - It increases the size of the identifier.  The exact size
1225	        necessary to support sufficient hierarchy is unclear, though it
1226	        is likely to be roughly the same as that used for the routing
1227	        hierarchy.  Analysis done during the original IPng debates
1228	        [RFC1752] suggests that close to 48-bits of hierarchy are needed
1229	        to identify all the possible sites 30-40 years from now.

1231	      - - The assignment of identifiers must be tied to the delegation
1232	        structure.  That is, the site that "owns" an identifier is the
1233	        one responsible for maintaining the identifier-to-locator
1234	        mapping information about it.

1236	      - - Due to the requirement of tying an identifier to the
1237	        delegation structure the identifier of a node cannot be burned
1238	        in during manufacturing.  Instead a mechanism is needed to allow
1239	        a node to learn its identifier.  To be practical, such a
1240	        mechanism would need to be automated and avoid the need for
1241	        manual configuration.

1243	5.2.2.  Insufficient Hierarchy Space in ESDs

1245	   In the case of GSE's 8-byte ESD, the size of the identifier is not
1246	   large enough to contain sufficient hierarchy to both create DNS-like
1247	   delegation points and support stateless address autoconfiguration.
1248	   Stateless address autoconfiguration [RFC1971] already assumes that an
1249	   interface's 6-byte link-layer (i.e., MAC) address can be appended to
1250	   a link's routing prefix to produce a globally unique IPv6 address.
1251	   With GSE, only two bytes would be available for hierarchy and
1252	   delegation.

1254	   It is also the case that the sorts of built-in identifiers now found
1255	   in computing hardware, such as "EUI-48" and "EUI-64" addresses
1256	   [IEEE802, IEEE1212], do not have the structure required for this
1257	   delegation.  Such identifiers have only two-levels of hierarchy; the
1258	   top-level typically identifies a manufacturer, with the remaining
1259	   part of the address being the equivalent of the serial number unique
1260	   to the manufacturer.  The delegation of the two-level hierarchy
1261	   (i.e., equipment manufacturer) does not correspond to the
1262	   administrator under which the end-user operates.  Hence, stateless
1263	   autoconfiguration [RFC1971] cannot create addresses with the
1264	   necessary hierarchical property in the ESD portion of an address.

1266	   Finally, imposing a required hierarchical structure on identifiers
1267	   such as an ESD would also introduce a new administrative burden and a
1268	   new or expanded registry system to manage ESD space (i.e., to insure
1269	   that ESDs are globally unique).  While the procedures for assigning
1270	   ESDs, which need only organizational and not topological
1271	   significance, would be simpler than the procedures for managing IPv4
1272	   addresses, it seems a laudable goal to avoid the problem altogether
1273	   if possible.  In addition, it would likely increase the complexity
1274	   for connecting new nodes to the Internet, a goal inconsistent with
1275	   Stateless Address autoconfiguration [RFC1971].

1277	   The topic of mapping full 16-byte GSE addresses to a locator or other
1278	   information is discussed in Appendix D.

1280	5.3.  Authentication of Identifiers

1282	   The true value of a globally unique identifier lies not on its
1283	   uniqueness but on an ability to use the same identifier repeatedly
1284	   and have it refer to the same end point.  That is, there is an
1285	   expectation that repeated and subsequent use of the same identifier
1286	   results in continued communication with the same end point.  To be
1287	   useful then, a valid identifier must either be easily distinguishable
1288	   from a fraudulent one, or the system must have a way to prevent
1289	   identifiers from being used in an unauthorized manner.

1291	   The remainder of this section discusses how identifier authentication
1292	   is done in both IPv4 and GSE, and shows how overloading an address
1293	   with both an identifier and a locator provides a significant
1294	   automatic identifier authentication.  In contrast, there is
1295	   essentially no identifier authentication in GSE.  It should be noted
1296	   that the actual strength of authentication that would be considered
1297	   sufficient is a topic in its own right, and we do not cover it here.
1298	   Instead, we focus on the relative strengths in the two schemes.

1300	5.3.1.  Identifier Authentication in IPv4

1302	   As described earlier, an IPv4 address simultaneously plays two roles:
1303	   a unique identifier and a locator.  Using an overloaded address as an
1304	   identifier has the side-effect of insuring that (for all practical
1305	   purposes) the identifier is globally unique.  Furthermore, because
1306	   the same number is used both to identify an interface and to deliver
1307	   data to that interface, it is impossible for some interface A to use
1308	   the identification of another interface B in an attempt to receive
1309	   data destined to B without being detected, unless the routing system
1310	   is compromised.

1312	   When both interfaces A and B claim the same unicast address, the
1313	   routing subsystem generally delivers packets to only one of them.
1314	   The other node will quickly realize that something is wrong (since
1315	   communication using the duplicate address fails) and take corrective
1316	   actions, either correcting a misconfiguration or otherwise detecting
1317	   and thwarting the intruder.  To understand how the routing subsystem
1318	   prevents the same address from being used in multiple locations,
1319	   there are two cases to consider, depending on whether the two
1320	   interfaces using duplicate addresses are attached to the same or to
1321	   different links.

1323	   When two interfaces on the same link use the same address, a node
1324	   (host or router) sending traffic to the duplicate address will in
1325	   practice send all packets to one of the nodes.  On Ethernets, for
1326	   example, the sender will use ARP (or Neighbor Discovery in IPv6) to
1327	   determine the link-layer address corresponding to the destination
1328	   address.  When multiple ARP replies for the target IP address are
1329	   received, the most recently received response replaces whatever is
1330	   already in the cache.  Consequently, the destinations a node using a
1331	   duplicate IP address can communicate with depends on what its
1332	   neighboring nodes have in their ARP caches.  In most cases, such
1333	   communication failures become apparent relatively quickly, since it
1334	   is unlikely that communication can proceed correctly on both nodes.

1336	   It is also the case that a number of ARP implementations (e.g., BSD-
1337	   derived implementations) log warning messages when an ARP request is
1338	   received from a node using the same address as the machine receiving
1339	   the ARP request.

1341	   When two interfaces on different links use the same address, the
1342	   routing subsystem generally delivers packets to only one of the nodes
1343	   because only one of the links has the right subnet corresponding to
1344	   the IP address.  Consequently, the node using the address on the
1345	   "wrong" link will generally never receive any packets sent to it and
1346	   will be unable to communicate with anyone.  For obvious reasons, this
1347	   condition is usually detected quickly.

1349	   It should be noted that although an address containing a combined
1350	   identifier and locator can be forged, the routing subsystem
1351	   significantly limits communication using the forged address.  First,
1352	   return traffic will be sent to the correct destination and not the
1353	   originator of the forged address.  This alone prevents certain types
1354	   of spoofing attacks.  For example, if a destination receives an
1355	   unexpected packet corresponding to a TCP connection that it is
1356	   unaware of, it may return at TCP segment resetting the connection.
1357	   Second, routers performing ingress filtering can refuse to forward
1358	   traffic claiming to originate from a source whose claimed address
1359	   does not match the expected addresses (from a topology perspective)
1360	   for sources located within a particular region [RFC 2267].  To
1361	   effectively masquerade as someone else requires subverting the
1362	   intermediate routing subsystem.

1364	5.3.2.  Identifier Authentication in GSE

1366	   In GSE, it is not possible for the routing subsystem to provide any
1367	   enforcement on the authenticity of identifiers with respect to their
1368	   corresponding Routing Stuff, since the Routing Stuff and ESD portions
1369	   of an address are by definition completely orthogonal quantities.
1370	   This fundamental problem is compounded by the fact that GSE provides
1371	   no way (at the transport or network layer) to map an ESD into its
1372	   corresponding Routing Stuff.  Thus, when looking at the source
1373	   address of a received packet, there is no way to ascertain whether
1374	   the Routing Stuff portion of the address corresponds to legitimate
1375	   Routing Stuff with respect to the corresponding ESD.  Consequently,
1376	   it becomes trivial in many cases for one node to masquerade as
1377	   another.

1379	5.4.  Transport Layer: What Locator Should Be Used?

1381	   In the following, we focus on what Routing Stuff to use with TCP; UDP
1382	   also depends on the Routing Stuff in similar way.  Indeed, we believe
1383	   that TCP is the "easier" case to deal with, for two reasons.  First,
1384	   TCP is a stateful protocol in which both ends of the connection can
1385	   negotiate with each other.  UDP-based communications are stateless,
1386	   and remember nothing from one packet to the next.  Consequently,
1387	   changing UDP to remember locator information in addition to the
1388	   identifier of the peer may require the introduction of "session"
1389	   features, perhaps as part of a common "library".  Second, changes to
1390	   UDP in practice mean changing individual applications themselves,
1391	   raising deployability questions.

1393	   There are three cases of interest from TCP's perspective:

1395	    - - the sending side of an active open

1397	    - - the sending side of a passive open (i.e., how to respond to an
1398	      active open)

1400	    - - changes to the Routing Stuff during an open connection.

1402	5.4.1.  RG Selection On An Active Open

1404	   If the host is performing a TCP "active open", the application first
1405	   queries the DNS to obtain the destination address, which contains the
1406	   appropriate RG for the remote peer.  That is, the initiator of
1407	   communication is assumed to provide the correct Routing Stuff when
1408	   initiating communication to a specific destination.

1410	5.4.2.  RG Selection On An Passive Open

1412	   When a server passively accepts connections from arbitrary clients,
1413	   it has no choice but to assume that the Routing Stuff in the source
1414	   address of a received packet that initiated the communication is
1415	   correct, because it has no way to authenticate its validity.  Note
1416	   that the Routing Stuff is "correct" only in the sense that it
1417	   corresponds to the site originating the connection, which the server
1418	   will send the reply to.  Whether the Routing Stuff paired with the
1419	   received ESD actually matches the Routing Stuff located at the site
1420	   where the legitimate owner of the ESD currently resides is not known
1421	   and cannot be determined.  Because the ESD alone cannot be mapped
1422	   into a locator (or some other quantity that can provide input to an
1423	   authentication procedure), there is no way to determine whether the
1424	   received Routing Stuff corresponds to that legitimately associated
1425	   with the source identifier of the received packet.  The issue of
1426	   spoofing is discussed in more detail later.

1428	5.4.3.  Mid-Connection RG Changes

1430	   While packets are flowing as part of an open connection, the RG
1431	   appearing on subsequent packets is susceptible to change through
1432	   renumbering events, or as a result of site-internal routing changes
1433	   that cause the egress point for off-site traffic to change.  It is
1434	   even possible that traffic-balancing schemes could result in the use
1435	   of two egress routers, with roughly every other packet exiting
1436	   through a different egress router.

1438	   Because TCP under GSE demultiplexes packets using only ESDs, newly
1439	   arrived packets will be delivered to the correct end-point regardless
1440	   of whether their source RG have changed.  The GSE proposal calls for
1441	   return traffic to continue to be sent via the "old" RG, even though
1442	   it may have been deprecated or become less optimal because the peer's
1443	   border router has changed.  That is, the RG to use for reaching a
1444	   peer is bound to a connection when the connection is established and
1445	   does not change thereafter.  However, the completion of renumbering
1446	   events (so that an earlier RG is now invalid) and certain topology
1447	   changes would require TCP to switch sending to a new RG mid-
1448	   connection.  To explore the scenario, we consider ways of allowing
1449	   the RG change to be made to existing established connections.

1451	   If TCP connection identifiers are based on ESDs rather than full
1452	   addresses, traffic from the same ESD would be viewed as coming from
1453	   the same peer, regardless of the source RG.  Because this
1454	   vulnerability is already present in today's Internet (forging the
1455	   source address of a packet is trivial), the mere delivery of incoming
1456	   datagrams with the same ESD but a different RG does not introduce new
1457	   vulnerability to TCP.  In today's Internet, any node can already
1458	   originate FINs/RSTs from an arbitrary source address and potentially
1459	   or definitely disrupt the connection.  Therefore, acceptance of
1460	   traffic independent of its source RG does not appear to significantly
1461	   worsen existing robustness.  Note, however, that ingress filtering as
1462	   described in Section 5.3.1, cannot be performed on packets containing
1463	   GSE addresses.  This does make it more difficult to prevent certain
1464	   types of attacks.

1466	   We also considered allowing TCP to reply to each segment using the RG
1467	   of the most recently-received segment.  Although this allows TCP
1468	   connections to survive certain important events (e.g., renumbering),
1469	   it also makes it trivial for anyone to hijack connections,
1470	   unacceptably weakening robustness compared with today's Internet.  A
1471	   sender simply needs to guess the sequence numbers in use by a given
1472	   TCP connection [Bellovin 89] and send traffic with a bogus RG to
1473	   hijack a connection to an intruder at an arbitrary location.

1475	   Providing protection from hijacking implies that the RG used to send
1476	   packets must be bound to a connection end-point (e.g., it is part of
1477	   the connection state).  Although it may be reasonable to accept
1478	   incoming traffic independent of the source RG, the choice of sending
1479	   RG requires more careful consideration.  Indeed, any subsequent
1480	   change in the RG used for sending traffic must be properly
1481	   authenticated (e.g., using cryptographic means).  In the GSE
1482	   proposal, the is no apparent way to authenticate such a change, since
1483	   the remote peer doesn't even know its own RG.  Consequently, the only
1484	   reasonable approach in GSE is to send to the peer using the first RG
1485	   used for the entire life of a connection.  That is, always use the
1486	   first RG seen, and accept the loss of connectivity whenever the RG
1487	   changes.

1489	5.4.4.  The Impact of Corrupted Routing Goop

1491	   Another interesting issue that arises is what impact corrupted RG
1492	   would have on robustness.  Because the RG is not covered by the TCP
1493	   checksum (the sender doesn't know what source RG will be inserted),
1494	   no TCP mechanism can detect such corruption at the receiver.
1495	   Moreover, once a specific RG is in use, it does not change for the
1496	   duration of a connection.  One interesting case occurs on the passive
1497	   side of a TCP connection, where a server accepts incoming connections
1498	   from remote clients.  If the initial SYN from the client includes a
1499	   corrupted RG, the server TCP will create a TCP connection (in the
1500	   SYN-RECEIVED state) and cache the corrupted RG with the connection.
1501	   The second packet of the 3-way handshake, the SYN-ACK packet, would
1502	   be sent to the wrong RG and consequently not reach the correct
1503	   destination.  Later, when the client retransmits the unacknowledged
1504	   SYN, the server will continue to send the SYN-ACK using the bad RG.
1505	   Eventually the client times out, and the attempt to open a TCP
1506	   connection fails.

1508	   We next consider relaxing the restriction on switching RGs in an
1509	   attempt to avoid the previous failure scenario.  The situation is
1510	   complicated by the fact that the RG on received packets may change
1511	   for legitimate reasons (e.g., a multi-homed site load-shares traffic
1512	   across multiple border routers).  The key question is how one can
1513	   determine which RG is valid and which is not.  That is, for each of
1514	   the destination RGs a sender attempts to use, how can it determine
1515	   which RG worked and which did not? Solving this problem is more
1516	   difficult than first appears, since one must cover the cases of
1517	   delayed segments, lost segments, simultaneous opens, etc.  If a SYN-
1518	   ACK is retransmitted using different RGs, it is not possible to
1519	   determine which of the two RGs worked correctly.  We conclude that
1520	   the only way TCP can determine that a particular RG is correct is by
1521	   receiving an ACK for a specific sequence number in which all
1522	   transmissions of that sequence number used the same RG.  This would
1523	   involve non-trivial changes to TCP implementations.

1525	   At best, an RG selection algorithm for TCP would require new logic in
1526	   implementations of TCP's opening handshake --- a significant
1527	   transition and deployment issue.  We are not certain that a valid
1528	   algorithm is attainable, however.  RG changes would have to be
1529	   handled in all cases handled by the opening handshake: delayed
1530	   segments, lost segments, undetected bit errors in RG, simultaneous
1531	   opens, old segments, etc.

1533	   In the end, we conclude that although the corrupted SYN case
1534	   introduces potential problems, the changes that would need to be made
1535	   to TCP to robustly deal with such corruption would be significant, if
1536	   tractable at all.  This would result in a transition to GSE also
1537	   having a significant TCPng component, a significant drawback.

1539	5.5.  On The Uniqueness Of ESDs

1541	   Although ESDs are expected to be globally unique, their uniqueness
1542	   property may be violated either due to mistakes in allocation or by
1543	   malicious attacks.  The exact uniqueness requirements for ESDs
1544	   depends on what purpose they serve and how they are used.  If the
1545	   correctness of some applications relies on the global uniqueness of
1546	   ESDs, then active checking and enforcement will be necessary.  On the
1547	   other hand if ESDs are used only to uniquely identify individual
1548	   endpoints within a session, then one may consider global uniqueness
1549	   as unnecessary.

1551	5.5.1.  Impact of Duplicate ESDs

1553	   Consider what happens when two nodes using the same ESD attempt to
1554	   communicate with each other.  In the GSE proposal, a node queries the
1555	   DNS to obtain an IPv6 address.  The returned address includes the
1556	   Routing Stuff of an address (the RG+STP portions).  The sender may
1557	   not notice the destination ESD is the same as its own ESD and may
1558	   well forward the packet to a router that delivers the packet to its
1559	   correct destination (using the information in the Routing Stuff).  On
1560	   receipt of the packet, however, the destination node would extract
1561	   the ESD portion of the destination address and detect the conflict.

1563	   A more problematic case occurs if two nodes having the same ESD
1564	   communicate with a third party.  To the third party, packets received
1565	   from either machine might appear to be coming from the same machine
1566	   since they all carry the same ESD.  Consequently, at the transport
1567	   level, if both machines choose the same source and destination port
1568	   numbers (one of the ports --- a server's well-known port number ---
1569	   will likely be the same), packets belonging to two distinct transport
1570	   connections will be demultiplexed to a single transport end-point.

1572	   When packets from different sources using the same source ESD are
1573	   delivered to the same transport end-point, a number of possibilities
1574	   come to mind:

1576	     1) Following the GSE specification, the transport end-point would
1577	        accept the packet, without regard to the Routing Stuff of the
1578	        source address.  This may lead to a number of robustness
1579	        problems (and at best will confuse the application).

1581	     2) The transport end-point could verify that the Routing Stuff of
1582	        the source address matches one of a set of expected values
1583	        before processing the packet further.  If the Routing Stuff
1584	        doesn't match any expected value, the packet could be dropped.
1585	        This would result in a connection from one host operating
1586	        correctly, while a connection from another host (using the same
1587	        ESD) would fail.

1589	     3) When a packet is received with an unexpected Routing Stuff the
1590	        receiver could invoke special-purpose code to deal with this
1591	        case.  Possible actions include attempting to verify whether the
1592	        Routing Stuff is indeed correct (the saved values may have
1593	        expired) or attempting to verify whether duplicate ESDs are in
1594	        use (e.g., by inventing a protocol that sends packets using both
1595	        Routing Stuff and verifies that they are delivered to the same
1596	        end-point).

1598	5.5.2.  New Denial of Service Attacks.

1600	   It is clear that there are potential problems if identifiers are not
1601	   globally unique.  How common such problems would actually occur in
1602	   practice depends on how many duplicates there actually are.  Thus,
1603	   one might be tempted to make the argument that a scheme for assigning
1604	   identifiers could be made to be "unique enough" in practice.  This
1605	   would be a dangerous and naive assumption, because in the absence of
1606	   any ESD enforcement (i.e. ensuring each host use only the assigned
1607	   ESD), intruders will actively impersonate other sites for the sole
1608	   purpose of invalidating the uniqueness assumption.  For example, one
1609	   could deny service to host foo.bar.com by querying the DNS for its
1610	   corresponding ESD, and then impersonating that ESD.

1612	   As a specific example, one GSE-specific denial-of-service attack
1613	   would be for an intruder to masquerade as another host and "wedge"
1614	   connections in a SYN-RECEIVED state by sending SYN segments
1615	   containing an invalid RG in the source IP address for a specific ESD.
1616	   Subsequent connection attempts to the wedged host from the legitimate
1617	   owner of the ESD (if they used the same TCP port numbers) would then
1618	   not complete, since return traffic would be sent to the wrong place.

1620	5.6.  Summary of Identifier Authentication Issues

1622	   In summary, changing the RG dynamically in a safe way for a
1623	   connection requires that an originator of traffic be able to
1624	   authenticate a proposed change in the RG before sending to a
1625	   particular ESD via that RG.  This is difficult for several reasons:

1627	     1) It can't be done on an end-to-end basis in GSE (e.g., via IPSec)
1628	        because the sender doesn't know what value the RG portion of the
1629	        address will have when it reaches the receiver.

1631	     2) It can't be easily done in GSE because there is no mechanism at
1632	        or below the transport layer to map ESDs into a quantity that
1633	        can be used as a key to jump start the authentication process
1634	        (using the DNS would be problematic due to layering circularity
1635	        considerations).

1637	     3) Any scheme that uses the full IPv6 address to do the
1638	        authentication can be used with today's standard provider-based
1639	        addressing, raising the question of what benefit is retained
1640	        from having separate identifiers and locators.

1642	   Our final conclusion is that with the GSE approach, transport
1643	   protocol end-points must make an early, single choice of the RG to
1644	   use when sending to a peer and stick with that choice for the
1645	   duration of the connection.  Specifically:

1647	     1) The demultiplexing of arriving packets to their transport end
1648	        points should use only the ESD, and not the Routing Stuff.

1650	     2) If the application chooses an RG for the remote peer (i.e., an
1651	        active open), use the provided RG for all traffic sent to that
1652	        peer, even if alternative RGs are received on subsequent
1653	        incoming datagrams from the same ESD.  For all other cases, use
1654	        the first RG received with a given ESD for all sending.

1656	     3) Simultaneously, we understand that, with the above rules, there
1657	        are still open issues with regard to invalid RGs, either through
1658	        corruption or through a active hostile attacks.

1660	   One difficulty With the above recommendation is that there does not
1661	   appear to be a straightforward way to use ESDs in conjunction with
1662	   mobility or site renumbering (in which existing connections survive
1663	   the renumbering).  This presents a quandary.  The main benefit of
1664	   separating identifiers and locators is the ability to have
1665	   communication (e.g., a TCP connection) continue transparently, even
1666	   when the Routing Stuff associated with a particular ESD changes.
1667	   However, switching to a new Routing Stuff without properly
1668	   authenticating it makes it trivial to hijack connections.

1670	   We cannot emphasize enough that the use of an ESD independent of an
1671	   associated RG can be very dangerous.  That is, communicating with a
1672	   peer implies that one is always talking to the same peer for the
1673	   duration of the communication.  But as has been described in previous
1674	   sections, such assurance can only come from properly authenticating
1675	   the RG associated with an ESD.  That is not possible in GSE.

1677	6.  Conclusion

1679	   The GSE proposal provides a concrete example of a network protocol
1680	   design that separates identifiers from locators in addresses.  In
1681	   this paper we compared GSE with IPv4's CIDR-style addressing to
1682	   better understand the pros and cons of the respective design
1683	   approaches.

1685	   Functionally speaking, identifiers and locators each have a logically
1686	   different role to play.  Thus overloading both in one field causes
1687	   problems whenever the location of a node changes but its identity
1688	   does not.  However, our analysis shows that overloading also presents
1689	   two critically important benefits.

1691	   First, for network entity A to send data to network entity B, A must
1692	   not only know B's end identifier but also B's locator.  No scalable
1693	   way is known at this time to provide this mapping at the network
1694	   layer, other than overloading the two quantities into an address as
1695	   is done in IPv4.  Fundamentally, a scalable mapping algorithm
1696	   strongly suggests that the identifier space be structured
1697	   hierarchically, yet identifiers in GSE are not sufficiently large to
1698	   both contain sufficient hierarchy and support stateless address
1699	   autoconfiguration.  Instead, GSE forces applications to supply up-
1700	   to-date locators.  However, relying on the locator provided at the
1701	   time communication is established as GSE does is inadequate when the
1702	   remote locator can change dynamically, precisely the scenario that is
1703	   supposed to benefit from the separation.  That is, the benefits of
1704	   separating the identifier from the locator are largely lost, if the
1705	   changes in the identifier to locator binding are not tracked quickly.

1707	   Secondly, when communicating with a remote site, if the RG changes
1708	   there begins to be uncertainty as to whether a reliable TCP handshake
1709	   is possible (because of the need for passively opened TCP to use the
1710	   RG's it obtains from the packets).  Because the reliability of TCP's
1711	   byte stream is critically dependent on its three-way handshake, this
1712	   is a significant issue.

1714	   Finally, when communicating with a remote site, a receiver must be
1715	   able to insure (with reasonable certainty) that received data does
1716	   indeed come from the expected remote entity.  In IPv4, it is possible
1717	   to receive packets from a forged source, but the potential for
1718	   mischief between communicating peers is significantly limited because
1719	   return traffic will not generally reach the source of the forged
1720	   traffic.  That is, communication involving packets sent in both
1721	   directions will not succeed.  In contrast, architectures like GSE
1722	   that decouple the identifier and locator functions lose the built-in
1723	   protection available in classical IP and thus face great difficulty
1724	   assuring that traffic from a source identified only by an identifier
1725	   actually comes from the correct source.  Short of using cryptographic
1726	   techniques (e.g. IPsec), there is no known mechanism that can use an
1727	   identifier alone to perform this remote entity authentication.  Using
1728	   an identifier alone for authentication of received packets is
1729	   dangerously unsafe.

1731	   In summary, although overloading the address field with a combined
1732	   identifier and locator leads to difficulties in retaining the
1733	   identity of a node whenever its address changes, analysis in this
1734	   paper suggests that the benefit of the overloading actually out-
1735	   weighs its cost.  Completely separating an identifier from its
1736	   locator renders the identifier untrustworthy, thus useless, in the
1737	   absence of an accompanying authentication system.

1739	7.  Security Considerations

1741	   The primary security consideration with GSE or, more generally, a
1742	   network layer with addresses split into locator and identifier parts,
1743	   is that of one node impersonating another by copying the
1744	   identification without the location.  Indeed, the main conclusion of
1745	   this paper is that a GSE-like addressing structure introduces new
1746	   security vulnerabilities that are not present in IP, and that those
1747	   problems are serious enough to question the benefits of an
1748	   architecture that separates locaters and identifiers in addresses.

1750	8.  Acknowledgments

1752	   Thanks go to Steve Deering and Bob Hinden (the Chairs of the IPng
1753	   Working Group) as well as Sun Microsystems (the host for the interim
1754	   meeting) for the planning and execution of the interim meeting.
1755	   Thanks also go to Mike O'Dell for writing the 8+8 and GSE drafts; by
1756	   publishing these documents and speaking on their behalf, Mike was the
1757	   catalyst for some valuable discussions, both for IPv6 addressing and
1758	   for addressing architectures in general.  Special thanks to the
1759	   attendees of the interim meeting whose high caliber discussions
1760	   helped motivate and shape this document.

1762	9.  References

1764	     [ANYCAST] "Host Anycasting Service", C. Partridge, T. Mendez, & W.
1765	             Milliken, RFC 1546.

1767	     [BATES] Scalable support for multi-homed multi-provider
1768	             connectivity, Tony Bates & Yakov Rekhter, RFC 2260,
1769	             January, 1998.

1771	     [Bellovin 89] "Security Problems in the TCP/IP Protocol Suite",
1772	             Bellovin, Steve, Computer Communications Review, Vol. 19,
1773	             No. 2, pp32-48, April 1989.

1775	     [CIDR] "Classless Inter-Domain Routing (CIDR): an Address
1776	             Assignment and Aggregation Strategy". V. Fuller, T. Li, J.
1777	             Yu, & K. Varadhan, RFC 1519, September 1993.

1779	     [DHCP-DDNS] Interaction between DHCP and DNS, Internet Draft, Yakov
1780	             Rekhter, (Work in Progress.)

1782	     [DDNS] "Dynamic Updates in the Domain Name System (DNS UPDATE)",
1783	             Paul Vixie (Editor), RFC 2136, April, 1997.

1785	     [EUI64] 64-Bit Global Identifier Format Tutorial.
1786	             http://standards.ieee.org/db/oui/tutorials/EUI64.html.
1787	             Note: "EUI-64" is claimed as a trademark by an organization
1788	             which also forbids reference to itself in association with
1789	             that term in a standards document which is not their own,
1790	             unless they have approved that reference.  However, since
1791	             this document is not standards-track, it seems safe to name
1792	             that organization: the IEEE.

1794	     [GSE] "GSE - An Alternate Addressing Architecture for IPv6", Mike
1795	             O'Dell, (Work in progress).

1797	     [IEEE802] IEEE Std 802-1990, "Local and Metropolitan Area Networks:
1798	             IEEE Standard Overview and Architecture."

1800	     [IEEE1212] IEEE Std 1212-1994, "Information technology--
1801	             Microprocessor systems: Control and Status Registers (CSR)
1802	             Architecture for microcomputer buses."

1804	     [IPv6-ADDRESS] "An IPv6 Aggregatable Global Unicast Address
1805	             Format", R. Hinden, M. O'Dell, S. Deering, RFC 2374, July,
1806	             1998.

1808	     [MOBILITY] "IP Mobility Support", C. Perkins, RFC 2002, October,
1809	             1996.

1811	     [RFC1752] "The Recommendation for the IP Next Generation Protocol",
1812	             S. Bradner, A. Mankin, RFC 1752, January, 1995.

1814	     [RFC1788] "ICMP Domain Name Messages", W. Simpson, RFC 1788, April,
1815	             1995.

1817	     [RFC1884] "IP Version 6 Addressing Architecture", R. Hinden & S.
1818	             Deering, Editors, RFC 1884.

1820	     [RFC1958] "Architectural Principles of the Internet", B. Carpenter,
1821	             RFC 1958, June, 1996.

1823	     [RFC1971] "IPv6 Stateless Address Autoconfiguration", S. Thomson,
1824	             T. Narten, RFC 1971, August, 1996.

1826	     [RFC2008] "Implications of Various Address Allocation Policies for
1827	             Internet Routing", Y. Rekhter, T. Li, RFC 2008, October
1828	             1996.

1830	     [RFC2073] An IPv6 Provider-Based Unicast Address Format.  Y.
1831	             Rekhter, P. Lothberg, R. Hinden, S. Deering, J. Postel. RFC
1832	             2073, January, 1997.

1834	     [RFC2267] Network Ingress Filtering: Defeating Denial of Service
1835	             Attacks which employ IP Source Address Spoofing, P.
1836	             Ferguson, D. Senie, RFC 2267.

1838	     [ROUTER-RENUM] "Router Renumbering for IPv6", M. Crawford, draft-
1839	             ietf-ipngwg-router-renum-06.txt.

1841	10.  Authors' Addresses

1843	   Matt Crawford                           John Stewart
1844	   Fermilab MS 368                         Juniper Networks, Inc.
1845	   PO Box 500                              385 Ravendale Drive
1846	   Batavia, IL 60510 USA                   Mountain View, CA  94043
1847	   Phone: 630-840-3461                     Phone: +1 650 526 8000
1848	   EMail: crawdad@fnal.gov                 EMail: jstewart@juniper.net

1850	   Allison Mankin                          Lixia Zhang
1851	   USC/ISI                                 UCLA Computer Science Department
1852	   4350 North Fairfax Drive                4531G Boelter Hall
1853	   Suite 620                               Los Angeles, CA 90095-1596 USA
1854	   Arlington, VA  22203 USA                Phone: 310-825-2695
1855	   EMail: mankin@isi.edu                   EMail: lixia@cs.ucla.edu
1856	   Phone: 703-812-3706

1858	   Thomas Narten
1859	   IBM Corporation
1860	   3039 Cornwallis Ave.
1861	   PO Box 12195 - F11/502
1862	   Research Triangle Park, NC 27709-2195
1863	   Phone: 919-254-7798
1864	   EMail: narten@raleigh.ibm.com

1866	Appendix A: Increased Reliance on Domain Name System (DNS)

1868	   As we've discussed in previous sections, the motivation for
1869	   separating identifiers from locators in IP address is to allow the
1870	   locator portion to change more easily.  However because GSE does not
1871	   provide a mapping from an ESD to its locator, whenever the locator
1872	   changes, GSE falls back on DNS to provide such mapping.

1874	   Because any mapping scheme is complicated by renumbering, and because
1875	   recent IPv4 experience has shown a requirement for renumbering at
1876	   some frequency, it is worthwhile to explore the general renumbering
1877	   issue.

1879	A.1: Renumbering and DNS: How Frequently Can We Renumber?

1881	   One premise of the GSE proposal [GSE] is that an ISP can renumber the
1882	   Routing Goop portion of a site's addresses transparently to the site
1883	   (i.e., without coordinating the change with the site).  This would
1884	   make it possible for backbone providers to aggressively renumber the
1885	   Routing Goop part of addresses to achieve a high degree of route
1886	   aggregation.  On closer examination, frequent (e.g., daily)
1887	   renumbering turns out to be difficult in practice because of a
1888	   circular dependency between the DNS and routing.  Specifically, if a
1889	   site's Routing Stuff changes, nodes communicating with the site need
1890	   to obtain the new Routing Stuff.  In the GSE proposal, one queries
1891	   the DNS to obtain this information.  However, in order to reach a
1892	   site's DNS servers, the pointers controlling the downward delegation
1893	   of authoritative DNS servers (i.e., DNS "glue records") must use
1894	   addresses with Routing Stuff that are reachable.  That is, in order
1895	   to find the address for the web server "www.foo.bar.com", DNS queries
1896	   might need to be sent to a root DNS server, as well as DNS servers
1897	   for "bar.com" and "foo.bar.com".  Each of these servers must be
1898	   reachable from the querying client.  Consequently, there must be an
1899	   adequate overlap period after the RG changes, during which both the
1900	   old Routing Stuff and the new Routing Stuff can be used
1901	   simultaneously.  During the overlap period, DNS glue records will
1902	   need to be updated to use the new addresses (including Routing Stuff)
1903	   and DNS RR's needs to be updated.  Only after all relevant DNS
1904	   servers have been updated and all previously cached RRs containing
1905	   the old addresses have timed out can the old RG be deleted.

1907	   An important observation is that the above issue is not specific to
1908	   GSE; the same requirement exists with today's provider-based
1909	   addressing architecture.  When a site is renumbered (e.g., it
1910	   switches ISPs and obtains a new set of addresses from its new
1911	   provider), the DNS must be updated in a similar fashion.

1913	A.2: Efficient DNS support for Site Renumbering

1915	   In the current Internet, when a site is renumbered, the addresses of
1916	   all the site's internal nodes change.  This requires a potentially
1917	   large update to the RR database for that site.  Although Dynamic DNS
1918	   [DDNS] could potentially be used, the cost is likely to be large due
1919	   to the large number of individual records that would need to be
1920	   updated.  In addition, when DHCP and DDNS are used together [DHCP-
1921	   DDNS], it may be the case that individual hosts "own" their own A or
1922	   AAAA records, further complicating the question of who is able to
1923	   update the contents of DNS RRs.

1925	   With GSE, When a site renumbers to satisfy its ISP, only the site's
1926	   routing prefix needs to change.  That is, the prefix reflects where
1927	   within the Internet the site resides.  One DNS modification that
1928	   could reduce the cost of updating the DNS when a site is renumbered
1929	   is to store addresses in two distinct RR's: one for the Routing Goop
1930	   that reflects where a node attaches to the Internet and the other for
1931	   STP-plus-ESD that is the site-specific part of an address.  During a
1932	   renumbering, the Routing Goop would change, but the "site internal
1933	   part" would remain fixed.  That way, renumbering a site would only
1934	   require that the Routing Goop RR of a site be updated; the "site-
1935	   internal part" of individual addresses would not change.

1937	   To obtain the address of a node from the DNS, a DNS query for the
1938	   name would return two quantities: the "site internal part" and the
1939	   DNS name of the Routing Stuff for the site.  An additional DNS query
1940	   would then obtain the specific RR of the site, and the complete
1941	   address would be synthesized by concatenating the two pieces of
1942	   information.

1944	   Implementing these DNS changes increases the practicality of using
1945	   Dynamic DNS to update a site's DNS records as it is renumbered.  Only
1946	   the site's Routing Goop RRs would need updating.

1948	   Finally, it may be useful to divide a node's AAAA RR into the three
1949	   logical parts of the GSE proposal, namely RG, STP and ESD.  Whether
1950	   or not it is useful to have separate RRs for the STP and ESD portions
1951	   of an address or a single RR combining both is an issue that requires
1952	   further study.

1954	   If AAAA records are comprised of multiple distinct RRs, then one
1955	   question is who should be responsible for synthesizing the AAAA from
1956	   its components: the resolver running on the querying client's machine
1957	   or the queried name server? To minimize the impact on client hosts
1958	   and make it easier to deploy future changes, it is recommended that
1959	   the synthesis of AAAA records from its constituent parts be done on
1960	   name servers rather than in client resolvers.

1962	A.2.1: Two-Faced DNS

1964	   The GSE proposal attempts to hide the RG part of addresses from nodes
1965	   within a site.  If the nodes do not know their own RG, then they
1966	   can't store or use them in ways that cause problems should the site
1967	   be renumbered and its RG change (i.e., the cached RG become invalid).
1968	   A site's DNS servers, however, will need to have more information
1969	   about the RG its site uses.  Moreover, the responses it returns will
1970	   depend on who queries the server.  A query from a node within the
1971	   site should return an address with a Site Local RG, whereas a query
1972	   for the same name from a client located at a different site should
1973	   return the global scope RG.  This facilitates intra-site
1974	   communication to be more resilient to failures outside of the site.
1975	   Such context-dependent DNS servers are commonly referred as "two-
1976	   faced" DNS servers.

1978	   Some issues that must be considered in this context:

1980	     1) A DNS server may recursively attempt to resolve a query on
1981	        behalf of a requesting client.  Consequently, a DNS query might
1982	        be received from a proxy rather than from the client that
1983	        actually seeks the information.  Because the proxy may not be
1984	        located at the same site as the originating client, a DNS server
1985	        cannot reliably determine whether a DNS request is coming from
1986	        the same site or a remote site.  One solution would be to
1987	        disallow recursive queries for off-site requesters, though this
1988	        raises additional questions.

1990	     2) Since cached responses are, in general, context sensitive, a
1991	        name server may be unable to correctly answer a query from its
1992	        cache, since the information it has is incomplete.  That is, it
1993	        may have loaded the information via a query from a local client,
1994	        and the information has a site-local prefix.  If a subsequent
1995	        request comes in from an off-site requester, the DNS server
1996	        cannot return a correct response (i.e., one containing the
1997	        correct RG).

1999	A.2.2: Bootstrapping Issues

2001	   If Routing Stuff information is distributed via the DNS, key DNS
2002	   servers must always be reachable.  In particular, the addresses
2003	   (including Routing Stuff) of all root DNS servers are, for all
2004	   practical purposes, well-known and assumed to never change.  It is
2005	   not uncommon for the addresses of root servers to be hard-coded into
2006	   software distributions.  Consequently, the Routing Stuff associated
2007	   with such addresses must always be usable for reaching root servers.

2009	   If it becomes necessary or desirable to change the Routing Stuff of
2010	   an address at which a root DNS server resides, the routing subsystem
2011	   will likely need to continue carrying "exceptions" for those
2012	   addresses.  Because the total number of root DNS servers is
2013	   relatively small, the routing subsystem is expected to be able to
2014	   handle this requirement.

2016	   All other DNS server addresses can be changed, since their addresses
2017	   are typically learned from an upper-level DNS server that has
2018	   delegated a part of the name space to them.  So long as the
2019	   delegating server is configured with the new address, the addresses
2020	   of other servers can change.

2022	Appendix B: Additional Issues Related to GSE

2024	   This paper focused primarily on the issues of separating identifiers
2025	   and locators in unicast addresses.  It is worth noting that a number
2026	   of additional issues were identified during the IPng interim meeting
2027	   with respect to the GSE proposal that would need to be considered
2028	   before an architecture such as GSE could be deployed.  Specifically:

2030	      - - it is not known how multicast would work under GSE.  One
2031	        identified issue is that a site with multiple egress routers
2032	        would (by default) inject multicast traffic through each of all
2033	        the egress routers, each would then replace the source Routing
2034	        Goop with a differing value.  This would lead to multiple copies
2035	        of the same packet each carrying a different IPv6 address, thus
2036	        being considered as from different sources.

2038	      - - It would be more difficult to create tunnels.  Any tunnel that
2039	        crosses a site boundary (i.e., the entry and exit points are in
2040	        differing sites) would in effect require that both tunnel
2041	        endpoints be border routers to insure that the addresses in the
2042	        inner headers were rewritten correctly.

2044	      - - In order for the DNS to hide a site's Routing Goop from
2045	        internal nodes yet make it visible to external nodes requires a
2046	        two-faced DNS.  The current DNS model assumes a single global
2047	        database in which all queries are answered the same way,
2048	        irregardless of who issued the query.  It is unclear how to make
2049	        the DNS answer queries in a context-sensitive manner without
2050	        also negatively impacting its caching model.

2052	Appendix C: Ideas Incorporated Into IPv6

2054	   This section summarizes changes made to IPv6 specifications which
2055	   originated in the GSE proposal or in the discussions arising from it.

2057	   The unicast address format was changed to improve the aggregability
2058	   of unicast addresses.  Instead of a topologically insignificant
2059	   Registry ID immediately following the Format Prefix [RFC2073], there
2060	   is now a Top-Level Aggregation Identifier [IPv6-ADDRESS].  This field
2061	   identifies a large routable aggregate to which an address belongs
2062	   rather than an administrative unit that assigned the address.  The
2063	   TLA corresponds to the "Large Structure" of GSE.  The IPv6 Next-Level
2064	   Aggregation Identifier (NLA) is roughly the rest of the GSE "Routing
2065	   Goop" and the Site-Level Aggregation Identifier (SLA) is a slightly
2066	   expanded GSE Site Topology Partition.

2068	   The decision to put fixed boundaries between parts of the unicast
2069	   address (TLA, NLA, SLA, Interface Identifier) into IPv6 addresses
2070	   [IPv6-ADDRESS] also came from GSE.  The previous "provider-based"
2071	   addressing architecture for IPv6 [RFC2073] had fluid boundaries
2072	   between Registry ID, Provider ID, Subscriber ID and the Intra-
2073	   Subscriber part, as well as undefined divisions within the Provider-
2074	   ID and Intra-Subscriber part.  (On subnetworks with a MAC-layer
2075	   address, the latter boundary was generally placed to accommodate use
2076	   of that address as an Interface ID.)  The new addressing architecture
2077	   still expects divisions within the NLA portion of the address, placed
2078	   to reflect topological aggregation points.

2080	   Defining a fixed boundary between the routable portion of the address
2081	   and the part indicating an interface on a specific link required
2082	   specifying an Interface Identifier that would be suitable for all
2083	   subnetwork technologies.  The IEEE "EUI-64" identifier was selected,
2084	   having the advantages of an easy mapping from 48 bit MAC addresses
2085	   and a defined escape flag into locally-administered values.

2087	   Another change was the redefinition of the interface identifier to be
2088	   a 64-bit quantity.  In the common case where a node has at least one
2089	   IEEE interface, the interface identifier is constructed from an IEEE
2090	   identifier (i.e., a MAC address) in such a way that there is a very
2091	   high probability that the identifier will be globally unique.  In the
2092	   case where a globally unique identifier can't easily be constructed
2093	   automatically, a bit in the identifier indicates that the address is
2094	   not globally unique.  At present, there are no plans for transport
2095	   protocols such as TCP to exploit interface identifiers, but the door
2096	   has been left open for a future protocol (e.g., TCPng) to take
2097	   advantage of the ESD concept.

2099	   Another change to come out of the GSE discussions relates to reducing
2100	   the number of DNS record changes required in the event of site
2101	   renumbering.  This work is not finalized as of this writing, but the
2102	   result may be that individual IPv6 addresses are stored (and signed,
2103	   in the case of Secure DNS) as a partial address and an indirect
2104	   pointer which leads to the high-order part of the address.  There may
2105	   be multiple levels of indirection and a changed record at any one
2106	   level would suffice to update the DNS's record of the IPv6 addresses
2107	   of every node in a given branch of the addressing hierarchy.

2109	   A change in the method of doing DNS address-to-name lookups is also
2110	   in the works.  This may be a change in the form and/or operation of
2111	   the ip6.int domain or some new mechanism which involves participation
2112	   by the routers or the end-nodes themselves.

2114	   Two other changes arising from GSE will not affect the IPv6 base
2115	   specifications themselves, but do direct additional work.  Those are
2116	   the injection of global prefix information into a site from a
2117	   provider or exchange [ROUTER-RENUM], and some inter-provider
2118	   cooperative method of providing multihoming to mutual customers with
2119	   minimal impact on routing tables in distant parts of the network.

2121	Appendix D: Reverse Mapping of Complete GSE Addresses

2123	   The ability to map an IP address into its corresponding DNS name is
2124	   used in several contexts:

2126	     1) Network packet tracing utilities (e.g., tcpdump) display the
2127	        contents of packets.  Printing out the DNS names appearing in
2128	        those packets (rather than dotted IP addresses) requires access
2129	        to an address-to-name mapping mechanism.

2131	     2) Some applications perform a "poor-man's" authentication by using
2132	        the DNS to map the source address of a peer into a DNS name.
2133	        The client then queries the DNS a second time, this time asking
2134	        for the address(es) corresponding to the peer's DNS name.  Only
2135	        if one of the addresses returned by the DNS matches the peer
2136	        address of the TCP connection is the source of the TCP
2137	        connection accepted as being from the indicated DNS name.

2139	        It is important to note that although two DNS queries are made
2140	        during the above operation, it is the second one --- mapping the
2141	        peer's DNS name back into an IP address --- that provides the
2142	        authentication property.  The first transaction simply obtains
2143	        the peer's DNS name, but no assumption is made that the returned
2144	        DNS name is correct.  Thus, the first DNS query could be
2145	        replaced by an alternate mechanism without weakening the already
2146	        weak authentication check described above.  One possible
2147	        alternate mechanism, an ICMP "Who Are You" message, is described
2148	        below

2150	     3) Applications that log all incoming network connections (e.g.,
2151	        anonymous FTP servers) may prefer logging recognizable DNS names
2152	        to addresses.

2154	     4) Network administrators examining logs or other trace data
2155	        containing addresses may wish to determine the DNS name of some
2156	        addresses.  Note that this may occur sometime after those
2157	        addresses were actually used.

2159	   The following subsections describe techniques for mapping a full IPv6
2160	   address back into some quantity (e.g., a DNS name or locator).  We
2161	   include these descriptions for completeness even though they do not
2162	   address the fundamental problem of how to perform the mapping on an
2163	   identifier alone.  It should also be noted that because both
2164	   techniques operate on complete IPv6 addresses, they are both directly
2165	   applicable to provider-based addressing schemes and are not specific
2166	   to GSE.

2168	D.1: DNS-Like Reverse Mapping of Full GSE Addresses

2170	   Although it seems infeasible to have a global scale, reverse mapping
2171	   of ESDs, within a site, it may be feasible to maintain a database
2172	   keyed on unstructured 8-byte ESDs.  However, it is an open question
2173	   whether such a database can be kept up-to-date at reasonable cost,
2174	   without making unreasonable assumptions as to how large sites are
2175	   going to grow, and how frequently ESD registrations will be made or
2176	   updated.  Note that the issue isn't just the physical database
2177	   itself, but the operational issues involved in keeping it up-to-date.
2178	   For the rest of this section, however, let us assume that such a
2179	   database can be built.

2181	   A mechanism supporting a lookup keyed on a flat-space ESD from an
2182	   arbitrary site requires having sufficient structure to identify the
2183	   site that needs to be queried.  In practice, since the Routing Stuff
2184	   is organized hierarchically, if an ESD is always used in conjunction
2185	   with Routing Stuff (i.e., a full 16-byte address), it becomes
2186	   feasible to maintain a DNS-like tree that maps full GSE addresses
2187	   into DNS names, in a fashion analogous to what is done with IPv4 PTR
2188	   records today.

2190	   It should be noted that a GSE address lookup will work only if the
2191	   Routing Stuff portion of the address is correctly entered in the DNS
2192	   tree.  Because the Routing Stuff portion of an address is expected to
2193	   change over time, this assumption will not hold valid indefinitely.

2195	   As a consequence, a packet trace recorded in the past might not
2196	   contain enough information to identify the off-Site sources of the
2197	   packets in the present.  This problem can be addressed by requiring
2198	   that the database of RG delegations be maintained, together with
2199	   accurate timing information, for some period of time after the RG is
2200	   no longer usable for routing packets.

2202	   Finally, it should be noted that the problem where an address's RG
2203	   "expires" with the implication that the mapping of "expired"
2204	   addresses into DNS names may no longer hold is not a problem specific
2205	   to the GSE proposal.  With provider-based addressing, the same issue
2206	   arises when a site renumbers into a new provider prefix and releases
2207	   the allocation from a previous block.  The authors are aware of one
2208	   such renumbering incidence in IPv4 where a block of returned
2209	   addresses was reassigned and reused within 24 hours of the
2210	   renumbering event.

2212	D.2: The ICMP Who-Are-You Message

2214	   There is widespread agreement on the utility of being able to
2215	   determine the DNS name one is communicating with from the address
2216	   being used.  In addition to the fact that DNS names are more
2217	   meaningful to human users and more stable than addresses, many users
2218	   use this reverse mapping as part of a poor-man's authentication for
2219	   the remote peer; if one can map the obtained DNS name back to the
2220	   same address, one has an increased confidence of the peer being a
2221	   legitimate one.

2223	   In practice, however, the IN-ADDR.ARPA domain is not fully populated
2224	   and poorly maintained.  Consequently, an old proposal to define an
2225	   ICMP Who-Are-You message was resurrected [RFC1788].  A client would
2226	   send such a message to a peer, and that peer would return an ICMP
2227	   message containing its DNS name.  Asking a remote host to supply its
2228	   own name in no way implies that the returned information is accurate.
2229	   However, having a remote peer provide a piece of information that a
2230	   client can use as input to a separate authentication procedure
2231	   provides a starting point for performing strong authentication.  The
2232	   actual strength of the authentication depends on the authentication
2233	   procedure invoked, rather than the untrustable piece of information
2234	   provided by a remote peer.

2236	   Reconsidering the "cheap" authentication procedure described earlier,
2237	   the ICMP Who-Are-You replaces the DNS PTR query used to obtain the
2238	   DNS name of a remote peer.  The second DNS query, to map the DNS name
2239	   back into a set of addresses, would be performed as before.  Because
2240	   the latter DNS query provides the strength of the authentication, the
2241	   use of an ICMP Who-Are-You message does not in any way weaken the
2242	   strength of the authentication method.  Indeed, it can only make it
2243	   more useful in practice, because virtually all hosts can be expected
2244	   to implement the Who-Are-You message.

2246	   The Who-Are-You message has advantages outside the context of GSE as
2247	   well, including a more decentralized, and hence more scalable,
2248	   administration and easier upkeep than a DNS reverse-lookup zone.  It
2249	   also has drawbacks: it requires the target node to be up and
2250	   reachable at the time of the query and to know its fully qualified
2251	   domain name.  It is also not possible to resolve addresses once those
2252	   addresses become unroutable.  In contrast, the DNS PTR mirrors, but
2253	   is independent of, the routing hierarchy.  The DNS can maintain
2254	   mappings long after the routing subsystem stops delivering packets to
2255	   certain addresses.

2257	   The requirement that the target node be up and reachable at the time
2258	   of the query makes it very uncertain that one would be able to take
2259	   addresses from a packet log and translate them to correct domain
2260	   names at a later time.  One can argue that this is a design flaw in
2261	   the logging system, as it violates the architectural principle,
2262	   "Avoid any design that requires addresses to be ... stored on non-
2263	   volatile storage" [RFC1958].  A better-designed system would look up
2264	   domain names promptly from logged addresses.  Indeed, one of the
2265	   authors has been doing that for some years.