idnits 2.17.1 

draft-ietf-ipngwg-esd-analysis-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 52 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 259 instances of too long lines in the document, the longest
     one being 3 characters in excess of 72.

  ** The abstract seems to contain references ([GSE]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.

  == There are 7 instances of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 1746 has weird spacing: '...t would  addre...'

  == Line 2154 has weird spacing: '...gh each  egres...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC 2073' is mentioned on line 640, but not defined

  ** Obsolete undefined reference: RFC 2073 (Obsoleted by RFC 2374)

  == Missing Reference: 'ESD' is mentioned on line 730, but not defined

  == Missing Reference: 'RFC 2267' is mentioned on line 1351, but not defined

  ** Obsolete undefined reference: RFC 2267 (Obsoleted by RFC 2827)

  == Unused Reference: 'ANYCAST' is defined on line 1872, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC1884' is defined on line 1929, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2267' is defined on line 1946, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Informational RFC: RFC 1546 (ref.
     'ANYCAST')

  ** Downref: Normative reference to an Informational RFC: RFC 2260 (ref.
     'BATES')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Bellovin 89'

  ** Obsolete normative reference: RFC 1519 (ref. 'CIDR') (Obsoleted by RFC
     4632)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DHCP-DDNS'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'EUI64'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'GSE'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE802'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE1212'

  ** Obsolete normative reference: RFC 2374 (ref. 'IPv6-ADDRESS') (Obsoleted
     by RFC 3587)

  ** Obsolete normative reference: RFC 2002 (ref. 'MOBILITY') (Obsoleted by
     RFC 3220)

  ** Downref: Normative reference to an Informational RFC: RFC 2663 (ref.
     'NAT')

  ** Obsolete normative reference: RFC 1788 (Obsoleted by RFC 6918)

  ** Obsolete normative reference: RFC 1884 (Obsoleted by RFC 2373)

  ** Downref: Normative reference to an Informational RFC: RFC 1958

  ** Obsolete normative reference: RFC 1971 (Obsoleted by RFC 2462)

  ** Obsolete normative reference: RFC 2073 (Obsoleted by RFC 2374)

  ** Obsolete normative reference: RFC 2267 (Obsoleted by RFC 2827)

  ** Obsolete normative reference: RFC 2401 (Obsoleted by RFC 4301)

  -- Duplicate reference: RFC2267, mentioned in 'RFC2409', was also mentioned
     in 'RFC2267'.

  ** Obsolete normative reference: RFC 2267 (ref. 'RFC2409') (Obsoleted by
     RFC 2827)

  == Outdated reference: A later version (-10) exists of
     draft-ietf-ipngwg-router-renum-06

  == Outdated reference: A later version (-05) exists of
     draft-ietf-ipngwg-site-prefixes-03

  -- Possible downref: Normative reference to a draft: ref. 'SITE-PREFIXES' 


     Summary: 23 errors (**), 0 flaws (~~), 14 warnings (==), 10 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                             Matt Crawford
2	                                                                Fermilab
3	<draft-ietf-ipngwg-esd-analysis-05.txt>                   Allison Mankin
4	                                                                     ISI
5	                                                           Thomas Narten
6	                                                                     IBM
7	                                                    John W. Stewart, III
8	                                                                 Juniper
9	                                                             Lixia Zhang
10	                                                                    UCLA
11	                                                           October, 1999

13	             Separating Identifiers and Locators in Addresses:            |
14	                 An Analysis of the GSE Proposal for IPv6

16	                  <draft-ietf-ipngwg-esd-analysis-05.txt>                 |

18	Status of this Memo                                                       |

20	   This document is an Internet-Draft and is in full conformance with
21	   all provisions of Section 10 of RFC2026 except that the right to
22	   produce derivative works is not granted.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF), its areas, and its working groups.  Note that
26	   other groups may also distribute working documents as Internet-
27	   Drafts.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet- Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   The list of current Internet-Drafts can be accessed at
35	   http://www.ietf.org/ietf/1id-abstracts.txt

37	   The list of Internet-Draft Shadow Directories can be accessed at
38	   http://www.ietf.org/shadow.html.

40	Abstract

42	   On February 27-28, 1997, the IPng Working Group held an interim
43	   meeting in Palo Alto, California to consider adopting Mike O'Dell's
44	   "GSE - An Alternate Addressing Architecture for IPv6" proposal [GSE].
45	   In GSE, 16-byte IPv6 addresses are split into distinct portions for
46	   global routing, local routing and end-point identification.  GSE
47	   includes the feature of configuring a node internal to a site with
48	   only the local routing and end-point identification portions of the
49	   address, thus hiding the full address from the node.  When such a
50	   node generates a packet, only the low-order bytes of the source
51	   address are specified; the high-order bytes of the address are filled
52	   in by a border router when the packet leaves the site.

54	   It has often been said that IPv4 "got it wrong" by treating its        |
55	   addresses simultaneously as locators and identifiers.  However, there  |
56	   has never beeeen a detailed and comprehensive proposal for a           |
57	   scaleable network protocol which separated the functions.  As a        |
58	   result, it wasn't possible to do a serious analysis comparing and      |
59	   contrasting a "separated" architecture and an "overloaded"             |
60	   architecture.  The GSE proposal serves as a vehicle for just such an   |
61	   analysis, and that is the purpose of this paper.

63	   We conclude that an architecture that clearly separates locators and
64	   identifiers in addresses introduces new issues and problems that do
65	   not have an easy or clear solution.  Indeed, the alleged
66	   disadvantages of overloading addresses turn out to provide some
67	   significant benefits over the non-overloaded approach.

69	   Contents

71	   Status of this Memo..........................................    1     |

73	   1.  Introduction.............................................    3     |

75	   2.  Definitions and Terminology..............................    4     |

77	   3.  Addressing and Routing in IPv4...........................    5     |

79	   4.  The GSE Proposal.........................................   14     |

81	   5.  Analysis: The Pros and Cons of Overloading Addresses.....   21     |

83	   6.  Conclusion...............................................   39     |

85	   7.  Security Considerations..................................   40     |

87	   8.  Acknowledgments..........................................   40     |

89	   9.  References...............................................   41     |

91	   10.  Authors' Addresses......................................   43     |

93	   Appendix A: Increased Reliance on Domain Name System (DNS)...   43     |
94	   Appendix B: Additional Issues Related to Specifically to GSE.   47     |

96	   Appendix C: Ideas Incorporated Into IPv6.....................   48     |

98	   Appendix D: Reverse Mapping of Complete GSE Addresses........   49     |

100	1.  Introduction

102	   In October of 1996, Mike O'Dell published an Internet-Draft (dubbed
103	   "8+8") that proposed significant changes to the IPv6 unicast
104	   addressing architecture.  The 8+8 proposal was the topic of
105	   considerable discussion at the December 1996 IETF meeting in San
106	   Jose.  Because the proposal offered both potential benefits (e.g.,
107	   enhanced routing scalability) and risks (e.g., changes to the basic
108	   IPv6 architecture), the IPng Working Group held an interim meeting on
109	   February 27-28, 1997 to consider adopting the 8+8 proposal.

111	   Shortly before the interim meeting, an updated version of the
112	   Internet-Draft was produced.  This version changed the name of the
113	   proposal from "8+8" to "GSE" to identify the three separate
114	   components of a unicast address: Global, Site and End-System
115	   Designator.

117	   The well-attended meeting generated high caliber, focused technical
118	   discussions on the issues involved, with participation by almost all
119	   of the attendees.  By the middle of the second day there was
120	   unanimous agreement that the GSE proposal as written presented too
121	   many risks and should not be adopted as the basis for IPv6.  The
122	   proposal did, however, challenge the group to make several
123	   improvements to the then existing IPv6 specifications (including
124	   increasing the aggregatability of addresses, having hard boundaries
125	   between routing and non-routing parts of the address, and easing the
126	   DNS aspects of renumbering).

128	   This document focuses primarily on the issue of separating unicast
129	   addresses into distinct portions for identification and location
130	   purposes, a separation that IPv4 does not make but that is
131	   fundamental to GSE.  We start with a discussion of the current
132	   architecture of IPv4 addressing and its impact on route scalability,
133	   identification, multi-homing, etc.  Next, the details of the GSE
134	   proposal are described.  Finally, the fundamental issue of
135	   decomposing addresses into multiple separate functional parts is
136	   analyzed in the context of the GSE proposal.  Here we detail some of
137	   the practical reasons why separating addresses into locators and
138	   identifier poses a number of new challenges, making it clear that
139	   having such a separation is no panacea.  An appendix contains a
140	   summary of the IPng Working Group's deliberations of GSE and the
141	   results on IPv6 addressing.

143	   Finally, this document's focus on unicast issues should not be
144	   interpreted to mean that the impact of separating identifier and
145	   locating functions on non-unicast aspects of routing and addressing
146	   are well understood or trivial to deal with.  Specifically,
147	   understanding how multicasting and anycast addressing [ANYCAST,
148	   RFC1884] fits into such a model requires further work.

150	2.  Definitions and Terminology

152	   The following terminology is used throughout this document.

154	      Routing Goop --- A term defined by the GSE document.  It refers to
155	                    the first six bytes of a sixteen byte IPv6 GSE
156	                    address.  The Routing Goop portion of an address
157	                    identifies where a site connects to the public
158	                    Internet.  More generally, the term refers to the
159	                    portion of an address's routing prefix that
160	                    identifies where on the public Internet the site
161	                    housing the address resides.

163	      Site Topology Partition --- A term defined by the GSE document
164	                    that refers to the two bytes of a sixteen byte IPv6
165	                    GSE address immediately to the right of the Routing
166	                    Goop.  The Site Topology Partition part of an
167	                    address identifies which link within a site an
168	                    address resides on.

170	      Routing Stuff --- The part of an address that identifies which
171	                    link the address resides on.  Within the context of
172	                    GSE, the Routing Stuff comprises the Routing Goop
173	                    and Site Topology Partition parts of an address
174	                    (i.e., the left mots eight bytes).

176	      identifier --- a value that indicates the sender of a packet, or
177	                    the intended recipient of a packet.  Within the
178	                    context of GSE, the ESD portion (i.e., the rightmost
179	                    eight bytes) of the address is an identifier.

181	      locator --- a field in a packet header that is used by the routing
182	                    subsystem to deliver a packet to the link on which a
183	                    destination resides.  The terms locator and Routing
184	                    Stuff are similar, we use Routing Stuff when
185	                    referring to the specific locator in GSE.

187	3.  Addressing and Routing in IPv4

189	   Before dealing with details of GSE, we present some background about
190	   how routing and addressing works in "classical IP" (i.e., IPv4).  We
191	   present this background because the GSE proposal proposes a fairly
192	   major change to the base model.  In order to properly evaluate GSE,
193	   one must understand what problems in IPv4 it alleges to improve or
194	   fix.

196	   The structure and semantics of a network layer protocol's addresses
197	   are absolutely core to that protocol.  Addressing substantially
198	   impacts the way packets are routed, the ability of a protocol to
199	   scale and the kinds of functionality higher layer protocols can count
200	   on.  Indeed, addressing is intertwined with both routing and
201	   transport layer issues; a change in any one of these can impact
202	   another.  Issues of administration and operation (e.g., address
203	   allocation/re-allocation and required renumbering), while not part of
204	   the pure exercise of engineering a network layer protocol, turn out
205	   to be critical to the scalability of that protocol in a global and
206	   commercial network.  The interaction between addressing, routing and
207	   especially aggregation is particularly relevant to this document, so
208	   some time will be spent describing it.

210	   Addresses in IPv4 serve two purposes:

212	     1) Unique identification of an interface.  A sending host tells the
213	        network the identity of the intended recipient by placing an IP
214	        address into the destination address field.  In addition, the
215	        receiving host checks the destination address field of received
216	        packets to ensure that the packet is, in fact, for it.

218	     2) Location information of that interface.  Routers use the
219	        packet's destination address in deciding where to forward the
220	        packet to get it closer to its ultimate destination.  That is,
221	        addresses identify "where" the intended recipient is located
222	        within the Internet topology.

224	        For scalability, the location information contained in addresses
225	        must be aggregatable.  In practice, this means that nodes
226	        topologically close to each other (e.g., connected to the same
227	        link, residing at the same site, or customers of the same ISP)
228	        must use addresses that share a common prefix.

230	   What is important to note is that these identification and location
231	   requirements have been met through the use of the same value, namely
232	   the IP address.  As will be noted repeatedly in this document, the
233	   "overloading" of IPv4 addresses with multiple semantics has some
234	   undesirable implications.  For example, the embedding of IPv4
235	   addresses within transport protocol addresses that identify the end-
236	   point of a connection couples those transport protocols with routing
237	   to a degree. This entanglement is inconsistent with a (too) strictly
238	   layered model in which routing would be a completely independent
239	   function of the network layer and not directly impact the transport
240	   layer.

242	   Combining locator and identifier functions also complicates the
243	   support for mobility.  In a mobile environment, the location of an
244	   end-station may change even though its identity stays the same;
245	   ideally, transport connections should be able to survive such
246	   changes.  In IPv4, however, one cannot change the locator without
247	   also changing the identifier since the same packet field is used for
248	   both.

250	   Consequently, there has been a train of thought for some time that
251	   having separate values for location and identification could be of
252	   significant benefit.  The GSE proposal, among other things, attempts
253	   to make such a separation.

255	   This document frequently uses mobility as an example to demonstrate
256	   the pros and cons of separating the identifier from the locator.
257	   However, the reader should note the fundamental equivalence between
258	   the problems faced by mobile hosts and the problem faced by sites
259	   that change providers yet don't want to renumber their network.  When
260	   a site changes providers, it moves topologically in much the same way
261	   a mobile node does when it moves from one place to another.
262	   Consequently, techniques that help or hinder mobility are often
263	   relevant to the issue of site renumbering.

265	3.1.  The Need for Aggregation

267	   IPv4 has seen a number of different addressing schemes.  Since the
268	   original specification, the two major additions have been subnetting
269	   and classless routing.  The motivation for adding subnetting was to
270	   allow a collection of networks located at one site to be viewed from
271	   afar as a single IP network (i.e., to aggregate all of the individual
272	   networks into a single bigger network).  The practical benefit of
273	   subnetting was that all of a site's hosts, even if scattered among
274	   tens or hundreds of LANs, could be represented by a single routing
275	   table entry in routers located far from the site.  In contrast, prior
276	   to subnetting, a site with ten LANs would advertise ten separate
277	   network entries, and all routers would have to maintain ten separate
278	   entries, even though they contained essentially redundant
279	   information.

281	   The benefits of aggregation should be clear.  The amount of work
282	   involved in constructing forwarding tables (i.e., selecting best
283	   routes and installing them into the switching subsystem) is dependent
284	   in part on the number of network routes (i.e., destinations) to which
285	   best paths are computed.  If each site has 10 internal networks, and
286	   each of those networks is individually advertised to the global
287	   routing system, the complexity of computing forwarding tables can
288	   easily be an order of magnitude greater than if each site advertised
289	   a single entry that covered all of the addresses used within the
290	   site.

292	3.2.  The Pre-CIDR Internet

294	   In the early days of the Internet, its topology and addressing were
295	   orthogonal.  Specifically, when a site wanted to connect to the
296	   Internet, it approached the central Internet Assigned Numbers
297	   Authority (IANA) to obtain an address block and then approached a
298	   provider about procuring connectivity.  This procedure for address
299	   allocation resulted in a system where the addresses used by customers
300	   of the same provider bore little relation to the addresses used by
301	   other customers of that same provider.  In other words, though the
302	   actual topology of the Internet was mostly hierarchical, the
303	   addressing was not.  An example of such a topology and addressing
304	   scheme is shown in Figure 1.

306	                +----------------+
307	                |                |------- Customer1 (192.2.2.0)
308	                |                |------- Customer2 (128.128.0.0)
309	                |   Provider A   |------- Customer3 (18.0.0.0)
310	                |                |------- Customer4 (193.3.3.0)
311	                |                |------- Customer5 (194.4.4.0)
312	                +----------------+
313	                        |
314	                        |
315	                        |
316	                        |
317	                +----------------+
318	                |   Provider B   |
319	                +----------------+

321	                                 Figure 1

323	   Figure 1 shows Provider A having 5 customers, each with their own
324	   independently obtained network address.  Providers A and B connect to
325	   each other.  In order for Provider B to be able to send traffic to
326	   Customers1-5, Provider A must announce a separate route to Provider B
327	   for each of the 5 networks.  That is, the routers within Provider B
328	   must have explicit routing entries for each of Provider A's customers
329	   -- 5 separate routes.

331	   Experience has shown that this approach scales very poorly.  In the
332	   Default-Free Zone (DFZ) of the Public Internet, where routers must
333	   maintain routing entries for all reachable destinations, the cost of
334	   computing forwarding tables quickly becomes unacceptably large.  A
335	   large part of the cost is related to the seemingly redundant
336	   computations that must be made for each individual network, even
337	   though many of them reside in the same topological location (e.g.,
338	   under the same provider).  Looking at Figure 1, the problem is that
339	   provider B performs 5 separate calculations to construct the
340	   forwarding table needed to reach each of A's customers, even though
341	   it is going to take the same path for all of them; in other words,
342	   there is an opportunity to do data abstraction.

344	   Figure 1 shows network numbers using the older "classful" notation.
345	   Since 1981, the first few bits of an address syntactically identified
346	   which parts of an address identified the "network" and "local"
347	   portions of an address.  There were a small number of Class A
348	   addresses (intended for very large sites), a medium number of Class B
349	   addresses (for medium-sized sites) and a very large number of Class C
350	   addresses (for very small sites).  In practice, the actual size of
351	   real networks didn't match the original allocation of Class A, B, and
352	   C addresses.  Class B addresses were bigger than most sites needed
353	   (and there weren't enough of them), and Class C addresses were too
354	   small (i.e., typical sites would need to get 10 or more C blocks to
355	   cover all of the hosts on their networks).  Consequently, classless
356	   addressing was developed [CIDR], which made the boundaries between
357	   the network and local parts of an address more flexible.  With
358	   classless addressing, a separate prefix-length (i.e., network mask)
359	   specifies how many of the left-most bits of an address identify the
360	   network part of the address.

362	3.3.  CIDR and Provider-Based Addressing

364	   One of the reasons CIDR (Classless Inter-Domain Routing) and its
365	   associated provider-assigned address allocation policy were
366	   introduced was to help reduce the cost of computing a routing table
367	   and the size of the forwarding table computed from the routing table.
368	   To achieve this goal CIDR aggressively aggregates network addresses.
369	   Aggregating network addresses means "merging" multiple addresses into
370	   a single "bigger" one, that is to use a common prefix to provide
371	   location information for all addresses sharing that same prefix.

373	   With CIDR, sites that want to connect to the Internet approach a
374	   provider to procure both connectivity and a network address.

376	   Individual providers have a block of address space covered by one
377	   prefix and assign pieces of that space to customers.  Consequently,
378	   customers of the same provider have addresses that share the same
379	   prefix.  The combination of CIDR and provider-based addressing
380	   results in the ability of a provider to address many hundreds of
381	   sites while introducing just one network address into the global
382	   routing system.  An example of such a topology and addressing scheme
383	   is shown in Figure 2.

385	                +----------------+
386	                |                |------- Customer1 (204.1.0.0/19)
387	                |                |------- Customer2 (204.1.32.0/23)
388	                |   Provider A   |------- Customer3 (204.1.34.0/24)
389	                |                |------- Customer4 (204.1.35.0/24)
390	                |                |------- Customer5 (204.1.36.0/23)
391	                +----------------+
392	                        |
393	                        |  A announces
394	                        |  204.1/16 to B
395	                        |
396	                +----------------+
397	                |   Provider B   |
398	                +----------------+

400	                                  Figure 2

402	   In Figure 2, Provider A has been assigned the classless block, or
403	   "aggregate", 204.1.0.0/16 (i.e., a prefix with the high-order 16 bits
404	   denoting a single network).  Provider A has 5 customers, each of
405	   which has been assigned a prefix subordinate to the aggregate.  In
406	   order for Provider B to be able to reach Customers1-5, Provider A
407	   only needs to announce the single prefix 204.1.0.0/16, and Provider
408	   B's routers need only a single routing table entry to reach all of
409	   Provider A's customers.  Note the important difference between the
410	   cases described in Figures 1 and 2; the latter example uses fewer
411	   entries in the routing table to reach the same number of
412	   destinations.

414	   CIDR was a critical step for the Internet: in the early 1990s the
415	   size of default-free routing tables required to support the classful
416	   Internet was almost more than the commercially-available hardware and
417	   software of the day could handle.  The introduction of BGP4's
418	   classless routing and provider-based address allocation policies
419	   resulted in a significant decrease in the growth rate of the routing
420	   tables.  At the same time, however, CIDR introduced some new
421	   weaknesses.  First, the Internet addressing model had to shift from
422	   one of "address owning" to "address lending" [RFC2008].  In pre-CIDR
423	   days sites acquired addresses from a central authority independent of
424	   their provider, and a site could assume it "owned" the address block
425	   it was given.  Owning addresses meant that once one had been given a
426	   set of network addresses, one could always use them; no matter where
427	   one's site connected to the Internet, the prefix for that network
428	   could be injected into the public routing system.  Today, however, it
429	   is simply not possible for all individual sites to have their own
430	   prefixes injected into the DFZ; there would be too many of them.
431	   Consequently, if a site decides to change providers, it needs to
432	   renumber all of its nodes using address space given to it by the new
433	   provider.  The "old" addresses it had used are returned back to its
434	   previous provider.  To understand this, consider if, from Figure 2,
435	   Customer3 changes its provider from Provider A to Provider C, but
436	   does not renumber.  The picture would be as follows:

438	                        +----------------+
439	                        |                |---- Customer1 (204.1.0.0/19)
440	                        |                |---- Customer2 (204.1.32.0/23)
441	                        |   Provider A   |
442	        +---------------|                |---- Customer4 (204.1.35.0/24)
443	        | A announces   |                |---- Customer5 (204.1.36.0/23)
444	        | 204.1/16 to B +----------------+
445	        |                     |
446	        |                     |
447	        |                     |
448	      +----------------+      |
449	      |   Provider B   |      |
450	      +----------------+      |
451	        |                     |
452	        |                     |
453	        |                     |
454	        | C announces         |
455	        | 204.1.34/24         |
456	        | to B          +----------------+
457	        +---------------|   Provider C   |---- Customer3 (204.1.34.0/24)
458	                        +----------------+

460	                                  Figure 3

462	   In Figure 3, Providers A, B and C are all directly connected to each
463	   other.  In order for Provider B to reach Customers 1, 2, 4 and 5,
464	   Provider A still only announces the 204.1.0.0/16 aggregate.  However,
465	   in order for Provider B to reach Customer3, Provider C must announce
466	   the prefix 204.1.34.0/24.  Prefix 204.1.34.0/24 is called a "more-
467	   specific" of 204.1.0.0/16; another term used is that Customer3 and
468	   Provider C have "punched a hole" in Provider A's address block.  From
469	   Provider B's view, the address space underneath 204.1.0.0/16 is no
470	   longer cleanly aggregated into a single prefix and instead the
471	   aggregation has been broken because the addressing is inconsistent
472	   with the topology; in order to maintain reachability to Customer1-5,
473	   Provider B must carry two prefixes where it used to have to carry
474	   only one.

476	   The example in Figure 3 explains why sites must renumber if existing
477	   levels of aggregation are to be maintained.  While a small number of
478	   new exceptions could be tolerated, and certain prefixes have been
479	   grandfathered, the reality in today's Internet is that there are
480	   thousands of providers, many with thousands of individual customers.
481	   It is generally accepted that renumbering of sites is essential for
482	   maintaining sufficient aggregation.

484	   The empirical cost of renumbering a site in order to maintain
485	   aggregation has been the subject of much discussion.  The practical
486	   reality, however, is that forcing all sites to renumber is difficult
487	   given the size and wealth of companies that now depend on the
488	   Internet for running their business.  Thus, although the technical
489	   community came to consensus that, with the current practice of
490	   provider-based addressing, address lending was necessary in order for
491	   the Internet to continue to operate and grow, the reality has been
492	   that some of CIDR's benefits have been lost because not all sites
493	   renumber.  It is worth noting that a number of providers today do
494	   route filtering based, in part, on prefix length; as a result, a site
495	   which does not renumber may have only partial connectivity to the
496	   Internet.  That is, a site may advertise a long prefix into the
497	   routing system, but there is no assurance that all parts of the
498	   Internet will accept the route; some simply ignore it.

500	   One unfortunate characteristic of CIDR at an architectural level is
501	   that the pieces of the infrastructure that benefit from the
502	   aggregation (i.e., the providers which make up the DFZ) are not the
503	   pieces that incur the renumbering cost (i.e., the end site).  The
504	   logical corollary of this statement is that the pieces of the
505	   infrastructure that do incur cost to achieve aggregation (e.g., sites
506	   which renumber when they change providers) don't directly see the
507	   benefit. (The word "directly" is used here because the continued
508	   operation of the Internet is a benefit, though it requires
509	   selflessness on the part of the site to recognize.) This can lead to
510	   a "tragedy of the commons" situation, where everyone agrees that some
511	   sites should renumber, but they themselves want to be one of those
512	   that do not.

514	3.4.  Multi-Homed Sites and Aggregation

516	   As sites become more dependent on the Internet, they have begun to
517	   install additional connections to the Internet to improve robustness
518	   and performance.  Such sites are called "multi-homed".
519	   Unfortunately, when a site connects to the Internet at multiple
520	   places, the impact on routing can be much like a site that switches
521	   providers but refuses to renumber.

523	   In the pre-CIDR days, multi-homed sites were typically known by only
524	   one network prefix, the prefix of their own address block.  When that
525	   site's providers announced the site's network into the global routing
526	   system, a "shortest path" type of routing would occur so that pieces
527	   of the Internet closest to the first provider might use the first
528	   provider while other pieces of the Internet would use the second
529	   provider.  This allowed sites to use the routing system itself to
530	   load balance traffic across their multiple connections.  This type of
531	   multi-homing assumes that a site's prefix can be propagated
532	   throughout the DFZ, an assumption that is no longer universally true.

534	   With CIDR, issues of addressing and aggregation complicate matters
535	   significantly.  At the highest level, there are three possible ways
536	   to deal with multi-homed sites.  The first possibility is to stay
537	   with pre-CIDR approach, allowing each multi-homed site to receive its
538	   address block directly from a registry, independent of its providers.
539	   The problem with this approach is that, because the address block is
540	   obtained independent of either provider, it is not aggregatable and
541	   therefore has a negative impact on the scaling of global routing.

543	   The second approach is for a multi-homed site to receive an
544	   allocation from one of its providers and just use that single prefix.
545	   The site would advertise its prefix to all of the providers to which
546	   it connects.  There are two problems with this approach.  First,
547	   although the prefix is aggregatable by the provider which made the
548	   allocation, it is not aggregatable by the other providers.  To the
549	   other providers, the site's prefix poses the same problem that a
550	   provider-independent address would.  Second, due to CIDR's rule for
551	   longest-match routing, it turns out that the site's prefix is not
552	   always aggregatable in practice even by the provider that made the
553	   allocation, if you want shortest-path routing load-spreading.
554	   Consider Figure 4.  Provider C has two paths for reaching Customer1.
555	   Provider A advertises 204.1/16, an aggregate which includes
556	   Customer1.  But Provider C will also receive an advertisement for
557	   prefix 204.1.0/19 from Provider B, and because the prefix match
558	   through B is longer, C will choose that path.  In order for Provider
559	   C to be able to choose between the two paths, Provider A would also
560	   have to advertise the longer prefix for 204.1.0/19 in addition to the
561	   shorter 204.1/16.  At this point, from the routing perspective, the
562	   situation is very similar to the general problem posed by the use of
563	   provider-independent addresses.

565	   It should be noted that the above example simplifies a very complex
566	   issue.  For example, consider the example in Figure 4 again.
567	   Provider A could choose not to propagate a route entry for the longer
568	   204.1.0/19 prefix, advertising only the shorter 204.1/16.  In such
569	   cases, provider C would always select Provider B.  Internally,
570	   Provider A would continue to route traffic from its other customers
571	   to Customer1 directly.  If Provider A had a large enough customer
572	   base, effective load sharing might be achieved.

574	                                         A advertises                     |
575	                        +------------+  204.1/16 to C  +------------+     |
576	                     ___| Provider A |-----------------| Provider C |     |
577	                    /   +------------+                 +------------+     |
578	                   /                       +----------/                   |
579	                  /                       /                               |
580	       Customer1 ---                     / B advertises 204.1.0/19 to C   |
581	      204.1.0.0/19  |                   /                                 |
582	                    |      +------------+                                 |
583	                     ----- | Provider B |                                 |
584	                           +------------+                                 |

586	                                   Figure 4                               |

588	   The third approach is for a multi-homed site to receive an allocation  *
589	   from each of its providers and not advertise the prefix obtained from
590	   one provider to any of its other providers.  This approach has
591	   advantages from the perspective of route scaling because both
592	   allocations are aggregatable.  Unfortunately, the approach doesn't
593	   necessarily meet the demands of the multi-homed site.  A site that
594	   has a prefix from each of its providers faces a number of choices
595	   about how to use that address space.  Possibilities include:

597	      1) The site can number a distinct set of hosts out of each of the
598	        prefixes.  Consider a configuration where a site is connected to
599	        ISP-A and ISP-B.  If the link to ISP-A goes down, then unless
600	        the ISP-A prefix is announced to ISP-B (which breaks
601	        aggregation), the hosts numbered out of the ISP-A prefix would
602	        be unreachable.

604	      2) The site could assign each host multiple addresses (i.e., one
605	        address for each ISP connection).  There are two problems with
606	        this.  First, it accelerates the consumption of the address
607	        space. While this may be a problem for the (limited) IPv4
608	        address space, it is not a significant issue in IPv6.  Second,
609	        when the connection to ISP-A goes down, addresses numbered out
610	        of ISP-A's space become unreachable.  Remote peers would have to
611	        have sufficient intelligence to use the second address.  For
612	        example, when initiating a connection to a host, the DNS would
613	        return multiple candidate addresses.  Clients would need to try
614	        them all before concluding that a destination is unreachable
615	        (something not all network applications currently do).  In
616	        addition, a site's hosts would need a significant amount of
617	        intelligence for choosing the source addresses they use.  A host
618	        shouldn't choose a source address corresponding to a link that
619	        is down.  At present, hosts do not have such sophistication.

621	   In summary, how best to support multi-homing with IPv4/CIDR faces a
622	   delicate balance between the scalability of routing versus the site's
623	   requirements of robustness and load-sharing.  At this point in time,
624	   no solution has been discovered that satisfies the competing
625	   requirements of route scaling and robustness/performance.  It is
626	   worth noting, however, that some people are beginning to study the
627	   issue more closely and propose novel ideas [BATES].

629	4.  The GSE Proposal

631	   This section provides a description of GSE with the intent of making
632	   this document stand-alone with respect to the GSE "specification".
633	   We begin by reviewing the motivation for GSE.  Next we review the
634	   salient technical details, and we conclude by listing the explicit
635	   non-goals of the GSE proposal.

637	4.1.  Motivation For GSE

639	   The primary motivation for GSE was the concern that the chief initial
640	   IPv6 global unicast address structure, provider-based [RFC 2073], was
641	   fundamentally the same as IPv4 with CIDR and provider-based
642	   aggregation.  Provider-based addressing requires that sites renumber
643	   when they switch providers, so that sites are always aggregated
644	   within their provider's prefix.  In practice, the cost of renumbering
645	   (which can only grow as a site grows in size and becomes more
646	   dependent on the Internet for day-to-day business) is high enough
647	   that an increasing number of sites refuse to renumber when they
648	   change providers.  This cost is particularly relevant in cases where
649	   end-users are asked to renumber because an upstream provider has
650	   changed its transit provider (i.e., the end site is asked to renumber
651	   for reasons outside of its control and for which it sees no direct
652	   benefit).  Consequently, the GSE draft asserts that IPv4 with CIDR
653	   has not achieved the aggressive aggregation required for the route
654	   computation functions of the DFZ of the Internet to scale for IPv4
655	   and that the much larger address space of IPv6 simply exacerbates the
656	   problem.

658	   The GSE proposal does not propose to eliminate the need for
659	   renumbering.  Indeed, it asserts that end sites will have to renumber
660	   more frequently in order to continue scaling the Internet.  However,
661	   GSE proposes to make the cost of renumbering small enough that sites
662	   can be renumbered at essentially any time with little or no
663	   disruption to its network connectivity, and in particular with no
664	   impact on communications that are strictly within the site.

666	   Finally, GSE attempts to address the problem of sites that have
667	   multiple Internet connections.  In CIDR, the pressure for better
668	   multi-homing support can create exceptions to route aggregation and
669	   result in poor scaling.  That is, the public routing infrastructure
670	   may have to carry multiple distinct routes for some demanding multi-
671	   homed sites, one for each independent path.  GSE recognizes the
672	   "special work done by the global Internet infrastructure on behalf of
673	   multi-homed sites" [GSE], and proposes a way for multi-homed sites to
674	   gain certain benefit without impacting global scaling.  This includes
675	   a specific mechanism that providers can use to support multi-homed
676	   sites, presumably at a cost that the site would consider when
677	   deciding whether or not to become multi-homed.

679	4.2.  GSE Address Format

681	   The key departure of GSE from classical IP addressing (both v4 and
682	   v6) was that rather than over-loading addresses with both locator and
683	   identifier functions, it splits the address into two elements: the
684	   high-order 8 bytes used for routing purposes (called "Routing Stuff"
685	   throughout the rest of this document) and the low-order 8 bytes for
686	   unique identification of an end-point.  The structure of GSE
687	   addresses is:

689	                0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
690	              +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
691	              |  Routing Goop    | STP| End System Designator |
692	              +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
693	                     6+ bytes   ~2 bytes       8 bytes

695	                                 Figure 5

697	4.2.1.  Routing Stuff (RG and STP)

699	   The Routing Goop (RG) identifies where within the public Internet
700	   topology a site connects and is used to route datagrams to the site.
701	   RG is structured as follows:

703	                           1                   2                   3
704	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
705	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
706	      | xxx | 13 Bits of LSID         |      Upper 16 bits of Goop    |
707	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

709	       3               4
710	       2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
711	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
712	      | Bottom 18 bits of Routing Goop    |
713	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

715	                                 Figure 6

717	   The RG describes the location of a site's connection by identifying
718	   smaller and smaller regions of topology until finally it identifies
719	   the link which connects the site.  Before interpreting the bits in
720	   the RG, it is important to understand that routing with GSE depends
721	   on decomposing the Internet's topology into a specific graph.  At the
722	   highest level, the topology is broken into Large Structures (LSs).
723	   An LS is a region that can aggregate significant amounts of topology.
724	   Examples of potential LSs are large providers and exchange points.
725	   Within an LS the topology is further divided into another graph of
726	   structures, with each LS dividing itself however it sees fit.  This
727	   division of the topology into smaller and smaller structures can
728	   recurse for a number of levels, where the trade-off is "between the
729	   flat-routing complexity within a region and minimizing total depth of
730	   the substructure" [ESD].

732	   Having described the decomposition process, we now examine the bits
733	   in the RG.  After the 3-bit prefix identifying the address as having
734	   a GSE format, the next 13 bits identify the LS.  By limiting the
735	   field to 13 bits, a ceiling is defined on the complexity of the top-
736	   most routing level (i.e., what we currently call the DFZ).  In the
737	   next 34 bits, a series of subordinate structure(s) are identified
738	   until finally the leaf subordinate structure is identified, at which
739	   point the remaining bits identify the individual link within that
740	   leaf structure.

742	   The remaining 14 bits of the Routing Stuff (i.e., the low-order 14
743	   bits of the high-order 8 bytes) comprise the STP and are used for
744	   routing structure within a site, similar to subnetting with IPv4.

746	   These bits are not part of the Routing Goop per se.  The distinction
747	   between Routing Stuff and Routing Goop is that RG controls routing in
748	   the Public Internet, while Routing Stuff includes the RG plus the
749	   Site Topology Partition (STP).  The STP is used for routing structure
750	   within a site.

752	   The GSE proposal formalized the ideas of sites and of public versus
753	   private topology.  In the first case, a site is a set of hosts,
754	   routers and media under the same administrative control which have
755	   zero or more connections to the Internet.  A site can have an
756	   arbitrarily complicated topology, but all of that complexity is
757	   hidden from everyone outside of the site.  A site only carries
758	   packets which originated from, or are destined to, that site; in
759	   other words, a site cannot be a transit network.  A site is private
760	   topology, while the transit networks form the public topology.

762	   A datagram is routed through public topology using just the RG, but
763	   within the destination site, routing is based on the Site Topology
764	   Partition (STP).

766	4.2.2.  End-System Designator

768	   The End-System Designator (ESD) is an unstructured 8-byte field that
769	   uniquely identifies an interface from all others.  The most important
770	   feature of the ESD is that it alone identifies an interface; the
771	   Routing Stuff portion of an address, although used to help deliver a
772	   packet to its destination, is not used to identify an end point.
773	   End-points of communication care about the ESD; as examples, TCP
774	   peers could be identified by the source and destination ESDs alone
775	   (together with port numbers), checksums would exclude the RG (the
776	   sender doesn't even know its RG, as described later) and on receipt
777	   of a packet only the ESD would be used in testing whether the packet
778	   is intended for local delivery.

780	   The leading contender for the role of a 64-bit globally unique ESD is
781	   the recently defined "EUI-64" identifier [EUI64].  These identifiers
782	   consist of a 24-bit "company_id" concatenated with a 40-bit
783	   "extension".  (Company_id is a new name for the "Organizationally
784	   Unique Identifier" that forms the first half of an 802 MAC address).
785	   Manufacturers are expected to assign locally unique values to the
786	   extension field, guaranteeing global uniqueness for the complete 64-
787	   bit identifier.  A range of the EUI-64 space is reserved to cover
788	   pre-existing 48-bit MAC addresses, and a defined mapping insures that
789	   an ESD derived from a MAC address will not duplicate the ESD of a
790	   device that has a built-in EUI-64.

792	   In some cases, interfaces may not have an appropriate MAC address or
793	   EUI-64 identifier.  A globally unique ESD must then be obtained
794	   through some alternate mechanism.  Several possible mechanisms can be
795	   imagined (e.g., the IANA could hand out addresses from the company_id
796	   it has been allocated).  Although we do not explore them in detail
797	   here, we note that a global coordination structure is required here
798	   to control the allocation of globally unique identifiers.

800	4.3.  Address Rewriting by Border Routers

802	   To obviate the need to renumber devices within sites because of
803	   changing providers, the GSE design hides the global Routing Goop (RG)
804	   from hosts in each site by having site border routers rewrite
805	   addresses of the packets they forward across the boundary between the
806	   site and public topology.  Within a site, nodes need not know the RG
807	   associated with their addresses.  They simply use a designated
808	   "Site-Local RG" value for internal addresses.  When a packet is
809	   forwarded to the public topology, the border router replaces the
810	   Site-Local RG portion of the packet's source address with an
811	   appropriate value.  Likewise, when a packet from the public topology
812	   is forwarded into a site, the border router replaces the RG part of
813	   the destination address with the designated Site-Local RG.

815	   To simplify discussion, the following text uses the singular term RG
816	   as if a site could have only one RG value (i.e., one connection to
817	   the Internet).  In fact, a site could have multiple Internet
818	   connections and consequently multiple RGs.

820	   GSE's approach to easing renumbering isn't so much to ease
821	   renumbering as to make it transparent to end users.  The RG by which
822	   a site is known is hidden from nodes within that site.  Instead, the
823	   RG for the site would be known only by the exit router, either
824	   through static configuration or through a dynamic protocol with an
825	   upstream provider.

827	   Because end hosts don't know their RG, they don't know their entire
828	   16-byte address, so they can't specify the full address in the source
829	   fields of packets they originate.  Consequently, when a datagram
830	   leaves a site, the egress border router fills in the high-order
831	   portion of the source address with the appropriate RG.

833	   The point of keeping the RG hidden from nodes within the core of a
834	   site is to insure the changeability of the RG without impacting the
835	   site itself.  It is expected that the RG would need to change
836	   relatively frequently (e.g., several times a year) in order to
837	   support sufficient aggregation as the topology of the Internet
838	   changes.  A change to a site's RG would only require a change at the
839	   site's egress point, and it's well possible that this change could be
840	   accomplished through a dynamic protocol with the upstream provider.    |
841	   In addition, the site's DNS records would need updating to properly    |
842	   indicate the current RG value.

844	   Hiding a site's RG from its internal nodes does not, however, mean
845	   that changes to RG have no impact on end sites.  Since the full 16-
846	   byte address of a node isn't a stable value (the RG portion can
847	   change), a stored address may contain invalid RG and be unusable if
848	   it isn't "refreshed" through some other means.  For example, opening
849	   a TCP connection, writing the address of the peer to a file and then
850	   later trying to reestablish a connection to that peer may well fail.
851	   For intra-site communication, however, it is expected that only the
852	   Site-Local RG would be used (and stored) which would continue to work
853	   for intra-site communication regardless of changes to the site's
854	   external RG.  This shields a site's intra-site traffic from any
855	   instabilities resulting from renumbering.

857	   In addition to rewriting source addresses that leave a site,
858	   destination addresses must be rewritten upon entering a site.  To
859	   understand the motivation behind this, consider a site with
860	   connections to three Internet providers.  Because each of those
861	   connections has its own RG, each destination within the site would be
862	   known by three different 16-byte addresses.  As a result, intra-site
863	   routers would have to carry a routing table three times larger than
864	   expected.  To work around this, GSE proposed replacing the RG in
865	   inbound packets with the special "Site-Local RG" value to reduce
866	   intra-site routing tables to the minimum necessary.

868	   In summary, when a node initiates a flow to a node at another site,
869	   the initiating node is expected to know the full 16-byte address for
870	   the destination through mechanisms such as a DNS query.  The
871	   initiating node does not, however, know its own RG, and uses the
872	   Site-Local RG values in the RG part of the source address.  When the
873	   datagram reaches the exit border router, the router replaces the RG
874	   of the packet's source address.  When the datagram arrives at the
875	   entry router at the destination site, the router replaces the RG
876	   portion of the destination address with the distinguished "Site-Local
877	   RG" value.  When the destination host needs to send return traffic,
878	   that host knows the full 16-byte address for the other host because
879	   it appeared in the source address field of the arriving packet.

881	4.4.  Renumbering and Rehoming Mid-Level ISPs

883	   One of the most difficult-to-solve components of the renumbering
884	   problem with CIDR is that of renumbering mid-level service providers.
885	   Specifically, if SmallISP1 changes its transit provider from BigISP1
886	   to BigISP2, then in order for the overall size of the routing tables
887	   to stay the same, all of SmallISP1's customers would have to renumber
888	   into address space covered by an aggregate of BigISP2.  GSE deals
889	   with this problem by handling the RG in DNS with indirection.
890	   Specifically, a site's DNS server specifies the RG portion of its
891	   addresses by referencing the "name" of its immediate provider, which
892	   is a resolvable DNS name (this implies a new Resource Record type).
893	   That provider may define some of the low-order bits of the RG and
894	   then reference its immediate provider.  This chain of reference
895	   allows mid-level service providers to change transit providers, and
896	   the customers of that mid-level will simply "inherit" the change in
897	   RG.  Note that this mechanism does not depend on the GSE address
898	   format per se and can also be applied to IPv4 addressing.

900	4.5.  Support for Multi-Homed Sites

902	   GSE defines a specific mechanism for providers to use to support
903	   multi-homed customers that gives those customers more reliability
904	   than singly-homed sites, but without a negative impact on the scaling
905	   of global routing.  This mechanism is not specific to GSE and could
906	   be applied to any multi-homing scenario where a site is known by
907	   multiple prefixes (including provider-based addressing).  Assume the
908	   following topology:

910	                             Provider1     Provider2
911	                             +------+       +------+
912	                             |      |       |      |
913	                             | PBR1 |       | PBR2 |
914	                             +----x-+       +-x----+
915	                                  |           |
916	                              RG1 |           | RG2
917	                                  |           |
918	                               +--x-----------x--+
919	                               | SBR1       SBR2 |
920	                               |                 |
921	                               +-----------------+
922	                                      Site

924	                                    Figure 7

926	   PBR1 is Provider1's border router while PBR2 is Provider2's border
927	   router.  SBR1 is the site's border router that connects to Provider1
928	   while SBR2 is the site's border router that connects to Provider2.
929	   Imagine, for example, that the line between Provider1 and the site
930	   goes down.  Any already existing flows that use a destination address
931	   including RG1 would stop working.  In addition, any addresses
932	   returned from DNS queries that include RG1 would not be viable
933	   addresses.  If PBR1 and PBR2 knew about each other, however, then in
934	   this case PBR1 could tunnel packets destined for RG1-prefixed
935	   addresses to PBR2, thus keeping the communication working.  (Note
936	   that IP-in-IP encapsulation is necessary since routers between PBR1
937	   and PBR2 would forward packets destined for addresses with PBR1's
938	   prefix back towards PBR1.)

940	4.6.  Explicit Non-Goals for GSE

942	   It is worth noting explicitly that GSE did not attempt to address the
943	   following issues:

945	     1) Survival of TCP connections through renumbering events.  If a
946	        site is renumbered, TCP connections using a previous address
947	        will continue to work only as long as the previous address still
948	        works (i.e., while it is still "valid" using RFC 1971
949	        terminology).  No attempt is made to have existing connections
950	        switch to the new address.

952	     2) It is not known how multicast can be made to work under GSE.

954	     3) It is not known how mobility can be made to work under GSE.

956	     4) The performance impact of having routers rewrite portions of the
957	        source and destination address in packet headers requires
958	        further study.

960	   That GSE didn't address the above does not mean they cannot be
961	   solved.  Rather, the issues simply weren't studied in sufficient
962	   depth.

964	5.  Analysis: The Pros and Cons of Overloading Addresses

966	   At this point we have given complete descriptions of two addressing
967	   architectures:  IPv4, which uses the overloading technique, and GSE,
968	   which uses the separated technique.  We now compare and contrast the
969	   two techniques.

971	   The following discussion is organized around three fundamental
972	   points:

974	     1) Identifiers indicate who the intended recipient of a packet is.
975	        At the network layer, an identifier refers to an interface, at
976	        the transport layer it refers to a process or other endpoint of
977	        a "connection".

979	     2) Identifiers must be mapped into a locator that the network layer
980	        can use to actually deliver a packet to its intended
981	        destination.

983	     3) There must be a suitable way to adequately authenticate the user
984	        of an identifier, so that communicating peers have sufficient
985	        confidence that packets sent to or received from a particular
986	        identifier correspond to the intended recipient.

988	5.1.  Purpose of an Identifier

990	   An identifier gives an entity the ability to refer to a communication
991	   end point and to refer to the same endpoint over an extended period
992	   of time.  In terms of semantics, two or more packets sent to the same
993	   identifier should be delivered to the same end point.  Likewise, one
994	   expects multiple packets received from the same identifier to have
995	   been originated by the same sending entity.  That is, a source
996	   identifier indicates who the packet is from and a destination
997	   identifier indicates who the packet is intended for.

999	   In IPv4, when applications communicate, transport "identifiers"
1000	   consist of addresses and port numbers.  For the purposes of this
1001	   discussion, we use the term "identifier" to mean the identifier of an
1002	   interface.  It is assumed that port numbers will be present when
1003	   higher layer entities communicate; the exact port numbers used are
1004	   not relevant to this discussion.

1006	   In small networks, flat routing can be used to deliver packets to
1007	   their destination based only on the destination identifier carried in
1008	   a packet header (i.e., the identifier is the locator and is not
1009	   required to have any structure).  However, in such systems, a
1010	   distinct route entry is required for every destination, an approach
1011	   that does not scale.  In larger networks, packet addresses include a
1012	   locator that helps the network layer deliver a packet to its
1013	   destination.  Such a locator typically has a structure to keep
1014	   routing tables small relative to the total number of reachable
1015	   destinations.  In IPv4, the identifier and locator are combined in a
1016	   single address; it is not possible to separate the locator portion of
1017	   an address from the identifier portion.  In contrast, the ESD portion
1018	   of a GSE address (which can easily be extracted from the address)
1019	   serves as an identifier, while the Routing Stuff plays the role of a
1020	   locator.

1022	   Having a clear separation between the locator and the identifier
1023	   portion of an address appears to provide protocols some additional
1024	   flexibility.  Once a packet has been delivered to its intended
1025	   destination interface (i.e., node), for example, the locator has
1026	   served its purpose and is no longer needed to further demultiplex a
1027	   packet to its higher-layer end point.  This means that if a packet is
1028	   delivered to the correct destination node (that is the identifier
1029	   carried in the packet address matches to one interface identifier of
1030	   the node), the node will accept the packet, regardless of how the
1031	   packet got there.  The exact locator used does not matter, within
1032	   most Internet circumstances, so long as it gets the packet delivered
1033	   to its proper destination.

1035	   The most obvious example that could benefit from the separation of
1036	   locators and identifiers involves communication with a mobile host.
1037	   Transport protocols such as TCP are unable to keep connections open
1038	   if either of the two endpoint identifiers for an open connection
1039	   changes.  Fundamentally, the endpoint identifiers indicate the two
1040	   endpoint entities that are communicating.  If a node were to receive
1041	   a packet from a node with which it had been communicating previously,
1042	   but the identifier used by the sending node has changed, the
1043	   recipient would be unable to distinguish this case from that of a
1044	   packet received from a completely different node.

1046	   In the specific case of TCP and IPv4, connections are identified
1047	   uniquely by the tuple: (srcIPaddr, dstIPaddr, srcport, dstport).
1048	   Because IPv4 addresses contain a combined locator/identifier, it is
1049	   not possible to have a node's location change without also having its
1050	   identifier change.  Consequently, when a mobile node moves, its
1051	   existing connections no longer work, in the absence of special
1052	   protocols such as Mobile IP [MOBILITY].

1054	   In contrast, connections in GSE are identified by the ESDs rather
1055	   than full IPv6 addresses.  That is, connections are identified
1056	   uniquely by the tuple: (srcESD, dstESD, srcport, dstport).
1057	   Consequently, when demultiplexing incoming packets to their proper
1058	   end point, TCP would ignore the Routing Stuff portions of addresses.
1059	   Because the Routing Stuff portion of an address is ignored during
1060	   demultiplexing operations, a mobile node is free to move -- and
1061	   change its Routing Stuff -- without changing its identification.

1063	   As a side note, it is a requirement in GSE that packets be
1064	   demultiplexed to higher layer endpoints on ESDs alone independent of
1065	   the Routing Stuff.  If a site is multi-homed, the packets it sends
1066	   may exit the site at different egress border routers during the
1067	   lifetime of a connection.  Because each border router will place its
1068	   own RG into the source addresses of outgoing packets, the receiving
1069	   TCP must ignore (at least) the RG portion of addresses when
1070	   demultiplexing received packets.  The alternative would make TCP
1071	   unable to cope with common routing changes, i.e., if the path
1072	   changed, packets delivered correctly would be discarded by the
1073	   receiving TCP rather than accepted.

1075	   Not surprisingly, having separate locator and identifiers in
1076	   addresses leads to additional problems as well.  First, an identifier
1077	   by itself provides only limited value.  In order to actually deliver
1078	   packets to a destination identifier, a corresponding locator must be
1079	   known.  The general problem of mapping identifiers into locators is
1080	   non-trivial to solve, and is the topic of the next Section.  Second,
1081	   because the Routing Stuff is ignored when packets being demultiplexed
1082	   upward in the protocol stack, it becomes much easier for an intruder
1083	   to masquerade as someone else.

1085	5.2.  Mapping an Identifier to a Locator

1087	   The idea of using addresses that cleanly separate location and
1088	   identification information is not new.  However, there are several
1089	   different flavors.  In its pure form, a sender need only know the
1090	   identifier of an end-point in order to send packets to it.  When
1091	   presented with a datagram to send, network software would be
1092	   responsible for determining the locator associated with an identifier
1093	   so that the packet can be delivered.  A key question is: "who is
1094	   responsible for finding the Routing Stuff associated with a given
1095	   identifier"? There are a number of possibilities, each with a
1096	   different set of implications:

1098	     1) The network layer could be responsible for doing the mapping.
1099	        The advantage of such a system is that an ESD could be stored
1100	        essentially forever (e.g., in configuration files), but whenever
1101	        it is actually used, network layer software would automatically
1102	        perform the mapping to determine the appropriate Routing Stuff
1103	        for the destination.  Likewise, should an existing mapping
1104	        become invalid, network layer software could dynamically
1105	        determine the updated value.  Unfortunately, building such a
1106	        mapping mechanism that scales is difficult if not impossible
1107	        with a flat identifier space (e.g., the ESD identifier).

1109	     2) The transport layer could be responsible for doing the mapping.
1110	        It could perform the mapping when a connection is first opened,
1111	        periodically refreshing the binding for long-running
1112	        connections.  Implementing such a scheme would change the
1113	        existing transport layer protocols TCP and UDP significantly.
1114	        However, in the case of TCP, such a scheme would have the
1115	        benefit that applications would probably not need to be
1116	        modified.  For UDP-based applications, this may not hold, since
1117	        most UDP-based protocols are implemented within applications.

1119	     3) Higher-layer software (e.g., the application itself) could be
1120	        responsible for performing the mapping.  This potentially
1121	        increases the burden on application programmers significantly,
1122	        especially if long-running connections are required to survive
1123	        renumbering and/or deal with mobile nodes.

1125	   The GSE proposal uses the last approach.  The network and transport
1126	   layers are always presented with both the Routing Stuff (RG + STP)
1127	   and the ESD together in one IPv6 address.  It is neither of these
1128	   layers' jobs to determine the Routing Stuff given only the ESD or to
1129	   validate that the Routing Stuff is correct.  When an application has
1130	   data to send, it queries the DNS to obtain the IPv6 AAAA record for a
1131	   destination.  The returned AAAA record contains both the Routing
1132	   Stuff and the ESD of the specified destination.  While such an
1133	   approach eliminates the need for the lower layers to be able to map
1134	   ESDs into corresponding Routing Stuff, it also means that when
1135	   presented with an address containing an incorrect (i.e., no longer
1136	   valid) Routing Stuff, the network is unable to deliver the packet to
1137	   its correct destination.  Note that addresses containing invalid
1138	   Routing Stuff will result any time when cached addresses are used
1139	   after the Routing Stuff of the address becomes invalid.  This may
1140	   happen if addresses are stored in configuration files, a mobile node
1141	   moves to a new location, long-running applications (clients and
1142	   servers) cache the result of DNS queries, a long-running connection
1143	   attempts to continue operating during a site renumbering event, etc.
1144	   Whatever the causes, the failures are fundamentally due to dynamic
1145	   topological changes at the network layer, yet in GSE such failures
1146	   are left to be dealt with at the application level (through DNS),
1147	   because neither the transport nor the network level has the ability
1148	   to re-map identifiers to corresponding locators.                       |

1150	   To avoid the above problem a network architecture must provide the
1151	   ability to map an identifier to a locator.  In IPv4, this mapping is
1152	   trivial, since the identifier and locator are combined in a single
1153	   quantity (i.e., the IPv4 address).  GSE does not provide this mapping
1154	   functionality directly.  Instead, GSE assumes that a node's DNS name
1155	   serves as its stable identifier, and uses normal DNS queries to map
1156	   the DNS "identifier" into an IPv6 address.  The IPv6 address contains
1157	   both the ESD identifier together with its Routing Stuff, that is an
1158	   initial binding/mapping between the identifier and locator.  When
1159	   this binding breaks (for example due to dynamic topological changes),
1160	   the ESD identifier cannot be mapped into a new locator by itself.
1161	   Instead one must resort back to application level, hoping another DNS
1162	   query would provide rescue to the broken binding between identifier
1163	   to locator that is needed for network delivery.

1165	   The use of DNS to provide identifier to locator mapping contributes
1166	   to GSE's apparent simplicity.  However, there are two fundamental
1167	   problems with this approach, if the intention is to make it
1168	   transparently easy to change locators over time.  First, the burden
1169	   of performing the mapping from identifier to locator is placed
1170	   directly on the application, because lower layers (i.e., transport
1171	   and network layers) cannot perform the mapping themselves due to
1172	   layering violation concerns (i.e., TCP and UDP can't perform a DNS
1173	   query).  Second, following all RG changes the DNS database must be
1174	   promptly updated and all expired information must be flushed out of
1175	   all DNS caches.  This stringent timing requirement imposed by lower
1176	   level operation would represent a departure from the original DNS
1177	   design, which provides DNS names to address mappings that only change
1178	   slowly over time if at all, and which relies heavily on caching over
1179	   relatively long time periods to scale well.

1181	   The following subsections discuss a number of issues related to
1182	   keeping track of or determining the locator associated with an
1183	   identifier.

1185	5.2.1.  Scalable Mapping of Identifiers to Locators

1187	   It is not difficult to construct a mapping from an identifier (such
1188	   as an ESD) to a locator (as well as other information such as a name,
1189	   cryptographic keys, etc.) provided one can structure the identifier
1190	   space appropriately to support scalable lookups.  In particular,
1191	   identifiers must have sufficient structure to support the delegating
1192	   mechanism of a distributed database such as DNS.  On the other hand,
1193	   no scalable mechanism is known for performing such a mapping on
1194	   arbitrary identifiers taken from a flat space lacking any structure.

1196	   Imposing a hierarchy on identifiers poses the following difficulties:

1198	      - - It increases the size of the identifier.  The exact size
1199	        necessary to support sufficient hierarchy is unclear, though it
1200	        is likely to be roughly the same as that used for the routing
1201	        hierarchy.  Analysis done during the original IPng debates
1202	        [RFC1752] suggests that close to 48-bits of hierarchy are needed
1203	        to identify all the possible sites 30-40 years from now.

1205	      - - The assignment of identifiers must be tied to the delegation
1206	        structure.  That is, the site that "owns" an identifier is the
1207	        one responsible for maintaining the identifier-to-locator
1208	        mapping information about it.

1210	      - - Due to the requirement of tying an identifier to the
1211	        delegation structure the identifier of a node cannot be burned
1212	        in during manufacturing.  Instead a mechanism is needed to allow
1213	        a node to learn its identifier.  To be practical, such a
1214	        mechanism would need to be automated and avoid the need for
1215	        manual configuration.

1217	5.2.2.  Insufficient Hierarchy Space in ESDs

1219	   In the case of GSE's 8-byte ESD, the size of the identifier is not
1220	   large enough to contain sufficient hierarchy to both create DNS-like
1221	   delegation points and support stateless address autoconfiguration.
1222	   Stateless address autoconfiguration [RFC1971] already assumes that an
1223	   interface's 6-byte link-layer (i.e., MAC) address can be appended to
1224	   a link's routing prefix to produce a globally unique IPv6 address.
1225	   With GSE, only two bytes would be available for hierarchy and
1226	   delegation.

1228	   It is also the case that the sorts of built-in identifiers now found
1229	   in computing hardware, such as "EUI-48" and "EUI-64" addresses
1230	   [IEEE802, IEEE1212], do not have the structure required for this
1231	   delegation.  Such identifiers have only two-levels of hierarchy; the
1232	   top-level typically identifies a manufacturer, with the remaining
1233	   part of the address being the equivalent of the serial number unique
1234	   to the manufacturer.  The delegation of the two-level hierarchy
1235	   (i.e., equipment manufacturer) does not correspond to the
1236	   administrator under which the end-user operates.  Hence, stateless
1237	   autoconfiguration [RFC1971] cannot create addresses with the
1238	   necessary hierarchical property in the ESD portion of an address.

1240	   Finally, imposing a required hierarchical structure on identifiers
1241	   such as an ESD would also introduce a new administrative burden and a
1242	   new or expanded registry system to manage ESD space (i.e., to insure
1243	   that ESDs are globally unique).  While the procedures for assigning
1244	   ESDs, which need only organizational and not topological
1245	   significance, would be simpler than the procedures for managing IPv4
1246	   addresses, it seems a laudable goal to avoid the problem altogether
1247	   if possible.  In addition, it would likely increase the complexity
1248	   for connecting new nodes to the Internet, a goal inconsistent with
1249	   Stateless Address autoconfiguration [RFC1971].

1251	   The topic of mapping full 16-byte GSE addresses to a locator or other
1252	   information is discussed in Appendix D.

1254	5.3.  Authentication of Identifiers

1256	   The true value of a globally unique identifier lies not on its
1257	   uniqueness but on an ability to use the same identifier repeatedly
1258	   and have it refer to the same end point.  That is, there is an
1259	   expectation that repeated and subsequent use of the same identifier
1260	   results in continued communication with the same end point.  To be
1261	   useful then, a valid identifier must either be easily distinguishable
1262	   from a fraudulent one, or the system must have a way to prevent
1263	   identifiers from being used in an unauthorized manner.

1265	   The remainder of this section discusses how identifier authentication
1266	   is done in both IPv4 and GSE, and shows how overloading an address
1267	   with both an identifier and a locator provides a significant
1268	   automatic identifier authentication.  In contrast, there is
1269	   essentially no identifier authentication in GSE.  It should be noted
1270	   that the actual strength of authentication that would be considered
1271	   sufficient is a topic in its own right, and we do not cover it here.
1272	   Instead, we focus on the relative strengths in the two schemes.

1274	   The following discussion assumes an absence of cryptographic           |
1275	   authentication to bind an identifier to an end site. Many of the       |
1276	   concerns described below would become non-issues if an appropriate     |
1277	   cryptographic infrastructure were available. Section 5.5 discusses     |
1278	   this issue in more detail.                                             |

1280	5.3.1.  Identifier Authentication in IPv4

1282	   As described earlier, an IPv4 address simultaneously plays two roles:
1283	   a unique identifier and a locator.  Using an overloaded address as an
1284	   identifier has the side-effect of insuring that (for all practical
1285	   purposes) the identifier is globally unique.  Furthermore, because
1286	   the same number is used both to identify an interface and to deliver
1287	   data to that interface, it is impossible for some interface A to use
1288	   the identification of another interface B in an attempt to receive
1289	   data destined to B without being detected, unless the routing system
1290	   is compromised.                                                        |

1292	   When both interfaces A and B claim the same unicast address, an        |
1293	   (uncompromised) routing subsystem generally delivers packets to only   |
1294	   one of them.  The other node will quickly realize that something is    |
1295	   wrong (since communication using the duplicate address fails) and      |
1296	   take corrective actions, either correcting a misconfiguration or       |
1297	   otherwise detecting and thwarting the intruder.  To understand how     |
1298	   the routing subsystem prevents the same address from being used in     |
1299	   multiple locations, there are two cases to consider, depending on      |
1300	   whether the two interfaces using duplicate addresses are attached to   |
1301	   the same or to different links.

1303	   When two interfaces on the same link use the same address, a node
1304	   (host or router) sending traffic to the duplicate address will in
1305	   practice send all packets to one of the nodes.  On Ethernets, for
1306	   example, the sender will use ARP (or Neighbor Discovery in IPv6) to
1307	   determine the link-layer address corresponding to the destination
1308	   address.  When multiple ARP replies for the target IP address are
1309	   received, the most recently received response replaces whatever is
1310	   already in the cache.  Consequently, the destinations a node using a
1311	   duplicate IP address can communicate with depends on what its
1312	   neighboring nodes have in their ARP caches.  In most cases, such
1313	   communication failures become apparent relatively quickly, since it
1314	   is unlikely that communication can proceed correctly on both nodes.

1316	   It is also the case that a number of ARP implementations (e.g., BSD-
1317	   derived implementations) log warning messages when an ARP request is
1318	   received from a node using the same address as the machine receiving
1319	   the ARP request.

1321	   The previous discussion describes the operation of ARP in the absence  |
1322	   of intruders or other malicious users. ARP has a number of security    |
1323	   vulnerabilities that make it trivial for an intruder to intercept      |
1324	   traffic and selectively process traffic that traverses a link,         |
1325	   provided the intruder is attached to the link the traffic of interest  |
1326	   traverses. For example, an intruder could intercept all traffic to an  |
1327	   address by being the last to return an ARP response, and then          |
1328	   selectively relay the traffic (after examining and/or modifying it)    |
1329	   to its intended recipient. This is a classic man-in-the-middle         |
1330	   attack.                                                                |

1332	   When two interfaces on different links use the same address, the
1333	   routing subsystem generally delivers packets to only one of the nodes
1334	   because only one of the links has the right subnet corresponding to
1335	   the IP address.  Consequently, the node using the address on the
1336	   "wrong" link will generally never receive any packets sent to it and
1337	   will be unable to communicate with anyone.  For obvious reasons, this
1338	   condition is usually detected quickly.

1340	   It should be noted that although an address containing a combined
1341	   identifier and locator can be forged, the routing subsystem
1342	   significantly limits communication using the forged address.  First,
1343	   return traffic will be sent to the correct destination and not the
1344	   originator of the forged address.  This alone prevents certain types
1345	   of spoofing attacks.  For example, if a destination receives an
1346	   unexpected packet corresponding to a TCP connection that it is
1347	   unaware of, it may return a TCP segment resetting the connection.      |
1348	   Second, routers performing ingress filtering can refuse to forward
1349	   traffic claiming to originate from a source whose source address does  |
1350	   not match the expected addresses (from a topology perspective) for
1351	   sources located within a particular region [RFC 2267].  To
1352	   effectively masquerade as someone else requires subverting the
1353	   intermediate routing subsystem.

1355	   To summarize, the routing subsystem in IPv4 provides a limited (but    |
1356	   quite significant) defense against arbitrary hijacking of packets to   |
1357	   an improper destination. We do not claim that this defense is          |
1358	   sufficient against all types of attacks by a determined intruder.      |
1359	   However, it does provide some degree of defense against accidental     |
1360	   misconfigurations (e.g., assigning an improper address to an           |
1361	   interface) and does erect hurdles that prevent an abritrary node from  |
1362	   impersonating another node.  The more dangerous attack, subverting     |
1363	   the routing subsystem by injecting unauthorized routes, can be traced  |
1364	   and detected by appropriate tools.                                     |

1366	5.3.2.  Identifier Authentication in GSE

1368	   In GSE, it is not possible for the routing subsystem to provide any
1369	   enforcement on the authenticity of identifiers with respect to their
1370	   corresponding Routing Stuff, since the Routing Stuff and ESD portions  |
1371	   of an address are by definition completely orthogonal quantities.      |
1372	   Thus, even the limited protection offered by IPv4 is not immediately   |
1373	   available.                                                             |

1375	   An interesting question is whether any such protection is needed. One  |
1376	   argument is that address-based authentication is so inherently weak    |
1377	   as to be useless, thus the increased vulnerability of a GSE-like       |
1378	   scheme is not significant. Where authentication is desired, the use    |
1379	   of something based on cryptography is necessary (e.g., IPsec           |
1380	   [RFC2401]).                                                            |

1382	   There are at least two arguments against this line of thought.         |
1383	   First, the lack of protection comparable to IPv4 may lead to a new     |
1384	   set of (poorly understood) security threats; Section 5.5 below         |
1385	   describes one possible threat. These threats must be dealt with at     |
1386	   the transport (or lower) layer because the threats are to the          |
1387	   integrety of the transport layer itself. Attempting to solve them at   |
1388	   higher-layers (e.g., via IPsec [RFC2401] and IKE [RFC2409]) results    |
1389	   in a potential layering circularity, where the security mechanisms     |
1390	   rely on a correctly functioning transport, but the transport relies    |
1391	   on those same security mechanisms to provide a service. Whether such   |
1392	   a mechanism can be designed is an area of future work.                 |

1394	   Second, requiring that basic threats to the transport layer be dealt   |
1395	   with using cryptographic techniques significantly increases the cost   |
1396	   of formerly simple packet exchanges. Cryptographic security no longer  |
1397	   becomes a choice an application can make, but quite possibly a         |
1398	   requirement to protect against certain types of attacks. Thus, the     |
1399	   cost of deploying effective defenses against a new class of denial of  |
1400	   service attacks may be quite significant.

1402	5.4.  Transport Layer: What Locator Should Be Used?

1404	   In the following, we focus on what Routing Stuff to use with TCP; UDP
1405	   also depends on the Routing Stuff in similar way.  Indeed, we believe
1406	   that TCP is the "easier" case to deal with, for two reasons.  First,
1407	   TCP is a stateful protocol in which both ends of the connection can
1408	   negotiate with each other.  UDP-based communications are stateless,
1409	   and remember nothing from one packet to the next.  Consequently,
1410	   changing UDP to remember locator information in addition to the
1411	   identifier of the peer may require the introduction of "session"
1412	   features, perhaps as part of a common "library".  Second, changes to
1413	   UDP in practice mean changing individual applications themselves,
1414	   raising deployability questions.

1416	   There are three cases of interest from TCP's perspective:

1418	    - - the sending side of an active open

1420	    - - the sending side of a passive open (i.e., how to respond to an
1421	      active open)

1423	    - - changes to the Routing Stuff during an open connection.

1425	5.4.1.  RG Selection On An Active Open

1427	   If the host is performing a TCP "active open", the application first
1428	   queries the DNS to obtain the destination address, which contains the
1429	   appropriate RG for the remote peer.  That is, the initiator of
1430	   communication is assumed to provide the correct Routing Stuff when
1431	   initiating communication to a specific destination.

1433	5.4.2.  RG Selection On An Passive Open

1435	   When a server passively accepts connections from arbitrary clients,
1436	   it has no choice but to assume that the Routing Stuff in the source
1437	   address of a received packet that initiated the communication is
1438	   correct, because it has no way to authenticate its validity.  Note
1439	   that the Routing Stuff is "correct" only in the sense that it
1440	   corresponds to the site originating the connection, which the server
1441	   will send the reply to.  Whether the Routing Stuff paired with the
1442	   received ESD actually matches the Routing Stuff located at the site
1443	   where the legitimate owner of the ESD currently resides is not known
1444	   and cannot be determined.  Because the ESD alone cannot be mapped
1445	   into a locator (or some other quantity that can provide input to an
1446	   authentication procedure), there is no way to determine whether the
1447	   received Routing Stuff corresponds to that legitimately associated
1448	   with the source identifier of the received packet.  The issue of
1449	   spoofing is discussed in more detail later.

1451	5.4.3.  Mid-Connection RG Changes

1453	   While packets are flowing as part of an open connection, the RG
1454	   appearing on subsequent packets is susceptible to change through
1455	   renumbering events, or as a result of site-internal routing changes
1456	   that cause the egress point for off-site traffic to change.  It is
1457	   even possible that traffic-balancing schemes could result in the use
1458	   of two egress routers, with roughly every other packet exiting
1459	   through a different egress router.

1461	   Because TCP under GSE demultiplexes packets using only ESDs, newly
1462	   arrived packets will be delivered to the correct end-point regardless
1463	   of whether their source RG have changed.  The GSE proposal calls for
1464	   return traffic to continue to be sent via the "old" RG, even though
1465	   it may have been deprecated or become less optimal because the peer's
1466	   border router has changed.  That is, the RG to use for reaching a
1467	   peer is bound to a connection when the connection is established and
1468	   does not change thereafter.  However, the completion of renumbering
1469	   events (so that an earlier RG is now invalid) and certain topology
1470	   changes would require TCP to switch sending to a new RG mid-
1471	   connection.  To explore the scenario, we consider ways of allowing
1472	   the RG change to be made to existing established connections.

1474	   If TCP connection identifiers are based on ESDs rather than full
1475	   addresses, traffic from the same ESD would be viewed as coming from
1476	   the same peer, regardless of the source RG.  Because this
1477	   vulnerability is already present in today's Internet (forging the
1478	   source address of a packet is trivial), the mere delivery of incoming
1479	   datagrams with the same ESD but a different RG does not introduce new
1480	   vulnerability to TCP.  In today's Internet, any node can already
1481	   originate FINs/RSTs from an arbitrary source address and potentially
1482	   or definitely disrupt the connection.  Therefore, acceptance of
1483	   traffic independent of its source RG does not appear to significantly
1484	   worsen existing robustness.  Note, however, that ingress filtering as
1485	   described in Section 5.3.1, cannot be performed on packets containing
1486	   GSE addresses.  This does make it more difficult to prevent certain
1487	   types of attacks.

1489	   We also considered allowing TCP to reply to each segment using the RG
1490	   of the most recently-received segment.  Although this allows TCP
1491	   connections to survive certain important events (e.g., renumbering),
1492	   it also makes it trivial for anyone to hijack connections,
1493	   unacceptably weakening robustness compared with today's Internet.  A
1494	   sender simply needs to guess the sequence numbers in use by a given
1495	   TCP connection [Bellovin 89] and send traffic with a bogus RG to
1496	   hijack a connection to an intruder at an arbitrary location.

1498	   Providing protection from hijacking implies that the RG used to send
1499	   packets must be bound to a connection end-point (e.g., it is part of
1500	   the connection state).  Although it may be reasonable to accept
1501	   incoming traffic independent of the source RG, the choice of sending
1502	   RG requires more careful consideration.  Indeed, any subsequent
1503	   change in the RG used for sending traffic must be properly
1504	   authenticated (e.g., using cryptographic means).  In the GSE
1505	   proposal, the is no apparent way to authenticate such a change, since
1506	   the remote peer doesn't even know its own RG.  Consequently, the only
1507	   reasonable approach in GSE is to send to the peer using the first RG
1508	   used for the entire life of a connection.  That is, always use the
1509	   first RG seen, and accept the loss of connectivity whenever the RG
1510	   changes.

1512	5.4.4.  The Impact of Corrupted Routing Goop

1514	   Another interesting issue that arises is what impact corrupted RG      |
1515	   would have on robustness, given that there is no IPv6 header checksum  |
1516	   that could help detect a corrupted source address field.  Because the  |
1517	   RG is not covered by the TCP checksum (the sender doesn't know what    |
1518	   source RG will be inserted), no TCP mechanism can detect such          |
1519	   corruption at the receiver.  Moreover, once a specific RG is in use,   |
1520	   it does not change for the duration of a connection.  One interesting  |
1521	   case occurs on the passive side of a TCP connection, where a server    |
1522	   accepts incoming connections from remote clients.  If the initial SYN  |
1523	   from the client includes a corrupted RG, the server TCP will create a  |
1524	   TCP connection (in the SYN-RECEIVED state) and cache the corrupted RG  |
1525	   with the connection.  The second packet of the 3-way handshake, the    |
1526	   SYN-ACK packet, would be sent to the wrong RG and consequently not     |
1527	   reach the correct destination.  Later, when the client retransmits     |
1528	   the unacknowledged SYN, the server will continue to send the SYN-ACK   |
1529	   using the bad RG.  Eventually the client times out, and the attempt    |
1530	   to open a TCP connection fails.

1532	   We next consider relaxing the restriction on switching RGs in an
1533	   attempt to avoid the previous failure scenario.  The situation is
1534	   complicated by the fact that the RG on received packets may change
1535	   for legitimate reasons (e.g., a multi-homed site load-shares traffic
1536	   across multiple border routers).  The key question is how one can
1537	   determine which RG is valid and which is not.  That is, for each of
1538	   the destination RGs a sender attempts to use, how can it determine
1539	   which RG worked and which did not? Solving this problem is more
1540	   difficult than first appears, since one must cover the cases of
1541	   delayed segments, lost segments, simultaneous opens, etc.  If a SYN-
1542	   ACK is retransmitted using different RGs, it is not possible to
1543	   determine which of the two RGs worked correctly.  We conclude that
1544	   the only way TCP can determine that a particular RG is correct is by
1545	   receiving an ACK for a specific sequence number in which all
1546	   transmissions of that sequence number used the same RG.  This would
1547	   involve non-trivial changes to TCP implementations.

1549	   At best, an RG selection algorithm for TCP would require new logic in
1550	   implementations of TCP's opening handshake --- a significant
1551	   transition and deployment issue.  We are not certain that a valid
1552	   algorithm is attainable, however.  RG changes would have to be
1553	   handled in all cases handled by the opening handshake: delayed
1554	   segments, lost segments, undetected bit errors in RG, simultaneous
1555	   opens, old segments, etc.

1557	   In the end, we conclude that although the corrupted SYN case
1558	   introduces potential problems, the changes that would need to be made
1559	   to TCP to robustly deal with such corruption would be significant, if
1560	   tractable at all.  This would result in a transition to GSE also
1561	   having a significant TCPng component, a significant drawback.

1563	5.5.  On The Uniqueness Of ESDs

1565	   Although ESDs are expected to be globally unique, their uniqueness
1566	   property may be violated either due to mistakes in allocation or by
1567	   malicious attacks.  The exact uniqueness requirements for ESDs
1568	   depends on what purpose they serve and how they are used.  If the
1569	   correctness of some applications relies on the global uniqueness of
1570	   ESDs, then active checking and enforcement will be necessary.  On the
1571	   other hand if ESDs are used only to uniquely identify individual
1572	   endpoints within a session, then one may consider global uniqueness
1573	   as unnecessary.

1575	5.5.1.  Impact of Duplicate ESDs

1577	   Consider what happens when two nodes using the same ESD attempt to     |
1578	   communicate with each other.  In the GSE proposal, a node queries the  |
1579	   DNS to obtain an IPv6 address.  The returned address includes the      |
1580	   Routing Stuff of an address (the RG+STP portions).  At this point,     |
1581	   the sender might notice the destination ESD is the same as its own     |
1582	   ESD and indicate an error. If it doesn't check, however, it may well   |
1583	   forward the packet to a router that delivers the packet to its         |
1584	   correct destination (using the information in the Routing Stuff).  On  |
1585	   receipt of the packet, again, the destination node could examine the   |
1586	   ESD portion of the source address and determine that it is the same    |
1587	   as its own and indicate an error. Alternatively, it could just         |
1588	   process the packet without detecting the duplication and               |
1589	   communication would proceed as normal (unless there are port number    |
1590	   conflicts due to the sender and receiver allocating port numbers from  |
1591	   the same name space).

1593	   A more problematic case occurs if two nodes having the same ESD
1594	   communicate with a third party.  To the third party, packets received
1595	   from either machine might appear to be coming from the same machine
1596	   since they all carry the same ESD.  Consequently, at the transport
1597	   level, if both machines choose the same source and destination port
1598	   numbers (one of the ports --- a server's well-known port number ---
1599	   will likely be the same), packets belonging to two distinct transport
1600	   connections will be demultiplexed to a single transport end-point.

1602	   When packets from different sources using the same source ESD are
1603	   delivered to the same transport end-point, a number of possibilities
1604	   come to mind:

1606	     1) Following the GSE specification, the transport end-point would
1607	        accept the packet, without regard to the Routing Stuff of the
1608	        source address.  This may lead to a number of robustness
1609	        problems (and at best will confuse the application).

1611	     2) The transport end-point could verify that the Routing Stuff of
1612	        the source address matches one of a set of expected values
1613	        before processing the packet further.  If the Routing Stuff
1614	        doesn't match any expected value, the packet could be dropped.
1615	        This would result in a connection from one host operating
1616	        correctly, while a connection from another host (using the same
1617	        ESD) would fail.

1619	     3) When a packet is received with an unexpected Routing Stuff the
1620	        receiver could invoke special-purpose code to deal with this
1621	        case.  Possible actions include attempting to verify whether the
1622	        Routing Stuff is indeed correct (the saved values may have
1623	        expired) or attempting to verify whether duplicate ESDs are in
1624	        use (e.g., by inventing a protocol that sends packets using both
1625	        Routing Stuff and verifies that they are delivered to the same
1626	        end-point).

1628	5.5.2.  New Denial of Service Attacks.

1630	   It is clear that there are potential problems if identifiers are not
1631	   globally unique.  How common such problems would actually occur in
1632	   practice depends on how many duplicates there actually are.  Thus,
1633	   one might be tempted to make the argument that a scheme for assigning
1634	   identifiers could be made to be "unique enough" in practice.  This
1635	   would be a dangerous and naive assumption, because in the absence of
1636	   any ESD enforcement (i.e. ensuring each host use only the assigned
1637	   ESD), intruders will actively impersonate other sites for the sole
1638	   purpose of invalidating the uniqueness assumption.  For example, one
1639	   could deny service to host foo.bar.com by querying the DNS for its
1640	   corresponding ESD, and then impersonating that ESD.

1642	   As a specific example, one GSE-specific denial-of-service attack
1643	   would be for an intruder to masquerade as another host and "wedge"
1644	   connections in a SYN-RECEIVED state by sending SYN segments
1645	   containing an invalid RG in the source IP address for a specific ESD.
1646	   Subsequent connection attempts to the wedged host from the legitimate
1647	   owner of the ESD (if they used the same TCP port numbers) would then
1648	   not complete, since return traffic would be sent to the wrong place.   |
1649	   Note that this attack is worse than the common syn-flood attack        |
1650	   because it not only ties up resources on the target machine, it        |
1651	   blocks out legitimate access to the target machine by a specific       |
1652	   third party.                                                           |

1654	   Another potential attack involves an intruder assuming the ESD of a    |
1655	   target site (e.g., mit.edu), then opening TCP connections using        |
1656	   mit.edu's ESD to a targer server (e.g., big-server.com). Because the   |
1657	   RG would point back to the attacker, the attacker could create a       |
1658	   number of TCP connections in an OPEN state without needing to guess    |
1659	   the sequence numbers needed to complete a 3-way handshake. Once those  |
1660	   connections are open, it would be difficult to (automatically)         |
1661	   distinguish between connections that are part of a denial-of-service   |
1662	   attack from those (idle) connections that are part of a legitimate     |
1663	   activity.                                                              |

1665	   The previous discussion indicates that separating identifiers and      |
1666	   locators opens up new potential denial-of-service attack policies      |
1667	   that would need to be carefully studied. One way of addressing them    |
1668	   would be to have a way to authenticate the RG associated with an       |
1669	   identifier, as the attacks take advantage of the distinction between   |
1670	   identifiers and locators.

1672	5.6.  Summary of Identifier Authentication Issues

1674	   In summary, changing the RG dynamically in a safe way for a
1675	   connection requires that an originator of traffic be able to
1676	   authenticate a proposed change in the RG before sending to a
1677	   particular ESD via that RG.  This is difficult for several reasons:

1679	     1) It can't be done on an end-to-end basis in GSE (e.g., via IPsec)  |
1680	        because the sender doesn't know what value the RG portion of the  |
1681	        address will have when it reaches the receiver. This issue is     |
1682	        specific to GSE and other approaches in which the end node knows  |
1683	        its own RG would not automatically have this problem.             |

1685	     2) It can't be easily done in GSE using just the ESD because there   |
1686	        is no mechanism at or below the transport layer to map ESDs into  |
1687	        a quantity that can be used as a key to jump start the            |
1688	        authentication process (using the DNS would be problematic due    |
1689	        to layering circularity considerations).                          |

1691	     3) It is conceivable that one could send a "who are you" type        |
1692	        message to a peer asking it to return a more suitable identifier  |
1693	        that can be used to jump start the authentication process. This   |
1694	        additional information would include information needed to        |
1695	        obtain keys, certificates, etc. from an appropriate source that   |
1696	        can be used to verify proper use of an ESD by a particular node.  |
1697	        Note, however, that the "who are you" makes use of the full       |
1698	        address, not just the ESD portion.

1700	     4) Any scheme that uses the full IPv6 address to do the              |
1701	        authentication can be used with today's standard provider-based
1702	        addressing, raising the question of what benefit is retained
1703	        from having separate identifiers and locators.

1705	   Our final conclusion is that with the GSE approach, transport
1706	   protocol end-points must make an early, single choice of the RG to
1707	   use when sending to a peer and stick with that choice for the
1708	   duration of the connection.  Specifically:

1710	     1) The demultiplexing of arriving packets to their transport end
1711	        points should use only the ESD, and not the Routing Stuff.

1713	     2) If the application chooses an RG for the remote peer (i.e., an
1714	        active open), use the provided RG for all traffic sent to that
1715	        peer, even if alternative RGs are received on subsequent
1716	        incoming datagrams from the same ESD.  For all other cases, use
1717	        the first RG received with a given ESD for all sending.

1719	     3) Simultaneously, we understand that, with the above rules, there
1720	        are still open issues with regard to invalid RGs, either through
1721	        corruption or through a active hostile attacks.

1723	   One difficulty With the above recommendation is that there does not
1724	   appear to be a straightforward way to use ESDs in conjunction with
1725	   mobility or site renumbering (in which existing connections survive
1726	   the renumbering).  This presents a quandary.  The main benefit of
1727	   separating identifiers and locators is the ability to have
1728	   communication (e.g., a TCP connection) continue transparently, even
1729	   when the Routing Stuff associated with a particular ESD changes.
1730	   However, switching to a new Routing Stuff without properly
1731	   authenticating it makes it trivial to hijack connections.

1733	   We cannot emphasize enough that the use of an ESD independent of an
1734	   associated RG can be very dangerous.  That is, communicating with a
1735	   peer implies that one is always talking to the same peer for the
1736	   duration of the communication.  But as has been described in previous
1737	   sections, such assurance can only come from properly authenticating    |
1738	   the RG associated with an ESD.  How to authentic the RG associated     |
1739	   with an ESD in GSE does not appear to have a trivial solution is an    |
1740	   open problem.                                                          |

1742	5.7.  The Need For Strong Authentication                                  |

1744	   The problems described earlier stem from an inability to verify        |
1745	   whether a particular RG is legitimately associated with an ESD. One    |
1746	   approach that would  address this problem is to use cryptographic      |
1747	   techniques to verify the binding between RG and an ESD. There are two  |
1748	   cases to consider.                                                     |

1750	   First, for an existing connection, switching from one RG to another    |
1751	   risks the possibility of an intruder hijacking a connection.           |
1752	   Addressing this risk involves having one endpoint verify               |
1753	   (cryptographically) with its peer that proposed new RG is acceptable.  |
1754	   This requires only an ability to communicate with the peer using the   |
1755	   older (i.e., current) RG and using the older RG to verify the new RG.  |
1756	   For example, a node could send its peer a message requesting           |
1757	   cryptographic verification for a new RG prior to actually switching    |
1758	   to it. Such verification would not require a public key                |
1759	   infrastrucutre, as the purpose is not to verify that the legitimate    |
1760	   owner of the ESD approves use of the RG, but that the peer with which  |
1761	   one is currently communicating with (and who is using a particular     |
1762	   ESD -- possibly illegally) approves switching to a different RG.       |

1764	   A more problematic case involves the wedging of connections as         |
1765	   described in Section 5.5.2. Here, an intruder improperly uses an       |
1766	   identifier legitimately belonging to someone else, denying the         |
1767	   legitimate owner service. Addressing this problem is more difficult.   |
1768	   One approach is to verify the RG associated with an identifier the     |
1769	   first time it is used. This would appear to require a global PKI       |
1770	   infrastructure (not available today) in which every potential node is  |
1771	   registered so that in the case of conflicts, it becomes possible to    |
1772	   determine the legitimate owner of an identifier.                       |

1774	   Another interesting question concerns at what layer such               |
1775	   cryptographic mechanisms would be needed. Ideally, the denial of       |
1776	   service threats must be dealt with at the transport (or lower) layer   |
1777	   because the threats are to the integrety of the transport layer        |
1778	   itself. Attempting to solve them at higher-layers (e.g., via IPsec     |
1779	   and IKE) results in a potential layering circularity, where the        |
1780	   security mechanisms rely on a correctly functioning transport, but     |
1781	   the transport relies on those same security mechanisms to provide a    |
1782	   service. Further work is needed to determine whether such a mechanism  |
1783	   can be designed using IPsec.                                           |

1785	6.  Conclusion

1787	   The GSE proposal provides a concrete example of a network protocol
1788	   design that separates identifiers from locators in addresses.  In
1789	   this paper we compared GSE with IPv4's CIDR-style addressing to
1790	   better understand the pros and cons of the respective design
1791	   approaches.

1793	   Functionally speaking, identifiers and locators each have a logically
1794	   different role to play.  Thus overloading both in one field causes
1795	   problems whenever the location of a node changes but its identity
1796	   does not.  However, our analysis shows that overloading also presents  |
1797	   three critically important benefits.

1799	   First, for network entity A to send data to network entity B, A must
1800	   not only know B's end identifier but also B's locator.  No scalable
1801	   way is known at this time to provide this mapping at the network
1802	   layer, other than overloading the two quantities into an address as
1803	   is done in IPv4.  Fundamentally, a scalable mapping algorithm
1804	   strongly suggests that the identifier space be structured
1805	   hierarchically, yet identifiers in GSE are not sufficiently large to
1806	   both contain sufficient hierarchy and support stateless address
1807	   autoconfiguration.  Instead, GSE forces applications to supply up-
1808	   to-date locators.  However, relying on the locator provided at the
1809	   time communication is established as GSE does is inadequate when the
1810	   remote locator can change dynamically, precisely the scenario that is
1811	   supposed to benefit from the separation.  That is, the benefits of
1812	   separating the identifier from the locator are largely lost, if the
1813	   changes in the identifier to locator binding are not tracked quickly.

1815	   Second, when communicating with a remote site, if the RG changes       |
1816	   there begins to be uncertainty as to whether a reliable TCP handshake
1817	   is possible (because of the need for passively opened TCP to use the
1818	   RG's it obtains from the packets).  Because the reliability of TCP's
1819	   byte stream is critically dependent on its three-way handshake, this
1820	   is a significant issue.

1822	   Finally, when communicating with a remote site, a receiver must be
1823	   able to insure (with reasonable certainty) that received data does
1824	   indeed come from the expected remote entity.  In IPv4, it is possible
1825	   to receive packets from a forged source, but the potential for
1826	   mischief between communicating peers is significantly limited because
1827	   return traffic will not generally reach the source of the forged
1828	   traffic.  That is, communication involving packets sent in both
1829	   directions will not succeed.  In contrast, architectures like GSE
1830	   that decouple the identifier and locator functions lose the built-in
1831	   protection available in classical IP and thus face great difficulty
1832	   assuring that traffic from a source identified only by an identifier
1833	   actually comes from the correct source.  Short of using cryptographic
1834	   techniques (e.g. IPsec), there is no known mechanism that can use an
1835	   identifier alone to perform this remote entity authentication.  Using
1836	   an identifier alone for authentication of received packets is
1837	   dangerously unsafe.

1839	   In summary, although overloading the address field with a combined
1840	   identifier and locator leads to difficulties in retaining the
1841	   identity of a node whenever its address changes, analysis in this
1842	   paper suggests that the benefit of the overloading actually out-
1843	   weighs its cost.  Completely separating an identifier from its
1844	   locator renders the identifier untrustworthy, thus useless, in the
1845	   absence of an accompanying authentication system.

1847	7.  Security Considerations

1849	   The primary security consideration with GSE or, more generally, a
1850	   network layer with addresses split into locator and identifier parts,
1851	   is that of one node impersonating another by copying the
1852	   identification without the location.  Indeed, the main conclusion of
1853	   this paper is that a GSE-like addressing structure introduces new
1854	   security vulnerabilities that are not present in IP, and that those
1855	   problems are serious enough to question the benefits of an
1856	   architecture that separates locaters and identifiers in addresses.

1858	8.  Acknowledgments

1860	   Thanks go to Steve Deering and Bob Hinden (the Chairs of the IPng
1861	   Working Group) as well as Sun Microsystems (the host for the interim
1862	   meeting) for the planning and execution of the interim meeting.
1863	   Thanks also go to Mike O'Dell for writing the 8+8 and GSE drafts; by
1864	   publishing these documents and speaking on their behalf, Mike was the
1865	   catalyst for some valuable discussions, both for IPv6 addressing and
1866	   for addressing architectures in general.  Special thanks to the
1867	   attendees of the interim meeting whose high caliber discussions
1868	   helped motivate and shape this document.

1870	9.  References

1872	     [ANYCAST] "Host Anycasting Service", C. Partridge, T. Mendez, & W.
1873	             Milliken, RFC 1546.

1875	     [BATES] Scalable support for multi-homed multi-provider
1876	             connectivity, Tony Bates & Yakov Rekhter, RFC 2260,
1877	             January, 1998.

1879	     [Bellovin 89] "Security Problems in the TCP/IP Protocol Suite",
1880	             Bellovin, Steve, Computer Communications Review, Vol. 19,
1881	             No. 2, pp32-48, April 1989.

1883	     [CIDR] "Classless Inter-Domain Routing (CIDR): an Address
1884	             Assignment and Aggregation Strategy". V. Fuller, T. Li, J.
1885	             Yu, & K. Varadhan, RFC 1519, September 1993.

1887	     [DHCP-DDNS] Interaction between DHCP and DNS, Internet Draft, Yakov
1888	             Rekhter, (Work in Progress.)

1890	     [DDNS] "Dynamic Updates in the Domain Name System (DNS UPDATE)",
1891	             Paul Vixie (Editor), RFC 2136, April, 1997.

1893	     [EUI64] 64-Bit Global Identifier Format Tutorial.
1894	             http://standards.ieee.org/db/oui/tutorials/EUI64.html.
1895	             Note: "EUI-64" is claimed as a trademark by an organization
1896	             which also forbids reference to itself in association with
1897	             that term in a standards document which is not their own,
1898	             unless they have approved that reference.  However, since
1899	             this document is not standards-track, it seems safe to name
1900	             that organization: the IEEE.

1902	     [GSE] "GSE - An Alternate Addressing Architecture for IPv6", Mike
1903	             O'Dell, (Work in progress).

1905	     [IEEE802] IEEE Std 802-1990, "Local and Metropolitan Area Networks:
1906	             IEEE Standard Overview and Architecture."

1908	     [IEEE1212] IEEE Std 1212-1994, "Information technology--
1909	             Microprocessor systems: Control and Status Registers (CSR)
1910	             Architecture for microcomputer buses."

1912	     [IPv6-ADDRESS] "An IPv6 Aggregatable Global Unicast Address
1913	             Format", R. Hinden, M. O'Dell, S. Deering, RFC 2374, July,
1914	             1998.

1916	     [MOBILITY] "IP Mobility Support", C. Perkins, RFC 2002, October,
1917	             1996.

1919	     [NAT] "IP Network Address Translator (NAT) Terminology and           |
1920	             Considerations", P. Srisuresh, M. Holdrege, RFC 2663,        |
1921	             August, 1999.                                                |

1923	     [RFC1752] "The Recommendation for the IP Next Generation Protocol",
1924	             S. Bradner, A. Mankin, RFC 1752, January, 1995.

1926	     [RFC1788] "ICMP Domain Name Messages", W. Simpson, RFC 1788, April,
1927	             1995.

1929	     [RFC1884] "IP Version 6 Addressing Architecture", R. Hinden & S.
1930	             Deering, Editors, RFC 1884.

1932	     [RFC1958] "Architectural Principles of the Internet", B. Carpenter,
1933	             RFC 1958, June, 1996.

1935	     [RFC1971] "IPv6 Stateless Address Autoconfiguration", S. Thomson,
1936	             T. Narten, RFC 1971, August, 1996.

1938	     [RFC2008] "Implications of Various Address Allocation Policies for
1939	             Internet Routing", Y. Rekhter, T. Li, RFC 2008, October
1940	             1996.

1942	     [RFC2073] An IPv6 Provider-Based Unicast Address Format.  Y.
1943	             Rekhter, P. Lothberg, R. Hinden, S. Deering, J. Postel. RFC
1944	             2073, January, 1997.

1946	     [RFC2267] Network Ingress Filtering: Defeating Denial of Service
1947	             Attacks which employ IP Source Address Spoofing, P.
1948	             Ferguson, D. Senie, RFC 2267, January, 1998.                 |

1950	     [RFC2401] Security Architecture for the Internet Protocol. S. Kent,  |
1951	             R.  Atkinson, RFC 2401, November 1998.                       |

1953	     [RFC2409] The Internet Key Exchange (IKE). D. Harkins, D. Carrel,    |
1954	             RFC 2267 November 1998.

1956	     [ROUTER-RENUM] "Router Renumbering for IPv6", M. Crawford, draft-
1957	             ietf-ipngwg-router-renum-06.txt.                             |

1959	     [SITE-PREFIXES] "Site prefixes in Neighbor Discovery", E. Nordmark,  |
1960	             draft-ietf-ipngwg-site-prefixes-03.txt.                      |

1962	10.  Authors' Addresses

1964	   Matt Crawford                           John Stewart
1965	   Fermilab MS 368                         Juniper Networks, Inc.
1966	   PO Box 500                              385 Ravendale Drive
1967	   Batavia, IL 60510 USA                   Mountain View, CA  94043
1968	   Phone: 630-840-3461                     Phone: +1 650 526 8000
1969	   EMail: crawdad@fnal.gov                 EMail: jstewart@juniper.net

1971	   Allison Mankin                          Lixia Zhang
1972	   USC/ISI                                 UCLA Computer Science Department
1973	   4350 North Fairfax Drive                4531G Boelter Hall
1974	   Suite 620                               Los Angeles, CA 90095-1596 USA
1975	   Arlington, VA  22203 USA                Phone: 310-825-2695
1976	   EMail: mankin@isi.edu                   EMail: lixia@cs.ucla.edu
1977	   Phone: 703-812-3706

1979	   Thomas Narten
1980	   IBM Corporation
1981	   3039 Cornwallis Ave.
1982	   PO Box 12195 - F11/502
1983	   Research Triangle Park, NC 27709-2195
1984	   Phone: 919-254-7798
1985	   EMail: narten@raleigh.ibm.com

1987	Appendix A: Increased Reliance on Domain Name System (DNS)

1989	   As we've discussed in previous sections, the motivation for
1990	   separating identifiers from locators in IP address is to allow the
1991	   locator portion to change more easily.  However because GSE does not
1992	   provide a mapping from an ESD to its locator, whenever the locator
1993	   changes, GSE falls back on DNS to provide such mapping.

1995	   Because any mapping scheme is complicated by renumbering, and because
1996	   recent IPv4 experience has shown a requirement for renumbering at
1997	   some frequency, it is worthwhile to explore the general renumbering
1998	   issue.

2000	A.1: Renumbering and DNS: How Frequently Can We Renumber?

2002	   One premise of the GSE proposal [GSE] is that an ISP can renumber the
2003	   Routing Goop portion of a site's addresses transparently to the site
2004	   (i.e., without coordinating the change with the site).  This would
2005	   make it possible for backbone providers to aggressively renumber the
2006	   Routing Goop part of addresses to achieve a high degree of route
2007	   aggregation.  On closer examination, frequent (e.g., daily)
2008	   renumbering turns out to be difficult in practice because of a
2009	   circular dependency between the DNS and routing.  Specifically, if a
2010	   site's Routing Stuff changes, nodes communicating with the site need
2011	   to obtain the new Routing Stuff.  In the GSE proposal, one queries
2012	   the DNS to obtain this information.  However, in order to reach a
2013	   site's DNS servers, the pointers controlling the downward delegation
2014	   of authoritative DNS servers (i.e., DNS "glue records") must use
2015	   addresses with Routing Stuff that are reachable.  That is, in order
2016	   to find the address for the web server "www.foo.bar.com", DNS queries
2017	   might need to be sent to a root DNS server, as well as DNS servers
2018	   for "bar.com" and "foo.bar.com".  Each of these servers must be
2019	   reachable from the querying client.  Consequently, there must be an
2020	   adequate overlap period after the RG changes, during which both the
2021	   old Routing Stuff and the new Routing Stuff can be used
2022	   simultaneously.  During the overlap period, DNS glue records will
2023	   need to be updated to use the new addresses (including Routing Stuff)
2024	   and DNS RR's needs to be updated.  Only after all relevant DNS
2025	   servers have been updated and all previously cached RRs containing
2026	   the old addresses have timed out can the old RG be deleted.

2028	   An important observation is that the above issue is not specific to
2029	   GSE; the same requirement exists with today's provider-based
2030	   addressing architecture.  When a site is renumbered (e.g., it
2031	   switches ISPs and obtains a new set of addresses from its new
2032	   provider), the DNS must be updated in a similar fashion.

2034	A.2: Efficient DNS support for Site Renumbering

2036	   In the current Internet, when a site is renumbered, the addresses of
2037	   all the site's internal nodes change.  This requires a potentially
2038	   large update to the RR database for that site.  Although Dynamic DNS
2039	   [DDNS] could potentially be used, the cost is likely to be large due
2040	   to the large number of individual records that would need to be
2041	   updated.  In addition, when DHCP and DDNS are used together [DHCP-
2042	   DDNS], it may be the case that individual hosts "own" their own A or
2043	   AAAA records, further complicating the question of who is able to
2044	   update the contents of DNS RRs.

2046	   With GSE, When a site renumbers to satisfy its ISP, only the site's
2047	   routing prefix needs to change.  That is, the prefix reflects where
2048	   within the Internet the site resides.  One DNS modification that
2049	   could reduce the cost of updating the DNS when a site is renumbered
2050	   is to store addresses in two distinct RR's: one for the Routing Goop
2051	   that reflects where a node attaches to the Internet and the other for
2052	   STP-plus-ESD that is the site-specific part of an address.  During a
2053	   renumbering, the Routing Goop would change, but the "site internal
2054	   part" would remain fixed.  That way, renumbering a site would only
2055	   require that the Routing Goop RR of a site be updated; the "site-
2056	   internal part" of individual addresses would not change.

2058	   To obtain the address of a node from the DNS, a DNS query for the
2059	   name would return two quantities: the "site internal part" and the
2060	   DNS name of the Routing Stuff for the site.  An additional DNS query
2061	   would then obtain the specific RR of the site, and the complete
2062	   address would be synthesized by concatenating the two pieces of
2063	   information.

2065	   Implementing these DNS changes increases the practicality of using
2066	   Dynamic DNS to update a site's DNS records as it is renumbered.  Only
2067	   the site's Routing Goop RRs would need updating.

2069	   Finally, it may be useful to divide a node's AAAA RR into the three
2070	   logical parts of the GSE proposal, namely RG, STP and ESD.  Whether
2071	   or not it is useful to have separate RRs for the STP and ESD portions
2072	   of an address or a single RR combining both is an issue that requires
2073	   further study.

2075	   If AAAA records are comprised of multiple distinct RRs, then one
2076	   question is who should be responsible for synthesizing the AAAA from
2077	   its components: the resolver running on the querying client's machine
2078	   or the queried name server? To minimize the impact on client hosts
2079	   and make it easier to deploy future changes, it is recommended that
2080	   the synthesis of AAAA records from its constituent parts be done on
2081	   name servers rather than in client resolvers.

2083	A.2.1: Two-Faced DNS

2085	   The GSE proposal attempts to hide the RG part of addresses from nodes
2086	   within a site.  If the nodes do not know their own RG, then they
2087	   can't store or use them in ways that cause problems should the site
2088	   be renumbered and its RG change (i.e., the cached RG become invalid).
2089	   A site's DNS servers, however, will need to have more information
2090	   about the RG its site uses.  Moreover, the responses it returns will
2091	   depend on who queries the server.  A query from a node within the
2092	   site should return an address with a Site Local RG, whereas a query
2093	   for the same name from a client located at a different site should
2094	   return the global scope RG.  This facilitates intra-site
2095	   communication to be more resilient to failures outside of the site.
2096	   Such context-dependent DNS servers are commonly referred as "two-
2097	   faced" DNS servers.

2099	   Some issues that must be considered in this context:

2101	     1) A DNS server may recursively attempt to resolve a query on
2102	        behalf of a requesting client.  Consequently, a DNS query might
2103	        be received from a proxy rather than from the client that
2104	        actually seeks the information.  Because the proxy may not be
2105	        located at the same site as the originating client, a DNS server
2106	        cannot reliably determine whether a DNS request is coming from
2107	        the same site or a remote site.  One solution would be to
2108	        disallow recursive queries for off-site requesters, though this
2109	        raises additional questions.

2111	     2) Since cached responses are, in general, context sensitive, a
2112	        name server may be unable to correctly answer a query from its
2113	        cache, since the information it has is incomplete.  That is, it
2114	        may have loaded the information via a query from a local client,
2115	        and the information has a site-local prefix.  If a subsequent
2116	        request comes in from an off-site requester, the DNS server
2117	        cannot return a correct response (i.e., one containing the
2118	        correct RG).

2120	A.2.2: Bootstrapping Issues

2122	   If Routing Stuff information is distributed via the DNS, key DNS
2123	   servers must always be reachable.  In particular, the addresses
2124	   (including Routing Stuff) of all root DNS servers are, for all
2125	   practical purposes, well-known and assumed to never change.  It is
2126	   not uncommon for the addresses of root servers to be hard-coded into
2127	   software distributions.  Consequently, the Routing Stuff associated
2128	   with such addresses must always be usable for reaching root servers.
2129	   If it becomes necessary or desirable to change the Routing Stuff of
2130	   an address at which a root DNS server resides, the routing subsystem
2131	   will likely need to continue carrying "exceptions" for those
2132	   addresses.  Because the total number of root DNS servers is
2133	   relatively small, the routing subsystem is expected to be able to
2134	   handle this requirement.

2136	   All other DNS server addresses can be changed, since their addresses
2137	   are typically learned from an upper-level DNS server that has
2138	   delegated a part of the name space to them.  So long as the
2139	   delegating server is configured with the new address, the addresses
2140	   of other servers can change.

2142	Appendix B: Additional Issues Related to Specifically to GSE              |

2144	   This paper focused primarily on the issues of separating identifiers
2145	   and locators in unicast addresses.  It is worth noting that a number   |
2146	   of GSE-specific additional issues were identified during the IPng      |
2147	   interim meeting. These stem from a GSE end node not knowing its own    |
2148	   RG and the need for border routers to translate the RG of addresses.   |
2149	   These issues would need to be considered before an architecture such   |
2150	   as GSE could be deployed.  Specifically:

2152	      - - it is not known how multicast would work under GSE.  One
2153	        identified issue is that a site with multiple egress routers
2154	        would (by default) inject multicast traffic through each  egress  |
2155	        routers, each would then replace the source Routing Goop with a
2156	        differing value.  This would lead to multiple copies of the same
2157	        packet each carrying a different IPv6 address, thus being
2158	        considered as from different sources.

2160	      - - It would be more difficult to create tunnels.  Any tunnel that
2161	        crosses a site boundary (i.e., the entry and exit points are in
2162	        differing sites) would in effect require that both tunnel
2163	        endpoints be border routers to insure that the addresses in the
2164	        inner headers were rewritten correctly.

2166	      - - In order for the DNS to hide a site's Routing Goop from
2167	        internal nodes yet make it visible to external nodes requires a
2168	        two-faced DNS.  The current DNS model assumes a single global
2169	        database in which all queries are answered the same way,
2170	        irregardless of who issued the query.  It is unclear how to make
2171	        the DNS answer queries in a context-sensitive manner without      |
2172	        also negatively impacting (i.e., crippling) its caching model.    |

2174	      - - Applications that send addresses in payloads (e.g., FTP PORT    |
2175	        command) may run into difficulties with GSE. Because the sender   |
2176	        does not know its own RG, the addresses it sends in payloads      |
2177	        will contain only the site-local prefix in the RG portion of the  |
2178	        address. In order for the receiver to open a connection back to   |
2179	        that address, it needs the proper RG.  This problem is analagous  |
2180	        to that of NATs, where addresses in payloads need to be           |
2181	        rewritten (e.g., via an ALG) when crossing the boundary between   |
2182	        different addressing realms [NAT].                                |

2184	      - - Border routers need to rewrite the source address of outgoing   |
2185	        packets. Additional parsing of packet headers is also required,   |
2186	        to find and rewrite any other addresses containing the site-      |
2187	        local prefix. For example, the source routing header may contain  |
2188	        additional addresses.

2190	Appendix C: Ideas Incorporated Into IPv6

2192	   This section summarizes changes made to IPv6 specifications which
2193	   originated in the GSE proposal or in the discussions arising from it.

2195	   The unicast address format was changed to improve the aggregability
2196	   of unicast addresses.  Instead of a topologically insignificant
2197	   Registry ID immediately following the Format Prefix [RFC2073], there
2198	   is now a Top-Level Aggregation Identifier [IPv6-ADDRESS].  This field
2199	   identifies a large routable aggregate to which an address belongs
2200	   rather than an administrative unit that assigned the address.  The
2201	   TLA corresponds to the "Large Structure" of GSE.  The IPv6 Next-Level
2202	   Aggregation Identifier (NLA) is roughly the rest of the GSE "Routing
2203	   Goop" and the Site-Level Aggregation Identifier (SLA) is a slightly
2204	   expanded GSE Site Topology Partition.

2206	   The decision to put fixed boundaries between parts of the unicast
2207	   address (TLA, NLA, SLA, Interface Identifier) into IPv6 addresses
2208	   [IPv6-ADDRESS] also came from GSE.  The previous "provider-based"
2209	   addressing architecture for IPv6 [RFC2073] had fluid boundaries
2210	   between Registry ID, Provider ID, Subscriber ID and the Intra-
2211	   Subscriber part, as well as undefined divisions within the Provider-
2212	   ID and Intra-Subscriber part.  (On subnetworks with a MAC-layer
2213	   address, the latter boundary was generally placed to accommodate use
2214	   of that address as an Interface ID.)  The new addressing architecture
2215	   still expects divisions within the NLA portion of the address, placed
2216	   to reflect topological aggregation points.

2218	   Defining a fixed boundary between the routable portion of the address
2219	   and the part indicating an interface on a specific link required
2220	   specifying an Interface Identifier that would be suitable for all
2221	   subnetwork technologies.  The IEEE "EUI-64" identifier was selected,
2222	   having the advantages of an easy mapping from 48 bit MAC addresses
2223	   and a defined escape flag into locally-administered values.

2225	   Another change was the redefinition of the interface identifier to be
2226	   a 64-bit quantity.  In the common case where a node has at least one
2227	   IEEE interface, the interface identifier is constructed from an IEEE
2228	   identifier (i.e., a MAC address) in such a way that there is a very
2229	   high probability that the identifier will be globally unique.  In the
2230	   case where a globally unique identifier can't easily be constructed
2231	   automatically, a bit in the identifier indicates that the address is
2232	   not globally unique.  At present, there are no plans for transport
2233	   protocols such as TCP to exploit interface identifiers, but the door
2234	   has been left open for a future protocol (e.g., TCPng) to take
2235	   advantage of the ESD concept.

2237	   Another change to come out of the GSE discussions relates to reducing
2238	   the number of DNS record changes required in the event of site
2239	   renumbering.  This work is not finalized as of this writing, but the
2240	   result may be that individual IPv6 addresses are stored (and signed,
2241	   in the case of Secure DNS) as a partial address and an indirect
2242	   pointer which leads to the high-order part of the address.  There may
2243	   be multiple levels of indirection and a changed record at any one
2244	   level would suffice to update the DNS's record of the IPv6 addresses
2245	   of every node in a given branch of the addressing hierarchy.

2247	   A change in the method of doing DNS address-to-name lookups is also
2248	   in the works.  This may be a change in the form and/or operation of
2249	   the ip6.int domain or some new mechanism which involves participation
2250	   by the routers or the end-nodes themselves.

2252	   Another example of follow-on work is site prefixes [SITE-PREFIXES],    |
2253	   whose aim is to have communicating parties prefer site-local           |
2254	   addresses for internal communication. Applications using site-local    |
2255	   addresses are generally immune to renumbering issues that effect only  |
2256	   global-scope addresses.                                                |

2258	   Two other changes arising from GSE will not affect the IPv6 base
2259	   specifications themselves, but do direct additional work.  Those are
2260	   the injection of global prefix information into a site from a
2261	   provider or exchange [ROUTER-RENUM], and some inter-provider
2262	   cooperative method of providing multihoming to mutual customers with
2263	   minimal impact on routing tables in distant parts of the network.

2265	Appendix D: Reverse Mapping of Complete GSE Addresses

2267	   The ability to map an IP address into its corresponding DNS name is
2268	   used in several contexts:

2270	     1) Network packet tracing utilities (e.g., tcpdump) display the
2271	        contents of packets.  Printing out the DNS names appearing in
2272	        those packets (rather than dotted IP addresses) requires access
2273	        to an address-to-name mapping mechanism.

2275	     2) Some applications perform a "poor-man's" authentication by using
2276	        the DNS to map the source address of a peer into a DNS name.
2277	        The client then queries the DNS a second time, this time asking
2278	        for the address(es) corresponding to the peer's DNS name.  Only
2279	        if one of the addresses returned by the DNS matches the peer
2280	        address of the TCP connection is the source of the TCP
2281	        connection accepted as being from the indicated DNS name.

2283	        It is important to note that although two DNS queries are made
2284	        during the above operation, it is the second one --- mapping the
2285	        peer's DNS name back into an IP address --- that provides the
2286	        authentication property.  The first transaction simply obtains
2287	        the peer's DNS name, but no assumption is made that the returned
2288	        DNS name is correct.  Thus, the first DNS query could be
2289	        replaced by an alternate mechanism without weakening the already
2290	        weak authentication check described above.  One possible
2291	        alternate mechanism, an ICMP "Who Are You" message, is described  |
2292	        below.

2294	     3) Applications that log all incoming network connections (e.g.,
2295	        anonymous FTP servers) may prefer logging recognizable DNS names
2296	        to addresses.

2298	     4) Network administrators examining logs or other trace data
2299	        containing addresses may wish to determine the DNS name of some
2300	        addresses.  Note that this may occur sometime after those
2301	        addresses were actually used.

2303	   The following subsections describe techniques for mapping a full IPv6
2304	   address back into some quantity (e.g., a DNS name or locator).  We
2305	   include these descriptions for completeness even though they do not
2306	   address the fundamental problem of how to perform the mapping on an
2307	   identifier alone.  It should also be noted that because both
2308	   techniques operate on complete IPv6 addresses, they are both directly
2309	   applicable to provider-based addressing schemes and are not specific
2310	   to GSE.

2312	D.1: DNS-Like Reverse Mapping of Full GSE Addresses

2314	   Although it seems infeasible to have a global scale, reverse mapping
2315	   of ESDs, within a site, it may be feasible to maintain a database
2316	   keyed on unstructured 8-byte ESDs.  However, it is an open question
2317	   whether such a database can be kept up-to-date at reasonable cost,
2318	   without making unreasonable assumptions as to how large sites are
2319	   going to grow, and how frequently ESD registrations will be made or
2320	   updated.  Note that the issue isn't just the physical database
2321	   itself, but the operational issues involved in keeping it up-to-date.
2322	   For the rest of this section, however, let us assume that such a
2323	   database can be built.

2325	   A mechanism supporting a lookup keyed on a flat-space ESD from an
2326	   arbitrary site requires having sufficient structure to identify the
2327	   site that needs to be queried.  In practice, since the Routing Stuff
2328	   is organized hierarchically, if an ESD is always used in conjunction
2329	   with Routing Stuff (i.e., a full 16-byte address), it becomes
2330	   feasible to maintain a DNS-like tree that maps full GSE addresses
2331	   into DNS names, in a fashion analagous to what is done with IPv4 PTR   |
2332	   records today.

2334	   It should be noted that a GSE address lookup will work only if the
2335	   Routing Stuff portion of the address is correctly entered in the DNS
2336	   tree.  Because the Routing Stuff portion of an address is expected to
2337	   change over time, this assumption will not hold valid indefinitely.
2338	   As a consequence, a packet trace recorded in the past might not
2339	   contain enough information to identify the off-Site sources of the
2340	   packets in the present.  This problem can be addressed by requiring
2341	   that the database of RG delegations be maintained, together with
2342	   accurate timing information, for some period of time after the RG is
2343	   no longer usable for routing packets.

2345	   Finally, it should be noted that the problem where an address's RG
2346	   "expires" with the implication that the mapping of "expired"
2347	   addresses into DNS names may no longer hold is not a problem specific
2348	   to the GSE proposal.  With provider-based addressing, the same issue
2349	   arises when a site renumbers into a new provider prefix and releases
2350	   the allocation from a previous block.  The authors are aware of one
2351	   such renumbering incidence in IPv4 where a block of returned
2352	   addresses was reassigned and reused within 24 hours of the
2353	   renumbering event.

2355	D.2: The ICMP Who-Are-You Message

2357	   There is widespread agreement on the utility of being able to
2358	   determine the DNS name one is communicating with from the address
2359	   being used.  In addition to the fact that DNS names are more
2360	   meaningful to human users and more stable than addresses, many users
2361	   use this reverse mapping as part of a poor-man's authentication for
2362	   the remote peer; if one can map the obtained DNS name back to the
2363	   same address, one has an increased confidence of the peer being a
2364	   legitimate one.

2366	   In practice, however, the IN-ADDR.ARPA domain is not fully populated   |
2367	   and poorly maintained.  Consequently, an old proposal to define an
2368	   ICMP Who-Are-You message was resurrected [RFC1788].  A client would
2369	   send such a message to a peer, and that peer would return an ICMP
2370	   message containing its DNS name.  Asking a remote host to supply its
2371	   own name in no way implies that the returned information is accurate.
2372	   However, having a remote peer provide a piece of information that a
2373	   client can use as input to a separate authentication procedure
2374	   provides a starting point for performing strong authentication.  The
2375	   actual strength of the authentication depends on the authentication
2376	   procedure invoked, rather than the untrustable piece of information
2377	   provided by a remote peer.

2379	   Reconsidering the "cheap" authentication procedure described earlier,
2380	   the ICMP Who-Are-You replaces the DNS PTR query used to obtain the
2381	   DNS name of a remote peer.  The second DNS query, to map the DNS name
2382	   back into a set of addresses, would be performed as before.  Because
2383	   the latter DNS query provides the strength of the authentication, the
2384	   use of an ICMP Who-Are-You message does not in any way weaken the
2385	   strength of the authentication method.  Indeed, it can only make it
2386	   more useful in practice, because virtually all hosts can be expected
2387	   to implement the Who-Are-You message.

2389	   The Who-Are-You message has advantages outside the context of GSE as
2390	   well, including a more decentralized, and hence more scalable,
2391	   administration and easier upkeep than a DNS reverse-lookup zone.  It
2392	   also has drawbacks: it requires the target node to be up and
2393	   reachable at the time of the query and to know its fully qualified
2394	   domain name.  It is also not possible to resolve addresses once those
2395	   addresses become unroutable.  In contrast, the DNS PTR mirrors, but
2396	   is independent of, the routing hierarchy.  The DNS can maintain
2397	   mappings long after the routing subsystem stops delivering packets to
2398	   certain addresses.

2400	   The requirement that the target node be up and reachable at the time
2401	   of the query makes it very uncertain that one would be able to take
2402	   addresses from a packet log and translate them to correct domain
2403	   names at a later time.  One can argue that this is a design flaw in
2404	   the logging system, as it violates the architectural principle,
2405	   "Avoid any design that requires addresses to be ... stored on non-
2406	   volatile storage" [RFC1958].  A better-designed system would look up
2407	   domain names promptly from logged addresses.  Indeed, one of the
2408	   authors has been doing that for some years.