idnits 2.17.1 

draft-templin-v6v4-ndisc-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 185: '...   bor MUST monitor its IPv4 reassembl...'
     RFC 2119 keyword, line 205: '... A; in addition, A MUST KNOW THAT B IS...'
     RFC 2119 keyword, line 209: '...   each node MUST implement the follow...'
     RFC 2119 keyword, line 233: '...or sends to its peer MUST be no larger...'
     RFC 2119 keyword, line 236: '...   - all packets SHOULD be sent with t...'
     (6 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The "Author's Address" (or "Authors' Addresses") section title is
     misspelled.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (30 October 2002) is 7842 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'IPv6' is mentioned on line 85, but not defined

  == Unused Reference: 'ICMPv6' is defined on line 496, but no explicit
     reference was found in the text

  == Unused Reference: 'IPV6' is defined on line 502, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2463 (ref. 'ICMPv6') (Obsoleted by RFC
     4443)

  ** Obsolete normative reference: RFC 2460 (ref. 'IPV6') (Obsoleted by RFC
     8200)

  ** Obsolete normative reference: RFC 2461 (ref. 'DISC') (Obsoleted by RFC
     4861)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FRAG'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FOLK'

  ** Obsolete normative reference: RFC 1063 (ref. 'PROBE') (Obsoleted by RFC
     1191)

  ** Obsolete normative reference: RFC 1981 (ref. 'PMTUDv6') (Obsoleted by
     RFC 8201)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MTUDWG'

  ** Obsolete normative reference: RFC  793 (ref. 'TCP') (Obsoleted by RFC
     9293)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'TCP-IP'


     Summary: 9 errors (**), 0 flaws (~~), 6 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	INTERNET-DRAFT                                              F. Templin
3	                                                                 Nokia

5	Expires 30 April 2003                                  30 October 2002

7	      Neighbor Affiliation Protocol for IPv6-over-(foo)-over-IPv4

9	                    draft-templin-v6v4-ndisc-01.txt

11	Abstract

13	   This document proposes extensions to IPv6 Neighbor Discovery for
14	   IPv6-over-(foo)-over-IPv4 links, where (foo) is either an
15	   encapsulating layer (e.g., UDP) or a NULL layer. It is essentially a
16	   lightweight, link-layer mechanism for neighbors to establish security
17	   associations, discover and dynamically re-adjust maximum receive unit
18	   (MRU) estimates, and perform unreachability detection. The protocol
19	   makes no attempt to ensure reliable message delivery; this function
20	   is performed by higher-layer protocols, e.g. TCP.

22	Status of this Memo

24	   This document is an Internet-Draft and is in full conformance with
25	   all provisions of Section 10 of RFC2026.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as Internet-
30	   Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet- Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/ietf/1id-abstracts.txt

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html.

43	   Copyright Notice

45	   Copyright (C) The Internet Society (2002).  All Rights Reserved.

47	1.  Introduction

49	   The author anticipates a long-term requirement for [IPv6] operation
50	   over IPv6-over-(foo)-over-IPv4 links, where (foo)-over-[IPv4] is
51	   treated as a link layer for IPv6). Neighbors that exchange data over
52	   such links will need to make the most efficient, robust and secure
53	   use of the intervening IPv4 paths possible. The author believes that
54	   this will require link-layer extensions to enable a hybrid proac-
55	   tive/on-demand neighbor discovery mechanism using bi-directional
56	   links.

58	   IPv6 nodes will need to establish security associations, determine
59	   and dynamically re-adjust maximum receive unit (MRU) estimates, and
60	   perform neighbor unreachability detection using an IPv4 intranet or
61	   internet as the link layer. Although IPv4 fragmentation is considered
62	   harmful [FRAG][FOLK], the author believes that strategically adapting
63	   to and minimizing fragmentation can provide a superior solution to
64	   strict fragmentation avoidance. Central to the design goals is the
65	   ability for neighbors to establish and maintain lightweight, link-
66	   layer associations to provide a continuous feedback loop. Although
67	   these associations take on certain aspects of reliable transport con-
68	   nections, the author prefers to refer to them as "neighbor affilia-
69	   tions".

71	2.  Applicability Statement

73	     - proposes a neighbor affiliation protocol for IPv6 neighbors
74	       on IPv6-over-(foo)-over-IPv4 links

76	     - works with either automatic or configured tunnels

78	     - may be useful for IPv6 operations in certain deployment scenarios

80	     - may extend to future IPv6 applications beyond the case of
81	       IPv6-over-(foo)-over-IPv4

83	3.  Terminology

85	   The terminology of [IPv4] and [IPv6] apply to this document. The fol-
86	   lowing additional term is defined:

88	   neighbor affiliation:
89	     a lightweight association that enables robust, efficient and
90	     secure bi-directional links between IPv6 neighbors

92	4.  Neighbor Affiliation Protocol

94	   When multiple IPv4 hops intervene between nodes, [DISC] provides no
95	   trust basis (i.e., neighbors are not on the same connected LAN seg-
96	   ment) nor any specification for the nodes to monitor each other's
97	   reachability (i.e., multicast is not supported). Moreover, the path
98	   between any pair of neighbors is mutually exclusive from other on-
99	   link neighbors and may change over time. Thus, the maximum packet
100	   size a node A can receive from neighbor B may differ from the amount
101	   it can receive from neighbor C; even though all three nodes techni-
102	   cally share the same "link". These issues are addressed through
103	   lightweight neighbor affiliations that are established on-demand and
104	   maintained proactively as long as they are in active use.

106	   [DISC] specifies mechanisms for IPv6 nodes to discover and maintain
107	   reachability information for active neighbors. Neighbor Solicitation
108	   (NS) and Neighbor Advertisement (NA) messages are normally used for
109	   this purpose on multiple access, broadcast media. But, when an IPv4
110	   intranet/internet is used as the link layer, the standard multicast
111	   mechanisms do not apply. For the purpose of this specification, uni-
112	   cast NS/NA messages are formatted as specified in [DISC, 4.3-4], and
113	   ALWAYS include a source/target link layer address option (even though
114	   [DISC] does not require these options for unicast operation). Also,
115	   as permitted in [DISC, 4.6.1], this document specifies the format of
116	   link-layer address options for use in IPv6-over-(foo)-over-IPv4
117	   links. The following subsections describe the neighbor affiliation
118	   protocol and specific adaptations of the mechanisms in [DISC] in
119	   detail:

121	4.1.  Affiliation Establishment

123	   Neighbor affiliations are established using the following algorithm.
124	   The algorithm assumes that a source has a unicast packet to send to a
125	   target and invokes address resolution ([DISC, 7.2]):

127	    1) The source sends an NS message to the target with a specially
128	       formatted source link layer address option containing a "SYN"
129	       indication. The source creates a neighbor cache entry for the
130	       target, and transitions from the "CLOSED" to the "SYN-SENT"
131	       state

133	    2) The target receives the NS and notices that it includes a
134	       source link layer option containing a "SYN" indication. The
135	       target creates a neighbor cache entry for the source and
136	       records the physical interface the NS arrived on. The target
137	       transitions from the "LISTEN" to the "SYN-RECEIVED" state,
138	       then sends a unicast Solicited NA to the source. The NA
139	       includes a target link layer address option that contains a
140	       "SYN/ACK" indication and a "Maximum_MRU" option initialized
141	       to the maximum receive unit (MRU) of the physical interface
142	       recored above

144	    3) The source receives the Solicited NA and notices that it
145	       includes a target link layer option containing a "SYN/ACK"
146	       indication. It records the physical interface the NA arrived
147	       on and places the "Maximum_MRU" value in the target's neighbor
148	       cache entry. (This value is also optionally written into the
149	       destination cache entry(s) for the IPv6 destination address(es)
150	       of any packets waiting for address resolution completion
151	       [DISC, 7.2.2]. The "Maximum_MRU" value represents the current
152	       upper bound for the maximum packet size the target is able to
153	       receive, i.e., the MRU of the target's physical interface

155	       Next, the source transitions from the "SYN-SENT" to the
156	       "ESTABLISHED" state and sends a unicast Solicited NA to the
157	       target, i.e., the source treats the received NA as an
158	       "NS equivalent". The NA includes a target link layer
159	       address option that contains an "ACK" indication and a
160	       "Maximum_MRU" option initialized to the MRU of the physical
161	       interface recorded above

163	    4) The target receives the Solicited NA and notices that it
164	       includes a target link layer option containing an "ACK"
165	       indication. It places the "Maximum_MRU" value in the cache
166	       for the source and transitions from the "SYN-RECEIVED" to
167	       the "ESTABLISED" state. The target and source have now
168	       established a bi-directional link using the exact mechanism
169	       specified in [TCP, 3.4] and are said to be "affiliated"

171	4.2.  Affiliation Maintenance

173	   Neighbor affiliations are established on-demand in response to packet
174	   transmissions, as described above. Once established, the neighbors
175	   work together to proactively maintain the affiliation as long as it
176	   is being actively used by either or both endpoints.  When the affili-
177	   ation is no longer needed, it is allowed to expire with stale cache
178	   entries eventually garbage-collected.

180	   Since the "Maximum_MRU" values exchanged in the affiliation estab-
181	   lishment phase only convey information about the maximum packet size
182	   each node can receive from neighbors on the same physical link, a
183	   mechanism is needed to detect whether an MRU reduction is incurred by
184	   the intervening IPv4 path.  To satisfy this requirement, each neigh-
185	   bor MUST monitor its IPv4 reassembly cache to detect packets being
186	   fragmented by the network. The size of the largest fragments arriving
187	   from a particular neighbor indicates the "Current_MRU" that can be
188	   accepted, and this value needs to be conveyed back to the neighbor
189	   causing the fragmentation (see below).

191	   Additionally, since the Neighbor Affiliation Protocol does not ensure
192	   reliable data delivery, no periodic acknowledgements are sent as in
193	   [TCP]. [DISC, 7.3] accepts hints from upper layer protocols of "for-
194	   ward progress" as reachability confirmation, but such indications may
195	   not be present when one (or both) of the neighbors are routers or
196	   when only "unidirectional" traffic (e.g., UDP-based continuous media
197	   streams) is present. Thus, the proactive maintenance process requires
198	   periodic transmission of "keepalive" messages.

200	   Of paramount importance is the fact that data traffic arrival conveys
201	   unidirectional link state information only. In particular, the fact
202	   that node A is receiving packets from node B only guarantees that the
203	   unidirectional link B->A is operational (and vice-versa for A->B). In
204	   other words, it is insufficient for A to receive packets from B and
205	   for B to receive packets from A; in addition, A MUST KNOW THAT B IS
206	   RECEIVING ITS PACKETS AND VICE-VERSA.

208	   In order to satisfy the above affiliation maintenance requirements,
209	   each node MUST implement the following keepalive algorithm:

211	    - if one or more data packets were received from an affiliated
212	      neighbor within the past N seconds, send the neighbor an
213	      unsolicited Neighbor Advertisement containing a target link
214	      layer address option with a "Current_MRU" option set to the
215	      size of the largest fragments arriving from the neighbor.
216	      If no fragmentation is taking place, "Current_MRU" is set
217	      to "Maximum_MRU". (Note that "Maximum_MRU" may change over
218	      time, e.g., due to routing fluctuations, so it too must
219	      be included in the NA.)

221	    - if no unsolicited NAs have arrived within the past N seconds
222	      even though packets have been sent to the neighbor, mark the
223	      affiliation as "stale".

225	4.3.  Fulfilling Contractual Obligations

227	   When a pair of neighbors engages in an affiliation as specified in
228	   the previous subsections, they effectively engage in a mutual con-
229	   tract that requires vigilance to maximize robustness and efficiency
230	   while doing no harm to each other or to the intervening network path.
231	   Mandatory contractual obligations include:

233	    - the packets a neighbor sends to its peer MUST be no larger
234	      than the peer's "Current_MRU" estimate

236	    - all packets SHOULD be sent with the IPv4 DF flag NOT set;
237	      regardless of their size

239	    - each node SHOULD attempt to maximize network utilization
240	      by periodically increasing the estimate for its neighbor
241	      to "Maximum_MRU" as long fragmentation remains below
242	      "harmful" levels

244	    - each neighbor MUST take preemptive measures to reduce
245	      fragmentation if it reaches "harmful" levels

247	   Preemptive measures to reduce harmful fragmentation (see "What con-
248	   stitutes harmful fragmentation?" below) include sending unsolicited
249	   NAs and ICMPv6 "packet too big" messages as appropriate. When a node
250	   detects harmful fragmentation, it MUST send a gratuitous unsolicited
251	   NA to the offending peer (as described in the previous section) with-
252	   out waiting for the normal affiliation maintenance timeout period.
253	   Nodes should employ a strategy to rate-limit such gratuitous NAs,
254	   since a RTT "burst" of fragmented packets may be in the pipeline.

256	   When fragments are lost such that a packet cannot be reassembled, the
257	   receiver MUST generate an ICMPv6 "packet too big" message, provided
258	   the first-fragment was not lost. The "packet too big" message encodes
259	   the length of the first-fragment in the MTU field and encapsulates as
260	   much of the IPv6 packet contained in the IPv4 first-fragment as pos-
261	   sible [ICMPv6, 3.2]. This message provides the peer with a transmit
262	   failure indication so lost data can be retransmitted.

264	   Note that [FRAG, 3.4] speaks favorably for the use of "transparent
265	   fragmentation", i.e., the use of reliable fragmentation and reassem-
266	   bly at a layer below IP. What is proposed here is essentially trans-
267	   parent link layer fragmentation with judicious and mitigated use so
268	   that network utilization is maximized while fragmentation is mini-
269	   mized.  In this sense, fragmentation in and of itself is NOT cause
270	   for alarm and rash actions - as long as both ends of the neighbor
271	   affiliation honor their contractual obligations and adapt their
272	   behavior appropriately.

274	4.4.  What Constitutes "Harmful" Fragmentation?

276	   In 1987, [FRAG] made an early assertion that fragmentation was harm-
277	   ful, and in 2002 the quantitative studies in [FOLK] concluded that
278	   this assertion is still true today. Even in today's Internet (where
279	   nodes by-and-large have correct fragmentation/reassembly
280	   implementations) fragmentation causes "slow-path" processing in
281	   routers and network performance degradation when the loss unit (a
282	   fragment) is smaller than the retransmission unit (a packet or seg-
283	   ment). [FOLK] observed that the principal contributors to fragmenta-
284	   tion in the Internet are continuous media applications (e.g., media
285	   players and interactive games) and tunnel ingress points with miscon-
286	   figured MTUs. Both produce UNMITIGATED and PERSISTENT fragmentation
287	   when no path MTU discovery feedback occurs, and it is ONLY this sort
288	   of fragmentation that this author believes should be considered harm-
289	   ful.

291	   Both [FRAG] and [FOLK] failed to note that the fundamental cause of
292	   unmitigated and persistent fragmentation are senders which disobey
293	   the robustness principle, i.e., nodes that are not "conservative in
294	   what they send". But, the contractual obligations of nodes that par-
295	   ticipate in the neighbor affiliation protocol specified in this docu-
296	   ment provide the necessary means for eliminating harmful fragmenta-
297	   tion. While fragmentation is an integral mechanism for efficient path
298	   probing in the specification, it's use is an appropriate and miti-
299	   gated application of the receiver's side of the robustness principle,
300	   i.e., be "liberal in what you receive".

302	4.5.  Willingness to Affiliate

304	   When a node initiates a neighbor affiliation as described in the pre-
305	   vious subsections, it may find that its peer either does not support
306	   the protocol or is otherwise unwilling to affiliate. In the former
307	   case, the NA that a peer returns in response to the initial NS will
308	   either not contain the "SYN/ACK" indication or not contain a target
309	   link layer option at all. In this case, the initiating node may:

311	     1) Safely estimate that the peer's MRU is the IPv6 MINMTU
312	        of 1280 bytes

314	     2) Bravely estimate that the peer's MRU is something larger
315	        than IPv6 MINMTU, e.g., 1480 bytes)

317	     3) Assume that the peer is unreachable, e.g., if no reasonable
318	        MRU estimate is possible or if a security association is
319	        required

321	   The MRU estimate MUST take into account that tunneled traffic is one
322	   of the primary contributers to harmful fragmentation in the Internet
323	   today [FOLK]. Nodes MUST honor the robustness principle of "be con-
324	   servative in what you send and liberal in what you receive" in all
325	   cases.

327	   Nodes may employee a strategy for allowing/disallowing particular
328	   affiliations, e.g., a router or server may choose not to answer a
329	   solicitation from a new host if its state cache is nearly full.
330	   Finally, security measures should be taken to ensure that the affili-
331	   ation protocol described above is not abused by malicious nodes. Can-
332	   didate mechanisms might be an adaptation of TCP "syn cookies" [refer-
333	   ence needed] or a shared secret between the neighbors.

335	4.6.  Source/Target Link Layer Option Format

337	   The NS/NA messages used for the neighbor affiliation protocol always
338	   encode the Source/Target Link Layer Address option, as specified by
339	   [DISC, 4.6.1]:

341	         0                   1                   2
342	         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
343	        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
344	        |     Type      |    Length     |    Link-Layer Address ...
345	        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

347	   The "Type" is encoded exactly as in [DISC, 4.6.1] and the Length is
348	   always 1 for the purpose of this specification. But, the link-layer
349	   address field has the following special format:

351	             (octets 2-3)        (octets 4-5)        (octets 6-7)
352	        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
353	        |S|A|  Reserverd    |    Current_MRU    |    Maximum_MRU    |
354	        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

356	   Thus, the "SYN" and "ACK" flags, the "Current_MRU" and the "Maxi-
357	   mum_MRU" values can be exchanged between affiliating neighbors, as
358	   required by the specification.

360	5.  Operational Considerations

362	   Nodes that establish neighbor affiliations on IPv6-over-(foo)-over-
363	   IPv4 links must observe the following operational considerations:

365	5.1.  Default Maximum Transmit Unit (DFLT_MTU)

367	   IPv6-over-(foo)-over-IPv4 interfaces may be configured over one or
368	   more underlying physical interfaces for IPv4. The default maximum
369	   transmit unit (DFLT_MTU) for an IPv6-over-(foo)-over-IPv4 interface
370	   is the maximum MTU of all underlying physical interfaces for IPv4
371	   minus the sum of all header sizes required for IPv6-over-(foo)
372	   encapsulation.

374	   For example, if a multi-homed host has an Ethernet interface and an
375	   FDDI interface, the maximum MTU for underlying physical interfaces is
376	   4352. If the encapsulation is IPv6-over-UDP, then:

378	    DFLT_MTU = 4352 - 40 (sizeof(ipv6_hdr)) - 8 (sizeof(udp_hdr)) = 4304

380	   Obviously, the DFLT_MTU value may be overestimated for some traffic
381	   on multi-homed hosts with heterogeneous interfaces. But, read fur-
382	   ther.

384	5.2.  Per-Neighbor Maximum Transmit Unit (NBR_MTU)

386	   As neighbor affiliations are established, the MRUs for "frequently
387	   accessed" neighbors are established in the destination cache. The
388	   destination cache entry for an active neighbor's MRU is always chosen
389	   as the MTU for that neighbor (NBR_MTU). When no destination cache
390	   entry exists, the DFLT_MTU is used.

392	5.3.  TCP MSS

394	   When a TCP connection is initiated to an on-link neighbor, the TCP
395	   MSS is initialized to (NBR_MTU - 40) if a destination cache entry
396	   exists; else (DFLT_MTU - 40). (40 = 20bytes for TCP header plus
397	   20bytes for IPv4 header). The TCP SYN segment carries the MSS option
398	   and initiates the neighbor affiliation process if no affiliation cur-
399	   rently exists (i.e., the TCP SYN segment is contained in the first
400	   packet out.) The neighbor affiliation may reduce the NBR_MTU value in
401	   the destination cache while the SYN packet waits on the Address Reso-
402	   lution queue. But, the system will self-correct when the peer
403	   responds with an MSS option in the SYN/ACK, since the peer will have
404	   up-to-date MRU information from the neighbor affiliation.

406	5.4.  Adaptation to Overestimated DFLT_MTU, NBR_MTU

408	   When a packet is sent to a neighbor based on an overestimated
409	   DFLT_MTU or NBR_MTU, an ICMPv6 "packet too big" message is generated
410	   locally and the too-big packet is dropped. Upper layers will retrans-
411	   mit the data based on the reduced MTU specified in the packet too big
412	   message.

414	5.5.  Implementation Alternatives

416	   Obviously, the cleanest implementation would entail in-kernel instru-
417	   mentation of the IPv4 reassembly cache and intervention in the spe-
418	   cific IPv6-over-(foo)-over-IPv4 device drivers that will use neighbor
419	   affiliation. But, an alternative is to write an application that uses
420	   the Berkeley Packet Filter (libpcap) to monitor the fragments that
421	   enter the host and generate the NS/NA messages necessary to support
422	   the protocol outside of the context of the kernel. In this way, the
423	   neighbor affiliation protocol can be easily deployed on nodes with
424	   existing "vanilla" IPv6-over-(foo)-over-IPv4 tunnel drivers.

426	6.  Rationale for this Approach

428	   One might reasonably ask why the approach in this document is recom-
429	   mended instead of the current practices specified in
430	   [PMTUDv4],[PMTUDv6]. When IPv6 uses (foo)-over-IPv4 as a link layer,
431	   ICMPv6 "Packet Too Big" messages can only be produced by the tunnel
432	   encapsulator and decapsulator (i.e., the two IPv6 neighbor nodes),
433	   while ICMPv4 "Fragmentation Needed" messages are produced by the
434	   intervening IPv4 routers. But, the ICMPv4 messages are not readily
435	   translated into ICMPv6 since they are only guaranteed to include up
436	   to 8 bytes of the too big packet's data (i.e., not enough information
437	   to determine the IPv6 header).

439	   One possible solution is to cache the IPv6 packets at the encapsula-
440	   tor and match them up with any ICMPv4 "frag needed" messages that are
441	   delivered by the IPv4 network. But, this creates a state scaling
442	   issue - especially when the encapsulator is an IPv6 router. Also, an
443	   encapsulating router would need to retransmit the data in the too-big
444	   packets on behalf of the final destination, i.e., act as a "proxy"
445	   for the final destination - but, this would require the router to
446	   understand the semantics of the packetization layer of the original
447	   source when it receives ICMPv4 "frag needed" messages from the IPv4
448	   network.

450	   Finally (and most importantly) IPv4  routers in the path between the
451	   encapsulator and decapsulator cannot be trusted to reliably deliver
452	   ICMPv4 "frag needed" messages - they can be lost due to network con-
453	   gestion or filtering firewalls, and they can be forged by an attacker
454	   since the end nodes have no trust basis with the IPv4 routers. The
455	   only acceptable means is to engage the IPv6 endpoints in a neighbor
456	   affiliation as described in this document.

458	7.  IANA considerations

460	   N/A

462	8.  Security considerations

464	   This document provides a potential platform for integrating secure
465	   neighbor discovery mechanisms.

467	Acknowledgements

469	   The proposal herin is nearly identical to some presented on the TCP-
470	   IP discussion group [TCP-IP] and IETF Path MTU Discovery Working
471	   Group [MTUDWG] mailing lists, roughly between the period of May 1997
472	   through May 1990. The earliest proposal that most closely matches the
473	   one herein was offered by Charles Lynn on November 17, 1987. Others
474	   (e.g., Fox, Bohle, 1989, etc.) proposed combining the basic mechanism
475	   described by Lynn with transport layer protocols, e.g., TCP. To the
476	   best of the author's knowledge, this document presents the first sug-
477	   gested combination of the Lynn proposal with Neighbor Discovery.

479	   Earlier works from SRI International proposed a "router affiliation"
480	   protocol. The term "affiliation" as used in this document was
481	   directly derived from its use in those earlier works. Two SRI
482	   researchers who participated in this effort were Barbara Denny and
483	   Bob Gilligan.

485	   The author acknowledges those who participated in discussions on the
486	   NGTRANS and V6OPS mailing lists between August and October 2002 for
487	   helpful insights.

489	   The author would finally like to acknowledge the founding architects
490	   of the DARPA Internet protocols, who created the technologies used by
491	   the millions of nodes on the Internet today and the billions more to
492	   come in the forseeable future.

494	Normative References

496	   [ICMPv6]   Conta, A. and S. Deering, "Internet Control Message
497	              Protocol (ICMPv6) for the Internet Protocol Version 6
498	              (IPv6) Specification", RFC 2463.

500	   [IPv4]     Postel, J., "Internet Protocol", RFC 791.

502	   [IPV6]     Deering, S., and R. Hinden, "Internet Protocol, Version 6
503	              (IPv6) Specification", RFC 2460.

505	   [DISC]     Narten, T., Nordmark, E., and W. Simpson, "Neighbor
506	              Discovery for IP Version 6 (IPv6)", RFC 2461,
507	              December 1998.

509	   [FRAG]     Kent, C., and J. Mogul, "Fragmentation Considered
510	              Harmful", December, 1987

512	   [FOLK]     Shannon, C., Moore, D. and k claffy, "Beyond Folklore:
513	              Observations on Fragmented Traffic"

515	   [PROBE]    Mogul, J., Kent, C., Partridge, C., and K. McCloughrie,
516	              "IP MTU Discovery Options", RFC 1063, July 1988.

518	   [PMTUDv4]  Mogul, J. and S. Deering, "Path MTU Discovery",
519	              RFC 1191, November 1990.

521	   [PMTUDv6]  McCann, J., Deering, S. and J. Mogul, "Path MTU
522	              Discovery for IP version 6", RFC 1981, August 1996.

524	   [MTUDWG]   IETF MTU Discovery Working group mailing list,
525	              gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log,
526	              November 1989 - February 1995.

528	   [TCP]      Postel, J., "Transmission Control Protocol", RFC 793.

530	   [TCP-IP]   TCP-IP Mailing list archives,
531	              http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip,
532	              May 1987 - May 1990.

534	Informative References

536	Authors Addresses

538	      Fred L. Templin
539	      Nokia
540	      313 Fairchild Drive
541	      Mountain View, CA, USA
542	      Phone: (650)-625-2331
543	      Email: ftemplin@iprg.nokia.com

545	APPENDIX A: Historic Evolution of PMTUD

547	   The topic of Path MTU discovery (PMTUD) saw a flurry of discussion
548	   and numerous proposals in the late 1980's through early 1990. The
549	   initial problem was posed by Art Berggreen on May 22, 1987 in a mes-
550	   sage to the TCP-IP discussion group [TCP-IP]. The discussion that
551	   followed provided significant reference material for [FRAG].  An IETF
552	   Path MTU Discovery Working Group [MTUDWG] was formed in late 1989
553	   with charter to produce an RFC. Several variations on a very few
554	   basic proposals were entertained, including:

556	    1. Routers record the PMTUD estimate in ICMP-like path probe
557	       messages (proposed in [FRAG] and later [PROBE])

559	    2. The destination reports any fragmentation that occurs for
560	       packets received with the "RF" (Report Fragmentation) bit
561	       set (Steve Deering's 1989 adaptation of Charles Lynn's
562	       Nov. 1987 proposal)

564	    3. A hybrid combination of 1) and Charles Lynn's Nov. 1987
565	       proposal (straw RFC draft by McCloughrie, Fox and Mogul
566	       on Jan 12, 1990)

568	    4. Combination of the Lynn proposal with TCP (Fred Bohle,
569	       Jan 30, 1990)

571	    5. Fragmentation avoidance by setting "IP_DF" flag on all
572	       packets and retransmitting if ICMPv4 "fragmentation needed"
573	       messages occur (Geof Cooper's 1987 proposal; later adapted
574	       into [PMTUDv4] by Mogul and Deering)

576	   Option 1) seemed attractive to the group at the time, since it was
577	   believed that routers would migrate more quickly than hosts.  Option
578	   2) was a strong contender, but repeated attempts to secure an "RF"
579	   bit in the IPv4 header from the IESG failed and the proponents became
580	   discouraged. 3) was abandoned because it was perceived as too compli-
581	   cated, and 4) never received any apparent serious consideration. Pro-
582	   posal 5) was a late entry into the discussion from Steve Deering on
583	   Feb. 24th, 1990.  The discussion group soon thereafter seemingly lost
584	   track of all other proposals and adopted 5), which eventually evolved
585	   into [PMTUDv4] and later [PMTUDv6].

587	   In retrospect, the "RF" bit postulated in 2) is not needed if a "con-
588	   tract" is first established between the peers, as in proposal 4) and
589	   a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on Feb
590	   19. 1990. These proposals saw little discussion or rebuttal, and were
591	   dismissed based on the following the assertions:

593	     - routers upgrade their software faster than hosts

595	     - PCs could not reassemble fragmented packets

597	     - Proteon and Wellfleet routers did not reproduce
598	       the "RF" bit properly in fragmented packets

600	     - Ethernet-FDDI bridges would need to perform fragmentation
601	       (i.e., "translucent" not "transparent" bridging)

603	     - the 16-bit IP_ID field could wrap around and disrupt
604	       reassembly at high packet arrival rates

606	   The first four assertions, although perhaps valid at the time, have
607	   been overcome by historical events leaving only the final to con-
608	   sider. But, [FOLK] has shown that IP_ID wraparound simply does not
609	   occur within several orders of magnitude the reassembly timeout win-
610	   dow on high-bandwidth networks.