idnits 2.17.1 

draft-ietf-ipngwg-pmtuv6-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-24) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 90: '...   IPv6 nodes SHOULD implement Path MT...'
     RFC 2119 keyword, line 206: '...et Too Big message, it MUST reduce its...'
     RFC 2119 keyword, line 213: '...oo Big message, a node MUST attempt to...'
     RFC 2119 keyword, line 214: '...ges in the near future.  The node MUST...'
     RFC 2119 keyword, line 219: '...   node MUST force the Path MTU Discov...'
     (6 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CONG'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FRAG'

  ** Obsolete normative reference: RFC 1885 (ref. 'ICMPv6') (Obsoleted by RFC
     2463)

  ** Obsolete normative reference: RFC 1883 (ref. 'IPv6-SPEC') (Obsoleted by
     RFC 2460)

  ** Downref: Normative reference to an Unknown state RFC: RFC  905 (ref.
     'ISOTP')

  == Outdated reference: A later version (-06) exists of
     draft-ietf-ipngwg-discovery-04

  ** Downref: Normative reference to an Informational RFC: RFC 1057 (ref.
     'RPC')


     Summary: 13 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                  J. McCann, Digital Equipment Corporation
2	May 23, 1996                                      S. Deering, Xerox PARC
3	                                 J. Mogul, Digital Equipment Corporation

5	                  Path MTU Discovery for IP version 6

7	                    draft-ietf-ipngwg-pmtuv6-03.txt

9	Abstract

11	   This document describes Path MTU Discovery for IP version 6.  It is
12	   largely derived from RFC-1191, which describes Path MTU Discovery for
13	   IP version 4.

15	Status of this Memo

17	   This document is an Internet-Draft.  Internet-Drafts are working
18	   documents of the Internet Engineering Task Force (IETF), its areas,
19	   and its working groups.  Note that other groups may also distribute
20	   working documents as Internet-Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as ``work in progress.''

27	   To learn the current status of any Internet-Draft, please check the
28	   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
29	   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
30	   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
31	   ftp.isi.edu (US West Coast).

33	   Distribution of this document is unlimited.

35	Expiration

37	   November 23, 1996

39	Contents

41	   Abstract........................................................1

43	   Status of this Memo.............................................1

45	   Contents........................................................2

47	   1. Introduction.................................................3

49	   2. Terminology..................................................3

51	   3. Protocol overview............................................4

53	   4. Protocol Requirements........................................5

55	   5. Implementation Issues........................................6

57	   5.1. Layering...................................................6

59	   5.2. Storing PMTU information...................................7

61	   5.3. Purging stale PMTU information.............................9

63	   5.4. TCP layer actions.........................................10

65	   5.5. Issues for other transport protocols......................11

67	   5.6. Management interface......................................12

69	   6. Security considerations.....................................12

71	   Acknowledgements...............................................13

73	   Appendix A - Comparison to RFC 1191............................14

75	   References.....................................................15

77	   Authors' Addresses.............................................16

79	1. Introduction

81	   When one IPv6 node has a large amount of data to send to another
82	   node, the data is transmitted in a series of IPv6 packets.  It is
83	   usually preferable that these packets be of the largest size that can
84	   successfully traverse the path from the source node to the
85	   destination node.  This packet size is referred to as the Path MTU
86	   (PMTU), and it is equal to the minimum link MTU of all the links in a
87	   path.  IPv6 defines a standard mechanism for a node to discover the
88	   PMTU of an arbitrary path.

90	   IPv6 nodes SHOULD implement Path MTU Discovery in order to discover
91	   and take advantage of paths with PMTU greater than the IPv6 minimum
92	   link MTU [IPv6-SPEC].  A minimal IPv6 implementation (e.g., in a boot
93	   ROM) may choose to omit implementation of Path MTU Discovery.

95	   Nodes not implementing Path MTU Discovery use the IPv6 minimum link
96	   MTU defined in [IPv6-SPEC] as the maximum packet size.  In most
97	   cases, this will result in the use of smaller packets than necessary,
98	   because most paths have a PMTU greater than the IPv6 minimum link
99	   MTU.  A node sending packets much smaller than the Path MTU allows is
100	   wasting network resources and probably getting suboptimal throughput.

102	2. Terminology

104	   node        - a device that implements IPv6.

106	   router      - a node that forwards IPv6 packets not explicitly
107	                 addressed to itself.

109	   host        - any node that is not a router.

111	   upper layer - a protocol layer immediately above IPv6.  Examples are
112	                 transport protocols such as TCP and UDP, control
113	                 protocols such as ICMP, routing protocols such as OSPF,
114	                 and internet or lower-layer protocols being "tunneled"
115	                 over (i.e., encapsulated in) IPv6 such as IPX,
116	                 AppleTalk, or IPv6 itself.

118	   link        - a communication facility or medium over which nodes can
119	                 communicate at the link layer, i.e., the layer
120	                 immediately below IPv6.  Examples are Ethernets (simple
121	                 or bridged); PPP links; X.25, Frame Relay, or ATM
122	                 networks; and internet (or higher) layer "tunnels",
123	                 such as tunnels over IPv4 or IPv6 itself.

125	   interface   - a node's attachment to a link.

127	   address     - an IPv6-layer identifier for an interface or a set of
128	                 interfaces.

130	   packet      - an IPv6 header plus payload.

132	   link MTU    - the maximum transmission unit, i.e., maximum packet
133	                 size in octets, that can be conveyed in one piece over
134	                 a link.

136	   path        - the set of links traversed by a packet between a source
137	                 node and a destination node

139	   path MTU    - the minimum link MTU of all the links in a path between
140	                 a source node and a destination node.

142	   PMTU        - path MTU

144	   Path MTU
145	   Discovery   - process by which a node learns the PMTU of a path

147	   flow        - a sequence of packets sent from a particular source
148	                 to a particular (unicast or multicast) destination for
149	                 which the source desires special handling by the
150	                 intervening routers.

152	   flow id     - a combination of a source address and a non-zero
153	                 flow label.

155	3. Protocol overview

157	   This memo describes a technique to dynamically discover the PMTU of a
158	   path.  The basic idea is that a source node initially assumes that
159	   the PMTU of a path is the (known) MTU of the first hop in the path.
160	   If any of the packets sent on that path are too large to be forwarded
161	   by some node along the path, that node will discard them and return
162	   ICMPv6 Packet Too Big messages [ICMPv6].  Upon receipt of such a
163	   message, the source node reduces its assumed PMTU for the path based
164	   on the MTU of the constricting hop as reported in the Packet Too Big
165	   message.

167	   The Path MTU Discovery process ends when the node's estimate of the
168	   PMTU is less than or equal to the actual PMTU.  Note that several
169	   iterations of the packet-sent/Packet-Too-Big-message-received cycle
170	   may occur before the Path MTU Discovery process ends, as there may be
171	   links with smaller MTUs further along the path.

173	   Alternatively, the node may elect to end the discovery process by
174	   ceasing to send packets larger than the IPv6 minimum link MTU.

176	   The PMTU of a path may change over time, due to changes in the
177	   routing topology.  Reductions of the PMTU are detected by Packet Too
178	   Big messages.  To detect increases in a path's PMTU, a node
179	   periodically increases its assumed PMTU.  This will almost always
180	   result in packets being discarded and Packet Too Big messages being
181	   generated, because in most cases the PMTU of the path will not have
182	   changed.  Therefore, attempts to detect increases in a path's PMTU
183	   should be done infrequently.

185	   Path MTU Discovery supports multicast as well as unicast
186	   destinations.  In the case of a multicast destination, copies of a
187	   packet may traverse many different paths to many different nodes.
188	   Each path may have a different PMTU, and a single multicast packet
189	   may result in multiple Packet Too Big messages, each reporting a
190	   different next-hop MTU.  The minimum PMTU value across the set of
191	   paths in use determines the size of subsequent packets sent to the
192	   multicast destination.

194	   Note that Path MTU Discovery must be performed even in cases where a
195	   node "thinks" a destination is attached to the same link as itself.
196	   In a situation such as when a neighboring router acts as proxy [ND]
197	   for some destination, the destination can to appear to be directly
198	   connected but is in fact more than one hop away.

200	4. Protocol Requirements

202	   As discussed in section 1, IPv6 nodes are not required to implement
203	   Path MTU Discovery.  The requirements in this section apply only to
204	   those implementations that include Path MTU Discovery.

206	   When a node receives a Packet Too Big message, it MUST reduce its
207	   estimate of the PMTU for the relevant path, based on the value of the
208	   MTU field in the message.  The precise behavior of a node in this
209	   circumstance is not specified, since different applications may have
210	   different requirements, and since different implementation
211	   architectures may favor different strategies.

213	   After receiving a Packet Too Big message, a node MUST attempt to
214	   avoid eliciting more such messages in the near future.  The node MUST
215	   reduce the size of the packets it is sending along the path.  Using a
216	   PMTU estimate larger than the IPv6 minimum link MTU may continue to
217	   elicit Packet Too Big messages.  Since each of these messages (and
218	   the dropped packets they respond to) consume network resources, the
219	   node MUST force the Path MTU Discovery process to end.

221	   Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast
222	   as possible.  Nodes MAY detect increases in PMTU, but because doing
223	   so requires sending packets larger than the current estimated PMTU,
224	   and because the likelihood is that the PMTU will not have increased,
225	   this MUST be done at infrequent intervals.  An attempt to detect an
226	   increase (by sending a packet larger than the current estimate) MUST
227	   NOT be done less than 5 minutes after a Packet Too Big message has
228	   been received for the given path.  The recommended setting for this
229	   timer is twice its minimum value (10 minutes).

231	   A node MUST NOT reduce its estimate of the Path MTU below the IPv6
232	   minimum link MTU.

234	      Note: A node may receive a Packet Too Big message reporting a
235	      next-hop MTU that is less than the IPv6 minimum link MTU.  In that
236	      case, the node is not required to reduce the size of subsequent
237	      packets sent on the path to less than the IPv6 minimun link MTU,
238	      but rather must include a Fragment header in those packets [IPv6-
239	      SPEC].

241	   A node MUST NOT increase its estimate of the Path MTU in response to
242	   the contents of a Packet Too Big message.  A message purporting to
243	   announce an increase in the Path MTU might be a stale packet that has
244	   been floating around in the network, a false packet injected as part
245	   of a denial-of-service attack, or the result of having multiple paths
246	   to the destination, each with a different PMTU.

248	5. Implementation Issues

250	   This section discusses a number of issues related to the
251	   implementation of Path MTU Discovery.  This is not a specification,
252	   but rather a set of notes provided as an aid for implementors.

254	   The issues include:

256	   - What layer or layers implement Path MTU Discovery?

258	   - How is the PMTU information cached?

260	   - How is stale PMTU information removed?

262	   - What must transport and higher layers do?

264	5.1. Layering

266	   In the IP architecture, the choice of what size packet to send is
267	   made by a protocol at a layer above IP.  This memo refers to such a
268	   protocol as a "packetization protocol".  Packetization protocols are
269	   usually transport protocols (for example, TCP) but can also be
270	   higher-layer protocols (for example, protocols built on top of UDP).

272	   Implementing Path MTU Discovery in the packetization layers
273	   simplifies some of the inter-layer issues, but has several drawbacks:
274	   the implementation may have to be redone for each packetization
275	   protocol, it becomes hard to share PMTU information between different
276	   packetization layers, and the connection-oriented state maintained by
277	   some packetization layers may not easily extend to save PMTU
278	   information for long periods.

280	   It is therefore suggested that the IP layer store PMTU information
281	   and that the ICMP layer process received Packet Too Big messages.
282	   The packetization layers may respond to changes in the PMTU, by
283	   changing the size of the messages they send.  To support this
284	   layering, packetization layers require a way to learn of changes in
285	   the value of MMS_S, the "maximum send transport-message size".  The
286	   MMS_S is derived from the Path MTU by subtracting the size of the
287	   IPv6 header plus space reserved by the IP layer for additional
288	   headers (if any).

290	   It is possible that a packetization layer, perhaps a UDP application
291	   outside the kernel, is unable to change the size of messages it
292	   sends.  This may result in a packet size that exceeds the Path MTU.
293	   To accommodate such situations, IPv6 defines a mechanism that allows
294	   large payloads to be divided into fragments, with each fragment sent
295	   in a separate packet (see [IPv6-SPEC] section "Fragment Header").
296	   However, packetization layers are encouraged to avoid sending
297	   messages that will require fragmentation (for the case against
298	   fragmentation, see [FRAG]).

300	5.2. Storing PMTU information

302	   Ideally, a PMTU value should be associated with a specific path
303	   traversed by packets exchanged between the source and destination
304	   nodes.  However, in most cases a node will not have enough
305	   information to completely and accurately identify such a path.
306	   Rather, a node must associate a PMTU value with some local
307	   representation of a path.  It is left to the implementation to select
308	   the local representation of a path.

310	   In the case of a multicast destination address, copies of a packet
311	   may traverse many different paths to reach many different nodes.  The
312	   local representation of the "path" to a multicast destination must in
313	   fact represent a potentially large set of paths.

315	   Minimally, an implementation could maintain a single PMTU value to be
316	   used for all packets originated from the node.  This PMTU value would
317	   be the minimum PMTU learned across the set of all paths in use by the
318	   node.  This approach is likely to result in the use of smaller
319	   packets than is necessary for many paths.

321	   An implementation could use the destination address as the local
322	   representation of a path.  The PMTU value associated with a
323	   destination would be the minimum PMTU learned across the set of all
324	   paths in use to that destination.  The set of paths in use to a
325	   particular destination is expected to be small, in many cases
326	   consisting of a single path.  This approach will result in the use of
327	   optimally sized packets on a per-destination basis.  This approach
328	   integrates nicely with the conceptual model of a host as described in
329	   [ND]: a PMTU value could be stored with the corresponding entry in
330	   the destination cache.

332	   If flows [IPv6-SPEC] are in use, an implementation could use the flow
333	   id as the local representation of a path.  Packets sent to a
334	   particular destination but belonging to different flows may use
335	   different paths, with the choice of path depending on the flow id.
336	   This approach will result in the use of optimally sized packets on a
337	   per-flow basis, providing finer granularity than PMTU values
338	   maintained on a per-destination basis.

340	   For source routed packets (i.e. packets containing an IPv6 Routing
341	   header [IPv6-SPEC]), the source route may further qualify the local
342	   representation of a path.  In particular, a packet containing a type
343	   0 Routing header in which all bits in the Strict/Loose Bit Map are
344	   equal to 1 contains a complete path specification.  An implementation
345	   could use source route information in the local representation of a
346	   path.

348	      Note: Some paths may be further distinguished by different
349	      security classifications.  The details of such classifications are
350	      beyond the scope of this memo.

352	   Initially, the PMTU value for a path is assumed to be the (known) MTU
353	   of the first-hop link.

355	   When a Packet Too Big message is received, the node determines which
356	   path the message applies to based on the contents of the Packet Too
357	   Big message.  For example, if the destination address is used as the
358	   local representation of a path, the destination address from the
359	   original packet would be used to determine which path the message
360	   applies to.

362	      Note: if the original packet contained a Routing header, the
363	      Routing header should be used to determine the location of the
364	      destination address within the original packet.  If Segments Left
365	      is equal to zero, the destination address is in the Destination
366	      Address field in the IPv6 header.  If Segments Left is greater
367	      than zero, the destination address is the last address
368	      (Address[n]) in the Routing header.

370	   The node then uses the value in the MTU field in the Packet Too Big
371	   message as a tentative PMTU value, and compares the tentative PMTU to
372	   the existing PMTU.  If the tentative PMTU is less than the existing
373	   PMTU estimate, the tentative PMTU replaces the existing PMTU as the
374	   PMTU value for the path.

376	   The packetization layers must be notified about decreases in the
377	   PMTU.  Any packetization layer instance (for example, a TCP
378	   connection) that is actively using the path must be notified if the
379	   PMTU estimate is decreased.

381	      Note: even if the Packet Too Big message contains an Original
382	      Packet Header that refers to a UDP packet, the TCP layer must be
383	      notified if any of its connections use the given path.

385	   Also, the instance that sent the packet that elicited the Packet Too
386	   Big message should be notified that its packet has been dropped, even
387	   if the PMTU estimate has not changed, so that it may retransmit the
388	   dropped data.

390	      Note: An implementation can avoid the use of an asynchronous
391	      notification mechanism for PMTU decreases by postponing
392	      notification until the next attempt to send a packet larger than
393	      the PMTU estimate.  In this approach, when an attempt is made to
394	      SEND a packet that is larger than the PMTU estimate, the SEND
395	      function should fail and return a suitable error indication.  This
396	      approach may be more suitable to a connectionless packetization
397	      layer (such as one using UDP), which (in some implementations) may
398	      be hard to "notify" from the ICMP layer.  In this case, the normal
399	      timeout-based retransmission mechanisms would be used to recover
400	      from the dropped packets.

402	   It is important to understand that the notification of the
403	   packetization layer instances using the path about the change in the
404	   PMTU is distinct from the notification of a specific instance that a
405	   packet has been dropped.  The latter should be done as soon as
406	   practical (i.e., asynchronously from the point of view of the
407	   packetization layer instance), while the former may be delayed until
408	   a packetization layer instance wants to create a packet.
409	   Retransmission should be done for only for those packets that are
410	   known to be dropped, as indicated by a Packet Too Big message.

412	5.3. Purging stale PMTU information

414	   Internetwork topology is dynamic; routes change over time.  While the
415	   local representation of a path may remain constant, the actual
416	   path(s) in use may change.  Thus, PMTU information cached by a node
417	   can become stale.

419	   If the stale PMTU value is too large, this will be discovered almost
420	   immediately once a large enough packet is sent on the path.  No such
421	   mechanism exists for realizing that a stale PMTU value is too small,
422	   so an implementation should "age" cached values.  When a PMTU value
423	   has not been decreased for a while (on the order of 10 minutes), the
424	   PMTU estimate should be set to the MTU of the first-hop link, and the
425	   packetization layers should be notified of the change.  This will
426	   cause the complete Path MTU Discovery process to take place again.

428	      Note: an implementation should provide a means for changing the
429	      timeout duration, including setting it to "infinity".  For
430	      example, nodes attached to an FDDI link which is then attached to
431	      the rest of the Internet via a small MTU serial line are never
432	      going to discover a new non-local PMTU, so they should not have to
433	      put up with dropped packets every 10 minutes.

435	   An upper layer must not retransmit data in response to an increase in
436	   the PMTU estimate, since this increase never comes in response to an
437	   indication of a dropped packet.

439	   One approach to implementing PMTU aging is to associate a timestamp
440	   field with a PMTU value.  This field is initialized to a "reserved"
441	   value, indicating that the PMTU is equal to the MTU of the first hop
442	   link.  Whenever the PMTU is decreased in response to a Packet Too Big
443	   message, the timestamp is set to the current time.

445	   Once a minute, a timer-driven procedure runs through all cached PMTU
446	   values, and for each PMTU whose timestamp is not "reserved" and is
447	   older than the timeout interval:

449	   - The PMTU estimate is set to the MTU of the first hop link.

451	   - The timestamp is set to the "reserved" value.

453	   - Packetization layers using this path are notified of the increase.

455	5.4. TCP layer actions

457	   The TCP layer must track the PMTU for the path(s) in use by a
458	   connection; it should not send segments that would result in packets
459	   larger than the PMTU.  A simple implementation could ask the IP layer
460	   for this value each time it created a new segment, but this could be
461	   inefficient.  Moreover, TCP implementations that follow the "slow-
462	   start" congestion-avoidance algorithm [CONG] typically calculate and
463	   cache several other values derived from the PMTU.  It may be simpler
464	   to receive asynchronous notification when the PMTU changes, so that
465	   these variables may be updated.

467	   A TCP implementation must also store the MSS value received from its
468	   peer, and must not send any segment larger than this MSS, regardless
469	   of the PMTU.  In 4.xBSD-derived implementations, this may require
470	   adding an additional field to the TCP state record.

472	   The value sent in the TCP MSS option is independent of the PMTU.
473	   This MSS option value is used by the other end of the connection,
474	   which may be using an unrelated PMTU value.  See [IPv6-SPEC] sections
475	   "Packet Size Issues" and "Maximum Upper-Layer Payload Size" for
476	   information on selecting a value for the TCP MSS option.

478	   When a Packet Too Big message is received, it implies that a packet
479	   was dropped by the node that sent the ICMP message.  It is sufficient
480	   to treat this as any other dropped segment, and wait until the
481	   retransmission timer expires to cause retransmission of the segment.
482	   If the Path MTU Discovery process requires several steps to find the
483	   PMTU of the full path, this could delay the connection by many
484	   round-trip times.

486	   Alternatively, the retransmission could be done in immediate response
487	   to a notification that the Path MTU has changed, but only for the
488	   specific connection specified by the Packet Too Big message.  The
489	   packet size used in the retransmission should be no larger than the
490	   new PMTU.

492	      Note: A packetization layer must not retransmit in response to
493	      every Packet Too Big message, since a burst of several oversized
494	      segments will give rise to several such messages and hence several
495	      retransmissions of the same data.  If the new estimated PMTU is
496	      still wrong, the process repeats, and there is an exponential
497	      growth in the number of superfluous segments sent.

499	      This means that the TCP layer must be able to recognize when a
500	      Packet Too Big notification actually decreases the PMTU that it
501	      has already used to send a packet on the given connection, and
502	      should ignore any other notifications.

504	   Many TCP implementations incorporate "congestion avoidance" and
505	   "slow-start" algorithms to improve performance [CONG].  Unlike a
506	   retransmission caused by a TCP retransmission timeout, a
507	   retransmission caused by a Packet Too Big message should not change
508	   the congestion window.  It should, however, trigger the slow-start
509	   mechanism (i.e., only one segment should be retransmitted until
510	   acknowledgements begin to arrive again).

512	   TCP performance can be reduced if the sender's maximum window size is
513	   not an exact multiple of the segment size in use (this is not the
514	   congestion window size, which is always a multiple of the segment
515	   size).  In many systems (such as those derived from 4.2BSD), the
516	   segment size is often set to 1024 octets, and the maximum window size
517	   (the "send space") is usually a multiple of 1024 octets, so the
518	   proper relationship holds by default.  If Path MTU Discovery is used,
519	   however, the segment size may not be a submultiple of the send space,
520	   and it may change during a connection; this means that the TCP layer
521	   may need to change the transmission window size when Path MTU
522	   Discovery changes the PMTU value.  The maximum window size should be
523	   set to the greatest multiple of the segment size that is less than or
524	   equal to the sender's buffer space size.

526	5.5. Issues for other transport protocols

528	   Some transport protocols (such as ISO TP4 [ISOTP]) are not allowed to
529	   repacketize when doing a retransmission.  That is, once an attempt is
530	   made to transmit a segment of a certain size, the transport cannot
531	   split the contents of the segment into smaller segments for
532	   retransmission.  In such a case, the original segment can be
533	   fragmented by the IP layer during retransmission.  Subsequent
534	   segments, when transmitted for the first time, should be no larger
535	   than allowed by the Path MTU.

537	   The Sun Network File System (NFS) uses a Remote Procedure Call (RPC)
538	   protocol [RPC] that, when used over UDP, in many cases will generate
539	   payloads that must be fragmented even for the first-hop link.  This
540	   might improve performance in certain cases, but it is known to cause
541	   reliability and performance problems, especially when the client and
542	   server are separated by routers.

544	   It is recommended that NFS implementations use Path MTU Discovery
545	   whenever routers are involved.  Most NFS implementations allow the
546	   RPC datagram size to be changed at mount-time (indirectly, by
547	   changing the effective file system block size), but might require
548	   some modification to support changes later on.

550	   Also, since a single NFS operation cannot be split across several UDP
551	   datagrams, certain operations (primarily, those operating on file
552	   names and directories) require a minimum payload size that if sent in
553	   a single packet would exceed the PMTU.  NFS implementations should
554	   not reduce the payload size below this threshold, even if Path MTU
555	   Discovery suggests a lower value.  In this case the payload will be
556	   fragmented by the IP layer.

558	5.6. Management interface

560	   It is suggested that an implementation provide a way for a system
561	   utility program to:

563	   - Specify that Path MTU Discovery not be done on a given path.

565	   - Change the PMTU value associated with a given path.

567	   The former can be accomplished by associating a flag with the path;
568	   when a packet is sent on a path with this flag set, the IP layer does
569	   not send packets larger than the IPv6 minimum link MTU.

571	   These features might be used to work around an anomalous situation,
572	   or by a routing protocol implementation that is able to obtain Path
573	   MTU values.

575	   The implementation should also provide a way to change the timeout
576	   period for aging stale PMTU information.

578	6. Security considerations

580	   This Path MTU Discovery mechanism makes possible two denial-of-
581	   service attacks, both based on a malicious party sending false Packet
582	   Too Big messages to a node.

584	   In the first attack, the false message indicates a PMTU much smaller
585	   than reality.  This should not entirely stop data flow, since the
586	   victim node should never set its PMTU estimate below the IPv6 minimum
587	   link MTU.  It will, however, result in suboptimal performance.

589	   In the second attack, the false message indicates a PMTU larger than
590	   reality.  If believed, this could cause temporary blockage as the
591	   victim sends packets that will be dropped by some router.  Within one
592	   round-trip time, the node would discover its mistake (receiving
593	   Packet Too Big messages from that router), but frequent repetition of
594	   this attack could cause lots of packets to be dropped.  A node,
595	   however, should never raise its estimate of the PMTU based on a
596	   Packet Too Big message, so should not be vulnerable to this attack.

598	   A malicious party could also cause problems if it could stop a victim
599	   from receiving legitimate Packet Too Big messages, but in this case
600	   there are simpler denial-of-service attacks available.

602	Acknowledgements

604	   We would like to acknowledge the authors of and contributors to
605	   [RFC-1191], from which the majority of this document was derived.  We
606	   would also like to acknowledge the members of the IPng working group
607	   for their careful review and constructive criticisms.

609	Appendix A - Comparison to RFC 1191

611	   This document is based in large part on RFC 1191, which describes
612	   Path MTU Discovery for IPv4.  Certain portions of RFC 1191 were not
613	   needed in this document:

615	   router specification    - Packet Too Big messages and corresponding
616	                             router behavior are defined in [ICMPv6]

618	   Don't Fragment bit      - there is no DF bit in IPv6 packets

620	   TCP MSS discussion      - selecting a value to send in the TCP MSS
621	                             option is discussed in [IPv6-SPEC]

623	   old-style messages      - all Packet Too Big messages report the
624	                             MTU of the constricting link

626	   MTU plateau tables      - not needed because there are no old-style
627	                             messages

629	References

631	   [CONG]      Van Jacobson.  Congestion Avoidance and Control.  Proc.
632	               SIGCOMM '88 Symposium on Communications Architectures and
633	               Protocols, pages 314-329.  Stanford, CA, August, 1988.

635	   [FRAG]      C. Kent and J. Mogul.  Fragmentation Considered Harmful.
636	               In Proc. SIGCOMM '87 Workshop on Frontiers in Computer
637	               Communications Technology.  August, 1987.

639	   [ICMPv6]    A. Conta and S. Deering, "Internet Control Message
640	               Protocol (ICMPv6) for the Internet Protocol Version 6
641	               (IPv6) Specification", RFC 1885, December 1995

643	   [IPv6-SPEC] S. Deering and R. Hinden, "Internet Protocol, Version 6
644	               (IPv6) Specification", RFC 1883, December 1995

646	   [ISOTP]     ISO.  ISO Transport Protocol Specification: ISO DP 8073.
647	               RFC 905, SRI Network Information Center, April, 1984.

649	   [ND]        T. Narten, E. Nordmark, and W. Simpson, "Neighbor
650	               Discovery for IP Version 6 (IPv6)", work in progress
651	               draft-ietf-ipngwg-discovery-04.txt, February 1996.

653	   [RFC-1191]  J. Mogul and S. Deering, "Path MTU Discovery",
654	               November 1990

656	   [RPC]       Sun Microsystems, Inc.  RPC: Remote Procedure Call
657	               Protocol.  RFC 1057, SRI Network Information Center,
658	               June, 1988.

660	Authors' Addresses

662	    Jack McCann
663	    Digital Equipment Corporation
664	    110 Spitbrook Road, ZKO3-3/U14
665	    Nashua, NH 03062
666	    Phone: +1 603 881 2608
667	    Fax:   +1 603 881 0120
668	    Email: mccann@zk3.dec.com

670	    Stephen E. Deering
671	    Xerox Palo Alto Research Center
672	    3333 Coyote Hill Road
673	    Palo Alto, CA 94304
674	    Phone: +1 415 812 4839
675	    Fax:   +1 415 812 4471
676	    Email: deering@parc.xerox.com

678	    Jeffrey Mogul
679	    Digital Equipment Corporation Western Research Laboratory
680	    250 University Avenue
681	    Palo Alto, CA 94301
682	    Phone: +1 415 617 3304
683	    Email: mogul@pa.dec.com

685	Expiration

687	    November 23, 1996