idnits 2.17.1 

draft-ietf-tsvwg-datagram-plpmtud-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The abstract seems to indicate that this document updates RFC8201, but
     the header doesn't have an 'Updates:' line to match this.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC4821, updated by this document, for
     RFC5378 checks: 2003-10-21)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (September 5, 2018) is 2059 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-14

  == Outdated reference: A later version (-32) exists of
     draft-ietf-tsvwg-udp-options-05

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260)


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                             G. Fairhurst
3	Internet-Draft                                                  T. Jones
4	Updates: 4821 (if approved)                       University of Aberdeen
5	Intended status: Standards Track                               M. Tuexen
6	Expires: March 9, 2019                                      I. Ruengeler
7	                                 Muenster University of Applied Sciences
8	                                                       September 5, 2018

10	     Packetization Layer Path MTU Discovery for Datagram Transports
11	                  draft-ietf-tsvwg-datagram-plpmtud-04

13	Abstract

15	   This document describes a robust method for Path MTU Discovery
16	   (PMTUD) for datagram Packetization Layers (PLs).  The document
17	   describes an extension to RFC 1191 and RFC 8201, which specifies
18	   ICMP-based Path MTU Discovery for IPv4 and IPv6.  The method allows a
19	   PL, or a datagram application that uses a PL, to discover whether a
20	   network path can support the current size of datagram.  This can be
21	   used to detect and reduce the message size when a sender encounters a
22	   network black hole (where packets are discarded, and no ICMP message
23	   is received).  The method can also probe a network path with
24	   progressively larger packets to find whether the maximum packet size
25	   can be increased.  This allows a sender to determine an appropriate
26	   packet size, providing functionally for datagram transports that is
27	   equivalent to the Packetization layer PMTUD specification for TCP,
28	   specified in RFC 4821.

30	   The document also provides implementation notes for incorporating
31	   Datagram PMTUD into IETF datagram transports or applications that use
32	   datagram transports.

34	   When published, this specification updates RFC 4821 when used with
35	   datagram transports.

37	Status of This Memo

39	   This Internet-Draft is submitted in full conformance with the
40	   provisions of BCP 78 and BCP 79.

42	   Internet-Drafts are working documents of the Internet Engineering
43	   Task Force (IETF).  Note that other groups may also distribute
44	   working documents as Internet-Drafts.  The list of current Internet-
45	   Drafts is at https://datatracker.ietf.org/drafts/current/.

47	   Internet-Drafts are draft documents valid for a maximum of six months
48	   and may be updated, replaced, or obsoleted by other documents at any
49	   time.  It is inappropriate to use Internet-Drafts as reference
50	   material or to cite them other than as "work in progress."

52	   This Internet-Draft will expire on March 9, 2019.

54	Copyright Notice

56	   Copyright (c) 2018 IETF Trust and the persons identified as the
57	   document authors.  All rights reserved.

59	   This document is subject to BCP 78 and the IETF Trust's Legal
60	   Provisions Relating to IETF Documents
61	   (https://trustee.ietf.org/license-info) in effect on the date of
62	   publication of this document.  Please review these documents
63	   carefully, as they describe your rights and restrictions with respect
64	   to this document.  Code Components extracted from this document must
65	   include Simplified BSD License text as described in Section 4.e of
66	   the Trust Legal Provisions and are provided without warranty as
67	   described in the Simplified BSD License.

69	Table of Contents

71	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
72	     1.1.  Classical Path MTU Discovery  . . . . . . . . . . . . . .   4
73	     1.2.  Packetization Layer Path MTU Discovery  . . . . . . . . .   5
74	     1.3.  Path MTU Discovery for Datagram Services  . . . . . . . .   6
75	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   7
76	   3.  Features Required to Provide Datagram PLPMTUD . . . . . . . .   9
77	   4.  DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . .  11
78	     4.1.  PLPMTU Probe Packets  . . . . . . . . . . . . . . . . . .  11
79	     4.2.  Confirmation of Probed Packet Size  . . . . . . . . . . .  13
80	     4.3.  Detection of Black Holes  . . . . . . . . . . . . . . . .  13
81	     4.4.  Response to PTB Messages  . . . . . . . . . . . . . . . .  14
82	       4.4.1.  Validation of PTB Messages  . . . . . . . . . . . . .  14
83	       4.4.2.  Use of PTB Messages . . . . . . . . . . . . . . . . .  15
84	   5.  Datagram Packetization Layer PMTUD  . . . . . . . . . . . . .  16
85	     5.1.  DPLPMTUD Components . . . . . . . . . . . . . . . . . . .  17
86	       5.1.1.  Timers  . . . . . . . . . . . . . . . . . . . . . . .  17
87	       5.1.2.  Constants . . . . . . . . . . . . . . . . . . . . . .  17
88	       5.1.3.  Variables . . . . . . . . . . . . . . . . . . . . . .  18
89	     5.2.  DPLPMTUD Phases . . . . . . . . . . . . . . . . . . . . .  19
90	       5.2.1.  Path Confirmation Phase . . . . . . . . . . . . . . .  20
91	       5.2.2.  Search Phase  . . . . . . . . . . . . . . . . . . . .  21
92	         5.2.2.1.  Resilience to inconsistent path information . . .  21
93	       5.2.3.  Search Complete Phase . . . . . . . . . . . . . . . .  21
94	       5.2.4.  PROBE_BASE Phase  . . . . . . . . . . . . . . . . . .  22
95	       5.2.5.  ERROR Phase . . . . . . . . . . . . . . . . . . . . .  22
96	         5.2.5.1.  Robustness to inconsistent path . . . . . . . . .  23

98	       5.2.6.  DISABLED Phase  . . . . . . . . . . . . . . . . . . .  23
99	     5.3.  State Machine . . . . . . . . . . . . . . . . . . . . . .  23
100	     5.4.  Search to Increase the PLPMTU . . . . . . . . . . . . . .  26
101	       5.4.1.  Probing for a larger PLPMTU . . . . . . . . . . . . .  26
102	       5.4.2.  Selection of Probe Sizes  . . . . . . . . . . . . . .  27
103	       5.4.3.  Resilience to inconsistent Path information . . . . .  28
104	   6.  Specification of Protocol-Specific Methods  . . . . . . . . .  28
105	     6.1.  Application support for DPLPMTUD with UDP or UDP-Lite . .  28
106	       6.1.1.  Application Request . . . . . . . . . . . . . . . . .  29
107	       6.1.2.  Application Response  . . . . . . . . . . . . . . . .  29
108	       6.1.3.  Sending Application Probe Packets . . . . . . . . . .  29
109	       6.1.4.  Validating the Path . . . . . . . . . . . . . . . . .  29
110	       6.1.5.  Handling of PTB Messages  . . . . . . . . . . . . . .  29
111	     6.2.  DPLPMTUD with UDP Options . . . . . . . . . . . . . . . .  30
112	       6.2.1.  UDP Probe Request Option  . . . . . . . . . . . . . .  31
113	       6.2.2.  UDP Probe Response Option . . . . . . . . . . . . . .  31
114	     6.3.  DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . .  32
115	       6.3.1.  SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . .  32
116	         6.3.1.1.  Sending SCTP Probe Packets  . . . . . . . . . . .  32
117	         6.3.1.2.  Validating the Path with SCTP . . . . . . . . . .  33
118	         6.3.1.3.  PTB Message Handling by SCTP  . . . . . . . . . .  33
119	       6.3.2.  DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . .  33
120	         6.3.2.1.  Sending SCTP/UDP Probe Packets  . . . . . . . . .  33
121	         6.3.2.2.  Validating the Path with SCTP/UDP . . . . . . . .  33
122	         6.3.2.3.  Handling of PTB Messages by SCTP/UDP  . . . . . .  33
123	       6.3.3.  DPLPMTUD for SCTP/DTLS  . . . . . . . . . . . . . . .  33
124	         6.3.3.1.  Sending SCTP/DTLS Probe Packets . . . . . . . . .  34
125	         6.3.3.2.  Validating the Path with SCTP/DTLS  . . . . . . .  34
126	         6.3.3.3.  Handling of PTB Messages by SCTP/DTLS . . . . . .  34
127	     6.4.  DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . .  34
128	       6.4.1.  Sending QUIC Probe Packets  . . . . . . . . . . . . .  34
129	       6.4.2.  Validating the Path with QUIC . . . . . . . . . . . .  35
130	       6.4.3.  Handling of PTB Messages by QUIC  . . . . . . . . . .  35
131	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  35
132	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  35
133	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  36
134	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  36
135	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  36
136	     10.2.  Informative References . . . . . . . . . . . . . . . . .  38
137	   Appendix A.  Event-driven state changes . . . . . . . . . . . . .  38
138	   Appendix B.  Revision Notes . . . . . . . . . . . . . . . . . . .  41
139	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  43

141	1.  Introduction

143	   The IETF has specified datagram transport using UDP, SCTP, and DCCP,
144	   as well as protocols layered on top of these transports (e.g., SCTP/
145	   UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP
146	   network layer.  This document describes a robust method for Path MTU
147	   Discovery (PMTUD) that may be used with these transport protocols (or
148	   the applications that use their transport service) to discover an
149	   appropriate size of packet to use across an Internet path.

151	   This specification clarifies the PLPMTUD method for SCTP described in
152	   section 10.2 of [RFC4821] by specifying the procedure in Section 6.3
153	   of this document.

155	1.1.  Classical Path MTU Discovery

157	   Classical Path Maximum Transmission Unit Discovery (PMTUD) can be
158	   used with any transport that is able to process ICMP Packet Too Big
159	   (PTB) messages (e.g., [RFC1191] and [RFC8201]).  The term PTB message
160	   is applied to both IPv4 ICMP Unreachable messages (Type 3) that carry
161	   the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too
162	   big messages (Type 2).  When a sender receives a PTB message, it
163	   reduces the effective MTU to the value reported in the PTB message
164	   (in this document called the PTB_SIZE).  A method from time-to-time
165	   increases the packet size in attempt to discover an increase in the
166	   supported PMTU.  The packets sent with a size larger than the current
167	   effective PMTU are known as probe packets.

169	   Packets not intended as probe packets are either fragmented to the
170	   current effective PMTU, or an attempt to send a packet larger than
171	   current effective PMTU fails with an error code.  Applications are
172	   sometimes provided with a primitive to let them read the maximum
173	   packet size, derived from the current effective PMTU.

175	   Classical PMTUD is subject to protocol failures.  One failure arises
176	   when traffic using a packet size larger than the actual PMTU is black
177	   holed (all datagrams sent with this size, or larger, are silently
178	   discarded without the sender receiving ICMP PTB messages).  This
179	   could arise when the PTB messages are not delivered back to the
180	   sender for some reason [RFC2923]).  For example, ICMP messages are
181	   increasingly filtered by middleboxes (including firewalls) [RFC4890].
182	   A stateful firewall could be configured with a policy to block
183	   incoming ICMP messages, which would prevent reception of PTB messages
184	   to endpoints behind this firewall.  Other examples include cases
185	   where PTB messages are not correctly processed/generated by tunnel
186	   endpoints.

188	   Another failure could result if a node that is not on the network
189	   path sends a PTB message that attempts to force the sender to change
190	   the effective PMTU [RFC8201].  A sender can protect itself from
191	   reacting to such messages by utilising the quoted packet within a PTB
192	   message payload to validate that the received PTB message was
193	   generated in response to a packet that had actually originated from
194	   the sender.  However, there are situations where a sender would be
195	   unable to provide this validation.

197	   Examples where validation of the PTB message is not possible include:

199	   o  When the router issuing the ICMP message is acting on a tunneled
200	      packet, the ICMP message will be directed to the tunnel endpoint.
201	      This tunnel endpoint is responsible for forwarding the ICMP
202	      message and also processing the quoted packet within the payload
203	      field to remove the effect of the tunnel, and return a correctly
204	      formatted ICMP message to the sender.  Failure to do appropriate
205	      processing therefore results in black-holing.

207	   o  When a router issuing the ICMP message implements RFC 792
208	      [RFC0792], it is only required to include (quote) the first 64
209	      bits of the IP payload of the packet within the ICMP payload.
210	      This could be insufficient to perform the tunnel processing
211	      described in the previous bullet.  Even if the decapsulated
212	      message is processed by the tunnel endpoint, there could be
213	      insufficient bytes remaining for the sender to interpret the
214	      quoted transport information.  RFC 1812 [RFC1812] requires routers
215	      to return the full packet if possible.  This can result in black-
216	      holing when used the path includes tunnels.

218	   o  When a router issuing the ICMP message quotes a packet with an
219	      encrypted transport, it may lack sufficient context to determine
220	      the original transport header.

222	   o  Even when the PTB message includes sufficient bytes of the quoted
223	      packet, the network layer could lack sufficient context to
224	      validate the ICMP message, because this depends on information
225	      about the active transport flows at an endpoint node (e.g., the
226	      socket/address pairs being used, and other protocol header
227	      information).

229	1.2.  Packetization Layer Path MTU Discovery

231	   The term Packetization Layer (PL) has been introduced to describe the
232	   layer that is responsible for placing data blocks into the payload of
233	   IP packets and selecting an appropriate Maximum Packet Size (MPS).
234	   This function is often performed by a transport protocol, but can
235	   also be performed by other encapsulation methods working above the
236	   transport layer.

238	   In contrast to PMTUD, Packetization Layer Path MTU Discovery
239	   (PLPMTUD) [RFC4821] does not rely upon reception and validation of
240	   PTB messages.  It is therefore more robust than Classical PMTUD.

242	   This has become the recommended approach for implementing PMTU
243	   discovery with TCP.

245	   It uses a general strategy where the PL sends probe packets to search
246	   for the largest size of unfragmented datagram that can be sent over a
247	   network path.  The probe packets are sent with a progressively larger
248	   packet size.  If a probe packet is successfully delivered (as
249	   determined by the PL), then the PLPMTU is raised to the size of the
250	   successful probe.  If no response is received to a probe packet, the
251	   method reduces the probe size.  This PLPMTU is used to set the
252	   application MPS.

254	   PLPMTUD introduces flexibility in the implementation of PMTU
255	   discovery.  At one extreme, it can be configured to only perform PTB
256	   black hole detection and recovery to increase the robustness of
257	   Classical PMTUD, or at the other extreme, all PTB processing can be
258	   disabled and PLPMTUD can completely replace Classical PMTUD.

260	   PLPMTUD can also include additional consistency checks without
261	   increasing the risk of increased black-holing.  For instance,the
262	   information available at the PL, or higher layers, makes PTB
263	   validation more straight forward.

265	1.3.  Path MTU Discovery for Datagram Services

267	   Section 5 of this document presents a set of algorithms for datagram
268	   protocols to discover the largest size of unfragmented datagram that
269	   can be sent over a network path.  The method described relies on
270	   features of the PL described in Section 3 and applies to transport
271	   protocols operating over IPv4 and IPv6.  It does not require
272	   cooperation from the lower layers, although it can utilise ICMP PTB
273	   messages when these received messages are made available to the PL.

275	   The UDP Usage Guidelines [RFC8085] state "an application SHOULD
276	   either use the Path MTU information provided by the IP layer or
277	   implement Path MTU Discovery (PMTUD)", but does not provide a
278	   mechanism for discovering the largest size of unfragmented datagram
279	   that can be used on a network path.  Prior to this document, PLPMTUD
280	   had not been specified for UDP.

282	   Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the
283	   Stream Control Transport Protocol (SCTP).  SCTP utilises heartbeat
284	   messages as probe packets, but RFC4821 does not provide a complete
285	   specification.  The present document provides the details to complete
286	   that specification.

288	   The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires
289	   implementations to support Classical PMTUD and states that a DCCP
290	   sender "MUST maintain the MPS allowed for each active DCCP session".
291	   It also defines the current congestion control MPS (CCMPS) supported
292	   by a network path.  This recommends use of PMTUD, and suggests use of
293	   control packets (DCCP-Sync) as path probe packets, because they do
294	   not risk application data loss.  The method defined in this
295	   specification could be used with DCCP.

297	   Section 6 specifies the method for a set of transports, and provides
298	   information to enable the implementation of PLPMTUD with other
299	   datagram transports and applications that use datagram transports.

301	2.  Terminology

303	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
304	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
305	   document are to be interpreted as described in [RFC2119].

307	   Other terminology is directly copied from [RFC4821], and the
308	   definitions in [RFC1122].

310	   Actual PMTU:  The Actual PMTU is the PMTU of a network path between a
311	      sender PL and a destination PL, which the DPLPMTUD algorithm seeks
312	      to determine.

314	   Black Holed:  Packets are Black holed when the sender is unaware that
315	      packets are not delivered to the destination endpoint (e.g., when
316	      the sender transmits packets of a particular size with a
317	      previously known effective PMTU and they are silently discarded by
318	      the network, but is not made aware of a change to the path that
319	      resulted in a smaller PLPMTU by ICMP messages).

321	   Classical Path MTU Discovery:  Classical PMTUD is a process described
322	      in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to
323	      learn the largest size of unfragmented datagram that can be used
324	      across a network path.

326	   Datagram:  A datagram is a transport-layer protocol data unit,
327	      transmitted in the payload of an IP packet.

329	   Effective PMTU:  The Effective PMTU is the current estimated value
330	      for PMTU that is used by a PMTUD.  This is equivalent to the
331	      PLPMTU derived by PLPMTUD.

333	   EMTU_S:  The Effective MTU for sending (EMTU_S) is defined in
334	      [RFC1122] as "the maximum IP datagram size that may be sent, for a
335	      particular combination of IP source and destination addresses...".

337	   EMTU_R:  The Effective MTU for receiving (EMTU_R) is designated in
338	      [RFC1122] as the largest datagram size that can be reassembled by
339	      EMTU_R ("Effective MTU to receive").

341	   Link:  A Link is a communication facility or medium over which nodes
342	      can communicate at the link layer, i.e., a layer below the IP
343	      layer.  Examples are Ethernet LANs and Internet (or higher) layer
344	      and tunnels.

346	   Link MTU:  The Link Maximum Transmission Unit (MTU) is the size in
347	      bytes of the largest IP packet, including the IP header and
348	      payload, that can be transmitted over a link.  Note that this
349	      could more properly be called the IP MTU, to be consistent with
350	      how other standards organizations use the acronym.  This includes
351	      the IP header, but excludes link layer headers and other framing
352	      that is not part of IP or the IP payload.  Other standards
353	      organizations generally define the link MTU to include the link
354	      layer headers.

356	   MPS:  The Maximum Packet Size (MPS) is the largest size of
357	      application data block that can be sent across a network path.  In
358	      DPLPMTUD this quantity is derived from the PLPMTU by taking into
359	      consideration the size of the lower protocol layer headers.

361	   MIN_PMTU:  The MIN_PMTU is the smallest size of PLPMTU that DPLPTMUD
362	      will attempt to use.

364	   Packet:  A Packet is the IP header plus the IP payload.

366	   Packetization Layer (PL):  The Packetization Layer (PL) is the layer
367	      of the network stack that places data into packets and performs
368	      transport protocol functions.

370	   Path:  The Path is the set of links and routers traversed by a packet
371	      between a source node and a destination node by a particular flow.

373	   Path MTU (PMTU):  The Path MTU (PMTU) is the minimum of the Link MTU
374	      of all the links forming a network path between a source node and
375	      a destination node.

377	   PTB_SIZE:  The PTB_SIZE is a value reported in a validated PTB
378	      message that indicates next hop link MTU of a router along the
379	      path.

381	   PLPMTU:  The Packetization Layer PMTU is an estimate of the actual
382	      PMTU provided by the DPLPMTUD algorithm.

384	   PLPMTUD:  Packetization Layer Path MTU Discovery (PLPMTUD), the
385	      method described in this document for datagram PLs, which is an
386	      extension to Classical PMTU Discovery.

388	   Probe packet:  A probe packet is a datagram sent with a purposely
389	      chosen size (typically the current PLPMTU or larger) to detect if
390	      packets of this size can be successfully sent end-to-end across
391	      the network path.

393	3.  Features Required to Provide Datagram PLPMTUD

395	   TCP PLPMTUD has been defined using standard TCP protocol mechanisms.
396	   All of the requirements in [RFC4821] also apply to the use of the
397	   technique with a datagram PL.  Unlike TCP, some datagram PLs require
398	   additional mechanisms to implement PLPMTUD.

400	   There are eight requirements for performing the datagram PLPMTUD
401	   method described in this specification:

403	   1.  PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide
404	       information about the maximum size of packet that can be
405	       transmitted by the sender on the local link (the local Link MTU).
406	       It MAY utilize similar information about the receiver when this
407	       is supplied (note this could be less than EMTU_R).  This avoids
408	       implementations trying to send probe packets that can not be
409	       transmitted by the local link.  Too high of a value could reduce
410	       the efficiency of the search algorithm.  Some applications also
411	       have a maximum transport protocol data unit (PDU) size, in which
412	       case there is no benefit from probing for a size larger than this
413	       (unless a transport allows multiplexing multiple applications
414	       PDUs into the same datagram).

416	   2.  PLPMTU: A datagram application is REQUIRED to be able to choose
417	       the size of datagrams sent to the network, up to the PLPMTU, or a
418	       smaller value (such as the MPS) derived from this.  This value is
419	       managed by the DPLPMTUD method.  The PLPMTU (specified as the
420	       effective PMTU in Section 1 of [RFC1191]) is equivalent to the
421	       EMTU_S (specified in [RFC1122]).

423	   3.  Probe packets: On request, a DPLPMTUD sender is REQUIRED to be
424	       able to transmit a packet larger than the PLMPMTU.  This is used
425	       to send a probe packet.  In IPv4, a probe packet MUST be sent
426	       with the Don't Fragment (DF) bit set in the IP header, and
427	       without network layer endpoint fragmentation.  In IPv6, a probe
428	       packet is always sent without source fragmentation (as specified
429	       in section 5.4 of [RFC8201]).

431	   4.  Processing PTB messages: A DPLPMTUD sender MAY optionally utilize
432	       PTB messages received from the network layer to help identify
433	       when a network path does not support the current size of probe
434	       packet.  Any received PTB message MUST be validated before it is
435	       used to update the PLPMTU discovery information [RFC8201].  This
436	       validation confirms that the PTB message was sent in response to
437	       a packet originating by the sender, and needs to be performed
438	       before the PLPMTU discovery method reacts to the PTB message.
439	       When the PTB_SIZE is indicated in the PTB message, this MAY be
440	       used by DPLPMTUD to reduce the probe size but MUST NOT be used to
441	       increase the PLPMTU ([RFC8201]).  This validation SHOULD utilise
442	       information that can not be simply determined by an off-path
443	       attacker, for example, by checking the value of a protocol header
444	       field known only to the two PL endpoints.  (Some datagram
445	       applications use well-known source and destination ports and
446	       therefore this check needs to rely on other information.)

448	   5.  Reception feedback: The destination PL endpoint is REQUIRED to
449	       provide a feedback method that indicates to the DPLPMTUD sender
450	       when a probe packet has been received by the destination PL
451	       endpoint.  The mechanism needs to be robust to the possibility
452	       that packets could be significantly delayed along a network path.
453	       The local PL endpoint at the sending node is REQUIRED to pass
454	       this feedback to the sender-side DPLPMTUD method.

456	   6.  Probing and congestion control: The isolated loss of a probe
457	       packet SHOULD NOT be treated as an indication of congestion and
458	       its loss SHOULD NOT directly trigger a congestion control
459	       reaction [RFC4821].

461	   7.  Probe loss recovery: If the data block carried by a probe packet
462	       needs to be sent reliably, the PL (or layers above) are REQUIRED
463	       to arrange any retransmission/repair of any resulting loss.  This
464	       method is REQUIRED to be robust in the case where probe packets
465	       are lost due to other reasons (including link transmission error,
466	       congestion).  The DPLPMTUD sender treats isolated loss of a probe
467	       packet (with or without an PTB message) as a potential indication
468	       of a PMTU limit for the path, but not as an indication of
469	       congestion, see Paragraph 6.

471	   8.  Shared PLPMTU state: The PLPMTU value could also be stored with
472	       the corresponding entry in the destination cache and used by
473	       other PL instances.  The specification of PLPMTUD [RFC4821]
474	       states: "If PLPMTUD updates the MTU for a particular path, all
475	       Packetization Layer sessions that share the path representation
476	       (as described in Section 5.2 of [RFC4821]) SHOULD be notified to
477	       make use of the new MTU and make the required congestion control
478	       adjustments".  Such methods MUST be robust to the wide variety of
479	       underlying network forwarding behaviours, PLPMTU adjustments
480	       based on shared PLPMTU values should be incorporated in the
481	       search algorithms.  Section 5.2 of [RFC8201] provides guidance on
482	       the caching of PMTU information and also the relation to IPv6
483	       flow labels.

485	   In addition, the following principles are stated for design of a
486	   DPLPMTUD method:

488	   o  MPS: A method is REQUIRED to signal an appropriate MPS to the
489	      higher layer using the PL.  The value of the MPS can change
490	      following a change to the path.  It is RECOMMENDED that methods
491	      avoid forcing an application to use an arbitrary small MPS
492	      (PLPMTU) for transmission while the method is searching for the
493	      currently supported PLPMTU.  Datagram PLs do not necessarily
494	      support fragmentation of PDUs larger than the PLPMTU.  A reduced
495	      MPS can adversely impact the performance of a datagram
496	      application.

498	   o  Path validation: It is RECOMMENDED that methods are robust to path
499	      changes that could have occurred since the path characteristics
500	      were last confirmed, and to the possibility of inconsistent path
501	      information being received.

503	   o  Datagram reordering: A method is REQUIRED to be robust to the
504	      possibility that a flow encounters reordering, or the traffic
505	      (including probe packets) is divided over more than one network
506	      path.

508	   o  When to probe: It is RECOMMENDED that methods determine whether
509	      the path capacity has increased since it last measured the path.
510	      This determines when the path should again be probed.

512	4.  DPLPMTUD Mechanisms

514	   This section lists the protocol mechanisms used in this
515	   specification.

517	4.1.  PLPMTU Probe Packets

519	   The DPLPMTUD method relies upon the PL sender being able to generate
520	   probe packets with a specific size.  TCP is able to generate these
521	   probe packets by choosing to appropriately segment data being sent
522	   [RFC4821].  In contrast, a datagram PL that needs to construct a
523	   probe packet has to either request an application to send a data
524	   block that is larger than that generated by an application, or to
525	   utilise padding functions to extend a datagram beyond the size of the
526	   application data block.  Protocols that permit exchange of control
527	   messages (without an application data block) could alternatively
528	   prefer to generate a probe packet by extending a control message with
529	   padding data.

531	   A receiver needs to be able to distinguish an in-band data block from
532	   any added padding.  This is needed to ensure that any added padding
533	   is not passed on to an application at the receiver.

535	   This results in three possible ways that a sender can create a probe
536	   packet listed in order of preference:

538	   Probing using padding data:  A probe packet that contains only
539	      control information together with any padding, which is needed to
540	      be inflated to the size required for the probe packet.  Since
541	      these probe packets do not carry an application-supplied data
542	      block, they do not typically require retransmission, although they
543	      do still consume network capacity and incur endpoint processing.

545	   Probing using application data and padding data:  A probe packet that
546	      contains a data block supplied by an application that is combined
547	      with padding to inflate the length of the datagram to the size
548	      required for the probe packet.  If the application/transport needs
549	      protection from the loss of this probe packet, the application/
550	      transport could perform transport-layer retransmission/repair of
551	      the data block (e.g., by retransmission after loss is detected or
552	      by duplicating the data block in a datagram without the padding
553	      data).

555	   Probing using application data:  A probe packet that contains a data
556	      block supplied by an application that matches the size required
557	      for the probe packet.  This method requests the application to
558	      issue a data block of the desired probe size.  If the application/
559	      transport needs protection from the loss of an unsuccessful probe
560	      packet, the application/transport needs then to perform transport-
561	      layer retransmission/repair of the data block (e.g., by
562	      retransmission after loss is detected).

564	   A PL that uses a probe packet carrying an application data block,
565	   could need to retransmit this application data block if the probe
566	   fails.  This could need the PL to re-fragment the data block to a
567	   smaller packet size that is expected to traverse the end-to-end path
568	   (which could utilise endpoint network-layer or PL fragmentation when
569	   these are available).

571	   DPLPMTUD MAY choose to use only one of these methods to simplify the
572	   implementation.

574	   Probe messages sent by a PL MUST contain enough information to
575	   uniquely identify the probe within Maximum Segment Lifetime, while
576	   being robust to reordering and replay of probe response and ICMP PTB
577	   messages.

579	4.2.  Confirmation of Probed Packet Size

581	   The PL needs a method to determine (confirm) when probe packets have
582	   been successfully received end-to-end across a network path.

584	   Transport protocols can include end-to-end methods that detect and
585	   report reception of specific datagrams that they send (e.g., DCCP and
586	   SCTP provide keep-alive/heartbeat features).  When supported, this
587	   mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of
588	   a probe packet.

590	   A PL that does not acknowledge data reception (e.g., UDP and UDP-
591	   Lite) is unable itself to detect when the packets that it sends are
592	   discarded because their size is greater than the actual PMTU.  These
593	   PLs need to either rely on an application protocol to detect this
594	   loss, or make use of an additional transport method such as UDP-
595	   Options [I-D.ietf-tsvwg-udp-options].

597	   Section Section 5 specifies this function for a set of IETF-specified
598	   protocols.

600	4.3.  Detection of Black Holes

602	   A PL sender needs to reduce the PLPMTU when it discovers the actual
603	   PMTU supported by a network path is less than the PLPMTU (i.e. to
604	   detect that traffic is being black holed).  This can be triggered
605	   when a validated PTB message is received, or by another event that
606	   indicates the network path no longer sustains the current packet
607	   size, such as a loss report from the PL or repeated lack of response
608	   to probe packets sent to confirm the PLPMTU.  Detection is followed
609	   by a reduction of the PLPMTU.

611	   Black Hole detection is performed by periodically sending packet
612	   probes of size PLPMTU to verify that a network path still supports
613	   the last acknowledged PLPMTU size.  There are two ways a DPLPMTUD
614	   sender detect that the current PLPMTU is not sustained by the path
615	   (i.e., to detect a black hole):

617	   o  A PL can rely upon a mechanisms implemented within the PL protocol
618	      to detect excessive loss of data sent with a specific packet size
619	      and then conclude that this excessive loss could be a result of an
620	      invalid PMTU (as in PLPMTUD for TCP [RFC4821]).

622	   o  A PL can use the probing mechanism to send confirmation probe
623	      packets of the size of the current PLPMTU and a timer track
624	      whether acknowledgments are received (e.g., The number of probe
625	      packets sent without receiving an acknowledgement, PROBE_COUNT,
626	      becomes greater than the MAX_PROBES).  These messages need to be
627	      generated periodically (e.g., using the confirmation timer
628	      Section 5.1.1), and should be suppressed when the PL is not
629	      actively sending data.  Successive loss of probes is an indication
630	      that the current path no longer supports the PLPMTU.

632	   When the method detects the current PLPMTU is not supported (a black
633	   hole is found), DPLPMTUD sets a lower MPS.  The PL then confirms that
634	   the updated PLPMTU can be successfully used across the path.  This
635	   can need the PL to send a probe packet with a size less than the size
636	   of the data block generated by an application.  In this case, the PL
637	   could provide a way to fragment a datagram at the PL, or could
638	   instead utilise a control packet with padding.

640	4.4.  Response to PTB Messages

642	   This method requires the DPLPMTUD sender to validate any received PTB
643	   message before using the PTB information.  The response to a PTB
644	   message depends on the PTB_SIZE indicated in the PTB message, the
645	   state of the PLPMTUD state machine, and the IP protocol being used.

647	   Section 4.4.1 first describes validation for both IPv4 ICMP
648	   Unreachable messages (type 3) and ICMPv6 packet too big messages,
649	   both of which are referred to as PTB messages in this document.

651	4.4.1.  Validation of PTB Messages

653	   A PL that receives a PTB message from a router or middlebox, MUST
654	   perform ICMP validation as specified in Section 5.2 of [RFC8085].
655	   This needs the PL to check the protocol information in the quoted
656	   payload to validate the message originated from the sending node.
657	   This check includes determining the appropriate port and IP
658	   information - necessary for the PTB message to be passed to the PL.
659	   In addition, the PL SHOULD validate information from the ICMP payload
660	   to determine that the quoted packet was sent by the PL.  These checks
661	   are intended to provide protection from packets that originate from a
662	   node that is not on the network path.  PTB messages are discarded if
663	   they fail to pass these checks, or where there is insufficient ICMP
664	   payload to perform the checks

666	   PTB messages that have been validated can be utilised by the DPLPMTUD
667	   algorithm.  A method that utilises these PTB messages can improve the
668	   speed at the which the algorithm detects an appropriate PLPMTU,
669	   compared to one that relies solely on probing.

671	4.4.2.  Use of PTB Messages

673	   A set of checks are intended to provide protection from a router that
674	   reports an unexpected PTB_SIZE.  The PL needs to check that the
675	   indicated PTB_SIZE is less than the size used by probe packets and
676	   larger than minimum size accepted.

678	   This section provides an informative summary of how PTB messages can
679	   be utilised.

681	   Validating PTB Messages:

683	      *  A simple implementation is permitted to ignore received PTB
684	         messages and therefore the PLPMTU is not updated when a PTB
685	         message is received.

687	      *  An implementation that supports PTB messages MUST validate
688	         messages before they are processed.

690	   MIN_PMTU < PTB_SIZE < BASE_MTU

692	      *  A robust PL MAY enter the PROBE_ERROR state for an IPv4 path
693	         when the PTB_SIZE reported in the PTB message >= 576B and when
694	         this is less than the BASE_MTU.

696	      *  A robust PL MAY enter the PROBE_ERROR state for an IPv6 path
697	         when the PTB_SIZE reported in the PTB message >= 1280B and when
698	         this is less than the BASE_MTU.

700	   PTB_SIZE = PLPMTU

702	      *  Transition to SEARCH_COMPLETE.

704	   PTB_SIZE > PROBED_SIZE

706	      *  The PTB_SIZE > PROBED_SIZE, inconsistent network signal.  These
707	         PTB messages ought to be discarded without further processing
708	         (the PLPMTU not updated).

710	      *  The information could be utilised as an input to trigger
711	         enabling a resilience mode.

713	   BASE_PMTU <= PTB_SIZE < PLPMTU

715	      *  Black hole detection is triggered and the PLPMTU ought to be
716	         set to BASE_PMTU.

718	      *  The PL could use PTB_SIZE reported in the PTB message to
719	         initialise a search algorithm.

721	   PLPMTU < PTB_SIZE < PROBED_SIZE

723	      *  The PLPMTU continues to be valid, but the last PROBED_SIZE
724	         searched was larger than the actual PMTU.

726	      *  The PLPMTU is not updated.

728	      *  The PL can use the reported PTB_SIZE from the PTB message as
729	         the next search point when it resumes the search algorithm.

731	5.  Datagram Packetization Layer PMTUD

733	   This section specifies Datagram PLPMTUD (DPLPMTUD).  The method can
734	   be introduced at various points in the IP protocol stack to discover
735	   the PLPMTU so that an application can utilise an appropriate MPS for
736	   the current network path.

738	     +----------------------+
739	     |         APP*         |
740	     +-+-------+----+---+---+
741	       |       |    |   |
742	   +---+--+ +--+--+ | +-+---+
743	   | QUIC*| |UDPO*| | |SCTP*|
744	   +---+--+ +--+--+ | ++--+-+
745	       |       |    |  |  |
746	       +-------+-+  |  |  |
747	                 |  |  |  |
748	                ++-+--++  |
749	                | UDP  |  |
750	                +---+--+  |
751	                    |     |
752	     +--------------+-----+-+
753	     |  Network Interface   |
754	     +----------------------+

756	           Figure 1: Examples where DPLPMTUD can be implemented

758	   The central idea of DPLPMTUD is probing by a sender.  Probe packets
759	   are sent to find the maximum size of user message that is completely
760	   transferred across the network path from the sender to the
761	   destination.

763	   This section identifies the components needed for implementation, the
764	   phases of operation, the state machine and search algorithm.

766	5.1.  DPLPMTUD Components

768	   This section describes components of DPLPMTUD.

770	5.1.1.  Timers

772	   The method utilises three timers:

774	   PROBE_TIMER:  The PROBE_TIMER is configured to expire after a period
775	      longer than the maximum time to receive an acknowledgment to a
776	      probe packet.  This value MUST be larger than 1 second, and SHOULD
777	      be larger than 15 seconds.  Guidance on selection of the timer
778	      value are provided in section 3.1.1 of the UDP Usage Guidelines
779	      [RFC8085].

781	      If the PL has a path Round Trip Time (RTT) estimate and timely
782	      acknowledgements the PROBE_TIMER can be derived from the PL RTT
783	      estimate.

785	   PMTU_RAISE_TIMER:  The PMTU_RAISE_TIMER is configured to the period a
786	      sender will continue to use the current PLPMTU, after which it re-
787	      enters the Search phase.  This timer has a period of 600 secs, as
788	      recommended by PLPMTUD [RFC4821].

790	      DPLPMTUD SHOULD inhibit sending probe packets when no application
791	      data has been sent since the previous probe packet.

793	   CONFIRMATION_TIMER:  The CONFIRMATION_TIMER is configured to the
794	      period a PL sender waits before confirming the current PLPMTU is
795	      still supported.  This is less than the PMTU_RAISE_TIMER and used
796	      to decrease the PLPMTU (e.g., when a black hole is encountered).
797	      Confirmation needs to be frequent enough when data is flowing that
798	      the sending PL does not black hole extensive amounts of traffic.
799	      Guidance on selection of the timer value are provided in section
800	      3.1.1 of the UDP Usage Guidelines[RFC8085].

802	      DPLPMTUD SHOULD inhibit sending probe packets when no application
803	      data has been sent since the previous probe packet.

805	   An implementation could implement the various timers using a single
806	   timer process.

808	5.1.2.  Constants

810	   The following constants are defined:

812	   MAX_PROBES:  MAX_PROBES is the maximum value of the
813	      PROBE_ERROR_COUNTER.  The default value of MAX_PROBES is 10.

815	   MIN_PMTU:  The MIN_PMTU is smallest allowed probe packet size.  For
816	      IPv6, this value is 1280 bytes, as specified in [RFC2460].  For
817	      IPv4, the minimum value is 68 bytes.  (An IPv4 router is required
818	      to be able to forward a datagram of 68 octets without further
819	      fragmentation.  This is the combined size of an IPv4 header and
820	      the minimum fragment size of 8 octets.  In addition, receivers are
821	      required to be able to reassemble fragmented datagrams at least up
822	      to 576B, as stated in section 3.3.3 of [RFC1122]))

824	   MAX_PMTU:  The MAX_PMTU is the largest size of PLPMTU.  This has to
825	      be less than or equal to the minimum of the local MTU of the
826	      outgoing interface and the destination PMTU for receiving.  An
827	      application or PL MAY reduce the MAX_PMTU when there is no need to
828	      send packets larger than a specific size.

830	   BASE_PMTU:  The BASE_PMTU is a configured size expected to work for
831	      most paths.  The size is equal to or larger than the MIN_PMTU and
832	      smaller than the MAX_PMTU.  In the case of IPv6, this value is
833	      1280 bytes [RFC2460].  When using IPv4, a size of 1200 bytes is
834	      RECOMMENDED.

836	5.1.3.  Variables

838	   This method utilises a set of variables:

840	   PROBED_SIZE:  The PROBED_SIZE is the size of the current probe
841	      packet.  This is a tentative value for the PLPMTU, which is
842	      awaiting confirmation by an acknowledgment.

844	   PROBE_COUNT:  The PROBE_COUNT is a count of the number of
845	      unsuccessful probe packets that have been sent with a size of
846	      PROBED_SIZE.  The value is initialised to zero when a particular
847	      size of PROBED_SIZE is first attempted.

849	   The figure below illustrates the relationship between the packet size
850	   constants and variables, in this case when the DPLPMTUD algorithm
851	   performs path probing to increase the size of the PLPMTU.  The MPS is
852	   less than the PLPMTU.  A probe packet has been sent of size
853	   PROBED_SIZE.  When this is acknowledged, the PLPMTU will be raised to
854	   PROBED_SIZE allowing the PROBED_SIZE to be increased towards the
855	   actual PMTU.

857	        MIN_PMTU                                             PMTU_MAX
858	          <------------------------------------------------------>
859	                         |       |    |     |           |
860	                         V       |    |     |           V
861	                     BASE_PMTU   V    |     V     Actual PMTU
862	                                MPS   |  PROBED_SIZE
863	                                      V
864	                                    PLPMTU

866	          Figure 2: Relationships between probe and packet sizes

868	5.2.  DPLPMTUD Phases

870	   The Datagram PLPMTUD algorithm moves through several phases of
871	   operation.

873	   An implementation that only reduces the PLPMTU to a suitable size
874	   would be sufficient to ensure reliable operation, but can be very
875	   inefficient when the actual PMTU changes or when the method (for
876	   whatever reason) makes a suboptimal choice for the PLPMTU.

878	   A full implementation of DPLPMTUD provides an algorithm enabling the
879	   DPLPMTUD sender to increase the PLPMTU following a change in the
880	   characteristics of the path, such as when a link is reconfigured with
881	   a larger MTU, or when there is a change in the set of links traversed
882	   by an end-to-end flow (e.g., after a routing or path fail-over
883	   decision).

885	   Black hole detection, see Section 4.3 and PTB processing Section 4.4
886	   proceed in parallel with these phases of operation.

888	                        +-------------------+
889	                        | Path Confirmation +--       Connectivity
890	                        +--------+----------+  \-----   or BASE_PMTU
891	                                 |     /\          \/ Confirmation Fails
892	            Connectivity and     |     |         +-------+
893	             BASE_PMTU confirmed |      ---------+ Error |
894	                                 |               +-------+
895	                                 |   CONFIRMATION_TIMER
896	                                 |        Fires
897	                                 \/
898	+----------------+          +--------------+
899	| Search Complete|<---------+   Search     |
900	+----------------+          +--------------+
901	                Search Algorithm
902	                   Completes

904	                         Figure 3: DPLPMTUD Phases

906	   Path Confirmation

908	      *  Connectivity is confirmed.

910	      *  DPLPMTUD confirms the BASE_PMTU is supported across the network
911	         path.

913	      *  DPLPMTUD then enters the search phase.

915	   Search

917	      *  DPLPMTUD performs probing to increase the PLPMTU.

919	      *  DPLPMTUD then enters the search complete or an error phase.

921	   Search Complete

923	      *  DPLPMTUD has found a suitable PLPMTU that is supported across
924	         the network path.

926	      *  Black hole detection will confirm this PLPMTU continues to be
927	         supported.

929	      *  On a longer time-frame, DPLPMTUD will re-enter the search phase
930	         to discover if the PLPMTU can be raised.

932	   Error

934	      *  Inconsistent or invalid network signals cause DPLPMTUD to be
935	         unable to progress.

937	      *  This causes the algorithm to lower the MPS until the path is
938	         shown to support the BASE_PMTU, or to suspend DPLPMTUD.

940	5.2.1.  Path Confirmation Phase

942	   DPLPMTUD starts in the Path confirmation phase.  Path confirmation is
943	   performed in two stages:

945	   1.  Connectivity to the remote peer is first confirmed.  When a
946	       connection-oriented PL is used, this stage is implicit.  It is
947	       performed as part of the normal PL connection handshake.  In
948	       contrast, an connectionless PL MUST send an acknowledged probe
949	       packet to confirm that the remote peer is reachable.

951	   2.  In the second stage, the PL confirms it can successfully send a
952	       datagram of the BASE_PMTU size across the current path.

954	   A PL that does not wish to support a network path with a PLPMTU less
955	   than BASE_PMTU can simplify the phase into a single step by
956	   performing connectivity checks with probes of the BASE_PMTU size.

958	   A PL MAY respond to PTB messages while in this phase, see
959	   Section 4.4.

961	   Once path confirmation has completed, DPLPMTUD can advertise an MPS
962	   to an upper layer.

964	   If DPLPMTUD fails to complete these tests it enters the
965	   PROBE_DISABLED phase, see Section 5.2.6, and ceases using DPLPTMUD.

967	5.2.2.  Search Phase

969	   The search phase utilises a search algorithm in attempt to increase
970	   the PLPMTU (see Section 5.4.1).  The PL sender increases the MPS each
971	   time a packet probe confirms a larger PLPMTU is supported by the
972	   path.  The algorithm concludes by entering the SEARCH_COMPLETE phase,
973	   see Section 5.2.3.

975	   A PL MAY respond to PTB messages while in this phase, using the PTB
976	   to advance or terminate the search, see Section 4.4.  Similarly black
977	   hole detection can terminate the search by entering the PROBE_BASE
978	   phase, see Section 5.2.4.

980	5.2.2.1.  Resilience to inconsistent path information

982	   Sometimes a PL sender is able to detect inconsistent results from the
983	   sequence of PLPMTU probes that it sends or the sequence of PTB
984	   messages that it receives.  This could be manifested as excessive
985	   fluctuation of the MPS.

987	   When inconsistent path information is detected, a PL sender can
988	   enable an alternate search mode that clamps the offered MPS to a
989	   smaller value for a period of time.  This avoids unnecessary black-
990	   holing of packets.

992	5.2.3.  Search Complete Phase

994	   On entry to the search complete phase, the DPLPMTUD sender starts the
995	   PMTU_RAISE_TIMER.  In this phase, the PLPMTU remains at the value
996	   confirmed by the last successful probe packet.

998	   In this phase, the PL MUST periodically confirm that the PLPMTU is
999	   still supported by the path.  If the PL is designed in a way that is
1000	   unable to confirm reachability to the destination endpoint after
1001	   probing has completed, the method uses a CONFIRMATION_TIMER to
1002	   periodically repeat a probe packet for the current PLPMTU size.

1004	   If the DPLPMTUD sender is unable to confirm reachability for packets
1005	   with a size of the current PLPMTU (e.g., if the CONFIRMATION_TIMER
1006	   expires) or the PL signals a lack of reachability, the method exits
1007	   the phase and enters the PROBE_BASE phase, see Section 5.2.4.

1009	   If the PMTU_RAISE_TIMER expires, the DPLPMTUD sender re-enters the
1010	   Search phase, see Section 5.2.2, and resumes probing for a larger
1011	   PLPMTU.

1013	   Back hole detection can be used in parallel to check that a network
1014	   path continues to support a previously confirmed PLPMTU.  If a black
1015	   hole is detected the algorithm moves to the PROBE_BASE phase, see
1016	   Section 5.2.4.

1018	   The phase can also exited when a validated PTB message is received
1019	   (see Section 4.4.1).

1021	5.2.4.  PROBE_BASE Phase

1023	   This phase is entered when black hole detection or a PTB message
1024	   indicates that the PLPMTU is not supported by the path.

1026	   On entry to this phase, the PLPMTU is set to the BASE_PMTU, and a
1027	   corresponding reduced MPS is advertised.

1029	   PROBED_SIZE is then set to the PLPMTU (i.e., the BASE_PMTU), to
1030	   confirm this size is supported across the path.  If confirmed,
1031	   DPLPMTUD enters the Search Phase to determine whether the PL sender
1032	   can use a larger PLPMTU.

1034	   If the path cannot be confirmed to support the BASE_PMTU after
1035	   sending MAX_PROBES, DPLPMTUD moves to the Error phase, see
1036	   Section 5.2.5.

1038	5.2.5.  ERROR Phase

1040	   The ERROR phase is entered when there is conflicting or invalid
1041	   PLPMTU information for the path (e.g. a failure to support the
1042	   BASE_PMTU).  In this phase, the MPS is set to a value less than the
1043	   BASE_PMTU, but at least the size of the MIN_PMTU.

1045	   DPLPMTUD remains in the ERROR phase until a consistent view of the
1046	   path can be discovered and it has also been confirmed that the path
1047	   supports the BASE_PMTU.

1049	   Note: MIN_PMTU may be identical to BASE_PMTU, simplifying the actions
1050	   in this phase.

1052	   If no acknowledgement is received for PROBE_COUNT probes of size
1053	   MIN_PMTU, the method suspends DPLPMTUD, see Section 5.2.5.

1055	5.2.5.1.  Robustness to inconsistent path

1057	   Robustness to paths unable to sustain the BASE_PMTU.  Some paths
1058	   could be unable to sustain packets of the BASE_PMTU size.  These
1059	   paths could use an alternate algorithm to implement the PROBE_ERROR
1060	   phase that allows fallback to a smaller than desired PLPMTU, rather
1061	   than suffer connectivity failure.

1063	   This could also utilise methods such as endpoint IP fragmentation to
1064	   enable the PL sender to communicate using packets smaller than the
1065	   BASE_PMTU.

1067	5.2.6.  DISABLED Phase

1069	   This phase suspends operation of DPLPMTUD.  It disables probing for
1070	   the PLPMTU until action is taken by the PL or application using the
1071	   PL.

1073	5.3.  State Machine

1075	   A state machine for DPLPMTUD is depicted in Figure 4.  If multihoming
1076	   is supported, a state machine is needed for each active path.

1078	                                             PROBE_TIMER expiry
1079	                                         (PROBE_COUNT = MAX_PROBES)
1080	                           +-------------------+       +--------------+
1081	                           |    PROBE_START    +------>|PROBE_DISABLED|
1082	                           +-------------------+       +--------------+
1083	                                      |                              ^
1084	                                      | Path confirmed               |
1085	                                      v                              |
1086	   MAX_PMTU acked or           +--------------+-+ (PROBE_COUNT       |
1087	   PTB (BASE_PMTU <= +---------| PROBE_SEARCH | |  < MAX_PROBES)     |
1088	     PTB_SIZE        |    +--> +--------------+<+  or Probe acked    |
1089	   <PROBED_SIZE)     |    |           |   ^                          |
1090	       or            |    |           |   |                          |
1091	   (PROBE_COUNT      |    |           |   |                          |
1092	     =MAX_PROBES)    |    |           |   |                          |
1093	     +---------------+    |           |   |                          |
1094	     |                    |           |   |                          |
1095	     |                    |           |   |                          |
1096	     |   PMTU_RAISE_TIMER |           |   |                          |
1097	     |                    |           |   |                          |
1098	     |                    |           |   |                          |
1099	     |        +-----------+           |   |                          |
1100	     |        |                       |   |                          |
1101	     |        |                       |   |                          |
1102	     |        |    (PTB_SIZE < PLPMTU)|   |                          |
1103	     |        |           or          |   | BASE_PMTU                |
1104	     |        |   Black hole detected |   | Probe acked              |
1105	     v        |                       v   |                          |
1106	   +----------+----+            +--------------+        +-------------+
1107	   |SEARCH_COMPLETE|----------->|  PROBE_BASE  |<-------| PROBE_ERROR |
1108	   +------+--------+            +--------------+        +-------------+
1109	    /\    |    Black hole detected  ^  | |  BASE_PMTU Probe acked:   ^
1110	     |    |             or          |  | |                           |
1111	     |    |    (PTB_SIZE < PLPMTU)  |  | | Probe BASE_PMTU:          |
1112	     |    |                         |  | | (PROBE_COUNT = MAX_PROBES)|
1113	     |    |                         |  | +---------------------------+
1114	     +----+                         +--+
1115	    Confirmation:                 PROBE_TIMER expiry:
1116	   (PROBE_COUNT < MAX_PROBES)    (PROBE_COUNT < MAX_PROBES)
1117	            or
1118	    PLPMTU Probe acked

1120	      Figure 4: State machine for Datagram PLPMTUD.  Note: Some state
1121	               changes are not show to simplify the diagram.

1123	   The following states are defined:

1125	   PROBE_START:  The PROBE_START state is the initial state before
1126	      probing has started.  The state confirms connectivity to the
1127	      remote PL.

1129	      The PLPMTU is set to the BASE_PMTU size.  Probing ought to start
1130	      immediately after connection setup to prevent the prevent the loss
1131	      of user data.  PLPMTUD is not performed in this state.  The state
1132	      transitions to PROBE_SEARCH, when a network path has been
1133	      confirmed, i.e., when a sent packet has been acknowledged on this
1134	      network path and the BASE_PMTU is confirmed to be supported.  If
1135	      the network path cannot be confirmed this state transitions to
1136	      PROBE_DISABLED.

1138	   PROBE_SEARCH:  The PROBE_SEARCH state is the main probing state.
1139	      This state is entered when probing for the BASE_PMTU was
1140	      successful.

1142	      The PROBE_COUNT is set to zero when the first probe packet is sent
1143	      for each probe size.  Each time a probe packet is acknowledged,
1144	      the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is
1145	      increased using the search algorithm.

1147	      When a probe packet is sent and not acknowledged within the period
1148	      of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe
1149	      packet is retransmitted.  The state is exited when the PROBE_COUNT
1150	      reaches MAX_PROBES; a PTB message is validated; a probe of size
1151	      PMTU_MAX is acknowledged or black hole detection is triggered.

1153	   SEARCH_COMPLETE:  The SEARCH_COMPLETE state indicates a successful
1154	      end to the PROBE_SEARCH state.  DPLPMTUD remains in this state
1155	      until either the PMTU_RAISE_TIMER expires; a received PTB message
1156	      is validated; or black hole detection is triggered.

1158	      When DPLPMTUD uses an unacknowledged PL and is in the
1159	      SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets
1160	      the PROBE_COUNT and schedules a probe packet with the size of the
1161	      PLPMTU.  If the probe packet fails to be acknowledged after
1162	      MAX_PROBES attempts, the method enters the PROBE_BASE state.  When
1163	      used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT
1164	      continue to generate PLPMTU probes in this state.

1166	   PROBE_BASE:  The PROBE_BASE state is used to confirm whether the
1167	      BASE_PMTU size is supported by the network path and is designed to
1168	      allow an application to continue working when there are transient
1169	      reductions in the actual PMTU.  It also seeks to avoid long
1170	      periods where traffic is black holed while searching for a larger
1171	      PLPMTU.

1173	      On entry, the PROBED_SIZE is set to the BASE_PMTU size and the
1174	      PROBE_COUNT is set to zero.

1176	      Each time a probe packet is sent, and the PROBE_TIMER is started.
1177	      The state is exited when the probe packet is acknowledged, and the
1178	      PL sender enters the PROBE_SEARCH state.

1180	      The state is also left when the PROBE_COUNT reaches MAX_PROBES; a
1181	      PTB message is validated.  This causes the PL sender to enter the
1182	      PROBE_ERROR state.

1184	   PROBE_ERROR:  The PROBE_ERROR state represents the case where the
1185	      network path is not known to support a PLPMTU of at least the
1186	      BASE_PMTU size.  It is entered when either a probe of size
1187	      BASE_PMTU has not been acknowledged or a validated PTB message
1188	      indicates a smaller PTB_SIZE smaller than the BASE_PMTU.

1190	      On entry, the PROBE_COUNT is set to zero and the PROBED_SIZE is
1191	      set to the MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU
1192	      size.  In this state, a probe packet is sent, and the PROBE_TIMER
1193	      is started.  The state transitions to the PROBE_SEARCH state when
1194	      a probe packet is acknowledged of at least size BASE_PMTU.  Robust
1195	      implementations may validate the BASE_PMTU several times before
1196	      transition to the PROBE_SEARCH.

1198	      Implementations are permitted to enable endpoint fragmentation if
1199	      the DPLPMTUD is unable to validate MIN_PMTU within PROBE_COUNT
1200	      probes.  If DPLPMTUD is unable to validate MIN_PMTU the
1201	      implementation should transition to PROBE_DISABLED.

1203	   PROBE_DISABLED:  The PROBE_DISABLED state indicates that connectivity
1204	      could not be established.  DPLPMTUD MUST NOT probe in this state.

1206	   Appendix A contains an informative description of key events.

1208	5.4.  Search to Increase the PLPMTU

1210	   This section describes the algorithms used by DPLPMTUD to search for
1211	   a larger PLPMTU.

1213	5.4.1.  Probing for a larger PLPMTU

1215	   Implementations use a search algorithm across the search range to
1216	   determine whether a larger PLPMTU can be supported across a network
1217	   path.

1219	   The method discovers the search range by confirming the minimum
1220	   PLPMTU and then using the probe method to select a PROBED_SIZE less
1221	   than or equal to PMTU_MAX.  PMTU_MAX is the minimum of the local MTU
1222	   and EMTU_R (learned from the remote endpoint).  The PMTU_MAX MAY be
1223	   reduced by an application that sets a maximum to the size of
1224	   datagrams it will send.

1226	   The PROBE_COUNT is initialised to zero when a probe packet is first
1227	   sent with a particular size.  A timer is used by the search algorithm
1228	   to trigger the sending of probe packets of size PROBED_SIZE, larger
1229	   than the PLPMTU.  Each probe packet successfully sent to the remote
1230	   peer is confirmed by acknowledgement at the PL, see Section 4.1.

1232	   Each time a probe packet is sent to the destination, the PROBE_TIMER
1233	   is started.  The timer is cancelled when the PL receives
1234	   acknowledgment that the probe packet has been successfully sent
1235	   across the path Section 4.1.  This confirms that the PROBED_SIZE is
1236	   supported, and the PROBED_SIZE value is then assigned to the PLPMTU.
1237	   The search algorithm can continue to send subsequent probe packets of
1238	   an increasing size.

1240	   If the timer expires before a probe packet is acknowledged, the probe
1241	   has failed to confirm the PROBED_SIZE.  Each time the PROBE_TIMER
1242	   expires, the PROBE_COUNT is incremented, the PROBE_TIMER is
1243	   reinitialised, and a probe packet of the same size is retransmitted
1244	   (the replicated probe improve the resilience to loss).  The maximum
1245	   number of retransmissions for a particular size is configured
1246	   (MAX_PROBES).  If the value of the PROBE_COUNT reaches MAX_PROBES,
1247	   probing will stop, and the PL sender enters the SEARCH_COMPLETE
1248	   state.

1250	5.4.2.  Selection of Probe Sizes

1252	   The search algorithm needs to determine a minimum useful gain in
1253	   PLPMTU.  It would not be constructive for a PL sender to attempt to
1254	   probe for all sizes - this would incur unnecessary load on the path
1255	   and has the undesirable effect of slowing the time to reach a more
1256	   optimal MPS.  Implementations SHOULD select the set of probe packet
1257	   sizes to maximise the gain in PLPMTU from each search step.

1259	   Implementations could optimize the search procedure by selecting step
1260	   sizes from a table of common PMTU sizes.  When selecting the
1261	   appropriate next size to search, an implementor ought to also
1262	   consider that there can be common sizes of MPS that applications seek
1263	   to use.

1265	   xxx Author Note: A future version of this section will detail example
1266	   methods for selecting probe size values, but does not plan to mandate
1267	   a single method. xxx

1269	5.4.3.  Resilience to inconsistent Path information

1271	   A decision to increase the PLPMTU needs to be resilient to the
1272	   possibility that information learned about the network path is
1273	   inconsistent (this could happen when probe packets are lost due to
1274	   other reasons, or some of the packets in a flow are forwarded along a
1275	   portion of the path that supports a different actual PMTU).

1277	   Frequent path changes could occur due to unexpected "flapping" -
1278	   where some packets from a flow pass along one path, but other packets
1279	   follow a different path with different properties.  DPLPMTUD can be
1280	   made resilient to these anomalies by introducing hysteresis into the
1281	   search decision to increase the MPS.

1283	6.  Specification of Protocol-Specific Methods

1285	   This section specifies protocol-specific details for datagram PLPMTUD
1286	   for IETF-specified transports.

1288	   The first subsection provides guidance on how to implement the
1289	   DPLPMTUD method as a part of an application using UDP or UDP-Lite.
1290	   The guidance also applies to other datagram services that do not
1291	   include a specific transport protocol (such as a tunnel
1292	   encapsulation).  The following subsection describe how DPLPMTUD can
1293	   be implemented as a part of the transport service, allowing
1294	   applications using the service to benefit from discovery of the
1295	   PLPMTU without themselves needing to implement this method.

1297	6.1.  Application support for DPLPMTUD with UDP or UDP-Lite

1299	   The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do
1300	   not define a method in the RFC-series that supports PLPMTUD.  In
1301	   particular, the UDP transport does not provide the transport layer
1302	   features needed to implement datagram PLPMTUD.

1304	   The DPLPMTUD method can be implemented as a part of an application
1305	   built directly or indirectly on UDP or UDP-Lite, but relies on
1306	   higher-layer protocol features to implement the method [RFC8085].

1308	   Some primitives used by DPLPMTUD might not be available via the
1309	   Datagram API (e.g., the ability to access the PLPMTU cache, or
1310	   interpret received ICMP PTB messages).

1312	   In addition, it is desirable that PMTU discovery is not performed by
1313	   multiple protocol layers.  An application SHOULD avoid implementing
1314	   DPLPMTUD when the underlying transport system provides this
1315	   capability.  Using a common method for managing the PLPMTU has
1316	   benefits, both in the ability to share state between different
1317	   processes and opportunities to coordinate probing.

1319	6.1.1.  Application Request

1321	   An application needs an application-layer protocol mechanism (such as
1322	   a message acknowledgement method) that solicits a response from a
1323	   destination endpoint.  The method SHOULD allow the sender to check
1324	   the value returned in the response to provide additional protection
1325	   from off-path insertion of data [RFC8085], suitable methods include a
1326	   parameter known only to the two endpoints, such as a session ID or
1327	   initialised sequence number.

1329	6.1.2.  Application Response

1331	   An application needs an application-layer protocol mechanism to
1332	   communicate the response from the destination endpoint.  This
1333	   response may indicate successful reception of the probe across the
1334	   path, but could also indicate that some (or all packets) have failed
1335	   to reach the destination.

1337	6.1.3.  Sending Application Probe Packets

1339	   A probe packet that may carry an application data block, but the
1340	   successful transmission of this data is at risk when used for
1341	   probing.  Some applications may prefer to use a probe packet that
1342	   does not carry an application data block to avoid disruption to
1343	   normal data transfer.

1345	6.1.4.  Validating the Path

1347	   An application that does not have other higher-layer information
1348	   confirming correct delivery of datagrams SHOULD implement the
1349	   CONFIRMATION_TIMER to periodically send probe packets while in the
1350	   SEARCH_COMPLETE state.

1352	6.1.5.  Handling of PTB Messages

1354	   An application that is able and wishes to receive PTB messages MUST
1355	   perform ICMP validation as specified in Section 5.2 of [RFC8085].
1356	   This requires that the application to check each received PTB
1357	   messages to validate it is received in response to transmitted
1358	   traffic and that the reported PTB_SIZE is less than the current
1359	   probed size.  A validated PTB message MAY be used as input to the
1360	   DPLPMTUD algorithm, but MUST NOT be used directly to set the PLPMTU.

1362	6.2.  DPLPMTUD with UDP Options

1364	   UDP Options[I-D.ietf-tsvwg-udp-options] can supply the additional
1365	   functionality required to implement DPLPMTUD within the UDP transport
1366	   service.  Implementing DPLPMTU using UDP Options avoids the need for
1367	   each application to implement the DPLPMTUD method.

1369	   Section 5.6 of[I-D.ietf-tsvwg-udp-options] defines the MSS option,
1370	   which allows the local sender to indicate the EMTU_R to the peer.
1371	   The value received in this option can be used to initialise PMTU_MAX.

1373	   UDP Options enables padding to be added to UDP datagrams that are
1374	   used as Probe Packets.  Feedback confirming reception of each Probe
1375	   Packet is provided by two new UDP Options:

1377	   o  The Probe Request Option (Section 6.2.1) is set by a sending PL to
1378	      solicit a response from a remote endpoint.  A four-byte token
1379	      identifies each request.

1381	   o  The Probe Response Option (Section 6.2.2 is generated by the UDP
1382	      Options receiver in response to reception of a previously received
1383	      Probe Request Option.  Each Probe Response Option echoes a
1384	      previously received four-byte token.

1386	   The token value allows implementations to be distinguish between
1387	   acknowledgements for initial probe packets and acknowledgements
1388	   confirming receipt of subsequent probe packets (e.g., travelling
1389	   along alternate paths with a larger RTT).  Each probe packet needs to
1390	   be uniquely identifiable by the UDP Options sender within the Maximum
1391	   Segment Lifetime (MSL).  The UDP Options sender therefore needs to
1392	   not recycle token values until they have expired or have been
1393	   acknowledged.  A 4 byte value for the token field provides sufficient
1394	   space for multiple unique probes to be made within the MSL.

1396	   Implementations ought to only send a probe packet with a Probe
1397	   Request Option when required by their local state machine, i.e., when
1398	   probing to grow the PLPMTU or to confirm the current PLPMTU.  The
1399	   procedure to handle the loss of a response packet is the
1400	   responsibility of the sender of the request.

1402	   A PL needs to determine that the path can still support the size of
1403	   datagram that the application is currently sending in the DPLPMTUD
1404	   search_done state (i.e., to detect black-holing of data).  One way to
1405	   achieve this is to send probe packets of size PLPMTU or to utilise a
1406	   higher-layer method that provides explicit feedback indicating any
1407	   packet loss.  Another possibility is to utilise data packets that
1408	   carry a Timestamp Option.  Reception of a valid timestamp that was
1409	   echoed by the remote endpoint can be used to infer connectivity.

1411	   This can provide useful feedback even over paths with asymmetric
1412	   capacity and/or that carry UDP Option flows that have very asymmetric
1413	   datagram rates, because an echo of the most recent timestamp still
1414	   indicates reception of at least one packet of the transmitted size.
1415	   This is sufficient to confirm there is no black hole.

1417	   In contrast, when sending a probe to increase the PLPMTU, a timestamp
1418	   may be unable to unambiguously identify that a specific probe packet
1419	   has been received.  Timestamp mechanisms cannot be used to confirm
1420	   the reception of individual probe messages and cannot be used to
1421	   stimulate a response from the remote peer.

1423	6.2.1.  UDP Probe Request Option

1425	   The Probe Request Option allows a sending endpoint to solicit a
1426	   response from a destination endpoint.

1428	   The Probe Request Option carries a four byte token set by the sender.
1429	   This token can be set to a value that is likely to be known only to
1430	   the sender (and is sent along the end-to-end path).  The sender can
1431	   then check the value returned in the UDP Probe Response Option.  The
1432	   value of the Token field, uniquely identifies a probe within the
1433	   maximum segment lifetime and can also provide additional protection
1434	   from off-path insertion of data[RFC8085].

1436	                       +---------+--------+-----------------+
1437	                       | Kind=9  | Len=6  |     Token       |
1438	                       +---------+--------+-----------------+
1439	                         1 byte    1 byte       4 bytes

1441	                   Figure 5: UDP Probe REQ Option Format

1443	6.2.2.  UDP Probe Response Option

1445	   The Probe Response Option is generated in response to reception of a
1446	   previously received Probe Request Option.

1448	   The Probe Response Option carries a four byte token field.  The Token
1449	   field associates the response with the Token value carried in the
1450	   most recently-received Echo Request.  The rate of generation of UDP
1451	   packets carrying a Probe Response Option MAY be rate-limited.

1453	                       +---------+--------+-----------------+
1454	                       | Kind=10 | Len=6  |     Token       |
1455	                       +---------+--------+-----------------+
1456	                         1 byte    1 byte       4 bytes

1458	                   Figure 6: UDP Probe RES Option Format

1460	6.3.  DPLPMTUD for SCTP

1462	   Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing
1463	   method for SCTP.  It recommends the use of the PAD chunk, defined in
1464	   [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build
1465	   a probe packet.  This enables probing without affecting the transfer
1466	   of user messages and without interfering with congestion control.
1467	   This is preferred to using DATA chunks (with padding as required) as
1468	   path probes.

1470	   XXX Author Note: Future versions of this document might define a
1471	   parameter contained in the INIT and INIT ACK chunk to indicate the
1472	   remote peer MTU to the local peer.  However, multihoming makes this a
1473	   bit complex, so it might not be worth doing.  XXX

1475	6.3.1.  SCTP/IPv4 and SCTP/IPv6

1477	   The base protocol is specified in [RFC4960].  This provides an
1478	   acknowledged PL.  A sender can therefore enter the PROBE_BASE state
1479	   as soon as connectivity has been confirmed.

1481	6.3.1.1.  Sending SCTP Probe Packets

1483	   Probe packets consist of an SCTP common header followed by a
1484	   HEARTBEAT chunk and a PAD chunk.  The PAD chunk is used to control
1485	   the length of the probe packet.  The HEARTBEAT chunk is used to
1486	   trigger the sending of a HEARTBEAT ACK chunk.  The reception of the
1487	   HEARTBEAT ACK chunk acknowledges reception of a successful probe.

1489	   The HEARTBEAT chunk carries a Heartbeat Information parameter which
1490	   should include, besides the information suggested in [RFC4960], the
1491	   probe size, which is the size of the complete datagram.  The size of
1492	   the PAD chunk is therefore computed by reducing the probing size by
1493	   the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT
1494	   request and the PAD chunk header.  The payload of the PAD chunk
1495	   contains arbitrary data.

1497	   To avoid fragmentation of retransmitted data, probing starts right
1498	   after the handshake, before data is sent.  Assuming normal behaviour
1499	   (i.e., the PMTU is smaller than or equal to the interface MTU), this
1500	   process will take a few round trip time periods depending on the
1501	   number of PMTU sizes probed.  The Heartbeat timer can be used to
1502	   implement the PROBE_TIMER.

1504	6.3.1.2.  Validating the Path with SCTP

1506	   Since SCTP provides an acknowledged PL, a sender MUST NOT implement
1507	   the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.

1509	6.3.1.3.  PTB Message Handling by SCTP

1511	   Normal ICMP validation MUST be performed as specified in Appendix C
1512	   of [RFC4960].  This requires that the first 8 bytes of the SCTP
1513	   common header are quoted in the payload of the PTB message, which can
1514	   be the case for ICMPv4 and is normally the case for ICMPv6.

1516	   When a PTB message has been validated, the PTB_SIZE reported in the
1517	   PTB message SHOULD be used with the DPLPMTUD algorithm, providing
1518	   that the reported PTB_SIZE is less than the current probe size.

1520	6.3.2.  DPLPMTUD for SCTP/UDP

1522	   The UDP encapsulation of SCTP is specified in [RFC6951].

1524	6.3.2.1.  Sending SCTP/UDP Probe Packets

1526	   Packet probing can be performed as specified in Section 6.3.1.1.  The
1527	   maximum payload is reduced by 8 bytes, which has to be considered
1528	   when filling the PAD chunk.

1530	6.3.2.2.  Validating the Path with SCTP/UDP

1532	   Since SCTP provides an acknowledged PL, a sender MUST NOT implement
1533	   the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.

1535	6.3.2.3.  Handling of PTB Messages by SCTP/UDP

1537	   Normal ICMP validation MUST be performed for PTB messages as
1538	   specified in Appendix C of [RFC4960].  This requires that the first 8
1539	   bytes of the SCTP common header are contained in the PTB message,
1540	   which can be the case for ICMPv4 (but note the UDP header also
1541	   consumes a part of the quoted packet header) and is normally the case
1542	   for ICMPv6.  When the validation is completed, the PTB_SIZE indicated
1543	   in the PTB message SHOULD be used with the DPLPMTUD providing that
1544	   the reported PTB_SIZE is less than the current probe size.

1546	6.3.3.  DPLPMTUD for SCTP/DTLS

1548	   The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is
1549	   specified in [RFC8261].  It is used for data channels in WebRTC
1550	   implementations.

1552	6.3.3.1.  Sending SCTP/DTLS Probe Packets

1554	   Packet probing can be done as specified in Section 6.3.1.1.

1556	6.3.3.2.  Validating the Path with SCTP/DTLS

1558	   Since SCTP provides an acknowledged PL, a sender MUST NOT implement
1559	   the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.

1561	6.3.3.3.  Handling of PTB Messages by SCTP/DTLS

1563	   It is not possible to perform normal ICMP validation as specified in
1564	   [RFC4960], since even if the ICMP message payload contains sufficient
1565	   information, the reflected SCTP common header would be encrypted.
1566	   Therefore it is not possible to process PTB messages at the PL.

1568	6.4.  DPLPMTUD for QUIC

1570	   Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a
1571	   UDP-based transport that provides reception feedback.

1573	   Section 9.2 of [I-D.ietf-quic-transport] describes the path
1574	   considerations when sending QUIC packets.  It recommends the use of
1575	   PADDING frames to build the probe packet.  This enables probing
1576	   without affecting the transfer of other QUIC frames.

1578	   This provides an acknowledged PL.  A sender can therefore enter the
1579	   PROBE_BASE state as soon as connectivity has been confirmed.

1581	6.4.1.  Sending QUIC Probe Packets

1583	   A probe packet consists of a QUIC Header and a payload containing
1584	   only PADDING Frames.  PADDING Frames are a single octet (0x00) and
1585	   several of these can be used to create a probe packet of size
1586	   PROBED_SIZE.  QUIC provides an acknowledged PL.  A sender can
1587	   therefore enter the PROBE_BASE state as soon as connectivity has been
1588	   confirmed.

1590	   The current specification of QUIC sets the following:

1592	   o  BASE_PMTU: 1200.  A QUIC sender needs to pad initial packets to
1593	      1200 bytes to confirm the path can support packets of a useful
1594	      size.

1596	   o  MIN_PMTU: 1200 bytes.  A QUIC sender that determines the PMTU has
1597	      fallen below 1200 bytes MUST immediately stop sending on the
1598	      affected path.

1600	6.4.2.  Validating the Path with QUIC

1602	   QUIC provides an acknowledged PL.  A sender therefore MUST NOT
1603	   implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.

1605	6.4.3.  Handling of PTB Messages by QUIC

1607	   QUIC operates over the UDP transport, and the guidelines on ICMP
1608	   validation as specified in Section 5.2 of [RFC8085] therefore apply.
1609	   Although QUIC does not currently specify a method for validating ICMP
1610	   responses, it does provide some guidelines to make it harder for an
1611	   off-path attacker to inject ICMP messages.

1613	   o  Set the IPv4 Don't Fragment (DF) bit on a small proportion of
1614	      packets, so that most invalid ICMP messages arrive when there are
1615	      no DF packets outstanding, and can therefore be identified as
1616	      spurious.

1618	   o  Store additional information from the IP or UDP headers from DF
1619	      packets (for example, the IP ID or UDP checksum) to further
1620	      authenticate incoming Datagram Too Big messages.

1622	   o  Any reduction in PMTU due to a report contained in an ICMP packet
1623	      is provisional until QUIC's loss detection algorithm determines
1624	      that the packet is actually lost.

1626	   XXX The above list was pulled whole from quic-transport - input is
1627	   invited from QUIC contributors.  XXX

1629	7.  Acknowledgements

1631	   This work was partially funded by the European Union's Horizon 2020
1632	   research and innovation programme under grant agreement No. 644334
1633	   (NEAT).  The views expressed are solely those of the author(s).

1635	8.  IANA Considerations

1637	   This memo includes no request to IANA.

1639	   XXX If new UDP Options are specified in this document, a request to
1640	   IANA will be included here.  XXX

1642	   If there are no requirements for IANA, the section will be removed
1643	   during conversion into an RFC by the RFC Editor.

1645	9.  Security Considerations

1647	   The security considerations for the use of UDP and SCTP are provided
1648	   in the references RFCs.  Security guidance for applications using UDP
1649	   is provided in the UDP Usage Guidelines [RFC8085].

1651	   There are cases where PTB messages are not delivered due to policy,
1652	   configuration or equipment design (see Section 1.1), this method
1653	   therefore does not rely upon PTB messages being received, but is able
1654	   to utilise these when they are received by the sender.  PTB messages
1655	   could potentially be used to cause a node to inappropriately reduce
1656	   the PLPMTU.  A node supporting DPLPMTUD MUST therefore appropriately
1657	   validate the payload of PTB messages to ensure these are received in
1658	   response to transmitted traffic (i.e., a reported error condition
1659	   that corresponds to a datagram actually sent by the path layer).

1661	   Parallel forwarding paths may need to be considered.  Section 5.2.5.1
1662	   identifies the need for robustness in the method when the path
1663	   information may be inconsistent.

1665	   A node performing DPLPMTUD could experience conflicting information
1666	   about the size of supported probe packets.  This could occur when
1667	   there are multiple paths are concurrently in use and these exhibit a
1668	   different PMTU.  If not considered, this could result in data being
1669	   black holed when the PLPMTU is larger than the smallest PMTU across
1670	   the current paths.

1672	   An on-path attacker could forge PTB messages to drive down the PLPMTU

1674	10.  References

1676	10.1.  Normative References

1678	   [I-D.ietf-quic-transport]
1679	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1680	              and Secure Transport", draft-ietf-quic-transport-14 (work
1681	              in progress), August 2018.

1683	   [I-D.ietf-tsvwg-udp-options]
1684	              Touch, J., "Transport Options for UDP", draft-ietf-tsvwg-
1685	              udp-options-05 (work in progress), July 2018.

1687	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
1688	              DOI 10.17487/RFC0768, August 1980,
1689	              <https://www.rfc-editor.org/info/rfc768>.

1691	   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
1692	              RFC 792, DOI 10.17487/RFC0792, September 1981,
1693	              <https://www.rfc-editor.org/info/rfc792>.

1695	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1696	              Communication Layers", STD 3, RFC 1122,
1697	              DOI 10.17487/RFC1122, October 1989,
1698	              <https://www.rfc-editor.org/info/rfc1122>.

1700	   [RFC1812]  Baker, F., Ed., "Requirements for IP Version 4 Routers",
1701	              RFC 1812, DOI 10.17487/RFC1812, June 1995,
1702	              <https://www.rfc-editor.org/info/rfc1812>.

1704	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1705	              Requirement Levels", BCP 14, RFC 2119,
1706	              DOI 10.17487/RFC2119, March 1997,
1707	              <https://www.rfc-editor.org/info/rfc2119>.

1709	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1710	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1711	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1713	   [RFC3828]  Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed.,
1714	              and G. Fairhurst, Ed., "The Lightweight User Datagram
1715	              Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July
1716	              2004, <https://www.rfc-editor.org/info/rfc3828>.

1718	   [RFC4820]  Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and
1719	              Parameter for the Stream Control Transmission Protocol
1720	              (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007,
1721	              <https://www.rfc-editor.org/info/rfc4820>.

1723	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1724	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1725	              <https://www.rfc-editor.org/info/rfc4960>.

1727	   [RFC6951]  Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream
1728	              Control Transmission Protocol (SCTP) Packets for End-Host
1729	              to End-Host Communication", RFC 6951,
1730	              DOI 10.17487/RFC6951, May 2013,
1731	              <https://www.rfc-editor.org/info/rfc6951>.

1733	   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
1734	              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
1735	              March 2017, <https://www.rfc-editor.org/info/rfc8085>.

1737	   [RFC8201]  McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
1738	              "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
1739	              DOI 10.17487/RFC8201, July 2017,
1740	              <https://www.rfc-editor.org/info/rfc8201>.

1742	   [RFC8261]  Tuexen, M., Stewart, R., Jesup, R., and S. Loreto,
1743	              "Datagram Transport Layer Security (DTLS) Encapsulation of
1744	              SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November
1745	              2017, <https://www.rfc-editor.org/info/rfc8261>.

1747	10.2.  Informative References

1749	   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
1750	              DOI 10.17487/RFC1191, November 1990,
1751	              <https://www.rfc-editor.org/info/rfc1191>.

1753	   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
1754	              RFC 2923, DOI 10.17487/RFC2923, September 2000,
1755	              <https://www.rfc-editor.org/info/rfc2923>.

1757	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1758	              Congestion Control Protocol (DCCP)", RFC 4340,
1759	              DOI 10.17487/RFC4340, March 2006,
1760	              <https://www.rfc-editor.org/info/rfc4340>.

1762	   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
1763	              Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007,
1764	              <https://www.rfc-editor.org/info/rfc4821>.

1766	   [RFC4890]  Davies, E. and J. Mohacsi, "Recommendations for Filtering
1767	              ICMPv6 Messages in Firewalls", RFC 4890,
1768	              DOI 10.17487/RFC4890, May 2007,
1769	              <https://www.rfc-editor.org/info/rfc4890>.

1771	Appendix A.  Event-driven state changes

1773	   This appendix contains an informative description of key events:

1775	   Path Setup:  When a new path is initiated, the state is set to
1776	      PROBE_START.  This sends a probe packet with the size of the
1777	      BASE_PMTU.  As soon as the path is confirmed, the state changes to
1778	      PROBE_SEARCH.

1780	   Arrival of an Acknowledgment:  Depending on the probing state, the
1781	      reaction differs according to Figure 7, which is a simplification
1782	      of Figure 4 focusing on this event.

1784	  +--------------+                                    +----------------+
1785	  | PROBE_START  | --3------------------------------> | PROBE_DISABLED |
1786	  +--------------+ --4----------------  ------------> +----------------+
1787	                                      \/
1788	  +--------------+                    /\              +--------------+
1789	  | PROBE_ERROR  | -------------------- \ ----------> |  PROBE_BASE  |
1790	  +--------------+ --4--------------/    \            +--------------+
1791	                                          \
1792	  +--------------+ --1 --------            \          +--------------+
1793	  |  PROBE_BASE  |             \        --- \ ------> | PROBE_ERROR  |
1794	  +--------------+ --3--------- \ -----/     \       +--------------+
1795	                                 \            \
1796	  +--------------+                \            -----> +--------------+
1797	  | PROBE_SEARCH | --2---          -----------------> | PROBE_SEARCH |
1798	  +--------------+       \        ------------------> +--------------+
1799	                          \ ---- /
1800	  +---------------+      / \                          +---------------+
1801	  |SEARCH_COMPLETE| -1---   \                         |SEARCH_COMPLETE|
1802	  +---------------+ -5--     -----------------------> +---------------+
1803	                        \
1804	                         \                            +--------------+
1805	                          --------------------------> |  PROBE_BASE  |
1806	                                                      +--------------+

1808	   Condition 1: The maximum PMTU size has not yet been reached.
1809	   Condition 2: The maximum PMTU size has been reached.  Condition 3:
1810	   Probe Timer expires and PROBE_COUNT = MAX_PROBEs.  Condition 4:
1811	   PROBE_ACK received.  Condition 5: Black hole detected.

1813	        Figure 7: State changes at the arrival of an acknowledgment

1815	   Probing timeout:  The PROBE_COUNT is initialised to zero each time
1816	      the value of PROBED_SIZE is changed and when a acknowledgment
1817	      confirming delivery of a probe packet.  The PROBE_TIMER is started
1818	      each time a probe packet is sent.  It is stopped when an
1819	      acknowledgment arrives that confirms delivery of a probe packet of
1820	      PROBED_SIZE.  If the probe packet is not acknowledged before the
1821	      PROBE_TIMER expires, the PROBE_COUNT is incremented.  When the
1822	      PROBE_COUNT equals the value MAX_PROBES, the state is changed,
1823	      otherwise a new probe packet of the same size (PROBED_SIZE) is
1824	      resent.  The state transitions are illustrated in Figure 8.  This
1825	      shows a simplification of Figure 4 with a focus only on this
1826	      event.

1828	  +--------------+                                    +----------------+
1829	  |  PROBE_START | --2------------------------------->| PROBE_DISABLED |
1830	  +--------------+                                    +----------------+

1832	  +--------------+                                    +--------------+
1833	  | PROBE_ERROR  |                 -----------------> | PROBE_ERROR  |
1834	  +--------------+                /                   +--------------+
1835	                                 /
1836	  +--------------+ --2----------/                     +--------------+
1837	  |  PROBE_BASE  | --1------------------------------> |  PROBE_BASE  |
1838	  +--------------+                                    +--------------+

1840	  +--------------+                                    +--------------+
1841	  | PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH |
1842	  +--------------+ --2---------                       +--------------+
1843	                               \
1844	  +---------------+             \                     +---------------+
1845	  |SEARCH_COMPLETE|              -------------------> |SEARCH_COMPLETE|
1846	  +---------------+                                   +---------------+

1848	   Condition 1: The maximum number of probe packets has not been
1849	   reached.  Condition 2: The maximum number of probe packets has been
1850	   reached.  XXX This diagram has not been validated.

1852	       Figure 8: State changes at the expiration of the probe timer

1854	   PMTU raise timer timeout:  DPLPMTUD periodically sends a probe packet
1855	      to detect whether a larger PMTU is possible.  This probe packet is
1856	      generated by the PMTU_RAISE_TIMER.

1858	   Arrival of a PTB message:  The active probing of the path can be
1859	      supported by the arrival of a PTB message indicating the PTB_SIZE.
1860	      Two examples are:

1862	      1.  The PTB_SIZE is between the PLPMTU and the probe that
1863	          triggered the PTB message.

1865	      2.  The PTB_SIZE is smaller than the PLPMTU.

1867	      In first case, the PROBE_BASE state transitions to the PROBE_ERROR
1868	      state.  In the PROBE_SEARCH state, a new probe packet is sent with
1869	      the size reported by the PTB message.

1871	      In second case, the probing starts again with a value of
1872	      PROBE_BASE.

1874	Appendix B.  Revision Notes

1876	   Note to RFC-Editor: please remove this entire section prior to
1877	   publication.

1879	   Individual draft -00:

1881	   o  Comments and corrections are welcome directly to the authors or
1882	      via the IETF TSVWG working group mailing list.

1884	   o  This update is proposed for WG comments.

1886	   Individual draft -01:

1888	   o  Contains the first representation of the algorithm, showing the
1889	      states and timers

1891	   o  This update is proposed for WG comments.

1893	   Individual draft -02:

1895	   o  Contains updated representation of the algorithm, and textual
1896	      corrections.

1898	   o  The text describing when to set the effective PMTU has not yet
1899	      been validated by the authors

1901	   o  To determine security to off-path-attacks: We need to decide
1902	      whether a received PTB message SHOULD/MUST be validated?  The text
1903	      on how to handle a PTB message indicating a link MTU larger than
1904	      the probe has yet not been validated by the authors

1906	   o  No text currently describes how to handle inconsistent results
1907	      from arbitrary re-routing along different parallel paths

1909	   o  This update is proposed for WG comments.

1911	   Working Group draft -00:

1913	   o  This draft follows a successful adoption call for TSVWG

1915	   o  There is still work to complete, please comment on this draft.

1917	   Working Group draft -01:

1919	   o  This draft includes improved introduction.

1921	   o  The draft is updated to require ICMP validation prior to accepting
1922	      PTB messages - this to be confirmed by WG

1924	   o  Section added to discuss Selection of Probe Size - methods to be
1925	      evlauated and recommendations to be considered

1927	   o  Section added to align with work proposed in the QUIC WG.

1929	   Working Group draft -02:

1931	   o  The draft was updated based on feedback from the WG, and a
1932	      detailed review by Magnus Westerlund.

1934	   o  The document updates RFC 4821.

1936	   o  Requirements list updated.

1938	   o  Added more explicit discussion of a simpler black-hole detection
1939	      mode.

1941	   o  This draft includes reorganisation of the section on IETF
1942	      protocols.

1944	   o  Added more discussion of implementation within an application.

1946	   o  Added text on flapping paths.

1948	   o  Replaced 'effective MTU' with new term PLPMTU.

1950	   Working Group draft -03:

1952	   o  Updated figures

1954	   o  Added more discussion on blackhole detection

1956	   o  Added figure describing just blackhole detection

1958	   o  Added figure relating MPS sizes

1960	   Working Group draft -04:

1962	   o  Described phases and named these consistently.

1964	   o  Corrected transition from confirmation directly to the search
1965	      phase (Base has been checked).

1967	   o  Redrawn state diagrams.

1969	   o  Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU).

1971	   o  Clarified Error state.

1973	   o  Clarified supsending DPLPMTUD.

1975	   o  Verified normative text in requirements section.

1977	   o  Removed duplicate text.

1979	   o  Changed all text to refer to /packet probe/probe packet/
1980	      /validation/verification/ added term /Probe Confirmation/ and
1981	      clarified BlackHole detection.

1983	Authors' Addresses

1985	   Godred Fairhurst
1986	   University of Aberdeen
1987	   School of Engineering
1988	   Fraser Noble Building
1989	   Aberdeen  AB24 3U
1990	   UK

1992	   Email: gorry@erg.abdn.ac.uk

1994	   Tom Jones
1995	   University of Aberdeen
1996	   School of Engineering
1997	   Fraser Noble Building
1998	   Aberdeen  AB24 3U
1999	   UK

2001	   Email: tom@erg.abdn.ac.uk

2003	   Michael Tuexen
2004	   Muenster University of Applied Sciences
2005	   Stegerwaldstrasse 39
2006	   Stein fart  48565
2007	   DE

2009	   Email: tuexen@fh-muenster.de
2010	   Irene Ruengeler
2011	   Muenster University of Applied Sciences
2012	   Stegerwaldstrasse 39
2013	   Stein fart  48565
2014	   DE

2016	   Email: i.ruengeler@fh-muenster.de