idnits 2.17.1 draft-ietf-tsvwg-datagram-plpmtud-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The abstract seems to indicate that this document updates RFC8201, but the header doesn't have an 'Updates:' line to match this. -- The abstract seems to indicate that this document updates RFC4821, but the header doesn't have an 'Updates:' line to match this. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (5 December 2019) is 1597 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-20 ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) == Outdated reference: A later version (-13) exists of draft-ietf-intarea-tunnels-10 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Fairhurst 3 Internet-Draft T. Jones 4 Updates4821 (if approved) University of Aberdeen 5 Intended status: Standards Track M. Tuexen 6 Expires: 7 June 2020 I. Ruengeler 7 T. Voelker 8 Muenster University of Applied Sciences 9 5 December 2019 11 Packetization Layer Path MTU Discovery for Datagram Transports 12 draft-ietf-tsvwg-datagram-plpmtud-12 14 Abstract 16 This document describes a robust method for Path MTU Discovery 17 (PMTUD) for datagram Packetization Layers (PLs). It describes an 18 extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path 19 MTU Discovery for IPv4 and IPv6. The method allows a PL, or a 20 datagram application that uses a PL, to discover whether a network 21 path can support the current size of datagram. This can be used to 22 detect and reduce the message size when a sender encounters a network 23 black hole (where packets are discarded). The method can probe a 24 network path with progressively larger packets to discover whether 25 the maximum packet size can be increased. This allows a sender to 26 determine an appropriate packet size, providing functionally for 27 datagram transports that is equivalent to the Packetization Layer 28 PMTUD specification for TCP, specified in RFC 4821. 30 The document also provides implementation notes for incorporating 31 Datagram PMTUD into IETF datagram transports or applications that use 32 datagram transports. 34 When published, this specification updates RFC 4821. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on 7 June 2020. 53 Copyright Notice 55 Copyright (c) 2019 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 60 license-info) in effect on the date of publication of this document. 61 Please review these documents carefully, as they describe your rights 62 and restrictions with respect to this document. Code Components 63 extracted from this document must include Simplified BSD License text 64 as described in Section 4.e of the Trust Legal Provisions and are 65 provided without warranty as described in the Simplified BSD License. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 70 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 71 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 72 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6 73 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 74 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 9 75 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 12 76 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 12 77 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 13 78 4.3. Detection of Unsupported PLPMTU Size, aka Black Hole 79 Detection . . . . . . . . . . . . . . . . . . . . . . . . 14 80 4.4. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 15 81 4.5. Response to PTB Messages . . . . . . . . . . . . . . . . 15 82 4.5.1. Validation of PTB Messages . . . . . . . . . . . . . 15 83 4.5.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 16 84 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 17 85 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 18 86 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 18 87 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 19 88 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 20 89 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 21 90 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 23 91 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 26 92 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 26 93 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 27 94 5.3.3. Resilience to Inconsistent Path Information . . . . . 27 95 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 28 97 6. Specification of Protocol-Specific Methods . . . . . . . . . 28 98 6.1. Application support for DPLPMTUD with UDP or 99 UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 28 100 6.1.1. Application Request . . . . . . . . . . . . . . . . . 29 101 6.1.2. Application Response . . . . . . . . . . . . . . . . 29 102 6.1.3. Sending Application Probe Packets . . . . . . . . . . 29 103 6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 29 104 6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 29 105 6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 30 106 6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 30 107 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 30 108 6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 31 109 6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 32 110 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 32 111 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 33 112 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 33 113 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 33 114 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 33 115 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 34 116 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 117 9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 118 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 119 10.1. Normative References . . . . . . . . . . . . . . . . . . 35 120 10.2. Informative References . . . . . . . . . . . . . . . . . 36 121 Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 37 122 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 124 1. Introduction 126 The IETF has specified datagram transport using UDP, SCTP, and DCCP, 127 as well as protocols layered on top of these transports (e.g., SCTP/ 128 UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP 129 network layer. This document describes a robust method for Path MTU 130 Discovery (PMTUD) that may be used with these transport protocols (or 131 the applications that use their transport service) to discover an 132 appropriate size of packet to use across an Internet path. 134 1.1. Classical Path MTU Discovery 136 Classical Path Maximum Transmission Unit Discovery (PMTUD) can be 137 used with any transport that is able to process ICMP Packet Too Big 138 (PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, 139 the term PTB message is applied to both IPv4 ICMP Unreachable 140 messages (type 3) that carry the error Fragmentation Needed (Type 3, 141 Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) 142 [RFC4443]. When a sender receives a PTB message, it reduces the 143 effective MTU to the value reported as the Link MTU in the PTB 144 message, and a method that from time-to-time increases the packet 145 size in attempt to discover an increase in the supported PMTU. The 146 packets sent with a size larger than the current effective PMTU are 147 known as probe packets. 149 Packets not intended as probe packets are either fragmented to the 150 current effective PMTU, or the attempt to send fails with an error 151 code. Applications are sometimes provided with a primitive to let 152 them read the Maximum Packet Size (MPS), derived from the current 153 effective PMTU. 155 Classical PMTUD is subject to protocol failures. One failure arises 156 when traffic using a packet size larger than the actual PMTU is 157 black-holed (all datagrams sent with this size, or larger, are 158 discarded). This could arise when the PTB messages are not delivered 159 back to the sender for some reason (see for example [RFC2923]). 161 Examples where PTB messages are not delivered include: 163 * The generation of ICMP messages is usually rate limited. This 164 could result in no PTB messages being generated to the sender (see 165 section 2.4 of [RFC4443]) 167 * ICMP messages can be filtered by middleboxes (including firewalls) 168 [RFC4890]. A stateful firewall could be configured with a policy 169 to block incoming ICMP messages, which would prevent reception of 170 PTB messages to a sending endpoint behind this firewall. 172 * When the router issuing the ICMP message drops a tunneled packet, 173 the resulting ICMP message will be directed to the tunnel ingress. 174 This tunnel endpoint is responsible for forwarding the ICMP 175 message and also processing the quoted packet within the payload 176 field to remove the effect of the tunnel, and return a correctly 177 formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. 178 Failure to do this prevents the PTB message reaching the original 179 sender. 181 * Asymmetry in forwarding can result in there being no return route 182 to the original sender, which would prevent an ICMP message being 183 delivered to the sender. This issue can also arise when policy- 184 based routing is used, Equal Cost Multipath (ECMP) routing is 185 used, or a middlebox acts as an application load balancer. An 186 example is where the path towards the server is chosen by ECMP 187 routing depending on bytes in the IP payload. In this case, when 188 a packet sent by the server encounters a problem after the ECMP 189 router, then any resulting ICMP message needs to also be directed 190 by the ECMP router towards the original sender. 192 * There are additional cases where the next hop destination fails to 193 receive a packet because of its size. This could be due to 194 misconfiguration of the layer 2 path between nodes, for instance 195 the MTU configured in a layer 2 switch, or misconfiguration of the 196 Maximum Receive Unit (MRU). If the packet is dropped by the link, 197 this will not cause a PTB message to be sent to the original 198 sender. 200 Another failure could result if a node that is not on the network 201 path sends a PTB message that attempts to force a sender to change 202 the effective PMTU [RFC8201]. A sender can protect itself from 203 reacting to such messages by utilising the quoted packet within a PTB 204 message payload to validate that the received PTB message was 205 generated in response to a packet that had actually originated from 206 the sender. However, there are situations where a sender would be 207 unable to provide this validation. Examples where validation of the 208 PTB message is not possible include: 210 * When a router issuing the ICMP message implements RFC792 211 [RFC0792], it is only required to include the first 64 bits of the 212 IP payload of the packet within the quoted payload. There could 213 be insufficient bytes remaining for the sender to interpret the 214 quoted transport information. 216 Note: The recommendation in RFC1812 [RFC1812] is that IPv4 routers 217 return a quoted packet with as much of the original datagram as 218 possible without the length of the ICMP datagram exceeding 576 219 bytes. IPv6 routers include as much of the invoking packet as 220 possible without the ICMPv6 packet exceeding 1280 bytes [RFC4443]. 222 * The use of tunnels/encryption can reduce the size of the quoted 223 packet returned to the original source address, increasing the 224 risk that there could be insufficient bytes remaining for the 225 sender to interpret the quoted transport information. 227 * Even when the PTB message includes sufficient bytes of the quoted 228 packet, the network layer could lack sufficient context to 229 validate the message, because validation depends on information 230 about the active transport flows at an endpoint node (e.g., the 231 socket/address pairs being used, and other protocol header 232 information). 234 * When a packet is encapsulated/tunneled over an encrypted 235 transport, the tunnel/encapsulation ingress might have 236 insufficient context, or computational power, to reconstruct the 237 transport header that would be needed to perform validation. 239 1.2. Packetization Layer Path MTU Discovery 241 The term Packetization Layer (PL) has been introduced to describe the 242 layer that is responsible for placing data blocks into the payload of 243 IP packets and selecting an appropriate MPS. This function is often 244 performed by a transport protocol, but can also be performed by other 245 encapsulation methods working above the transport layer. 247 In contrast to PMTUD, Packetization Layer Path MTU Discovery 248 (PLPMTUD) [RFC4821] does not rely upon reception and validation of 249 PTB messages. It is therefore more robust than Classical PMTUD. 250 This has become the recommended approach for implementing PMTU 251 discovery. 253 It uses a general strategy where the PL sends probe packets to search 254 for the largest size of unfragmented datagram that can be sent over a 255 network path. Probe packets are sent with a progressively larger 256 packet size. If a probe packet is successfully delivered (as 257 determined by the PL), then the PLPMTU is raised to the size of the 258 successful probe. If no response is received to a probe packet, the 259 method reduces the probe size. The result of probing with the PLPMTU 260 is used to set the application MPS. 262 PLPMTUD introduces flexibility in the implementation of PMTU 263 discovery. At one extreme, it can be configured to only perform ICMP 264 Black Hole Detection and recovery to increase the robustness of 265 Classical PMTUD, or at the other extreme, all PTB processing can be 266 disabled and PLPMTUD can completely replace Classical PMTUD (see 267 Section 4.5). 269 PLPMTUD can also include additional consistency checks without 270 increasing the risk that data is lost when probing to discover the 271 path MTU. For example, information available at the PL, or higher 272 layers, enables received PTB messages to be validated before being 273 utilized. 275 1.3. Path MTU Discovery for Datagram Services 277 Section 5 of this document presents a set of algorithms for datagram 278 protocols to discover the largest size of unfragmented datagram that 279 can be sent over a network path. The method described relies on 280 features of the PL described in Section 3 and applies to transport 281 protocols operating over IPv4 and IPv6. It does not require 282 cooperation from the lower layers, although it can utilize PTB 283 messages when these received messages are made available to the PL. 285 The UDP Usage Guidelines [RFC8085] state "an application SHOULD 286 either use the Path MTU information provided by the IP layer or 287 implement Path MTU Discovery (PMTUD)", but does not provide a 288 mechanism for discovering the largest size of unfragmented datagram 289 that can be used on a network path. Prior to this document, PLPMTUD 290 had not been specified for UDP. 292 Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the 293 Stream Control Transport Protocol (SCTP). SCTP utilizes probe 294 packets consisting of a minimal sized HEARTBEAT chunk bundled with a 295 PAD chunk as defined in [RFC4820], but RFC4821 does not provide a 296 complete specification. The present document provides the details to 297 complete that specification. 299 The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires 300 implementations to support Classical PMTUD and states that a DCCP 301 sender "MUST maintain the MPS allowed for each active DCCP session". 302 It also defines the current congestion control MPS (CCMPS) supported 303 by a network path. This recommends use of PMTUD, and suggests use of 304 control packets (DCCP-Sync) as path probe packets, because they do 305 not risk application data loss. The method defined in this 306 specification could be used with DCCP. 308 Section 6 specifies the method for a set of transports, and provides 309 information to enable the implementation of PLPMTUD with other 310 datagram transports and applications that use datagram transports. 312 2. Terminology 314 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 315 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 316 "OPTIONAL" in this document are to be interpreted as described in BCP 317 14 [RFC2119] [RFC8174] when, and only when, they appear in all 318 capitals, as shown here. 320 Other terminology is directly copied from [RFC4821], and the 321 definitions in [RFC1122]. 323 Actual PMTU: The Actual PMTU is the PMTU of a network path between a 324 sender PL and a destination PL, which the DPLPMTUD algorithm seeks 325 to determine. 327 Black Hole: A Black Hole is encountered when a sender is unaware 328 that packets are not being delivered to the destination end point. 329 Two types of Black Hole are relevant to DPLPMTUD: 331 Packet Black Hole: Packets encounter a Packet Black Hole when 332 packets are not delivered to the destination 333 endpoint (e.g., when the sender transmits 334 packets of a particular size with a previously 335 known effective PMTU and they are discarded by 336 the network). 338 ICMP Black Hole An ICMP Black Hole is encountered when the 339 sender is unaware that packets are not 340 delivered to the destination endpoint because 341 PTB messages are not received by the 342 originating PL sender. 344 Black holed : Traffic is black-holed when the sender is unaware that 345 packets are not being delivered. This could be due to a Packet 346 Black Hole or an ICMP Black Hole. 348 Classical Path MTU Discovery: Classical PMTUD is a process described 349 in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to 350 learn the largest size of unfragmented datagram that can be used 351 across a network path. 353 Datagram: A datagram is a transport-layer protocol data unit, 354 transmitted in the payload of an IP packet. 356 Effective PMTU: The Effective PMTU is the current estimated value 357 for PMTU that is used by a PMTUD. This is equivalent to the 358 PLPMTU derived by PLPMTUD. 360 EMTU_S: The Effective MTU for sending (EMTU_S) is defined in 361 [RFC1122] as "the maximum IP datagram size that may be sent, for a 362 particular combination of IP source and destination addresses...". 364 EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in 365 [RFC1122] as the largest datagram size that can be reassembled by 366 EMTU_R (Effective MTU to receive). 368 Link: A Link is a communication facility or medium over which nodes 369 can communicate at the link layer, i.e., a layer below the IP 370 layer. Examples are Ethernet LANs and Internet (or higher) layer 371 and tunnels. 373 Link MTU: The Link Maximum Transmission Unit (MTU) is the size in 374 bytes of the largest IP packet, including the IP header and 375 payload, that can be transmitted over a link. Note that this 376 could more properly be called the IP MTU, to be consistent with 377 how other standards organizations use the acronym. This includes 378 the IP header, but excludes link layer headers and other framing 379 that is not part of IP or the IP payload. Other standards 380 organizations generally define the link MTU to include the link 381 layer headers. 383 MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD 384 will attempt to use. 386 MPS: The Maximum Packet Size (MPS) is the largest size of 387 application data block that can be sent across a network path by a 388 PL. In DPLPMTUD this quantity is derived from the PLPMTU by 389 taking into consideration the size of the lower protocol layer 390 headers. Probe packets generated by DPLPMTUD can have a size 391 larger than the MPS. 393 MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD 394 will attempt to use. 396 Packet: A Packet is the IP header plus the IP payload. 398 Packetization Layer (PL): The Packetization Layer (PL) is the layer 399 of the network stack that places data into packets and performs 400 transport protocol functions. 402 Path: The Path is the set of links and routers traversed by a packet 403 between a source node and a destination node by a particular flow. 405 Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU 406 of all the links forming a network path between a source node and 407 a destination node. 409 PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB 410 message that indicates next hop link MTU of a router along the 411 path. 413 PLPMTU: The Packetization Layer PMTU is an estimate of the actual 414 PMTU provided by the DPLPMTUD algorithm. 416 PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the 417 method described in this document for datagram PLs, which is an 418 extension to Classical PMTU Discovery. 420 Probe packet: A probe packet is a datagram sent with a purposely 421 chosen size (typically the current PLPMTU or larger) to detect if 422 packets of this size can be successfully sent end-to-end across 423 the network path. 425 3. Features Required to Provide Datagram PLPMTUD 427 TCP PLPMTUD has been defined using standard TCP protocol mechanisms. 428 All of the requirements in [RFC4821] also apply to the use of the 429 technique with a datagram PL. Unlike TCP, some datagram PLs require 430 additional mechanisms to implement PLPMTUD. 432 There are eight requirements for performing the datagram PLPMTUD 433 method described in this specification: 435 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide 436 information about the maximum size of packet that can be 437 transmitted by the sender on the local link (the local Link MTU). 438 It MAY utilize similar information about the receiver when this 439 is supplied (note this could be less than EMTU_R). This avoids 440 implementations trying to send probe packets that can not be 441 transmitted by the local link. Too high of a value could reduce 442 the efficiency of the search algorithm. Some applications also 443 have a maximum transport protocol data unit (PDU) size, in which 444 case there is no benefit from probing for a size larger than this 445 (unless a transport allows multiplexing multiple applications 446 PDUs into the same datagram). 448 2. PLPMTU: A datagram application using a PL not supporting 449 fragmentation is REQUIRED to be able to choose the size of 450 datagrams sent to the network, up to the PLPMTU, or a smaller 451 value (such as the MPS) derived from this. This value is managed 452 by the DPLPMTUD method. The PLPMTU (specified as the effective 453 PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S 454 (specified in [RFC1122]). 456 3. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be 457 able to transmit a packet larger than the PLMPMTU. This is used 458 to send a probe packet. In IPv4, a probe packet MUST be sent 459 with the Don't Fragment (DF) bit set in the IP header, and 460 without network layer endpoint fragmentation. In IPv6, a probe 461 packet is always sent without source fragmentation (as specified 462 in section 5.4 of [RFC8201]). 464 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize 465 PTB messages received from the network layer to help identify 466 when a network path does not support the current size of probe 467 packet. Any received PTB message MUST be validated before it is 468 used to update the PLPMTU discovery information [RFC8201]. This 469 validation confirms that the PTB message was sent in response to 470 a packet originating by the sender, and needs to be performed 471 before the PLPMTU discovery method reacts to the PTB message. A 472 PTB message MUST NOT be used to increase the PLPMTU [RFC8201]. 474 5. Reception feedback: The destination PL endpoint is REQUIRED to 475 provide a feedback method that indicates to the DPLPMTUD sender 476 when a probe packet has been received by the destination PL 477 endpoint. The mechanism needs to be robust to the possibility 478 that packets could be significantly delayed along a network path. 480 The local PL endpoint at the sending node is REQUIRED to pass 481 this feedback to the sender DPLPMTUD method. 483 6. Probe loss recovery: It is RECOMMENDED to use probe packets that 484 do not carry any user data that would require retransmission if 485 lost. Most datagram transports permit this. If a probe packet 486 contains user data requiring retransmission in case of loss, the 487 PL (or layers above) are REQUIRED to arrange any retransmission/ 488 repair of any resulting loss. DPLPMTUD is REQUIRED to be robust 489 in the case where probe packets are lost due to other reasons 490 (including link transmission error, congestion). 492 7. Probing and congestion control: The DPLPMTUD sender treats 493 isolated loss of a probe packet (with or without a corresponding 494 PTB message) as a potential indication of a PMTU limit for the 495 path. Loss of a probe packet SHOULD NOT be treated as an 496 indication of congestion. The loss of a probe packet SHOULD NOT 497 directly trigger a congestion control reaction [RFC4821] because 498 this could result in unecessary reduction of the sending rate. 499 The interval between probe packets MUST be at least one RTT. 501 8. Shared PLPMTU state: The PLPMTU value MAY also be stored with the 502 corresponding entry associated with the destination in the IP 503 layer cache, and used by other PL instances. The specification 504 of PLPMTUD [RFC4821] states: "If PLPMTUD updates the MTU for a 505 particular path, all Packetization Layer sessions that share the 506 path representation (as described in Section 5.2 of [RFC4821]) 507 SHOULD be notified to make use of the new MTU". Such methods 508 MUST be robust to the wide variety of underlying network 509 forwarding behaviors. Section 5.2 of [RFC8201] provides guidance 510 on the caching of PMTU information and also the relation to IPv6 511 flow labels. 513 In addition, the following principles are stated for design of a 514 DPLPMTUD method: 516 * MPS: A method is REQUIRED to signal an appropriate MPS to the 517 higher layer using the PL. The value of the MPS can change 518 following a change to the path. It is RECOMMENDED that methods 519 avoid forcing an application to use an arbitrary small MPS 520 (PLPMTU) for transmission while the method is searching for the 521 currently supported PLPMTU. Datagram PLs do not necessarily 522 support fragmentation of PDUs larger than the PLPMTU. A reduced 523 MPS can adversely impact the performance of a datagram 524 application. 526 * Path validation: It is RECOMMENDED that methods are robust to path 527 changes that could have occurred since the path characteristics 528 were last confirmed, and to the possibility of inconsistent path 529 information being received. 531 * Datagram reordering: A method is REQUIRED to be robust to the 532 possibility that a flow encounters reordering, or the traffic 533 (including probe packets) is divided over more than one network 534 path. 536 * When to probe: It is RECOMMENDED that methods determine whether 537 the path has changed since it last measured the path. This can 538 help determine when to probe the path again. 540 4. DPLPMTUD Mechanisms 542 This section lists the protocol mechanisms used in this 543 specification. 545 4.1. PLPMTU Probe Packets 547 The DPLPMTUD method relies upon the PL sender being able to generate 548 probe packets with a specific size. TCP is able to generate these 549 probe packets by choosing to appropriately segment data being sent 550 [RFC4821]. In contrast, a datagram PL that needs to construct a 551 probe packet has to either request an application to send a data 552 block that is larger than that generated by an application, or to 553 utilize padding functions to extend a datagram beyond the size of the 554 application data block. Protocols that permit exchange of control 555 messages (without an application data block) MAY prefer to generate a 556 probe packet by extending a control message with padding data. 558 A receiver is REQUIRED to be able to distinguish an in-band data 559 block from any added padding. This is needed to ensure that any 560 added padding is not passed on to an application at the receiver. 562 This results in three possible ways that a sender can create a probe 563 packet: 565 Probing using padding data: A probe packet that contains only 566 control information together with any padding, which is needed to 567 be inflated to the size required for the probe packet. Since 568 these probe packets do not carry an application-supplied data 569 block, they do not typically require retransmission, although they 570 do still consume network capacity and incur endpoint processing. 572 Probing using application data and padding 573 data: A probe packet that 574 contains a data block supplied by an application that is combined 575 with padding to inflate the length of the datagram to the size 576 required for the probe packet. If the application/transport needs 577 protection from the loss of this probe packet, the application/ 578 transport could perform transport-layer retransmission/repair of 579 the data block (e.g., by retransmission after loss is detected or 580 by duplicating the data block in a datagram without the padding 581 data). 583 Probing using application data: A probe packet that contains a data 584 block supplied by an application that matches the size required 585 for the probe packet. This method requests the application to 586 issue a data block of the desired probe size. If the application/ 587 transport needs protection from the loss of an unsuccessful probe 588 packet, the application/transport needs then to perform transport- 589 layer retransmission/repair of the data block (e.g., by 590 retransmission after loss is detected). 592 A PL that uses a probe packet carrying an application data block, 593 could need to retransmit this application data block if the probe 594 fails. This could need the PL to re-fragment the data block to a 595 smaller packet size that is expected to traverse the end-to-end path 596 (which could utilize endpoint network-layer or PL fragmentation when 597 these are available). 599 DPLPMTUD MAY choose to use only one of these methods to simplify the 600 implementation. 602 Probe messages sent by a PL MUST contain enough information to 603 uniquely identify the probe within Maximum Segment Lifetime, while 604 being robust to reordering and replay of probe response and PTB 605 messages. 607 4.2. Confirmation of Probed Packet Size 609 The PL needs a method to determine (confirm) when probe packets have 610 been successfully received end-to-end across a network path. 612 Transport protocols can include end-to-end methods that detect and 613 report reception of specific datagrams that they send (e.g., DCCP and 614 SCTP provide keep-alive/heartbeat features). When supported, this 615 mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of 616 a probe packet. 618 A PL that does not acknowledge data reception (e.g., UDP and UDP- 619 Lite) is unable itself to detect when the packets that it sends are 620 discarded because their size is greater than the actual PMTU. These 621 PLs need to either rely on an application protocol to detect this 622 loss. 624 Section 6 specifies this function for a set of IETF-specified 625 protocols. 627 4.3. Detection of Unsupported PLPMTU Size, aka Black Hole Detection 629 A PL sender needs to reduce the PLPMTU when it discovers the actual 630 PMTU supported by a network path is less than the PLPMTU. This can 631 be triggered when a validated PTB message is received, or by another 632 event that indicates the network path no longer sustains the current 633 packet size, such as a loss report from the PL, or repeated lack of 634 response to probe packets sent to confirm the PLPMTU. Detection is 635 followed by a reduction of the PLPMTU. 637 This is performed by sending packet probes of size PLPMTU to verify 638 that a network path still supports the last acknowledged PLPMTU size. 639 There are two alternative mechanism: 641 * A PL can rely upon a mechanism implemented within the PL to detect 642 excessive loss of data sent with a specific packet size and then 643 conclude that this excessive loss could be a result of an invalid 644 PMTU (as in PLPMTUD for TCP [RFC4821]). 646 * A PL can use the DPLPMTUD probing mechanism to periodically 647 generate probe packets of the size of the current PLPMTU (e.g., 648 using the confirmation timer Section 5.1.1). A timer tracks 649 whether acknowledgments are received. Successive loss of probes 650 is an indication that the current path no longer supports the 651 PLPMTU (e.g., when the number of probe packets sent without 652 receiving an acknowledgement, PROBE_COUNT, becomes greater than 653 MAX_PROBES). 655 A PL MAY inhibit sending probe packets when no application data has 656 been sent since the previous probe packet. A PL preferring to use an 657 up-to-data PLPMTU once user data is sent again, MAY choose to 658 continue PLPMTU discovery for each path. However, this may result in 659 additional packets being sent. 661 When the method detects the current PLPMTU is not supported, DPLPMTUD 662 sets a lower MPS. The PL then confirms that the updated PLPMTU can 663 be successfully used across the path. The PL could need to send a 664 probe packet with a size less than the size of the data block 665 generated by an application. In this case, the PL could provide a 666 way to fragment a datagram at the PL, or use a control packet as the 667 packet probe. 669 4.4. Disabling the Effect of PMTUD 671 A PL implementing this specification MUST suspend network layer 672 processing of outgoing packets that enforces a PMTU 673 [RFC1191][RFC8201] for each flow utilising DPLPMTUD, and instead use 674 DPLPMTUD to control the size of packets that are sent by a flow. 675 This removes the need for the network layer to drop or fragment sent 676 packets that have a size greater than the PMTU. 678 4.5. Response to PTB Messages 680 This method requires the DPLPMTUD sender to validate any received PTB 681 message before using the PTB information. The response to a PTB 682 message depends on the PTB_SIZE indicated in the PTB message, the 683 state of the PLPMTUD state machine, and the IP protocol being used. 685 Section 4.5.1 first describes validation for both IPv4 ICMP 686 Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, 687 both of which are referred to as PTB messages in this document. 689 4.5.1. Validation of PTB Messages 691 This section specifies utilization of PTB messages. 693 * A simple implementation MAY ignore received PTB messages and in 694 this case the PLPMTU is not updated when a PTB message is 695 received. 697 * An implementation that supports PTB messages MUST validate 698 messages before they are further processed. 700 A PL that receives a PTB message from a router or middlebox, performs 701 ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. 702 Because DPLPMTUD operates at the PL, the PL needs to check that each 703 received PTB message is received in response to a packet transmitted 704 by the endpoint PL performing DPLPMTUD. 706 The PL MUST check the protocol information in the quoted packet 707 carried in an ICMP PTB message payload to validate the message 708 originated from the sending node. This validation includes 709 determining that the combination of the IP addresses, the protocol, 710 the source port and destination port match those returned in the 711 quoted packet - this is also necessary for the PTB message to be 712 passed to the corresponding PL. 714 The validation SHOULD utilize information that it is not simple for 715 an off-path attacker to determine [RFC8085]. For example, by 716 checking the value of a protocol header field known only to the two 717 PL endpoints. A datagram application that uses well-known source and 718 destination ports ought to also rely on other information to complete 719 this validation. 721 These checks are intended to provide protection from packets that 722 originate from a node that is not on the network path. A PTB message 723 that does not complete the validation MUST NOT be further utilized by 724 the DPLPMTUD method. 726 PTB messages that have been validated MAY be utilized by the DPLPMTUD 727 algorithm, but MUST NOT be used directly to set the PLPMTU. A method 728 that utilizes these PTB messages can improve the speed at the which 729 the algorithm detects an appropriate PLPMTU, compared to one that 730 relies solely on probing. Section 4.5.2 describes this processing. 732 4.5.2. Use of PTB Messages 734 A set of checks are intended to provide protection from a router that 735 reports an unexpected PTB_SIZE. The PL also needs to check that the 736 indicated PTB_SIZE is less than the size used by probe packets and 737 larger than minimum size accepted. 739 This section provides a summary of how PTB messages can be utilized. 740 This processing depends on the PTB_SIZE and the current value of a 741 set of variables: 743 PTB_SIZE < MIN_PMTU 744 * Invalid PTB_SIZE see Section 4.5.1. 746 * PTB message ought to be discarded without further processing 747 (e. g. PLPMTU not modified). 749 * The information could be utilized as an input to trigger 750 enabling a resilience mode. 752 MIN_PMTU < PTB_SIZE < BASE_PMTU 753 * A robust PL MAY enter an error state (see Section 5.2) for an 754 IPv4 path when the PTB_SIZE reported in the PTB message is 755 larger than or equal to 68 bytes [RFC0791] and when this is 756 less than the BASE_PMTU. 758 * A robust PL MAY enter an error state (see Section 5.2) for an 759 IPv6 path when the PTB_SIZE reported in the PTB message is 760 larger than or equal to 1280 bytes [RFC8200] and when this is 761 less than the BASE_PMTU. 763 PTB_SIZE = PLPMTU 764 * Completes the search for a larger PLPMTU. 766 PTB_SIZE > PROBED_SIZE 767 * Inconsistent network signal. 769 * PTB message ought to be discarded without further processing 770 (e. g. PLPMTU not modified). 772 * The information could be utilized as an input to trigger 773 enabling a resilience mode. 775 BASE_PMTU <= PTB_SIZE < PLPMTU 776 * Black Hole Detection is triggered and the PLPMTU ought to be 777 set to BASE_PMTU. 779 * The PL could use the PTB_SIZE reported in the PTB message to 780 initialize a search algorithm. 782 PLPMTU < PTB_SIZE < PROBED_SIZE 783 * The PLPMTU continues to be valid, but the last PROBED_SIZE 784 searched was larger than the actual PMTU. 786 * The PLPMTU is not updated. 788 * The PL can use the reported PTB_SIZE from the PTB message as 789 the next search point when it resumes the search algorithm. 791 5. Datagram Packetization Layer PMTUD 793 This section specifies Datagram PLPMTUD (DPLPMTUD). The method can 794 be introduced at various points (as indicated with * in the figure 795 below) in the IP protocol stack to discover the PLPMTU so that an 796 application can utilize an appropriate MPS for the current network 797 path. DPLPMTUD SHOULD NOT be used by an application if it is already 798 used in a lower layer. 800 +----------------------+ 801 | Application* | 802 +-+-------+----+----+--+ 803 | | | | 804 +---+--+ +--+--+ | +-+---+ 805 | QUIC*| |UDPO*| | |SCTP*| 806 +---+--+ +--+--+ | +--+--+ 807 | | | | | 808 +-------+--+ | | | 809 | | | | 810 +-+-+--+ | 811 | UDP | | 812 +---+--+ | 813 | | 814 +--------------+-----+-+ 815 | Network Interface | 816 +----------------------+ 818 Figure 1: Examples where DPLPMTUD can be implemented 820 The central idea of DPLPMTUD is probing by a sender. Probe packets 821 are sent to find the maximum size of a user message that can be 822 completely transferred across the network path from the sender to the 823 destination. 825 The following sections identify the components needed for 826 implementation, provides an overview of the phases of operation, and 827 specifies the state machine and search algorithm. 829 5.1. DPLPMTUD Components 831 This section describes the timers, constants, and variables of 832 DPLPMTUD. 834 5.1.1. Timers 836 The method utilizes up to three timers: 838 PROBE_TIMER: The PROBE_TIMER is configured to expire after a 839 period longer than the maximum time to receive 840 an acknowledgment to a probe packet. This value 841 MUST NOT be smaller than 1 second, and SHOULD be 842 larger than 15 seconds. Guidance on selection 843 of the timer value are provided in section 3.1.1 844 of the UDP Usage Guidelines [RFC8085]. 846 If the PL has a path Round Trip Time (RTT) 847 estimate and timely acknowledgements the 848 PROBE_TIMER can be derived from the PL RTT 849 estimate. 851 PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period 852 a sender will continue to use the current 853 PLPMTU, after which it re-enters the Search 854 phase. This timer has a period of 600 seconds, 855 as recommended by PLPMTUD [RFC4821]. 857 DPLPMTUD MAY inhibit sending probe packets when 858 no application data has been sent since the 859 previous probe packet. A PL preferring to use 860 an up-to-data PMTU once user data is sent again, 861 can choose to continue PMTU discovery for each 862 path. However, this may result in sending 863 additional packets. 865 CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST 866 NOT be used. For other PLs, the 867 CONFIRMATION_TIMER is configured to the period a 868 PL sender waits before confirming the current 869 PLPMTU is still supported. This is less than 870 the PMTU_RAISE_TIMER and used to decrease the 871 PLPMTU (e.g., when a black hole is encountered). 872 Confirmation needs to be frequent enough when 873 data is flowing that the sending PL does not 874 black hole extensive amounts of traffic. 875 Guidance on selection of the timer value are 876 provided in section 3.1.1 of the UDP Usage 877 Guidelines [RFC8085]. 879 DPLPMTUD MAY inhibit sending probe packets when 880 no application data has been sent since the 881 previous probe packet. A PL preferring to use 882 an up-to-data PMTU once user data is sent again, 883 can choose to continue PMTU discovery for each 884 path. However, this may result in sending 885 additional packets. 887 An implementation could implement the various timers using a single 888 timer. 890 5.1.2. Constants 892 The following constants are defined: 894 MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT 895 counter (see Section 5.1.3). MAX_PROBES represents the 896 limit for the number of consecutive probe attempts of 897 any size. The default value of MAX_PROBES is 3. This 898 value is greater than 1 to provide robustness to 899 isolated packet loss. 901 MIN_PMTU: The MIN_PMTU is the smallest allowed probe packet size. 902 For IPv6, this value is 1280 bytes, as specified in 903 [RFC8200]. For IPv4, the minimum value is 68 bytes. 905 Note: An IPv4 router is required to be able to forward a 906 datagram of 68 bytes without further fragmentation. 907 This is the combined size of an IPv4 header and the 908 minimum fragment size of 8 bytes. In addition, 909 receivers are required to be able to reassemble 910 fragmented datagrams at least up to 576 bytes, as stated 911 in section 3.3.3 of [RFC1122]. 913 MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU. This has to 914 be less than or equal to the minimum of the local MTU of 915 the outgoing interface and the destination PMTU for 916 receiving. An application, or PL, MAY choose a smaller 917 MAX_PMTU when there is no need to send packets larger 918 than a specific size. 920 BASE_PMTU: The BASE_PMTU is a configured size expected to work for 921 most paths. The size is equal to or larger than the 922 MIN_PMTU and smaller than the MAX_PMTU. In the case of 923 IPv6, this value is 1280 bytes [RFC8200]. When using 924 IPv4, a size of 1200 bytes is RECOMMENDED. 926 5.1.3. Variables 928 This method utilizes a set of variables: 930 PROBED_SIZE: The PROBED_SIZE is the size of the current probe 931 packet. This is a tentative value for the PLPMTU, 932 which is awaiting confirmation by an acknowledgment. 934 PROBE_COUNT: The PROBE_COUNT is a count of the number of successive 935 unsuccessful probe packets that have been sent. Each 936 time a probe packet is acknowledged, the value is set 937 to zero. 939 The figure below illustrates the relationship between the packet size 940 constants and variables at a point of time when the DPLPMTUD 941 algorithm performs path probing to increase the size of the PLPMTU. 942 A probe packet has been sent of size PROBED_SIZE. Once this is 943 acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the 944 DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual 945 PMTU. 947 MIN_PMTU MAX_PMTU 948 <--------------------------------------------------> 949 | | | | 950 v | | v 951 BASE_PMTU | v Actual PMTU 952 | PROBED_SIZE 953 v 954 PLPMTU 956 Figure 2: Relationships between packet size constants and variables 958 5.1.4. Overview of DPLPMTUD Phases 960 This section provides a high-level informative view of the DPLPMTUD 961 method, by describing the movement of the method through several 962 phases of operation. More detail is available in the state machine 963 Section 5.2. 965 +------+ 966 +------->| Base |----------------+ Connectivity 967 | +------+ | or BASE_PMTU 968 | | | confirmation failed 969 | | v 970 | | Connectivity +-------+ 971 | | and BASE_PMTU | Error | 972 | | confirmed +-------+ 973 | | | Consistent 974 | v | connectivity 975 PLPMTU | +--------+ | and BASE_PMTU 976 confirmation | | Search |<--------------+ confirmed 977 failed | +--------+ 978 | ^ | 979 | | | 980 | Raise | | Search 981 | timer | | algorithm 982 | expired | | completed 983 | | | 984 | | v 985 | +-----------------+ 986 +---| Search Complete | 987 +-----------------+ 989 Figure 3: DPLPMTUD Phases 991 Base: The Base Phase confirms connectivity to the remote 992 peer. This phase is implicit for a connection- 993 oriented PL (where it can be performed in a PL 994 connection handshake). A connectionless PL needs 995 to send an acknowledged probe packet to confirm 996 that the remote peer is reachable. The sender also 997 confirms that BASE_PMTU is supported across the 998 network path. 1000 A PL that does not wish to support a path with a 1001 PLPMTU less than BASE_PMTU can simplify the phase 1002 into a single step by performing the connectivity 1003 checks with a probe of the BASE_PMTU size. 1005 Once confirmed, DPLPMTUD enters the Search Phase. 1006 If this phase fails to confirm, DPLPMTUD enters the 1007 Error Phase. 1009 Search: The Search Phase utilizes a search algorithm to 1010 send probe packets to seek to increase the PLPMTU. 1011 The algorithm concludes when it has found a 1012 suitable PLPMTU, by entering the Search Complete 1013 Phase. 1015 A PL could respond to PTB messages using the PTB to 1016 advance or terminate the search, see Section 4.5. 1018 Search Complete: The Search Complete Phase is entered when the 1019 PLPMTU is supported across the network path. A PL 1020 can use a CONFIRMATION_TIMER to periodically repeat 1021 a probe packet for the current PLPMTU size. If the 1022 sender is unable to confirm reachability (e.g., if 1023 the CONFIRMATION_TIMER expires) or the PL signals a 1024 lack of reachability, DPLPMTUD enters the Base 1025 phase. 1027 The PMTU_RAISE_TIMER is used to periodically resume 1028 the search phase to discover if the PLPMTU can be 1029 raised. Black Hole Detection or receipt of a 1030 validated PTB message (see Section 4.5.1) can cause 1031 the sender to enter the Base Phase. 1033 Error: The Error Phase is entered when there is 1034 conflicting or invalid PLPMTU information for the 1035 path (e.g. a failure to support the BASE_PMTU) that 1036 cause DPLPMTUD to be unable to progress and the 1037 PLPMTU is lowered. 1039 DPLPMTUD remains in the Error Phase until a 1040 consistent view of the path can be discovered and 1041 it has also been confirmed that the path supports 1042 the BASE_PMTU (or DPLPMTUD is suspended). 1044 An implementation that only reduces the PLPMTU to a suitable size 1045 would be sufficient to ensure reliable operation, but can be very 1046 inefficient when the actual PMTU changes or when the method (for 1047 whatever reason) makes a suboptimal choice for the PLPMTU. 1049 A full implementation of DPLPMTUD provides an algorithm enabling the 1050 DPLPMTUD sender to increase the PLPMTU following a change in the 1051 characteristics of the path, such as when a link is reconfigured with 1052 a larger MTU, or when there is a change in the set of links traversed 1053 by an end-to-end flow (e.g., after a routing or path fail-over 1054 decision). 1056 5.2. State Machine 1058 A state machine for DPLPMTUD is depicted in Figure 4. If multipath 1059 or multihoming is supported, a state machine is needed for each path. 1061 Note: Not all changes are not shown to simplify the diagram. 1063 | | 1064 | Start | PL indicates loss 1065 | | of connectivity 1066 v v 1067 +---------------+ +---------------+ 1068 | DISABLED | | ERROR | 1069 +---------------+ PROBE_TIMER expiry: +---------------+ 1070 | PL indicates PROBE_COUNT = MAX_PROBES or ^ | 1071 | connectivity PTB: PTB_SIZE < BASE_PMTU | | 1072 +--------------------+ +---------------+ | 1073 | | | 1074 v | BASE_PMTU Probe | 1075 +---------------+ acked | 1076 | BASE |----------------------+ 1077 +---------------+ | 1078 Black hole detected or ^ | ^ ^ Black hole detected or | 1079 PTB: PTB_SIZE < PLPMTU | | | | PTB: PTB_SIZE < PLPMTU | 1080 +--------------------+ | | +--------------------+ | 1081 | +----+ | | 1082 | PROBE_TIMER expiry: | | 1083 | PROBE_COUNT < MAX_PROBES | | 1084 | | | 1085 | PMTU_RAISE_TIMER expiry | | 1086 | +-----------------------------------------+ | | 1087 | | | | | 1088 | | v | v 1089 +---------------+ +---------------+ 1090 |SEARCH_COMPLETE| | SEARCHING | 1091 +---------------+ +---------------+ 1092 | ^ ^ | | ^ 1093 | | | | | | 1094 | | +-----------------------------------------+ | | 1095 | | MAX_PMTU Probe acked or PROBE_TIMER | | 1096 | | expiry: PROBE_COUNT = MAX_PROBES or | | 1097 +----+ PTB: PTB_SIZE = PLPMTU +----+ 1098 CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: 1099 PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or 1100 PLPMTU Probe acked Probe acked or PTB: 1101 PLPMTU < PTB_SIZE < PROBED_SIZE 1103 Figure 4: State machine for Datagram PLPMTUD 1105 The following states are defined: 1107 DISABLED: The DISABLED state is the initial state before 1108 probing has started. It is also entered from any 1109 other state, when the PL indicates loss of 1110 connectivity. This state is left, once the PL 1111 indicates connectivity to the remote PL. 1113 BASE: The BASE state is used to confirm that the 1114 BASE_PMTU size is supported by the network path and 1115 is designed to allow an application to continue 1116 working when there are transient reductions in the 1117 actual PMTU. It also seeks to avoid long periods 1118 where traffic is black holed while searching for a 1119 larger PLPMTU. 1121 On entry, the PROBED_SIZE is set to the BASE_PMTU 1122 size and the PROBE_COUNT is set to zero. 1124 Each time a probe packet is sent, the PROBE_TIMER 1125 is started. The state is exited when the probe 1126 packet is acknowledged, and the PL sender enters 1127 the SEARCHING state. 1129 The state is also left when the PROBE_COUNT reaches 1130 MAX_PROBES or a received PTB message is validated. 1131 This causes the PL sender to enter the ERROR state. 1133 SEARCHING: The SEARCHING state is the main probing state. 1134 This state is entered when probing for the 1135 BASE_PMTU was successful. 1137 Each time a probe packet is acknowledged, the 1138 PROBE_COUNT is set to zero, the PLPMTU is set to 1139 the PROBED_SIZE and then the PROBED_SIZE is 1140 increased using the search algorithm. 1142 When a probe packet is sent and not acknowledged 1143 within the period of the PROBE_TIMER, the 1144 PROBE_COUNT is incremented and a new probe packet 1145 is transmitted. The state is exited when the 1146 PROBE_COUNT reaches MAX_PROBES, a received PTB 1147 message is validated, a probe of size MAX_PMTU is 1148 acknowledged, or a black hole is detected. 1150 SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful 1151 end to the SEARCHING state. DPLPMTUD remains in 1152 this state until either the PMTU_RAISE_TIMER 1153 expires, a received PTB message is validated, or a 1154 black hole is detected. 1156 When DPLPMTUD uses an unacknowledged PL and is in 1157 the SEARCH_COMPLETE state, a CONFIRMATION_TIMER 1158 periodically resets the PROBE_COUNT and schedules a 1159 probe packet with the size of the PLPMTU. If 1160 MAX_PROBES successive PLPMTUD sized probes fail to 1161 be acknowledged the method enters the BASE state. 1162 When used with an acknowledged PL (e.g., SCTP), 1163 DPLPMTUD SHOULD NOT continue to generate PLPMTU 1164 probes in this state. 1166 ERROR: The ERROR state represents the case where either 1167 the network path is not known to support a PLPMTU 1168 of at least the BASE_PMTU size or when there is 1169 contradictory information about the network path 1170 that would otherwise result in excessive variation 1171 in the MPS signalled to the higher layer. The 1172 state implements a method to mitigate oscillation 1173 in the state-event engine. It signals a 1174 conservative value of the MPS to the higher layer 1175 by the PL. The state is exited when packet probes 1176 no longer detect the error or when the PL indicates 1177 that connectivity has been lost. 1179 Implementations are permitted to enable endpoint 1180 fragmentation if the DPLPMTUD is unable to validate 1181 MIN_PMTU within PROBE_COUNT probes. If DPLPMTUD is 1182 unable to validate MIN_PMTU the implementation 1183 should transition to the DISABLED state. 1185 Note: MIN_PMTU may be identical to BASE_PMTU, 1186 simplifying the actions in this state. 1188 5.3. Search to Increase the PLPMTU 1190 This section describes the algorithms used by DPLPMTUD to search for 1191 a larger PLPMTU. 1193 5.3.1. Probing for a larger PLPMTU 1195 Implementations use a search algorithm across the search range to 1196 determine whether a larger PLPMTU can be supported across a network 1197 path. 1199 The method discovers the search range by confirming the minimum 1200 PLPMTU and then using the probe method to select a PROBED_SIZE less 1201 than or equal to MAX_PMTU. MAX_PMTU is the minimum of the local MTU 1202 and EMTU_R (learned from the remote endpoint). The MAX_PMTU MAY be 1203 reduced by an application that sets a maximum to the size of 1204 datagrams it will send. 1206 The PROBE_COUNT is initialized to zero when the first probe with a 1207 size greater than or equal to PLPMTUD is sent. A timer is used by 1208 the search algorithm to trigger the sending of probe packets of size 1209 PROBED_SIZE, larger than the PLPMTU. Each probe packet successfully 1210 sent to the remote peer is confirmed by acknowledgement at the PL, 1211 see Section 4.1. 1213 Each time a probe packet is sent to the destination, the PROBE_TIMER 1214 is started. The timer is canceled when the PL receives 1215 acknowledgment that the probe packet has been successfully sent 1216 across the path Section 4.1. This confirms that the PROBED_SIZE is 1217 supported, and the PROBED_SIZE value is then assigned to the PLPMTU. 1218 The search algorithm can continue to send subsequent probe packets of 1219 an increasing size. 1221 If the timer expires before a probe packet is acknowledged, the probe 1222 has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER 1223 expires, the PROBE_COUNT is incremented, the PROBE_TIMER is 1224 reinitialized, and a new probe of the same size or any other size 1225 (determined by the search algorithm) can be sent. The maximum number 1226 of consecutive failed probes is configured (MAX_PROBES). If the 1227 value of the PROBE_COUNT reaches MAX_PROBES, probing will stop, and 1228 the PL sender enters the SEARCH_COMPLETE state. 1230 5.3.2. Selection of Probe Sizes 1232 The search algorithm needs to determine a minimum useful gain in 1233 PLPMTU. It would not be constructive for a PL sender to attempt to 1234 probe for all sizes. This would incur unnecessary load on the path 1235 and has the undesirable effect of slowing the time to reach a more 1236 optimal MPS. Implementations SHOULD select the set of probe packet 1237 sizes to maximize the gain in PLPMTU from each search step. 1239 Implementations could optimize the search procedure by selecting step 1240 sizes from a table of common PMTU sizes. When selecting the 1241 appropriate next size to search, an implementer ought to also 1242 consider that there can be common sizes of MPS that applications seek 1243 to use, and their could be common sizes of MTU used within the 1244 network. 1246 5.3.3. Resilience to Inconsistent Path Information 1248 A decision to increase the PLPMTU needs to be resilient to the 1249 possibility that information learned about the network path is 1250 inconsistent. A path is inconsistent, when, for example, probe 1251 packets are lost due to other reasons (i.e. not packet size) or due 1252 to frequent path changes. Frequent path changes could occur by 1253 unexpected "flapping" - where some packets from a flow pass along one 1254 path, but other packets follow a different path with different 1255 properties. 1257 A PL sender is able to detect inconsistency from the sequence of 1258 PLPMTU probes that it sends or the sequence of PTB messages that it 1259 receives. When inconsistent path information is detected, a PL 1260 sender could use an alternate search mode that clamps the offered MPS 1261 to a smaller value for a period of time. This avoids unnecessary 1262 loss of packets due to MTU limitation. 1264 5.4. Robustness to Inconsistent Paths 1266 Some paths could be unable to sustain packets of the BASE_PMTU size. 1267 To be robust to these paths an implementation could implement the 1268 Error State. This allows fallback to a smaller than desired PLPMTU, 1269 rather than suffer connectivity failure. This could utilize methods 1270 such as endpoint IP fragmentation to enable the PL sender to 1271 communicate using packets smaller than the BASE_PMTU. 1273 6. Specification of Protocol-Specific Methods 1275 DPLPMTUD requires protocol-specific details to be specified for each 1276 PL that is used. 1278 The first subsection provides guidance on how to implement the 1279 DPLPMTUD method as a part of an application using UDP or UDP-Lite. 1280 The guidance also applies to other datagram services that do not 1281 include a specific transport protocol (such as a tunnel 1282 encapsulation). The following subsections describe how DPLPMTUD can 1283 be implemented as a part of the transport service, allowing 1284 applications using the service to benefit from discovery of the 1285 PLPMTU without themselves needing to implement this method. 1287 6.1. Application support for DPLPMTUD with UDP or UDP-Lite 1289 The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do 1290 not define a method in the RFC-series that supports PLPMTUD. In 1291 particular, the UDP transport does not provide the transport layer 1292 features needed to implement datagram PLPMTUD. 1294 The DPLPMTUD method can be implemented as a part of an application 1295 built directly or indirectly on UDP or UDP-Lite, but relies on 1296 higher-layer protocol features to implement the method [RFC8085]. 1298 Some primitives used by DPLPMTUD might not be available via the 1299 Datagram API (e.g., the ability to access the PLPMTU from the IP 1300 layer cache, or interpret received PTB messages). 1302 In addition, it is desirable that PMTU discovery is not performed by 1303 multiple protocol layers. An application SHOULD avoid using DPLPMTUD 1304 when the underlying transport system provides this capability. To 1305 use common method for managing the PLPMTU has benefits, both in the 1306 ability to share state between different processes and opportunities 1307 to coordinate probing. 1309 6.1.1. Application Request 1311 An application needs an application-layer protocol mechanism (such as 1312 a message acknowledgement method) that solicits a response from a 1313 destination endpoint. The method SHOULD allow the sender to check 1314 the value returned in the response to provide additional protection 1315 from off-path insertion of data [RFC8085], suitable methods include a 1316 parameter known only to the two endpoints, such as a session ID or 1317 initialized sequence number. 1319 6.1.2. Application Response 1321 An application needs an application-layer protocol mechanism to 1322 communicate the response from the destination endpoint. This 1323 response may indicate successful reception of the probe across the 1324 path, but could also indicate that some (or all packets) have failed 1325 to reach the destination. 1327 6.1.3. Sending Application Probe Packets 1329 A probe packet that may carry an application data block, but the 1330 successful transmission of this data is at risk when used for 1331 probing. Some applications may prefer to use a probe packet that 1332 does not carry an application data block to avoid disruption to data 1333 transfer. 1335 6.1.4. Initial Connectivity 1337 An application that does not have other higher-layer information 1338 confirming connectivity with the remote peer SHOULD implement a 1339 connectivity mechanism using acknowledged probe packets before 1340 entering the BASE state. 1342 6.1.5. Validating the Path 1344 An application that does not have other higher-layer information 1345 confirming correct delivery of datagrams SHOULD implement the 1346 CONFIRMATION_TIMER to periodically send probe packets while in the 1347 SEARCH_COMPLETE state. 1349 6.1.6. Handling of PTB Messages 1351 An application that is able and wishes to receive PTB messages MUST 1352 perform ICMP validation as specified in Section 5.2 of [RFC8085]. 1353 This requires that the application to check each received PTB 1354 messages to validate it is received in response to transmitted 1355 traffic and that the reported PTB_SIZE is less than the current 1356 probed size (see Section 4.5.2). A validated PTB message MAY be used 1357 as input to the DPLPMTUD algorithm, but MUST NOT be used directly to 1358 set the PLPMTU. 1360 6.2. DPLPMTUD for SCTP 1362 Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing 1363 method for SCTP. It recommends the use of the PAD chunk, defined in 1364 [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build 1365 a probe packet. This enables probing without affecting the transfer 1366 of user messages and without interfering with congestion control. 1367 This is preferred to using DATA chunks (with padding as required) as 1368 path probes. 1370 6.2.1. SCTP/IPv4 and SCTP/IPv6 1372 6.2.1.1. Initial Connectivity 1374 The base protocol is specified in [RFC4960]. This provides an 1375 acknowledged PL. A sender can therefore enter the BASE state as soon 1376 as connectivity has been confirmed. 1378 6.2.1.2. Sending SCTP Probe Packets 1380 Probe packets consist of an SCTP common header followed by a 1381 HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control 1382 the length of the probe packet. The HEARTBEAT chunk is used to 1383 trigger the sending of a HEARTBEAT ACK chunk. The reception of the 1384 HEARTBEAT ACK chunk acknowledges reception of a successful probe. 1386 The HEARTBEAT chunk carries a Heartbeat Information parameter which 1387 should include, besides the information suggested in [RFC4960], the 1388 probe size, which is the size of the complete datagram. The size of 1389 the PAD chunk is therefore computed by reducing the probing size by 1390 the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT 1391 request and the PAD chunk header. The payload of the PAD chunk 1392 contains arbitrary data. 1394 To avoid fragmentation of retransmitted data, probing starts right 1395 after the PL handshake, before data is sent. Assuming this behavior 1396 (i.e., the PMTU is smaller than or equal to the interface MTU), this 1397 process will take a few round trip time periods depending on the 1398 number of PMTU sizes probed. The Heartbeat timer can be used to 1399 implement the PROBE_TIMER. 1401 6.2.1.3. Validating the Path with SCTP 1403 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1404 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1406 6.2.1.4. PTB Message Handling by SCTP 1408 Normal ICMP validation MUST be performed as specified in Appendix C 1409 of [RFC4960]. This requires that the first 8 bytes of the SCTP 1410 common header are quoted in the payload of the PTB message, which can 1411 be the case for ICMPv4 and is normally the case for ICMPv6. 1413 When a PTB message has been validated, the PTB_SIZE reported in the 1414 PTB message SHOULD be used with the DPLPMTUD algorithm, providing 1415 that the reported PTB_SIZE is less than the current probe size (see 1416 Section 4.5). 1418 6.2.2. DPLPMTUD for SCTP/UDP 1420 The UDP encapsulation of SCTP is specified in [RFC6951]. 1422 6.2.2.1. Initial Connectivity 1424 A sender can enter the BASE state as soon as SCTP connectivity has 1425 been confirmed. 1427 6.2.2.2. Sending SCTP/UDP Probe Packets 1429 Packet probing can be performed as specified in Section 6.2.1.2. The 1430 maximum payload is reduced by 8 bytes, which has to be considered 1431 when filling the PAD chunk. 1433 6.2.2.3. Validating the Path with SCTP/UDP 1435 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1436 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1438 6.2.2.4. Handling of PTB Messages by SCTP/UDP 1440 ICMP validation MUST be performed for PTB messages as specified in 1441 Appendix C of [RFC4960]. This requires that the first 8 bytes of the 1442 SCTP common header are contained in the PTB message, which can be the 1443 case for ICMPv4 (but note the UDP header also consumes a part of the 1444 quoted packet header) and is normally the case for ICMPv6. When the 1445 validation is completed, the PTB_SIZE indicated in the PTB message 1446 SHOULD be used with the DPLPMTUD providing that the reported PTB_SIZE 1447 is less than the current probe size. 1449 6.2.3. DPLPMTUD for SCTP/DTLS 1451 The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is 1452 specified in [RFC8261]. It is used for data channels in WebRTC 1453 implementations. 1455 6.2.3.1. Initial Connectivity 1457 A sender can enter the BASE state as soon as SCTP connectivity has 1458 been confirmed. 1460 6.2.3.2. Sending SCTP/DTLS Probe Packets 1462 Packet probing can be done as specified in Section 6.2.1.2. 1464 6.2.3.3. Validating the Path with SCTP/DTLS 1466 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1467 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1469 6.2.3.4. Handling of PTB Messages by SCTP/DTLS 1471 It is not possible to perform ICMP validation as specified in 1472 [RFC4960], since even if the ICMP message payload contains sufficient 1473 information, the reflected SCTP common header would be encrypted. 1474 Therefore it is not possible to process PTB messages at the PL. 1476 6.3. DPLPMTUD for QUIC 1478 QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides 1479 reception feedback. The UDP payload includes the QUIC packet header, 1480 protected payload, and any authentication fields. QUIC depends on a 1481 PMTU of at least 1280 bytes. 1483 Section 14.1 of [I-D.ietf-quic-transport] describes the path 1484 considerations when sending QUIC packets. It recommends the use of 1485 PADDING frames to build the probe packet. Pure probe-only packets 1486 are constructed with PADDING frames and PING frames to create a 1487 padding only packet that will elicit an acknowledgement. Such 1488 padding only packets enable probing without affecting the transfer of 1489 other QUIC frames. 1491 The recommendation for QUIC endpoints implementing DPLPMTUD is that a 1492 MPS is maintained for each combination of local and remote IP 1493 addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines 1494 that the PMTU between any pair of local and remote IP addresses has 1495 fallen below an acceptable MPS, it needs to immediately cease sending 1496 QUIC packets on the affected path. This could result in termination 1497 of the connection if an alternative path cannot be found 1498 [I-D.ietf-quic-transport]. 1500 6.3.1. Initial Connectivity 1502 The base protocol is specified in [I-D.ietf-quic-transport]. This 1503 provides an acknowledged PL. A sender can therefore enter the BASE 1504 state as soon as connectivity has been confirmed. 1506 6.3.2. Sending QUIC Probe Packets 1508 A probe packet consists of a QUIC Header and a payload containing 1509 PADDING Frames and a PING Frame. PADDING Frames are a single octet 1510 (0x00) and several of these can be used to create a probe packet of 1511 size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can 1512 therefore enter the BASE state as soon as connectivity has been 1513 confirmed. 1515 The current specification of QUIC sets the following: 1517 * BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to 1518 1200 bytes to confirm the path can support packets of a useful 1519 size. 1521 * MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has 1522 fallen below 1200 bytes MUST immediately stop sending on the 1523 affected path. 1525 6.3.3. Validating the Path with QUIC 1527 QUIC provides an acknowledged PL. A sender therefore MUST NOT 1528 implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1530 6.3.4. Handling of PTB Messages by QUIC 1532 QUIC operates over the UDP transport, and the guidelines on ICMP 1533 validation as specified in Section 5.2 of [RFC8085] therefore apply. 1534 In addition to UDP Port validation QUIC can validate an ICMP message 1535 by looking for valid Connection IDs in the quoted packet. 1537 7. Acknowledgements 1539 This work was partially funded by the European Union's Horizon 2020 1540 research and innovation programme under grant agreement No. 644334 1541 (NEAT). The views expressed are solely those of the author(s). 1543 Thanks to all that have commented or contributed, the TSVWG and QUIC 1544 working groups, and Mathew Calder and Julius Flohr for providing 1545 implementations. 1547 8. IANA Considerations 1549 This memo includes no request to IANA. 1551 If there are no requirements for IANA, the section will be removed 1552 during conversion into an RFC by the RFC Editor. 1554 9. Security Considerations 1556 The security considerations for the use of UDP and SCTP are provided 1557 in the references RFCs. The interval between individual probe 1558 packets MUST be at least one RTT, and the interval between rounds of 1559 probing is determined by the PMTU_RAISE_TIMER. 1561 A PL sender needs to ensure that the method used to confirm reception 1562 of probe packets offers protection from off-path attackers injecting 1563 packets into the path. This protection if provided in IETF-defined 1564 protocols (e.g., TCP, SCTP) using a randomly-initialized sequence 1565 number. A description of one way to do this when using UDP is 1566 provided in section 5.1 of [RFC8085]). 1568 There are cases where ICMP Packet Too Big (PTB) messages are not 1569 delivered due to policy, configuration or equipment design (see 1570 Section 1.1), this method therefore does not rely upon PTB messages 1571 being received, but is able to utilize these when they are received 1572 by the sender. PTB messages could potentially be used to cause a 1573 node to inappropriately reduce the PLPMTU. A node supporting 1574 DPLPMTUD MUST therefore appropriately validate the payload of PTB 1575 messages to ensure these are received in response to transmitted 1576 traffic (i.e., a reported error condition that corresponds to a 1577 datagram actually sent by the path layer, see Section 4.5.1). 1579 An on-path attacker, able to create a PTB message could forge PTB 1580 messages that include a valid quoted IP packet. Such an attack could 1581 be used to drive down the PLPMTU. There are two ways this method can 1582 be mitigated against such attacks: First, by ensuring that a PL 1583 sender never reduces the PLPMTU below the base size, solely in 1584 response to receiving a PTB message. This is achieved by first 1585 entering the BASE state when such a message is received. Second, the 1586 design does not require processing of PTB messages, a PL sender could 1587 therefore suspend processing of PTB messages (e.g., in a robustness 1588 mode after detecting that subsequent probes actually confirm that a 1589 size larger than the PTB_SIZE is supported by a path). 1591 Parallel forwarding paths SHOULD be considered. Section 5.4 1592 identifies the need for robustness in the method when the path 1593 information may be inconsistent. 1595 A node performing DPLPMTUD could experience conflicting information 1596 about the size of supported probe packets. This could occur when 1597 there are multiple paths are concurrently in use and these exhibit a 1598 different PMTU. If not considered, this could result in data being 1599 black holed when the PLPMTU is larger than the smallest PMTU across 1600 the current paths. 1602 10. References 1604 10.1. Normative References 1606 [I-D.ietf-quic-transport] 1607 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1608 and Secure Transport", draft-ietf-quic-transport-20 (work 1609 in progress), 23 April 2019, 1610 . 1613 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1614 DOI 10.17487/RFC0768, August 1980, 1615 . 1617 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1618 DOI 10.17487/RFC0791, September 1981, 1619 . 1621 [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", 1622 RFC 1191, DOI 10.17487/RFC1191, November 1990, 1623 . 1625 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1626 Requirement Levels", BCP 14, RFC 2119, 1627 DOI 10.17487/RFC2119, March 1997, 1628 . 1630 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 1631 and G. Fairhurst, Ed., "The Lightweight User Datagram 1632 Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July 1633 2004, . 1635 [RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and 1636 Parameter for the Stream Control Transmission Protocol 1637 (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, 1638 . 1640 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1641 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1642 . 1644 [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream 1645 Control Transmission Protocol (SCTP) Packets for End-Host 1646 to End-Host Communication", RFC 6951, 1647 DOI 10.17487/RFC6951, May 2013, 1648 . 1650 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1651 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1652 March 2017, . 1654 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1655 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1656 May 2017, . 1658 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1659 (IPv6) Specification", STD 86, RFC 8200, 1660 DOI 10.17487/RFC8200, July 2017, 1661 . 1663 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1664 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1665 DOI 10.17487/RFC8201, July 2017, 1666 . 1668 [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, 1669 "Datagram Transport Layer Security (DTLS) Encapsulation of 1670 SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November 1671 2017, . 1673 10.2. Informative References 1675 [I-D.ietf-intarea-tunnels] 1676 Touch, J. and M. Townsley, "IP Tunnels in the Internet 1677 Architecture", draft-ietf-intarea-tunnels-10 (work in 1678 progress), 12 September 2019, 1679 . 1682 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1683 RFC 792, DOI 10.17487/RFC0792, September 1981, 1684 . 1686 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1687 Communication Layers", STD 3, RFC 1122, 1688 DOI 10.17487/RFC1122, October 1989, 1689 . 1691 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1692 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1693 . 1695 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1696 RFC 2923, DOI 10.17487/RFC2923, September 2000, 1697 . 1699 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1700 Congestion Control Protocol (DCCP)", RFC 4340, 1701 DOI 10.17487/RFC4340, March 2006, 1702 . 1704 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1705 Control Message Protocol (ICMPv6) for the Internet 1706 Protocol Version 6 (IPv6) Specification", STD 89, 1707 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1708 . 1710 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1711 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 1712 . 1714 [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering 1715 ICMPv6 Messages in Firewalls", RFC 4890, 1716 DOI 10.17487/RFC4890, May 2007, 1717 . 1719 Appendix A. Revision Notes 1721 Note to RFC-Editor: please remove this entire section prior to 1722 publication. 1724 Individual draft -00: 1726 * Comments and corrections are welcome directly to the authors or 1727 via the IETF TSVWG working group mailing list. 1729 * This update is proposed for WG comments. 1731 Individual draft -01: 1733 * Contains the first representation of the algorithm, showing the 1734 states and timers 1736 * This update is proposed for WG comments. 1738 Individual draft -02: 1740 * Contains updated representation of the algorithm, and textual 1741 corrections. 1743 * The text describing when to set the effective PMTU has not yet 1744 been validated by the authors 1746 * To determine security to off-path-attacks: We need to decide 1747 whether a received PTB message SHOULD/MUST be validated? The text 1748 on how to handle a PTB message indicating a link MTU larger than 1749 the probe has yet not been validated by the authors 1751 * No text currently describes how to handle inconsistent results 1752 from arbitrary re-routing along different parallel paths 1754 * This update is proposed for WG comments. 1756 Working Group draft -00: 1758 * This draft follows a successful adoption call for TSVWG 1760 * There is still work to complete, please comment on this draft. 1762 Working Group draft -01: 1764 * This draft includes improved introduction. 1766 * The draft is updated to require ICMP validation prior to accepting 1767 PTB messages - this to be confirmed by WG 1769 * Section added to discuss Selection of Probe Size - methods to be 1770 evaluated and recommendations to be considered 1772 * Section added to align with work proposed in the QUIC WG. 1774 Working Group draft -02: 1776 * The draft was updated based on feedback from the WG, and a 1777 detailed review by Magnus Westerlund. 1779 * The document updates RFC 4821. 1781 * Requirements list updated. 1783 * Added more explicit discussion of a simpler black-hole detection 1784 mode. 1786 * This draft includes reorganisation of the section on IETF 1787 protocols. 1789 * Added more discussion of implementation within an application. 1791 * Added text on flapping paths. 1793 * Replaced 'effective MTU' with new term PLPMTU. 1795 Working Group draft -03: 1797 * Updated figures 1799 * Added more discussion on blackhole detection 1801 * Added figure describing just blackhole detection 1803 * Added figure relating MPS sizes 1805 Working Group draft -04: 1807 * Described phases and named these consistently. 1809 * Corrected transition from confirmation directly to the search 1810 phase (Base has been checked). 1812 * Redrawn state diagrams. 1814 * Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU). 1816 * Clarified Error state. 1818 * Clarified suspending DPLPMTUD. 1820 * Verified normative text in requirements section. 1822 * Removed duplicate text. 1824 * Changed all text to refer to /packet probe/probe packet/ 1825 /validation/verification/ added term /Probe Confirmation/ and 1826 clarified BlackHole detection. 1828 Working Group draft -05: 1830 * Updated security considerations. 1832 * Feedback after speaking with Joe Touch helped improve UDP-Options 1833 description. 1835 Working Group draft -06: 1837 * Updated description of ICMP issues in section 1.1 1839 * Update to description of QUIC. 1841 Working group draft -07: 1843 * Moved description of the PTB processing method from the PTB 1844 requirements section. 1846 * Clarified what is performed in the PTB validation check. 1848 * Updated security consideration to explain PTB security without 1849 needing to read the rest of the document. 1851 * Reformatted state machine diagram 1853 Working group draft -08: 1855 * Moved to rfcxml v3+ 1857 * Rendered diagrams to svg in html version. 1859 * Removed Appendix A. Event-driven state changes. 1861 * Removed section on DPLPMTUD with UDP Options. 1863 * Shortened the description of phases. 1865 Working group draft -09: 1867 * Remove final mention of UDP Options 1869 * Add Initial Connectivity sections to each PL 1870 * Add to disable outgoing pmtu enforcement of packets 1872 Working group draft -10: 1874 * Address comments from Lars Eggert 1876 * Reinforce that PROBE_COUNT is successive attempts to probe for any 1877 size 1879 * Redefine MAx_PROBES to 3 1881 * Address PTB_SIZE of 0 or less that MIN_PMTU 1883 Working group draft -11: 1885 * Restore a sentence removed in previous rev 1887 * De-acronymise QUIC 1889 * Address some nits 1891 Working group draft -12: 1893 * Add TSVWG, QUIC and implementers to acknowledgements 1895 * Shorten a diagram line 1897 * Address nits from Julius and Wes 1899 * Be clearer when talking about IP layer caches 1901 Authors' Addresses 1903 Godred Fairhurst 1904 University of Aberdeen 1905 School of Engineering, Fraser Noble Building 1906 Aberdeen 1907 AB24 3UE 1908 United Kingdom 1910 Email: gorry@erg.abdn.ac.uk 1912 Tom Jones 1913 University of Aberdeen 1914 School of Engineering, Fraser Noble Building 1915 Aberdeen 1916 AB24 3UE 1917 United Kingdom 1919 Email: tom@erg.abdn.ac.uk 1921 Michael Tuexen 1922 Muenster University of Applied Sciences 1923 Stegerwaldstrasse 39 1924 48565 Steinfurt 1925 Germany 1927 Email: tuexen@fh-muenster.de 1929 Irene Ruengeler 1930 Muenster University of Applied Sciences 1931 Stegerwaldstrasse 39 1932 48565 Steinfurt 1933 Germany 1935 Email: i.ruengeler@fh-muenster.de 1937 Timo Voelker 1938 Muenster University of Applied Sciences 1939 Stegerwaldstrasse 39 1940 48565 Steinfurt 1941 Germany 1943 Email: timo.voelker@fh-muenster.de