idnits 2.17.1 draft-ietf-tsvwg-datagram-plpmtud-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. -- The abstract seems to indicate that this document updates RFC8201, but the header doesn't have an 'Updates:' line to match this. -- The abstract seems to indicate that this document updates RFC4821, but the header doesn't have an 'Updates:' line to match this. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (20 November 2019) is 1591 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-20 ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) == Outdated reference: A later version (-13) exists of draft-ietf-intarea-tunnels-10 Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Fairhurst 3 Internet-Draft T. Jones 4 Updates4821 (if approved) University of Aberdeen 5 Intended status: Standards Track M. Tuexen 6 Expires: 23 May 2020 I. Ruengeler 7 T. Voelker 8 Muenster University of Applied Sciences 9 20 November 2019 11 Packetization Layer Path MTU Discovery for Datagram Transports 12 draft-ietf-tsvwg-datagram-plpmtud-11 14 Abstract 16 This document describes a robust method for Path MTU Discovery 17 (PMTUD) for datagram Packetization Layers (PLs). It describes an 18 extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path 19 MTU Discovery for IPv4 and IPv6. The method allows a PL, or a 20 datagram application that uses a PL, to discover whether a network 21 path can support the current size of datagram. This can be used to 22 detect and reduce the message size when a sender encounters a network 23 black hole (where packets are discarded). The method can probe a 24 network path with progressively larger packets to discover whether 25 the maximum packet size can be increased. This allows a sender to 26 determine an appropriate packet size, providing functionally for 27 datagram transports that is equivalent to the Packetization Layer 28 PMTUD specification for TCP, specified in RFC 4821. 30 The document also provides implementation notes for incorporating 31 Datagram PMTUD into IETF datagram transports or applications that use 32 datagram transports. 34 When published, this specification updates RFC 4821. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on 23 May 2020. 53 Copyright Notice 55 Copyright (c) 2019 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 60 license-info) in effect on the date of publication of this document. 61 Please review these documents carefully, as they describe your rights 62 and restrictions with respect to this document. Code Components 63 extracted from this document must include Simplified BSD License text 64 as described in Section 4.e of the Trust Legal Provisions and are 65 provided without warranty as described in the Simplified BSD License. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 70 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 71 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 72 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6 73 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 74 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 9 75 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 12 76 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 12 77 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 13 78 4.3. Detection of Unsupported PLPMTU Size, aka Black Hole 79 Detection . . . . . . . . . . . . . . . . . . . . . . . . 14 80 4.4. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 15 81 4.5. Response to PTB Messages . . . . . . . . . . . . . . . . 15 82 4.5.1. Validation of PTB Messages . . . . . . . . . . . . . 15 83 4.5.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 16 84 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 17 85 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 18 86 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 18 87 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 19 88 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 20 89 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 21 90 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 23 91 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 26 92 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 26 93 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 27 94 5.3.3. Resilience to Inconsistent Path Information . . . . . 27 95 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 28 97 6. Specification of Protocol-Specific Methods . . . . . . . . . 28 98 6.1. Application support for DPLPMTUD with UDP or 99 UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 28 100 6.1.1. Application Request . . . . . . . . . . . . . . . . . 29 101 6.1.2. Application Response . . . . . . . . . . . . . . . . 29 102 6.1.3. Sending Application Probe Packets . . . . . . . . . . 29 103 6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 29 104 6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 29 105 6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 30 106 6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 30 107 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 30 108 6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 31 109 6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 32 110 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 32 111 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 33 112 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 33 113 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 33 114 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 33 115 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 34 116 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 117 9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 118 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 119 10.1. Normative References . . . . . . . . . . . . . . . . . . 35 120 10.2. Informative References . . . . . . . . . . . . . . . . . 36 121 Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 37 122 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 124 1. Introduction 126 The IETF has specified datagram transport using UDP, SCTP, and DCCP, 127 as well as protocols layered on top of these transports (e.g., SCTP/ 128 UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP 129 network layer. This document describes a robust method for Path MTU 130 Discovery (PMTUD) that may be used with these transport protocols (or 131 the applications that use their transport service) to discover an 132 appropriate size of packet to use across an Internet path. 134 1.1. Classical Path MTU Discovery 136 Classical Path Maximum Transmission Unit Discovery (PMTUD) can be 137 used with any transport that is able to process ICMP Packet Too Big 138 (PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, 139 the term PTB message is applied to both IPv4 ICMP Unreachable 140 messages (type 3) that carry the error Fragmentation Needed (Type 3, 141 Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) 142 [RFC4443]. When a sender receives a PTB message, it reduces the 143 effective MTU to the value reported as the Link MTU in the PTB 144 message, and a method that from time-to-time increases the packet 145 size in attempt to discover an increase in the supported PMTU. The 146 packets sent with a size larger than the current effective PMTU are 147 known as probe packets. 149 Packets not intended as probe packets are either fragmented to the 150 current effective PMTU, or the attempt to send fails with an error 151 code. Applications are sometimes provided with a primitive to let 152 them read the Maximum Packet Size (MPS), derived from the current 153 effective PMTU. 155 Classical PMTUD is subject to protocol failures. One failure arises 156 when traffic using a packet size larger than the actual PMTU is 157 black-holed (all datagrams sent with this size, or larger, are 158 discarded). This could arise when the PTB messages are not delivered 159 back to the sender for some reason (see for example [RFC2923]). 161 Examples where PTB messages are not delivered include: 163 * The generation of ICMP messages is usually rate limited. This 164 could result in no PTB messages being generated to the sender (see 165 section 2.4 of [RFC4443]) 167 * ICMP messages can be filtered by middleboxes (including firewalls) 168 [RFC4890]. A stateful firewall could be configured with a policy 169 to block incoming ICMP messages, which would prevent reception of 170 PTB messages to a sending endpoint behind this firewall. 172 * When the router issuing the ICMP message drops a tunneled packet, 173 the resulting ICMP message will be directed to the tunnel ingress. 174 This tunnel endpoint is responsible for forwarding the ICMP 175 message and also processing the quoted packet within the payload 176 field to remove the effect of the tunnel, and return a correctly 177 formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. 178 Failure to do this prevents the PTB message reaching the original 179 sender. 181 * Asymmetry in forwarding can result in there being no return route 182 to the original sender, which would prevent an ICMP message being 183 delivered to the sender. This issue can also arise when policy- 184 based routing is used, Equal Cost Multipath (ECMP) routing is 185 used, or a middlebox acts as an application load balancer. An 186 example is where the path towards the server is chosen by ECMP 187 routing depending on bytes in the IP payload. In this case, when 188 a packet sent by the server encounters a problem after the ECMP 189 router, then any resulting ICMP message needs to also be directed 190 by the ECMP router towards the original sender. 192 * There are additional cases where the next hop destination fails to 193 receive a packet because of its size. This could be due to 194 misconfiguration of the layer 2 path between nodes, for instance 195 the MTU configured in a layer 2 switch, or misconfiguration of the 196 Maximum Receive Unit (MRU). If the packet is dropped by the link, 197 this will not cause a PTB message to be sent to the original 198 sender. 200 Another failure could result if a node that is not on the network 201 path sends a PTB message that attempts to force a sender to change 202 the effective PMTU [RFC8201]. A sender can protect itself from 203 reacting to such messages by utilising the quoted packet within a PTB 204 message payload to validate that the received PTB message was 205 generated in response to a packet that had actually originated from 206 the sender. However, there are situations where a sender would be 207 unable to provide this validation. Examples where validation of the 208 PTB message is not possible include: 210 * When a router issuing the ICMP message implements RFC792 211 [RFC0792], it is only required to include the first 64 bits of the 212 IP payload of the packet within the quoted payload. There could 213 be insufficient bytes remaining for the sender to interpret the 214 quoted transport information. 216 Note: The recommendation in RFC1812 [RFC1812] is that IPv4 routers 217 return a quoted packet with as much of the original datagram as 218 possible without the length of the ICMP datagram exceeding 576 219 bytes. IPv6 routers include as much of the invoking packet as 220 possible without the ICMPv6 packet exceeding 1280 bytes [RFC4443]. 222 * The use of tunnels/encryption can reduce the size of the quoted 223 packet returned to the original source address, increasing the 224 risk that there could be insufficient bytes remaining for the 225 sender to interpret the quoted transport information. 227 * Even when the PTB message includes sufficient bytes of the quoted 228 packet, the network layer could lack sufficient context to 229 validate the message, because validation depends on information 230 about the active transport flows at an endpoint node (e.g., the 231 socket/address pairs being used, and other protocol header 232 information). 234 * When a packet is encapsulated/tunneled over an encrypted 235 transport, the tunnel/encapsulation ingress might have 236 insufficient context, or computational power, to reconstruct the 237 transport header that would be needed to perform validation. 239 1.2. Packetization Layer Path MTU Discovery 241 The term Packetization Layer (PL) has been introduced to describe the 242 layer that is responsible for placing data blocks into the payload of 243 IP packets and selecting an appropriate MPS. This function is often 244 performed by a transport protocol, but can also be performed by other 245 encapsulation methods working above the transport layer. 247 In contrast to PMTUD, Packetization Layer Path MTU Discovery 248 (PLPMTUD) [RFC4821] does not rely upon reception and validation of 249 PTB messages. It is therefore more robust than Classical PMTUD. 250 This has become the recommended approach for implementing PMTU 251 discovery. 253 It uses a general strategy where the PL sends probe packets to search 254 for the largest size of unfragmented datagram that can be sent over a 255 network path. Probe packets are sent with a progressively larger 256 packet size. If a probe packet is successfully delivered (as 257 determined by the PL), then the PLPMTU is raised to the size of the 258 successful probe. If no response is received to a probe packet, the 259 method reduces the probe size. The result of probing with the PLPMTU 260 is used to set the application MPS. 262 PLPMTUD introduces flexibility in the implementation of PMTU 263 discovery. At one extreme, it can be configured to only perform ICMP 264 Black Hole Detection and recovery to increase the robustness of 265 Classical PMTUD, or at the other extreme, all PTB processing can be 266 disabled and PLPMTUD can completely replace Classical PMTUD (see 267 Section 4.5). 269 PLPMTUD can also include additional consistency checks without 270 increasing the risk that data is lost when probing to discover the 271 path MTU. For example, information available at the PL, or higher 272 layers, enables received PTB messages to be validated before being 273 utilized. 275 1.3. Path MTU Discovery for Datagram Services 277 Section 5 of this document presents a set of algorithms for datagram 278 protocols to discover the largest size of unfragmented datagram that 279 can be sent over a network path. The method described relies on 280 features of the PL described in Section 3 and applies to transport 281 protocols operating over IPv4 and IPv6. It does not require 282 cooperation from the lower layers, although it can utilize PTB 283 messages when these received messages are made available to the PL. 285 The UDP Usage Guidelines [RFC8085] state "an application SHOULD 286 either use the Path MTU information provided by the IP layer or 287 implement Path MTU Discovery (PMTUD)", but does not provide a 288 mechanism for discovering the largest size of unfragmented datagram 289 that can be used on a network path. Prior to this document, PLPMTUD 290 had not been specified for UDP. 292 Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the 293 Stream Control Transport Protocol (SCTP). SCTP utilizes probe 294 packets consisting of a minimal sized HEARTBEAT chunk bundled with a 295 PAD chunk as defined in [RFC4820], but RFC4821 does not provide a 296 complete specification. The present document provides the details to 297 complete that specification. 299 The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires 300 implementations to support Classical PMTUD and states that a DCCP 301 sender "MUST maintain the MPS allowed for each active DCCP session". 302 It also defines the current congestion control MPS (CCMPS) supported 303 by a network path. This recommends use of PMTUD, and suggests use of 304 control packets (DCCP-Sync) as path probe packets, because they do 305 not risk application data loss. The method defined in this 306 specification could be used with DCCP. 308 Section 6 specifies the method for a set of transports, and provides 309 information to enable the implementation of PLPMTUD with other 310 datagram transports and applications that use datagram transports. 312 2. Terminology 314 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 315 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 316 "OPTIONAL" in this document are to be interpreted as described in BCP 317 14 [RFC2119] [RFC8174] when, and only when, they appear in all 318 capitals, as shown here. 320 Other terminology is directly copied from [RFC4821], and the 321 definitions in [RFC1122]. 323 Actual PMTU: The Actual PMTU is the PMTU of a network path between a 324 sender PL and a destination PL, which the DPLPMTUD algorithm seeks 325 to determine. 327 Black Hole: A Black Hole is encountered when a sender is unaware 328 that packets are not being delivered to the destination end point. 329 Two types of Black Hole are relevant to DPLPMTUD: 331 Packet Black Hole: Packets encounter a Packet Black Hole when 332 packets are not delivered to the destination 333 endpoint (e.g., when the sender transmits 334 packets of a particular size with a previously 335 known effective PMTU and they are discarded by 336 the network). 338 ICMP Black Hole An ICMP Black Hole is encountered when the 339 sender is unaware that packets are not 340 delivered to the destination endpoint because 341 PTB messages are not received by the 342 originating PL sender. 344 Black holed : Traffic is black-holed when the sender is unaware that 345 packets are not being delivered. This could be due to a Packet 346 Black Hole or an ICMP Black Hole. 348 Classical Path MTU Discovery: Classical PMTUD is a process described 349 in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to 350 learn the largest size of unfragmented datagram that can be used 351 across a network path. 353 Datagram: A datagram is a transport-layer protocol data unit, 354 transmitted in the payload of an IP packet. 356 Effective PMTU: The Effective PMTU is the current estimated value 357 for PMTU that is used by a PMTUD. This is equivalent to the 358 PLPMTU derived by PLPMTUD. 360 EMTU_S: The Effective MTU for sending (EMTU_S) is defined in 361 [RFC1122] as "the maximum IP datagram size that may be sent, for a 362 particular combination of IP source and destination addresses...". 364 EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in 365 [RFC1122] as the largest datagram size that can be reassembled by 366 EMTU_R (Effective MTU to receive). 368 Link: A Link is a communication facility or medium over which nodes 369 can communicate at the link layer, i.e., a layer below the IP 370 layer. Examples are Ethernet LANs and Internet (or higher) layer 371 and tunnels. 373 Link MTU: The Link Maximum Transmission Unit (MTU) is the size in 374 bytes of the largest IP packet, including the IP header and 375 payload, that can be transmitted over a link. Note that this 376 could more properly be called the IP MTU, to be consistent with 377 how other standards organizations use the acronym. This includes 378 the IP header, but excludes link layer headers and other framing 379 that is not part of IP or the IP payload. Other standards 380 organizations generally define the link MTU to include the link 381 layer headers. 383 MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD 384 will attempt to use. 386 MPS: The Maximum Packet Size (MPS) is the largest size of 387 application data block that can be sent across a network path by a 388 PL. In DPLPMTUD this quantity is derived from the PLPMTU by 389 taking into consideration the size of the lower protocol layer 390 headers. Probe packets generated by DPLPMTUD can have a size 391 larger than the MPS. 393 MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD 394 will attempt to use. 396 Packet: A Packet is the IP header plus the IP payload. 398 Packetization Layer (PL): The Packetization Layer (PL) is the layer 399 of the network stack that places data into packets and performs 400 transport protocol functions. 402 Path: The Path is the set of links and routers traversed by a packet 403 between a source node and a destination node by a particular flow. 405 Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU 406 of all the links forming a network path between a source node and 407 a destination node. 409 PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB 410 message that indicates next hop link MTU of a router along the 411 path. 413 PLPMTU: The Packetization Layer PMTU is an estimate of the actual 414 PMTU provided by the DPLPMTUD algorithm. 416 PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the 417 method described in this document for datagram PLs, which is an 418 extension to Classical PMTU Discovery. 420 Probe packet: A probe packet is a datagram sent with a purposely 421 chosen size (typically the current PLPMTU or larger) to detect if 422 packets of this size can be successfully sent end-to-end across 423 the network path. 425 3. Features Required to Provide Datagram PLPMTUD 427 TCP PLPMTUD has been defined using standard TCP protocol mechanisms. 428 All of the requirements in [RFC4821] also apply to the use of the 429 technique with a datagram PL. Unlike TCP, some datagram PLs require 430 additional mechanisms to implement PLPMTUD. 432 There are eight requirements for performing the datagram PLPMTUD 433 method described in this specification: 435 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide 436 information about the maximum size of packet that can be 437 transmitted by the sender on the local link (the local Link MTU). 438 It MAY utilize similar information about the receiver when this 439 is supplied (note this could be less than EMTU_R). This avoids 440 implementations trying to send probe packets that can not be 441 transmitted by the local link. Too high of a value could reduce 442 the efficiency of the search algorithm. Some applications also 443 have a maximum transport protocol data unit (PDU) size, in which 444 case there is no benefit from probing for a size larger than this 445 (unless a transport allows multiplexing multiple applications 446 PDUs into the same datagram). 448 2. PLPMTU: A datagram application using a PL not supporting 449 fragmentation is REQUIRED to be able to choose the size of 450 datagrams sent to the network, up to the PLPMTU, or a smaller 451 value (such as the MPS) derived from this. This value is managed 452 by the DPLPMTUD method. The PLPMTU (specified as the effective 453 PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S 454 (specified in [RFC1122]). 456 3. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be 457 able to transmit a packet larger than the PLMPMTU. This is used 458 to send a probe packet. In IPv4, a probe packet MUST be sent 459 with the Don't Fragment (DF) bit set in the IP header, and 460 without network layer endpoint fragmentation. In IPv6, a probe 461 packet is always sent without source fragmentation (as specified 462 in section 5.4 of [RFC8201]). 464 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize 465 PTB messages received from the network layer to help identify 466 when a network path does not support the current size of probe 467 packet. Any received PTB message MUST be validated before it is 468 used to update the PLPMTU discovery information [RFC8201]. This 469 validation confirms that the PTB message was sent in response to 470 a packet originating by the sender, and needs to be performed 471 before the PLPMTU discovery method reacts to the PTB message. A 472 PTB message MUST NOT be used to increase the PLPMTU [RFC8201]. 474 5. Reception feedback: The destination PL endpoint is REQUIRED to 475 provide a feedback method that indicates to the DPLPMTUD sender 476 when a probe packet has been received by the destination PL 477 endpoint. The mechanism needs to be robust to the possibility 478 that packets could be significantly delayed along a network path. 480 The local PL endpoint at the sending node is REQUIRED to pass 481 this feedback to the sender DPLPMTUD method. 483 6. Probe loss recovery: It is RECOMMENDED to use probe packets that 484 do not carry any user data that would require retransmission if 485 lost. Most datagram transports permit this. If a probe packet 486 contains user data requiring retransmission in case of loss, the 487 PL (or layers above) are REQUIRED to arrange any retransmission/ 488 repair of any resulting loss. DPLPMTUD is REQUIRED to be robust 489 in the case where probe packets are lost due to other reasons 490 (including link transmission error, congestion). 492 7. Probing and congestion control: The DPLPMTUD sender treats 493 isolated loss of a probe packet (with or without a corresponding 494 PTB message) as a potential indication of a PMTU limit for the 495 path. Loss of a probe packet SHOULD NOT be treated as an 496 indication of congestion. The loss of a probe packet SHOULD NOT 497 directly trigger a congestion control reaction [RFC4821] because 498 this could result in unecessary reduction of the sending rate. 499 The interval between probe packets MUST be at least one RTT. 501 8. Shared PLPMTU state: The PLPMTU value MAY also be stored with the 502 corresponding entry in the destination cache and used by other PL 503 instances. The specification of PLPMTUD [RFC4821] states: "If 504 PLPMTUD updates the MTU for a particular path, all Packetization 505 Layer sessions that share the path representation (as described 506 in Section 5.2 of [RFC4821]) SHOULD be notified to make use of 507 the new MTU". Such methods MUST be robust to the wide variety of 508 underlying network forwarding behaviors. Section 5.2 of 509 [RFC8201] provides guidance on the caching of PMTU information 510 and also the relation to IPv6 flow labels. 512 In addition, the following principles are stated for design of a 513 DPLPMTUD method: 515 * MPS: A method is REQUIRED to signal an appropriate MPS to the 516 higher layer using the PL. The value of the MPS can change 517 following a change to the path. It is RECOMMENDED that methods 518 avoid forcing an application to use an arbitrary small MPS 519 (PLPMTU) for transmission while the method is searching for the 520 currently supported PLPMTU. Datagram PLs do not necessarily 521 support fragmentation of PDUs larger than the PLPMTU. A reduced 522 MPS can adversely impact the performance of a datagram 523 application. 525 * Path validation: It is RECOMMENDED that methods are robust to path 526 changes that could have occurred since the path characteristics 527 were last confirmed, and to the possibility of inconsistent path 528 information being received. 530 * Datagram reordering: A method is REQUIRED to be robust to the 531 possibility that a flow encounters reordering, or the traffic 532 (including probe packets) is divided over more than one network 533 path. 535 * When to probe: It is RECOMMENDED that methods determine whether 536 the path has changed since it last measured the path. This can 537 help determine when to probe the path again. 539 4. DPLPMTUD Mechanisms 541 This section lists the protocol mechanisms used in this 542 specification. 544 4.1. PLPMTU Probe Packets 546 The DPLPMTUD method relies upon the PL sender being able to generate 547 probe packets with a specific size. TCP is able to generate these 548 probe packets by choosing to appropriately segment data being sent 549 [RFC4821]. In contrast, a datagram PL that needs to construct a 550 probe packet has to either request an application to send a data 551 block that is larger than that generated by an application, or to 552 utilize padding functions to extend a datagram beyond the size of the 553 application data block. Protocols that permit exchange of control 554 messages (without an application data block) MAY prefer to generate a 555 probe packet by extending a control message with padding data. 557 A receiver is REQUIRED to be able to distinguish an in-band data 558 block from any added padding. This is needed to ensure that any 559 added padding is not passed on to an application at the receiver. 561 This results in three possible ways that a sender can create a probe 562 packet: 564 Probing using padding data: A probe packet that contains only 565 control information together with any padding, which is needed to 566 be inflated to the size required for the probe packet. Since 567 these probe packets do not carry an application-supplied data 568 block, they do not typically require retransmission, although they 569 do still consume network capacity and incur endpoint processing. 571 Probing using application data and padding 572 data: A probe packet that 573 contains a data block supplied by an application that is combined 574 with padding to inflate the length of the datagram to the size 575 required for the probe packet. If the application/transport needs 576 protection from the loss of this probe packet, the application/ 577 transport could perform transport-layer retransmission/repair of 578 the data block (e.g., by retransmission after loss is detected or 579 by duplicating the data block in a datagram without the padding 580 data). 582 Probing using application data: A probe packet that contains a data 583 block supplied by an application that matches the size required 584 for the probe packet. This method requests the application to 585 issue a data block of the desired probe size. If the application/ 586 transport needs protection from the loss of an unsuccessful probe 587 packet, the application/transport needs then to perform transport- 588 layer retransmission/repair of the data block (e.g., by 589 retransmission after loss is detected). 591 A PL that uses a probe packet carrying an application data block, 592 could need to retransmit this application data block if the probe 593 fails. This could need the PL to re-fragment the data block to a 594 smaller packet size that is expected to traverse the end-to-end path 595 (which could utilize endpoint network-layer or PL fragmentation when 596 these are available). 598 DPLPMTUD MAY choose to use only one of these methods to simplify the 599 implementation. 601 Probe messages sent by a PL MUST contain enough information to 602 uniquely identify the probe within Maximum Segment Lifetime, while 603 being robust to reordering and replay of probe response and PTB 604 messages. 606 4.2. Confirmation of Probed Packet Size 608 The PL needs a method to determine (confirm) when probe packets have 609 been successfully received end-to-end across a network path. 611 Transport protocols can include end-to-end methods that detect and 612 report reception of specific datagrams that they send (e.g., DCCP and 613 SCTP provide keep-alive/heartbeat features). When supported, this 614 mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of 615 a probe packet. 617 A PL that does not acknowledge data reception (e.g., UDP and UDP- 618 Lite) is unable itself to detect when the packets that it sends are 619 discarded because their size is greater than the actual PMTU. These 620 PLs need to either rely on an application protocol to detect this 621 loss. 623 Section 6 specifies this function for a set of IETF-specified 624 protocols. 626 4.3. Detection of Unsupported PLPMTU Size, aka Black Hole Detection 628 A PL sender needs to reduce the PLPMTU when it discovers the actual 629 PMTU supported by a network path is less than the PLPMTU. This can 630 be triggered when a validated PTB message is received, or by another 631 event that indicates the network path no longer sustains the current 632 packet size, such as a loss report from the PL, or repeated lack of 633 response to probe packets sent to confirm the PLPMTU. Detection is 634 followed by a reduction of the PLPMTU. 636 This is performed by sending packet probes of size PLPMTU to verify 637 that a network path still supports the last acknowledged PLPMTU size. 638 There are two alternative mechanism: 640 * A PL can rely upon a mechanism implemented within the PL to detect 641 excessive loss of data sent with a specific packet size and then 642 conclude that this excessive loss could be a result of an invalid 643 PMTU (as in PLPMTUD for TCP [RFC4821]). 645 * A PL can use the DPLPMTUD probing mechanism to periodically 646 generate probe packets of the size of the current PLPMTU (e.g., 647 using the confirmation timer Section 5.1.1). A timer tracks 648 whether acknowledgments are received. Successive loss of probes 649 is an indication that the current path no longer supports the 650 PLPMTU (e.g., when the number of probe packets sent without 651 receiving an acknowledgement, PROBE_COUNT, becomes greater than 652 MAX_PROBES). 654 A PL MAY inhibit sending probe packets when no application data has 655 been sent since the previous probe packet. A PL preferring to use an 656 up-to-data PLPMTU once user data is sent again, MAY choose to 657 continue PLPMTU discovery for each path. However, this may result in 658 additional packets being sent. 660 When the method detects the current PLPMTU is not supported, DPLPMTUD 661 sets a lower MPS. The PL then confirms that the updated PLPMTU can 662 be successfully used across the path. The PL could need to send a 663 probe packet with a size less than the size of the data block 664 generated by an application. In this case, the PL could provide a 665 way to fragment a datagram at the PL, or use a control packet as the 666 packet probe. 668 4.4. Disabling the Effect of PMTUD 670 A PL implementing this specification MUST suspend network layer 671 processing of outgoing packets that enforces a PMTU 672 [RFC1191][RFC8201] for each flow utilising DPLPMTUD, and instead use 673 DPLPMTUD to control the size of packets that are sent by a flow. 674 This removes the need for the network layer to drop or fragment sent 675 packets that have a size greater than the PMTU. 677 4.5. Response to PTB Messages 679 This method requires the DPLPMTUD sender to validate any received PTB 680 message before using the PTB information. The response to a PTB 681 message depends on the PTB_SIZE indicated in the PTB message, the 682 state of the PLPMTUD state machine, and the IP protocol being used. 684 Section 4.5.1 first describes validation for both IPv4 ICMP 685 Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, 686 both of which are referred to as PTB messages in this document. 688 4.5.1. Validation of PTB Messages 690 This section specifies utilization of PTB messages. 692 * A simple implementation MAY ignore received PTB messages and in 693 this case the PLPMTU is not updated when a PTB message is 694 received. 696 * An implementation that supports PTB messages MUST validate 697 messages before they are further processed. 699 A PL that receives a PTB message from a router or middlebox, performs 700 ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. 701 Because DPLPMTUD operates at the PL, the PL needs to check that each 702 received PTB message is received in response to a packet transmitted 703 by the endpoint PL performing DPLPMTUD. 705 The PL MUST check the protocol information in the quoted packet 706 carried in an ICMP PTB message payload to validate the message 707 originated from the sending node. This validation includes 708 determining that the combination of the IP addresses, the protocol, 709 the source port and destination port match those returned in the 710 quoted packet - this is also necessary for the PTB message to be 711 passed to the corresponding PL. 713 The validation SHOULD utilize information that it is not simple for 714 an off-path attacker to determine [RFC8085]. For example, by 715 checking the value of a protocol header field known only to the two 716 PL endpoints. A datagram application that uses well-known source and 717 destination ports ought to also rely on other information to complete 718 this validation. 720 These checks are intended to provide protection from packets that 721 originate from a node that is not on the network path. A PTB message 722 that does not complete the validation MUST NOT be further utilized by 723 the DPLPMTUD method. 725 PTB messages that have been validated MAY be utilized by the DPLPMTUD 726 algorithm, but MUST NOT be used directly to set the PLPMTU. A method 727 that utilizes these PTB messages can improve the speed at the which 728 the algorithm detects an appropriate PLPMTU, compared to one that 729 relies solely on probing. Section 4.5.2 describes this processing. 731 4.5.2. Use of PTB Messages 733 A set of checks are intended to provide protection from a router that 734 reports an unexpected PTB_SIZE. The PL also needs to check that the 735 indicated PTB_SIZE is less than the size used by probe packets and 736 larger than minimum size accepted. 738 This section provides a summary of how PTB messages can be utilized. 739 This processing depends on the PTB_SIZE and the current value of a 740 set of variables: 742 PTB_SIZE < MIN_MTU 743 * Invalid PTB_SIZE see Section 4.5.1. 745 * PTB message ought to be discarded without further processing 746 (e. g. PLPMTU not modified). 748 * The information could be utilized as an input to trigger 749 enabling a resilience mode. 751 MIN_PMTU < PTB_SIZE < BASE_PMTU 752 * A robust PL MAY enter an error state (see Section 5.2) for an 753 IPv4 path when the PTB_SIZE reported in the PTB message is 754 larger than or equal to 68 bytes and when this is less than the 755 BASE_PMTU. 757 * A robust PL MAY enter an error state (see Section 5.2) for an 758 IPv6 path when the PTB_SIZE reported in the PTB message is 759 larger than or equal to 1280 bytes and when this is less than 760 the BASE_PMTU. 762 PTB_SIZE = PLPMTU 763 * Completes the search for a larger PLPMTU. 765 PTB_SIZE > PROBED_SIZE 766 * Inconsistent network signal. 768 * PTB message ought to be discarded without further processing 769 (e. g. PLPMTU not modified). 771 * The information could be utilized as an input to trigger 772 enabling a resilience mode. 774 BASE_PMTU <= PTB_SIZE < PLPMTU 775 * Black Hole Detection is triggered and the PLPMTU ought to be 776 set to BASE_PMTU. 778 * The PL could use the PTB_SIZE reported in the PTB message to 779 initialize a search algorithm. 781 PLPMTU < PTB_SIZE < PROBED_SIZE 782 * The PLPMTU continues to be valid, but the last PROBED_SIZE 783 searched was larger than the actual PMTU. 785 * The PLPMTU is not updated. 787 * The PL can use the reported PTB_SIZE from the PTB message as 788 the next search point when it resumes the search algorithm. 790 5. Datagram Packetization Layer PMTUD 792 This section specifies Datagram PLPMTUD (DPLPMTUD). The method can 793 be introduced at various points (as indicated with * in the figure 794 below) in the IP protocol stack to discover the PLPMTU so that an 795 application can utilize an appropriate MPS for the current network 796 path. DPLPMTUD SHOULD NOT be used by an application if it is already 797 used in a lower layer. 799 +----------------------+ 800 | Application* | 801 +-+-------+----+----+--+ 802 | | | | 803 +---+--+ +--+--+ | +-+---+ 804 | QUIC*| |UDPO*| | |SCTP*| 805 +---+--+ +--+--+ | +--+--+ 806 | | | | | 807 +-------+--+ | | | 808 | | | | 809 +-+-+--+ | 810 | UDP | | 811 +---+--+ | 812 | | 813 +--------------+-----+-+ 814 | Network Interface | 815 +----------------------+ 817 Figure 1: Examples where DPLPMTUD can be implemented 819 The central idea of DPLPMTUD is probing by a sender. Probe packets 820 are sent to find the maximum size of a user message that can be 821 completely transferred across the network path from the sender to the 822 destination. 824 The following sections identify the components needed for 825 implementation, provides an overview of the phases of operation, and 826 specifies the state machine and search algorithm. 828 5.1. DPLPMTUD Components 830 This section describes the timers, constants, and variables of 831 DPLPMTUD. 833 5.1.1. Timers 835 The method utilizes up to three timers: 837 PROBE_TIMER: The PROBE_TIMER is configured to expire after a 838 period longer than the maximum time to receive 839 an acknowledgment to a probe packet. This value 840 MUST NOT be smaller than 1 second, and SHOULD be 841 larger than 15 seconds. Guidance on selection 842 of the timer value are provided in section 3.1.1 843 of the UDP Usage Guidelines [RFC8085]. 845 If the PL has a path Round Trip Time (RTT) 846 estimate and timely acknowledgements the 847 PROBE_TIMER can be derived from the PL RTT 848 estimate. 850 PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period 851 a sender will continue to use the current 852 PLPMTU, after which it re-enters the Search 853 phase. This timer has a period of 600 seconds, 854 as recommended by PLPMTUD [RFC4821]. 856 DPLPMTUD MAY inhibit sending probe packets when 857 no application data has been sent since the 858 previous probe packet. A PL preferring to use 859 an up-to-data PMTU once user data is sent again, 860 can choose to continue PMTU discovery for each 861 path. However, this may result in sending 862 additional packets. 864 CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST 865 NOT be used. For other PLs, the 866 CONFIRMATION_TIMER is configured to the period a 867 PL sender waits before confirming the current 868 PLPMTU is still supported. This is less than 869 the PMTU_RAISE_TIMER and used to decrease the 870 PLPMTU (e.g., when a black hole is encountered). 871 Confirmation needs to be frequent enough when 872 data is flowing that the sending PL does not 873 black hole extensive amounts of traffic. 874 Guidance on selection of the timer value are 875 provided in section 3.1.1 of the UDP Usage 876 Guidelines [RFC8085]. 878 DPLPMTUD MAY inhibit sending probe packets when 879 no application data has been sent since the 880 previous probe packet. A PL preferring to use 881 an up-to-data PMTU once user data is sent again, 882 can choose to continue PMTU discovery for each 883 path. However, this may result in sending 884 additional packets. 886 An implementation could implement the various timers using a single 887 timer. 889 5.1.2. Constants 891 The following constants are defined: 893 MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT 894 counter (see Section 5.1.3). MAX_PROBES represents the 895 limit for the number of consecutive probe attempts of 896 any size. The default value of MAX_PROBES is 3. This 897 value is greater than 1 to provide robustness to 898 isolated packet loss. 900 MIN_PMTU: The MIN_PMTU is the smallest allowed probe packet size. 901 For IPv6, this value is 1280 bytes, as specified in 902 [RFC2460]. For IPv4, the minimum value is 68 bytes. 904 Note: An IPv4 router is required to be able to forward a 905 datagram of 68 bytes without further fragmentation. 906 This is the combined size of an IPv4 header and the 907 minimum fragment size of 8 bytes. In addition, 908 receivers are required to be able to reassemble 909 fragmented datagrams at least up to 576 bytes, as stated 910 in section 3.3.3 of [RFC1122]. 912 MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU. This has to 913 be less than or equal to the minimum of the local MTU of 914 the outgoing interface and the destination PMTU for 915 receiving. An application, or PL, MAY choose a smaller 916 MAX_PMTU when there is no need to send packets larger 917 than a specific size. 919 BASE_PMTU: The BASE_PMTU is a configured size expected to work for 920 most paths. The size is equal to or larger than the 921 MIN_PMTU and smaller than the MAX_PMTU. In the case of 922 IPv6, this value is 1280 bytes [RFC2460]. When using 923 IPv4, a size of 1200 bytes is RECOMMENDED. 925 5.1.3. Variables 927 This method utilizes a set of variables: 929 PROBED_SIZE: The PROBED_SIZE is the size of the current probe 930 packet. This is a tentative value for the PLPMTU, 931 which is awaiting confirmation by an acknowledgment. 933 PROBE_COUNT: The PROBE_COUNT is a count of the number of successive 934 unsuccessful probe packets that have been sent. Each 935 time a probe packet is acknowledged, the value is set 936 to zero. 938 The figure below illustrates the relationship between the packet size 939 constants and variables at a point of time when the DPLPMTUD 940 algorithm performs path probing to increase the size of the PLPMTU. 941 A probe packet has been sent of size PROBED_SIZE. Once this is 942 acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the 943 DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual 944 PMTU. 946 MIN_PMTU MAX_PMTU 947 <--------------------------------------------------> 948 | | | | 949 v | | v 950 BASE_PMTU | v Actual PMTU 951 | PROBED_SIZE 952 v 953 PLPMTU 955 Figure 2: Relationships between packet size constants and variables 957 5.1.4. Overview of DPLPMTUD Phases 959 This section provides a high-level informative view of the DPLPMTUD 960 method, by describing the movement of the method through several 961 phases of operation. More detail is available in the state machine 962 Section 5.2. 964 +------+ 965 +------->| Base |----------------+ Connectivity 966 | +------+ | or BASE_PMTU 967 | | | confirmation failed 968 | | v 969 | | Connectivity +-------+ 970 | | and BASE_PMTU | Error | 971 | | confirmed +-------+ 972 | | | 973 | v | Consistent connectivity 974 PLPMTU | +--------+ | and BASE_PMTU 975 confirmation | | Search |<--------------+ confirmed 976 failed | +--------+ 977 | ^ | 978 | | | 979 | Raise | | Search 980 | timer | | algorithm 981 | expired | | completed 982 | | | 983 | | v 984 | +-----------------+ 985 +---| Search Complete | 986 +-----------------+ 988 Figure 3: DPLPMTUD Phases 990 Base: The Base Phase confirms connectivity to the remote 991 peer. This phase is implicit for a connection- 992 oriented PL (where it can be performed in a PL 993 connection handshake). A connectionless PL needs 994 to send an acknowledged probe packet to confirm 995 that the remote peer is reachable. The sender also 996 confirms that BASE_PMTU is supported across the 997 network path. 999 A PL that does not wish to support a path with a 1000 PLPMTU less than BASE_PMTU can simplify the phase 1001 into a single step by performing the connectivity 1002 checks with a probe of the BASE_PMTU size. 1004 Once confirmed, DPLPMTUD enters the Search Phase. 1005 If this phase fails to confirm, DPLPMTUD enters the 1006 Error Phase. 1008 Search: The Search Phase utilizes a search algorithm to 1009 send probe packets to seek to increase the PLPMTU. 1010 The algorithm concludes when it has found a 1011 suitable PLPMTU, by entering the Search Complete 1012 Phase. 1014 A PL could respond to PTB messages using the PTB to 1015 advance or terminate the search, see Section 4.5. 1017 Search Complete: The Search Complete Phase is entered when the 1018 PLPMTU is supported across the network path. A PL 1019 can use a CONFIRMATION_TIMER to periodically repeat 1020 a probe packet for the current PLPMTU size. If the 1021 sender is unable to confirm reachability (e.g., if 1022 the CONFIRMATION_TIMER expires) or the PL signals a 1023 lack of reachability, DPLPMTUD enters the Base 1024 phase. 1026 The PMTU_RAISE_TIMER is used to periodically resume 1027 the search phase to discover if the PLPMTU can be 1028 raised. Black Hole Detection or receipt of a 1029 validated PTB message (see Section 4.5.1) can cause 1030 the sender to enter the Base Phase. 1032 Error: The Error Phase is entered when there is 1033 conflicting or invalid PLPMTU information for the 1034 path (e.g. a failure to support the BASE_PMTU) that 1035 cause DPLPMTUD to be unable to progress and the 1036 PLPMTU is lowered. 1038 DPLPMTUD remains in the Error Phase until a 1039 consistent view of the path can be discovered and 1040 it has also been confirmed that the path supports 1041 the BASE_PMTU (or DPLPMTUD is suspended). 1043 An implementation that only reduces the PLPMTU to a suitable size 1044 would be sufficient to ensure reliable operation, but can be very 1045 inefficient when the actual PMTU changes or when the method (for 1046 whatever reason) makes a suboptimal choice for the PLPMTU. 1048 A full implementation of DPLPMTUD provides an algorithm enabling the 1049 DPLPMTUD sender to increase the PLPMTU following a change in the 1050 characteristics of the path, such as when a link is reconfigured with 1051 a larger MTU, or when there is a change in the set of links traversed 1052 by an end-to-end flow (e.g., after a routing or path fail-over 1053 decision). 1055 5.2. State Machine 1057 A state machine for DPLPMTUD is depicted in Figure 4. If multipath 1058 or multihoming is supported, a state machine is needed for each path. 1060 Note: Not all changes are not shown to simplify the diagram. 1062 | | 1063 | Start | PL indicates loss 1064 | | of connectivity 1065 v v 1066 +---------------+ +---------------+ 1067 | DISABLED | | ERROR | 1068 +---------------+ PROBE_TIMER expiry: +---------------+ 1069 | PL indicates PROBE_COUNT = MAX_PROBES or ^ | 1070 | connectivity PTB: PTB_SIZE < BASE_PMTU | | 1071 +--------------------+ +---------------+ | 1072 | | | 1073 v | BASE_PMTU Probe | 1074 +---------------+ acked | 1075 | BASE |----------------------+ 1076 +---------------+ | 1077 Black hole detected or ^ | ^ ^ Black hole detected or | 1078 PTB: PTB_SIZE < PLPMTU | | | | PTB: PTB_SIZE < PLPMTU | 1079 +--------------------+ | | +--------------------+ | 1080 | +----+ | | 1081 | PROBE_TIMER expiry: | | 1082 | PROBE_COUNT < MAX_PROBES | | 1083 | | | 1084 | PMTU_RAISE_TIMER expiry | | 1085 | +-----------------------------------------+ | | 1086 | | | | | 1087 | | v | v 1088 +---------------+ +---------------+ 1089 |SEARCH_COMPLETE| | SEARCHING | 1090 +---------------+ +---------------+ 1091 | ^ ^ | | ^ 1092 | | | | | | 1093 | | +-----------------------------------------+ | | 1094 | | MAX_PMTU Probe acked or PROBE_TIMER | | 1095 | | expiry: PROBE_COUNT = MAX_PROBES or | | 1096 +----+ PTB: PTB_SIZE = PLPMTU +----+ 1097 CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: 1098 PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or 1099 PLPMTU Probe acked Probe acked or PTB: 1100 PLPMTU < PTB_SIZE < PROBED_SIZE 1102 Figure 4: State machine for Datagram PLPMTUD 1104 The following states are defined: 1106 DISABLED: The DISABLED state is the initial state before 1107 probing has started. It is also entered from any 1108 other state, when the PL indicates loss of 1109 connectivity. This state is left, once the PL 1110 indicates connectivity to the remote PL. 1112 BASE: The BASE state is used to confirm that the 1113 BASE_PMTU size is supported by the network path and 1114 is designed to allow an application to continue 1115 working when there are transient reductions in the 1116 actual PMTU. It also seeks to avoid long periods 1117 where traffic is black holed while searching for a 1118 larger PLPMTU. 1120 On entry, the PROBED_SIZE is set to the BASE_PMTU 1121 size and the PROBE_COUNT is set to zero. 1123 Each time a probe packet is sent, the PROBE_TIMER 1124 is started. The state is exited when the probe 1125 packet is acknowledged, and the PL sender enters 1126 the SEARCHING state. 1128 The state is also left when the PROBE_COUNT reaches 1129 MAX_PROBES or a received PTB message is validated. 1130 This causes the PL sender to enter the ERROR state. 1132 SEARCHING: The SEARCHING state is the main probing state. 1133 This state is entered when probing for the 1134 BASE_PMTU was successful. 1136 Each time a probe packet is acknowledged, the 1137 PROBE_COUNT is set to zero, the PLPMTU is set to 1138 the PROBED_SIZE and then the PROBED_SIZE is 1139 increased using the search algorithm. 1141 When a probe packet is sent and not acknowledged 1142 within the period of the PROBE_TIMER, the 1143 PROBE_COUNT is incremented and a new probe packet 1144 is transmitted. The state is exited when the 1145 PROBE_COUNT reaches MAX_PROBES, a received PTB 1146 message is validated, a probe of size MAX_PMTU is 1147 acknowledged, or a black hole is detected. 1149 SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful 1150 end to the SEARCHING state. DPLPMTUD remains in 1151 this state until either the PMTU_RAISE_TIMER 1152 expires, a received PTB message is validated, or a 1153 black hole is detected. 1155 When DPLPMTUD uses an unacknowledged PL and is in 1156 the SEARCH_COMPLETE state, a CONFIRMATION_TIMER 1157 periodically resets the PROBE_COUNT and schedules a 1158 probe packet with the size of the PLPMTU. If 1159 MAX_PROBES successive PLPMTUD sized probes fail to 1160 be acknowledged the method enters the BASE state. 1161 When used with an acknowledged PL (e.g., SCTP), 1162 DPLPMTUD SHOULD NOT continue to generate PLPMTU 1163 probes in this state. 1165 ERROR: The ERROR state represents the case where either 1166 the network path is not known to support a PLPMTU 1167 of at least the BASE_PMTU size or when there is 1168 contradictory information about the network path 1169 that would otherwise result in excessive variation 1170 in the MPS signalled to the higher layer. The 1171 state implements a method to mitigate oscillation 1172 in the state-event engine. It signals a 1173 conservative value of the MPS to the higher layer 1174 by the PL. The state is exited when packet probes 1175 no longer detect the error or when the PL indicates 1176 that connectivity has been lost. 1178 Implementations are permitted to enable endpoint 1179 fragmentation if the DPLPMTUD is unable to validate 1180 MIN_PMTU within PROBE_COUNT probes. If DPLPMTUD is 1181 unable to validate MIN_PMTU the implementation 1182 should transition to the DISABLED state. 1184 Note: MIN_PMTU may be identical to BASE_PMTU, 1185 simplifying the actions in this state. 1187 5.3. Search to Increase the PLPMTU 1189 This section describes the algorithms used by DPLPMTUD to search for 1190 a larger PLPMTU. 1192 5.3.1. Probing for a larger PLPMTU 1194 Implementations use a search algorithm across the search range to 1195 determine whether a larger PLPMTU can be supported across a network 1196 path. 1198 The method discovers the search range by confirming the minimum 1199 PLPMTU and then using the probe method to select a PROBED_SIZE less 1200 than or equal to MAX_PMTU. MAX_PMTU is the minimum of the local MTU 1201 and EMTU_R (learned from the remote endpoint). The MAX_PMTU MAY be 1202 reduced by an application that sets a maximum to the size of 1203 datagrams it will send. 1205 The PROBE_COUNT is initialized to zero when the first probe with a 1206 size greater than or equal to PLPMTUD is sent. A timer is used by 1207 the search algorithm to trigger the sending of probe packets of size 1208 PROBED_SIZE, larger than the PLPMTU. Each probe packet successfully 1209 sent to the remote peer is confirmed by acknowledgement at the PL, 1210 see Section 4.1. 1212 Each time a probe packet is sent to the destination, the PROBE_TIMER 1213 is started. The timer is canceled when the PL receives 1214 acknowledgment that the probe packet has been successfully sent 1215 across the path Section 4.1. This confirms that the PROBED_SIZE is 1216 supported, and the PROBED_SIZE value is then assigned to the PLPMTU. 1217 The search algorithm can continue to send subsequent probe packets of 1218 an increasing size. 1220 If the timer expires before a probe packet is acknowledged, the probe 1221 has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER 1222 expires, the PROBE_COUNT is incremented, the PROBE_TIMER is 1223 reinitialized, and a new probe of the same size or any other size 1224 (determined by the search algorithm) can be sent. The maximum number 1225 of consecutive failed probes is configured (MAX_PROBES). If the 1226 value of the PROBE_COUNT reaches MAX_PROBES, probing will stop, and 1227 the PL sender enters the SEARCH_COMPLETE state. 1229 5.3.2. Selection of Probe Sizes 1231 The search algorithm needs to determine a minimum useful gain in 1232 PLPMTU. It would not be constructive for a PL sender to attempt to 1233 probe for all sizes. This would incur unnecessary load on the path 1234 and has the undesirable effect of slowing the time to reach a more 1235 optimal MPS. Implementations SHOULD select the set of probe packet 1236 sizes to maximize the gain in PLPMTU from each search step. 1238 Implementations could optimize the search procedure by selecting step 1239 sizes from a table of common PMTU sizes. When selecting the 1240 appropriate next size to search, an implementer ought to also 1241 consider that there can be common sizes of MPS that applications seek 1242 to use, and their could be common sizes of MTU used within the 1243 network. 1245 5.3.3. Resilience to Inconsistent Path Information 1247 A decision to increase the PLPMTU needs to be resilient to the 1248 possibility that information learned about the network path is 1249 inconsistent. A path is inconsistent, when, for example, probe 1250 packets are lost due to other reasons (i.e. not packet size) or due 1251 to frequent path changes. Frequent path changes could occur by 1252 unexpected "flapping" - where some packets from a flow pass along one 1253 path, but other packets follow a different path with different 1254 properties. 1256 A PL sender is able to detect inconsistency from the sequence of 1257 PLPMTU probes that it sends or the sequence of PTB messages that it 1258 receives. When inconsistent path information is detected, a PL 1259 sender could use an alternate search mode that clamps the offered MPS 1260 to a smaller value for a period of time. This avoids unnecessary 1261 loss of packets due to MTU limitation. 1263 5.4. Robustness to Inconsistent Paths 1265 Some paths could be unable to sustain packets of the BASE_PMTU size. 1266 To be robust to these paths an implementation could implement the 1267 Error State. This allows fallback to a smaller than desired PLPMTU, 1268 rather than suffer connectivity failure. This could utilize methods 1269 such as endpoint IP fragmentation to enable the PL sender to 1270 communicate using packets smaller than the BASE_PMTU. 1272 6. Specification of Protocol-Specific Methods 1274 DPLPMTUD requires protocol-specific details to be specified for each 1275 PL that is used. 1277 The first subsection provides guidance on how to implement the 1278 DPLPMTUD method as a part of an application using UDP or UDP-Lite. 1279 The guidance also applies to other datagram services that do not 1280 include a specific transport protocol (such as a tunnel 1281 encapsulation). The following subsections describe how DPLPMTUD can 1282 be implemented as a part of the transport service, allowing 1283 applications using the service to benefit from discovery of the 1284 PLPMTU without themselves needing to implement this method. 1286 6.1. Application support for DPLPMTUD with UDP or UDP-Lite 1288 The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do 1289 not define a method in the RFC-series that supports PLPMTUD. In 1290 particular, the UDP transport does not provide the transport layer 1291 features needed to implement datagram PLPMTUD. 1293 The DPLPMTUD method can be implemented as a part of an application 1294 built directly or indirectly on UDP or UDP-Lite, but relies on 1295 higher-layer protocol features to implement the method [RFC8085]. 1297 Some primitives used by DPLPMTUD might not be available via the 1298 Datagram API (e.g., the ability to access the PLPMTU cache, or 1299 interpret received PTB messages). 1301 In addition, it is desirable that PMTU discovery is not performed by 1302 multiple protocol layers. An application SHOULD avoid using DPLPMTUD 1303 when the underlying transport system provides this capability. To 1304 use common method for managing the PLPMTU has benefits, both in the 1305 ability to share state between different processes and opportunities 1306 to coordinate probing. 1308 6.1.1. Application Request 1310 An application needs an application-layer protocol mechanism (such as 1311 a message acknowledgement method) that solicits a response from a 1312 destination endpoint. The method SHOULD allow the sender to check 1313 the value returned in the response to provide additional protection 1314 from off-path insertion of data [RFC8085], suitable methods include a 1315 parameter known only to the two endpoints, such as a session ID or 1316 initialized sequence number. 1318 6.1.2. Application Response 1320 An application needs an application-layer protocol mechanism to 1321 communicate the response from the destination endpoint. This 1322 response may indicate successful reception of the probe across the 1323 path, but could also indicate that some (or all packets) have failed 1324 to reach the destination. 1326 6.1.3. Sending Application Probe Packets 1328 A probe packet that may carry an application data block, but the 1329 successful transmission of this data is at risk when used for 1330 probing. Some applications may prefer to use a probe packet that 1331 does not carry an application data block to avoid disruption to data 1332 transfer. 1334 6.1.4. Initial Connectivity 1336 An application that does not have other higher-layer information 1337 confirming connectivity with the remote peer SHOULD implement a 1338 connectivity mechanism using acknowledged probe packets before 1339 entering the BASE state. 1341 6.1.5. Validating the Path 1343 An application that does not have other higher-layer information 1344 confirming correct delivery of datagrams SHOULD implement the 1345 CONFIRMATION_TIMER to periodically send probe packets while in the 1346 SEARCH_COMPLETE state. 1348 6.1.6. Handling of PTB Messages 1350 An application that is able and wishes to receive PTB messages MUST 1351 perform ICMP validation as specified in Section 5.2 of [RFC8085]. 1352 This requires that the application to check each received PTB 1353 messages to validate it is received in response to transmitted 1354 traffic and that the reported PTB_SIZE is less than the current 1355 probed size (see Section 4.5.2). A validated PTB message MAY be used 1356 as input to the DPLPMTUD algorithm, but MUST NOT be used directly to 1357 set the PLPMTU. 1359 6.2. DPLPMTUD for SCTP 1361 Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing 1362 method for SCTP. It recommends the use of the PAD chunk, defined in 1363 [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build 1364 a probe packet. This enables probing without affecting the transfer 1365 of user messages and without interfering with congestion control. 1366 This is preferred to using DATA chunks (with padding as required) as 1367 path probes. 1369 6.2.1. SCTP/IPv4 and SCTP/IPv6 1371 6.2.1.1. Initial Connectivity 1373 The base protocol is specified in [RFC4960]. This provides an 1374 acknowledged PL. A sender can therefore enter the BASE state as soon 1375 as connectivity has been confirmed. 1377 6.2.1.2. Sending SCTP Probe Packets 1379 Probe packets consist of an SCTP common header followed by a 1380 HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control 1381 the length of the probe packet. The HEARTBEAT chunk is used to 1382 trigger the sending of a HEARTBEAT ACK chunk. The reception of the 1383 HEARTBEAT ACK chunk acknowledges reception of a successful probe. 1385 The HEARTBEAT chunk carries a Heartbeat Information parameter which 1386 should include, besides the information suggested in [RFC4960], the 1387 probe size, which is the size of the complete datagram. The size of 1388 the PAD chunk is therefore computed by reducing the probing size by 1389 the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT 1390 request and the PAD chunk header. The payload of the PAD chunk 1391 contains arbitrary data. 1393 To avoid fragmentation of retransmitted data, probing starts right 1394 after the PL handshake, before data is sent. Assuming this behavior 1395 (i.e., the PMTU is smaller than or equal to the interface MTU), this 1396 process will take a few round trip time periods depending on the 1397 number of PMTU sizes probed. The Heartbeat timer can be used to 1398 implement the PROBE_TIMER. 1400 6.2.1.3. Validating the Path with SCTP 1402 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1403 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1405 6.2.1.4. PTB Message Handling by SCTP 1407 Normal ICMP validation MUST be performed as specified in Appendix C 1408 of [RFC4960]. This requires that the first 8 bytes of the SCTP 1409 common header are quoted in the payload of the PTB message, which can 1410 be the case for ICMPv4 and is normally the case for ICMPv6. 1412 When a PTB message has been validated, the PTB_SIZE reported in the 1413 PTB message SHOULD be used with the DPLPMTUD algorithm, providing 1414 that the reported PTB_SIZE is less than the current probe size (see 1415 Section 4.5). 1417 6.2.2. DPLPMTUD for SCTP/UDP 1419 The UDP encapsulation of SCTP is specified in [RFC6951]. 1421 6.2.2.1. Initial Connectivity 1423 A sender can enter the BASE state as soon as SCTP connectivity has 1424 been confirmed. 1426 6.2.2.2. Sending SCTP/UDP Probe Packets 1428 Packet probing can be performed as specified in Section 6.2.1.2. The 1429 maximum payload is reduced by 8 bytes, which has to be considered 1430 when filling the PAD chunk. 1432 6.2.2.3. Validating the Path with SCTP/UDP 1434 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1435 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1437 6.2.2.4. Handling of PTB Messages by SCTP/UDP 1439 ICMP validation MUST be performed for PTB messages as specified in 1440 Appendix C of [RFC4960]. This requires that the first 8 bytes of the 1441 SCTP common header are contained in the PTB message, which can be the 1442 case for ICMPv4 (but note the UDP header also consumes a part of the 1443 quoted packet header) and is normally the case for ICMPv6. When the 1444 validation is completed, the PTB_SIZE indicated in the PTB message 1445 SHOULD be used with the DPLPMTUD providing that the reported PTB_SIZE 1446 is less than the current probe size. 1448 6.2.3. DPLPMTUD for SCTP/DTLS 1450 The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is 1451 specified in [RFC8261]. It is used for data channels in WebRTC 1452 implementations. 1454 6.2.3.1. Initial Connectivity 1456 A sender can enter the BASE state as soon as SCTP connectivity has 1457 been confirmed. 1459 6.2.3.2. Sending SCTP/DTLS Probe Packets 1461 Packet probing can be done as specified in Section 6.2.1.2. 1463 6.2.3.3. Validating the Path with SCTP/DTLS 1465 Since SCTP provides an acknowledged PL, a sender MUST NOT implement 1466 the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1468 6.2.3.4. Handling of PTB Messages by SCTP/DTLS 1470 It is not possible to perform ICMP validation as specified in 1471 [RFC4960], since even if the ICMP message payload contains sufficient 1472 information, the reflected SCTP common header would be encrypted. 1473 Therefore it is not possible to process PTB messages at the PL. 1475 6.3. DPLPMTUD for QUIC 1477 QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides 1478 reception feedback. The UDP payload includes the QUIC packet header, 1479 protected payload, and any authentication fields. QUIC depends on a 1480 PMTU of at least 1280 bytes. 1482 Section 14.1 of [I-D.ietf-quic-transport] describes the path 1483 considerations when sending QUIC packets. It recommends the use of 1484 PADDING frames to build the probe packet. Pure probe-only packets 1485 are constructed with PADDING frames and PING frames to create a 1486 padding only packet that will elicit an acknowledgement. Such 1487 padding only packets enable probing without affecting the transfer of 1488 other QUIC frames. 1490 The recommendation for QUIC endpoints implementing DPLPMTUD is that a 1491 MPS is maintained for each combination of local and remote IP 1492 addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines 1493 that the PMTU between any pair of local and remote IP addresses has 1494 fallen below an acceptable MPS, it needs to immediately cease sending 1495 QUIC packets on the affected path. This could result in termination 1496 of the connection if an alternative path cannot be found 1497 [I-D.ietf-quic-transport]. 1499 6.3.1. Initial Connectivity 1501 The base protocol is specified in [I-D.ietf-quic-transport]. This 1502 provides an acknowledged PL. A sender can therefore enter the BASE 1503 state as soon as connectivity has been confirmed. 1505 6.3.2. Sending QUIC Probe Packets 1507 A probe packet consists of a QUIC Header and a payload containing 1508 PADDING Frames and a PING Frame. PADDING Frames are a single octet 1509 (0x00) and several of these can be used to create a probe packet of 1510 size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can 1511 therefore enter the BASE state as soon as connectivity has been 1512 confirmed. 1514 The current specification of QUIC sets the following: 1516 * BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to 1517 1200 bytes to confirm the path can support packets of a useful 1518 size. 1520 * MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has 1521 fallen below 1200 bytes MUST immediately stop sending on the 1522 affected path. 1524 6.3.3. Validating the Path with QUIC 1526 QUIC provides an acknowledged PL. A sender therefore MUST NOT 1527 implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 1529 6.3.4. Handling of PTB Messages by QUIC 1531 QUIC operates over the UDP transport, and the guidelines on ICMP 1532 validation as specified in Section 5.2 of [RFC8085] therefore apply. 1533 In addition to UDP Port validation QUIC can validate an ICMP message 1534 by looking for valid Connection IDs in the quoted packet. 1536 7. Acknowledgements 1538 This work was partially funded by the European Union's Horizon 2020 1539 research and innovation programme under grant agreement No. 644334 1540 (NEAT). The views expressed are solely those of the author(s). 1542 8. IANA Considerations 1544 This memo includes no request to IANA. 1546 If there are no requirements for IANA, the section will be removed 1547 during conversion into an RFC by the RFC Editor. 1549 9. Security Considerations 1551 The security considerations for the use of UDP and SCTP are provided 1552 in the references RFCs. The interval between individual probe 1553 packets MUST be at least one RTT, and the interval between rounds of 1554 probing is determined by the PMTU_RAISE_TIMER. 1556 A PL sender needs to ensure that the method used to confirm reception 1557 of probe packets offers protection from off-path attackers injecting 1558 packets into the path. This protection if provided in IETF-defined 1559 protocols (e.g., TCP, SCTP) using a randomly-initialized sequence 1560 number. A description of one way to do this when using UDP is 1561 provided in section 5.1 of [RFC8085]). 1563 There are cases where ICMP Packet Too Big (PTB) messages are not 1564 delivered due to policy, configuration or equipment design (see 1565 Section 1.1), this method therefore does not rely upon PTB messages 1566 being received, but is able to utilize these when they are received 1567 by the sender. PTB messages could potentially be used to cause a 1568 node to inappropriately reduce the PLPMTU. A node supporting 1569 DPLPMTUD MUST therefore appropriately validate the payload of PTB 1570 messages to ensure these are received in response to transmitted 1571 traffic (i.e., a reported error condition that corresponds to a 1572 datagram actually sent by the path layer, see Section 4.5.1). 1574 An on-path attacker, able to create a PTB message could forge PTB 1575 messages that include a valid quoted IP packet. Such an attack could 1576 be used to drive down the PLPMTU. There are two ways this method can 1577 be mitigated against such attacks: First, by ensuring that a PL 1578 sender never reduces the PLPMTU below the base size, solely in 1579 response to receiving a PTB message. This is achieved by first 1580 entering the BASE state when such a message is received. Second, the 1581 design does not require processing of PTB messages, a PL sender could 1582 therefore suspend processing of PTB messages (e.g., in a robustness 1583 mode after detecting that subsequent probes actually confirm that a 1584 size larger than the PTB_SIZE is supported by a path). 1586 Parallel forwarding paths SHOULD be considered. Section 5.4 1587 identifies the need for robustness in the method when the path 1588 information may be inconsistent. 1590 A node performing DPLPMTUD could experience conflicting information 1591 about the size of supported probe packets. This could occur when 1592 there are multiple paths are concurrently in use and these exhibit a 1593 different PMTU. If not considered, this could result in data being 1594 black holed when the PLPMTU is larger than the smallest PMTU across 1595 the current paths. 1597 10. References 1599 10.1. Normative References 1601 [I-D.ietf-quic-transport] 1602 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1603 and Secure Transport", draft-ietf-quic-transport-20 (work 1604 in progress), 23 April 2019, 1605 . 1608 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1609 DOI 10.17487/RFC0768, August 1980, 1610 . 1612 [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", 1613 RFC 1191, DOI 10.17487/RFC1191, November 1990, 1614 . 1616 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1617 Requirement Levels", BCP 14, RFC 2119, 1618 DOI 10.17487/RFC2119, March 1997, 1619 . 1621 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1622 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1623 December 1998, . 1625 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 1626 and G. Fairhurst, Ed., "The Lightweight User Datagram 1627 Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July 1628 2004, . 1630 [RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and 1631 Parameter for the Stream Control Transmission Protocol 1632 (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, 1633 . 1635 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1636 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1637 . 1639 [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream 1640 Control Transmission Protocol (SCTP) Packets for End-Host 1641 to End-Host Communication", RFC 6951, 1642 DOI 10.17487/RFC6951, May 2013, 1643 . 1645 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1646 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1647 March 2017, . 1649 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1650 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1651 May 2017, . 1653 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1654 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1655 DOI 10.17487/RFC8201, July 2017, 1656 . 1658 [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, 1659 "Datagram Transport Layer Security (DTLS) Encapsulation of 1660 SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November 1661 2017, . 1663 10.2. Informative References 1665 [I-D.ietf-intarea-tunnels] 1666 Touch, J. and M. Townsley, "IP Tunnels in the Internet 1667 Architecture", draft-ietf-intarea-tunnels-10 (work in 1668 progress), 12 September 2019, 1669 . 1672 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1673 RFC 792, DOI 10.17487/RFC0792, September 1981, 1674 . 1676 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1677 Communication Layers", STD 3, RFC 1122, 1678 DOI 10.17487/RFC1122, October 1989, 1679 . 1681 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1682 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1683 . 1685 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1686 RFC 2923, DOI 10.17487/RFC2923, September 2000, 1687 . 1689 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1690 Congestion Control Protocol (DCCP)", RFC 4340, 1691 DOI 10.17487/RFC4340, March 2006, 1692 . 1694 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1695 Control Message Protocol (ICMPv6) for the Internet 1696 Protocol Version 6 (IPv6) Specification", STD 89, 1697 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1698 . 1700 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1701 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 1702 . 1704 [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering 1705 ICMPv6 Messages in Firewalls", RFC 4890, 1706 DOI 10.17487/RFC4890, May 2007, 1707 . 1709 Appendix A. Revision Notes 1711 Note to RFC-Editor: please remove this entire section prior to 1712 publication. 1714 Individual draft -00: 1716 * Comments and corrections are welcome directly to the authors or 1717 via the IETF TSVWG working group mailing list. 1719 * This update is proposed for WG comments. 1721 Individual draft -01: 1723 * Contains the first representation of the algorithm, showing the 1724 states and timers 1726 * This update is proposed for WG comments. 1728 Individual draft -02: 1730 * Contains updated representation of the algorithm, and textual 1731 corrections. 1733 * The text describing when to set the effective PMTU has not yet 1734 been validated by the authors 1736 * To determine security to off-path-attacks: We need to decide 1737 whether a received PTB message SHOULD/MUST be validated? The text 1738 on how to handle a PTB message indicating a link MTU larger than 1739 the probe has yet not been validated by the authors 1741 * No text currently describes how to handle inconsistent results 1742 from arbitrary re-routing along different parallel paths 1744 * This update is proposed for WG comments. 1746 Working Group draft -00: 1748 * This draft follows a successful adoption call for TSVWG 1750 * There is still work to complete, please comment on this draft. 1752 Working Group draft -01: 1754 * This draft includes improved introduction. 1756 * The draft is updated to require ICMP validation prior to accepting 1757 PTB messages - this to be confirmed by WG 1759 * Section added to discuss Selection of Probe Size - methods to be 1760 evaluated and recommendations to be considered 1762 * Section added to align with work proposed in the QUIC WG. 1764 Working Group draft -02: 1766 * The draft was updated based on feedback from the WG, and a 1767 detailed review by Magnus Westerlund. 1769 * The document updates RFC 4821. 1771 * Requirements list updated. 1773 * Added more explicit discussion of a simpler black-hole detection 1774 mode. 1776 * This draft includes reorganisation of the section on IETF 1777 protocols. 1779 * Added more discussion of implementation within an application. 1781 * Added text on flapping paths. 1783 * Replaced 'effective MTU' with new term PLPMTU. 1785 Working Group draft -03: 1787 * Updated figures 1789 * Added more discussion on blackhole detection 1791 * Added figure describing just blackhole detection 1793 * Added figure relating MPS sizes 1795 Working Group draft -04: 1797 * Described phases and named these consistently. 1799 * Corrected transition from confirmation directly to the search 1800 phase (Base has been checked). 1802 * Redrawn state diagrams. 1804 * Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU). 1806 * Clarified Error state. 1808 * Clarified suspending DPLPMTUD. 1810 * Verified normative text in requirements section. 1812 * Removed duplicate text. 1814 * Changed all text to refer to /packet probe/probe packet/ 1815 /validation/verification/ added term /Probe Confirmation/ and 1816 clarified BlackHole detection. 1818 Working Group draft -05: 1820 * Updated security considerations. 1822 * Feedback after speaking with Joe Touch helped improve UDP-Options 1823 description. 1825 Working Group draft -06: 1827 * Updated description of ICMP issues in section 1.1 1829 * Update to description of QUIC. 1831 Working group draft -07: 1833 * Moved description of the PTB processing method from the PTB 1834 requirements section. 1836 * Clarified what is performed in the PTB validation check. 1838 * Updated security consideration to explain PTB security without 1839 needing to read the rest of the document. 1841 * Reformatted state machine diagram 1843 Working group draft -08: 1845 * Moved to rfcxml v3+ 1847 * Rendered diagrams to svg in html version. 1849 * Removed Appendix A. Event-driven state changes. 1851 * Removed section on DPLPMTUD with UDP Options. 1853 * Shortened the description of phases. 1855 Working group draft -09: 1857 * Remove final mention of UDP Options 1859 * Add Initial Connectivity sections to each PL 1861 * Add to disable outgoing pmtu enforcement of packets 1863 Working group draft -10: 1865 * Address comments from Lars Eggert 1867 * Reinforce that PROBE_COUNT is successive attempts to probe for any 1868 size 1870 * Redefine MAx_PROBES to 3 1872 * Address PTB_SIZE of 0 or less that MIN_PMTU 1874 Authors' Addresses 1876 Godred Fairhurst 1877 University of Aberdeen 1878 School of Engineering, Fraser Noble Building 1879 Aberdeen 1880 AB24 3UE 1881 United Kingdom 1883 Email: gorry@erg.abdn.ac.uk 1885 Tom Jones 1886 University of Aberdeen 1887 School of Engineering, Fraser Noble Building 1888 Aberdeen 1889 AB24 3UE 1890 United Kingdom 1892 Email: tom@erg.abdn.ac.uk 1894 Michael Tuexen 1895 Muenster University of Applied Sciences 1896 Stegerwaldstrasse 39 1897 48565 Steinfurt 1898 Germany 1900 Email: tuexen@fh-muenster.de 1902 Irene Ruengeler 1903 Muenster University of Applied Sciences 1904 Stegerwaldstrasse 39 1905 48565 Steinfurt 1906 Germany 1908 Email: i.ruengeler@fh-muenster.de 1910 Timo Voelker 1911 Muenster University of Applied Sciences 1912 Stegerwaldstrasse 39 1913 48565 Steinfurt 1914 Germany 1916 Email: timo.voelker@fh-muenster.de