idnits 2.17.1 draft-ietf-6man-rfc1981bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 104: '... IPv6 nodes SHOULD implement Path MT...' RFC 2119 keyword, line 110: '...th MTU Discovery MUST use the IPv6 min...' RFC 2119 keyword, line 249: '... Nodes SHOULD appropriately validate...' RFC 2119 keyword, line 255: '...nimum link MTU, it MUST discard it. A...' RFC 2119 keyword, line 256: '... node MUST NOT reduce its estimate o...' (10 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 179 has weird spacing: '...scovery proce...' == Line 705 has weird spacing: '...ent bit the...' == Line 707 has weird spacing: '...cussion sel...' == Line 710 has weird spacing: '...essages all...' == Line 713 has weird spacing: '... tables not...' == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 31, 2017) is 2584 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-6man-rfc2460bis-09 -- Possible downref: Normative reference to a draft: ref. 'I-D.ietf-6man-rfc2460bis' -- Obsolete informational reference (is this intentional?): RFC 6691 (Obsoleted by RFC 9293) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. McCann 3 Internet-Draft Digital Equipment Corporation 4 Obsoletes: 1981 (if approved) S. Deering 5 Intended status: Standards Track Retired 6 Expires: October 2, 2017 J. Mogul 7 Digital Equipment Corporation 8 R. Hinden, Ed. 9 Check Point Software 10 March 31, 2017 12 Path MTU Discovery for IP version 6 13 draft-ietf-6man-rfc1981bis-05 15 Abstract 17 This document describes Path MTU Discovery for IP version 6. It is 18 largely derived from RFC 1191, which describes Path MTU Discovery for 19 IP version 4. It obsoletes RFC1981. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 2, 2017. 38 Copyright Notice 40 Copyright (c) 2017 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 This document may contain material from IETF Documents or IETF 54 Contributions published or made publicly available before November 55 10, 2008. The person(s) controlling the copyright in some of this 56 material may not have granted the IETF Trust the right to allow 57 modifications of such material outside the IETF Standards Process. 58 Without obtaining an adequate license from the person(s) controlling 59 the copyright in such materials, this document may not be modified 60 outside the IETF Standards Process, and derivative works of it may 61 not be created outside the IETF Standards Process, except to format 62 it for publication as an RFC or to translate it into languages other 63 than English. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 5 70 4. Protocol Requirements . . . . . . . . . . . . . . . . . . . . 6 71 5. Implementation Issues . . . . . . . . . . . . . . . . . . . . 7 72 5.1. Layering . . . . . . . . . . . . . . . . . . . . . . . . 7 73 5.2. Storing PMTU information . . . . . . . . . . . . . . . . 8 74 5.3. Purging stale PMTU information . . . . . . . . . . . . . 10 75 5.4. Packetization layer actions . . . . . . . . . . . . . . . 11 76 5.5. Issues for other transport protocols . . . . . . . . . . 12 77 5.6. Management interface . . . . . . . . . . . . . . . . . . 13 78 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 79 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 82 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 83 9.2. Informative References . . . . . . . . . . . . . . . . . 14 84 Appendix A. Comparison to RFC 1191 . . . . . . . . . . . . . . . 15 85 Appendix B. Changes Since RFC 1981 . . . . . . . . . . . . . . . 16 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 88 1. Introduction 90 When one IPv6 node has a large amount of data to send to another 91 node, the data is transmitted in a series of IPv6 packets. These 92 packets can have a size less than or equal to the Path MTU (PMTU). 93 Alternatively, they can be larger packets that are fragmented into a 94 series of fragments each with a size less than or equal to the PMTU. 96 It is usually preferable that these packets be of the largest size 97 that can successfully traverse the path from the source node to the 98 destination node without the need for IPv6 fragmentation. This 99 packet size is referred to as the Path MTU, and it is equal to the 100 minimum link MTU of all the links in a path. This document defines a 101 standard mechanism for a node to discover the PMTU of an arbitrary 102 path. 104 IPv6 nodes SHOULD implement Path MTU Discovery in order to discover 105 and take advantage of paths with PMTU greater than the IPv6 minimum 106 link MTU [I-D.ietf-6man-rfc2460bis]. A minimal IPv6 implementation 107 (e.g., in a boot ROM) may choose to omit implementation of Path MTU 108 Discovery. 110 Nodes not implementing Path MTU Discovery MUST use the IPv6 minimum 111 link MTU defined in [I-D.ietf-6man-rfc2460bis] as the maximum packet 112 size. In most cases, this will result in the use of smaller packets 113 than necessary, because most paths have a PMTU greater than the IPv6 114 minimum link MTU. A node sending packets much smaller than the Path 115 MTU allows is wasting network resources and probably getting 116 suboptimal throughput. 118 Nodes implementing Path MTU Discovery and sending packets larger than 119 the IPv6 minimum link MTU are susceptible to loss if ICMPv6 [ICMPv6] 120 messages are blocked or not transmitted. For example, this will 121 result in connections that complete the TCP three-way handshake 122 correctly but then hang when data is transferred. This state is 123 referred to as a black hole connection. Path MTU Discovery relies on 124 such messages to determine the MTU of the path. 126 An extension to Path MTU Discovery defined in this document can be 127 found in [RFC4821]. RFC4821 defines a method for Packetization Layer 128 Path MTU Discovery (PLPMTUD) designed for use over paths where 129 delivery of ICMPv6 messages to a host is not assured. 131 2. Terminology 133 node a device that implements IPv6. 135 router a node that forwards IPv6 packets not explicitly 136 addressed to itself. 138 host any node that is not a router. 140 upper layer a protocol layer immediately above IPv6. 141 Examples are transport protocols such as TCP and 142 UDP, control protocols such as ICMPv6, routing 143 protocols such as OSPF, and internet or lower- 144 layer protocols being "tunneled" over (i.e., 145 encapsulated in) IPv6 such as IPX, AppleTalk, or 146 IPv6 itself. 148 link a communication facility or medium over which 149 nodes can communicate at the link layer, i.e., 150 the layer immediately below IPv6. Examples are 151 Ethernets (simple or bridged); PPP links; X.25, 152 Frame Relay, or ATM networks; and internet (or 153 higher) layer "tunnels", such as tunnels over 154 IPv4 or IPv6 itself. 156 interface a node's attachment to a link. 158 address an IPv6-layer identifier for an interface or a 159 set of interfaces. 161 packet an IPv6 header plus payload. The packet can have 162 a size less than or equal to the PMTU. 163 Alternatively, this can be a larger packet that 164 is fragmented into a series of fragments each 165 with a size less than or equal to the PMTU. 167 link MTU the maximum transmission unit, i.e., maximum 168 packet size in octets, that can be conveyed in 169 one piece over a link. 171 path the set of links traversed by a packet between a 172 source node and a destination node. 174 path MTU the minimum link MTU of all the links in a path 175 between a source node and a destination node. 177 PMTU path MTU 179 Path MTU Discovery process by which a node learns the PMTU of a path 181 EMTU_S Effective MTU for sending, used by upper layer 182 protocols to limit the size of IP packets they 183 queue for sending [RFC6691]. 185 EMTU_R Effective MTU for receiving, the largest packet 186 that can be reassembled at the receiver. 188 flow a sequence of packets sent from a particular 189 source to a particular (unicast or multicast) 190 destination for which the source desires special 191 handling by the intervening routers. 193 flow id a combination of a source address and a non-zero 194 flow label. 196 3. Protocol Overview 198 This memo describes a technique to dynamically discover the PMTU of a 199 path. The basic idea is that a source node initially assumes that 200 the PMTU of a path is the (known) MTU of the first hop in the path. 201 If any of the packets sent on that path are too large to be forwarded 202 by some node along the path, that node will discard them and return 203 ICMPv6 Packet Too Big messages. Upon receipt of such a message, the 204 source node reduces its assumed PMTU for the path based on the MTU of 205 the constricting hop as reported in the Packet Too Big message. The 206 decreased PMTU causes the source to send smaller fragments or change 207 EMTU_S to cause upper layer to reduce the size of IP packets it 208 sends. 210 The Path MTU Discovery process ends when the node's estimate of the 211 PMTU is less than or equal to the actual PMTU. Note that several 212 iterations of the packet-sent/Packet-Too-Big-message-received cycle 213 may occur before the Path MTU Discovery process ends, as there may be 214 links with smaller MTUs further along the path. 216 Alternatively, the node may elect to end the discovery process by 217 ceasing to send packets larger than the IPv6 minimum link MTU. 219 The PMTU of a path may change over time, due to changes in the 220 routing topology. Reductions of the PMTU are detected by Packet Too 221 Big messages. To detect increases in a path's PMTU, a node 222 periodically increases its assumed PMTU. This will almost always 223 result in packets being discarded and Packet Too Big messages being 224 generated, because in most cases the PMTU of the path will not have 225 changed. Therefore, attempts to detect increases in a path's PMTU 226 should be done infrequently. 228 Path MTU Discovery supports multicast as well as unicast 229 destinations. In the case of a multicast destination, copies of a 230 packet may traverse many different paths to many different nodes. 231 Each path may have a different PMTU, and a single multicast packet 232 may result in multiple Packet Too Big messages, each reporting a 233 different next-hop MTU. The minimum PMTU value across the set of 234 paths in use determines the size of subsequent packets sent to the 235 multicast destination. 237 Note that Path MTU Discovery must be performed even in cases where a 238 node "thinks" a destination is attached to the same link as itself. 239 In a situation such as when a neighboring router acts as proxy [ND] 240 for some destination, the destination can to appear to be directly 241 connected but is in fact more than one hop away. 243 4. Protocol Requirements 245 As discussed in Section 1, IPv6 nodes are not required to implement 246 Path MTU Discovery. The requirements in this section apply only to 247 those implementations that include Path MTU Discovery. 249 Nodes SHOULD appropriately validate the payload of ICMPv6 PTB 250 messages to ensure these are received in response to transmitted 251 traffic (i.e., a reported error condition that corresponds to an IPv6 252 packet actually sent by the application) per [ICMPv6]. 254 If a node receives a Packet Too Big message reporting a next-hop MTU 255 that is less than the IPv6 minimum link MTU, it MUST discard it. A 256 node MUST NOT reduce its estimate of the Path MTU below the IPv6 257 minimum link MTU. 259 When a node receives a Packet Too Big message, it MUST reduce its 260 estimate of the PMTU for the relevant path, based on the value of the 261 MTU field in the message. The precise behavior of a node in this 262 circumstance is not specified, since different applications may have 263 different requirements, and since different implementation 264 architectures may favor different strategies. 266 After receiving a Packet Too Big message, a node MUST attempt to 267 avoid eliciting more such messages in the near future. The node MUST 268 reduce the size of the packets it is sending along the path. Using a 269 PMTU estimate larger than the IPv6 minimum link MTU may continue to 270 elicit Packet Too Big messages. Since each of these messages (and 271 the dropped packets they respond to) consume network resources, the 272 node MUST force the Path MTU Discovery process to end. 274 Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast 275 as possible. Nodes MAY detect increases in PMTU, but because doing 276 so requires sending packets larger than the current estimated PMTU, 277 and because the likelihood is that the PMTU will not have increased, 278 this MUST be done at infrequent intervals. An attempt to detect an 279 increase (by sending a packet larger than the current estimate) MUST 280 NOT be done less than 5 minutes after a Packet Too Big message has 281 been received for the given path. The recommended setting for this 282 timer is twice its minimum value (10 minutes). 284 A node MUST NOT increase its estimate of the Path MTU in response to 285 the contents of a Packet Too Big message. A message purporting to 286 announce an increase in the Path MTU might be a stale packet that has 287 been floating around in the network, a false packet injected as part 288 of a denial-of-service attack, or the result of having multiple paths 289 to the destination, each with a different PMTU. 291 5. Implementation Issues 293 This section discusses a number of issues related to the 294 implementation of Path MTU Discovery. This is not a specification, 295 but rather a set of notes provided as an aid for implementers. 297 The issues include: 299 - What layer or layers implement Path MTU Discovery? 301 - How is the PMTU information cached? 303 - How is stale PMTU information removed? 305 - What must transport and higher layers do? 307 5.1. Layering 309 In the IP architecture, the choice of what size packet to send is 310 made by a protocol at a layer above IP. This memo refers to such a 311 protocol as a "packetization protocol". Packetization protocols are 312 usually transport protocols (for example, TCP) but can also be 313 higher-layer protocols (for example, protocols built on top of UDP). 315 Implementing Path MTU Discovery in the packetization layers 316 simplifies some of the inter-layer issues, but has several drawbacks: 317 the implementation may have to be redone for each packetization 318 protocol, it becomes hard to share PMTU information between different 319 packetization layers, and the connection-oriented state maintained by 320 some packetization layers may not easily extend to save PMTU 321 information for long periods. 323 It is therefore suggested that the IP layer store PMTU information 324 and that the ICMPv6 layer process received Packet Too Big messages. 325 The packetization layers may respond to changes in the PMTU by 326 changing the size of the messages they send. To support this 327 layering, packetization layers require a way to learn of changes in 328 the value of MMS_S, the "maximum send transport-message size". 330 MMS_S is a transport message size calculated by subtracting the size 331 of the IPv6 header (including IPv6 extension headers) from the 332 largest IP packet that can be sent, EMTU_S. MMS_S is limited by a 333 combination of factors, including the PMTU, support for packet 334 fragmentation and reassembly, and the packet reassembly limit (see 335 [I-D.ietf-6man-rfc2460bis] section "Fragment Header"). When source 336 fragmentation is available, EMTU_S is set to EMTU_R, as indicated by 337 the receiver at using an upper layer protocol or based on protocol 338 requirements (1500 octets for IPv6). When a message larger than PMTU 339 is to be transmitted, the source creates fragments, each limited by 340 PMTU. When source fragmentation is not desired, EMTU_S is set to 341 PMTU, and the upper layer protocol is expected to either perform its 342 own fragmentation and reassembly or otherwise limit the size of its 343 messages accordingly. 345 However, packetization layers are encouraged to avoid sending 346 messages that will require source fragmentation (for the case against 347 fragmentation, see [FRAG]). 349 5.2. Storing PMTU information 351 Ideally, a PMTU value should be associated with a specific path 352 traversed by packets exchanged between the source and destination 353 nodes. However, in most cases a node will not have enough 354 information to completely and accurately identify such a path. 355 Rather, a node must associate a PMTU value with some local 356 representation of a path. It is left to the implementation to select 357 the local representation of a path. 359 In the case of a multicast destination address, copies of a packet 360 may traverse many different paths to reach many different nodes. The 361 local representation of the "path" to a multicast destination must 362 represent a potentially large set of paths. 364 Minimally, an implementation could maintain a single PMTU value to be 365 used for all packets originated from the node. This PMTU value would 366 be the minimum PMTU learned across the set of all paths in use by the 367 node. This approach is likely to result in the use of smaller 368 packets than is necessary for many paths. In the case of multipath 369 routing (e.g., Equal Cost Multipath Routing, ECMP), a set of paths 370 can exist even for a single source and destination pair. 372 An implementation could use the destination address as the local 373 representation of a path. The PMTU value associated with a 374 destination would be the minimum PMTU learned across the set of all 375 paths in use to that destination. This approach will result in the 376 use of optimally sized packets on a per-destination basis. This 377 approach integrates nicely with the conceptual model of a host as 378 described in [ND]: a PMTU value could be stored with the 379 corresponding entry in the destination cache. 381 If flows [I-D.ietf-6man-rfc2460bis] are in use, an implementation 382 could use the flow id as the local representation of a path. Packets 383 sent to a particular destination but belonging to different flows may 384 use different paths, as with ECMP, in which the choice of path might 385 depending on the flow id. This approach might result in the use of 386 optimally sized packets on a per-flow basis, providing finer 387 granularity than PMTU values maintained on a per-destination basis. 389 For source routed packets (i.e. packets containing an IPv6 Routing 390 header [I-D.ietf-6man-rfc2460bis]), the source route may further 391 qualify the local representation of a path. 393 Initially, the PMTU value for a path is assumed to be the (known) MTU 394 of the first-hop link. 396 When a Packet Too Big message is received, the node determines which 397 path the message applies to based on the contents of the Packet Too 398 Big message. For example, if the destination address is used as the 399 local representation of a path, the destination address from the 400 original packet would be used to determine which path the message 401 applies to. 403 Note: if the original packet contained a Routing header, the 404 Routing header should be used to determine the location of the 405 destination address within the original packet. If Segments Left 406 is equal to zero, the destination address is in the Destination 407 Address field in the IPv6 header. If Segments Left is greater 408 than zero, the destination address is the last address 409 (Address[n]) in the Routing header. 411 The node then uses the value in the MTU field in the Packet Too Big 412 message as a tentative PMTU value or the minimum IPv6 next hope MTU 413 if that is larger, and compares the tentative PMTU to the existing 414 PMTU. If the tentative PMTU is less than the existing PMTU estimate, 415 the tentative PMTU replaces the existing PMTU as the PMTU value for 416 the path. 418 The packetization layers must be notified about decreases in the 419 PMTU. Any packetization layer instance (for example, a TCP 420 connection) that is actively using the path must be notified if the 421 PMTU estimate is decreased. 423 Note: even if the Packet Too Big message contains an Original 424 Packet Header that refers to a UDP packet, the TCP layer must be 425 notified if any of its connections use the given path. 427 Also, the instance that sent the packet that elicited the Packet Too 428 Big message should be notified that its packet has been dropped, even 429 if the PMTU estimate has not changed, so that it may retransmit the 430 dropped data. 432 Note: An implementation can avoid the use of an asynchronous 433 notification mechanism for PMTU decreases by postponing 434 notification until the next attempt to send a packet larger than 435 the PMTU estimate. In this approach, when an attempt is made to 436 SEND a packet that is larger than the PMTU estimate, the SEND 437 function should fail and return a suitable error indication. This 438 approach may be more suitable to a connectionless packetization 439 layer (such as one using UDP), which (in some implementations) may 440 be hard to "notify" from the ICMPv6 layer. In this case, the 441 normal timeout-based retransmission mechanisms would be used to 442 recover from the dropped packets. 444 It is important to understand that the notification of the 445 packetization layer instances using the path about the change in the 446 PMTU is distinct from the notification of a specific instance that a 447 packet has been dropped. The latter should be done as soon as 448 practical (i.e., asynchronously from the point of view of the 449 packetization layer instance), while the former may be delayed until 450 a packetization layer instance wants to create a packet. 451 Retransmission should be done for only for those packets that are 452 known to be dropped, as indicated by a Packet Too Big message. 454 5.3. Purging stale PMTU information 456 Internetwork topology is dynamic; routes change over time. While the 457 local representation of a path may remain constant, the actual 458 path(s) in use may change. Thus, PMTU information cached by a node 459 can become stale. 461 If the stale PMTU value is too large, this will be discovered almost 462 immediately once a large enough packet is sent on the path. No such 463 mechanism exists for realizing that a stale PMTU value is too small, 464 so an implementation SHOULD "age" cached values. When a PMTU value 465 has not been decreased for a while (on the order of 10 minutes), the 466 PMTU estimate should be set to the MTU of the first-hop link, and the 467 packetization layers should be notified of the change. This will 468 cause the complete Path MTU Discovery process to take place again. 470 Note: an implementation should provide a means for changing the 471 timeout duration, including setting it to "infinity". For 472 example, nodes attached to an FDDI link which is then attached to 473 the rest of the Internet via a small MTU serial line are never 474 going to discover a new non-local PMTU, so they should not have to 475 put up with dropped packets every 10 minutes. 477 An upper layer must not retransmit data in response to an increase in 478 the PMTU estimate, since this increase never comes in response to an 479 indication of a dropped packet. 481 One approach to implementing PMTU aging is to associate a timestamp 482 field with a PMTU value. This field is initialized to a "reserved" 483 value, indicating that the PMTU is equal to the MTU of the first hop 484 link. Whenever the PMTU is decreased in response to a Packet Too Big 485 message, the timestamp is set to the current time. 487 Once a minute, a timer-driven procedure runs through all cached PMTU 488 values, and for each PMTU whose timestamp is not "reserved" and is 489 older than the timeout interval: 491 - The PMTU estimate is set to the MTU of the first hop link. 493 - The timestamp is set to the "reserved" value. 495 - Packetization layers using this path are notified of the increase. 497 5.4. Packetization layer actions 499 A packetization layer (e.g., TCP) must track the PMTU for the path(s) 500 in use by a connection; it should not send segments that would result 501 in packets larger than the PMTU, except to probe during PMTU 502 discovery (this probe packet must not be fragmented to the PMTU). A 503 simple implementation could ask the IP layer for this value each time 504 it created a new segment, but this could be inefficient. An 505 implementation typically caches other values derived from the PMTU. 506 It may be simpler to receive asynchronous notification when the PMTU 507 changes, so that these variables may be also updated. 509 A TCP implementation must also store the Maximum Segment Size (MSS) 510 value received from its peer, which represents the EMTU_R, the 511 largest packet that can be reassembled by the receiver, and must not 512 send any segment larger than this MSS, regardless of the PMTU. 514 The value sent in the TCP MSS option is independent of the PMTU; it 515 is determined by the receiver reassembly limit EMTU_R. This MSS 516 option value is used by the other end of the connection, which may be 517 using an unrelated PMTU value. See [I-D.ietf-6man-rfc2460bis] 518 sections "Packet Size Issues" and "Maximum Upper-Layer Payload Size" 519 for information on selecting a value for the TCP MSS option. 521 When a Packet Too Big message is received, it implies that a packet 522 was dropped by the node that sent the ICMPv6 message. It is 523 sufficient to treat this in the same way as any other dropped 524 segment, and will be recovered by normal retransmission methods. If 525 the Path MTU Discovery process requires several steps to find the 526 PMTU of the full path, this could delay the connection by many round- 527 trip times. 529 Alternatively, the retransmission could be done in immediate response 530 to a notification that the Path MTU has changed, but only for the 531 specific connection specified by the Packet Too Big message. The 532 packet size used in the retransmission should be no larger than the 533 new PMTU. 535 Note: A packetization layer must not retransmit in response to 536 every Packet Too Big message, since a burst of several oversized 537 segments will give rise to several such messages and hence several 538 retransmissions of the same data. If the new estimated PMTU is 539 still wrong, the process repeats, and there is an exponential 540 growth in the number of superfluous segments sent. 541 Retransmissions can increase network load in response to 542 congestion, worsening that congestion. Any packetization layer 543 that uses retransmission is responsible for congestion control of 544 its retransmissions. See [RFC8085] for more information. 546 This means that the TCP layer must be able to recognize when a 547 Packet Too Big notification actually decreases the PMTU that it 548 has already used to send a packet on the given connection, and 549 should ignore any other notifications. 551 Many TCP implementations incorporate "congestion avoidance" and 552 "slow-start" algorithms to improve performance [CONG]. Unlike a 553 retransmission caused by a TCP retransmission timeout, a 554 retransmission caused by a Packet Too Big message should not change 555 the congestion window. It should, however, trigger the slow-start 556 mechanism (i.e., only one segment should be retransmitted until 557 acknowledgements begin to arrive again). 559 TCP performance can be reduced if the sender's maximum window size is 560 not an exact multiple of the segment size in use (this is not the 561 congestion window size). 563 5.5. Issues for other transport protocols 565 Some transport protocols are not allowed to repacketize when doing a 566 retransmission. That is, once an attempt is made to transmit a 567 segment of a certain size, the transport cannot split the contents of 568 the segment into smaller segments for retransmission. In such a 569 case, the original segment can be fragmented by the IP layer during 570 retransmission. Subsequent segments, when transmitted for the first 571 time, should be no larger than allowed by the Path MTU. 573 Path MTU Discovery for IPv4 [RFC1191] used NFS as an example of a 574 UDP-based application that benefits from PMTU discovery. Since then 575 [RFC7530], states the supported transport layer between NFS and IP 576 must be an IETF standardized transport protocol that is specified to 577 avoid network congestion; such transports include TCP and the Stream 578 Control Transmission Protocol (SCTP). In this case, the transport is 579 itself responsible for determining and using an effective Path MTU, 580 including implementing PMTU discovery when this is needed. 582 5.6. Management interface 584 It is suggested that an implementation provide a way for a system 585 utility program to: 587 - Specify that Path MTU Discovery not be done on a given path. 589 - Change the PMTU value associated with a given path. 591 The former can be accomplished by associating a flag with the path; 592 when a packet is sent on a path with this flag set, the IP layer does 593 not send packets larger than the IPv6 minimum link MTU. 595 These features might be used to work around an anomalous situation, 596 or by a routing protocol implementation that is able to obtain Path 597 MTU values. 599 The implementation should also provide a way to change the timeout 600 period for aging stale PMTU information. 602 6. Security Considerations 604 This Path MTU Discovery mechanism makes possible two denial-of- 605 service attacks, both based on a malicious party sending false Packet 606 Too Big messages to a node. 608 In the first attack, the false message indicates a PMTU much 609 smaller than reality. In response, the victim node should never 610 set its PMTU estimate below the IPv6 minimum link MTU. A sender 611 that falsely reduces to this MTU would observe suboptimal 612 performance. 614 In the second attack, the false message indicates a PMTU larger 615 than reality. If believed, this could cause temporary blockage as 616 the victim sends packets that will be dropped by some router. 617 Within one round-trip time, the node would discover its mistake 618 (receiving Packet Too Big messages from that router), but frequent 619 repetition of this attack could cause lots of packets to be 620 dropped. A node, however, should never raise its estimate of the 621 PMTU based on a Packet Too Big message, so should not be 622 vulnerable to this attack. 624 A malicious party could also cause problems if it could stop a victim 625 from receiving legitimate Packet Too Big messages, but in this case 626 there are simpler denial-of-service attacks available. 628 If ICMPv6 filtering prevents reception of ICMPv6 Packet Too Big 629 messages, the source will not learn the actual path MTU. 630 Packetization Layer Path MTU Discovery [RFC4821] does not rely upon 631 network support for ICMPv6 messages and is therefore considered more 632 robust than standard PMTUD. It is not susceptible to "black holing" 633 of ICMPv6 message. 635 7. Acknowledgements 637 We would like to acknowledge the authors of and contributors to 638 [RFC1191], from which the majority of this document was derived. We 639 would also like to acknowledge the members of the IPng working group 640 for their careful review and constructive criticisms. 642 8. IANA Considerations 644 This document does not have any IANA actions 646 9. References 648 9.1. Normative References 650 [I-D.ietf-6man-rfc2460bis] 651 <>, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 652 Specification", draft-ietf-6man-rfc2460bis-09 (work in 653 progress), March 2017. 655 [ICMPv6] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 656 Control Message Protocol (ICMPv6) for the Internet 657 Protocol Version 6 (IPv6) Specification", RFC 4443, DOI 658 10.17487/RFC4443, March 2006, 659 . 661 9.2. Informative References 663 [CONG] Jacobson, V., "Congestion Avoidance and Control", Proc. 664 SIGCOMM '88 Symposium on Communications Architectures and 665 Protocols , August 1988. 667 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 668 In Proc. SIGCOMM '87 Workshop on Frontiers in Computer 669 Communications Technology , August 1987. 671 [ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 672 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 673 DOI 10.17487/RFC4861, September 2007, 674 . 676 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 677 DOI 10.17487/RFC1191, November 1990, 678 . 680 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 681 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 682 . 684 [RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)", 685 RFC 6691, DOI 10.17487/RFC6691, July 2012, 686 . 688 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 689 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 690 March 2015, . 692 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 693 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 694 March 2017, . 696 Appendix A. Comparison to RFC 1191 698 This document is based in large part on RFC 1191, which describes 699 Path MTU Discovery for IPv4. Certain portions of RFC 1191 were not 700 needed in this document: 702 router specification Packet Too Big messages and corresponding 703 router behavior are defined in [ICMPv6] 705 Don't Fragment bit there is no DF bit in IPv6 packets 707 TCP MSS discussion selecting a value to send in the TCP MSS option 708 is discussed in [I-D.ietf-6man-rfc2460bis] 710 old-style messages all Packet Too Big messages report the MTU of 711 the constricting link 713 MTU plateau tables not needed because there are no old-style 714 messages 716 Appendix B. Changes Since RFC 1981 718 This document has the following changes from RFC1981. Numbers 719 identify the Internet-Draft version where the change was made: 721 Working Group Internet Drafts 723 05) Changes based on IETF last call reviews by Gorry Fairhurst, 724 Joe Touch, Susan Hares, Stewart Bryant, Rifaat Shekh-Yusef, 725 and Donald Eastlake. This includes includes: 727 o Clarify that the purpose of PMTUD is to reduce the need 728 for IPv6 Fragmentation. 730 o Added text to Introduction about effects on PMTUD when 731 ICMPv6 messages are blocked. 733 o Clarified in Section 4. that nodes should validate the 734 payload of ICMPv6 PTB messages per RFC4443. 736 o Removed text in Section 5.2 about the number of paths to a 737 destination. 739 o Changed title of Section 5.4 to "Packetization layer 740 actions". 742 o Clarified first paragraph in Section 5.4 to to cover all 743 packetization layers, not just TCP. 745 o Clarified text in Section 5.4 to use normal retransmission 746 methods. 748 o Add clarification to Note in Section 5.4 about 749 retransmissions. 751 o Removed text in Section 5.4 that described 4.2BSD as it is 752 now obsolete. 754 o Removed reference to TP4 in Section 5.5. 756 o Updated text in Section 5.5 about NFS including adding a 757 current reference to NFS and removing obsolete text. 759 o Revised text in Section 6 to clarify first attack 760 response. 762 o Added new text in Section 6 to clarify the effect of 763 ICMPv6 filtering on PMTUD. 765 o Aligned terminology for the packetization layer 766 terminology. 768 o Editorial changes. 770 04) Changes based on AD Evaluation including removing details 771 about RFC4821 algorithm in Section 1, remove text about 772 decrementing hop limit from Section 3, and removed text about 773 obsolete security classifications from Section 5.2. 775 04) Editorial changes and clarification in Section 5.2 based on 776 IP Directorate review by Donald Eastlake 778 03) Remove text in Section 5.3 regarding RH0 since it was 779 deprecated by RFC5095 781 02) Clarified in Section 3 that ICMPv6 Packet Too Big should be 782 sent even if the node doesn't decrement the hop limit 784 01) Revised the text about PLPMTUD to use the word "path". 786 01) Editorial changes. 788 00) Added text to discard an ICMPv6 Packet Too Big message 789 containing an MTU less than the IPv6 minimum link MTU. 791 00) Revision of text regarding RFC4821. 793 00) Added R. Hinden as Editor to facilitate ID submission. 795 00) Editorial changes. 797 Individual Internet Drafts 799 01) Remove Note about a Packet Too Big message reporting a next- 800 hop MTU that is less than the IPv6 minimum link MTU. This 801 was removed from [I-D.ietf-6man-rfc2460bis]. 803 01) Include a link to RFC4821 along with a short summary of what 804 it does. 806 01) Assigned references to informative and normative. 808 01) Editorial changes. 810 00) Establish a baseline from RFC1981. The only intended changes 811 are formatting (XML is slightly different from .nroff), 812 differences between an RFC and Internet Draft, fixing a few 813 ID Nits, updating references, and updates to the authors 814 information. There should not be any content changes to the 815 specification. 817 Authors' Addresses 819 Jack McCann 820 Digital Equipment Corporation 822 Stephen E. Deering 823 Retired 824 Vancouver, British Columbia 825 Canada 827 Jeffrey Mogul 828 Digital Equipment Corporation 830 Robert M. Hinden (editor) 831 Check Point Software 832 959 Skyway Road 833 San Carlos, CA 94070 834 USA 836 Email: bob.hinden@gmail.com