idnits 2.17.1 draft-ietf-6man-rfc1981bis-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 105: '... IPv6 nodes SHOULD implement Path MT...' RFC 2119 keyword, line 111: '...th MTU Discovery MUST use the IPv6 min...' RFC 2119 keyword, line 250: '... Nodes SHOULD appropriately validate...' RFC 2119 keyword, line 256: '...nimum link MTU, it MUST discard it. A...' RFC 2119 keyword, line 257: '... node MUST NOT reduce its estimate o...' (10 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 180 has weird spacing: '...scovery proce...' == Line 712 has weird spacing: '...ent bit the...' == Line 714 has weird spacing: '...cussion sel...' == Line 717 has weird spacing: '...essages all...' == Line 720 has weird spacing: '... tables not...' == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 7, 2017) is 2576 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-6man-rfc2460bis-09 -- Possible downref: Normative reference to a draft: ref. 'I-D.ietf-6man-rfc2460bis' -- Obsolete informational reference (is this intentional?): RFC 6691 (Obsoleted by RFC 9293) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. McCann 3 Internet-Draft Digital Equipment Corporation 4 Obsoletes: 1981 (if approved) S. Deering 5 Intended status: Standards Track Retired 6 Expires: October 9, 2017 J. Mogul 7 Digital Equipment Corporation 8 R. Hinden, Ed. 9 Check Point Software 10 April 7, 2017 12 Path MTU Discovery for IP version 6 13 draft-ietf-6man-rfc1981bis-06 15 Abstract 17 This document describes Path MTU Discovery for IP version 6. It is 18 largely derived from RFC 1191, which describes Path MTU Discovery for 19 IP version 4. It obsoletes RFC1981. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 9, 2017. 38 Copyright Notice 40 Copyright (c) 2017 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 This document may contain material from IETF Documents or IETF 54 Contributions published or made publicly available before November 55 10, 2008. The person(s) controlling the copyright in some of this 56 material may not have granted the IETF Trust the right to allow 57 modifications of such material outside the IETF Standards Process. 58 Without obtaining an adequate license from the person(s) controlling 59 the copyright in such materials, this document may not be modified 60 outside the IETF Standards Process, and derivative works of it may 61 not be created outside the IETF Standards Process, except to format 62 it for publication as an RFC or to translate it into languages other 63 than English. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 5 70 4. Protocol Requirements . . . . . . . . . . . . . . . . . . . . 6 71 5. Implementation Issues . . . . . . . . . . . . . . . . . . . . 7 72 5.1. Layering . . . . . . . . . . . . . . . . . . . . . . . . 7 73 5.2. Storing PMTU information . . . . . . . . . . . . . . . . 8 74 5.3. Purging stale PMTU information . . . . . . . . . . . . . 10 75 5.4. Packetization layer actions . . . . . . . . . . . . . . . 11 76 5.5. Issues for other transport protocols . . . . . . . . . . 12 77 5.6. Management interface . . . . . . . . . . . . . . . . . . 13 78 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 79 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 82 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 83 9.2. Informative References . . . . . . . . . . . . . . . . . 14 84 Appendix A. Comparison to RFC 1191 . . . . . . . . . . . . . . . 15 85 Appendix B. Changes Since RFC 1981 . . . . . . . . . . . . . . . 16 86 B.1. Change History Since RFC1981 . . . . . . . . . . . . . . 17 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 89 1. Introduction 91 When one IPv6 node has a large amount of data to send to another 92 node, the data is transmitted in a series of IPv6 packets. These 93 packets can have a size less than or equal to the Path MTU (PMTU). 94 Alternatively, they can be larger packets that are fragmented into a 95 series of fragments each with a size less than or equal to the PMTU. 97 It is usually preferable that these packets be of the largest size 98 that can successfully traverse the path from the source node to the 99 destination node without the need for IPv6 fragmentation. This 100 packet size is referred to as the Path MTU, and it is equal to the 101 minimum link MTU of all the links in a path. This document defines a 102 standard mechanism for a node to discover the PMTU of an arbitrary 103 path. 105 IPv6 nodes SHOULD implement Path MTU Discovery in order to discover 106 and take advantage of paths with PMTU greater than the IPv6 minimum 107 link MTU [I-D.ietf-6man-rfc2460bis]. A minimal IPv6 implementation 108 (e.g., in a boot ROM) may choose to omit implementation of Path MTU 109 Discovery. 111 Nodes not implementing Path MTU Discovery MUST use the IPv6 minimum 112 link MTU defined in [I-D.ietf-6man-rfc2460bis] as the maximum packet 113 size. In most cases, this will result in the use of smaller packets 114 than necessary, because most paths have a PMTU greater than the IPv6 115 minimum link MTU. A node sending packets much smaller than the Path 116 MTU allows is wasting network resources and probably getting 117 suboptimal throughput. 119 Nodes implementing Path MTU Discovery and sending packets larger than 120 the IPv6 minimum link MTU are susceptible to problematic connectivity 121 if ICMPv6 [ICMPv6] messages are blocked or not transmitted. For 122 example, this will result in connections that complete the TCP three- 123 way handshake correctly but then hang when data is transferred. This 124 state is referred to as a black hole connection. Path MTU Discovery 125 relies on such messages to determine the MTU of the path. 127 An extension to Path MTU Discovery defined in this document can be 128 found in [RFC4821]. RFC4821 defines a method for Packetization Layer 129 Path MTU Discovery (PLPMTUD) designed for use over paths where 130 delivery of ICMPv6 messages to a host is not assured. 132 2. Terminology 134 node a device that implements IPv6. 136 router a node that forwards IPv6 packets not explicitly 137 addressed to itself. 139 host any node that is not a router. 141 upper layer a protocol layer immediately above IPv6. 142 Examples are transport protocols such as TCP and 143 UDP, control protocols such as ICMPv6, routing 144 protocols such as OSPF, and internet or lower- 145 layer protocols being "tunneled" over (i.e., 146 encapsulated in) IPv6 such as IPX, AppleTalk, or 147 IPv6 itself. 149 link a communication facility or medium over which 150 nodes can communicate at the link layer, i.e., 151 the layer immediately below IPv6. Examples are 152 Ethernets (simple or bridged); PPP links; X.25, 153 Frame Relay, or ATM networks; and internet (or 154 higher) layer "tunnels", such as tunnels over 155 IPv4 or IPv6 itself. 157 interface a node's attachment to a link. 159 address an IPv6-layer identifier for an interface or a 160 set of interfaces. 162 packet an IPv6 header plus payload. The packet can have 163 a size less than or equal to the PMTU. 164 Alternatively, this can be a larger packet that 165 is fragmented into a series of fragments each 166 with a size less than or equal to the PMTU. 168 link MTU the maximum transmission unit, i.e., maximum 169 packet size in octets, that can be conveyed in 170 one piece over a link. 172 path the set of links traversed by a packet between a 173 source node and a destination node. 175 path MTU the minimum link MTU of all the links in a path 176 between a source node and a destination node. 178 PMTU path MTU 180 Path MTU Discovery process by which a node learns the PMTU of a path 182 EMTU_S Effective MTU for sending, used by upper layer 183 protocols to limit the size of IP packets they 184 queue for sending [RFC6691]. 186 EMTU_R Effective MTU for receiving, the largest packet 187 that can be reassembled at the receiver. 189 flow a sequence of packets sent from a particular 190 source to a particular (unicast or multicast) 191 destination for which the source desires special 192 handling by the intervening routers. 194 flow id a combination of a source address and a non-zero 195 flow label. 197 3. Protocol Overview 199 This memo describes a technique to dynamically discover the PMTU of a 200 path. The basic idea is that a source node initially assumes that 201 the PMTU of a path is the (known) MTU of the first hop in the path. 202 If any of the packets sent on that path are too large to be forwarded 203 by some node along the path, that node will discard them and return 204 ICMPv6 Packet Too Big messages. Upon receipt of such a message, the 205 source node reduces its assumed PMTU for the path based on the MTU of 206 the constricting hop as reported in the Packet Too Big message. The 207 decreased PMTU causes the source to send smaller fragments or change 208 EMTU_S to cause upper layer to reduce the size of IP packets it 209 sends. 211 The Path MTU Discovery process ends when the node's estimate of the 212 PMTU is less than or equal to the actual PMTU. Note that several 213 iterations of the packet-sent/Packet-Too-Big-message-received cycle 214 may occur before the Path MTU Discovery process ends, as there may be 215 links with smaller MTUs further along the path. 217 Alternatively, the node may elect to end the discovery process by 218 ceasing to send packets larger than the IPv6 minimum link MTU. 220 The PMTU of a path may change over time, due to changes in the 221 routing topology. Reductions of the PMTU are detected by Packet Too 222 Big messages. To detect increases in a path's PMTU, a node 223 periodically increases its assumed PMTU. This will almost always 224 result in packets being discarded and Packet Too Big messages being 225 generated, because in most cases the PMTU of the path will not have 226 changed. Therefore, attempts to detect increases in a path's PMTU 227 should be done infrequently. 229 Path MTU Discovery supports multicast as well as unicast 230 destinations. In the case of a multicast destination, copies of a 231 packet may traverse many different paths to many different nodes. 232 Each path may have a different PMTU, and a single multicast packet 233 may result in multiple Packet Too Big messages, each reporting a 234 different next-hop MTU. The minimum PMTU value across the set of 235 paths in use determines the size of subsequent packets sent to the 236 multicast destination. 238 Note that Path MTU Discovery must be performed even in cases where a 239 node "thinks" a destination is attached to the same link as itself. 240 In a situation such as when a neighboring router acts as proxy [ND] 241 for some destination, the destination can to appear to be directly 242 connected but is in fact more than one hop away. 244 4. Protocol Requirements 246 As discussed in Section 1, IPv6 nodes are not required to implement 247 Path MTU Discovery. The requirements in this section apply only to 248 those implementations that include Path MTU Discovery. 250 Nodes SHOULD appropriately validate the payload of ICMPv6 PTB 251 messages to ensure these are received in response to transmitted 252 traffic (i.e., a reported error condition that corresponds to an IPv6 253 packet actually sent by the application) per [ICMPv6]. 255 If a node receives a Packet Too Big message reporting a next-hop MTU 256 that is less than the IPv6 minimum link MTU, it MUST discard it. A 257 node MUST NOT reduce its estimate of the Path MTU below the IPv6 258 minimum link MTU. 260 When a node receives a Packet Too Big message, it MUST reduce its 261 estimate of the PMTU for the relevant path, based on the value of the 262 MTU field in the message. The precise behavior of a node in this 263 circumstance is not specified, since different applications may have 264 different requirements, and since different implementation 265 architectures may favor different strategies. 267 After receiving a Packet Too Big message, a node MUST attempt to 268 avoid eliciting more such messages in the near future. The node MUST 269 reduce the size of the packets it is sending along the path. Using a 270 PMTU estimate larger than the IPv6 minimum link MTU may continue to 271 elicit Packet Too Big messages. Since each of these messages (and 272 the dropped packets they respond to) consume network resources, the 273 node MUST force the Path MTU Discovery process to end. 275 Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast 276 as possible. Nodes MAY detect increases in PMTU, but because doing 277 so requires sending packets larger than the current estimated PMTU, 278 and because the likelihood is that the PMTU will not have increased, 279 this MUST be done at infrequent intervals. An attempt to detect an 280 increase (by sending a packet larger than the current estimate) MUST 281 NOT be done less than 5 minutes after a Packet Too Big message has 282 been received for the given path. The recommended setting for this 283 timer is twice its minimum value (10 minutes). 285 A node MUST NOT increase its estimate of the Path MTU in response to 286 the contents of a Packet Too Big message. A message purporting to 287 announce an increase in the Path MTU might be a stale packet that has 288 been floating around in the network, a false packet injected as part 289 of a denial-of-service attack, or the result of having multiple paths 290 to the destination, each with a different PMTU. 292 5. Implementation Issues 294 This section discusses a number of issues related to the 295 implementation of Path MTU Discovery. This is not a specification, 296 but rather a set of notes provided as an aid for implementers. 298 The issues include: 300 - What layer or layers implement Path MTU Discovery? 302 - How is the PMTU information cached? 304 - How is stale PMTU information removed? 306 - What must transport and higher layers do? 308 5.1. Layering 310 In the IP architecture, the choice of what size packet to send is 311 made by a protocol at a layer above IP. This memo refers to such a 312 protocol as a "packetization protocol". Packetization protocols are 313 usually transport protocols (for example, TCP) but can also be 314 higher-layer protocols (for example, protocols built on top of UDP). 316 Implementing Path MTU Discovery in the packetization layers 317 simplifies some of the inter-layer issues, but has several drawbacks: 318 the implementation may have to be redone for each packetization 319 protocol, it becomes hard to share PMTU information between different 320 packetization layers, and the connection-oriented state maintained by 321 some packetization layers may not easily extend to save PMTU 322 information for long periods. 324 It is therefore suggested that the IP layer store PMTU information 325 and that the ICMPv6 layer process received Packet Too Big messages. 326 The packetization layers may respond to changes in the PMTU by 327 changing the size of the messages they send. To support this 328 layering, packetization layers require a way to learn of changes in 329 the value of MMS_S, the "maximum send transport-message size". 331 MMS_S is a transport message size calculated by subtracting the size 332 of the IPv6 header (including IPv6 extension headers) from the 333 largest IP packet that can be sent, EMTU_S. MMS_S is limited by a 334 combination of factors, including the PMTU, support for packet 335 fragmentation and reassembly, and the packet reassembly limit (see 336 [I-D.ietf-6man-rfc2460bis] section "Fragment Header"). When source 337 fragmentation is available, EMTU_S is set to EMTU_R, as indicated by 338 the receiver using an upper layer protocol or based on protocol 339 requirements (1500 octets for IPv6). When a message larger than PMTU 340 is to be transmitted, the source creates fragments, each limited by 341 PMTU. When source fragmentation is not desired, EMTU_S is set to 342 PMTU, and the upper layer protocol is expected to either perform its 343 own fragmentation and reassembly or otherwise limit the size of its 344 messages accordingly. 346 However, packetization layers are encouraged to avoid sending 347 messages that will require source fragmentation (for the case against 348 fragmentation, see [FRAG]). 350 5.2. Storing PMTU information 352 Ideally, a PMTU value should be associated with a specific path 353 traversed by packets exchanged between the source and destination 354 nodes. However, in most cases a node will not have enough 355 information to completely and accurately identify such a path. 356 Rather, a node must associate a PMTU value with some local 357 representation of a path. It is left to the implementation to select 358 the local representation of a path. 360 In the case of a multicast destination address, copies of a packet 361 may traverse many different paths to reach many different nodes. The 362 local representation of the "path" to a multicast destination must 363 represent a potentially large set of paths. 365 Minimally, an implementation could maintain a single PMTU value to be 366 used for all packets originated from the node. This PMTU value would 367 be the minimum PMTU learned across the set of all paths in use by the 368 node. This approach is likely to result in the use of smaller 369 packets than is necessary for many paths. In the case of multipath 370 routing (e.g., Equal Cost Multipath Routing, ECMP), a set of paths 371 can exist even for a single source and destination pair. 373 An implementation could use the destination address as the local 374 representation of a path. The PMTU value associated with a 375 destination would be the minimum PMTU learned across the set of all 376 paths in use to that destination. This approach will result in the 377 use of optimally sized packets on a per-destination basis. This 378 approach integrates nicely with the conceptual model of a host as 379 described in [ND]: a PMTU value could be stored with the 380 corresponding entry in the destination cache. 382 If flows [I-D.ietf-6man-rfc2460bis] are in use, an implementation 383 could use the flow id as the local representation of a path. Packets 384 sent to a particular destination but belonging to different flows may 385 use different paths, as with ECMP, in which the choice of path might 386 depending on the flow id. This approach might result in the use of 387 optimally sized packets on a per-flow basis, providing finer 388 granularity than PMTU values maintained on a per-destination basis. 390 For source routed packets (i.e. packets containing an IPv6 Routing 391 header [I-D.ietf-6man-rfc2460bis]), the source route may further 392 qualify the local representation of a path. 394 Initially, the PMTU value for a path is assumed to be the (known) MTU 395 of the first-hop link. 397 When a Packet Too Big message is received, the node determines which 398 path the message applies to based on the contents of the Packet Too 399 Big message. For example, if the destination address is used as the 400 local representation of a path, the destination address from the 401 original packet would be used to determine which path the message 402 applies to. 404 Note: if the original packet contained a Routing header, the 405 Routing header should be used to determine the location of the 406 destination address within the original packet. If Segments Left 407 is equal to zero, the destination address is in the Destination 408 Address field in the IPv6 header. If Segments Left is greater 409 than zero, the destination address is the last address 410 (Address[n]) in the Routing header. 412 The node then uses the value in the MTU field in the Packet Too Big 413 message as a tentative PMTU value or the IPv6 minimum link MTU if 414 that is larger, and compares the tentative PMTU to the existing PMTU. 415 If the tentative PMTU is less than the existing PMTU estimate, the 416 tentative PMTU replaces the existing PMTU as the PMTU value for the 417 path. 419 The packetization layers must be notified about decreases in the 420 PMTU. Any packetization layer instance (for example, a TCP 421 connection) that is actively using the path must be notified if the 422 PMTU estimate is decreased. 424 Note: even if the Packet Too Big message contains an Original 425 Packet Header that refers to a UDP packet, the TCP layer must be 426 notified if any of its connections use the given path. 428 Also, the instance that sent the packet that elicited the Packet Too 429 Big message should be notified that its packet has been dropped, even 430 if the PMTU estimate has not changed, so that it may retransmit the 431 dropped data. 433 Note: An implementation can avoid the use of an asynchronous 434 notification mechanism for PMTU decreases by postponing 435 notification until the next attempt to send a packet larger than 436 the PMTU estimate. In this approach, when an attempt is made to 437 SEND a packet that is larger than the PMTU estimate, the SEND 438 function should fail and return a suitable error indication. This 439 approach may be more suitable to a connectionless packetization 440 layer (such as one using UDP), which (in some implementations) may 441 be hard to "notify" from the ICMPv6 layer. In this case, the 442 normal timeout-based retransmission mechanisms would be used to 443 recover from the dropped packets. 445 It is important to understand that the notification of the 446 packetization layer instances using the path about the change in the 447 PMTU is distinct from the notification of a specific instance that a 448 packet has been dropped. The latter should be done as soon as 449 practical (i.e., asynchronously from the point of view of the 450 packetization layer instance), while the former may be delayed until 451 a packetization layer instance wants to create a packet. 452 Retransmission should be done for only for those packets that are 453 known to be dropped, as indicated by a Packet Too Big message. 455 5.3. Purging stale PMTU information 457 Internetwork topology is dynamic; routes change over time. While the 458 local representation of a path may remain constant, the actual 459 path(s) in use may change. Thus, PMTU information cached by a node 460 can become stale. 462 If the stale PMTU value is too large, this will be discovered almost 463 immediately once a large enough packet is sent on the path. No such 464 mechanism exists for realizing that a stale PMTU value is too small, 465 so an implementation SHOULD "age" cached values. When a PMTU value 466 has not been decreased for a while (on the order of 10 minutes), the 467 PMTU estimate should be set to the MTU of the first-hop link, and the 468 packetization layers should be notified of the change. This will 469 cause the complete Path MTU Discovery process to take place again. 471 Note: an implementation should provide a means for changing the 472 timeout duration, including setting it to "infinity". For 473 example, nodes attached to an FDDI link which is then attached to 474 the rest of the Internet via a small MTU serial line are never 475 going to discover a new non-local PMTU, so they should not have to 476 put up with dropped packets every 10 minutes. 478 An upper layer must not retransmit data in response to an increase in 479 the PMTU estimate, since this increase never comes in response to an 480 indication of a dropped packet. 482 One approach to implementing PMTU aging is to associate a timestamp 483 field with a PMTU value. This field is initialized to a "reserved" 484 value, indicating that the PMTU is equal to the MTU of the first hop 485 link. Whenever the PMTU is decreased in response to a Packet Too Big 486 message, the timestamp is set to the current time. 488 Once a minute, a timer-driven procedure runs through all cached PMTU 489 values, and for each PMTU whose timestamp is not "reserved" and is 490 older than the timeout interval: 492 - The PMTU estimate is set to the MTU of the first hop link. 494 - The timestamp is set to the "reserved" value. 496 - Packetization layers using this path are notified of the increase. 498 5.4. Packetization layer actions 500 A packetization layer (e.g., TCP) must track the PMTU for the path(s) 501 in use by a connection; it should not send segments that would result 502 in packets larger than the PMTU, except to probe during PMTU 503 discovery (this probe packet must not be fragmented to the PMTU). A 504 simple implementation could ask the IP layer for this value each time 505 it created a new segment, but this could be inefficient. An 506 implementation typically caches other values derived from the PMTU. 507 It may be simpler to receive asynchronous notification when the PMTU 508 changes, so that these variables may be also updated. 510 A TCP implementation must also store the Maximum Segment Size (MSS) 511 value received from its peer, which represents the EMTU_R, the 512 largest packet that can be reassembled by the receiver, and must not 513 send any segment larger than this MSS, regardless of the PMTU. 515 The value sent in the TCP MSS option is independent of the PMTU; it 516 is determined by the receiver reassembly limit EMTU_R. This MSS 517 option value is used by the other end of the connection, which may be 518 using an unrelated PMTU value. See [I-D.ietf-6man-rfc2460bis] 519 sections "Packet Size Issues" and "Maximum Upper-Layer Payload Size" 520 for information on selecting a value for the TCP MSS option. 522 When a Packet Too Big message is received, it implies that a packet 523 was dropped by the node that sent the ICMPv6 message. It is 524 sufficient to treat this in the same way as any other dropped 525 segment, and will be recovered by normal retransmission methods. If 526 the Path MTU Discovery process requires several steps to find the 527 PMTU of the full path, this could delay the connection by many round- 528 trip times. 530 Alternatively, the retransmission could be done in immediate response 531 to a notification that the Path MTU has changed, but only for the 532 specific connection specified by the Packet Too Big message. The 533 packet size used in the retransmission should be no larger than the 534 new PMTU. 536 Note: A packetization layer must not retransmit in response to 537 every Packet Too Big message, since a burst of several oversized 538 segments will give rise to several such messages and hence several 539 retransmissions of the same data. If the new estimated PMTU is 540 still wrong, the process repeats, and there is an exponential 541 growth in the number of superfluous segments sent. 542 Retransmissions can increase network load in response to 543 congestion, worsening that congestion. Any packetization layer 544 that uses retransmission is responsible for congestion control of 545 its retransmissions. See [RFC8085] for more information. 547 This means that the TCP layer must be able to recognize when a 548 Packet Too Big notification actually decreases the PMTU that it 549 has already used to send a packet on the given connection, and 550 should ignore any other notifications. 552 Many TCP implementations incorporate "congestion avoidance" and 553 "slow-start" algorithms to improve performance [CONG]. Unlike a 554 retransmission caused by a TCP retransmission timeout, a 555 retransmission caused by a Packet Too Big message should not change 556 the congestion window. It should, however, trigger the slow-start 557 mechanism (i.e., only one segment should be retransmitted until 558 acknowledgements begin to arrive again). 560 TCP performance can be reduced if the sender's maximum window size is 561 not an exact multiple of the segment size in use (this is not the 562 congestion window size). 564 5.5. Issues for other transport protocols 566 Some transport protocols are not allowed to repacketize when doing a 567 retransmission. That is, once an attempt is made to transmit a 568 segment of a certain size, the transport cannot split the contents of 569 the segment into smaller segments for retransmission. In such a 570 case, the original segment can be fragmented by the IP layer during 571 retransmission. Subsequent segments, when transmitted for the first 572 time, should be no larger than allowed by the Path MTU. 574 Path MTU Discovery for IPv4 [RFC1191] used NFS as an example of a 575 UDP-based application that benefits from PMTU discovery. Since then 576 [RFC7530], states the supported transport layer between NFS and IP 577 must be an IETF standardized transport protocol that is specified to 578 avoid network congestion; such transports include TCP and the Stream 579 Control Transmission Protocol (SCTP). In this case, the transport is 580 itself responsible for determining and using an effective Path MTU, 581 including implementing PMTU discovery when this is needed. 583 5.6. Management interface 585 It is suggested that an implementation provide a way for a system 586 utility program to: 588 - Specify that Path MTU Discovery not be done on a given path. 590 - Change the PMTU value associated with a given path. 592 The former can be accomplished by associating a flag with the path; 593 when a packet is sent on a path with this flag set, the IP layer does 594 not send packets larger than the IPv6 minimum link MTU. 596 These features might be used to work around an anomalous situation, 597 or by a routing protocol implementation that is able to obtain Path 598 MTU values. 600 The implementation should also provide a way to change the timeout 601 period for aging stale PMTU information. 603 6. Security Considerations 605 This Path MTU Discovery mechanism makes possible two denial-of- 606 service attacks, both based on a malicious party sending false Packet 607 Too Big messages to a node. 609 In the first attack, the false message indicates a PMTU much 610 smaller than reality. In response, the victim node should never 611 set its PMTU estimate below the IPv6 minimum link MTU. A sender 612 that falsely reduces to this MTU would observe suboptimal 613 performance. 615 In the second attack, the false message indicates a PMTU larger 616 than reality. If believed, this could cause temporary blockage as 617 the victim sends packets that will be dropped by some router. 618 Within one round-trip time, the node would discover its mistake 619 (receiving Packet Too Big messages from that router), but frequent 620 repetition of this attack could cause lots of packets to be 621 dropped. A node, however, should never raise its estimate of the 622 PMTU based on a Packet Too Big message, so should not be 623 vulnerable to this attack. 625 A malicious party could also cause problems if it could stop a victim 626 from receiving legitimate Packet Too Big messages, but in this case 627 there are simpler denial-of-service attacks available. 629 If ICMPv6 filtering prevents reception of ICMPv6 Packet Too Big 630 messages, the source will not learn the actual path MTU. 631 Packetization Layer Path MTU Discovery [RFC4821] does not rely upon 632 network support for ICMPv6 messages and is therefore considered more 633 robust than standard PMTUD. It is not susceptible to "black holing" 634 of ICMPv6 message. See [RFC4890] for recommendations regarding 635 filtering ICMPv6 messages. 637 7. Acknowledgements 639 We would like to acknowledge the authors of and contributors to 640 [RFC1191], from which the majority of this document was derived. We 641 would also like to acknowledge the members of the IPng working group 642 for their careful review and constructive criticisms. 644 8. IANA Considerations 646 This document does not have any IANA actions 648 9. References 650 9.1. Normative References 652 [I-D.ietf-6man-rfc2460bis] 653 <>, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 654 Specification", draft-ietf-6man-rfc2460bis-09 (work in 655 progress), March 2017. 657 [ICMPv6] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 658 Control Message Protocol (ICMPv6) for the Internet 659 Protocol Version 6 (IPv6) Specification", RFC 4443, DOI 660 10.17487/RFC4443, March 2006, 661 . 663 9.2. Informative References 665 [CONG] Jacobson, V., "Congestion Avoidance and Control", Proc. 666 SIGCOMM '88 Symposium on Communications Architectures and 667 Protocols , August 1988. 669 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 670 In Proc. SIGCOMM '87 Workshop on Frontiers in Computer 671 Communications Technology , August 1987. 673 [ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 674 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 675 DOI 10.17487/RFC4861, September 2007, 676 . 678 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 679 DOI 10.17487/RFC1191, November 1990, 680 . 682 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 683 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 684 . 686 [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering 687 ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/ 688 RFC4890, May 2007, 689 . 691 [RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)", 692 RFC 6691, DOI 10.17487/RFC6691, July 2012, 693 . 695 [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System 696 (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, 697 March 2015, . 699 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 700 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 701 March 2017, . 703 Appendix A. Comparison to RFC 1191 705 This document is based in large part on RFC 1191, which describes 706 Path MTU Discovery for IPv4. Certain portions of RFC 1191 were not 707 needed in this document: 709 router specification Packet Too Big messages and corresponding 710 router behavior are defined in [ICMPv6] 712 Don't Fragment bit there is no DF bit in IPv6 packets 714 TCP MSS discussion selecting a value to send in the TCP MSS option 715 is discussed in [I-D.ietf-6man-rfc2460bis] 717 old-style messages all Packet Too Big messages report the MTU of 718 the constricting link 720 MTU plateau tables not needed because there are no old-style 721 messages 723 Appendix B. Changes Since RFC 1981 725 This document is based on RFC1981 has the following changes from 726 RFC1981: 728 o Clarified Section 1 "Introduction" that the purpose of PMTUD is to 729 reduce the need for IPv6 fragmentation. 731 o Added text to Section 1 "Introduction" and Section 6 "Security 732 Considerations" about the effects on PMTUD when ICMPv6 messages 733 are blocked. 735 o Added a short summary to the Section 1 "Introduction" of 736 Packetization Layer Path MTU Discovery ((PLPMTUD) and a reference 737 to RFC4821 that defines it. 739 o Aligned text in Section 2 "Terminology" to match current 740 packetization layer terminology. 742 o Added clarification in Section 4 "Protocol Requirements" that 743 nodes should validate the payload of ICMP PTB message per RFC4443. 745 o Remove Note from Section 4 "Protocol Requirements" about a Packet 746 Too Big message reporting a next-hop MTU that is less than the 747 IPv6 minimum link MTU because this was removed from 748 [I-D.ietf-6man-rfc2460bis]. 750 o Added clarification in Section 5.2 "Storing PMTU information" to 751 discard an ICMPv6 Packet Too Big message if it contains a MTU less 752 than the IPv6 minimum link MTU. 754 o Removed text in Section 5.2 "Storing PMTU information" about the 755 RH0 routing header because it was deprecated by RFC5095. 757 o Removed text about obsolete security classification from 758 Section 5.2 "Storing PMTU information". 760 o Changed title of Section 5.4 to "Packetization Layer actions" and 761 changed to text in the first paragraph to to generalize this 762 section to cover all packetization layers, not just TCP. 764 o Clarified text in Section 5.4 "Packetization Layer actions" to use 765 normal packetization layer retransmission methods. 767 o Removed text in Section 5.4 "Packetization Layer actions" that 768 described 4.2 BSD because it is obsolete, and removed reference to 769 TP4. 771 o Updated text in Section 5.5 "Issues for other transport protocols" 772 about NFS including adding a current reference to NFS and removing 773 obsolete text. 775 o Editorial Changes. 777 B.1. Change History Since RFC1981 779 NOTE TO RFC EDITOR: Please remove this subsection prior to RFC 780 Publication 782 This section describes change history made in each Internet Draft 783 that went into producing this version. The numbers identify the 784 Internet-Draft version in which the change was made. 786 Working Group Internet Drafts 788 06) Revised Appendix B "Changes since RFC1981" to have a summary 789 of changes since RFC1981 and a separate subsection with a 790 change history of each Internet Draft. This subsection will 791 be removed when the RFC is published. 793 06) Editorial changes based on comments received after publishing 794 the -05 draft. 796 05) Changes based on IETF last call reviews by Gorry Fairhurst, 797 Joe Touch, Susan Hares, Stewart Bryant, Rifaat Shekh-Yusef, 798 and Donald Eastlake. This includes includes: 800 o Clarify that the purpose of PMTUD is to reduce the need 801 for IPv6 Fragmentation. 803 o Added text to Introduction about effects on PMTUD when 804 ICMPv6 messages are blocked. 806 o Clarified in Section 4. that nodes should validate the 807 payload of ICMPv6 PTB messages per RFC4443. 809 o Removed text in Section 5.2 about the number of paths to a 810 destination. 812 o Changed title of Section 5.4 to "Packetization layer 813 actions". 815 o Clarified first paragraph in Section 5.4 to to cover all 816 packetization layers, not just TCP. 818 o Clarified text in Section 5.4 to use normal retransmission 819 methods. 821 o Add clarification to Note in Section 5.4 about 822 retransmissions. 824 o Removed text in Section 5.4 that described 4.2BSD as it is 825 now obsolete. 827 o Removed reference to TP4 in Section 5.5. 829 o Updated text in Section 5.5 about NFS including adding a 830 current reference to NFS and removing obsolete text. 832 o Revised text in Section 6 to clarify first attack 833 response. 835 o Added new text in Section 6 to clarify the effect of 836 ICMPv6 filtering on PMTUD. 838 o Aligned terminology for the packetization layer 839 terminology. 841 o Editorial changes. 843 04) Changes based on AD Evaluation including removing details 844 about RFC4821 algorithm in Section 1, remove text about 845 decrementing hop limit from Section 3, and removed text about 846 obsolete security classifications from Section 5.2. 848 04) Editorial changes and clarification in Section 5.2 based on 849 IP Directorate review by Donald Eastlake 851 03) Remove text in Section 5.3 regarding RH0 since it was 852 deprecated by RFC5095 854 02) Clarified in Section 3 that ICMPv6 Packet Too Big should be 855 sent even if the node doesn't decrement the hop limit 857 01) Revised the text about PLPMTUD to use the word "path". 859 01) Editorial changes. 861 00) Added text to discard an ICMPv6 Packet Too Big message 862 containing an MTU less than the IPv6 minimum link MTU. 864 00) Revision of text regarding RFC4821. 866 00) Added R. Hinden as Editor to facilitate ID submission. 868 00) Editorial changes. 870 Individual Internet Drafts 872 01) Remove Note about a Packet Too Big message reporting a next- 873 hop MTU that is less than the IPv6 minimum link MTU. This 874 was removed from [I-D.ietf-6man-rfc2460bis]. 876 01) Include a link to RFC4821 along with a short summary of what 877 it does. 879 01) Assigned references to informative and normative. 881 01) Editorial changes. 883 00) Establish a baseline from RFC1981. The only intended changes 884 are formatting (XML is slightly different from .nroff), 885 differences between an RFC and Internet Draft, fixing a few 886 ID Nits, updating references, and updates to the authors 887 information. There should not be any content changes to the 888 specification. 890 Authors' Addresses 892 Jack McCann 893 Digital Equipment Corporation 895 Stephen E. Deering 896 Retired 897 Vancouver, British Columbia 898 Canada 900 Jeffrey Mogul 901 Digital Equipment Corporation 902 Robert M. Hinden (editor) 903 Check Point Software 904 959 Skyway Road 905 San Carlos, CA 94070 906 USA 908 Email: bob.hinden@gmail.com