idnits 2.17.1 draft-ietf-ipngwg-pmtuv6-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 197: '...et Too Big message, it MUST reduce its...' RFC 2119 keyword, line 204: '...oo Big message, a node MUST attempt to...' RFC 2119 keyword, line 205: '...ges in the near future. The node MUST...' RFC 2119 keyword, line 210: '... node MUST force the Path MTU Discov...' RFC 2119 keyword, line 212: '...th MTU Discovery MUST detect decreases...' (5 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'CONG' -- Possible downref: Non-RFC (?) normative reference: ref. 'FRAG' ** Obsolete normative reference: RFC 1885 (ref. 'ICMPv6') (Obsoleted by RFC 2463) ** Obsolete normative reference: RFC 1883 (ref. 'IPv6-SPEC') (Obsoleted by RFC 2460) ** Downref: Normative reference to an Unknown state RFC: RFC 905 (ref. 'ISOTP') == Outdated reference: A later version (-06) exists of draft-ietf-ipngwg-discovery-04 ** Downref: Normative reference to an Informational RFC: RFC 1057 (ref. 'RPC') Summary: 13 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT J. McCann, Digital Equipment Corporation 2 February 21, 1996 S. Deering, Xerox PARC 3 J. Mogul, Digital Equipment Corporation 5 Path MTU Discovery for IP version 6 7 draft-ietf-ipngwg-pmtuv6-01.txt 9 Abstract 11 This document describes Path MTU Discovery for IP version 6. It is 12 largely derived from RFC-1191, which describes Path MTU Discovery for 13 IP version 4. 15 Status of this Memo 17 This document is an Internet-Draft. Internet-Drafts are working 18 documents of the Internet Engineering Task Force (IETF), its areas, 19 and its working groups. Note that other groups may also distribute 20 working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as ``work in progress.'' 27 To learn the current status of any Internet-Draft, please check the 28 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 29 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 30 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 31 ftp.isi.edu (US West Coast). 33 Distribution of this document is unlimited. 35 Expiration 37 August 21, 1996 39 Contents 41 Abstract........................................................1 43 Status of this Memo.............................................1 45 Contents........................................................2 47 1. Introduction.................................................3 49 2. Terminology..................................................3 51 3. Protocol overview............................................4 53 4. Protocol Requirements........................................5 55 5. Implementation suggestions...................................6 57 5.1. Layering...................................................6 59 5.2. Storing PMTU information...................................7 61 5.3. Purging stale PMTU information.............................9 63 5.4. TCP layer actions.........................................10 65 5.5. Issues for other transport protocols......................11 67 5.6. Management interface......................................12 69 6. Security considerations.....................................12 71 Acknowledgements...............................................13 73 Appendix A - Comparison to RFC 1191............................14 75 References.....................................................15 77 Authors' Addresses.............................................16 79 1. Introduction 81 When one IPv6 node has a large amount of data to send to another 82 node, the data is transmitted in a series of IPv6 packets. It is 83 usually preferable that these packets be of the largest size that can 84 successfully traverse the path from the source node to the 85 destination node. This packet size is referred to as the Path MTU 86 (PMTU), and it is equal to the minimum link MTU of all the links in a 87 path. IPv6 defines a standard mechanism for a node to discover the 88 PMTU of an arbitrary path. 90 Nodes not implementing Path MTU Discovery use the IPv6 minimum link 91 MTU defined in [IPv6-SPEC] as the maximum packet size. In most 92 cases, this will result in the use of smaller packets than necessary, 93 because most paths have a PMTU greater than the IPv6 minimum link 94 MTU. A node sending packets much smaller than the Path MTU allows is 95 wasting network resources and probably getting suboptimal throughput. 97 2. Terminology 99 node - a device that implements IPv6. 101 router - a node that forwards IPv6 packets not explicitly 102 addressed to itself. 104 host - any node that is not a router. 106 upper layer - a protocol layer immediately above IPv6. Examples are 107 transport protocols such as TCP and UDP, control 108 protocols such as ICMP, routing protocols such as OSPF, 109 and internet or lower-layer protocols being "tunneled" 110 over (i.e., encapsulated in) IPv6 such as IPX, 111 AppleTalk, or IPv6 itself. 113 link - a communication facility or medium over which nodes can 114 communicate at the link layer, i.e., the layer 115 immediately below IPv6. Examples are Ethernets (simple 116 or bridged); PPP links; X.25, Frame Relay, or ATM 117 networks; and internet (or higher) layer "tunnels", 118 such as tunnels over IPv4 or IPv6 itself. 120 interface - a node's attachment to a link. 122 address - an IPv6-layer identifier for an interface or a set of 123 interfaces. 125 packet - an IPv6 header plus payload. 127 link MTU - the maximum transmission unit, i.e., maximum packet 128 size in octets, that can be conveyed in one piece over 129 a link. 131 path - the set of links traversed by a packet between a source 132 node and a destination node 134 path MTU - the minimum link MTU of all the links in a path between 135 a source node and a destination node. 137 PMTU - path MTU 139 Path MTU 140 Discovery - process by which a node learns the PMTU of a path 142 flow - a sequence of packets sent from a particular source 143 to a particular (unicast or multicast) destination for 144 which the source desires special handling by the 145 intervening routers. 147 flow id - a combination of a source address and a non-zero 148 flow label. 150 3. Protocol overview 152 This memo describes a technique to dynamically discover the PMTU of a 153 path. The basic idea is that a source node initially assumes that 154 the PMTU of a path is the (known) MTU of the first hop in the path. 155 If any of the packets sent on that path are too large to be forwarded 156 by some node along the path, that node will discard them and return 157 ICMPv6 Packet Too Big messages [ICMPv6]. Upon receipt of such a 158 message, the source node reduces its assumed PMTU for the path based 159 on the MTU of the constricting hop as reported in the Packet Too Big 160 message. 162 The Path MTU Discovery process ends when the node's estimate of the 163 PMTU is less than or equal to the actual PMTU. Note that several 164 iterations of the packet-sent/Packet-Too-Big-message-received cycle 165 may occur before the Path MTU Discovery process ends, as there may be 166 links with smaller MTUs further along the path. 168 Alternatively, the node may elect to end the discovery process by 169 ceasing to send packets larger than the IPv6 minimum link MTU. 171 The PMTU of a path may change over time, due to changes in the 172 routing topology. Reductions of the PMTU are detected by Packet Too 173 Big messages. To detect increases in a path's PMTU, a node 174 periodically increases its assumed PMTU. This will almost always 175 result in packets being discarded and Packet Too Big messages being 176 generated, because in most cases the PMTU of the path will not have 177 changed. Therefore, attempts to detect increases in a path's PMTU 178 should be done infrequently. 180 Path MTU Discovery supports multicast as well as unicast 181 destinations. In the case of a multicast destination, copies of a 182 packet may traverse many different paths to many different nodes. 183 Each path may have a different PMTU, and a single multicast packet 184 may result in multiple Packet Too Big messages, each reporting a 185 different next-hop MTU. The minimum PMTU value across the set of 186 paths in use determines the size of subsequent packets sent to the 187 multicast destination. 189 Note that Path MTU Discovery must be performed even in cases where a 190 node "thinks" a destination is attached to the same link as itself. 191 In a situation such as when a neighboring router acts as proxy [ND] 192 for some destination, the destination can to appear to be directly 193 connected but is in fact more than one hop away. 195 4. Protocol Requirements 197 When a node receives a Packet Too Big message, it MUST reduce its 198 estimate of the PMTU for the relevant path, based on the value of the 199 MTU field in the message. The precise behavior of a node in this 200 circumstance is not specified, since different applications may have 201 different requirements, and since different implementation 202 architectures may favor different strategies. 204 After receiving a Packet Too Big message, a node MUST attempt to 205 avoid eliciting more such messages in the near future. The node MUST 206 reduce the size of the packets it is sending along the path. Using a 207 PMTU estimate larger than the IPv6 minimum link MTU may continue to 208 elicit Packet Too Big messages. Since each of these messages (and 209 the dropped packets they respond to) consume network resources, the 210 node MUST force the Path MTU Discovery process to end. 212 Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast 213 as possible. Nodes MAY detect increases in PMTU, but because doing 214 so requires sending packets larger than the current estimated PMTU, 215 and because the likelihood is that the PMTU will not have increased, 216 this MUST be done at infrequent intervals. An attempt to detect an 217 increase (by sending a packet larger than the current estimate) MUST 218 NOT be done less than 5 minutes after a Packet Too Big message has 219 been received for the given path. The recommended setting for this 220 timer is twice its minimum value (10 minutes). 222 A node MUST NOT reduce its estimate of the Path MTU below the IPv6 223 minimum link MTU. 225 Note: A node may receive a Packet Too Big message reporting a 226 next-hop MTU that is less than the IPv6 minimum link MTU. In that 227 case, the node is not required to reduce the size of subsequent 228 packets sent on the path to less than the IPv6 minimun link MTU, 229 but rather must include a Fragment header in those packets [IPv6- 230 SPEC]. 232 A node MUST NOT increase its estimate of the Path MTU in response to 233 the contents of a Packet Too Big message. A message purporting to 234 announce an increase in the Path MTU might be a stale packet that has 235 been floating around in the network, a false packet injected as part 236 of a denial-of-service attack, or the result of having multiple paths 237 to the destination, each with a different PMTU. 239 5. Implementation suggestions 241 This section discusses a number of issues related to the 242 implementation of Path MTU Discovery. This is not a specification, 243 but rather a set of notes provided as an aid for implementors. 245 The issues include: 247 - What layer or layers implement Path MTU Discovery? 249 - How is the PMTU information cached? 251 - How is stale PMTU information removed? 253 - What must transport and higher layers do? 255 5.1. Layering 257 In the IP architecture, the choice of what size packet to send is 258 made by a protocol at a layer above IP. This memo refers to such a 259 protocol as a "packetization protocol". Packetization protocols are 260 usually transport protocols (for example, TCP) but can also be 261 higher-layer protocols (for example, protocols built on top of UDP). 263 Implementing Path MTU Discovery in the packetization layers 264 simplifies some of the inter-layer issues, but has several drawbacks: 265 the implementation may have to be redone for each packetization 266 protocol, it becomes hard to share PMTU information between different 267 packetization layers, and the connection-oriented state maintained by 268 some packetization layers may not easily extend to save PMTU 269 information for long periods. 271 It is therefore suggested that the IP layer store PMTU information 272 and that the ICMP layer process received Packet Too Big messages. 273 The packetization layers may respond to changes in the PMTU, by 274 changing the size of the messages they send. To support this 275 layering, packetization layers require a way to learn of changes in 276 the value of MMS_S, the "maximum send transport-message size". The 277 MMS_S is derived from the Path MTU by subtracting the size of the 278 IPv6 header plus space reserved by the IP layer for additional 279 headers (if any). 281 It is possible that a packetization layer, perhaps a UDP application 282 outside the kernel, is unable to change the size of messages it 283 sends. This may result in a packet size that exceeds the Path MTU. 284 To accommodate such situations, IPv6 defines a mechanism that allows 285 large payloads to be divided into fragments, with each fragment sent 286 in a separate packet (see [IPv6-SPEC] section "Fragment Header"). 287 However, packetization layers are encouraged to avoid sending 288 messages that will require fragmentation (for the case against 289 fragmentation, see [FRAG]). 291 5.2. Storing PMTU information 293 Ideally, a PMTU value should be associated with a specific path 294 traversed by packets exchanged between the source and destination 295 nodes. However, in most cases a node will not have enough 296 information to completely and accurately identify such a path. 297 Rather, a node must associate a PMTU value with some local 298 representation of a path. It is left to the implementation to select 299 the local representation of a path. 301 In the case of a multicast destination address, copies of a packet 302 may traverse many different paths to reach many different nodes. The 303 local representation of the "path" to a multicast destination must in 304 fact represent a potentially large set of paths. 306 Minimally, an implementation could maintain a single PMTU value to be 307 used for all packets originated from the node. This PMTU value would 308 be the minimum PMTU learned across the set of all paths in use by the 309 node. This approach is likely to result in the use of smaller 310 packets than is necessary for many paths. 312 An implementation could use the destination address as the local 313 representation of a path. The PMTU value associated with a 314 destination would be the minimum PMTU learned across the set of all 315 paths in use to that destination. The set of paths in use to a 316 particular destination is expected to be small, in many cases 317 consisting of a single path. This approach will result in the use of 318 optimally sized packets on a per-destination basis. This approach 319 integrates nicely with the conceptual model of a host as described in 320 [ND]: a PMTU value could be stored with the corresponding entry in 321 the destination cache. 323 If flows [IPv6-SPEC] are in use, an implementation could use the flow 324 id as the local representation of a path. Packets sent to a 325 particular destination but belonging to different flows may use 326 different paths, with the choice of path depending on the flow id. 327 This approach will result in the use of optimally sized packets on a 328 per-flow basis, providing finer granularity than PMTU values 329 maintained on a per-destination basis. 331 For source routed packets (i.e. packets containing an IPv6 Routing 332 header [IPv6-SPEC]), the source route may further qualify the local 333 representation of a path. In particular, a packet containing a type 334 0 Routing header in which all bits in the Strict/Loose Bit Map are 335 equal to 1 contains a complete path specification. An implementation 336 could use source route information in the local representation of a 337 path. 339 Note: Some paths may be further distinguished by different 340 security classifications. The details of such classifications are 341 beyond the scope of this memo. 343 Initially, the PMTU value for a path is assumed to be the (known) MTU 344 of the first-hop link. 346 When a Packet Too Big message is received, the node determines which 347 path the message applies to based on the contents of the Packet Too 348 Big message. For example, if the destination address is used as the 349 local representation of a path, the destination address from the 350 original packet would be used to determine which path the message 351 applies to. 353 Note: if the original packet contained a Routing header, the 354 Routing header should be used to determine the location of the 355 destination address within the original packet. If Segments Left 356 is equal to zero, the destination address is in the Destination 357 Address field in the IPv6 header. If Segments Left is greater 358 than zero, the destination address is the last address 359 (Address[n]) in the Routing header. 361 The node then uses the value in the MTU field in the Packet Too Big 362 message as a tentative PMTU value, and compares the tentative PMTU to 363 the existing PMTU. If the tentative PMTU is less than the existing 364 PMTU estimate, the tentative PMTU replaces the existing PMTU as the 365 PMTU value for the path. 367 The packetization layers must be notified about decreases in the 368 PMTU. Any packetization layer instance (for example, a TCP 369 connection) that is actively using the path must be notified if the 370 PMTU estimate is decreased. 372 Note: even if the Packet Too Big message contains an Original 373 Packet Header that refers to a UDP packet, the TCP layer must be 374 notified if any of its connections use the given path. 376 Also, the instance that sent the packet that elicited the Packet Too 377 Big message should be notified that its packet has been dropped, even 378 if the PMTU estimate has not changed, so that it may retransmit the 379 dropped data. 381 Note: An implementation can avoid the use of an asynchronous 382 notification mechanism for PMTU decreases by postponing 383 notification until the next attempt to send a packet larger than 384 the PMTU estimate. In this approach, when an attempt is made to 385 SEND a packet that is larger than the PMTU estimate, the SEND 386 function should fail and return a suitable error indication. This 387 approach may be more suitable to a connectionless packetization 388 layer (such as one using UDP), which (in some implementations) may 389 be hard to "notify" from the ICMP layer. In this case, the normal 390 timeout-based retransmission mechanisms would be used to recover 391 from the dropped packets. 393 It is important to understand that the notification of the 394 packetization layer instances using the path about the change in the 395 PMTU is distinct from the notification of a specific instance that a 396 packet has been dropped. The latter should be done as soon as 397 practical (i.e., asynchronously from the point of view of the 398 packetization layer instance), while the former may be delayed until 399 a packetization layer instance wants to create a packet. 400 Retransmission should be done for only for those packets that are 401 known to be dropped, as indicated by a Packet Too Big message. 403 5.3. Purging stale PMTU information 405 Internetwork topology is dynamic; routes change over time. While the 406 local representation of a path may remain constant, the actual 407 path(s) in use may change. Thus, PMTU information cached by a node 408 can become stale. 410 If the stale PMTU value is too large, this will be discovered almost 411 immediately once a large enough packet is sent on the path. No such 412 mechanism exists for realizing that a stale PMTU value is too small, 413 so an implementation should "age" cached values. When a PMTU value 414 has not been decreased for a while (on the order of 10 minutes), the 415 PMTU estimate should be set to the MTU of the first-hop link, and the 416 packetization layers should be notified of the change. This will 417 cause the complete Path MTU Discovery process to take place again. 419 Note: an implementation should provide a means for changing the 420 timeout duration, including setting it to "infinity". For 421 example, nodes attached to an FDDI link which is then attached to 422 the rest of the Internet via a small MTU serial line are never 423 going to discover a new non-local PMTU, so they should not have to 424 put up with dropped packets every 10 minutes. 426 An upper layer must not retransmit data in response to an increase in 427 the PMTU estimate, since this increase never comes in response to an 428 indication of a dropped packet. 430 One approach to implementing PMTU aging is to associate a timestamp 431 field with a PMTU value. This field is initialized to a "reserved" 432 value, indicating that the PMTU is equal to the MTU of the first hop 433 link. Whenever the PMTU is decreased in response to a Packet Too Big 434 message, the timestamp is set to the current time. 436 Once a minute, a timer-driven procedure runs through all cached PMTU 437 values, and for each PMTU whose timestamp is not "reserved" and is 438 older than the timeout interval: 440 - The PMTU estimate is set to the MTU of the first hop link. 442 - The timestamp is set to the "reserved" value. 444 - Packetization layers using this path are notified of the increase. 446 5.4. TCP layer actions 448 The TCP layer must track the PMTU for the path(s) in use by a 449 connection; it should not send segments that would result in packets 450 larger than the PMTU. A simple implementation could ask the IP layer 451 for this value each time it created a new segment, but this could be 452 inefficient. Moreover, TCP implementations that follow the "slow- 453 start" congestion-avoidance algorithm [CONG] typically calculate and 454 cache several other values derived from the PMTU. It may be simpler 455 to receive asynchronous notification when the PMTU changes, so that 456 these variables may be updated. 458 A TCP implementation must also store the MSS value received from its 459 peer, and must not send any segment larger than this MSS, regardless 460 of the PMTU. In 4.xBSD-derived implementations, this may require 461 adding an additional field to the TCP state record. 463 The value sent in the TCP MSS option is independent of the PMTU. 464 This MSS option value is used by the other end of the connection, 465 which may be using an unrelated PMTU value. See [IPv6-SPEC] sections 466 "Packet Size Issues" and "Maximum Upper-Layer Payload Size" for 467 information on selecting a value for the TCP MSS option. 469 When a Packet Too Big message is received, it implies that a packet 470 was dropped by the node that sent the ICMP message. It is sufficient 471 to treat this as any other dropped segment, and wait until the 472 retransmission timer expires to cause retransmission of the segment. 473 If the Path MTU Discovery process requires several steps to find the 474 PMTU of the full path, this could delay the connection by many 475 round-trip times. 477 Alternatively, the retransmission could be done in immediate response 478 to a notification that the Path MTU has changed, but only for the 479 specific connection specified by the Packet Too Big message. The 480 packet size used in the retransmission should be no larger than the 481 new PMTU. 483 Note: A packetization layer must not retransmit in response to 484 every Packet Too Big message, since a burst of several oversized 485 segments will give rise to several such messages and hence several 486 retransmissions of the same data. If the new estimated PMTU is 487 still wrong, the process repeats, and there is an exponential 488 growth in the number of superfluous segments sent. 490 This means that the TCP layer must be able to recognize when a 491 Packet Too Big notification actually decreases the PMTU that it 492 has already used to send a packet on the given connection, and 493 should ignore any other notifications. 495 Many TCP implementations incorporate "congestion avoidance" and 496 "slow-start" algorithms to improve performance [CONG]. Unlike a 497 retransmission caused by a TCP retransmission timeout, a 498 retransmission caused by a Packet Too Big message should not change 499 the congestion window. It should, however, trigger the slow-start 500 mechanism (i.e., only one segment should be retransmitted until 501 acknowledgements begin to arrive again). 503 TCP performance can be reduced if the sender's maximum window size is 504 not an exact multiple of the segment size in use (this is not the 505 congestion window size, which is always a multiple of the segment 506 size). In many systems (such as those derived from 4.2BSD), the 507 segment size is often set to 1024 octets, and the maximum window size 508 (the "send space") is usually a multiple of 1024 octets, so the 509 proper relationship holds by default. If Path MTU Discovery is used, 510 however, the segment size may not be a submultiple of the send space, 511 and it may change during a connection; this means that the TCP layer 512 may need to change the transmission window size when Path MTU 513 Discovery changes the PMTU value. The maximum window size should be 514 set to the greatest multiple of the segment size that is less than or 515 equal to the sender's buffer space size. 517 5.5. Issues for other transport protocols 519 Some transport protocols (such as ISO TP4 [ISOTP]) are not allowed to 520 repacketize when doing a retransmission. That is, once an attempt is 521 made to transmit a segment of a certain size, the transport cannot 522 split the contents of the segment into smaller segments for 523 retransmission. In such a case, the original segment can be 524 fragmented by the IP layer during retransmission. Subsequent 525 segments, when transmitted for the first time, should be no larger 526 than allowed by the Path MTU. 528 The Sun Network File System (NFS) uses a Remote Procedure Call (RPC) 529 protocol [RPC] that, when used over UDP, in many cases will generate 530 payloads that must be fragmented even for the first-hop link. This 531 might improve performance in certain cases, but it is known to cause 532 reliability and performance problems, especially when the client and 533 server are separated by routers. 535 It is recommended that NFS implementations use Path MTU Discovery 536 whenever routers are involved. Most NFS implementations allow the 537 RPC datagram size to be changed at mount-time (indirectly, by 538 changing the effective file system block size), but might require 539 some modification to support changes later on. 541 Also, since a single NFS operation cannot be split across several UDP 542 datagrams, certain operations (primarily, those operating on file 543 names and directories) require a minimum payload size that if sent in 544 a single packet would exceed the PMTU. NFS implementations should 545 not reduce the payload size below this threshold, even if Path MTU 546 Discovery suggests a lower value. In this case the payload will be 547 fragmented by the IP layer. 549 5.6. Management interface 551 It is suggested that an implementation provide a way for a system 552 utility program to: 554 - Specify that Path MTU Discovery not be done on a given path. 556 - Change the PMTU value associated with a given path. 558 The former can be accomplished by associating a flag with the path; 559 when a packet is sent on a path with this flag set, the IP layer does 560 not send packets larger than the IPv6 minimum link MTU. 562 These features might be used to work around an anomalous situation, 563 or by a routing protocol implementation that is able to obtain Path 564 MTU values. 566 The implementation should also provide a way to change the timeout 567 period for aging stale PMTU information. 569 6. Security considerations 571 This Path MTU Discovery mechanism makes possible two denial-of- 572 service attacks, both based on a malicious party sending false Packet 573 Too Big messages to a node. 575 In the first attack, the false message indicates a PMTU much smaller 576 than reality. This should not entirely stop data flow, since the 577 victim node should never set its PMTU estimate below the IPv6 minimum 578 link MTU. It will, however, result in suboptimal performance. 580 In the second attack, the false message indicates a PMTU larger than 581 reality. If believed, this could cause temporary blockage as the 582 victim sends packets that will be dropped by some router. Within one 583 round-trip time, the node would discover its mistake (receiving 584 Packet Too Big messages from that router), but frequent repetition of 585 this attack could cause lots of packets to be dropped. A node, 586 however, should never raise its estimate of the PMTU based on a 587 Packet Too Big message, so should not be vulnerable to this attack. 589 A malicious party could also cause problems if it could stop a victim 590 from receiving legitimate Packet Too Big messages, but in this case 591 there are simpler denial-of-service attacks available. 593 Acknowledgements 595 We would like to acknowledge the authors of and contributors to 596 [RFC-1191], from which the majority of this document was derived. We 597 would also like to acknowledge the members of the IPng working group 598 for their careful review and constructive criticisms. 600 Appendix A - Comparison to RFC 1191 602 This document is based in large part on RFC 1191, which describes 603 Path MTU Discovery for IPv4. Certain portions of RFC 1191 were not 604 needed in this document: 606 router specification - Packet Too Big messages and corresponding 607 router behavior are defined in [ICMPv6] 609 Don't Fragment bit - there is no DF bit in IPv6 packets 611 TCP MSS discussion - selecting a value to send in the TCP MSS 612 option is discussed in [IPv6-SPEC] 614 old-style messages - all Packet Too Big messages report the 615 MTU of the constricting link 617 MTU plateau tables - not needed because there are no old-style 618 messages 620 References 622 [CONG] Van Jacobson. Congestion Avoidance and Control. Proc. 623 SIGCOMM '88 Symposium on Communications Architectures and 624 Protocols, pages 314-329. Stanford, CA, August, 1988. 626 [FRAG] C. Kent and J. Mogul. Fragmentation Considered Harmful. 627 In Proc. SIGCOMM '87 Workshop on Frontiers in Computer 628 Communications Technology. August, 1987. 630 [ICMPv6] A. Conta and S. Deering, "Internet Control Message 631 Protocol (ICMPv6) for the Internet Protocol Version 6 632 (IPv6) Specification", RFC 1885, December 1995 634 [IPv6-SPEC] S. Deering and R. Hinden, "Internet Protocol, Version 6 635 (IPv6) Specification", RFC 1883, December 1995 637 [ISOTP] ISO. ISO Transport Protocol Specification: ISO DP 8073. 638 RFC 905, SRI Network Information Center, April, 1984. 640 [ND] T. Narten, E. Nordmark, and W. Simpson, "Neighbor 641 Discovery for IP Version 6 (IPv6)", work in progress 642 draft-ietf-ipngwg-discovery-04.txt, February 1996. 644 [RFC-1191] J. Mogul and S. Deering, "Path MTU Discovery", 645 November 1990 647 [RPC] Sun Microsystems, Inc. RPC: Remote Procedure Call 648 Protocol. RFC 1057, SRI Network Information Center, 649 June, 1988. 651 Authors' Addresses 653 Jack McCann 654 Digital Equipment Corporation 655 110 Spitbrook Road, ZKO3-3/U14 656 Nashua, NH 03062 657 Phone: +1 603 881 2608 658 Fax: +1 603 881 0120 659 Email: mccann@zk3.dec.com 661 Stephen E. Deering 662 Xerox Palo Alto Research Center 663 3333 Coyote Hill Road 664 Palo Alto, CA 94304 665 Phone: +1 415 812 4839 666 Fax: +1 415 812 4471 667 Email: deering@parc.xerox.com 669 Jeffrey Mogul 670 Digital Equipment Corporation Western Research Laboratory 671 250 University Avenue 672 Palo Alto, CA 94301 673 Phone: +1 415 617 3304 674 Email: mogul@pa.dec.com 676 Expiration 678 August 21, 1996