idnits 2.17.1 draft-welzl-pmtud-options-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 160: '...-hop network. Also, the host MUST set...' RFC 2119 keyword, line 170: '...a datagram containing this option MUST...' RFC 2119 keyword, line 173: '...n, the MTU field MUST be set to the lo...' RFC 2119 keyword, line 177: '... changes). Additionally, a router MUST...' RFC 2119 keyword, line 192: '...tream routers. It MUST be initialized...' (23 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: A host that implements PMTU-Options SHOULD wait for feedback to arrive before initiating the subsequent PMTUD or PLPMTUD process, which remains unchanged except for one difference: the MTU during the process SHOULD not exceed the value received from PMTU-Options. Feedback from PMTU-Options SHOULD not be kept any longer -- it is only intended as an aid for the subsequent PMTUD or PLPMTUD process. Since packet drops can substantially delay the reception of feedback, a host MAY use a timer to initiate PTMUD or PLPMTUD even when no feedback has arrived; if such a timer is implemented, a means to configure this timer MUST be provided. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2003) is 7733 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC1435' is defined on line 918, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1981 (Obsoleted by RFC 8201) ** Downref: Normative reference to an Informational RFC: RFC 1435 ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Downref: Normative reference to an Informational RFC: RFC 2923 ** Obsolete normative reference: RFC 2960 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 2402 (Obsoleted by RFC 4302, RFC 4305) -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 2406 (Obsoleted by RFC 4303, RFC 4305) Summary: 9 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Michael Welzl 3 draft-welzl-pmtud-options-01.txt University of Innsbruck 4 Institute of Computer Science 5 Experimental 9. February 2003 7 PMTU-Options: Path MTU Discovery Using Options 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC 2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference 22 material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 Abstract 32 This document describes an experimental enhancement of Path MTU 33 Discovery for special scenarios (e.g., tunnels, or to detect PMTU 34 increases). It has the potential of reducing loss, speeding up 35 convergence, reducing load in routers which would otherwise need to 36 generate a large amount of ICMP messages, and alleviating certain 37 additional problems (interactions with tunnels, Black Hole 38 Detection). The idea is to use an IP Option which queries routers for 39 their MTU before starting a Path MTU Discovery process. The result 40 retrieved in this manner is used as an upper limit for Path MTU 41 Discovery. To this end, it is fed back to the source either at the 42 packetization layer (recommended) or at the IP layer. 44 Changes from draft-welzl-pmtud-options-00.txt: 46 o The addition of a "TTL-Check" field. 48 o The addition of a section on packet drops due to options. 50 o Update of section 5.1 on the impact of Slow Path processing. 52 o Update of references. 54 Table of Contents 56 Status of this Memo ....................................... 1 57 1. Definitions ....................................... 2 58 2. Introduction ...................................... 2 59 3. Specification ..................................... 3 60 3.1. Probe MTU Option Format for IPv4 ................ 3 61 3.2. Feedback Format ................................. 4 62 3.2.1 IP ............................................ 4 63 3.2.2 TCP ........................................... 5 64 3.2.3 SCTP .......................................... 5 65 3.2.4 DCCP .......................................... 6 66 3.3. Host Operation .................................. 7 67 3.4. IPv6 Usage ...................................... 8 68 3.4.1 Probe MTU Option Format for IPv6 .............. 8 69 3.4.2 Feedback Format: IP ........................... 9 70 3.4.3 Feedback Format: TCP .......................... 11 71 3.4.4 Feedback Format: SCTP ......................... 11 72 3.4.5 Feedback Format: DCCP ......................... 12 73 3.5. IP Tunnels ...................................... 12 74 4. Potential Advantages .............................. 13 75 4.1. Reducing Packet Loss ............................ 13 76 4.2. Circumventing Black Hole Detection .............. 13 77 4.3. Other Problems with ICMP Fragmentation Needed ... 14 78 4.4. Circumventing Problems with IP Tunnels .......... 14 79 5. Discussion of Issues .............................. 15 80 5.1. Slow Path Processing ............................ 15 81 5.2. Dropped Packets ................................. 16 82 5.2. Placing Additional Burden on Routers ............ 16 83 5.3. Motivating Deployment ........................... 17 84 6. Discussion of Usage Scenarios ..................... 17 85 6.1. Detecting PMTU Increases ........................ 17 86 6.2. RTT-Robust Transport Protocols .................. 18 87 6.3. Tunnels ......................................... 18 88 7. Related Work ...................................... 18 89 8. Security Considerations ........................... 18 90 9. IANA Considerations ............................... 19 91 10. Normative References .............................. 20 92 11. Informative References ............................ 20 93 12. Acknowledgements .................................. 21 94 13. Author's Address .................................. 22 95 14. Full Copyright Statement .......................... 23 97 1. Definitions 99 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 100 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this 101 document are to be interpreted as described in RFC 2119. 103 Throughout this document, "Path MTU Discovery" (PMTUD) refers to the 104 mechanism described in RFC 1191 [RFC1191] and "Packetization Layer 105 Path MTU Discovery" (PLPMTUD) refers to the mechanism described in 106 [draft-PLPMTUD]. The mechanism described in this document is called 107 "Path MTU Discovery using Options" (PMTU-Options). 109 In the IP architecture, the choice of what size datagram to send is 110 made by a protocol at a layer above IP. We refer to such a protocol 111 as a "packetization protocol". Packetization protocols are usually 112 transport protocols (for example, TCP) but can also be higher-layer 113 protocols (for example, protocols built on top of UDP). 115 2. Introduction 117 This memo specifies how options can be used as an enhancement for 118 PMTUD and PLPMTUD. The method resembles the mechanism described in 119 [RFC1063]: a sender includes an IP Option containing the MTU of its 120 outgoing link. Upon forwarding, each router compares the value with 121 the MTUs of the links which are traversed by the datagram and updates 122 the field if one of the MTUs of its links is smaller. If all routers 123 support this scheme, the receiver has the correct MTU in the option 124 and can communicate it back to the sender (preferably at the 125 packetization layer instead of the IP layer as specified in 126 [RFC1063]). 128 The main difference is that the mechanism described in this document 129 does not rely on all routers along a path to support the IP option. 130 Instead, when this scheme is carried out just before doing PMTUD or 131 PLPMTUD, the result is used as an upper limit for the MTU of the path 132 (i.e. the Path MTU will definitely not exceed the value obtained with 133 this mechanism), no matter how many routers support it. This method 134 has several potential advantages over standard PMTUD or PLPMTUD 135 (listed in section 4) but also some issues (listed in section 5). 136 Being an experimental specification, it is mainly intended for 137 special usage scenarios, some of which are described in section 6. 139 3. Specification 141 3.1. Probe MTU Option Format for IPv4 143 The "Probe MTU Option" for IPv4 that has routers update the MTU value 144 (IP option number 11) is specified as in [RFC1063], with the 145 exception of additional "TTL-Check" and "MTU Nonce" fields: 147 0 1 2 3 148 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 149 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 150 | Type = 11 | Size = 8 | MTU | 151 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 152 | TTL-Check | MTU Nonce | 153 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 155 This option always contains the lowest MTU of all the links that have 156 been traversed so far by the datagram. 158 A host that sends this option must initialize the MTU field to be the 159 MTU of the directly-connected network. If the host is multi-homed, 160 this should be for the first-hop network. Also, the host MUST set 161 the TTL-Check field to a random value. It also also calculates and 162 remembers TTL-Diff, the difference between the TTL value and the 163 value of TTL-Check in the transmitted packet, as follows: 165 TTL Diff = ( TTL - TTL-Check ) mod 256 167 The purpose of this calculation is to detect whether all routers 168 along the path supported the option. 170 Each router that receives a datagram containing this option MUST 171 compare the MTU field with the MTUs of the inbound and outbound links 172 for the datagram. If either MTU is lower than the value in the MTU 173 field of the option, the MTU field MUST be set to the lower MTU. 174 (Note that routers conforming to RFC-1812 may not know either the 175 inbound interface or the outbound interface at the time that IP 176 options are processed. Accordingly, support for this option may 177 require major router software changes). Additionally, a router MUST 178 decrement the TTL-Check field on forwarding. 180 Any host receiving a datagram which contains this option should 181 confirm that the value of the MTU field of the option is less than or 182 equal to that of the inbound link, and if necessary, reduce the MTU 183 field value, before processing the option. 185 If the receiving host is not able to accept datagrams as large as 186 specified by the value of the MTU field of the option, then it should 187 reduce the MTU field to the size of the largest datagram it can 188 accept. 190 The MTU Nonce field is a means to prevent attackers from hiding the 191 fact that a router has updated the MTU field and provide some 192 protection against broken downstream routers. It MUST be initialized 193 with a random non-zero 24-bit number by the sender. The number MUST 194 be kept for later comparison with the Nonce value which is fed back 195 to the sender. A router which updates the MTU field in the option 196 MUST set the MTU Nonce field to 0. 198 3.2. Feedback Format 200 IP Option processing is known to be a costly operation. To avoid 201 placing this burden on routers along the backward path, feedback 202 SHOULD be stored at the packetization layer; it MAY be stored at the 203 IP layer in special cases (e.g. if PMTU-Options is used by a UDP 204 based application). Header extensions are specified for IP, TCP, 205 SCTP and DCCP. A host implementing PMTU-Options SHOULD react to this 206 feedback in all of the supported protocols and provide the MTU value 207 to the local PMTUD or PLPMTUD instance. The PMTUD or PLPMTUD instance 208 MUST NOT increase a cached PMTU value in response to PMTU-Options 209 feedback. If the MTU value from this feedback is greater than or 210 equal to the cached MTU value and the MTU Nonce is set to 0, there is 211 a chance that an intermediate node or the receiver misbehaved (due to 212 broken software or because of an attack). 214 3.2.1. IP 216 The Reply MTU Option for IPv4 (IP option number 12) is specified as 217 in [RFC1063], with the exception of additional "TTL-Diff" and "MTU 218 Nonce" fields: 220 0 1 2 3 221 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 223 | Type = 12 | Size = 8 | MTU | 224 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 225 | TTL-Diff | MTU Nonce | 226 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 228 This option is used to return the value learned from a Probe MTU 229 Option to the sender of the Probe MTU Option at the IP level. Since 230 it causes unnecessary processing overhead in routers along the 231 backward path, its usage is only recommended for rare special cases. 232 In particular, it may be helpful for UDP-based applications which 233 utilize PMTU-Options. 235 The first octet of this option contains the option type, identifying 236 the IP option. The second octet of this option contains the size 237 field, specifying the option length in octets. The size field is set 238 to 8. The next three fields (two, one and three octets, respectively) 239 contain the result of the MTU-Options process, TTL-Diff and the MTU 240 Nonce; the MTU and MTU Nonce fields must be copied from the IPv4 241 Probe MTU Option by the receiver. The receiver must set the TTL-Diff 242 field to ( TTL - TTL-Check ) mod 256, where "TTL" is the TTL field in 243 the IP header. 245 3.2.2. TCP 247 The Reply MTU Option format for TCP over IPv4 is: 249 0 1 2 3 250 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 251 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 252 | Kind | Length = 8 | MTU | 253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 254 | TTL-Diff | MTU Nonce | 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 257 The first octet of this option contains the option kind, identifying 258 the TCP option (to be specified by IANA). The second octet of this 259 option contains the length field, specifying the option length in 260 octets. The length field is set to 8. The MTU, TTL-Diff and MTU Nonce 261 fields are similar to the specification in section 3.2.1. 263 3.2.3. SCTP 265 In SCTP [RFC2960], the sender is informed by the receiver about PMTU- 266 Options feedback by including the Reply MTU chunk. This chunk looks 267 as follows: 269 0 1 2 3 270 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 271 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 272 | Chunk Type | Flags=00000000| Chunk Length = 12 | 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 274 | MTU | TTL-Diff | MTU Nonce | 275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 276 | MTU Nonce (continued) | Padding (0) | 277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 279 Note: The Reply MTU chunk is considered a Control chunk. 281 The Chunk Type field identifies the chunk; it consists of two high- 282 order bits "11", indicating that a processing SCTP node which does 283 not recognize this chunk type must skip this chunk and continue 284 processing, but report in an ERROR Chunk using the "Unrecognized 285 Chunk Type" cause of error. The sender of the Reply MTU chunk SHOULD 286 send the MTU feedback via some other means (via a different active 287 protocol or an IP option) in response to this ERROR Chunk. The 288 trailing six bits are currently undefined (to be specified by IANA). 289 Since this chunk has no specific flags, the Flags field is set to 0. 290 The Chunk Length field is set to 12 (the length of the chunk 291 including the Chunk Type, Flags and Length fields). 293 The MTU, TTL-Diff and MTU Nonce fields are similar to the 294 specification in section 3.2.1; since the length of a SCTP chunk must 295 be a multiple of 4 octets, the last octet is filled with 0 (padding). 297 3.2.4. DCCP 299 The Feedback MTU Option format for DCCP over IPv4 is: 301 0 1 2 3 302 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 304 | Type | Length = 8 | MTU | 305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 306 | TTL-Diff | MTU Nonce | 307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 309 The Type field indentifies the option (number to be specified by 310 IANA). The rest of the option is similar to the TCP option defined in 311 section 3.2.2. 313 3.3 Host Operation 315 Generally, the PMTU-Options process is carried out before initiating 316 a regular PMTUD or PLPMTUD process. 318 Since IP options require processing at every router along a path, it 319 is important to ensure that no packets unnecessarily include IP 320 options. Thus, a host implementing PMTU-Options must keep state in 321 routing table entries and only initiate a PMTUD or PLPMTUD (and 322 accompanying PMTU-Options) process when this information becomes 323 stale. The process of purging stale PMTU information is specified in 324 [RFC1191], sections 6.2 and 6.3. Additionally, the number of 325 datagrams carrying IP options should be restricted with a fixed 326 percentage of total datagrams that are sent by a host to ensure 327 scalability. It can also be reduced by using PMTU-Options in special 328 situations only; a discussion of such potential usage scenarios is 329 provided in section 6. 331 It is assumed that the datagram size used when doing PMTU-Options is 332 already some sensible value (e.g., the result of a recent PMTU 333 process or the first-hop data-link MTU). When a Probe MTU Option is 334 added to a datagram, it is prolonged by 8 octets (16 octets with 335 IPv6). This number must therefore be taken into account when 336 generating a datagram -- when doing PMTU-Options, the size of the 337 datagram including the Probe MTU Option must not exceed the recent 338 PMTU value. The correct size must be communicated to the 339 packetization layer protocol (see [RFC1191], section 6.4, for a 340 suggestion of how such communication could be implemented). 342 A host receiving a packet that carries the Probe MTU Option MUST feed 343 back the information to the sender by copying the MTU and MTU Nonce 344 fields and TTL-Diff as specified in section 3.2 to the next return 345 datagram. To this end, it SHOULD inform a packetization layer 346 protocol which is communicating with the originator of the Probe MTU 347 Option. Alternatively, a host receiving a packet that carries the 348 Probe MTU Option MAY use the MTU Reply IP Option; this method is not 349 recommended because it involves unnecessary processing overhead in 350 routers on the return path. 352 A host MUST be able to accept MTU feedback from IP and SHOULD be able 353 to accept feedback from all packetization layer protocols. This 354 feedback contains a MTU Nonce value which either has the same value 355 as the MTU Nonce field in the original Probe MTU Option (a random 356 number initialized by the sender), meaning that the initial MTU value 357 in the Probe MTU Option was not changed by routers, or 0, meaning 358 that the initial MTU value in the Probe MTU Option was changed by 359 routers. If the PMTU upper limit from PMTU-Options feedback is 360 greater than the initial PMTU value stored at the sender, the latter 361 value MUST be used. If, in such a case, the MTU Nonce was changed, 362 there is a chance that an intermediate node or the receiver 363 misbehaved (due to broken software or because of an attack). 365 The resulting PMTU upper limit at the PMTU-Options sender MUST be 366 communicated to the local PMTUD or PLPMTUD instance. Note that 367 because the options are placed on unreliable datagrams, the original 368 sender will have to resend probes (possibly once per window of data) 369 until it receives feedback. 371 If the value of TTL-Diff in the reply packet is equal to the stored 372 value of TTL-Diff, all routers along the path supported the option. 373 This means that the MTU value in the reply packet is not an upper 374 limit for the PMTU but the actual end result; this fact SHOULD be 375 communicated to the local PMTUD or PLPMTUD instance, which MAY then 376 terminate immediately using the value from the packet carrying the 377 MTU feedback. 379 A host that implements PMTU-Options SHOULD wait for feedback to 380 arrive before initiating the subsequent PMTUD or PLPMTUD process, 381 which remains unchanged except for one difference: the MTU during the 382 process SHOULD not exceed the value received from PMTU-Options. 383 Feedback from PMTU-Options SHOULD not be kept any longer -- it is 384 only intended as an aid for the subsequent PMTUD or PLPMTUD process. 385 Since packet drops can substantially delay the reception of feedback, 386 a host MAY use a timer to initiate PTMUD or PLPMTUD even when no 387 feedback has arrived; if such a timer is implemented, a means to 388 configure this timer MUST be provided. 390 3.4. IPv6 Usage 392 Path MTU Discovery for IPv6 is specified in [RFC1981]. PMTU-Options 393 can be combined with this PMTUD variant just like regular PMTUD or 394 PLPMTUD; the mechanism should work with IPv6 without requiring any 395 substantial changes. The following sections describe the IPv6 formats 396 for the Probe MTU Option and feedback -- these formats differ from 397 the IPv4 variants in that the MTU field is larger (4 instead of 2 398 octets) to support IPv6 jumbograms [RFC2675] and the MTU Nonce field 399 is larger (8 instead of 4 octets) to ensure 8-octet alignment without 400 wasting space for padding octets. Also, the TTL field in the IPv4 401 header, which is used for calculations related to TTL-Check, is the 402 Hop Limit field in the case of IPv6. 404 3.4.1. Probe MTU Option Format for IPv6 405 The option format specified in section 3.1 is a Hop-by-Hop Options 406 header in the IPv6 case. The PMTU-Options Hop-by-Hop Options header 407 is identified by a Next Header value of 0 in the IPv6 header, and has 408 the following format: 410 0 1 2 3 411 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 413 | Option Type |Opt Data Len=12| 414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 415 | MTU | 416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 417 | TTL-Check | | 418 +-+-+-+-+-+-+-+-+ 7-octet MTU Nonce field + 419 | | 420 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 422 The Option Type field identifies the option; it consists of two high- 423 order bits "00", indicating that a processing IPv6 node which does 424 not recognize this option type must skip over this option and 425 continue processing the header. The following bit "1" indicates that 426 the Option Data field (MTU, TTL-Check and MTU Nonce) may change en- 427 route. The trailing five bits are currently undefined (to be 428 specified by IANA). The MTU field was stretched to 4 octets to 429 support IPv6 jumbograms [RFC2675]. 431 Since it may be assumed that, when either of the option-bearing 432 headers are present, they carry a very small number of options -- 433 usually only one [RFC2460] -- the MTU Nonce field was stretched to 434 fit the 8-octet alignment (otherwise, a PadN Option of 6 octets would 435 have to be used in most cases). This way, the security of PMTU- 436 Options is enhanced. A complete Hop-by-Hop Options header containing 437 this one option would look as follows: 439 0 1 2 3 440 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 | Next Header | Hdr Ext Len=1 | Option Type |Opt Data Len=12| 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 444 | MTU | 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 | TTL-Check | | 447 +-+-+-+-+-+-+-+-+ 7-octet MTU Nonce field + 448 | | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 3.4.2. Feedback Format: IP 452 The option format specified in section 3.2.1 is a Destination Options 453 header in the IPv6 case. The PMTU-Options Destination Options header 454 is identified by a Next Header value of 60 in the immediately 455 preceding header, and has the following format: 457 0 1 2 3 458 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | Option Type |Opt Data Len=12| 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | MTU | 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 464 | TTL-Diff | | 465 +-+-+-+-+-+-+-+-+ 7-octet MTU Nonce field + 466 | | 467 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 469 The Option Type field identifies the option; it consists of two high- 470 order bits "00", indicating that a processing IPv6 node which does 471 not recognize this option type must skip over this option and 472 continue processing the header. The following bit "0" indicates that 473 the Option Data field (MTU, TTL-Diff and MTU Nonce) does not change 474 en-route. The trailing five bits are currently undefined (to be 475 specified by IANA). 477 The second octet of this option contains the Opt Data Len field, 478 specifying the length of the Option Data field in octets. The Opt 479 Data Len field is set to 12. The next three fields (four, one and 480 seven octets, respectively) contain the result of the MTU-Options 481 process, TTL-Diff and the MTU Nonce; the MTU and MTU Nonce fields 482 must be copied from the IPv6 Probe MTU Option by the receiver. The 483 receiver must set the TTL-Diff field to ( TTL - Hop Limit ) mod 256, 484 where "Hop Limit" is the Hop Limit field in the IPv6 header. 486 A complete Destination Options header containing this one option 487 would look as follows: 489 0 1 2 3 490 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 491 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 492 | Next Header | Hdr Ext Len=1 | Option Type |Opt Data Len=12| 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 | MTU | 495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 496 | TTL-Diff | | 497 +-+-+-+-+-+-+-+-+ 7-octet MTU Nonce field + 498 | | 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 501 3.4.3. Feedback Format: TCP 503 The Reply MTU Option format for TCP over IPv6 is: 505 0 1 2 3 506 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 508 | Kind | Length = 14 | MTU | 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 510 | MTU (continued) | TTL-Diff | MTU Nonce | 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 | MTU Nonce (continued) | MTU Nonce (continued) | 513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 514 | MTU Nonce (continued) | 515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 The first octet of this option contains the option kind, identifying 518 the TCP option. The second octet of this option contains the length 519 field, specifying the total option length in octets. The length field 520 is set to 14. The MTU, TTL-Diff and MTU Nonce fields are similar to 521 the specification in section 3.4.2. 523 3.4.4. Feedback Format: SCTP 525 In SCTP [RFC2960], the sender is informed by the receiver about PMTU- 526 Options feedback by including the Reply MTU chunk. For IPv6, this 527 chunk looks as follows: 529 0 1 2 3 530 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 532 | Chunk Type | Flags=00000000| Chunk Length = 16 | 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 534 | MTU | 535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 536 | TTL-Diff | | 537 +-+-+-+-+-+-+-+-+ 7-octet MTU Nonce field + 538 | | 539 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 541 Note: The Reply MTU chunk is considered a Control chunk. 543 The Chunk Type field identifies the chunk; it consists of two high- 544 order bits "11", indicating that a processing SCTP node which does 545 not recognize this chunk type must skip this chunk and continue 546 processing, but report in an ERROR Chunk using the "Unrecognized 547 Chunk Type" cause of error. The sender of the Reply MTU chunk SHOULD 548 send the MTU feedback via some other means (via a different active 549 protocol or an IP option) in response to this ERROR Chunk. The 550 trailing six bits are currently undefined (to be specified by IANA). 551 Since this chunk has no specific flags, the Flags field is set to 0. 552 The Chunk Length field is set to 16 (the length of the chunk 553 including the Chunk Type, Flags and Length fields). 555 The MTU, TTL-Diff and MTU Nonce fields are similar to the 556 specification in section 3.4.2. 558 3.4.5. Feedback Format: DCCP 560 The Reply MTU Option format for DCCP over IPv6 is: 562 0 1 2 3 563 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 565 | Type | Length = 14 | MTU | 566 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 567 | MTU (continued) | TTL-Diff | MTU Nonce | 568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 569 | MTU Nonce (continued) | MTU Nonce (continued) | 570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 571 | MTU Nonce (continued) | 572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 574 The Type field indentifies the option (number to be 575 specified by IANA). The rest of the option is similar 576 to the TCP option defined in section 3.4.3. 578 3.5. IP Tunnels 580 If a network node that performs IP-in-IP tunneling (be it because 581 of IPsec, IPv6 or for any other purpose) encapsulates a datagram 582 carrying the Probe MTU Option, it should copy this option to 583 the outer IP header, no matter how many headers there are 584 in between. If the network node at the "other side of the tunnel" 585 (the node that unpackages an IP datagram by removing the outer 586 header) receives a datagram carrying the Probe MTU Option in the 587 (outer) IP header, it should subtract the length of the outer and 588 all intermediate headers from the value in the (outer header) option 589 and then copy it into the inner header option on unpackaging. 591 Such nodes SHOULD NOT change the MTU Nonce field. 593 4. Potential Advantages 595 The PMTU-Options mechanism has a number of potential advantages: 597 4.1. Reducing Packet Loss 599 Probing for the Path MTU means that at some point, a datagram with 600 a size exceeding the Path MTU must be sent with the DF bit set to 1. 601 Such a datagram will be dropped, which normally requires the 602 data it carried to be retransmitted. This retransmission is 603 additional traffic which is caused by PMTUD or PLPMTUD. 605 The result of PMTU-Options is to be interpreted as an upper limit 606 for the Path MTU. If it turns out to be the Path MTU during the 607 subsequent PMTUD or PLPMTUD process, the Path MTU is detected 608 without causing packet loss. 610 4.2. Circumventing Black Hole Detection 612 PMTUD is known to have a problem called "Black Hole Detection" 613 [RFC2923], which happens when a PMTU sender does not receive an 614 ICMP Fragmentation Needed message even though the size of a 615 datagram with DF set to 1 has exceeded the MTU of a link. This 616 may happen if, for instance, routers or firewalls are misconfigured. 617 PLPMTUD is robust against this failure because it does not rely 618 on ICMP. 620 Since the information from PMTU-Options is explicit even when 621 the size of a datagram has not exceeded the MTU of a link, the 622 "Black Hole Detection" problem of PMTUD could be circumvented 623 in a special case that is best explained by means of a simple 624 example: 626 MTU A ------ MTU B ------ MTU C ------ MTU A 627 Sender --------| R1 |---------| R2 |---------| R3 |--------- Receiver 628 ------ ------ ------ 630 In this example, a datagram traverses routers R1, R2 and R3, and MTU 631 A > MTU B > MTU C. PMTUD has converged a long time ago, and a path 632 change has occured. Not having noticed any problems, the host now 633 probes for a larger MTU. Routers R1 and R2 are misconfigured not to 634 send ICMP "Fragmentation Needed" messages -- the size is increased 635 and exceeds MTU B, but the sender never notices this because R1 636 simply drops the datagram. 638 Now let us assume that PTMU-Options is used. The sender first submits 639 a datagram carrying a Probe MTU Option; R3, which is properly 640 configured, updates the value in the option (originally A) with C. 641 Since PMTUD uses this value as an upper limit, it never exceeds the 642 MTU of the link between routers R1 and R2, and Black Hole Detection 643 does not occur. 645 4.3. Other Problems with ICMP Fragmentation Needed 647 In addition to the danger of leading to the Black Hole Detection 648 problem, utilizing ICMP Fragmentation Needed messages has the 649 following additional problems: 651 o Generating an ICMP packet requires memory to be allocated, header 652 fields to be initialized etc., which is a costly operation for a 653 router. 654 o An ICMP message is additional signaling traffic that consumes 655 network capacity. 656 o An ICMP message can be dropped, e.g. as the result of congestion 657 on the return path. 659 By replacing the generation of an ICMP Fragmentation Needed message 660 with a simple option field update, PMTU-Options circumvents these 661 problems. 663 4.4. Circumventing Problems with IP Tunnels 665 Sometimes, the DF bit is ignored by network nodes that encapsulate IP 666 packets: the total length of a packet with additional headers may 667 exceed the Path MTU of the tunnel even if this is not the case 668 without the extra headers. Also, the inner nodes of a tunnel are 669 often invisible to the data flow that is carried through the tunnel. 670 It would therefore be up to the network nodes at the edge of the 671 tunnel to perform PMTUD or PLPMTUD and fragment a packet 672 appropriately, if necessary. Simply ignoring the DF bit is an 673 attractive alternative. 675 If the PMTU-Options mechanism is supported by routers along a tunnel 676 path and IP Options are copied to the outer IP header, it is possible 677 to detect a potentially smaller MTU in the tunnel, thereby decreasing 678 the chance of fragmentation in the tunnel. 680 5. Discussion of Issues 682 In what follows, we discuss some obvious problems with PMTU-Options: 684 5.1. Slow Path Processing 686 IP packets carrying options are known to be processed in the "Slow 687 Path" (software, as opposed to the hardware-only "Fast Path") in most 688 normal IP router. This means that packets carrying the Probe MTU 689 Option will experience a notable delay along the forward path. It is 690 therefore not recommended to use datagrams that belong to a PMTU- 691 Options process for round-trip time estimation; this makes it 692 questionable whether PMTU-Options should be used at the beginning of 693 a TCP connection. 695 In IETF and IRTF mailing lists, the impact of option processing has 696 been claimed to be immense (e.g., slowing down packets by factor 100) 697 on several occasions. While this may be true for packet processing in 698 a single router, a recent measurement study has shown that the impact 699 on end-to-end delay can be much less severe [WeRo]. The study 700 encompassed two separate tests -- one in July/August 2002 and one in 701 August/September 2003. In each of these tests, a ping carrying a NOP 702 IP Option was sent to a host from the list at [TBIT], followed by a 703 ping without the option; this process was repeated 100 times per 704 host, and there was a pause of 1 second in between all pings -- thus, 705 these measurements do not say anything about the (possibly different) 706 behavior of routers when they are flooded with a large number of 707 packets that contain IP options. Additionally, a traceroute was 708 carried out for each host. 710 4427 and 4401 hosts reacted to pings with options, with a number of 711 hops ranging from 6 to 34 (most hosts were in the range of 14 - 25 712 hops). The pings traversed 5726 unique router addresses (not 713 necessarily as many different machines because different interfaces 714 on a single router may show up as different addresses in traceroute) 715 in 2002 and 5194 in 2003. The measurements from hosts that answered 716 with options (approximately 1/4) were ignored in the final statistics 717 because the backward path is unknown. The rest of the data was 718 weighted based on the frequency of occurence (a router which shows up 719 100 times in the measurements is 100 times less important than a 720 router which shows up only once), path length and variance (to 721 diminish the effect of queuing delay). 723 The final result was that on average, slow path processing in routers 724 caused an additional delay of 10% (2002) and 7% (2003) along the 725 forward path if a packet contains a NOP option. These results appear 726 to concur with the results reported in [FrJo], where similar 727 measurements are described. 729 5.2. Dropped Packets 731 In addition to the resulting impact that slow path processing has on 732 the round-trip time, the measurements reported in [WeRo] revealed an 733 alarming fact: in the 2003 test run, 3507 out of 7908 hosts reacted 734 to regular pings but did not react when a NOP option was used. At 735 this point, it is unclear who dropped the pings that carried options: 736 routers along the paths to the hosts, firewalls, the hosts 737 themselves, or routers along the backward paths (because the hosts 738 may have included the option in their responses). Also, the reaction 739 to different types of packets (e.g., TCP SYN) is unknown. 741 For PMTU-Options, this means that the mechanism can easily lead to 742 packet drops, in particular if the receiver is not known to support 743 it. This may have adverse effects on the packetization layer if these 744 drops are interpreted as a sign of congestion. This is one particular 745 reason to consider PMTU-Options as experimental and propose its usage 746 for special scenarios only. 748 5.3. Placing Additional Burden on Routers 750 PMTUD and PLPMTUD only require the "problematic" router (router 751 attached to a link with an MTU that was just exceeded) to do 752 substantial extra work (notably, all the routers on the return path 753 from the "problematic" router to the sender are involved in 754 forwarding an ICMP Fragmentation Needed message in the case of 755 standard PMTUD [RFC1191]). PMTU-Options involves all routers along 756 the path by having them process IP options. In other words, while the 757 burden for an individual router is smaller (processing of the Probe 758 MTU Option is probably a less costly operation than generating an 759 ICMP Fragmentation Needed message), the burden for the whole network 760 is perhaps greater. 762 Since the processing overhead caused by the Probe MTU option in 763 routers is unknown, it is important to limit the amount of such 764 packets in a network; clearly, PMTU-Options should not be used for 765 each and every new TCP connection but in special scenarios only. 767 5.4. Motivating Deployment 769 From the perspective of a single node, there is no immediate gain 770 when deploying PMTU-Options; at least the sender, receiver and a 771 router (ideally attached to the link with the Path MTU -- otherwise 772 the only benefit of PMTU-Options could be faster PMTUD convergence 773 because it starts with a smaller value) must participate for the 774 mechanism to be beneficial. However, the same is true for Explicit 775 Congestion Notification (ECN) [RFC3168], a deployment overview of 776 which is given at [ECN]. 778 PMTU-Options has the following motivating deployment factor: a router 779 with a particularly small MTU will typically need to send a large 780 number of ICMP packets. This is where PMTU-Options deployment would 781 be most beneficial because it might lead to reduced CPU load. From 782 the perspective of an end host where PMTU-Options is used, such 783 routers are exactly the ones that should be updated because they have 784 small MTUs. 786 6. Discussion of Usage Scenarios 788 Since the Probe MTU Option places an additional burden on routers via 789 IP option processing and the additional delay from Slow Path 790 processing can falsify a round-trip time estimation, it is 791 questionable whether the mechanism should be used at the beginning of 792 a standard TCP connection. Being an experimental PMTUD enhancement, 793 PMTU-Options is rather intended to be used under special 794 circumstances -- depending on the importance of the aforementioned 795 advantages as opposed to the gravity of the aforementioned issues 796 with the mechanism. In what follows, some examples of potential usage 797 scenarios are given: 799 6.1. Detecting PMTU Increases 801 Since the PMTU may change (e.g., when a routing change occurs) and 802 even become larger while a long-lasting connection is active, 803 [RFC1191] describes a method to probe for increased MTUs (which 804 should be done rarely). The recommended method increases the packet 805 size according to a table specified in [RFC1191]; alternatively, the 806 "aged" cached PMTU values may be reset to the first-hop data-link 807 MTU. Since chances are high that the PMTU did not change and either 808 of these process will therefore immediately exceed the current PMTU, 809 it may be recommendable to use PMTU-Options before increasing a 810 cached PMTU value. 812 6.2. RTT-robust Transport Protocols 814 Transport protocols that do not rely on round-trip time estimates as 815 heavily as TCP or SCTP may be a good fit for PMTU-Options. In 816 particular, this includes DCCP [draft-DCCP], where the importance of 817 round-trip time estimates depends on the Congestion Control ID 818 (CCID), and UDP, where no round-trip time estimation is specified. 819 Currently, PMTUD is unavailable for applications that utilize UDP -- 820 it is up to such applications to find out about the ideal size of a 821 datagram. PMTUD or PLPMTUD may however be provided to such 822 applications in the future (the issue has been discussed in the PMTUD 823 Working Group). Assuming that it is available to applications running 824 on top of UDP, it may be recommendable for such an application to use 825 PMTU-Options depending on its requirements. 827 6.3. Tunnels 829 The operation of PMTU-Options across tunnels is specified in section 830 3.5, the potential advantage of this kind of operation is described 831 in section 4.4. In addition to the viewpoint of an application 832 traversing a tunnel without wanting PMTUD to fail, the endpoints of a 833 tunnel may sometimes also need to determine the MTU in between them 834 (the viewpoint being a tunnel with endpoints which do not care about 835 actual application endpoints) [draft-TunnelMTU]. In such a case, 836 using PMTU-Options may be recommendable due to the potentially 837 controlled (or known) tunnel environment and sporadic MTU 838 determination at tunnel endpoints. 840 7. Related Work 842 This work can be seen as an extension of the basic idea in [RFC1063]. 843 A related document is [draft-PMTUDv6], where the idea of utilizing 844 options for PMTUD is described for IPv6. This is the only other text 845 that the author is aware of where a Probe MTU option is combined with 846 regular PMTUD. Some of the design choices in this document (e.g., the 847 MTU Nonce) were based on [RFC3168] and [draft-QS]. 849 8. Security Considerations 851 Since MTU-Options requires routers to change a value in packets and 852 must therefore always be placed in the outermost IP header to remain 853 functioning, it cannot be fully protected with IPsec; however, as the 854 explicit MTU update to the sender originates from the receiver of an 855 end-to-end connection and not an intermediate router as with ICMP 856 Fragmentation Needed messages, the receiver can be authenticated 857 using the Encapsulation Security Payload (ESP) header [RFC2406] or 858 the Authentication Header (AH) [RFC2402]. For this purpose, the 859 Probe MTU Option is classified as mutable and the Reply MTU Option is 860 classified as immutable. The mutable header field (MTU field in the 861 Probe MTU Option) is where IPsec cannot help -- it cannot prevent 862 intermediate attackers from sending false data. The following attacks 863 from such a malicious node must be considered: 865 o Sending a MTU value that is too large: 867 A host MUST ignore feedback containing an MTU that is larger than 868 the MTU it initially wrote into the option. If a router reduces 869 the MTU value, it MUST set the MTU Nonce to 0 -- to hide this 870 fact, a malicious node would have to guess the original MTU Nonce, 871 which has a 1/(2^32) chance of success (1/2^128 with IPv6). If a 872 malicious node ignores the MTU Nonce and still increases the MTU 873 value, the attacker can only succeed if the new value is smaller 874 than the (unknown) original value from the sender. Even then, the 875 value received from this option is only used as an upper limit for 876 PMTUD and PLPMTUD, which will still work in case of such an 877 attack. Only the benefit of PMTU-Options can be lost. 879 o Sending a MTU value that is too small: 881 This attack, which is similar to an attack with ICMP 882 "Fragmentation needed" messages carrying a value that is too 883 small, can degrade the performance of a sender. PMTU-Options does 884 not provide a mechanism to prevent this attack. This is one of the 885 most important reasons to consider it experimental: the result of 886 PMTU-Options should merely be used as a hint. 888 o Lying about the number of routers that supported the option: 890 The number of routers that supported the option can be faked by 891 altering TTL-Check or TTL-Diff. The only plausible attack based on 892 this value would be to claim that all routers along the path 893 supported the option, as this might cause the sender to use the 894 MTU value in the reply packet right away without starting a PMTUD 895 or PLPMTUD process. However, since the initial value of TTL-Check 896 is a random number, the chance of a malicious node guessing the 897 right value of TTL-Check is at most 1/256. 899 9. IANA Considerations 901 This specification reuses the obsolete IPv4 option numbers 11 and 12. 902 It requires two IPv6 Option Type numbers (with leading bits "001" and 903 "000"), a TCP option number, a SCTP Chunk Type number (with leading 904 bits "00") and a DCCP option number. 906 10. Normative References 908 [RFC1191] Mogul, J.C., and Deering, S.E., "Path MTU discovery", RFC 909 1191, November 1990. 911 [RFC1981] McCann, J., Deering, S. and Mogul, J., "Path MTU Discovery 912 for IP version 6", RFC 1981, August 1996. 914 [draft-PLPMTUD] Mathis, M., Heffner, J. and Lahey, K., "Path MTU 915 Discovery", Internet-draft draft-ietf-pmtud-method-00.txt, October 916 19, 2003. 918 [RFC1435] Knowles, S., "IESG Advice from Experience with Path MTU 919 Discovery.", RFC 1435, March 1993. 921 [RFC2460] Deering, S., and Hinden, R., "Internet Protocol, Version 6 922 (IPv6)", RFC 2460, December 1998. 924 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 925 2923, September 2000. 927 [RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C., 928 Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and 929 Paxson, V., "Stream Control Transmission Protocol", RFC 2960, October 930 2000. 932 [draft-DCCP] Kohler, E., Handley, M., Floyd, S., and Padhye, J., 933 "Datagram Congestion Control Protocol (DCCP)", Internet-draft draft- 934 ietf-dccp-spec-05.txt, October 2003. 936 [RFC2402] Kent, S., and Atkinson, R., "IP Authentication Header", RFC 937 2402, November 1998 939 11. Informative References 941 [RFC1063] Mogul, J.C., Kent, C.A., Partridge, C., and McCloghrie, K., 942 "IP MTU discovery options.", RFC 1063, July, 1988. 944 [WeRo] Welzl, M., and Rossi, M., "On The Impact of IP Option 945 Processing", Preprint-Reihe des Fachbereichs Mathematik-Informatik 946 (technical report), No. 15, 2003. Available from 947 http://www.welzl.at/ip-options 949 [FrJo] Fransson, P., and Jonsson, A., "The Need for an Alternative to 950 IPv4-Options", RVK (RadioVetenskap och Kommunikation), Stockholm, 951 Sweden, pp. 162-166, June 2002. 953 [RFC2675] Borman, D., Deering, S., and Hinden, R., "IPv6 Jumbograms", 954 RFC 2675, August 1999. 956 [draft-TunnelMTU] Templin, F., "Dynamic MTU Determination for 957 IPv6-in-IPv4 Tunnels", Internet-draft draft-ietf-templin- 958 tunnelmtu-06.txt, November 2003. Currently available from: 959 http://www.geocities.com/osprey67/tunnelmtu-06.txt 961 [RFC3168] Ramakrishnan, K., Floyd, S., and Black, D., "The Addition 962 of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 963 2001. 965 [ECN] The ECN website, http://www.icir.org/floyd/ecn.html 967 [draft-PMTUDv6] Park, S. D., and Lee, H., "The PMTU Discovery for 968 IPv6 Using Hop-by-Hop Option Header", Internet-draft draft-park-pmtu- 969 ipv6-option-header-00.txt, March 2003 (expired). 971 [draft-QS] Jain, A., and Floyd, S., "Quick-Start for TCP and IP", 972 Internet-draft draft-amit-quick-start-02.txt, October 2002 (expired). 973 Available from http://www.icir.org/floyd/quickstart.html 975 [TBIT] The TBIT website, http://www.icir.org/tbit/ 977 [RFC2406] Kent, S., and Atkinson, R., "IP Encapsulating Security 978 Payload (ESP)", RFC 2406, November 1998. 980 12. Acknowledgements 982 The author would like to thank everybody who contributed to this 983 document via helpful discussions. In particular, this includes: Eddie 984 Kohler, Simon Leinen, Matt Mathis, Michael Richardson and Fred 985 Templin. 987 13. Author's Address 989 Michael Welzl 990 University of Innsbruck 991 Institute fuer Informatik 992 Technikerstr. 25/7 993 A-6020 Innsbruck, Austria 995 Phone: +43 (512) 507-6110 996 Fax: +43 (5122) 507-2977 997 Email: michael.welzl@uibk.ac.at 998 Web: http://www.welzl.at 1000 14. Full Copyright Statement 1002 Copyright (C) The Internet Society (1997). All Rights Reserved. 1004 This document and translations of it may be copied and furnished to 1005 others, and derivative works that comment on or otherwise explain it 1006 or assist in its implmentation may be prepared, copied, published and 1007 distributed, in whole or in part, without restriction of any kind, 1008 provided that the above copyright notice and this paragraph are 1009 included on all such copies and derivative works. However, this 1010 document itself may not be modified in any way, such as by removing 1011 the copyright notice or references to the Internet Society or other 1012 Internet organizations, except as needed for the purpose of 1013 developing Internet standards in which case the procedures for 1014 copyrights defined in the Internet Standards process must be 1015 followed, or as required to translate it into languages other than 1016 English. 1018 The limited permissions granted above are perpetual and will not be 1019 revoked by the Internet Society or its successors or assigns. 1021 This document and the information contained herein is provided on an 1022 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1023 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1024 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1025 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1026 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."