idnits 2.17.1 draft-templin-6man-fragrep-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (November 17, 2021) is 889 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC3366' is mentioned on line 265, but not defined == Missing Reference: 'RFCXXXX' is mentioned on line 374, but not defined == Outdated reference: A later version (-63) exists of draft-templin-6man-aero-37 == Outdated reference: A later version (-74) exists of draft-templin-6man-omni-51 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Updates: RFC8200, RFC8201, RFC4443, November 17, 2021 5 RFC1191 (if approved) 6 Intended status: Standards Track 7 Expires: May 21, 2022 9 IPv6 Fragment Retransmission and Path MTU Discovery Soft Errors 10 draft-templin-6man-fragrep-02 12 Abstract 14 Internet Protocol version 6 (IPv6) provides a fragmentation and 15 reassembly service for end systems allowing for the transmission of 16 packets that exceed the path MTU. However, loss of just a single 17 fragment requires retransmission of the original packet in its 18 entirety, with potentially devastating effects on performance. This 19 document specifies an IPv6 fragment retransmission scheme that 20 matches the loss unit to the retransmission unit. The document 21 further specifies an update to Path MTU Discovery that distinguishes 22 hard link size restrictions from reassembly congestion events. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on May 21, 2022. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. Common Use Cases . . . . . . . . . . . . . . . . . . . . . . 3 61 4. IPv6 Fragmentation . . . . . . . . . . . . . . . . . . . . . 4 62 5. IPv6 Fragment Retransmission . . . . . . . . . . . . . . . . 5 63 6. Packet Too Big (PTB) Soft Errors . . . . . . . . . . . . . . 7 64 7. Implementation Status . . . . . . . . . . . . . . . . . . . . 8 65 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 66 9. Security Considerations . . . . . . . . . . . . . . . . . . . 9 67 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 68 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 69 11.1. Normative References . . . . . . . . . . . . . . . . . . 9 70 11.2. Informative References . . . . . . . . . . . . . . . . . 10 71 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 73 1. Introduction 75 Internet Protocol version 6 (IPv6) [RFC8200] provides a fragmentation 76 and reassembly service similar to that found in IPv4 [RFC0791], with 77 the exception that only the source host (i.e., and not routers on the 78 path) may perform fragmentation. When an IPv6 packet is fragmented, 79 the loss unit (i.e., a single IPv6 fragment) becomes smaller than the 80 retransmission unit (i.e., the entire packet) which under 81 intermittent loss conditions could result in sustained retransmission 82 storms with little or no forward progress [RFC8900]. 84 The presumed drawbacks of fragmentation are tempered by the fact that 85 greater performance can often be realized when the source sends large 86 packets that exceed the path MTU. This is due to the fact that a 87 single large IPv6 packet produced by upper layers results in a burst 88 of multiple fragment packets produced by lower layers with minimal 89 inter-packet delays. These bursts yield high network utilization for 90 the burst duration, while modern reassembly implementations have 91 proven capable of accommodating such bursts. If the loss unit can 92 somehow be made to match the retransmission unit, the performance 93 benefits of IPv6 fragmentation can be realized. 95 This document therefore proposes an IPv6 fragment retransmission 96 service in which the source marks each fragment with an "Ordinal" 97 number, and the destination may request retransmissions of any 98 ordinal fragments that are lost. This retransmission request service 99 is intended only for short-duration and opportunistic best-effort 100 recovery (i.e., and not true end-to-end reliability). In this way, 101 the service mirrors the Automatic Repeat Request (ARQ) function of 102 common data links [RFC3366] by considering an imaginary virtual link 103 that extends from the IPv6 source to destination. The goal therefore 104 is for the destination to quickly obtain missing individual fragments 105 of partial reassemblies before true end-to-end timers would cause 106 retransmission of the entire packet. 108 When conditions suggest that original sources should begin sending 109 smaller packets, the fragmentation source and/or reassembly 110 destination can return a new type of ICMPv6 Packet Too Big or ICMPv4 111 Fragmentation Needed message termed a PTB "soft error" that is 112 distinguished from classic "hard errors" by including a non-zero 113 value in the PTB Code (ICMIPv6) or unused (ICMPv4) field. The 114 fragmentation source can return soft errors (subject to rate 115 limiting) suggesting a smaller packet size while fragmentation of 116 large packets is producing excessive numbers of fragments. 117 Similarly, the reassembly destination can return soft errors (via the 118 fragmentation source) while reassembly of large packets is causing 119 excessive reassembly congestion. Original sources that receive these 120 soft errors should reduce the size of packets they send for the short 121 term, but can again begin to increase their packet sizes without 122 delay as long as no further soft or hard errors arrive. 124 2. Terminology 126 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 127 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 128 "OPTIONAL" in this document are to be interpreted as described in BCP 129 14 [RFC2119][RFC8174] when, and only when, they appear in all 130 capitals, as shown here. 132 3. Common Use Cases 134 A common use case of interest is to improve the state of affairs for 135 IPv6 encapsulation (i.e., "tunneling") [RFC2473] when the original 136 source may be many IP hops away from the tunnel ingress, and the 137 tunnel packet may be fragmented following encapsulation. The tunnel 138 is seen as a "link" on the path from the original source to the final 139 destination, and the goal is to increase the reliability of that link 140 in order to minimize wasteful end-to-end retransmissions. 142 When the original source and IPv6 fragmentation source are located on 143 the same platform (physical or virtual) the window of opportunity for 144 successful retransmission of individual fragments may be narrow 145 unless the link persistence timeframe is carefully coordinated with 146 upper layer retransmission timers. (In an uncoordinated case, upper 147 layers may retransmit the entire packet before or at roughly the same 148 time the IPv6 fragmentation source retransmits individual fragments, 149 leading to increased congestion and wasted retransmissions.) 151 4. IPv6 Fragmentation 153 IPv6 fragmentation is specified in Section 4.5 of [RFC8200] and is 154 based on the IPv6 Fragment extension header formatted as shown below: 156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 157 | Next Header | Reserved | Fragment Offset |Res|M| 158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 159 | Identification | 160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 162 In this format: 164 o Next Header is a 1-octet IP protocol version of the next header 165 following the Fragment Header. 167 o Reserved is a 1-octet reserved field set to 0 on transmission and 168 ignored on reception. 170 o Fragment Offset is a 13-bit field that provides the offset (in 171 8-octet units) of the data portion that follows from the beginning 172 of the packet. 174 o Res is a 2-bit field set to 0 on transmission and ignored on 175 reception. 177 o M is the "More Fragments" bit telling whether additional fragments 178 follow. 180 o Identification is a 32 bit numerical identification value for the 181 entire IPv6 packet. The value is copied into each fragment of the 182 same IPv6 packet. 184 The fragmentation and reassembly specification in [RFC8200] can be 185 considered as the standard method which adheres to the details of 186 that RFC. This document presents an enhanced method that allows for 187 retransmissions of individual fragments. 189 5. IPv6 Fragment Retransmission 191 Fragmentation implementations that obey this specification write an 192 "Ordinal" value beginning with 0 and monotonically incremented for 193 each successive fragment in the (formerly) "Reserved" field of the 194 IPv6 Fragment Header, which is redefined as a 6-bit "Ordinal" field 195 followed by a 1-bit R(eserved) flag followed by a 1-bit A(RQ) flag as 196 shown below: 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 | Next Header | Ordinal |R|A| Fragment Offset |Res|M| 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 201 | Identification | 202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 204 In particular, when a source that obeys this specification fragments 205 an IPv6 packet it sets the Ordinal value for the first fragment to 206 '0', the Ordinal value for the second fragment to '1', the Ordinal 207 value for the third fragment to '2', etc. up to either the final 208 fragment or the 64th fragment (whichever comes first). The source 209 also sets the A flag to 1 in each fragment to inform the destination 210 that fragment retransmission is supported for this packet. 212 When a destination that obeys this specification receives IPv6 213 fragments with the A flag set to 1, it infers that the source 214 participates in the protocol and maintains a checklist of all Ordinal 215 numbered fragments received for a specific Identification number. 217 If the destination notices one or more Ordinals missing after most 218 other Ordinals for the same Identification have arrived, it can 219 prepare an ICMPv6 Fragmentation Report (FRAGREP) message [RFC4443] to 220 send back to the source. The message is formatted as follows: 222 0 1 2 3 223 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 224 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 225 | Type | Code | Checksum | 226 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 227 | Identification (0) | 228 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 229 | Ordinal Bitmap (0) (0-31) | 230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 231 | Ordinal Bitmap (0) (32-63) | 232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 233 | Identification (1) | 234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 235 | Ordinal Bitmap (1) (0-31) | 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 237 | Ordinal Bitmap (1) (32-63) | 238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 239 | ... | 240 | ... | 242 In this format, the destination prepares the FRAGREP message as a 243 list of 12-octet (Identification(i), Bitmap(i)) pairs. The first 4 244 octets in each pair encode the Identification value for the IPv6 245 packet that is subject of the report, while the remaining 8 octets 246 encode a 64-bit Bitmap of Ordinal fragments received for this 247 Identification. For example, if the destination receives Ordinals 0, 248 1, 3, 4, 6, and 8 it sets Bitmap bits 0, 1, 3, 4, 6 and 8 to '1' and 249 sets all other bits to '0'. The destination may include as many 250 (Identification, Bitmap) pairs as necessary without causing the 251 entire message to exceed the minimum IPv6 MTU of 1280 bytes. (If 252 additional pairs are necessary, the destination may prepare and send 253 multiple messages.) 255 The destination next transmits the FRAGREP message to the IPv6 256 fragment source. When the source receives the message, it examines 257 each entry to determine the per-Identification Ordinal fragments that 258 require retransmission. For example, if the source receives a Bitmap 259 for Identification 0x12345678 with bits 0, 1, 3, 4, 6 and 8 set to 260 '1', it would retransmit Ordinal fragments (0x12345678, 2), 261 (0x12345678, 5) and (0x12345678, 7). 263 This implies that the source should maintain a cache of recently 264 transmitted fragments for a time interval known as "link persistence" 265 [RFC3366]. The link persistence should be at least as long as the 266 round-trip time from the fragmentation source to the reassembly 267 destination, plus an additional small delay to allow for reassembly 268 processing overhead. Then, if the source receives a FRAGREP message 269 requesting retransmission of one or more Ordinals, it can retransmit 270 if it still holds the Ordinal in its cache. Otherwise, the Ordinal 271 will incur a cache miss and the original source will eventually 272 retransmit the original packet in its entirety. After processing all 273 entries in the FRAGREP, the source discards the message. 275 Note that the maximum-sized IPv6 packet that a source can submit for 276 fragmentation is 64KB, and the minimum IPv6 path MTU is 1280B. 277 Assuming the minimum IPv6 path MTU as the nominal size for non-final 278 fragments, the number of Ordinals for each IPv6 packet should 279 therefore fit within the allotted 64 Bitmap bits when the fragments 280 are transmitted over IPv6-only network paths. However, when the path 281 may traverse one or more IPv4 networks (e.g., via tunneling) the path 282 MTU may be significantly smaller. In that case, the number of IPv6 283 fragments needed may exceed the maximum number of Ordinal candidates 284 for retransmission (i.e., 64). 286 When the number of IPv6 fragments exceeds 64, the source assigns an 287 Ordinal value and sets A to 1 in the first 64 fragments, but sets 288 both Ordinal and A to 0 in all remaining fragments then transmits all 289 fragments. When the destination receives the fragments, it may 290 return a FRAGREP to request retransmission of any of the first 64 291 fragments, but may not request retransmission of any additional 292 fragments for which the default behavior of best-effort delivery 293 applies. (However, all fragments are presented equally to the 294 reassembly cache where successful reassembly is likely.) 296 Finally, transmission of IPv6 fragments over IPv6-only paths can 297 safely proceed without a fragmentation-layer integrity check since 298 IPv6 includes reassembly safeguards and a 32-bit Identification 299 value. Conversely, transmission of IPv6 fragments over IPv4-only or 300 mixed IPv6/IPv4 paths requires a fragmentation-layer integrity check 301 inserted by the source before fragmentation and verified by the 302 destination following reassembly since IPv4 provides only a 16-bit 303 Identification and no reassembly safeguards. (In cases where the 304 full path cannot be determined a priori, an integrity check should 305 always be included as specified in AERO [I-D.templin-6man-aero] and 306 OMNI [I-D.templin-6man-omni].) 308 6. Packet Too Big (PTB) Soft Errors 310 When an IPv6 fragmentation source forwards packets that produce what 311 it considers as excessive numbers fragments (e.g., 32, 48, 64, more), 312 the fragmentation source can also return PTB "soft errors" to the 313 original source (subject to rate limiting). Either the fragmentation 314 source or reassembly destination may also return PTB soft errors if 315 the frequency of retransmissions or reassembly failures exceeds 316 acceptable thresholds. 318 PTB soft errors are distinguished from ordinary "hard errors" through 319 a non-zero value in the ICMPv6 "Code" field [RFC8201][RFC4443] or 320 ICMPv4 "unused" field [RFC1191]. The following values are currently 321 defined: 323 o 0 - "PTB hard error" - Original sources that receive these 324 messages obey the classic Path MTU Discovery (PMTUD) 325 specifications found in [RFC8201][RFC1191]. 327 o 1 - "PTB soft error (packet lost)" - Original sources that receive 328 these messages should reduce their packet sizes while 329 retransmitting the data from the lost packet, but need not wait 330 the prescribed 10 minutes before attempting to again increase 331 packet sizes. 333 o 2 - "PTB soft error (packet forwarded)" - Original sources that 334 receive these messages should reduce their packet sizes without 335 invoking retransmission, and also need not wait the prescribed 10 336 minutes before attempting to again increase packet sizes. 338 o 3-255 - reserved for future use. 340 PTB soft errors include as much of the invoking packet as possible 341 without the message exceeding the minimum MTU (i.e., 1280 bytes for 342 IPv6 or 576 bytes for IPv4). Original sources that recognize PTB 343 soft errors should follow common logic to dynamically tune their 344 packet sizes to obtain the best performance. In particular, an 345 original source can gradually increase the size of packets it sends 346 while no or few PTB soft errors are arriving then again reduce packet 347 sizes when excessive soft errors arrive. 349 Original sources that do not recognize PTB soft errors (i.e., that do 350 not examine the Code/unused field value) follow the same standards as 351 for hard errors as described above. These sources may miss 352 opportunities to realize improved performance. 354 7. Implementation Status 356 TBD. 358 8. IANA Considerations 360 A new ICMPv6 Message Type code for "Fragmentation Report (FRAGREP)" 361 is requested. 363 The IANA is instructed to create new registries for "ICMPv6 Packet 364 Too Big Code field" and "ICMPv4 Fragmentation Needed unused field" 365 values. Both registries should have the following initial values: 367 Value Sub-Type name Reference 368 ----- ------------- ---------- 369 0 PTB hard error [RFCXXXX] 370 1 PTB soft error (loss) [RFCXXXX] 371 2 PTB soft error (no loss) [RFCXXXX] 372 3-252 Unassigned 373 253-254 Reserved for Experimentation [RFCXXXX] 374 255 Reserved by IANA [RFCXXXX] 376 Figure 1: Packet Too Big Code/unused Values 378 9. Security Considerations 380 Communications networking security is necessary to preserve 381 confidentiality, integrity and availability. 383 10. Acknowledgements 385 This work was inspired by ongoing AERO/OMNI/DTN investigations. 387 . 389 11. References 391 11.1. Normative References 393 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 394 DOI 10.17487/RFC0791, September 1981, 395 . 397 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 398 DOI 10.17487/RFC1191, November 1990, 399 . 401 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 402 Requirement Levels", BCP 14, RFC 2119, 403 DOI 10.17487/RFC2119, March 1997, 404 . 406 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 407 Control Message Protocol (ICMPv6) for the Internet 408 Protocol Version 6 (IPv6) Specification", STD 89, 409 RFC 4443, DOI 10.17487/RFC4443, March 2006, 410 . 412 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 413 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 414 May 2017, . 416 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 417 (IPv6) Specification", STD 86, RFC 8200, 418 DOI 10.17487/RFC8200, July 2017, 419 . 421 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 422 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 423 DOI 10.17487/RFC8201, July 2017, 424 . 426 11.2. Informative References 428 [I-D.templin-6man-aero] 429 Templin, F. L., "Automatic Extended Route Optimization 430 (AERO)", draft-templin-6man-aero-37 (work in progress), 431 November 2021. 433 [I-D.templin-6man-omni] 434 Templin, F. L. and T. Whyman, "Transmission of IP Packets 435 over Overlay Multilink Network (OMNI) Interfaces", draft- 436 templin-6man-omni-51 (work in progress), November 2021. 438 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 439 IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473, 440 December 1998, . 442 [RFC8900] Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 443 and F. Gont, "IP Fragmentation Considered Fragile", 444 BCP 230, RFC 8900, DOI 10.17487/RFC8900, September 2020, 445 . 447 Author's Address 449 Fred L. Templin (editor) 450 Boeing Research & Technology 451 P.O. Box 3707 452 Seattle, WA 98124 453 USA 455 Email: fltemplin@acm.org