idnits 2.17.1 draft-ietf-6lo-fragment-recovery-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4944, updated by this document, for RFC5378 checks: 2005-07-13) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (23 October 2019) is 1648 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-6lo-minimal-fragment-04 == Outdated reference: A later version (-30) exists of draft-ietf-6tisch-architecture-27 == Outdated reference: A later version (-02) exists of draft-ietf-lwig-6lowpan-virtual-reassembly-01 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 6lo P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Updates: 4944 (if approved) 23 October 2019 5 Intended status: Standards Track 6 Expires: 25 April 2020 8 6LoWPAN Selective Fragment Recovery 9 draft-ietf-6lo-fragment-recovery-07 11 Abstract 13 This draft updates RFC 4944 with a simple protocol to recover 14 individual fragments across a route-over mesh network, with a minimal 15 flow control to protect the network against bloat. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on 25 April 2020. 34 Copyright Notice 36 Copyright (c) 2019 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 41 license-info) in effect on the date of publication of this document. 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. Code Components 44 extracted from this document must include Simplified BSD License text 45 as described in Section 4.e of the Trust Legal Provisions and are 46 provided without warranty as described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2.1. BCP 14 . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.2. References . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.3. 6LoWPAN Acronyms . . . . . . . . . . . . . . . . . . . . 4 55 2.4. Referenced Work . . . . . . . . . . . . . . . . . . . . . 4 56 2.5. New Terms . . . . . . . . . . . . . . . . . . . . . . . . 5 57 3. Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . . 6 58 4. Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . . 6 59 4.1. Slack in the First Fragment . . . . . . . . . . . . . . . 7 60 4.2. Gap between frames . . . . . . . . . . . . . . . . . . . 7 61 4.3. Modifying the First Fragment . . . . . . . . . . . . . . 7 62 5. New Dispatch types and headers . . . . . . . . . . . . . . . 8 63 5.1. Recoverable Fragment Dispatch type and Header . . . . . . 8 64 5.2. RFRAG Acknowledgment Dispatch type and Header . . . . . . 11 65 6. Fragments Recovery . . . . . . . . . . . . . . . . . . . . . 12 66 6.1. Forwarding Fragments . . . . . . . . . . . . . . . . . . 14 67 6.1.1. Upon the first fragment . . . . . . . . . . . . . . . 14 68 6.1.2. Upon the next fragments . . . . . . . . . . . . . . . 15 69 6.2. Upon the RFRAG Acknowledgments . . . . . . . . . . . . . 15 70 6.3. Aborting the Transmission of a Fragmented Packet . . . . 16 71 7. Management Considerations . . . . . . . . . . . . . . . . . . 17 72 7.1. Protocol Parameters . . . . . . . . . . . . . . . . . . . 17 73 7.2. Observing the network . . . . . . . . . . . . . . . . . . 18 74 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 75 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 76 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 77 11. Normative References . . . . . . . . . . . . . . . . . . . . 19 78 12. Informative References . . . . . . . . . . . . . . . . . . . 20 79 Appendix A. Rationale . . . . . . . . . . . . . . . . . . . . . 23 80 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 24 81 Appendix C. Considerations On Flow Control . . . . . . . . . . . 25 82 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 26 84 1. Introduction 86 In most Low Power and Lossy Network (LLN) applications, the bulk of 87 the traffic consists of small chunks of data (in the order few bytes 88 to a few tens of bytes) at a time. Given that an IEEE Std. 802.15.4 89 [IEEE.802.15.4] frame can carry a payload of 74 bytes or more, 90 fragmentation is usually not required. However, and though this 91 happens only occasionally, a number of mission critical applications 92 do require the capability to transfer larger chunks of data, for 93 instance to support the firmware upgrade of the LLN nodes or the 94 extraction of logs from LLN nodes. In the former case, the large 95 chunk of data is transferred to the LLN node, whereas in the latter, 96 the large chunk flows away from the LLN node. In both cases, the 97 size can be on the order of 10 kilobytes or more and an end-to-end 98 reliable transport is required. 100 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 101 defines the original 6LoWPAN datagram fragmentation mechanism for 102 LLNs. One critical issue with this original design is that routing 103 an IPv6 [RFC8200] packet across a route-over mesh requires to 104 reassemble the full packet at each hop, which may cause latency along 105 a path and an overall buffer bloat in the network. The "6TiSCH 106 Architecture" [I-D.ietf-6tisch-architecture] recommends to use a hop- 107 by-hop fragment forwarding technique to alleviate those undesirable 108 effects. "LLN Minimal Fragment Forwarding" 109 [I-D.ietf-6lo-minimal-fragment] proposes such a technique, in a 110 fashion that is compatible with [RFC4944] without the need to define 111 a new protocol. 113 However, adding that capability alone to the local implementation of 114 the original 6LoWPAN fragmentation would not address the inherent 115 fragility of fragmentation (see [I-D.ietf-intarea-frag-fragile]) in 116 particular the issues of resources locked on the receiver and the 117 wasted transmissions due to the loss of a single fragment ina whole 118 datagram. [Kent] compares the unreliable delivery of fragments with 119 a mechanism it calls "selective acknowledgements" that recovers the 120 loss of a fragment individually. The paper illustrates the benefits 121 that can be derived from such a method in figures 1, 2 and 3, pages 6 122 and 7. [RFC4944] as no selective recovery and the whole datagram 123 fails when one fragment is not delivered to the destination 6LoWPAN 124 endpoint. Constrained memory resources are blocked on the receiver 125 until the receiver times out, possibly causing the loss of subsequent 126 packets that can not be received for the lack of buffers. 128 That problem is exacerbated when forwarding fragments over multiple 129 hops since a loss at an intermediate hop will not be discovered by 130 either the source or the destination, and the source will keep on 131 sending fragments, wasting even more resources in the network and 132 possibly contributing to the condition that caused the loss to no 133 avail since the datagram cannot arrive in its entirety. RFC 4944 is 134 also missing signaling to abort a multi-fragment transmission at any 135 time and from either end, and, if the capability to forward fragments 136 is implemented, clean up the related state in the network. It is 137 also lacking flow control capabilities to avoid participating to a 138 congestion that may in turn cause the loss of a fragment and 139 potentially the retransmission of the full datagram. 141 This specification provides a method to forward fragments across a 142 multi-hop route-over mesh, and a selective acknowledgment to recover 143 individual fragments between 6LoWPAN endpoints. 145 The method is designed to limit congestion loss in the network and 146 addresses the requirements that are detailed in Appendix B. 148 2. Terminology 150 2.1. BCP 14 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 154 "OPTIONAL" in this document are to be interpreted as described in BCP 155 14 [RFC2119][RFC8174] when, and only when, they appear in all 156 capitals, as shown here. 158 2.2. References 160 In this document, readers will encounter terms and concepts that are 161 discussed in "Problem Statement and Requirements for IPv6 over 162 Low-Power Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606] 164 2.3. 6LoWPAN Acronyms 166 This document uses the following acronyms: 168 6BBR: 6LoWPAN Backbone Router 169 6LBR: 6LoWPAN Border Router 170 6LN: 6LoWPAN Node 171 6LR: 6LoWPAN Router 172 LLN: Low-Power and Lossy Network 174 2.4. Referenced Work 176 Past experience with fragmentation has shown that misassociated or 177 lost fragments can lead to poor network behavior and, occasionally, 178 trouble at application layer. The reader is encouraged to read "IPv4 179 Reassembly Errors at High Data Rates" [RFC4963] and follow the 180 references for more information. 182 That experience led to the definition of "Path MTU discovery" 183 [RFC8201] (PMTUD) protocol that limits fragmentation over the 184 Internet. 186 Specifically in the case of UDP, valuable additional information can 187 be found in "UDP Usage Guidelines for Application Designers" 188 [RFC8085]. 190 Readers are expected to be familiar with all the terms and concepts 191 that are discussed in "IPv6 over Low-Power Wireless Personal Area 192 Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and 193 Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4 194 Networks" [RFC4944]. 196 "The Benefits of Using Explicit Congestion Notification (ECN)" 197 [RFC8087] provides useful information on the potential benefits and 198 pitfalls of using ECN. 200 Quoting the "Multiprotocol Label Switching (MPLS) Architecture" 201 [RFC3031]: with MPLS, 'packets are "labeled" before they are 202 forwarded'. At subsequent hops, there is no further analysis of the 203 packet's network layer header. Rather, the label is used as an index 204 into a table which specifies the next hop, and a new label". The 205 MPLS technique is leveraged in the present specification to forward 206 fragments that actually do not have a network layer header, since the 207 fragmentation occurs below IP. 209 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] 210 introduces the concept of a Virtual Reassembly Buffer (VRB) and an 211 associated technique to forward fragments as they come, using the 212 datagram_tag as a label in a fashion similar to MPLS. This 213 specification reuses that technique with slightly modified controls. 215 2.5. New Terms 217 This specification uses the following terms: 219 6LoWPAN endpoints: The LLN nodes in charge of generating or 220 expanding a 6LoWPAN header from/to a full IPv6 packet. The 221 6LoWPAN endpoints are the points where fragmentation and 222 reassembly take place. 224 Compressed Form: This specification uses the generic term Compressed 225 Form to refer to the format of a datagram after the action of 226 [RFC6282] and possibly [RFC8138] for RPL [RFC6550] artifacts. 228 datagram_size: The size of the datagram in its Compressed Form 229 before it is fragmented. The datagram_size is expressed in a unit 230 that depends on the MAC layer technology, by default a byte. 232 datagram_tag: An identifier of a datagram that is locally unique to 233 the Layer-2 sender. Associated with the MAC address of the 234 sender, this becomes a globally unique identifier for the 235 datagram. 237 fragment_offset: The offset of a particular fragment of a datagram 238 in its Compressed Form. The fragment_offset is expressed in a 239 unit that depends on the MAC layer technology and is by default a 240 byte. 242 RFRAG: Recoverable Fragment 244 RFRAG-ACK: Recoverable Fragment Acknowledgement 246 RFRAG Acknowledgment Request: An RFRAG with the Acknowledgement 247 Request flag ('X' flag) set. 249 NULL bitmap: Refers to a bitmap with all bits set to zero. 251 FULL bitmap: Refers to a bitmap with all bits set to one. 253 Forward: The direction of a LSP path, followed by the RFRAG. 255 Reverse: The reverse direction of a LSP path, taken by the RFRAG- 256 ACK. 258 3. Updating RFC 4944 260 This specification updates the fragmentation mechanism that is 261 specified in "Transmission of IPv6 Packets over IEEE 802.15.4 262 Networks" [RFC4944] for use in route-over LLNs by providing a model 263 where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and 264 where fragments that are lost on the way can be recovered 265 individually. A new format for fragment is introduced and new 266 dispatch types are defined in Section 5. 268 [RFC8138] allows to modify the size of a packet en-route by removing 269 the consumed hops in a compressed Routing Header. It results that 270 fragment_offset and datagram_size (see Section 2.5) must also be 271 modified en-route, whcih is difficult to do in the uncompressed form. 272 This specification expresses those fields in the Compressed Form and 273 allows to modify them en-route (see Section 4.3) easily. 275 Note that consistently with Section 2 of [RFC6282] for the 276 fragmentation mechanism described in Section 5.3 of [RFC4944], any 277 header that cannot fit within the first fragment MUST NOT be 278 compressed when using the fragmentation mechanism described in this 279 specification. 281 4. Extending draft-ietf-6lo-minimal-fragment 283 This specification extends the fragment forwarding mechanism 284 specified in "LLN Minimal Fragment Forwarding" 285 [I-D.ietf-6lo-minimal-fragment] by providing additional operations to 286 improve the management of the Virtual Reassembly Buffer (VRB) in the 287 context of recoverable fragments. 289 4.1. Slack in the First Fragment 291 At the time of this writing, [I-D.ietf-6lo-minimal-fragment] allows 292 for refragmenting in intermediate nodes, meaning that some bytes from 293 a given fragment may be left in the VRB to be added to the next 294 fragment. The reason for this to happen would be the need for space 295 in the outgoing fragment that was not needed in the incoming 296 fragment, for instance because the 6LoWPAN Header Compression is not 297 as efficient on the outgoing link, e.g., if the Interface ID (IID) of 298 the source IPv6 address is elided by the originator on the first hop 299 because it matches the source MAC address, but cannot be on the next 300 hops because the source MAC address changes. 302 This specification cannot allow this operation since fragments are 303 recovered end-to-end based on a sequence number. This means that the 304 fragments that contain a 6LoWPAN-compressed header MUST have enough 305 slack to enable a less efficient compression in the next hops that 306 still fits in one MAC frame. For instance, if the IID of the source 307 IPv6 address is elided by the originator, then it MUST compute the 308 fragment_size as if the MTU was 8 bytes less. This way, the next hop 309 can restore the source IID to the first fragment without impacting 310 the second fragment. 312 4.2. Gap between frames 314 This specification introduces a concept of Inter-Frame Gap, which is 315 a configurable interval of time between transmissions to a same next 316 hop. In the case of half duplex interfaces, this InterFrameGap 317 ensures that the next hop has progressed the previous frame and is 318 capable of receiving the next one. 320 In the case of a mesh operating at a single frequency with 321 omnidirectional antennas, a larger InterFrameGap is required to 322 protect the frame against hidden terminal collisions with the 323 previous frame of a same flow that is still progressing along a 324 common path. 326 The Inter-Frame Gap is useful even for unfragmented datagrams, but it 327 becomes a necessity for fragments that are typically generated in a 328 fast sequence and are all sent over the exact same path. 330 4.3. Modifying the First Fragment 332 The compression of the Hop Limit, of the source and destination 333 addresses in the IPv6 Header, and of the Routing Header, may change 334 en-route in a Route-Over mesh LLN. If the size of the first fragment 335 is modified, then the intermediate node MUST adapt the datagram_size 336 to reflect that difference. 338 The intermediate node MUST also save the difference of datagram_size 339 of the first fragment in the VRB and add it to the datagram_size and 340 to the fragment_offset of all the subsequent fragments for that 341 datagram. 343 5. New Dispatch types and headers 345 This specification enables the 6LoWPAN fragmentation sublayer to 346 provide an MTU up to 2048 bytes to the upper layer, which can be the 347 6LoWPAN Header Compression sublayer that is defined in the 348 "Compression Format for IPv6 Datagrams" [RFC6282] specification. In 349 order to achieve this, this specification enables the fragmentation 350 and the reliable transmission of fragments over a multihop 6LoWPAN 351 mesh network. 353 This specification provides a technique that is derived from MPLS to 354 forward individual fragments across a 6LoWPAN route-over mesh without 355 reassembly at each hop. The datagram_tag is used as a label; it is 356 locally unique to the node that owns the source MAC address of the 357 fragment, so together the MAC address and the label can identify the 358 fragment globally. A node may build the datagram_tag in its own 359 locally-significant way, as long as the chosen datagram_tag stays 360 unique to the particular datagram for the lifetime of that datagram. 361 It results that the label does not need to be globally unique but 362 also that it must be swapped at each hop as the source MAC address 363 changes. 365 This specification extends RFC 4944 [RFC4944] with 2 new Dispatch 366 types, for Recoverable Fragment (RFRAG) and for the RFRAG 367 Acknowledgment back. The new 6LoWPAN Dispatch types are taken from 368 Page 0 [RFC8025] as indicated in Table 1 in Section 9. 370 In the following sections, a "datagram_tag" extends the semantics 371 defined in [RFC4944] Section 5.3."Fragmentation Type and Header". 372 The datagram_tag is a locally unique identifier for the datagram from 373 the perspective of the sender. This means that the datagram_tag 374 identifies a datagram uniquely in the network when associated with 375 the source of the datagram. As the datagram gets forwarded, the 376 source changes and the datagram_tag must be swapped as detailed in 377 [I-D.ietf-6lo-minimal-fragment]. 379 5.1. Recoverable Fragment Dispatch type and Header 381 In this specification, if the packet is compressed then the size and 382 offset of the fragments are expressed on the Compressed Form of the 383 packet form as opposed to the uncompressed - native - packet form. 385 The format of the fragment header is shown in Figure 1. It is the 386 same for all fragments. The format has a length and an offset, as 387 well as a sequence field. This would be redundant if the offset was 388 computed as the product of the sequence by the length, but this is 389 not the case. The position of a fragment in the reassembly buffer is 390 neither correlated with the value of the sequence field nor with the 391 order in which the fragments are received. This enables out-of- 392 sequence subfragmenting, e.g., a fragment seq. 5 that is retried end- 393 to-end as smaller fragments seq. 5, 13 and 14 due to a change of MTU 394 along the path between the 6LoWPAN endpoints. 396 1 2 3 397 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 |1 1 1 0 1 0 0|E| datagram_tag | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 |X| sequence| fragment_size | fragment_offset | 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 404 X set == Ack-Request 406 Figure 1: RFRAG Dispatch type and Header 408 There is no requirement on the receiver to check for contiguity of 409 the received fragments, and the sender MUST ensure that when all 410 fragments are acknowledged, then the datagram is fully received. 411 This may be useful in particular in the case where the MTU changes 412 and a fragment sequence is retried with a smaller fragment_size, the 413 remainder of the original fragment being retried with new sequence 414 values. 416 The first fragment is recognized by a sequence of 0; it carries its 417 fragment_size and the datagram_size of the compressed packet before 418 it is fragmented, whereas the other fragments carry their 419 fragment_size and fragment_offset. The last fragment for a datagram 420 is recognized when its fragment_offset and its fragment_size add up 421 to the datagram_size. 423 Recoverable Fragments are sequenced and a bitmap is used in the RFRAG 424 Acknowledgment to indicate the received fragments by setting the 425 individual bits that correspond to their sequence. 427 X: 1 bit; Ack-Request: when set, the sender requires an RFRAG 428 Acknowledgment from the receiver. 430 E: 1 bit; Explicit Congestion Notification; the "E" flag is reset by 431 the source of the fragment and set by intermediate routers to 432 signal that this fragment experienced congestion along its path. 434 Fragment_size: 10 bit unsigned integer; the size of this fragment in 435 a unit that depends on the MAC layer technology. Unless 436 overridden by a more specific specification, that unit is the 437 octet which allows fragments up to 512 bytes. 439 datagram_tag: 16 bits; an identifier of the datagram that is locally 440 unique to the sender. 442 Sequence: 5 bit unsigned integer; the sequence number of the 443 fragment in the acknowledgement bitmap. Fragments are numbered 444 [0..N] where N is in [0..31]. A Sequence of 0 indicates the first 445 fragment in a datagram, but non-zero values are not indicative of 446 the position in the reassembly buffer. 448 Fragment_offset: 16 bit unsigned integer. 450 When the Fragment_offset is set to a non-0 value, its semantics 451 depend on the value of the Sequence field as follows: 453 * For a first fragment (i.e. with a Sequence of 0), this field 454 indicates the datagram_size of the compressed datagram, to help 455 the receiver allocate an adapted buffer for the reception and 456 reassembly operations. The fragment may be stored for local 457 reassembly. Alternatively, it may be routed based on the 458 destination IPv6 address. In that case, a VRB state must be 459 installed as described in Section 6.1.1. 460 * When the Sequence is not 0, this field indicates the offset of 461 the fragment in the Compressed Form of the datagram. The 462 fragment may be added to a local reassembly buffer or forwarded 463 based on an existing VRB as described in Section 6.1.2. 465 A Fragment_offset that is set to a value of 0 indicates an abort 466 condition and all state regarding the datagram should be cleaned 467 up once the processing of the fragment is complete; the processing 468 of the fragment depends on whether there is a VRB already 469 established for this datagram, and the next hop is still 470 reachable: 472 * if a VRB already exists and is not broken, the fragment is to 473 be forwarded along the associated Label Switched Path (LSP) as 474 described in Section 6.1.2, but regardless of the value of the 475 Sequence field; 476 * else, if the Sequence is 0, then the fragment is to be routed 477 as described in Section 6.1.1 but no state is conserved 478 afterwards. In that case, the session if it exists is aborted 479 and the packet is also forwarded in an attempt to clean up the 480 next hops as along the path indicated by the IPv6 header 481 (possibly including a routing header). 483 If the fragment cannot be forwarded or routed, then an abort 484 RFRAG-ACK is sent back to the source as described in 485 Section 6.1.2. 487 5.2. RFRAG Acknowledgment Dispatch type and Header 489 This specification also defines a 4-octet RFRAG Acknowledgment bitmap 490 that is used by the reassembling endpoint to confirm selectively the 491 reception of individual fragments. A given offset in the bitmap maps 492 one to one with a given sequence number and indicates which fragment 493 is acknowledged as follows: 495 1 2 3 496 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 497 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 498 | RFRAG Acknowledgment Bitmap | 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 500 ^ ^ 501 | | bitmap indicating whether: 502 | +----- Fragment with sequence 9 was received 503 +----------------------- Fragment with sequence 0 was received 505 Figure 2: RFRAG Acknowledgment bitmap encoding 507 Figure 3 shows an example Acknowledgment bitmap which indicates that 508 all fragments from sequence 0 to 20 were received, except for 509 fragments 1, 2 and 16 that were lost and must be retried. 511 1 2 3 512 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 514 |1|0|0|1|1|1|1|1|1|1|1|1|1|1|1|1|0|1|1|1|1|0|0|0|0|0|0|0|0|0|0|0| 515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 Figure 3: Example RFRAG Acknowledgment Bitmap 519 The RFRAG Acknowledgment Bitmap is included in a RFRAG Acknowledgment 520 header, as follows: 522 1 2 3 523 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 525 |1 1 1 0 1 0 1|E| datagram_tag | 526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 527 | RFRAG Acknowledgment Bitmap (32 bits) | 528 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 529 Figure 4: RFRAG Acknowledgment Dispatch type and Header 531 E: 1 bit; Explicit Congestion Notification Echo 533 When set, the sender indicates that at least one of the 534 acknowledged fragments was received with an Explicit Congestion 535 Notification, indicating that the path followed by the fragments 536 is subject to congestion. More in Appendix C. 538 RFRAG Acknowledgment Bitmap: An RFRAG Acknowledgment Bitmap, whereby 539 setting the bit at offset x indicates that fragment x was 540 received, as shown in Figure 2. A NULL bitmap that indicates that 541 the fragmentation process is aborted. A FULL bitmap that 542 indicates that the fragmentation process is complete, all 543 fragments were received at the reassembly endpoint. 545 6. Fragments Recovery 547 The Recoverable Fragment header RFRAG is used to transport a fragment 548 and optionally request an RFRAG Acknowledgment that will confirm the 549 good reception of one or more fragments. An RFRAG Acknowledgment is 550 carried as a standalone fragment header (i.e. with no 6LoWPAN 551 payload) in a message that is propagated back to the 6LoWPAN endpoint 552 that was the originator of the fragments. To achieve this, each hop 553 that performed an MPLS-like operation on fragments reverses that 554 operation for the RFRAG_ACK by sending a frame from the next hop to 555 the previous hop as known by its MAC address in the VRB. The 556 datagram_tag in the RFRAG_ACK is unique to the receiver and is enough 557 information for an intermediate hop to locate the VRB that contains 558 the datagram_tag used by the previous hop and the Layer-2 information 559 associated to it (interface and MAC address). 561 The 6LoWPAN endpoint that fragments the packets at 6LoWPAN level (the 562 sender) also controls the amount of acknowledgments by setting the 563 Ack-Request flag in the RFRAG packets. The sender may set the Ack- 564 Request flag on any fragment to perform congestion control by 565 limiting the number of outstanding fragments, which are the fragments 566 that have been sent but for which reception or loss was not 567 positively confirmed by the reassembling endpoint. The maximum 568 number of outstanding fragments is the Window-Size. It is 569 configurable and may vary in case of ECN notification. When the 570 6LoWPAN endpoint that reassembles the packets at 6LoWPAN level (the 571 receiver) receives a fragment with the Ack-Request flag set, it MUST 572 send an RFRAG Acknowledgment back to the originator to confirm 573 reception of all the fragments it has received so far. 575 The Ack-Request ('X') set in an RFRAG marks the end of a window. 576 This flag MUST be set on the last fragment if the sender wishes to 577 protect the datagram, and it MAY be set in any intermediate fragment 578 for the purpose of flow control. This ARQ process MUST be protected 579 by a timer, and the fragment that carries the 'X' flag MAY be retried 580 upon time out a configurable amount of times (see Section 7.1). Upon 581 exhaustion of the retries the sender may either abort the 582 transmission of the datagram or retry the datagram from the first 583 fragment with an 'X' flag set in order to reestablish a path and 584 discover which fragments were received over the old path in the 585 acknowledgment bitmap. When the sender of the fragment knows that an 586 underlying link-layer mechanism protects the fragments, it may 587 refrain from using the RFRAG Acknowledgment mechanism, and never set 588 the Ack-Request bit. 590 The RFRAG Acknowledgment can optionally carry an ECN indication for 591 flow control (see Appendix C). The receiver of a fragment with the 592 'E' (ECN) flag set MUST echo that information by setting the 'E' 593 (ECN) flag in the next RFRAG Acknowledgment. 595 In order to protect the datagram, the sender transfers a controlled 596 number of fragments and flags the last fragment of a window with an 597 RFRAG Acknowledgment Request. The receiver MUST acknowledge a 598 fragment with the acknowledgment request bit set. If any fragment 599 immediately preceeding an acknowledgment request is still missing, 600 the receiver MAY intentionally delay its acknowledgment to allow in- 601 transit fragments to arrive. Because it might defeat the round trip 602 delay computation, delaying the acknowledgment should be configurable 603 and not enabled by default. 605 The receiver MAY issue unsolicited acknowledgments. An unsolicited 606 acknowledgment signals to the sender endpoint that it can resume 607 sending if it had reached its maximum number of outstanding 608 fragments. Another use is to inform that the reassembling endpoint 609 aborted the process of an individual datagram. 611 When all the fragments are received, the receiving endpoint 612 reconstructs the packet, passes it to the upper layer, sends a RFRAG 613 Acknowledgment on the reverse path with a FULL bitmap, and harms a 614 short timer to absorb packets that are still in flight for that 615 datagram without creating a new state and abort the communication if 616 it keeps going on. 618 Note that acknowledgments might consume precious resources so the use 619 of unsolicited acknowledgments should be configurable and not enabled 620 by default. 622 An observation is that streamlining forwarding of fragments generally 623 reduces the latency over the LLN mesh, providing room for retries 624 within existing upper-layer reliability mechanisms. The sender 625 protects the transmission over the LLN mesh with a retry timer that 626 is computed according to the method detailed in [RFC6298]. It is 627 expected that the upper layer retries obey the recommendations in 628 "UDP Usage Guidelines" [RFC8085], in which case a single round of 629 fragment recovery should fit within the upper layer recovery timers. 631 Fragments are sent in a round robin fashion: the sender sends all the 632 fragments for a first time before it retries any lost fragment; lost 633 fragments are retried in sequence, oldest first. This mechanism 634 enables the receiver to acknowledge fragments that were delayed in 635 the network before they are retried. 637 When a single frequency is used by contiguous hops, the sender should 638 wait a reasonable amount of time between fragments so as to let a 639 fragment progress a few hops and avoid hidden terminal issues. This 640 precaution is not required on channel hopping technologies such as 641 Time Slotted Channel Hopping (TSCH) [RFC6554], where nodes that 642 communicate at Layer-2 are scheduled to send and receive 643 respectively, and different hops operate on different channels. 645 6.1. Forwarding Fragments 647 It is assumed that the first Fragment is large enough to carry the 648 IPv6 header and make routing decisions. If that is not so, then this 649 specification MUST NOT be used. 651 This specification extends the Virtual Reassembly Buffer (VRB) 652 technique to forward fragments with no intermediate reconstruction of 653 the entire packet. It inherits operations like datagram_tag 654 Switching and using a timer to clean the VRB when the traffic dries 655 up. In more details, the first fragment carries the IP header and it 656 is routed all the way from the fragmenting endpoint to the 657 reassembling endpoint. Upon the first fragment, the routers along 658 the path install a label-switched path (LSP), and the following 659 fragments are label-switched along that path. As a consequence, the 660 next fragments can only follow the path that was set up by the first 661 fragment and cannot follow an alternate route. The datagram_tag is 662 used to carry the label, that is swapped at each hop. All fragments 663 follow the same path and fragments are delivered in the order at 664 which they are sent. 666 6.1.1. Upon the first fragment 668 In Route-Over mode, the source and destination MAC addressed in a 669 frame change at each hop. The label that is formed and placed in the 670 datagram_tag is associated to the source MAC and only valid (and 671 unique) for that source MAC. Upon a first fragment (i.e. with a 672 sequence of zero), an intermediate router creates a VRB and the 673 associated LSP state for the tuple (source MAC address, datagram_tag) 674 and the fragment is forwarded along the IPv6 route that matches the 675 destination IPv6 address in the IPv6 header as prescribed by 676 [I-D.ietf-6lo-minimal-fragment], whereas the receiving endpoint 677 allocates a reassembly buffer. The LSP state enables to match the 678 (previous MAC address, datagram_tag) in an incoming fragment to the 679 tuple (next MAC address, swapped datagram_tag) used in the forwarded 680 fragment and points at the VRB. In addition, the router also forms a 681 Reverse LSP state indexed by the MAC address of the next hop and the 682 swapped datagram_tag. This reverse LSP state also points at the VRB 683 and enables to match the (next MAC address, swapped_datagram_tag) 684 found in an RFRAG Acknowledgment to the tuple (previous MAC address, 685 datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG- 686 ACK) back to the sender endpoint. 688 6.1.2. Upon the next fragments 690 Upon a next fragment (i.e. with a non-zero sequence), an intermediate 691 router looks up a LSP indexed by the tuple (MAC address, 692 datagram_tag) found in the fragment. If it is found, the router 693 forwards the fragment using the associated VRB as prescribed by 694 [I-D.ietf-6lo-minimal-fragment]. 696 if the VRB for the tuple is not found, the router builds an RFRAG-ACK 697 to abort the transmission of the packet. The resulting message has 698 the following information: 700 * The source and destination MAC addresses are swapped from those 701 found in the fragment 702 * The datagram_tag set to the datagram_tag found in the fragment 703 * A NULL bitmap is used to signal the abort condition 705 At this point the router is all set and can send the RFRAG-ACK back 706 to the previous router. The RFRAG-ACK should normally be forwarded 707 all the way to the source using the reverse LSP state in the VRBs in 708 the intermediate routers as described in the next section. 710 6.2. Upon the RFRAG Acknowledgments 712 Upon an RFRAG-ACK, the router looks up a Reverse LSP indexed by the 713 tuple (MAC address, datagram_tag), which are respectively the source 714 MAC address of the received frame and the received datagram_tag. If 715 it is found, the router forwards the fragment using the associated 716 VRB as prescribed by [I-D.ietf-6lo-minimal-fragment], but using the 717 Reverse LSP so that the RFRAG-ACK flows back to the sender endpoint. 719 If the Reverse LSP is not found, the router MUST silently drop the 720 RFRAG-ACK message. 722 Either way, if the RFRAG-ACK indicates that the fragment was entirely 723 received (FULL bitmap), it arms a short timer, and upon timeout, the 724 VRB and all the associated state are destroyed. Until the timer 725 elapses, fragments of that datagram may still be received, e.g. if 726 the RFRAG-ACK was lost on the way back and the source retried the 727 last fragment. In that case, the router forwards the fragment 728 according to the state in the VRB. 730 This specification does not provide a method to discover the number 731 of hops or the minimal value of MTU along those hops. But should the 732 minimal MTU decrease, it is possible to retry a long fragment (say 733 sequence of 5) with first a shorter fragment of the same sequence (5 734 again) and then one or more other fragments with a sequence that was 735 not used before (e.g., 13 and 14). Note that Path MTU Discovery is 736 out of scope for this document. 738 6.3. Aborting the Transmission of a Fragmented Packet 740 A reset is signaled on the forward path with a pseudo fragment that 741 has the fragment_offset, sequence and fragment_size all set to 0, and 742 no data. 744 When the sender or a router on the way decides that a packet should 745 be dropped and the fragmentation process aborted, it generates a 746 reset pseudo fragment and forwards it down the fragment path. 748 Each router next along the path the way forwards the pseudo fragment 749 based on the VRB state. If an acknowledgment is not requested, the 750 VRB and all associated state are destroyed. 752 Upon reception of the pseudo fragment, the receiver cleans up all 753 resources for the packet associated to the datagram_tag. If an 754 acknowledgment is requested, the receiver responds with a NULL 755 bitmap. 757 The other way around, the receiver might need to abort the process of 758 a fragmented packet for internal reasons, for instance if it is out 759 of reassembly buffers, or if it keeps receiving fragments beyond a 760 reasonable time while it considers that this packet is already fully 761 reassembled and was passed to the upper layer. In that case, the 762 receiver SHOULD indicate so to the sender with a NULL bitmap in a 763 RFRAG Acknowledgment. Upon an acknowledgment with a NULL bitmap, the 764 sender endpoint MUST abort the transmission of the fragmented 765 datagram. 767 7. Management Considerations 769 7.1. Protocol Parameters 771 There is no particular configuration on the receiver, as echoing ECN 772 is always on. The configuration only applies to the sender, which is 773 in control of the transmission. The management system SHOULD be 774 capable of providing the parameters below: 776 MinFragmentSize: The MinFragmentSize is the minimum value for the 777 Fragment_Size. 779 OptFragmentSize: The MinFragmentSize is the value for the 780 Fragment_Size that the sender should use to start with. 782 MaxFragmentSize: The MaxFragmentSize is the maximum value for the 783 Fragment_Size. It MUST be lower than the minimum MTU along the 784 path. A large value augments the chances of buffer bloat and 785 transmission loss. The value MUST be less than 512 if the unit 786 that is defined for the PHY layer is the octet. 788 UseECN: Indicates whether the sender should react to ECN. When the 789 sender reacts to ECN the Window_Size will vary between 790 MinWindowSize and MaxWindowSize. 792 MinWindowSize: The minimum value of Window_Size that the sender can 793 use. 795 OptWindowSize: The OptWindowSize is the value for the Window_Size 796 that the sender should use to start with. 798 MaxWindowSize: The maximum value of Window_Size that the sender can 799 use. The value MUSt be less than 32. 801 InterFrameGap: Indicates a minimum amount of time between 802 transmissions. All packets to a same destination, and in 803 particular fragments, may be subject to receive while transmitting 804 and hidden terminal collisions with the next or the previous 805 transmission as the fragments progress along a same path. The 806 InterFrameGap protects the propagation of one transmission before 807 the next one is triggered and creates a duty cycle that controls 808 the ratio of air time and memory in intermediate nodes that a 809 particular datagram will use. 811 MinARQTimeOut: The maximum amount of time a node should wait for an 812 RFRAG Acknowledgment before it takes a next action. 814 OptARQTimeOut: The starting point of the value of the amount that a 815 sender should wait for an RFRAG Acknowledgment before it takes a 816 next action. 818 MaxARQTimeOut: The maximum amount of time a node should wait for an 819 RFRAG Acknowledgment before it takes a next action. 821 MaxFragRetries: The maximum number of retries for a particular 822 Fragment. 824 MaxDatagramRetries: The maximum number of retries from scratch for a 825 particular Datagram. 827 7.2. Observing the network 829 The management system should monitor the amount of retries and of ECN 830 settings that can be observed from the perspective of the both the 831 sender and the receiver, and may tune the optimum size of 832 Fragment_Size and of the Window_Size, OptWindowSize and OptWindowSize 833 respectively, at the sender. The values should be bounded by the 834 expected number of hops and reduced beyond that when the number of 835 datagrams that can traverse an intermediate point may exceed its 836 capacity and cause a congestion loss. The InterFrameGap is another 837 tool that can be used to increase the spacing between fragments of a 838 same datagram and reduce the ratio of time when a particular 839 intermediate node holds a fragment of that datagram. 841 8. Security Considerations 843 The considerations in the Security section of [I-D.ietf-core-cocoa] 844 apply equally to this specification. 846 The process of recovering fragments does not appear to create any 847 opening for new threat compared to "Transmission of IPv6 Packets over 848 IEEE 802.15.4 Networks" [RFC4944]. 850 The Virtual Recovery Buffer inherited from 851 [I-D.ietf-6lo-minimal-fragment] may be used to perform a Denial-of- 852 Service (DoS) attack against the intermediate Routers since the 853 routers need to maintain a state per flow. The VRB implementation 854 technique described in [I-D.ietf-lwig-6lowpan-virtual-reassembly] 855 allows to realign which data goes in which fragment which causes the 856 intermediate node to store a portion of the data, which adds an 857 attack vector that is not present with this specification. With this 858 specification, the data that is transported in each fragment is 859 conserved and the state to keep does not include any data that would 860 not fit in the previous fragment. 862 9. IANA Considerations 864 This document allocates 4 values in Page 0 for recoverable fragments 865 from the "Dispatch Type Field" registry that was created by 866 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 867 and reformatted by "6LoWPAN Paging Dispatch" [RFC8025]. 869 The suggested values (to be confirmed by IANA) are indicated in 870 Table 1. 872 +-------------+------+----------------------------------+-----------+ 873 | Bit Pattern | Page | Header Type | Reference | 874 +=============+======+==================================+===========+ 875 | 11 10100x | 0 | RFRAG - Recoverable Fragment | THIS RFC | 876 +-------------+------+----------------------------------+-----------+ 877 | 11 10101x | 0 | RFRAG-ACK - RFRAG | THIS RFC | 878 | | | Acknowledgment | | 879 +-------------+------+----------------------------------+-----------+ 881 Table 1: Additional Dispatch Value Bit Patterns 883 10. Acknowledgments 885 The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent 886 Toutain, Carles Gomez Montenegro, Thomas Watteyne and Michael 887 Richardson for in-depth reviews and comments. Also many thanks to 888 Jonathan Hui, Jay Werb, Christos Polyzois, Soumitri Kolavennu, Pat 889 Kinney, Margaret Wasserman, Richard Kelsey, Carsten Bormann and Harry 890 Courtice for their various contributions. 892 11. Normative References 894 [I-D.ietf-6lo-minimal-fragment] 895 Watteyne, T., Bormann, C., and P. Thubert, "6LoWPAN 896 Fragment Forwarding", Work in Progress, Internet-Draft, 897 draft-ietf-6lo-minimal-fragment-04, 2 September 2019, 898 . 901 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 902 Requirement Levels", BCP 14, RFC 2119, 903 DOI 10.17487/RFC2119, March 1997, 904 . 906 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 907 "Transmission of IPv6 Packets over IEEE 802.15.4 908 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 909 . 911 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 912 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 913 DOI 10.17487/RFC6282, September 2011, 914 . 916 [RFC6554] Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6 917 Routing Header for Source Routes with the Routing Protocol 918 for Low-Power and Lossy Networks (RPL)", RFC 6554, 919 DOI 10.17487/RFC6554, March 2012, 920 . 922 [RFC8025] Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power 923 Wireless Personal Area Network (6LoWPAN) Paging Dispatch", 924 RFC 8025, DOI 10.17487/RFC8025, November 2016, 925 . 927 [RFC8138] Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie, 928 "IPv6 over Low-Power Wireless Personal Area Network 929 (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138, 930 April 2017, . 932 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 933 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 934 May 2017, . 936 12. Informative References 938 [I-D.ietf-6tisch-architecture] 939 Thubert, P., "An Architecture for IPv6 over the TSCH mode 940 of IEEE 802.15.4", Work in Progress, Internet-Draft, 941 draft-ietf-6tisch-architecture-27, 18 October 2019, 942 . 945 [I-D.ietf-core-cocoa] 946 Bormann, C., Betzler, A., Gomez, C., and I. Demirkol, 947 "CoAP Simple Congestion Control/Advanced", Work in 948 Progress, Internet-Draft, draft-ietf-core-cocoa-03, 21 949 February 2018, 950 . 952 [I-D.ietf-intarea-frag-fragile] 953 Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 954 and F. Gont, "IP Fragmentation Considered Fragile", Work 955 in Progress, Internet-Draft, draft-ietf-intarea-frag- 956 fragile-17, 30 September 2019, 957 . 960 [I-D.ietf-lwig-6lowpan-virtual-reassembly] 961 Bormann, C. and T. Watteyne, "Virtual reassembly buffers 962 in 6LoWPAN", Work in Progress, Internet-Draft, draft-ietf- 963 lwig-6lowpan-virtual-reassembly-01, 11 March 2019, 964 . 967 [IEEE.802.15.4] 968 IEEE, "IEEE Standard for Low-Rate Wireless Networks", 969 IEEE Standard 802.15.4, DOI 10.1109/IEEE 970 P802.15.4-REVd/D01, October 2019, 971 . 973 [Kent] Kent, C. and J. Mogul, ""Fragmentation Considered 974 Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in 975 Computer Communications Technology", 976 DOI 10.1145/55483.55524, August 1987, 977 . 980 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 981 RFC 2914, DOI 10.17487/RFC2914, September 2000, 982 . 984 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 985 Label Switching Architecture", RFC 3031, 986 DOI 10.17487/RFC3031, January 2001, 987 . 989 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 990 of Explicit Congestion Notification (ECN) to IP", 991 RFC 3168, DOI 10.17487/RFC3168, September 2001, 992 . 994 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 995 over Low-Power Wireless Personal Area Networks (6LoWPANs): 996 Overview, Assumptions, Problem Statement, and Goals", 997 RFC 4919, DOI 10.17487/RFC4919, August 2007, 998 . 1000 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1001 Errors at High Data Rates", RFC 4963, 1002 DOI 10.17487/RFC4963, July 2007, 1003 . 1005 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1006 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1007 . 1009 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1010 "Computing TCP's Retransmission Timer", RFC 6298, 1011 DOI 10.17487/RFC6298, June 2011, 1012 . 1014 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1015 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1016 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1017 Low-Power and Lossy Networks", RFC 6550, 1018 DOI 10.17487/RFC6550, March 2012, 1019 . 1021 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1022 Statement and Requirements for IPv6 over Low-Power 1023 Wireless Personal Area Network (6LoWPAN) Routing", 1024 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1025 . 1027 [RFC7554] Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using 1028 IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the 1029 Internet of Things (IoT): Problem Statement", RFC 7554, 1030 DOI 10.17487/RFC7554, May 2015, 1031 . 1033 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1034 Recommendations Regarding Active Queue Management", 1035 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1036 . 1038 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1039 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1040 March 2017, . 1042 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1043 Explicit Congestion Notification (ECN)", RFC 8087, 1044 DOI 10.17487/RFC8087, March 2017, 1045 . 1047 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1048 (IPv6) Specification", STD 86, RFC 8200, 1049 DOI 10.17487/RFC8200, July 2017, 1050 . 1052 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1053 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1054 DOI 10.17487/RFC8201, July 2017, 1055 . 1057 Appendix A. Rationale 1059 There are a number of uses for large packets in Wireless Sensor 1060 Networks. Such usages may not be the most typical or represent the 1061 largest amount of traffic over the LLN; however, the associated 1062 functionality can be critical enough to justify extra care for 1063 ensuring effective transport of large packets across the LLN. 1065 The list of those usages includes: 1067 Towards the LLN node: Firmware update: For example, a new version 1068 of the LLN node software is downloaded from a system manager 1069 over unicast or multicast services. Such a reflashing 1070 operation typically involves updating a large number of similar 1071 LLN nodes over a relatively short period of time. 1073 Packages of Commands: A number of commands or 1074 a full configuration can be packaged as a single message to 1075 ensure consistency and enable atomic execution or complete roll 1076 back. Until such commands are fully received and interpreted, 1077 the intended operation will not take effect. 1079 From the LLN node: Waveform captures: A number of consecutive 1080 samples are measured at a high rate for a short time and then 1081 transferred from a sensor to a gateway or an edge server as a 1082 single large report. 1084 Data logs: LLN nodes may generate large logs of 1085 sampled data for later extraction. LLN nodes may also generate 1086 system logs to assist in diagnosing problems on the node or 1087 network. 1089 Large data packets: Rich data types might 1090 require more than one fragment. 1092 Uncontrolled firmware download or waveform upload can easily result 1093 in a massive increase of the traffic and saturate the network. 1095 When a fragment is lost in transmission, the lack of recovery in the 1096 original fragmentation system of RFC 4944 implies that all fragments 1097 would need to be resent, further contributing to the congestion that 1098 caused the initial loss, and potentially leading to congestion 1099 collapse. 1101 This saturation may lead to excessive radio interference, or random 1102 early discard (leaky bucket) in relaying nodes. Additional queuing 1103 and memory congestion may result while waiting for a low power next 1104 hop to emerge from its sleeping state. 1106 Considering that RFC 4944 defines an MTU is 1280 bytes and that in 1107 most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can 1108 limit the MAC payload to as few as 74 bytes, a packet might be 1109 fragmented into at least 18 fragments at the 6LoWPAN shim layer. 1110 Taking into account the worst-case header overhead for 6LoWPAN 1111 Fragmentation and Mesh Addressing headers will increase the number of 1112 required fragments to around 32. This level of fragmentation is much 1113 higher than that traditionally experienced over the Internet with 1114 IPv4 fragments. At the same time, the use of radios increases the 1115 probability of transmission loss and Mesh-Under techniques compound 1116 that risk over multiple hops. 1118 Mechanisms such as TCP or application-layer segmentation could be 1119 used to support end-to-end reliable transport. One option to support 1120 bulk data transfer over a frame-size-constrained LLN is to set the 1121 Maximum Segment Size to fit within the link maximum frame size. 1122 Doing so, however, can add significant header overhead to each 1123 802.15.4 frame. In addition, deploying such a mechanism requires 1124 that the end-to-end transport is aware of the delivery properties of 1125 the underlying LLN, which is a layer violation, and difficult to 1126 achieve from the far end of the IPv6 network. 1128 Appendix B. Requirements 1130 For one-hop communications, a number of Low Power and Lossy Network 1131 (LLN) link-layers propose a local acknowledgment mechanism that is 1132 enough to detect and recover the loss of fragments. In a multihop 1133 environment, an end-to-end fragment recovery mechanism might be a 1134 good complement to a hop-by-hop MAC level recovery. This draft 1135 introduces a simple protocol to recover individual fragments between 1136 6LoWPAN endpoints that may be multiple hops away. The method 1137 addresses the following requirements of a LLN: 1139 Number of fragments The recovery mechanism must support highly 1140 fragmented packets, with a maximum of 32 fragments per packet. 1142 Minimum acknowledgment overhead Because the radio is half duplex, 1143 and because of silent time spent in the various medium access 1144 mechanisms, an acknowledgment consumes roughly as many resources 1145 as data fragment. 1147 The new end-to-end fragment recovery mechanism should be able to 1148 acknowledge multiple fragments in a single message and not require 1149 an acknowledgment at all if fragments are already protected at a 1150 lower layer. 1152 Controlled latency The recovery mechanism must succeed or give up 1153 within the time boundary imposed by the recovery process of the 1154 Upper Layer Protocols. 1156 Optional congestion control The aggregation of multiple concurrent 1157 flows may lead to the saturation of the radio network and 1158 congestion collapse. 1160 The recovery mechanism should provide means for controlling the 1161 number of fragments in transit over the LLN. 1163 Appendix C. Considerations On Flow Control 1165 Considering that a multi-hop LLN can be a very sensitive environment 1166 due to the limited queuing capabilities of a large population of its 1167 nodes, this draft recommends a simple and conservative approach to 1168 Congestion Control, based on TCP congestion avoidance. 1170 Congestion on the forward path is assumed in case of packet loss, and 1171 packet loss is assumed upon time out. The draft allows to control 1172 the number of outstanding fragments, that have been transmitted but 1173 for which an acknowledgment was not received yet. It must be noted 1174 that the number of outstanding fragments should not exceed the number 1175 of hops in the network, but the way to figure the number of hops is 1176 out of scope for this document. 1178 Congestion on the forward path can also be indicated by an Explicit 1179 Congestion Notification (ECN) mechanism. Though whether and how ECN 1180 [RFC3168] is carried out over the LoWPAN is out of scope, this draft 1181 provides a way for the destination endpoint to echo an ECN indication 1182 back to the source endpoint in an acknowledgment message as 1183 represented in Figure 4 in Section 5.2. 1185 It must be noted that congestion and collision are different topics. 1186 In particular, when a mesh operates on a same channel over multiple 1187 hops, then the forwarding of a fragment over a certain hop may 1188 collide with the forwarding of a next fragment that is following over 1189 a previous hop but in a same interference domain. This draft enables 1190 an end-to-end flow control, but leaves it to the sender stack to pace 1191 individual fragments within a transmit window, so that a given 1192 fragment is sent only when the previous fragment has had a chance to 1193 progress beyond the interference domain of this hop. In the case of 1194 6TiSCH [I-D.ietf-6tisch-architecture], which operates over the 1195 TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of 1196 IEEE802.14.5, a fragment is forwarded over a different channel at a 1197 different time and it makes full sense to transmit the next fragment 1198 as soon as the previous fragment has had its chance to be forwarded 1199 at the next hop. 1201 From the standpoint of a source 6LoWPAN endpoint, an outstanding 1202 fragment is a fragment that was sent but for which no explicit 1203 acknowledgment was received yet. This means that the fragment might 1204 be on the way, received but not yet acknowledged, or the 1205 acknowledgment might be on the way back. It is also possible that 1206 either the fragment or the acknowledgment was lost on the way. 1208 From the sender standpoint, all outstanding fragments might still be 1209 in the network and contribute to its congestion. There is an 1210 assumption, though, that after a certain amount of time, a frame is 1211 either received or lost, so it is not causing congestion anymore. 1212 This amount of time can be estimated based on the round trip delay 1213 between the 6LoWPAN endpoints. The method detailed in [RFC6298] is 1214 recommended for that computation. 1216 The reader is encouraged to read through "Congestion Control 1217 Principles" [RFC2914]. Additionally [RFC7567] and [RFC5681] provide 1218 deeper information on why this mechanism is needed and how TCP 1219 handles Congestion Control. Basically, the goal here is to manage 1220 the amount of fragments present in the network; this is achieved by 1221 to reducing the number of outstanding fragments over a congested path 1222 by throttling the sources. 1224 Section 6 describes how the sender decides how many fragments are 1225 (re)sent before an acknowledgment is required, and how the sender 1226 adapts that number to the network conditions. 1228 Author's Address 1230 Pascal Thubert (editor) 1231 Cisco Systems, Inc 1232 Building D, 45 Allee des Ormes - BP1200 1233 06254 MOUGINS - Sophia Antipolis 1234 France 1236 Phone: +33 497 23 26 34 1237 Email: pthubert@cisco.com