idnits 2.17.1 draft-ietf-6lo-fragment-recovery-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4944, updated by this document, for RFC5378 checks: 2005-07-13) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 22, 2019) is 1712 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-6lo-minimal-fragment-02 == Outdated reference: A later version (-02) exists of draft-ietf-lwig-6lowpan-virtual-reassembly-01 ** Downref: Normative reference to an Informational draft: draft-ietf-lwig-6lowpan-virtual-reassembly (ref. 'I-D.ietf-lwig-6lowpan-virtual-reassembly') == Outdated reference: A later version (-30) exists of draft-ietf-6tisch-architecture-24 == Outdated reference: A later version (-17) exists of draft-ietf-intarea-frag-fragile-15 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 6lo P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Updates: 4944 (if approved) July 22, 2019 5 Intended status: Standards Track 6 Expires: January 23, 2020 8 6LoWPAN Selective Fragment Recovery 9 draft-ietf-6lo-fragment-recovery-05 11 Abstract 13 This draft updates RFC 4944 with a simple protocol to recover 14 individual fragments across a route-over mesh network, with a minimal 15 flow control to protect the network against bloat. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on January 23, 2020. 34 Copyright Notice 36 Copyright (c) 2019 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.1. BCP 14 . . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.2. References . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.3. 6LoWPAN Acronyms . . . . . . . . . . . . . . . . . . . . 4 56 2.4. Referenced Work . . . . . . . . . . . . . . . . . . . . . 4 57 2.5. New Terms . . . . . . . . . . . . . . . . . . . . . . . . 5 58 3. Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . . 6 59 4. Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . . 6 60 4.1. Slack in the First Fragment . . . . . . . . . . . . . . . 7 61 4.2. Gap between frames . . . . . . . . . . . . . . . . . . . 7 62 4.3. Modifying the First Fragment . . . . . . . . . . . . . . 8 63 5. New Dispatch types and headers . . . . . . . . . . . . . . . 8 64 5.1. Recoverable Fragment Dispatch type and Header . . . . . . 9 65 5.2. RFRAG Acknowledgment Dispatch type and Header . . . . . . 11 66 6. Fragments Recovery . . . . . . . . . . . . . . . . . . . . . 12 67 6.1. Forwarding Fragments . . . . . . . . . . . . . . . . . . 14 68 6.1.1. Upon the first fragment . . . . . . . . . . . . . . . 15 69 6.1.2. Upon the next fragments . . . . . . . . . . . . . . . 15 70 6.2. Upon the RFRAG Acknowledgments . . . . . . . . . . . . . 16 71 6.3. Aborting the Transmission of a Fragmented Packet . . . . 16 72 7. Management Considerations . . . . . . . . . . . . . . . . . . 17 73 7.1. Protocol Parameters . . . . . . . . . . . . . . . . . . . 17 74 7.2. Observing the network . . . . . . . . . . . . . . . . . . 18 75 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 76 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 77 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 78 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 79 11.1. Normative References . . . . . . . . . . . . . . . . . . 19 80 11.2. Informative References . . . . . . . . . . . . . . . . . 20 81 Appendix A. Rationale . . . . . . . . . . . . . . . . . . . . . 23 82 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 24 83 Appendix C. Considerations On Flow Control . . . . . . . . . . . 25 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 26 86 1. Introduction 88 In most Low Power and Lossy Network (LLN) applications, the bulk of 89 the traffic consists of small chunks of data (in the order few bytes 90 to a few tens of bytes) at a time. Given that an IEEE Std. 802.15.4 91 [IEEE.802.15.4] frame can carry a payload of 74 bytes or more, 92 fragmentation is usually not required. However, and though this 93 happens only occasionally, a number of mission critical applications 94 do require the capability to transfer larger chunks of data, for 95 instance to support the firmware upgrade of the LLN nodes or the 96 extraction of logs from LLN nodes. In the former case, the large 97 chunk of data is transferred to the LLN node, whereas in the latter, 98 the large chunk flows away from the LLN node. In both cases, the 99 size can be on the order of 10 kilobytes or more and an end-to-end 100 reliable transport is required. 102 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 103 defines the original 6LoWPAN datagram fragmentation mechanism for 104 LLNs. One critical issue with this original design is that routing 105 an IPv6 [RFC8200] packet across a route-over mesh requires to 106 reassemble the full packet at each hop, which may cause latency along 107 a path and an overall buffer bloat in the network. The "6TiSCH 108 Architecture" [I-D.ietf-6tisch-architecture] recommends to use a hop- 109 by-hop fragment forwarding technique to alleviate those undesirable 110 effects. "LLN Minimal Fragment Forwarding" 111 [I-D.ietf-6lo-minimal-fragment] proposes such a technique, in a 112 fashion that is compatible with [RFC4944] without the need to define 113 a new protocol. 115 However, adding that capability alone to the local implementation of 116 the original 6LoWPAN fragmentation would not address the inherent 117 fragility of fragmentation (see [I-D.ietf-intarea-frag-fragile]) in 118 particular the issues of resources locked on the receiver and the 119 wasted transmissions due to the loss of a single fragment ina whole 120 datagram. [Kent] compares the unreliable delivery of fragments with 121 a mechanism it calls "selective acknowledgements" that recovers the 122 loss of a fragment individually. The paper illustrates the benefits 123 that can be derived from such a method in figures 1, 2 and 3, pages 6 124 and 7. [RFC4944] as no selective recovery and the whole datagram 125 fails when one fragment is not delivered to the destination 6LoWPAN 126 endpoint. Constrained memory resources are blocked on the receiver 127 until the receiver times out, possibly causing the loss of subsequent 128 packets that can not be received for the lack of buffers. 130 That problem is exacerbated when forwarding fragments over multiple 131 hops since a loss at an intermediate hop will not be discovered by 132 either the source or the destination, and the source will keep on 133 sending fragments, wasting even more resources in the network and 134 possibly contributing to the condition that caused the loss to no 135 avail since the datagram cannot arrive in its entirety. RFC 4944 is 136 also missing signaling to abort a multi-fragment transmission at any 137 time and from either end, and, if the capability to forward fragments 138 is implemented, clean up the related state in the network. It is 139 also lacking flow control capabilities to avoid participating to a 140 congestion that may in turn cause the loss of a fragment and 141 potentially the retransmission of the full datagram. 143 This specification provides a method to forward fragments across a 144 multi-hop route-over mesh, and a selective acknowledgment to recover 145 individual fragments between 6LoWPAN endpoints. 147 The method is designed to limit congestion loss in the network and 148 addresses the requirements that are detailed in Appendix B. 150 2. Terminology 152 2.1. BCP 14 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 156 "OPTIONAL" in this document are to be interpreted as described in BCP 157 14 [RFC2119][RFC8174] when, and only when, they appear in all 158 capitals, as shown here. 160 2.2. References 162 In this document, readers will encounter terms and concepts that are 163 discussed in "Problem Statement and Requirements for IPv6 over Low- 164 Power Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606] 166 2.3. 6LoWPAN Acronyms 168 This document uses the following acronyms: 170 6BBR: 6LoWPAN Backbone Router 172 6LBR: 6LoWPAN Border Router 174 6LN: 6LoWPAN Node 176 6LR: 6LoWPAN Router 178 LLN: Low-Power and Lossy Network 180 2.4. Referenced Work 182 Past experience with fragmentation has shown that misassociated or 183 lost fragments can lead to poor network behavior and, occasionally, 184 trouble at application layer. The reader is encouraged to read "IPv4 185 Reassembly Errors at High Data Rates" [RFC4963] and follow the 186 references for more information. 188 That experience led to the definition of "Path MTU discovery" 189 [RFC8201] (PMTUD) protocol that limits fragmentation over the 190 Internet. 192 Specifically in the case of UDP, valuable additional information can 193 be found in "UDP Usage Guidelines for Application Designers" 194 [RFC8085]. 196 Readers are expected to be familiar with all the terms and concepts 197 that are discussed in "IPv6 over Low-Power Wireless Personal Area 198 Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and 199 Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4 200 Networks" [RFC4944]. 202 "The Benefits of Using Explicit Congestion Notification (ECN)" 203 [RFC8087] provides useful information on the potential benefits and 204 pitfalls of using ECN. 206 Quoting the "Multiprotocol Label Switching (MPLS) Architecture" 207 [RFC3031]: with MPLS, 'packets are "labeled" before they are 208 forwarded'. At subsequent hops, there is no further analysis of the 209 packet's network layer header. Rather, the label is used as an index 210 into a table which specifies the next hop, and a new label". The 211 MPLS technique is leveraged in the present specification to forward 212 fragments that actually do not have a network layer header, since the 213 fragmentation occurs below IP. 215 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] 216 introduces the concept of a Virtual Reassembly Buffer (VRB) and an 217 associated technique to forward fragments as they come, using the 218 datagram_tag as a label in a fashion similar to MPLS. This 219 specification reuses that technique with slightly modified controls. 221 2.5. New Terms 223 This specification uses the following terms: 225 6LoWPAN endpoints The LLN nodes in charge of generating or expanding 226 a 6LoWPAN header from/to a full IPv6 packet. The 6LoWPAN 227 endpoints are the points where fragmentation and reassembly take 228 place. 230 Compressed Form This specification uses the generic term Compressed 231 Form to refer to the format of a datagram after the action of 232 [RFC6282] and possibly [RFC8138] for RPL [RFC6550] artifacts. 234 datagram_size: The size of the datagram in its Compressed Form 235 before it is fragmented. The datagram_size is expressed in a unit 236 that depends on the MAC layer technology, by default a byte. 238 fragment_offset: The offset of a particular fragment of a datagram 239 in its Compressed Form. The fragment_offset is expressed in a 240 unit that depends on the MAC layer technology and is by default a 241 byte. 243 datagram_tag: An identifier of a datagram that is locally unique to 244 the Layer-2 sender. Associated with the MAC address of the 245 sender, this becomes a globally unique identifier for the 246 datagram. 248 RFRAG: Recoverable Fragment 250 RFRAG-ACK: Recoverable Fragment Acknowledgement 252 RFRAG Acknowledgment Request: An RFRAG with the Acknowledgement 253 Request flag ('X' flag) set. 255 All 0's: Refers to a bitmap with all bits set to zero. 257 All 1's: Refers to a bitmap with all bits set to one. 259 3. Updating RFC 4944 261 This specification updates the fragmentation mechanism that is 262 specified in "Transmission of IPv6 Packets over IEEE 802.15.4 263 Networks" [RFC4944] for use in route-over LLNs by providing a model 264 where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and 265 where fragments that are lost on the way can be recovered 266 individually. A new format for fragment is introduced and new 267 dispatch types are defined in Section 5. 269 [RFC8138] allows to modify the size of a packet en-route by removing 270 the consumed hops in a compressed Routing Header. It results that 271 fragment_offset and datagram_size (see Section 2.5) must also be 272 modified en-route, whcih is difficult to do in the uncompressed form. 273 This specification expresses those fields in the Compressed Form and 274 allows to modify them en-route (see Section 4.3) easily. 276 Note that consistently with Section 2 of [RFC6282] for the 277 fragmentation mechanism described in Section 5.3 of [RFC4944], any 278 header that cannot fit within the first fragment MUST NOT be 279 compressed when using the fragmentation mechanism described in this 280 specification. 282 4. Extending draft-ietf-6lo-minimal-fragment 284 This specification extends the fragment forwarding mechanism 285 specified in "LLN Minimal Fragment Forwarding" 286 [I-D.ietf-6lo-minimal-fragment] by providing additional operations to 287 improve the management of the Virtual Reassembly Buffer (VRB) in the 288 context of recoverable fragments. 290 4.1. Slack in the First Fragment 292 At the time of this writing, [I-D.ietf-6lo-minimal-fragment] allows 293 for refragmenting in intermediate nodes, meaning that some bytes from 294 a given fragment may be left in the VRB to be added to the next 295 fragment. The reason for this to happen would be the need for space 296 in the outgoing fragment that was not needed in the incoming 297 fragment, for instance because the 6LoWPAN Header Compression is not 298 as efficient on the outgoing link, e.g., if the Interface ID (IID) of 299 the source IPv6 address is elided by the originator on the first hop 300 because it matches the source MAC address, but cannot be on the next 301 hops because the source MAC address changes. 303 This specification cannot allow this operation since fragments are 304 recovered end-to-end based on a sequence number. This means that the 305 fragments that contain a 6LoWPAN-compressed header MUST have enough 306 slack to enable a less efficient compression in the next hops that 307 still fits in one MAC frame. For instance, if the IID of the source 308 IPv6 address is elided by the originator, then it MUST compute the 309 fragment_size as if the MTU was 8 bytes less. This way, the next hop 310 can restore the source IID to the first fragment without impacting 311 the second fragment. 313 4.2. Gap between frames 315 This specification introduces a concept of Inter-Frame Gap, which is 316 a configurable interval of time between transmissions to a same next 317 hop. In the case of half duplex interfaces, this InterFrameGap 318 ensures that the next hop has progressed the previous frame and is 319 capable of receiving the next one. 321 In the case of a mesh operating at a single frequency with 322 omnidirectional antennas, a larger InterFrameGap is required to 323 protect the frame against hidden terminal collisions with the 324 previous frame of a same flow that is still progressing along a 325 common path. 327 The Inter-Frame Gap is useful even for unfragmented datagrams, but it 328 becomes a necessity for fragments that are typically generated in a 329 fast sequence and are all sent over the exact same path. 331 4.3. Modifying the First Fragment 333 The compression of the Hop Limit, of the source and destination 334 addresses in the IPv6 Header, and of the Routing Header, may change 335 en-route in a Route-Over mesh LLN. If the size of the first fragment 336 is modified, then the intermediate node MUST adapt the datagram_size 337 to reflect that difference. 339 The intermediate node MUST also save the difference of datagram_size 340 of the first fragment in the VRB and add it to the datagram_size and 341 to the fragment_offset of all the subsequent fragments for that 342 datagram. 344 5. New Dispatch types and headers 346 This specification enables the 6LoWPAN fragmentation sublayer to 347 provide an MTU up to 2048 bytes to the upper layer, which can be the 348 6LoWPAN Header Compression sublayer that is defined in the 349 "Compression Format for IPv6 Datagrams" [RFC6282] specification. In 350 order to achieve this, this specification enables the fragmentation 351 and the reliable transmission of fragments over a multihop 6LoWPAN 352 mesh network. 354 This specification provides a technique that is derived from MPLS to 355 forward individual fragments across a 6LoWPAN route-over mesh without 356 reassembly at each hop. The datagram_tag is used as a label; it is 357 locally unique to the node that owns the source MAC address of the 358 fragment, so together the MAC address and the label can identify the 359 fragment globally. A node may build the datagram_tag in its own 360 locally-significant way, as long as the chosen datagram_tag stays 361 unique to the particular datagram for the lifetime of that datagram. 362 It results that the label does not need to be globally unique but 363 also that it must be swapped at each hop as the source MAC address 364 changes. 366 This specification extends RFC 4944 [RFC4944] with 2 new Dispatch 367 types, for Recoverable Fragment (RFRAG) and for the RFRAG 368 Acknowledgment back. The new 6LoWPAN Dispatch types are taken from 369 Page 0 [RFC8025] as indicated in Table 1 in Section 9. 371 In the following sections, a "datagram_tag" extends the semantics 372 defined in [RFC4944] Section 5.3."Fragmentation Type and Header". 373 The datagram_tag is a locally unique identifier for the datagram from 374 the perspective of the sender. This means that the datagram_tag 375 identifies a datagram uniquely in the network when associated with 376 the source of the datagram. As the datagram gets forwarded, the 377 source changes and the datagram_tag must be swapped as detailed in 378 [I-D.ietf-6lo-minimal-fragment]. 380 5.1. Recoverable Fragment Dispatch type and Header 382 In this specification, if the packet is compressed then the size and 383 offset of the fragments are expressed on the Compressed Form of the 384 packet form as opposed to the uncompressed - native - packet form. 386 The format of the fragment header is shown in Figure 1. It is the 387 same for all fragments. The format has a length and an offset, as 388 well as a sequence field. This would be redundant if the offset was 389 computed as the product of the sequence by the length, but this is 390 not the case. The position of a fragment in the reassembly buffer is 391 neither correlated with the value of the sequence field nor with the 392 order in which the fragments are received. This enables out-of- 393 sequence subfragmenting, e.g., a fragment seq. 5 that is retried end- 394 to-end as smaller fragments seq. 5, 13 and 14 due to a change of MTU 395 along the path between the 6LoWPAN endpoints. 397 1 2 3 398 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 400 |1 1 1 0 1 0 0|E| datagram_tag | 401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 402 |X| sequence| fragment_size | fragment_offset | 403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 X set == Ack-Request 407 Figure 1: RFRAG Dispatch type and Header 409 There is no requirement on the receiver to check for contiguity of 410 the received fragments, and the sender MUST ensure that when all 411 fragments are acknowledged, then the datagram is fully received. 412 This may be useful in particular in the case where the MTU changes 413 and a fragment sequence is retried with a smaller fragment_size, the 414 remainder of the original fragment being retried with new sequence 415 values. 417 The first fragment is recognized by a sequence of 0; it carries its 418 fragment_size and the datagram_size of the compressed packet before 419 it is fragmented, whereas the other fragments carry their 420 fragment_size and fragment_offset. The last fragment for a datagram 421 is recognized when its fragment_offset and its fragment_size add up 422 to the datagram_size. 424 Recoverable Fragments are sequenced and a bitmap is used in the RFRAG 425 Acknowledgment to indicate the received fragments by setting the 426 individual bits that correspond to their sequence. 428 X: 1 bit; Ack-Request: when set, the sender requires an RFRAG 429 Acknowledgment from the receiver. 431 E: 1 bit; Explicit Congestion Notification; the "E" flag is reset by 432 the source of the fragment and set by intermediate routers to 433 signal that this fragment experienced congestion along its path. 435 Fragment_size: 10 bit unsigned integer; the size of this fragment in 436 a unit that depends on the MAC layer technology. Unless 437 overridden by a more specific specification, that unit is the 438 octet which allows fragments up to 512 bytes. 440 datagram_tag: 16 bits; an identifier of the datagram that is locally 441 unique to the sender. 443 Sequence: 5 bit unsigned integer; the sequence number of the 444 fragment in the acknowledgement bitmap. Fragments are numbered 445 [0..N] where N is in [0..31]. A Sequence of 0 indicates the first 446 fragment in a datagram, but non-zero values are not indicative of 447 the position in the reassembly buffer. 449 Fragment_offset: 16 bit unsigned integer; 451 * When the Fragment_offset is set to a non-0 value, its semantics 452 depend on the value of the Sequence field. 454 + For a first fragment (i.e. with a Sequence of 0), this field 455 indicates the datagram_size of the compressed datagram, to 456 help the receiver allocate an adapted buffer for the 457 reception and reassembly operations. The fragment may be 458 stored for local reassembly. Alternatively, it may be 459 routed based on the destination IPv6 address. In that case, 460 a VRB state must be installed as described in Section 6.1.1. 462 + When the Sequence is not 0, this field indicates the offset 463 of the fragment in the Compressed Form of the datagram. The 464 fragment may be added to a local reassembly buffer or 465 forwarded based on an existing VRB as described in 466 Section 6.1.2. 468 * A Fragment_offset that is set to a value of 0 indicates an 469 abort condition and all state regarding the datagram should be 470 cleaned up once the processing of the fragment is complete; the 471 processing of the fragment depends on whether there is a VRB 472 already established for this datagram, and the next hop is 473 still reachable: 475 + if a VRB already exists and is not broken, the fragment is 476 to be forwarded along the associated Label Switched Path 477 (LSP) as described in Section 6.1.2, but regardless of the 478 value of the Sequence field; 480 + else, if the Sequence is 0, then the fragment is to be 481 routed as described in Section 6.1.1 but no state is 482 conserved afterwards. In that case, the session if it 483 exists is aborted and the packet is also forwarded in an 484 attempt to clean up the next hops as along the path 485 indicated by the IPv6 header (possibly including a routing 486 header). 488 If the fragment cannot be forwarded or routed, then an abort 489 RFRAG-ACK is sent back to the source as described in 490 Section 6.1.2. 492 5.2. RFRAG Acknowledgment Dispatch type and Header 494 This specification also defines a 4-octet RFRAG Acknowledgment bitmap 495 that is used by the reassembling end point to confirm selectively the 496 reception of individual fragments. A given offset in the bitmap maps 497 one to one with a given sequence number and indicates which fragment 498 is acknowledged as follows: 500 1 2 3 501 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 503 | RFRAG Acknowledgment Bitmap | 504 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 505 ^ ^ 506 | | bitmap indicating whether: 507 | +----- Fragment with sequence 9 was received 508 +----------------------- Fragment with sequence 0 was received 510 Figure 2: RFRAG Acknowledgment bitmap encoding 512 Figure 3 shows an example Acknowledgment bitmap which indicates that 513 all fragments from sequence 0 to 20 were received, except for 514 fragments 1, 2 and 16 that were lost and must be retried. 516 1 2 3 517 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 519 |1|0|0|1|1|1|1|1|1|1|1|1|1|1|1|1|0|1|1|1|1|0|0|0|0|0|0|0|0|0|0|0| 520 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 522 Figure 3: Example RFRAG Acknowledgment Bitmap 524 The RFRAG Acknowledgment Bitmap is included in a RFRAG Acknowledgment 525 header, as follows: 527 1 2 3 528 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 530 |1 1 1 0 1 0 1|E| datagram_tag | 531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 532 | RFRAG Acknowledgment Bitmap (32 bits) | 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 535 Figure 4: RFRAG Acknowledgment Dispatch type and Header 537 E: 1 bit; Explicit Congestion Notification Echo 539 When set, the sender indicates that at least one of the 540 acknowledged fragments was received with an Explicit Congestion 541 Notification, indicating that the path followed by the fragments 542 is subject to congestion. More in Appendix C. 544 RFRAG Acknowledgment Bitmap 546 An RFRAG Acknowledgment Bitmap, whereby setting the bit at offset 547 x indicates that fragment x was received, as shown in Figure 2. 548 All 0's is a NULL bitmap that indicates that the fragmentation 549 process is aborted. All 1's is a FULL bitmap that indicates that 550 the fragmentation process is complete, all fragments were received 551 at the reassembly end point. 553 6. Fragments Recovery 555 The Recoverable Fragment header RFRAG is used to transport a fragment 556 and optionally request an RFRAG Acknowledgment that will confirm the 557 good reception of one or more fragments. An RFRAG Acknowledgment is 558 carried as a standalone fragment header (i.e. with no 6LoWPAN 559 payload) in a message that is propagated back to the 6LoWPAN endpoint 560 that was the originator of the fragments. To achieve this, each hop 561 that performed an MPLS-like operation on fragments reverses that 562 operation for the RFRAG_ACK by sending a frame from the next hop to 563 the previous hop as known by its MAC address in the VRB. The 564 datagram_tag in the RFRAG_ACK is unique to the receiver and is enough 565 information for an intermediate hop to locate the VRB that contains 566 the datagram_tag used by the previous hop and the Layer-2 information 567 associated to it (interface and MAC address). 569 The 6LoWPAN endpoint that fragments the packets at 6LoWPAN level (the 570 sender) also controls the amount of acknowledgments by setting the 571 Ack-Request flag in the RFRAG packets. The sender may set the Ack- 572 Request flag on any fragment to perform congestion control by 573 limiting the number of outstanding fragments, which are the fragments 574 that have been sent but for which reception or loss was not 575 positively confirmed by the reassembling endpoint. The maximum 576 number of outstanding fragments is the Window-Size. It is 577 configurable and may vary in case of ECN notification. When the 578 6LoWPAN endpoint that reassembles the packets at 6LoWPAN level (the 579 receiver) receives a fragment with the Ack-Request flag set, it MUST 580 send an RFRAG Acknowledgment back to the originator to confirm 581 reception of all the fragments it has received so far. 583 The Ack-Request ('X') set in an RFRAG marks the end of a window. 584 This flag SHOULD be set on the last fragment to protect the datagram, 585 and it MAY be set in any intermediate fragment for the purpose of 586 flow control. This ARQ process MUST be protected by a timer, and the 587 fragment that carries the 'X' flag MAY be retried upon time out a 588 configurable amount of times (see Section 7.1). Upon exhaustion of 589 the retries the sender may either abort the transmission of the 590 datagram or retry the datagram from the first fragment with an 'X' 591 flag set in order to reestablish a path and discover which fragments 592 were received over the old path in the acknowledgment bitmap. When 593 the sender of the fragment knows that an underlying link-layer 594 mechanism protects the fragments, it may refrain from using the RFRAG 595 Acknowledgment mechanism, and never set the Ack-Request bit. 597 The RFRAG Acknowledgment can optionally carry an ECN indication for 598 flow control (see Appendix C). The receiver of a fragment with the 599 'E' (ECN) flag set MUST echo that information by setting the 'E' 600 (ECN) flag in the next RFRAG Acknowledgment. 602 The sender transfers a controlled number of fragments and MAY flag 603 the last fragment of a window with an RFRAG Acknowledgment Request. 604 The receiver MUST acknowledge a fragment with the acknowledgment 605 request bit set. If any fragment immediately preceding an 606 acknowledgment request is still missing, the receiver MAY 607 intentionally delay its acknowledgment to allow in-transit fragments 608 to arrive. Because it might defeat the round trip delay computation, 609 delaying the acknowledgment should be configurable and not enabled by 610 default. 612 The receiver MAY issue unsolicited acknowledgments. An unsolicited 613 acknowledgment signals to the sender endpoint that it can resume 614 sending if it had reached its maximum number of outstanding 615 fragments. Another use is to inform that the reassembling endpoint 616 aborted the process of an individual datagram. 618 Note that acknowledgments might consume precious resources so the use 619 of unsolicited acknowledgments should be configurable and not enabled 620 by default. 622 An observation is that streamlining forwarding of fragments generally 623 reduces the latency over the LLN mesh, providing room for retries 624 within existing upper-layer reliability mechanisms. The sender 625 protects the transmission over the LLN mesh with a retry timer that 626 is computed according to the method detailed in [RFC6298]. It is 627 expected that the upper layer retries obey the recommendations in 628 "UDP Usage Guidelines" [RFC8085], in which case a single round of 629 fragment recovery should fit within the upper layer recovery timers. 631 Fragments are sent in a round robin fashion: the sender sends all the 632 fragments for a first time before it retries any lost fragment; lost 633 fragments are retried in sequence, oldest first. This mechanism 634 enables the receiver to acknowledge fragments that were delayed in 635 the network before they are retried. 637 When a single frequency is used by contiguous hops, the sender should 638 wait a reasonable amount of time between fragments so as to let a 639 fragment progress a few hops and avoid hidden terminal issues. This 640 precaution is not required on channel hopping technologies such as 641 Time Slotted Channel Hopping (TSCH) [RFC6554] 643 6.1. Forwarding Fragments 645 It is assumed that the first Fragment is large enough to carry the 646 IPv6 header and make routing decisions. If that is not so, then this 647 specification MUST NOT be used. 649 This specification extends the Virtual Reassembly Buffer (VRB) 650 technique to forward fragments with no intermediate reconstruction of 651 the entire packet. It inherits operations like datagram_tag 652 Switching and using a timer to clean the VRB when the traffic dries 653 up. In more details, the first fragment carries the IP header and it 654 is routed all the way from the fragmenting end point to the 655 reassembling end point. Upon the first fragment, the routers along 656 the path install a label-switched path (LSP), and the following 657 fragments are label-switched along that path. As a consequence, the 658 next fragments can only follow the path that was set up by the first 659 fragment and cannot follow an alternate route. The datagram_tag is 660 used to carry the label, that is swapped at each hop. All fragments 661 follow the same path and fragments are delivered in the order at 662 which they are sent. 664 6.1.1. Upon the first fragment 666 In Route-Over mode, the source and destination MAC addressed in a 667 frame change at each hop. The label that is formed and placed in the 668 datagram_tag is associated to the source MAC and only valid (and 669 unique) for that source MAC. Upon a first fragment (i.e. with a 670 sequence of zero), a VRB and the associated LSP state are created for 671 the tuple (source MAC address, datagram_tag) and the fragment is 672 forwarded along the IPv6 route that matches the destination IPv6 673 address in the IPv6 header as prescribed by 674 [I-D.ietf-6lo-minimal-fragment]. The LSP state enables to match the 675 (previous MAC address, datagram_tag) in an incoming fragment to the 676 tuple (next MAC address, swapped datagram_tag) used in the forwarded 677 fragment and points at the VRB. In addition, the router also forms a 678 Reverse LSP state indexed by the MAC address of the next hop and the 679 swapped datagram_tag. This reverse LSP state also points at the VRB 680 and enables to match the (next MAC address, swapped_datagram_tag) 681 found in an RFRAG Acknowledgment to the tuple (previous MAC address, 682 datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG- 683 ACK) back to the sender endpoint. 685 6.1.2. Upon the next fragments 687 Upon a next fragment (i.e. with a non-zero sequence), the router 688 looks up a LSP indexed by the tuple (MAC address, datagram_tag) found 689 in the fragment. If it is found, the router forwards the fragment 690 using the associated VRB as prescribed by 691 [I-D.ietf-6lo-minimal-fragment]. 693 if the VRB for the tuple is not found, the router builds an RFRAG-ACK 694 to abort the transmission of the packet. The resulting message has 695 the following information: 697 o The source and destination MAC addresses are swapped from those 698 found in the fragment 700 o The datagram_tag set to the datagram_tag found in the fragment 702 o A NULL bitmap is used to signal the abort condition 704 At this point the router is all set and can send the RFRAG-ACK back 705 to the previous router. The RFRAG-ACK should normally be forwarded 706 all the way to the source using the reverse LSP state in the VRBs in 707 the intermediate routers as described in the next section. 709 6.2. Upon the RFRAG Acknowledgments 711 Upon an RFRAG-ACK, the router looks up a Reverse LSP indexed by the 712 tuple (MAC address, datagram_tag), which are respectively the source 713 MAC address of the received frame and the received datagram_tag. If 714 it is found, the router forwards the fragment using the associated 715 VRB as prescribed by [I-D.ietf-6lo-minimal-fragment], but using the 716 Reverse LSP so that the RFRAG-ACK flows back to the sender endpoint. 718 If the Reverse LSP is not found, the router MUST silently drop the 719 RFRAG-ACK message. 721 Either way, if the RFRAG-ACK indicates that the fragment was entirely 722 received (FULL bitmap), it arms a short timer, and upon timeout, the 723 VRB and all the associated state are destroyed. Until the timer 724 elapses, fragments of that datagram may still be received, e.g. if 725 the RFRAG-ACK was lost on the way back and the source retried the 726 last fragment. In that case, the router forwards the fragment 727 according to the state in the VRB. 729 This specification does not provide a method to discover the number 730 of hops or the minimal value of MTU along those hops. But should the 731 minimal MTU decrease, it is possible to retry a long fragment (say 732 sequence of 5) with first a shorter fragment of the same sequence (5 733 again) and then one or more other fragments with a sequence that was 734 not used before (e.g., 13 and 14). Note that Path MTU Discovery is 735 out of scope for this document. 737 6.3. Aborting the Transmission of a Fragmented Packet 739 A reset is signaled on the forward path with a pseudo fragment that 740 has the fragment_offset, sequence and fragment_size all set to 0, and 741 no data. 743 When the sender or a router on the way decides that a packet should 744 be dropped and the fragmentation process aborted, it generates a 745 reset pseudo fragment and forwards it down the fragment path. 747 Each router next along the path the way forwards the pseudo fragment 748 based on the VRB state. If an acknowledgment is not requested, the 749 VRB and all associated state are destroyed. 751 Upon reception of the pseudo fragment, the receiver cleans up all 752 resources for the packet associated to the datagram_tag. If an 753 acknowledgment is requested, the receiver responds with a NULL 754 bitmap. 756 The other way around, the receiver might need to abort the process of 757 a fragmented packet for internal reasons, for instance if it is out 758 of reassembly buffers, or considers that this packet is already fully 759 reassembled and passed to the upper layer. In that case, the 760 receiver SHOULD indicate so to the sender with a NULL bitmap in a 761 RFRAG Acknowledgment. Upon an acknowledgment with a NULL bitmap, the 762 sender endpoint MUST abort the transmission of the fragmented 763 datagram. 765 7. Management Considerations 767 7.1. Protocol Parameters 769 There is no particular configuration on the receiver, as echoing ECN 770 is always on. The configuration only applies to the sender, which is 771 in control of the transmission. The management system SHOULD be 772 capable of providing the parameters below: 774 MinFragmentSize: The MinFragmentSize is the minimum value for the 775 Fragment_Size. 777 OptFragmentSize: The MinFragmentSize is the value for the 778 Fragment_Size that the sender should use to start with. 780 MaxFragmentSize: The MaxFragmentSize is the maximum value for the 781 Fragment_Size. It MUST be lower than the minimum MTU along the 782 path. A large value augments the chances of buffer bloat and 783 transmission loss. The value MUST be less than 512 if the unit 784 that is defined for the PHY layer is the octet. 786 UseECN: Indicates whether the sender should react to ECN. When the 787 sender reacts to ECN the Window_Size will vary between 788 MinWindowSize and MaxWindowSize. 790 MinWindowSize: The minimum value of Window_Size that the sender can 791 use. 793 OptWindowSize: The OptWindowSize is the value for the Window_Size 794 that the sender should use to start with. 796 MaxWindowSize: The maximum value of Window_Size that the sender can 797 use. The value MUSt be less than 32. 799 InterFrameGap: Indicates a minimum amount of time between 800 transmissions. All packets to a same destination, and in 801 particular fragments, may be subject to receive while 802 transmitting and hidden terminal collisions with the next or 803 the previous transmission as the fragments progress along a 804 same path. The InterFrameGap protects the propagation of one 805 transmission before the next one is triggered and creates a 806 duty cycle that controls the ratio of air time and memory in 807 intermediate nodes that a particular datagram will use. 809 MinARQTimeOut: The maximum amount of time a node should wait for an 810 RFRAG Acknowledgment before it takes a next action. 812 OptARQTimeOut: The starting point of the value of the amount that a 813 sender should wait for an RFRAG Acknowledgment before it takes 814 a next action. 816 MaxARQTimeOut: The maximum amount of time a node should wait for an 817 RFRAG Acknowledgment before it takes a next action. 819 MaxFragRetries: The maximum number of retries for a particular 820 Fragment. 822 MaxDatagramRetries: The maximum number of retries from scratch for a 823 particular Datagram. 825 7.2. Observing the network 827 The management system should monitor the amount of retries and of ECN 828 settings that can be observed from the perspective of the both the 829 sender and the receiver, and may tune the optimum size of 830 Fragment_Size and of the Window_Size, OptWindowSize and OptWindowSize 831 respectively, at the sender. The values should be bounded by the 832 expected number of hops and reduced beyond that when the number of 833 datagrams that can traverse an intermediate point may exceed its 834 capacity and cause a congestion loss. The InterFrameGap is another 835 tool that can be used to increase the spacing between fragments of a 836 same datagram and reduce the ratio of time when a particular 837 intermediate node holds a fragment of that datagram. 839 8. Security Considerations 841 The considerations in the Security section of [I-D.ietf-core-cocoa] 842 apply equally to this specification. 844 The process of recovering fragments does not appear to create any 845 opening for new threat compared to "Transmission of IPv6 Packets over 846 IEEE 802.15.4 Networks" [RFC4944]. 848 The technique of Virtual Recovery Buffers inherited from 849 [I-D.ietf-6lo-minimal-fragment] may be used to perform a Denial-of- 850 Service (DoS) attack against the intermediate Routers since the 851 routers need to maintain a state per flow. Note that as opposed to 852 the VRB described in [I-D.ietf-lwig-6lowpan-virtual-reassembly] the 853 data that is transported in each fragment is conserved and the state 854 to keep does not include any data that would not fit in the previous 855 fragment. 857 9. IANA Considerations 859 This document allocates 4 values in Page 0 for recoverable fragments 860 from the "Dispatch Type Field" registry that was created by 861 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 862 and reformatted by "6LoWPAN Paging Dispatch" [RFC8025]. 864 The suggested values (to be confirmed by IANA) are indicated in 865 Table 1. 867 +-------------+------+----------------------------------+-----------+ 868 | Bit Pattern | Page | Header Type | Reference | 869 +-------------+------+----------------------------------+-----------+ 870 | 11 10100x | 0 | RFRAG - Recoverable Fragment | RFC THIS | 871 | 11 10101x | 0 | RFRAG-ACK - RFRAG Acknowledgment | RFC THIS | 872 +-------------+------+----------------------------------+-----------+ 874 Table 1: Additional Dispatch Value Bit Patterns 876 10. Acknowledgments 878 The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent 879 Toutain, Carles Gomez Montenegro, Thomas Watteyne and Michael 880 Richardson for in-depth reviews and comments. Also many thanks to 881 Jonathan Hui, Jay Werb, Christos Polyzois, Soumitri Kolavennu, Pat 882 Kinney, Margaret Wasserman, Richard Kelsey, Carsten Bormann and Harry 883 Courtice for their various contributions. 885 11. References 887 11.1. Normative References 889 [I-D.ietf-6lo-minimal-fragment] 890 Watteyne, T., Bormann, C., and P. Thubert, "LLN Minimal 891 Fragment Forwarding", draft-ietf-6lo-minimal-fragment-02 892 (work in progress), June 2019. 894 [I-D.ietf-lwig-6lowpan-virtual-reassembly] 895 Bormann, C. and T. Watteyne, "Virtual reassembly buffers 896 in 6LoWPAN", draft-ietf-lwig-6lowpan-virtual-reassembly-01 897 (work in progress), March 2019. 899 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 900 Requirement Levels", BCP 14, RFC 2119, 901 DOI 10.17487/RFC2119, March 1997, 902 . 904 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 905 "Transmission of IPv6 Packets over IEEE 802.15.4 906 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 907 . 909 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 910 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 911 DOI 10.17487/RFC6282, September 2011, 912 . 914 [RFC6554] Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6 915 Routing Header for Source Routes with the Routing Protocol 916 for Low-Power and Lossy Networks (RPL)", RFC 6554, 917 DOI 10.17487/RFC6554, March 2012, 918 . 920 [RFC8025] Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power 921 Wireless Personal Area Network (6LoWPAN) Paging Dispatch", 922 RFC 8025, DOI 10.17487/RFC8025, November 2016, 923 . 925 [RFC8138] Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie, 926 "IPv6 over Low-Power Wireless Personal Area Network 927 (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138, 928 April 2017, . 930 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 931 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 932 May 2017, . 934 11.2. Informative References 936 [I-D.ietf-6tisch-architecture] 937 Thubert, P., "An Architecture for IPv6 over the TSCH mode 938 of IEEE 802.15.4", draft-ietf-6tisch-architecture-24 (work 939 in progress), July 2019. 941 [I-D.ietf-core-cocoa] 942 Bormann, C., Betzler, A., Gomez, C., and I. Demirkol, 943 "CoAP Simple Congestion Control/Advanced", draft-ietf- 944 core-cocoa-03 (work in progress), February 2018. 946 [I-D.ietf-intarea-frag-fragile] 947 Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 948 and F. Gont, "IP Fragmentation Considered Fragile", draft- 949 ietf-intarea-frag-fragile-15 (work in progress), July 950 2019. 952 [IEEE.802.15.4] 953 IEEE, "IEEE Standard for Low-Rate Wireless Networks", 954 IEEE Standard 802.15.4, DOI 10.1109/IEEE 955 P802.15.4-REVd/D01, 956 . 958 [Kent] Kent, C. and J. Mogul, ""Fragmentation Considered 959 Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in 960 Computer Communications Technology", 961 DOI 10.1145/55483.55524, August 1987, 962 . 965 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 966 RFC 2914, DOI 10.17487/RFC2914, September 2000, 967 . 969 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 970 Label Switching Architecture", RFC 3031, 971 DOI 10.17487/RFC3031, January 2001, 972 . 974 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 975 of Explicit Congestion Notification (ECN) to IP", 976 RFC 3168, DOI 10.17487/RFC3168, September 2001, 977 . 979 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 980 over Low-Power Wireless Personal Area Networks (6LoWPANs): 981 Overview, Assumptions, Problem Statement, and Goals", 982 RFC 4919, DOI 10.17487/RFC4919, August 2007, 983 . 985 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 986 Errors at High Data Rates", RFC 4963, 987 DOI 10.17487/RFC4963, July 2007, 988 . 990 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 991 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 992 . 994 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 995 "Computing TCP's Retransmission Timer", RFC 6298, 996 DOI 10.17487/RFC6298, June 2011, 997 . 999 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1000 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1001 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1002 Low-Power and Lossy Networks", RFC 6550, 1003 DOI 10.17487/RFC6550, March 2012, 1004 . 1006 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1007 Statement and Requirements for IPv6 over Low-Power 1008 Wireless Personal Area Network (6LoWPAN) Routing", 1009 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1010 . 1012 [RFC7554] Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using 1013 IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the 1014 Internet of Things (IoT): Problem Statement", RFC 7554, 1015 DOI 10.17487/RFC7554, May 2015, 1016 . 1018 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1019 Recommendations Regarding Active Queue Management", 1020 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1021 . 1023 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1024 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1025 March 2017, . 1027 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1028 Explicit Congestion Notification (ECN)", RFC 8087, 1029 DOI 10.17487/RFC8087, March 2017, 1030 . 1032 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1033 (IPv6) Specification", STD 86, RFC 8200, 1034 DOI 10.17487/RFC8200, July 2017, 1035 . 1037 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1038 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1039 DOI 10.17487/RFC8201, July 2017, 1040 . 1042 Appendix A. Rationale 1044 There are a number of uses for large packets in Wireless Sensor 1045 Networks. Such usages may not be the most typical or represent the 1046 largest amount of traffic over the LLN; however, the associated 1047 functionality can be critical enough to justify extra care for 1048 ensuring effective transport of large packets across the LLN. 1050 The list of those usages includes: 1052 Towards the LLN node: 1054 Firmware update: For example, a new version of the LLN node 1055 software is downloaded from a system manager over unicast or 1056 multicast services. Such a reflashing operation typically 1057 involves updating a large number of similar LLN nodes over a 1058 relatively short period of time. 1060 Packages of Commands: A number of commands or a full 1061 configuration can be packaged as a single message to ensure 1062 consistency and enable atomic execution or complete roll back. 1063 Until such commands are fully received and interpreted, the 1064 intended operation will not take effect. 1066 From the LLN node: 1068 Waveform captures: A number of consecutive samples are measured 1069 at a high rate for a short time and then transferred from a 1070 sensor to a gateway or an edge server as a single large report. 1072 Data logs: LLN nodes may generate large logs of sampled data for 1073 later extraction. LLN nodes may also generate system logs to 1074 assist in diagnosing problems on the node or network. 1076 Large data packets: Rich data types might require more than one 1077 fragment. 1079 Uncontrolled firmware download or waveform upload can easily result 1080 in a massive increase of the traffic and saturate the network. 1082 When a fragment is lost in transmission, the lack of recovery in the 1083 original fragmentation system of RFC 4944 implies that all fragments 1084 would need to be resent, further contributing to the congestion that 1085 caused the initial loss, and potentially leading to congestion 1086 collapse. 1088 This saturation may lead to excessive radio interference, or random 1089 early discard (leaky bucket) in relaying nodes. Additional queuing 1090 and memory congestion may result while waiting for a low power next 1091 hop to emerge from its sleeping state. 1093 Considering that RFC 4944 defines an MTU is 1280 bytes and that in 1094 most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can 1095 limit the MAC payload to as few as 74 bytes, a packet might be 1096 fragmented into at least 18 fragments at the 6LoWPAN shim layer. 1097 Taking into account the worst-case header overhead for 6LoWPAN 1098 Fragmentation and Mesh Addressing headers will increase the number of 1099 required fragments to around 32. This level of fragmentation is much 1100 higher than that traditionally experienced over the Internet with 1101 IPv4 fragments. At the same time, the use of radios increases the 1102 probability of transmission loss and Mesh-Under techniques compound 1103 that risk over multiple hops. 1105 Mechanisms such as TCP or application-layer segmentation could be 1106 used to support end-to-end reliable transport. One option to support 1107 bulk data transfer over a frame-size-constrained LLN is to set the 1108 Maximum Segment Size to fit within the link maximum frame size. 1109 Doing so, however, can add significant header overhead to each 1110 802.15.4 frame. In addition, deploying such a mechanism requires 1111 that the end-to-end transport is aware of the delivery properties of 1112 the underlying LLN, which is a layer violation, and difficult to 1113 achieve from the far end of the IPv6 network. 1115 Appendix B. Requirements 1117 For one-hop communications, a number of Low Power and Lossy Network 1118 (LLN) link-layers propose a local acknowledgment mechanism that is 1119 enough to detect and recover the loss of fragments. In a multihop 1120 environment, an end-to-end fragment recovery mechanism might be a 1121 good complement to a hop-by-hop MAC level recovery. This draft 1122 introduces a simple protocol to recover individual fragments between 1123 6LoWPAN endpoints that may be multiple hops away. The method 1124 addresses the following requirements of a LLN: 1126 Number of fragments 1128 The recovery mechanism must support highly fragmented packets, 1129 with a maximum of 32 fragments per packet. 1131 Minimum acknowledgment overhead 1133 Because the radio is half duplex, and because of silent time spent 1134 in the various medium access mechanisms, an acknowledgment 1135 consumes roughly as many resources as data fragment. 1137 The new end-to-end fragment recovery mechanism should be able to 1138 acknowledge multiple fragments in a single message and not require 1139 an acknowledgment at all if fragments are already protected at a 1140 lower layer. 1142 Controlled latency 1144 The recovery mechanism must succeed or give up within the time 1145 boundary imposed by the recovery process of the Upper Layer 1146 Protocols. 1148 Optional congestion control 1150 The aggregation of multiple concurrent flows may lead to the 1151 saturation of the radio network and congestion collapse. 1153 The recovery mechanism should provide means for controlling the 1154 number of fragments in transit over the LLN. 1156 Appendix C. Considerations On Flow Control 1158 Considering that a multi-hop LLN can be a very sensitive environment 1159 due to the limited queuing capabilities of a large population of its 1160 nodes, this draft recommends a simple and conservative approach to 1161 Congestion Control, based on TCP congestion avoidance. 1163 Congestion on the forward path is assumed in case of packet loss, and 1164 packet loss is assumed upon time out. The draft allows to control 1165 the number of outstanding fragments, that have been transmitted but 1166 for which an acknowledgment was not received yet. It must be noted 1167 that the number of outstanding fragments should not exceed the number 1168 of hops in the network, but the way to figure the number of hops is 1169 out of scope for this document. 1171 Congestion on the forward path can also be indicated by an Explicit 1172 Congestion Notification (ECN) mechanism. Though whether and how ECN 1173 [RFC3168] is carried out over the LoWPAN is out of scope, this draft 1174 provides a way for the destination endpoint to echo an ECN indication 1175 back to the source endpoint in an acknowledgment message as 1176 represented in Figure 4 in Section 5.2. 1178 It must be noted that congestion and collision are different topics. 1179 In particular, when a mesh operates on a same channel over multiple 1180 hops, then the forwarding of a fragment over a certain hop may 1181 collide with the forwarding of a next fragment that is following over 1182 a previous hop but in a same interference domain. This draft enables 1183 an end-to-end flow control, but leaves it to the sender stack to pace 1184 individual fragments within a transmit window, so that a given 1185 fragment is sent only when the previous fragment has had a chance to 1186 progress beyond the interference domain of this hop. In the case of 1187 6TiSCH [I-D.ietf-6tisch-architecture], which operates over the 1188 TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of 1189 IEEE802.14.5, a fragment is forwarded over a different channel at a 1190 different time and it makes full sense to transmit the next fragment 1191 as soon as the previous fragment has had its chance to be forwarded 1192 at the next hop. 1194 From the standpoint of a source 6LoWPAN endpoint, an outstanding 1195 fragment is a fragment that was sent but for which no explicit 1196 acknowledgment was received yet. This means that the fragment might 1197 be on the way, received but not yet acknowledged, or the 1198 acknowledgment might be on the way back. It is also possible that 1199 either the fragment or the acknowledgment was lost on the way. 1201 From the sender standpoint, all outstanding fragments might still be 1202 in the network and contribute to its congestion. There is an 1203 assumption, though, that after a certain amount of time, a frame is 1204 either received or lost, so it is not causing congestion anymore. 1205 This amount of time can be estimated based on the round trip delay 1206 between the 6LoWPAN endpoints. The method detailed in [RFC6298] is 1207 recommended for that computation. 1209 The reader is encouraged to read through "Congestion Control 1210 Principles" [RFC2914]. Additionally [RFC7567] and [RFC5681] provide 1211 deeper information on why this mechanism is needed and how TCP 1212 handles Congestion Control. Basically, the goal here is to manage 1213 the amount of fragments present in the network; this is achieved by 1214 to reducing the number of outstanding fragments over a congested path 1215 by throttling the sources. 1217 Section 6 describes how the sender decides how many fragments are 1218 (re)sent before an acknowledgment is required, and how the sender 1219 adapts that number to the network conditions. 1221 Author's Address 1223 Pascal Thubert (editor) 1224 Cisco Systems, Inc 1225 Building D 1226 45 Allee des Ormes - BP1200 1227 MOUGINS - Sophia Antipolis 06254 1228 FRANCE 1230 Phone: +33 497 23 26 34 1231 Email: pthubert@cisco.com