idnits 2.17.1 draft-ietf-6lo-fragment-recovery-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4944, updated by this document, for RFC5378 checks: 2005-07-13) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (21 October 2019) is 1641 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-6lo-minimal-fragment-04 == Outdated reference: A later version (-30) exists of draft-ietf-6tisch-architecture-27 == Outdated reference: A later version (-02) exists of draft-ietf-lwig-6lowpan-virtual-reassembly-01 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 6lo P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Updates: 4944 (if approved) 21 October 2019 5 Intended status: Standards Track 6 Expires: 23 April 2020 8 6LoWPAN Selective Fragment Recovery 9 draft-ietf-6lo-fragment-recovery-06 11 Abstract 13 This draft updates RFC 4944 with a simple protocol to recover 14 individual fragments across a route-over mesh network, with a minimal 15 flow control to protect the network against bloat. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on 23 April 2020. 34 Copyright Notice 36 Copyright (c) 2019 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 41 license-info) in effect on the date of publication of this document. 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. Code Components 44 extracted from this document must include Simplified BSD License text 45 as described in Section 4.e of the Trust Legal Provisions and are 46 provided without warranty as described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2.1. BCP 14 . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.2. References . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.3. 6LoWPAN Acronyms . . . . . . . . . . . . . . . . . . . . 4 55 2.4. Referenced Work . . . . . . . . . . . . . . . . . . . . . 4 56 2.5. New Terms . . . . . . . . . . . . . . . . . . . . . . . . 5 57 3. Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . . 6 58 4. Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . . 6 59 4.1. Slack in the First Fragment . . . . . . . . . . . . . . . 7 60 4.2. Gap between frames . . . . . . . . . . . . . . . . . . . 7 61 4.3. Modifying the First Fragment . . . . . . . . . . . . . . 7 62 5. New Dispatch types and headers . . . . . . . . . . . . . . . 8 63 5.1. Recoverable Fragment Dispatch type and Header . . . . . . 8 64 5.2. RFRAG Acknowledgment Dispatch type and Header . . . . . . 11 65 6. Fragments Recovery . . . . . . . . . . . . . . . . . . . . . 12 66 6.1. Forwarding Fragments . . . . . . . . . . . . . . . . . . 14 67 6.1.1. Upon the first fragment . . . . . . . . . . . . . . . 14 68 6.1.2. Upon the next fragments . . . . . . . . . . . . . . . 15 69 6.2. Upon the RFRAG Acknowledgments . . . . . . . . . . . . . 15 70 6.3. Aborting the Transmission of a Fragmented Packet . . . . 16 71 7. Management Considerations . . . . . . . . . . . . . . . . . . 16 72 7.1. Protocol Parameters . . . . . . . . . . . . . . . . . . . 16 73 7.2. Observing the network . . . . . . . . . . . . . . . . . . 18 74 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 75 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 76 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 77 11. Normative References . . . . . . . . . . . . . . . . . . . . 19 78 12. Informative References . . . . . . . . . . . . . . . . . . . 20 79 Appendix A. Rationale . . . . . . . . . . . . . . . . . . . . . 23 80 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 24 81 Appendix C. Considerations On Flow Control . . . . . . . . . . . 25 82 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 26 84 1. Introduction 86 In most Low Power and Lossy Network (LLN) applications, the bulk of 87 the traffic consists of small chunks of data (in the order few bytes 88 to a few tens of bytes) at a time. Given that an IEEE Std. 802.15.4 89 [IEEE.802.15.4] frame can carry a payload of 74 bytes or more, 90 fragmentation is usually not required. However, and though this 91 happens only occasionally, a number of mission critical applications 92 do require the capability to transfer larger chunks of data, for 93 instance to support the firmware upgrade of the LLN nodes or the 94 extraction of logs from LLN nodes. In the former case, the large 95 chunk of data is transferred to the LLN node, whereas in the latter, 96 the large chunk flows away from the LLN node. In both cases, the 97 size can be on the order of 10 kilobytes or more and an end-to-end 98 reliable transport is required. 100 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 101 defines the original 6LoWPAN datagram fragmentation mechanism for 102 LLNs. One critical issue with this original design is that routing 103 an IPv6 [RFC8200] packet across a route-over mesh requires to 104 reassemble the full packet at each hop, which may cause latency along 105 a path and an overall buffer bloat in the network. The "6TiSCH 106 Architecture" [I-D.ietf-6tisch-architecture] recommends to use a hop- 107 by-hop fragment forwarding technique to alleviate those undesirable 108 effects. "LLN Minimal Fragment Forwarding" 109 [I-D.ietf-6lo-minimal-fragment] proposes such a technique, in a 110 fashion that is compatible with [RFC4944] without the need to define 111 a new protocol. 113 However, adding that capability alone to the local implementation of 114 the original 6LoWPAN fragmentation would not address the inherent 115 fragility of fragmentation (see [I-D.ietf-intarea-frag-fragile]) in 116 particular the issues of resources locked on the receiver and the 117 wasted transmissions due to the loss of a single fragment ina whole 118 datagram. [Kent] compares the unreliable delivery of fragments with 119 a mechanism it calls "selective acknowledgements" that recovers the 120 loss of a fragment individually. The paper illustrates the benefits 121 that can be derived from such a method in figures 1, 2 and 3, pages 6 122 and 7. [RFC4944] as no selective recovery and the whole datagram 123 fails when one fragment is not delivered to the destination 6LoWPAN 124 endpoint. Constrained memory resources are blocked on the receiver 125 until the receiver times out, possibly causing the loss of subsequent 126 packets that can not be received for the lack of buffers. 128 That problem is exacerbated when forwarding fragments over multiple 129 hops since a loss at an intermediate hop will not be discovered by 130 either the source or the destination, and the source will keep on 131 sending fragments, wasting even more resources in the network and 132 possibly contributing to the condition that caused the loss to no 133 avail since the datagram cannot arrive in its entirety. RFC 4944 is 134 also missing signaling to abort a multi-fragment transmission at any 135 time and from either end, and, if the capability to forward fragments 136 is implemented, clean up the related state in the network. It is 137 also lacking flow control capabilities to avoid participating to a 138 congestion that may in turn cause the loss of a fragment and 139 potentially the retransmission of the full datagram. 141 This specification provides a method to forward fragments across a 142 multi-hop route-over mesh, and a selective acknowledgment to recover 143 individual fragments between 6LoWPAN endpoints. 145 The method is designed to limit congestion loss in the network and 146 addresses the requirements that are detailed in Appendix B. 148 2. Terminology 150 2.1. BCP 14 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 154 "OPTIONAL" in this document are to be interpreted as described in BCP 155 14 [RFC2119][RFC8174] when, and only when, they appear in all 156 capitals, as shown here. 158 2.2. References 160 In this document, readers will encounter terms and concepts that are 161 discussed in "Problem Statement and Requirements for IPv6 over 162 Low-Power Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606] 164 2.3. 6LoWPAN Acronyms 166 This document uses the following acronyms: 168 6BBR: 6LoWPAN Backbone Router 169 6LBR: 6LoWPAN Border Router 170 6LN: 6LoWPAN Node 171 6LR: 6LoWPAN Router 172 LLN: Low-Power and Lossy Network 174 2.4. Referenced Work 176 Past experience with fragmentation has shown that misassociated or 177 lost fragments can lead to poor network behavior and, occasionally, 178 trouble at application layer. The reader is encouraged to read "IPv4 179 Reassembly Errors at High Data Rates" [RFC4963] and follow the 180 references for more information. 182 That experience led to the definition of "Path MTU discovery" 183 [RFC8201] (PMTUD) protocol that limits fragmentation over the 184 Internet. 186 Specifically in the case of UDP, valuable additional information can 187 be found in "UDP Usage Guidelines for Application Designers" 188 [RFC8085]. 190 Readers are expected to be familiar with all the terms and concepts 191 that are discussed in "IPv6 over Low-Power Wireless Personal Area 192 Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and 193 Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4 194 Networks" [RFC4944]. 196 "The Benefits of Using Explicit Congestion Notification (ECN)" 197 [RFC8087] provides useful information on the potential benefits and 198 pitfalls of using ECN. 200 Quoting the "Multiprotocol Label Switching (MPLS) Architecture" 201 [RFC3031]: with MPLS, 'packets are "labeled" before they are 202 forwarded'. At subsequent hops, there is no further analysis of the 203 packet's network layer header. Rather, the label is used as an index 204 into a table which specifies the next hop, and a new label". The 205 MPLS technique is leveraged in the present specification to forward 206 fragments that actually do not have a network layer header, since the 207 fragmentation occurs below IP. 209 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] 210 introduces the concept of a Virtual Reassembly Buffer (VRB) and an 211 associated technique to forward fragments as they come, using the 212 datagram_tag as a label in a fashion similar to MPLS. This 213 specification reuses that technique with slightly modified controls. 215 2.5. New Terms 217 This specification uses the following terms: 219 6LoWPAN endpoints: The LLN nodes in charge of generating or 220 expanding a 6LoWPAN header from/to a full IPv6 packet. The 221 6LoWPAN endpoints are the points where fragmentation and 222 reassembly take place. 224 Compressed Form: This specification uses the generic term Compressed 225 Form to refer to the format of a datagram after the action of 226 [RFC6282] and possibly [RFC8138] for RPL [RFC6550] artifacts. 228 datagram_size: The size of the datagram in its Compressed Form 229 before it is fragmented. The datagram_size is expressed in a unit 230 that depends on the MAC layer technology, by default a byte. 232 fragment_offset: The offset of a particular fragment of a datagram 233 in its Compressed Form. The fragment_offset is expressed in a 234 unit that depends on the MAC layer technology and is by default a 235 byte. 237 datagram_tag: An identifier of a datagram that is locally unique to 238 the Layer-2 sender. Associated with the MAC address of the 239 sender, this becomes a globally unique identifier for the 240 datagram. 242 RFRAG: Recoverable Fragment 244 RFRAG-ACK: Recoverable Fragment Acknowledgement 246 RFRAG Acknowledgment Request: An RFRAG with the Acknowledgement 247 Request flag ('X' flag) set. 249 All 0's: Refers to a bitmap with all bits set to zero. 251 All 1's: Refers to a bitmap with all bits set to one. 253 3. Updating RFC 4944 255 This specification updates the fragmentation mechanism that is 256 specified in "Transmission of IPv6 Packets over IEEE 802.15.4 257 Networks" [RFC4944] for use in route-over LLNs by providing a model 258 where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and 259 where fragments that are lost on the way can be recovered 260 individually. A new format for fragment is introduced and new 261 dispatch types are defined in Section 5. 263 [RFC8138] allows to modify the size of a packet en-route by removing 264 the consumed hops in a compressed Routing Header. It results that 265 fragment_offset and datagram_size (see Section 2.5) must also be 266 modified en-route, whcih is difficult to do in the uncompressed form. 267 This specification expresses those fields in the Compressed Form and 268 allows to modify them en-route (see Section 4.3) easily. 270 Note that consistently with Section 2 of [RFC6282] for the 271 fragmentation mechanism described in Section 5.3 of [RFC4944], any 272 header that cannot fit within the first fragment MUST NOT be 273 compressed when using the fragmentation mechanism described in this 274 specification. 276 4. Extending draft-ietf-6lo-minimal-fragment 278 This specification extends the fragment forwarding mechanism 279 specified in "LLN Minimal Fragment Forwarding" 280 [I-D.ietf-6lo-minimal-fragment] by providing additional operations to 281 improve the management of the Virtual Reassembly Buffer (VRB) in the 282 context of recoverable fragments. 284 4.1. Slack in the First Fragment 286 At the time of this writing, [I-D.ietf-6lo-minimal-fragment] allows 287 for refragmenting in intermediate nodes, meaning that some bytes from 288 a given fragment may be left in the VRB to be added to the next 289 fragment. The reason for this to happen would be the need for space 290 in the outgoing fragment that was not needed in the incoming 291 fragment, for instance because the 6LoWPAN Header Compression is not 292 as efficient on the outgoing link, e.g., if the Interface ID (IID) of 293 the source IPv6 address is elided by the originator on the first hop 294 because it matches the source MAC address, but cannot be on the next 295 hops because the source MAC address changes. 297 This specification cannot allow this operation since fragments are 298 recovered end-to-end based on a sequence number. This means that the 299 fragments that contain a 6LoWPAN-compressed header MUST have enough 300 slack to enable a less efficient compression in the next hops that 301 still fits in one MAC frame. For instance, if the IID of the source 302 IPv6 address is elided by the originator, then it MUST compute the 303 fragment_size as if the MTU was 8 bytes less. This way, the next hop 304 can restore the source IID to the first fragment without impacting 305 the second fragment. 307 4.2. Gap between frames 309 This specification introduces a concept of Inter-Frame Gap, which is 310 a configurable interval of time between transmissions to a same next 311 hop. In the case of half duplex interfaces, this InterFrameGap 312 ensures that the next hop has progressed the previous frame and is 313 capable of receiving the next one. 315 In the case of a mesh operating at a single frequency with 316 omnidirectional antennas, a larger InterFrameGap is required to 317 protect the frame against hidden terminal collisions with the 318 previous frame of a same flow that is still progressing along a 319 common path. 321 The Inter-Frame Gap is useful even for unfragmented datagrams, but it 322 becomes a necessity for fragments that are typically generated in a 323 fast sequence and are all sent over the exact same path. 325 4.3. Modifying the First Fragment 327 The compression of the Hop Limit, of the source and destination 328 addresses in the IPv6 Header, and of the Routing Header, may change 329 en-route in a Route-Over mesh LLN. If the size of the first fragment 330 is modified, then the intermediate node MUST adapt the datagram_size 331 to reflect that difference. 333 The intermediate node MUST also save the difference of datagram_size 334 of the first fragment in the VRB and add it to the datagram_size and 335 to the fragment_offset of all the subsequent fragments for that 336 datagram. 338 5. New Dispatch types and headers 340 This specification enables the 6LoWPAN fragmentation sublayer to 341 provide an MTU up to 2048 bytes to the upper layer, which can be the 342 6LoWPAN Header Compression sublayer that is defined in the 343 "Compression Format for IPv6 Datagrams" [RFC6282] specification. In 344 order to achieve this, this specification enables the fragmentation 345 and the reliable transmission of fragments over a multihop 6LoWPAN 346 mesh network. 348 This specification provides a technique that is derived from MPLS to 349 forward individual fragments across a 6LoWPAN route-over mesh without 350 reassembly at each hop. The datagram_tag is used as a label; it is 351 locally unique to the node that owns the source MAC address of the 352 fragment, so together the MAC address and the label can identify the 353 fragment globally. A node may build the datagram_tag in its own 354 locally-significant way, as long as the chosen datagram_tag stays 355 unique to the particular datagram for the lifetime of that datagram. 356 It results that the label does not need to be globally unique but 357 also that it must be swapped at each hop as the source MAC address 358 changes. 360 This specification extends RFC 4944 [RFC4944] with 2 new Dispatch 361 types, for Recoverable Fragment (RFRAG) and for the RFRAG 362 Acknowledgment back. The new 6LoWPAN Dispatch types are taken from 363 Page 0 [RFC8025] as indicated in Table 1 in Section 9. 365 In the following sections, a "datagram_tag" extends the semantics 366 defined in [RFC4944] Section 5.3."Fragmentation Type and Header". 367 The datagram_tag is a locally unique identifier for the datagram from 368 the perspective of the sender. This means that the datagram_tag 369 identifies a datagram uniquely in the network when associated with 370 the source of the datagram. As the datagram gets forwarded, the 371 source changes and the datagram_tag must be swapped as detailed in 372 [I-D.ietf-6lo-minimal-fragment]. 374 5.1. Recoverable Fragment Dispatch type and Header 376 In this specification, if the packet is compressed then the size and 377 offset of the fragments are expressed on the Compressed Form of the 378 packet form as opposed to the uncompressed - native - packet form. 380 The format of the fragment header is shown in Figure 1. It is the 381 same for all fragments. The format has a length and an offset, as 382 well as a sequence field. This would be redundant if the offset was 383 computed as the product of the sequence by the length, but this is 384 not the case. The position of a fragment in the reassembly buffer is 385 neither correlated with the value of the sequence field nor with the 386 order in which the fragments are received. This enables out-of- 387 sequence subfragmenting, e.g., a fragment seq. 5 that is retried end- 388 to-end as smaller fragments seq. 5, 13 and 14 due to a change of MTU 389 along the path between the 6LoWPAN endpoints. 391 1 2 3 392 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 394 |1 1 1 0 1 0 0|E| datagram_tag | 395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 396 |X| sequence| fragment_size | fragment_offset | 397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 X set == Ack-Request 401 Figure 1: RFRAG Dispatch type and Header 403 There is no requirement on the receiver to check for contiguity of 404 the received fragments, and the sender MUST ensure that when all 405 fragments are acknowledged, then the datagram is fully received. 406 This may be useful in particular in the case where the MTU changes 407 and a fragment sequence is retried with a smaller fragment_size, the 408 remainder of the original fragment being retried with new sequence 409 values. 411 The first fragment is recognized by a sequence of 0; it carries its 412 fragment_size and the datagram_size of the compressed packet before 413 it is fragmented, whereas the other fragments carry their 414 fragment_size and fragment_offset. The last fragment for a datagram 415 is recognized when its fragment_offset and its fragment_size add up 416 to the datagram_size. 418 Recoverable Fragments are sequenced and a bitmap is used in the RFRAG 419 Acknowledgment to indicate the received fragments by setting the 420 individual bits that correspond to their sequence. 422 X: 1 bit; Ack-Request: when set, the sender requires an RFRAG 423 Acknowledgment from the receiver. 425 E: 1 bit; Explicit Congestion Notification; the "E" flag is reset by 426 the source of the fragment and set by intermediate routers to 427 signal that this fragment experienced congestion along its path. 429 Fragment_size: 10 bit unsigned integer; the size of this fragment in 430 a unit that depends on the MAC layer technology. Unless 431 overridden by a more specific specification, that unit is the 432 octet which allows fragments up to 512 bytes. 434 datagram_tag: 16 bits; an identifier of the datagram that is locally 435 unique to the sender. 437 Sequence: 5 bit unsigned integer; the sequence number of the 438 fragment in the acknowledgement bitmap. Fragments are numbered 439 [0..N] where N is in [0..31]. A Sequence of 0 indicates the first 440 fragment in a datagram, but non-zero values are not indicative of 441 the position in the reassembly buffer. 443 Fragment_offset: 16 bit unsigned integer. 445 When the Fragment_offset is set to a non-0 value, its semantics 446 depend on the value of the Sequence field as follows: 448 * For a first fragment (i.e. with a Sequence of 0), this field 449 indicates the datagram_size of the compressed datagram, to help 450 the receiver allocate an adapted buffer for the reception and 451 reassembly operations. The fragment may be stored for local 452 reassembly. Alternatively, it may be routed based on the 453 destination IPv6 address. In that case, a VRB state must be 454 installed as described in Section 6.1.1. 455 * When the Sequence is not 0, this field indicates the offset of 456 the fragment in the Compressed Form of the datagram. The 457 fragment may be added to a local reassembly buffer or forwarded 458 based on an existing VRB as described in Section 6.1.2. 460 A Fragment_offset that is set to a value of 0 indicates an abort 461 condition and all state regarding the datagram should be cleaned 462 up once the processing of the fragment is complete; the processing 463 of the fragment depends on whether there is a VRB already 464 established for this datagram, and the next hop is still 465 reachable: 467 * if a VRB already exists and is not broken, the fragment is to 468 be forwarded along the associated Label Switched Path (LSP) as 469 described in Section 6.1.2, but regardless of the value of the 470 Sequence field; 471 * else, if the Sequence is 0, then the fragment is to be routed 472 as described in Section 6.1.1 but no state is conserved 473 afterwards. In that case, the session if it exists is aborted 474 and the packet is also forwarded in an attempt to clean up the 475 next hops as along the path indicated by the IPv6 header 476 (possibly including a routing header). 478 If the fragment cannot be forwarded or routed, then an abort 479 RFRAG-ACK is sent back to the source as described in 480 Section 6.1.2. 482 5.2. RFRAG Acknowledgment Dispatch type and Header 484 This specification also defines a 4-octet RFRAG Acknowledgment bitmap 485 that is used by the reassembling end point to confirm selectively the 486 reception of individual fragments. A given offset in the bitmap maps 487 one to one with a given sequence number and indicates which fragment 488 is acknowledged as follows: 490 1 2 3 491 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 493 | RFRAG Acknowledgment Bitmap | 494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 495 ^ ^ 496 | | bitmap indicating whether: 497 | +----- Fragment with sequence 9 was received 498 +----------------------- Fragment with sequence 0 was received 500 Figure 2: RFRAG Acknowledgment bitmap encoding 502 Figure 3 shows an example Acknowledgment bitmap which indicates that 503 all fragments from sequence 0 to 20 were received, except for 504 fragments 1, 2 and 16 that were lost and must be retried. 506 1 2 3 507 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 508 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 509 |1|0|0|1|1|1|1|1|1|1|1|1|1|1|1|1|0|1|1|1|1|0|0|0|0|0|0|0|0|0|0|0| 510 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 Figure 3: Example RFRAG Acknowledgment Bitmap 514 The RFRAG Acknowledgment Bitmap is included in a RFRAG Acknowledgment 515 header, as follows: 517 1 2 3 518 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 519 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 520 |1 1 1 0 1 0 1|E| datagram_tag | 521 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 522 | RFRAG Acknowledgment Bitmap (32 bits) | 523 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 524 Figure 4: RFRAG Acknowledgment Dispatch type and Header 526 E: 1 bit; Explicit Congestion Notification Echo 528 When set, the sender indicates that at least one of the 529 acknowledged fragments was received with an Explicit Congestion 530 Notification, indicating that the path followed by the fragments 531 is subject to congestion. More in Appendix C. 533 RFRAG Acknowledgment Bitmap: An RFRAG Acknowledgment Bitmap, whereby 534 setting the bit at offset x indicates that fragment x was 535 received, as shown in Figure 2. All 0's is a NULL bitmap that 536 indicates that the fragmentation process is aborted. All 1's is a 537 FULL bitmap that indicates that the fragmentation process is 538 complete, all fragments were received at the reassembly end point. 540 6. Fragments Recovery 542 The Recoverable Fragment header RFRAG is used to transport a fragment 543 and optionally request an RFRAG Acknowledgment that will confirm the 544 good reception of one or more fragments. An RFRAG Acknowledgment is 545 carried as a standalone fragment header (i.e. with no 6LoWPAN 546 payload) in a message that is propagated back to the 6LoWPAN endpoint 547 that was the originator of the fragments. To achieve this, each hop 548 that performed an MPLS-like operation on fragments reverses that 549 operation for the RFRAG_ACK by sending a frame from the next hop to 550 the previous hop as known by its MAC address in the VRB. The 551 datagram_tag in the RFRAG_ACK is unique to the receiver and is enough 552 information for an intermediate hop to locate the VRB that contains 553 the datagram_tag used by the previous hop and the Layer-2 information 554 associated to it (interface and MAC address). 556 The 6LoWPAN endpoint that fragments the packets at 6LoWPAN level (the 557 sender) also controls the amount of acknowledgments by setting the 558 Ack-Request flag in the RFRAG packets. The sender may set the Ack- 559 Request flag on any fragment to perform congestion control by 560 limiting the number of outstanding fragments, which are the fragments 561 that have been sent but for which reception or loss was not 562 positively confirmed by the reassembling endpoint. The maximum 563 number of outstanding fragments is the Window-Size. It is 564 configurable and may vary in case of ECN notification. When the 565 6LoWPAN endpoint that reassembles the packets at 6LoWPAN level (the 566 receiver) receives a fragment with the Ack-Request flag set, it MUST 567 send an RFRAG Acknowledgment back to the originator to confirm 568 reception of all the fragments it has received so far. 570 The Ack-Request ('X') set in an RFRAG marks the end of a window. 571 This flag SHOULD be set on the last fragment to protect the datagram, 572 and it MAY be set in any intermediate fragment for the purpose of 573 flow control. This ARQ process MUST be protected by a timer, and the 574 fragment that carries the 'X' flag MAY be retried upon time out a 575 configurable amount of times (see Section 7.1). Upon exhaustion of 576 the retries the sender may either abort the transmission of the 577 datagram or retry the datagram from the first fragment with an 'X' 578 flag set in order to reestablish a path and discover which fragments 579 were received over the old path in the acknowledgment bitmap. When 580 the sender of the fragment knows that an underlying link-layer 581 mechanism protects the fragments, it may refrain from using the RFRAG 582 Acknowledgment mechanism, and never set the Ack-Request bit. 584 The RFRAG Acknowledgment can optionally carry an ECN indication for 585 flow control (see Appendix C). The receiver of a fragment with the 586 'E' (ECN) flag set MUST echo that information by setting the 'E' 587 (ECN) flag in the next RFRAG Acknowledgment. 589 The sender transfers a controlled number of fragments and MAY flag 590 the last fragment of a window with an RFRAG Acknowledgment Request. 591 The receiver MUST acknowledge a fragment with the acknowledgment 592 request bit set. If any fragment immediately preceding an 593 acknowledgment request is still missing, the receiver MAY 594 intentionally delay its acknowledgment to allow in-transit fragments 595 to arrive. Because it might defeat the round trip delay computation, 596 delaying the acknowledgment should be configurable and not enabled by 597 default. 599 The receiver MAY issue unsolicited acknowledgments. An unsolicited 600 acknowledgment signals to the sender endpoint that it can resume 601 sending if it had reached its maximum number of outstanding 602 fragments. Another use is to inform that the reassembling endpoint 603 aborted the process of an individual datagram. 605 Note that acknowledgments might consume precious resources so the use 606 of unsolicited acknowledgments should be configurable and not enabled 607 by default. 609 An observation is that streamlining forwarding of fragments generally 610 reduces the latency over the LLN mesh, providing room for retries 611 within existing upper-layer reliability mechanisms. The sender 612 protects the transmission over the LLN mesh with a retry timer that 613 is computed according to the method detailed in [RFC6298]. It is 614 expected that the upper layer retries obey the recommendations in 615 "UDP Usage Guidelines" [RFC8085], in which case a single round of 616 fragment recovery should fit within the upper layer recovery timers. 618 Fragments are sent in a round robin fashion: the sender sends all the 619 fragments for a first time before it retries any lost fragment; lost 620 fragments are retried in sequence, oldest first. This mechanism 621 enables the receiver to acknowledge fragments that were delayed in 622 the network before they are retried. 624 When a single frequency is used by contiguous hops, the sender should 625 wait a reasonable amount of time between fragments so as to let a 626 fragment progress a few hops and avoid hidden terminal issues. This 627 precaution is not required on channel hopping technologies such as 628 Time Slotted Channel Hopping (TSCH) [RFC6554] 630 6.1. Forwarding Fragments 632 It is assumed that the first Fragment is large enough to carry the 633 IPv6 header and make routing decisions. If that is not so, then this 634 specification MUST NOT be used. 636 This specification extends the Virtual Reassembly Buffer (VRB) 637 technique to forward fragments with no intermediate reconstruction of 638 the entire packet. It inherits operations like datagram_tag 639 Switching and using a timer to clean the VRB when the traffic dries 640 up. In more details, the first fragment carries the IP header and it 641 is routed all the way from the fragmenting end point to the 642 reassembling end point. Upon the first fragment, the routers along 643 the path install a label-switched path (LSP), and the following 644 fragments are label-switched along that path. As a consequence, the 645 next fragments can only follow the path that was set up by the first 646 fragment and cannot follow an alternate route. The datagram_tag is 647 used to carry the label, that is swapped at each hop. All fragments 648 follow the same path and fragments are delivered in the order at 649 which they are sent. 651 6.1.1. Upon the first fragment 653 In Route-Over mode, the source and destination MAC addressed in a 654 frame change at each hop. The label that is formed and placed in the 655 datagram_tag is associated to the source MAC and only valid (and 656 unique) for that source MAC. Upon a first fragment (i.e. with a 657 sequence of zero), a VRB and the associated LSP state are created for 658 the tuple (source MAC address, datagram_tag) and the fragment is 659 forwarded along the IPv6 route that matches the destination IPv6 660 address in the IPv6 header as prescribed by 661 [I-D.ietf-6lo-minimal-fragment]. The LSP state enables to match the 662 (previous MAC address, datagram_tag) in an incoming fragment to the 663 tuple (next MAC address, swapped datagram_tag) used in the forwarded 664 fragment and points at the VRB. In addition, the router also forms a 665 Reverse LSP state indexed by the MAC address of the next hop and the 666 swapped datagram_tag. This reverse LSP state also points at the VRB 667 and enables to match the (next MAC address, swapped_datagram_tag) 668 found in an RFRAG Acknowledgment to the tuple (previous MAC address, 669 datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG- 670 ACK) back to the sender endpoint. 672 6.1.2. Upon the next fragments 674 Upon a next fragment (i.e. with a non-zero sequence), the router 675 looks up a LSP indexed by the tuple (MAC address, datagram_tag) found 676 in the fragment. If it is found, the router forwards the fragment 677 using the associated VRB as prescribed by 678 [I-D.ietf-6lo-minimal-fragment]. 680 if the VRB for the tuple is not found, the router builds an RFRAG-ACK 681 to abort the transmission of the packet. The resulting message has 682 the following information: 684 * The source and destination MAC addresses are swapped from those 685 found in the fragment 686 * The datagram_tag set to the datagram_tag found in the fragment 687 * A NULL bitmap is used to signal the abort condition 689 At this point the router is all set and can send the RFRAG-ACK back 690 to the previous router. The RFRAG-ACK should normally be forwarded 691 all the way to the source using the reverse LSP state in the VRBs in 692 the intermediate routers as described in the next section. 694 6.2. Upon the RFRAG Acknowledgments 696 Upon an RFRAG-ACK, the router looks up a Reverse LSP indexed by the 697 tuple (MAC address, datagram_tag), which are respectively the source 698 MAC address of the received frame and the received datagram_tag. If 699 it is found, the router forwards the fragment using the associated 700 VRB as prescribed by [I-D.ietf-6lo-minimal-fragment], but using the 701 Reverse LSP so that the RFRAG-ACK flows back to the sender endpoint. 703 If the Reverse LSP is not found, the router MUST silently drop the 704 RFRAG-ACK message. 706 Either way, if the RFRAG-ACK indicates that the fragment was entirely 707 received (FULL bitmap), it arms a short timer, and upon timeout, the 708 VRB and all the associated state are destroyed. Until the timer 709 elapses, fragments of that datagram may still be received, e.g. if 710 the RFRAG-ACK was lost on the way back and the source retried the 711 last fragment. In that case, the router forwards the fragment 712 according to the state in the VRB. 714 This specification does not provide a method to discover the number 715 of hops or the minimal value of MTU along those hops. But should the 716 minimal MTU decrease, it is possible to retry a long fragment (say 717 sequence of 5) with first a shorter fragment of the same sequence (5 718 again) and then one or more other fragments with a sequence that was 719 not used before (e.g., 13 and 14). Note that Path MTU Discovery is 720 out of scope for this document. 722 6.3. Aborting the Transmission of a Fragmented Packet 724 A reset is signaled on the forward path with a pseudo fragment that 725 has the fragment_offset, sequence and fragment_size all set to 0, and 726 no data. 728 When the sender or a router on the way decides that a packet should 729 be dropped and the fragmentation process aborted, it generates a 730 reset pseudo fragment and forwards it down the fragment path. 732 Each router next along the path the way forwards the pseudo fragment 733 based on the VRB state. If an acknowledgment is not requested, the 734 VRB and all associated state are destroyed. 736 Upon reception of the pseudo fragment, the receiver cleans up all 737 resources for the packet associated to the datagram_tag. If an 738 acknowledgment is requested, the receiver responds with a NULL 739 bitmap. 741 The other way around, the receiver might need to abort the process of 742 a fragmented packet for internal reasons, for instance if it is out 743 of reassembly buffers, or considers that this packet is already fully 744 reassembled and passed to the upper layer. In that case, the 745 receiver SHOULD indicate so to the sender with a NULL bitmap in a 746 RFRAG Acknowledgment. Upon an acknowledgment with a NULL bitmap, the 747 sender endpoint MUST abort the transmission of the fragmented 748 datagram. 750 7. Management Considerations 752 7.1. Protocol Parameters 754 There is no particular configuration on the receiver, as echoing ECN 755 is always on. The configuration only applies to the sender, which is 756 in control of the transmission. The management system SHOULD be 757 capable of providing the parameters below: 759 MinFragmentSize: The MinFragmentSize is the minimum value for the 760 Fragment_Size. 762 OptFragmentSize: The MinFragmentSize is the value for the 763 Fragment_Size that the sender should use to start with. 765 MaxFragmentSize: The MaxFragmentSize is the maximum value for the 766 Fragment_Size. It MUST be lower than the minimum MTU along the 767 path. A large value augments the chances of buffer bloat and 768 transmission loss. The value MUST be less than 512 if the unit 769 that is defined for the PHY layer is the octet. 771 UseECN: Indicates whether the sender should react to ECN. When the 772 sender reacts to ECN the Window_Size will vary between 773 MinWindowSize and MaxWindowSize. 775 MinWindowSize: The minimum value of Window_Size that the sender can 776 use. 778 OptWindowSize: The OptWindowSize is the value for the Window_Size 779 that the sender should use to start with. 781 MaxWindowSize: The maximum value of Window_Size that the sender can 782 use. The value MUSt be less than 32. 784 InterFrameGap: Indicates a minimum amount of time between 785 transmissions. All packets to a same destination, and in 786 particular fragments, may be subject to receive while transmitting 787 and hidden terminal collisions with the next or the previous 788 transmission as the fragments progress along a same path. The 789 InterFrameGap protects the propagation of one transmission before 790 the next one is triggered and creates a duty cycle that controls 791 the ratio of air time and memory in intermediate nodes that a 792 particular datagram will use. 794 MinARQTimeOut: The maximum amount of time a node should wait for an 795 RFRAG Acknowledgment before it takes a next action. 797 OptARQTimeOut: The starting point of the value of the amount that a 798 sender should wait for an RFRAG Acknowledgment before it takes a 799 next action. 801 MaxARQTimeOut: The maximum amount of time a node should wait for an 802 RFRAG Acknowledgment before it takes a next action. 804 MaxFragRetries: The maximum number of retries for a particular 805 Fragment. 807 MaxDatagramRetries: The maximum number of retries from scratch for a 808 particular Datagram. 810 7.2. Observing the network 812 The management system should monitor the amount of retries and of ECN 813 settings that can be observed from the perspective of the both the 814 sender and the receiver, and may tune the optimum size of 815 Fragment_Size and of the Window_Size, OptWindowSize and OptWindowSize 816 respectively, at the sender. The values should be bounded by the 817 expected number of hops and reduced beyond that when the number of 818 datagrams that can traverse an intermediate point may exceed its 819 capacity and cause a congestion loss. The InterFrameGap is another 820 tool that can be used to increase the spacing between fragments of a 821 same datagram and reduce the ratio of time when a particular 822 intermediate node holds a fragment of that datagram. 824 8. Security Considerations 826 The considerations in the Security section of [I-D.ietf-core-cocoa] 827 apply equally to this specification. 829 The process of recovering fragments does not appear to create any 830 opening for new threat compared to "Transmission of IPv6 Packets over 831 IEEE 802.15.4 Networks" [RFC4944]. 833 The Virtual Recovery Buffer inherited from 834 [I-D.ietf-6lo-minimal-fragment] may be used to perform a Denial-of- 835 Service (DoS) attack against the intermediate Routers since the 836 routers need to maintain a state per flow. The VRB implementation 837 technique described in [I-D.ietf-lwig-6lowpan-virtual-reassembly] 838 allows to realign which data goes in which fragment which causes the 839 intermediate node to store a portion of the data, which adds an 840 attack vector that is not present with this specification. With this 841 specification, the data that is transported in each fragment is 842 conserved and the state to keep does not include any data that would 843 not fit in the previous fragment. 845 9. IANA Considerations 847 This document allocates 4 values in Page 0 for recoverable fragments 848 from the "Dispatch Type Field" registry that was created by 849 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 850 and reformatted by "6LoWPAN Paging Dispatch" [RFC8025]. 852 The suggested values (to be confirmed by IANA) are indicated in 853 Table 1. 855 +-------------+------+----------------------------------+-----------+ 856 | Bit Pattern | Page | Header Type | Reference | 857 +=============+======+==================================+===========+ 858 | 11 10100x | 0 | RFRAG - Recoverable Fragment | THIS RFC | 859 +-------------+------+----------------------------------+-----------+ 860 | 11 10101x | 0 | RFRAG-ACK - RFRAG | THIS RFC | 861 | | | Acknowledgment | | 862 +-------------+------+----------------------------------+-----------+ 864 Table 1: Additional Dispatch Value Bit Patterns 866 10. Acknowledgments 868 The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent 869 Toutain, Carles Gomez Montenegro, Thomas Watteyne and Michael 870 Richardson for in-depth reviews and comments. Also many thanks to 871 Jonathan Hui, Jay Werb, Christos Polyzois, Soumitri Kolavennu, Pat 872 Kinney, Margaret Wasserman, Richard Kelsey, Carsten Bormann and Harry 873 Courtice for their various contributions. 875 11. Normative References 877 [I-D.ietf-6lo-minimal-fragment] 878 Watteyne, T., Bormann, C., and P. Thubert, "6LoWPAN 879 Fragment Forwarding", Work in Progress, Internet-Draft, 880 draft-ietf-6lo-minimal-fragment-04, 2 September 2019, 881 . 884 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 885 Requirement Levels", BCP 14, RFC 2119, 886 DOI 10.17487/RFC2119, March 1997, 887 . 889 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 890 "Transmission of IPv6 Packets over IEEE 802.15.4 891 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 892 . 894 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 895 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 896 DOI 10.17487/RFC6282, September 2011, 897 . 899 [RFC6554] Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6 900 Routing Header for Source Routes with the Routing Protocol 901 for Low-Power and Lossy Networks (RPL)", RFC 6554, 902 DOI 10.17487/RFC6554, March 2012, 903 . 905 [RFC8025] Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power 906 Wireless Personal Area Network (6LoWPAN) Paging Dispatch", 907 RFC 8025, DOI 10.17487/RFC8025, November 2016, 908 . 910 [RFC8138] Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie, 911 "IPv6 over Low-Power Wireless Personal Area Network 912 (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138, 913 April 2017, . 915 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 916 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 917 May 2017, . 919 12. Informative References 921 [I-D.ietf-6tisch-architecture] 922 Thubert, P., "An Architecture for IPv6 over the TSCH mode 923 of IEEE 802.15.4", Work in Progress, Internet-Draft, 924 draft-ietf-6tisch-architecture-27, 18 October 2019, 925 . 928 [I-D.ietf-core-cocoa] 929 Bormann, C., Betzler, A., Gomez, C., and I. Demirkol, 930 "CoAP Simple Congestion Control/Advanced", Work in 931 Progress, Internet-Draft, draft-ietf-core-cocoa-03, 21 932 February 2018, 933 . 935 [I-D.ietf-intarea-frag-fragile] 936 Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 937 and F. Gont, "IP Fragmentation Considered Fragile", Work 938 in Progress, Internet-Draft, draft-ietf-intarea-frag- 939 fragile-17, 30 September 2019, 940 . 943 [I-D.ietf-lwig-6lowpan-virtual-reassembly] 944 Bormann, C. and T. Watteyne, "Virtual reassembly buffers 945 in 6LoWPAN", Work in Progress, Internet-Draft, draft-ietf- 946 lwig-6lowpan-virtual-reassembly-01, 11 March 2019, 947 . 950 [IEEE.802.15.4] 951 IEEE, "IEEE Standard for Low-Rate Wireless Networks", 952 IEEE Standard 802.15.4, DOI 10.1109/IEEE 953 P802.15.4-REVd/D01, October 2019, 954 . 956 [Kent] Kent, C. and J. Mogul, ""Fragmentation Considered 957 Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in 958 Computer Communications Technology", 959 DOI 10.1145/55483.55524, August 1987, 960 . 963 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 964 RFC 2914, DOI 10.17487/RFC2914, September 2000, 965 . 967 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 968 Label Switching Architecture", RFC 3031, 969 DOI 10.17487/RFC3031, January 2001, 970 . 972 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 973 of Explicit Congestion Notification (ECN) to IP", 974 RFC 3168, DOI 10.17487/RFC3168, September 2001, 975 . 977 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 978 over Low-Power Wireless Personal Area Networks (6LoWPANs): 979 Overview, Assumptions, Problem Statement, and Goals", 980 RFC 4919, DOI 10.17487/RFC4919, August 2007, 981 . 983 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 984 Errors at High Data Rates", RFC 4963, 985 DOI 10.17487/RFC4963, July 2007, 986 . 988 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 989 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 990 . 992 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 993 "Computing TCP's Retransmission Timer", RFC 6298, 994 DOI 10.17487/RFC6298, June 2011, 995 . 997 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 998 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 999 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1000 Low-Power and Lossy Networks", RFC 6550, 1001 DOI 10.17487/RFC6550, March 2012, 1002 . 1004 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1005 Statement and Requirements for IPv6 over Low-Power 1006 Wireless Personal Area Network (6LoWPAN) Routing", 1007 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1008 . 1010 [RFC7554] Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using 1011 IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the 1012 Internet of Things (IoT): Problem Statement", RFC 7554, 1013 DOI 10.17487/RFC7554, May 2015, 1014 . 1016 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1017 Recommendations Regarding Active Queue Management", 1018 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1019 . 1021 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1022 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1023 March 2017, . 1025 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1026 Explicit Congestion Notification (ECN)", RFC 8087, 1027 DOI 10.17487/RFC8087, March 2017, 1028 . 1030 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1031 (IPv6) Specification", STD 86, RFC 8200, 1032 DOI 10.17487/RFC8200, July 2017, 1033 . 1035 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1036 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1037 DOI 10.17487/RFC8201, July 2017, 1038 . 1040 Appendix A. Rationale 1042 There are a number of uses for large packets in Wireless Sensor 1043 Networks. Such usages may not be the most typical or represent the 1044 largest amount of traffic over the LLN; however, the associated 1045 functionality can be critical enough to justify extra care for 1046 ensuring effective transport of large packets across the LLN. 1048 The list of those usages includes: 1050 Towards the LLN node: Firmware update: For example, a new version 1051 of the LLN node software is downloaded from a system manager 1052 over unicast or multicast services. Such a reflashing 1053 operation typically involves updating a large number of similar 1054 LLN nodes over a relatively short period of time. 1055 Packages of Commands: A number of commands or 1056 a full configuration can be packaged as a single message to 1057 ensure consistency and enable atomic execution or complete roll 1058 back. Until such commands are fully received and interpreted, 1059 the intended operation will not take effect. 1060 From the LLN node: Waveform captures: A number of consecutive 1061 samples are measured at a high rate for a short time and then 1062 transferred from a sensor to a gateway or an edge server as a 1063 single large report. 1064 Data logs: LLN nodes may generate large logs of 1065 sampled data for later extraction. LLN nodes may also generate 1066 system logs to assist in diagnosing problems on the node or 1067 network. 1068 Large data packets: Rich data types might 1069 require more than one fragment. 1071 Uncontrolled firmware download or waveform upload can easily result 1072 in a massive increase of the traffic and saturate the network. 1074 When a fragment is lost in transmission, the lack of recovery in the 1075 original fragmentation system of RFC 4944 implies that all fragments 1076 would need to be resent, further contributing to the congestion that 1077 caused the initial loss, and potentially leading to congestion 1078 collapse. 1080 This saturation may lead to excessive radio interference, or random 1081 early discard (leaky bucket) in relaying nodes. Additional queuing 1082 and memory congestion may result while waiting for a low power next 1083 hop to emerge from its sleeping state. 1085 Considering that RFC 4944 defines an MTU is 1280 bytes and that in 1086 most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can 1087 limit the MAC payload to as few as 74 bytes, a packet might be 1088 fragmented into at least 18 fragments at the 6LoWPAN shim layer. 1089 Taking into account the worst-case header overhead for 6LoWPAN 1090 Fragmentation and Mesh Addressing headers will increase the number of 1091 required fragments to around 32. This level of fragmentation is much 1092 higher than that traditionally experienced over the Internet with 1093 IPv4 fragments. At the same time, the use of radios increases the 1094 probability of transmission loss and Mesh-Under techniques compound 1095 that risk over multiple hops. 1097 Mechanisms such as TCP or application-layer segmentation could be 1098 used to support end-to-end reliable transport. One option to support 1099 bulk data transfer over a frame-size-constrained LLN is to set the 1100 Maximum Segment Size to fit within the link maximum frame size. 1101 Doing so, however, can add significant header overhead to each 1102 802.15.4 frame. In addition, deploying such a mechanism requires 1103 that the end-to-end transport is aware of the delivery properties of 1104 the underlying LLN, which is a layer violation, and difficult to 1105 achieve from the far end of the IPv6 network. 1107 Appendix B. Requirements 1109 For one-hop communications, a number of Low Power and Lossy Network 1110 (LLN) link-layers propose a local acknowledgment mechanism that is 1111 enough to detect and recover the loss of fragments. In a multihop 1112 environment, an end-to-end fragment recovery mechanism might be a 1113 good complement to a hop-by-hop MAC level recovery. This draft 1114 introduces a simple protocol to recover individual fragments between 1115 6LoWPAN endpoints that may be multiple hops away. The method 1116 addresses the following requirements of a LLN: 1118 Number of fragments 1120 The recovery mechanism must support highly fragmented packets, 1121 with a maximum of 32 fragments per packet. 1123 Minimum acknowledgment overhead 1125 Because the radio is half duplex, and because of silent time spent 1126 in the various medium access mechanisms, an acknowledgment 1127 consumes roughly as many resources as data fragment. 1129 The new end-to-end fragment recovery mechanism should be able to 1130 acknowledge multiple fragments in a single message and not require 1131 an acknowledgment at all if fragments are already protected at a 1132 lower layer. 1134 Controlled latency 1135 The recovery mechanism must succeed or give up within the time 1136 boundary imposed by the recovery process of the Upper Layer 1137 Protocols. 1139 Optional congestion control 1141 The aggregation of multiple concurrent flows may lead to the 1142 saturation of the radio network and congestion collapse. 1144 The recovery mechanism should provide means for controlling the 1145 number of fragments in transit over the LLN. 1147 Appendix C. Considerations On Flow Control 1149 Considering that a multi-hop LLN can be a very sensitive environment 1150 due to the limited queuing capabilities of a large population of its 1151 nodes, this draft recommends a simple and conservative approach to 1152 Congestion Control, based on TCP congestion avoidance. 1154 Congestion on the forward path is assumed in case of packet loss, and 1155 packet loss is assumed upon time out. The draft allows to control 1156 the number of outstanding fragments, that have been transmitted but 1157 for which an acknowledgment was not received yet. It must be noted 1158 that the number of outstanding fragments should not exceed the number 1159 of hops in the network, but the way to figure the number of hops is 1160 out of scope for this document. 1162 Congestion on the forward path can also be indicated by an Explicit 1163 Congestion Notification (ECN) mechanism. Though whether and how ECN 1164 [RFC3168] is carried out over the LoWPAN is out of scope, this draft 1165 provides a way for the destination endpoint to echo an ECN indication 1166 back to the source endpoint in an acknowledgment message as 1167 represented in Figure 4 in Section 5.2. 1169 It must be noted that congestion and collision are different topics. 1170 In particular, when a mesh operates on a same channel over multiple 1171 hops, then the forwarding of a fragment over a certain hop may 1172 collide with the forwarding of a next fragment that is following over 1173 a previous hop but in a same interference domain. This draft enables 1174 an end-to-end flow control, but leaves it to the sender stack to pace 1175 individual fragments within a transmit window, so that a given 1176 fragment is sent only when the previous fragment has had a chance to 1177 progress beyond the interference domain of this hop. In the case of 1178 6TiSCH [I-D.ietf-6tisch-architecture], which operates over the 1179 TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of 1180 IEEE802.14.5, a fragment is forwarded over a different channel at a 1181 different time and it makes full sense to transmit the next fragment 1182 as soon as the previous fragment has had its chance to be forwarded 1183 at the next hop. 1185 From the standpoint of a source 6LoWPAN endpoint, an outstanding 1186 fragment is a fragment that was sent but for which no explicit 1187 acknowledgment was received yet. This means that the fragment might 1188 be on the way, received but not yet acknowledged, or the 1189 acknowledgment might be on the way back. It is also possible that 1190 either the fragment or the acknowledgment was lost on the way. 1192 From the sender standpoint, all outstanding fragments might still be 1193 in the network and contribute to its congestion. There is an 1194 assumption, though, that after a certain amount of time, a frame is 1195 either received or lost, so it is not causing congestion anymore. 1196 This amount of time can be estimated based on the round trip delay 1197 between the 6LoWPAN endpoints. The method detailed in [RFC6298] is 1198 recommended for that computation. 1200 The reader is encouraged to read through "Congestion Control 1201 Principles" [RFC2914]. Additionally [RFC7567] and [RFC5681] provide 1202 deeper information on why this mechanism is needed and how TCP 1203 handles Congestion Control. Basically, the goal here is to manage 1204 the amount of fragments present in the network; this is achieved by 1205 to reducing the number of outstanding fragments over a congested path 1206 by throttling the sources. 1208 Section 6 describes how the sender decides how many fragments are 1209 (re)sent before an acknowledgment is required, and how the sender 1210 adapts that number to the network conditions. 1212 Author's Address 1214 Pascal Thubert (editor) 1215 Cisco Systems, Inc 1216 Building D, 45 Allee des Ormes - BP1200 1217 06254 MOUGINS - Sophia Antipolis 1218 France 1220 Phone: +33 497 23 26 34 1221 Email: pthubert@cisco.com