idnits 2.17.1 draft-ietf-lwig-tcp-constrained-node-networks-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 32 instances of too long lines in the document, the longest one being 90 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet has text resembling RFC 2119 boilerplate text. -- The document date (October 8, 2020) is 1267 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC 7228' is mentioned on line 897, but not defined == Unused Reference: 'RFC6092' is defined on line 1243, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 6691 (Obsoleted by RFC 9293) == Outdated reference: A later version (-02) exists of draft-ietf-core-fasor-00 -- Obsolete informational reference (is this intentional?): RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LWIG Working Group C. Gomez 3 Internet-Draft UPC 4 Intended status: Informational J. Crowcroft 5 Expires: April 11, 2021 University of Cambridge 6 M. Scharf 7 Hochschule Esslingen 8 October 8, 2020 10 TCP Usage Guidance in the Internet of Things (IoT) 11 draft-ietf-lwig-tcp-constrained-node-networks-11 13 Abstract 15 This document provides guidance on how to implement and use the 16 Transmission Control Protocol (TCP) in Constrained-Node Networks 17 (CNNs), which are a characterstic of the Internet of Things (IoT). 18 Such environments require a lightweight TCP implementation and may 19 not make use of optional functionality. This document explains a 20 number of known and deployed techniques to simplify a TCP stack as 21 well as corresponding tradeoffs. The objective is to help embedded 22 developers with decisions on which TCP features to use. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on April 11, 2021. 41 Copyright Notice 43 Copyright (c) 2020 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Conventions used in this document . . . . . . . . . . . . . . 4 60 3. Characteristics of CNNs relevant for TCP . . . . . . . . . . 4 61 3.1. Network and link properties . . . . . . . . . . . . . . . 4 62 3.2. Usage scenarios . . . . . . . . . . . . . . . . . . . . . 5 63 3.3. Communication and traffic patterns . . . . . . . . . . . 6 64 4. TCP implementation and configuration in CNNs . . . . . . . . 6 65 4.1. Addressing path properties . . . . . . . . . . . . . . . 7 66 4.1.1. Maximum Segment Size (MSS) . . . . . . . . . . . . . 7 67 4.1.2. Explicit Congestion Notification (ECN) . . . . . . . 8 68 4.1.3. Explicit loss notifications . . . . . . . . . . . . . 9 69 4.2. TCP guidance for single-MSS stacks . . . . . . . . . . . 9 70 4.2.1. Single-MSS stacks - benefits and issues . . . . . . . 9 71 4.2.2. TCP options for single-MSS stacks . . . . . . . . . . 10 72 4.2.3. Delayed Acknowledgments for single-MSS stacks . . . . 11 73 4.2.4. RTO calculation for single-MSS stacks . . . . . . . . 11 74 4.3. General recommendations for TCP in CNNs . . . . . . . . . 12 75 4.3.1. Loss recovery and congestion/flow control . . . . . . 12 76 4.3.1.1. Selective Acknowledgments (SACK) . . . . . . . . 13 77 4.3.2. Delayed Acknowledgments . . . . . . . . . . . . . . . 13 78 4.3.3. Initial Window . . . . . . . . . . . . . . . . . . . 14 79 5. TCP usage recommendations in CNNs . . . . . . . . . . . . . . 14 80 5.1. TCP connection initiation . . . . . . . . . . . . . . . . 14 81 5.2. Number of concurrent connections . . . . . . . . . . . . 15 82 5.3. TCP connection lifetime . . . . . . . . . . . . . . . . . 15 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 84 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 85 8. Annex. TCP implementations for constrained devices . . . . . 18 86 8.1. uIP . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 87 8.2. lwIP . . . . . . . . . . . . . . . . . . . . . . . . . . 19 88 8.3. RIOT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 89 8.4. TinyOS . . . . . . . . . . . . . . . . . . . . . . . . . 20 90 8.5. FreeRTOS . . . . . . . . . . . . . . . . . . . . . . . . 20 91 8.6. uC/OS . . . . . . . . . . . . . . . . . . . . . . . . . . 20 92 8.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 20 93 9. Annex. Changes compared to previous versions . . . . . . . . 22 94 9.1. Changes between -00 and -01 . . . . . . . . . . . . . . . 22 95 9.2. Changes between -01 and -02 . . . . . . . . . . . . . . . 22 96 9.3. Changes between -02 and -03 . . . . . . . . . . . . . . . 22 97 9.4. Changes between -03 and -04 . . . . . . . . . . . . . . . 23 98 9.5. Changes between -04 and -05 . . . . . . . . . . . . . . . 23 99 9.6. Changes between -05 and -06 . . . . . . . . . . . . . . . 23 100 9.7. Changes between -06 and -07 . . . . . . . . . . . . . . . 23 101 9.8. Changes between -07 and -08 . . . . . . . . . . . . . . . 23 102 9.9. Changes between -08 and -09 . . . . . . . . . . . . . . . 23 103 9.10. Changes between -09 and -10 . . . . . . . . . . . . . . . 24 104 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 105 10.1. Normative References . . . . . . . . . . . . . . . . . . 24 106 10.2. Informative References . . . . . . . . . . . . . . . . . 25 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 109 1. Introduction 111 The Internet Protocol suite is being used for connecting Constrained- 112 Node Networks (CNNs) to the Internet, enabling the so-called Internet 113 of Things (IoT) [RFC7228]. In order to meet the requirements that 114 stem from CNNs, the IETF has produced a suite of new protocols 115 specifically designed for such environments (see e.g. [RFC8352]). 116 New IETF protocol stack components include the IPv6 over Low-power 117 Wireless Personal Area Networks (6LoWPAN) adaptation layer 118 [RFC4944][RFC6282][RFC6775], the IPv6 Routing Protocol for Low-power 119 and lossy networks (RPL) routing protocol [RFC6550], and the 120 Constrained Application Protocol (CoAP) [RFC7252]. 122 As of the writing, the main current transport layer protocols in IP- 123 based IoT scenarios are UDP and TCP. However, TCP has been 124 criticized (often, unfairly) as a protocol for the IoT. In fact, 125 some TCP features are not optimal for IoT scenarios, such as 126 relatively long header size, unsuitability for multicast, and always- 127 confirmed data delivery. However, many typical claims on TCP 128 unsuitability for IoT (e.g. a high complexity, connection-oriented 129 approach incompatibility with radio duty-cycling, and spurious 130 congestion control activation in wireless links) are not valid, can 131 be solved, or are also found in well accepted IoT end-to-end 132 reliability mechanisms (see [IntComp] for a detailed analysis). 134 At the application layer, CoAP was developed over UDP [RFC7252]. 135 However, the integration of some CoAP deployments with existing 136 infrastructure is being challenged by middleboxes such as firewalls, 137 which may limit and even block UDP-based communications. This is the 138 main reason why a CoAP over TCP specification has been developed 139 [RFC8323]. 141 Other application layer protocols not specifically designed for CNNs 142 are also being considered for the IoT space. Some examples include 143 HTTP/2 and even HTTP/1.1, both of which run over TCP by default 144 [RFC7230] [RFC7540], and the Extensible Messaging and Presence 145 Protocol (XMPP) [RFC6120]. TCP is also used by non-IETF application- 146 layer protocols in the IoT space such as the Message Queue Telemetry 147 Transport (MQTT) and its lightweight variants. 149 TCP is a sophisticated transport protocol that includes optional 150 functionality (e.g. TCP options) that may improve performance in 151 some environments. However, many optional TCP extensions require 152 complex logic inside the TCP stack and increase the codesize and the 153 memory requirements. Many TCP extensions are not required for 154 interoperability with other standard-compliant TCP endpoints. Given 155 the limited resources on constrained devices, careful selection of 156 optional TCP features can make an implementation more lightweight. 158 This document provides guidance on how to implement and configure 159 TCP, as well as on how TCP is advisable to be used by applications, 160 in CNNs. The overarching goal is to offer simple measures to allow 161 for lightweight TCP implementation and suitable operation in such 162 environments. A TCP implementation following the guidance in this 163 document is intended to be compatible with a TCP endpoint that is 164 compliant to the TCP standards, albeit possibly with a lower 165 performance. This implies that such a TCP client would always be 166 able to connect with a standard-compliant TCP server, and a 167 corresponding TCP server would always be able to connect with a 168 standard-compliant TCP client. 170 This document assumes that the reader is familiar with TCP. A 171 comprehensive survey of the TCP standards can be found in [RFC7414]. 172 Similar guidance regarding the use of TCP in special environments has 173 been published before, e.g., for cellular wireless networks 174 [RFC3481]. 176 2. Conventions used in this document 178 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL","SHALL NOT", 179 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 180 document are to be interpreted as described in [RFC2119]. 182 3. Characteristics of CNNs relevant for TCP 184 3.1. Network and link properties 186 CNNs are defined in [RFC7228] as networks whose characteristics are 187 influenced by being composed of a significant portion of constrained 188 nodes. The latter are characterized by significant limitations on 189 processing, memory, and energy resources, among others [RFC7228]. 190 The first two dimensions pose constraints on the complexity and on 191 the memory footprint of the protocols that constrained nodes can 192 support. The latter requires techniques to save energy, such as 193 radio duty-cycling in wireless devices [RFC8352], as well as 194 minimization of the number of messages transmitted/received (and 195 their size). 197 [RFC7228] lists typical network constraints in CNN, including low 198 achievable bitrate/throughput, high packet loss and high variability 199 of packet loss, highly asymmetric link characteristics, severe 200 penalties for using larger packets, limits on reachability over time, 201 etc. CNN may use wireless or wired technologies (e.g., Power Line 202 Communication), and the transmission rates are typically low (e.g. 203 below 1 Mbps). 205 For use of TCP, one challenge is that not all technologies in CNN may 206 be aligned with typical Internet subnetwork design principles 207 [RFC3819]. For instance, constrained nodes often use physical/link 208 layer technologies that have been characterized as 'lossy', i.e., 209 exhibit a relatively high bit error rate. Dealing with corruption 210 loss is one of the open issues in the Internet [RFC6077]. 212 3.2. Usage scenarios 214 There are different deployment and usage scenarios for CNNs. Some 215 CNNs follow the star topology, whereby one or several hosts are 216 linked to a central device that acts as a router connecting the CNN 217 to the Internet. CNNs may also follow the multihop topology 218 [RFC6606]. 220 In constrained environments, there can be different types of devices 221 [RFC7228]. For example, there can be devices with single combined 222 send/receive buffer, devices with a separate send and receive buffer, 223 or devices with a pool of multiple send/receive buffers. In the 224 latter case, it is possible that buffers also be shared for other 225 protocols. 227 One key use case for the use of TCP in CNNs is a model where 228 constrained devices connect to unconstrained servers in the Internet. 229 But it is also possible that both TCP endpoints run on constrained 230 devices. In the first case, communication possibly has to traverse a 231 middlebox (e.g. a firewall, NAT, etc.). Figure 1 illustrates such 232 scenario. Note that the scenario is asymmetric, as the unconstrained 233 device will typically not suffer the severe constraints of the 234 constrained device. The unconstrained device is expected to be 235 mains-powered, to have high amount of memory and processing power, 236 and to be connected to a resource-rich network. 238 Assuming that a majority of constrained devices will correspond to 239 sensor nodes, the amount of data traffic sent by constrained devices 240 (e.g. sensor node measurements) is expected to be higher than the 241 amount of data traffic in the opposite direction. Nevertheless, 242 constrained devices may receive requests (to which they may respond), 243 commands (for configuration purposes and for constrained devices 244 including actuators) and relatively infrequent firmware/software 245 updates. 247 +---------------+ 248 o o <-------- TCP communication -----> | | 249 o o | | 250 o o | Unconstrained | 251 o o +-----------+ | device | 252 o o o ------ | Middlebox | ------- | | 253 o o +-----------+ | (e.g. cloud) | 254 o o o | | 255 +---------------+ 256 constrained devices 258 Figure 1: TCP communication between a constrained device and an 259 unconstrained device, traversing a middlebox. 261 3.3. Communication and traffic patterns 263 IoT applications are characterized by a number of different 264 communication patterns. The following non-comprehensive list 265 explains some typical examples: 267 o Unidirectional transfers: An IoT device (e.g. a sensor) can send 268 (repeatedly) updates to the other endpoint. Not in every case 269 there is a need for an application response back to the IoT 270 device. 272 o Request-response patterns: An IoT device receiving a request from 273 the other endpoint, which triggers a response from the IoT device. 275 o Bulk data transfers: A typical example for a long file transfer 276 would be an IoT device firmware update. 278 A typical communication pattern is that a constrained device 279 communicates with an unconstrained device (cf. Figure 1). But it is 280 also possible that constrained devices communicate amongst 281 themselves. 283 4. TCP implementation and configuration in CNNs 285 This section explains how a TCP stack can deal with typical 286 constraints in CNN. The guidance in this section relates to the TCP 287 implementation and its configuration. 289 4.1. Addressing path properties 291 4.1.1. Maximum Segment Size (MSS) 293 Assuming that IPv6 is used, and for the sake of lightweight 294 implementation and operation, unless applications require handling 295 large data units (i.e. leading to an IPv6 datagram size greater than 296 1280 bytes), it may be desirable to limit the IP datagram size to 297 1280 bytes in order to avoid the need to support Path MTU Discovery 298 [RFC8201]. In addition, an IP datagram size of 1280 bytes avoids 299 incurring IPv6-layer fragmentation. 301 An IPv6 datagram size exceeding 1280 bytes can be avoided by setting 302 the TCP MSS not larger than 1220 bytes. This assumes that the remote 303 sender will use no TCP options, aside from possibly the MSS option, 304 which is only used in the initial TCP SYN packet. 306 In order to accommodate unrequested TCP options that may be used by 307 some TCP implementations, a constrained device may advertise an MSS 308 smaller than 1220 bytes (e.g. not larger than 1200 bytes). Note that 309 it is advised for TCP implementations to consume payload space 310 instead of increasing datagram size when including IP or TCP options 311 in an IP packet to be sent [RFC6691]. Therefore, the suggestion of 312 advertising an MSS smaller than 1220 bytes is likely to be 313 overcautious and its suitability should be considered carefully. 315 Note that setting the MTU to 1280 bytes is possible for link layer 316 technologies in the CNN space, even if some of them are characterized 317 by a short data unit payload size, e.g. up to a few tens or hundreds 318 of bytes. For example, the maximum frame size in IEEE 802.15.4 is 319 127 bytes. 6LoWPAN defined an adaptation layer to support IPv6 over 320 IEEE 802.15.4 networks. The adaptation layer includes a 321 fragmentation mechanism, since IPv6 requires the layer below to 322 support an MTU of 1280 bytes [RFC2460], while IEEE 802.15.4 lacked 323 fragmentation mechanisms. 6LoWPAN defines an IEEE 802.15.4 link MTU 324 of 1280 bytes [RFC4944]. Other technologies, such as Bluetooth LE 325 [RFC7668], ITU-T G.9959 [RFC7428] or DECT-ULE [RFC8105], also use 326 6LoWPAN-based adaptation layers in order to enable IPv6 support. 327 These technologies do support link layer fragmentation. By 328 exploiting this functionality, the adaptation layers that enable IPv6 329 over such technologies also define an MTU of 1280 bytes. 331 On the other hand, there exist technologies also used in the CNN 332 space, such as Master Slave / Token Passing (TP) [RFC8163], 333 Narrowband IoT (NB-IoT) [RFC8376] or IEEE 802.11ah 334 [I-D.delcarpio-6lo-wlanah], that do not suffer the same degree of 335 frame size limitations as the technologies mentioned above. The MTU 336 for MS/TP is recommended to be 1500 bytes [RFC8163], the MTU in NB- 337 IoT is 1600 bytes, and the maximum frame payload size for IEEE 338 802.11ah is 7991 bytes. 340 Using larger MSS (to a suitable extent) may be beneficial in some 341 scenarios, especially when transferring large payloads, as it reduces 342 the number of packets (and packet headers) required for a given 343 payload. However, the characteristics of the constrained network 344 need to be considered. In particular, in a lossy network where 345 unreliable fragment delivery is used, the amount of data that TCP 346 unnecessarily retransmits due to fragment loss increases (and 347 throughput decreases) quickly with the MSS. This happens because the 348 loss of a fragment leads to the loss of the whole fragmented packet 349 being transmitted. Unnecessary data retransmission is particularly 350 harmful in CNNs due to the resource constraints of such environments. 351 Note that, while the original 6LoWPAN fragmentation mechanism 352 [RFC4944] does not offer reliable fragment delivery, fragment 353 recovery functionality for 6LoWPAN or 6Lo environments is being 354 standardized as of the writing [I-D.ietf-6lo-fragment-recovery]. 356 4.1.2. Explicit Congestion Notification (ECN) 358 Explicit Congestion Notification (ECN) [RFC3168] ECN allows a router 359 to signal in the IP header of a packet that congestion is arising, 360 for example when a queue size reaches a certain threshold. An ECN- 361 enabled TCP receiver will echo back the congestion signal to the TCP 362 sender by setting a flag in its next TCP ACK. The sender triggers 363 congestion control measures as if a packet loss had happened. 365 The document [RFC8087] outlines the principal gains in terms of 366 increased throughput, reduced delay, and other benefits when ECN is 367 used over a network path that includes equipment that supports 368 Congestion Experienced (CE) marking. In the context of CNNs, a 369 remarkable feature of ECN is that congestion can be signalled without 370 incurring packet drops (which will lead to retransmissions and 371 consumption of limited resources such as energy and bandwitdh). 373 ECN can further reduce packet losses since congestion control 374 measures can be applied earlier [RFC2884]. Fewer lost packets 375 implies that the number of retransmitted segments decreases, which is 376 particularly beneficial in CNNs, where energy and bandwidth resources 377 are typically limited. Also, it makes sense to try to avoid packet 378 drops for transactional workloads with small data sizes, which are 379 typical for CNNs. In such traffic patterns, it is more difficult and 380 often impossible to detect packet loss without retransmission 381 timeouts (e.g., as there may be no three duplicate ACKs). Any 382 retransmission timeout slows down the data transfer significantly. 383 In addition, if the constrained device uses power saving techniques, 384 a retransmission timeout will incur a wake-up action, in contrast to 385 ACK clock- triggered sending. When the congestion window of a TCP 386 sender has a size of one segment and a TCP ACK with an ECN signal 387 (ECE flag) arrives at the TCP sender, the TCP sender resets the 388 retransmit timer, and the sender will only be able to send a new 389 packet when the retransmit timer expires. Effectively, the TCP 390 sender reduces at that moment its sending rate from 1 segment per 391 Round Trip Time (RTT) to 1 segment per RTO and reduces the sending 392 rate further on each ECN signal received in subsequent TCP ACKs. 393 Otherwise, if an ECN signal is not present in a subsequent TCP ACK 394 the TCP sender resumes the normal ACK-clocked transmission of 395 segments [RFC3168]. 397 ECN can be incrementally deployed in the Internet. Guidance on 398 configuration and usage of ECN is provided in [RFC7567]. Given the 399 benefits, more and more TCP stacks in the Internet support ECN, and 400 it specifically makes sense to leverage ECN in controlled 401 environments such as CNNs. Note, however, that supporting ECN 402 increases implementation complexity. 404 4.1.3. Explicit loss notifications 406 There has been a significant body of research on solutions capable of 407 explicitly indicating whether a TCP segment loss is due to 408 corruption, in order to avoid activation of congestion control 409 mechanisms [ETEN] [RFC2757]. While such solutions may provide 410 significant improvement, they have not been widely deployed and 411 remain as experimental work. In fact, as of today, the IETF has not 412 standardized any such solution. 414 4.2. TCP guidance for single-MSS stacks 416 This section discusses TCP stacks that allow transferring a single 417 MSS. More general guidance is provided in Section 4.3. 419 4.2.1. Single-MSS stacks - benefits and issues 421 A TCP stack can reduce the memory requirements by advertising a TCP 422 window size of one MSS, and also transmit at most one MSS of 423 unacknowledged data. In that case, both congestion and flow control 424 implementation are quite simple. Such a small receive and send 425 window may be sufficient for simple message exchanges in the CNN 426 space. However, only using a window of one MSS can significantly 427 affect performance. A stop-and-wait operation results in low 428 throughput for transfers that exceed the length of one MSS, e.g., a 429 firmware download. Furthermore, a single-MSS solution relies solely 430 on timer-based loss recovery, therefore missing the performance gain 431 of Fast Retransmit and Fast Recovery (which require a larger window 432 size, see Subsection 4.3.1). 434 If CoAP is used over TCP with the default setting for NSTART in 435 [RFC7252], a CoAP endpoint is not allowed to send a new message to a 436 destination until a response for the previous message sent to that 437 destination has been received. This is equivalent to an application- 438 layer window size of 1 data unit. For this use of CoAP, a maximum 439 TCP window of one MSS may be sufficient, as long as the CoAP message 440 size does not exceed one MSS. An exception in CoAP over TCP, though, 441 is the Capabilities and Settings Message (CSM) that must be sent at 442 the start of the TCP connection. The first application message 443 carrying user data is allowed to be sent immediately after the CSM 444 message. If the sum of the CSM size plus the application message 445 size exceeds the MSS, a sender using a single-MSS stack will need to 446 wait for the ACK confirming the CSM before sending the application 447 message. 449 4.2.2. TCP options for single-MSS stacks 451 A TCP implementation needs to support, at a minimum, TCP options 2, 1 452 and 0. These are, respectively, the Maximum Segment Size (MSS) 453 option, the No-Operation option, and the End Of Option List marker 454 [RFC0793]. None of these are a substantial burden to support. These 455 options are sufficient for interoperability with a standard-compliant 456 TCP endpoint, albeit many TCP stacks support additional options and 457 can negotiate their use. A TCP implementation is permitted to 458 silently ignore all other TCP options. 460 A TCP implementation for a constrained device that uses a single-MSS 461 TCP receive or transmit window size may not benefit from supporting 462 the following TCP options: Window scale [RFC7323], TCP Timestamps 463 [RFC7323], Selective Acknowledgments (SACK) and SACK-Permitted 464 [RFC2018]. Also other TCP options may not be required on a 465 constrained device with a very lightweight implementation. With 466 regard to the Window scale option, note that it is only useful if a 467 window size greater than 64 kB is needed. 469 Note that a TCP sender can benefit from the TCP Timestamps option 470 [RFC7323] in detecting spurious RTOs. The latter are quite likely to 471 occur in CNN scenarios due to a number of reasons (e.g. route changes 472 in a multihop scenario, link layer retries, etc.). The header 473 overhead incurred by the Timestamps option (of up to 12 bytes) needs 474 to be taken into account. 476 One potentially relevant TCP option in the context of CNNs is TCP 477 Fast Open (TFO) [RFC7413]. As described in Section 5.3, TFO can be 478 used to address the problem of traversing middleboxes that perform 479 early filter state record deletion. 481 4.2.3. Delayed Acknowledgments for single-MSS stacks 483 TCP Delayed Acknowledgments are meant to reduce the number of ACKs 484 sent within a TCP connection, thus reducing network overhead, but 485 they may increase the time until a sender may receive an ACK. In 486 general, usefulness of Delayed ACKs depends heavily on the usage 487 scenario (see subsection 4.3.2). There can be interactions with 488 single-MSS stacks. 490 When traffic is unidirectional, if the sender can send at most one 491 MSS of data or the receiver advertises a receive window not greater 492 than the MSS, Delayed ACKs may unnecessarily contribute delay (up to 493 500 ms) to the RTT [RFC5681], which limits the throughput and can 494 increase data delivery time. Note that, in some cases, it may not be 495 possible to disable Delayed ACKs. One known workaround is to split 496 the data to be sent into two segments of smaller size. A standard 497 compliant TCP receiver may immediately acknowledge the second MSS of 498 data, which can improve throughput. However, this 'split hack' may 499 not always work since a TCP receiver is required to acknowledge every 500 second full-sized segment, but not two consecutive small segments. 501 The overhead of sending two IP packets instead of one is another 502 downside of the 'split hack'. 504 Similar issues may happen when the sender uses the Nagle algorithm, 505 since the sender may need to wait for an unnecessarily delayed ACK to 506 send a new segment. Disabling the algorithm will not have impact if 507 the sender can only handle stop-and-wait operation at the TCP level. 509 For request-response traffic, when the receiver uses Delayed ACKs, a 510 response to a data message can piggyback an ACK, as long as the 511 latter is sent before the Delayed ACK timer expires, thus avoiding 512 unnecessary ACKs without payload. Disabling Delayed ACKs at the 513 sender allows an immediate ACK for the data segment carrying the 514 response. 516 4.2.4. RTO calculation for single-MSS stacks 518 The Retransmission Timeout (RTO) calculation is one of the 519 fundamental TCP algorithms [RFC6298]. There is a fundamental trade- 520 off: A short, aggressive RTO behavior reduces wait time before 521 retransmissions, but it also increases the probability of spurious 522 timeouts. The latter lead to unnecessary waste of potentially scarce 523 resources in CNNs such as energy and bandwidth. In contrast, a 524 conservative timeout can result in long error recovery times and thus 525 needlessly delay data delivery. 527 If a TCP sender uses a very small window size, and it cannot benefit 528 from Fast Retransmit/Fast Recovery or SACK, the RTO algorithm has a 529 large impact on performance. In that case, RTO algorithm tuning may 530 be considered, although careful assessment of possible drawbacks is 531 recommended [I-D.ietf-tcpm-rto-consider]. 533 As an example, adaptive RTO algorithms defined for CoAP over UDP have 534 been found to perform well in CNN scenarios [Commag] 535 [I-D.ietf-core-fasor]. 537 4.3. General recommendations for TCP in CNNs 539 This section summarizes some widely used techniques to improve TCP, 540 with a focus on their use in CNNs. The TCP extensions discussed here 541 are useful in a wide range of network scenarios, including CNNs. 542 This section is not comprehensive. A comprehensive survey of TCP 543 extensions is published in [RFC7414]. 545 4.3.1. Loss recovery and congestion/flow control 547 Devices that have enough memory to allow a larger (i.e. more than 3 548 MSS of data) TCP window size can leverage a more efficient loss 549 recovery than the timer-based approach used for smaller TCP window 550 size (see Subsection 3.2.1) by using Fast Retransmit and Fast 551 Recovery [RFC5681], at the expense of slightly greater complexity and 552 Transmission Control Block (TCB) size. Assuming that Delayed ACKs 553 are used by the receiver, a window size of up to 5 MSS is required 554 for Fast Retransmit and Fast Recovery to work efficiently: If in a 555 given TCP transmission of full-sized segments 1, 2, 3, 4, and 5, 556 segment 2 gets lost, and the ACK for segment 1 is held by the Delayed 557 ACK timer, then the sender should get an ACK for segment 1 when 3 558 arrives and duplicate ACKs when segments 4, 5, and 6 arrive. It will 559 retransmit segment 2 when the third duplicate ACK arrives. In order 560 to have segments 2, 3, 4, 5, and 6 sent, the window has to be of at 561 least 5 MSS. With an MSS of 1220 bytes, a buffer of a size of 5 MSS 562 would require 6100 bytes. 564 The example in the previous paragraph did not use a further TCP 565 improvement such as Limited Transmit [RFC3042]. The latter may also 566 be useful for any transfer that has more than one segment in flight. 567 Small transfers tend to benefit more from Limited Transmit, because 568 they are more likely to not receive enough duplicate ACKs. Assuming 569 the example in the previous paragraph, Limited Transmit allows 570 sending 5 MSS with a congestion window (cwnd) of 3 segments, plus two 571 additional segments for the first two duplicate ACKs. With Limited 572 Transmit, even a cwnd of 2 segments allows sending 5 MSS, at the 573 expense of additional delay contributed by the Delayed ACK timer for 574 the ACK that confirms segment 1. 576 When a multiple-segment window is used, the receiver will need to 577 manage the reception of possible out-of-order received segments, 578 requiring sufficient buffer space. 580 4.3.1.1. Selective Acknowledgments (SACK) 582 If a device with less severe memory and processing constraints can 583 afford advertising a TCP window size of several MSS, it makes sense 584 to support the SACK option to improve performance. SACK allows a 585 data receiver to inform the data sender of non-contiguous data blocks 586 received, thus a sender (having previously sent the SACK-Permitted 587 option) can avoid performing unnecessary retransmissions, saving 588 energy and bandwidth, as well as reducing latency. In addition, SACK 589 often allows for faster loss recovery when there is more than one 590 lost segment in a window of data, since with SACK recovery may 591 complete with less RTTs. SACK is particularly useful for bulk data 592 transfers. A receiver supporting SACK will need to keep track of the 593 SACK blocks that need to be received. The sender will also need to 594 keep track of which data segments need to be resent after learning 595 which data blocks are missing at the receiver. SACK adds 8*n+2 bytes 596 to the TCP header, where n denotes the number of data blocks 597 received, up to 4 blocks. For a low number of out-of-order segments, 598 the header overhead penalty of SACK is compensated by avoiding 599 unnecessary retransmissions. When the sender discovers the data 600 blocks that have already been received, it needs to also store the 601 necessary state to avoid unnecessary retransmission of data segments 602 that have already been received. 604 4.3.2. Delayed Acknowledgments 606 For certain traffic patterns, Delayed ACKs may have a detrimental 607 effect, as already noted in Section 4.2.3. Advanced TCP stacks may 608 use heuristics to determine the maximum delay for an ACK. For CNNs, 609 the recommendation depends on the expected communication patterns. 611 When traffic over a CNN is expected to mostly be unidirectional 612 messages with a size typically up to one MSS, and the time between 613 two consecutive message transmissions is greater than the Delayed ACK 614 timeout, it may make sense to use a small timeout or disable Delayed 615 ACKs at the receiver. This avoids incurring additional delay, as 616 well as the energy consumption of the sender (which might e.g. keep 617 its radio interface in receive mode) during that time. Note that 618 disabling Delayed ACKs may only be possible if the peer device is 619 administered by the same entity managing the constrained device. For 620 request-response traffic, enabling Delayed ACKs is recommended at the 621 server end, in order to allow combining a response with the ACK into 622 a single segment, thus increasing efficiency. In addition, if a 623 client issues requests infrequently, disabling Delayed ACKs at the 624 client allows an immediate ACK for the data segment carrying the 625 response. 627 In contrast, Delayed ACKs allow to reduce the number of ACKs in bulk 628 transfer type of traffic, e.g. for firmware/software updates or for 629 transferring larger data units containing a batch of sensor readings. 631 Note that, in many scenarios, the peer that a constrained device 632 communicates with will be a general purpose system that communicates 633 with both constrained and unconstrained devices. Since delayed ACKs 634 are often configured through system-wide parameters, delayed ACKs 635 behavior at the peer will be the same regardless of the nature of the 636 endpoints it talks to. Such a peer will typically have delayed ACKs 637 enabled. 639 4.3.3. Initial Window 641 RFC 5681 specifies a TCP Initial Window (IW) of roughly 4 kB 642 [RFC5681]. Subsequently, RFC 6928 defined an experimental new value 643 for the IW, which in practice will result in an IW of 10 MSS 644 [RFC6928]. The latter is nowadays used in many TCP implementations. 646 Note that a 10-MSS IW was recommended for resource-rich environments 647 (e.g. broadband environments), which are significantly different from 648 CNNs. In CNNs, many application layer data units are relatively 649 small (e.g. below one MSS). However, larger objects (e.g. large 650 files containing sensor readings, firmware updates, etc.) may also 651 need to be transferred in CNNs. If such a large object is 652 transferred in CNNs, with an IW setting of 10 MSS, there is 653 significant buffer overflow risk. In order to avoid such problem, in 654 CNNs the IW needs to be carefully set, based on device and network 655 resource constraints. In many cases, a safe IW setting will be 656 smaller than 10 MSS. 658 5. TCP usage recommendations in CNNs 660 This section discusses how TCP can be used by applications that are 661 developed for CNN scenarios. These remarks are by and large 662 independent of how TCP is exactly implemented. 664 5.1. TCP connection initiation 666 In the constrained device to unconstrained device scenario 667 illustrated above, a TCP connection is typically initiated by the 668 constrained device, in order for this device to support possible 669 sleep periods to save energy. 671 5.2. Number of concurrent connections 673 TCP endpoints with a small amount of memory may only support a small 674 number of connections. Each TCP connection requires storing a number 675 of variables in the TCB. Depending on the internal TCP 676 implementation, each connection may result in further memory 677 overhead, and connections may compete for scarce resources (e.g. 678 further memory overhead for send and receive buffers, etc). 680 A careful application design may try to keep the number of concurrent 681 connections as small as possible. A client can for instance limit 682 the number of simultaneous open connections that it maintains to a 683 given server. Multiple connections could for instance be used to 684 avoid the "head-of-line blocking" problem in an application transfer. 685 However, in addition to consuming resources, using multiple 686 connections can also cause undesirable side effects in congested 687 networks. For example, the HTTP/1.1 specification encourages clients 688 to be conservative when opening multiple connections [RFC7230]. 689 Furthermore, each new connection will start with a 3-way handshake, 690 therefore increasing message overhead. 692 Being conservative when opening multiple TCP connections is of 693 particular importance in Constrained-Node Networks. 695 5.3. TCP connection lifetime 697 In order to minimize message overhead, it makes sense to keep a TCP 698 connection open as long as the two TCP endpoints have more data to 699 send. If applications exchange data rather infrequently, i.e., if 700 TCP connections would stay idle for a long time, the idle time can 701 result in problems. For instance, certain middleboxes such as 702 firewalls or NAT devices are known to delete state records after an 703 inactivity interval. RFC 5382 specifies a minimum value for such 704 interval of 124 minutes. Measurement studies have reported that TCP 705 NAT binding timeouts are highly variable across devices, with a 706 median around 60 minutes, the shortest timeout being around 2 707 minutes, and more than 50% of the devices with a timeout shorter than 708 the aforementioned minimum timeout of 124 minutes [HomeGateway]. The 709 timeout duration used by a middlebox implementation may not be known 710 to the TCP endpoints. 712 In CNNs, such middleboxes may e.g. be present at the boundary between 713 the CNN and other networks. If the middlebox can be optimized for 714 CNN use cases, it makes sense to increase the initial value for 715 filter state inactivity timers to avoid problems with idle 716 connections. Apart from that, this problem can be dealt with by 717 different connection handling strategies, each having pros and cons. 719 One approach for infrequent data transfer is to use short-lived TCP 720 connections. Instead of trying to maintain a TCP connection for a 721 long time, possibly short-lived connections can be opened between two 722 endpoints, which are closed if no more data needs to be exchanged. 723 For use cases that can cope with the additional messages and the 724 latency resulting from starting new connections, it is recommended to 725 use a sequence of short-lived connections, instead of maintaining a 726 single long-lived connection. 728 The message and latency overhead that stems from using a sequence of 729 short-lived connections could be reduced by TCP Fast Open (TFO) 730 [RFC7413], which is an experimental TCP extension, at the expense of 731 increased implementation complexity and increased TCP Control Block 732 (TCB) size. TFO allows data to be carried in SYN (and SYN-ACK) 733 segments, and to be consumed immediately by the receiving endpoint. 734 This reduces the message and latency overhead compared to the 735 traditional three-way handshake to establish a TCP connection. For 736 security reasons, the connection initiator has to request a TFO 737 cookie from the other endpoint. The cookie, with a size of 4 or 16 738 bytes, is then included in SYN packets of subsequent connections. 739 The cookie needs to be refreshed (and obtained by the client) after a 740 certain amount of time. Nevertheless, TFO is more efficient than 741 frequently opening new TCP connections with the traditional three-way 742 handshake, as long as the cookie can be reused in subsequent 743 connections. However, as stated in RFC 7413, TFO deviates from the 744 standard TCP semantics, since the data in the SYN could be replayed 745 to an application in some rare circumstances. Applications should 746 not use TFO unless they can tolerate this issue, e.g., by using 747 Transport Layer Security (TLS) [RFC7413]. A comprehensive discussion 748 on TFO can be found at RFC 7413. 750 Another approach is to use long-lived TCP connections with 751 application-layer heartbeat messages. Various application protocols 752 support such heartbeat messages (e.g. CoAP over TCP [RFC8323]). 753 Periodic application-layer heartbeats can prevent early filter state 754 record deletion in middleboxes. If the TCP binding timeout for a 755 middlebox to be traversed by a given connection is known, middlebox 756 filter state deletion will be avoided if the heartbeat period is 757 lower than the middlebox TCP binding timeout. Otherwise, the 758 implementer needs to take into account that middlebox TCP binding 759 timeouts fall in a wide range of possible values [HomeGateway], and 760 it may be hard to find a proper heartbeat period for application- 761 layer heartbeat messages. 763 One specific advantage of Heartbeat messages is that they also allow 764 aliveness checks at the application level. In general, it makes 765 sense to realize aliveness checks at the highest protocol layer 766 possible that is meaningful to the application, in order to maximize 767 the depth of the aliveness check. In addition, timely detection of a 768 dead peer may allow savings in terms of TCB memory use. However, the 769 transmission of heartbeat messages consumes resources. This aspect 770 needs to be assessed carefully, considering the characteristics of 771 each specific CNN. 773 A TCP implementation may also be able to send "keep-alive" segments 774 to test a TCP connection. According to [RFC1122], "keep-alives" are 775 an optional TCP mechanism that is turned off by default, i.e., an 776 application must explicitly enable it for a TCP connection. The 777 interval between "keep-alive" messages must be configurable and it 778 must default to no less than two hours. With this large timeout, TCP 779 keep-alive messages might not always be useful to avoid deletion of 780 filter state records in some middleboxes. However, sending TCP keep- 781 alive probes more frequently risks draining power on energy- 782 constrained devices. 784 6. Security Considerations 786 Best current practise for securing TCP and TCP-based communication 787 also applies to CNN. As example, use of Transport Layer Security 788 (TLS) is strongly recommended if it is applicable. 790 There are also TCP options which can improve TCP security. One 791 example is the TCP Authentication Option (TCP-AO) [RFC5925]. 792 However, this option adds overhead and complexity. TCP-AO typically 793 has a size of 16-20 bytes. 795 For the mechanisms discussed in this document, the corresponding 796 considerations apply. For instance, if TFO is used, the security 797 considerations of [RFC7413] apply. 799 Constrained devices are expected to support smaller TCP window sizes 800 than less limited devices. In such conditions, segment 801 retransmission triggered by RTO expiration is expected to be 802 relatively frequent, due to lack of (enough) duplicate ACKs, 803 especially when a constrained device uses a single-MSS 804 implementation. For this reason, constrained devices running TCP may 805 appear as particularly appealing victims of the so-called "shrew" 806 Denial of Service (DoS) attack [shrew], whereby one or more sources 807 generate a packet spike targetted to coincide with consecutive RTO- 808 expiration-triggered retry attempts of a victim node. Note that the 809 attack may be performed by Internet-connected devices, including 810 constrained devices in the same CNN as the victim, as well as remote 811 ones. Mitigation techniques include RTO randomization and attack 812 blocking by routers able to detect shrew attacks based on their 813 traffic pattern. 815 7. Acknowledgments 817 Carles Gomez has been funded in part by the Spanish Government 818 (Ministerio de Educacion, Cultura y Deporte) through the Jose 819 Castillejo grants CAS15/00336 and and CAS18/00170, and by European 820 Regional Development Fund (ERDF) and the Spanish Government through 821 projects TEC2016-79988-P, PID2019-106808RA-I00, AEI/FEDER, UE, and by 822 Generalitat de Catalunya Grant 2017 SGR 376. Part of his 823 contribution to this work has been carried out during his stays as a 824 visiting scholar at the Computer Laboratory of the University of 825 Cambridge. 827 The authors appreciate the feedback received for this document. The 828 following folks provided comments that helped improve the document: 829 Carsten Bormann, Zhen Cao, Wei Genyu, Ari Keranen, Abhijan 830 Bhattacharyya, Andres Arcia-Moret, Yoshifumi Nishida, Joe Touch, Fred 831 Baker, Nik Sultana, Kerry Lynn, Erik Nordmark, Markku Kojo, Hannes 832 Tschofenig, David Black, Yoshifumi Nishida, Ilpo Jarvinen, Emmanuel 833 Baccelli, Stuart Cheshire, Gorry Fairhurst, Ingemar Johansson, and 834 Ted Lemon. Simon Brummer provided details, and kindly performed RAM 835 and ROM usage measurements, on the RIOT TCP implementation. Xavi 836 Vilajosana provided details on the OpenWSN TCP implementation. Rahul 837 Jadhav kindly performed code size measurements on the Contiki-NG and 838 lwIP 2.1.2 TCP implementations. He also provided details on the uIP 839 TCP implementation. 841 8. Annex. TCP implementations for constrained devices 843 This section overviews the main features of TCP implementations for 844 constrained devices. The survey is limited to open source stacks 845 with small footprint. It is not meant to be all-encompassing. For 846 more powerful embedded systems (e.g., with 32-bit processors), there 847 are further stacks that comprehensively implement TCP. On the other 848 hand, please be aware that this Annex is based on information 849 available as of the writing. 851 8.1. uIP 853 uIP is a TCP/IP stack, targetted for 8 and 16-bit microcontrollers, 854 which pioneered TCP/IP implementations for constrained devices. uIP 855 has been deployed with Contiki and the Arduino Ethernet shield. A 856 code size of ~5 kB (which comprises checksumming, IPv4, ICMP and TCP) 857 has been reported for uIP [Dunk]. Later versions of uIP implement 858 IPv6 as well. 860 uIP uses the same global buffer for both incoming and outgoing 861 traffic, which has a size of a single packet. In case of a 862 retransmission, an application must be able to reproduce the same 863 user data that had been transmitted. Multiple connections are 864 supported, but need to share the global buffer. 866 The MSS is announced via the MSS option on connection establishment 867 and the receive window size (of one MSS) is not modified during a 868 connection. Stop-and-wait operation is used for sending data. Among 869 other optimizations, this allows to avoid sliding window operations, 870 which use 32-bit arithmetic extensively and are expensive on 8-bit 871 CPUs. 873 Contiki uses the "split hack" technique (see Section 4.2.3) to avoid 874 Delayed ACKs for senders using a single segment. 876 The code size of the TCP implementation in Contiki-NG has been 877 measured to be of 3.2 kB on CC2538DK, cross-compiling on Linux. 879 8.2. lwIP 881 lwIP is a TCP/IP stack, targetted for 8- and 16-bit microcontrollers. 882 lwIP has a total code size of ~14 kB to ~22 kB (which comprises 883 memory management, checksumming, network interfaces, IPv4, ICMP and 884 TCP), and a TCP code size of ~9 kB to ~14 kB [Dunk]. Both IPv4 and 885 IPv6 are supported in lwIP since v2.0.0. 887 In contrast with uIP, lwIP decouples applications from the network 888 stack. lwIP supports a TCP transmission window greater than a single 889 segment, as well as buffering of incoming and outcoming data. Other 890 implemented mechanisms comprise slow start, congestion avoidance, 891 fast retransmit and fast recovery. SACK and Window Scale support has 892 been recently added to lwIP. 894 8.3. RIOT 896 The RIOT TCP implementation (called GNRC TCP) has been designed for 897 Class 1 devices [RFC 7228]. The main target platforms are 8- and 898 16-bit microcontrollers, with 32-bit platforms also supported. GNRC 899 TCP offers a similar function set as uIP, but it provides and 900 maintains an independent receive buffer for each connection. In 901 contrast to uIP, retransmission is also handled by GNRC TCP. For 902 simplicity, GNRC TCP uses a single-MSS implementation. The 903 application programmer does not need to know anything about the TCP 904 internals, therefore GNRC TCP can be seen as a user-friendly uIP TCP 905 implementation. 907 The MSS is set on connections establishment and cannot be changed 908 during connection lifetime. GNRC TCP allows multiple connections in 909 parallel, but each TCB must be allocated somewhere in the system. By 910 default there is only enough memory allocated for a single TCP 911 connection, but it can be increased at compile time if the user needs 912 multiple parallel connections. 914 The RIOT TCP implementation offers an optional POSIX socket wrapper 915 that enables POSIX compliance, if needed. 917 Further details on RIOT and GNRC can be found in the literature 918 [RIOT], [GNRC]. 920 8.4. TinyOS 922 TinyOS was important as a platform for early constrained devices. 923 TinyOS has an experimental TCP stack that uses a simple nonblocking 924 library-based implementation of TCP, which provides a subset of the 925 socket interface primitives. The application is responsible for 926 buffering. The TCP library does not do any receive-side buffering. 927 Instead, it will immediately dispatch new, in-order data to the 928 application and otherwise drop the segment. A send buffer is 929 provided by the application. Multiple TCP connections are possible. 930 Recently there has been little further work on the stack. 932 8.5. FreeRTOS 934 FreeRTOS is a real-time operating system kernel for embedded devices 935 that is supported by 16- and 32-bit microprocessors. Its TCP 936 implementation is based on multiple-segment window size, although a 937 'Tiny-TCP' option, which is a single-MSS variant, can be enabled. 938 Delayed ACKs are supported, with a 20-ms Delayed ACK timer as a 939 technique intended 'to gain performance'. 941 8.6. uC/OS 943 uC/OS is a real-time operating system kernel for embedded devices, 944 which is maintained by Micrium. uC/OS is intended for 8-, 16- and 945 32-bit microprocessors. The uC/OS TCP implementation supports a 946 multiple-segment window size. 948 8.7. Summary 949 +---+---------+--------+----+------+--------+-----+ 950 |uIP|lwIP orig|lwIP 2.1|RIOT|TinyOS|FreeRTOS|uC/OS| 951 +------+-------------+---+---------+--------+----+------+--------+-----+ 952 |Memory|Code size(kB)| <5|~9 to ~14| 38 | <7 | N/A | <9.2 | N/A | 953 | | |(a)| (T1) | (T4) |(T3)| | (T2) | | 954 +------+-------------+---+---------+--------+----+------+--------+-----+ 955 | | Single-Segm.|Yes| No | No | Yes| No | No | No | 956 | +-------------+---+---------+--------+----+------+--------+-----+ 957 | | Slow start | No| Yes | Yes | No | Yes | No | Yes | 958 | T +-------------+---+---------+--------+----+------+--------+-----+ 959 | C |Fast rec/retx| No| Yes | Yes | No | Yes | No | Yes | 960 | P +-------------+---+---------+--------+----+------+--------+-----+ 961 | | Keep-alive | No| No | Yes | No | No | Yes | Yes | 962 | +-------------+---+---------+--------+----+------+--------+-----+ 963 | f | Win. Scale | No| No | Yes | No | No | Yes | No | 964 | e +-------------+---+---------+--------+----+------+--------+-----+ 965 | a | TCP timest.| No| No | Yes | No | No | Yes | No | 966 | t +-------------+---+---------+--------+----+------+--------+-----+ 967 | u | SACK | No| No | Yes | No | No | Yes | No | 968 | r +-------------+---+---------+--------+----+------+--------+-----+ 969 | e | Del. ACKs | No| Yes | Yes | No | No | Yes | Yes | 970 | s +-------------+---+---------+--------+----+------+--------+-----+ 971 | | Socket | No| No |Optional|(I) |Subset| Yes | Yes | 972 | +-------------+---+---------+--------+----+------+--------+-----+ 973 | |Concur. Conn.|Yes| Yes | Yes | Yes| Yes | Yes | Yes | 974 +------+-------------+---+---------+--------+----+------+--------+-----+ 975 | TLS supported | No| No | Yes | Yes| Yes | Yes | Yes | 976 +--------------------+---+---------+--------+----+------+--------+-----+ 978 (T1) = TCP-only, on x86 and AVR platforms 979 (T2) = TCP-only, on ARM Cortex-M platform 980 (T3) = TCP-only, on ARM Cortex-M0+ platform (NOTE: RAM usage for the same platform 981 is ~2.5 kB for one TCP connection plus ~1.2 kB for each additional connection) 982 (T4) = TCP-only, on CC2538DK, cross-compiling on Linux 983 (a) = includes IP, ICMP and TCP on x86 and AVR platforms. The Contiki-NG TCP implementation has a code size of 3.2 kB on CC2538DK, cross-compiling on Linux 984 (I) = optional POSIX socket wrapper which enables POSIX compliance if needed 985 Mult. = Multiple 986 N/A = Not Available 988 Figure 2: Summary of TCP features for differrent lightweight TCP 989 implementations. None of the implementations considered in this 990 Annex support ECN or TFO. 992 9. Annex. Changes compared to previous versions 994 RFC Editor: To be removed prior to publication 996 9.1. Changes between -00 and -01 998 o Changed title and abstract 1000 o Clarification that communcation with standard-compliant TCP 1001 endpoints is required, based on feedback from Joe Touch 1003 o Additional discussion on communication patters 1005 o Numerous changes to address a comprehensive review from Hannes 1006 Tschofenig 1008 o Reworded security considerations 1010 o Additional references and better distinction between normative and 1011 informative entries 1013 o Feedback from Rahul Jadhav on the uIP TCP implementation 1015 o Basic data for the TinyOS TCP implementation added, based on 1016 source code analysis 1018 9.2. Changes between -01 and -02 1020 o Added text to the Introduction section, and a reference, on 1021 traditional bad perception of TCP for IoT 1023 o Added sections on FreeRTOS and uC/OS 1025 o Updated TinyOS section 1027 o Updated summary table 1029 o Reorganized Section 4 (single-MSS vs multiple-MSS window size), 1030 some content now also in new Section 5 1032 9.3. Changes between -02 and -03 1034 o Rewording to better explain the benefit of ECN 1036 o Additional context information on the surveyed implementations 1038 o Added details, but removed "Data size" raw, in the summary table 1039 o Added discussion on shrew attacks 1041 9.4. Changes between -03 and -04 1043 o Addressing the remaining TODOs 1045 o Alignment of the wording on TCP "keep-alives" with related 1046 discussions in the IETF transport area 1048 o Added further discussion on delayed ACKs 1050 o Removed OpenWSN subsection from the Annex 1052 9.5. Changes between -04 and -05 1054 o Addressing comments by Yoshifumi Nishida 1056 o Removed mentioning MD5 as an example (comment by David Black) 1058 o Added memory footprint details of TCP implementations (Contiki-NG 1059 and lwIP 2.1.2) provided by Rahul Jadhav in the Annex 1061 o Addressed comments by Ilpo Jarvinen throughout the whole document 1063 o Improved the RIOT section in the Annex, based on feedback from 1064 Emmanuel Baccelli 1066 9.6. Changes between -05 and -06 1068 o Incorporated suggestions by Stuart Cheshire 1070 9.7. Changes between -06 and -07 1072 o Addressed comments by Gorry Fairhurst 1074 9.8. Changes between -07 and -08 1076 o Addressed WGLC comments by Ilpo Jarvinen, Markku Kojo and Ingemar 1077 Johansson throughout the document, including the addition of a new 1078 subsection on Initial Window considerations. 1080 9.9. Changes between -08 and -09 1082 o Addressed second round of comments by Ilpo Jarvinen and Markku 1083 Kojo, based on the previous draft update. 1085 9.10. Changes between -09 and -10 1087 o Addressed comments by Erik Kline. 1089 o Addressed a comment by Markku Kojo on advice given in RFC 6691. 1091 10. References 1093 10.1. Normative References 1095 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1096 RFC 793, DOI 10.17487/RFC0793, September 1981, 1097 . 1099 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1100 Communication Layers", STD 3, RFC 1122, 1101 DOI 10.17487/RFC1122, October 1989, 1102 . 1104 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 1105 Selective Acknowledgment Options", RFC 2018, 1106 DOI 10.17487/RFC2018, October 1996, 1107 . 1109 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1110 Requirement Levels", BCP 14, RFC 2119, 1111 DOI 10.17487/RFC2119, March 1997, 1112 . 1114 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1115 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1116 December 1998, . 1118 [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing 1119 TCP's Loss Recovery Using Limited Transmit", RFC 3042, 1120 DOI 10.17487/RFC3042, January 2001, 1121 . 1123 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1124 of Explicit Congestion Notification (ECN) to IP", 1125 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1126 . 1128 [RFC3819] Karn, P., Ed., Bormann, C., Fairhurst, G., Grossman, D., 1129 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1130 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1131 RFC 3819, DOI 10.17487/RFC3819, July 2004, 1132 . 1134 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1135 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1136 . 1138 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1139 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 1140 June 2010, . 1142 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1143 "Computing TCP's Retransmission Timer", RFC 6298, 1144 DOI 10.17487/RFC6298, June 2011, 1145 . 1147 [RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)", 1148 RFC 6691, DOI 10.17487/RFC6691, July 2012, 1149 . 1151 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1152 "Increasing TCP's Initial Window", RFC 6928, 1153 DOI 10.17487/RFC6928, April 2013, 1154 . 1156 [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for 1157 Constrained-Node Networks", RFC 7228, 1158 DOI 10.17487/RFC7228, May 2014, 1159 . 1161 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. 1162 Scheffenegger, Ed., "TCP Extensions for High Performance", 1163 RFC 7323, DOI 10.17487/RFC7323, September 2014, 1164 . 1166 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 1167 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 1168 . 1170 10.2. Informative References 1172 [Commag] A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoAP 1173 Congestion Control for the Internet of Things", IEEE 1174 Communications Magazine, June 2016. 1176 [Dunk] A. Dunkels, "Full TCP/IP for 8-Bit Architectures", 2003. 1178 [ETEN] R. Krishnan et al, "Explicit transport error notification 1179 (ETEN) for error-prone wireless and satellite networks", 1180 Computer Networks 2004. 1182 [GNRC] M. Lenders et al., "Connecting the World of Embedded 1183 Mobiles: The RIOTApproach to Ubiquitous Networking for the 1184 IoT", 2018. 1186 [HomeGateway] 1187 Haetoenen, S., Nyrhinen, A., Eggert, L., Strowes, S., 1188 Sarolahti, P., and M. Kojo, "An Experimental Study of Home 1189 Gateway Characteristics", Proceedings of the 10th ACM 1190 SIGCOMM conference on Internet measurement 2010. 1192 [I-D.delcarpio-6lo-wlanah] 1193 Vega, L., Robles, I., and R. Morabito, "IPv6 over 1194 802.11ah", draft-delcarpio-6lo-wlanah-01 (work in 1195 progress), October 2015. 1197 [I-D.ietf-6lo-fragment-recovery] 1198 Thubert, P., "6LoWPAN Selective Fragment Recovery", draft- 1199 ietf-6lo-fragment-recovery-21 (work in progress), March 1200 2020. 1202 [I-D.ietf-core-fasor] 1203 Jarvinen, I., Kojo, M., Raitahila, I., and Z. Cao, "Fast- 1204 Slow Retransmission Timeout and Congestion Control 1205 Algorithm for CoAP", draft-ietf-core-fasor-00 (work in 1206 progress), March 2020. 1208 [I-D.ietf-tcpm-rto-consider] 1209 Allman, M., "Requirements for Time-Based Loss Detection", 1210 draft-ietf-tcpm-rto-consider-17 (work in progress), July 1211 2020. 1213 [IntComp] C. Gomez, A. Arcia-Moret, J. Crowcroft, "TCP in the 1214 Internet of Things: from ostracism to prominence", IEEE 1215 Internet Computing, January-February 2018. 1217 [RFC2757] Montenegro, G., Dawkins, S., Kojo, M., Magret, V., and N. 1218 Vaidya, "Long Thin Networks", RFC 2757, 1219 DOI 10.17487/RFC2757, January 2000, 1220 . 1222 [RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of 1223 Explicit Congestion Notification (ECN) in IP Networks", 1224 RFC 2884, DOI 10.17487/RFC2884, July 2000, 1225 . 1227 [RFC3481] Inamura, H., Ed., Montenegro, G., Ed., Ludwig, R., Gurtov, 1228 A., and F. Khafizov, "TCP over Second (2.5G) and Third 1229 (3G) Generation Wireless Networks", BCP 71, RFC 3481, 1230 DOI 10.17487/RFC3481, February 2003, 1231 . 1233 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 1234 "Transmission of IPv6 Packets over IEEE 802.15.4 1235 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 1236 . 1238 [RFC6077] Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B. 1239 Briscoe, "Open Research Issues in Internet Congestion 1240 Control", RFC 6077, DOI 10.17487/RFC6077, February 2011, 1241 . 1243 [RFC6092] Woodyatt, J., Ed., "Recommended Simple Security 1244 Capabilities in Customer Premises Equipment (CPE) for 1245 Providing Residential IPv6 Internet Service", RFC 6092, 1246 DOI 10.17487/RFC6092, January 2011, 1247 . 1249 [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence 1250 Protocol (XMPP): Core", RFC 6120, DOI 10.17487/RFC6120, 1251 March 2011, . 1253 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 1254 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 1255 DOI 10.17487/RFC6282, September 2011, 1256 . 1258 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1259 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1260 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1261 Low-Power and Lossy Networks", RFC 6550, 1262 DOI 10.17487/RFC6550, March 2012, 1263 . 1265 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1266 Statement and Requirements for IPv6 over Low-Power 1267 Wireless Personal Area Network (6LoWPAN) Routing", 1268 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1269 . 1271 [RFC6775] Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C. 1272 Bormann, "Neighbor Discovery Optimization for IPv6 over 1273 Low-Power Wireless Personal Area Networks (6LoWPANs)", 1274 RFC 6775, DOI 10.17487/RFC6775, November 2012, 1275 . 1277 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1278 Protocol (HTTP/1.1): Message Syntax and Routing", 1279 RFC 7230, DOI 10.17487/RFC7230, June 2014, 1280 . 1282 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 1283 Application Protocol (CoAP)", RFC 7252, 1284 DOI 10.17487/RFC7252, June 2014, 1285 . 1287 [RFC7414] Duke, M., Braden, R., Eddy, W., Blanton, E., and A. 1288 Zimmermann, "A Roadmap for Transmission Control Protocol 1289 (TCP) Specification Documents", RFC 7414, 1290 DOI 10.17487/RFC7414, February 2015, 1291 . 1293 [RFC7428] Brandt, A. and J. Buron, "Transmission of IPv6 Packets 1294 over ITU-T G.9959 Networks", RFC 7428, 1295 DOI 10.17487/RFC7428, February 2015, 1296 . 1298 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1299 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1300 DOI 10.17487/RFC7540, May 2015, 1301 . 1303 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1304 Recommendations Regarding Active Queue Management", 1305 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1306 . 1308 [RFC7668] Nieminen, J., Savolainen, T., Isomaki, M., Patil, B., 1309 Shelby, Z., and C. Gomez, "IPv6 over BLUETOOTH(R) Low 1310 Energy", RFC 7668, DOI 10.17487/RFC7668, October 2015, 1311 . 1313 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1314 Explicit Congestion Notification (ECN)", RFC 8087, 1315 DOI 10.17487/RFC8087, March 2017, 1316 . 1318 [RFC8105] Mariager, P., Petersen, J., Ed., Shelby, Z., Van de Logt, 1319 M., and D. Barthel, "Transmission of IPv6 Packets over 1320 Digital Enhanced Cordless Telecommunications (DECT) Ultra 1321 Low Energy (ULE)", RFC 8105, DOI 10.17487/RFC8105, May 1322 2017, . 1324 [RFC8163] Lynn, K., Ed., Martocci, J., Neilson, C., and S. 1325 Donaldson, "Transmission of IPv6 over Master-Slave/Token- 1326 Passing (MS/TP) Networks", RFC 8163, DOI 10.17487/RFC8163, 1327 May 2017, . 1329 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1330 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1331 DOI 10.17487/RFC8201, July 2017, 1332 . 1334 [RFC8323] Bormann, C., Lemay, S., Tschofenig, H., Hartke, K., 1335 Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained 1336 Application Protocol) over TCP, TLS, and WebSockets", 1337 RFC 8323, DOI 10.17487/RFC8323, February 2018, 1338 . 1340 [RFC8352] Gomez, C., Kovatsch, M., Tian, H., and Z. Cao, Ed., 1341 "Energy-Efficient Features of Internet of Things 1342 Protocols", RFC 8352, DOI 10.17487/RFC8352, April 2018, 1343 . 1345 [RFC8376] Farrell, S., Ed., "Low-Power Wide Area Network (LPWAN) 1346 Overview", RFC 8376, DOI 10.17487/RFC8376, May 2018, 1347 . 1349 [RIOT] E. Baccelli et al., "RIOT: an Open Source Operating 1350 Systemfor Low-end Embedded Devices in the IoT", 2018. 1352 [shrew] A. Kuzmanovic, E. Knightly, "Low-Rate TCP-Targeted Denial 1353 of Service Attacks", SIGCOMM'03 2003. 1355 Authors' Addresses 1357 Carles Gomez 1358 UPC 1359 C/Esteve Terradas, 7 1360 Castelldefels 08860 1361 Spain 1363 Email: carlesgo@entel.upc.edu 1364 Jon Crowcroft 1365 University of Cambridge 1366 JJ Thomson Avenue 1367 Cambridge, CB3 0FD 1368 United Kingdom 1370 Email: jon.crowcroft@cl.cam.ac.uk 1372 Michael Scharf 1373 Hochschule Esslingen 1374 Flandernstr. 101 1375 Esslingen 73732 1376 Germany 1378 Email: michael.scharf@hs-esslingen.de