idnits 2.17.1 draft-ietf-lwig-tcp-constrained-node-networks-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 32 instances of too long lines in the document, the longest one being 90 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 30, 2020) is 1245 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC 7228' is mentioned on line 898, but not defined ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 6691 (Obsoleted by RFC 9293) == Outdated reference: A later version (-02) exists of draft-ietf-core-fasor-01 == Outdated reference: A later version (-15) exists of draft-ietf-tcpm-generalized-ecn-05 -- Obsolete informational reference (is this intentional?): RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 4 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 LWIG Working Group C. Gomez 3 Internet-Draft UPC 4 Intended status: Informational J. Crowcroft 5 Expires: May 3, 2021 University of Cambridge 6 M. Scharf 7 Hochschule Esslingen 8 October 30, 2020 10 TCP Usage Guidance in the Internet of Things (IoT) 11 draft-ietf-lwig-tcp-constrained-node-networks-13 13 Abstract 15 This document provides guidance on how to implement and use the 16 Transmission Control Protocol (TCP) in Constrained-Node Networks 17 (CNNs), which are a characteristic of the Internet of Things (IoT). 18 Such environments require a lightweight TCP implementation and may 19 not make use of optional functionality. This document explains a 20 number of known and deployed techniques to simplify a TCP stack as 21 well as corresponding tradeoffs. The objective is to help embedded 22 developers with decisions on which TCP features to use. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on May 3, 2021. 41 Copyright Notice 43 Copyright (c) 2020 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Characteristics of CNNs relevant for TCP . . . . . . . . . . 4 60 2.1. Network and link properties . . . . . . . . . . . . . . . 4 61 2.2. Usage scenarios . . . . . . . . . . . . . . . . . . . . . 5 62 2.3. Communication and traffic patterns . . . . . . . . . . . 6 63 3. TCP implementation and configuration in CNNs . . . . . . . . 6 64 3.1. Addressing path properties . . . . . . . . . . . . . . . 7 65 3.1.1. Maximum Segment Size (MSS) . . . . . . . . . . . . . 7 66 3.1.2. Explicit Congestion Notification (ECN) . . . . . . . 8 67 3.1.3. Explicit loss notifications . . . . . . . . . . . . . 9 68 3.2. TCP guidance for single-MSS stacks . . . . . . . . . . . 9 69 3.2.1. Single-MSS stacks - benefits and issues . . . . . . . 9 70 3.2.2. TCP options for single-MSS stacks . . . . . . . . . . 10 71 3.2.3. Delayed Acknowledgments for single-MSS stacks . . . . 10 72 3.2.4. RTO calculation for single-MSS stacks . . . . . . . . 11 73 3.3. General recommendations for TCP in CNNs . . . . . . . . . 12 74 3.3.1. Loss recovery and congestion/flow control . . . . . . 12 75 3.3.1.1. Selective Acknowledgments (SACK) . . . . . . . . 13 76 3.3.2. Delayed Acknowledgments . . . . . . . . . . . . . . . 13 77 3.3.3. Initial Window . . . . . . . . . . . . . . . . . . . 14 78 4. TCP usage recommendations in CNNs . . . . . . . . . . . . . . 14 79 4.1. TCP connection initiation . . . . . . . . . . . . . . . . 14 80 4.2. Number of concurrent connections . . . . . . . . . . . . 15 81 4.3. TCP connection lifetime . . . . . . . . . . . . . . . . . 15 82 5. Security Considerations . . . . . . . . . . . . . . . . . . . 17 83 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 84 7. Annex. TCP implementations for constrained devices . . . . . 18 85 7.1. uIP . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 86 7.2. lwIP . . . . . . . . . . . . . . . . . . . . . . . . . . 19 87 7.3. RIOT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 88 7.4. TinyOS . . . . . . . . . . . . . . . . . . . . . . . . . 20 89 7.5. FreeRTOS . . . . . . . . . . . . . . . . . . . . . . . . 20 90 7.6. uC/OS . . . . . . . . . . . . . . . . . . . . . . . . . . 20 91 7.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 21 92 8. Annex. Changes compared to previous versions . . . . . . . . 22 93 8.1. Changes between -00 and -01 . . . . . . . . . . . . . . . 22 94 8.2. Changes between -01 and -02 . . . . . . . . . . . . . . . 22 95 8.3. Changes between -02 and -03 . . . . . . . . . . . . . . . 22 96 8.4. Changes between -03 and -04 . . . . . . . . . . . . . . . 23 97 8.5. Changes between -04 and -05 . . . . . . . . . . . . . . . 23 98 8.6. Changes between -05 and -06 . . . . . . . . . . . . . . . 23 99 8.7. Changes between -06 and -07 . . . . . . . . . . . . . . . 23 100 8.8. Changes between -07 and -08 . . . . . . . . . . . . . . . 23 101 8.9. Changes between -08 and -09 . . . . . . . . . . . . . . . 23 102 8.10. Changes between -09 and -10 . . . . . . . . . . . . . . . 24 103 8.11. Changes between -10 and -11 . . . . . . . . . . . . . . . 24 104 8.12. Changes between -11 and -12 . . . . . . . . . . . . . . . 24 105 8.13. Changes between -12 and -13 . . . . . . . . . . . . . . . 24 106 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 107 9.1. Normative References . . . . . . . . . . . . . . . . . . 24 108 9.2. Informative References . . . . . . . . . . . . . . . . . 25 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 111 1. Introduction 113 The Internet Protocol suite is being used for connecting Constrained- 114 Node Networks (CNNs) to the Internet, enabling the so-called Internet 115 of Things (IoT) [RFC7228]. In order to meet the requirements that 116 stem from CNNs, the IETF has produced a suite of new protocols 117 specifically designed for such environments (see e.g. [RFC8352]). 118 New IETF protocol stack components include the IPv6 over Low-power 119 Wireless Personal Area Networks (6LoWPAN) adaptation layer 120 [RFC4944][RFC6282][RFC6775], the IPv6 Routing Protocol for Low-power 121 and lossy networks (RPL) routing protocol [RFC6550], and the 122 Constrained Application Protocol (CoAP) [RFC7252]. 124 As of the writing, the main current transport layer protocols in IP- 125 based IoT scenarios are UDP and TCP. TCP has been criticized, often 126 unfairly, as a protocol that is unsuitable for the IoT. It is true 127 that some TCP features, such as relatively long header size, 128 unsuitability for multicast, and always-confirmed data delivery, are 129 not optimal for IoT scenarios. However, many typical claims on TCP 130 unsuitability for IoT (e.g. a high complexity, connection-oriented 131 approach incompatibility with radio duty-cycling, and spurious 132 congestion control activation in wireless links) are not valid, can 133 be solved, or are also found in well accepted IoT end-to-end 134 reliability mechanisms (see [IntComp] for a detailed analysis). 136 At the application layer, CoAP was developed over UDP [RFC7252]. 137 However, the integration of some CoAP deployments with existing 138 infrastructure is being challenged by middleboxes such as firewalls, 139 which may limit and even block UDP-based communications. This is the 140 main reason why a CoAP over TCP specification has been developed 141 [RFC8323]. 143 Other application layer protocols not specifically designed for CNNs 144 are also being considered for the IoT space. Some examples include 145 HTTP/2 and even HTTP/1.1, both of which run over TCP by default 146 [RFC7230] [RFC7540], and the Extensible Messaging and Presence 147 Protocol (XMPP) [RFC6120]. TCP is also used by non-IETF application- 148 layer protocols in the IoT space such as the Message Queuing 149 Telemetry Transport (MQTT) [MQTT] and its lightweight variants. 151 TCP is a sophisticated transport protocol that includes optional 152 functionality (e.g. TCP options) that may improve performance in 153 some environments. However, many optional TCP extensions require 154 complex logic inside the TCP stack and increase the code size and the 155 memory requirements. Many TCP extensions are not required for 156 interoperability with other standard-compliant TCP endpoints. Given 157 the limited resources on constrained devices, careful selection of 158 optional TCP features can make an implementation more lightweight. 160 This document provides guidance on how to implement and configure 161 TCP, as well as on how TCP is advisable to be used by applications, 162 in CNNs. The overarching goal is to offer simple measures to allow 163 for lightweight TCP implementation and suitable operation in such 164 environments. A TCP implementation following the guidance in this 165 document is intended to be compatible with a TCP endpoint that is 166 compliant to the TCP standards, albeit possibly with a lower 167 performance. This implies that such a TCP client would always be 168 able to connect with a standard-compliant TCP server, and a 169 corresponding TCP server would always be able to connect with a 170 standard-compliant TCP client. 172 This document assumes that the reader is familiar with TCP. A 173 comprehensive survey of the TCP standards can be found in [RFC7414]. 174 Similar guidance regarding the use of TCP in special environments has 175 been published before, e.g., for cellular wireless networks 176 [RFC3481]. 178 2. Characteristics of CNNs relevant for TCP 180 2.1. Network and link properties 182 CNNs are defined in [RFC7228] as networks whose characteristics are 183 influenced by being composed of a significant portion of constrained 184 nodes. The latter are characterized by significant limitations on 185 processing, memory, and energy resources, among others [RFC7228]. 186 The first two dimensions pose constraints on the complexity and on 187 the memory footprint of the protocols that constrained nodes can 188 support. The latter requires techniques to save energy, such as 189 radio duty-cycling in wireless devices [RFC8352], as well as 190 minimization of the number of messages transmitted/received (and 191 their size). 193 [RFC7228] lists typical network constraints in CNN, including low 194 achievable bitrate/throughput, high packet loss and high variability 195 of packet loss, highly asymmetric link characteristics, severe 196 penalties for using larger packets, limits on reachability over time, 197 etc. CNN may use wireless or wired technologies (e.g., Power Line 198 Communication), and the transmission rates are typically low (e.g. 199 below 1 Mbps). 201 For use of TCP, one challenge is that not all technologies in CNN may 202 be aligned with typical Internet subnetwork design principles 203 [RFC3819]. For instance, constrained nodes often use physical/link 204 layer technologies that have been characterized as 'lossy', i.e., 205 exhibit a relatively high bit error rate. Dealing with corruption 206 loss is one of the open issues in the Internet [RFC6077]. 208 2.2. Usage scenarios 210 There are different deployment and usage scenarios for CNNs. Some 211 CNNs follow the star topology, whereby one or several hosts are 212 linked to a central device that acts as a router connecting the CNN 213 to the Internet. Alternatively, CNNs may also follow the multihop 214 topology [RFC6606]. 216 In constrained environments, there can be different types of devices 217 [RFC7228]. For example, there can be devices with single combined 218 send/receive buffer, devices with a separate send and receive buffer, 219 or devices with a pool of multiple send/receive buffers. In the 220 latter case, it is possible that buffers are also shared for other 221 protocols. 223 One key use case for TCP in CNNs is a model where constrained devices 224 connect to unconstrained servers in the Internet. But it is also 225 possible that both TCP endpoints run on constrained devices. In the 226 first case, communication possibly has to traverse a middlebox (e.g. 227 a firewall, NAT, etc.). Figure 1 illustrates such a scenario. Note 228 that the scenario is asymmetric, as the unconstrained device will 229 typically not suffer the severe constraints of the constrained 230 device. The unconstrained device is expected to be mains-powered, to 231 have high amount of memory and processing power, and to be connected 232 to a resource-rich network. 234 Assuming that a majority of constrained devices will correspond to 235 sensor nodes, the amount of data traffic sent by constrained devices 236 (e.g. sensor node measurements) is expected to be higher than the 237 amount of data traffic in the opposite direction. Nevertheless, 238 constrained devices may receive requests (to which they may respond), 239 commands (for configuration purposes and for constrained devices 240 including actuators) and relatively infrequent firmware/software 241 updates. 243 +---------------+ 244 o o <-------- TCP communication -----> | | 245 o o | | 246 o o | Unconstrained | 247 o o +-----------+ | device | 248 o o o ------ | Middlebox | ------- | | 249 o o +-----------+ | (e.g. cloud) | 250 o o o | | 251 +---------------+ 252 constrained devices 254 Figure 1: TCP communication between a constrained device and an 255 unconstrained device, traversing a middlebox. 257 2.3. Communication and traffic patterns 259 IoT applications are characterized by a number of different 260 communication patterns. The following non-comprehensive list 261 explains some typical examples: 263 o Unidirectional transfers: An IoT device (e.g. a sensor) can send 264 (repeatedly) updates to the other endpoint. There is not always a 265 need for an application response back to the IoT device. 267 o Request-response patterns: An IoT device receiving a request from 268 the other endpoint, which triggers a response from the IoT device. 270 o Bulk data transfers: A typical example for a long file transfer 271 would be an IoT device firmware update. 273 A typical communication pattern is that a constrained device 274 communicates with an unconstrained device (cf. Figure 1). But it is 275 also possible that constrained devices communicate amongst 276 themselves. 278 3. TCP implementation and configuration in CNNs 280 This section explains how a TCP stack can deal with typical 281 constraints in CNN. The guidance in this section relates to the TCP 282 implementation and its configuration. 284 3.1. Addressing path properties 286 3.1.1. Maximum Segment Size (MSS) 288 Assuming that IPv6 is used, and for the sake of lightweight 289 implementation and operation, unless applications require handling 290 large data units (i.e. leading to an IPv6 datagram size greater than 291 1280 bytes), it may be desirable to limit the IP datagram size to 292 1280 bytes in order to avoid the need to support Path MTU Discovery 293 [RFC8201]. In addition, an IP datagram size of 1280 bytes avoids 294 incurring IPv6-layer fragmentation [RFC8900]. 296 An IPv6 datagram size exceeding 1280 bytes can be avoided by setting 297 the TCP MSS not larger than 1220 bytes. Note that it is already a 298 requirement that TCP implementations consume payload space instead of 299 increasing datagram size when including IP or TCP options in an IP 300 packet to be sent [RFC6691]. Therefore, it is not required to 301 advertise an MSS smaller than 1220 bytes in order to accommodate TCP 302 options. 304 Note that setting the MTU to 1280 bytes is possible for link layer 305 technologies in the CNN space, even if some of them are characterized 306 by a short data unit payload size, e.g. up to a few tens or hundreds 307 of bytes. For example, the maximum frame size in IEEE 802.15.4 is 308 127 bytes. 6LoWPAN defined an adaptation layer to support IPv6 over 309 IEEE 802.15.4 networks. The adaptation layer includes a 310 fragmentation mechanism, since IPv6 requires the layer below to 311 support an MTU of 1280 bytes [RFC8200], while IEEE 802.15.4 lacked 312 fragmentation mechanisms. 6LoWPAN defines an IEEE 802.15.4 link MTU 313 of 1280 bytes [RFC4944]. Other technologies, such as Bluetooth LE 314 [RFC7668], ITU-T G.9959 [RFC7428] or DECT-ULE [RFC8105], also use 315 6LoWPAN-based adaptation layers in order to enable IPv6 support. 316 These technologies do support link layer fragmentation. By 317 exploiting this functionality, the adaptation layers that enable IPv6 318 over such technologies also define an MTU of 1280 bytes. 320 On the other hand, there exist technologies also used in the CNN 321 space, such as Master Slave / Token Passing (TP) [RFC8163], 322 Narrowband IoT (NB-IoT) [RFC8376] or IEEE 802.11ah 323 [I-D.delcarpio-6lo-wlanah], that do not suffer the same degree of 324 frame size limitations as the technologies mentioned above. The MTU 325 for MS/TP is recommended to be 1500 bytes [RFC8163], the MTU in NB- 326 IoT is 1600 bytes, and the maximum frame payload size for IEEE 327 802.11ah is 7991 bytes. 329 Using larger MSS (to a suitable extent) may be beneficial in some 330 scenarios, especially when transferring large payloads, as it reduces 331 the number of packets (and packet headers) required for a given 332 payload. However, the characteristics of the constrained network 333 need to be considered. In particular, in a lossy network where 334 unreliable fragment delivery is used, the amount of data that TCP 335 unnecessarily retransmits due to fragment loss increases (and 336 throughput decreases) quickly with the MSS. This happens because the 337 loss of a fragment leads to the loss of the whole fragmented packet 338 being transmitted. Unnecessary data retransmission is particularly 339 harmful in CNNs due to the resource constraints of such environments. 340 Note that, while the original 6LoWPAN fragmentation mechanism 341 [RFC4944] does not offer reliable fragment delivery, fragment 342 recovery functionality for 6LoWPAN or 6Lo environments is being 343 standardized as of the writing [I-D.ietf-6lo-fragment-recovery]. 345 3.1.2. Explicit Congestion Notification (ECN) 347 Explicit Congestion Notification (ECN) [RFC3168] ECN allows a router 348 to signal in the IP header of a packet that congestion is arising, 349 for example when a queue size reaches a certain threshold. An ECN- 350 enabled TCP receiver will echo back the congestion signal to the TCP 351 sender by setting a flag in its next TCP ACK. The sender triggers 352 congestion control measures as if a packet loss had happened. 354 The document [RFC8087] outlines the principal gains in terms of 355 increased throughput, reduced delay, and other benefits when ECN is 356 used over a network path that includes equipment that supports 357 Congestion Experienced (CE) marking. In the context of CNNs, a 358 remarkable feature of ECN is that congestion can be signalled without 359 incurring packet drops (which will lead to retransmissions and 360 consumption of limited resources such as energy and bandwidth). 362 ECN can further reduce packet losses since congestion control 363 measures can be applied earlier [RFC2884]. Fewer lost packets 364 implies that the number of retransmitted segments decreases, which is 365 particularly beneficial in CNNs, where energy and bandwidth resources 366 are typically limited. Also, it makes sense to try to avoid packet 367 drops for transactional workloads with small data sizes, which are 368 typical for CNNs. In such traffic patterns, it is more difficult and 369 often impossible to detect packet loss without retransmission 370 timeouts (e.g., as there may be no three duplicate ACKs). Any 371 retransmission timeout slows down the data transfer significantly. 372 In addition, if the constrained device uses power saving techniques, 373 a retransmission timeout will incur a wake-up action, in contrast to 374 ACK clock- triggered sending. When the congestion window of a TCP 375 sender has a size of one segment and a TCP ACK with an ECN signal 376 (ECE flag) arrives at the TCP sender, the TCP sender resets the 377 retransmit timer, and the sender will only be able to send a new 378 packet when the retransmit timer expires. Effectively, the TCP 379 sender reduces at that moment its sending rate from 1 segment per 380 Round Trip Time (RTT) to 1 segment per Retransmission Timeout (RTO) 381 and reduces the sending rate further on each ECN signal received in 382 subsequent TCP ACKs. Otherwise, if an ECN signal is not present in a 383 subsequent TCP ACK the TCP sender resumes the normal ACK-clocked 384 transmission of segments [RFC3168]. 386 ECN can be incrementally deployed in the Internet. Guidance on 387 configuration and usage of ECN is provided in [RFC7567]. Given the 388 benefits, more and more TCP stacks in the Internet support ECN, and 389 it specifically makes sense to leverage ECN in controlled 390 environments such as CNNs. As of the writing, there is on-going work 391 to extend the types of TCP packets that are ECN-capable, including 392 pure ACKs [I-D.ietf-tcpm-generalized-ecn]. Such a feature may 393 further increase the benefits of ECN in CNN environments. Note, 394 however, that supporting ECN increases implementation complexity. 396 3.1.3. Explicit loss notifications 398 There has been a significant body of research on solutions capable of 399 explicitly indicating whether a TCP segment loss is due to 400 corruption, in order to avoid activation of congestion control 401 mechanisms [ETEN] [RFC2757]. While such solutions may provide 402 significant improvement, they have not been widely deployed and 403 remain as experimental work. In fact, as of today, the IETF has not 404 standardized any such solution. 406 3.2. TCP guidance for single-MSS stacks 408 This section discusses TCP stacks that allow transferring a single 409 MSS. More general guidance is provided in Section 3.3. 411 3.2.1. Single-MSS stacks - benefits and issues 413 A TCP stack can reduce the memory requirements by advertising a TCP 414 window size of one MSS, and also transmit at most one MSS of 415 unacknowledged data. In that case, both congestion and flow control 416 implementation are quite simple. Such a small receive and send 417 window may be sufficient for simple message exchanges in the CNN 418 space. However, only using a window of one MSS can significantly 419 affect performance. A stop-and-wait operation results in low 420 throughput for transfers that exceed the length of one MSS, e.g., a 421 firmware download. Furthermore, a single-MSS solution relies solely 422 on timer-based loss recovery, therefore missing the performance gain 423 of Fast Retransmit and Fast Recovery (which require a larger window 424 size, see Section 3.3.1). 426 If CoAP is used over TCP with the default setting for NSTART in 427 [RFC7252], a CoAP endpoint is not allowed to send a new message to a 428 destination until a response for the previous message sent to that 429 destination has been received. This is equivalent to an application- 430 layer window size of 1 data unit. For this use of CoAP, a maximum 431 TCP window of one MSS may be sufficient, as long as the CoAP message 432 size does not exceed one MSS. An exception in CoAP over TCP, though, 433 is the Capabilities and Settings Message (CSM) that must be sent at 434 the start of the TCP connection. The first application message 435 carrying user data is allowed to be sent immediately after the CSM 436 message. If the sum of the CSM size plus the application message 437 size exceeds the MSS, a sender using a single-MSS stack will need to 438 wait for the ACK confirming the CSM before sending the application 439 message. 441 3.2.2. TCP options for single-MSS stacks 443 A TCP implementation needs to support, at a minimum, TCP options 2, 1 444 and 0. These are, respectively, the Maximum Segment Size (MSS) 445 option, the No-Operation option, and the End Of Option List marker 446 [RFC0793]. None of these are a substantial burden to support. These 447 options are sufficient for interoperability with a standard-compliant 448 TCP endpoint, albeit many TCP stacks support additional options and 449 can negotiate their use. A TCP implementation is permitted to 450 silently ignore all other TCP options. 452 A TCP implementation for a constrained device that uses a single-MSS 453 TCP receive or transmit window size may not benefit from supporting 454 the following TCP options: Window scale [RFC7323], TCP Timestamps 455 [RFC7323], Selective Acknowledgments (SACK) and SACK-Permitted 456 [RFC2018]. Also other TCP options may not be required on a 457 constrained device with a very lightweight implementation. With 458 regard to the Window scale option, note that it is only useful if a 459 window size greater than 64 kB is needed. 461 Note that a TCP sender can benefit from the TCP Timestamps option 462 [RFC7323] in detecting spurious RTOs. The latter are quite likely to 463 occur in CNN scenarios due to a number of reasons (e.g. route changes 464 in a multihop scenario, link layer retries, etc.). The header 465 overhead incurred by the Timestamps option (of up to 12 bytes) needs 466 to be taken into account. 468 3.2.3. Delayed Acknowledgments for single-MSS stacks 470 TCP Delayed Acknowledgments are meant to reduce the number of ACKs 471 sent within a TCP connection, thus reducing network overhead, but 472 they may increase the time until a sender may receive an ACK. In 473 general, usefulness of Delayed ACKs depends heavily on the usage 474 scenario (see Section 3.3.2). There can be interactions with single- 475 MSS stacks. 477 When traffic is unidirectional, if the sender can send at most one 478 MSS of data or the receiver advertises a receive window not greater 479 than the MSS, Delayed ACKs may unnecessarily contribute delay (up to 480 500 ms) to the RTT [RFC5681], which limits the throughput and can 481 increase data delivery time. Note that, in some cases, it may not be 482 possible to disable Delayed ACKs. One known workaround is to split 483 the data to be sent into two segments of smaller size. A standard 484 compliant TCP receiver may immediately acknowledge the second MSS of 485 data, which can improve throughput. However, this 'split hack' may 486 not always work since a TCP receiver is required to acknowledge every 487 second full-sized segment, but not two consecutive small segments. 488 The overhead of sending two IP packets instead of one is another 489 downside of the 'split hack'. 491 Similar issues may happen when the sender uses the Nagle algorithm, 492 since the sender may need to wait for an unnecessarily delayed ACK to 493 send a new segment. Disabling the algorithm will not have impact if 494 the sender can only handle stop-and-wait operation at the TCP level. 496 For request-response traffic, when the receiver uses Delayed ACKs, a 497 response to a data message can piggyback an ACK, as long as the 498 latter is sent before the Delayed ACK timer expires, thus avoiding 499 unnecessary ACKs without payload. Disabling Delayed ACKs at the 500 request sender allows an immediate ACK for the data segment carrying 501 the response. 503 3.2.4. RTO calculation for single-MSS stacks 505 The RTO calculation is one of the fundamental TCP algorithms 506 [RFC6298]. There is a fundamental trade-off: A short, aggressive RTO 507 behavior reduces wait time before retransmissions, but it also 508 increases the probability of spurious timeouts. The latter lead to 509 unnecessary waste of potentially scarce resources in CNNs such as 510 energy and bandwidth. In contrast, a conservative timeout can result 511 in long error recovery times and thus needlessly delay data delivery. 513 If a TCP sender uses a very small window size, and it cannot benefit 514 from Fast Retransmit/Fast Recovery or SACK, the RTO algorithm has a 515 large impact on performance. In that case, RTO algorithm tuning may 516 be considered, although careful assessment of possible drawbacks is 517 recommended [I-D.ietf-tcpm-rto-consider]. 519 As an example, adaptive RTO algorithms defined for CoAP over UDP have 520 been found to perform well in CNN scenarios [Commag] 521 [I-D.ietf-core-fasor]. 523 3.3. General recommendations for TCP in CNNs 525 This section summarizes some widely used techniques to improve TCP, 526 with a focus on their use in CNNs. The TCP extensions discussed here 527 are useful in a wide range of network scenarios, including CNNs. 528 This section is not comprehensive. A comprehensive survey of TCP 529 extensions is published in [RFC7414]. 531 3.3.1. Loss recovery and congestion/flow control 533 Devices that have enough memory to allow a larger (i.e. more than 3 534 MSS of data) TCP window size can leverage a more efficient loss 535 recovery than the timer-based approach used for smaller TCP window 536 size (see Section 3.2.1) by using Fast Retransmit and Fast Recovery 537 [RFC5681], at the expense of slightly greater complexity and 538 Transmission Control Block (TCB) size. Assuming that Delayed ACKs 539 are used by the receiver, a window size of up to 5 MSS is required 540 for Fast Retransmit and Fast Recovery to work efficiently: If in a 541 given TCP transmission of full-sized segments 1, 2, 3, 4, and 5, 542 segment 2 gets lost, and the ACK for segment 1 is held by the Delayed 543 ACK timer, then the sender should get an ACK for segment 1 when 3 544 arrives and duplicate ACKs when segments 4, 5, and 6 arrive. It will 545 retransmit segment 2 when the third duplicate ACK arrives. In order 546 to have segments 2, 3, 4, 5, and 6 sent, the window has to be of at 547 least 5 MSS. With an MSS of 1220 bytes, a buffer of a size of 5 MSS 548 would require 6100 bytes. 550 The example in the previous paragraph did not use a further TCP 551 improvement such as Limited Transmit [RFC3042]. The latter may also 552 be useful for any transfer that has more than one segment in flight. 553 Small transfers tend to benefit more from Limited Transmit, because 554 they are more likely to not receive enough duplicate ACKs. Assuming 555 the example in the previous paragraph, Limited Transmit allows 556 sending 5 MSS with a congestion window (cwnd) of 3 segments, plus two 557 additional segments for the first two duplicate ACKs. With Limited 558 Transmit, even a cwnd of 2 segments allows sending 5 MSS, at the 559 expense of additional delay contributed by the Delayed ACK timer for 560 the ACK that confirms segment 1. 562 When a multiple-segment window is used, the receiver will need to 563 manage the reception of possible out-of-order received segments, 564 requiring sufficient buffer space. Note that even when a 1-MSS 565 window is used, out-of-order arrival should also be managed, as the 566 sender may send multiple sub-MSS packets that fit in the window. (On 567 the other hand, the receiver is free to simply drop out-of-order 568 segments, thus forcing retransmissions). 570 3.3.1.1. Selective Acknowledgments (SACK) 572 If a device with less severe memory and processing constraints can 573 afford advertising a TCP window size of several MSS, it makes sense 574 to support the SACK option to improve performance. SACK allows a 575 data receiver to inform the data sender of non-contiguous data blocks 576 received, thus a sender (having previously sent the SACK-Permitted 577 option) can avoid performing unnecessary retransmissions, saving 578 energy and bandwidth, as well as reducing latency. In addition, SACK 579 often allows for faster loss recovery when there is more than one 580 lost segment in a window of data, since SACK recovery may complete 581 with less RTTs. SACK is particularly useful for bulk data transfers. 582 A receiver supporting SACK will need to keep track of the data blocks 583 that need to be received. The sender will also need to keep track of 584 which data segments need to be resent after learning which data 585 blocks are missing at the receiver. SACK adds 8*n+2 bytes to the TCP 586 header, where n denotes the number of data blocks received, up to 4 587 blocks. For a low number of out-of-order segments, the header 588 overhead penalty of SACK is compensated by avoiding unnecessary 589 retransmissions. When the sender discovers the data blocks that have 590 already been received, it needs to also store the necessary state to 591 avoid unnecessary retransmission of data segments that have already 592 been received. 594 3.3.2. Delayed Acknowledgments 596 For certain traffic patterns, Delayed ACKs may have a detrimental 597 effect, as already noted in Section 3.2.3. Advanced TCP stacks may 598 use heuristics to determine the maximum delay for an ACK. For CNNs, 599 the recommendation depends on the expected communication patterns. 601 When traffic over a CNN is expected to mostly be unidirectional 602 messages with a size typically up to one MSS, and the time between 603 two consecutive message transmissions is greater than the Delayed ACK 604 timeout, it may make sense to use a smaller timeout or disable 605 Delayed ACKs at the receiver. This avoids incurring additional 606 delay, as well as the energy consumption of the sender (which might 607 e.g. keep its radio interface in receive mode) during that time. 608 Note that disabling Delayed ACKs may only be possible if the peer 609 device is administered by the same entity managing the constrained 610 device. For request-response traffic, enabling Delayed ACKs is 611 recommended at the server end, in order to allow combining a response 612 with the ACK into a single segment, thus increasing efficiency. In 613 addition, if a client issues requests infrequently, disabling Delayed 614 ACKs at the client allows an immediate ACK for the data segment 615 carrying the response. 617 In contrast, Delayed ACKs allow to reduce the number of ACKs in bulk 618 transfer type of traffic, e.g. for firmware/software updates or for 619 transferring larger data units containing a batch of sensor readings. 621 Note that, in many scenarios, the peer that a constrained device 622 communicates with will be a general purpose system that communicates 623 with both constrained and unconstrained devices. Since delayed ACKs 624 are often configured through system-wide parameters, delayed ACKs 625 behavior at the peer will be the same regardless of the nature of the 626 endpoints it talks to. Such a peer will typically have delayed ACKs 627 enabled. 629 3.3.3. Initial Window 631 RFC 5681 specifies a TCP Initial Window (IW) of roughly 4 kB 632 [RFC5681]. Subsequently, RFC 6928 defined an experimental new value 633 for the IW, which in practice will result in an IW of 10 MSS 634 [RFC6928]. The latter is nowadays used in many TCP implementations. 636 Note that a 10-MSS IW was recommended for resource-rich environments 637 (e.g. broadband environments), which are significantly different from 638 CNNs. In CNNs, many application layer data units are relatively 639 small (e.g. below one MSS). However, larger objects (e.g. large 640 files containing sensor readings, firmware updates, etc.) may also 641 need to be transferred in CNNs. If such a large object is 642 transferred in CNNs, with an IW setting of 10 MSS, there is 643 significant buffer overflow risk, since many CNN devices support 644 network or radio buffers of a size smaller than 10 MSS. In order to 645 avoid such problem, in CNNs the IW needs to be carefully set, based 646 on device and network resource constraints. In many cases, a safe IW 647 setting will be smaller than 10 MSS. 649 4. TCP usage recommendations in CNNs 651 This section discusses how TCP can be used by applications that are 652 developed for CNN scenarios. These remarks are by and large 653 independent of how TCP is exactly implemented. 655 4.1. TCP connection initiation 657 In the constrained device to unconstrained device scenario 658 illustrated above, a TCP connection is typically initiated by the 659 constrained device, in order for this device to support possible 660 sleep periods to save energy. 662 4.2. Number of concurrent connections 664 TCP endpoints with a small amount of memory may only support a small 665 number of connections. Each TCP connection requires storing a number 666 of variables in the TCB. Depending on the internal TCP 667 implementation, each connection may result in further memory 668 overhead, and connections may compete for scarce resources (e.g. 669 further memory overhead for send and receive buffers, etc). 671 A careful application design may try to keep the number of concurrent 672 connections as small as possible. A client can for instance limit 673 the number of simultaneous open connections that it maintains to a 674 given server. Multiple connections could for instance be used to 675 avoid the "head-of-line blocking" problem in an application transfer. 676 However, in addition to consuming resources, using multiple 677 connections can also cause undesirable side effects in congested 678 networks. For example, the HTTP/1.1 specification encourages clients 679 to be conservative when opening multiple connections [RFC7230]. 680 Furthermore, each new connection will start with a 3-way handshake, 681 therefore increasing message overhead. 683 Being conservative when opening multiple TCP connections is of 684 particular importance in Constrained-Node Networks. 686 4.3. TCP connection lifetime 688 In order to minimize message overhead, it makes sense to keep a TCP 689 connection open as long as the two TCP endpoints have more data to 690 send. If applications exchange data rather infrequently, i.e., if 691 TCP connections would stay idle for a long time, the idle time can 692 result in problems. For instance, certain middleboxes such as 693 firewalls or NAT devices are known to delete state records after an 694 inactivity interval. RFC 5382 specifies a minimum value for such 695 interval of 124 minutes. Measurement studies have reported that TCP 696 NAT binding timeouts are highly variable across devices, with a 697 median around 60 minutes, the shortest timeout being around 2 698 minutes, and more than 50% of the devices with a timeout shorter than 699 the aforementioned minimum timeout of 124 minutes [HomeGateway]. The 700 timeout duration used by a middlebox implementation may not be known 701 to the TCP endpoints. 703 In CNNs, such middleboxes may e.g. be present at the boundary between 704 the CNN and other networks. If the middlebox can be optimized for 705 CNN use cases, it makes sense to increase the initial value for 706 filter state inactivity timers to avoid problems with idle 707 connections. Apart from that, this problem can be dealt with by 708 different connection handling strategies, each having pros and cons. 710 One approach for infrequent data transfer is to use short-lived TCP 711 connections. Instead of trying to maintain a TCP connection for a 712 long time, possibly short-lived connections can be opened between two 713 endpoints, which are closed if no more data needs to be exchanged. 714 For use cases that can cope with the additional messages and the 715 latency resulting from starting new connections, it is recommended to 716 use a sequence of short-lived connections, instead of maintaining a 717 single long-lived connection. 719 The message and latency overhead that stems from using a sequence of 720 short-lived connections could be reduced by TCP Fast Open (TFO) 721 [RFC7413], which is an experimental TCP extension, at the expense of 722 increased implementation complexity and increased TCP Control Block 723 (TCB) size. TFO allows data to be carried in SYN (and SYN-ACK) 724 segments, and to be consumed immediately by the receiving endpoint. 725 This reduces the message and latency overhead compared to the 726 traditional three-way handshake to establish a TCP connection. For 727 security reasons, the connection initiator has to request a TFO 728 cookie from the other endpoint. The cookie, with a size of 4 or 16 729 bytes, is then included in SYN packets of subsequent connections. 730 The cookie needs to be refreshed (and obtained by the client) after a 731 certain amount of time. While a given cookie is used for multiple 732 connections between the same two endpoints, the latter may become 733 vulnerable to privacy threats. In addition, a valid cookie may be 734 stolen from a compromised host and may be used to perform SYN flood 735 attacks, as well as amplified reflection attacks to victim hosts (see 736 Section 5 of RFC 7413). Nevertheless, TFO is more efficient than 737 frequently opening new TCP connections with the traditional three-way 738 handshake, as long as the cookie can be reused in subsequent 739 connections. However, as stated in RFC 7413, TFO deviates from the 740 standard TCP semantics, since the data in the SYN could be replayed 741 to an application in some rare circumstances. Applications should 742 not use TFO unless they can tolerate this issue, e.g., by using 743 Transport Layer Security (TLS) [RFC7413]. A comprehensive discussion 744 on TFO can be found at RFC 7413. 746 Another approach is to use long-lived TCP connections with 747 application-layer heartbeat messages. Various application protocols 748 support such heartbeat messages (e.g. CoAP over TCP [RFC8323]). 749 Periodic application-layer heartbeats can prevent early filter state 750 record deletion in middleboxes. If the TCP binding timeout for a 751 middlebox to be traversed by a given connection is known, middlebox 752 filter state deletion will be avoided if the heartbeat period is 753 lower than the middlebox TCP binding timeout. Otherwise, the 754 implementer needs to take into account that middlebox TCP binding 755 timeouts fall in a wide range of possible values [HomeGateway], and 756 it may be hard to find a proper heartbeat period for application- 757 layer heartbeat messages. 759 One specific advantage of Heartbeat messages is that they also allow 760 aliveness checks at the application level. In general, it makes 761 sense to realize aliveness checks at the highest protocol layer 762 possible that is meaningful to the application, in order to maximize 763 the depth of the aliveness check. In addition, timely detection of a 764 dead peer may allow savings in terms of TCB memory use. However, the 765 transmission of heartbeat messages consumes resources. This aspect 766 needs to be assessed carefully, considering the characteristics of 767 each specific CNN. 769 A TCP implementation may also be able to send "keep-alive" segments 770 to test a TCP connection. According to [RFC1122], "keep-alives" are 771 an optional TCP mechanism that is turned off by default, i.e., an 772 application must explicitly enable it for a TCP connection. The 773 interval between "keep-alive" messages must be configurable and it 774 must default to no less than two hours. With this large timeout, TCP 775 keep-alive messages might not always be useful to avoid deletion of 776 filter state records in some middleboxes. However, sending TCP keep- 777 alive probes more frequently risks draining power on energy- 778 constrained devices. 780 5. Security Considerations 782 Best current practice for securing TCP and TCP-based communication 783 also applies to CNN. As example, use of Transport Layer Security 784 (TLS) [RFC8446] is strongly recommended if it is applicable. 785 However, note that TLS protects only the contents of the data 786 segments. 788 There are TCP options which can actually protect the transport layer. 789 One example is the TCP Authentication Option (TCP-AO) [RFC5925]. 790 However, this option adds overhead and complexity. TCP-AO typically 791 has a size of 16-20 bytes. An implementer needs to asses the trade- 792 off between security and performance when using TCP-AO, considering 793 the characteristics (in terms of energy, bandwidth and computational 794 power) of the environment where TCP will be used. 796 For the mechanisms discussed in this document, the corresponding 797 considerations apply. For instance, if TFO is used, the security 798 considerations of [RFC7413] apply. 800 Constrained devices are expected to support smaller TCP window sizes 801 than less limited devices. In such conditions, segment 802 retransmission triggered by RTO expiration is expected to be 803 relatively frequent, due to lack of (enough) duplicate ACKs, 804 especially when a constrained device uses a single-MSS 805 implementation. For this reason, constrained devices running TCP may 806 appear as particularly appealing victims of the so-called "shrew" 807 Denial of Service (DoS) attack [shrew], whereby one or more sources 808 generate a packet spike targeted to coincide with consecutive RTO- 809 expiration-triggered retry attempts of a victim node. Note that the 810 attack may be performed by Internet-connected devices, including 811 constrained devices in the same CNN as the victim, as well as remote 812 ones. Mitigation techniques include RTO randomization and attack 813 blocking by routers able to detect shrew attacks based on their 814 traffic pattern. 816 6. Acknowledgments 818 Carles Gomez has been funded in part by the Spanish Government 819 (Ministerio de Educacion, Cultura y Deporte) through the Jose 820 Castillejo grants CAS15/00336 and and CAS18/00170, and by European 821 Regional Development Fund (ERDF) and the Spanish Government through 822 projects TEC2016-79988-P, PID2019-106808RA-I00, AEI/FEDER, UE, and by 823 Generalitat de Catalunya Grant 2017 SGR 376. Part of his 824 contribution to this work has been carried out during his stays as a 825 visiting scholar at the Computer Laboratory of the University of 826 Cambridge. 828 The authors appreciate the feedback received for this document. The 829 following folks provided comments that helped improve the document: 830 Carsten Bormann, Zhen Cao, Wei Genyu, Ari Keranen, Abhijan 831 Bhattacharyya, Andres Arcia-Moret, Yoshifumi Nishida, Joe Touch, Fred 832 Baker, Nik Sultana, Kerry Lynn, Erik Nordmark, Markku Kojo, Hannes 833 Tschofenig, David Black, Yoshifumi Nishida, Ilpo Jarvinen, Emmanuel 834 Baccelli, Stuart Cheshire, Gorry Fairhurst, Ingemar Johansson, Ted 835 Lemon, and Michael Tuexen. Simon Brummer provided details, and 836 kindly performed RAM and ROM usage measurements, on the RIOT TCP 837 implementation. Xavi Vilajosana provided details on the OpenWSN TCP 838 implementation. Rahul Jadhav kindly performed code size measurements 839 on the Contiki-NG and lwIP 2.1.2 TCP implementations. He also 840 provided details on the uIP TCP implementation. 842 7. Annex. TCP implementations for constrained devices 844 This section overviews the main features of TCP implementations for 845 constrained devices. The survey is limited to open source stacks 846 with small footprint. It is not meant to be all-encompassing. For 847 more powerful embedded systems (e.g., with 32-bit processors), there 848 are further stacks that comprehensively implement TCP. On the other 849 hand, please be aware that this Annex is based on information 850 available as of the writing. 852 7.1. uIP 854 uIP is a TCP/IP stack, targetted for 8 and 16-bit microcontrollers, 855 which pioneered TCP/IP implementations for constrained devices. uIP 856 has been deployed with Contiki and the Arduino Ethernet shield. A 857 code size of ~5 kB (which comprises checksumming, IPv4, ICMP and TCP) 858 has been reported for uIP [Dunk]. Later versions of uIP implement 859 IPv6 as well. 861 uIP uses the same global buffer for both incoming and outgoing 862 traffic, which has a size of a single packet. In case of a 863 retransmission, an application must be able to reproduce the same 864 user data that had been transmitted. Multiple connections are 865 supported, but need to share the global buffer. 867 The MSS is announced via the MSS option on connection establishment 868 and the receive window size (of one MSS) is not modified during a 869 connection. Stop-and-wait operation is used for sending data. Among 870 other optimizations, this allows to avoid sliding window operations, 871 which use 32-bit arithmetic extensively and are expensive on 8-bit 872 CPUs. 874 Contiki uses the "split hack" technique (see Section 3.2.3) to avoid 875 Delayed ACKs for senders using a single segment. 877 The code size of the TCP implementation in Contiki-NG has been 878 measured to be of 3.2 kB on CC2538DK, cross-compiling on Linux. 880 7.2. lwIP 882 lwIP is a TCP/IP stack, targetted for 8- and 16-bit microcontrollers. 883 lwIP has a total code size of ~14 kB to ~22 kB (which comprises 884 memory management, checksumming, network interfaces, IPv4, ICMP and 885 TCP), and a TCP code size of ~9 kB to ~14 kB [Dunk]. Both IPv4 and 886 IPv6 are supported in lwIP since v2.0.0. 888 In contrast with uIP, lwIP decouples applications from the network 889 stack. lwIP supports a TCP transmission window greater than a single 890 segment, as well as buffering of incoming and outcoming data. Other 891 implemented mechanisms comprise slow start, congestion avoidance, 892 fast retransmit and fast recovery. SACK and Window Scale support has 893 been recently added to lwIP. 895 7.3. RIOT 897 The RIOT TCP implementation (called GNRC TCP) has been designed for 898 Class 1 devices [RFC 7228]. The main target platforms are 8- and 899 16-bit microcontrollers, with 32-bit platforms also supported. GNRC 900 TCP offers a similar function set as uIP, but it provides and 901 maintains an independent receive buffer for each connection. In 902 contrast to uIP, retransmission is also handled by GNRC TCP. For 903 simplicity, GNRC TCP uses a single-MSS implementation. The 904 application programmer does not need to know anything about the TCP 905 internals, therefore GNRC TCP can be seen as a user-friendly uIP TCP 906 implementation. 908 The MSS is set on connections establishment and cannot be changed 909 during connection lifetime. GNRC TCP allows multiple connections in 910 parallel, but each TCB must be allocated somewhere in the system. By 911 default there is only enough memory allocated for a single TCP 912 connection, but it can be increased at compile time if the user needs 913 multiple parallel connections. 915 The RIOT TCP implementation offers an optional POSIX socket wrapper 916 that enables POSIX compliance, if needed. 918 Further details on RIOT and GNRC can be found in the literature 919 [RIOT], [GNRC]. 921 7.4. TinyOS 923 TinyOS was important as a platform for early constrained devices. 924 TinyOS has an experimental TCP stack that uses a simple nonblocking 925 library-based implementation of TCP, which provides a subset of the 926 socket interface primitives. The application is responsible for 927 buffering. The TCP library does not do any receive-side buffering. 928 Instead, it will immediately dispatch new, in-order data to the 929 application and otherwise drop the segment. A send buffer is 930 provided by the application. Multiple TCP connections are possible. 931 Recently there has been little further work on the stack. 933 7.5. FreeRTOS 935 FreeRTOS is a real-time operating system kernel for embedded devices 936 that is supported by 16- and 32-bit microprocessors. Its TCP 937 implementation is based on multiple-segment window size, although a 938 'Tiny-TCP' option, which is a single-MSS variant, can be enabled. 939 Delayed ACKs are supported, with a 20-ms Delayed ACK timer as a 940 technique intended 'to gain performance'. 942 7.6. uC/OS 944 uC/OS is a real-time operating system kernel for embedded devices, 945 which is maintained by Micrium. uC/OS is intended for 8-, 16- and 946 32-bit microprocessors. The uC/OS TCP implementation supports a 947 multiple-segment window size. 949 7.7. Summary 951 +---+---------+--------+----+------+--------+-----+ 952 |uIP|lwIP orig|lwIP 2.1|RIOT|TinyOS|FreeRTOS|uC/OS| 953 +------+-------------+---+---------+--------+----+------+--------+-----+ 954 |Memory|Code size(kB)| <5|~9 to ~14| 38 | <7 | N/A | <9.2 | N/A | 955 | | |(a)| (T1) | (T4) |(T3)| | (T2) | | 956 +------+-------------+---+---------+--------+----+------+--------+-----+ 957 | | Single-Segm.|Yes| No | No | Yes| No | No | No | 958 | +-------------+---+---------+--------+----+------+--------+-----+ 959 | | Slow start | No| Yes | Yes | No | Yes | No | Yes | 960 | T +-------------+---+---------+--------+----+------+--------+-----+ 961 | C |Fast rec/retx| No| Yes | Yes | No | Yes | No | Yes | 962 | P +-------------+---+---------+--------+----+------+--------+-----+ 963 | | Keep-alive | No| No | Yes | No | No | Yes | Yes | 964 | +-------------+---+---------+--------+----+------+--------+-----+ 965 | f | Win. Scale | No| No | Yes | No | No | Yes | No | 966 | e +-------------+---+---------+--------+----+------+--------+-----+ 967 | a | TCP timest.| No| No | Yes | No | No | Yes | No | 968 | t +-------------+---+---------+--------+----+------+--------+-----+ 969 | u | SACK | No| No | Yes | No | No | Yes | No | 970 | r +-------------+---+---------+--------+----+------+--------+-----+ 971 | e | Del. ACKs | No| Yes | Yes | No | No | Yes | Yes | 972 | s +-------------+---+---------+--------+----+------+--------+-----+ 973 | | Socket | No| No |Optional|(I) |Subset| Yes | Yes | 974 | +-------------+---+---------+--------+----+------+--------+-----+ 975 | |Concur. Conn.|Yes| Yes | Yes | Yes| Yes | Yes | Yes | 976 +------+-------------+---+---------+--------+----+------+--------+-----+ 977 | TLS supported | No| No | Yes | Yes| Yes | Yes | Yes | 978 +--------------------+---+---------+--------+----+------+--------+-----+ 980 (T1) = TCP-only, on x86 and AVR platforms 981 (T2) = TCP-only, on ARM Cortex-M platform 982 (T3) = TCP-only, on ARM Cortex-M0+ platform (NOTE: RAM usage for the same platform 983 is ~2.5 kB for one TCP connection plus ~1.2 kB for each additional connection) 984 (T4) = TCP-only, on CC2538DK, cross-compiling on Linux 985 (a) = includes IP, ICMP and TCP on x86 and AVR platforms. The Contiki-NG TCP implementation has a code size of 3.2 kB on CC2538DK, cross-compiling on Linux 986 (I) = optional POSIX socket wrapper which enables POSIX compliance if needed 987 Mult. = Multiple 988 N/A = Not Available 990 Figure 2: Summary of TCP features for different lightweight TCP 991 implementations. None of the implementations considered in this 992 Annex support ECN or TFO. 994 8. Annex. Changes compared to previous versions 996 RFC Editor: To be removed prior to publication 998 8.1. Changes between -00 and -01 1000 o Changed title and abstract 1002 o Clarification that communication with standard-compliant TCP 1003 endpoints is required, based on feedback from Joe Touch 1005 o Additional discussion on communication patters 1007 o Numerous changes to address a comprehensive review from Hannes 1008 Tschofenig 1010 o Reworded security considerations 1012 o Additional references and better distinction between normative and 1013 informative entries 1015 o Feedback from Rahul Jadhav on the uIP TCP implementation 1017 o Basic data for the TinyOS TCP implementation added, based on 1018 source code analysis 1020 8.2. Changes between -01 and -02 1022 o Added text to the Introduction section, and a reference, on 1023 traditional bad perception of TCP for IoT 1025 o Added sections on FreeRTOS and uC/OS 1027 o Updated TinyOS section 1029 o Updated summary table 1031 o Reorganized Section 4 (single-MSS vs multiple-MSS window size), 1032 some content now also in new Section 5 1034 8.3. Changes between -02 and -03 1036 o Rewording to better explain the benefit of ECN 1038 o Additional context information on the surveyed implementations 1040 o Added details, but removed "Data size" raw, in the summary table 1041 o Added discussion on shrew attacks 1043 8.4. Changes between -03 and -04 1045 o Addressing the remaining TODOs 1047 o Alignment of the wording on TCP "keep-alives" with related 1048 discussions in the IETF transport area 1050 o Added further discussion on delayed ACKs 1052 o Removed OpenWSN section from the Annex 1054 8.5. Changes between -04 and -05 1056 o Addressing comments by Yoshifumi Nishida 1058 o Removed mentioning MD5 as an example (comment by David Black) 1060 o Added memory footprint details of TCP implementations (Contiki-NG 1061 and lwIP 2.1.2) provided by Rahul Jadhav in the Annex 1063 o Addressed comments by Ilpo Jarvinen throughout the whole document 1065 o Improved the RIOT section in the Annex, based on feedback from 1066 Emmanuel Baccelli 1068 8.6. Changes between -05 and -06 1070 o Incorporated suggestions by Stuart Cheshire 1072 8.7. Changes between -06 and -07 1074 o Addressed comments by Gorry Fairhurst 1076 8.8. Changes between -07 and -08 1078 o Addressed WGLC comments by Ilpo Jarvinen, Markku Kojo and Ingemar 1079 Johansson throughout the document, including the addition of a new 1080 section on Initial Window considerations. 1082 8.9. Changes between -08 and -09 1084 o Addressed second round of comments by Ilpo Jarvinen and Markku 1085 Kojo, based on the previous draft update. 1087 8.10. Changes between -09 and -10 1089 o Addressed comments by Erik Kline. 1091 o Addressed a comment by Markku Kojo on advice given in RFC 6691. 1093 8.11. Changes between -10 and -11 1095 o Addressed a comment by Ted Lemon on MSS advice. 1097 8.12. Changes between -11 and -12 1099 o Addressed comments from IESG and various directorates. 1101 8.13. Changes between -12 and -13 1103 o Fixed two typos. 1105 o Addressed a comment by Barry Leiba. 1107 9. References 1109 9.1. Normative References 1111 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1112 RFC 793, DOI 10.17487/RFC0793, September 1981, 1113 . 1115 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1116 Communication Layers", STD 3, RFC 1122, 1117 DOI 10.17487/RFC1122, October 1989, 1118 . 1120 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 1121 Selective Acknowledgment Options", RFC 2018, 1122 DOI 10.17487/RFC2018, October 1996, 1123 . 1125 [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing 1126 TCP's Loss Recovery Using Limited Transmit", RFC 3042, 1127 DOI 10.17487/RFC3042, January 2001, 1128 . 1130 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1131 of Explicit Congestion Notification (ECN) to IP", 1132 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1133 . 1135 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1136 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1137 . 1139 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1140 "Computing TCP's Retransmission Timer", RFC 6298, 1141 DOI 10.17487/RFC6298, June 2011, 1142 . 1144 [RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)", 1145 RFC 6691, DOI 10.17487/RFC6691, July 2012, 1146 . 1148 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1149 "Increasing TCP's Initial Window", RFC 6928, 1150 DOI 10.17487/RFC6928, April 2013, 1151 . 1153 [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for 1154 Constrained-Node Networks", RFC 7228, 1155 DOI 10.17487/RFC7228, May 2014, 1156 . 1158 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. 1159 Scheffenegger, Ed., "TCP Extensions for High Performance", 1160 RFC 7323, DOI 10.17487/RFC7323, September 2014, 1161 . 1163 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 1164 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 1165 . 1167 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1168 Recommendations Regarding Active Queue Management", 1169 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1170 . 1172 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1173 (IPv6) Specification", STD 86, RFC 8200, 1174 DOI 10.17487/RFC8200, July 2017, 1175 . 1177 9.2. Informative References 1179 [Commag] A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoAP 1180 Congestion Control for the Internet of Things", IEEE 1181 Communications Magazine, June 2016. 1183 [Dunk] A. Dunkels, "Full TCP/IP for 8-Bit Architectures", 2003. 1185 [ETEN] R. Krishnan et al, "Explicit transport error notification 1186 (ETEN) for error-prone wireless and satellite networks", 1187 Computer Networks 2004. 1189 [GNRC] M. Lenders et al., "Connecting the World of Embedded 1190 Mobiles: The RIOTApproach to Ubiquitous Networking for the 1191 IoT", 2018. 1193 [HomeGateway] 1194 Haetoenen, S., Nyrhinen, A., Eggert, L., Strowes, S., 1195 Sarolahti, P., and M. Kojo, "An Experimental Study of Home 1196 Gateway Characteristics", Proceedings of the 10th ACM 1197 SIGCOMM conference on Internet measurement 2010. 1199 [I-D.delcarpio-6lo-wlanah] 1200 Vega, L., Robles, I., and R. Morabito, "IPv6 over 1201 802.11ah", draft-delcarpio-6lo-wlanah-01 (work in 1202 progress), October 2015. 1204 [I-D.ietf-6lo-fragment-recovery] 1205 Thubert, P., "6LoWPAN Selective Fragment Recovery", draft- 1206 ietf-6lo-fragment-recovery-21 (work in progress), March 1207 2020. 1209 [I-D.ietf-core-fasor] 1210 Jarvinen, I., Kojo, M., Raitahila, I., and Z. Cao, "Fast- 1211 Slow Retransmission Timeout and Congestion Control 1212 Algorithm for CoAP", draft-ietf-core-fasor-01 (work in 1213 progress), October 2020. 1215 [I-D.ietf-tcpm-generalized-ecn] 1216 Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit 1217 Congestion Notification (ECN) to TCP Control Packets", 1218 draft-ietf-tcpm-generalized-ecn-05 (work in progress), 1219 November 2019. 1221 [I-D.ietf-tcpm-rto-consider] 1222 Allman, M., "Requirements for Time-Based Loss Detection", 1223 draft-ietf-tcpm-rto-consider-17 (work in progress), July 1224 2020. 1226 [IntComp] C. Gomez, A. Arcia-Moret, J. Crowcroft, "TCP in the 1227 Internet of Things: from ostracism to prominence", IEEE 1228 Internet Computing, January-February 2018. 1230 [MQTT] ISO/IEC 20922:2016, "Message Queuing Telemetry Transport 1231 (MQTT) v3.1.1", 2016. 1233 [RFC2757] Montenegro, G., Dawkins, S., Kojo, M., Magret, V., and N. 1234 Vaidya, "Long Thin Networks", RFC 2757, 1235 DOI 10.17487/RFC2757, January 2000, 1236 . 1238 [RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of 1239 Explicit Congestion Notification (ECN) in IP Networks", 1240 RFC 2884, DOI 10.17487/RFC2884, July 2000, 1241 . 1243 [RFC3481] Inamura, H., Ed., Montenegro, G., Ed., Ludwig, R., Gurtov, 1244 A., and F. Khafizov, "TCP over Second (2.5G) and Third 1245 (3G) Generation Wireless Networks", BCP 71, RFC 3481, 1246 DOI 10.17487/RFC3481, February 2003, 1247 . 1249 [RFC3819] Karn, P., Ed., Bormann, C., Fairhurst, G., Grossman, D., 1250 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1251 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1252 RFC 3819, DOI 10.17487/RFC3819, July 2004, 1253 . 1255 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 1256 "Transmission of IPv6 Packets over IEEE 802.15.4 1257 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 1258 . 1260 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1261 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 1262 June 2010, . 1264 [RFC6077] Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B. 1265 Briscoe, "Open Research Issues in Internet Congestion 1266 Control", RFC 6077, DOI 10.17487/RFC6077, February 2011, 1267 . 1269 [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence 1270 Protocol (XMPP): Core", RFC 6120, DOI 10.17487/RFC6120, 1271 March 2011, . 1273 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 1274 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 1275 DOI 10.17487/RFC6282, September 2011, 1276 . 1278 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1279 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1280 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1281 Low-Power and Lossy Networks", RFC 6550, 1282 DOI 10.17487/RFC6550, March 2012, 1283 . 1285 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1286 Statement and Requirements for IPv6 over Low-Power 1287 Wireless Personal Area Network (6LoWPAN) Routing", 1288 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1289 . 1291 [RFC6775] Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C. 1292 Bormann, "Neighbor Discovery Optimization for IPv6 over 1293 Low-Power Wireless Personal Area Networks (6LoWPANs)", 1294 RFC 6775, DOI 10.17487/RFC6775, November 2012, 1295 . 1297 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1298 Protocol (HTTP/1.1): Message Syntax and Routing", 1299 RFC 7230, DOI 10.17487/RFC7230, June 2014, 1300 . 1302 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 1303 Application Protocol (CoAP)", RFC 7252, 1304 DOI 10.17487/RFC7252, June 2014, 1305 . 1307 [RFC7414] Duke, M., Braden, R., Eddy, W., Blanton, E., and A. 1308 Zimmermann, "A Roadmap for Transmission Control Protocol 1309 (TCP) Specification Documents", RFC 7414, 1310 DOI 10.17487/RFC7414, February 2015, 1311 . 1313 [RFC7428] Brandt, A. and J. Buron, "Transmission of IPv6 Packets 1314 over ITU-T G.9959 Networks", RFC 7428, 1315 DOI 10.17487/RFC7428, February 2015, 1316 . 1318 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1319 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1320 DOI 10.17487/RFC7540, May 2015, 1321 . 1323 [RFC7668] Nieminen, J., Savolainen, T., Isomaki, M., Patil, B., 1324 Shelby, Z., and C. Gomez, "IPv6 over BLUETOOTH(R) Low 1325 Energy", RFC 7668, DOI 10.17487/RFC7668, October 2015, 1326 . 1328 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1329 Explicit Congestion Notification (ECN)", RFC 8087, 1330 DOI 10.17487/RFC8087, March 2017, 1331 . 1333 [RFC8105] Mariager, P., Petersen, J., Ed., Shelby, Z., Van de Logt, 1334 M., and D. Barthel, "Transmission of IPv6 Packets over 1335 Digital Enhanced Cordless Telecommunications (DECT) Ultra 1336 Low Energy (ULE)", RFC 8105, DOI 10.17487/RFC8105, May 1337 2017, . 1339 [RFC8163] Lynn, K., Ed., Martocci, J., Neilson, C., and S. 1340 Donaldson, "Transmission of IPv6 over Master-Slave/Token- 1341 Passing (MS/TP) Networks", RFC 8163, DOI 10.17487/RFC8163, 1342 May 2017, . 1344 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1345 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1346 DOI 10.17487/RFC8201, July 2017, 1347 . 1349 [RFC8323] Bormann, C., Lemay, S., Tschofenig, H., Hartke, K., 1350 Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained 1351 Application Protocol) over TCP, TLS, and WebSockets", 1352 RFC 8323, DOI 10.17487/RFC8323, February 2018, 1353 . 1355 [RFC8352] Gomez, C., Kovatsch, M., Tian, H., and Z. Cao, Ed., 1356 "Energy-Efficient Features of Internet of Things 1357 Protocols", RFC 8352, DOI 10.17487/RFC8352, April 2018, 1358 . 1360 [RFC8376] Farrell, S., Ed., "Low-Power Wide Area Network (LPWAN) 1361 Overview", RFC 8376, DOI 10.17487/RFC8376, May 2018, 1362 . 1364 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 1365 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 1366 . 1368 [RFC8900] Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 1369 and F. Gont, "IP Fragmentation Considered Fragile", 1370 BCP 230, RFC 8900, DOI 10.17487/RFC8900, September 2020, 1371 . 1373 [RIOT] E. Baccelli et al., "RIOT: an Open Source Operating 1374 Systemfor Low-end Embedded Devices in the IoT", 2018. 1376 [shrew] A. Kuzmanovic, E. Knightly, "Low-Rate TCP-Targeted Denial 1377 of Service Attacks", SIGCOMM'03 2003. 1379 Authors' Addresses 1381 Carles Gomez 1382 UPC 1383 C/Esteve Terradas, 7 1384 Castelldefels 08860 1385 Spain 1387 Email: carlesgo@entel.upc.edu 1389 Jon Crowcroft 1390 University of Cambridge 1391 JJ Thomson Avenue 1392 Cambridge, CB3 0FD 1393 United Kingdom 1395 Email: jon.crowcroft@cl.cam.ac.uk 1397 Michael Scharf 1398 Hochschule Esslingen 1399 Flandernstr. 101 1400 Esslingen 73732 1401 Germany 1403 Email: michael.scharf@hs-esslingen.de