idnits 2.17.1 

draft-ietf-lwig-tcp-constrained-node-networks-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There are 32 instances of too long lines in the document, the longest
     one being 90 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet has text resembling
     RFC 2119 boilerplate text.

  -- The document date (September 6, 2020) is 1321 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC 7228' is mentioned on line 888, but not defined

  == Unused Reference: 'RFC6092' is defined on line 1229, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  ** Obsolete normative reference: RFC 6691 (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 7230
     (Obsoleted by RFC 9110, RFC 9112)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)


     Summary: 5 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	LWIG Working Group                                              C. Gomez
3	Internet-Draft                                                       UPC
4	Intended status: Informational                              J. Crowcroft
5	Expires: March 10, 2021                          University of Cambridge
6	                                                               M. Scharf
7	                                                    Hochschule Esslingen
8	                                                       September 6, 2020

10	           TCP Usage Guidance in the Internet of Things (IoT)
11	            draft-ietf-lwig-tcp-constrained-node-networks-10

13	Abstract

15	   This document provides guidance on how to implement and use the
16	   Transmission Control Protocol (TCP) in Constrained-Node Networks
17	   (CNNs), which are a characterstic of the Internet of Things (IoT).
18	   Such environments require a lightweight TCP implementation and may
19	   not make use of optional functionality.  This document explains a
20	   number of known and deployed techniques to simplify a TCP stack as
21	   well as corresponding tradeoffs.  The objective is to help embedded
22	   developers with decisions on which TCP features to use.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at https://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on March 10, 2021.

41	Copyright Notice

43	   Copyright (c) 2020 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (https://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Conventions used in this document . . . . . . . . . . . . . .   4
60	   3.  Characteristics of CNNs relevant for TCP  . . . . . . . . . .   4
61	     3.1.  Network and link properties . . . . . . . . . . . . . . .   4
62	     3.2.  Usage scenarios . . . . . . . . . . . . . . . . . . . . .   5
63	     3.3.  Communication and traffic patterns  . . . . . . . . . . .   6
64	   4.  TCP implementation and configuration in CNNs  . . . . . . . .   6
65	     4.1.  Addressing path properties  . . . . . . . . . . . . . . .   7
66	       4.1.1.  Maximum Segment Size (MSS)  . . . . . . . . . . . . .   7
67	       4.1.2.  Explicit Congestion Notification (ECN)  . . . . . . .   8
68	       4.1.3.  Explicit loss notifications . . . . . . . . . . . . .   9
69	     4.2.  TCP guidance for single-MSS stacks  . . . . . . . . . . .   9
70	       4.2.1.  Single-MSS stacks - benefits and issues . . . . . . .   9
71	       4.2.2.  TCP options for single-MSS stacks . . . . . . . . . .  10
72	       4.2.3.  Delayed Acknowledgments for single-MSS stacks . . . .  10
73	       4.2.4.  RTO calculation for single-MSS stacks . . . . . . . .  11
74	     4.3.  General recommendations for TCP in CNNs . . . . . . . . .  12
75	       4.3.1.  Loss recovery and congestion/flow control . . . . . .  12
76	         4.3.1.1.  Selective Acknowledgments (SACK)  . . . . . . . .  12
77	       4.3.2.  Delayed Acknowledgments . . . . . . . . . . . . . . .  13
78	       4.3.3.  Initial Window  . . . . . . . . . . . . . . . . . . .  14
79	   5.  TCP usage recommendations in CNNs . . . . . . . . . . . . . .  14
80	     5.1.  TCP connection initiation . . . . . . . . . . . . . . . .  14
81	     5.2.  Number of concurrent connections  . . . . . . . . . . . .  14
82	     5.3.  TCP connection lifetime . . . . . . . . . . . . . . . . .  15
83	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
84	   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  17
85	   8.  Annex. TCP implementations for constrained devices  . . . . .  18
86	     8.1.  uIP . . . . . . . . . . . . . . . . . . . . . . . . . . .  18
87	     8.2.  lwIP  . . . . . . . . . . . . . . . . . . . . . . . . . .  19
88	     8.3.  RIOT  . . . . . . . . . . . . . . . . . . . . . . . . . .  19
89	     8.4.  TinyOS  . . . . . . . . . . . . . . . . . . . . . . . . .  20
90	     8.5.  FreeRTOS  . . . . . . . . . . . . . . . . . . . . . . . .  20
91	     8.6.  uC/OS . . . . . . . . . . . . . . . . . . . . . . . . . .  20
92	     8.7.  Summary . . . . . . . . . . . . . . . . . . . . . . . . .  20
93	   9.  Annex. Changes compared to previous versions  . . . . . . . .  22
94	     9.1.  Changes between -00 and -01 . . . . . . . . . . . . . . .  22
95	     9.2.  Changes between -01 and -02 . . . . . . . . . . . . . . .  22
96	     9.3.  Changes between -02 and -03 . . . . . . . . . . . . . . .  22
97	     9.4.  Changes between -03 and -04 . . . . . . . . . . . . . . .  23
98	     9.5.  Changes between -04 and -05 . . . . . . . . . . . . . . .  23
99	     9.6.  Changes between -05 and -06 . . . . . . . . . . . . . . .  23
100	     9.7.  Changes between -06 and -07 . . . . . . . . . . . . . . .  23
101	     9.8.  Changes between -07 and -08 . . . . . . . . . . . . . . .  23
102	     9.9.  Changes between -08 and -09 . . . . . . . . . . . . . . .  23
103	     9.10. Changes between -09 and -10 . . . . . . . . . . . . . . .  24
104	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  24
105	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  24
106	     10.2.  Informative References . . . . . . . . . . . . . . . . .  25
107	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  29

109	1.  Introduction

111	   The Internet Protocol suite is being used for connecting Constrained-
112	   Node Networks (CNNs) to the Internet, enabling the so-called Internet
113	   of Things (IoT) [RFC7228].  In order to meet the requirements that
114	   stem from CNNs, the IETF has produced a suite of new protocols
115	   specifically designed for such environments (see e.g.  [RFC8352]).
116	   New IETF protocol stack components include the IPv6 over Low-power
117	   Wireless Personal Area Networks (6LoWPAN) adaptation layer
118	   [RFC4944][RFC6282][RFC6775], the IPv6 Routing Protocol for Low-power
119	   and lossy networks (RPL) routing protocol [RFC6550], and the
120	   Constrained Application Protocol (CoAP) [RFC7252].

122	   As of the writing, the main current transport layer protocols in IP-
123	   based IoT scenarios are UDP and TCP.  However, TCP has been
124	   criticized (often, unfairly) as a protocol for the IoT.  In fact,
125	   some TCP features are not optimal for IoT scenarios, such as
126	   relatively long header size, unsuitability for multicast, and always-
127	   confirmed data delivery.  However, many typical claims on TCP
128	   unsuitability for IoT (e.g. a high complexity, connection-oriented
129	   approach incompatibility with radio duty-cycling, and spurious
130	   congestion control activation in wireless links) are not valid, can
131	   be solved, or are also found in well accepted IoT end-to-end
132	   reliability mechanisms (see [IntComp] for a detailed analysis).

134	   At the application layer, CoAP was developed over UDP [RFC7252].
135	   However, the integration of some CoAP deployments with existing
136	   infrastructure is being challenged by middleboxes such as firewalls,
137	   which may limit and even block UDP-based communications.  This is the
138	   main reason why a CoAP over TCP specification has been developed
139	   [RFC8323].

141	   Other application layer protocols not specifically designed for CNNs
142	   are also being considered for the IoT space.  Some examples include
143	   HTTP/2 and even HTTP/1.1, both of which run over TCP by default
144	   [RFC7230] [RFC7540], and the Extensible Messaging and Presence
145	   Protocol (XMPP) [RFC6120].  TCP is also used by non-IETF application-
146	   layer protocols in the IoT space such as the Message Queue Telemetry
147	   Transport (MQTT) and its lightweight variants.

149	   TCP is a sophisticated transport protocol that includes optional
150	   functionality (e.g.  TCP options) that may improve performance in
151	   some environments.  However, many optional TCP extensions require
152	   complex logic inside the TCP stack and increase the codesize and the
153	   memory requirements.  Many TCP extensions are not required for
154	   interoperability with other standard-compliant TCP endpoints.  Given
155	   the limited resources on constrained devices, careful selection of
156	   optional TCP features can make an implementation more lightweight.

158	   This document provides guidance on how to implement and configure
159	   TCP, as well as on how TCP is advisable to be used by applications,
160	   in CNNs.  The overarching goal is to offer simple measures to allow
161	   for lightweight TCP implementation and suitable operation in such
162	   environments.  A TCP implementation following the guidance in this
163	   document is intended to be compatible with a TCP endpoint that is
164	   compliant to the TCP standards, albeit possibly with a lower
165	   performance.  This implies that such a TCP client would always be
166	   able to connect with a standard-compliant TCP server, and a
167	   corresponding TCP server would always be able to connect with a
168	   standard-compliant TCP client.

170	   This document assumes that the reader is familiar with TCP.  A
171	   comprehensive survey of the TCP standards can be found in [RFC7414].
172	   Similar guidance regarding the use of TCP in special environments has
173	   been published before, e.g., for cellular wireless networks
174	   [RFC3481].

176	2.  Conventions used in this document

178	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL","SHALL NOT",
179	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
180	   document are to be interpreted as described in [RFC2119].

182	3.  Characteristics of CNNs relevant for TCP

184	3.1.  Network and link properties

186	   CNNs are defined in [RFC7228] as networks whose characteristics are
187	   influenced by being composed of a significant portion of constrained
188	   nodes.  The latter are characterized by significant limitations on
189	   processing, memory, and energy resources, among others [RFC7228].
190	   The first two dimensions pose constraints on the complexity and on
191	   the memory footprint of the protocols that constrained nodes can
192	   support.  The latter requires techniques to save energy, such as
193	   radio duty-cycling in wireless devices [RFC8352], as well as
194	   minimization of the number of messages transmitted/received (and
195	   their size).

197	   [RFC7228] lists typical network constraints in CNN, including low
198	   achievable bitrate/throughput, high packet loss and high variability
199	   of packet loss, highly asymmetric link characteristics, severe
200	   penalties for using larger packets, limits on reachability over time,
201	   etc.  CNN may use wireless or wired technologies (e.g., Power Line
202	   Communication), and the transmission rates are typically low (e.g.
203	   below 1 Mbps).

205	   For use of TCP, one challenge is that not all technologies in CNN may
206	   be aligned with typical Internet subnetwork design principles
207	   [RFC3819].  For instance, constrained nodes often use physical/link
208	   layer technologies that have been characterized as 'lossy', i.e.,
209	   exhibit a relatively high bit error rate.  Dealing with corruption
210	   loss is one of the open issues in the Internet [RFC6077].

212	3.2.  Usage scenarios

214	   There are different deployment and usage scenarios for CNNs.  Some
215	   CNNs follow the star topology, whereby one or several hosts are
216	   linked to a central device that acts as a router connecting the CNN
217	   to the Internet.  CNNs may also follow the multihop topology
218	   [RFC6606].

220	   In constrained environments, there can be different types of devices
221	   [RFC7228].  For example, there can be devices with single combined
222	   send/receive buffer, devices with a separate send and receive buffer,
223	   or devices with a pool of multiple send/receive buffers.  In the
224	   latter case, it is possible that buffers also be shared for other
225	   protocols.

227	   One key use case for the use of TCP in CNNs is a model where
228	   constrained devices connect to unconstrained servers in the Internet.
229	   But it is also possible that both TCP endpoints run on constrained
230	   devices.  In the first case, communication possibly has to traverse a
231	   middlebox (e.g. a firewall, NAT, etc.).  Figure 1 illustrates such
232	   scenario.  Note that the scenario is asymmetric, as the unconstrained
233	   device will typically not suffer the severe constraints of the
234	   constrained device.  The unconstrained device is expected to be
235	   mains-powered, to have high amount of memory and processing power,
236	   and to be connected to a resource-rich network.

238	   Assuming that a majority of constrained devices will correspond to
239	   sensor nodes, the amount of data traffic sent by constrained devices
240	   (e.g. sensor node measurements) is expected to be higher than the
241	   amount of data traffic in the opposite direction.  Nevertheless,
242	   constrained devices may receive requests (to which they may respond),
243	   commands (for configuration purposes and for constrained devices
244	   including actuators) and relatively infrequent firmware/software
245	   updates.

247	                                                      +---------------+
248	           o     o <-------- TCP communication -----> |               |
249	          o     o                                     |               |
250	             o     o                                  | Unconstrained |
251	       o        o              +-----------+          |    device     |
252	           o     o   o  ------ | Middlebox |  ------- |               |
253	            o   o              +-----------+          |  (e.g. cloud) |
254	          o    o  o                                   |               |
255	                                                      +---------------+
256	      constrained devices

258	      Figure 1: TCP communication between a constrained device and an
259	               unconstrained device, traversing a middlebox.

261	3.3.  Communication and traffic patterns

263	   IoT applications are characterized by a number of different
264	   communication patterns.  The following non-comprehensive list
265	   explains some typical examples:

267	   o  Unidirectional transfers: An IoT device (e.g. a sensor) can send
268	      (repeatedly) updates to the other endpoint.  Not in every case
269	      there is a need for an application response back to the IoT
270	      device.

272	   o  Request-response patterns: An IoT device receiving a request from
273	      the other endpoint, which triggers a response from the IoT device.

275	   o  Bulk data transfers: A typical example for a long file transfer
276	      would be an IoT device firmware update.

278	   A typical communication pattern is that a constrained device
279	   communicates with an unconstrained device (cf.  Figure 1).  But it is
280	   also possible that constrained devices communicate amongst
281	   themselves.

283	4.  TCP implementation and configuration in CNNs

285	   This section explains how a TCP stack can deal with typical
286	   constraints in CNN.  The guidance in this section relates to the TCP
287	   implementation and its configuration.

289	4.1.  Addressing path properties

291	4.1.1.  Maximum Segment Size (MSS)

293	   Assuming that IPv6 is used, and for the sake of lightweight
294	   implementation and operation, unless applications require handling
295	   large data units (i.e. leading to an IPv6 datagram size greater than
296	   1280 bytes), it may be desirable to limit the IP datagram size to
297	   1280 bytes in order to avoid the need to support Path MTU Discovery
298	   [RFC8201].  In addition, an IP datagram size of 1280 bytes avoids
299	   incurring IPv6-layer fragmentation.

301	   An IPv6 datagram size exceeding 1280 bytes can be avoided by setting
302	   the TCP MSS not larger than 1220 bytes.  This assumes that the remote
303	   sender will use no TCP options, aside from possibly the MSS option,
304	   which is only used in the initial TCP SYN packet.

306	   In order to accommodate unrequested TCP options that may be used by
307	   some TCP implementations, a constrained device may advertise an MSS
308	   smaller than 1220 bytes (e.g. not larger than 1200 bytes).  Note that
309	   it is advised for TCP implementations to consume payload space
310	   instead of increasing datagram size when including IP or TCP options
311	   in an IP packet to be sent [RFC6691].  Therefore, the suggestion of
312	   advertising an MSS smaller than 1220 bytes is likely to be
313	   overcautious and its suitability should be considered carefully.

315	   Note that setting the MTU to 1280 bytes is possible for link layer
316	   technologies in the CNN space, even if some of them are characterized
317	   by a short data unit payload size, e.g. up to a few tens or hundreds
318	   of bytes.  For example, the maximum frame size in IEEE 802.15.4 is
319	   127 bytes.  6LoWPAN defined an adaptation layer to support IPv6 over
320	   IEEE 802.15.4 networks.  The adaptation layer includes a
321	   fragmentation mechanism, since IPv6 requires the layer below to
322	   support an MTU of 1280 bytes [RFC2460], while IEEE 802.15.4 lacked
323	   fragmentation mechanisms.  6LoWPAN defines an IEEE 802.15.4 link MTU
324	   of 1280 bytes [RFC4944].  Other technologies, such as Bluetooth LE
325	   [RFC7668], ITU-T G.9959 [RFC7428] or DECT-ULE [RFC8105], also use
326	   6LoWPAN-based adaptation layers in order to enable IPv6 support.
327	   These technologies do support link layer fragmentation.  By
328	   exploiting this functionality, the adaptation layers that enable IPv6
329	   over such technologies also define an MTU of 1280 bytes.

331	   On the other hand, there exist technologies also used in the CNN
332	   space, such as Master Slave / Token Passing (TP) [RFC8163],
333	   Narrowband IoT (NB-IoT) [RFC8376] or IEEE 802.11ah
334	   [I-D.delcarpio-6lo-wlanah], that do not suffer the same degree of
335	   frame size limitations as the technologies mentioned above.  The MTU
336	   for MS/TP is recommended to be 1500 bytes [RFC8163], the MTU in NB-
337	   IoT is 1600 bytes, and the maximum frame payload size for IEEE
338	   802.11ah is 7991 bytes.

340	   Note that using larger MSS (to a suitable extent) may be beneficial,
341	   especially when transferring large payloads, as it reduces the number
342	   of packets (and packet headers) required for a given payload.
343	   However, use of MTUs that exceed 1280 bytes (including the typical
344	   1500-byte MTU for communication paths including broader Internet
345	   segments) may incur Path MTU Discovery overhead.

347	4.1.2.  Explicit Congestion Notification (ECN)

349	   Explicit Congestion Notification (ECN) [RFC3168] ECN allows a router
350	   to signal in the IP header of a packet that congestion is arising,
351	   for example when a queue size reaches a certain threshold.  An ECN-
352	   enabled TCP receiver will echo back the congestion signal to the TCP
353	   sender by setting a flag in its next TCP ACK.  The sender triggers
354	   congestion control measures as if a packet loss had happened.

356	   The document [RFC8087] outlines the principal gains in terms of
357	   increased throughput, reduced delay, and other benefits when ECN is
358	   used over a network path that includes equipment that supports
359	   Congestion Experienced (CE) marking.  In the context of CNNs, a
360	   remarkable feature of ECN is that congestion can be signalled without
361	   incurring packet drops (which will lead to retransmissions and
362	   consumption of limited resources such as energy and bandwitdh).

364	   ECN can further reduce packet losses since congestion control
365	   measures can be applied earlier [RFC2884].  Fewer lost packets
366	   implies that the number of retransmitted segments decreases, which is
367	   particularly beneficial in CNNs, where energy and bandwidth resources
368	   are typically limited.  Also, it makes sense to try to avoid packet
369	   drops for transactional workloads with small data sizes, which are
370	   typical for CNNs.  In such traffic patterns, it is more difficult and
371	   often impossible to detect packet loss without retransmission
372	   timeouts (e.g., as there may be no three duplicate ACKs).  Any
373	   retransmission timeout slows down the data transfer significantly.
374	   In addition, if the constrained device uses power saving techniques,
375	   a retransmission timeout will incur a wake-up action, in contrast to
376	   ACK clock- triggered sending.  When the congestion window of a TCP
377	   sender has a size of one segment and a TCP ACK with an ECN signal
378	   (ECE flag) arrives at the TCP sender, the TCP sender resets the
379	   retransmit timer, and the sender will only be able to send a new
380	   packet when the retransmit timer expires.  Effectively, the TCP
381	   sender reduces at that moment its sending rate from 1 segment per
382	   Round Trip Time (RTT) to 1 segment per RTO and reduces the sending
383	   rate further on each ECN signal received in subsequent TCP ACKs.
384	   Otherwise, if an ECN signal is not present in a subsequent TCP ACK
385	   the TCP sender resumes the normal ACK-clocked transmission of
386	   segments [RFC3168].

388	   ECN can be incrementally deployed in the Internet.  Guidance on
389	   configuration and usage of ECN is provided in [RFC7567].  Given the
390	   benefits, more and more TCP stacks in the Internet support ECN, and
391	   it specifically makes sense to leverage ECN in controlled
392	   environments such as CNNs.  Note, however, that supporting ECN
393	   increases implementation complexity.

395	4.1.3.  Explicit loss notifications

397	   There has been a significant body of research on solutions capable of
398	   explicitly indicating whether a TCP segment loss is due to
399	   corruption, in order to avoid activation of congestion control
400	   mechanisms [ETEN] [RFC2757].  While such solutions may provide
401	   significant improvement, they have not been widely deployed and
402	   remain as experimental work.  In fact, as of today, the IETF has not
403	   standardized any such solution.

405	4.2.  TCP guidance for single-MSS stacks

407	   This section discusses TCP stacks that allow transferring a single
408	   MSS.  More general guidance is provided in Section 4.3.

410	4.2.1.  Single-MSS stacks - benefits and issues

412	   A TCP stack can reduce the memory requirements by advertising a TCP
413	   window size of one MSS, and also transmit at most one MSS of
414	   unacknowledged data.  In that case, both congestion and flow control
415	   implementation are quite simple.  Such a small receive and send
416	   window may be sufficient for simple message exchanges in the CNN
417	   space.  However, only using a window of one MSS can significantly
418	   affect performance.  A stop-and-wait operation results in low
419	   throughput for transfers that exceed the length of one MSS, e.g., a
420	   firmware download.  Furthermore, a single-MSS solution relies solely
421	   on timer-based loss recovery, therefore missing the performance gain
422	   of Fast Retransmit and Fast Recovery (which require a larger window
423	   size, see Subsection 4.3.1).

425	   If CoAP is used over TCP with the default setting for NSTART in
426	   [RFC7252], a CoAP endpoint is not allowed to send a new message to a
427	   destination until a response for the previous message sent to that
428	   destination has been received.  This is equivalent to an application-
429	   layer window size of 1 data unit.  For this use of CoAP, a maximum
430	   TCP window of one MSS may be sufficient, as long as the CoAP message
431	   size does not exceed one MSS.  An exception in CoAP over TCP, though,
432	   is the Capabilities and Settings Message (CSM) that must be sent at
433	   the start of the TCP connection.  The first application message
434	   carrying user data is allowed to be sent immediately after the CSM
435	   message.  If the sum of the CSM size plus the application message
436	   size exceeds the MSS, a sender using a single-MSS stack will need to
437	   wait for the ACK confirming the CSM before sending the application
438	   message.

440	4.2.2.  TCP options for single-MSS stacks

442	   A TCP implementation needs to support, at a minimum, TCP options 2, 1
443	   and 0.  These are, respectively, the Maximum Segment Size (MSS)
444	   option, the No-Operation option, and the End Of Option List marker
445	   [RFC0793].  None of these are a substantial burden to support.  These
446	   options are sufficient for interoperability with a standard-compliant
447	   TCP endpoint, albeit many TCP stacks support additional options and
448	   can negotiate their use.  A TCP implementation is permitted to
449	   silently ignore all other TCP options.

451	   A TCP implementation for a constrained device that uses a single-MSS
452	   TCP receive or transmit window size may not benefit from supporting
453	   the following TCP options: Window scale [RFC7323], TCP Timestamps
454	   [RFC7323], Selective Acknowledgments (SACK) and SACK-Permitted
455	   [RFC2018].  Also other TCP options may not be required on a
456	   constrained device with a very lightweight implementation.  With
457	   regard to the Window scale option, note that it is only useful if a
458	   window size greater than 64 kB is needed.

460	   Note that a TCP sender can benefit from the TCP Timestamps option
461	   [RFC7323] in detecting spurious RTOs.  The latter are quite likely to
462	   occur in CNN scenarios due to a number of reasons (e.g. route changes
463	   in a multihop scenario, link layer retries, etc.).  The header
464	   overhead incurred by the Timestamps option (of up to 12 bytes) needs
465	   to be taken into account.

467	   One potentially relevant TCP option in the context of CNNs is TCP
468	   Fast Open (TFO) [RFC7413].  As described in Section 5.3, TFO can be
469	   used to address the problem of traversing middleboxes that perform
470	   early filter state record deletion.

472	4.2.3.  Delayed Acknowledgments for single-MSS stacks

474	   TCP Delayed Acknowledgments are meant to reduce the number of ACKs
475	   sent within a TCP connection, thus reducing network overhead, but
476	   they may increase the time until a sender may receive an ACK.  In
477	   general, usefulness of Delayed ACKs depends heavily on the usage
478	   scenario (see subsection 4.3.2).  There can be interactions with
479	   single-MSS stacks.

481	   When traffic is unidirectional, if the sender can send at most one
482	   MSS of data or the receiver advertises a receive window not greater
483	   than the MSS, Delayed ACKs may unnecessarily contribute delay (up to
484	   500 ms) to the RTT [RFC5681], which limits the throughput and can
485	   increase data delivery time.  Note that, in some cases, it may not be
486	   possible to disable Delayed ACKs.  One known workaround is to split
487	   the data to be sent into two segments of smaller size.  A standard
488	   compliant TCP receiver may immediately acknowledge the second MSS of
489	   data, which can improve throughput.  However, this 'split hack' may
490	   not always work since a TCP receiver is required to acknowledge every
491	   second full-sized segment, but not two consecutive small segments.
492	   The overhead of sending two IP packets instead of one is another
493	   downside of the 'split hack'.

495	   Similar issues may happen when the sender uses the Nagle algorithm,
496	   since the sender may need to wait for an unnecessarily delayed ACK to
497	   send a new segment.  Disabling the algorithm will not have impact if
498	   the sender can only handle stop-and-wait operation at the TCP level.

500	   For request-response traffic, when the receiver uses Delayed ACKs, a
501	   response to a data message can piggyback an ACK, as long as the
502	   latter is sent before the Delayed ACK timer expires, thus avoiding
503	   unnecessary ACKs without payload.  Disabling Delayed ACKs at the
504	   sender allows an immediate ACK for the data segment carrying the
505	   response.

507	4.2.4.  RTO calculation for single-MSS stacks

509	   The Retransmission Timeout (RTO) calculation is one of the
510	   fundamental TCP algorithms [RFC6298].  There is a fundamental trade-
511	   off: A short, aggressive RTO behavior reduces wait time before
512	   retransmissions, but it also increases the probability of spurious
513	   timeouts.  The latter lead to unnecessary waste of potentially scarce
514	   resources in CNNs such as energy and bandwidth.  In contrast, a
515	   conservative timeout can result in long error recovery times and thus
516	   needlessly delay data delivery.

518	   If a TCP sender uses a very small window size, and it cannot benefit
519	   from Fast Retransmit/Fast Recovery or SACK, the RTO algorithm has a
520	   large impact on performance.  In that case, RTO algorithm tuning may
521	   be considered, although careful assessment of possible drawbacks is
522	   recommended [I-D.ietf-tcpm-rto-consider].

524	   As an example, adaptive RTO algorithms defined for CoAP over UDP have
525	   been found to perform well in CNN scenarios [Commag]
526	   [I-D.jarvinen-core-fasor].

528	4.3.  General recommendations for TCP in CNNs

530	   This section summarizes some widely used techniques to improve TCP,
531	   with a focus on their use in CNNs.  The TCP extensions discussed here
532	   are useful in a wide range of network scenarios, including CNNs.
533	   This section is not comprehensive.  A comprehensive survey of TCP
534	   extensions is published in [RFC7414].

536	4.3.1.  Loss recovery and congestion/flow control

538	   Devices that have enough memory to allow a larger (i.e. more than 3
539	   MSS of data) TCP window size can leverage a more efficient loss
540	   recovery than the timer-based approach used for smaller TCP window
541	   size (see Subsection 3.2.1) by using Fast Retransmit and Fast
542	   Recovery [RFC5681], at the expense of slightly greater complexity and
543	   Transmission Control Block (TCB) size.  Assuming that Delayed ACKs
544	   are used by the receiver, a window size of up to 5 MSS is required
545	   for Fast Retransmit and Fast Recovery to work efficiently: If in a
546	   given TCP transmission of full-sized segments 1, 2, 3, 4, and 5,
547	   segment 2 gets lost, and the ACK for segment 1 is held by the Delayed
548	   ACK timer, then the sender should get an ACK for segment 1 when 3
549	   arrives and duplicate ACKs when segments 4, 5, and 6 arrive.  It will
550	   retransmit segment 2 when the third duplicate ACK arrives.  In order
551	   to have segments 2, 3, 4, 5, and 6 sent, the window has to be of at
552	   least 5 MSS.  With an MSS of 1220 bytes, a buffer of a size of 5 MSS
553	   would require 6100 bytes.

555	   The example in the previous paragraph did not use a further TCP
556	   improvement such as Limited Transmit [RFC3042].  The latter may also
557	   be useful for any transfer that has more than one segment in flight.
558	   Small transfers tend to benefit more from Limited Transmit, because
559	   they are more likely to not receive enough duplicate ACKs.  Assuming
560	   the example in the previous paragraph, Limited Transmit allows
561	   sending 5 MSS with a congestion window (cwnd) of 3 segments, plus two
562	   additional segments for the first two duplicate ACKs.  With Limited
563	   Transmit, even a cwnd of 2 segments allows sending 5 MSS, at the
564	   expense of additional delay contributed by the Delayed ACK timer for
565	   the ACK that confirms segment 1.

567	   When a multiple-segment window is used, the receiver will need to
568	   manage the reception of possible out-of-order received segments,
569	   requiring sufficient buffer space.

571	4.3.1.1.  Selective Acknowledgments (SACK)

573	   If a device with less severe memory and processing constraints can
574	   afford advertising a TCP window size of several MSS, it makes sense
575	   to support the SACK option to improve performance.  SACK allows a
576	   data receiver to inform the data sender of non-contiguous data blocks
577	   received, thus a sender (having previously sent the SACK-Permitted
578	   option) can avoid performing unnecessary retransmissions, saving
579	   energy and bandwidth, as well as reducing latency.  In addition, SACK
580	   often allows for faster loss recovery when there is more than one
581	   lost segment in a window of data, since with SACK recovery may
582	   complete with less RTTs.  SACK is particularly useful for bulk data
583	   transfers.  A receiver supporting SACK will need to keep track of the
584	   SACK blocks that need to be received.  The sender will also need to
585	   keep track of which data segments need to be resent after learning
586	   which data blocks are missing at the receiver.  SACK adds 8*n+2 bytes
587	   to the TCP header, where n denotes the number of data blocks
588	   received, up to 4 blocks.  For a low number of out-of-order segments,
589	   the header overhead penalty of SACK is compensated by avoiding
590	   unnecessary retransmissions.  When the sender discovers the data
591	   blocks that have already been received, it needs to also store the
592	   necessary state to avoid unnecessary retransmission of data segments
593	   that have already been received.

595	4.3.2.  Delayed Acknowledgments

597	   For certain traffic patterns, Delayed ACKs may have a detrimental
598	   effect, as already noted in Section 4.2.3.  Advanced TCP stacks may
599	   use heuristics to determine the maximum delay for an ACK.  For CNNs,
600	   the recommendation depends on the expected communication patterns.

602	   When traffic over a CNN is expected to mostly be unidirectional
603	   messages with a size typically up to one MSS, and the time between
604	   two consecutive message transmissions is greater than the Delayed ACK
605	   timeout, it may make sense to use a small timeout or disable Delayed
606	   ACKs at the receiver.  This avoids incurring additional delay, as
607	   well as the energy consumption of the sender (which might e.g. keep
608	   its radio interface in receive mode) during that time.  Note that
609	   disabling Delayed ACKs may only be possible if the peer device is
610	   administered by the same entity managing the constrained device.  For
611	   request-response traffic, enabling Delayed ACKs is recommended at the
612	   server end, in order to allow combining a response with the ACK into
613	   a single segment, thus increasing efficiency.  In addition, if a
614	   client issues requests infrequently, disabling Delayed ACKs at the
615	   client allows an immediate ACK for the data segment carrying the
616	   response.

618	   In contrast, Delayed ACKs allow to reduce the number of ACKs in bulk
619	   transfer type of traffic, e.g. for firmware/software updates or for
620	   transferring larger data units containing a batch of sensor readings.

622	   Note that, in many scenarios, the peer that a constrained device
623	   communicates with will be a general purpose system that communicates
624	   with both constrained and unconstrained devices.  Since delayed ACKs
625	   are often configured through system-wide parameters, delayed ACKs
626	   behavior at the peer will be the same regardless of the nature of the
627	   endpoints it talks to.  Such a peer will typically have delayed ACKs
628	   enabled.

630	4.3.3.  Initial Window

632	   RFC 5681 specifies a TCP Initial Window (IW) of roughly 4 kB
633	   [RFC5681].  Subsequently, RFC 6928 defined an experimental new value
634	   for the IW, which in practice will result in an IW of 10 MSS
635	   [RFC6928].  The latter is nowadays used in many TCP implementations.

637	   Note that a 10-MSS IW was recommended for resource-rich environments
638	   (e.g. broadband environments), which are significantly different from
639	   CNNs.  In CNNs, many application layer data units are relatively
640	   small (e.g. below one MSS).  However, larger objects (e.g. large
641	   files containing sensor readings, firmware updates, etc.) may also
642	   need to be transferred in CNNs.  If such a large object is
643	   transferred in CNNs, with an IW setting of 10 MSS, there is
644	   significant buffer overflow risk.  In order to avoid such problem, in
645	   CNNs the IW needs to be carefully set, based on device and network
646	   resource constraints.  In many cases, a safe IW setting will be
647	   smaller than 10 MSS.

649	5.  TCP usage recommendations in CNNs

651	   This section discusses how TCP can be used by applications that are
652	   developed for CNN scenarios.  These remarks are by and large
653	   independent of how TCP is exactly implemented.

655	5.1.  TCP connection initiation

657	   In the constrained device to unconstrained device scenario
658	   illustrated above, a TCP connection is typically initiated by the
659	   constrained device, in order for this device to support possible
660	   sleep periods to save energy.

662	5.2.  Number of concurrent connections

664	   TCP endpoints with a small amount of memory may only support a small
665	   number of connections.  Each TCP connection requires storing a number
666	   of variables in the TCB.  Depending on the internal TCP
667	   implementation, each connection may result in further memory
668	   overhead, and connections may compete for scarce resources (e.g.
669	   further memory overhead for send and receive buffers, etc).

671	   A careful application design may try to keep the number of concurrent
672	   connections as small as possible.  A client can for instance limit
673	   the number of simultaneous open connections that it maintains to a
674	   given server.  Multiple connections could for instance be used to
675	   avoid the "head-of-line blocking" problem in an application transfer.
676	   However, in addition to consuming resources, using multiple
677	   connections can also cause undesirable side effects in congested
678	   networks.  For example, the HTTP/1.1 specification encourages clients
679	   to be conservative when opening multiple connections [RFC7230].
680	   Furthermore, each new connection will start with a 3-way handshake,
681	   therefore increasing message overhead.

683	   Being conservative when opening multiple TCP connections is of
684	   particular importance in Constrained-Node Networks.

686	5.3.  TCP connection lifetime

688	   In order to minimize message overhead, it makes sense to keep a TCP
689	   connection open as long as the two TCP endpoints have more data to
690	   send.  If applications exchange data rather infrequently, i.e., if
691	   TCP connections would stay idle for a long time, the idle time can
692	   result in problems.  For instance, certain middleboxes such as
693	   firewalls or NAT devices are known to delete state records after an
694	   inactivity interval.  RFC 5382 specifies a minimum value for such
695	   interval of 124 minutes.  Measurement studies have reported that TCP
696	   NAT binding timeouts are highly variable across devices, with a
697	   median around 60 minutes, the shortest timeout being around 2
698	   minutes, and more than 50% of the devices with a timeout shorter than
699	   the aforementioned minimum timeout of 124 minutes [HomeGateway].  The
700	   timeout duration used by a middlebox implementation may not be known
701	   to the TCP endpoints.

703	   In CNNs, such middleboxes may e.g. be present at the boundary between
704	   the CNN and other networks.  If the middlebox can be optimized for
705	   CNN use cases, it makes sense to increase the initial value for
706	   filter state inactivity timers to avoid problems with idle
707	   connections.  Apart from that, this problem can be dealt with by
708	   different connection handling strategies, each having pros and cons.

710	   One approach for infrequent data transfer is to use short-lived TCP
711	   connections.  Instead of trying to maintain a TCP connection for a
712	   long time, possibly short-lived connections can be opened between two
713	   endpoints, which are closed if no more data needs to be exchanged.
714	   For use cases that can cope with the additional messages and the
715	   latency resulting from starting new connections, it is recommended to
716	   use a sequence of short-lived connections, instead of maintaining a
717	   single long-lived connection.

719	   The message and latency overhead that stems from using a sequence of
720	   short-lived connections could be reduced by TCP Fast Open (TFO)
721	   [RFC7413], which is an experimental TCP extension, at the expense of
722	   increased implementation complexity and increased TCP Control Block
723	   (TCB) size.  TFO allows data to be carried in SYN (and SYN-ACK)
724	   segments, and to be consumed immediately by the receiving endpoint.
725	   This reduces the message and latency overhead compared to the
726	   traditional three-way handshake to establish a TCP connection.  For
727	   security reasons, the connection initiator has to request a TFO
728	   cookie from the other endpoint.  The cookie, with a size of 4 or 16
729	   bytes, is then included in SYN packets of subsequent connections.
730	   The cookie needs to be refreshed (and obtained by the client) after a
731	   certain amount of time.  Nevertheless, TFO is more efficient than
732	   frequently opening new TCP connections with the traditional three-way
733	   handshake, as long as the cookie can be reused in subsequent
734	   connections.  However, as stated in RFC 7413, TFO deviates from the
735	   standard TCP semantics, since the data in the SYN could be replayed
736	   to an application in some rare circumstances.  Applications should
737	   not use TFO unless they can tolerate this issue, e.g., by using
738	   Transport Layer Security (TLS) [RFC7413].  A comprehensive discussion
739	   on TFO can be found at RFC 7413.

741	   Another approach is to use long-lived TCP connections with
742	   application-layer heartbeat messages.  Various application protocols
743	   support such heartbeat messages (e.g.  CoAP over TCP [RFC8323]).
744	   Periodic application-layer heartbeats can prevent early filter state
745	   record deletion in middleboxes.  If the TCP binding timeout for a
746	   middlebox to be traversed by a given connection is known, middlebox
747	   filter state deletion will be avoided if the heartbeat period is
748	   lower than the middlebox TCP binding timeout.  Otherwise, the
749	   implementer needs to take into account that middlebox TCP binding
750	   timeouts fall in a wide range of possible values [HomeGateway], and
751	   it may be hard to find a proper heartbeat period for application-
752	   layer heartbeat messages.

754	   One specific advantage of Heartbeat messages is that they also allow
755	   aliveness checks at the application level.  In general, it makes
756	   sense to realize aliveness checks at the highest protocol layer
757	   possible that is meaningful to the application, in order to maximize
758	   the depth of the aliveness check.  In addition, timely detection of a
759	   dead peer may allow savings in terms of TCB memory use.  However, the
760	   transmission of heartbeat messages consumes resources.  This aspect
761	   needs to be assessed carefully, considering the characteristics of
762	   each specific CNN.

764	   A TCP implementation may also be able to send "keep-alive" segments
765	   to test a TCP connection.  According to [RFC1122], "keep-alives" are
766	   an optional TCP mechanism that is turned off by default, i.e., an
767	   application must explicitly enable it for a TCP connection.  The
768	   interval between "keep-alive" messages must be configurable and it
769	   must default to no less than two hours.  With this large timeout, TCP
770	   keep-alive messages might not always be useful to avoid deletion of
771	   filter state records in some middleboxes.  However, sending TCP keep-
772	   alive probes more frequently risks draining power on energy-
773	   constrained devices.

775	6.  Security Considerations

777	   Best current practise for securing TCP and TCP-based communication
778	   also applies to CNN.  As example, use of Transport Layer Security
779	   (TLS) is strongly recommended if it is applicable.

781	   There are also TCP options which can improve TCP security.  One
782	   example is the TCP Authentication Option (TCP-AO) [RFC5925].
783	   However, this option adds overhead and complexity.  TCP-AO typically
784	   has a size of 16-20 bytes.

786	   For the mechanisms discussed in this document, the corresponding
787	   considerations apply.  For instance, if TFO is used, the security
788	   considerations of [RFC7413] apply.

790	   Constrained devices are expected to support smaller TCP window sizes
791	   than less limited devices.  In such conditions, segment
792	   retransmission triggered by RTO expiration is expected to be
793	   relatively frequent, due to lack of (enough) duplicate ACKs,
794	   especially when a constrained device uses a single-MSS
795	   implementation.  For this reason, constrained devices running TCP may
796	   appear as particularly appealing victims of the so-called "shrew"
797	   Denial of Service (DoS) attack [shrew], whereby one or more sources
798	   generate a packet spike targetted to coincide with consecutive RTO-
799	   expiration-triggered retry attempts of a victim node.  Note that the
800	   attack may be performed by Internet-connected devices, including
801	   constrained devices in the same CNN as the victim, as well as remote
802	   ones.  Mitigation techniques include RTO randomization and attack
803	   blocking by routers able to detect shrew attacks based on their
804	   traffic pattern.

806	7.  Acknowledgments

808	   Carles Gomez has been funded in part by the Spanish Government
809	   (Ministerio de Educacion, Cultura y Deporte) through the Jose
810	   Castillejo grants CAS15/00336 and and CAS18/00170, and by European
811	   Regional Development Fund (ERDF) and the Spanish Government through
812	   projects TEC2016-79988-P, PID2019-106808RA-I00, AEI/FEDER, UE, and by
813	   Generalitat de Catalunya Grant 2017 SGR 376.  Part of his
814	   contribution to this work has been carried out during his stays as a
815	   visiting scholar at the Computer Laboratory of the University of
816	   Cambridge.

818	   The authors appreciate the feedback received for this document.  The
819	   following folks provided comments that helped improve the document:
820	   Carsten Bormann, Zhen Cao, Wei Genyu, Ari Keranen, Abhijan
821	   Bhattacharyya, Andres Arcia-Moret, Yoshifumi Nishida, Joe Touch, Fred
822	   Baker, Nik Sultana, Kerry Lynn, Erik Nordmark, Markku Kojo, Hannes
823	   Tschofenig, David Black, Yoshifumi Nishida, Ilpo Jarvinen, Emmanuel
824	   Baccelli, Stuart Cheshire, Gorry Fairhurst, and Ingemar Johansson.
825	   Simon Brummer provided details, and kindly performed RAM and ROM
826	   usage measurements, on the RIOT TCP implementation.  Xavi Vilajosana
827	   provided details on the OpenWSN TCP implementation.  Rahul Jadhav
828	   kindly performed code size measurements on the Contiki-NG and lwIP
829	   2.1.2 TCP implementations.  He also provided details on the uIP TCP
830	   implementation.

832	8.  Annex.  TCP implementations for constrained devices

834	   This section overviews the main features of TCP implementations for
835	   constrained devices.  The survey is limited to open source stacks
836	   with small footprint.  It is not meant to be all-encompassing.  For
837	   more powerful embedded systems (e.g., with 32-bit processors), there
838	   are further stacks that comprehensively implement TCP.  On the other
839	   hand, please be aware that this Annex is based on information
840	   available as of the writing.

842	8.1.  uIP

844	   uIP is a TCP/IP stack, targetted for 8 and 16-bit microcontrollers,
845	   which pioneered TCP/IP implementations for constrained devices.  uIP
846	   has been deployed with Contiki and the Arduino Ethernet shield.  A
847	   code size of ~5 kB (which comprises checksumming, IPv4, ICMP and TCP)
848	   has been reported for uIP [Dunk].  Later versions of uIP implement
849	   IPv6 as well.

851	   uIP uses the same global buffer for both incoming and outgoing
852	   traffic, which has a size of a single packet.  In case of a
853	   retransmission, an application must be able to reproduce the same
854	   user data that had been transmitted.  Multiple connections are
855	   supported, but need to share the global buffer.

857	   The MSS is announced via the MSS option on connection establishment
858	   and the receive window size (of one MSS) is not modified during a
859	   connection.  Stop-and-wait operation is used for sending data.  Among
860	   other optimizations, this allows to avoid sliding window operations,
861	   which use 32-bit arithmetic extensively and are expensive on 8-bit
862	   CPUs.

864	   Contiki uses the "split hack" technique (see Section 4.2.3) to avoid
865	   Delayed ACKs for senders using a single segment.

867	   The code size of the TCP implementation in Contiki-NG has been
868	   measured to be of 3.2 kB on CC2538DK, cross-compiling on Linux.

870	8.2.  lwIP

872	   lwIP is a TCP/IP stack, targetted for 8- and 16-bit microcontrollers.
873	   lwIP has a total code size of ~14 kB to ~22 kB (which comprises
874	   memory management, checksumming, network interfaces, IPv4, ICMP and
875	   TCP), and a TCP code size of ~9 kB to ~14 kB [Dunk].  Both IPv4 and
876	   IPv6 are supported in lwIP since v2.0.0.

878	   In contrast with uIP, lwIP decouples applications from the network
879	   stack. lwIP supports a TCP transmission window greater than a single
880	   segment, as well as buffering of incoming and outcoming data.  Other
881	   implemented mechanisms comprise slow start, congestion avoidance,
882	   fast retransmit and fast recovery.  SACK and Window Scale support has
883	   been recently added to lwIP.

885	8.3.  RIOT

887	   The RIOT TCP implementation (called GNRC TCP) has been designed for
888	   Class 1 devices [RFC 7228].  The main target platforms are 8- and
889	   16-bit microcontrollers, with 32-bit platforms also supported.  GNRC
890	   TCP offers a similar function set as uIP, but it provides and
891	   maintains an independent receive buffer for each connection.  In
892	   contrast to uIP, retransmission is also handled by GNRC TCP.  For
893	   simplicity, GNRC TCP uses a single-MSS implementation.  The
894	   application programmer does not need to know anything about the TCP
895	   internals, therefore GNRC TCP can be seen as a user-friendly uIP TCP
896	   implementation.

898	   The MSS is set on connections establishment and cannot be changed
899	   during connection lifetime.  GNRC TCP allows multiple connections in
900	   parallel, but each TCB must be allocated somewhere in the system.  By
901	   default there is only enough memory allocated for a single TCP
902	   connection, but it can be increased at compile time if the user needs
903	   multiple parallel connections.

905	   The RIOT TCP implementation offers an optional POSIX socket wrapper
906	   that enables POSIX compliance, if needed.

908	   Further details on RIOT and GNRC can be found in the literature
909	   [RIOT], [GNRC].

911	8.4.  TinyOS

913	   TinyOS was important as a platform for early constrained devices.
914	   TinyOS has an experimental TCP stack that uses a simple nonblocking
915	   library-based implementation of TCP, which provides a subset of the
916	   socket interface primitives.  The application is responsible for
917	   buffering.  The TCP library does not do any receive-side buffering.
918	   Instead, it will immediately dispatch new, in-order data to the
919	   application and otherwise drop the segment.  A send buffer is
920	   provided by the application.  Multiple TCP connections are possible.
921	   Recently there has been little further work on the stack.

923	8.5.  FreeRTOS

925	   FreeRTOS is a real-time operating system kernel for embedded devices
926	   that is supported by 16- and 32-bit microprocessors.  Its TCP
927	   implementation is based on multiple-segment window size, although a
928	   'Tiny-TCP' option, which is a single-MSS variant, can be enabled.
929	   Delayed ACKs are supported, with a 20-ms Delayed ACK timer as a
930	   technique intended 'to gain performance'.

932	8.6.  uC/OS

934	   uC/OS is a real-time operating system kernel for embedded devices,
935	   which is maintained by Micrium. uC/OS is intended for 8-, 16- and
936	   32-bit microprocessors.  The uC/OS TCP implementation supports a
937	   multiple-segment window size.

939	8.7.  Summary
940	                        +---+---------+--------+----+------+--------+-----+
941	                        |uIP|lwIP orig|lwIP 2.1|RIOT|TinyOS|FreeRTOS|uC/OS|
942	   +------+-------------+---+---------+--------+----+------+--------+-----+
943	   |Memory|Code size(kB)| <5|~9 to ~14|   38   | <7 | N/A  |  <9.2  | N/A |
944	   |      |             |(a)|   (T1)  |  (T4)  |(T3)|      |  (T2)  |     |
945	   +------+-------------+---+---------+--------+----+------+--------+-----+
946	   |      | Single-Segm.|Yes|    No   |   No   | Yes|  No  |   No   |  No |
947	   |      +-------------+---+---------+--------+----+------+--------+-----+
948	   |      |  Slow start | No|   Yes   |   Yes  | No | Yes  |   No   | Yes |
949	   |  T   +-------------+---+---------+--------+----+------+--------+-----+
950	   |  C   |Fast rec/retx| No|   Yes   |   Yes  | No | Yes  |   No   | Yes |
951	   |  P   +-------------+---+---------+--------+----+------+--------+-----+
952	   |      |  Keep-alive | No|    No   |   Yes  | No |  No  |  Yes   | Yes |
953	   |      +-------------+---+---------+--------+----+------+--------+-----+
954	   |  f   |  Win. Scale | No|    No   |   Yes  | No |  No  |  Yes   |  No |
955	   |  e   +-------------+---+---------+--------+----+------+--------+-----+
956	   |  a   |  TCP timest.| No|    No   |   Yes  | No |  No  |  Yes   |  No |
957	   |  t   +-------------+---+---------+--------+----+------+--------+-----+
958	   |  u   |      SACK   | No|    No   |   Yes  | No |  No  |  Yes   |  No |
959	   |  r   +-------------+---+---------+--------+----+------+--------+-----+
960	   |  e   |  Del. ACKs  | No|   Yes   |   Yes  | No |  No  |  Yes   | Yes |
961	   |  s   +-------------+---+---------+--------+----+------+--------+-----+
962	   |      |     Socket  | No|    No   |Optional|(I) |Subset|  Yes   | Yes |
963	   |      +-------------+---+---------+--------+----+------+--------+-----+
964	   |      |Concur. Conn.|Yes|   Yes   |   Yes  | Yes| Yes  |  Yes   | Yes |
965	   +------+-------------+---+---------+--------+----+------+--------+-----+
966	   |    TLS supported   | No|    No   |   Yes  | Yes| Yes  |  Yes   | Yes |
967	   +--------------------+---+---------+--------+----+------+--------+-----+

969	     (T1)  = TCP-only, on x86 and AVR platforms
970	     (T2)  = TCP-only, on ARM Cortex-M platform
971	     (T3)  = TCP-only, on ARM Cortex-M0+ platform (NOTE: RAM usage for the same platform
972	             is ~2.5 kB for one TCP connection plus ~1.2 kB for each additional connection)
973	     (T4)  = TCP-only, on CC2538DK, cross-compiling on Linux
974	     (a)   = includes IP, ICMP and TCP on x86 and AVR platforms. The Contiki-NG TCP implementation has a code size of 3.2 kB on CC2538DK, cross-compiling on Linux
975	     (I)   = optional POSIX socket wrapper which enables POSIX compliance if needed
976	     Mult. = Multiple
977	     N/A   = Not Available

979	     Figure 2: Summary of TCP features for differrent lightweight TCP
980	     implementations.  None of the implementations considered in this
981	                         Annex support ECN or TFO.

983	9.  Annex.  Changes compared to previous versions

985	   RFC Editor: To be removed prior to publication

987	9.1.  Changes between -00 and -01

989	   o  Changed title and abstract

991	   o  Clarification that communcation with standard-compliant TCP
992	      endpoints is required, based on feedback from Joe Touch

994	   o  Additional discussion on communication patters

996	   o  Numerous changes to address a comprehensive review from Hannes
997	      Tschofenig

999	   o  Reworded security considerations

1001	   o  Additional references and better distinction between normative and
1002	      informative entries

1004	   o  Feedback from Rahul Jadhav on the uIP TCP implementation

1006	   o  Basic data for the TinyOS TCP implementation added, based on
1007	      source code analysis

1009	9.2.  Changes between -01 and -02

1011	   o  Added text to the Introduction section, and a reference, on
1012	      traditional bad perception of TCP for IoT

1014	   o  Added sections on FreeRTOS and uC/OS

1016	   o  Updated TinyOS section

1018	   o  Updated summary table

1020	   o  Reorganized Section 4 (single-MSS vs multiple-MSS window size),
1021	      some content now also in new Section 5

1023	9.3.  Changes between -02 and -03

1025	   o  Rewording to better explain the benefit of ECN

1027	   o  Additional context information on the surveyed implementations

1029	   o  Added details, but removed "Data size" raw, in the summary table
1030	   o  Added discussion on shrew attacks

1032	9.4.  Changes between -03 and -04

1034	   o  Addressing the remaining TODOs

1036	   o  Alignment of the wording on TCP "keep-alives" with related
1037	      discussions in the IETF transport area

1039	   o  Added further discussion on delayed ACKs

1041	   o  Removed OpenWSN subsection from the Annex

1043	9.5.  Changes between -04 and -05

1045	   o  Addressing comments by Yoshifumi Nishida

1047	   o  Removed mentioning MD5 as an example (comment by David Black)

1049	   o  Added memory footprint details of TCP implementations (Contiki-NG
1050	      and lwIP 2.1.2) provided by Rahul Jadhav in the Annex

1052	   o  Addressed comments by Ilpo Jarvinen throughout the whole document

1054	   o  Improved the RIOT section in the Annex, based on feedback from
1055	      Emmanuel Baccelli

1057	9.6.  Changes between -05 and -06

1059	   o  Incorporated suggestions by Stuart Cheshire

1061	9.7.  Changes between -06 and -07

1063	   o  Addressed comments by Gorry Fairhurst

1065	9.8.  Changes between -07 and -08

1067	   o  Addressed WGLC comments by Ilpo Jarvinen, Markku Kojo and Ingemar
1068	      Johansson throughout the document, including the addition of a new
1069	      subsection on Initial Window considerations.

1071	9.9.  Changes between -08 and -09

1073	   o  Addressed second round of comments by Ilpo Jarvinen and Markku
1074	      Kojo, based on the previous draft update.

1076	9.10.  Changes between -09 and -10

1078	   o  Addressed comments by Erik Kline.

1080	   o  Addressed a comment by Markku Kojo on advice given in RFC 6691.

1082	10.  References

1084	10.1.  Normative References

1086	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1087	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1088	              <https://www.rfc-editor.org/info/rfc793>.

1090	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1091	              Communication Layers", STD 3, RFC 1122,
1092	              DOI 10.17487/RFC1122, October 1989,
1093	              <https://www.rfc-editor.org/info/rfc1122>.

1095	   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
1096	              Selective Acknowledgment Options", RFC 2018,
1097	              DOI 10.17487/RFC2018, October 1996,
1098	              <https://www.rfc-editor.org/info/rfc2018>.

1100	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1101	              Requirement Levels", BCP 14, RFC 2119,
1102	              DOI 10.17487/RFC2119, March 1997,
1103	              <https://www.rfc-editor.org/info/rfc2119>.

1105	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1106	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1107	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1109	   [RFC3042]  Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing
1110	              TCP's Loss Recovery Using Limited Transmit", RFC 3042,
1111	              DOI 10.17487/RFC3042, January 2001,
1112	              <https://www.rfc-editor.org/info/rfc3042>.

1114	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1115	              of Explicit Congestion Notification (ECN) to IP",
1116	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1117	              <https://www.rfc-editor.org/info/rfc3168>.

1119	   [RFC3819]  Karn, P., Ed., Bormann, C., Fairhurst, G., Grossman, D.,
1120	              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
1121	              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
1122	              RFC 3819, DOI 10.17487/RFC3819, July 2004,
1123	              <https://www.rfc-editor.org/info/rfc3819>.

1125	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1126	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1127	              <https://www.rfc-editor.org/info/rfc5681>.

1129	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1130	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1131	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1133	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
1134	              "Computing TCP's Retransmission Timer", RFC 6298,
1135	              DOI 10.17487/RFC6298, June 2011,
1136	              <https://www.rfc-editor.org/info/rfc6298>.

1138	   [RFC6691]  Borman, D., "TCP Options and Maximum Segment Size (MSS)",
1139	              RFC 6691, DOI 10.17487/RFC6691, July 2012,
1140	              <https://www.rfc-editor.org/info/rfc6691>.

1142	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
1143	              "Increasing TCP's Initial Window", RFC 6928,
1144	              DOI 10.17487/RFC6928, April 2013,
1145	              <https://www.rfc-editor.org/info/rfc6928>.

1147	   [RFC7228]  Bormann, C., Ersue, M., and A. Keranen, "Terminology for
1148	              Constrained-Node Networks", RFC 7228,
1149	              DOI 10.17487/RFC7228, May 2014,
1150	              <https://www.rfc-editor.org/info/rfc7228>.

1152	   [RFC7323]  Borman, D., Braden, B., Jacobson, V., and R.
1153	              Scheffenegger, Ed., "TCP Extensions for High Performance",
1154	              RFC 7323, DOI 10.17487/RFC7323, September 2014,
1155	              <https://www.rfc-editor.org/info/rfc7323>.

1157	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
1158	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
1159	              <https://www.rfc-editor.org/info/rfc7413>.

1161	10.2.  Informative References

1163	   [Commag]   A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoAP
1164	              Congestion Control for the Internet of Things", IEEE
1165	              Communications Magazine, June 2016.

1167	   [Dunk]     A. Dunkels, "Full TCP/IP for 8-Bit Architectures", 2003.

1169	   [ETEN]     R. Krishnan et al, "Explicit transport error notification
1170	              (ETEN) for error-prone wireless and satellite networks",
1171	              Computer Networks 2004.

1173	   [GNRC]     M. Lenders et al., "Connecting the World of Embedded
1174	              Mobiles: The RIOTApproach to Ubiquitous Networking for the
1175	              IoT", 2018.

1177	   [HomeGateway]
1178	              Haetoenen, S., Nyrhinen, A., Eggert, L., Strowes, S.,
1179	              Sarolahti, P., and M. Kojo, "An Experimental Study of Home
1180	              Gateway Characteristics", Proceedings of the 10th ACM
1181	              SIGCOMM conference on Internet measurement 2010.

1183	   [I-D.delcarpio-6lo-wlanah]
1184	              Vega, L., Robles, I., and R. Morabito, "IPv6 over
1185	              802.11ah", draft-delcarpio-6lo-wlanah-01 (work in
1186	              progress), October 2015.

1188	   [I-D.ietf-tcpm-rto-consider]
1189	              Allman, M., "Requirements for Time-Based Loss Detection",
1190	              draft-ietf-tcpm-rto-consider-17 (work in progress), July
1191	              2020.

1193	   [I-D.jarvinen-core-fasor]
1194	              Jarvinen, I., Kojo, M., Raitahila, I., and Z. Cao, "Fast-
1195	              Slow Retransmission Timeout and Congestion Control
1196	              Algorithm for CoAP", draft-jarvinen-core-fasor-02 (work in
1197	              progress), July 2019.

1199	   [IntComp]  C. Gomez, A. Arcia-Moret, J. Crowcroft, "TCP in the
1200	              Internet of Things: from ostracism to prominence", IEEE
1201	              Internet Computing, January-February 2018.

1203	   [RFC2757]  Montenegro, G., Dawkins, S., Kojo, M., Magret, V., and N.
1204	              Vaidya, "Long Thin Networks", RFC 2757,
1205	              DOI 10.17487/RFC2757, January 2000,
1206	              <https://www.rfc-editor.org/info/rfc2757>.

1208	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1209	              Explicit Congestion Notification (ECN) in IP Networks",
1210	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1211	              <https://www.rfc-editor.org/info/rfc2884>.

1213	   [RFC3481]  Inamura, H., Ed., Montenegro, G., Ed., Ludwig, R., Gurtov,
1214	              A., and F. Khafizov, "TCP over Second (2.5G) and Third
1215	              (3G) Generation Wireless Networks", BCP 71, RFC 3481,
1216	              DOI 10.17487/RFC3481, February 2003,
1217	              <https://www.rfc-editor.org/info/rfc3481>.

1219	   [RFC4944]  Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler,
1220	              "Transmission of IPv6 Packets over IEEE 802.15.4
1221	              Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007,
1222	              <https://www.rfc-editor.org/info/rfc4944>.

1224	   [RFC6077]  Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B.
1225	              Briscoe, "Open Research Issues in Internet Congestion
1226	              Control", RFC 6077, DOI 10.17487/RFC6077, February 2011,
1227	              <https://www.rfc-editor.org/info/rfc6077>.

1229	   [RFC6092]  Woodyatt, J., Ed., "Recommended Simple Security
1230	              Capabilities in Customer Premises Equipment (CPE) for
1231	              Providing Residential IPv6 Internet Service", RFC 6092,
1232	              DOI 10.17487/RFC6092, January 2011,
1233	              <https://www.rfc-editor.org/info/rfc6092>.

1235	   [RFC6120]  Saint-Andre, P., "Extensible Messaging and Presence
1236	              Protocol (XMPP): Core", RFC 6120, DOI 10.17487/RFC6120,
1237	              March 2011, <https://www.rfc-editor.org/info/rfc6120>.

1239	   [RFC6282]  Hui, J., Ed. and P. Thubert, "Compression Format for IPv6
1240	              Datagrams over IEEE 802.15.4-Based Networks", RFC 6282,
1241	              DOI 10.17487/RFC6282, September 2011,
1242	              <https://www.rfc-editor.org/info/rfc6282>.

1244	   [RFC6550]  Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
1245	              Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
1246	              JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
1247	              Low-Power and Lossy Networks", RFC 6550,
1248	              DOI 10.17487/RFC6550, March 2012,
1249	              <https://www.rfc-editor.org/info/rfc6550>.

1251	   [RFC6606]  Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem
1252	              Statement and Requirements for IPv6 over Low-Power
1253	              Wireless Personal Area Network (6LoWPAN) Routing",
1254	              RFC 6606, DOI 10.17487/RFC6606, May 2012,
1255	              <https://www.rfc-editor.org/info/rfc6606>.

1257	   [RFC6775]  Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C.
1258	              Bormann, "Neighbor Discovery Optimization for IPv6 over
1259	              Low-Power Wireless Personal Area Networks (6LoWPANs)",
1260	              RFC 6775, DOI 10.17487/RFC6775, November 2012,
1261	              <https://www.rfc-editor.org/info/rfc6775>.

1263	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
1264	              Protocol (HTTP/1.1): Message Syntax and Routing",
1265	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
1266	              <https://www.rfc-editor.org/info/rfc7230>.

1268	   [RFC7252]  Shelby, Z., Hartke, K., and C. Bormann, "The Constrained
1269	              Application Protocol (CoAP)", RFC 7252,
1270	              DOI 10.17487/RFC7252, June 2014,
1271	              <https://www.rfc-editor.org/info/rfc7252>.

1273	   [RFC7414]  Duke, M., Braden, R., Eddy, W., Blanton, E., and A.
1274	              Zimmermann, "A Roadmap for Transmission Control Protocol
1275	              (TCP) Specification Documents", RFC 7414,
1276	              DOI 10.17487/RFC7414, February 2015,
1277	              <https://www.rfc-editor.org/info/rfc7414>.

1279	   [RFC7428]  Brandt, A. and J. Buron, "Transmission of IPv6 Packets
1280	              over ITU-T G.9959 Networks", RFC 7428,
1281	              DOI 10.17487/RFC7428, February 2015,
1282	              <https://www.rfc-editor.org/info/rfc7428>.

1284	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1285	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1286	              DOI 10.17487/RFC7540, May 2015,
1287	              <https://www.rfc-editor.org/info/rfc7540>.

1289	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
1290	              Recommendations Regarding Active Queue Management",
1291	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
1292	              <https://www.rfc-editor.org/info/rfc7567>.

1294	   [RFC7668]  Nieminen, J., Savolainen, T., Isomaki, M., Patil, B.,
1295	              Shelby, Z., and C. Gomez, "IPv6 over BLUETOOTH(R) Low
1296	              Energy", RFC 7668, DOI 10.17487/RFC7668, October 2015,
1297	              <https://www.rfc-editor.org/info/rfc7668>.

1299	   [RFC8087]  Fairhurst, G. and M. Welzl, "The Benefits of Using
1300	              Explicit Congestion Notification (ECN)", RFC 8087,
1301	              DOI 10.17487/RFC8087, March 2017,
1302	              <https://www.rfc-editor.org/info/rfc8087>.

1304	   [RFC8105]  Mariager, P., Petersen, J., Ed., Shelby, Z., Van de Logt,
1305	              M., and D. Barthel, "Transmission of IPv6 Packets over
1306	              Digital Enhanced Cordless Telecommunications (DECT) Ultra
1307	              Low Energy (ULE)", RFC 8105, DOI 10.17487/RFC8105, May
1308	              2017, <https://www.rfc-editor.org/info/rfc8105>.

1310	   [RFC8163]  Lynn, K., Ed., Martocci, J., Neilson, C., and S.
1311	              Donaldson, "Transmission of IPv6 over Master-Slave/Token-
1312	              Passing (MS/TP) Networks", RFC 8163, DOI 10.17487/RFC8163,
1313	              May 2017, <https://www.rfc-editor.org/info/rfc8163>.

1315	   [RFC8201]  McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
1316	              "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
1317	              DOI 10.17487/RFC8201, July 2017,
1318	              <https://www.rfc-editor.org/info/rfc8201>.

1320	   [RFC8323]  Bormann, C., Lemay, S., Tschofenig, H., Hartke, K.,
1321	              Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained
1322	              Application Protocol) over TCP, TLS, and WebSockets",
1323	              RFC 8323, DOI 10.17487/RFC8323, February 2018,
1324	              <https://www.rfc-editor.org/info/rfc8323>.

1326	   [RFC8352]  Gomez, C., Kovatsch, M., Tian, H., and Z. Cao, Ed.,
1327	              "Energy-Efficient Features of Internet of Things
1328	              Protocols", RFC 8352, DOI 10.17487/RFC8352, April 2018,
1329	              <https://www.rfc-editor.org/info/rfc8352>.

1331	   [RFC8376]  Farrell, S., Ed., "Low-Power Wide Area Network (LPWAN)
1332	              Overview", RFC 8376, DOI 10.17487/RFC8376, May 2018,
1333	              <https://www.rfc-editor.org/info/rfc8376>.

1335	   [RIOT]     E. Baccelli et al., "RIOT: an Open Source Operating
1336	              Systemfor Low-end Embedded Devices in the IoT", 2018.

1338	   [shrew]    A. Kuzmanovic, E. Knightly, "Low-Rate TCP-Targeted Denial
1339	              of Service Attacks", SIGCOMM'03 2003.

1341	Authors' Addresses

1343	   Carles Gomez
1344	   UPC
1345	   C/Esteve Terradas, 7
1346	   Castelldefels  08860
1347	   Spain

1349	   Email: carlesgo@entel.upc.edu

1351	   Jon Crowcroft
1352	   University of Cambridge
1353	   JJ Thomson Avenue
1354	   Cambridge, CB3 0FD
1355	   United Kingdom

1357	   Email: jon.crowcroft@cl.cam.ac.uk
1358	   Michael Scharf
1359	   Hochschule Esslingen
1360	   Flandernstr. 101
1361	   Esslingen  73732
1362	   Germany

1364	   Email: michael.scharf@hs-esslingen.de