idnits 2.17.1 

draft-ietf-lwig-tcp-constrained-node-networks-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There are 32 instances of too long lines in the document, the longest
     one being 90 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet has text resembling
     RFC 2119 boilerplate text.

  -- The document date (November 3, 2019) is 1635 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC 7228' is mentioned on line 880, but not defined

  == Unused Reference: 'RFC6092' is defined on line 1211, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  == Outdated reference: A later version (-17) exists of
     draft-ietf-tcpm-rto-consider-08

  -- Obsolete informational reference (is this intentional?): RFC 7230
     (Obsoleted by RFC 9110, RFC 9112)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)


     Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	LWIG Working Group                                              C. Gomez
3	Internet-Draft                                                       UPC
4	Intended status: Informational                              J. Crowcroft
5	Expires: May 6, 2020                             University of Cambridge
6	                                                               M. Scharf
7	                                                    Hochschule Esslingen
8	                                                        November 3, 2019

10	           TCP Usage Guidance in the Internet of Things (IoT)
11	            draft-ietf-lwig-tcp-constrained-node-networks-09

13	Abstract

15	   This document provides guidance on how to implement and use the
16	   Transmission Control Protocol (TCP) in Constrained-Node Networks
17	   (CNNs), which are a characterstic of the Internet of Things (IoT).
18	   Such environments require a lightweight TCP implementation and may
19	   not make use of optional functionality.  This document explains a
20	   number of known and deployed techniques to simplify a TCP stack as
21	   well as corresponding tradeoffs.  The objective is to help embedded
22	   developers with decisions on which TCP features to use.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at https://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on May 6, 2020.

41	Copyright Notice

43	   Copyright (c) 2019 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (https://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Conventions used in this document . . . . . . . . . . . . . .   4
60	   3.  Characteristics of CNNs relevant for TCP  . . . . . . . . . .   4
61	     3.1.  Network and link properties . . . . . . . . . . . . . . .   4
62	     3.2.  Usage scenarios . . . . . . . . . . . . . . . . . . . . .   5
63	     3.3.  Communication and traffic patterns  . . . . . . . . . . .   6
64	   4.  TCP implementation and configuration in CNNs  . . . . . . . .   6
65	     4.1.  Addressing path properties  . . . . . . . . . . . . . . .   7
66	       4.1.1.  Maximum Segment Size (MSS)  . . . . . . . . . . . . .   7
67	       4.1.2.  Explicit Congestion Notification (ECN)  . . . . . . .   8
68	       4.1.3.  Explicit loss notifications . . . . . . . . . . . . .   9
69	     4.2.  TCP guidance for single-MSS stacks  . . . . . . . . . . .   9
70	       4.2.1.  Single-MSS stacks - benefits and issues . . . . . . .   9
71	       4.2.2.  TCP options for single-MSS stacks . . . . . . . . . .  10
72	       4.2.3.  Delayed Acknowledgments for single-MSS stacks . . . .  10
73	       4.2.4.  RTO calculation for single-MSS stacks . . . . . . . .  11
74	     4.3.  General recommendations for TCP in CNNs . . . . . . . . .  12
75	       4.3.1.  Loss recovery and congestion/flow control . . . . . .  12
76	         4.3.1.1.  Selective Acknowledgments (SACK)  . . . . . . . .  12
77	       4.3.2.  Delayed Acknowledgments . . . . . . . . . . . . . . .  13
78	       4.3.3.  Initial Window  . . . . . . . . . . . . . . . . . . .  14
79	   5.  TCP usage recommendations in CNNs . . . . . . . . . . . . . .  14
80	     5.1.  TCP connection initiation . . . . . . . . . . . . . . . .  14
81	     5.2.  Number of concurrent connections  . . . . . . . . . . . .  14
82	     5.3.  TCP connection lifetime . . . . . . . . . . . . . . . . .  15
83	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
84	   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  17
85	   8.  Annex. TCP implementations for constrained devices  . . . . .  18
86	     8.1.  uIP . . . . . . . . . . . . . . . . . . . . . . . . . . .  18
87	     8.2.  lwIP  . . . . . . . . . . . . . . . . . . . . . . . . . .  19
88	     8.3.  RIOT  . . . . . . . . . . . . . . . . . . . . . . . . . .  19
89	     8.4.  TinyOS  . . . . . . . . . . . . . . . . . . . . . . . . .  19
90	     8.5.  FreeRTOS  . . . . . . . . . . . . . . . . . . . . . . . .  20
91	     8.6.  uC/OS . . . . . . . . . . . . . . . . . . . . . . . . . .  20
92	     8.7.  Summary . . . . . . . . . . . . . . . . . . . . . . . . .  20
93	   9.  Annex. Changes compared to previous versions  . . . . . . . .  22
94	     9.1.  Changes between -00 and -01 . . . . . . . . . . . . . . .  22
95	     9.2.  Changes between -01 and -02 . . . . . . . . . . . . . . .  22
96	     9.3.  Changes between -02 and -03 . . . . . . . . . . . . . . .  22
97	     9.4.  Changes between -03 and -04 . . . . . . . . . . . . . . .  23
98	     9.5.  Changes between -04 and -05 . . . . . . . . . . . . . . .  23
99	     9.6.  Changes between -05 and -06 . . . . . . . . . . . . . . .  23
100	     9.7.  Changes between -06 and -07 . . . . . . . . . . . . . . .  23
101	     9.8.  Changes between -07 and -08 . . . . . . . . . . . . . . .  23
102	     9.9.  Changes between -08 and -09 . . . . . . . . . . . . . . .  23
103	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  24
104	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  24
105	     10.2.  Informative References . . . . . . . . . . . . . . . . .  25
106	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  29

108	1.  Introduction

110	   The Internet Protocol suite is being used for connecting Constrained-
111	   Node Networks (CNNs) to the Internet, enabling the so-called Internet
112	   of Things (IoT) [RFC7228].  In order to meet the requirements that
113	   stem from CNNs, the IETF has produced a suite of new protocols
114	   specifically designed for such environments (see e.g.  [RFC8352]).
115	   New IETF protocol stack components include the IPv6 over Low-power
116	   Wireless Personal Area Networks (6LoWPAN) adaptation layer
117	   [RFC4944][RFC6282][RFC6775], the IPv6 Routing Protocol for Low-power
118	   and lossy networks (RPL) routing protocol [RFC6550], and the
119	   Constrained Application Protocol (CoAP) [RFC7252].

121	   As of the writing, the main current transport layer protocols in IP-
122	   based IoT scenarios are UDP and TCP.  However, TCP has been
123	   criticized (often, unfairly) as a protocol for the IoT.  In fact,
124	   some TCP features are not optimal for IoT scenarios, such as
125	   relatively long header size, unsuitability for multicast, and always-
126	   confirmed data delivery.  However, many typical claims on TCP
127	   unsuitability for IoT (e.g. a high complexity, connection-oriented
128	   approach incompatibility with radio duty-cycling, and spurious
129	   congestion control activation in wireless links) are not valid, can
130	   be solved, or are also found in well accepted IoT end-to-end
131	   reliability mechanisms (see [IntComp] for a detailed analysis).

133	   At the application layer, CoAP was developed over UDP [RFC7252].
134	   However, the integration of some CoAP deployments with existing
135	   infrastructure is being challenged by middleboxes such as firewalls,
136	   which may limit and even block UDP-based communications.  This is the
137	   main reason why a CoAP over TCP specification has been developed
138	   [RFC8323].

140	   Other application layer protocols not specifically designed for CNNs
141	   are also being considered for the IoT space.  Some examples include
142	   HTTP/2 and even HTTP/1.1, both of which run over TCP by default
143	   [RFC7230] [RFC7540], and the Extensible Messaging and Presence
144	   Protocol (XMPP) [RFC6120].  TCP is also used by non-IETF application-
145	   layer protocols in the IoT space such as the Message Queue Telemetry
146	   Transport (MQTT) and its lightweight variants.

148	   TCP is a sophisticated transport protocol that includes optional
149	   functionality (e.g.  TCP options) that may improve performance in
150	   some environments.  However, many optional TCP extensions require
151	   complex logic inside the TCP stack and increase the codesize and the
152	   memory requirements.  Many TCP extensions are not required for
153	   interoperability with other standard-compliant TCP endpoints.  Given
154	   the limited resources on constrained devices, careful selection of
155	   optional TCP features can make an implementation more lightweight.

157	   This document provides guidance on how to implement and configure
158	   TCP, as well as on how TCP is advisable to be used by applications,
159	   in CNNs.  The overarching goal is to offer simple measures to allow
160	   for lightweight TCP implementation and suitable operation in such
161	   environments.  A TCP implementation following the guidance in this
162	   document is intended to be compatible with a TCP endpoint that is
163	   compliant to the TCP standards, albeit possibly with a lower
164	   performance.  This implies that such a TCP client would always be
165	   able to connect with a standard-compliant TCP server, and a
166	   corresponding TCP server would always be able to connect with a
167	   standard-compliant TCP client.

169	   This document assumes that the reader is familiar with TCP.  A
170	   comprehensive survey of the TCP standards can be found in [RFC7414].
171	   Similar guidance regarding the use of TCP in special environments has
172	   been published before, e.g., for cellular wireless networks
173	   [RFC3481].

175	2.  Conventions used in this document

177	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL","SHALL NOT",
178	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
179	   document are to be interpreted as described in [RFC2119].

181	3.  Characteristics of CNNs relevant for TCP

183	3.1.  Network and link properties

185	   CNNs are defined in [RFC7228] as networks whose characteristics are
186	   influenced by being composed of a significant portion of constrained
187	   nodes.  The latter are characterized by significant limitations on
188	   processing, memory, and energy resources, among others [RFC7228].
189	   The first two dimensions pose constraints on the complexity and on
190	   the memory footprint of the protocols that constrained nodes can
191	   support.  The latter requires techniques to save energy, such as
192	   radio duty-cycling in wireless devices [RFC8352], as well as
193	   minimization of the number of messages transmitted/received (and
194	   their size).

196	   [RFC7228] lists typical network constraints in CNN, including low
197	   achievable bitrate/throughput, high packet loss and high variability
198	   of packet loss, highly asymmetric link characteristics, severe
199	   penalties for using larger packets, limits on reachability over time,
200	   etc.  CNN may use wireless or wired technologies (e.g., Power Line
201	   Communication), and the transmission rates are typically low (e.g.
202	   below 1 Mbps).

204	   For use of TCP, one challenge is that not all technologies in CNN may
205	   be aligned with typical Internet subnetwork design principles
206	   [RFC3819].  For instance, constrained nodes often use physical/link
207	   layer technologies that have been characterized as 'lossy', i.e.,
208	   exhibit a relatively high bit error rate.  Dealing with corruption
209	   loss is one of the open issues in the Internet [RFC6077].

211	3.2.  Usage scenarios

213	   There are different deployment and usage scenarios for CNNs.  Some
214	   CNNs follow the star topology, whereby one or several hosts are
215	   linked to a central device that acts as a router connecting the CNN
216	   to the Internet.  CNNs may also follow the multihop topology
217	   [RFC6606].

219	   In constrained environments, there can be different types of devices
220	   [RFC7228].  For example, there can be devices with single combined
221	   send/receive buffer, devices with a separate send and receive buffer,
222	   or devices with a pool of multiple send/receive buffers.  In the
223	   latter case, it is possible that buffers also be shared for other
224	   protocols.

226	   One key use case for the use of TCP in CNNs is a model where
227	   constrained devices connect to unconstrained servers in the Internet.
228	   But it is also possible that both TCP endpoints run on constrained
229	   devices.  In the first case, communication possibly has to traverse a
230	   middlebox (e.g. a firewall, NAT, etc.).  Figure 1 illustrates such
231	   scenario.  Note that the scenario is asymmetric, as the unconstrained
232	   device will typically not suffer the severe constraints of the
233	   constrained device.  The unconstrained device is expected to be
234	   mains-powered, to have high amount of memory and processing power,
235	   and to be connected to a resource-rich network.

237	   Assuming that a majority of constrained devices will correspond to
238	   sensor nodes, the amount of data traffic sent by constrained devices
239	   (e.g. sensor node measurements) is expected to be higher than the
240	   amount of data traffic in the opposite direction.  Nevertheless,
241	   constrained devices may receive requests (to which they may respond),
242	   commands (for configuration purposes and for constrained devices
243	   including actuators) and relatively infrequent firmware/software
244	   updates.

246	                                                      +---------------+
247	           o     o <-------- TCP communication -----> |               |
248	          o     o                                     |               |
249	             o     o                                  | Unconstrained |
250	       o        o              +-----------+          |    device     |
251	           o     o   o  ------ | Middlebox |  ------- |               |
252	            o   o              +-----------+          |  (e.g. cloud) |
253	          o    o  o                                   |               |
254	                                                      +---------------+
255	      constrained devices

257	      Figure 1: TCP communication between a constrained device and an
258	               unconstrained device, traversing a middlebox.

260	3.3.  Communication and traffic patterns

262	   IoT applications are characterized by a number of different
263	   communication patterns.  The following non-comprehensive list
264	   explains some typical examples:

266	   o  Unidirectional transfers: An IoT device (e.g. a sensor) can send
267	      (repeatedly) updates to the other endpoint.  Not in every case
268	      there is a need for an application response back to the IoT
269	      device.

271	   o  Request-response patterns: An IoT device receiving a request from
272	      the other endpoint, which triggers a response from the IoT device.

274	   o  Bulk data transfers: A typical example for a long file transfer
275	      would be an IoT device firmware update.

277	   A typical communication pattern is that a constrained device
278	   communicates with an unconstrained device (cf.  Figure 1).  But it is
279	   also possible that constrained devices communicate amongst
280	   themselves.

282	4.  TCP implementation and configuration in CNNs

284	   This section explains how a TCP stack can deal with typical
285	   constraints in CNN.  The guidance in this section relates to the TCP
286	   implementation and its configuration.

288	4.1.  Addressing path properties

290	4.1.1.  Maximum Segment Size (MSS)

292	   Assuming that IPv6 is used, and for the sake of lightweight
293	   implementation and operation, unless applications require handling
294	   large data units (i.e. leading to an IPv6 datagram size greater than
295	   1280 bytes), it may be desirable to limit the IP datagram size to
296	   1280 bytes in order to avoid the need to support Path MTU Discovery
297	   [RFC8201].  In addition, an IP datagram size of 1280 bytes avoids
298	   incurring IPv6-layer fragmentation.

300	   An IPv6 datagram size exceeding 1280 bytes can be avoided by setting
301	   the TCP MSS not larger than 1220 bytes.  This assumes that the remote
302	   sender will use no TCP options, aside from possibly the MSS option,
303	   which is only used in the initial TCP SYN packet.

305	   In order to accommodate unrequested TCP options that may be used by
306	   some TCP implementations, a constrained device may advertise an MSS
307	   smaller than 1220 bytes (e.g. not larger than 1200 bytes).  Note
308	   that, in many implementations, TCP options generally consume payload
309	   space instead of increasing datagram size, therefore this suggestion
310	   might be overcautious and its suitability will depend on each
311	   specific scenario.

313	   Note that setting the MTU to 1280 bytes is possible for link layer
314	   technologies in the CNN space, even if some of them are characterized
315	   by a short data unit payload size, e.g. up to a few tens or hundreds
316	   of bytes.  For example, the maximum frame size in IEEE 802.15.4 is
317	   127 bytes.  6LoWPAN defined an adaptation layer to support IPv6 over
318	   IEEE 802.15.4 networks.  The adaptation layer includes a
319	   fragmentation mechanism, since IPv6 requires the layer below to
320	   support an MTU of 1280 bytes [RFC2460], while IEEE 802.15.4 lacked
321	   fragmentation mechanisms.  6LoWPAN defines an IEEE 802.15.4 link MTU
322	   of 1280 bytes [RFC4944].  Other technologies, such as Bluetooth LE
323	   [RFC7668], ITU-T G.9959 [RFC7428] or DECT-ULE [RFC8105], also use
324	   6LoWPAN-based adaptation layers in order to enable IPv6 support.
325	   These technologies do support link layer fragmentation.  By
326	   exploiting this functionality, the adaptation layers that enable IPv6
327	   over such technologies also define an MTU of 1280 bytes.

329	   On the other hand, there exist technologies also used in the CNN
330	   space, such as Master Slave / Token Passing (TP) [RFC8163],
331	   Narrowband IoT (NB-IoT) [RFC8376] or IEEE 802.11ah
332	   [I-D.delcarpio-6lo-wlanah], that do not suffer the same degree of
333	   frame size limitations as the technologies mentioned above.  The MTU
334	   for MS/TP is recommended to be 1500 bytes [RFC8163], the MTU in NB-
335	   IoT is 1600 bytes, and the maximum frame payload size for IEEE
336	   802.11ah is 7991 bytes.

338	   Finally, note that using larger MSS (to a suitable extent) may be
339	   beneficial, especially when transferring large payloads, as it
340	   reduces the number of packets (and packet headers) required for a
341	   given payload.

343	4.1.2.  Explicit Congestion Notification (ECN)

345	   Explicit Congestion Notification (ECN) [RFC3168] ECN allows a router
346	   to signal in the IP header of a packet that congestion is arising,
347	   for example when a queue size reaches a certain threshold.  An ECN-
348	   enabled TCP receiver will echo back the congestion signal to the TCP
349	   sender by setting a flag in its next TCP ACK.  The sender triggers
350	   congestion control measures as if a packet loss had happened.

352	   The document [RFC8087] outlines the principal gains in terms of
353	   increased throughput, reduced delay, and other benefits when ECN is
354	   used over a network path that includes equipment that supports
355	   Congestion Experienced (CE) marking.  In the context of CNNs, a
356	   remarkable feature of ECN is that congestion can be signalled without
357	   incurring packet drops (which will lead to retransmissions and
358	   consumption of limited resources such as energy and bandwitdh).

360	   ECN can further reduce packet losses since congestion control
361	   measures can be applied earlier [RFC2884].  Less lost packets implies
362	   that the number of retransmitted segments decreases, which is
363	   particularly beneficial in CNNs, where energy and bandwidth resources
364	   are typically limited.  Also, it makes sense to try to avoid packet
365	   drops for transactional workloads with small data sizes, which are
366	   typical for CNNs.  In such traffic patterns, it is more difficult and
367	   often impossible to detect packet loss without retransmission
368	   timeouts (e.g., as there may be no three duplicate ACKs).  Any
369	   retransmission timeout slows down the data transfer significantly.
370	   In addition, if the constrained device uses power saving techniques,
371	   a retransmission timeout will incur a wake-up action, in contrast to
372	   ACK clock- triggered sending.  When the congestion window of a TCP
373	   sender has a size of one segment and a TCP ACK with an ECN signal
374	   (ECE flag) arrives at the TCP sender, the TCP sender resets the
375	   retransmit timer, and the sender will only be able to send a new
376	   packet when the retransmit timer expires.  Effectively, the TCP
377	   sender reduces at that moment its sending rate from 1 segment per
378	   Round Trip Time (RTT) to 1 segment per RTO and reduces the sending
379	   rate further on each ECN signal received in subsequent TCP ACKs.
380	   Otherwise, if an ECN signal is not present in a subsequent TCP ACK
381	   the TCP sender resumes the normal ACK-clocked transmission of
382	   segments [RFC3168].

384	   ECN can be incrementally deployed in the Internet.  Guidance on
385	   configuration and usage of ECN is provided in [RFC7567].  Given the
386	   benefits, more and more TCP stacks in the Internet support ECN, and
387	   it specifically makes sense to leverage ECN in controlled
388	   environments such as CNNs.  Note, however, that supporting ECN
389	   increases implementation complexity.

391	4.1.3.  Explicit loss notifications

393	   There has been a significant body of research on solutions capable of
394	   explicitly indicating whether a TCP segment loss is due to
395	   corruption, in order to avoid activation of congestion control
396	   mechanisms [ETEN] [RFC2757].  While such solutions may provide
397	   significant improvement, they have not been widely deployed and
398	   remain as experimental work.  In fact, as of today, the IETF has not
399	   standardized any such solution.

401	4.2.  TCP guidance for single-MSS stacks

403	   This section discusses TCP stacks that allow transferring a single
404	   MSS.  More general guidance is provided in Section 4.3.

406	4.2.1.  Single-MSS stacks - benefits and issues

408	   A TCP stack can reduce the memory requirements by advertising a TCP
409	   window size of one MSS, and also transmit at most one MSS of
410	   unacknowledged data.  In that case, both congestion and flow control
411	   implementation are quite simple.  Such a small receive and send
412	   window may be sufficient for simple message exchanges in the CNN
413	   space.  However, only using a window of one MSS can significantly
414	   affect performance.  A stop-and-wait operation results in low
415	   throughput for transfers that exceed the length of one MSS, e.g., a
416	   firmware download.  Furthermore, a single-MSS solution relies solely
417	   on timer-based loss recovery, therefore missing the performance gain
418	   of Fast Retransmit and Fast Recovery (which require a larger window
419	   size, see Subsection 4.3.1).

421	   If CoAP is used over TCP with the default setting for NSTART in
422	   [RFC7252], a CoAP endpoint is not allowed to send a new message to a
423	   destination until a response for the previous message sent to that
424	   destination has been received.  This is equivalent to an application-
425	   layer window size of 1 data unit.  For this use of CoAP, a maximum
426	   TCP window of one MSS may be sufficient, as long as the CoAP message
427	   size does not exceed one MSS.  An exception in CoAP over TCP, though,
428	   is the Capabilities and Settings Message (CSM) that must be sent at
429	   the start of the TCP connection.  The first application message
430	   carrying user data is allowed to be sent immediately after the CSM
431	   message.  If the sum of the CSM size plus the application message
432	   size exceeds the MSS, a sender using a single-MSS stack will need to
433	   wait for the ACK confirming the CSM before sending the application
434	   message.

436	4.2.2.  TCP options for single-MSS stacks

438	   A TCP implementation needs to support, at a minimum, TCP options 2, 1
439	   and 0.  These are, respectively, the Maximum Segment Size (MSS)
440	   option, the No-Operation option, and the End Of Option List marker
441	   [RFC0793].  None of these are a substantial burden to support.  These
442	   options are sufficient for interoperability with a standard-compliant
443	   TCP endpoint, albeit many TCP stacks support additional options and
444	   can negotiate their use.  A TCP implementation is permitted to
445	   silently ignore all other TCP options.

447	   A TCP implementation for a constrained device that uses a single-MSS
448	   TCP receive or transmit window size may not benefit from supporting
449	   the following TCP options: Window scale [RFC7323], TCP Timestamps
450	   [RFC7323], Selective Acknowledgments (SACK) and SACK-Permitted
451	   [RFC2018].  Also other TCP options may not be required on a
452	   constrained device with a very lightweight implementation.  With
453	   regard to the Window scale option, note that it is only useful if a
454	   window size greater than 64 kB is needed.

456	   Note that a TCP sender can benefit from the TCP Timestamps option
457	   [RFC7323] in detecting spurious RTOs.  The latter are quite likely to
458	   occur in CNN scenarios due to a number of reasons (e.g. route changes
459	   in a multihop scenario, link layer retries, etc.).  The header
460	   overhead incurred by the Timestamps option (of up to 12 bytes) needs
461	   to be taken into account.

463	   One potentially relevant TCP option in the context of CNNs is TCP
464	   Fast Open (TFO) [RFC7413].  As described in Section 5.3, TFO can be
465	   used to address the problem of traversing middleboxes that perform
466	   early filter state record deletion.

468	4.2.3.  Delayed Acknowledgments for single-MSS stacks

470	   TCP Delayed Acknowledgments are meant to reduce the number of ACKs
471	   sent within a TCP connection, thus reducing network overhead, but
472	   they may increase the time until a sender may receive an ACK.  In
473	   general, usefulness of Delayed ACKs depends heavily on the usage
474	   scenario (see subsection 4.3.2).  There can be interactions with
475	   single-MSS stacks.

477	   When traffic is unidirectional, if the sender can send at most one
478	   MSS of data or the receiver advertises a receive window not greater
479	   than the MSS, Delayed ACKs may unnecessarily contribute delay (up to
480	   500 ms) to the RTT [RFC5681], which limits the throughput and can
481	   increase data delivery time.  Note that, in some cases, it may not be
482	   possible to disable Delayed ACKs.  One known workaround is to split
483	   the data to be sent into two segments of smaller size.  A standard
484	   compliant TCP receiver may immediately acknowledge the second MSS of
485	   data, which can improve throughput.  However, this 'split hack' may
486	   not always work since a TCP receiver is required to acknowledge every
487	   second full-sized segment, but not two consecutive small segments.
488	   Furthermore, the overhead of sending two IP packets instead of one is
489	   another downside of the 'split hack'.

491	   Similar issues may happen when the sender uses the Nagle algorithm,
492	   since the sender may need to wait for an unnecessarily delayed ACK to
493	   send a new segment.  Disabling the algorithm will not have impact if
494	   the sender can only handle stop-and-wait operation at the TCP level.

496	   For request-response traffic, when the receiver uses Delayed ACKs, a
497	   response to a data message can piggyback an ACK, as long as the
498	   latter is sent before the Delayed ACK timer expires, thus avoiding
499	   unnecessary ACKs without payload.  Disabling Delayed ACKs at the
500	   sender allows an immediate ACK for the data segment carrying the
501	   response.

503	4.2.4.  RTO calculation for single-MSS stacks

505	   The Retransmission Timeout (RTO) calculation is one of the
506	   fundamental TCP algorithms [RFC6298].  There is a fundamental trade-
507	   off: A short, aggressive RTO behavior reduces wait time before
508	   retransmissions, but it also increases the probability of spurious
509	   timeouts.  The latter lead to unnecessary waste of potentially scarce
510	   resources in CNNs such as energy and bandwidth.  In contrast, a
511	   conservative timeout can result in long error recovery times and thus
512	   needlessly delay data delivery.

514	   If a TCP sender uses a very small window size, and it cannot benefit
515	   from Fast Retransmit/Fast Recovery or SACK, the RTO algorithm has a
516	   large impact on performance.  In that case, RTO algorithm tuning may
517	   be considered, although careful assessment of possible drawbacks is
518	   recommended [I-D.ietf-tcpm-rto-consider].

520	   As an example, adaptive RTO algorithms defined for CoAP over UDP have
521	   been found to perform well in CNN scenarios [Commag]
522	   [I-D.jarvinen-core-fasor].

524	4.3.  General recommendations for TCP in CNNs

526	   This section summarizes some widely used techniques to improve TCP,
527	   with a focus on their use in CNNs.  The TCP extensions discussed here
528	   are useful in a wide range of network scenarios, including CNNs.
529	   This section is not comprehensive.  A comprehensive survey of TCP
530	   extensions is published in [RFC7414].

532	4.3.1.  Loss recovery and congestion/flow control

534	   Devices that have enough memory to allow a larger (i.e. more than 3
535	   MSS of data) TCP window size can leverage a more efficient loss
536	   recovery than the timer-based approach used for smaller TCP window
537	   size (see Subsection 3.2.1) by using Fast Retransmit and Fast
538	   Recovery [RFC5681], at the expense of slightly greater complexity and
539	   TCB size.  Assuming that Delayed ACKs are used by the receiver, a
540	   window size of up to 5 MSS is required for Fast Retransmit and Fast
541	   Recovery to work efficiently: If in a given TCP transmission of full-
542	   sized segments 1, 2, 3, 4, and 5, segment 2 gets lost, and the ACK
543	   for segment 1 is held by the Delayed ACK timer, then the sender
544	   should get an ACK for segment 1 when 3 arrives and duplicate ACKs
545	   when segments 4, 5, and 6 arrive.  It will retransmit segment 2 when
546	   the third duplicate ACK arrives.  In order to have segments 2, 3, 4,
547	   5, and 6 sent, the window has to be of at least 5 MSS.  With an MSS
548	   of 1220 bytes, a buffer of a size of 5 MSS would require 6100 bytes.

550	   The example in the previous paragraph did not use a further TCP
551	   improvement such as Limited Transmit [RFC3042].  The latter may also
552	   be useful for any transfer that has more than one segment in flight.
553	   Small transfers tend to benefit more from Limited Transmit, because
554	   they are more likely to not receive enough duplicate ACKs.  Assuming
555	   the example in the previous paragraph, Limited Transmit allows
556	   sending 5 MSS with a congestion window (cwnd) of 3 segments, plus two
557	   additional segments for the first two duplicate ACKs.  With Limited
558	   Transmit, even a cwnd of 2 segments allows sending 5 MSS, at the
559	   expense of additional delay contributed by the Delayed ACK timer for
560	   the ACK that confirms segment 1.

562	   When a multiple-segment window is used, the receiver will need to
563	   manage the reception of possible out-of-order received segments,
564	   requiring sufficient buffer space.

566	4.3.1.1.  Selective Acknowledgments (SACK)

568	   If a device with less severe memory and processing constraints can
569	   afford advertising a TCP window size of several MSS, it makes sense
570	   to support the SACK option to improve performance.  SACK allows a
571	   data receiver to inform the data sender of non-contiguous data blocks
572	   received, thus a sender (having previously sent the SACK-Permitted
573	   option) can avoid performing unnecessary retransmissions, saving
574	   energy and bandwidth, as well as reducing latency.  In addition, SACK
575	   often allows for faster loss recovery when there is more than one
576	   lost segment in a window of data, since with SACK recovery may
577	   complete with less RTTs.  SACK is particularly useful for bulk data
578	   transfers.  A receiver supporting SACK will need to keep track of the
579	   SACK blocks that need to be received.  The sender will also need to
580	   keep track of which data segments need to be resent after learning
581	   which data blocks are missing at the receiver.  SACK adds 8*n+2 bytes
582	   to the TCP header, where n denotes the number of data blocks
583	   received, up to 4 blocks.  For a low number of out-of-order segments,
584	   the header overhead penalty of SACK is compensated by avoiding
585	   unnecessary retransmissions.  When the sender discovers the data
586	   blocks that have already been received, it needs to also store the
587	   necessary state to avoid unnecessary retransmission of data segments
588	   that have already been received.

590	4.3.2.  Delayed Acknowledgments

592	   For certain traffic patterns, Delayed ACKs may have a detrimental
593	   effect, as already noted in Section 4.2.3.  Advanced TCP stacks may
594	   use heuristics to determine the maximum delay for an ACK.  For CNNs,
595	   the recommendation depends on the expected communication patterns.

597	   When traffic over a CNN is expected to mostly be unidirectional
598	   messages with a size typically up to one MSS, and the time between
599	   two consecutive message transmissions is greater than the Delayed ACK
600	   timeout, it may make sense to use a small timeout or disable Delayed
601	   ACKs at the receiver.  This avoids incurring additional delay, as
602	   well as the energy consumption of the sender (which might e.g. keep
603	   its radio interface in receive mode) during that time.  Note that
604	   disabling Delayed ACKs may only be possible if the peer device is
605	   administered by the same entity managing the constrained device.  For
606	   request-response traffic, enabling Delayed ACKs is recommended at the
607	   server end, in order to allow combining a response with the ACK into
608	   a single segment, thus increasing efficiency.  In addition, if a
609	   client issues requests infrequently, disabling Delayed ACKs at the
610	   client allows an immediate ACK for the data segment carrying the
611	   response.

613	   In contrast, Delayed ACKs allow to reduce the number of ACKs in bulk
614	   transfer type of traffic, e.g. for firmware/software updates or for
615	   transferring larger data units containing a batch of sensor readings.

617	   Note that, in many scenarios, the peer that a constrained device
618	   communicates with will be a general purpose system that communicates
619	   with both constrained and unconstrained devices.  Since delayed ACKs
620	   are often configured through system-wide parameters, delayed ACKs
621	   behavior at the peer will be the same regardless of the nature of the
622	   endpoints it talks to.  Such a peer will typically have delayed ACKs
623	   enabled.

625	4.3.3.  Initial Window

627	   RFC 5681 specifies a TCP Initial Window (IW) of roughly 4 kB
628	   [RFC5681].  Subsequently, RFC 6928 defined an experimental new value
629	   for the IW, which in practice will result in an IW of 10 MSS
630	   [RFC6928].  The latter is nowadays used in many TCP implementations.

632	   Note that a 10-MSS IW was recommended for resource-rich environments
633	   (e.g. broadband environments), which are significantly different from
634	   CNNs.  In CNNs, many application layer data units are relatively
635	   small (e.g. below one MSS).  However, larger objects (e.g. large
636	   files containing sensor readings, firmware updates, etc.) may also
637	   need to be transferred in CNNs.  If such a large object is
638	   transferred in CNNs, with an IW setting of 10 MSS, there is
639	   significant buffer overflow risk.  In order to avoid such problem, in
640	   CNNs the IW needs to be carefully set, based on device and network
641	   resource constraints.  In many cases, a safe IW setting will be
642	   smaller than 10 MSS.

644	5.  TCP usage recommendations in CNNs

646	   This section discusses how TCP can be used by applications that are
647	   developed for CNN scenarios.  These remarks are by and large
648	   independent of how TCP is exactly implemented.

650	5.1.  TCP connection initiation

652	   In the constrained device to unconstrained device scenario
653	   illustrated above, a TCP connection is typically initiated by the
654	   constrained device, in order for this device to support possible
655	   sleep periods to save energy.

657	5.2.  Number of concurrent connections

659	   TCP endpoints with a small amount of memory may only support a small
660	   number of connections.  Each TCP connection requires storing a number
661	   of variables in the Transmission Control Block (TCB).  Depending on
662	   the internal TCP implementation, each connection may result in
663	   further memory overhead, and connections may compete for scarce
664	   resources (e.g. further memory overhead for send and receive buffers,
665	   etc).

667	   A careful application design may try to keep the number of concurrent
668	   connections as small as possible.  A client can for instance limit
669	   the number of simultaneous open connections that it maintains to a
670	   given server.  Multiple connections could for instance be used to
671	   avoid the "head-of-line blocking" problem in an application transfer.
672	   However, in addition to consuming resources, using multiple
673	   connections can also cause undesirable side effects in congested
674	   networks.  For example, the HTTP/1.1 specification encourages clients
675	   to be conservative when opening multiple connections [RFC7230].
676	   Furthermore, each new connection will start with a 3-way handshake,
677	   therefore increasing message overhead.

679	   Being conservative when opening multiple TCP connections is of
680	   particular importance in Constrained-Node Networks.

682	5.3.  TCP connection lifetime

684	   In order to minimize message overhead, it makes sense to keep a TCP
685	   connection open as long as the two TCP endpoints have more data to
686	   send.  If applications exchange data rather infrequently, i.e., if
687	   TCP connections would stay idle for a long time, the idle time can
688	   result in problems.  For instance, certain middleboxes such as
689	   firewalls or NAT devices are known to delete state records after an
690	   inactivity interval.  RFC 5382 specifies a minimum value for such
691	   interval of 124 minutes.  Measurement studies have reported that TCP
692	   NAT binding timeouts are highly variable across devices, with a
693	   median around 60 minutes, the shortest timeout being around 2
694	   minutes, and more than 50% of the devices with a timeout shorter than
695	   the aforementioned minimum timeout of 124 minutes [HomeGateway].  The
696	   timeout duration used by a middlebox implementation may not be known
697	   to the TCP endpoints.

699	   In CNNs, such middleboxes may e.g. be present at the boundary between
700	   the CNN and other networks.  If the middlebox can be optimized for
701	   CNN use cases, it makes sense to increase the initial value for
702	   filter state inactivity timers to avoid problems with idle
703	   connections.  Apart from that, this problem can be dealt with by
704	   different connection handling strategies, each having pros and cons.

706	   One approach for infrequent data transfer is to use short-lived TCP
707	   connections.  Instead of trying to maintain a TCP connection for long
708	   time, possibly short-lived connections can be opened between two
709	   endpoints, which are closed if no more data needs to be exchanged.
710	   For use cases that can cope with the additional messages and the
711	   latency resulting from starting new connections, it is recommended to
712	   use a sequence of short-lived connections, instead of maintaining a
713	   single long-lived connection.

715	   The message and latency overhead that stems from using a sequence of
716	   short-lived connections could be reduced by TCP Fast Open (TFO)
717	   [RFC7413], which is an experimental TCP extension, at the expense of
718	   increased implementation complexity and increased TCP Control Block
719	   (TCB) size.  TFO allows data to be carried in SYN (and SYN-ACK)
720	   segments, and to be consumed immediately by the receiving endpoint.
721	   This reduces the message and latency overhead compared to the
722	   traditional three-way handshake to establish a TCP connection.  For
723	   security reasons, the connection initiator has to request a TFO
724	   cookie from the other endpoint.  The cookie, with a size of 4 or 16
725	   bytes, is then included in SYN packets of subsequent connections.
726	   The cookie needs to be refreshed (and obtained by the client) after a
727	   certain amount of time.  Nevertheless, TFO is more efficient than
728	   frequently opening new TCP connections with the traditional three-way
729	   handshake, as long as the cookie can be reused in subsequent
730	   connections.  However, as stated in RFC 7413, TFO deviates from the
731	   standard TCP semantics, since the data in the SYN could be replayed
732	   to an application in some rare circumstances.  Applications should
733	   not use TFO unless they can tolerate this issue, e.g., by using
734	   Transport Layer Security (TLS) [RFC7413].  A comprehensive discussion
735	   on TFO can be found at RFC 7413.

737	   Another approach is to use long-lived TCP connections with
738	   application-layer heartbeat messages.  Various application protocols
739	   support such heartbeat messages (e.g.  CoAP over TCP [RFC8323]).
740	   Periodic application-layer heartbeats can prevent early filter state
741	   record deletion in middleboxes.  If the TCP binding timeout for a
742	   middlebox to be traversed by a given connection is known, middlebox
743	   filter state deletion will be avoided if the heartbeat period is
744	   lower than the middlebox TCP binding timeout.  Otherwise, the
745	   implementer needs to take into account that middlebox TCP binding
746	   timeouts fall in a wide range of possible values [HomeGateway], and
747	   it may be hard to find a proper heartbeat period for application-
748	   layer heartbeat messages.

750	   One specific advantage of Heartbeat messages is that they also allow
751	   aliveness checks at the application level.  In general, it makes
752	   sense to realize aliveness checks at the highest protocol layer
753	   possible that is meaningful to the application, in order to maximize
754	   the depth of the aliveness check.  In addition, timely detection of a
755	   dead peer may allow savings in terms of TCB memory use.  However, the
756	   transmission of heartbeat messages consumes resources.  This aspect
757	   needs to be assessed carefully, considering the characteristics of
758	   each specific CNN.

760	   A TCP implementation may also be able to send "keep-alive" segments
761	   to test a TCP connection.  According to [RFC1122], "keep-alives" are
762	   an optional TCP mechanism that is turned off by default, i.e., an
763	   application must explicitly enable it for a TCP connection.  The
764	   interval between "keep-alive" messages must be configurable and it
765	   must default to no less than two hours.  With this large timeout, TCP
766	   keep-alive messages might not always be useful to avoid deletion of
767	   filter state records in some middleboxes.  However, sending TCP keep-
768	   alive probes more frequently risks draining power on energy-
769	   constrained devices.

771	6.  Security Considerations

773	   Best current practise for securing TCP and TCP-based communication
774	   also applies to CNN.  As example, use of Transport Layer Security
775	   (TLS) is strongly recommended if it is applicable.

777	   There are also TCP options which can improve TCP security.  One
778	   example is the TCP Authentication Option (TCP-AO) [RFC5925].
779	   However, this option adds overhead and complexity.  TCP-AO typically
780	   has a size of 16-20 bytes.

782	   For the mechanisms discussed in this document, the corresponding
783	   considerations apply.  For instance, if TFO is used, the security
784	   considerations of [RFC7413] apply.

786	   Constrained devices are expected to support smaller TCP window sizes
787	   than less limited devices.  In such conditions, segment
788	   retransmission triggered by RTO expiration is expected to be
789	   relatively frequent, due to lack of (enough) duplicate ACKs,
790	   especially when a constrained device uses a single-MSS
791	   implementation.  For this reason, constrained devices running TCP may
792	   appear as particularly appealing victims of the so-called "shrew"
793	   Denial of Service (DoS) attack [shrew], whereby one or more sources
794	   generate a packet spike targetted to coincide with consecutive RTO-
795	   expiration-triggered retry attempts of a victim node.  Note that the
796	   attack may be performed by Internet-connected devices, including
797	   constrained devices in the same CNN as the victim, as well as remote
798	   ones.  Mitigation techniques include RTO randomization and attack
799	   blocking by routers able to detect shrew attacks based on their
800	   traffic pattern.

802	7.  Acknowledgments

804	   Carles Gomez has been funded in part by the Spanish Government
805	   (Ministerio de Educacion, Cultura y Deporte) through the Jose
806	   Castillejo grants CAS15/00336 and and CAS18/00170, and by European
807	   Regional Development Fund (ERDF) and the Spanish Government through
808	   project TEC2016-79988-P, AEI/FEDER, UE.  Part of his contribution to
809	   this work has been carried out during his stays as a visiting scholar
810	   at the Computer Laboratory of the University of Cambridge.

812	   The authors appreciate the feedback received for this document.  The
813	   following folks provided comments that helped improve the document:
814	   Carsten Bormann, Zhen Cao, Wei Genyu, Ari Keranen, Abhijan
815	   Bhattacharyya, Andres Arcia-Moret, Yoshifumi Nishida, Joe Touch, Fred
816	   Baker, Nik Sultana, Kerry Lynn, Erik Nordmark, Markku Kojo, Hannes
817	   Tschofenig, David Black, Yoshifumi Nishida, Ilpo Jarvinen, Emmanuel
818	   Baccelli, Stuart Cheshire, Gorry Fairhurst, and Ingemar Johansson.
819	   Simon Brummer provided details, and kindly performed RAM and ROM
820	   usage measurements, on the RIOT TCP implementation.  Xavi Vilajosana
821	   provided details on the OpenWSN TCP implementation.  Rahul Jadhav
822	   kindly performed code size measurements on the Contiki-NG and lwIP
823	   2.1.2 TCP implementations.  He also provided details on the uIP TCP
824	   implementation.

826	8.  Annex.  TCP implementations for constrained devices

828	   This section overviews the main features of TCP implementations for
829	   constrained devices.  The survey is limited to open source stacks
830	   with small footprint.  It is not meant to be all-encompassing.  For
831	   more powerful embedded systems (e.g., with 32-bit processors), there
832	   are further stacks that comprehensively implement TCP.  On the other
833	   hand, please be aware that this Annex is based on information
834	   available as of the writing.

836	8.1.  uIP

838	   uIP is a TCP/IP stack, targetted for 8 and 16-bit microcontrollers,
839	   which pioneered TCP/IP implementations for constrained devices. uIP
840	   has been deployed with Contiki and the Arduino Ethernet shield.  A
841	   code size of ~5 kB (which comprises checksumming, IP, ICMP and TCP)
842	   has been reported for uIP [Dunk].

844	   uIP uses the same global buffer for both incoming and outgoing
845	   traffic, which has a size of a single packet.  In case of a
846	   retransmission, an application must be able to reproduce the same
847	   user data that had been transmitted.  Multiple connections are
848	   supported, but need to share the global buffer.

850	   The MSS is announced via the MSS option on connection establishment
851	   and the receive window size (of one MSS) is not modified during a
852	   connection.  Stop-and-wait operation is used for sending data.  Among
853	   other optimizations, this allows to avoid sliding window operations,
854	   which use 32-bit arithmetic extensively and are expensive on 8-bit
855	   CPUs.

857	   Contiki uses the "split hack" technique (see Section 4.2.3) to avoid
858	   Delayed ACKs for senders using a single segment.

860	   The code size of the TCP implementation in Contiki-NG has been
861	   measured to be of 3.2 kB on CC2538DK, cross-compiling on Linux.

863	8.2.  lwIP

865	   lwIP is a TCP/IP stack, targetted for 8- and 16-bit microcontrollers.
866	   lwIP has a total code size of ~14 kB to ~22 kB (which comprises
867	   memory management, checksumming, network interfaces, IP, ICMP and
868	   TCP), and a TCP code size of ~9 kB to ~14 kB [Dunk].

870	   In contrast with uIP, lwIP decouples applications from the network
871	   stack. lwIP supports a TCP transmission window greater than a single
872	   segment, as well as buffering of incoming and outcoming data.  Other
873	   implemented mechanisms comprise slow start, congestion avoidance,
874	   fast retransmit and fast recovery.  SACK and Window Scale support has
875	   been recently added to lwIP.

877	8.3.  RIOT

879	   The RIOT TCP implementation (called GNRC TCP) has been designed for
880	   Class 1 devices [RFC 7228].  The main target platforms are 8- and
881	   16-bit microcontrollers, with 32-bit platforms also supported.  GNRC
882	   TCP offers a similar function set as uIP, but it provides and
883	   maintains an independent receive buffer for each connection.  In
884	   contrast to uIP, retransmission is also handled by GNRC TCP.  For
885	   simplicity, GNRC TCP uses a single-MSS implementation.  The
886	   application programmer does not need to know anything about the TCP
887	   internals, therefore GNRC TCP can be seen as a user-friendly uIP TCP
888	   implementation.

890	   The MSS is set on connections establishment and cannot be changed
891	   during connection lifetime.  GNRC TCP allows multiple connections in
892	   parallel, but each TCB must be allocated somewhere in the system.  By
893	   default there is only enough memory allocated for a single TCP
894	   connection, but it can be increased at compile time if the user needs
895	   multiple parallel connections.

897	   The RIOT TCP implementation offers an optional POSIX socket wrapper
898	   that enables POSIX compliance, if needed.

900	   Further details on RIOT and GNRC can be found in the literature
901	   [RIOT], [GNRC].

903	8.4.  TinyOS

905	   TinyOS was important as platform for early constrained devices.
906	   TinyOS has an experimental TCP stack that uses a simple nonblocking
907	   library-based implementation of TCP, which provides a subset of the
908	   socket interface primitives.  The application is responsible for
909	   buffering.  The TCP library does not do any receive-side buffering.
910	   Instead, it will immediately dispatch new, in-order data to the
911	   application and otherwise drop the segment.  A send buffer is
912	   provided by the application.  Multiple TCP connections are possible.
913	   Recently there has been little further work on the stack.

915	8.5.  FreeRTOS

917	   FreeRTOS is a real-time operating system kernel for embedded devices
918	   that is supported by 16- and 32-bit microprocessors.  Its TCP
919	   implementation is based on multiple-segment window size, although a
920	   'Tiny-TCP' option, which is a single-MSS variant, can be enabled.
921	   Delayed ACKs are supported, with a 20-ms Delayed ACK timer as a
922	   technique intended 'to gain performance'.

924	8.6.  uC/OS

926	   uC/OS is a real-time operating system kernel for embedded devices,
927	   which is maintained by Micrium. uC/OS is intended for 8-, 16- and
928	   32-bit microprocessors.  The uC/OS TCP implementation supports a
929	   multiple-segment window size.

931	8.7.  Summary
932	                        +---+---------+--------+----+------+--------+-----+
933	                        |uIP|lwIP orig|lwIP 2.1|RIOT|TinyOS|FreeRTOS|uC/OS|
934	   +------+-------------+---+---------+--------+----+------+--------+-----+
935	   |Memory|Code size(kB)| <5|~9 to ~14|   38   | <7 | N/A  |  <9.2  | N/A |
936	   |      |             |(a)|   (T1)  |  (T4)  |(T3)|      |  (T2)  |     |
937	   +------+-------------+---+---------+--------+----+------+--------+-----+
938	   |      | Single-Segm.|Yes|    No   |   No   | Yes|  No  |   No   |  No |
939	   |      +-------------+---+---------+--------+----+------+--------+-----+
940	   |      |  Slow start | No|   Yes   |   Yes  | No | Yes  |   No   | Yes |
941	   |  T   +-------------+---+---------+--------+----+------+--------+-----+
942	   |  C   |Fast rec/retx| No|   Yes   |   Yes  | No | Yes  |   No   | Yes |
943	   |  P   +-------------+---+---------+--------+----+------+--------+-----+
944	   |      |  Keep-alive | No|    No   |   Yes  | No |  No  |  Yes   | Yes |
945	   |      +-------------+---+---------+--------+----+------+--------+-----+
946	   |  f   |  Win. Scale | No|    No   |   Yes  | No |  No  |  Yes   |  No |
947	   |  e   +-------------+---+---------+--------+----+------+--------+-----+
948	   |  a   |  TCP timest.| No|    No   |   Yes  | No |  No  |  Yes   |  No |
949	   |  t   +-------------+---+---------+--------+----+------+--------+-----+
950	   |  u   |      SACK   | No|    No   |   Yes  | No |  No  |  Yes   |  No |
951	   |  r   +-------------+---+---------+--------+----+------+--------+-----+
952	   |  e   |  Del. ACKs  | No|   Yes   |   Yes  | No |  No  |  Yes   | Yes |
953	   |  s   +-------------+---+---------+--------+----+------+--------+-----+
954	   |      |     Socket  | No|    No   |Optional|(I) |Subset|  Yes   | Yes |
955	   |      +-------------+---+---------+--------+----+------+--------+-----+
956	   |      |Concur. Conn.|Yes|   Yes   |   Yes  | Yes| Yes  |  Yes   | Yes |
957	   +------+-------------+---+---------+--------+----+------+--------+-----+
958	   |    TLS supported   | No|    No   |   Yes  | Yes| Yes  |  Yes   | Yes |
959	   +--------------------+---+---------+--------+----+------+--------+-----+

961	     (T1)  = TCP-only, on x86 and AVR platforms
962	     (T2)  = TCP-only, on ARM Cortex-M platform
963	     (T3)  = TCP-only, on ARM Cortex-M0+ platform (NOTE: RAM usage for the same platform
964	             is ~2.5 kB for one TCP connection plus ~1.2 kB for each additional connection)
965	     (T4)  = TCP-only, on CC2538DK, cross-compiling on Linux
966	     (a)   = includes IP, ICMP and TCP on x86 and AVR platforms. The Contiki-NG TCP implementation has a code size of 3.2 kB on CC2538DK, cross-compiling on Linux
967	     (I)   = optional POSIX socket wrapper which enables POSIX compliance if needed
968	     Mult. = Multiple
969	     N/A   = Not Available

971	     Figure 2: Summary of TCP features for differrent lightweight TCP
972	     implementations.  None of the implementations considered in this
973	                         Annex support ECN or TFO.

975	9.  Annex.  Changes compared to previous versions

977	   RFC Editor: To be removed prior to publication

979	9.1.  Changes between -00 and -01

981	   o  Changed title and abstract

983	   o  Clarification that communcation with standard-compliant TCP
984	      endpoints is required, based on feedback from Joe Touch

986	   o  Additional discussion on communication patters

988	   o  Numerous changes to address a comprehensive review from Hannes
989	      Tschofenig

991	   o  Reworded security considerations

993	   o  Additional references and better distinction between normative and
994	      informative entries

996	   o  Feedback from Rahul Jadhav on the uIP TCP implementation

998	   o  Basic data for the TinyOS TCP implementation added, based on
999	      source code analysis

1001	9.2.  Changes between -01 and -02

1003	   o  Added text to the Introduction section, and a reference, on
1004	      traditional bad perception of TCP for IoT

1006	   o  Added sections on FreeRTOS and uC/OS

1008	   o  Updated TinyOS section

1010	   o  Updated summary table

1012	   o  Reorganized Section 4 (single-MSS vs multiple-MSS window size),
1013	      some content now also in new Section 5

1015	9.3.  Changes between -02 and -03

1017	   o  Rewording to better explain the benefit of ECN

1019	   o  Additional context information on the surveyed implementations

1021	   o  Added details, but removed "Data size" raw, in the summary table
1022	   o  Added discussion on shrew attacks

1024	9.4.  Changes between -03 and -04

1026	   o  Addressing the remaining TODOs

1028	   o  Alignment of the wording on TCP "keep-alives" with related
1029	      discussions in the IETF transport area

1031	   o  Added further discussion on delayed ACKs

1033	   o  Removed OpenWSN subsection from the Annex

1035	9.5.  Changes between -04 and -05

1037	   o  Addressing comments by Yoshifumi Nishida

1039	   o  Removed mentioning MD5 as an example (comment by David Black)

1041	   o  Added memory footprint details of TCP implementations (Contiki-NG
1042	      and lwIP 2.1.2) provided by Rahul Jadhav in the Annex

1044	   o  Addressed comments by Ilpo Jarvinen throughout the whole document

1046	   o  Improved the RIOT section in the Annex, based on feedback from
1047	      Emmanuel Baccelli

1049	9.6.  Changes between -05 and -06

1051	   o  Incorporated suggestions by Stuart Cheshire

1053	9.7.  Changes between -06 and -07

1055	   o  Addressed comments by Gorry Fairhurst

1057	9.8.  Changes between -07 and -08

1059	   o  Addressed WGLC comments by Ilpo Jarvinen, Markku Kojo and Ingemar
1060	      Johansson throughout the document, including the addition of a new
1061	      subsection on Initial Window considerations.

1063	9.9.  Changes between -08 and -09

1065	   o  Addressed second round of comments by Ilpo Jarvinen and Markku
1066	      Kojo, based on the previous draft update.

1068	10.  References

1070	10.1.  Normative References

1072	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1073	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1074	              <https://www.rfc-editor.org/info/rfc793>.

1076	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1077	              Communication Layers", STD 3, RFC 1122,
1078	              DOI 10.17487/RFC1122, October 1989,
1079	              <https://www.rfc-editor.org/info/rfc1122>.

1081	   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
1082	              Selective Acknowledgment Options", RFC 2018,
1083	              DOI 10.17487/RFC2018, October 1996,
1084	              <https://www.rfc-editor.org/info/rfc2018>.

1086	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1087	              Requirement Levels", BCP 14, RFC 2119,
1088	              DOI 10.17487/RFC2119, March 1997,
1089	              <https://www.rfc-editor.org/info/rfc2119>.

1091	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1092	              (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
1093	              December 1998, <https://www.rfc-editor.org/info/rfc2460>.

1095	   [RFC3042]  Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing
1096	              TCP's Loss Recovery Using Limited Transmit", RFC 3042,
1097	              DOI 10.17487/RFC3042, January 2001,
1098	              <https://www.rfc-editor.org/info/rfc3042>.

1100	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1101	              of Explicit Congestion Notification (ECN) to IP",
1102	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1103	              <https://www.rfc-editor.org/info/rfc3168>.

1105	   [RFC3819]  Karn, P., Ed., Bormann, C., Fairhurst, G., Grossman, D.,
1106	              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
1107	              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
1108	              RFC 3819, DOI 10.17487/RFC3819, July 2004,
1109	              <https://www.rfc-editor.org/info/rfc3819>.

1111	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1112	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1113	              <https://www.rfc-editor.org/info/rfc5681>.

1115	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1116	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1117	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1119	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
1120	              "Computing TCP's Retransmission Timer", RFC 6298,
1121	              DOI 10.17487/RFC6298, June 2011,
1122	              <https://www.rfc-editor.org/info/rfc6298>.

1124	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
1125	              "Increasing TCP's Initial Window", RFC 6928,
1126	              DOI 10.17487/RFC6928, April 2013,
1127	              <https://www.rfc-editor.org/info/rfc6928>.

1129	   [RFC7228]  Bormann, C., Ersue, M., and A. Keranen, "Terminology for
1130	              Constrained-Node Networks", RFC 7228,
1131	              DOI 10.17487/RFC7228, May 2014,
1132	              <https://www.rfc-editor.org/info/rfc7228>.

1134	   [RFC7323]  Borman, D., Braden, B., Jacobson, V., and R.
1135	              Scheffenegger, Ed., "TCP Extensions for High Performance",
1136	              RFC 7323, DOI 10.17487/RFC7323, September 2014,
1137	              <https://www.rfc-editor.org/info/rfc7323>.

1139	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
1140	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
1141	              <https://www.rfc-editor.org/info/rfc7413>.

1143	10.2.  Informative References

1145	   [Commag]   A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoAP
1146	              Congestion Control for the Internet of Things", IEEE
1147	              Communications Magazine, June 2016.

1149	   [Dunk]     A. Dunkels, "Full TCP/IP for 8-Bit Architectures", 2003.

1151	   [ETEN]     R. Krishnan et al, "Explicit transport error notification
1152	              (ETEN) for error-prone wireless and satellite networks",
1153	              Computer Networks 2004.

1155	   [GNRC]     M. Lenders et al., "Connecting the World of Embedded
1156	              Mobiles: The RIOTApproach to Ubiquitous Networking for the
1157	              IoT", 2018.

1159	   [HomeGateway]
1160	              Haetoenen, S., Nyrhinen, A., Eggert, L., Strowes, S.,
1161	              Sarolahti, P., and M. Kojo, "An Experimental Study of Home
1162	              Gateway Characteristics", Proceedings of the 10th ACM
1163	              SIGCOMM conference on Internet measurement 2010.

1165	   [I-D.delcarpio-6lo-wlanah]
1166	              Vega, L., Robles, I., and R. Morabito, "IPv6 over
1167	              802.11ah", draft-delcarpio-6lo-wlanah-01 (work in
1168	              progress), October 2015.

1170	   [I-D.ietf-tcpm-rto-consider]
1171	              Allman, M., "Retransmission Timeout Requirements", draft-
1172	              ietf-tcpm-rto-consider-08 (work in progress), February
1173	              2019.

1175	   [I-D.jarvinen-core-fasor]
1176	              Jarvinen, I., Kojo, M., Raitahila, I., and Z. Cao, "Fast-
1177	              Slow Retransmission Timeout and Congestion Control
1178	              Algorithm for CoAP", draft-jarvinen-core-fasor-02 (work in
1179	              progress), July 2019.

1181	   [IntComp]  C. Gomez, A. Arcia-Moret, J. Crowcroft, "TCP in the
1182	              Internet of Things: from ostracism to prominence", IEEE
1183	              Internet Computing, January-February 2018.

1185	   [RFC2757]  Montenegro, G., Dawkins, S., Kojo, M., Magret, V., and N.
1186	              Vaidya, "Long Thin Networks", RFC 2757,
1187	              DOI 10.17487/RFC2757, January 2000,
1188	              <https://www.rfc-editor.org/info/rfc2757>.

1190	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1191	              Explicit Congestion Notification (ECN) in IP Networks",
1192	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1193	              <https://www.rfc-editor.org/info/rfc2884>.

1195	   [RFC3481]  Inamura, H., Ed., Montenegro, G., Ed., Ludwig, R., Gurtov,
1196	              A., and F. Khafizov, "TCP over Second (2.5G) and Third
1197	              (3G) Generation Wireless Networks", BCP 71, RFC 3481,
1198	              DOI 10.17487/RFC3481, February 2003,
1199	              <https://www.rfc-editor.org/info/rfc3481>.

1201	   [RFC4944]  Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler,
1202	              "Transmission of IPv6 Packets over IEEE 802.15.4
1203	              Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007,
1204	              <https://www.rfc-editor.org/info/rfc4944>.

1206	   [RFC6077]  Papadimitriou, D., Ed., Welzl, M., Scharf, M., and B.
1207	              Briscoe, "Open Research Issues in Internet Congestion
1208	              Control", RFC 6077, DOI 10.17487/RFC6077, February 2011,
1209	              <https://www.rfc-editor.org/info/rfc6077>.

1211	   [RFC6092]  Woodyatt, J., Ed., "Recommended Simple Security
1212	              Capabilities in Customer Premises Equipment (CPE) for
1213	              Providing Residential IPv6 Internet Service", RFC 6092,
1214	              DOI 10.17487/RFC6092, January 2011,
1215	              <https://www.rfc-editor.org/info/rfc6092>.

1217	   [RFC6120]  Saint-Andre, P., "Extensible Messaging and Presence
1218	              Protocol (XMPP): Core", RFC 6120, DOI 10.17487/RFC6120,
1219	              March 2011, <https://www.rfc-editor.org/info/rfc6120>.

1221	   [RFC6282]  Hui, J., Ed. and P. Thubert, "Compression Format for IPv6
1222	              Datagrams over IEEE 802.15.4-Based Networks", RFC 6282,
1223	              DOI 10.17487/RFC6282, September 2011,
1224	              <https://www.rfc-editor.org/info/rfc6282>.

1226	   [RFC6550]  Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
1227	              Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
1228	              JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
1229	              Low-Power and Lossy Networks", RFC 6550,
1230	              DOI 10.17487/RFC6550, March 2012,
1231	              <https://www.rfc-editor.org/info/rfc6550>.

1233	   [RFC6606]  Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem
1234	              Statement and Requirements for IPv6 over Low-Power
1235	              Wireless Personal Area Network (6LoWPAN) Routing",
1236	              RFC 6606, DOI 10.17487/RFC6606, May 2012,
1237	              <https://www.rfc-editor.org/info/rfc6606>.

1239	   [RFC6775]  Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C.
1240	              Bormann, "Neighbor Discovery Optimization for IPv6 over
1241	              Low-Power Wireless Personal Area Networks (6LoWPANs)",
1242	              RFC 6775, DOI 10.17487/RFC6775, November 2012,
1243	              <https://www.rfc-editor.org/info/rfc6775>.

1245	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
1246	              Protocol (HTTP/1.1): Message Syntax and Routing",
1247	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
1248	              <https://www.rfc-editor.org/info/rfc7230>.

1250	   [RFC7252]  Shelby, Z., Hartke, K., and C. Bormann, "The Constrained
1251	              Application Protocol (CoAP)", RFC 7252,
1252	              DOI 10.17487/RFC7252, June 2014,
1253	              <https://www.rfc-editor.org/info/rfc7252>.

1255	   [RFC7414]  Duke, M., Braden, R., Eddy, W., Blanton, E., and A.
1256	              Zimmermann, "A Roadmap for Transmission Control Protocol
1257	              (TCP) Specification Documents", RFC 7414,
1258	              DOI 10.17487/RFC7414, February 2015,
1259	              <https://www.rfc-editor.org/info/rfc7414>.

1261	   [RFC7428]  Brandt, A. and J. Buron, "Transmission of IPv6 Packets
1262	              over ITU-T G.9959 Networks", RFC 7428,
1263	              DOI 10.17487/RFC7428, February 2015,
1264	              <https://www.rfc-editor.org/info/rfc7428>.

1266	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1267	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1268	              DOI 10.17487/RFC7540, May 2015,
1269	              <https://www.rfc-editor.org/info/rfc7540>.

1271	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
1272	              Recommendations Regarding Active Queue Management",
1273	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
1274	              <https://www.rfc-editor.org/info/rfc7567>.

1276	   [RFC7668]  Nieminen, J., Savolainen, T., Isomaki, M., Patil, B.,
1277	              Shelby, Z., and C. Gomez, "IPv6 over BLUETOOTH(R) Low
1278	              Energy", RFC 7668, DOI 10.17487/RFC7668, October 2015,
1279	              <https://www.rfc-editor.org/info/rfc7668>.

1281	   [RFC8087]  Fairhurst, G. and M. Welzl, "The Benefits of Using
1282	              Explicit Congestion Notification (ECN)", RFC 8087,
1283	              DOI 10.17487/RFC8087, March 2017,
1284	              <https://www.rfc-editor.org/info/rfc8087>.

1286	   [RFC8105]  Mariager, P., Petersen, J., Ed., Shelby, Z., Van de Logt,
1287	              M., and D. Barthel, "Transmission of IPv6 Packets over
1288	              Digital Enhanced Cordless Telecommunications (DECT) Ultra
1289	              Low Energy (ULE)", RFC 8105, DOI 10.17487/RFC8105, May
1290	              2017, <https://www.rfc-editor.org/info/rfc8105>.

1292	   [RFC8163]  Lynn, K., Ed., Martocci, J., Neilson, C., and S.
1293	              Donaldson, "Transmission of IPv6 over Master-Slave/Token-
1294	              Passing (MS/TP) Networks", RFC 8163, DOI 10.17487/RFC8163,
1295	              May 2017, <https://www.rfc-editor.org/info/rfc8163>.

1297	   [RFC8201]  McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
1298	              "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
1299	              DOI 10.17487/RFC8201, July 2017,
1300	              <https://www.rfc-editor.org/info/rfc8201>.

1302	   [RFC8323]  Bormann, C., Lemay, S., Tschofenig, H., Hartke, K.,
1303	              Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained
1304	              Application Protocol) over TCP, TLS, and WebSockets",
1305	              RFC 8323, DOI 10.17487/RFC8323, February 2018,
1306	              <https://www.rfc-editor.org/info/rfc8323>.

1308	   [RFC8352]  Gomez, C., Kovatsch, M., Tian, H., and Z. Cao, Ed.,
1309	              "Energy-Efficient Features of Internet of Things
1310	              Protocols", RFC 8352, DOI 10.17487/RFC8352, April 2018,
1311	              <https://www.rfc-editor.org/info/rfc8352>.

1313	   [RFC8376]  Farrell, S., Ed., "Low-Power Wide Area Network (LPWAN)
1314	              Overview", RFC 8376, DOI 10.17487/RFC8376, May 2018,
1315	              <https://www.rfc-editor.org/info/rfc8376>.

1317	   [RIOT]     E. Baccelli et al., "RIOT: an Open Source Operating
1318	              Systemfor Low-end Embedded Devices in the IoT", 2018.

1320	   [shrew]    A. Kuzmanovic, E. Knightly, "Low-Rate TCP-Targeted Denial
1321	              of Service Attacks", SIGCOMM'03 2003.

1323	Authors' Addresses

1325	   Carles Gomez
1326	   UPC
1327	   C/Esteve Terradas, 7
1328	   Castelldefels  08860
1329	   Spain

1331	   Email: carlesgo@entel.upc.edu

1333	   Jon Crowcroft
1334	   University of Cambridge
1335	   JJ Thomson Avenue
1336	   Cambridge, CB3 0FD
1337	   United Kingdom

1339	   Email: jon.crowcroft@cl.cam.ac.uk

1341	   Michael Scharf
1342	   Hochschule Esslingen
1343	   Flandernstr. 101
1344	   Esslingen  73732
1345	   Germany

1347	   Email: michael.scharf@hs-esslingen.de