idnits 2.17.1 

draft-ietf-core-cocoa-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 21, 2018) is 2249 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 560

  -- Looks like a reference, but probably isn't: '2' on line 564

  -- Looks like a reference, but probably isn't: '3' on line 567

  -- Looks like a reference, but probably isn't: '4' on line 571

  == Missing Reference: '5-10' is mentioned on line 556, but not defined

  -- Looks like a reference, but probably isn't: '5' on line 576

  -- Looks like a reference, but probably isn't: '6' on line 581

  -- Looks like a reference, but probably isn't: '7' on line 586

  -- Looks like a reference, but probably isn't: '8' on line 590

  -- Looks like a reference, but probably isn't: '9' on line 595

  -- Looks like a reference, but probably isn't: '10' on line 600


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	CoRE Working Group                                            C. Bormann
3	Internet-Draft                                   Universitaet Bremen TZI
4	Intended status: Informational                                A. Betzler
5	Expires: August 25, 2018                                  Fundacio i2CAT
6	                                                                C. Gomez
7	                                                             I. Demirkol
8	                     Universitat Politecnica de Catalunya/Fundacio i2CAT
9	                                                       February 21, 2018

11	                CoAP Simple Congestion Control/Advanced
12	                        draft-ietf-core-cocoa-03

14	Abstract

16	   CoAP, the Constrained Application Protocol, needs to be implemented
17	   in such a way that it does not cause persistent congestion on the
18	   network it uses.  The CoRE CoAP specification defines basic behavior
19	   that exhibits low risk of congestion with minimal implementation
20	   requirements.  It also leaves room for combining the base
21	   specification with advanced congestion control mechanisms with higher
22	   performance.

24	   This specification defines more advanced, but still simple CoRE
25	   Congestion Control mechanisms, called CoCoA.  The core of these
26	   mechanisms is a Retransmission TimeOut (RTO) algorithm that makes use
27	   of Round-Trip Time (RTT) estimates, in contrast with how the RTO is
28	   determined as per the base CoAP specification (RFC 7252).  The
29	   mechanisms defined in this document have relatively low complexity,
30	   yet they improve the default CoAP RTO algorithm.  The design of the
31	   mechanisms in this specification has made use of input from
32	   simulations and experiments in real networks.

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on August 25, 2018.

50	Copyright Notice

52	   Copyright (c) 2018 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (https://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
68	     1.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
69	   2.  Context . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
70	   3.  Area of Applicability . . . . . . . . . . . . . . . . . . . .   4
71	   4.  Advanced CoAP Congestion Control: RTO Estimation  . . . . . .   5
72	     4.1.  Blind RTO Estimate  . . . . . . . . . . . . . . . . . . .   6
73	     4.2.  Measurement-based RTO Estimate  . . . . . . . . . . . . .   6
74	       4.2.1.  Differences with the algorithm of RFC 6298  . . . . .   7
75	       4.2.2.  Discussion  . . . . . . . . . . . . . . . . . . . . .   7
76	     4.3.  Lifetime, Aging . . . . . . . . . . . . . . . . . . . . .   8
77	   5.  Advanced CoAP Congestion Control: Non-Confirmables  . . . . .   9
78	     5.1.  Discussion  . . . . . . . . . . . . . . . . . . . . . . .   9
79	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
80	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
81	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
82	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
83	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  11
84	   Appendix A.  Supporting evidence  . . . . . . . . . . . . . . . .  11
85	     A.1.  Older versions of the draft and improvement . . . . . . .  12
86	     A.2.  References  . . . . . . . . . . . . . . . . . . . . . . .  12
87	   Appendix B.  Pseudocode . . . . . . . . . . . . . . . . . . . . .  13
88	     B.1.  Updating the RTO estimator  . . . . . . . . . . . . . . .  13
89	     B.2.  RTO aging . . . . . . . . . . . . . . . . . . . . . . . .  14
90	     B.3.  Variable Backoff Factor . . . . . . . . . . . . . . . . .  14
91	   Appendix C.  Examples . . . . . . . . . . . . . . . . . . . . . .  15
92	     C.1.  Example A.1: weak RTTs  . . . . . . . . . . . . . . . . .  15
93	     C.2.  Example A.2: VBF and aging  . . . . . . . . . . . . . . .  15
94	     C.3.  Example B: VBF and aging  . . . . . . . . . . . . . . . .  16
95	   Appendix D.  Analysis: difference between strong and weak
96	                estimators . . . . . . . . . . . . . . . . . . . . .  16
97	   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  17
98	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

100	1.  Introduction

102	   CoAP, the Constrained Application Protocol, needs to be implemented
103	   in such a way that it does not cause persistent congestion on the
104	   network it uses.  The CoRE CoAP specification defines basic behavior
105	   that exhibits low risk of congestion with minimal implementation
106	   requirements.  It also leaves room for combining the base
107	   specification with advanced congestion control mechanisms with higher
108	   performance.

110	   The present specification defines such an advanced CoRE Congestion
111	   Control mechanism, with the goal of improving performance while
112	   retaining safety as well as the simplicity that is appropriate for
113	   constrained devices.  Hence, we are calling this mechanism Simple
114	   Congestion Control/Advanced, or CoCoA for short.

116	   CoCoA calculates the retransmission time-out (RTO) based on RTT
117	   estimations with and without loss.  By taking retransmissions (in a
118	   potentially lossy network) into account when estimating the RTT, this
119	   algorithm reacts to congestion with a lower sending rate.  For non-
120	   confirmable packets, it also limits the sending rate to 1/RTO;
121	   assuming that the RTO estimation in CoCoA works as expected, RTO
122	   should be slightly greater than the RTT, thus CoCoA would be more
123	   conservative than the original specification in [RFC7641].

125	   In the Internet, congestion control is typically implemented in a way
126	   that it can be introduced or upgraded unilaterally.  Still, a new
127	   congestion control scheme must not be introduced lightly.  To ensure
128	   that the new scheme is not posing a danger to the network,
129	   considerable work has been done on simulations and experiments in
130	   real networks.  Some of this work will be mentioned in "Discussion"
131	   subsections in the following sections; an overview is given in
132	   Appendix A.  Extended rationale for this specification can also be
133	   found in the historical Internet-Drafts
134	   [I-D.bormann-core-congestion-control] and
135	   [I-D.eggert-core-congestion-control], as well as in the minutes of
136	   the IETF 84 CoRE WG meetings.

138	1.1.  Terminology

140	   This specification uses terms from [RFC7252].  In addition, it
141	   defines the following terminology:

143	   Initiator:  The endpoint that sends the message that initiates an
144	      exchange.  E.g., the party that sends a confirmable message, or a
145	      non-confirmable message (see Section 4.3 of [RFC7252]) conveying a
146	      request.

148	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
149	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
150	   "OPTIONAL" in this document are to be interpreted as described in
151	   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
152	   capitals, as shown here.

154	   The term "byte", abbreviated by "B", is used in its now customary
155	   sense as a synonym for "octet".

157	2.  Context

159	   In the definition of the CoAP protocol [RFC7252], an approach was
160	   taken that includes a very simple basic scheme (lock-step with the
161	   number of parallel exchanges usually limited to 1) in the base
162	   specification together with performance-enhancing advanced
163	   mechanisms.

165	   The present specification is based on the approved text in the
166	   [RFC7252] base specification.  It is making use of the text that
167	   permits advanced congestion control mechanisms and allows them to
168	   change protocol parameters, including NSTART and the binary
169	   exponential backoff mechanism.  Note that Section 4.8 of [RFC7252]
170	   limits the leeway that implementations have in changing the CoRE
171	   protocol parameters.

173	   The present specification also assumes that, outside of exchanges,
174	   non-confirmable messages can only be used at a limited rate without
175	   an advanced congestion control mechanism (this is mainly relevant for
176	   [RFC7641]).  It is also intended to address the [RFC8085] guideline
177	   about combining congestion control state for a destination; and to
178	   clarify its meaning for CoAP using the definition of an endpoint.

180	   The present specification does not address multicast or dithering
181	   beyond basic retransmission dithering.

183	3.  Area of Applicability

185	   The present algorithm is intended to be generally applicable.  The
186	   objective is to be "better" than default CoAP congestion control in a
187	   number of characteristics, including achievable goodput for a given
188	   offered load, latency, and recovery from bursts, while providing more
189	   predictable stress to the network and the same level of safety from
190	   catastrophic congestion.  The algorithm defined in this document is
191	   intended to adapt to the current characteristics of any underlying
192	   network, and therefore is well suited for a wide range of network
193	   conditions, in terms of bandwidth, latency, load, loss rate,
194	   topology, etc.  In particular, CoCoA has been found to perform well
195	   in scenarios with latencies ranging from the order of milliseconds to
196	   peaks of dozens of seconds, as well as in single-hop and multihop
197	   topologies.  Link technologies used in existing evaluation work
198	   comprise IEEE 802.15.4, GPRS, UMTS and Wi-Fi (see Appendix A).  CoCoA
199	   is also expected to work suitably across the general Internet.  The
200	   algorithm does require three state variables per scope plus the state
201	   needed to do RTT measurements, so it may not be applicable to the
202	   most constrained devices (say, class 1 as per [RFC7228]).

204	   The scope of each instance of the algorithm in the current set of
205	   evaluations has been the five-tuple, i.e., CoAP + endpoint (transport
206	   address) for Initiator and Responder.  Potential applicability to
207	   larger scopes needs to be examined.

209	4.  Advanced CoAP Congestion Control: RTO Estimation

211	   For an initiator that plans to make multiple requests to one
212	   destination endpoint, it may be worthwhile to make RTT measurements
213	   in order to compute a more appropriate RTO than the default initial
214	   timeout of 2 to 3 s.  In particular, a wide spectrum of RTT values is
215	   expected in different types of networks where CoAP is used.  Those
216	   RTTs range from several orders of magnitude below the default initial
217	   timeout to values larger than the default.  The algorithm defined in
218	   this document is based on the algorithm for RTO estimation defined in
219	   [RFC6298], with appropriately extended default/base values, as
220	   proposed in Section 4.2.1.  Note that such a mechanism must, during
221	   idle periods, decay RTO estimates that are shorter or longer than the
222	   default RTO estimate back to the default RTO estimate, until fresh
223	   measurements become available again, as proposed in Section 4.3.

225	   RTT variability challenges RTO estimation.  In TCP, delayed ACKs
226	   contribute to RTT variability, since this option adds a delay of up
227	   to 500 ms (typically, 200 ms) before an ACK is sent by a receiving
228	   TCP endpoint.  However, one important consideration not relevant for
229	   TCP is the fact that a CoAP round-trip may include application
230	   processing time, which may be hard to predict, and may differ between
231	   different resources available at the same endpoint.  Also, for
232	   communications with networks of constrained devices that apply radio
233	   duty cycling, large and variable round-trip times are likely to be
234	   observed.  Servers will only trigger their early ACKs (with a non-
235	   piggybacked response to be sent later) based on the default timers,
236	   e.g. after 1 s.  A client that has arrived at a RTO estimate shorter
237	   than 1 s SHOULD therefore use a larger backoff factor for
238	   retransmissions to avoid expending all of its retransmissions
239	   (MAX_RETRANSMIT, see Section 4.2 of [RFC7252], normally 4) in the
240	   default interval of 2 to 3 s.  The approach chosen for a mechanism
241	   with variable backoff factors is presented in Section 4.2.1.

243	   It may also be worthwhile to perform RTT estimation not just based on
244	   information measured from a single destination endpoint, but also
245	   based on entire hosts (IP addresses) and/or complete prefixes (e.g.,
246	   maintain an RTT estimate for a whole /64).  The exact way this can be
247	   used to reduce the amount of state in an initiator is for further
248	   study.

250	4.1.  Blind RTO Estimate

252	   The initial RTO estimate for an endpoint is set to 2 seconds (the
253	   initial RTO estimate is used as the initial value for both E_weak_
254	   and E_strong_ below).

256	   If only the initial RTO estimate is available, the RTO estimate for
257	   each of up to NSTART exchanges started in parallel is set to 2 s
258	   times the number of parallel exchanges, e.g. if two exchanges are
259	   already running, the initial RTO estimate for an additional exchange
260	   is 6 seconds.

262	4.2.  Measurement-based RTO Estimate

264	   The RTO estimator runs two copies of the algorithm defined in
265	   [RFC6298], using the same variables and calculations to estimate the
266	   RTO, with the differences introduced in Section 4.2.1: One copy for
267	   exchanges that complete on initial transmissions (the "strong
268	   estimator", E_strong_), and one copy for exchanges that have run into
269	   retransmissions, where only the first two retransmissions are
270	   considered (the "weak estimator", E_weak_).  For the latter, there is
271	   some ambiguity whether a response is based on the initial
272	   transmission or the retransmissions.  For the purposes of the weak
273	   estimator, the time from the initial transmission counts.  Responses
274	   obtained after the third retransmission are not used to update an
275	   estimator.

277	   The overall RTO estimate is an exponentially weighted moving average
278	   computed of the strong and the weak estimator, which is evolved after
279	   each contribution to the weak estimator (1) or to the strong
280	   estimator (2), from the estimator (either the weak or strong
281	   estimator) that made the most recent contribution:

283	   RTO := w_weak   * E_weak_   + (1 - w_weak)   * RTO       (1)

285	   RTO := w_strong * E_strong_ + (1 - w_strong) * RTO       (2)
286	   (Splitting this update into the two cases avoids making the
287	   contribution of the weak estimator too big in naturally lossy
288	   networks.)

290	   The default values for the corresponding weights, w_weak and
291	   w_strong, are 0.25 and 0.5, respectively.  These values have been
292	   found to offer good performance in evaluations (see Appendix A).
293	   Pseudocode and examples for the overall RTO estimate presented are
294	   available in Appendix B.1 and Appendix C.1.

296	4.2.1.  Differences with the algorithm of RFC 6298

298	   This subsection presents three differences of the algorithm defined
299	   in this document with the one defined in [RFC6298].  The first two
300	   recommend new parameter settings.  The third one is the variable
301	   backoff factor (VBF), which replaces RFC6298's simple exponential
302	   backoff that always multiplies the RTO by a factor of 2 when the RTO
303	   timer expires.

305	   The initial value for each of the two RTO estimators is 2 s.

307	   For the weak estimator, the factor K (the RTT variance multiplier) is
308	   set to 1 instead of 4.  This is necessary to avoid a strong increase
309	   of the RTO in the case that the RTTVAR value is very large, which may
310	   be the case if a weak RTT measurement is obtained after one or more
311	   retransmissions.

313	   In order to avoid that exchanges with small initial RTOs (i.e.  RTO
314	   estimate lower than 1 s) use up all retransmissions in a short
315	   interval of time, the RTO for a retransmission is multiplied by 3 for
316	   each retransmission as long as the RTO is less than 1 s.

318	   On the other hand, to avoid exchanges with large initial RTOs (i.e.,
319	   RTO estimate greater than 3 s) not being able to carry out all
320	   retransmissions within MAX_TRANSMIT_WAIT (normally 93 s), the RTO is
321	   multiplied only by 1.5 when RTO is greater than 3 s.

323	   Pseudocode for the variable backoff factor is in Appendix B.3.

325	   The binary exponential backoff is truncated at 32 seconds.  Similar
326	   to the way retransmissions are handled in the base specification,
327	   they are dithered between 1 x RTO and ACK_RANDOM_FACTOR x RTO.

329	4.2.2.  Discussion

331	   In contrast to [RFC6298], this algorithm attempts to make use of
332	   ambiguous information from retransmissions.  This is motivated by the
333	   high non-congestion loss rates expected in constrained node networks,
334	   and the need to update the RTO estimators even in the presence of
335	   loss.  This approach appears to contravene the mandate in
336	   Section 3.1.1 of [RFC8085] that "latency samples MUST NOT be derived
337	   from ambiguous transactions".  However, those samples are not simply
338	   combined into the strong estimator, but are used to correct the
339	   limited knowledge that can be gained from the strong RTT measurements
340	   by employing an additional weak estimator.  In fact, the weak
341	   estimator allows to better update the RTO estimator when mostly weak
342	   RTTs are available, either due to the lossy nature of links or due to
343	   congestion-induced losses.  In the presence of the latter, and
344	   compared to a strong-only estimator (w_weak=0), spurious timeouts are
345	   avoided and the rate of retries is reduced, which allows to decrease
346	   congestion.  Evidence that has been collected from experiments
347	   appears to support that the overall effect of using this data in the
348	   way described is beneficial (Appendix A).

350	   Some evaluation has been done on earlier versions of this
351	   specification [Betzler2013].  A more recent (and more comprehensive)
352	   reference is [Betzler2015].

354	4.3.  Lifetime, Aging

356	   The state of the RTO estimators for an endpoint SHOULD be kept as
357	   long as possible.  If other state is kept for the endpoint (such as a
358	   DTLS connection), it is very strongly RECOMMENDED to keep the RTO
359	   state alive at least as long as this other state.  In the absence of
360	   such other state, the RTO state SHOULD be kept at least long enough
361	   to avoid frequent returns to inappropriate initial values.  For the
362	   default parameter set of Section 4.8 of [RFC7252], it is strongly
363	   RECOMMENDED to keep it for at least 255 s.

365	   If an estimator has a value that is lower than 1 s, and it is left
366	   without further update for 16 times its current value, the RTO
367	   estimate is doubled.  If an estimator has a value that is higher than
368	   3 s, and it is left without further update for 4 times its current
369	   value, the RTO estimate is set to be

371	      RTO := 1 s + (0.5 * RTO)

373	   (Note that, instead of running a timer, it is possible to implement
374	   these RTO aging calculations cumulatively at the time the estimator
375	   is used next.)

377	   Pseudocode and examples for the aging mechanism presented are
378	   available in Appendix B.2 and in Appendix C.2.

380	5.  Advanced CoAP Congestion Control: Non-Confirmables

382	   A CoAP endpoint MUST NOT send non-confirmables to another CoAP
383	   endpoint at a rate higher than defined by this document.  Independent
384	   of any congestion control mechanisms, a CoAP endpoint can always send
385	   non-confirmables if their rate does not exceed 1 B/s.

387	   Non-confirmables that form part of exchanges are governed by the
388	   rules for exchanges.

390	   Non-confirmables outside exchanges (e.g., [RFC7641] notifications
391	   sent as non-confirmables) are governed by the following rules:

393	   1.  Of any 16 consecutive messages towards this endpoint that aren't
394	       responses or acknowledgments, at least 2 of the messages must be
395	       confirmable.

397	   2.  An RTO as specified in Section 4 must be used for confirmable
398	       messages.

400	   3.  The packet rate of non-confirmable messages cannot exceed 1/RTO,
401	       where RTO is the overall RTO estimator value at the time the non-
402	       confirmable packet is sent.

404	5.1.  Discussion

406	   The mechanism defined above for non-confirmables is relatively
407	   conservative.  More advanced versions of this algorithm could run a
408	   TFRC-style Loss Event Rate calculator [RFC5348] and apply the TCP
409	   equation to achieve a higher rate than 1/RTO.

411	   [RFC7641], Section 4.5.1, specifies that the rate of Non-Confirmables
412	   SHOULD NOT exceed 1/RTT on average, if the server can maintain an RTT
413	   estimate for a client.  CoCoA limits the packet rate of Non-
414	   Confirmables in this situation to 1/RTO.  Assuming that the RTO
415	   estimation in CoCoA works as expected, RTO[k] should be slightly
416	   greater than the RTT[k], thus CoCoA would be more conservative.  The
417	   expectation therefore is that complying with the NON rate set by
418	   CoCoA leads to complying with [RFC7641].

420	6.  IANA Considerations

422	   This document makes no requirements on IANA.  (This section to be
423	   removed by RFC editor.)

425	7.  Security Considerations

427	   The security considerations of, e.g., [RFC5681], [RFC2914], and
428	   [RFC8085] apply.  Some issues are already discussed in the security
429	   considerations of [RFC7252].

431	   If a malicious node manages to prevent the delivery of some packets,
432	   a consequence will be an RTO increase, which will further reduce
433	   network performance.  Note that this type of attack is not specific
434	   for CoCoA (and not even specific for CoAP), and many congestion
435	   control algorithms increase the RTO upon packet loss detection.
436	   While it is hard to prevent radio jamming, some mitigation for other
437	   forms of this type of attack is provided by network access control
438	   techniques.  Also, the weak estimator in CoCoA increases the chances
439	   of obtaining RTT measurements in the presence of heavy packet losses,
440	   allowing to keep the RTO updated, which in turn allows recovery from
441	   a jamming attack in reasonable time.

443	8.  References

445	8.1.  Normative References

447	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
448	              Requirement Levels", BCP 14, RFC 2119,
449	              DOI 10.17487/RFC2119, March 1997,
450	              <https://www.rfc-editor.org/info/rfc2119>.

452	   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41,
453	              RFC 2914, DOI 10.17487/RFC2914, September 2000,
454	              <https://www.rfc-editor.org/info/rfc2914>.

456	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
457	              "Computing TCP's Retransmission Timer", RFC 6298,
458	              DOI 10.17487/RFC6298, June 2011,
459	              <https://www.rfc-editor.org/info/rfc6298>.

461	   [RFC7252]  Shelby, Z., Hartke, K., and C. Bormann, "The Constrained
462	              Application Protocol (CoAP)", RFC 7252,
463	              DOI 10.17487/RFC7252, June 2014,
464	              <https://www.rfc-editor.org/info/rfc7252>.

466	   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
467	              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
468	              March 2017, <https://www.rfc-editor.org/info/rfc8085>.

470	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
471	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
472	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

474	8.2.  Informative References

476	   [Betzler2013]
477	              Betzler, A., Gomez, C., Demirkol, I., and J. Paradells,
478	              "Congestion control in reliable CoAP communication",
479	              ACM MSWIM'13 p. 365-372, DOI 10.1145/2507924.2507954,
480	              2013.

482	   [Betzler2015]
483	              Betzler, A., Gomez, C., Demirkol, I., and J. Paradells,
484	              "CoCoA+: an Advanced Congestion Control Mechanism for
485	              CoAP", Ad Hoc Networks Vol. 33 pp. 126-139,
486	              DOI 10.1016/j.adhoc.2015.04.007, October 2015.

488	   [I-D.bormann-core-congestion-control]
489	              Bormann, C. and K. Hartke, "Congestion Control Principles
490	              for CoAP", draft-bormann-core-congestion-control-02 (work
491	              in progress), July 2012.

493	   [I-D.eggert-core-congestion-control]
494	              Eggert, L., "Congestion Control for the Constrained
495	              Application Protocol (CoAP)", draft-eggert-core-
496	              congestion-control-01 (work in progress), January 2011.

498	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
499	              Friendly Rate Control (TFRC): Protocol Specification",
500	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
501	              <https://www.rfc-editor.org/info/rfc5348>.

503	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
504	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
505	              <https://www.rfc-editor.org/info/rfc5681>.

507	   [RFC7228]  Bormann, C., Ersue, M., and A. Keranen, "Terminology for
508	              Constrained-Node Networks", RFC 7228,
509	              DOI 10.17487/RFC7228, May 2014,
510	              <https://www.rfc-editor.org/info/rfc7228>.

512	   [RFC7641]  Hartke, K., "Observing Resources in the Constrained
513	              Application Protocol (CoAP)", RFC 7641,
514	              DOI 10.17487/RFC7641, September 2015,
515	              <https://www.rfc-editor.org/info/rfc7641>.

517	Appendix A.  Supporting evidence

519	   (Editor's note: The references local to this appendix may need to be
520	   merged with those from the specification proper, depending on the
521	   discretion of the RFC editor.)
522	   CoCoA has been evaluated by means of simulation and experimentation
523	   in diverse scenarios comprising different link layer technologies,
524	   network topologies, traffic patterns and device classes.  The main
525	   overall evaluation result is that CoCoA consistently delivers a
526	   performance which is better than, or at least similar to, that of
527	   default CoAP congestion control.  While the latter is insensitive to
528	   network conditions, CoCoA is adaptive and makes good use of RTT
529	   samples.

531	   It has been shown over real GPRS and IEEE 802.15.4 mesh network
532	   testbeds that in these settings, in comparison to default CoAP, CoCoA
533	   increases throughput and reduces the time it takes for a network to
534	   process traffic bursts, while not sacrificing fairness.  In contrast,
535	   other RTT-sensitive approaches such as Linux-RTO or Peak-Hopper-RTO
536	   may be too simple or do not adapt well to IoT scenarios,
537	   underperforming default CoAP under certain conditions [1].  On the
538	   other hand, CoCoA has been found to reduce latency in GPRS and WiFi
539	   setups, compared with default CoAP [2].

541	   CoCoA performance has also been evaluated for non-confirmable traffic
542	   over emulated GPRS/UMTS links and over a real IEEE 802.15.4 mesh
543	   testbed.  Results show that since CoCoA is adaptive, it yields better
544	   packet delivery ratio than default CoAP (which does not apply
545	   congestion control to non-confirmable messages) or Observe (which
546	   introduces congestion control that is not adaptive to network
547	   conditions) [3, 4].

549	A.1.  Older versions of the draft and improvement

551	   CoCoA has evolved since its initial draft version.  Its core has
552	   remained mostly stable since draft-bormann-core-cocoa-02.  The
553	   evolution of CoCoA has been driven by research work.  This process,
554	   including evaluations of early versions of CoCoA, as well as
555	   improvement proposals that were finally incorporated in CoCoA, is
556	   reflected in published works [5-10].

558	A.2.  References

560	   [1] A.  Betzler, C.  Gomez, I.  Demirkol, J.  Paradells, "CoAP
561	   congestion control for the Internet of Things", IEEE Communications
562	   Magazine, July 2016.

564	   [2] F.  Zheng, B.  Fu, Z.  Cao, "CoAP Latency Evaluation", draft-
565	   zheng-core-coap-lantency-evaluation-00, 2016 (work in progress).

567	   [3] A.  Betzler, C.  Gomez, I.  Demirkol, "Evaluation of Advanced
568	   Congestion Control Mechanisms for Unreliable CoAP Communications",
569	   PE-WASUN, Cancun, Mexico, 2015.

571	   [4] A.  Betzler, J.  Isern, C.  Gomez, I.  Demirkol, J.  Paradells,
572	   "Experimental Evaluation of Congestion Control for CoAP
573	   Communications without End-to-End Reliability", Ad Hoc Networks,
574	   Volume 52, 1 December 2016, Pages 183-194.

576	   [5] A.  Betzler, C.  Gomez, I.  Demirkol, J.  Paradells, "Congestion
577	   Control in Reliable CoAP Communication", 16th ACM International
578	   Conference on Modeling, Analysis and Simulation of Wireless and
579	   Mobile Systems (MSWIM'13), Barcelona, Spain, Nov. 2013.

581	   [6] A.  Betzler, C.  Gomez, I.  Demirkol, M.  Kovatsch, "Congestion
582	   Control for CoAP cloud services", 8th International Workshop on
583	   Service-Oriented Cyber-Physical Systems in Converging Networked
584	   Environments (SOCNE) 2014, Barcelona, Spain, Sept. 2014.

586	   [7] A.  Betzler, C.  Gomez, I.  Demirkol, J.  Paradells, "CoCoA+: an
587	   advanced congestion control mechanism for CoAP", Ad Hoc Networks
588	   journal, 2015.

590	   [8] Bhalerao, Rahul, Sridhar Srinivasa Subramanian, and Joseph
591	   Pasquale.  "An analysis and improvement of congestion control in the
592	   CoAP Internet-of-Things protocol." 2016 13th IEEE Annual Consumer
593	   Communications & Networking Conference (CCNC).  IEEE, 2016.

595	   [9] I Jaervinen, L Daniel, M Kojo, "Experimental evaluation of
596	   alternative congestion control algorithms for Constrained Application
597	   Protocol (CoAP)", IEEE 2nd World Forum on Internet of Things (WF-
598	   IoT), 2015.

600	   [10] Balandina, Ekaterina, Yevgeni Koucheryavy, and Andrei Gurtov.
601	   "Computing the retransmission timeout in coap."  Internet of Things,
602	   Smart Spaces, and Next Generation Networking.  Springer Berlin
603	   Heidelberg, 2013. 352-362.

605	Appendix B.  Pseudocode

607	B.1.  Updating the RTO estimator
608	   // Default values
609	   ALPHA = 0.125 // RFC 6298
610	   BETA = 0.25 // RFC 6298
611	   W_STRONG = 0.5
612	   W_WEAK = 0.25

614	   updateRTO(retransmissions, RTT) {
615	     if (retransmissions == 0) {
616	       RTTVAR_strong = (1 - BETA) * RTTVAR_strong
617	                     + BETA * (RTT_strong - RTT);
618	       RTT_strong  = (1 - ALPHA) * RTT_strong + ALPHA * RTT;
619	       E_strong = RTT_strong  + 4 * RTTVAR_strong;
620	       RTO = W_STRONG * E_strong + (1 - W_STRONG) * RTO;
621	     } else if (retransmissions <= 2) {
622	       RTTVAR_weak = (1 - BETA) * RTTVAR_weak
623	                   + BETA * (RTT_weak - RTT);
624	       RTT_weak  = (1 - ALPHA) * RTT_weak + ALPHA * RTT;
625	       E_weak = RTT_weak  + 1 * RTTVAR_weak;
626	       RTO = W_WEAK * E_weak + (1 - W_WEAK) * RTO
627	     }
628	   }

630	B.2.  RTO aging

632	   checkAging() {
633	     clock_time difference = getCurrentTime() - lastUpdatedTime;

635	     if ((RTO < 1s) && (difference > (16 * RTO))) {
636	       RTO = 2 * RTO;
637	       lastUpdatedTime = getCurrentTime();
638	     } else if ((RTO > 3s) && (difference > (4 * RTO))) {
639	       RTO = 1s + 0.5 * RTO;
640	       lastUpdatedTime = getCurrentTime();
641	     }
642	   }

644	B.3.  Variable Backoff Factor

646	   backOffRTO() {
647	     if (RTO < 1s) {
648	       RTO = RTO * 3;
649	     } else if (RTO > 3s) {
650	       RTO = RTO * 1.5;
651	     } else {
652	       RTO = RTO * 2;
653	     }
654	   }

656	Appendix C.  Examples

658	C.1.  Example A.1: weak RTTs

660	   A large network of sensor nodes that report periodical measurements
661	   is operating normally, without congestion.  The nodes transmit their
662	   sensor readings via CON messages every 20 s in an asynchronous way
663	   towards a server located behind a gateway, obtaining strong RTT
664	   measurements (RTT 1.1 s, RTTVAR 0.1 s) that lead to the calculation
665	   of an RTO of 1.5 s (in average) in each node.  In this mode of
666	   operation, no aging is applied, since the RTO is refreshed before the
667	   aging mechanism applies.

669	   Suddenly, upon detection of a global event, the majority of sensor
670	   nodes start transmitting at a higher rate (every 5 s) to increase the
671	   resolution of the acquired data, which creates heavy congestion that
672	   leads to packet losses and an important increase of real RTT between
673	   the nodes and the server (RTT 2 s, RTTVAR 1 s).  Due to the packet
674	   losses and spurious retransmissions (which can fuel congestion even
675	   more), many nodes are not able to update their RTO via strong RTT
676	   measurements, but they are able to obtain weak RTT measurements.  A
677	   node with an initial RTO of 1.5 s would run into a retransmission,
678	   before obtaining an ACK (given the RTT of 2 s and that the ACK is not
679	   lost).

681	   This weak RTT measurement would increase the overall RTO of the node
682	   to 1.875 s (RTO = 0.25 * 3 s + 0.75 * 1.5 s).  Following the same
683	   calculus (and RTT/RTTVAR values), after obtaining another weak RTT,
684	   the RTO would increase to 2.156 s.  At this point, the benefits of
685	   the weak RTT measurements are twofold:

687	   1.  Further spurious retransmissions are avoided as the RTO has
688	       increased above the real RTT.

690	   2.  The increase of RTOs across the whole network reduces the rate
691	       with which retransmissions are generated, decreasing the network
692	       congestion (which leads to an RTT and packet loss decrease).

694	C.2.  Example A.2: VBF and aging

696	   Assuming that the frequency of message generation is even higher
697	   (every 3 s) and the real RTT would further increase due to
698	   congestion, the RTO at some point would increase to 4 s.  Since now
699	   the RTO is above 3 s, no longer a binary backoff is used to avoid the
700	   RTO growing too much in case of retransmissions.  As the generation
701	   of data from the nodes ceases at some point (the network returns to a
702	   normal state), the aging mechanism would reduce the RTO automatically
703	   (with an RTO of 4 s, after 16 s the RTO would be shifted to 3 s
704	   before a new RTT is measured).

706	C.3.  Example B: VBF and aging

708	   A network of nodes connected over 4G with an Internet service is
709	   calculating very small RTO values (0.3 s) and the nodes are
710	   transmitting CON messages every 1 s.  Suddenly, the connection
711	   quality gets worse and the nodes switch to a more stable, yet slower
712	   connection via GPRS.  As a result of this change, the nodes run into
713	   retransmissions, as the real RTT has increased above the calculated
714	   RTO.

716	   Since the RTO is below 1 s, the Variable Backoff Factor increases the
717	   backoff values quickly to avoid spurious retransmissions (0.9 s first
718	   retry, 2.7 s second retry, etc.).  Further, if due to the packet
719	   losses and increased delays in the network no new RTT measurements
720	   are obtained, the aging mechanism automatically increases the RTO
721	   (doubling it) after 3.8 s (16 * 0.3 s) to adapt better to the sudden
722	   changes of network conditions.  Without the Variable Backoff Factor
723	   and the aging mechanism, the number of spurious retransmissions would
724	   be much higher and the RTO would be corrected more slowly.

726	Appendix D.  Analysis: difference between strong and weak estimators

728	   This section analyzes the difference between the strong and weak RTO
729	   estimators.  If there is no congestion, assume a static RTT of R'.
730	   Then, E_strong_can be expressed as:

732	      E_strong_ = R' + G,

734	   since RTTVAR is reduced constantly by RTTVAR = RTTVAR * 3/4
735	   (according to [RFC6298], and SRTT=R'), G would be dominant term in
736	   the max(G, K * RTTVAR) expression in the long run.

738	   For the weak estimator: assume that the RTO setting converges to
739	   E_strong_ calculated above in the long run.  If there is a packet
740	   loss, and an RTT is obtained for the first retransmission, then the
741	   weak RTT sample obtained by the weak estimator is:

743	      RW' = R'+ G + R'

745	   Therefore, E_weak_ can be expressed as:

747	      E_weak_ = RW' + max(G, RW'/2) = 3 * R'

749	Acknowledgements

751	   The first document to examine CoAP congestion control issues in
752	   detail was [I-D.eggert-core-congestion-control], to which this draft
753	   owes a lot.

755	   Michael Scharf did a review of CoAP congestion control issues that
756	   asked a lot of good questions.  Several Transport Area
757	   representatives made further significant inputs this discussion
758	   during IETF84, including Lars Eggert, Michael Scharf, and David
759	   Black.  Andrew McGregor, Eric Rescorla, Richard Kelsey, Ed Beroset,
760	   Jari Arkko, Zach Shelby, Matthias Kovatsch and many others provided
761	   very useful additions.  Further reviews by Michael Scharf and Ingemar
762	   Johansson led to further improvements, including some more discussion
763	   in the appendices.

765	   Authors from Universitat Politecnica de Catalunya have been supported
766	   in part by the Spanish Government's Ministerio de Economia y
767	   Competitividad through projects TEC2009-11453, TEC2012-32531,
768	   TEC2016-79988-P and FEDER.

770	   Carles Gomez has been funded in part by the Spanish Government
771	   (Ministerio de Educacion, Cultura y Deporte) through the Jose
772	   Castillejo grant CAS15/00336.  His contribution to this work has been
773	   carried out in part during his stay as a visiting scholar at the
774	   Computer Laboratory of the University of Cambridge, in collaboration
775	   with Prof. Jon Crowcroft.

777	Authors' Addresses

779	   Carsten Bormann
780	   Universitaet Bremen TZI
781	   Postfach 330440
782	   Bremen  D-28359
783	   Germany

785	   Phone: +49-421-218-63921
786	   Email: cabo@tzi.org

788	   August Betzler
789	   Fundacio i2CAT
790	   Mobile and Wireless Internet Group
791	   C/ del Gran Capita, 2
792	   Barcelona  08034
793	   Spain

795	   Email: august.betzler@i2cat.net
796	   Carles Gomez
797	   Universitat Politecnica de Catalunya/Fundacio i2CAT
798	   Escola d'Enginyeria de Telecomunicacio i Aeroespacial
799	   de Castelldefels
800	   C/Esteve Terradas, 7
801	   Castelldefels  08860
802	   Spain

804	   Phone: +34-93-413-7206
805	   Email: carlesgo@entel.upc.edu

807	   Ilker Demirkol
808	   Universitat Politecnica de Catalunya/Fundacio i2CAT
809	   Departament d'Enginyeria Telematica
810	   C/Jordi Girona, 1-3
811	   Barcelona  08034
812	   Spain

814	   Email: ilker.demirkol@entel.upc.edu