idnits 2.17.1 draft-ietf-core-cocoa-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 21, 2018) is 2249 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 560 -- Looks like a reference, but probably isn't: '2' on line 564 -- Looks like a reference, but probably isn't: '3' on line 567 -- Looks like a reference, but probably isn't: '4' on line 571 == Missing Reference: '5-10' is mentioned on line 556, but not defined -- Looks like a reference, but probably isn't: '5' on line 576 -- Looks like a reference, but probably isn't: '6' on line 581 -- Looks like a reference, but probably isn't: '7' on line 586 -- Looks like a reference, but probably isn't: '8' on line 590 -- Looks like a reference, but probably isn't: '9' on line 595 -- Looks like a reference, but probably isn't: '10' on line 600 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 CoRE Working Group C. Bormann 3 Internet-Draft Universitaet Bremen TZI 4 Intended status: Informational A. Betzler 5 Expires: August 25, 2018 Fundacio i2CAT 6 C. Gomez 7 I. Demirkol 8 Universitat Politecnica de Catalunya/Fundacio i2CAT 9 February 21, 2018 11 CoAP Simple Congestion Control/Advanced 12 draft-ietf-core-cocoa-03 14 Abstract 16 CoAP, the Constrained Application Protocol, needs to be implemented 17 in such a way that it does not cause persistent congestion on the 18 network it uses. The CoRE CoAP specification defines basic behavior 19 that exhibits low risk of congestion with minimal implementation 20 requirements. It also leaves room for combining the base 21 specification with advanced congestion control mechanisms with higher 22 performance. 24 This specification defines more advanced, but still simple CoRE 25 Congestion Control mechanisms, called CoCoA. The core of these 26 mechanisms is a Retransmission TimeOut (RTO) algorithm that makes use 27 of Round-Trip Time (RTT) estimates, in contrast with how the RTO is 28 determined as per the base CoAP specification (RFC 7252). The 29 mechanisms defined in this document have relatively low complexity, 30 yet they improve the default CoAP RTO algorithm. The design of the 31 mechanisms in this specification has made use of input from 32 simulations and experiments in real networks. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 25, 2018. 50 Copyright Notice 52 Copyright (c) 2018 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Area of Applicability . . . . . . . . . . . . . . . . . . . . 4 71 4. Advanced CoAP Congestion Control: RTO Estimation . . . . . . 5 72 4.1. Blind RTO Estimate . . . . . . . . . . . . . . . . . . . 6 73 4.2. Measurement-based RTO Estimate . . . . . . . . . . . . . 6 74 4.2.1. Differences with the algorithm of RFC 6298 . . . . . 7 75 4.2.2. Discussion . . . . . . . . . . . . . . . . . . . . . 7 76 4.3. Lifetime, Aging . . . . . . . . . . . . . . . . . . . . . 8 77 5. Advanced CoAP Congestion Control: Non-Confirmables . . . . . 9 78 5.1. Discussion . . . . . . . . . . . . . . . . . . . . . . . 9 79 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 80 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 81 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 82 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 83 8.2. Informative References . . . . . . . . . . . . . . . . . 11 84 Appendix A. Supporting evidence . . . . . . . . . . . . . . . . 11 85 A.1. Older versions of the draft and improvement . . . . . . . 12 86 A.2. References . . . . . . . . . . . . . . . . . . . . . . . 12 87 Appendix B. Pseudocode . . . . . . . . . . . . . . . . . . . . . 13 88 B.1. Updating the RTO estimator . . . . . . . . . . . . . . . 13 89 B.2. RTO aging . . . . . . . . . . . . . . . . . . . . . . . . 14 90 B.3. Variable Backoff Factor . . . . . . . . . . . . . . . . . 14 91 Appendix C. Examples . . . . . . . . . . . . . . . . . . . . . . 15 92 C.1. Example A.1: weak RTTs . . . . . . . . . . . . . . . . . 15 93 C.2. Example A.2: VBF and aging . . . . . . . . . . . . . . . 15 94 C.3. Example B: VBF and aging . . . . . . . . . . . . . . . . 16 95 Appendix D. Analysis: difference between strong and weak 96 estimators . . . . . . . . . . . . . . . . . . . . . 16 97 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 17 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 100 1. Introduction 102 CoAP, the Constrained Application Protocol, needs to be implemented 103 in such a way that it does not cause persistent congestion on the 104 network it uses. The CoRE CoAP specification defines basic behavior 105 that exhibits low risk of congestion with minimal implementation 106 requirements. It also leaves room for combining the base 107 specification with advanced congestion control mechanisms with higher 108 performance. 110 The present specification defines such an advanced CoRE Congestion 111 Control mechanism, with the goal of improving performance while 112 retaining safety as well as the simplicity that is appropriate for 113 constrained devices. Hence, we are calling this mechanism Simple 114 Congestion Control/Advanced, or CoCoA for short. 116 CoCoA calculates the retransmission time-out (RTO) based on RTT 117 estimations with and without loss. By taking retransmissions (in a 118 potentially lossy network) into account when estimating the RTT, this 119 algorithm reacts to congestion with a lower sending rate. For non- 120 confirmable packets, it also limits the sending rate to 1/RTO; 121 assuming that the RTO estimation in CoCoA works as expected, RTO 122 should be slightly greater than the RTT, thus CoCoA would be more 123 conservative than the original specification in [RFC7641]. 125 In the Internet, congestion control is typically implemented in a way 126 that it can be introduced or upgraded unilaterally. Still, a new 127 congestion control scheme must not be introduced lightly. To ensure 128 that the new scheme is not posing a danger to the network, 129 considerable work has been done on simulations and experiments in 130 real networks. Some of this work will be mentioned in "Discussion" 131 subsections in the following sections; an overview is given in 132 Appendix A. Extended rationale for this specification can also be 133 found in the historical Internet-Drafts 134 [I-D.bormann-core-congestion-control] and 135 [I-D.eggert-core-congestion-control], as well as in the minutes of 136 the IETF 84 CoRE WG meetings. 138 1.1. Terminology 140 This specification uses terms from [RFC7252]. In addition, it 141 defines the following terminology: 143 Initiator: The endpoint that sends the message that initiates an 144 exchange. E.g., the party that sends a confirmable message, or a 145 non-confirmable message (see Section 4.3 of [RFC7252]) conveying a 146 request. 148 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 149 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 150 "OPTIONAL" in this document are to be interpreted as described in 151 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 152 capitals, as shown here. 154 The term "byte", abbreviated by "B", is used in its now customary 155 sense as a synonym for "octet". 157 2. Context 159 In the definition of the CoAP protocol [RFC7252], an approach was 160 taken that includes a very simple basic scheme (lock-step with the 161 number of parallel exchanges usually limited to 1) in the base 162 specification together with performance-enhancing advanced 163 mechanisms. 165 The present specification is based on the approved text in the 166 [RFC7252] base specification. It is making use of the text that 167 permits advanced congestion control mechanisms and allows them to 168 change protocol parameters, including NSTART and the binary 169 exponential backoff mechanism. Note that Section 4.8 of [RFC7252] 170 limits the leeway that implementations have in changing the CoRE 171 protocol parameters. 173 The present specification also assumes that, outside of exchanges, 174 non-confirmable messages can only be used at a limited rate without 175 an advanced congestion control mechanism (this is mainly relevant for 176 [RFC7641]). It is also intended to address the [RFC8085] guideline 177 about combining congestion control state for a destination; and to 178 clarify its meaning for CoAP using the definition of an endpoint. 180 The present specification does not address multicast or dithering 181 beyond basic retransmission dithering. 183 3. Area of Applicability 185 The present algorithm is intended to be generally applicable. The 186 objective is to be "better" than default CoAP congestion control in a 187 number of characteristics, including achievable goodput for a given 188 offered load, latency, and recovery from bursts, while providing more 189 predictable stress to the network and the same level of safety from 190 catastrophic congestion. The algorithm defined in this document is 191 intended to adapt to the current characteristics of any underlying 192 network, and therefore is well suited for a wide range of network 193 conditions, in terms of bandwidth, latency, load, loss rate, 194 topology, etc. In particular, CoCoA has been found to perform well 195 in scenarios with latencies ranging from the order of milliseconds to 196 peaks of dozens of seconds, as well as in single-hop and multihop 197 topologies. Link technologies used in existing evaluation work 198 comprise IEEE 802.15.4, GPRS, UMTS and Wi-Fi (see Appendix A). CoCoA 199 is also expected to work suitably across the general Internet. The 200 algorithm does require three state variables per scope plus the state 201 needed to do RTT measurements, so it may not be applicable to the 202 most constrained devices (say, class 1 as per [RFC7228]). 204 The scope of each instance of the algorithm in the current set of 205 evaluations has been the five-tuple, i.e., CoAP + endpoint (transport 206 address) for Initiator and Responder. Potential applicability to 207 larger scopes needs to be examined. 209 4. Advanced CoAP Congestion Control: RTO Estimation 211 For an initiator that plans to make multiple requests to one 212 destination endpoint, it may be worthwhile to make RTT measurements 213 in order to compute a more appropriate RTO than the default initial 214 timeout of 2 to 3 s. In particular, a wide spectrum of RTT values is 215 expected in different types of networks where CoAP is used. Those 216 RTTs range from several orders of magnitude below the default initial 217 timeout to values larger than the default. The algorithm defined in 218 this document is based on the algorithm for RTO estimation defined in 219 [RFC6298], with appropriately extended default/base values, as 220 proposed in Section 4.2.1. Note that such a mechanism must, during 221 idle periods, decay RTO estimates that are shorter or longer than the 222 default RTO estimate back to the default RTO estimate, until fresh 223 measurements become available again, as proposed in Section 4.3. 225 RTT variability challenges RTO estimation. In TCP, delayed ACKs 226 contribute to RTT variability, since this option adds a delay of up 227 to 500 ms (typically, 200 ms) before an ACK is sent by a receiving 228 TCP endpoint. However, one important consideration not relevant for 229 TCP is the fact that a CoAP round-trip may include application 230 processing time, which may be hard to predict, and may differ between 231 different resources available at the same endpoint. Also, for 232 communications with networks of constrained devices that apply radio 233 duty cycling, large and variable round-trip times are likely to be 234 observed. Servers will only trigger their early ACKs (with a non- 235 piggybacked response to be sent later) based on the default timers, 236 e.g. after 1 s. A client that has arrived at a RTO estimate shorter 237 than 1 s SHOULD therefore use a larger backoff factor for 238 retransmissions to avoid expending all of its retransmissions 239 (MAX_RETRANSMIT, see Section 4.2 of [RFC7252], normally 4) in the 240 default interval of 2 to 3 s. The approach chosen for a mechanism 241 with variable backoff factors is presented in Section 4.2.1. 243 It may also be worthwhile to perform RTT estimation not just based on 244 information measured from a single destination endpoint, but also 245 based on entire hosts (IP addresses) and/or complete prefixes (e.g., 246 maintain an RTT estimate for a whole /64). The exact way this can be 247 used to reduce the amount of state in an initiator is for further 248 study. 250 4.1. Blind RTO Estimate 252 The initial RTO estimate for an endpoint is set to 2 seconds (the 253 initial RTO estimate is used as the initial value for both E_weak_ 254 and E_strong_ below). 256 If only the initial RTO estimate is available, the RTO estimate for 257 each of up to NSTART exchanges started in parallel is set to 2 s 258 times the number of parallel exchanges, e.g. if two exchanges are 259 already running, the initial RTO estimate for an additional exchange 260 is 6 seconds. 262 4.2. Measurement-based RTO Estimate 264 The RTO estimator runs two copies of the algorithm defined in 265 [RFC6298], using the same variables and calculations to estimate the 266 RTO, with the differences introduced in Section 4.2.1: One copy for 267 exchanges that complete on initial transmissions (the "strong 268 estimator", E_strong_), and one copy for exchanges that have run into 269 retransmissions, where only the first two retransmissions are 270 considered (the "weak estimator", E_weak_). For the latter, there is 271 some ambiguity whether a response is based on the initial 272 transmission or the retransmissions. For the purposes of the weak 273 estimator, the time from the initial transmission counts. Responses 274 obtained after the third retransmission are not used to update an 275 estimator. 277 The overall RTO estimate is an exponentially weighted moving average 278 computed of the strong and the weak estimator, which is evolved after 279 each contribution to the weak estimator (1) or to the strong 280 estimator (2), from the estimator (either the weak or strong 281 estimator) that made the most recent contribution: 283 RTO := w_weak * E_weak_ + (1 - w_weak) * RTO (1) 285 RTO := w_strong * E_strong_ + (1 - w_strong) * RTO (2) 286 (Splitting this update into the two cases avoids making the 287 contribution of the weak estimator too big in naturally lossy 288 networks.) 290 The default values for the corresponding weights, w_weak and 291 w_strong, are 0.25 and 0.5, respectively. These values have been 292 found to offer good performance in evaluations (see Appendix A). 293 Pseudocode and examples for the overall RTO estimate presented are 294 available in Appendix B.1 and Appendix C.1. 296 4.2.1. Differences with the algorithm of RFC 6298 298 This subsection presents three differences of the algorithm defined 299 in this document with the one defined in [RFC6298]. The first two 300 recommend new parameter settings. The third one is the variable 301 backoff factor (VBF), which replaces RFC6298's simple exponential 302 backoff that always multiplies the RTO by a factor of 2 when the RTO 303 timer expires. 305 The initial value for each of the two RTO estimators is 2 s. 307 For the weak estimator, the factor K (the RTT variance multiplier) is 308 set to 1 instead of 4. This is necessary to avoid a strong increase 309 of the RTO in the case that the RTTVAR value is very large, which may 310 be the case if a weak RTT measurement is obtained after one or more 311 retransmissions. 313 In order to avoid that exchanges with small initial RTOs (i.e. RTO 314 estimate lower than 1 s) use up all retransmissions in a short 315 interval of time, the RTO for a retransmission is multiplied by 3 for 316 each retransmission as long as the RTO is less than 1 s. 318 On the other hand, to avoid exchanges with large initial RTOs (i.e., 319 RTO estimate greater than 3 s) not being able to carry out all 320 retransmissions within MAX_TRANSMIT_WAIT (normally 93 s), the RTO is 321 multiplied only by 1.5 when RTO is greater than 3 s. 323 Pseudocode for the variable backoff factor is in Appendix B.3. 325 The binary exponential backoff is truncated at 32 seconds. Similar 326 to the way retransmissions are handled in the base specification, 327 they are dithered between 1 x RTO and ACK_RANDOM_FACTOR x RTO. 329 4.2.2. Discussion 331 In contrast to [RFC6298], this algorithm attempts to make use of 332 ambiguous information from retransmissions. This is motivated by the 333 high non-congestion loss rates expected in constrained node networks, 334 and the need to update the RTO estimators even in the presence of 335 loss. This approach appears to contravene the mandate in 336 Section 3.1.1 of [RFC8085] that "latency samples MUST NOT be derived 337 from ambiguous transactions". However, those samples are not simply 338 combined into the strong estimator, but are used to correct the 339 limited knowledge that can be gained from the strong RTT measurements 340 by employing an additional weak estimator. In fact, the weak 341 estimator allows to better update the RTO estimator when mostly weak 342 RTTs are available, either due to the lossy nature of links or due to 343 congestion-induced losses. In the presence of the latter, and 344 compared to a strong-only estimator (w_weak=0), spurious timeouts are 345 avoided and the rate of retries is reduced, which allows to decrease 346 congestion. Evidence that has been collected from experiments 347 appears to support that the overall effect of using this data in the 348 way described is beneficial (Appendix A). 350 Some evaluation has been done on earlier versions of this 351 specification [Betzler2013]. A more recent (and more comprehensive) 352 reference is [Betzler2015]. 354 4.3. Lifetime, Aging 356 The state of the RTO estimators for an endpoint SHOULD be kept as 357 long as possible. If other state is kept for the endpoint (such as a 358 DTLS connection), it is very strongly RECOMMENDED to keep the RTO 359 state alive at least as long as this other state. In the absence of 360 such other state, the RTO state SHOULD be kept at least long enough 361 to avoid frequent returns to inappropriate initial values. For the 362 default parameter set of Section 4.8 of [RFC7252], it is strongly 363 RECOMMENDED to keep it for at least 255 s. 365 If an estimator has a value that is lower than 1 s, and it is left 366 without further update for 16 times its current value, the RTO 367 estimate is doubled. If an estimator has a value that is higher than 368 3 s, and it is left without further update for 4 times its current 369 value, the RTO estimate is set to be 371 RTO := 1 s + (0.5 * RTO) 373 (Note that, instead of running a timer, it is possible to implement 374 these RTO aging calculations cumulatively at the time the estimator 375 is used next.) 377 Pseudocode and examples for the aging mechanism presented are 378 available in Appendix B.2 and in Appendix C.2. 380 5. Advanced CoAP Congestion Control: Non-Confirmables 382 A CoAP endpoint MUST NOT send non-confirmables to another CoAP 383 endpoint at a rate higher than defined by this document. Independent 384 of any congestion control mechanisms, a CoAP endpoint can always send 385 non-confirmables if their rate does not exceed 1 B/s. 387 Non-confirmables that form part of exchanges are governed by the 388 rules for exchanges. 390 Non-confirmables outside exchanges (e.g., [RFC7641] notifications 391 sent as non-confirmables) are governed by the following rules: 393 1. Of any 16 consecutive messages towards this endpoint that aren't 394 responses or acknowledgments, at least 2 of the messages must be 395 confirmable. 397 2. An RTO as specified in Section 4 must be used for confirmable 398 messages. 400 3. The packet rate of non-confirmable messages cannot exceed 1/RTO, 401 where RTO is the overall RTO estimator value at the time the non- 402 confirmable packet is sent. 404 5.1. Discussion 406 The mechanism defined above for non-confirmables is relatively 407 conservative. More advanced versions of this algorithm could run a 408 TFRC-style Loss Event Rate calculator [RFC5348] and apply the TCP 409 equation to achieve a higher rate than 1/RTO. 411 [RFC7641], Section 4.5.1, specifies that the rate of Non-Confirmables 412 SHOULD NOT exceed 1/RTT on average, if the server can maintain an RTT 413 estimate for a client. CoCoA limits the packet rate of Non- 414 Confirmables in this situation to 1/RTO. Assuming that the RTO 415 estimation in CoCoA works as expected, RTO[k] should be slightly 416 greater than the RTT[k], thus CoCoA would be more conservative. The 417 expectation therefore is that complying with the NON rate set by 418 CoCoA leads to complying with [RFC7641]. 420 6. IANA Considerations 422 This document makes no requirements on IANA. (This section to be 423 removed by RFC editor.) 425 7. Security Considerations 427 The security considerations of, e.g., [RFC5681], [RFC2914], and 428 [RFC8085] apply. Some issues are already discussed in the security 429 considerations of [RFC7252]. 431 If a malicious node manages to prevent the delivery of some packets, 432 a consequence will be an RTO increase, which will further reduce 433 network performance. Note that this type of attack is not specific 434 for CoCoA (and not even specific for CoAP), and many congestion 435 control algorithms increase the RTO upon packet loss detection. 436 While it is hard to prevent radio jamming, some mitigation for other 437 forms of this type of attack is provided by network access control 438 techniques. Also, the weak estimator in CoCoA increases the chances 439 of obtaining RTT measurements in the presence of heavy packet losses, 440 allowing to keep the RTO updated, which in turn allows recovery from 441 a jamming attack in reasonable time. 443 8. References 445 8.1. Normative References 447 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 448 Requirement Levels", BCP 14, RFC 2119, 449 DOI 10.17487/RFC2119, March 1997, 450 . 452 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 453 RFC 2914, DOI 10.17487/RFC2914, September 2000, 454 . 456 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 457 "Computing TCP's Retransmission Timer", RFC 6298, 458 DOI 10.17487/RFC6298, June 2011, 459 . 461 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 462 Application Protocol (CoAP)", RFC 7252, 463 DOI 10.17487/RFC7252, June 2014, 464 . 466 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 467 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 468 March 2017, . 470 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 471 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 472 May 2017, . 474 8.2. Informative References 476 [Betzler2013] 477 Betzler, A., Gomez, C., Demirkol, I., and J. Paradells, 478 "Congestion control in reliable CoAP communication", 479 ACM MSWIM'13 p. 365-372, DOI 10.1145/2507924.2507954, 480 2013. 482 [Betzler2015] 483 Betzler, A., Gomez, C., Demirkol, I., and J. Paradells, 484 "CoCoA+: an Advanced Congestion Control Mechanism for 485 CoAP", Ad Hoc Networks Vol. 33 pp. 126-139, 486 DOI 10.1016/j.adhoc.2015.04.007, October 2015. 488 [I-D.bormann-core-congestion-control] 489 Bormann, C. and K. Hartke, "Congestion Control Principles 490 for CoAP", draft-bormann-core-congestion-control-02 (work 491 in progress), July 2012. 493 [I-D.eggert-core-congestion-control] 494 Eggert, L., "Congestion Control for the Constrained 495 Application Protocol (CoAP)", draft-eggert-core- 496 congestion-control-01 (work in progress), January 2011. 498 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 499 Friendly Rate Control (TFRC): Protocol Specification", 500 RFC 5348, DOI 10.17487/RFC5348, September 2008, 501 . 503 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 504 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 505 . 507 [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for 508 Constrained-Node Networks", RFC 7228, 509 DOI 10.17487/RFC7228, May 2014, 510 . 512 [RFC7641] Hartke, K., "Observing Resources in the Constrained 513 Application Protocol (CoAP)", RFC 7641, 514 DOI 10.17487/RFC7641, September 2015, 515 . 517 Appendix A. Supporting evidence 519 (Editor's note: The references local to this appendix may need to be 520 merged with those from the specification proper, depending on the 521 discretion of the RFC editor.) 522 CoCoA has been evaluated by means of simulation and experimentation 523 in diverse scenarios comprising different link layer technologies, 524 network topologies, traffic patterns and device classes. The main 525 overall evaluation result is that CoCoA consistently delivers a 526 performance which is better than, or at least similar to, that of 527 default CoAP congestion control. While the latter is insensitive to 528 network conditions, CoCoA is adaptive and makes good use of RTT 529 samples. 531 It has been shown over real GPRS and IEEE 802.15.4 mesh network 532 testbeds that in these settings, in comparison to default CoAP, CoCoA 533 increases throughput and reduces the time it takes for a network to 534 process traffic bursts, while not sacrificing fairness. In contrast, 535 other RTT-sensitive approaches such as Linux-RTO or Peak-Hopper-RTO 536 may be too simple or do not adapt well to IoT scenarios, 537 underperforming default CoAP under certain conditions [1]. On the 538 other hand, CoCoA has been found to reduce latency in GPRS and WiFi 539 setups, compared with default CoAP [2]. 541 CoCoA performance has also been evaluated for non-confirmable traffic 542 over emulated GPRS/UMTS links and over a real IEEE 802.15.4 mesh 543 testbed. Results show that since CoCoA is adaptive, it yields better 544 packet delivery ratio than default CoAP (which does not apply 545 congestion control to non-confirmable messages) or Observe (which 546 introduces congestion control that is not adaptive to network 547 conditions) [3, 4]. 549 A.1. Older versions of the draft and improvement 551 CoCoA has evolved since its initial draft version. Its core has 552 remained mostly stable since draft-bormann-core-cocoa-02. The 553 evolution of CoCoA has been driven by research work. This process, 554 including evaluations of early versions of CoCoA, as well as 555 improvement proposals that were finally incorporated in CoCoA, is 556 reflected in published works [5-10]. 558 A.2. References 560 [1] A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoAP 561 congestion control for the Internet of Things", IEEE Communications 562 Magazine, July 2016. 564 [2] F. Zheng, B. Fu, Z. Cao, "CoAP Latency Evaluation", draft- 565 zheng-core-coap-lantency-evaluation-00, 2016 (work in progress). 567 [3] A. Betzler, C. Gomez, I. Demirkol, "Evaluation of Advanced 568 Congestion Control Mechanisms for Unreliable CoAP Communications", 569 PE-WASUN, Cancun, Mexico, 2015. 571 [4] A. Betzler, J. Isern, C. Gomez, I. Demirkol, J. Paradells, 572 "Experimental Evaluation of Congestion Control for CoAP 573 Communications without End-to-End Reliability", Ad Hoc Networks, 574 Volume 52, 1 December 2016, Pages 183-194. 576 [5] A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "Congestion 577 Control in Reliable CoAP Communication", 16th ACM International 578 Conference on Modeling, Analysis and Simulation of Wireless and 579 Mobile Systems (MSWIM'13), Barcelona, Spain, Nov. 2013. 581 [6] A. Betzler, C. Gomez, I. Demirkol, M. Kovatsch, "Congestion 582 Control for CoAP cloud services", 8th International Workshop on 583 Service-Oriented Cyber-Physical Systems in Converging Networked 584 Environments (SOCNE) 2014, Barcelona, Spain, Sept. 2014. 586 [7] A. Betzler, C. Gomez, I. Demirkol, J. Paradells, "CoCoA+: an 587 advanced congestion control mechanism for CoAP", Ad Hoc Networks 588 journal, 2015. 590 [8] Bhalerao, Rahul, Sridhar Srinivasa Subramanian, and Joseph 591 Pasquale. "An analysis and improvement of congestion control in the 592 CoAP Internet-of-Things protocol." 2016 13th IEEE Annual Consumer 593 Communications & Networking Conference (CCNC). IEEE, 2016. 595 [9] I Jaervinen, L Daniel, M Kojo, "Experimental evaluation of 596 alternative congestion control algorithms for Constrained Application 597 Protocol (CoAP)", IEEE 2nd World Forum on Internet of Things (WF- 598 IoT), 2015. 600 [10] Balandina, Ekaterina, Yevgeni Koucheryavy, and Andrei Gurtov. 601 "Computing the retransmission timeout in coap." Internet of Things, 602 Smart Spaces, and Next Generation Networking. Springer Berlin 603 Heidelberg, 2013. 352-362. 605 Appendix B. Pseudocode 607 B.1. Updating the RTO estimator 608 // Default values 609 ALPHA = 0.125 // RFC 6298 610 BETA = 0.25 // RFC 6298 611 W_STRONG = 0.5 612 W_WEAK = 0.25 614 updateRTO(retransmissions, RTT) { 615 if (retransmissions == 0) { 616 RTTVAR_strong = (1 - BETA) * RTTVAR_strong 617 + BETA * (RTT_strong - RTT); 618 RTT_strong = (1 - ALPHA) * RTT_strong + ALPHA * RTT; 619 E_strong = RTT_strong + 4 * RTTVAR_strong; 620 RTO = W_STRONG * E_strong + (1 - W_STRONG) * RTO; 621 } else if (retransmissions <= 2) { 622 RTTVAR_weak = (1 - BETA) * RTTVAR_weak 623 + BETA * (RTT_weak - RTT); 624 RTT_weak = (1 - ALPHA) * RTT_weak + ALPHA * RTT; 625 E_weak = RTT_weak + 1 * RTTVAR_weak; 626 RTO = W_WEAK * E_weak + (1 - W_WEAK) * RTO 627 } 628 } 630 B.2. RTO aging 632 checkAging() { 633 clock_time difference = getCurrentTime() - lastUpdatedTime; 635 if ((RTO < 1s) && (difference > (16 * RTO))) { 636 RTO = 2 * RTO; 637 lastUpdatedTime = getCurrentTime(); 638 } else if ((RTO > 3s) && (difference > (4 * RTO))) { 639 RTO = 1s + 0.5 * RTO; 640 lastUpdatedTime = getCurrentTime(); 641 } 642 } 644 B.3. Variable Backoff Factor 646 backOffRTO() { 647 if (RTO < 1s) { 648 RTO = RTO * 3; 649 } else if (RTO > 3s) { 650 RTO = RTO * 1.5; 651 } else { 652 RTO = RTO * 2; 653 } 654 } 656 Appendix C. Examples 658 C.1. Example A.1: weak RTTs 660 A large network of sensor nodes that report periodical measurements 661 is operating normally, without congestion. The nodes transmit their 662 sensor readings via CON messages every 20 s in an asynchronous way 663 towards a server located behind a gateway, obtaining strong RTT 664 measurements (RTT 1.1 s, RTTVAR 0.1 s) that lead to the calculation 665 of an RTO of 1.5 s (in average) in each node. In this mode of 666 operation, no aging is applied, since the RTO is refreshed before the 667 aging mechanism applies. 669 Suddenly, upon detection of a global event, the majority of sensor 670 nodes start transmitting at a higher rate (every 5 s) to increase the 671 resolution of the acquired data, which creates heavy congestion that 672 leads to packet losses and an important increase of real RTT between 673 the nodes and the server (RTT 2 s, RTTVAR 1 s). Due to the packet 674 losses and spurious retransmissions (which can fuel congestion even 675 more), many nodes are not able to update their RTO via strong RTT 676 measurements, but they are able to obtain weak RTT measurements. A 677 node with an initial RTO of 1.5 s would run into a retransmission, 678 before obtaining an ACK (given the RTT of 2 s and that the ACK is not 679 lost). 681 This weak RTT measurement would increase the overall RTO of the node 682 to 1.875 s (RTO = 0.25 * 3 s + 0.75 * 1.5 s). Following the same 683 calculus (and RTT/RTTVAR values), after obtaining another weak RTT, 684 the RTO would increase to 2.156 s. At this point, the benefits of 685 the weak RTT measurements are twofold: 687 1. Further spurious retransmissions are avoided as the RTO has 688 increased above the real RTT. 690 2. The increase of RTOs across the whole network reduces the rate 691 with which retransmissions are generated, decreasing the network 692 congestion (which leads to an RTT and packet loss decrease). 694 C.2. Example A.2: VBF and aging 696 Assuming that the frequency of message generation is even higher 697 (every 3 s) and the real RTT would further increase due to 698 congestion, the RTO at some point would increase to 4 s. Since now 699 the RTO is above 3 s, no longer a binary backoff is used to avoid the 700 RTO growing too much in case of retransmissions. As the generation 701 of data from the nodes ceases at some point (the network returns to a 702 normal state), the aging mechanism would reduce the RTO automatically 703 (with an RTO of 4 s, after 16 s the RTO would be shifted to 3 s 704 before a new RTT is measured). 706 C.3. Example B: VBF and aging 708 A network of nodes connected over 4G with an Internet service is 709 calculating very small RTO values (0.3 s) and the nodes are 710 transmitting CON messages every 1 s. Suddenly, the connection 711 quality gets worse and the nodes switch to a more stable, yet slower 712 connection via GPRS. As a result of this change, the nodes run into 713 retransmissions, as the real RTT has increased above the calculated 714 RTO. 716 Since the RTO is below 1 s, the Variable Backoff Factor increases the 717 backoff values quickly to avoid spurious retransmissions (0.9 s first 718 retry, 2.7 s second retry, etc.). Further, if due to the packet 719 losses and increased delays in the network no new RTT measurements 720 are obtained, the aging mechanism automatically increases the RTO 721 (doubling it) after 3.8 s (16 * 0.3 s) to adapt better to the sudden 722 changes of network conditions. Without the Variable Backoff Factor 723 and the aging mechanism, the number of spurious retransmissions would 724 be much higher and the RTO would be corrected more slowly. 726 Appendix D. Analysis: difference between strong and weak estimators 728 This section analyzes the difference between the strong and weak RTO 729 estimators. If there is no congestion, assume a static RTT of R'. 730 Then, E_strong_can be expressed as: 732 E_strong_ = R' + G, 734 since RTTVAR is reduced constantly by RTTVAR = RTTVAR * 3/4 735 (according to [RFC6298], and SRTT=R'), G would be dominant term in 736 the max(G, K * RTTVAR) expression in the long run. 738 For the weak estimator: assume that the RTO setting converges to 739 E_strong_ calculated above in the long run. If there is a packet 740 loss, and an RTT is obtained for the first retransmission, then the 741 weak RTT sample obtained by the weak estimator is: 743 RW' = R'+ G + R' 745 Therefore, E_weak_ can be expressed as: 747 E_weak_ = RW' + max(G, RW'/2) = 3 * R' 749 Acknowledgements 751 The first document to examine CoAP congestion control issues in 752 detail was [I-D.eggert-core-congestion-control], to which this draft 753 owes a lot. 755 Michael Scharf did a review of CoAP congestion control issues that 756 asked a lot of good questions. Several Transport Area 757 representatives made further significant inputs this discussion 758 during IETF84, including Lars Eggert, Michael Scharf, and David 759 Black. Andrew McGregor, Eric Rescorla, Richard Kelsey, Ed Beroset, 760 Jari Arkko, Zach Shelby, Matthias Kovatsch and many others provided 761 very useful additions. Further reviews by Michael Scharf and Ingemar 762 Johansson led to further improvements, including some more discussion 763 in the appendices. 765 Authors from Universitat Politecnica de Catalunya have been supported 766 in part by the Spanish Government's Ministerio de Economia y 767 Competitividad through projects TEC2009-11453, TEC2012-32531, 768 TEC2016-79988-P and FEDER. 770 Carles Gomez has been funded in part by the Spanish Government 771 (Ministerio de Educacion, Cultura y Deporte) through the Jose 772 Castillejo grant CAS15/00336. His contribution to this work has been 773 carried out in part during his stay as a visiting scholar at the 774 Computer Laboratory of the University of Cambridge, in collaboration 775 with Prof. Jon Crowcroft. 777 Authors' Addresses 779 Carsten Bormann 780 Universitaet Bremen TZI 781 Postfach 330440 782 Bremen D-28359 783 Germany 785 Phone: +49-421-218-63921 786 Email: cabo@tzi.org 788 August Betzler 789 Fundacio i2CAT 790 Mobile and Wireless Internet Group 791 C/ del Gran Capita, 2 792 Barcelona 08034 793 Spain 795 Email: august.betzler@i2cat.net 796 Carles Gomez 797 Universitat Politecnica de Catalunya/Fundacio i2CAT 798 Escola d'Enginyeria de Telecomunicacio i Aeroespacial 799 de Castelldefels 800 C/Esteve Terradas, 7 801 Castelldefels 08860 802 Spain 804 Phone: +34-93-413-7206 805 Email: carlesgo@entel.upc.edu 807 Ilker Demirkol 808 Universitat Politecnica de Catalunya/Fundacio i2CAT 809 Departament d'Enginyeria Telematica 810 C/Jordi Girona, 1-3 811 Barcelona 08034 812 Spain 814 Email: ilker.demirkol@entel.upc.edu