idnits 2.17.1 draft-bormann-core-cocoa-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 19, 2015) is 3112 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 CoRE Working Group C. Bormann 3 Internet-Draft Universitaet Bremen TZI 4 Intended status: Informational A. Betzler 5 Expires: April 21, 2016 C. Gomez 6 I. Demirkol 7 Universitat Politecnica de Catalunya/Fundacio i2CAT 8 October 19, 2015 10 CoAP Simple Congestion Control/Advanced 11 draft-bormann-core-cocoa-03 13 Abstract 15 The CoAP protocol needs to be implemented in such a way that it does 16 not cause persistent congestion on the network it uses. The CoRE 17 CoAP specification defines basic behavior that exhibits low risk of 18 congestion with minimal implementation requirements. It also leaves 19 room for combining the base specification with advanced congestion 20 control mechanisms with higher performance. 22 This specification defines some simple advanced CoRE Congestion 23 Control mechanisms, Simple CoCoA. In the present version -02, it is 24 making use of input from simulations and experiments in real 25 networks. The specification might still benefit from simplifying it 26 further. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 21, 2016. 45 Copyright Notice 47 Copyright (c) 2015 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3. Area of Applicability . . . . . . . . . . . . . . . . . . . . 4 66 4. Advanced CoAP Congestion Control: RTO Estimation . . . . . . 4 67 4.1. Blind RTO Estimate . . . . . . . . . . . . . . . . . . . 5 68 4.2. Measured RTO Estimate . . . . . . . . . . . . . . . . . . 5 69 4.2.1. Modifications to the algorithm of RFC 6298 . . . . . 5 70 4.2.2. Discussion . . . . . . . . . . . . . . . . . . . . . 6 71 4.3. Lifetime, Aging . . . . . . . . . . . . . . . . . . . . . 6 72 5. Advanced CoAP Congestion Control: Non-Confirmables . . . . . 7 73 5.1. Discussion . . . . . . . . . . . . . . . . . . . . . . . 7 74 6. Advanced CoAP Congestion Control: Aggregate Congestion 75 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 76 6.1. Proposed Algorithm . . . . . . . . . . . . . . . . . . . 8 77 6.2. Example . . . . . . . . . . . . . . . . . . . . . . . . . 8 78 6.3. Discussion . . . . . . . . . . . . . . . . . . . . . . . 9 79 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 80 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 81 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 82 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 83 10.1. Normative References . . . . . . . . . . . . . . . . . . 10 84 10.2. Informative References . . . . . . . . . . . . . . . . . 11 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 87 1. Introduction 89 (See Abstract.) 91 Extended rationale for this specification can be found in 92 [I-D.bormann-core-congestion-control] and 94 [I-D.eggert-core-congestion-control], as well as in the minutes of 95 the IETF 84 CoRE WG meetings. 97 1.1. Terminology 99 This specification uses terms from [RFC7252]. In addition, it 100 defines the following terminology: 102 Initiator: The endpoint that sends the message that initiates an 103 exchange. E.g., the party that sends a confirmable message, or a 104 non-confirmable message conveying a request. 106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 108 document are to be interpreted as described in [RFC2119] when they 109 appear in ALL CAPS. These words may also appear in this document in 110 lower case as plain English words, absent their normative meanings. 112 (Note that this document is itself informational, but it is 113 discussing normative statements.) 115 The term "byte", abbreviated by "B", is used in its now customary 116 sense as a synonym for "octet". 118 2. Context 120 In the Vancouver IETF 84 CoRE meeting, a path forward was defined 121 that includes a very simple basic scheme (lock-step with a number of 122 parallel exchanges of 1) in the base specification together with 123 performance-enhancing advanced mechanisms. 125 The present specification is based on the approved text in the 126 [RFC7252] base specification. It is making use of the text that 127 permits advanced congestion control mechanisms and allows them to 128 change protocol parameters, including NSTART and the binary 129 exponential backoff mechanism. Note that Section 4.8 of [RFC7252] 130 limits the leeway that implementations have in changing the CoRE 131 protocol parameters. 133 The present specification also assumes that, outside of exchanges, 134 non-confirmable messages can only be used at a limited rate without 135 an advanced congestion control mechanism (this is mainly relevant for 136 [RFC7641]). It is also intended to address the [RFC5405] guideline 137 about combining congestion control state for a destination; and to 138 clarify its meaning for CoAP using the definition of an endpoint. 140 The present specification does not address multicast or dithering 141 beyond basic retransmission dithering. 143 3. Area of Applicability 145 The present algorithm is intended to be generally applicable. The 146 objective is to be "better" than default CoAP congestion control in a 147 number of characteristics, including achievable goodput for a given 148 offered load, latency, and recovery from bursts, while providing more 149 predictable stress to the network and the same level of safety from 150 catastrophic congestion. It does require three state variables per 151 scope plus the state needed to do RTT measurements, so it may not be 152 applicable to the most constrained devices (class 1 as per 153 [RFC7228]). 155 The scope of each instance of the algorithm in the current set of 156 evaluations has been the five-tuple, i.e., CoAP + endpoint (transport 157 address) for Initiator and Responder. Potential applicability to 158 larger scopes needs to be examined. 160 4. Advanced CoAP Congestion Control: RTO Estimation 162 For an initiator that plans to make multiple requests to one 163 destination endpoint, it may be worthwhile to make RTT measurements 164 in order to obtain a better RTO estimation than that implied by the 165 default initial timeout of 2 to 3 s. This is based on the usual 166 algorithms for RTO estimation [RFC6298], with appropriately extended 167 default/base values, as proposed in Section 4.2.1. Note that such a 168 mechanism must, during idle periods, decay RTO estimates that are 169 shorter or longer than the basic RTO estimate back to the basic RTO 170 estimate, until fresh measurements become available again, as 171 proposed in Section 4.3. 173 One important consideration not relevant for TCP is the fact that a 174 CoAP round-trip may include application processing time, which may be 175 hard to predict, and may differ between different resources available 176 at the same endpoint. Also, for communications with networks of 177 constrained devices that apply radio duty cycling, large and variable 178 round-trip times are likely to be observed. Servers will only 179 trigger their early ACKs (with a non-piggybacked response to be sent 180 later) based on the default timers, e.g. after 1 s. A client that 181 has arrived at a RTO estimate shorter than 1 s SHOULD therefore use a 182 larger backoff factor for retransmissions to avoid expending all of 183 its retransmissions in the default interval of 2 to 3 s. A proposal 184 for a mechanism with variable backoff factors is presented in 185 Section 4.2.1. 187 It may also be worthwhile to do RTT estimates not just based on 188 information measured from a single destination endpoint, but also 189 based on entire hosts (IP addresses) and/or complete prefixes (e.g., 190 maintain an RTT estimate for a whole /64). The exact way this can be 191 used to reduce the amount of state in an initiator is for further 192 study. 194 4.1. Blind RTO Estimate 196 The initial RTO estimate for an endpoint is set to 2 seconds (the 197 initial RTO estimate is used as the initial value for both E_weak_ 198 and E_strong_ below). 200 If only the initial RTO estimate is available, the RTO estimate for 201 each of up to NSTART exchanges started in parallel is set to 2 s 202 times the number of parallel exchanges, e.g. if two exchanges are 203 already running, the initial RTO estimate for an additional exchange 204 is 6 seconds. 206 4.2. Measured RTO Estimate 208 The RTO estimator runs two copies of the algorithm defined in 209 [RFC6298], as modified in Section 4.2.1: One copy for exchanges that 210 complete on initial transmissions (the "strong estimator", 211 E_strong_), and one copy for exchanges that have run into 212 retransmissions, where only the first two retransmissions are 213 considered (the "weak estimator", E_weak_). For the latter, there is 214 some ambiguity whether a response is based on the initial 215 transmission or the retransmissions. For the purposes of the weak 216 estimator, the time from the initial transmission counts. Responses 217 obtained after the third retransmission are not used to update an 218 estimator. 220 The overall RTO estimate is an exponentially weighted moving average 221 (alpha = 0.5 and 0.25, respectively) computed of the strong and the 222 weak estimator, which is evolved after each contribution to the weak 223 estimator (1) or to the strong estimator (2), from the estimator that 224 made the most recent contribution: 226 RTO := 0.25 * E_weak_ + 0.75 * RTO (1) 228 RTO := 0.5 * E_strong_ + 0.5 * RTO (2) 230 (Splitting this update into the two cases avoids making the 231 contribution of the weak estimator too big in naturally lossy 232 networks.) 234 4.2.1. Modifications to the algorithm of RFC 6298 236 This subsection presents three modifications that must be applied to 237 the algorithm of [RFC6298] as per this document. The first two 238 recommend new parameter settings. The third one is the variable 239 backoff factor mechanism. 241 The initial value for each of the two RTO estimators is 2 s. 243 For the weak estimator, the factor K (the RTT variance multiplier) is 244 set to 1 instead of 4. This is necessary to avoid a strong increase 245 of the RTO in the case that the RTTVAR value is very large, which may 246 be the case if a weak RTT measurement is obtained after one or more 247 retransmissions. 249 If an RTO estimation is lower than 1 s or higher than 3 s, instead of 250 applying a binary backoff factor in both cases, a variable backoff 251 factor is used. For RTO estimations below 1 s, the RTO for a 252 retransmission is multiplied by 3, while for estimations above 3 s, 253 the RTO is multiplied only by 1.5 (this updated choice of numbers to 254 be verified by more simulations). This helps to avoid that exchanges 255 with small initial RTOs use up all retransmissions in a short 256 interval of time and exchanges with large initial RTOs may not be 257 able to carry out all retransmissions within MAX_TRANSMIT_WAIT 258 (93 s). 260 The binary exponential backoff is truncated at 32 seconds. Similar 261 to the way retransmissions are handled in the base specification, 262 they are dithered between 1 x RTO and ACK_RANDOM_FACTOR x RTO. 264 4.2.2. Discussion 266 In contrast to [RFC6298], this algorithm attempts to make use of 267 ambiguous information from retransmissions. This is motivated by the 268 high non-congestion loss rates expected in constrained node networks, 269 and the need to update the RTO estimators even in the presence of 270 loss. Additional investigation is required to determine whether this 271 is indeed justified. 273 Some evaluation has been done on earlier versions of this 274 specification [Betzler2013]. A more recent (and more comprehensive) 275 reference is [Betzler2015]. Additional investigation is required. 277 4.3. Lifetime, Aging 279 The state of the RTO estimators for an endpoint SHOULD be kept as 280 long as possible. If other state is kept for the endpoint (such as a 281 DTLS connection), it is very strongly RECOMMENDED to keep the RTO 282 state alive at least as long as this other state. It MUST be kept 283 for at least 255 s. 285 If an estimator has a value that is lower than 1 s, and it is left 286 without further update for 16 times its current value, the RTO 287 estimate is doubled. If an estimator has a value that is higher than 288 3 s, and it is left without further update for 4 times its current 289 value, the RTO estimate is set to be 291 RTO := 1 s + (0.5 * RTO) 293 (Note that, instead of running a timer, it is possible to implement 294 these RTO aging calculations cumulatively at the time the estimator 295 is used next.) 297 5. Advanced CoAP Congestion Control: Non-Confirmables 299 (TO DO: Align this with final consensus on -observe!) 301 A CoAP endpoint MUST NOT send non-confirmables to another CoAP 302 endpoint at a rate higher than defined by this document. Independent 303 of any congestion control mechanisms, a CoAP endpoint can always send 304 non-confirmables if their rate does not exceed 1 B/s. 306 Non-confirmables that form part of exchanges are governed by the 307 rules for exchanges. 309 Non-confirmables outside exchanges (e.g., [RFC7641] notifications 310 sent as non-confirmables) are governed by the following rules: 312 1. Of any 16 consecutive messages towards this endpoint that aren't 313 responses or acknowledgments, at least 2 of the messages must be 314 confirmable. 316 2. The confirmable messages must be sent under an RTO estimator, as 317 specified in Section 4. 319 3. The packet rate of non-confirmable messages cannot exceed 1/RTO, 320 where RTO is the overall RTO estimator value at the time the non- 321 confirmable packet is sent. 323 5.1. Discussion 325 This is relatively conservative. More advanced versions of this 326 algorithm could run a TFRC-style Loss Event Rate calculator [RFC5348] 327 and apply the TCP equation to achieve a higher rate than 1/RTO. 329 6. Advanced CoAP Congestion Control: Aggregate Congestion Control 331 (This section is still more experimental than the previous ones.) 333 6.1. Proposed Algorithm 335 To avoid possible congestion when sending many packets to different 336 destination endpoints in parallel, the overall number of outstanding 337 interactions towards different destination endpoints should be 338 limited. An upper limit PLIMIT determines the maximum number of 339 outstanding interactions towards different destinations that are 340 allowed in parallel. When a request is sent to a destination 341 endpoint, PLIMIT is determined according to Equation (3) in the case 342 that valid RTO information is already available for the destination 343 endpoint, or using Equation (4) in case that no RTO information is 344 available for the destination endpoint. 346 PLIMIT = max(LAMBDA, LAMBDA*ACK_TIMEOUT)/mean(RTO)) (3) 348 PLIMIT = LAMBDA (4) 350 where LAMBDA determines the minimum value for the maximum number of 351 allowed outstanding interactions and is suggested to be set to 4, and 352 mean(RTO) is the average value of all valid RTO estimations 353 maintained by the device. A new interaction may only be processed if 354 the current overall number of outstanding interactions is lower than 355 the PLIMIT calculated when the request is initiated. 357 6.2. Example 359 In the following we give an example, with LAMBDA = 4 (our proposed 360 default LAMBDA): 362 Assume that a sender has so far obtained RTO estimations for two 363 destination endpoints A (RTO = 0.5 s) and B (RTO = 1.5 s), and 364 currently pcount (a variable which accounts for the number of 365 outstanding interactions towards different endpoints) is equal to 0. 366 Now three transactions are initiated consecutively in the following 367 order: one for A, one for B and one for a new destination C. 369 When an interaction with node A is initiated, PLIMIT is calculated: 371 PLIMIT= max(4, (4*2 s)/mean(0.5 s, 1.5 s)) = max (4, 8 s/1 s) = 372 max (4, 8) = 8 374 This means that with the current RTO information that the sender has 375 obtained about the destination endpoints, up to 8 outstanding 376 interactions to different endpoints would be allowed. By initiating 377 an interaction with A, pcount is increased to 1, which is still below 378 PLIMIT. Thus, the interaction may be processed. The same applies to 379 B: pcount increases to 2 after obtaining the same PLIMIT value of 8. 381 Destination C is unknown to CoCoA, therefore the updated PLIMIT 382 before processing the interaction with node C is 4. 384 The CoAP request may be processed (pcount = 3). If two more 385 interactions with different unknown destination endpoints would have 386 been initiated, only the first one would have met the requirements to 387 process it (PLIMIT = 4, pcount = 4). The second interaction would 388 have increased pcount to 5, which is not permitted, since PLIMIT is 389 4. It may occur that pcount exceeds PLIMIT in particular cases, in 390 this case, the interaction is not permitted as well. 392 6.3. Discussion 394 The idea of the proposal is to allow more parallel transactions to 395 different destination endpoints if we have low RTO estimations for 396 them (which can be interpreted as good connections and low degree of 397 congestion). If the RTO estimations are large or interactions with 398 unknown destinations are initiated, the mechanism behaves more 399 conservatively by reducing the maximum number of parallel 400 interactions towards different destinations, but allowing at least 401 LAMBDA outstanding interactions. If no RTO information is available 402 for a destination endpoint, PLIMIT is simply set to be LAMBDA. 404 If at any moment pcount would exceed PLIMIT, CoAP does not 405 immediately perform the transaction. Further, it is important that 406 in parallel, NSTART for each destination endpoint applies (which, for 407 now, we assume to be 1). Overall, LAMBDA determines how aggressive/ 408 conservative CoCoA behaves by default and it should be chosen 409 carefully. 411 It will be necessary to see whether this approach is effective in the 412 sense that it avoids congestion in use cases where transactions to a 413 multitude of different destination endpoints are initiated. An 414 important aspect of such evaluations would be how the choice of 415 LAMBDA affects the performance. On the other hand, a more safe 416 approach would use max(RTO) instead of mean(RTO). Other concerns 417 include the fact that the congestion degree of the paths to "known" 418 endpoints influence whether a new interaction is permitted to some 419 new endpoint which may be in very different conditions in terms of 420 congestion. However, it is desirable to avoid adding a lot of 421 complexity to the current CoCoA mechanisms. 423 7. IANA Considerations 425 This document makes no requirements on IANA. (This section to be 426 removed by RFC editor.) 428 8. Security Considerations 430 (TBD. The security considerations of, e.g., [RFC5681], [RFC2914], 431 and [RFC5405] apply. Some issues are already discussed in the 432 security considerations of [RFC7252].) 434 9. Acknowledgements 436 The first document to examine CoAP congestion control issues in 437 detail was [I-D.eggert-core-congestion-control], to which this draft 438 owes a lot. 440 Michael Scharf did a review of CoAP congestion control issues that 441 asked a lot of good questions. Several Transport Area 442 representatives made further significant inputs this discussion 443 during IETF84, including Lars Eggert, Michael Scharf, and David 444 Black. Andrew McGregor, Eric Rescorla, Richard Kelsey, Ed Beroset, 445 Jari Arkko, Zach Shelby, Matthias Kovatsch and many others provided 446 very useful additions. 448 Authors from Universitat Politecnica de Catalunya have been supported 449 in part by the Spanish Government's Ministerio de Economia y 450 Competitividad through projects TEC2009-11453 and TEC2012-32531, and 451 FEDER. 453 10. References 455 10.1. Normative References 457 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 458 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 459 RFC2119, March 1997, 460 . 462 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 463 2914, DOI 10.17487/RFC2914, September 2000, 464 . 466 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 467 for Application Designers", BCP 145, RFC 5405, DOI 468 10.17487/RFC5405, November 2008, 469 . 471 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 472 "Computing TCP's Retransmission Timer", RFC 6298, DOI 473 10.17487/RFC6298, June 2011, 474 . 476 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 477 Application Protocol (CoAP)", RFC 7252, DOI 10.17487/ 478 RFC7252, June 2014, 479 . 481 10.2. Informative References 483 [Betzler2013] 484 Betzler, A., Gomez, C., Demirkol, I., and J. Paradells, 485 "Congestion control in reliable CoAP communication", 486 ACM MSWIM'13 p. 365-372, DOI 10.1145/2507924.2507954, 487 2013. 489 [Betzler2015] 490 Betzler, A., Gomez, C., Demirkol, I., and J. Paradells, 491 "CoCoA+: an Advanced Congestion Control Mechanism for 492 CoAP", Ad Hoc Networks Vol. 33 pp. 126-139, DOI 10.1016/ 493 j.adhoc.2015.04.007, October 2015. 495 [I-D.bormann-core-congestion-control] 496 Bormann, C. and K. Hartke, "Congestion Control Principles 497 for CoAP", draft-bormann-core-congestion-control-02 (work 498 in progress), July 2012. 500 [I-D.eggert-core-congestion-control] 501 Eggert, L., "Congestion Control for the Constrained 502 Application Protocol (CoAP)", draft-eggert-core- 503 congestion-control-01 (work in progress), January 2011. 505 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 506 Friendly Rate Control (TFRC): Protocol Specification", RFC 507 5348, DOI 10.17487/RFC5348, September 2008, 508 . 510 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 511 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 512 . 514 [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for 515 Constrained-Node Networks", RFC 7228, DOI 10.17487/ 516 RFC7228, May 2014, 517 . 519 [RFC7641] Hartke, K., "Observing Resources in the Constrained 520 Application Protocol (CoAP)", RFC 7641, DOI 10.17487/ 521 RFC7641, September 2015, 522 . 524 Authors' Addresses 526 Carsten Bormann 527 Universitaet Bremen TZI 528 Postfach 330440 529 Bremen D-28359 530 Germany 532 Phone: +49-421-218-63921 533 Email: cabo@tzi.org 535 August Betzler 536 Universitat Politecnica de Catalunya/Fundacio i2CAT 537 Departament d'Enginyeria Telematica 538 C/Jordi Girona, 1-3 539 Barcelona 08034 540 Spain 542 Email: august.betzler@entel.upc.edu 544 Carles Gomez 545 Universitat Politecnica de Catalunya/Fundacio i2CAT 546 Escola d'Enginyeria de Telecomunicacio i Aeroespacial 547 de Castelldefels 548 C/Esteve Terradas, 7 549 Castelldefels 08860 550 Spain 552 Phone: +34-93-413-7206 553 Email: carlesgo@entel.upc.edu 555 Ilker Demirkol 556 Universitat Politecnica de Catalunya/Fundacio i2CAT 557 Departament d'Enginyeria Telematica 558 C/Jordi Girona, 1-3 559 Barcelona 08034 560 Spain 562 Email: ilker.demirkol@entel.upc.edu