idnits 2.17.1 draft-eggert-core-congestion-control-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 27, 2011) is 4836 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-core-coap-04 ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group L. Eggert 3 Internet-Draft Nokia 4 Intended status: Experimental January 27, 2011 5 Expires: July 31, 2011 7 Congestion Control for the Constrained Application Protocol (CoAP) 8 draft-eggert-core-congestion-control-01 10 Abstract 12 The Constrained Application Protocol (CoAP) is a simple, low- 13 overhead, UDP-based protocol for use with resource-constrained IP 14 networks and nodes. CoAP defines a simple technique to individually 15 retransmit lost messages, but has no other congestion control 16 mechanisms. 18 This document motivates the need for additional congestion control 19 mechanisms, and defines some simple strawman proposals. The goal is 20 to encourage experimentation with these and other proposals, in order 21 to determine which mechanisms are feasible to implement on resource- 22 constrained nodes and are effective in real deployments. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. This document may not be modified, 28 and derivative works of it may not be created, except to format it 29 for publication as an RFC or to translate it into languages other 30 than English. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on July 31, 2011. 44 Copyright Notice 46 Copyright (c) 2011 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 1. Introduction 61 The Constrained Application Protocol (CoAP) [I-D.ietf-core-coap] is a 62 simple, low-overhead, UDP-based protocol for use with resource- 63 constrained IP networks and nodes. 65 CoAP defines two kinds of interactions between end-points: 67 1. a client/server interaction model, where request or notify 68 messages initiate a transaction with a server, which may send a 69 response to the client with a matching transaction ID 71 2. an asynchronous subscribe/notify interaction model, where a 72 server can send notify messages to a client about a resource 73 which the client has subscribed to 75 CoAP uses the User Datagram Protocol (UDP) [RFC0768] to transmit 76 these messages. For reliable messages, i.e., messages for which a 77 delivery confirmation is required, CoAP defines a simple mechanism to 78 individually retransmit such "confirmable" messages for which no 79 delivery acknowledgement was received. This mechanism uses an 80 exponentially backed-off timer to schedule a fixed number of re- 81 transmission attempts. 83 This document argues that although this retransmission mechanism is a 84 required first step to implement congestion control for CoAP, it 85 alone is not sufficient to alleviate network overload in all 86 conditions. Section 2 gives a short summary of Internet congestion 87 control principles, and Section 3 presents some simple strawman 88 proposals that attempt to complement the current message 89 retransmission mechanism in CoAP. 91 2. Discussion of Internet Congestion Control Principles 93 [RFC2914] describes the best current practices for congestion control 94 in the Internet, and requires that Internet communication employ 95 congestion control mechanisms. Because UDP itself provides no 96 congestion control mechanisms, it is up to the applications and 97 application-layer protocols that use UDP for Internet communication 98 to employ suitable mechanisms to prevent congestion collapse and 99 establish a degree of fairness. CoAP is one such application-layer 100 protocol. 102 [RFC2914] identifies two major reasons why congestion control 103 mechanisms are critical for the stable operation of the Internet: 105 1. The prevention of congestion collapse, i.e., a state where an 106 increase in network load results in a decrease in useful work 107 done by the network. 109 2. The establishment of a degree of fairness, i.e., allowing 110 multiple flows to share the capacity of a path reasonably 111 equitably. 113 Bulk transfers cause the overwhelming majority of the bytes on the 114 Internet, and the traditional congestion control mechanisms used for 115 bulk transfers are engineered to saturate the network without driving 116 it into congestive collapse. Fairness between flows is an important 117 secondary consideration when the network operates around the 118 saturation point, so that new flows are not disadvantaged compared to 119 established flows, and can obtain a reasonable share of the capacity 120 quickly. 122 The environments that CoAP targets are IP networks, although more 123 resource-constrained ones than the "big-I" Internet. This does not 124 eliminate the need for end-point-based congestion control! If 125 anything, the environments that CoAP will be deployed in have fewer 126 capabilities for network provisioning, queuing and queue management, 127 traffic engineering and capacity allocation, which are among the 128 techniques that can sometimes offset the need for end-to-end 129 congestion control to some degree. 131 However, the environments that CoAP targets are sufficiently 132 different from the "big-I" Internet so that the motivations for 133 congestion control from [RFC2914] should probably be weighted 134 differently. CoAP networks will not be used for bulk data transfers 135 and CoAP nodes will not need to use a significant fraction of the 136 capacity of a path to provide a useful service. (In fact, they are 137 often too resource-constrained to do so in the first place.) Under 138 normal operation, a CoAP network will be mostly idle, which means 139 that fairness between the transmissions of different CoAP nodes is 140 not a large issue. A CoAP congestion control mechanism can hence 141 focus on preventing congestion collapse, i.e., preventing situations 142 where the mount of useful work done approaches zero as network load 143 increases. This is a much more tractable problem given the specific 144 conditions of CoAP environments. 146 The current IETF congestion control mechanisms, such as TCP [RFC5681] 147 or TFRC [RFC5348], all focus on determining a "safe" sending rate for 148 a bulk transfer, i.e., for a single flow of many packets between a 149 sender and destination where many packets are in flight at any given 150 time. They measure the path characteristics, such as round-trip time 151 (RTT) and packet loss rate, by monitoring the ongoing transfer and 152 use this information to adjust the sending rate of the flow during 153 the transmission. 155 This approach is not feasible for CoAP. The infrequent request/ 156 response interaction that CoAP supports does not generate sufficient 157 data about the path characteristics to drive a traditional congestion 158 control loop, even if the notion of "a flow" to a destination is 159 extended from "one CoAP transaction" to "a sequence of CoAP 160 transactions". Further complications can arise for CoAP deployments 161 that involve low-capacity, low-power radio links that can cause 162 highly variable path characteristics that are more challenging to 163 adapt to than traditional "big-I" Internet paths. This approach is 164 also not applicable to multicast transmissions, which may see 165 frequent use in some CoAP deployments. 167 [RFC5405] documents the IETF's current best practices for using UDP 168 for unicast communication in the Internet. It provides guidance on 169 topics such as message sizes, reliability, checksums, middlebox 170 traversal and congestion control. Section 3.1.2 of [RFC5405], which 171 focuses on congestion control for low data-volume applications, is 172 especially relevant to CoAP. 174 Section 3.1.2 of [RFC5405] acknowledges that the traditional IETF 175 congestion control mechanisms are not applicable for low data-volume 176 application protocols such as CoAP. Instead, it recommends that such 177 application protocols: 179 o maintain an estimate of the RTT for any destination with which 180 they communicate, or assume a conservative fixed value of 3 181 seconds when no RTT estimate can be obtained (e.g., unidirectional 182 communication) 184 o control their transmission behavior by not sending on average more 185 than one UDP datagram per RTT to a destination 187 o detect packet loss and exponentially back their retransmission 188 timer off when a loss event occurs 190 o employ congestion control for both directions of a bi-directional 191 communication 193 CoAP follows some of these guidelines already. At the moment, it 194 uses a fixed value of 2 seconds for its retransmission timer for both 195 requests and responses, which although somewhat shorter than the 196 recommended value in [RFC5405] is likely appropriate for many of its 197 deployment scenarios. CoAP also uses exponential back-off for its 198 retransmission timer. 200 This alone, however, does not result in a complete congestion control 201 mechanism for CoAP. Section 3 defines an experimental complement to 202 the current CoAP mechanism described in [I-D.ietf-core-coap]. 204 3. CoAP Congestion Control 206 This section proposes several congestion control techniques for CoAP 207 that are intended to improve its ability to prevent congestion 208 collapse. At the moment, these techniques are described with the 209 intent of encouraging experimentation with such proposals in CoAP 210 simulations and experimental testbed deployments. Of particular 211 interest are mechanism requiring little computation and state, i.e., 212 mechanisms that can be implemented in resource-constrained nodes 213 without much overhead. 215 3.1. Retransmissions 217 CoAP already defines a simple retransmission scheme with exponential 218 back-off, where messages that have not been responded to in 219 RESPONSE_TIMEOUT are retransmitted, followed by doubling 220 RESPONSE_TIMEOUT. Up to MAX_RETRANSMIT retransmission attempts are 221 made. (At the moment, [I-D.ietf-core-coap] defines RESPONSE_TIMEOUT 222 to be 2 seconds and MAX_RETRANSMIT to be four attempts.) As stated 223 above, although RESPONSE_TIMEOUT is somewhat shorter than what 224 [RFC5405] recommends, the shorter value is likely to not cause large 225 issues in many deployments that CoAP targets. 227 However, using a fixed value for RESPONSE_TIMEOUT instead of basing 228 it on the measured RTT to a destination has some minor drawbacks. 229 CoAP may be used in deployments where the path RTTs can approach the 230 currently defined RESPONSE_TIMEOUT of 2 seconds, such as Internet 231 deployments involving GSM or 3G links, or cases where preparing a 232 response can involve significant computation or where it otherwise 233 incurs delays, such as long sleep cycles at the receiver. Fixed 234 timeouts that are too short can cause spurious retransmissions, i.e., 235 unnecessary retransmissions in cases where either the request or the 236 response are still in transit. Spurious retransmissions, especially 237 persistent ones, waste resources. 239 This section therefore proposes that CoAP deployments experiment with 240 maintaining an estimate of the RTT for any destination with which 241 they (frequently) communicate. Specifically, it is suggested that 242 deployments experiment with the algorithm specified in [RFC2988] to 243 compute a smoothed RTT (SRTT) estimate, and compute RESPONSE_TIMEOUT 244 in the same way [RFC2988] computes RTO. 246 This suggestion unfortunately does require maintaining per- 247 destination state at the sender, which may be undesirable. The 248 amount of required state can be reduced by maintaining a single 249 "upper bound" RTT measurement across all destinations. The downside 250 here is that retransmissions may be delayed longer than they would be 251 with per-destination state; the upside is that multicast messages are 252 supported. 254 A second suggestion is to experiment with a longer RESPONSE_TIMEOUT, 255 such as 3 seconds or longer, which is what [RFC5405] recommends, in 256 order to determine if there are significant drawbacks or whether this 257 default value could be lengthened. 259 3.2. Aggregate Congestion Control 261 Traditional Internet congestion control algorithms control the 262 sending rate of a single flow. When a node establishes multiple, 263 parallel flows, their congestion control loops run (mostly) 264 independently of one another. Interactions between the control loops 265 of parallel flows are (mostly) indirect, e.g., a rate increase of one 266 flow may cause packet loss and an eventual rate decrease to another. 268 CoAP "flows", i.e., sequences of infrequent CoAP transactions between 269 the same two nodes, do not require much more per-flow congestion 270 control than a retransmission scheme that reduces the rate (increases 271 the back-off) of a flow under loss, and a (low) cap on the number of 272 allowed outstanding requests to a destination. ([RFC5405] recommends 273 "on average not more than one" outstanding transaction to a given 274 destination.) 276 On the other hand, CoAP applications may potentially want to initiate 277 many transactions with different nodes at the same time. Allowing 278 CoAP applications to initiate an unlimited number of parallel 279 transactions gives them the means for causing overload, and depends 280 on application-level measures to detect and correctly mitigate this 281 failure. Because each transaction only consumes a very limited 282 amount of resources, it is arguably more important to control the 283 total outstanding number of transactions, compared to controlling the 284 rate at which each individual one is being (re)transmitted. The CoAP 285 spec [I-D.ietf-core-coap] does currently not impose any limit on how 286 many parallel transactions to different nodes an end-point may have 287 outstanding. 289 Given the importance of preventing congestion collapse, this document 290 argues that the CoAP protocol should specify a common mechanism for 291 congestion controlling the aggregate traffic a CoAP node sends into 292 the network. In other words, the CoAP stack should locally drop 293 application-generated messages under overload situations (or indicate 294 to applications that at the moment, no transmission is permissible), 295 rather than attempting to send them into the network, irrespective of 296 the destination. 298 One proposal is to implement a simple windowing algorithm. In this 299 mechanism, a CoAP node has a certain number of "transmission credits" 300 available during a time interval. Sending one CoAP message consumes 301 one transmission credit, independent of which destination it is being 302 sent to. If all transmission credits have been used up during a time 303 interval, the CoAP node drops any additional messages that the 304 applications attempt to send during the remainder of the time 305 interval (or it prevents applications from generating the messages in 306 the first place). At the end of a time interval, the CoAP node 307 determines whether acknowledgments have been received for all 308 "confirmable" messages it has sent within the time interval. If this 309 is the case, the CoAP node increases the number of transmission 310 credits by one for the following time interval. If acknowledgments 311 fail to arrive for some of the "confirmable" messages sent during the 312 time interval, the number of transmission credits is cut in half for 313 the next interval. 315 The description above leaves several questions unanswered. These 316 include the length of the time interval and whether it is fixed or 317 adapted over time, whether an increase by one and a reduction by half 318 are the correct parameters for the proposed AIMD (additive increase, 319 multiplicative decrease) scheme, whether the decrease should be 320 proportional to the loss rate, how non-confirmable and multicast 321 messages are handled, and others. 323 At the moment, this document does not attempt to answer these 324 questions. Instead, it encourages simulations and implementations to 325 explore the design space, and also consider other non-windowing 326 approaches. 328 3.3. Explicit Congestion Notification 330 Explicit Congestion Notification (ECN) [RFC3168] is an extension to 331 IP that allows routers to inform end nodes when they approach 332 congestion by setting a bit in the IP header. The receiver of a 333 message echoes this bit to the sender, which reacts as if packet loss 334 had occurred for the flow. 336 Deployment of ECN can reduce overall packet loss, because senders can 337 react to congestion early, i.e., before packet loss occurs. This is 338 especially attractive in resource-constrained environments, because 339 retransmissions can be avoided, which conserves resources. 341 If CoAP uses an aggregate congestion control mechanism such as 342 described in Section 3.2, it will reduce the amount of transmission 343 credits for the next time interval when some of the responses 344 received had the ECN bit set. (Other reactions to ECN markings may 345 be possible.) 347 Whether ECN support is possible in CoAP deployments remains to be 348 investigated, because ECN usage requires a negotiation handshake (can 349 potentially be avoided if support is made mandatory for CoAP 350 deployments) and because routers need to support ECN marking. At 351 this point, simulations attempting to quantify the benefits may 352 therefore be easiest to obtain in order to understand which benefits 353 ECN brings to CoAP. 355 3.4. Multicast Considerations 357 CoAP requests may be multicast, and result in several replies from 358 different end-points, potentially consuming much more resource 359 capacity for the request and response transmissions than a single 360 unicast transaction. It can therefore be argued that the sending 361 multicast requests should be more conservatively controlled than the 362 sending of unicast requests. 364 CoAP already acknowledges this to some degree by not retransmitting 365 multicast requests at the CoAP level. Unfortunately, CoAP currently 366 has no means for preventing an application from doing application- 367 level retransmissions of multicast requests. Given that the 368 prevention of congestion collapse is important, such a mechanism 369 should be added. 371 The aggregate congestion control proposal in Section 3.2 puts a cap 372 on the number of transmissions allowed during a time interval, 373 including multicast requests. It is currently unclear whether 374 additional means are required for CoAP deployments that make heavy 375 use of multicast. As before, experimentation is encouraged to 376 understand the problem space. 378 4. IANA Considerations 380 This document requests no actions from IANA. 382 [Note to the RFC Editor: Please remove this section upon 383 publication.] 385 5. Security Considerations 387 This document has no known security implications. 389 [Note to the RFC Editor: Please remove this section upon 390 publication.] 392 6. Acknowledgments 394 Lars Eggert is partly funded by [TRILOGY], a research project 395 supported by the European Commission under its Seventh Framework 396 Program. 398 7. References 400 7.1. Normative References 402 [I-D.ietf-core-coap] 403 Shelby, Z., Hartke, K., Bormann, C., and B. Frank, 404 "Constrained Application Protocol (CoAP)", 405 draft-ietf-core-coap-04 (work in progress), January 2011. 407 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 408 August 1980. 410 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 411 RFC 2914, September 2000. 413 [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission 414 Timer", RFC 2988, November 2000. 416 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 417 of Explicit Congestion Notification (ECN) to IP", 418 RFC 3168, September 2001. 420 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 421 for Application Designers", BCP 145, RFC 5405, 422 November 2008. 424 7.2. Informative References 426 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 427 Friendly Rate Control (TFRC): Protocol Specification", 428 RFC 5348, September 2008. 430 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 431 Control", RFC 5681, September 2009. 433 [TRILOGY] "Trilogy Project", http://www.trilogy-project.org/. 435 Author's Address 437 Lars Eggert 438 Nokia Research Center 439 P.O. Box 407 440 Nokia Group 00045 441 Finland 443 Phone: +358 50 48 24461 444 Email: lars.eggert@nokia.com 445 URI: http://research.nokia.com/people/lars_eggert