idnits 2.17.1 draft-eggert-core-congestion-control-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 23, 2010) is 5055 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-core-coap-00 ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group L. Eggert 3 Internet-Draft Nokia 4 Intended status: Experimental June 23, 2010 5 Expires: December 25, 2010 7 Congestion Control for the Constrained Application Protocol (CoAP) 8 draft-eggert-core-congestion-control-00 10 Abstract 12 The Constrained Application Protocol (CoAP) is a simple, low- 13 overhead, UDP-based protocol for use with resource-constrained IP 14 networks and nodes. CoAP defines a simple technique to individually 15 retransmit lost messages, but has no other congestion control 16 mechanisms. This document motivates the need for additional 17 congestion control mechanisms, and defines some simple strawman 18 proposals. The goal is to encourage experimentation with these and 19 other proposals, in order to determine which mechanisms are feasible 20 to implement on resource-constrained nodes and are effective real 21 deployments. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. This document may not be modified, 27 and derivative works of it may not be created, except to format it 28 for publication as an RFC or to translate it into languages other 29 than English. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on December 25, 2010. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 1. Introduction 60 The Constrained Application Protocol (CoAP) [I-D.ietf-core-coap] is a 61 simple, low-overhead, UDP-based protocol for use with resource- 62 constrained IP networks and nodes. 64 CoAP defines two kinds of interactions between end-points: 66 1. a client/server interaction model, where request or notify 67 messages initiate a transaction with a server, which may send a 68 response to the client with a matching transaction ID 70 2. an asynchronous subscribe/notify interaction model, where a 71 server can send notify messages to a client about a resource 72 which the client has subscribed to 74 CoAP uses the User Datagram Protocol (UDP) [RFC0768] to transmit 75 these messages and defines a simple mechanism to individually 76 retransmit lost messages using an exponentially backed-off timer. 78 This document argues that although this retransmission mechanism is a 79 required first step to implement congestion control for CoAP, it 80 alone is not sufficient to alleviate network overload in all 81 conditions. Section 2 gives a short summary of Internet congestion 82 control principles, and Section 3 presents some simple strawman 83 proposals that attempt to complement the current message 84 retransmission mechanism in CoAP. 86 2. Discussion of Internet Congestion Control Principles 88 [RFC2914] describes the best current practices for congestion control 89 in the Internet, and requires that Internet communication employ 90 congestion control mechanisms. Because UDP itself provides no 91 congestion control mechanisms, it is up to the applications and 92 application-layer protocols that use UDP for Internet communication 93 to employ suitable mechanisms to prevent congestion collapse and 94 establish a degree of fairness. CoAP is one such application-layer 95 protocol. 97 [RFC2914] identifies two major reasons why congestion control 98 mechanisms are critical for the stable operation of the Internet: 100 1. The prevention of congestion collapse, i.e., a state where an 101 increase in network load results in a decrease in useful work 102 done by the network. 104 2. The establishment of a degree of fairness, i.e., allowing 105 multiple flows to share the capacity of a path reasonably 106 equitably. 108 The overwhelming majority of the bytes on the Internet are caused by 109 bulk transfers, and the traditional congestion control mechanisms are 110 engineered to saturate the network without driving it into congestive 111 collapse. Fairness between flows is an important consideration when 112 the network operates around the saturation point, so that new flows 113 are not disadvantaged compared to established flows, and can obtain a 114 reasonable share of the capacity quickly. 116 The environments that CoAP targets are IP networks, although more 117 resource-constrained ones than the "big-I" Internet. This does not 118 eliminate the need for end-point-based congestion control. If 119 anything, the environments that CoAP will be deployed in have fewer 120 capabilities for network provisioning, traffic engineering and 121 capacity allocation, which are among the techniques that can 122 sometimes offset the need for end-to-end congestion control to some 123 degree. 125 However, the environments that CoAP targets are sufficiently 126 different from the "big-I" Internet so that the motivations for 127 congestion control from [RFC2914] should probably be weighted 128 differently. CoAP networks will not be used for bulk data transfers 129 and CoAP nodes will not need to use a significant fraction of the 130 capacity of a path to provide a useful service. (In fact, they are 131 often too resource-constrained to do so in the first place.) Under 132 normal operation, a CoAP network will be mostly idle, which means 133 that fairness between the transmissions of different CoAP nodes is 134 not a large issue. A CoAP congestion control mechanism can hence 135 focus on preventing congestion collapse, which is a much more 136 tractable problem given the specific conditions of CoAP environments. 138 The current IETF congestion control mechanisms, such as TCP [RFC5681] 139 or TFRC [RFC5348], all focus on determining a "safe" sending rate for 140 a bulk transfer, i.e., for a single flow of many packets between a 141 sender and destination where many packets are in flight at any given 142 time. They measure the path characteristics, such as round-trip time 143 (RTT) and packet loss rate, by monitoring the ongoing transfer and 144 use this information to adjust the sending rate of the flow during 145 the transmission. 147 This approach is not feasible for CoAP. The infrequent request/ 148 response interaction that CoAP supports does not generate sufficient 149 data about the path characteristics to drive a traditional congestion 150 control loop, even if the notion of "a flow" to a destination is 151 extended from "one CoAP transaction" to "a sequence of CoAP 152 transactions". This approach is also not applicable to multicast 153 transmissions, which CoAP offers. 155 [RFC5405] documents the IETF's current best practices for using UDP 156 for unicast communication in the Internet. It provides guidance on 157 topics such as message sizes, reliability, checksums, middlebox 158 traversal and congestion control. Section 3.1.2 of [RFC5405], which 159 focuses on congestion control for low data-volume applications, is 160 especially relevant to CoAP. 162 Section 3.1.2 of [RFC5405] acknowledges that the traditional IETF 163 congestion control mechanisms are not applicable for low data-volume 164 application protocols such as CoAP. Instead, it recommends that such 165 application protocols: 167 o maintain an estimate of the RTT for any destination with which 168 they communicate, or assume a conservative fixed value of 3 169 seconds when no RTT estimate can be obtained (e.g., unidirectional 170 communication) 172 o control their transmission behavior by not sending on average more 173 than one UDP datagram per RTT to a destination 175 o detect packet loss and exponentially back their retransmission 176 timer off when a loss event occurs 178 o employ congestion control for both directions of a bi-directional 179 communication 181 CoAP follows some of these guidelines already. It uses a fixed value 182 of 1 second for its retransmission timer for both requests and 183 responses, which although shorter than the recommended value in 184 [RFC5405] is likely appropriate for many of its deployment scenarios. 185 CoAP also uses exponential back-off for its retransmission timer. 187 This alone, however, does not result in a complete congestion control 188 mechanism for CoAP. Section 3 defines an experimental complement to 189 the current CoAP mechanism described in [I-D.ietf-core-coap]. 191 3. CoAP Congestion Control 193 This section proposes several congestion control techniques for CoAP 194 that are intended to improve its ability to prevent congestion 195 collapse. At the moment, these techniques are described with the 196 intent of encouraging experimentation with such proposals in CoAP 197 simulations and testbed deployments. Of particular interest are 198 mechanism requiring little computation and state, i.e., mechanisms 199 that can be implemented in resource-constrained nodes without much 200 overhead. 202 3.1. Retransmissions 204 CoAP already defines a simple retransmission scheme with exponential 205 back-off, where messages that have not been responded to in 206 RESPONSE_TIMEOUT are retransmitted, followed by doubling 207 RESPONSE_TIMEOUT. Up to MAX_RETRANSMIT retransmission attempts are 208 made. (At the moment, [I-D.ietf-core-coap] defines RESPONSE_TIMEOUT 209 to be 1 second and MAX_RETRANSMIT to be five attempts.) As stated 210 above, although RESPONSE_TIMEOUT is shorter than what [RFC5405] 211 recommends, the shorter value is likely to not cause large issues in 212 many deployments that CoAP targets. 214 However, using a fixed value for RESPONSE_TIMEOUT instead of basing 215 it on the measured RTT to a destination has some minor drawbacks. 216 CoAP may be used in deployments where the path RTTs can approach the 217 currently defined RESPONSE_TIMEOUT of 1 second, such as Internet 218 deployments involving GSM or 3G links, or cases where preparing a 219 response can involve significant computation or where it otherwise 220 incurs delays, such as long sleep cycles at the receiver. Fixed 221 timeouts that are too short can cause spurious retransmissions, i.e., 222 unnecessary retransmissions in cases where either the request or the 223 response are still in transit. Spurious retransmissions, especially 224 persistent ones, waste resources. 226 This section therefore proposes that CoAP deployments experiment with 227 maintaining an estimate of the RTT for any destination with which 228 they communicate. Specifically, it is suggested that deployments 229 experiment with the algorithm specified in [RFC2988] to compute a 230 smoothed RTT (SRTT) estimate, and compute RESPONSE_TIMEOUT in the 231 same way [RFC2988] computes RTO. 233 A second suggestion is to experiment with a longer RESPONSE_TIMEOUT, 234 such as 3 seconds, which is what [RFC5405] recommends, in order to 235 determine if there are significant drawbacks or whether this value 236 could be lengthened. 238 3.2. Aggregate Congestion Control 240 Traditional Internet congestion control algorithms control the 241 sending rate of a single flow. When a node establishes multiple, 242 parallel flows, their congestion control loops run (mostly) 243 independently of one another. Interactions between the control loops 244 of parallel flows is (mostly) indirect, e.g., a rate increase of one 245 flow may cause packet loss and a rate decrease to another. 247 CoAP "flows", i.e., sequences of infrequent CoAP transactions between 248 the same two nodes, do not require much more per-flow congestion 249 control than a retransmission scheme that reduces the rate (increases 250 the back-off) of a flow under loss, and a (low) cap on the number of 251 allowed outstanding requests to a destination. ([RFC5405] recommends 252 "on average not more than one" outstanding transaction to a given 253 destination.) 255 On the other hand, CoAP applications may potentially want to initiate 256 many transactions with different nodes at the same time. Allowing 257 CoAP applications to initiate an unlimited number of parallel 258 transactions gives them the means for causing overload, and depends 259 on application-level measures to detect and correctly mitigate this 260 failure. Because each transaction only consumes a very limited 261 amount of resources, it is arguably more important to control the 262 total outstanding number of transactions, compared to controlling the 263 rate at which each individual one is being (re)transmitted. The CoAP 264 spec [I-D.ietf-core-coap] does currently not impose any limit on how 265 many parallel transactions to different nodes an end-point may have 266 outstanding. 268 Given the importance of preventing congestion collapse, this document 269 argues that the CoAP protocol should specify a common mechanism for 270 congestion controlling the aggregate traffic a CoAP node sends into 271 the network. In other words, the CoAP stack should locally drop 272 application-generated messages under overload situations, rather than 273 attempting to send them into the network, irrespective of the 274 destination. 276 One proposal is to implement a simple windowing algorithm. In this 277 mechanism, a CoAP node has a certain number of "transmission credits" 278 available during a time interval. Sending one CoAP message consumes 279 one transmission credit, independent of which destination it is being 280 sent to. If all transmission credits have been used up during a time 281 interval, the CoAP node drops any additional messages that the 282 applications attempt to send during the remainder of the time 283 interval. At the end of a time interval, the CoAP node determines 284 whether responses have been received for all requests it has issued 285 within the time interval. If this is the case, the CoAP node 286 increases the number of send credits by one for the following time 287 interval. If responses fail to arrive for some of the requests 288 issued during the time interval, the number of permitted CoAP 289 requests is cut in half for the next interval. 291 The description above leaves several questions unanswered. These 292 include the length of the time interval and whether it is fixed or 293 adapted over time, whether an increase by one and a reduction by half 294 are the correct parameters for the proposed AIMD (additive increase, 295 multiplicative decrease) scheme, whether the decrease should be 296 proportional to the loss rate, and others. 298 This document does at the moment not attempt to answer these 299 questions. Instead, it encourages simulations and implementations to 300 explore the design space, and also consider other non-windowing 301 approaches. 303 3.3. Explicit Congestion Notification 305 Explicit Congestion Notification (ECN) [RFC3168] is an extension to 306 IP that allows routers to inform end nodes when they approach 307 congestion by setting a bit in the IP header. The receiver of a 308 message echoes this bit to the sender, which reacts as if a packet 309 loss had occurred for the flow. 311 Deployment of ECN can reduce overall packet loss, because senders can 312 react to congestion early, i.e., before packet loss occurs. This is 313 especially attractive in resource-constrained environments, because 314 retransmissions can be avoided. 316 If CoAP uses an aggregate congestion control mechanism such as 317 described in Section Section 3.2, it will reduce the amount of 318 transmission credits for the next time interval when some of the 319 responses received had the ECN bit set. (Other reactions to ECN 320 markings may be possible.) 322 Whether ECN support is possible in CoAP deployments remains to be 323 investigated, because ECN usage requires a negotiation handshake (can 324 potentially be avoided if support is made mandatory for CoAP 325 deployments) and because routers need to support ECN marking. At 326 this point, simulations attempting to quantify the benefits may 327 therefore be easiest to obtain. 329 3.4. Multicast Considerations 331 CoAP requests may be multicast, and result in several replies from 332 different end-points, potentially consuming much more resource 333 capacity for the request and response transmissions than a single 334 unicast transaction. It can therefore be argued that sending 335 multicast requests should be more conservatively controlled than the 336 sending of unicast requests. 338 CoAP already acknowledges this to some degree by not retransmitting 339 multicast requests at the CoAP-level. Unfortunately, CoAP currently 340 has no means for preventing an application from doing application- 341 level retransmissions of multicast requests. Given that the 342 prevention of congestion collapse is important, such a mechanism 343 should be added. 345 The aggregate congestion control proposal in Section Section 3.2 puts 346 a cap on the number of transmissions allowed during a time interval, 347 including multicast requests. It is currently unclear whether 348 additional means are required for CoAP deployments that make heavy 349 use of multicast. As before, experimentation is encouraged to 350 understand the problem space. 352 4. IANA Considerations 354 This document requests no actions from IANA. 356 [Note to the RFC Editor: Please remove this section upon 357 publication.] 359 5. Security Considerations 361 This document has no known security implications. 363 [Note to the RFC Editor: Please remove this section upon 364 publication.] 366 6. Acknowledgments 368 Lars Eggert is partly funded by [TRILOGY], a research project 369 supported by the European Commission under its Seventh Framework 370 Program. 372 7. References 374 7.1. Normative References 376 [I-D.ietf-core-coap] 377 Shelby, Z., Frank, B., and D. Sturek, "Constrained 378 Application Protocol (CoAP)", draft-ietf-core-coap-00 379 (work in progress), June 2010. 381 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 382 August 1980. 384 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 385 RFC 2914, September 2000. 387 [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission 388 Timer", RFC 2988, November 2000. 390 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 391 of Explicit Congestion Notification (ECN) to IP", 392 RFC 3168, September 2001. 394 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 395 for Application Designers", BCP 145, RFC 5405, 396 November 2008. 398 7.2. Informative References 400 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 401 Friendly Rate Control (TFRC): Protocol Specification", 402 RFC 5348, September 2008. 404 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 405 Control", RFC 5681, September 2009. 407 [TRILOGY] "Trilogy Project", http://www.trilogy-project.org/. 409 Author's Address 411 Lars Eggert 412 Nokia Research Center 413 P.O. Box 407 414 Nokia Group 00045 415 Finland 417 Phone: +358 50 48 24461 418 Email: lars.eggert@nokia.com 419 URI: http://research.nokia.com/people/lars_eggert