idnits 2.17.1 

draft-eggert-core-congestion-control-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i)
     Publication Limitation clause.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 27, 2011) is 4836 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-18) exists of
     draft-ietf-core-coap-04

  ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298)

  ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085)


     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          L. Eggert
3	Internet-Draft                                                     Nokia
4	Intended status: Experimental                           January 27, 2011
5	Expires: July 31, 2011

7	   Congestion Control for the Constrained Application Protocol (CoAP)
8	                draft-eggert-core-congestion-control-01

10	Abstract

12	   The Constrained Application Protocol (CoAP) is a simple, low-
13	   overhead, UDP-based protocol for use with resource-constrained IP
14	   networks and nodes.  CoAP defines a simple technique to individually
15	   retransmit lost messages, but has no other congestion control
16	   mechanisms.

18	   This document motivates the need for additional congestion control
19	   mechanisms, and defines some simple strawman proposals.  The goal is
20	   to encourage experimentation with these and other proposals, in order
21	   to determine which mechanisms are feasible to implement on resource-
22	   constrained nodes and are effective in real deployments.

24	Status of this Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.  This document may not be modified,
28	   and derivative works of it may not be created, except to format it
29	   for publication as an RFC or to translate it into languages other
30	   than English.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on July 31, 2011.

44	Copyright Notice

46	   Copyright (c) 2011 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	1.  Introduction

61	   The Constrained Application Protocol (CoAP) [I-D.ietf-core-coap] is a
62	   simple, low-overhead, UDP-based protocol for use with resource-
63	   constrained IP networks and nodes.

65	   CoAP defines two kinds of interactions between end-points:

67	   1.  a client/server interaction model, where request or notify
68	       messages initiate a transaction with a server, which may send a
69	       response to the client with a matching transaction ID

71	   2.  an asynchronous subscribe/notify interaction model, where a
72	       server can send notify messages to a client about a resource
73	       which the client has subscribed to

75	   CoAP uses the User Datagram Protocol (UDP) [RFC0768] to transmit
76	   these messages.  For reliable messages, i.e., messages for which a
77	   delivery confirmation is required, CoAP defines a simple mechanism to
78	   individually retransmit such "confirmable" messages for which no
79	   delivery acknowledgement was received.  This mechanism uses an
80	   exponentially backed-off timer to schedule a fixed number of re-
81	   transmission attempts.

83	   This document argues that although this retransmission mechanism is a
84	   required first step to implement congestion control for CoAP, it
85	   alone is not sufficient to alleviate network overload in all
86	   conditions.  Section 2 gives a short summary of Internet congestion
87	   control principles, and Section 3 presents some simple strawman
88	   proposals that attempt to complement the current message
89	   retransmission mechanism in CoAP.

91	2.  Discussion of Internet Congestion Control Principles

93	   [RFC2914] describes the best current practices for congestion control
94	   in the Internet, and requires that Internet communication employ
95	   congestion control mechanisms.  Because UDP itself provides no
96	   congestion control mechanisms, it is up to the applications and
97	   application-layer protocols that use UDP for Internet communication
98	   to employ suitable mechanisms to prevent congestion collapse and
99	   establish a degree of fairness.  CoAP is one such application-layer
100	   protocol.

102	   [RFC2914] identifies two major reasons why congestion control
103	   mechanisms are critical for the stable operation of the Internet:

105	   1.  The prevention of congestion collapse, i.e., a state where an
106	       increase in network load results in a decrease in useful work
107	       done by the network.

109	   2.  The establishment of a degree of fairness, i.e., allowing
110	       multiple flows to share the capacity of a path reasonably
111	       equitably.

113	   Bulk transfers cause the overwhelming majority of the bytes on the
114	   Internet, and the traditional congestion control mechanisms used for
115	   bulk transfers are engineered to saturate the network without driving
116	   it into congestive collapse.  Fairness between flows is an important
117	   secondary consideration when the network operates around the
118	   saturation point, so that new flows are not disadvantaged compared to
119	   established flows, and can obtain a reasonable share of the capacity
120	   quickly.

122	   The environments that CoAP targets are IP networks, although more
123	   resource-constrained ones than the "big-I" Internet.  This does not
124	   eliminate the need for end-point-based congestion control!  If
125	   anything, the environments that CoAP will be deployed in have fewer
126	   capabilities for network provisioning, queuing and queue management,
127	   traffic engineering and capacity allocation, which are among the
128	   techniques that can sometimes offset the need for end-to-end
129	   congestion control to some degree.

131	   However, the environments that CoAP targets are sufficiently
132	   different from the "big-I" Internet so that the motivations for
133	   congestion control from [RFC2914] should probably be weighted
134	   differently.  CoAP networks will not be used for bulk data transfers
135	   and CoAP nodes will not need to use a significant fraction of the
136	   capacity of a path to provide a useful service.  (In fact, they are
137	   often too resource-constrained to do so in the first place.)  Under
138	   normal operation, a CoAP network will be mostly idle, which means
139	   that fairness between the transmissions of different CoAP nodes is
140	   not a large issue.  A CoAP congestion control mechanism can hence
141	   focus on preventing congestion collapse, i.e., preventing situations
142	   where the mount of useful work done approaches zero as network load
143	   increases.  This is a much more tractable problem given the specific
144	   conditions of CoAP environments.

146	   The current IETF congestion control mechanisms, such as TCP [RFC5681]
147	   or TFRC [RFC5348], all focus on determining a "safe" sending rate for
148	   a bulk transfer, i.e., for a single flow of many packets between a
149	   sender and destination where many packets are in flight at any given
150	   time.  They measure the path characteristics, such as round-trip time
151	   (RTT) and packet loss rate, by monitoring the ongoing transfer and
152	   use this information to adjust the sending rate of the flow during
153	   the transmission.

155	   This approach is not feasible for CoAP.  The infrequent request/
156	   response interaction that CoAP supports does not generate sufficient
157	   data about the path characteristics to drive a traditional congestion
158	   control loop, even if the notion of "a flow" to a destination is
159	   extended from "one CoAP transaction" to "a sequence of CoAP
160	   transactions".  Further complications can arise for CoAP deployments
161	   that involve low-capacity, low-power radio links that can cause
162	   highly variable path characteristics that are more challenging to
163	   adapt to than traditional "big-I" Internet paths.  This approach is
164	   also not applicable to multicast transmissions, which may see
165	   frequent use in some CoAP deployments.

167	   [RFC5405] documents the IETF's current best practices for using UDP
168	   for unicast communication in the Internet.  It provides guidance on
169	   topics such as message sizes, reliability, checksums, middlebox
170	   traversal and congestion control.  Section 3.1.2 of [RFC5405], which
171	   focuses on congestion control for low data-volume applications, is
172	   especially relevant to CoAP.

174	   Section 3.1.2 of [RFC5405] acknowledges that the traditional IETF
175	   congestion control mechanisms are not applicable for low data-volume
176	   application protocols such as CoAP.  Instead, it recommends that such
177	   application protocols:

179	   o  maintain an estimate of the RTT for any destination with which
180	      they communicate, or assume a conservative fixed value of 3
181	      seconds when no RTT estimate can be obtained (e.g., unidirectional
182	      communication)

184	   o  control their transmission behavior by not sending on average more
185	      than one UDP datagram per RTT to a destination

187	   o  detect packet loss and exponentially back their retransmission
188	      timer off when a loss event occurs

190	   o  employ congestion control for both directions of a bi-directional
191	      communication

193	   CoAP follows some of these guidelines already.  At the moment, it
194	   uses a fixed value of 2 seconds for its retransmission timer for both
195	   requests and responses, which although somewhat shorter than the
196	   recommended value in [RFC5405] is likely appropriate for many of its
197	   deployment scenarios.  CoAP also uses exponential back-off for its
198	   retransmission timer.

200	   This alone, however, does not result in a complete congestion control
201	   mechanism for CoAP.  Section 3 defines an experimental complement to
202	   the current CoAP mechanism described in [I-D.ietf-core-coap].

204	3.  CoAP Congestion Control

206	   This section proposes several congestion control techniques for CoAP
207	   that are intended to improve its ability to prevent congestion
208	   collapse.  At the moment, these techniques are described with the
209	   intent of encouraging experimentation with such proposals in CoAP
210	   simulations and experimental testbed deployments.  Of particular
211	   interest are mechanism requiring little computation and state, i.e.,
212	   mechanisms that can be implemented in resource-constrained nodes
213	   without much overhead.

215	3.1.  Retransmissions

217	   CoAP already defines a simple retransmission scheme with exponential
218	   back-off, where messages that have not been responded to in
219	   RESPONSE_TIMEOUT are retransmitted, followed by doubling
220	   RESPONSE_TIMEOUT.  Up to MAX_RETRANSMIT retransmission attempts are
221	   made.  (At the moment, [I-D.ietf-core-coap] defines RESPONSE_TIMEOUT
222	   to be 2 seconds and MAX_RETRANSMIT to be four attempts.)  As stated
223	   above, although RESPONSE_TIMEOUT is somewhat shorter than what
224	   [RFC5405] recommends, the shorter value is likely to not cause large
225	   issues in many deployments that CoAP targets.

227	   However, using a fixed value for RESPONSE_TIMEOUT instead of basing
228	   it on the measured RTT to a destination has some minor drawbacks.
229	   CoAP may be used in deployments where the path RTTs can approach the
230	   currently defined RESPONSE_TIMEOUT of 2 seconds, such as Internet
231	   deployments involving GSM or 3G links, or cases where preparing a
232	   response can involve significant computation or where it otherwise
233	   incurs delays, such as long sleep cycles at the receiver.  Fixed
234	   timeouts that are too short can cause spurious retransmissions, i.e.,
235	   unnecessary retransmissions in cases where either the request or the
236	   response are still in transit.  Spurious retransmissions, especially
237	   persistent ones, waste resources.

239	   This section therefore proposes that CoAP deployments experiment with
240	   maintaining an estimate of the RTT for any destination with which
241	   they (frequently) communicate.  Specifically, it is suggested that
242	   deployments experiment with the algorithm specified in [RFC2988] to
243	   compute a smoothed RTT (SRTT) estimate, and compute RESPONSE_TIMEOUT
244	   in the same way [RFC2988] computes RTO.

246	   This suggestion unfortunately does require maintaining per-
247	   destination state at the sender, which may be undesirable.  The
248	   amount of required state can be reduced by maintaining a single
249	   "upper bound" RTT measurement across all destinations.  The downside
250	   here is that retransmissions may be delayed longer than they would be
251	   with per-destination state; the upside is that multicast messages are
252	   supported.

254	   A second suggestion is to experiment with a longer RESPONSE_TIMEOUT,
255	   such as 3 seconds or longer, which is what [RFC5405] recommends, in
256	   order to determine if there are significant drawbacks or whether this
257	   default value could be lengthened.

259	3.2.  Aggregate Congestion Control

261	   Traditional Internet congestion control algorithms control the
262	   sending rate of a single flow.  When a node establishes multiple,
263	   parallel flows, their congestion control loops run (mostly)
264	   independently of one another.  Interactions between the control loops
265	   of parallel flows are (mostly) indirect, e.g., a rate increase of one
266	   flow may cause packet loss and an eventual rate decrease to another.

268	   CoAP "flows", i.e., sequences of infrequent CoAP transactions between
269	   the same two nodes, do not require much more per-flow congestion
270	   control than a retransmission scheme that reduces the rate (increases
271	   the back-off) of a flow under loss, and a (low) cap on the number of
272	   allowed outstanding requests to a destination.  ([RFC5405] recommends
273	   "on average not more than one" outstanding transaction to a given
274	   destination.)

276	   On the other hand, CoAP applications may potentially want to initiate
277	   many transactions with different nodes at the same time.  Allowing
278	   CoAP applications to initiate an unlimited number of parallel
279	   transactions gives them the means for causing overload, and depends
280	   on application-level measures to detect and correctly mitigate this
281	   failure.  Because each transaction only consumes a very limited
282	   amount of resources, it is arguably more important to control the
283	   total outstanding number of transactions, compared to controlling the
284	   rate at which each individual one is being (re)transmitted.  The CoAP
285	   spec [I-D.ietf-core-coap] does currently not impose any limit on how
286	   many parallel transactions to different nodes an end-point may have
287	   outstanding.

289	   Given the importance of preventing congestion collapse, this document
290	   argues that the CoAP protocol should specify a common mechanism for
291	   congestion controlling the aggregate traffic a CoAP node sends into
292	   the network.  In other words, the CoAP stack should locally drop
293	   application-generated messages under overload situations (or indicate
294	   to applications that at the moment, no transmission is permissible),
295	   rather than attempting to send them into the network, irrespective of
296	   the destination.

298	   One proposal is to implement a simple windowing algorithm.  In this
299	   mechanism, a CoAP node has a certain number of "transmission credits"
300	   available during a time interval.  Sending one CoAP message consumes
301	   one transmission credit, independent of which destination it is being
302	   sent to.  If all transmission credits have been used up during a time
303	   interval, the CoAP node drops any additional messages that the
304	   applications attempt to send during the remainder of the time
305	   interval (or it prevents applications from generating the messages in
306	   the first place).  At the end of a time interval, the CoAP node
307	   determines whether acknowledgments have been received for all
308	   "confirmable" messages it has sent within the time interval.  If this
309	   is the case, the CoAP node increases the number of transmission
310	   credits by one for the following time interval.  If acknowledgments
311	   fail to arrive for some of the "confirmable" messages sent during the
312	   time interval, the number of transmission credits is cut in half for
313	   the next interval.

315	   The description above leaves several questions unanswered.  These
316	   include the length of the time interval and whether it is fixed or
317	   adapted over time, whether an increase by one and a reduction by half
318	   are the correct parameters for the proposed AIMD (additive increase,
319	   multiplicative decrease) scheme, whether the decrease should be
320	   proportional to the loss rate, how non-confirmable and multicast
321	   messages are handled, and others.

323	   At the moment, this document does not attempt to answer these
324	   questions.  Instead, it encourages simulations and implementations to
325	   explore the design space, and also consider other non-windowing
326	   approaches.

328	3.3.  Explicit Congestion Notification

330	   Explicit Congestion Notification (ECN) [RFC3168] is an extension to
331	   IP that allows routers to inform end nodes when they approach
332	   congestion by setting a bit in the IP header.  The receiver of a
333	   message echoes this bit to the sender, which reacts as if packet loss
334	   had occurred for the flow.

336	   Deployment of ECN can reduce overall packet loss, because senders can
337	   react to congestion early, i.e., before packet loss occurs.  This is
338	   especially attractive in resource-constrained environments, because
339	   retransmissions can be avoided, which conserves resources.

341	   If CoAP uses an aggregate congestion control mechanism such as
342	   described in Section 3.2, it will reduce the amount of transmission
343	   credits for the next time interval when some of the responses
344	   received had the ECN bit set.  (Other reactions to ECN markings may
345	   be possible.)

347	   Whether ECN support is possible in CoAP deployments remains to be
348	   investigated, because ECN usage requires a negotiation handshake (can
349	   potentially be avoided if support is made mandatory for CoAP
350	   deployments) and because routers need to support ECN marking.  At
351	   this point, simulations attempting to quantify the benefits may
352	   therefore be easiest to obtain in order to understand which benefits
353	   ECN brings to CoAP.

355	3.4.  Multicast Considerations

357	   CoAP requests may be multicast, and result in several replies from
358	   different end-points, potentially consuming much more resource
359	   capacity for the request and response transmissions than a single
360	   unicast transaction.  It can therefore be argued that the sending
361	   multicast requests should be more conservatively controlled than the
362	   sending of unicast requests.

364	   CoAP already acknowledges this to some degree by not retransmitting
365	   multicast requests at the CoAP level.  Unfortunately, CoAP currently
366	   has no means for preventing an application from doing application-
367	   level retransmissions of multicast requests.  Given that the
368	   prevention of congestion collapse is important, such a mechanism
369	   should be added.

371	   The aggregate congestion control proposal in Section 3.2 puts a cap
372	   on the number of transmissions allowed during a time interval,
373	   including multicast requests.  It is currently unclear whether
374	   additional means are required for CoAP deployments that make heavy
375	   use of multicast.  As before, experimentation is encouraged to
376	   understand the problem space.

378	4.  IANA Considerations

380	   This document requests no actions from IANA.

382	   [Note to the RFC Editor: Please remove this section upon
383	   publication.]

385	5.  Security Considerations

387	   This document has no known security implications.

389	   [Note to the RFC Editor: Please remove this section upon
390	   publication.]

392	6.  Acknowledgments

394	   Lars Eggert is partly funded by [TRILOGY], a research project
395	   supported by the European Commission under its Seventh Framework
396	   Program.

398	7.  References

400	7.1.  Normative References

402	   [I-D.ietf-core-coap]
403	              Shelby, Z., Hartke, K., Bormann, C., and B. Frank,
404	              "Constrained Application Protocol (CoAP)",
405	              draft-ietf-core-coap-04 (work in progress), January 2011.

407	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
408	              August 1980.

410	   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41,
411	              RFC 2914, September 2000.

413	   [RFC2988]  Paxson, V. and M. Allman, "Computing TCP's Retransmission
414	              Timer", RFC 2988, November 2000.

416	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
417	              of Explicit Congestion Notification (ECN) to IP",
418	              RFC 3168, September 2001.

420	   [RFC5405]  Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines
421	              for Application Designers", BCP 145, RFC 5405,
422	              November 2008.

424	7.2.  Informative References

426	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
427	              Friendly Rate Control (TFRC): Protocol Specification",
428	              RFC 5348, September 2008.

430	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
431	              Control", RFC 5681, September 2009.

433	   [TRILOGY]  "Trilogy Project",  http://www.trilogy-project.org/.

435	Author's Address

437	   Lars Eggert
438	   Nokia Research Center
439	   P.O. Box 407
440	   Nokia Group  00045
441	   Finland

443	   Phone: +358 50 48 24461
444	   Email: lars.eggert@nokia.com
445	   URI:   http://research.nokia.com/people/lars_eggert