idnits 2.17.1 

draft-moncaster-conex-problem-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Sep 2009 rather than the newer Notice from 28 Dec 2009.  (See
     https://trustee.ietf.org/license-info/)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 667: '...imply say that operators MUST turn off...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 1, 2010) is 5163 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-08) exists of
     draft-irtf-iccrg-welzl-congestion-control-open-research-05

  == Outdated reference: A later version (-09) exists of
     draft-livingood-woundy-congestion-mgmt-03

  == Outdated reference: A later version (-10) exists of
     draft-ietf-ledbat-congestion-00

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)


     Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Congestion Exposure                                         T. Moncaster
3	Internet-Draft                                                   L. Krug
4	Intended status: Informational                                        BT
5	Expires: September 2, 2010                                      M. Menth
6	                                                 University of Wuerzburg
7	                                                               J. Araujo
8	                                                                     UCL
9	                                                                S. Blake
10	                                                        Extreme Networks
11	                                                          R. Woundy, Ed.
12	                                                                 Comcast
13	                                                           March 1, 2010

15	            The Need for Congestion Exposure in the Internet
16	                    draft-moncaster-conex-problem-00

18	Abstract

20	   Today's Internet is a product of its history.  TCP is the main
21	   transport protocol responsible for sharing out bandwidth and
22	   preventing a recurrence of congestion collapse while packet drop is
23	   the primary signal of congestion at bottlenecks.  Since packet drop
24	   (and increased delay) impacts all their customers negatively, network
25	   operators would like to be able to distinguish between overly
26	   aggressive congestion control and a confluence of many low-bandwidth,
27	   low-impact flows.  But they are unable to see the actual congestion
28	   signal and thus, they have to implement bandwidth and/or usage limits
29	   based on the only information they can see or measure (the contents
30	   of the packet headers and the rate of the traffic).  Such measures
31	   don't solve the packet-drop problems effectively and are leading to
32	   calls for government regulation (which also won't solve the problem).

34	   We propose congestion exposure as a possible solution.  This allows
35	   packets to carry an accurate prediction of the congestion they expect
36	   to cause downstream thus allowing it to be visible to ISPs and
37	   network operators.  This memo sets out the motivations for congestion
38	   exposure and introduces a strawman protocol designed to achieve
39	   congestion exposure.

41	Status of This Memo

43	   This Internet-Draft is submitted to IETF in full conformance with the
44	   provisions of BCP 78 and BCP 79.

46	   Internet-Drafts are working documents of the Internet Engineering
47	   Task Force (IETF), its areas, and its working groups.  Note that
48	   other groups may also distribute working documents as Internet-
49	   Drafts.

51	   Internet-Drafts are draft documents valid for a maximum of six months
52	   and may be updated, replaced, or obsoleted by other documents at any
53	   time.  It is inappropriate to use Internet-Drafts as reference
54	   material or to cite them other than as "work in progress."

56	   The list of current Internet-Drafts can be accessed at
57	   http://www.ietf.org/ietf/1id-abstracts.txt.

59	   The list of Internet-Draft Shadow Directories can be accessed at
60	   http://www.ietf.org/shadow.html.

62	   This Internet-Draft will expire on September 2, 2010.

64	Copyright Notice

66	   Copyright (c) 2010 IETF Trust and the persons identified as the
67	   document authors.  All rights reserved.

69	   This document is subject to BCP 78 and the IETF Trust's Legal
70	   Provisions Relating to IETF Documents
71	   (http://trustee.ietf.org/license-info) in effect on the date of
72	   publication of this document.  Please review these documents
73	   carefully, as they describe your rights and restrictions with respect
74	   to this document.  Code Components extracted from this document must
75	   include Simplified BSD License text as described in Section 4.e of
76	   the Trust Legal Provisions and are provided without warranty as
77	   described in the BSD License.

79	Table of Contents

81	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
82	     1.1.  Definitions  . . . . . . . . . . . . . . . . . . . . . . .  5
83	     1.2.  Changes from previous versions . . . . . . . . . . . . . .  6
84	   2.  The Problem  . . . . . . . . . . . . . . . . . . . . . . . . .  7
85	     2.1.  Congestion is not the problem  . . . . . . . . . . . . . .  7
86	     2.2.  Increase capacity or manage traffic? . . . . . . . . . . .  7
87	       2.2.1.  Making Congestion Visible  . . . . . . . . . . . . . .  8
88	       2.2.2.  ECN - a Step in the Right Directions . . . . . . . . .  8
89	   3.  Existing Approaches to Traffic Control . . . . . . . . . . . .  9
90	     3.1.  Layer 3 Measurement  . . . . . . . . . . . . . . . . . . .  9
91	       3.1.1.  Volume Accounting  . . . . . . . . . . . . . . . . . .  9
92	       3.1.2.  Rate Measurement . . . . . . . . . . . . . . . . . . . 10
93	     3.2.  Higher Layer Discrimination  . . . . . . . . . . . . . . . 10
94	       3.2.1.  Bottleneck Rate Policing . . . . . . . . . . . . . . . 10
95	       3.2.2.  DPI and Application Rate Policing  . . . . . . . . . . 11
96	   4.  Why Now? . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
97	   5.  Requirements for a Solution  . . . . . . . . . . . . . . . . . 12
98	   6.  A Strawman Congestion Exposure Protocol  . . . . . . . . . . . 14
99	   7.  Use Cases  . . . . . . . . . . . . . . . . . . . . . . . . . . 14
100	     7.1.  Improved Policing  . . . . . . . . . . . . . . . . . . . . 16
101	       7.1.1.  Per Aggregate Policing . . . . . . . . . . . . . . . . 16
102	       7.1.2.  Per customer policing  . . . . . . . . . . . . . . . . 16
103	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 17
104	   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 17
105	   10. Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 18
106	   11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18
107	   12. Informative References . . . . . . . . . . . . . . . . . . . . 18

109	1.  Introduction

111	   The Internet has grown from humble origins to become a global
112	   phenomenon with billions of end-users able to share the network and
113	   exchange data and more.  One of the key elements in this success has
114	   been the use of distributed algorithms such as TCP that share
115	   capacity while avoiding congestion collapse.  These algorithms rely
116	   on the end-systems altruistically reducing their transmission rate in
117	   response to any congestion they see.

119	   In recent years ISPs have seen a minority of users taking a larger
120	   share of the network by using applications that transfer data
121	   continuously for hours or even days at a time and even opening
122	   multiple simultaneous TCP connections.  This issue became prevalent
123	   with the advent of "always on" broadband connections.  Frequently
124	   peer to peer protocols have been held responsible [RFC5594] but
125	   streaming video traffic is becoming increasingly significant.  In
126	   order to improve the network experience for the majority of their
127	   customers, many ISPs have chosen to impose controls on how their
128	   network's capacity is shared rather than continually buying more
129	   capacity.  They calculate that most customers will be unwilling to
130	   contribute to the cost of extra shared capacity if that will only
131	   really benefit a minority of users.  Approaches include volume
132	   counting or charging, and application rate limiting.  Typically these
133	   traffic controls, whilst not impacting most customers, set a
134	   restriction on a customer's level of network usage, as defined in a
135	   "fair usage policy".

137	   We believe that such traffic controls seek to control the wrong
138	   quantity.  What matters in the network is neither the volume of
139	   traffic nor the rate of traffic, it is the contribution to congestion
140	   over time - congestion means that your traffic impacts other users,
141	   and conversely that their traffic impacts you.  So if there is no
142	   congestion there need not be any restriction on the amount a user can
143	   send; restrictions only need to apply when others are sending traffic
144	   such that there is congestion.  In fact some of the current work at
145	   the IETF [LEDBAT] and IRTF [CC-open-research] already reflects this
146	   thinking.  For example, an application intending to transfer large
147	   amounts of data could use LEDBAT to try to reduce its transmission
148	   rate before any competing TCP flows do, by detecting an increase in
149	   end-to-end delay (as a measure of incipient congestion).  However
150	   these techniques rely on voluntary, altruistic action by end users
151	   and their application providers.  ISPs cannot enforce their use.
152	   This leads to our second point.

154	   The Internet was designed so that end-hosts detect and control
155	   congestion.  We believe that congestion needs to be visible to
156	   network nodes as well, not just to the end hosts.  More specifically,
157	   a network needs to be able to measure how much congestion traffic
158	   causes between the monitoring point in the network and the
159	   destination ("rest-of-path congestion").  This would be a new
160	   capability; today a network can use explicit congestion notification
161	   (ECN) [RFC3168] to detect how much congestion traffic has suffered
162	   between the source and a monitoring point in the network, but not
163	   beyond.  Such a capability would enable an ISP to give incentives for
164	   the use, without restrictions, of LEDBAT-like applications whilst
165	   perhaps restricting excessive use of TCP and UDP ones.

167	   So we propose a new approach which we call congestion exposure.  We
168	   propose that congestion information should be made visible at the IP
169	   layer, so that any network node can measure the contribution to
170	   congestion of an aggregate of traffic as easily as straight volume
171	   can be measured today.  Once the information is exposed in this way,
172	   it is then possible to use it to measure the true impact of any
173	   traffic on the network.  Lacking the ability to see congestion, some
174	   ISPs count the volume each user transfers.  On this basis LEDBAT
175	   applications would get blamed for hogging the network given the large
176	   amount of volume they transfer.  However, because they yield rather
177	   than hog, they actually contribute very little to congestion.  One
178	   use of exposed congestion information would be to measure the
179	   congestion attributable to a given user, and thereby incentivise the
180	   use of protocols such as [LEDBAT] which aim to reduce the congestion
181	   caused by bulk data transfers.

183	   Creating the incentive to deploy low-congestion protocols such as
184	   LEDBAT is just one of many motivations for congestion exposure.  In
185	   general, congestion exposure gives ISPs a principled way to hold
186	   their customers accountable for the impact on others of their network
187	   usage and reward them for choosing congestion-sensitive applications.
188	   It can measure the impact of an individual consumer, a large
189	   enterprise network or the traffic crossing a border from another ISP
190	   - anywhere where volume is used today as a (poor) measure of usage.
191	   In Section 7, a range of potential use cases for congestion exposure
192	   are given, showing it is possible to imagine a wide range of other
193	   ways to use the exposed congestion information.

195	1.1.  Definitions

197	   Throughout this document we refer to congestion repeatedly.
198	   Congestion has a wide range of definitions.  For the purposes of this
199	   document it is defined using the simplest way that it can be measured
200	   - the instantaneous fraction of loss.  More precisely, congestion is
201	   bits lost divided by bits sent, taken over any brief period.  By
202	   extension, if explicit congestion notification (ECN) is being used,
203	   the fraction of bits marked (rather than lost) gives a useful metric
204	   that can be thought of as analagous to congestion.  Strictly
205	   congestion should measure impairment, whereas ECN aims to avoid any
206	   loss or delay impairments due to congestion.  But for the purposes of
207	   this document, the two will both be called congestion.

209	   We also need to define two specific terms carefully:

211	   Upstream Congestion:  The congestion that has already been
212	      experienced by a packet as it travels along its path.  In other
213	      words at any point on the path it is the congestion between that
214	      point and the source of the packet.

216	   Downstream Congestion:  The congestion that a packet still has to
217	      experience on the remainder of its path.  In other words at any
218	      point it is the congestion still to be experienced as the packet
219	      travels between that point and its destination.

221	1.2.  Changes from previous versions

223	   From -03 to -04 (current version):

225	         Many edits throughout per comments from Bob Briscoe about the
226	         intentions of ConEx.

228	         References section updated; reference to Comcast congestion
229	         management system added as ISP example.

231	         NOTE: there are still sections needing more work, especially
232	         the Use Cases.  The whole document also needs trimming in
233	         places and checking for repetition or omission.

235	   From -02 to -03:

237	         Abstract re-written again following comments from John Leslie.

239	         Use Cases Section re-written.

241	         Security Considerations section improved.

243	         This ChangeLog added.

245	   From -01 to -02:

247	         Extensive changes throughout the document:

249	         +  Abstract and Introduction re-written.

251	         +  The Problem section re-written and extended significantly.

253	         +  Why Now?  Section re-written and extended.

255	         +  Requirements extended.

257	         +  Security Considerations expanded.

259	         Other less major changes throughout.

261	   From -00 -01:

263	         Significant changes throughout including re-organising the main
264	         structure.

266	         New Abstract and changes to Introduction.

268	2.  The Problem

270	2.1.  Congestion is not the problem

272	   The problem is not congestion itself.  The problem is how best to
273	   share available capacity.  When too much traffic meets too little
274	   capacity, congestion occurs.  Then we have to share out what capacity
275	   there is.  But we should not (and cannot) solve the capacity sharing
276	   problem by trying to make it go away - by saying there should somehow
277	   be no congestion, slower traffic or more capacity.  That misses the
278	   whole point of the Internet: to multiplex or share available capacity
279	   at maximum bit-rate.

281	   So as we say, the problem is not congestion in itself.  Every elastic
282	   data transfer should (and usually will) congest a healthy data
283	   network.  If it doesn't, its transport protocol is broken.  There
284	   should always be periods approaching 100% utilisation at some link
285	   along every data path through the Internet, implying that frequent
286	   periods of congestion are a healthy sign.  If transport protocols are
287	   too weak to congest capacity, they are under-utilising it and hanging
288	   around longer than they need to, reducing the capacity available for
289	   the next data transfers that might be about to start.

291	2.2.  Increase capacity or manage traffic?

293	   Some say the problem is that ISPs should invest in more capacity.
294	   Certainly increasing capacity should make the congested periods
295	   during data transfers shorter and the non-congested gaps between them
296	   longer.  The argument goes that if capacity were large enough it
297	   would make the periods when there is a capacity sharing problem
298	   insignificant and not worth solving.

300	   Yet, ISPs are facing a quandary - traffic is growing rapidly and
301	   traffic patterns are changing significantly (see Section 4 and
302	   [Cisco-VNI]) They know that any increases in capacity will have to be
303	   paid for by all their customers but capacity growth will be of most
304	   benefit to the heaviest users.  Faced with these problems, some ISPs
305	   are seeking to reduce what they regard as "heavy usage" in order to
306	   improve the service experienced by the majority of their customers.

308	   If done properly, managing traffic should be a valid alternative to
309	   increasing capacity.  An ISP's customers can vote with their feet if
310	   the ISP chooses the wrong balance between managing heavy traffic and
311	   charging for too much shared capacity.  Current traffic management
312	   techniques (Section 3) fight against the capacity shares that TCP is
313	   aiming for.  Ironically, they try to impose something approaching
314	   LEDBAT-like behaviour on heavier flows.  But as we have seen, they
315	   cannot give LEDBAT the credit for doing this itself - the network
316	   just sees a LEDBAT flow as a large amount of volume.

318	   Thus the problem for the IETF is to ensure that ISPs and their
319	   equipment suppliers have appropriate protocol support - not just to
320	   impose good capacity sharing themselves, but to encourage end-to-end
321	   protocols to share out capacity in everyone's best interests.

323	2.2.1.  Making Congestion Visible

325	   Unfortunately ISPs are only able to see limited information about the
326	   traffic they forward.  As we will see in section 3 they are forced to
327	   use the only information they do have available which leads to myopic
328	   control that has scant regard for the actual impact of the traffic or
329	   the underlying network conditions.  All their approaches are unsound
330	   because they cannot measure the most useful metric.  The volume or
331	   rate of a given flow or aggregate doesn't directly affect other
332	   users, but the congestion it causes does.  This can be seen with a
333	   simple illustration.  A 5Mbps flow in an otherwise empty 10Mbps
334	   bottleneck causes no congestion and so affects no other users.  By
335	   contrast a 1Mbps flow entering a 10Mbps bottleneck that is already
336	   fully occupied causes significant congestion and impacts every other
337	   user sharing that bottleneck as well as suffering impairment itself.
338	   So the real problem that needs to be addressed is how to close this
339	   information gap.  How can we expose congestion at the IP layer so
340	   that it can be used as the basis for measuring the impact of any
341	   traffic on the network as a whole?

343	2.2.2.  ECN - a Step in the Right Directions

345	   Explicit Congestion Notification [RFC3168] allows routers to
346	   explicitly tell end-hosts that they are approaching the point of
347	   congestion.  ECN builds on Active Queue Mechanisms such as random
348	   early discard (RED) [RFC2309] by allowing the router to mark a packet
349	   with a Congestion Experienced (CE) codepoint, rather than dropping
350	   it.  The probability of a packet being marked increases with the
351	   length of the queue and thus the rate of CE marks is a guide to the
352	   level of congestion at that queue.  This CE codepoint travels forward
353	   through the network to the receiver which then informs the sender
354	   that it has seen congestion.  The sender is then required to respond
355	   as if it had experienced a packet loss.  Because the CE codepoint is
356	   visible in the IP layer, this approach reveals the upstream
357	   congestion level for a packet.

359	   So Is ECN the Solution?  Alas not - ECN does allow downstream nodes
360	   to measure the upstream congestion for any flow, but this is not
361	   enough.  This can make a receiver accountable for the congestion
362	   caused by incoming traffic.  But a receiver can only control incoming
363	   congestion indirectly, by politely asking the sender to control it.
364	   A receiver cannot make a sender install an adaptive codec, or install
365	   LEDBAT instead of TCP.  And a receiver cannot ask an attacker to stop
366	   flooding it with traffic.  What is needed is knowledge of the
367	   downstream congestion level for which you need additional information
368	   that is still concealed from the network - by design.

370	3.  Existing Approaches to Traffic Control

372	   Existing approaches intended to address the problems outlined above
373	   can be broadly divided into two groups - those that passively monitor
374	   traffic and can thus measure the apparent impact of a given flow of
375	   packets and those that can actively discriminate against certain
376	   packets, flows, applications or users based on various
377	   characteristics or metrics.

379	3.1.  Layer 3 Measurement

381	   L3 measurement of traffic relies on using the information that can be
382	   measured directly or is revealed in the IP header of the packet (or
383	   lower layers).  Architecturally, L3 measurement is best since it fits
384	   with the idea of the hourglass design of the Internet [RFC3439].
385	   This asserts that "the complexity of the Internet belongs at the
386	   edges, and the IP layer of the Internet should remain as simple as
387	   possible."

389	3.1.1.  Volume Accounting

391	   Volume accounting is a technique that is often used to discriminate
392	   between heavy and light users.  The volume of traffic sent by a given
393	   user or network is one of the easiest pieces of information to
394	   monitor in a network.  Measuring the size of every packet from the
395	   header and adding them up is a simple operation.  Consequently this
396	   has long been a favoured measure used by operators to control their
397	   customers.

399	   The precise manner in which this volume information is used may vary.
400	   Typically ISPs may impose an overall volume cap on their customers
401	   (perhaps 10Gbytes a month).  Alternatively they may decide that the
402	   heaviest users each month are subjected to some sanction.

404	   Volume is naively thought to indicate the impact that one party's
405	   traffic has on others.  But the same volume can cause very different
406	   impacts on others if it is transferred at slightly different times,
407	   or between slightly different endpoints.  Also the impact on others
408	   greatly depends on how responsive the transport is to congestion,
409	   whether responsive (TCP), very responsive (LEDBAT), aggressive
410	   (multiple TCPs) or totally unresponsive.

412	3.1.2.  Rate Measurement

414	   Rate measurements might be thought indicative of the impact of one
415	   aggregate of traffic on others, and rate is often limited to avoid
416	   impact on others.  However such limits generally constrain everyone
417	   much more than they need to, just in case most parties send fast at
418	   the same time.  And such limits constrain everyone too little at
419	   other times, when everyone actually does send fast at the same time.

421	   The problem with measuring rate is that it doesn't say how much the
422	   rate is occupying shared capacity over time, and whether the high
423	   rate of one user comes at times when others want a high rate.

425	3.2.  Higher Layer Discrimination

427	   Over recent years a number of traffic management techniques have
428	   emerged that explicitly differentiate between different traffic
429	   types, applications and even users.  This is done because ISPs and
430	   operators feel they have a need to use such techniques to better
431	   control a new raft of applications that break some of the implicit
432	   design assumptions behind TCP (short-lived flows, limited flows per
433	   connection, generally between server and client).

435	3.2.1.  Bottleneck Rate Policing

437	   Bottleneck flow rate policers such as [XCHOKe] and [pBox] have been
438	   proposed as approaches for rate policing traffic.  But they must be
439	   deployed at bottlenecks in order to work.  Unfortunately, capacity
440	   sharing is not only about congestion-responsive behaviour of each
441	   flow, but also about how long the flows occupy the capacity and the
442	   combined total of multiple flows.  Such rate policers also make an
443	   assumption about what constitutes acceptable per-flow behaviour.  If
444	   these bottleneck policers were widely deployed, the Internet could
445	   find itself with one universal rate adaptation policy embedded
446	   throughout the network.  With TCP's congestion control algorithm
447	   approaching its scalability limits as the network bandwidth continues
448	   to increase, new algorithms are being developed for high-speed
449	   congestion control.  Embedding assumptions about acceptable rate
450	   adaptation would make evolution to such new algorithms extremely
451	   painful.

453	3.2.2.  DPI and Application Rate Policing

455	   Some operators use deep packet inspection (DPI) and traffic analysis
456	   to identify certain applications they believe to have an excessive
457	   impact on the network.  ISPs generally pick on applications that that
458	   they judge as low value to the customer in question and high impact
459	   on other customers.  A common example is peer-to-peer file-sharing.
460	   Having identified a flow as belonging to such an application, the
461	   operator uses differential scheduling to limit the impact of that
462	   flow on others, which usually limits its throughput as well.  This
463	   has fuelled the on-going battle between application developers and
464	   DPI vendors.

466	   When operators first started to limit the throughput of P2P, it soon
467	   became common knowledge that turning on encryption could boost your
468	   throughput.  The DPI vendors then improved their equipment so that it
469	   could identify P2P traffic by the pattern of packets it sends.  This
470	   risks becoming an endless vicious cycle - an arms race that neither
471	   side can win.  Furthermore such techniques may put the operator in
472	   direct conflict with the customers, regulators and content providers.

474	4.  Why Now?

476	   The accountability and capacity sharing problems highlighted so far
477	   have always characterised the Internet to some extent.  In 1988 Van
478	   Jacobson coded capacity sharing into TCP's e2e congestion control
479	   algorithms [TCPcc].  But fair queuing algorithms were already being
480	   written for network operators to ensure each active user received an
481	   equal share of a link and couldn't game the system [RFC0970].  The
482	   two approaches have divergent objectives, but they have co-existed
483	   ever since.

485	   The main new factor has been the introduction of residential
486	   broadband, making 'always-on' available to all, not just campuses and
487	   enterprises.  Both TCP and approaches like fair queuing don't take
488	   account of how much of each user's data is occupying a link over
489	   time, which can significantly reduce the capacity available to
490	   lighter usage.  Therefore residential ISPs have been introducing new
491	   traffic management equipment that can prioritise based on each
492	   customer's usage volume, e.g.  [Comcast].  Otherwise capacity
493	   upgrades get eaten up by transfers of large amounts of data, with
494	   little gain for interactive usage [BB-Incentive].

496	   In campus networks, capacity upgrades are the easiest way to mitigate
497	   the inability of TCP or FQ to take account of activity over time.
498	   But capacity upgrades are much more expensive in residential
499	   broadband networks that are spread over large geographic areas and
500	   customers will only be happy to pay more for their service if the
501	   majority can see a significant benefit.

503	   However, these traffic management techniques fight the capacity
504	   shares e2e protocols are aiming at, rather than working together in
505	   unison.  And, the more optimal ISPs try to make their controls, the
506	   more they need application knowledge within the network - which isn't
507	   how the Internet was designed to work.  Congestion exposure hasn't
508	   been considered before, because the depth of the problem has only
509	   recently been understood.  We now understand that both networks and
510	   end-systems to focus on contribution to congestion, not volume or
511	   rate.  Then application knowledge is only needed on the end-system,
512	   where it should be.  But the reason this isn't happening is because
513	   the network cannot see the information it needs (congestion).

515	   As long as ISPs continue to use rate and volume as the key metrics
516	   for determining when to control traffic there is no incentive to use
517	   LEDBAT or other low-congestion protocols to improve the performance
518	   of competing interactive traffic.  We believe that congestion
519	   exposure gives ISPs the information they need to be able to
520	   discriminate in favour of such low-congestion transports.  In turn
521	   this will give users a direct benefit from using such transports and
522	   so encourage their wider use.

524	5.  Requirements for a Solution

526	   This section proposes some requirements for any solution to this
527	   problem.  We believe that a solution that meets most of these
528	   requirements is likely to be better than one that doesn't, but we
529	   recognise that if a working group is established in this area, it may
530	   have to make tradeoffs.

532	   o  Allow both upstream and downstream congestion to be visible at the
533	      IP layer -- visibility at the IP layer allows congestion in the
534	      heart of the network to be monitored at the edges and without
535	      deploying complicated and intrusive equipment such as DPI boxes.
536	      This gives several advantages:

538	      1.  It enables bulk policing of traffic based on the congestion it
539	          is actually going to cause in the network.

541	      2.  It allows the amount of congestion across ISP borders to be
542	          monitored.

544	      3.  It supports a diversity of intra-domain and inter-domain
545	          congestion management practices.

547	      4.  It allows the contribution to congestion over time to be
548	          counted as easily as volume can be counted today.

550	      5.  It supports contractual arrangements for managing traffic
551	          (acceptable use policies, SLAs etc) between just the two
552	          parties exchanging traffic across their point of attachment,
553	          without involving others.

555	   o  Avoid making assumptions about the behavior of specific
556	      applications (e.g. be agnostic to application and transport
557	      behaviour).

559	   o  Support the widest possible range of transport protocols for the
560	      widest range of data types (elastic, inelastic, real-time,
561	      background, etc) -- don't force a "universal rate adaptable
562	      policy" such as TCP-friendliness [RFC3448].

564	   o  Be responsive to real-time congestion in the network.

566	   o  Allow incremental deployment of the solution and ideally design
567	      for permanent partial deployment to increase chances of successful
568	      deployment.

570	   o  Ensure packets supporting congestion exposure are distinguishable
571	      from others, so that each transport can control when it chooses to
572	      deploy congestion exposure, and ISPs can manage the two types of
573	      traffic distinctly.

575	   o  Support mechanisms that ensure the integrity of congestion
576	      notifications, thus making it hard for a user or network to
577	      distort the congestion signal.

579	   o  Be robust in the face of DoS attacks, so that congestion
580	      information can be used to identify and limit DoS traffic and to
581	      protect the hosts and network elements implementing congestion
582	      exposure.

584	   Many of these requirements are by no means unique to the problem of
585	   congestion exposure.  Incremental deployment for instance is a
586	   critical requirement for any new protocol that affects something as
587	   fundamental as IP.  Being robust under attack is also a pre-requisite
588	   for any protocol to succeed in the real Internet and this is covered
589	   in more detail in Section 9.

591	6.  A Strawman Congestion Exposure Protocol

593	   In this section we explore a simple strawman protocol that would
594	   solve the congestion exposure problem.  This protocol neatly
595	   illustrates how a solution might work.  A practical implementation of
596	   this protocol has been produced and both simulations and real-life
597	   testing show that it works.  The protocol is based on a concept known
598	   as re-feedback [Re-fb] and builds on existing active queue management
599	   techniques like RED [RFC2309] and ECN [RFC3168] that network elements
600	   can already use to measure and expose congestion.

602	   Re-feedback, standing for re-inserted feedback, is a system designed
603	   to allow end-hosts to reveal to the network information about their
604	   network path that they have received via conventional feedback (for
605	   instance congestion).

607	   In our strawman protocol we imagine that packets have two
608	   "congestion" fields in their IP header:

610	   o  The first is a congestion experienced field to record the upstream
611	      congestion level along the path.  Routers indicate their current
612	      congestion level by updating this field in every packet.  As the
613	      packet traverses the network it builds up a record of the overall
614	      congestion along its path in this field.  This data is sent back
615	      to the sender who uses it to determine its transmission rate.

617	   o  The other is a whole-path congestion field that uses re-feedback
618	      to record the total congestion along the path.  The sender does
619	      this by re-inserting the current congestion level for the path
620	      into this field for every packet it transmits.

622	   Thus at any node downstream of the sender you can see the upstream
623	   congestion for the packet (the congestion thus far), the whole path
624	   congestion (with a time lag of 1RTT) and can calculate the downstream
625	   congestion by subtracting one from the other.

627	   So congestion exposure can be achieved by coupling congestion
628	   notification from routers with the re-insertion of this information
629	   by the sender.  This establishes information symmetry between users
630	   and network providers.

632	7.  Use Cases

634	   Once downstream congestion information is revealed in the IP header
635	   it can be used for a number of purposes.  Precise details of how the
636	   information might be used are beyond the scope of this document but
637	   this section will give an overview of some possible uses. {ToDo:
638	   write up the rest of this section properly.  Concentrate on a couple
639	   of the most useful potential use cases (traffic management and
640	   accountability?) and mention a couple of more arcane uses (traffic
641	   engineering and e2e QoS).  The key thing is to clarify that
642	   Congestion Exposure is a tool that can be used for many other
643	   things...}

645	   It allows an ISP to accurately identify which traffic is having the
646	   greatest impact on the network and either police directly on that
647	   basis or use it to determine which users should be policed.  It can
648	   form the basis of inter-domain contracts between operators.  It could
649	   even be used as the basis for inter-domain routing, thus encouraging
650	   operators to invest appropriately in improving their infrastructure.

652	   From Rich Woundy: "I would add a section about use cases.  The
653	   primary use case would seem to be an "incentive environment that
654	   ensures optimal sharing of capacity", although that could use a
655	   better title.  Other use cases may include "DDoS mitigation", "end-
656	   to-end QoS", "traffic engineering", and "inter-provider service
657	   monitoring".  (You can see I am stealing liberally from the
658	   motivation draft here.  We'll have to see whether the other use cases
659	   are "core" to this group, or "freebies" that come along with re-ECN
660	   as a particular protocol.)"

662	   My take on this is we need to concentrate on one or two major use
663	   cases.  The most obvious one is using this to control user-behaviour
664	   and encourage the use of "congestion friendly" protocols such as
665	   LEDBAT.

667	   {Comments from Louise Krug:} simply say that operators MUST turn off
668	   any kind of rate limitation for LEDBAT traffic and what they might
669	   mean for the amount of bandwidth they see compared to a throttled
670	   customer?  You could then extend that to say how it leads to better
671	   QoS differentiation under the assumption that there is a broad
672	   traffic mix any way?  Not sure how much detail you want to go into
673	   here though?

675	   {ToDo: better incorporate this text from Mirja into Michael's text
676	   below.} Congestion exposure can enable ISPs to give an incentive to
677	   end-systems to response to congestion in a way that leads to a better
678	   share of the available capacity.  For example the introduction of a
679	   per-user congestion volume might motivate "heavy-user" to back off
680	   with their high-bandwidth traffic (when congestion occurs) to save
681	   their congestion volume for more time-critical traffic.  If every
682	   end-system reacts to congestion in such a way that it avoids
683	   congestion for non-critical traffic and allow a certain level of
684	   congestion for the more important traffic (from the user's point of
685	   view), the all-over user experience will be increased.  More-over the
686	   network might be utilized more equally when less-important traffic is
687	   shifted to less congested time slots.

689	7.1.  Improved Policing

691	   As described earlier in this document, ISPs throttle traffic not
692	   because it causes congestion in the network but because users have
693	   exceeded their traffic profile or because individual applications or
694	   flows are suspected to cause congestion.  This is done because it is
695	   not possible to police only the traffic that is causing congestion.
696	   Congestion exposure allows new possibilities for rate policing.

698	7.1.1.  Per Aggregate Policing

700	   A straightforward application of congestion exposure is per-flow or
701	   per-aggregate congestion policing.  Instead of limiting flows or
702	   aggregates because they have exceeded certain rate thresholds, they
703	   can be throttled if they cause too much congestion in the network.
704	   This is throttling on evidence instead of suspicion.

706	7.1.2.  Per customer policing

708	   The assumption is that every customer has an allowance of congestion
709	   per second.  If he causes more congestion than this throughout the
710	   network, his traffic can be policed or shaped to ensure he stays
711	   within his allowance.  The nice features of this approach are that it
712	   sets incentives for the use of congestion-minimising transport
713	   protocols such as LEDBAT and allows tariffs that better reflect the
714	   relative impact of customers each other.

716	   Incentives for congestion minimising transports:  A user generates
717	      foreground and background traffic.  Foreground traffic needs to go
718	      fast while background traffic can afford to go slow.  With per-
719	      customer congestion policing, users can optimise their network
720	      experience by using congestion-minimising transport protocols for
721	      background traffic and normal TCP-like or even high-speed
722	      transport protocols for foreground traffic.  Doing so means
723	      background traffic only causes minimal congestion so that
724	      foreground traffic can go faster than when both were transmitted
725	      over the same transport protocols.  Hence, per-customer congestion
726	      policing sets incentives for selfish users to utilise congestion-
727	      minimising transport protocols.

729	   Improved tariff structures:  Currently customers are offered tariffs
730	      with all manner of differentaitors from peak access rate to volume
731	      limit and even specific application rate limits.  Congestion-
732	      policing offers a better means of distinguishing between tariffs.

734	      Heavy users and light users will get equal access in terms of
735	      speed and short-term throughput, but customers that cause more
736	      congestion and thus have a bigger impact on others will have to
737	      pay for the privilege or suffer reduced throuhgput during periods
738	      of heavy congestion.  However tariffs are a subject best left to
739	      the market to determine, not the IETF.

741	8.  IANA Considerations

743	   This document makes no request to IANA.

745	9.  Security Considerations

747	   One intended use of exposed congestion information is to hold the e2e
748	   transport and the network accountable to each other.  Therefore, any
749	   congestion exposure protocol will have to provide the necessary hooks
750	   to mechanisms that can assure the integrity of this information.  The
751	   network cannot be relied on to report information to the receiver
752	   against its interest, and the same applies for the information the
753	   receiver feeds back to the sender, and that the sender reports back
754	   to the network.  Looking at all each in turn:

756	   o  The Network.  In general it is not in any network's interest to
757	      under-declare congestion since this will have potentially negative
758	      consequences for all users of that network.  It may be in its
759	      interest to over-declare congestion if, for instance, it wishes to
760	      force traffic to move away to a different network or indeed simply
761	      wants to reduce the amonut of traffic it is carrying.  Congestion
762	      Exposure itself shouldn't significantly alter the incentives for
763	      and against honest declaration of congestion by a network, but it
764	      is possible to imagine applications of Congestion Exposure that
765	      will change these incentives.  There is a general perception among
766	      networks that their level of congestion is a business secret.
767	      Actually in the Internet architecture congestion is one of the
768	      worst-kept secrets a network has, because end-hosts can see
769	      congestion better than networks can.  Nonetheless, one goal of a
770	      congestion exposure protocol is to allow networks to pinpoint
771	      whether congestion is in one side or the other of a border.
772	      Although this extra transparency should be good for ISPs with low
773	      congestion, those with underprovisioned networks may try to
774	      obstruct deployment.

776	   o  The Receiver.  Receivers generally have an incentive to under-
777	      declare congestion since they generally wish to receive the data
778	      from the sender as rapidly as possible.  [Savage] explains how a
779	      receiver can significantly improve their throughput my failing to
780	      declare congestion.  This is a problem with or without Congestion
781	      Exposure.  [KGao] explains one possible technique to encourage
782	      receiver's to be honest in their declaration of congestion.

784	   o  The Sender.  One proposed mechanisms for congestion exposure adds
785	      a requirement for a sender to let the network know how much
786	      congestion it has suffered or caused.  Although most senders
787	      currently respond to congestion they are informed of, one use of
788	      exposed congestion information might be to encourage sources of
789	      excessive congestion to respond more than previously.  Then
790	      clearly there may be an incentive for the sender to under-declare
791	      congestion.  This will be a particular problem with sources of
792	      flooding attacks.

794	   In addition there are potential problems from source spoofing.  A
795	   malicious sender can pretend to be another user by spoofing the
796	   source address.  A congestion exposure protocol will need to be
797	   robust against injection of false congestion information into the
798	   forward path that could distort or disrupt the integrity of the
799	   congestion signal.

801	10.  Conclusions

803	   Congestion exposure is the idea that traffic itself indicates to all
804	   nodes on its path how much congestion it causes on the entire path.
805	   It is useful for network operators to police traffic only if it
806	   really causes congestion in the Internet instead of doing blind rate
807	   capping independently of the congestion situation.  This change would
808	   give incentives to users to adopt new transport protocols such as
809	   LEDBAT which try to avoid congestion more than TCP does.
810	   Requirements for congestion exposure in the IP header were
811	   summarized, one technical solution was presented, and additional use
812	   cases for congestion exposure were discussed.

814	11.  Acknowledgements

816	   A number of people other than authors have provided text and comments
817	   for this memo.  The document is being produced in support of a BoF on
818	   Congestion Exposure as discussed extensively on the <re-ecn@ietf.org>
819	   mailing list.

821	12.  Informative References

823	   [BB-Incentive]      MIT Communications Futures Program (CFP) and
824	                       Cambridge University Communications Research
825	                       Network, "The Broadband Incentive Problem",
826	                       September 2005.

828	   [CC-open-research]  Welzl, M., Scharf, M., Briscoe, B., and D.
829	                       Papadimitriou, "Open Research Issues in Internet
830	                       Congestion Control", draft-irtf-iccrg-welzl-
831	                       congestion-control-open-research-05 (work in
832	                       progress), September 2009.

834	   [Cisco-VNI]         Cisco Systems, inc., "Cisco Visual Networking
835	                       Index: Forecast and Methodology, 2008-2013",
836	                       June 2009.

838	   [Comcast]           Bastian, C., Klieber, T., Livingood, J., Mills,
839	                       J., and R. Woundy, "Comcast's Protocol-Agnostic
840	                       Congestion Management System",
841	                       draft-livingood-woundy-congestion-mgmt-03 (work
842	                       in progress), February 2010.

844	   [KGao]              Gao, K. and C. Wang, "Incrementally Deployable
845	                       Prevention to TCP Attack with Misbehaving
846	                       Receivers", December 2004.

848	   [LEDBAT]            Shalunov, S., "Low Extra Delay Background
849	                       Transport (LEDBAT)",
850	                       draft-ietf-ledbat-congestion-00 (work in
851	                       progress), October 2009.

853	   [RFC0970]           Nagle, J., "On packet switches with infinite
854	                       storage", RFC 970, December 1985.

856	   [RFC2309]           Braden, B., Clark, D., Crowcroft, J., Davie, B.,
857	                       Deering, S., Estrin, D., Floyd, S., Jacobson, V.,
858	                       Minshall, G., Partridge, C., Peterson, L.,
859	                       Ramakrishnan, K., Shenker, S., Wroclawski, J.,
860	                       and L. Zhang, "Recommendations on Queue
861	                       Management and Congestion Avoidance in the
862	                       Internet", RFC 2309, April 1998.

864	   [RFC3168]           Ramakrishnan, K., Floyd, S., and D. Black, "The
865	                       Addition of Explicit Congestion Notification
866	                       (ECN) to IP", RFC 3168, September 2001.

868	   [RFC3439]           Bush, R. and D. Meyer, "Some Internet
869	                       Architectural Guidelines and Philosophy",
870	                       RFC 3439, December 2002.

872	   [RFC3448]           Handley, M., Floyd, S., Padhye, J., and J.
873	                       Widmer, "TCP Friendly Rate Control (TFRC):
874	                       Protocol Specification", RFC 3448, January 2003.

876	   [RFC5594]           Peterson, J. and A. Cooper, "Report from the IETF
877	                       Workshop on Peer-to-Peer (P2P) Infrastructure,
878	                       May 28, 2008", RFC 5594, July 2009.

880	   [Re-fb]             Briscoe, B., Jacquet, A., Di Cairano-Gilfedder,
881	                       C., Salvatori, A., Soppera, A., and M. Koyabe,
882	                       "Policing Congestion Response in an Internetwork
883	                       Using Re-Feedback", ACM SIGCOMM CCR 35(4)277--
884	                       288, August 2005, <http://www.acm.org/sigs/
885	                       sigcomm/sigcomm2005/techprog.html#session8>.

887	   [Savage]            Savage, S., Wetherall, D., and T. Anderson, "TCP
888	                       Congestion Control with a Misbehaving Receiver",
889	                       ACM SIGCOMM Computer Communication Review , 1999.

891	   [TCPcc]             Jacobson, V. and M. Karels, "Congestion Avoidance
892	                       and Control", Proc. ACM SIGCOMM'88 Symposium,
893	                       Computer Communication Review 18(4)314--329,
894	                       August 1988,
895	                       <http://ee.lbl.gov/papers/congavoid.pdf>.

897	   [XCHOKe]            Chhabra, P., Chuig, S., Goel, A., John, A.,
898	                       Kumar, A., Saran, H., and R. Shorey, "XCHOKe:
899	                       Malicious Source Control for Congestion Avoidance
900	                       at Internet Gateways", Proceedings of IEEE
901	                       International Conference on Network Protocols
902	                       (ICNP-02) , November 2002, <http://
903	                       csdl.computer.org/comp/proceedings/icnp/2002/
904	                       1856/00/18560186.pdf>.

906	   [pBox]              Floyd, S. and K. Fall, "Promoting the Use of End-
907	                       to-End Congestion Control in the Internet", IEEE/
908	                       ACM Transactions on Networking 7(4) 458--472,
909	                       August 1999,
910	                       <http://www.aciri.org/floyd/end2end-paper.html>.

912	Authors' Addresses

914	   Toby Moncaster
915	   BT
916	   B54/70, Adastral Park
917	   Martlesham Heath
918	   Ipswich  IP5 3RE
919	   UK

921	   Phone: +44 7918 901170
922	   EMail: toby.moncaster@bt.com
923	   Louise Krug
924	   BT
925	   B54/77, Adastral Park
926	   Martlesham Heath
927	   Ipswich  IP5 3RE
928	   UK

930	   EMail: louise.burness@bt.com

932	   Michael Menth
933	   University of Wuerzburg
934	   room B206, Institute of Computer Science
935	   Am Hubland
936	   Wuerzburg  D-97074
937	   Germany

939	   Phone: +49 931 888 6644
940	   EMail: menth@informatik.uni-wuerzburg.de

942	   Joao Taveira Araujo
943	   UCL
944	   GS206 Department of Electronic and Electrical Engineering
945	   Torrington Place
946	   London  WC1E 7JE
947	   UK

949	   EMail: j.araujo@ee.ucl.ac.uk

951	   Steven Blake
952	   Extreme Networks
953	   Pamlico Building One, Suite 100
954	   3306/08 E. NC Hwy 54
955	   RTP, NC 27709
956	   US

958	   EMail: sblake@extremenetworks.com
959	   Richard Woundy (editor)
960	   Comcast
961	   Comcast Cable Communications
962	   27 Industrial Avenue
963	   Chelmsford, MA  01824
964	   US

966	   EMail: richard_woundy@cable.comcast.com
967	   URI:   http://www.comcast.com