idnits 2.17.1 

draft-floyd-cong-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 15
     longer pages, the longest (page 2) being 60 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 16 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 131 has weird spacing: '...lure to  respo...'

  == Line 134 has weird spacing: '...   but  is a w...'

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 2000) is 8713 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'RFC2119' is defined on line 535, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-03) exists of
     draft-balakrishnan-cm-02

  -- Possible downref: Normative reference to a draft: ref. 'BS00' 

  == Outdated reference: A later version (-07) exists of
     draft-ietf-pilc-slow-03

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FF99'

  -- Unexpected draft version: The latest known version of 
     draft-handley-tcp-cwv is -01, but you're referring to -02.

  ** Downref: Normative reference to an Historic draft: draft-handley-tcp-cwv
     (ref. 'HPF00')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Jacobson88'

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  ** Obsolete normative reference: RFC  896 (Obsoleted by RFC 7805)

  ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323)

  ** Obsolete normative reference: RFC 2309 (Obsoleted by RFC 7567)

  ** Downref: Normative reference to an Informational RFC: RFC 2357

  ** Obsolete normative reference: RFC 2414 (Obsoleted by RFC 3390)

  ** Downref: Normative reference to an Informational RFC: RFC 2475

  ** Obsolete normative reference: RFC 2481 (Obsoleted by RFC 3168)

  ** Downref: Normative reference to an Informational RFC: RFC 2525

  ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681)

  ** Obsolete normative reference: RFC 2582 (Obsoleted by RFC 3782)

  ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
     RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SCWA99'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'TCPB98'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'TCPF98'


     Summary: 17 errors (**), 0 flaws (~~), 9 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force            Sally Floyd, Editor
2	INTERNET DRAFT                                           ACIRI
3	draft-floyd-cong-04.txt                              June 2000
4	                                        Expires: December 2000

6	                     Congestion Control Principles

8	                          Status of this Memo

10	   This document is an Internet-Draft and is in full conformance with
11	   all provisions of Section 10 of RFC2026.

13	   Internet-Drafts are working documents of the Internet Engineering
14	   Task Force (IETF), its areas, and its working groups.  Note that
15	   other groups may also distribute working documents as Internet-
16	   Drafts.

18	   Internet-Drafts are draft documents valid for a maximum of six months
19	   and may be updated, replaced, or obsoleted by other documents at any
20	   time.  It is inappropriate to use Internet- Drafts as reference
21	   material or to cite them other than as "work in progress."

23	   The list of current Internet-Drafts can be accessed at
24	   http://www.ietf.org/ietf/1id-abstracts.txt

26	   The list of Internet-Draft Shadow Directories can be accessed at
27	   http://www.ietf.org/shadow.html.

29	Abstract

31	1.  Introduction.

33	   The goal of this document is to explain the need for congestion
34	   control in the Internet, and to discuss what constitutes correct
35	   congestion control.  One specific goal is to illustrate the dangers
36	   of neglecting to apply proper congestion control.  A second goal is
37	   to discuss the role of the IETF in standardizing new congestion
38	   control protocols.

40	   This document draws heavily from earlier RFCs, in some cases
41	   reproducing entire sections of the text of earlier documents
42	   [RFC2309, RFC2357].  We have also borrowed heavily from earlier
43	   publications addressing the need for end-to-end congestion control
44	   [FF99].

46	2.  Current standards on congestion control

48	   IETF standards concerning end-to-end congestion control focus either
49	   on specific protocols (e.g., TCP [RFC2581], reliable multicast
50	   protocols [RFC2357]) or on the syntax and semantics of communications
51	   between the end nodes and routers about congestion information (e.g.,
52	   Explicit Congestion Notification [RFC2481]) or desired quality-of-
53	   service (diff-serv)).  The role of end-to-end congestion control is
54	   also discussed in an Informational RFC on "Recommendations on Queue
55	   Management and Congestion Avoidance in the Internet" [RFC2309].  RFC
56	   2309 recommends the deployment of active queue management mechanisms
57	   in routers, and the continuation of design efforts towards mechanisms
58	   in routers to deal with flows that are unresponsive to congestion
59	   notification.  We freely borrow from RFC 2309 some of their general
60	   discussion of end-to-end congestion control.

62	   In contrast to the RFCs discussed above, this document is a more
63	   general discussion of the principles of congestion control.  One of
64	   the keys to the success of the Internet has been the congestion
65	   avoidance mechanisms of TCP.  While TCP is still the dominant
66	   transport protocol in the Internet, it is not ubiquitous, and there
67	   are an increasing number of applications that, for one reason or
68	   another, choose not to use TCP.  Such traffic includes not only
69	   multicast traffic, but unicast traffic such as streaming multimedia
70	   that does not require reliability; and traffic such as DNS or routing
71	   messages that consist of short transfers deemed critical to the
72	   operation of the network.  Much of this traffic does not use any form
73	   of either bandwidth reservations or end-to-end congestion control.
74	   The continued use of end-to-end congestion control by best-effort
75	   traffic is critical for maintaining the stability of the Internet.

77	   This document also discusses the general role of the IETF in the
78	   standardization of new congestion control protocols.

80	   The discussion of congestion control principles for differentiated
81	   services or integrated services is not addressed in this document.
82	   Some categories of integrated or differentiated services include a
83	   guarantee by the network of end-to-end bandwidth, and as such do not
84	   require end-to-end congestion control mechanisms.

86	3.  The development of end-to-end congestion control.

88	3.1.  Preventing congestion collapse.

90	   The Internet protocol architecture is based on a connectionless end-
91	   to-end packet service using the IP protocol.  The advantages of its
92	   connectionless design, flexibility and robustness, have been amply
93	   demonstrated.  However, these advantages are not without cost:
94	   careful design is required to provide good service under heavy load.
95	   In fact, lack of attention to the dynamics of packet forwarding can
96	   result in severe service degradation or "Internet meltdown".  This
97	   phenomenon was first observed during the early growth phase of the
98	   Internet of the mid 1980s [RFC896], and is technically called
99	   "congestion collapse".

101	   The original specification of TCP [RFC793] included window-based flow
102	   control as a means for the receiver to govern the amount of data sent
103	   by the sender.  This flow control was used to prevent overflow of the
104	   receiver's data buffer space available for that connection.  [RFC793]
105	   reported that segments could be lost due either to errors or to
106	   network congestion, but did not include dynamic adjustment of the
107	   flow-control window in response to congestion.

109	   The original fix for Internet meltdown was provided by Van Jacobson.
110	   Beginning in 1986, Jacobson developed the congestion avoidance
111	   mechanisms that are now required in TCP implementations [Jacobson88,
112	   RFC 2581].  These mechanisms operate in the hosts to cause TCP
113	   connections to "back off" during congestion.  We say that TCP flows
114	   are "responsive" to congestion signals (i.e., dropped packets) from
115	   the network.  It is these TCP congestion avoidance algorithms that
116	   prevent the congestion collapse of today's Internet.

118	   However, that is not the end of the story.  Considerable research has
119	   been done on Internet dynamics since 1988, and the Internet has
120	   grown.  It has become clear that the TCP congestion avoidance
121	   mechanisms [RFC2581], while necessary and powerful, are not
122	   sufficient to provide good service in all circumstances.  In addition
123	   to the development of new congestion control mechanisms [RFC2357],
124	   router-based mechanisms are in development that complement the
125	   endpoint congestion avoidance mechanisms.

127	   A major issue that still needs to be addressed is the potential for
128	   future congestion collapse of the Internet due to flows that do not
129	   use responsible end-to-end congestion control.  RFC 896 [RFC896]
130	   suggested in 1984 that gateways should detect and `squelch'
131	   misbehaving hosts: "Failure to  respond  to  an  ICMP  Source  Quench
132	   message, though,  should be regarded as grounds for action by a
133	   gateway to disconnect a host.  Detecting such failure is non-trivial
134	   but  is a worthwhile area for further research."  Current papers
135	   still propose that routers detect and penalize flows that are not
136	   employing acceptable end-to-end congestion control [FF99].

138	3.2.  Fairness

140	   In addition to a concern about congestion collapse, there is a
141	   concern about `fairness' for best-effort traffic.  Because TCP "backs
142	   off" during congestion, a large number of TCP connections can share a
143	   single, congested link in such a way that bandwidth is shared
144	   reasonably equitably among similarly situated flows.  The equitable
145	   sharing of bandwidth among flows depends on the fact that all flows
146	   are running compatible congestion control algorithms.  For TCP, this
147	   means congestion control algorithms conformant with the current TCP
148	   specification [RFC793, RFC1122, RFC2581].

150	   The issue of fairness among competing flows has become increasingly
151	   important for several reasons.  First, using window scaling
152	   [RFC1323], individual TCPs can use high bandwidth even over high-
153	   propagation-delay paths.  Second, with the growth of the web,
154	   Internet users increasingly want high-bandwidth and low-delay
155	   communications, rather than the leisurely transfer of a long file in
156	   the background.  The growth of best-effort traffic that does not use
157	   TCP underscores this concern about fairness between competing best-
158	   effort traffic in times of congestion.

160	   The popularity of the Internet has caused a proliferation in the
161	   number of TCP implementations.  Some of these may fail to implement
162	   the TCP congestion avoidance mechanisms correctly because of poor
163	   implementation [RFC2525].  Others may deliberately be implemented
164	   with congestion avoidance algorithms that are more aggressive in
165	   their use of bandwidth than other TCP implementations; this would
166	   allow a vendor to claim to have a "faster TCP".  The logical
167	   consequence of such implementations would be a spiral of increasingly
168	   aggressive TCP implementations, or increasingly aggressive transport
169	   protocols, leading back to the point where there is effectively no
170	   congestion avoidance and the Internet is chronically congested.

172	   There is a well-known way to achieve more aggressive performance
173	   without even changing the transport protocol, by changing the level
174	   of granularity: open multiple connections to the same place, as has
175	   been done in the past by some Web browsers.  Thus, instead of a
176	   spiral of increasingly aggressive transport protocols, we would
177	   instead have a spiral of increasingly aggressive web browsers, or
178	   increasingly aggressive applications.

180	   This raises the issue of the appropriate granularity of a "flow",
181	   where we define a `flow' as the level of granularity appropriate for
182	   the application of both fairness and congestion control.  From RFC
183	   2309:  "There are a few `natural' answers: 1) a TCP or UDP connection
184	   (source address/port, destination address/port); 2) a
185	   source/destination host pair; 3) a given source host or a given
186	   destination host.  We would guess that the source/destination host
187	   pair gives the most appropriate granularity in many circumstances.
188	   The granularity of flows for congestion management is, at least in
189	   part, a policy question that needs to be addressed in the wider IETF
190	   community."

192	   Again borrowing from RFC 2309, we use the term "TCP-compatible" for a
193	   flow that behaves under congestion like a flow produced by a
194	   conformant TCP.  A TCP-compatible flow is responsive to congestion
195	   notification, and in steady-state uses no more bandwidth than a
196	   conformant TCP running under comparable conditions (drop rate, RTT,
197	   MTU, etc.)

199	   It is convenient to divide flows into three classes: (1) TCP-
200	   compatible flows, (2) unresponsive flows, i.e., flows that do not
201	   slow down when congestion occurs, and (3) flows that are responsive
202	   but are not TCP-compatible.  The last two classes contain more
203	   aggressive flows that pose significant threats to Internet
204	   performance, as we discuss below.

206	   In addition to steady-state fairness, the fairness of the initial
207	   slow-start is also a concern.  One concern is the transient effect on
208	   other flows of a flow with an overly-aggressive slow-start procedure.
209	   Slow-start performance is particularly important for the many flows
210	   that are short-lived, and only have a small amount of data to
211	   transfer.

213	3.3.  Optimizing performance regarding throughput, delay, and loss.

215	   In addition to the prevention of congestion collapse and concerns
216	   about fairness, a third reason for a flow to use end-to-end
217	   congestion control can be to optimize its own performance regarding
218	   throughput, delay, and loss.  In some circumstances, for example in
219	   environments of high statistical multiplexing, the delay and loss
220	   rate experienced by a flow are largely independent of its own sending
221	   rate.  However, in environments with lower levels of statistical
222	   multiplexing or with per-flow scheduling, the delay and loss rate
223	   experienced by a flow is in part a function of the flow's own sending
224	   rate.  Thus, a flow can use end-to-end congestion control to limit
225	   the delay or loss experienced by its own packets.  We would note,
226	   however, that in an environment like the current best-effort
227	   Internet, concerns regarding congestion collapse and fairness with
228	   competing flows limit the range of congestion control behaviors
229	   available to a flow.

231	4.  The role of the standards process

233	   The standardization of a transport protocol includes not only
234	   standardization of aspects of the protocol that could affect
235	   interoperability (e.g., information exchanged by the end-nodes), but
236	   also standardization of mechanisms deemed critical to performance
237	   (e.g., in TCP, reduction of the congestion window in response to a
238	   packet drop).  At the same time, implementation-specific details and
239	   other aspects of the transport protocol that do not affect
240	   interoperability and do not significantly interfere with performance
241	   do not require standardization.  Areas of TCP that do not require
242	   standardization include the details of TCP's Fast Recovery procedure
243	   after a Fast Retransmit [RFC2582].  The appendix uses examples from
244	   TCP to discuss in more detail the role of the standards process in
245	   the development of congestion control.

247	4.1.  The development of new transport protocols.

249	   In addition to addressing the danger of congestion collapse, the
250	   standardization process for new transport protocols takes care to
251	   avoid a congestion control `arms race' among competing protocols.  As
252	   an example, in RFC 2357 [RFC2357] the TSV Area Directors and their
253	   Directorate outline criteria for the publication as RFCs of Internet-
254	   Drafts on reliable multicast transport protocols.  From [RFC2357]:
255	   "A particular concern for the IETF is the impact of reliable
256	   multicast traffic on other traffic in the Internet in times of
257	   congestion, in particular the effect of reliable multicast traffic on
258	   competing TCP traffic....  The challenge to the IETF is to encourage
259	   research and implementations of reliable multicast, and to enable the
260	   needs of applications for reliable multicast to be met as
261	   expeditiously as possible, while at the same time protecting the
262	   Internet from the congestion disaster or collapse that could result
263	   from the widespread use of applications with inappropriate reliable
264	   multicast mechanisms."

266	   The list of technical criteria that must be addressed by RFCs on new
267	   reliable multicast transport protocols include the following:  "Is
268	   there a congestion control mechanism? How well does it perform? When
269	   does it fail?  Note that congestion control mechanisms that operate
270	   on the network more aggressively than TCP will face a great burden of
271	   proof that they don't threaten network stability."

273	   It is reasonable to expect that these concerns about the effect of
274	   new transport protocols on competing traffic will apply not only to
275	   reliable multicast protocols, but to unreliable unicast, reliable
276	   unicast, and unreliable multicast traffic as well.

278	4.2.  Application-level issues that affect congestion control

280	   The specific issue of a browser opening multiple connections to the
281	   same destination has been addressed by RFC 2616 [RFC2616], which
282	   states in Section 8.1.4 that "Clients that use persistent connections
283	   SHOULD limit the number of simultaneous connections that they
284	   maintain to a given server.  A single-user client SHOULD NOT maintain
285	   more than 2 connections with any server or proxy."

287	4.3.  New developments in the standards process

289	   The most obvious developments in the IETF that could affect the
290	   evolution of congestion control are the development of integrated and
291	   differentiated services [RFC2212, RFC2475] and of Explicit Congestion
292	   Notification (ECN) [RFC2481].  However, other less dramatic
293	   developments are likely to affect congestion control as well.

295	   One such effort is that to construct Endpoint Congestion Management
296	   [BS00], to enable multiple concurrent flows from a sender to the same
297	   receiver to share congestion control state.  By allowing multiple
298	   connections to the same destination to act as one flow in terms of
299	   end-to-end congestion control, a Congestion Manager could allow
300	   individual connections slow-starting to take advantage of previous
301	   information about the congestion state of the end-to-end path.
302	   Further, the use of a Congestion Manager could remove the congestion
303	   control dangers of multiple flows being opened between the same
304	   source/destination pair, and could perhaps be used to allow a browser
305	   to open many simultaneous connections to the same destination.

307	5.  A description of congestion collapse

309	   This section discusses congestion collapse from undelivered packets
310	   in some detail, and shows how unresponsive flows could contribute to
311	   congestion collapse in the Internet.  This section draws heavily on
312	   material from [FF99].

314	   Informally, congestion collapse occurs when an increase in the
315	   network load results in a decrease in the useful work done by the
316	   network.  As discussed in Section 3, congestion collapse was first
317	   reported in the mid 1980s [RFC896], and was largely due to TCP
318	   connections unnecessarily retransmitting packets that were either in
319	   transit or had already been received at the receiver.  We call the
320	   congestion collapse that results from the unnecessary retransmission
321	   of packets classical congestion collapse.  Classical congestion
322	   collapse is a stable condition that can result in throughput that is
323	   a small fraction of normal [RFC896].  Problems with classical
324	   congestion collapse have generally been corrected by the timer
325	   improvements and congestion control mechanisms in modern
326	   implementations of TCP [Jacobson88].

328	   A second form of potential congestion collapse occurs due to
329	   undelivered packets.  Congestion collapse from undelivered packets
330	   arises when bandwidth is wasted by delivering packets through the
331	   network that are dropped before reaching their ultimate destination.
332	   This is probably the largest unresolved danger with respect to
333	   congestion collapse in the Internet today.  Different scenarios can
334	   result in different degrees of congestion collapse, in terms of the
335	   fraction of the congested links' bandwidth used for productive work.
336	   The danger of congestion collapse from undelivered packets is due
337	   primarily to the increasing deployment of open-loop applications not
338	   using end-to-end congestion control.  Even more destructive would be
339	   best-effort applications that *increase* their sending rate in
340	   response to an increased packet drop rate (e.g., automatically using
341	   an increased level of FEC).

343	   Table 1 gives the results from a scenario with congestion collapse
344	   from undelivered packets, where scarce bandwidth is wasted by packets
345	   that never reach their destination.  The simulation uses a scenario
346	   with three TCP flows and one UDP flow competing over a congested 1.5
347	   Mbps link.  The access links for all nodes are 10 Mbps, except that
348	   the access link to the receiver of the UDP flow is 128 Kbps, only 9%
349	   of the bandwidth of shared link.  When the UDP source rate exceeds
350	   128 Kbps, most of the UDP packets will be dropped at the output port
351	   to that final link.

353	        UDP
354	        Arrival   UDP       TCP       Total
355	        Rate      Goodput   Goodput   Goodput
356	       --------------------------------------
357	         0.7       0.7      98.5      99.2
358	         1.8       1.7      97.3      99.1
359	         2.6       2.6      96.0      98.6
360	         5.3       5.2      92.7      97.9
361	         8.8       8.4      87.1      95.5
362	        10.5       8.4      84.8      93.2
363	        13.1       8.4      81.4      89.8
364	        17.5       8.4      77.3      85.7
365	        26.3       8.4      64.5      72.8
366	        52.6       8.4      38.1      46.4
367	        58.4       8.4      32.8      41.2
368	        65.7       8.4      28.5      36.8
369	        75.1       8.4      19.7      28.1
370	        87.6       8.4      11.3      19.7
371	       105.2       8.4       3.4      11.8
372	       131.5       8.4       2.4      10.7
373	   Table 1.  A simulation with three TCP flows and one UDP flow.

375	   Table 1 shows the UDP arrival rate from the sender, the UDP goodput
376	   (defined as the bandwidth delivered to the receiver), the TCP goodput
377	   (as delivered to the TCP receivers), and the aggregate goodput on the
378	   congested 1.5 Mbps link.  Each rate is given as a fraction of the
379	   bandwidth of the congested link.  As the UDP source rate increases,
380	   the TCP goodput decreases roughly linearly, and the UDP goodput is
381	   nearly constant.  Thus, as the UDP flow increases its offered load,
382	   its only effect is to hurt the TCP and aggregate goodput.  On the
383	   congested link, the UDP flow ultimately `wastes' the bandwidth that
384	   could have been used by the TCP flow, and reduces the goodput in the
385	   network as a whole down to a small fraction of the bandwidth of the
386	   congested link.

388	   The simulations in Table 1 illustrate both unfairness and congestion
389	   collapse.  As [FF99] discusses, compatible congestion control is not
390	   the only way to provide fairness; per-flow scheduling at the
391	   congested routers is an alternative mechanism at the routers that
392	   guarantees fairness.  However, as discussed in [FF99], per-flow
393	   scheduling can not be relied upon to prevent congestion collapse.

395	   There are only two alternatives for eliminating the danger of
396	   congestion collapse from undelivered packets.  The first alternative
397	   for preventing congestion collapse from undelivered packets is the
398	   use of effective end-to-end congestion control by the end nodes.
399	   More specifically, the requirement would be that a flow avoid a
400	   pattern of significant losses at links downstream from the first
401	   congested link on the path.  (Here, we would consider any link a
402	   `congested link' if any flow is using bandwidth that would otherwise
403	   be used by other traffic on the link.) Given that an end-node is
404	   generally unable to distinguish between a path with one congested
405	   link and a path with multiple congested links, the most reliable way
406	   for a flow to avoid a pattern of significant losses at a downstream
407	   congested link is for the flow to use end-to-end congestion control,
408	   and reduce its sending rate in the presence of loss.

410	   A second alternative for preventing congestion collapse from
411	   undelivered packets would be a guarantee by the network that packets
412	   accepted at a congested link in the network will be delivered all the
413	   way to the receiver [RFC2212, RFC2475].  We note that the choice
414	   between the first alternative of end-to-end congestion control and
415	   the second alternative of end-to-end bandwidth guarantees does not
416	   have to be an either/or decision; congestion collapse can be
417	   prevented by the use of effective end-to-end congestion by some of
418	   the traffic, and the use of end-to-end bandwidth guarantees from the
419	   network for the rest of the traffic.

421	6.  Forms of end-to-end congestion control

423	   This document has discussed concerns about congestion collapse and
424	   about fairness with TCP for new forms of congestion control.  This
425	   does not mean, however, that concerns about congestion collapse and
426	   fairness with TCP necessitate that all best-effort traffic deploy
427	   congestion control based on TCP's Additive-Increase Multiplicative-
428	   Decrease (AIMD) algorithm of reducing the sending rate in half in
429	   response to each packet drop.  This section separately discusses the
430	   implications of these two concerns of congestion collapse and
431	   fairness with TCP.

433	6.1.  End-to-end congestion control for avoiding congestion collapse.

435	   The avoidance of congestion collapse from undelivered packets
436	   requires that flows avoid a scenario of a high sending rate, multiple
437	   congested links, and a persistent high packet drop rate at the
438	   downstream link.  Because congestion collapse from undelivered
439	   packets consists of packets that waste valuable bandwidth only to be
440	   dropped downstream, this form of congestion collapse is not possible
441	   in an environment where each flow traverses only one congested link,
442	   or where only a small number of packets are dropped at links
443	   downstream of the first congested link.  Thus, any form of congestion
444	   control that successfully avoids a high sending rate in the presence
445	   of a high packet drop rate should be sufficient to avoid congestion
446	   collapse from undelivered packets.

448	   We would note that the addition of Explicit Congestion Notification
449	   (ECN) to the IP architecture would not, in and of itself, remove the
450	   danger of congestion collapse for best-effort traffic.  ECN allows
451	   routers to set a bit in packet headers as an indication of congestion
452	   to the end-nodes, rather than being forced to rely on packet drops to
453	   indicate congestion.  However, with ECN, packet-marking would replace
454	   packet-dropping only in times of moderate congestion.  In particular,
455	   when congestion is heavy, and a router's buffers overflow, the router
456	   has no choice but to drop arriving packets.

458	6.2.  End-to-end congestion control for fairness with TCP.

460	   The concern expressed in [RFC2357] about fairness with TCP places a
461	   significant though not crippling constraint on the range of viable
462	   end-to-end congestion control mechanisms for best-effort traffic.  An
463	   environment with per-flow scheduling at all congested links would
464	   isolate flows from each other, and eliminate the need for congestion
465	   control mechanisms to be TCP-compatible.  An environment with
466	   differentiated services, where flows marked as belonging to a certain
467	   diff-serv class would be scheduled in isolation from best-effort
468	   traffic, could allow the emergence of an entire diff-serv class of
469	   traffic where congestion control was not required to be TCP-
470	   compatible.  Similarly, a pricing-controlled environment, or a diff-
471	   serv class with its own pricing paradigm, could supercede the concern
472	   about fairness with TCP.  However, for the current Internet
473	   environment, where other best-effort traffic could compete in a FIFO
474	   queue with TCP traffic, the absence of fairness with TCP could lead
475	   to one flow `starving out' another flow in a time of high congestion,
476	   as was illustrated in Table 1 above.

478	   However, the list of TCP-compatible congestion control procedures is
479	   not limited to AIMD with the same increase/ decrease parameters as
480	   TCP.  Other TCP-compatible congestion control procedures include
481	   rate-based variants of AIMD; AIMD with different sets of
482	   increase/decrease parameters that give the same steady-state
483	   behavior; equation-based congestion control where the sender adjusts
484	   its sending rate in response to information about the long-term
485	   packet drop rate; layered multicast where receivers subscribe and
486	   unsubscribe from layered multicast groups; and possibly other forms
487	   that we have not yet begun to consider.

489	7. Acknowledgements

491	   Much of this document draws directly on previous RFCs addressing end-
492	   to-end congestion control.  This attempts to be a summary of ideas
493	   that have been discussed for many years, and by many people.  In
494	   particular, acknowledgement is due to the members of the End-to-End
495	   Research Group, the Reliable Multicast Research Group, and the
496	   Transport Area Directorate.  This document has also benefited from
497	   discussion and feedback from the Transport Area Working Group.
498	   Particular thanks are due to Mark Allman for feedback on an earlier
499	   version of this document.

501	8. References

503	   [BS00] Hari Balakrishnan and Srinivasan Seshan, The Congestion
504	   Manager, draft-balakrishnan-cm-02.txt, internet draft, work in
505	   progress, March 2000.

507	   [DMKM00] S. Dawkins, G. Montenegro, M. Kojo, and V. Magret, End-to-
508	   end Performance Implications of Slow Links, draft-ietf-pilc-
509	   slow-03.txt, internet draft, work in progress, March 2000.

511	   [FF99] Floyd, S., and Fall, K., Promoting the Use of End-to-End
512	   Congestion Control in the Internet. IEEE/ACM Transactions on
513	   Networking, August 1999.  URL http://www-
514	   nrg.ee.lbl.gov/floyd/end2end-paper.html".

516	   [HPF00] Handley, M., Padhye, J., and Floyd, S., TCP Congestion Window
517	   Validation, draft-handley-tcp-cwv-02.txt, internet draft, work in
518	   progress, March 2000.

520	   [Jacobson88] V. Jacobson, Congestion Avoidance and Control, ACM
521	   SIGCOMM '88, August 1988.

523	   [RFC793] J. Postel, Transmission Control Protocol, RFC 793, September
524	   1981.

526	   [RFC896] Nagle, J., Congestion Control in IP/TCP, RFC 896, January
527	   1984.

529	   [RFC1122] Braden, R., Ed., Requirements for Internet Hosts --
530	   Communication Layers, STD 3, RFC 1122, October 1989.

532	   [RFC1323] V. Jacobson, R. Braden, and D. Borman, TCP Extensions for
533	   High Performance, RFC 1323, May 1992.

535	   [RFC2119] S. Bradner, Key words for use in RFCs to Indicate
536	   Requirement Levels, RFC 2119, March 1997.

538	   [RFC2212] S. Shenker, C. Partridge, and R. Guerin, Specification of
539	   Guaranteed Quality of Service, RFC 2212, September 1997.

541	   [RFC2309] B. Braden, Clark, D., Crowcroft, J., Davie, B., S. Deering,
542	   Estrin, D., Floyd, S., V. Jacobson, G. Minshall, Partridge, C., L.
543	   Peterson, Ramakrishnan, K.K., Shenker, S., Wroclawski, J., and Zhang,
544	   L., Recommendations on Queue Management and Congestion Avoidance in
545	   the Internet, RFC 2309, April 1998.

547	   [RFC2357] A. Mankin, A. Romanow, S. Bradner, and V. Paxson, IETF
548	   Criteria for Evaluating Reliable Multicast Transport and Application
549	   Protocols, RFC 2357, June 1998.

551	   [RFC2414] Allman, M., Floyd, S., and Partridge, C., Increasing TCP's
552	   Initial Window, RFC 2414, Experimental, September 1998.

554	   [RFC2475] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W.
555	   Weiss, An Architecture for Differentiated Services, RFC 2475,
556	   December 1998.

558	   [RFC2481] K. Ramakrishnan and S. Floyd, A Proposal to add Explicit
559	   Congestion Notification (ECN) to IP, RFC 2481, January 1999.

561	   [RFC2525] V. Paxson, M Allman, S. Dawson, W. Fenner, and J. Griner,
562	   I. Heavens, K. Lahey, J. Semke, B. Volz, Known TCP Implementation
563	   Problems, RFC 2525, March 1999.

565	   [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion
566	   Control, RFC 2581, April 1999.

568	   [RFC2582] Floyd, S., and Henderson, T., The NewReno Modification to
569	   TCP's Fast Recovery Algorithm, RFC 2582, April 1999.

571	   [RFC2616] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,
572	   P. Leach, and T. Berners-Lee, Hypertext Transfer Protocol --
573	   HTTP/1.1, RFC 2616, June 1999.

575	   [SCWA99] S. Savage, N. Cardwell, D. Wetherall, and T. Anderson, TCP
576	   Congestion Control with a Misbehaving Receiver, ACM Computer
577	   Communications Review, October 1999.

579	   [TCPB98] Hari Balakrishnan, Venkata N. Padmanabhan, Srinivasan
580	   Seshan, Mark Stemm, and Randy H. Katz, TCP Behavior of a Busy
581	   Internet Server: Analysis and Improvements, IEEE Infocom, March 1998.
582	   Available from:
583	   "http://www.cs.berkeley.edu/~hari/papers/infocom98.ps.gz".

585	   [TCPF98] Dong Lin and H.T. Kung, TCP Fast Recovery Strategies:
586	   Analysis and Improvements, IEEE Infocom, March 1998.  Available from:
587	   "http://www.eecs.harvard.edu/networking/papers/ infocom-tcp-
588	   final-198.pdf".

590	9.  TCP-Specific issues

592	   In this section we discuss some of the particulars of TCP congestion
593	   control, to illustrate a realization of the congestion control
594	   principles, including some of the details that arise when
595	   incorporating them into a production transport protocol.

597	9.1.  Slow-start.

599	   The TCP sender can not open a new connection by sending a large burst
600	   of data (e.g., a receiver's advertised window) all at once.  The TCP
601	   sender is limited by a small initial value for the congestion window.
602	   During slow-start, the TCP sender can increase its sending rate by at
603	   most a factor of two in one roundtrip time.  Slow-start ends when
604	   congestion is detected, or when the sender's congestion window is
605	   greater than the slow-start threshold ssthresh.

607	   An issue that potentially affects global congestion control, and
608	   therefore has been explicitly addressed in the standards process,
609	   includes an increase in the value of the initial window
610	   [RFC2414,RFC2581].

612	   Issues that have not been addressed in the standards process, and are
613	   generally considered not to require standardization, include such
614	   issues as the use (or non-use) of rate-based pacing, and mechanisms
615	   for ending slow-start early, before the congestion window reaches
616	   ssthresh.  Such mechanisms result in slow-start behavior that is as
617	   conservative or more conservative than standard TCP.

619	9.2.  Additive Increase, Multiplicative Decrease.

621	   In the absence of congestion, the TCP sender increases its congestion
622	   window by at most one packet per roundtrip time. In response to a
623	   congestion indication, the TCP sender decreases its congestion window
624	   by half.  (More precisely, the new congestion window is half of the
625	   minimum of the congestion window and the receiver's advertised
626	   window.)

628	   An issue that potentially affects global congestion control, and
629	   therefore would be likely to be explicitly addressed in the standards
630	   process, would include a proposed addition of congestion control for
631	   the return stream of `pure acks'.

633	   An issue that has not been addressed in the standards process, and is
634	   generally not considered to require standardization, would be a
635	   change to the congestion window to apply as an upper bound on the
636	   number of bytes presumed to be in the pipe, instead of applying as a
637	   sliding window starting from the cumulative acknowledgement.
638	   (Clearly, the receiver's advertised window applies as a sliding
639	   window starting from the cumulative acknowledgement field, because
640	   packets received above the cumulative acknowledgement field are held
641	   in TCP's receive buffer, and have not been delivered to the
642	   application.  However, the congestion window applies to the number of
643	   packets outstanding in the pipe, and does not necessarily have to
644	   include packets that have been received out-of-order by the TCP
645	   receiver.)

647	9.3.  Retransmit timers.

649	   The TCP sender sets a retransmit timer to infer that a packet has
650	   been dropped in the network.  When the retransmit timer expires, the
651	   sender infers that a packet has been lost, sets ssthresh to half of
652	   the current window, and goes into slow-start, retransmitting the lost
653	   packet.  If the retransmit timer expires because no acknowledgement
654	   has been received for a retransmitted packet, the retransmit timer is
655	   also "backed-off", doubling the value of the next retransmit timeout
656	   interval.

658	   An issue that potentially affects global congestion control, and
659	   therefore would be likely to be explicitly addressed in the standards
660	   process, might include a modified mechanism for setting the
661	   retransmit timer that could significantly increase the number of
662	   retransmit timers that expire prematurely, when the acknowledgement
663	   has not yet arrived at the sender, but in fact no packets have been
664	   dropped.  This could be of concern to the Internet standards process
665	   because retransmit timers that expire prematurely could lead to an
666	   increase in the number of packets unnecessarily transmitted on a
667	   congested link.

669	9.4.  Fast Retransmit and Fast Recovery.

671	   After seeing three duplicate acknowledgements, the TCP sender infers
672	   a packet loss.  The TCP sender sets ssthresh to half of the current
673	   window, reduces the congestion window to at most half of the previous
674	   window, and retransmits the lost packet.

676	   An issue that potentially affects global congestion control, and
677	   therefore would be likely to be explicitly addressed in the standards
678	   process, might include a proposal (if there was one) for inferring a
679	   lost packet after only one or two duplicate acknowledgements.  If
680	   poorly designed, such a proposal could lead to an increase in the
681	   number of packets unnecessarily transmitted on a congested path.

683	   An issue that has not been addressed in the standards process, and
684	   would not be expected to require standardization, would be a proposal
685	   to send a "new" or presumed-lost packet in response to a duplicate or
686	   partial acknowledgement, if allowed by the congestion window.  An
687	   example of this would be sending a new packet in response to a single
688	   duplicate acknowledgement, to keep the `ack clock' going in case no
689	   further acknowledgements would have arrived.  Such a proposal is an
690	   example of a beneficial change that does not involve interoperability
691	   and does not affect global congestion control, and that therefore
692	   could be implemented by vendors without requiring the intervention of
693	   the IETF standards process.  (This issue has in fact been addressed
694	   in [DMKM00], which suggests that "researchers may wish to experiment
695	   with injecting new traffic into the network when duplicate
696	   acknowledgements are being received, as described in [TCPB98] and
697	   [TCPF98]."

699	9.5.  Other aspects of TCP congestion control.

701	   Other aspects of TCP congestion control that have not been discussed
702	   in any of the sections above include TCP's recovery from an idle or
703	   application-limited period [HPF00].

705	10. Security Considerations

707	   This document has been about the risks associated with congestion
708	   control, or with the absence of congestion control.  Section 3.2
709	   discusses the potentials for unfairness if competing flows don't use
710	   compatible congestion control mechanisms, and Section 5 considers the
711	   dangers of congestion collapse if flows don't use end-to-end
712	   congestion control.

714	   Because this document does not propose any specific congestion
715	   control mechanisms, it is also not necessary to present specific
716	   security measures associated with congestion control.  However, we
717	   would note that there are a range of security considerations
718	   associated with congestion control that should be considered in IETF
719	   documents.

721	   For example, individual congestion control mechanisms should be as
722	   robust as possible to the attempts of individual end-nodes to subvert
723	   end-to-end congestion control [SCWA99].  This is a particular concern
724	   in multicast congestion control, because of the far-reaching
725	   distribution of the traffic and the greater opportunities for
726	   individual receivers to fail to report congestion.

728	   RFC 2309 also discussed the potential dangers to the Internet of
729	   unresponsive flows, that is, flows that don't reduce their sending
730	   rate in the presence of congestion, and describes the need for
731	   mechanisms in the network to deal with flows that are unresponsive to
732	   congestion notification.  We would note that there is still a need
733	   for research, engineering, measurement, and deployment in these
734	   areas.

736	   Because the Internet aggregates very large numbers of flows, the risk
737	   to the whole infrastructure of subverting the congestion control of a
738	   few individual flows is limited.  Rather, the risk to the
739	   infrastructure would come from the widespread deployment of many end-
740	   nodes subverting end-to-end congestion control.

742	AUTHORS' ADDRESSES

744	   Sally Floyd
745	   AT&T Center for Internet Research at ICSI (ACIRI)
746	   Phone: +1 (510) 642-4274 x189
747	   Email: floyd@aciri.org
748	   URL: http://www.aciri.org/floyd/

750	   This draft was created in June 2000.
751	   It expires December 2000.