idnits 2.17.1 

draft-ietf-aqm-recommendation-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document obsoletes RFC2309, but the
     abstract doesn't seem to directly say this.  It does mention RFC2309
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 23, 2015) is 3349 days in the past.  Is this
     intentional?


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085)

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC  896
     (Obsoleted by RFC 7805)

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)

  -- Obsolete informational reference (is this intentional?): RFC 2460
     (Obsoleted by RFC 8200)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                      F. Baker, Ed.
3	Internet-Draft                                             Cisco Systems
4	Obsoletes: 2309 (if approved)                          G. Fairhurst, Ed.
5	Intended status: Best Current Practice            University of Aberdeen
6	Expires: August 27, 2015                               February 23, 2015

8	         IETF Recommendations Regarding Active Queue Management
9	                    draft-ietf-aqm-recommendation-10

11	Abstract

13	   This memo presents recommendations to the Internet community
14	   concerning measures to improve and preserve Internet performance.  It
15	   presents a strong recommendation for testing, standardization, and
16	   widespread deployment of active queue management (AQM) in network
17	   devices, to improve the performance of today's Internet.  It also
18	   urges a concerted effort of research, measurement, and ultimate
19	   deployment of AQM mechanisms to protect the Internet from flows that
20	   are not sufficiently responsive to congestion notification.

22	   The note replaces the recommendations of RFC 2309 based on fifteen
23	   years of experience and new research.

25	Status of This Memo

27	   This Internet-Draft is submitted in full conformance with the
28	   provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF).  Note that other groups may also distribute
32	   working documents as Internet-Drafts.  The list of current Internet-
33	   Drafts is at http://datatracker.ietf.org/drafts/current/.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."

40	   This Internet-Draft will expire on August 27, 2015.

42	Copyright Notice

44	   Copyright (c) 2015 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (http://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document.  Please review these documents
51	   carefully, as they describe your rights and restrictions with respect
52	   to this document.  Code Components extracted from this document must
53	   include Simplified BSD License text as described in Section 4.e of
54	   the Trust Legal Provisions and are provided without warranty as
55	   described in the Simplified BSD License.

57	   This document may contain material from IETF Documents or IETF
58	   Contributions published or made publicly available before November
59	   10, 2008.  The person(s) controlling the copyright in some of this
60	   material may not have granted the IETF Trust the right to allow
61	   modifications of such material outside the IETF Standards Process.
62	   Without obtaining an adequate license from the person(s) controlling
63	   the copyright in such materials, this document may not be modified
64	   outside the IETF Standards Process, and derivative works of it may
65	   not be created outside the IETF Standards Process, except to format
66	   it for publication as an RFC or to translate it into languages other
67	   than English.

69	Table of Contents

71	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
72	     1.1.  Congestion Collapse . . . . . . . . . . . . . . . . . . .   3
73	     1.2.  Active Queue Management to Manage Latency . . . . . . . .   4
74	     1.3.  Document Overview . . . . . . . . . . . . . . . . . . . .   5
75	     1.4.  Changes to the recommendations of RFC2309 . . . . . . . .   6
76	     1.5.  Requirements Language . . . . . . . . . . . . . . . . . .   6
77	   2.  The Need For Active Queue Management  . . . . . . . . . . . .   6
78	     2.1.  AQM and Multiple Queues . . . . . . . . . . . . . . . . .  10
79	     2.2.  AQM and Explicit Congestion Marking (ECN) . . . . . . . .  10
80	     2.3.  AQM and Buffer Size . . . . . . . . . . . . . . . . . . .  11
81	   3.  Managing Aggressive Flows . . . . . . . . . . . . . . . . . .  11
82	   4.  Conclusions and Recommendations . . . . . . . . . . . . . . .  14
83	     4.1.  Operational deployments SHOULD use AQM procedures . . . .  15
84	     4.2.  Signaling to the transport endpoints  . . . . . . . . . .  16
85	       4.2.1.  AQM and ECN . . . . . . . . . . . . . . . . . . . . .  17
86	     4.3.  AQM algorithms deployed SHOULD NOT require operational
87	           tuning  . . . . . . . . . . . . . . . . . . . . . . . . .  18
88	     4.4.  AQM algorithms SHOULD respond to measured congestion, not
89	           application profiles. . . . . . . . . . . . . . . . . . .  19
90	     4.5.  AQM algorithms SHOULD NOT be dependent on specific
91	           transport protocol behaviours . . . . . . . . . . . . . .  20
92	     4.6.  Interactions with congestion control algorithms . . . . .  21
93	     4.7.  The need for further research . . . . . . . . . . . . . .  22
94	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  23
95	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
96	   7.  Privacy Considerations  . . . . . . . . . . . . . . . . . . .  23
97	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  24
98	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  24
99	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  24
100	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  25
101	   Appendix A.  Change Log . . . . . . . . . . . . . . . . . . . . .  28
102	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  31

104	1.  Introduction

106	   The Internet protocol architecture is based on a connectionless end-
107	   to-end packet service using the Internet Protocol, whether IPv4
108	   [RFC0791] or IPv6 [RFC2460].  The advantages of its connectionless
109	   design: flexibility and robustness, have been amply demonstrated.
110	   However, these advantages are not without cost: careful design is
111	   required to provide good service under heavy load.  In fact, lack of
112	   attention to the dynamics of packet forwarding can result in severe
113	   service degradation or "Internet meltdown".  This phenomenon was
114	   first observed during the early growth phase of the Internet in the
115	   mid 1980s [RFC0896][RFC0970], and is technically called "congestion
116	   collapse" and was a key focus of RFC2309.

118	   Although wide-scale congestion collapse is not common in the
119	   Internet, the presence of localised congestion collapse is by no
120	   means rare.  It is therefore important to continue to avoid
121	   congestion collapse.

123	   Since 1998, when RFC2309 was written, the Internet has become used
124	   for a variety of traffic.  In the current Internet, low latency is
125	   extremely important for many interactive and transaction-based
126	   applications.  The same type of technology that RFC2309 advocated for
127	   combating congestion collapse is also effective at limiting delays to
128	   reduce the interaction delay (latency) experienced by applications
129	   [Bri15].  High or unpredictable latency can impact the performance of
130	   the control loops used by ene-to-end protocols (including congestion
131	   control algorithms using TCP).  There is now also a focus on reducing
132	   network latency using the same technology.

134	   The mechanisms decsribed in this document may be implemented in
135	   network devices on the path between end-points that include routers,
136	   switches, and other network middleboxes.  The methods may also be
137	   implemented in the networking stacks within endpoint devices that
138	   connect to the network.

140	1.1.  Congestion Collapse

142	   The original fix for Internet meltdown was provided by Van Jacobsen.
143	   Beginning in 1986, Jacobsen developed the congestion avoidance
144	   mechanisms [Jacobson88] that are now required for implementations of
145	   the Transport Control Protocol (TCP) [RFC0793] [RFC1122].  ([RFC7414]
146	   provides a roadmap to help identify TCP-related documents.)  These
147	   mechanisms operate in Internet hosts to cause TCP connections to
148	   "back off" during congestion.  We say that TCP flows are "responsive"
149	   to congestion signals (i.e., packets that are dropped or marked with
150	   explicit congestion notification [RFC3168]).  It is primarily these
151	   TCP congestion avoidance algorithms that prevent the congestion
152	   collapse of today's Internet.  Similar algorithms are specified for
153	   other non-TCP transports.

155	   However, that is not the end of the story.  Considerable research has
156	   been done on Internet dynamics since 1988, and the Internet has
157	   grown.  It has become clear that the congestion avoidance mechanisms
158	   [RFC5681], while necessary and powerful, are not sufficient to
159	   provide good service in all circumstances.  Basically, there is a
160	   limit to how much control can be accomplished from the edges of the
161	   network.  Some mechanisms are needed in network devices to complement
162	   the endpoint congestion avoidance mechanisms.  These mechanisms may
163	   be implemented in network devices.

165	1.2.  Active Queue Management to Manage Latency

167	   Internet latency has become a focus of attention to increase the
168	   responsiveness of Internet applications and protocols.  One major
169	   source of delay is the build-up of queues in network devices.
170	   Queueing occurs whenever the arrival rate of data at the ingress to a
171	   device exceeds the current egress rate.  Such queueing is normal in a
172	   packet-switched network and is often necessary to absorb bursts in
173	   transmission and perform statistical multiplexing of traffic, but
174	   excessive queueing can lead to unwanted delay, reducing the
175	   performance of some Internet applications.

177	   RFC 2309 introduced the concept of "Active Queue Management" (AQM), a
178	   class of technologies that, by signaling to common congestion-
179	   controlled transports such as TCP, manages the size of queues that
180	   build in network buffers.  RFC 2309 also describes a specific AQM
181	   algorithm, Random Early Detection (RED), and recommends that this be
182	   widely implemented and used by default in routers.

184	   With an appropriate set of parameters, RED is an effective algorithm.
185	   However, dynamically predicting this set of parameters was found to
186	   be difficult.  As a result, RED has not been enabled by default, and
187	   its present use in the Internet is limited.  Other AQM algorithms
188	   have been developed since RC2309 was published, some of which are
189	   self-tuning within a range of applicability.  Hence, while this memo
190	   continues to recommend the deployment of AQM, it no longer recommends
191	   that RED or any other specific algorithm is used as a default;
192	   instead it provides recommendations on how to select appropriate
193	   algorithms and that a recommended algorithm is able to automate any
194	   required tuning for common deployment scenarios.

196	   Deploying AQM in the network can significantly reduce the latency
197	   across an Internet path and since writing RFC2309, this has become a
198	   key motivation for using AQM in the Internet.  In the context of AQM,
199	   it is useful to distinguish between two related classes of
200	   algorithms: "queue management" versus "scheduling" algorithms.  To a
201	   rough approximation, queue management algorithms manage the length of
202	   packet queues by marking or dropping packets when necessary or
203	   appropriate, while scheduling algorithms determine which packet to
204	   send next and are used primarily to manage the allocation of
205	   bandwidth among flows.  While these two mechanisms are closely
206	   related, they address different performance issues and operate on
207	   different timescales.  Both may be used in combination.

209	1.3.  Document Overview

211	   The discussion in this memo applies to "best-effort" traffic, which
212	   is to say, traffic generated by applications that accept the
213	   occasional loss, duplication, or reordering of traffic in flight.  It
214	   also applies to other traffic, such as real-time traffic that can
215	   adapt its sending rate to reduce loss and/or delay.  It is most
216	   effective when the adaption occurs on time scales of a single Round
217	   Trip Time (RTT) or a small number of RTTs, for elastic traffic
218	   [RFC1633].

220	   Two performance issues are highlighted:

222	   The first issue is the need for an advanced form of queue management
223	   that we call "Active Queue Management", AQM.  Section 2 summarizes
224	   the benefits that active queue management can bring.  A number of AQM
225	   procedures are described in the literature, with different
226	   characteristics.  This document does not recommend any of them in
227	   particular, but does make recommendations that ideally would affect
228	   the choice of procedure used in a given implementation.

230	   The second issue, discussed in Section 4 of this memo, is the
231	   potential for future congestion collapse of the Internet due to flows
232	   that are unresponsive, or not sufficiently responsive, to congestion
233	   indications.  Unfortunately, while scheduling can mitigate some of
234	   the side-effects of sharing a network queue with an unresponsive
235	   flow, there is currently no consensus solution to controlling the
236	   congestion caused by such aggressive flows.  Methods such as
237	   congestion exposure (ConEx) [RFC6789] offer a framework [CONEX] that
238	   can update network devices to alleviate these effects.  Significant
239	   research and engineering will be required before any solution will be
240	   available.  It is imperative that work to mitigate the impact of
241	   unresponsive flows is energetically pursued, to ensure acceptable
242	   performance and the future stability of the Internet.

244	   Section 4 concludes the memo with a set of recommendations to the
245	   Internet community on the use of AQM and recommendations for defining
246	   AQM algorithms.

248	1.4.  Changes to the recommendations of RFC2309

250	   This memo replaces the recommendations in [RFC2309], which resulted
251	   from past discussions of end-to-end performance, Internet congestion,
252	   and RED in the End-to-End Research Group of the Internet Research
253	   Task Force (IRTF).  It follows experience with this and other
254	   algorithms, and the AQM discussion within the IETF [AQM-WG].

256	   While RFC2309 described AQM in terms of the length of a queue.  This
257	   memo changes this, to use AQM to refer to any method that allows
258	   network devices to control either the queue length and/or the mean
259	   time that a packet spends in a queue.

261	   This memo also explicitly obsoletes the recommendation that Random
262	   Early Detection (RED) was to be used as the default AQM mechanism for
263	   the Internet.  This is replaced by a detailed set of recommendations
264	   for selecting an appropriate AQM algorithm.  As in RFC2309, this memo
265	   also motivates the need for continued research, but clarifies the
266	   research with examples appropriate at the time that this memo is
267	   published.

269	1.5.  Requirements Language

271	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
272	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
273	   document are to be interpreted as described in [RFC2119].

275	2.  The Need For Active Queue Management

277	   Active Queue Management (AQM) is a method that allows network devices
278	   to control the queue length or the mean time that a packet spends in
279	   a queue.  Although AQM can be applied across a range of deployment
280	   environments, the recommendations in this document are directed to
281	   use in the general Internet.  It is expected that the principles and
282	   guidance are also applicable to a wide range of environments, but may
283	   require tuning for specific types of link/network (e.g. to
284	   accommodate the traffic patterns found in data centres, the
285	   challenges of wireless infrastructure, or the higher delay
286	   encountered on satellite Internet links).  The remainder of this
287	   section identifies the need for AQM and the advantages of deploying
288	   AQM methods.

290	   The traditional technique for managing the queue length in a network
291	   device is to set a maximum length (in terms of packets) for each
292	   queue, accept packets for the queue until the maximum length is
293	   reached, then reject (drop) subsequent incoming packets until the
294	   queue decreases because a packet from the queue has been transmitted.
295	   This technique is known as "tail drop", since the packet that arrived
296	   most recently (i.e., the one on the tail of the queue) is dropped
297	   when the queue is full.  This method has served the Internet well for
298	   years, but it has four important drawbacks:

300	   1.  Full Queues

302	       The tail drop discipline allows queues to maintain a full (or,
303	       almost full) status for long periods of time, since tail drop
304	       signals congestion (via a packet drop) only when the queue has
305	       become full.  It is important to reduce the steady-state queue
306	       size, and this is perhaps the most important goal for queue
307	       management.

309	       The naive assumption might be that there is a simple tradeoff
310	       between delay and throughput, and that the recommendation that
311	       queues be maintained in a "non-full" state essentially translates
312	       to a recommendation that low end-to-end delay is more important
313	       than high throughput.  However, this does not take into account
314	       the critical role that packet bursts play in Internet
315	       performance.  For example, even though TCP constrains the
316	       congestion window of a flow, packets often arrive at network
317	       devices in bursts [Leland94].  If the queue is full or almost
318	       full, an arriving burst will cause multiple packets to be dropped
319	       from the same flow.  Bursts of loss can result in a global
320	       synchronization of flows throttling back, followed by a sustained
321	       period of lowered link utilization, reducing overall throughput
322	       [Flo94], [Zha90]

324	       The goal of buffering in the network is to absorb data bursts and
325	       to transmit them during the (hopefully) ensuing bursts of
326	       silence.  This is essential to permit transmission of bursts of
327	       data.  Normally small queues are preferred in network devices,
328	       with sufficient queue capacity to absorb the bursts.  The
329	       counter-intuitive result is that maintaining normally-small
330	       queues can result in higher throughput as well as lower end-to-
331	       end delay.  In summary, queue limits should not reflect the
332	       steady state queues we want to be maintained in the network;
333	       instead, they should reflect the size of bursts that a network
334	       device needs to absorb.

336	   2.  Lock-Out
337	       In some situations tail drop allows a single connection or a few
338	       flows to monopolize the queue space starving other connections,
339	       preventing them from getting room in the queue [Flo92].

341	   3.  Mitigating the Impact of Packet Bursts

343	       Large burst of packets can delay other packets, disrupting the
344	       control loop (e.g. the pacing of flows by the TCP ACK-Clock), and
345	       reducing the performance of flows that share a common bottleneck.

347	   4.  Control loop synchronization

349	       Congestion control, like other end-to-end mechanisms, introduces
350	       a control loop between hosts.  Sessions that share a common
351	       network bottleneck can therefore become synchronised, introducing
352	       periodic disruption (e.g.  jitter/loss). "lock-out" is often also
353	       the result of synchronization or other timing effects

355	   Besides tail drop, two alternative queue management disciplines that
356	   can be applied when a queue becomes full are "random drop on full" or
357	   "head drop on full".  When a new packet arrives at a full queue using
358	   the random drop on full discipline, the network device drops a
359	   randomly selected packet from the queue (which can be an expensive
360	   operation, since it naively requires an O(N) walk through the packet
361	   queue).  When a new packet arrives at a full queue using the head
362	   drop on full discipline, the network device drops the packet at the
363	   front of the queue [Lakshman96].  Both of these solve the lock-out
364	   problem, but neither solves the full-queues problem described above.

366	   We know in general how to solve the full-queues problem for
367	   "responsive" flows, i.e., those flows that throttle back in response
368	   to congestion notification.  In the current Internet, dropped packets
369	   provide a critical mechanism indicating congestion notification to
370	   hosts.  The solution to the full-queues problem is for network
371	   devices to drop or ECN-mark packets before a queue becomes full, so
372	   that hosts can respond to congestion before buffers overflow.  We
373	   call such a proactive approach AQM.  By dropping or ECN-marking
374	   packets before buffers overflow, AQM allows network devices to
375	   control when and how many packets to drop.

377	   In summary, an active queue management mechanism can provide the
378	   following advantages for responsive flows.

380	   1.  Reduce number of packets dropped in network devices

382	       Packet bursts are an unavoidable aspect of packet networks
383	       [Willinger95].  If all the queue space in a network device is
384	       already committed to "steady state" traffic or if the buffer
385	       space is inadequate, then the network device will have no ability
386	       to buffer bursts.  By keeping the average queue size small, AQM
387	       will provide greater capacity to absorb naturally-occurring
388	       bursts without dropping packets.

390	       Furthermore, without AQM, more packets will be dropped when a
391	       queue does overflow.  This is undesirable for several reasons.
392	       First, with a shared queue and the tail drop discipline, this can
393	       result in unnecessary global synchronization of flows, resulting
394	       in lowered average link utilization, and hence lowered network
395	       throughput.  Second, unnecessary packet drops represent a waste
396	       of network capacity on the path before the drop point.

398	       While AQM can manage queue lengths and reduce end-to-end latency
399	       even in the absence of end-to-end congestion control, it will be
400	       able to reduce packet drops only in an environment that continues
401	       to be dominated by end-to-end congestion control.

403	   2.  Provide a lower-delay interactive service

405	       By keeping a small average queue size, AQM will reduce the delays
406	       experienced by flows.  This is particularly important for
407	       interactive applications such as short web transfers, POP/IMAP,
408	       DNS, terminal traffic (telnet, ssh, mosh, RDP, etc), gaming or
409	       interactive audio-video sessions, whose subjective (and
410	       objective) performance is better when the end-to-end delay is
411	       low.

413	   3.  Avoid lock-out behavior

415	       AQM can prevent lock-out behavior by ensuring that there will
416	       almost always be a buffer available for an incoming packet.  For
417	       the same reason, AQM can prevent a bias against low capacity, but
418	       highly bursty, flows.

420	       Lock-out is undesirable because it constitutes a gross unfairness
421	       among groups of flows.  However, we stop short of calling this
422	       benefit "increased fairness", because general fairness among
423	       flows requires per-flow state, which is not provided by queue
424	       management.  For example, in a network device using AQM with only
425	       FIFO scheduling, two TCP flows may receive very different share
426	       of the network capacity simply because they have different round-
427	       trip times [Floyd91], and a flow that does not use congestion
428	       control may receive more capacity than a flow that does.  AQM can
429	       therefore be combined with a scheduling mechanism that divides
430	       network traffic between multiple queues (section 2.1).

432	   4.  Reduce the probability of control loop synchronization
433	       The probability of network control loop synchronization can be
434	       reduced if network devices introduce randomness in the AQM
435	       functions that trigger congestion avoidance at the sending host.

437	2.1.  AQM and Multiple Queues

439	   A network device may use per-flow or per-class queuing with a
440	   scheduling algorithm to either prioritize certain applications or
441	   classes of traffic, limit the rate of transmission, or to provide
442	   isolation between different traffic flows within a common class.  For
443	   example, a router may maintain per-flow state to achieve general
444	   fairness by a per-flow scheduling algorithm such as various forms of
445	   Fair Queueing (FQ) [Dem90] [Sut99], including Weighted Fair Queuing
446	   (WFQ), Stochastic Fairness Queueing (SFQ) [McK90] Deficit Round Robin
447	   (DRR) [Shr96], [Nic12], and/or a Class-Based Queue scheduling
448	   algorithm such as CBQ [Floyd95].  Hierarchical queues may also be
449	   used e.g., as a part of a Hierarchical Token Bucket (HTB), or
450	   Hierarchical Fair Service Curve (HFSC) [Sto97].  These methods are
451	   also used to realize a range of Quality of Service (QoS) behaviours
452	   designed to meet the need of traffic classes (e.g. using the
453	   integrated or differentiated service models).

455	   AQM is needed even for network devices that use per-flow or per-class
456	   queuing, because scheduling algorithms by themselves do not control
457	   the overall queue size or the size of individual queues.  AQM
458	   mechanisms might need to control the overall queue sizes, to ensure
459	   that arriving bursts can be accommodated without dropping packets.
460	   AQM should also be used to control the queue size for each individual
461	   flow or class, so that they do not experience unnecessarily high
462	   delay.  Using a combination of AQM and scheduling between multiple
463	   queues has been shown to offer good results in experimental and some
464	   types of operational use.

466	   In short, scheduling algorithms and queue management should be seen
467	   as complementary, not as replacements for each other.

469	2.2.  AQM and Explicit Congestion Marking (ECN)

471	   An AQM method may use Explicit Congestion Notification (ECN)
472	   [RFC3168] instead of dropping to mark packets under mild or moderate
473	   congestion.  ECN-marking can allow a network device to signal
474	   congestion at a point before a transport experiences congestion loss
475	   or additional queuing delay [ECN-Benefit].  Section 4.2.1 describes
476	   some of the benefits of using ECN with AQM.

478	2.3.  AQM and Buffer Size

480	   It is important to differentiate the choice of buffer size for a
481	   queue in a switch/router or other network device, and the
482	   threshold(s) and other parameters that determine how and when an AQM
483	   algorithm operates.  The optimum buffer size is a function of
484	   operational requirements and should generally be sized to be
485	   sufficient to buffer the largest normal traffic burst that is
486	   expected.  This size depends on the number and burstiness of traffic
487	   arriving at the queue and the rate at which traffic leaves the queue.

489	   One objective of AQM is to minimize the effect of lock-out, where one
490	   flow prevents other flows from effectively gaining capacity.  This
491	   need can be illustrated by a simple example of drop-tail queuing when
492	   a new TCP flow injects packets into a queue that happens to be almost
493	   full.  A TCP flow's congestion control algorithm [RFC5681] increases
494	   the flow rate to maximize its effective window.  This builds a queue
495	   in the network, inducing latency to the flow and other flows that
496	   share this queue.  Once a drop-tail queue fills, there will also be
497	   loss.  A new flow, sending its initial burst, has an enhanced
498	   probability of filling the remaining queue and dropping packets.  As
499	   a result, the new flow can be effectively prevented from effectively
500	   sharing the queue for a period of many RTTs.  In contrast, AQM can
501	   minimize the mean queue depth and therefore reducing the probability
502	   that competing sessions can materially prevent each other from
503	   performing well.

505	   AQM frees a designer from having to limit the buffer space assigned
506	   to a queue to achieve acceptable performance, allowing allocation of
507	   sufficient buffering to satisfy the needs of the particular traffic
508	   pattern.  Different types of traffic and deployment scenarios will
509	   lead to different requirements.  The choice of AQM algorithm and
510	   associated parameters is therefore a function of the way in which
511	   congestion is experienced and the required reaction to achieve
512	   acceptable performance.  This latter is the primary topic of the
513	   following sections.

515	3.  Managing Aggressive Flows

517	   One of the keys to the success of the Internet has been the
518	   congestion avoidance mechanisms of TCP.  Because TCP "backs off"
519	   during congestion, a large number of TCP connections can share a
520	   single, congested link in such a way that link bandwidth is shared
521	   reasonably equitably among similarly situated flows.  The equitable
522	   sharing of bandwidth among flows depends on all flows running
523	   compatible congestion avoidance algorithms, i.e., methods conformant
524	   with the current TCP specification [RFC5681].

526	   In this document a flow is known as "TCP-friendly" when it has a
527	   congestion response that approximates the average response expected
528	   of a TCP flow.  One example method of a TCP-friendly scheme is the
529	   TCP-Friendly Rate Control algorithm [RFC5348].  In this document, the
530	   term is used more generally to describe this and other algorithms
531	   that meet these goals.

533	   There are a variety of types of network flow.  Some convenient
534	   classes that describe flows are: (1) TCP Friendly flows, (2)
535	   unresponsive flows, i.e., flows that do not slow down when congestion
536	   occurs, and (3) flows that are responsive but are less responsive to
537	   congestion than TCP.  The last two classes contain more aggressive
538	   flows that can pose significant threats to Internet performance.

540	   1.  TCP-Friendly flows

542	       A TCP-friendly flow responds to congestion notification within a
543	       small number of path Round Trip Times (RTT), and in steady-state
544	       it uses no more capacity than a conformant TCP running under
545	       comparable conditions (drop rate, RTT, packet size, etc.).  This
546	       is described in the remainder of the document.

548	   2.  Non-Responsive Flows

550	       A flow that does not adjust its rate in response to congestion
551	       notification within a small number of path RTTs, can also use
552	       more capacity than a conformant TCP running under comparable
553	       conditions.  There is a growing set of applications whose
554	       congestion avoidance algorithms are inadequate or nonexistent
555	       (i.e., a flow that does not throttle its sending rate when it
556	       experiences congestion).

558	       The User Datagram Protocol (UDP) [RFC0768] provides a minimal,
559	       best-effort transport to applications and upper-layer protocols
560	       (both simply called "applications" in the remainder of this
561	       document) and does not itself provide mechanisms to prevent
562	       congestion collapse and establish a degree of fairness [RFC5405].
563	       Examples that use UDP include some streaming applications for
564	       packet voice and video, and some multicast bulk data transport.
565	       Other traffic, when aggregated may also become unresponsive to
566	       congestion notification.  If no action is taken, such
567	       unresponsive flows could lead to a new congestion collapse
568	       [RFC2914].  Some applications can even increase their traffic
569	       volume in response to congestion (e.g. by adding forward error
570	       correction when loss is experienced), with the possibility that
571	       they contribute to congestion collapse.

573	       In general, applications need to incorporate effective congestion
574	       avoidance mechanisms [RFC5405].  Research continues to be needed
575	       to identify and develop ways to accomplish congestion avoidance
576	       for presently unresponsive applications.  Network devices need to
577	       be able to protect themselves against unresponsive flows, and
578	       mechanisms to accomplish this must be developed and deployed.
579	       Deployment of such mechanisms would provide an incentive for all
580	       applications to become responsive by either using a congestion-
581	       controlled transport (e.g.  TCP, SCTP [RFC4960] and DCCP
582	       [RFC4340].) or by incorporating their own congestion control in
583	       the application [RFC5405], [RFC6679].

585	   3.  Transport Flows that are less responsive than TCP

587	       A second threat is posed by transport protocol implementations
588	       that are responsive to congestion, but, either deliberately or
589	       through faulty implementation, reduce less than a TCP flow would
590	       have done in response to congestion.  This covers a spectrum of
591	       behaviours between (1) and (2).  If applications are not
592	       sufficiently responsive to congestion signals, they may gain an
593	       unfair share of the available network capacity.

595	       For example, the popularity of the Internet has caused a
596	       proliferation in the number of TCP implementations.  Some of
597	       these may fail to implement the TCP congestion avoidance
598	       mechanisms correctly because of poor implementation.  Others may
599	       deliberately be implemented with congestion avoidance algorithms
600	       that are more aggressive in their use of capacity than other TCP
601	       implementations; this would allow a vendor to claim to have a
602	       "faster TCP".  The logical consequence of such implementations
603	       would be a spiral of increasingly aggressive TCP implementations,
604	       leading back to the point where there is effectively no
605	       congestion avoidance and the Internet is chronically congested.

607	       Another example could be an RTP/UDP video flow that uses an
608	       adaptive codec, but responds incompletely to indications of
609	       congestion or responds over an excessively long time period.
610	       Such flows are unlikely to be responsive to congestion signals in
611	       a timeframe comparable to a small number of end-to-end
612	       transmission delays.  However, over a longer timescale, perhaps
613	       seconds in duration, they could moderate their speed, or increase
614	       their speed if they determine capacity to be available.

616	       Tunneled traffic aggregates carrying multiple (short) TCP flows
617	       can be more aggressive than standard bulk TCP.  Applications
618	       (e.g., web browsers primarily supporting HTTP 1.1 and peer-to-
619	       peer file-sharing) have exploited this by opening multiple
620	       connections to the same endpoint.

622	       Lastly, some applications (e.g., web browsers primarily
623	       supporting HTTP 1.1) open a large numbers of succesive short TCP
624	       flows for a single session.  This can lead to each individual
625	       flow spending the majority of time in the exponential TCP slow
626	       start phase, rather than in TCP congestion avoidance.  The
627	       resulting traffic aggregate can therefore be much less responsive
628	       than a single standard TCP flow.

630	   The projected increase in the fraction of total Internet traffic for
631	   more aggressive flows in classes 2 and 3 could pose a threat to the
632	   performance of the future Internet.  There is therefore an urgent
633	   need for measurements of current conditions and for further research
634	   into the ways of managing such flows.  This raises many difficult
635	   issues in finding methods with an acceptable overhead cost that can
636	   identify and isolate unresponsive flows or flows that are less
637	   responsive than TCP.  Finally, there is as yet little measurement or
638	   simulation evidence available about the rate at which these threats
639	   are likely to be realized, or about the expected benefit of
640	   algorithms for managing such flows.

642	   Another topic requiring consideration is the appropriate granularity
643	   of a "flow" when considering a queue management method.  There are a
644	   few "natural" answers: 1) a transport (e.g.,TCP or UDP) flow (source
645	   address/port, destination address/port, protocol); 2) Differentiated
646	   Services Code Point, DSCP; 3) a source/destination host pair (IP
647	   address); 4) a given source host or a given destination host, or
648	   various combinations of the above; 5) a subscriber or site receiving
649	   the Internet service (enterprise or residential).

651	   The source/destination host pair gives an appropriate granularity in
652	   many circumstances, However, different vendors/providers use
653	   different granularities for defining a flow (as a way of
654	   "distinguishing" themselves from one another), and different
655	   granularities may be chosen for different places in the network.  It
656	   may be the case that the granularity is less important than the fact
657	   that a network device needs to be able to deal with more unresponsive
658	   flows at *some* granularity.  The granularity of flows for congestion
659	   management is, at least in part, a question of policy that needs to
660	   be addressed in the wider IETF community.

662	4.  Conclusions and Recommendations

664	   The IRTF, in publishing [RFC2309], and the IETF in subsequent
665	   discussion, has developed a set of specific recommendations regarding
666	   the implementation and operational use of AQM procedures.  The
667	   recommendations provided by this document are summarised as:

669	   1.  Network devices SHOULD implement some AQM mechanism to manage
670	       queue lengths, reduce end-to-end latency, and avoid lock-out
671	       phenomena within the Internet.

673	   2.  Deployed AQM algorithms SHOULD support Explicit Congestion
674	       Notification (ECN) as well as loss to signal congestion to
675	       endpoints.

677	   3.  AQM algorithms SHOULD NOT require tuning of initial or
678	       configuration parameters in common use cases.

680	   4.  AQM algorithms SHOULD respond to measured congestion, not
681	       application profiles.

683	   5.  AQM algorithms SHOULD NOT interpret specific transport protocol
684	       behaviours.

686	   6.  Transport protocol congestion control algorithms SHOULD maximize
687	       their use of available capacity (when there is data to send)
688	       without incurring undue loss or undue round trip delay.

690	   7.  Research, engineering, and measurement efforts are needed
691	       regarding the design of mechanisms to deal with flows that are
692	       unresponsive to congestion notification or are responsive, but
693	       are more aggressive than present TCP.

695	   These recommendations are expressed using the word "SHOULD".  This is
696	   in recognition that there may be use cases that have not been
697	   envisaged in this document in which the recommendation does not
698	   apply.  Therefore, care should be taken in concluding that one's use
699	   case falls in that category; during the life of the Internet, such
700	   use cases have been rarely if ever observed and reported.  To the
701	   contrary, available research [Choi04] says that even high speed links
702	   in network cores that are normally very stable in depth and behavior
703	   experience occasional issues that need moderation.  The
704	   recommendations are detailed in the following sections.

706	4.1.  Operational deployments SHOULD use AQM procedures

708	   AQM procedures are designed to minimize the delay and buffer
709	   exhaustion induced in the network by queues that have filled as a
710	   result of host behavior.  Marking and loss behaviors provide a signal
711	   that buffers within network devices are becoming unnecessarily full,
712	   and that the sender would do well to moderate its behavior.

714	   The use of scheduling mechanisms, such as priority queuing, classful
715	   queuing, and fair queuing, is often effective in networks to help a
716	   network serve the needs of a range of applications.  Network
717	   operators can use these methods to manage traffic passing a choke
718	   point.  This is discussed in [RFC2474] and [RFC2475].  When
719	   scheduling is used AQM should be applied across the classes or flows
720	   as well as within each class or flow:

722	   o  AQM mechanisms need to control the overall queue sizes, to ensure
723	      that arriving bursts can be accommodated without dropping packets.

725	   o  AQM mechanisms need to allow combination with other mechanisms,
726	      such as scheduling, to allow implementation of policies for
727	      providing fairness between different flows.

729	   o  AQM should be used to control the queue size for each individual
730	      flow or class, so that they do not experience unnecessarily high
731	      delay.

733	4.2.  Signaling to the transport endpoints

735	   There are a number of ways a network device may signal to the end
736	   point that the network is becoming congested and trigger a reduction
737	   in rate.  The signalling methods include:

739	   o  Delaying transport segments (packets) in flight, such as in a
740	      queue.

742	   o  Dropping transport segments (packets) in transit.

744	   o  Marking transport segments (packets), such as using Explicit
745	      Congestion Control[RFC3168] [RFC4301] [RFC4774] [RFC6040]
746	      [RFC6679].

748	   Increased network latency is used as an implicit signal of
749	   congestion.  E.g., in TCP additional delay can affect ACK Clocking
750	   and has the result of reducing the rate of transmission of new data.
751	   In the Real Time Protocol (RTP), network latency impacts the RTCP-
752	   reported RTT and increased latency can trigger a sender to adjust its
753	   rate.  Methods such as Low Extra Delay Background Transport (LEDBAT)
754	   [RFC6817] assume increased latency as a primary signal of congestion.
755	   Appropriate use of delay-based methods and the implications of AQM
756	   presently remains an area for further research.

758	   It is essential that all Internet hosts respond to loss [RFC5681],
759	   [RFC5405][RFC4960][RFC4340].  Packet dropping by network devices that
760	   are under load has two effects: It protects the network, which is the
761	   primary reason that network devices drop packets.  The detection of
762	   loss also provides a signal to a reliable transport (e.g., TCP, SCTP)
763	   that there is potential congestion using a pragmatic heuristic; "when
764	   the network discards a message in flight, it may imply the presence
765	   of faulty equipment or media in a path, and it may imply the presence
766	   of congestion.  To be conservative, a transport must assume it may be
767	   the latter."  Applications using unreliable transports (e.g.,using
768	   UDP) need to similarly react to loss [RFC5405]

770	   Network devices SHOULD use an AQM algorithm to measure local
771	   congestion and to determine the packets to mark or drop so that the
772	   congestion is managed.

774	   In general, dropping multiple packets from the same sessions in the
775	   same RTT is ineffective, and can reduce throughput.  Also, dropping
776	   or marking packets from multiple sessions simultaneously can have the
777	   effect of synchronizing them, resulting in increasing peaks and
778	   troughs in the subsequent traffic load.  Hence, AQM algorithms SHOULD
779	   randomize dropping in time, to reduce the probability that congestion
780	   indications are only experienced by a small proportion of the active
781	   flows.

783	   Loss due to dropping also has an effect on the efficiency of a flow
784	   and can significantly impact some classes of application.  In
785	   reliable transports the dropped data must be subsequently
786	   retransmitted.  While other applications/transports may adapt to the
787	   absence of lost data, this still implies inefficient use of available
788	   capacity and the dropped traffic can affect other flows.  Hence,
789	   congestion signalling by loss is not entirely positive; it is a
790	   necessary evil.

792	4.2.1.  AQM and ECN

794	   Explicit Congestion Notification (ECN) [RFC4301] [RFC4774] [RFC6040]
795	   [RFC6679] is a network-layer function that allows a transport to
796	   receive network congestion information from a network device without
797	   incurring the unintended consequences of loss.  ECN includes both
798	   transport mechanisms and functions implemented in network devices,
799	   the latter rely upon using AQM to decider when and whether to ECN-
800	   mark.

802	   Congestion for ECN-capable transports is signalled by a network
803	   device setting the "Congestion Experienced (CE)" codepoint in the IP
804	   header.  This codepoint is noted by the remote receiving end point
805	   and signalled back to the sender using a transport protocol
806	   mechanism, allowing the sender to trigger timely congestion control.
807	   The decision to set the CE codepoint requires an AQM algorithm
808	   configured with a threshold.  Non-ECN capable flows (the default) are
809	   dropped under congestion.

811	   Network devices SHOULD use an AQM algorithm that marks ECN-capable
812	   traffic when making decisions about the response to congestion.

814	   Network devices need to implement this method by marking ECN-capable
815	   traffic or by dropping non-ECN-capable traffic.

817	   Safe deployment of ECN requires that network devices drop excessive
818	   traffic, even when marked as originating from an ECN-capable
819	   transport.  This is a necessary safety precaution because:

821	   1.  A non-conformant, broken or malicious receiver could conceal an
822	       ECN mark, and not report this to the sender;

824	   2.  A non-conformant, broken or malicious sender could ignore a
825	       reported ECN mark, as it could ignore a loss without using ECN;

827	   3.  A malfunctioning or non-conforming network device may "hide" an
828	       ECN mark (or fail to correctly set the ECN codepoint at an egress
829	       of a network tunnel).

831	   In normal operation, such cases should be very uncommon, however
832	   overload protection is desirable to protect traffic from
833	   misconfigured or malicious use of ECN (e.g., a denial-of-service
834	   attack that generates ECN-capable traffic that is unresponsive to CE-
835	   marking).

837	   An AQM algorithm that supports ECN needs to define the threshold and
838	   algorithm for ECN-marking.  This threshold MAY differ from that used
839	   for dropping packets that are not marked as ECN-capable, and SHOULD
840	   be configurable.

842	   Network devices SHOULD use an algorithm to drop excessive traffic
843	   (e.g., at some level above the threshold for CE-marking), even when
844	   the packets are marked as originating from an ECN-capable transport.

846	4.3.  AQM algorithms deployed SHOULD NOT require operational tuning

848	   A number of AQM algorithms have been proposed.  Many require some
849	   form of tuning or setting of parameters for initial network
850	   conditions.  This can make these algorithms difficult to use in
851	   operational networks.

853	   AQM algorithms need to consider both "initial conditions" and
854	   "operational conditions".  The former includes values that exist
855	   before any experience is gathered about the use of the algorithm,
856	   such as the configured speed of interface, support for full duplex
857	   communication, interface MTU and other properties of the link.  The
858	   latter includes information observed from monitoring the size of the
859	   queue, experienced queueing delay, rate of packet discard, etc.

861	   This document therefore specifies that AQM algorithms that are
862	   proposed for deployment in the Internet have the following
863	   properties:

865	   o  AQM algorithm deployment SHOULD NOT require tuning in common use
866	      cases.  An algorithm needs to provide a default behaviour that
867	      auto-tunes to a reasonable performance for typical network
868	      operational conditions.  This is expected to ease deployment and
869	      operation.  Initial conditions, such as the interface rate and MTU
870	      size or other values derived from these, MAY be required by an AQM
871	      algorithm.

873	   o  MAY support further manual tuning that could improve performance
874	      in a specific deployed network.  Algorithms that lack such
875	      variables are acceptable, but if such variables exist, they SHOULD
876	      be externalized (made visible to the operator).  Guidance needs to
877	      be provided on the cases where auto-tuning is unlikely to achieve
878	      acceptable performance and to identify the set of parameters that
879	      can be tuned.  For example, the expected response of an algorithm
880	      may need to be configured to accommodate the largest expected Path
881	      RTT, since this value can not be known at initialization.  This
882	      guidance is expected to enable the algorithm to be deployed in
883	      networks that have specific characteristics (paths with variable/
884	      larger delay; networks where capacity is impacted by interactions
885	      with lower layer mechanisms, etc).

887	   o  MAY provide logging and alarm signals to assist in identifying if
888	      an algorithm using manual or auto-tuning is functioning as
889	      expected. (e.g., this could be based on an internal consistency
890	      check between input, output, and mark/drop rates over time).  This
891	      is expected to encourage deployment by default and allow operators
892	      to identify potential interactions with other network functions.

894	   Hence, self-tuning algorithms are to be preferred.  Algorithms
895	   recommended for general Internet deployment by the IETF need to be
896	   designed so that they do not require operational (especially manual)
897	   configuration or tuning.

899	4.4.  AQM algorithms SHOULD respond to measured congestion, not
900	      application profiles.

902	   Not all applications transmit packets of the same size.  Although
903	   applications may be characterized by particular profiles of packet
904	   size this should not be used as the basis for AQM (see next section).
905	   Other methods exist, e.g., Differentiated Services queueing, Pre-
906	   Congestion Notification (PCN) [RFC5559], that can be used to
907	   differentiate and police classes of application.  Network devices may
908	   combine AQM with these traffic classification mechanisms and perform
909	   AQM only on specific queues within a network device.

911	   An AQM algorithm should not deliberately try to prejudice the size of
912	   packet that performs best (i.e., Preferentially drop/mark based only
913	   on packet size).  Procedures for selecting packets to mark/drop
914	   SHOULD observe the actual or projected time that a packet is in a
915	   queue (bytes at a rate being an analog to time).  When an AQM
916	   algorithm decides whether to drop (or mark) a packet, it is
917	   RECOMMENDED that the size of the particular packet should not be
918	   taken into account [RFC7141].

920	   Applications (or transports) generally know the packet size that they
921	   are using and can hence make their judgments about whether to use
922	   small or large packets based on the data they wish to send and the
923	   expected impact on the delay or throughput, or other performance
924	   parameter.  When a transport or application responds to a dropped or
925	   marked packet, the size of the rate reduction should be proportionate
926	   to the size of the packet that was sent [RFC7141].

928	   AQM-enabled system MAY instantiate different instances of an AQM
929	   algorithm to be applied within the same traffic class.  Traffic
930	   classes may be differentiated based on an Access Control List (ACL),
931	   the packet Differentiated Services Code Point (DSCP) [RFC5559],
932	   enabling use of the ECN field (i.e., any of ECT(0), ECT(1) or
933	   CE)[RFC3168] [RFC4774], a multi-field (MF) classifier that combines
934	   the values of a set of protocol fields (e.g., IP address, transport,
935	   ports) or an equivalent codepoint at a lower layer.  This
936	   recommendation goes beyond what is defined in RFC 3168, by allowing
937	   that an implementation MAY use more than one instance of an AQM
938	   algorithm to handle both ECN-capable and non-ECN-capable packets.

940	4.5.  AQM algorithms SHOULD NOT be dependent on specific transport
941	      protocol behaviours

943	   In deploying AQM, network devices need to support a range of Internet
944	   traffic and SHOULD NOT make implicit assumptions about the
945	   characteristics desired by the set transports/applications the
946	   network supports.  That is, AQM methods should be opaque to the
947	   choice of transport and application.

949	   AQM algorithms are often evaluated by considering TCP [RFC0793] with
950	   a limited number of applications.  Although TCP is the predominant
951	   transport in the Internet today, this no longer represents a
952	   sufficient selection of traffic for verification.  There is
953	   significant use of UDP [RFC0768] in voice and video services, and
954	   some applications find utility in SCTP [RFC4960] and DCCP [RFC4340].
955	   Hence, AQM algorithms should also demonstrate operation with
956	   transports other than TCP and need to consider a variety of
957	   applications.  Selection of AQM algorithms also needs to consider use
958	   of tunnel encapsulations that may carry traffic aggregates.

960	   AQM algorithms SHOULD NOT target or derive implicit assumptions about
961	   the characteristics desired by specific transports/applications.
962	   Transports and applications need to respond to the congestion signals
963	   provided by AQM (i.e., dropping or ECN-marking) in a timely manner
964	   (within a few RTT at the latest).

966	4.6.  Interactions with congestion control algorithms

968	   Applications and transports need to react to received implicit or
969	   explicit signals that indicate the presence of congestion.  This
970	   section identifies issues that can impact the design of transport
971	   protocols when using paths that use AQM.

973	   Transport protocols and applications need timely signals of
974	   congestion.  The time taken to detect and respond to congestion is
975	   increased when network devices queue packets in buffers.  It can be
976	   difficult to detect tail losses at a higher layer and this may
977	   sometimes require transport timers or probe packets to detect and
978	   respond to such loss.  Loss patterns may also impact timely
979	   detection, e.g., the time may be reduced when network devices do not
980	   drop long runs of packets from the same flow.

982	   A common objective of an elastic transport congestion control
983	   protocol is to allow an application to deliver the maximum rate of
984	   data without inducing excessive delays when packets are queued in a
985	   buffers within the network.  To achieve this, a transport should try
986	   to operate at rate below the inflexion point of the load/delay curve
987	   (the bend of what is sometimes called a "hockey-stick" curve)
988	   [Jain94].  When the congestion window allows the load to approach
989	   this bend, the end-to-end delay starts to rise - a result of
990	   congestion, as packets probabilistically arrive at non-overlapping
991	   times.  On the one hand, a transport that operates above this point
992	   can experience congestion loss and could also trigger operator
993	   activities, such as those discussed in [RFC6057].  On the other hand,
994	   a flow may achieve both near-maximum throughput and low latency when
995	   it operates close to this knee point, with minimal contribution to
996	   router congestion.  Choice of an appropriate rate/congestion window
997	   can therefore significantly impact the loss and delay experienced by
998	   a flow and will impact other flows that share a common network queue.

1000	   Some applications may send less than permitted by the congestion
1001	   control window (or rate).  Examples include multimedia codecs that
1002	   stream at some natural rate (or set of rates) or an application that
1003	   is naturally interactive (e.g., some web applications, interactive
1004	   server-based gaming, transaction-based protocols).  Such applications
1005	   may have different objectives.  They may not wish to maximize
1006	   throughput, but may desire a lower loss rate or bounded delay.

1008	   The correct operation of an AQM-enabled network device MUST NOT rely
1009	   upon specific transport responses to congestion signals.

1011	4.7.  The need for further research

1013	   The second recommendation of [RFC2309] called for further research
1014	   into the interaction between network queues and host applications,
1015	   and the means of signaling between them.  This research has occurred,
1016	   and we as a community have learned a lot.  However, we are not done.

1018	   We have learned that the problems of congestion, latency and buffer-
1019	   sizing have not gone away, and are becoming more important to many
1020	   users.  A number of self-tuning AQM algorithms have been found that
1021	   offer significant advantages for deployed networks.  There is also
1022	   renewed interest in deploying AQM and the potential of ECN.

1024	   Traffic patterns can depend on the network deployment scenario, and
1025	   Internet research therefore needs to consider the implications of a
1026	   diverse range of application interactions.  At the time of writing
1027	   (in 2015), an obvious example of further research is the need to
1028	   consider the many-to-one communication patterns found in data
1029	   centers, known as incast [Ren12], (e.g., produced by Map/Reduce
1030	   applications).  Such anlaysis needs to study not only each
1031	   application traffic type, but should also include combinations of
1032	   types of traffic.

1034	   Research also needs to consider the need to extend our taxonomy of
1035	   transport sessions to include not only "mice" and "elephants", but
1036	   "lemmings"?  Where "Lemmings" are flash crowds of "mice" that the
1037	   network inadvertently tries to signal to as if they were elephant
1038	   flows, resulting in head of line blocking in a data center deployment
1039	   scenario.

1041	   Examples of other required research include:

1043	   o  Research into new AQM and scheduling algorithms.

1045	   o  Appropriate use of delay-based methods and the implications of
1046	      AQM.

1048	   o  Research into suitable algorithms for marking ECN-capable packets
1049	      that do not require operational configuration or tuning for common
1050	      use.

1052	   o  Experience in the deployment of ECN alongside AQM.

1054	   o  Tools for enabling AQM (and ECN) deployment and measuring the
1055	      performance.

1057	   o  Methods for mitigating the impact of non-conformant and malicious
1058	      flows.

1060	   o  Research to understand the implications of using new network and
1061	      transport methods on applications.

1063	   Hence, this document therefore reiterates the call of RFC 2309: we
1064	   need continuing research as applications develop.

1066	5.  IANA Considerations

1068	   This memo asks the IANA for no new parameters.

1070	6.  Security Considerations

1072	   While security is a very important issue, it is largely orthogonal to
1073	   the performance issues discussed in this memo.

1075	   This recommendation requires algorithms to be independent of specific
1076	   transport or application behaviors.  Therefore a network device does
1077	   not require visibility or access to upper layer protocol information
1078	   to implement an AQM algorithm.  This ability to operate in an
1079	   application-agnostic fashion is therefore an example of a privacy-
1080	   enhancing feature.

1082	   Many deployed network devices use queueing methods that allow
1083	   unresponsive traffic to capture network capacity, denying access to
1084	   other traffic flows.  This could potentially be used as a denial-of-
1085	   service attack.  This threat could be reduced in network devices that
1086	   deploy AQM or some form of scheduling.  We note, however, that a
1087	   denial-of-service attack that results in unresponsive traffic flows
1088	   may be indistinguishable from other traffic flows (e.g., tunnels
1089	   carrying aggregates of short flows, high-rate isochronous
1090	   applications).  New methods therefore may remain vulnerable, and this
1091	   document recommends that ongoing research should consider ways to
1092	   mitigate such attacks.

1094	7.  Privacy Considerations

1096	   This document, by itself, presents no new privacy issues.

1098	8.  Acknowledgements

1100	   The original version of this document describing best current
1101	   practice was based on the informational text of [RFC2309].  This was
1102	   written by the End-to-End Research Group, which is to say Bob Braden,
1103	   Dave Clark, Jon Crowcroft, Bruce Davie, Steve Deering, Deborah
1104	   Estrin, Sally Floyd, Van Jacobson, Greg Minshall, Craig Partridge,
1105	   Larry Peterson, KK Ramakrishnan, Scott Shenker, John Wroclawski, and
1106	   Lixia Zhang.  Although there are important differences, many of the
1107	   key arguments in the present document remain unchanged from those in
1108	   RFC 2309.

1110	   The need for an updated document was agreed to in the tsvarea meeting
1111	   at IETF 86.  This document was reviewed on the aqm@ietf.org list.
1112	   Comments were received from Colin Perkins, Richard Scheffenegger,
1113	   Dave Taht, John Leslie, David Collier-Brown and many others.

1115	   Gorry Fairhurst was in part supported by the European Community under
1116	   its Seventh Framework Programme through the Reducing Internet
1117	   Transport Latency (RITE) project (ICT-317700).

1119	9.  References

1121	9.1.  Normative References

1123	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1124	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1126	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1127	              of Explicit Congestion Notification (ECN) to IP", RFC
1128	              3168, September 2001.

1130	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
1131	              Internet Protocol", RFC 4301, December 2005.

1133	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
1134	              Explicit Congestion Notification (ECN) Field", BCP 124,
1135	              RFC 4774, November 2006.

1137	   [RFC5405]  Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines
1138	              for Application Designers", BCP 145, RFC 5405, November
1139	              2008.

1141	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1142	              Control", RFC 5681, September 2009.

1144	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1145	              Notification", RFC 6040, November 2010.

1147	   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1148	              and K. Carlberg, "Explicit Congestion Notification (ECN)
1149	              for RTP over UDP", RFC 6679, August 2012.

1151	   [RFC7141]  Briscoe, B. and J. Manner, "Byte and Packet Congestion
1152	              Notification", BCP 41, RFC 7141, February 2014.

1154	9.2.  Informative References

1156	   [AQM-WG]   "IETF AQM WG", .

1158	   [Bri15]    Briscoe, Bob., Brunstrom, Anna., Petlund, Andreas., Hayes,
1159	              David., Ros, David., Tsang, Ing-Jyh., Gjessing, Stein.,
1160	              Fairhurst, Gorry., Griwodz, Carsten., and Michael. Welzl,
1161	              "Reducing Internet Latency: A Survey of Techniques and
1162	              their Merit, IEEE Communications Surveys & Tutorials",
1163	              2015.

1165	   [CONEX]    Mathis, M. and B. Briscoe, "The Benefits to Applications
1166	              of using Explicit Congestion Notification (ECN)", IETF
1167	              (Work-in-Progress) draft-ietf-conex-abstract-mech, March
1168	              2014.

1170	   [Choi04]   Choi, Baek-Young., Moon, Sue., Zhang, Zhi-Li.,
1171	              Papagiannaki, K., and C. Diot, "Analysis of Point-To-Point
1172	              Packet Delay In an Operational Network", March 2004.

1174	   [Dem90]    Demers, A., Keshav, S., and S. Shenker, "Analysis and
1175	              Simulation of a Fair Queueing Algorithm, Internetworking:
1176	              Research and Experience", SIGCOMM Symposium proceedings on
1177	              Communications architectures and protocols , 1990.

1179	   [ECN-Benefit]
1180	              Welzl, M. and G. Fairhurst, "The Benefits to Applications
1181	              of using Explicit Congestion Notification (ECN)", IETF
1182	              (Work-in-Progress) , February 2014.

1184	   [Flo92]    Floyd, S. and V. Jacobsen, "On Traffic Phase Effects in
1185	              Packet-Switched Gateways", 1992.

1187	   [Flo94]    Floyd, S. and V. Jacobsen, "The Synchronization of
1188	              Periodic Routing Messages,
1189	              http://ee.lbl.gov/papers/sync_94.pdf", 1994.

1191	   [Floyd91]  Floyd, S., "Connections with Multiple Congested Gateways
1192	              in Packet-Switched Networks Part 1: One-way Traffic.",
1193	              Computer Communications Review , October 1991.

1195	   [Floyd95]  Floyd, S. and V. Jacobson, "Link-sharing and Resource
1196	              Management Models for Packet Networks", IEEE/ACM
1197	              Transactions on Networking , August 1995.

1199	   [Jacobson88]
1200	              Jacobson, V., "Congestion Avoidance and Control", SIGCOMM
1201	              Symposium proceedings on Communications architectures and
1202	              protocols , August 1988.

1204	   [Jain94]   Jain, Raj., Ramakrishnan, KK., and Chiu. Dah-Ming,
1205	              "Congestion avoidance scheme for computer networks", US
1206	              Patent Office 5377327, December 1994.

1208	   [Lakshman96]
1209	              Lakshman, TV., Neidhardt, A., and T. Ott, "The Drop From
1210	              Front Strategy in TCP Over ATM and Its Interworking with
1211	              Other Control Features", IEEE Infocomm , 1996.

1213	   [Leland94]
1214	              Leland, W., Taqqu, M., Willinger, W., and D. Wilson, "On
1215	              the Self-Similar Nature of Ethernet Traffic (Extended
1216	              Version)", IEEE/ACM Transactions on Networking , February
1217	              1994.

1219	   [McK90]    McKenney, PE. and G. Varghese, "Stochastic Fairness
1220	              Queuing",
1221	              http://www2.rdrop.com/~paulmck/scalability/paper/
1222	              sfq.2002.06.04.pdf , 1990.

1224	   [Nic12]    Nichols, K., "Controlling Queue Delay", Communications of
1225	              the ACM Vol. 55 No. 11, July, 2012, pp.42-50. , July 2002.

1227	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
1228	              August 1980.

1230	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791, September
1231	              1981.

1233	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7, RFC
1234	              793, September 1981.

1236	   [RFC0896]  Nagle, J., "Congestion control in IP/TCP internetworks",
1237	              RFC 896, January 1984.

1239	   [RFC0970]  Nagle, J., "On packet switches with infinite storage", RFC
1240	              970, December 1985.

1242	   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
1243	              Communication Layers", STD 3, RFC 1122, October 1989.

1245	   [RFC1633]  Braden, B., Clark, D., and S. Shenker, "Integrated
1246	              Services in the Internet Architecture: an Overview", RFC
1247	              1633, June 1994.

1249	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
1250	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
1251	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
1252	              S., Wroclawski, J., and L. Zhang, "Recommendations on
1253	              Queue Management and Congestion Avoidance in the
1254	              Internet", RFC 2309, April 1998.

1256	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1257	              (IPv6) Specification", RFC 2460, December 1998.

1259	   [RFC2474]  Nichols, K., Blake, S., Baker, F., and D. Black,
1260	              "Definition of the Differentiated Services Field (DS
1261	              Field) in the IPv4 and IPv6 Headers", RFC 2474, December
1262	              1998.

1264	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1265	              and W. Weiss, "An Architecture for Differentiated
1266	              Services", RFC 2475, December 1998.

1268	   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41, RFC
1269	              2914, September 2000.

1271	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1272	              Congestion Control Protocol (DCCP)", RFC 4340, March 2006.

1274	   [RFC4960]  Stewart, R., "Stream Control Transmission Protocol", RFC
1275	              4960, September 2007.

1277	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1278	              Friendly Rate Control (TFRC): Protocol Specification", RFC
1279	              5348, September 2008.

1281	   [RFC5559]  Eardley, P., "Pre-Congestion Notification (PCN)
1282	              Architecture", RFC 5559, June 2009.

1284	   [RFC6057]  Bastian, C., Klieber, T., Livingood, J., Mills, J., and R.
1285	              Woundy, "Comcast's Protocol-Agnostic Congestion Management
1286	              System", RFC 6057, December 2010.

1288	   [RFC6789]  Briscoe, B., Woundy, R., and A. Cooper, "Congestion
1289	              Exposure (ConEx) Concepts and Use Cases", RFC 6789,
1290	              December 2012.

1292	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1293	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1294	              December 2012.

1296	   [RFC7414]  Duke, M., Braden, R., Eddy, W., Blanton, E., and A.
1297	              Zimmermann, "A Roadmap for Transmission Control Protocol
1298	              (TCP) Specification Documents", RFC 7414, February 2015.

1300	   [Ren12]    Ren, Y., Zhao, Y., and P. Liu, "A survey on TCP Incast in
1301	              data center networks, International Journal of
1302	              Communication Systems, Volume 27, Issue 8, pages
1303	              1160-117", 1990.

1305	   [Shr96]    Shreedhar, M. and G. Varghese, "Efficient Fair Queueing
1306	              Using Deficit Round Robin", IEEE/ACM Transactions on
1307	              Networking Vol 4, No. 3 , July 1996.

1309	   [Sto97]    Stoica, I. and H. Zhang, "A Hierarchical Fair Service
1310	              Curve algorithm for Link sharing, real-time and priority
1311	              services", ACM SIGCOMM , 1997.

1313	   [Sut99]    Suter, B., "Buffer Management Schemes for Supporting TCP
1314	              in Gigabit Routers with Per-flow Queueing", IEEE Journal
1315	              on Selected Areas in Communications Vol. 17 Issue 6, June,
1316	              1999, pp. 1159-1169. , 1999.

1318	   [Willinger95]
1319	              Willinger, W., Taqqu, M., Sherman, R., Wilson, D., and V.
1320	              Jacobson, "Self-Similarity Through High-Variability:
1321	              Statistical Analysis of Ethernet LAN Traffic at the Source
1322	              Level", SIGCOMM Symposium proceedings on Communications
1323	              architectures and protocols , August 1995.

1325	   [Zha90]    Zhang, L. and D. Clark, "Oscillating Behavior of Network
1326	              Traffic: A Case Study Simulation,
1327	              http://groups.csail.mit.edu/ana/Publications/Zhang-DDC-
1328	              Oscillating-Behavior-of-Network-Traffic-1990.pdf", 1990.

1330	Appendix A.  Change Log

1332	   RFC-Editor please remove this appendix before publication.

1334	   Initial Version:  March 2013
1335	   Minor update of the algorithms that the IETF recommends SHOULD NOT
1336	   require operational (especially manual) configuration or tuning
1337	      April 2013

1339	   Major surgery.  This draft is for discussion at IETF-87 and expected
1340	   to be further updated.
1341	      July 2013

1343	   -00 WG Draft - Updated transport recommendations; revised deployment
1344	   configuration section; numerous minor edits.
1345	      Oct 2013

1347	   -01 WG Draft - Updated transport recommendations; revised deployment
1348	   configuration section; numerous minor edits.
1349	      Jan 2014 - Feedback from WG.

1351	   -02 WG Draft - Minor edits  Feb 2014 - Mainly language fixes.

1353	   -03 WG Draft - Minor edits  Feb 2013 - Comments from David Collier-
1354	      Brown and David Taht.

1356	   -04 WG Draft - Minor edits  May 2014 - Comments during WGLC: Provided
1357	      some introductory subsections to help people (with subsections and
1358	      better text). - Written more on the role scheduling.  - Clarified
1359	      that ECN mark threshold needs to be configurable. - Reworked your
1360	      "knee" para.  Various updates in response to feedback.

1362	   -05 WG Draft - Minor edits  June 2014 - New text added to address
1363	      further comments, and improve introduction - adding context,
1364	      reference to Conex, linking between sections, added text on
1365	      synchronization.

1367	   -06 WG Draft - Minor edits  July 2014 - Reorganised the introduction
1368	      following WG feedback to better explain how this relates to the
1369	      original goals of RFC2309.  Added item on packet bursts.  Various
1370	      minor corrections incorporated - no change to main
1371	      recommendations.

1373	   -07 WG Draft - Minor edits  July 2014 - Replaced ID REF by RFC 7141.
1374	      Changes made to introduction following inputs from Wes Eddy and
1375	      John Leslie.  Corrections and additions proposed by Bob Briscoe.

1377	   -08 WG Draft - Minor edits  August 2014 - Review comments from John
1378	      Leslie and Bob Briscoe.  Text corrections including; updated
1379	      Acknowledgments (RFC2309 ref) s/congestive/congestion/g; changed
1380	      the more bold language from RFC2309 to reflect a more considered
1381	      perceived threat to Internet Performance; modified the category
1382	      that is not-TCP-like to be "less responsive to congestion than
1383	      TCP" and more clearkly noted that represents a range of
1384	      behaviours.

1386	   -09 WG Draft - Minor edits  Jan 2015 - Edits following LC comments.

1388	   -10 WG Draft - Minor edits  Feb 2015 - Update following IESG Review

1390	   o  Gorry's Unresolved to-do list

1392	   o  ----------------

1394	   o  DISCUSS: This sentence above could be understood in different
1395	      ways.  For example, that any configuration is wrong.  The ability
1396	      to activate AQM is a good thing IMO.  The section 4.3 title is
1397	      closer to what you intend to say: "AQM algorithms deployed SHOULD
1398	      NOT require operational tuning" The issue is that you only define
1399	      what you mean by "operational configuration" in section 4.3

1401	   o  ----------------

1403	   o  DISCUSS1: OLD: 3.  The algorithms that the IETF recommends SHOULD
1404	      NOT require operational (especially manual) configuration or
1405	      tuning.  NEW: AQM algorithm deployment SHOULD NOT require tuning
1406	      of initial or configuration parameters.  Gorry proposes: AQM
1407	      algorithms specified by the IETF SHOULD NOT require tuning of
1408	      initial or configuration parameters.

1410	   o  -------------------------

1412	   o  DISCUSS2: OLD: 4.3 AQM algorithms deployed SHOULD NOT require
1413	      operational tuning NEW: 4.3 AQM algorithm deployment SHOULD NOT
1414	      require tuning

1416	   o  ---------------------------------

1418	   o  COMMENT: I see relatively little, and that includes here mention
1419	      that time scales for queue management and the time scale for
1420	      application responsiveness to congestion signals are wildly
1421	      different. e.g. one is measured in usecs the other is bounded by
1422	      RTT. queue sizing and policing around abberant events, for example
1423	      micro-as loops driven by a prefix withdraw isn't really dealt with
1424	      at all.

1426	   o  - GF: This is true, although we hint at the same issue with paths
1427	      with satellite delay (when RTT >> AQM loop response time) --- have
1428	      we ideas of what to write?

1430	   o  ---------------------------------
1431	   o  GEN-ART COMMENT: "Ensuring that mechanisms do not interact badly:
1432	      Given that a number of different mechanisms are being developed
1433	      and potentially may all be deployed in various quantities in
1434	      routers, etc., along the path that a packet takes, ensuring that
1435	      this does not lead to instability or other interactions should
1436	      also be a target of research.  A number of applications now have
1437	      flow control mechanisms that may be deployed as an adjunct to TCP
1438	      so that a single path may have multiple nested end-to-end feedback
1439	      loops (notably, just about to be standardized, HHTP2!) and it
1440	      would be very wise to ensure that adding AQM into the loop does
1441	      not lead to problems. "-GF:what is this comment about???? - I know
1442	      the problems of flow control interaction in HTTP2 with TCP; but I
1443	      don't understand how that applies and what new text this needs:
1444	      Ideas?

1446	Authors' Addresses

1448	   Fred Baker (editor)
1449	   Cisco Systems
1450	   Santa Barbara, California  93117
1451	   USA

1453	   Email: fred@cisco.com

1455	   Godred Fairhurst (editor)
1456	   University of Aberdeen
1457	   School of Engineering
1458	   Fraser Noble Building
1459	   Aberdeen, Scotland  AB24 3UE
1460	   UK

1462	   Email: gorry@erg.abdn.ac.uk
1463	   URI:   http://www.erg.abdn.ac.uk