idnits 2.17.1 draft-ietf-aqm-recommendation-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2309, but the abstract doesn't seem to directly say this. It does mention RFC2309 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 13, 2015) is 3390 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 896 (Obsoleted by RFC 7805) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Baker, Ed. 3 Internet-Draft Cisco Systems 4 Obsoletes: 2309 (if approved) G. Fairhurst, Ed. 5 Intended status: Best Current Practice University of Aberdeen 6 Expires: July 17, 2015 January 13, 2015 8 IETF Recommendations Regarding Active Queue Management 9 draft-ietf-aqm-recommendation-09 11 Abstract 13 This memo presents recommendations to the Internet community 14 concerning measures to improve and preserve Internet performance. It 15 presents a strong recommendation for testing, standardization, and 16 widespread deployment of active queue management (AQM) in network 17 devices, to improve the performance of today's Internet. It also 18 urges a concerted effort of research, measurement, and ultimate 19 deployment of AQM mechanisms to protect the Internet from flows that 20 are not sufficiently responsive to congestion notification. 22 The note largely repeats the recommendations of RFC 2309, and 23 replaces these after fifteen years of experience and new research. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on July 17, 2015. 42 Copyright Notice 44 Copyright (c) 2015 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 1.1. Congestion Collapse . . . . . . . . . . . . . . . . . . . 3 61 1.2. Active Queue Management to Manage Latency . . . . . . . . 3 62 1.3. Document Overview . . . . . . . . . . . . . . . . . . . . 4 63 1.4. Changes to the recommendations of RFC2309 . . . . . . . . 5 64 1.5. Requirements Language . . . . . . . . . . . . . . . . . . 6 65 2. The Need For Active Queue Management . . . . . . . . . . . . 6 66 2.1. AQM and Multiple Queues . . . . . . . . . . . . . . . . . 9 67 2.2. AQM and Explicit Congestion Marking (ECN) . . . . . . . . 10 68 2.3. AQM and Buffer Size . . . . . . . . . . . . . . . . . . . 10 69 3. Managing Aggressive Flows . . . . . . . . . . . . . . . . . . 11 70 4. Conclusions and Recommendations . . . . . . . . . . . . . . . 14 71 4.1. Operational deployments SHOULD use AQM procedures . . . . 15 72 4.2. Signaling to the transport endpoints . . . . . . . . . . 15 73 4.2.1. AQM and ECN . . . . . . . . . . . . . . . . . . . . . 16 74 4.3. AQM algorithms deployed SHOULD NOT require operational 75 tuning . . . . . . . . . . . . . . . . . . . . . . . . . 18 76 4.4. AQM algorithms SHOULD respond to measured congestion, not 77 application profiles. . . . . . . . . . . . . . . . . . . 19 78 4.5. AQM algorithms SHOULD NOT be dependent on specific 79 transport protocol behaviours . . . . . . . . . . . . . . 20 80 4.6. Interactions with congestion control algorithms . . . . . 20 81 4.7. The need for further research . . . . . . . . . . . . . . 21 82 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 22 84 7. Privacy Considerations . . . . . . . . . . . . . . . . . . . 23 85 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 86 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 87 9.1. Normative References . . . . . . . . . . . . . . . . . . 23 88 9.2. Informative References . . . . . . . . . . . . . . . . . 24 89 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 27 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 92 1. Introduction 94 The Internet protocol architecture is based on a connectionless end- 95 to-end packet service using the Internet Protocol, whether IPv4 96 [RFC0791] or IPv6 [RFC2460]. The advantages of its connectionless 97 design: flexibility and robustness, have been amply demonstrated. 98 However, these advantages are not without cost: careful design is 99 required to provide good service under heavy load. In fact, lack of 100 attention to the dynamics of packet forwarding can result in severe 101 service degradation or "Internet meltdown". This phenomenon was 102 first observed during the early growth phase of the Internet in the 103 mid 1980s [RFC0896][RFC0970], and is technically called "congestion 104 collapse" and was a key focus of RFC2309. 106 Since 1998, when RFC2309 was written, the Internet has become used 107 for a variety of traffic. In the current Internet low latency is 108 extremely important for many interactive and transaction-based 109 applications. The same type of technology that RFC2309 advocated for 110 combating congestion collapse is also effective at limiting delays to 111 reduce the interaction delay experienced by applications. While 112 there is still a need to avoid congestion collapse, there is now also 113 a focus on reducing network latency using the same technology. 115 1.1. Congestion Collapse 117 The original fix for Internet meltdown was provided by Van Jacobsen. 118 Beginning in 1986, Jacobsen developed the congestion avoidance 119 mechanisms [Jacobson88] that are now required for implementations of 120 the Transport Control Protocol (TCP) [RFC0768] [RFC1122]. These 121 mechanisms operate in Internet hosts to cause TCP connections to 122 "back off" during congestion. We say that TCP flows are "responsive" 123 to congestion signals (i.e., packets that are dropped or marked with 124 explicit congestion notification [RFC3168]). It is primarily these 125 TCP congestion avoidance algorithms that prevent the congestion 126 collapse of today's Internet. Similar algorithms are specified for 127 other non-TCP transports. 129 However, that is not the end of the story. Considerable research has 130 been done on Internet dynamics since 1988, and the Internet has 131 grown. It has become clear that the congestion avoidance mechanisms 132 [RFC5681], while necessary and powerful, are not sufficient to 133 provide good service in all circumstances. Basically, there is a 134 limit to how much control can be accomplished from the edges of the 135 network. Some mechanisms are needed in the network devices to 136 complement the endpoint congestion avoidance mechanisms. These 137 mechanisms may be implemented in network devices that include 138 routers, switches, and other network middleboxes. 140 1.2. Active Queue Management to Manage Latency 142 Internet latency has become a focus of attention to increase the 143 responsiveness of Internet applications and protocols. One major 144 source of delay is the build-up of queues in network devices. 146 Queueing occurs whenever the arrival rate of data at the ingress to a 147 device exceeds the current egress rate. Such queueing is normal in a 148 packet-switched network and often necessary to absorb bursts in 149 transmission and perform statistical multiplexing of traffic, but 150 excessive queueing can lead to unwanted delay, reducing the 151 performance of some Internet applications. 153 RFC 2309 introduced the concept of "Active Queue Management" (AQM), a 154 > class of technologies that, by signaling to common congestion- 155 controlled transports such as TCP, manages the size of queues that 156 build in network buffers. RFC 2309 also describes a specific AQM 157 algorithm, Random Early Detection (RED), and recommends that this be 158 widely implemented and used by default in routers. 160 With an appropriate set of parameters, RED is an effective algorithm. 161 However, dynamically predicting this set of parameters was found to 162 be difficult. As a result, RED has not been enabled by default, and 163 its present use in the Internet is limited. Other AQM algorithms 164 have been developed since RC2309 was published, some of which are 165 self-tuning within a range of applicability. Hence, while this memo 166 continues to recommend the deployment of AQM, it no longer recommends 167 that RED or any other specific algorithm is used as a default; 168 instead it provides recommendations on how to select appropriate 169 algorithms and recommends that algorithms should be used that a 170 recommended algorithm is able to automate any required tuning for 171 common deployment scenarios. 173 Deploying AQM in the network can significantly reduce the latency 174 across an Internet path and since writing RFC2309, this has become a 175 key motivation for using AQM in the Internet. In the context of AQM, 176 it is useful to distinguish between two related classes of 177 algorithms: "queue management" versus "scheduling" algorithms. To a 178 rough approximation, queue management algorithms manage the length of 179 packet queues by marking or dropping packets when necessary or 180 appropriate, while scheduling algorithms determine which packet to 181 send next and are used primarily to manage the allocation of 182 bandwidth among flows. While these two mechanisms are closely 183 related, they address different performance issues and operate on 184 different timescales. Both may be used in combination. 186 1.3. Document Overview 188 The discussion in this memo applies to "best-effort" traffic, which 189 is to say, traffic generated by applications that accept the 190 occasional loss, duplication, or reordering of traffic in flight. It 191 also applies to other traffic, such as real-time traffic that can 192 adapt its sending rate to reduce loss and/or delay. It is most 193 effective when the adaption occurs on time scales of a single Round 194 Trip Time (RTT) or a small number of RTTs, for elastic traffic 195 [RFC1633]. 197 Two performance issues are highlighted: 199 The first issue is the need for an advanced form of queue management 200 that we call "Active Queue Management", AQM. Section 2 summarizes 201 the benefits that active queue management can bring. A number of AQM 202 procedures are described in the literature, with different 203 characteristics. This document does not recommend any of them in 204 particular, but does make recommendations that ideally would affect 205 the choice of procedure used in a given implementation. 207 The second issue, discussed in Section 4 of this memo, is the 208 potential for future congestion collapse of the Internet due to flows 209 that are unresponsive, or not sufficiently responsive, to congestion 210 indications. Unfortunately, while scheduling can mitigate some of 211 the side-effects of sharing a network queue with an unresponsive 212 flow, there is currently no consensus solution to controlling the 213 congestion caused by such aggressive flows. Methods such as 214 congestion exposure (ConEx) [RFC6789] offer a framework [CONEX] that 215 can update network devices to alleviate these effects. Significant 216 research and engineering will be required before any solution will be 217 available. It is imperative that work to mitigate the impact of 218 unresponsive flows is energetically pursued, to ensure acceptable 219 performance and the future stability of the Internet. 221 Section 4 concludes the memo with a set of recommendations to the 222 Internet community on the use of AQM and recommendations for defining 223 AQM algorithms. 225 1.4. Changes to the recommendations of RFC2309 227 This memo replaces the recommendations in [RFC2309], which resulted 228 from past discussions of end-to-end performance, Internet congestion, 229 and RED in the End-to-End Research Group of the Internet Research 230 Task Force (IRTF). It follows experience with this and other 231 algorithms, and the AQM discussion within the IETF [AQM-WG]. 233 While RFC2309 described AQM in terms of the length of a queue. This 234 memo changes this, to use AQM to refer to any method that allows 235 network devices to control either the queue length and/or the mean 236 time that a packet spends in a queue. 238 This memo also explicitly obsoletes the recommendation that Random 239 Early Detection (RED) was to be used as the default AQM mechanism for 240 the Internet. This is replaced by a detailed set of recommendations 241 for selecting an appropriate AQM algorithm. As in RFC2309, this memo 242 also motivates the need for continued research, but clarifies the 243 research with examples appropriate at the time that this memo is 244 published. 246 1.5. Requirements Language 248 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 249 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 250 document are to be interpreted as described in [RFC2119]. 252 2. The Need For Active Queue Management 254 Active Queue Management (AQM) is a method that allows network devices 255 to control the queue length or the mean time that a packet spends in 256 a queue. Although AQM can be applied across a range of deployment 257 environments, the recommendations in this document are directed to 258 use in the general Internet. It is expected that the principles and 259 guidance are also applicable to a wide range of environments, but may 260 require tuning for specific types of link/network (e.g. to 261 accommodate the traffic patterns found in data centres, the 262 challenges of wireless infrastructure, or the higher delay 263 encountered on satellite Internet links). The remainder of this 264 section identifies the need for AQM and the advantages of deploying 265 AQM methods. 267 The traditional technique for managing the queue length in a network 268 device is to set a maximum length (in terms of packets) for each 269 queue, accept packets for the queue until the maximum length is 270 reached, then reject (drop) subsequent incoming packets until the 271 queue decreases because a packet from the queue has been transmitted. 272 This technique is known as "tail drop", since the packet that arrived 273 most recently (i.e., the one on the tail of the queue) is dropped 274 when the queue is full. This method has served the Internet well for 275 years, but it has four important drawbacks: 277 1. Full Queues 279 The tail drop discipline allows queues to maintain a full (or, 280 almost full) status for long periods of time, since tail drop 281 signals congestion (via a packet drop) only when the queue has 282 become full. It is important to reduce the steady-state queue 283 size, and this is perhaps the most important goal for queue 284 management. 286 The naive assumption might be that there is a simple tradeoff 287 between delay and throughput, and that the recommendation that 288 queues be maintained in a "non-full" state essentially translates 289 to a recommendation that low end-to-end delay is more important 290 than high throughput. However, this does not take into account 291 the critical role that packet bursts play in Internet 292 performance. For example, even though TCP constrains the 293 congestion window of a flow, packets often arrive at network 294 devices in bursts [Leland94]. If the queue is full or almost 295 full, an arriving burst will cause multiple packets to be dropped 296 from the same flow. Bursts of loss can result in a global 297 synchronization of flows throttling back, followed by a sustained 298 period of lowered link utilization, reducing overall throughput 299 [Flo94], [Zha90] 301 The goal of buffering in the network is to absorb data bursts and 302 to transmit them during the (hopefully) ensuing bursts of 303 silence. This is essential to permit transmission of bursts of 304 data. Normally small queues are preferred in network devices, 305 with sufficient queue capacity to absorb the bursts. The 306 counter-intuitive result is that maintaining normally-small 307 queues can result in higher throughput as well as lower end-to- 308 end delay. In summary, queue limits should not reflect the 309 steady state queues we want to be maintained in the network; 310 instead, they should reflect the size of bursts that a network 311 device needs to absorb. 313 2. Lock-Out 315 In some situations tail drop allows a single connection or a few 316 flows to monopolize the queue space starving other connection 317 preventing them from getting room in the queue [Flo92]. 319 3. Mitigating the Impact of Packet Bursts 321 Large burst of packets can delay other packets, disrupting the 322 control loop (e.g. the pacing of flows by the TCP ACK-Clock), and 323 reducing the performance of flows that share a common bottleneck. 325 4. Control loop synchronization 327 Congestion control, like other end-to-end mechanisms, introduces 328 a control loop between hosts. Sessions that share a common 329 network bottleneck can therefore become synchronised, introducing 330 periodic disruption (e.g. jitter/loss). "lock-out" is often also 331 the result of synchronization or other timing effects 333 Besides tail drop, two alternative queue management disciplines that 334 can be applied when a queue becomes full are "random drop on full" or 335 "head drop on full". When a new packet arrives at a full queue using 336 the random drop on full discipline, the network device drops a 337 randomly selected packet from the queue (which can be an expensive 338 operation, since it naively requires an O(N) walk through the packet 339 queue). When a new packet arrives at a full queue using the head 340 drop on full discipline, the network device drops the packet at the 341 front of the queue [Lakshman96]. Both of these solve the lock-out 342 problem, but neither solves the full-queues problem described above. 344 We know in general how to solve the full-queues problem for 345 "responsive" flows, i.e., those flows that throttle back in response 346 to congestion notification. In the current Internet, dropped packets 347 provide a critical mechanism indicating congestion notification to 348 hosts. The solution to the full-queues problem is for network 349 devices to drop or ECN-mark packets before a queue becomes full, so 350 that hosts can respond to congestion before buffers overflow. We 351 call such a proactive approach AQM. By dropping or ECN-marking 352 packets before buffers overflow, AQM allows network devices to 353 control when and how many packets to drop. 355 In summary, an active queue management mechanism can provide the 356 following advantages for responsive flows. 358 1. Reduce number of packets dropped in network devices 360 Packet bursts are an unavoidable aspect of packet networks 361 [Willinger95]. If all the queue space in a network device is 362 already committed to "steady state" traffic or if the buffer 363 space is inadequate, then the network device will have no ability 364 to buffer bursts. By keeping the average queue size small, AQM 365 will provide greater capacity to absorb naturally-occurring 366 bursts without dropping packets. 368 Furthermore, without AQM, more packets will be dropped when a 369 queue does overflow. This is undesirable for several reasons. 370 First, with a shared queue and the tail drop discipline, this can 371 result in unnecessary global synchronization of flows, resulting 372 in lowered average link utilization, and hence lowered network 373 throughput. Second, unnecessary packet drops represent a waste 374 of network capacity on the path before the drop point. 376 While AQM can manage queue lengths and reduce end-to-end latency 377 even in the absence of end-to-end congestion control, it will be 378 able to reduce packet drops only in an environment that continues 379 to be dominated by end-to-end congestion control. 381 2. Provide a lower-delay interactive service 383 By keeping a small average queue size, AQM will reduce the delays 384 experienced by flows. This is particularly important for 385 interactive applications such as short web transfers, POP/IMAP, 386 DNS, terminal traffic (telnet, ssh, mosh, RDP, etc), gaming or 387 interactive audio-video sessions, whose subjective (and 388 objective) performance is better when the end-to-end delay is 389 low. 391 3. Avoid lock-out behavior 393 AQM can prevent lock-out behavior by ensuring that there will 394 almost always be a buffer available for an incoming packet. For 395 the same reason, AQM can prevent a bias against low capacity, but 396 highly bursty, flows. 398 Lock-out is undesirable because it constitutes a gross unfairness 399 among groups of flows. However, we stop short of calling this 400 benefit "increased fairness", because general fairness among 401 flows requires per-flow state, which is not provided by queue 402 management. For example, in a network device using AQM with only 403 FIFO scheduling, two TCP flows may receive very different share 404 of the network capacity simply because they have different round- 405 trip times [Floyd91], and a flow that does not use congestion 406 control may receive more capacity than a flow that does. AQM can 407 therefore be combined with a scheduling mechanism that divides 408 network traffic between multiple queues (section 2.1). 410 4. Reduce the probability of control loop synchronization 412 The probability of network control loop synchronization can be 413 reduced if network devices introduce randomness in the AQM 414 functions that trigger congestion avoidance at the sending host. 416 2.1. AQM and Multiple Queues 418 A network device may use per-flow or per-class queuing with a 419 scheduling algorithm to either prioritize certain applications or 420 classes of traffic, limit the rate of transmission, or to provide 421 isolation between different traffic flows within a common class. For 422 example, a router may maintain per-flow state to achieve general 423 fairness by a per-flow scheduling algorithm such as various forms of 424 Fair Queueing (FQ) [Dem90] [Sut99], including Weighted Fair Queuing 425 (WFQ), Stochastic Fairness Queueing (SFQ) [McK90] Deficit Round Robin 426 (DRR) [Shr96], [Nic12], and/or a Class-Based Queue scheduling 427 algorithm such as CBQ [Floyd95]. Hierarchical queues may also be 428 used e.g., as a part of a Hierarchical Token Bucket (HTB), or 429 Hierarchical Fair Service Curve (HFSC) [Sto97]. These methods are 430 also used to realize a range of Quality of Service (QoS) behaviours 431 designed to meet the need of traffic classes (e.g. using the 432 integrated or differentiated service models). 434 AQM is needed even for network devices that use per-flow or per-class 435 queuing, because scheduling algorithms by themselves do not control 436 the overall queue size or the size of individual queues. AQM 437 mechanisms might need to control the overall queue sizes, to ensure 438 that arriving bursts can be accommodated without dropping packets. 439 AQM should also be used to control the queue size for each individual 440 flow or class, so that they do not experience unnecessarily high 441 delay. Using a combination of AQM and scheduling between multiple 442 queues has been shown to offer good results in experimental and some 443 types of operational use. 445 In short, scheduling algorithms and queue management should be seen 446 as complementary, not as replacements for each other. 448 2.2. AQM and Explicit Congestion Marking (ECN) 450 An AQM method may use Explicit Congestion Notification (ECN) 451 [RFC3168] instead of dropping to mark packets under mild or moderate 452 congestion. ECN-marking can allow a network device to signal 453 congestion at a point before a transport experiences congestion loss 454 or additional queuing delay [ECN-Benefit]. Section 4.2.1 describes 455 some of the benefits of using ECN with AQM. 457 2.3. AQM and Buffer Size 459 It is important to differentiate the choice of buffer size for a 460 queue in a switch/router or other network device, and the 461 threshold(s) and other parameters that determine how and when an AQM 462 algorithm operates. The optimum buffer size is a function of 463 operational requirements and should generally be sized to be 464 sufficient to buffer the largest normal traffic burst that is 465 expected. This size depends on the number and burstiness of traffic 466 arriving at the queue and the rate at which traffic leaves the queue. 468 One objective of AQM is to minimize the effect of lock-out, where one 469 flow prevents other flows from effectively gaining capacity. This 470 need can be illustrated by a simple example of drop-tail queuing when 471 a new TCP flow injects packets into a queue that happens to be almost 472 full. A TCP flow's congestion control algorithm [RFC5681] increases 473 the flow rate to maximize its effective window. This builds a queue 474 in the network, inducing latency to the flow and other flows that 475 share this queue. Once a drop-tail queue fills, there will also be 476 loss. A new flow, sending its initial burst, has an enhanced 477 probability of filling the remaining queue and dropping packets. As 478 a result, the new flow can be effectively prevented from effectively 479 sharing the queue for a period of many RTTs. In contrast, AQM can 480 minimize the mean queue depth and therefore reducing the probability 481 that competing sessions can materially prevent each other from 482 performing well. 484 AQM frees a designer from having to limit the buffer space assigned 485 to a queue to achieve acceptable performance, allowing allocation of 486 sufficient buffering to satisfy the needs of the particular traffic 487 pattern. Different types of traffic and deployment scenarios will 488 lead to different requirements. The choice of AQM algorithm and 489 associated parameters is therefore a function of the way in which 490 congestion is experienced and the required reaction to achieve 491 acceptable performance. This latter is the primary topic of the 492 following sections. 494 3. Managing Aggressive Flows 496 One of the keys to the success of the Internet has been the 497 congestion avoidance mechanisms of TCP. Because TCP "backs off" 498 during congestion, a large number of TCP connections can share a 499 single, congested link in such a way that link bandwidth is shared 500 reasonably equitably among similarly situated flows. The equitable 501 sharing of bandwidth among flows depends on all flows running 502 compatible congestion avoidance algorithms, i.e., methods conformant 503 with the current TCP specification [RFC5681]. 505 In this document a flow is known as "TCP-friendly" when it has a 506 congestion response that approximates the average response expected 507 of a TCP flow. One example method of a TCP-friendly scheme is the 508 TCP-Friendly Rate Control algorithm [RFC5348]. In this document, the 509 term is used more generally to describe this and other algorithms 510 that meet these goals. 512 There are a variety of types of network flow. Some convenient 513 classes that describe flows are: (1) TCP Friendly flows, (2) 514 unresponsive flows, i.e., flows that do not slow down when congestion 515 occurs, and (3) flows that are responsive but are less responsive to 516 congestion than TCP. The last two classes contain more aggressive 517 flows that can pose significant threats to Internet performance. 519 1. TCP-Friendly flows 521 A TCP-friendly flow responds to congestion notification within a 522 small number of path Round Trip Times (RTT), and in steady-state 523 it uses no more capacity than a conformant TCP running under 524 comparable conditions (drop rate, RTT, packet size, etc.). This 525 is described in the remainder of the document. 527 2. Non-Responsive Flows 528 The User Datagram Protocol (UDP) [RFC0768] provides a minimal, 529 best-effort transport to applications and upper-layer protocols 530 (both simply called "applications" in the remainder of this 531 document) and does not itself provide mechanisms to prevent 532 congestion collapse and establish a degree of fairness [RFC5405]. 534 There is a growing set of UDP-based applications whose congestion 535 avoidance algorithms are inadequate or nonexistent (i.e, a flow 536 that does not throttle its sending rate when it experiences 537 congestion). Examples include some UDP streaming applications 538 for packet voice and video, and some multicast bulk data 539 transport. If no action is taken, such unresponsive flows could 540 lead to a new congestion collapse. Some applications can even 541 increase their traffic volume in response to congestion (e.g. by 542 adding forward error correction when loss is experienced), with 543 the possibility that they contribute to congestion collapse. 545 In general, UDP-based applications need to incorporate effective 546 congestion avoidance mechanisms [RFC5405]. Research continues to 547 be needed to identify and develop ways to accomplish congestion 548 avoidance for presently unresponsive applications. Network 549 devices need to be able to protect themselves against 550 unresponsive flows, and mechanisms to accomplish this must be 551 developed and deployed. Deployment of such mechanisms would 552 provide an incentive for all applications to become responsive by 553 either using a congestion-controlled transport (e.g. TCP, SCTP 554 [RFC4960] and DCCP [RFC4340].) or by incorporating their own 555 congestion control in the application [RFC5405], [RFC6679]. 557 Lastly, some applications (e.g. current web browsers) open a 558 large numbers of short TCP flows for a single session. This can 559 lead to each individual flow spending the majority of time in the 560 exponential TCP slow start phase, rather than in TCP congestion 561 avoidance. The resulting traffic aggregate can therefore be much 562 less responsive than a single standard TCP flow. 564 3. Transport Flows that are less responsive than TCP 566 A second threat is posed by transport protocol implementations 567 that are responsive to congestion, but, either deliberately or 568 through faulty implementation, reduce less than a TCP flow would 569 have done in response to congestion. This covers a spectrum of 570 behaviours between (1) and (2). If applications are not 571 sufficiently responsive to congestion signals, they may gain an 572 unfair share of the available network capacity. 574 For example, the popularity of the Internet has caused a 575 proliferation in the number of TCP implementations. Some of 576 these may fail to implement the TCP congestion avoidance 577 mechanisms correctly because of poor implementation. Others may 578 deliberately be implemented with congestion avoidance algorithms 579 that are more aggressive in their use of capacity than other TCP 580 implementations; this would allow a vendor to claim to have a 581 "faster TCP". The logical consequence of such implementations 582 would be a spiral of increasingly aggressive TCP implementations, 583 leading back to the point where there is effectively no 584 congestion avoidance and the Internet is chronically congested. 586 Another example could be an RTP/UDP video flow that uses an 587 adaptive codec, but responds incompletely to indications of 588 congestion or responds over an excessively long time period. 589 Such flows are unlikely to be responsive to congestion signals in 590 a timeframe comparable to a small number of end-to-end 591 transmission delays. However, over a longer timescale, perhaps 592 seconds in duration, they could moderate their speed, or increase 593 their speed if they determine capacity to be available. 595 Tunneled traffic aggregates carrying multiple (short) TCP flows 596 can be more aggressive than standard bulk TCP. Applications 597 (e.g. web browsers and peer-to-peer file-sharing) have exploited 598 this by opening multiple connections to the same endpoint. 600 The projected increase in the fraction of total Internet traffic for 601 more aggressive flows in classes 2 and 3 could pose a threat to the 602 performance of the future Internet. There is therefore an urgent 603 need for measurements of current conditions and for further research 604 into the ways of managing such flows. This raises many difficult 605 issues in finding methods with an acceptable overhead cost that can 606 identify and isolate unresponsive flows or flows that are less 607 responsive than TCP. Finally, there is as yet little measurement or 608 simulation evidence available about the rate at which these threats 609 are likely to be realized, or about the expected benefit of 610 algorithms for managing such flows. 612 Another topic requiring consideration is the appropriate granularity 613 of a "flow" when considering a queue management method. There are a 614 few "natural" answers: 1) a transport (e.g. TCP or UDP) flow (source 615 address/port, destination address/port, protocol); 2) Differentiated 616 Services Code Point, DSCP; 3) a source/destination host pair (IP 617 address); 4) a given source host or a given destination host, or 618 various combinations of the above; 5) a subscriber or site receiving 619 the Internet service (enterprise or residential). 621 The source/destination host pair gives an appropriate granularity in 622 many circumstances, However, different vendors/providers use 623 different granularities for defining a flow (as a way of 624 "distinguishing" themselves from one another), and different 625 granularities may be chosen for different places in the network. It 626 may be the case that the granularity is less important than the fact 627 that a network device needs to be able to deal with more unresponsive 628 flows at *some* granularity. The granularity of flows for congestion 629 management is, at least in part, a question of policy that needs to 630 be addressed in the wider IETF community. 632 4. Conclusions and Recommendations 634 The IRTF, in publishing [RFC2309], and the IETF in subsequent 635 discussion, has developed a set of specific recommendations regarding 636 the implementation and operational use of AQM procedures. The 637 recommendations provided by this document are summarised as: 639 1. Network devices SHOULD implement some AQM mechanism to manage 640 queue lengths, reduce end-to-end latency, and avoid lock-out 641 phenomena within the Internet. 643 2. Deployed AQM algorithms SHOULD support Explicit Congestion 644 Notification (ECN) as well as loss to signal congestion to 645 endpoints. 647 3. The algorithms that the IETF recommends SHOULD NOT require 648 operational (especially manual) configuration or tuning. 650 4. AQM algorithms SHOULD respond to measured congestion, not 651 application profiles. 653 5. AQM algorithms SHOULD NOT interpret specific transport protocol 654 behaviours. 656 6. Transport protocol congestion control algorithms SHOULD maximize 657 their use of available capacity (when there is data to send) 658 without incurring undue loss or undue round trip delay. 660 7. Research, engineering, and measurement efforts are needed 661 regarding the design of mechanisms to deal with flows that are 662 unresponsive to congestion notification or are responsive, but 663 are more aggressive than present TCP. 665 These recommendations are expressed using the word "SHOULD". This is 666 in recognition that there may be use cases that have not been 667 envisaged in this document in which the recommendation does not 668 apply. Therefore, care should be taken in concluding that one's use 669 case falls in that category; during the life of the Internet, such 670 use cases have been rarely if ever observed and reported. To the 671 contrary, available research [Choi04] says that even high speed links 672 in network cores that are normally very stable in depth and behavior 673 experience occasional issues that need moderation. The 674 recommendations are detailed in the following sections. 676 4.1. Operational deployments SHOULD use AQM procedures 678 AQM procedures are designed to minimize the delay and buffer 679 exhaustion induced in the network by queues that have filled as a 680 result of host behavior. Marking and loss behaviors provide a signal 681 that buffers within network devices are becoming unnecessarily full, 682 and that the sender would do well to moderate its behavior. 684 The use of scheduling mechanisms, such as priority queuing, classful 685 queuing, and fair queuing, is often effective in networks to help a 686 network serve the needs of a range of applications. Network 687 operators can use these methods to manage traffic passing a choke 688 point. This is discussed in [RFC2474] and [RFC2475]. When 689 scheduling is used AQM should be applied across the classes or flows 690 as well as within each class or flow: 692 o AQM mechanisms need to control the overall queue sizes, to ensure 693 that arriving bursts can be accommodated without dropping packets. 695 o AQM mechanisms need to allow combination with other mechanisms, 696 such as scheduling, to allow implementation of policies for 697 providing fairness between different flows. 699 o AQM should be used to control the queue size for each individual 700 flow or class, so that they do not experience unnecessarily high 701 delay. 703 4.2. Signaling to the transport endpoints 705 There are a number of ways a network device may signal to the end 706 point that the network is becoming congested and trigger a reduction 707 in rate. The signalling methods include: 709 o Delaying transport segments (packets) in flight, such as in a 710 queue. 712 o Dropping transport segments (packets) in transit. 714 o Marking transport segments (packets), such as using Explicit 715 Congestion Control[RFC3168] [RFC4301] [RFC4774] [RFC6040] 716 [RFC6679]. 718 Increased network latency is used as an implicit signal of 719 congestion. E.g., in TCP additional delay can affect ACK Clocking 720 and has the result of reducing the rate of transmission of new data. 721 In the Real Time Protocol (RTP), network latency impacts the RTCP- 722 reported RTT and increased latency can trigger a sender to adjust its 723 rate. Methods such as Low Extra Delay Background Transport (LEDBAT) 724 [RFC6817] assume increased latency as a primary signal of congestion. 725 Appropriate use of delay-based methods and the implications of AQM 726 presently remains an area for further research. 728 It is essential that all Internet hosts respond to loss [RFC5681], 729 [RFC5405][RFC4960][RFC4340]. Packet dropping by network devices that 730 are under load has two effects: It protects the network, which is the 731 primary reason that network devices drop packets. The detection of 732 loss also provides a signal to a reliable transport (e.g. TCP, SCTP) 733 that there is potential congestion using a pragmatic heuristic; "when 734 the network discards a message in flight, it may imply the presence 735 of faulty equipment or media in a path, and it may imply the presence 736 of congestion. To be conservative, a transport must assume it may be 737 the latter." Unreliable transports (e.g. using UDP) need to 738 similarly react to loss [RFC5405] 740 Network devices SHOULD use an AQM algorithm to measure local local 741 congestion and to determine the packets to mark or drop so that the 742 congestion is managed. 744 In general, dropping multiple packets from the same sessions in the 745 same RTT is ineffective, and can reduce throughput. Also, dropping 746 or marking packets from multiple sessions simultaneously can have the 747 effect of synchronizing them, resulting in increasing peaks and 748 troughs in the subsequent traffic load. Hence, AQM algorithms SHOULD 749 randomize dropping in time, to reduce the probability that congestion 750 indications are only experienced by a small proportion of the active 751 flows. 753 Loss due to dropping also has an effect on the efficiency of a flow 754 and can significantly impact some classes of application. In 755 reliable transports the dropped data must be subsequently 756 retransmitted. While other applications/transports may adapt to the 757 absence of lost data, this still implies inefficient use of available 758 capacity and the dropped traffic can affect other flows. Hence, 759 congestion signalling by loss is not entirely positive; it is a 760 necessary evil. 762 4.2.1. AQM and ECN 764 Explicit Congestion Notification (ECN) [RFC4301] [RFC4774] [RFC6040] 765 [RFC6679] is a network-layer function that allows a transport to 766 receive network congestion information from a network device without 767 incurring the unintended consequences of loss. ECN includes both 768 transport mechanisms and functions implemented in network devices, 769 the latter rely upon using AQM to decider when and whether to ECN- 770 mark. 772 Congestion for ECN-capable transports is signalled by a network 773 device setting the "Congestion Experienced (CE)" codepoint in the IP 774 header. This codepoint is noted by the remote receiving end point 775 and signalled back to the sender using a transport protocol 776 mechanism, allowing the sender to trigger timely congestion control. 777 The decision to set the CE codepoint requires an AQM algorithm 778 configured with a threshold. Non-ECN capable flows (the default) are 779 dropped under congestion. 781 Network devices SHOULD use an AQM algorithm that marks ECN-capable 782 traffic when making decisions about the response to congestion. 783 Network devices need to implement this method by marking ECN-capable 784 traffic or by dropping non-ECN-capable traffic. 786 Safe deployment of ECN requires that network devices drop excessive 787 traffic, even when marked as originating from an ECN-capable 788 transport. This is a necessary safety precaution because: 790 1. A non-conformant, broken or malicious receiver could conceal an 791 ECN mark, and not report this to the sender; 793 2. A non-conformant, broken or malicious sender could ignore a 794 reported ECN mark, as it could ignore a loss without using ECN; 796 3. A malfunctioning or non-conforming network device may "hide" an 797 ECN mark (or fail to correctly set the ECN codepoint at an egress 798 of a network tunnel). 800 In normal operation, such cases should be very uncommon, however 801 overload protection is desirable to protect traffic from 802 misconfigured or malicious use of ECN (e.g. a denial-of-service 803 attack that generates ECN-capable traffic that is unresponsive to CE- 804 marking). 806 An AQM algorithm that supports ECN needs to define the threshold and 807 algorithm for ECN-marking. This threshold MAY differ from that used 808 for dropping packets that are not marked as ECN-capable, and SHOULD 809 be configurable. 811 Network devices SHOULD use an algorithm to drop excessive traffic 812 (e.g. at some level above the threshold for CE-marking), even when 813 the packets are marked as originating from an ECN-capable transport. 815 4.3. AQM algorithms deployed SHOULD NOT require operational tuning 817 A number of AQM algorithms have been proposed. Many require some 818 form of tuning or setting of parameters for initial network 819 conditions. This can make these algorithms difficult to use in 820 operational networks. 822 AQM algorithms need to consider both "initial conditions" and 823 "operational conditions". The former includes values that exist 824 before any experience is gathered about the use of the algorithm, 825 such as the configured speed of interface, support for full duplex 826 communication, interface MTU and other properties of the link. The 827 latter includes information observed from monitoring the size of the 828 queue, experienced queueing delay, rate of packet discard, etc. 830 This document therefore specifies that AQM algorithms that are 831 proposed for deployment in the Internet have the following 832 properties: 834 o SHOULD NOT require tuning of initial or configuration parameters. 835 An algorithm needs to provide a default behaviour that auto-tunes 836 to a reasonable performance for typical network operational 837 conditions. This is expected to ease deployment and operation. 838 Initial conditions, such as the interface rate and MTU size or 839 other values derived from these, MAY be required by an AQM 840 algorithm. 842 o MAY support further manual tuning that could improve performance 843 in a specific deployed network. Algorithms that lack such 844 variables are acceptable, but if such variables exist, they SHOULD 845 be externalized (made visible to the operator). Guidance needs to 846 be provided on the cases where auto-tuning is unlikely to achieve 847 acceptable performance and to identify the set of parameters that 848 can be tuned. For example, the expected response of an algorithm 849 may need to be configured to accommodate the largest expected Path 850 RTT, since this value can not be known at initialization. This 851 guidance is expected to enable the algorithm to be deployed in 852 networks that have specific characteristics (paths with variable/ 853 larger delay; networks where capacity is impacted by interactions 854 with lower layer mechanisms, etc). 856 o MAY provide logging and alarm signals to assist in identifying if 857 an algorithm using manual or auto-tuning is functioning as 858 expected. (e.g., this could be based on an internal consistency 859 check between input, output, and mark/drop rates over time). This 860 is expected to encourage deployment by default and allow operators 861 to identify potential interactions with other network functions. 863 Hence, self-tuning algorithms are to be preferred. Algorithms 864 recommended for general Internet deployment by the IETF need to be 865 designed so that they do not require operational (especially manual) 866 configuration or tuning. 868 4.4. AQM algorithms SHOULD respond to measured congestion, not 869 application profiles. 871 Not all applications transmit packets of the same size. Although 872 applications may be characterized by particular profiles of packet 873 size this should not be used as the basis for AQM (see next section). 874 Other methods exist, e.g. Differentiated Services queueing, Pre- 875 Congestion Notification (PCN) [RFC5559], that can be used to 876 differentiate and police classes of application. Network devices may 877 combine AQM with these traffic classification mechanisms and perform 878 AQM only on specific queues within a network device. 880 An AQM algorithm should not deliberately try to prejudice the size of 881 packet that performs best (i.e. Preferentially drop/mark based only 882 on packet size). Procedures for selecting packets to mark/drop 883 SHOULD observe the actual or projected time that a packet is in a 884 queue (bytes at a rate being an analog to time). When an AQM 885 algorithm decides whether to drop (or mark) a packet, it is 886 RECOMMENDED that the size of the particular packet should not be 887 taken into account [RFC7141]. 889 Applications (or transports) generally know the packet size that they 890 are using and can hence make their judgments about whether to use 891 small or large packets based on the data they wish to send and the 892 expected impact on the delay or throughput, or other performance 893 parameter. When a transport or application responds to a dropped or 894 marked packet, the size of the rate reduction should be proportionate 895 to the size of the packet that was sent [RFC7141]. 897 AQM-enabled system MAY instantiate different instances of an AQM 898 algorithm to be applied within the same traffic class. Traffic 899 classes may be differentiated based on an Access Control List (ACL), 900 the packet Differentiated Services Code Point (DSCP) [RFC5559], 901 enabling use of the ECN field (i.e. any of ECT(0), ECT(1) or 902 CE)[RFC3168] [RFC4774], a multi-field (MF) classifier that combines 903 the values of a set of protocol fields (e.g. IP address, transport, 904 ports) or an equivalent codepoint at a lower layer. This 905 recommendation goes beyond what is defined in RFC 3168, by allowing 906 that an implementation MAY use more than one instance of an AQM 907 algorithm to handle both ECN-capable and non-ECN-capable packets. 909 4.5. AQM algorithms SHOULD NOT be dependent on specific transport 910 protocol behaviours 912 In deploying AQM, network devices need to support a range of Internet 913 traffic and SHOULD NOT make implicit assumptions about the 914 characteristics desired by the set transports/applications the 915 network supports. That is, AQM methods should be opaque to the 916 choice of transport and application. 918 AQM algorithms are often evaluated by considering TCP [RFC0793] with 919 a limited number of applications. Although TCP is the predominant 920 transport in the Internet today, this no longer represents a 921 sufficient selection of traffic for verification. There is 922 significant use of UDP [RFC0768] in voice and video services, and 923 some applications find utility in SCTP [RFC4960] and DCCP [RFC4340]. 924 Hence, AQM algorithms should also demonstrate operation with 925 transports other than TCP and need to consider a variety of 926 applications. Selection of AQM algorithms also needs to consider use 927 of tunnel encapsulations that may carry traffic aggregates. 929 AQM algorithms SHOULD NOT target or derive implicit assumptions about 930 the characteristics desired by specific transports/applications. 931 Transports and applications need to respond to the congestion signals 932 provided by AQM (i.e. dropping or ECN-marking) in a timely manner 933 (within a few RTT at the latest). 935 4.6. Interactions with congestion control algorithms 937 Applications and transports need to react to received implicit or 938 explicit signals that indicate the presence of congestion. This 939 section identifies issues that can impact the design of transport 940 protocols when using paths that use AQM. 942 Transport protocols and applications need timely signals of 943 congestion. The time taken to detect and respond to congestion is 944 increased when network devices queue packets in buffers. It can be 945 difficult to detect tail losses at a higher layer and this may 946 sometimes require transport timers or probe packets to detect and 947 respond to such loss. Loss patterns may also impact timely 948 detection, e.g. the time may be reduced when network devices do not 949 drop long runs of packets from the same flow. 951 A common objective of an elastic transport congestion control 952 protocol is to allow an application to deliver the maximum rate of 953 data without inducing excessive delays when packets are queued in a 954 buffers within the network. To achieve this, a transport should try 955 to operate at rate below the inflexion point of the load/delay curve 956 (the bend of what is sometimes called a "hockey-stick" curve) 958 [Jain94]. When the congestion window allows the load to approach 959 this bend, the end-to-end delay starts to rise - a result of 960 congestion, as packets probabilistically arrive at non-overlapping 961 times. On the one hand, a transport that operates above this point 962 can experience congestion loss and could also trigger operator 963 activities, such as those discussed in [RFC6057]. On the other hand, 964 a flow may achieve both near-maximum throughput and low latency when 965 it operates close to this knee point, with minimal contribution to 966 router congestion. Choice of an appropriate rate/congestion window 967 can therefore significantly impact the loss and delay experienced by 968 a flow and will impact other flows that share a common network queue. 970 Some applications may send less than permitted by the congestion 971 control window (or rate). Examples include multimedia codecs that 972 stream at some natural rate (or set of rates) or an application that 973 is naturally interactive (e.g., some web applications, gaming, 974 transaction-based protocols). Such applications may have different 975 objectives. They may not wish to maximize throughput, but may desire 976 a lower loss rate or bounded delay. 978 The correct operation of an AQM-enabled network device MUST NOT rely 979 upon specific transport responses to congestion signals. 981 4.7. The need for further research 983 The second recommendation of [RFC2309] called for further research 984 into the interaction between network queues and host applications, 985 and the means of signaling between them. This research has occurred, 986 and we as a community have learned a lot. However, we are not done. 988 We have learned that the problems of congestion, latency and buffer- 989 sizing have not gone away, and are becoming more important to many 990 users. A number of self-tuning AQM algorithms have been found that 991 offer significant advantages for deployed networks. There is also 992 renewed interest in deploying AQM and the potential of ECN. 994 Traffic patterns can depend on the network deployment scenario, and 995 Internet research therefore needs to consider the implications of a 996 diverse range of application interactions. At the time of writing 997 (in 2015), an obvious example of further research is the need to 998 consider the many-to-one communication patterns found in data 999 centers, known as incast [Ren12], (e.g., produced by Map/Reduce 1000 applications). 1002 Research also needs to consider the need to extend our taxonomy of 1003 transport sessions to include not only "mice" and "elephants", but 1004 "lemmings"? Where "Lemmings" are flash crowds of "mice" that the 1005 network inadvertently tries to signal to as if they were elephant 1006 flows, resulting in head of line blocking in a data center deployment 1007 scenario. 1009 Examples of other required research include: 1011 o Research into new AQM and scheduling algorithms. 1013 o Appropriate use of delay-based methods and the implications of 1014 AQM. 1016 o Research into suitable algorithms for marking ECN-capable packets 1017 that do not require operational configuration or tuning for common 1018 use. 1020 o Experience in the deployment of ECN alongside AQM. 1022 o Tools for enabling AQM (and ECN) deployment and measuring the 1023 performance. 1025 o Methods for mitigating the impact of non-conformant and malicious 1026 flows. 1028 o Research to understand the implications of using new network and 1029 transport methods on applications. 1031 Hence, this document therefore reiterates the call of RFC 2309: we 1032 need continuing research as applications develop. 1034 5. IANA Considerations 1036 This memo asks the IANA for no new parameters. 1038 6. Security Considerations 1040 While security is a very important issue, it is largely orthogonal to 1041 the performance issues discussed in this memo. 1043 Many deployed network devices use queueing methods that allow 1044 unresponsive traffic to capture network capacity, denying access to 1045 other traffic flows. This could potentially be used as a denial-of- 1046 service attack. This threat could be reduced in network devices 1047 deploy AQM or some form of scheduling. We note, however, that a 1048 denial-of-service attack that results in unresponsive traffic flows 1049 may be indistinguishable from other traffic flows (e.g. tunnels 1050 carrying aggregates of short flows, high-rate isochronous 1051 applications). New methods therefore may remain vulnerable, and this 1052 document recommends that ongoing research should consider ways to 1053 mitigate such attacks. 1055 7. Privacy Considerations 1057 This document, by itself, presents no new privacy issues. 1059 8. Acknowledgements 1061 The original version of this document describing best current 1062 practice was based on the informational text of [RFC2309]. This was 1063 written by the End-to-End Research Group, which is to say Bob Braden, 1064 Dave Clark, Jon Crowcroft, Bruce Davie, Steve Deering, Deborah 1065 Estrin, Sally Floyd, Van Jacobson, Greg Minshall, Craig Partridge, 1066 Larry Peterson, KK Ramakrishnan, Scott Shenker, John Wroclawski, and 1067 Lixia Zhang. Although there are important differences, many of the 1068 key arguments in the present document remain unchanged from those in 1069 RFC 2309. 1071 The need for an updated document was agreed to in the tsvarea meeting 1072 at IETF 86. This document was reviewed on the aqm@ietf.org list. 1073 Comments were received from Colin Perkins, Richard Scheffenegger, 1074 Dave Taht, John Leslie, David Collier-Brown and many others. 1076 Gorry Fairhurst was in part supported by the European Community under 1077 its Seventh Framework Programme through the Reducing Internet 1078 Transport Latency (RITE) project (ICT-317700). 1080 9. References 1082 9.1. Normative References 1084 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1085 Requirement Levels", BCP 14, RFC 2119, March 1997. 1087 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1088 of Explicit Congestion Notification (ECN) to IP", RFC 1089 3168, September 2001. 1091 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1092 Internet Protocol", RFC 4301, December 2005. 1094 [RFC4774] Floyd, S., "Specifying Alternate Semantics for the 1095 Explicit Congestion Notification (ECN) Field", BCP 124, 1096 RFC 4774, November 2006. 1098 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 1099 for Application Designers", BCP 145, RFC 5405, November 1100 2008. 1102 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1103 Control", RFC 5681, September 2009. 1105 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1106 Notification", RFC 6040, November 2010. 1108 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 1109 and K. Carlberg, "Explicit Congestion Notification (ECN) 1110 for RTP over UDP", RFC 6679, August 2012. 1112 [RFC7141] Briscoe, B. and J. Manner, "Byte and Packet Congestion 1113 Notification", BCP 41, RFC 7141, February 2014. 1115 9.2. Informative References 1117 [AQM-WG] "IETF AQM WG", . 1119 [CONEX] Mathis, M. and B. Briscoe, "The Benefits to Applications 1120 of using Explicit Congestion Notification (ECN)", IETF 1121 (Work-in-Progress) draft-ietf-conex-abstract-mech, March 1122 2014. 1124 [Choi04] Choi, Baek-Young., Moon, Sue., Zhang, Zhi-Li., 1125 Papagiannaki, K., and C. Diot, "Analysis of Point-To-Point 1126 Packet Delay In an Operational Network", March 2004. 1128 [Dem90] Demers, A., Keshav, S., and S. Shenker, "Analysis and 1129 Simulation of a Fair Queueing Algorithm, Internetworking: 1130 Research and Experience", SIGCOMM Symposium proceedings on 1131 Communications architectures and protocols , 1990. 1133 [ECN-Benefit] 1134 Welzl, M. and G. Fairhurst, "The Benefits to Applications 1135 of using Explicit Congestion Notification (ECN)", IETF 1136 (Work-in-Progress) , February 2014. 1138 [Flo92] Floyd, S. and V. Jacobsen, "On Traffic Phase Effects in 1139 Packet-Switched Gateways", 1992. 1141 [Flo94] Floyd, S. and V. Jacobsen, "The Synchronization of 1142 Periodic Routing Messages, 1143 http://ee.lbl.gov/papers/sync_94.pdf", 1994. 1145 [Floyd91] Floyd, S., "Connections with Multiple Congested Gateways 1146 in Packet-Switched Networks Part 1: One-way Traffic.", 1147 Computer Communications Review , October 1991. 1149 [Floyd95] Floyd, S. and V. Jacobson, "Link-sharing and Resource 1150 Management Models for Packet Networks", IEEE/ACM 1151 Transactions on Networking , August 1995. 1153 [Jacobson88] 1154 Jacobson, V., "Congestion Avoidance and Control", SIGCOMM 1155 Symposium proceedings on Communications architectures and 1156 protocols , August 1988. 1158 [Jain94] Jain, Raj., Ramakrishnan, KK., and Chiu. Dah-Ming, 1159 "Congestion avoidance scheme for computer networks", US 1160 Patent Office 5377327, December 1994. 1162 [Lakshman96] 1163 Lakshman, TV., Neidhardt, A., and T. Ott, "The Drop From 1164 Front Strategy in TCP Over ATM and Its Interworking with 1165 Other Control Features", IEEE Infocomm , 1996. 1167 [Leland94] 1168 Leland, W., Taqqu, M., Willinger, W., and D. Wilson, "On 1169 the Self-Similar Nature of Ethernet Traffic (Extended 1170 Version)", IEEE/ACM Transactions on Networking , February 1171 1994. 1173 [McK90] McKenney, PE. and G. Varghese, "Stochastic Fairness 1174 Queuing", 1175 http://www2.rdrop.com/~paulmck/scalability/paper/ 1176 sfq.2002.06.04.pdf , 1990. 1178 [Nic12] Nichols, K., "Controlling Queue Delay", Communications of 1179 the ACM Vol. 55 No. 11, July, 2012, pp.42-50. , July 2002. 1181 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1182 August 1980. 1184 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1185 1981. 1187 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 1188 793, September 1981. 1190 [RFC0896] Nagle, J., "Congestion control in IP/TCP internetworks", 1191 RFC 896, January 1984. 1193 [RFC0970] Nagle, J., "On packet switches with infinite storage", RFC 1194 970, December 1985. 1196 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1197 Communication Layers", STD 3, RFC 1122, October 1989. 1199 [RFC1633] Braden, B., Clark, D., and S. Shenker, "Integrated 1200 Services in the Internet Architecture: an Overview", RFC 1201 1633, June 1994. 1203 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 1204 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 1205 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 1206 S., Wroclawski, J., and L. Zhang, "Recommendations on 1207 Queue Management and Congestion Avoidance in the 1208 Internet", RFC 2309, April 1998. 1210 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1211 (IPv6) Specification", RFC 2460, December 1998. 1213 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1214 "Definition of the Differentiated Services Field (DS 1215 Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1216 1998. 1218 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1219 and W. Weiss, "An Architecture for Differentiated 1220 Services", RFC 2475, December 1998. 1222 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1223 Congestion Control Protocol (DCCP)", RFC 4340, March 2006. 1225 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC 1226 4960, September 2007. 1228 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 1229 Friendly Rate Control (TFRC): Protocol Specification", RFC 1230 5348, September 2008. 1232 [RFC5559] Eardley, P., "Pre-Congestion Notification (PCN) 1233 Architecture", RFC 5559, June 2009. 1235 [RFC6057] Bastian, C., Klieber, T., Livingood, J., Mills, J., and R. 1236 Woundy, "Comcast's Protocol-Agnostic Congestion Management 1237 System", RFC 6057, December 2010. 1239 [RFC6789] Briscoe, B., Woundy, R., and A. Cooper, "Congestion 1240 Exposure (ConEx) Concepts and Use Cases", RFC 6789, 1241 December 2012. 1243 [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 1244 "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, 1245 December 2012. 1247 [Ren12] Ren, Y., Zhao, Y., and P. Liu, "A survey on TCP Incast in 1248 data center networks, International Journal of 1249 Communication Systems, Volume 27, Issue 8, pages 1250 1160-117", 1990. 1252 [Shr96] Shreedhar, M. and G. Varghese, "Efficient Fair Queueing 1253 Using Deficit Round Robin", IEEE/ACM Transactions on 1254 Networking Vol 4, No. 3 , July 1996. 1256 [Sto97] Stoica, I. and H. Zhang, "A Hierarchical Fair Service 1257 Curve algorithm for Link sharing, real-time and priority 1258 services", ACM SIGCOMM , 1997. 1260 [Sut99] Suter, B., "Buffer Management Schemes for Supporting TCP 1261 in Gigabit Routers with Per-flow Queueing", IEEE Journal 1262 on Selected Areas in Communications Vol. 17 Issue 6, June, 1263 1999, pp. 1159-1169. , 1999. 1265 [Willinger95] 1266 Willinger, W., Taqqu, M., Sherman, R., Wilson, D., and V. 1267 Jacobson, "Self-Similarity Through High-Variability: 1268 Statistical Analysis of Ethernet LAN Traffic at the Source 1269 Level", SIGCOMM Symposium proceedings on Communications 1270 architectures and protocols , August 1995. 1272 [Zha90] Zhang, L. and D. Clark, "Oscillating Behavior of Network 1273 Traffic: A Case Study Simulation, 1274 http://groups.csail.mit.edu/ana/Publications/Zhang-DDC- 1275 Oscillating-Behavior-of-Network-Traffic-1990.pdf", 1990. 1277 Appendix A. Change Log 1279 RFC-Editor please remove this appendix before publication. 1281 Initial Version: March 2013 1283 Minor update of the algorithms that the IETF recommends SHOULD NOT 1284 require operational (especially manual) configuration or tuning 1285 April 2013 1287 Major surgery. This draft is for discussion at IETF-87 and expected 1288 to be further updated. 1289 July 2013 1291 -00 WG Draft - Updated transport recommendations; revised deployment 1292 configuration section; numerous minor edits. 1293 Oct 2013 1295 -01 WG Draft - Updated transport recommendations; revised deployment 1296 configuration section; numerous minor edits. 1297 Jan 2014 - Feedback from WG. 1299 -02 WG Draft - Minor edits Feb 2014 - Mainly language fixes. 1301 -03 WG Draft - Minor edits Feb 2013 - Comments from David Collier- 1302 Brown and David Taht. 1304 -04 WG Draft - Minor edits May 2014 - Comments during WGLC: Provided 1305 some introductory subsections to help people (with subsections and 1306 better text). - Written more on the role scheduling. - Clarified 1307 that ECN mark threshold needs to be configurable. - Reworked your 1308 "knee" para. Various updates in response to feedback. 1310 -05 WG Draft - Minor edits June 2014 - New text added to address 1311 further comments, and improve introduction - adding context, 1312 reference to Conex, linking between sections, added text on 1313 synchronization. 1315 -06 WG Draft - Minor edits July 2014 - Reorganised the introduction 1316 following WG feedback to better explain how this relates to the 1317 original goals of RFC2309. Added item on packet bursts. Various 1318 minor corrections incorporated - no change to main 1319 recommendations. 1321 -07 WG Draft - Minor edits July 2014 - Replaced ID REF by RFC 7141. 1322 Changes made to introduction following inputs from Wes Eddy and 1323 John Leslie. Corrections and additions proposed by Bob Briscoe. 1325 -08 WG Draft - Minor edits August 2014 - Review comments from John 1326 Leslie and Bob Briscoe. Text corrections including; updated 1327 Acknowledgments (RFC2309 ref) s/congestive/congestion/g; changed 1328 the more bold language from RFC2309 to reflect a more considered 1329 perceived threat to Internet Performance; modified the category 1330 that is not-TCP-like to be "less responsive to congestion than 1331 TCP" and more clearkly noted that represents a range of 1332 behaviours. 1334 -09 WG Draft - Minor edits Jan 2015 - Edits following LC comments. 1336 Authors' Addresses 1338 Fred Baker (editor) 1339 Cisco Systems 1340 Santa Barbara, California 93117 1341 USA 1343 Email: fred@cisco.com 1345 Godred Fairhurst (editor) 1346 University of Aberdeen 1347 School of Engineering 1348 Fraser Noble Building 1349 Aberdeen, Scotland AB24 3UE 1350 UK 1352 Email: gorry@erg.abdn.ac.uk 1353 URI: http://www.erg.abdn.ac.uk