idnits 2.17.1 draft-khademi-alternativebackoff-ecn-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC3168, but the abstract doesn't seem to directly say this. It does mention RFC3168 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3168, updated by this document, for RFC5378 checks: 2000-11-17) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 03, 2016) is 2945 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-10) exists of draft-ietf-aqm-codel-02 == Outdated reference: A later version (-10) exists of draft-ietf-aqm-pie-03 == Outdated reference: A later version (-07) exists of draft-ietf-tcpm-cubic-00 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group N. Khademi 3 Internet-Draft M. Welzl 4 Updates: 3168 (if approved) University of Oslo 5 Intended status: Experimental G. Armitage 6 Expires: October 5, 2016 Swinburne University of Technology 7 G. Fairhurst 8 University of Aberdeen 9 April 03, 2016 11 TCP Alternative Backoff with ECN (ABE) 12 draft-khademi-alternativebackoff-ecn-03 14 Abstract 16 This memo provides an experimental update to RFC3168. It updates the 17 TCP sender-side reaction to a congestion notification received via 18 Explicit Congestion Notification (ECN). The updated method reduces 19 cwnd by a smaller amount than TCP does in reaction to loss. The 20 intention is to achieve good throughput when the queue at the 21 bottleneck is smaller than the bandwidth-delay-product of the 22 connection. This is more likely when an Active Queue Management 23 (AQM) mechanism has used ECN to CE-mark a packet, than when a packet 24 was lost. Future versions of this document will discuss SCTP as well 25 as other transports using ECN. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on October 5, 2016. 44 Copyright Notice 46 Copyright (c) 2016 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2.1. Why use ECN to vary the degree of backoff? . . . . . . . 3 64 2.2. Choice of ABE multiplier . . . . . . . . . . . . . . . . 4 65 3. NEW: Updating the Sender-side ECN Reaction . . . . . . . . . 5 66 3.1. RFC 2119 . . . . . . . . . . . . . . . . . . . . . . . . 6 67 3.2. Update to RFC 3168 . . . . . . . . . . . . . . . . . . . 6 68 3.3. Status of the Update . . . . . . . . . . . . . . . . . . 7 69 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 70 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 71 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 72 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 7.1. Normative References . . . . . . . . . . . . . . . . . . 8 74 7.2. Informative References . . . . . . . . . . . . . . . . . 8 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 77 1. Introduction 79 Explicit Congestion Notification (ECN) is specified in [RFC3168]. It 80 allows a network device that uses Active Queue Management (AQM) to 81 set the congestion experienced, CE, codepoint in the ECN field of the 82 IP packet header, rather than drop ECN-capable packets when incipient 83 congestion is detected. When an ECN-capable transport is used over a 84 path that supports ECN, it provides the opportunity for flows to 85 improve their performance in the presence of incipient congestion 86 [I-D.AQM-ECN-benefits]. 88 [RFC3168] not only specifies the router use of the ECN field, it also 89 specifies a TCP procedure for using ECN. This states that a TCP 90 sender should treat the ECN indication of congestion in the same way 91 as that of a non-ECN-Capable TCP flow experiencing loss, by halving 92 the congestion window "cwnd" and by reducing the slow start threshold 93 "ssthresh". [RFC5681] stipulates that TCP congestion control sets 94 "ssthresh" to max(FlightSize / 2, 2*SMSS) in response to packet loss. 95 Consequently, a standard TCP flow using this reaction needs 96 significant network queue space: it can only fully utilise a 97 bottleneck when the length of the link queue (or the AQM dropping 98 threshold) is at least the bandwidth-delay product (BDP) of the flow. 100 A backoff multipler of 0.5 (halving cwnd and sshthresh after packet 101 loss) is not the only available strategy. As defined in [ID.CUBIC], 102 CUBIC multiplies the current cwnd by 0.8 in response to loss 103 (although the Linux implementation of CUBIC has used a multiplier of 104 0.7 since kernel version 2.6.25 released in 2008). Consequently, 105 CUBIC utilise paths well even when the bottleneck queue is shorter 106 than the bandwidth-delay product of the flow. However, in the case 107 of a DropTail (FIFO) queue without AQM, such less-aggressive backoff 108 increases the risk of creating a standing queue [CODEL2012]. 110 Devices implementing AQM are likely to be the dominant (and possibly 111 only) source of ECN CE-marking for packets from ECN-capable senders. 112 AQM mechanisms typically strive to maintain a small queue length, 113 regardless of the bandwidth-delay product of flows passing through 114 them. Receipt of an ECN CE-mark might therefore reasonably be taken 115 to indicate that a small bottleneck queue exists in the path, and 116 hence the TCP flow would benefit from using a less aggressive backoff 117 multiplier. 119 Results reported in [ABE2015] show significant benefits (improved 120 throughput) when reacting to ECN-Echo by multiplying cwnd and 121 sstthresh with a value in the range [0.7..0.85]. Section 2 describes 122 the rationale for this change. Section 3 specifies a change to the 123 TCP sender backoff behaviour in response to an indication that CE- 124 marks have been received by the receiver. 126 2. Discussion 128 Much of the background to this proposal can be found in [ABE2015]. 129 Using a mix of experiments, theory and simulations with standard 130 NewReno and CUBIC, [ABE2015] recommends enabling ECN and "...letting 131 individual TCP senders use a larger multiplicative decrease factor in 132 reaction to ECN CE-marks from AQM-enabled bottlenecks." Such a 133 change is noted to result in "...significant performance gains in 134 lightly-multiplexed scenarios, without losing the delay-reduction 135 benefits of deploying CoDel or PIE." 137 2.1. Why use ECN to vary the degree of backoff? 139 The classic rule-of-thumb dictates a BDP of bottleneck buffering if a 140 TCP connection wishes to optimise path utilisation. A single TCP 141 connection running through such a bottleneck will have opened cwnd up 142 to 2*BDP by the time packet loss occurs. [RFC5681]'s halving of cwnd 143 and ssthresh pushes the TCP connection back to allowing only a BDP of 144 packets in flight -- just enough to maintain 100% utilisation of the 145 network path. 147 AQM schemes like CoDel [I-D.CoDel] and PIE [I-D.PIE] use congestion 148 notifications to constrain the queuing delays experienced by packets, 149 rather than in response to impending or actual bottleneck buffer 150 exhaustion. With current default delay targets, CoDel and PIE both 151 effectively emulate a shallow buffered bottleneck (section II, 152 [ABE2015]) while allowing short traffic bursts into the queue. This 153 interacts acceptably for TCP connections over low BDP paths, or 154 highly multiplexed scenarios (lmany concurrent TCP connections). 155 However, it interacts badly with lightly-multiplexed cases (few 156 concurrent connections) over high BDP paths. Conventional TCP 157 backoff in such cases leads to gaps in packet transmission and under- 158 utilisation of the path. 160 In an ideal world, the TCP sender would adapt its backoff strategy to 161 match the effective depth at which a bottleneck begins indicating 162 congestion. In the practical world, [ABE2015] proposes using the 163 existence of ECN CE-marks to infer whether a path's bottleneck is 164 AQM-enabled (shallow queue) or classic DropTail (deep queue), and 165 adjust backoff accordingly. This results in a change to [RFC3168], 166 which recommended that TCP senders respond in the same way following 167 indication of a received ECN CE-mark and a packet loss, making these 168 equivalent signals of congestion. (The idea to change this behaviour 169 pre-dates ABE. [ICC2002] also proposed using ECN CE-marks to modify 170 TCP congestion control behaviour, using a larger multiplicative 171 decrease factor in conjunction with a smaller additive increase 172 factor to deal with RED-based bottlenecks that were not necessarily 173 configured to emulate a shallow queue.) 175 [RFC7567] states that "deployed AQM algorithms SHOULD support 176 Explicit Congestion Notification (ECN) as well as loss to signal 177 congestion to endpoints" and [I-D.AQM-ECN-benefits] encourages this 178 deployment. Apple recently announced their intention to enable ECN 179 in iOS 9 and OS X 10.11 devices [WWDC2015]. By 2014, server-side ECN 180 negotiation was observed to be provided by the majority of the top 181 million web servers [PAM2015], and only 0.5% of websites incurred 182 additional connection setup latency using RFC3168-compliant ECN- 183 fallback mechanisms. 185 2.2. Choice of ABE multiplier 187 ABE decouples a TCP sender's reaction to loss and ECN CE-marks. The 188 description respectively uses beta_{loss} and beta_{ecn} to refer to 189 the multiplicative decrease factors applied in response to packet 190 loss and in response to an indication of a received CN CE-mark on an 191 ECN-enabled TCP connection (based on the terms used in [ABE2015]). 193 For non-ECN-enabled TCP connections, no ECN CE-marks are received and 194 only beta_{loss} applies. 196 In other words, in response to detected loss: 198 FlightSize_(n+1) = FlightSize_n * beta_{loss} 200 and in response to an indication of a received ECN CE-mark: 202 FlightSize_(n+1) = FlightSize_n * beta_{ecn} 204 where, as in [RFC5681], FlightSize is the amount of outstanding data 205 in the network, upper-bounded by the sender's congestion window 206 (cwnd) and the receiver's advertised window (rwnd). The higher the 207 values of beta_*, the less aggressive the response of any individual 208 backoff event. 210 The appropriate choice for beta_{loss} and beta_{ecn} values is a 211 balancing act between path utilisation and draining the bottleneck 212 queue. More aggressive backoff (smaller beta_*) risks underutilising 213 the path, while less aggressive backoff (larger beta_*) can result in 214 slower draining of the bottleneck queue. 216 The Internet has already been running with at least two different 217 beta_{loss} values for several years: the value in [RFC5681] is 0.5, 218 and Linux CUBIC uses 0.7. ABE proposes no change to beta_{loss} used 219 by any current TCP implementations. 221 beta_{ecn} depends on how we want to optimise the reponse of a TCP 222 connection to shallow AQM marking thresholds. beta_{loss} reflects 223 the preferred response of each TCP algorithm when faced with 224 exhaustion of buffers (of unknown depth) signalled by packet loss. 225 Consequently, for any given TCP algorithm the choice of beta_{ecn} is 226 likely to be algorithm-specific, rather than a constant multiple of 227 the algorithm's existing beta_{loss}. 229 A range of experiments (section IV, [ABE2015]) with NewReno and CUBIC 230 over CoDel and PIE in lightly multiplexed scenarios have explored 231 this choice of parameter. These experiments indicate that CUBIC 232 connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), 233 and NewReno connections see improvements with beta_{ecn} in the range 234 0.7 to 0.85 (c.f., beta_{loss} = 0.5). 236 3. NEW: Updating the Sender-side ECN Reaction 238 This section specifies an experimental update to [RFC3168]. 240 3.1. RFC 2119 242 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 243 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 244 document are to be interpreted as described in [RFC2119]. 246 3.2. Update to RFC 3168 248 This document specifies an update to the TCP sender reaction that 249 follows when the TCP receiver signals that ECN CE-marked packets have 250 been received. 252 The first paragraph of Section 6.1.2, "The TCP Sender", in [RFC3168] 253 contains the following text: 255 "If the sender receives an ECN-Echo (ECE) ACK packet (that is, an ACK 256 packet with the ECN-Echo flag set in the TCP header), then the sender 257 knows that congestion was encountered in the network on the path from 258 the sender to the receiver. The indication of congestion should be 259 treated just as a congestion loss in non-ECN-Capable TCP. That is, 260 the TCP source halves the congestion window "cwnd" and reduces the 261 slow start threshold "ssthresh"." 263 This memo updates this by replacing it with the following text: 265 "If the sender receives an ECN-Echo (ECE) ACK packet (that is, an ACK 266 packet with the ECN-Echo flag set in the TCP header), then the sender 267 knows that congestion was encountered in the network on the path from 268 the sender to the receiver. This indication of congestion could be 269 treated in the same way as a congestion loss, however reception of 270 the ECN-Echo flag SHOULD produce a reduction in FlightSize that is 271 less than the reduction had the flow experienced loss. The reduction 272 needs to be sufficient to allow flows sharing a bottleneck to 273 increase their share of the capacity. This reduction MUST be less 274 than 0.85 (at least a 15% reduction). 276 An ECN-capable network device cannot eliminate the possibility of 277 loss, because a drop may occur due to a traffic burst exceeding the 278 instantaneous available capacity of a network buffer or as a result 279 of the AQM algorithm (overload protection mechanisms, etc [RFC7567]). 280 Whatever the cause of loss, detection of a missing packet needs to 281 trigger the standard loss-based congestion control response. This 282 explicitly does not update this behaviour. 284 In addition, this document RECOMMENDS that experimental deployments 285 method multiply the FlightSize by 0.8 and reduce the slow start 286 threshold 'ssthresh' in response to reception of a TCP segment that 287 sets the ECN-Echo flag." 289 3.3. Status of the Update 291 This update is a sender-side only change. Like other changes to 292 congestion-control algorithms it does not require any change to the 293 TCP receiver or to network devices (except to enable an ECN-marking 294 algorithm [RFC3168] [RFC7567]). If the method is only deployed by 295 some TCP senders, and not by others, the senders that use this method 296 can gain advantage, possibly at the expense of other flows that do 297 not use this updated method. This advantage applies only to ECN- 298 marked packets and not to loss indications. Hence, the new method 299 can not lead to congestion collapse. 301 The present specification has been assigned an Experimental status, 302 to provide Internet deployment experience before being proposed as a 303 Standards-Track update. 305 4. Acknowledgements 307 Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by 308 the European Community under its Seventh Framework Programme through 309 the Reducing Internet Transport Latency (RITE) project (ICT-317700). 310 The views expressed are solely those of the authors. 312 The authors would like to thank the following people for their 313 contributions to [ABE2015]: Chamil Kulatunga, David Ros, Stein 314 Gjessing, Sebastian Zander. Thanks to (in alphabetical order) Bob 315 Briscoe, John Leslie, Dave Taht and the TCPM WG for providing 316 valuable feedback on this document. 318 The authors would like to thank feedback on the congestion control 319 behaviour specified in this update received from the IRTF Internet 320 Congestion Control Research Group (ICCRG). 322 5. IANA Considerations 324 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 326 This memo includes no request to IANA. 328 6. Security Considerations 330 The described method is a sender-side only transport change, and does 331 not change the protocol messages exchanged. The security 332 considerations of RFC 3819 therefore still apply. 334 This document describes a change to TCP congestion control with ECN 335 that will typically lead to a change in the capacity achieved when 336 flows share a network bottleneck. Similar unfairness in the way that 337 capacity is shared is also exhibited by other congestion control 338 mechanisms that have been in use in the Internet for many years 339 (e.g., CUBIC [ID.CUBIC]). Unfairness may also be a result of other 340 factors, including the round trip time experienced by a flow. This 341 advantage applies only to ECN-marked packets and not to loss 342 indications, and will therefore not lead to congestion collapse. 344 7. References 346 7.1. Normative References 348 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 349 Requirement Levels", BCP 14, RFC 2119, 350 DOI 10.17487/RFC2119, March 1997, 351 . 353 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 354 of Explicit Congestion Notification (ECN) to IP", 355 RFC 3168, DOI 10.17487/RFC3168, September 2001, 356 . 358 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 359 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 360 . 362 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 363 Recommendations Regarding Active Queue Management", 364 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 365 . 367 7.2. Informative References 369 [ABE2015] Khademi, N., Welzl, M., Armitage, G., Kulatunga, C., Ros, 370 D., Fairhurst, G., Gjessing, S., and S. Zander, 371 "Alternative Backoff: Achieving Low Latency and High 372 Throughput with ECN and AQM", CAIA Technical Report CAIA- 373 TR-150710A, Swinburne University of Technology, July 2015, 374 . 377 [CODEL2012] 378 Nichols, K. and V. Jacobson, "Controlling Queue Delay", 379 July 2012, . 381 [I-D.AQM-ECN-benefits] 382 Fairhurst, G. and M. Welzl, "The Benefits of using 383 Explicit Congestion Notification (ECN)", Internet-draft, 384 IETF work-in-progress draft-ietf-aqm-ecn-benefits-08, 385 November 2015. 387 [I-D.CoDel] 388 Nichols, K., Jacobson, V., McGregor, V., and J. Iyengar, 389 "The Benefits of using Explicit Congestion Notification 390 (ECN)", Internet-draft, IETF work-in-progress draft-ietf- 391 aqm-codel-02, December 2015. 393 [I-D.PIE] Pan, R., Natarajan, P., Baker, F., White, G., VerSteeg, 394 B., Prabhu, M., Piglione, C., and V. Subramanian, "PIE: A 395 Lightweight Control Scheme To Address the Bufferbloat 396 Problem", Internet-draft, IETF work-in-progress draft- 397 ietf-aqm-pie-03, November 2015. 399 [ICC2002] Kwon, M. and S. Fahmy, "TCP Increase/Decrease Behavior 400 with Explicit Congestion Notification (ECN)", IEEE 401 ICC 2002, New York, New York, USA, May 2002, 402 . 404 [ID.CUBIC] 405 Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 406 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 407 Internet-draft, IETF work-in-progress draft-ietf-tcpm- 408 cubic-00, June 2015. 410 [PAM2015] Trammell, B., Kuhlewind, M., Boppart, D., Learmonth, I., 411 Fairhurst, G., and R. Scheffenegger, "Enabling Internet- 412 wide Deployment of Explicit Congestion Notification", 413 Proceedings of the 2015 Passive and Active Measurement 414 Conference, New York, March 2015, 415 . 417 [WWDC2015] 418 Lakhera, P. and S. Cheshire, "Your App and Next Generation 419 Networks", Apple Worldwide Developers Conference 2015, San 420 Francisco, USA, June 2015, 421 . 423 Authors' Addresses 424 Naeem Khademi 425 University of Oslo 426 PO Box 1080 Blindern 427 Oslo N-0316 428 Norway 430 Email: naeemk@ifi.uio.no 432 Michael Welzl 433 University of Oslo 434 PO Box 1080 Blindern 435 Oslo N-0316 436 Norway 438 Email: michawe@ifi.uio.no 440 Grenville Armitage 441 Centre for Advanced Internet Architectures 442 Swinburne University of Technology 443 PO Box 218 444 John Street, Hawthorn 445 Victoria 3122 446 Australia 448 Email: garmitage@swin.edu.au 450 Godred Fairhurst 451 University of Aberdeen 452 School of Engineering, Fraser Noble Building 453 Aberdeen AB24 3UE 454 UK 456 Email: gorry@erg.abdn.ac.uk