idnits 2.17.1 draft-khademi-tcpm-alternativebackoff-ecn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 30, 2016) is 2734 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-04) exists of draft-black-tsvwg-ecn-experimentation-02 == Outdated reference: A later version (-07) exists of draft-ietf-tcpm-cubic-02 == Outdated reference: A later version (-10) exists of draft-ietf-aqm-codel-04 == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-01 == Outdated reference: A later version (-10) exists of draft-ietf-tcpm-dctcp-02 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group N. Khademi 3 Internet-Draft M. Welzl 4 Intended status: Experimental University of Oslo 5 Expires: May 3, 2017 G. Armitage 6 Swinburne University of 7 Technology 8 G. Fairhurst 9 University of Aberdeen 10 October 30, 2016 12 TCP Alternative Backoff with ECN (ABE) 13 draft-khademi-tcpm-alternativebackoff-ecn-01 15 Abstract 17 This memo updates the TCP sender-side reaction to a congestion 18 notification received via Explicit Congestion Notification (ECN). 19 The updated method reduces FlightSize in Congestion Avoidance by a 20 smaller amount than the TCP reaction to loss. The intention is to 21 achieve good throughput when the queue at the bottleneck is smaller 22 than the bandwidth-delay-product of the connection. This is more 23 likely when an Active Queue Management (AQM) mechanism has used ECN 24 to CE-mark a packet, than when a packet was lost. Future versions of 25 this document will also describe a corresponding method for SCTP. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on May 3, 2017. 44 Copyright Notice 46 Copyright (c) 2016 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3.1. Why Use ECN to Vary the Degree of Backoff? . . . . . . . . 4 65 3.2. Focus on ECN as Defined in RFC3168 . . . . . . . . . . . . 5 66 3.3. Discussion: Choice of ABE Multiplier . . . . . . . . . . . 5 67 4. Specification . . . . . . . . . . . . . . . . . . . . . . . . 6 68 5. Status of the Update . . . . . . . . . . . . . . . . . . . . . 6 69 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6 70 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 71 8. Implementation Status . . . . . . . . . . . . . . . . . . . . 7 72 9. Security Considerations . . . . . . . . . . . . . . . . . . . 7 73 10. Revision Information . . . . . . . . . . . . . . . . . . . . . 7 74 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 75 11.1. Normative References . . . . . . . . . . . . . . . . . . . 8 76 11.2. Informative References . . . . . . . . . . . . . . . . . . 8 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 79 1. Definitions 81 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 82 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 83 document are to be interpreted as described in RFC 2119 [RFC2119]. 85 2. Introduction 87 Complementing [I-D.AQM-ECN-benefits], [I-D.ECN-exp] enables wider ECN 88 deployment by updating rules in [RFC3168] that prohibited certain 89 experiments. Specifically, [I-D.ECN-exp] allows for experiments to 90 specify a congestion control response to a CE-marked packet that 91 differs from the response to a dropped packet. This memo defines 92 such a different congestion control response, called "ABE" 93 (Alternative Backoff with ECN). ABE is thus an Experiment in 94 accordance with [I-D.ECN-exp]. 96 [RFC5681] stipulates that TCP congestion control sets "ssthresh" to 97 max(FlightSize / 2, 2*SMSS) in response to packet loss. This 98 corresponds to a backoff multiplier of 0.5 (halving cwnd and 99 sshthresh after packet loss). Consequently, a standard TCP flow 100 using this reaction needs significant network queue space: it can 101 only fully utilise a bottleneck when the length of the link queue (or 102 the AQM dropping threshold) is at least the bandwidth-delay product 103 (BDP) of the flow. 105 A backoff multiplier of 0.5 is not the only available strategy. As 106 defined in [I-D.CUBIC], CUBIC multiplies the current cwnd by 0.7 in 107 response to loss (the Linux implementation of CUBIC has used a 108 multiplier of 0.7 since kernel version 2.6.25 released in 2008). 109 Consequently, CUBIC utilises paths well even when the bottleneck 110 queue is shorter than the bandwidth-delay product of the flow. 111 However, in the case of a DropTail (FIFO) queue without AQM, such 112 less-aggressive backoff increases the risk of creating a standing 113 queue [CODEL2012]. 115 The standard TCP backoff behaviour defined in [RFC5681] entails 116 reduced link utilisation in situations with short queues and low 117 statistical multiplexing. This memo proposes a concrete sender-side- 118 only congestion control response that remedies this problem. 120 Devices implementing AQM are likely to be the dominant (and possibly 121 only) source of ECN CE-marking for packets from ECN-capable senders. 122 AQM mechanisms typically strive to maintain a small average queue 123 length, regardless of the bandwidth-delay product of flows passing 124 through them. Receipt of an ECN CE-mark might therefore reasonably 125 be taken to indicate that a small bottleneck queue exists in the 126 path, and hence the TCP flow would benefit from using a less 127 aggressive backoff multiplier. 129 Much of the background to this proposal can be found in [ABE2015]. 130 Using a mix of experiments, theory and simulations with standard 131 NewReno and CUBIC, [ABE2015] recommends enabling ECN and letting 132 individual TCP senders use a larger multiplicative decrease factor as 133 a reaction to the receiver reporting ECN CE-marks from AQM-enabled 134 bottlenecks. Such a change is noted to result in "...significant 135 performance gains in lightly-multiplexed scenarios, without losing 136 the delay-reduction benefits of deploying CoDel or PIE" [I-D.CoDel] 137 [I-D.PIE]. This is achieved when reacting to ECN-Echo in Congestion 138 Avoidance by multiplying cwnd and sstthresh with a value in the range 139 [0.7..0.85]. 141 3. Discussion 143 3.1. Why Use ECN to Vary the Degree of Backoff? 145 The classic rule-of-thumb dictates that a transport provides a BDP of 146 bottleneck buffering if a TCP connection wishes to optimise path 147 utilisation. A single TCP connection running through such a 148 bottleneck will have opened cwnd up to 2*BDP by the time packet loss 149 occurs. [RFC5681]'s halving of cwnd and ssthresh pushes the TCP 150 connection back to allowing only a BDP of packets in flight -- just 151 sufficient to maintain 100% utilisation of the network path. 153 AQM schemes like CoDel [I-D.CoDel] and PIE [I-D.PIE] use congestion 154 notifications to constrain the queuing delays experienced by packets, 155 rather than in response to impending or actual bottleneck buffer 156 exhaustion. With current default delay targets, CoDel and PIE both 157 effectively emulate a shallow buffered bottleneck (section II, 158 [ABE2015]) while allowing short traffic bursts into the queue. This 159 interacts acceptably for TCP connections over low BDP paths, or 160 highly multiplexed scenarios (many concurrent TCP connections). 161 However, it interacts badly with lightly-multiplexed cases (few 162 concurrent connections) over a high BDP path. Conventional TCP 163 backoff in such cases leads to gaps in packet transmission and under- 164 utilisation of the path. 166 The idea to react differently to loss upon detecting an ECN CE-mark 167 pre-dates [ABE2015]. [ICC2002] also proposed using ECN CE-marks to 168 modify TCP congestion control behaviour, using a larger 169 multiplicative decrease factor in conjunction with a smaller additive 170 increase factor to work with RED-based bottlenecks that were not 171 necessarily configured to emulate a shallow queue. 173 3.2. Focus on ECN as Defined in RFC3168 175 Some mechanisms rely on ECN semantics that differ from the 176 definitions in [RFC3168] -- for example, Congestion Exposure (ConEx) 177 [RFC7713] and DCTCP [I-D.ietf-tcpm-dctcp] need more accurate ECN 178 information than the feedback mechanism in [RFC3168] offers (defined 179 in [I-D.ietf-tcpm-accurate-ecn]). Such mechanisms allow a sending 180 rate adjustment more frequent than each RTT. These mechanisms are 181 out of the scope of the current document. 183 3.3. Discussion: Choice of ABE Multiplier 185 Alternative Backoff with ECN (ABE) decouples a TCP sender's reaction 186 to loss and ECN CE-marks in Congestion Avoidance. The description 187 respectively uses beta_{loss} and beta_{ecn} to refer to the 188 multiplicative decrease factors applied in response to packet loss, 189 and also in response to a receiver indicating that an ECN CE-mark was 190 received on an ECN-enabled TCP connection (based on the terms used in 191 [ABE2015]). For non-ECN-enabled TCP connections, no ECN CE-marks are 192 received and only beta_{loss} applies. 194 In other words, in response to detected loss: 196 FlightSize_(n+1) = FlightSize_n * beta_{loss} 198 and in response to an indication of a received ECN CE-mark: 200 FlightSize_(n+1) = FlightSize_n * beta_{ecn} 202 where, as in [RFC5681], FlightSize is the amount of outstanding data 203 in the network, upper-bounded by the sender's congestion window 204 (cwnd) and the receiver's advertised window (rwnd). The higher the 205 values of beta_{loss} and beta_{ecn}, the less aggressive the 206 response of any individual backoff event. 208 The appropriate choice for beta_{loss} and beta_{ecn} values is a 209 balancing act between path utilisation and draining the bottleneck 210 queue. More aggressive backoff (smaller beta_*) risks underutilising 211 the path, while less aggressive backoff (larger beta_*) can result in 212 slower draining of the bottleneck queue. 214 The Internet has already been running with at least two different 215 beta_{loss} values for several years: the value in [RFC5681] is 0.5, 216 and Linux CUBIC uses 0.7. ABE proposes no change to beta_{loss} used 217 by any current TCP implementations. 219 beta_{ecn} depends on how the response of a TCP connection to shallow 220 AQM marking thresholds is optimised. beta_{loss} reflects the 221 preferred response of each TCP algorithm when faced with exhaustion 222 of buffers (of unknown depth) signalled by packet loss. 223 Consequently, for any given TCP algorithm the choice of beta_{ecn} is 224 likely to be algorithm-specific, rather than a constant multiple of 225 the algorithm's existing beta_{loss}. 227 A range of experiments (section IV, [ABE2015]) with NewReno and CUBIC 228 over CoDel and PIE in lightly-multiplexed scenarios have explored 229 this choice of parameter. These experiments indicate that CUBIC 230 connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), 231 and NewReno connections see improvements with beta_{ecn} in the range 232 0.7 to 0.85 (cf. beta_{loss} = 0.5). 234 4. Specification 236 This document RECOMMENDS that experimental deployments multiply the 237 FlightSize by 0.8 and reduce the slow start threshold 'ssthresh' in 238 Congestion Avoidance in response to reception of a TCP segment that 239 sets the ECN-Echo flag. 241 5. Status of the Update 243 This update is a sender-side only change. Like other changes to 244 congestion-control algorithms it does not require any change to the 245 TCP receiver or to network devices (except to enable an ECN-marking 246 algorithm [RFC3168] [RFC7567]). If the method is only deployed by 247 some TCP senders, and not by others, the senders that use this method 248 can gain advantage, possibly at the expense of other flows that do 249 not use this updated method. This advantage applies only to ECN- 250 marked packets and not to loss indications. Hence, the new method 251 can not lead to congestion collapse. 253 The present specification has been assigned an Experimental status, 254 to provide Internet deployment experience before being proposed as a 255 Standards-Track update. 257 6. Acknowledgements 259 Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by the 260 European Community under its Seventh Framework Programme through the 261 Reducing Internet Transport Latency (RITE) project (ICT-317700). The 262 views expressed are solely those of the authors. 264 The authors would like to thank the following people for their 265 contributions to [ABE2015]: Chamil Kulatunga, David Ros, Stein 266 Gjessing, Sebastian Zander. Thanks to (in alphabetical order) Bob 267 Briscoe, Markku Kojo, John Leslie, Dave Taht and the TCPM WG for 268 providing valuable feedback on this document. 270 The authors would like to thank feedback on the congestion control 271 behaviour specified in this update received from the IRTF Internet 272 Congestion Control Research Group (ICCRG). 274 7. IANA Considerations 276 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 278 This memo includes no request to IANA. 280 8. Implementation Status 282 ABE is implemented as a patch for Linux and FreeBSD. It is meant for 283 research and available for download from 284 http://heim.ifi.uio.no/naeemk/research/ABE/ This code was used to 285 produce the test results that are reported in [ABE2015]. 287 9. Security Considerations 289 The described method is a sender-side only transport change, and does 290 not change the protocol messages exchanged. The security 291 considerations of [RFC3168] therefore still apply. 293 This document describes a change to TCP congestion control with ECN 294 that will typically lead to a change in the capacity achieved when 295 flows share a network bottleneck. Similar unfairness in the way that 296 capacity is shared is also exhibited by other congestion control 297 mechanisms that have been in use in the Internet for many years 298 (e.g., CUBIC [I-D.CUBIC]). Unfairness may also be a result of other 299 factors, including the round trip time experienced by a flow. This 300 advantage applies only to ECN-marked packets and not to loss 301 indications, and will therefore not lead to congestion collapse. 303 10. Revision Information 305 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 307 -01. This I-D now refers to 308 draft-black-tsvwg-ecn-experimentation-02, which replaces 309 draft-khademi-tsvwg-ecn-response-00 to make a broader update to 310 RFC3168 for the sake of allowing experiments. As a result, some of 311 the motivating and discussing text that was moved from 312 draft-khademi-alternativebackoff-ecn-03 to 313 draft-khademi-tsvwg-ecn-response-00 has now been re-inserted here. 315 -00. draft-khademi-tsvwg-ecn-response-00 and 316 draft-khademi-tcpm-alternativebackoff-ecn-00 replace 317 draft-khademi-alternativebackoff-ecn-03, following discussion in the 318 TSVWG and TCPM working groups. 320 11. References 322 11.1. Normative References 324 [I-D.ECN-exp] 325 Black, D., "Explicit Congestion Notification (ECN) 326 Experimentation", Internet-draft, IETF 327 work-in-progress draft-black-tsvwg-ecn-experimentation-02, 328 October 2016. 330 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 331 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 332 RFC2119, March 1997, 333 . 335 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 336 of Explicit Congestion Notification (ECN) to IP", 337 RFC 3168, DOI 10.17487/RFC3168, September 2001, 338 . 340 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 341 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 342 . 344 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 345 Recommendations Regarding Active Queue Management", 346 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 347 . 349 11.2. Informative References 351 [ABE2015] Khademi, N., Welzl, M., Armitage, G., Kulatunga, C., Ros, 352 D., Fairhurst, G., Gjessing, S., and S. Zander, 353 "Alternative Backoff: Achieving Low Latency and High 354 Throughput with ECN and AQM", CAIA Technical Report CAIA- 355 TR-150710A, Swinburne University of Technology, July 2015, 356 . 359 [CODEL2012] 360 Nichols, K. and V. Jacobson, "Controlling Queue Delay", 361 July 2012, . 363 [I-D.AQM-ECN-benefits] 364 Fairhurst, G. and M. Welzl, "The Benefits of using 365 Explicit Congestion Notification (ECN)", Internet-draft, 366 IETF work-in-progress draft-ietf-aqm-ecn-benefits-08, 367 November 2015. 369 [I-D.CUBIC] 370 Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 371 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 372 Internet-draft, IETF 373 work-in-progress draft-ietf-tcpm-cubic-02, August 2016. 375 [I-D.CoDel] 376 Nichols, K., Jacobson, V., McGregor, V., and J. Iyengar, 377 "Controlled Delay Active Queue Management", Internet- 378 draft, IETF work-in-progress draft-ietf-aqm-codel-04, 379 June 2016. 381 [I-D.PIE] Pan, R., Natarajan, P., Baker, F., and G. White, "PIE: A 382 Lightweight Control Scheme To Address the Bufferbloat 383 Problem", Internet-draft, IETF 384 work-in-progress draft-ietf-aqm-pie-10, September 2016. 386 [I-D.ietf-tcpm-accurate-ecn] 387 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 388 Accurate ECN Feedback in TCP", 389 draft-ietf-tcpm-accurate-ecn-01 (work in progress), 390 June 2016. 392 [I-D.ietf-tcpm-dctcp] 393 Bensley, S., Eggert, L., Thaler, D., Balasubramanian, P., 394 and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion 395 Control for Datacenters", draft-ietf-tcpm-dctcp-02 (work 396 in progress), July 2016. 398 [ICC2002] Kwon, M. and S. Fahmy, "TCP Increase/Decrease Behavior 399 with Explicit Congestion Notification (ECN)", IEEE 400 ICC 2002, New York, New York, USA, May 2002, 401 . 403 [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) 404 Concepts, Abstract Mechanism, and Requirements", RFC 7713, 405 DOI 10.17487/RFC7713, December 2015, 406 . 408 Authors' Addresses 410 Naeem Khademi 411 University of Oslo 412 PO Box 1080 Blindern 413 Oslo, N-0316 414 Norway 416 Email: naeemk@ifi.uio.no 418 Michael Welzl 419 University of Oslo 420 PO Box 1080 Blindern 421 Oslo, N-0316 422 Norway 424 Email: michawe@ifi.uio.no 426 Grenville Armitage 427 Centre for Advanced Internet Architectures 428 Swinburne University of Technology 429 PO Box 218 430 John Street, Hawthorn 431 Victoria, 3122 432 Australia 434 Email: garmitage@swin.edu.au 436 Godred Fairhurst 437 University of Aberdeen 438 School of Engineering, Fraser Noble Building 439 Aberdeen, AB24 3UE 440 UK 442 Email: gorry@erg.abdn.ac.uk