idnits 2.17.1 draft-barik-mptcp-lisa-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (June 27, 2016) is 2858 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Experimental RFC: RFC 6356 ** Downref: Normative reference to an Experimental RFC: RFC 6928 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Multipath TCP R. Barik 3 Internet-Draft University of Oslo 4 Intended status: Standards Track S. Ferlin 5 Expires: December 29, 2016 Simula Research Laboratory 6 M. Welzl 7 University of Oslo 8 June 27, 2016 10 A Linked Slow-Start Algorithm for MPTCP 11 draft-barik-mptcp-lisa-01 13 Abstract 15 This document describes the LISA (Linked Slow-Start Algorithm) for 16 Multipath TCP (MPTCP). Currently during slow-start, subflows behave 17 like independent TCP flows making MPTCP unfair to cross-traffic and 18 causing more congestion at the bottleneck. This also yields more 19 losses among the MPTCP subflows. LISA couples the initial windows 20 (IW) of MPTCP subflows during the initial slow-start phase to remove 21 this adverse behavior. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on December 29, 2016. 40 Copyright Notice 42 Copyright (c) 2016 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. MPTCP Slow-Start Problem Description . . . . . . . . . . . . . 4 60 2.1. Example of current MPTCP slow-start problem . . . . . . . . 4 61 3. Linked Slow-Start Algorithm . . . . . . . . . . . . . . . . . . 4 62 3.1. Description of LISA . . . . . . . . . . . . . . . . . . . . 4 63 3.2. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 5 64 4. Implementation Status . . . . . . . . . . . . . . . . . . . . . 6 65 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6 66 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 67 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 68 8. Change History . . . . . . . . . . . . . . . . . . . . . . . . 7 69 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 9.1. Normative References . . . . . . . . . . . . . . . . . . . 7 71 9.2. Informative References . . . . . . . . . . . . . . . . . . 7 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 74 1. Introduction 76 The current MPTCP implementation provides multiple congestion control 77 algorithms, which aim to provide fairness to TCP flows at the shared 78 bottlenecks. However, in RFC 6356 [RFC6356], the subflows' slow- 79 start phase remains unchanged to RFC 5681 [RFC5681], and all the 80 subflows at this stage behave like independent TCP flows. Following 81 the development of IW as per [RFC6928], each MPTCP subflow can start 82 with IW = 10. With an increasing number of subflows, the subflows' 83 collective behavior during the initial slow-start phase can 84 temporarily be very aggressive towards a concurrent regular TCP flow 85 at the shared bottleneck. 87 According to [UIT02], most of the TCP sessions in the Internet 88 consist of short flows, e.g., HTTP requests, where TCP will likely 89 never leave slow-start. Therefore, the slow-start behavior becomes 90 of critical importance for the overall performance. 92 To mitigate the adverse effect during initial slow-start, we 93 introduce LISA, the "Linked Slow-Start Algorithm". LISA shares the 94 congestion window MPTCP subflows in slow start whenever a new subflow 95 joins. 97 1.1. Definitions 99 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 100 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 101 document are to be interpreted as described in RFC 2119 [RFC2119]. 103 Acronyms used in this document: 105 IW -- Initial Window 107 RTT -- Round Trip Time 109 CWND -- Congestion Window 111 Inflight -- MPTCP subflow's inflight data 113 old_subflow.CWND -- Congestion Window of the subflow having 114 largest sending rate 116 new_subflow.CWND -- New incoming subflow's Congestion Window 118 Ignore_ACKs -- a boolean variable indicating whether ACKs should 119 be ignored 121 ACKs_To_Ignore -- the number of ACKs for which old_subflow.CWND 122 stops increasing during slow-start 124 compound CWND -- sum of CWND of the subflows in slow-start 126 2. MPTCP Slow-Start Problem Description 128 Since it takes 1 RTT for the sender to receive any feedback on a 129 given TCP connection, sending an additional segment after every ACK 130 is rather aggressive. Therefore, in slow-start, all subflows 131 independently doubling their CWND as in regular TCP results in MPTCP 132 also doubling its compound CWND. The MPTCP aggregate only diverges 133 from this behavior when the number of subflows changes. Coupling of 134 CWND is therefore not necessary in slow-start except when a new 135 subflow joins. 137 2.1. Example of current MPTCP slow-start problem 139 We illustrate the problematic MPTCP slow-start behavior with an 140 example: Consider an MPTCP connection consisting of 2 subflows. The 141 first subflow starts with IW = 10, and after 2 RTTs the CWND becomes 142 40 and a new subflow joins, again with IW = 10. Then, the compound 143 CWND becomes 40+10 = 50. With an increasing number of subflows, the 144 compound CWND in MPTCP becomes larger than that of a concurrent TCP 145 flow. 147 For example, MPTCP with eight subflows (as recommended in [DCMPTCP11] 148 for datacenters) will have a compound CWND of 110 (40+7*10). As a 149 result, MPTCP would behave unfairly to a concurrent TCP flow sharing 150 the bottleneck. This aggressive behavior of MPTCP also affects the 151 performance of MPTCP. If multiple subflows share a bottleneck, each 152 of them doubling their rate every RTT, will cause excessive losses at 153 the bottleneck. This makes MPTCP enter the congestion avoidance 154 phase earlier and thereby increases the completion time of the 155 transfer. 157 This problem, and the improvement attained with LISA, are documented 158 in detail in [lisa]. 160 3. Linked Slow-Start Algorithm 162 3.1. Description of LISA 164 The idea behind LISA is that each new subflow takes a 'packet credit' 165 from an existing subflow in slow-start for its own IW. We design the 166 mechanism such that a new subflow has 10 segments as the upper limit 168 [RFC6928] and 3 segments as the lower limit [RFC3390]. This is based 169 on [RFC6928], [RFC3390] and the main reason behind it is to let these 170 subflows compete reasonably with other flows. We also divide the 171 CWND fairly in order to give all subflows an equal chance when 172 competing with each other. 174 LISA first finds the subflow with the largest sending rate measured 175 over the last RTT. Depending on the subflow's CWND, between 3 and 10 176 segments are taken from it as packet credit and used for the new 177 subflow's IW. The packet credit is realized by reducing the CWND 178 from the old subflow and halting its increase for ACKs_To_Ignore 179 number of ACKs. 181 We clarify LISA with the example given in Section 2.1. After 2 RTTs, 182 the old_subflow.CWND = 40 and a new_subflow joins the connection. 183 Since old_subflow.CWND >= 20 (refer to Section 3.2), 10 packets can 184 be taken by the new_subflow.CWND, resulting in old_subflow.CWND = 30 185 and new_subflow.CWND = 10. Hence, MPTCP's compound CWND, whose 186 current size is 40, should ideally become 60+20 = 80 after 1 RTT 187 (assuming a receiver without delayed ACKs). However, if 40 segments 188 from old_subflow.CWND are already in flight, the compound CWND 189 becomes in fact 70+20 = 90. Here, LISA keeps old_subflow.CWND from 190 increasing for the next 10 ACKs. In comparison, MPTCP without LISA 191 would have a compound CWND of 80+20=100 after 1 RTT. 193 3.2. Algorithm 195 Below, we describe the LISA algorithm. LISA is invoked before a new 196 subflow sends its IW. 198 1. Before computing the new_subflow.CWND, Ignore_ACKs = False and 199 ACKs_To_Ignore = 0. 201 2. Then, ignoring the new_subflow, the subflow in slow-start with 202 the largest sending rate (old_subflow.CWND, measured over the 203 last RTT) is selected. 205 3. If there is no such subflow, the IW of the new_subflow.CWND = 10 206 Otherwise, the following steps are executed: 208 if old_subflow.CWND >= 20 // take IW(10) packets 210 old_subflow.CWND -= 10 212 new_subflow.CWND = 10 213 Ignore_ACKs = True 215 else if old_subflow.CWND >= 6 // take half the packets 217 new_subflow.CWND -= old_subflow.CWND / 2 219 old_subflow.CWND -= new_subflow.CWND 221 Ignore_ACKs = True 223 else 225 new_subflow.CWND = 3 // can't take from old_subflow 227 4. if Ignore_ACKs and Inflight > old_subflow.CWND 229 // do not increase CWND when ACKs arrive 231 ACKs_To_Ignore = Inflight - old_subflow.CWND 233 4. Implementation Status 235 LISA is implemented as a patch to the Linux kernel 3.14.33+ and 236 within MPTCP's v0.89.5. It is meant for research and provided by the 237 Unviersity of Oslo and Simula Research Laboratory, and available for 238 download from http://heim.ifi.uio.no/runabk/lisa This code was used 239 to produce the test results that are reported in [lisa]. 241 5. Acknowledgements 243 This work was part-funded by the European Community under its Seventh 244 Framework Programme through the Reducing Internet Transport Latency 245 (RITE) project (ICT-317700). The authors also would like to thank 246 David Hayes (UiO) for his comments. The views expressed are solely 247 those of the authors. 249 6. IANA Considerations 251 This memo includes no request to IANA. 253 7. Security Considerations 254 8. Change History 256 Changes made to this document: 258 00->01 : Some minor text improvements and updated a reference. 260 9. References 262 9.1. Normative References 264 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 265 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 266 RFC2119, March 1997, 267 . 269 [RFC3390] Allman, M., Floyd, S., and C. Partridge, "Increasing TCP's 270 Initial Window", RFC 3390, DOI 10.17487/RFC3390, 271 October 2002, . 273 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 274 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 275 . 277 [RFC6356] Raiciu, C., Handley, M., and D. Wischik, "Coupled 278 Congestion Control for Multipath Transport Protocols", 279 RFC 6356, DOI 10.17487/RFC6356, October 2011, 280 . 282 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 283 "Increasing TCP's Initial Window", RFC 6928, DOI 10.17487/ 284 RFC6928, April 2013, 285 . 287 9.2. Informative References 289 [DCMPTCP11] 290 Raiciu, C., Barre, S., Pluntke, C., Greenhalgh, A., 291 Wischik, D., and M. Handley, "Improving datacenter 292 performance and robustness with multipath TCP", ACM 293 SIGCOMM p266-277, August 2011. 295 [UIT02] Brownlee, N. and K. Claffy, "Understanding internet 296 traffic streams: Dragonflies and tortoises", IEEE 297 Communications Magazine p110-117, 2002. 299 [lisa] Barik, R., Welzl, M., Ferlin, S., and O. Alay, "LISA: A 300 Linked Slow-Start Algorithm for MPTCP", IEEE ICC 2016, 301 Kuala Lumpur, Malaysia , 2016. 303 Authors' Addresses 305 Runa Barik 306 University of Oslo 307 PO Box 1080 Blindern 308 Oslo N-0316 309 Norway 311 Email: runabk@ifi.uio.no 313 Simone Ferlin 314 Simula Research Laboratory 315 P.O.Box 134 316 Lysaker, 1325 317 Norway 319 Email: ferlin@simula.no 321 Michael Welzl 322 University of Oslo 323 PO Box 1080 Blindern 324 Oslo, N-0316 325 Norway 327 Phone: +47 2285 2420 328 Email: michawe@ifi.uio.no