idnits 2.17.1 draft-leymann-banana-load-rebalance-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 18, 2017) is 2586 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'BIC' is mentioned on line 207, but not defined Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BANANA N. Leymann 3 Internet Draft C. Heidemann 4 Intended Category: Informational Deutsche Telekom AG 5 G. Liang 6 China Mobile 7 J. Shen 8 China Telecom Co., Ltd 9 M. Zhang 10 L. Chen 11 Huawei 12 M. Cullen 13 Painless Security 14 Expires: August 22, 2017 February 18, 2017 16 BANdwidth Aggregation for interNet Access (BANANA) 17 Load Rebalance for Bonding Tunnels 18 draft-leymann-banana-load-rebalance-00.txt 20 Abstract 22 BANdwidth Aggregation for interNet Access (BANANA) makes use of a 23 subscriber's multiple points of attachment to the Internet to provide 24 the subscriber with higher bandwidth and reliability than what is 25 provided by any single one of these attachments. 27 Various tunnel based methods have been developed to realize BANANA. 28 This document specifies a throughput-increasing mechanism that can be 29 commonly adopted by bonding tunnels methods. Basically, ingress node 30 adaptively adjusts its load distribution function according to the 31 quality of the bonding tunnels so as to make best use of the bonding 32 bandwidth. 34 Status of this Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF), its areas, and its working groups. Note that 41 other groups may also distribute working documents as 42 Internet-Drafts. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 49 The list of current Internet-Drafts can be accessed at 50 http://www.ietf.org/1id-abstracts.html 52 The list of Internet-Draft Shadow Directories can be accessed at 53 http://www.ietf.org/shadow.html 55 Copyright and License Notice 57 Copyright (c) 2017 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 73 2. Acronyms and Terminology . . . . . . . . . . . . . . . . . . . 3 74 3. Problem: Bonding Reordering Buffer Bloating . . . . . . . . . . 3 75 4. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 5 76 5. Load Rebalance . . . . . . . . . . . . . . . . . . . . . . . . 5 77 5.1. Adaptive Splitting Ratio . . . . . . . . . . . . . . . . . 6 78 6. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . . 6 79 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 82 9.1. Normative References . . . . . . . . . . . . . . . . . . . 6 83 9.2. Informative References . . . . . . . . . . . . . . . . . . 7 84 Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 86 1. Introduction 88 BANdwidth Aggregation for interNet Access (BANANA) enables 89 subscribers to make use of multiple access technologies to achieve 90 reliable and high bandwidth Internet access. Various bonding tunnel 91 technologies have been proposed to realize BANANA [GREbond] [GTPbond] 92 [MIPbond]. Since per packet traffic distribution is adopted by 93 bonding tunnels, latency difference of the two tunnels may cause 94 packet disorder to a single traffic flow that is being split across 95 these two tunnels. Therefore, a reordering buffer for the bonding 96 tunnels is used at the egress node to restore packet disorder. It is 97 referred as "bonding reordering buffer" afterwards in this document. 99 The egress node places a limit (see OUTOFORDER_TIMER in [RFC2890]) on 100 the time that a packet can wait in the bonding reordering buffer and 101 places a limit on the number of packets in the bonding reordering 102 buffer (MAX_REORDER_BUFFER, see MAX_PERFLOW_BUFFER in [RFC2890]). Any 103 packet that would cause violation of either of the two limits MUST be 104 forcibly delivered by the egress node. The bonding reordering buffer 105 bloating issue may break these two limits, which lead to the 106 mandatory packet delivery therefore causes mass loss of TCP packets. 107 The throughput of the bonding tunnels may decrease dramatically. It 108 is always important to minimize the usage of the bonding reordering 109 buffer (or "Bonding Reordering Buffer Size") in order to reduce the 110 possibility of breaking the above two limits. 112 BANANA may measure the Round Trip Time (RTT) and data rate of each 113 tunnel and monitor the usage of the bonding reordering buffer. Based 114 on the measurement, the ingress node may dynamically adjust the 115 traffic distribution function in order to achieve a higher throughput 116 of the bonding tunnels. For example, it may adaptively update the 117 splitting ratio or adaptively arrange the packet sequence into the 118 bonding tunnels. 120 2. Acronyms and Terminology 122 CIR: Committed Information Rate [RFC2697] 124 RTT: Round Trip Time 126 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 127 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 128 document are to be interpreted as described in RFC 2119 [RFC2119]. 130 3. Problem: Bonding Reordering Buffer Bloating 132 Latency difference of the two tunnels causes packet disorder to a 133 traffic flow that is split across these two tunnels. The bonding 134 reordering buffer based on the bonding sequence number at the egress 135 is used to "absorb" this latency difference. Figure 3.1 illustrates 136 the operation of the reordering. 138 +-+ 139 |7| Bonding Sequence Number 140 +-+ 141 .4. Sequence Number 142 ... 143 +-+ +----->----Tunnel 1---->------+ +-+ 144 |8| | | |1| 145 +-+ +--------------+ +---------------+ +-+ 146 ---->| Distribution | | Recombination |----> 147 +--------------+ +---------------+ 148 | | ^| 149 +----->----Tunnel 2---->------+ |v 150 +-+ +-+ +-+ +-------+ 151 |6| |4| |2| | +-+-+ |Bonding 152 +-+ +-+ +-+ | |5|3| |Reordering 153 .3. .2. .1. | +-+-+ |Buffer 154 ... ... ... +-------+ 156 Figure 3.1: Bonding Tunnel Reordering Operation 158 [RFC2890] places two limits on the reordering buffer of a tunnel. One 159 is the timer limit: OUTOFORDER_TIMER and the other is the size limit: 160 MAX_PERFLOW_BUFFER. For bonding tunnels, the first limit is reused 161 while the second parameter becomes the maximum bonding reordering 162 buffer size of the entire bonding tunnel rather than a specific flow. 164 [RFC5681] defines Flight Size as the amount of data that has been 165 sent but not yet cumulatively acknowledged. In this document, the 166 Flight Size of a tunnel indicates the amount of data that has been 167 sent by the ingress node onto this tunnel but not yet pass through 168 the reordering buffer (which is not shown in the figure) of this 169 tunnel. The Flight Size of the entire bonding tunnel indicates the 170 amount of data that has been sent by the ingress node by either 171 tunnel but not yet pass through the bonding reordering buffer. From 172 the sequence number of the last packet sent by the ingress node and 173 the latest sequence number acknowledged by the egress node, the 174 ingress node can monitor the Flight Size of a tunnel. For the entire 175 bonding tunnel, the egress node might acknowledge the bonding 176 sequence number via either of the two tunnels. The maximum bonding 177 sequence number acknowledged by both tunnels is the latest 178 acknowledged bonding sequence number. 180 As shown in Figure 3.1, the Flight Sizes of the tunnels can be used 181 to estimate the load of the tunnels and the usage of the bonding 182 reordering buffer. Suppose the Flight Size of Tunnel_1 is F_1, the 183 Flight Size of Tunnel_2 is F_2, the Flight Size of the entire bonding 184 tunnel is F_B while the Bonding Reordering Buffer Size is B. B can be 185 calculated as 186 B = F_B - F_1 - F_2 187 = 6 - 1 - 3 188 = 2 190 The bonding reordering buffer may bloat due to the large delay 191 difference of the two tunnels. This bonding reordering buffer 192 bloating issue might lead to the violation of the timer and/or the 193 buffer size limit. The egress node has to deliver the violating 194 packets, which will cause mass packet loss and retransmission of the 195 carried TCP traffic. Throughput of the bonded tunnels will drop 196 dramatically. Therefore, it is always important to minimize the size 197 of the bonding reordering buffer. 199 4. Related Work 201 Several TCP congestion-avoidance algorithms are implemented for 202 congestion control in the Internet. TCP New Reno, defined by 203 [RFC6582], improves retransmission during the fast-recovery phase. In 204 the absence of SACK [RFC2018], TCP New Reno responds to partial 205 acknowledgments (ACKs that cover new data, but not all the data 206 outstanding when loss was detected) and sends the next packet beyond 207 the ACKed sequence number. The TCP [BIC] uses binary search to 208 iteratively find the proper congestion window size in each time 209 interval of RTT. [CUBIC] is a less aggressive and more systematic 210 derivative of BIC, in which the window is a cubic function of time so 211 that RTT fairness is guaranteed. Explicit Congestion Notification 212 (ECN [RFC3168]) notifies impending network congestion by setting a 213 mark in the IP header instead of dropping packets. When the receiver 214 echoes the congestion indication to the sender, the sender should 215 reduce its transmission rate accordingly. 217 However, traditional TCP congestion-avoidance algorithms or the ECN 218 mechanism are not applicable to bonding tunnels due to the following 219 reasons. Bonding tunnels adopt per packet other than per flow load 220 balancing. Bonding tunnels are established between a pair of network 221 devices rather than host-to-host. The ingress node of bonding tunnels 222 is not capable to alter the traffic sending rate. It does not keep 223 sending buffers so it is not capable to retransmit lost packets 224 either. 226 5. Load Rebalance 228 Parameters such as the Round-Trip Time and the packet loss rate of 229 each tunnel, the usage of the bonding reordering buffer and the data 230 rate of the tunnels might be measured. The measurement could be done 231 in either an one-way or two-way manner. The measured information 232 might be carried either by data packets or control messages. 234 Based on the measured information, the ingress node can judge whether 235 one tunnel is already congested so that the traffic proportion to be 236 loaded on it should be decreased. The ingress node therefore can 237 timely adjusts the traffic distribution function to realize a "load 238 rebalance". This load rebalance helps the BANANA system to make best 239 use of the bandwidth of the two tunnels, and to reduce the queue 240 length in the bonding reordering buffer before the congestion control 241 of user's TCP traffic react. 243 5.1. Adaptive Splitting Ratio 245 Coloring mechanism is used to achieve per-packet traffic distribution 246 across bonded tunnels [GREbond] [GTPbond]. Coloring mechanism is 247 defined by [RFC2697] and [RFC2698]. The Committed Information Rate 248 (CIR) determines the traffic rate distributed into a give tunnel. The 249 CIR of the primary tunnel is fixed while the CIR of the secondary 250 tunnel can be tuned dynamically. The ingress node may monitor the 251 latency of the two tunnels via the measurement of RTT. If the latency 252 difference of the two tunnels exceeds a pre-configured threshold (a 253 value in the range from 0 to 100ms), the CIR for the secondary tunnel 254 is decreased (e.g., by a half). Otherwise, its CIR is additively 255 increased as high as to the maximum traffic rate of the secondary 256 tunnel. As the ingress node tunes the CIR, the traffic splitting 257 ratio will be adaptively changed as well. 259 6. Protocol Extensions 261 TBD. 263 The specification about protocol extensions in this document is 264 intended to be applicable to various bonding tunnel protocols. 266 7. Security Considerations 268 Security should be considered by specific bonding tunnel protocols. 270 8. IANA Considerations 272 This document does not require any allocations by the IANA and 273 therefore does not have any new IANA considerations. 275 9. References 277 9.1. Normative References 279 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 280 Requirement Levels", BCP 14, RFC 2119, DOI 281 10.17487/RFC2119, March 1997, . 284 [RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color 285 Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999, 286 . 288 [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color 289 Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999, 290 . 292 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 293 RFC 2890, DOI 10.17487/RFC2890, September 2000, 294 . 296 [RFC6582] T.Henderson, S.Floyd, A.Gurtov, Y.Nishida, "The NewReno 297 Modification to TCP's Fast Recovery Algorithm", RFC 6582, 298 DOI 10.17487/RFC6582, April 2012, 301 [CUBIC] I.Rhee, & L.Xu, "CUBIC: A New TCP-Friendly High-Speed TCP 302 Variant", 305 9.2. Informative References 307 [RFC2018] M.Mathis, J.Mahdavi, S.Floyd, A.Romanow, "TCP Selective 308 Acknowledgment Options", RFC 2018, DOI 10.17487/RFC2018, 309 October 1996, 311 [RFC3168] K. Ramakrishnan, S. Floyd, D. Black, "The Addition of 312 Explicit Congestion Notification (ECN) to IP", RFC 3168, 313 DOI 10.17487/RFC3168, September 2001, . 316 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 317 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 318 . 320 [GREbond] N. Leymann, C. Heidemann, M. Zhang, et al, "GRE Tunnel 321 Bonding", draft-zhang-gre-tunnel-bonding, work in progress. 323 [GTPbond] P. Muley, W. Henderichx, G. Liang, H. Liu, "Network based 324 Bonding solution for Hybrid Access", draft-muley-network- 325 based-bonding-hybrid-access, work in progress. 327 [MIPbond] P. Seite, A. Yegin and S. Gundavelli, "Multihoming support 328 for Residential Gateways", draft-seite-dmm-rg-multihoming, 329 work in progress. 331 Author's Addresses 333 Nicolai Leymann 334 Deutsche Telekom AG 335 Winterfeldtstrasse 21-27 336 Berlin 10781 337 Germany 339 Phone: +49-170-2275345 340 Email: n.leymann@telekom.de 342 Cornelius Heidemann 343 Deutsche Telekom AG 344 Heinrich-Hertz-Strasse 3-7 345 Darmstadt 64295 346 Germany 348 Phone: +4961515812721 349 Email: heidemannc@telekom.de 351 Geng Liang 352 China Mobile 353 32 Xuanwumen West Street, 354 Xicheng District, Beijing, 100053, 355 China 357 EMail: gengliang@chinamobile.com 359 Jun Shen 360 China Telecom Co., Ltd 361 109 West Zhongshan Ave, Tianhe District 362 Guangzhou 510630 363 P.R. China 365 EMail: shenjun@gsta.com 367 Mingui Zhang 368 Huawei Technologies 369 No.156 Beiqing Rd. Haidian District, 370 Beijing 100095 P.R. China 372 EMail: zhangmingui@huawei.com 373 Lihao Chen 374 Huawei Technologies 375 No.156 Beiqing Rd. Haidian District, 376 Beijing 100095 P.R. China 378 EMail: lihao.chen@huawei.com 380 Margaret Cullen 381 Painless Security 382 14 Summer St. Suite 202 383 Malden, MA 02148 USA 385 EMail: margaret@painless-security.com