idnits 2.17.1 draft-gettys-iw10-considered-harmful-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 26, 2011) is 4621 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'Chu' is defined on line 316, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2068 (Obsoleted by RFC 2616) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Jim Gettys 3 Internet-Draft Alcatel-Lucent Bell Labs 4 Intended status: Informational August 26, 2011 5 Expires: February 27, 2012 7 IW10 Considered Harmful 8 draft-gettys-iw10-considered-harmful-00 10 Abstract 12 The proposed change to the initial window to 10 indraft-ietf-tcpm- 13 initcwnd must be considered deeply harmful; not because it is the 14 proposed change is evil taken in isolation, but that other changes in 15 web browsers and web sites that have occurred over the last decade, 16 it makes the problem of transient congestion at a user's broadband 17 connection two and a half times worse. This result has been hidden 18 by the already widespread bufferbloat present in broadband 19 connections. Packet loss in isolation is no longer a useful metric 20 of a path's quality. The very drive to improve latency of web page 21 rendering is already destroying other low latency applications, such 22 as VOIP and gaming, and will prevent reliable rich web real time web 23 applications such as those contemplated by the IETF rtcweb working 24 group. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on February 27, 2012. 43 Copyright Notice 45 Copyright (c) 2011 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 64 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 65 6. Informative References . . . . . . . . . . . . . . . . . . . . 8 66 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 1. Introduction 70 In the second half of the 2000's, competition in web browsers 71 reappeared and changed focus from strictly features, to speed 72 (meaning latency, at least as seen from data-centers, which can be 73 highly misleading), with the discovery (most clearly understood by 74 Google) that web sites are stickier the faster (lower latency) they 75 are. Perhaps Sergey Brin and Larry Page knew Stuart Cheshire at 76 Stanford? [Cheshire]. 78 The problem, in short, is the multiplicative effect of the following: 80 o Browsers ignoring RFC 2068 [RFC2068] and RFC 2616 [RFC2616] 81 requirement to use no more than two simultaneous TCP connections, 82 with current browsers often using 6 or sometimes many more 83 simultaneous TCP connections 85 o "Sharded" web sites that sometimes deliberately hide the path to 86 servers actually located in the same data center, to encourage 87 browsers to use even more simultaneous TCP connections 89 o The proposed change to the TCP congestion window, to allow each 90 fresh TCP connection to send as much as 2.5 times as much data as 91 in the past. 93 o Current broadband connections having a single queue available to 94 customers, which is usually badly over-buffered, hiding packet 95 loss 97 o Web pages having large numbers of embedded objects in a web page. 99 o Web servers having large memory caches and processing power when 100 generating objects on the fly that responses are often/usually 101 transmitted effectively instantaneously at line rate speed 103 The result can easily be a horrifying large impulse of packets sent 104 effectively simultaneously to the user as a continuous packet train, 105 landing in, and clogging the one queue in their broadband connection 106 and/or home router for extended periods of time. Any chance for your 107 VOIP call to work correctly, or to avoid being fragged in your game, 108 evaporates. 110 2. Discussion 112 The original reasons for the 2 TCP connection rule in RFC 2068 113 [RFC2068] and RFC 2616 [RFC2616] in section 8.1.4 are long gone. In 114 the 1990's era, dial-up modem banks were often badly underbuffered, 115 and multiple simultaneous connections could easily cause excessive 116 packet loss due to self-congestion either on the dialup port itself, 117 or the dialup bank overall. 119 Since the 1990's memory has become very cheap, and we now have the 120 opposite problem: buffering in broad band equipment is much, much too 121 large, larger than any sane amount, as shown by the Netalyzr 122 [Netalyzr] and FCC data [Sundaresan], a phenomena I christened 123 "bufferbloat" as we had lacked a good term for this problem. 125 What is more, broadband equipment usually provides only a single 126 queue unmanaged, bloated queue to the subscriber; a large impulse of 127 packets for a single user will block other packets from other 128 applications of that user, or to other users who share that 129 connection. This buffering is so large that slow start is badly 130 damaged (TCP will attempt to run many times faster than it should 131 until packet loss finally brings it back under control), and 132 congestion avoidance is no longer stable, as I discovered in 2010 133 [Gettys]. 135 I had expected and hoped that high performance would be achieved via 136 HTTP Pipelining [HTTPPerf] and that web traffic would have have 137 longer TCP sessions. HTTP pipelining is painful due to HTTP's lack 138 of any multiplexing layer and the lack of response numbering to allow 139 out of order responses; "poor man's multiplexing" is possible, but 140 complex. The benefits of pipelining to the length of TCP sessions 141 are somewhat less than one might naively presume, as significantly 142 fewer packets are ultimately necessary. But HTTP pipelining has 143 never seen widespread browser deployment (though is supported by a 144 high fraction of web servers). You will seldom see packet loss by 145 using many TCP connections simultaneously in today's Internet, as 146 buffers are now so large they can absorb huge transients, sometimes 147 even a megabyte or more. 149 Web browsers (seeing no packet loss) changed from obeying the RFC2616 150 requirement of two TCP connections to using 6 or even 15 TCP 151 connections from a browser to a web server. What is more, some web 152 sites, called "sharded web sites," deliberately split themselves 153 across multiple names to trick web browsers into even more profligate 154 uses of TCP connections, and there is no way for a web client to 155 easily determine a web site has been so engineered. 157 A browser may then render a page with many embedded objects (e.g. 158 images). Current web browsers will therefore simultaneously open or 159 reuse 6 or often many more TCP connections at once, and the initial 160 congestion window of 4 packets may be sent from the same data center 161 simultaneously to a user's broadband connection. These packets 162 currently enter the single queue in most broadband systems between 163 the Internet and the home, and with no QOS or other fair queuing 164 present, induce transient latency; your VOIP or gaming packet will be 165 stuck behind this burst of web traffic until the burst drains. 166 Similarly, current home routers often lack any QOS or sophisticated 167 queuing to ensure fairness between different users. The proposal by 168 Chu, et. al. in to raise the initial congestion window from four to 169 10, making the blocking and resulting latency and jitter problem up 170 to 2.5 times worse. 172 Note that broadband equipment is not the only overbuffered equipment 173 in most user's paths. Home routers, 3g wireless and the user's 174 operating systems are usually even worse than broadband equipment. 175 In a user's home, whenever the wireless bandwidth happens to be below 176 that of the broadband connection, the bottleneck link is in the 177 wireless hop, and so the problem may occur there rather than in the 178 broadband connection. It is the bottleneck transfer that matters; 179 not the theoretical bandwidth of the links. 802.11g is at best 180 20Mbps, but often much worse. Other bottleneck points in the user's 181 paths may also be lacking AQM. 183 I believe the performance analysis in the draft-ietf-tcpm-initcwnd is 184 flawed not by being incorrect in what it presents, but by overlooking 185 the latency and jitter inflicted on other traffic that is sharing the 186 broadband link, due to the large buffering in these links and 187 typically single queue. It is the damage the IW change would make to 188 other real time applications sharing that link (including rtcweb 189 applications), or what those sharing that link do to you that is the 190 issue. 192 Simple arithmetic to compute induced transient latency, even ignoring 193 all overhead, comes up with scary results: 195 Latency table 197 +------+---------+---------+----------+--------+---------+----------+ 198 | # | ICW=4 | Time | Time | ICW=10 | Time | Time | 199 | conn | (bytes) | @1Mbps | @50Mbps | | @1Mbps | @50Mbps | 200 +------+---------+---------+----------+--------+---------+----------+ 201 | 2 | 12000 | 96ms | 1.92ms | 30000 | 240ms | 4.8ms | 202 | 6 | 36000 | 288ms | 5.76ms | 90000 | 720ms | 14ms | 203 | 15 | 90000 | 720ms | 14.4ms | 225000 | 1800ms | 36ms | 204 | 30 | 180000 | 1440ms | 28.8ms | 450000 | 3600ms | 72ms | 205 +------+---------+---------+----------+--------+---------+----------+ 207 Table 1: Unloaded Latency 209 1 Mbps may be your fair share of a loaded 802.11 link. 50Mbps is near 210 the top end of today's broadband. Available bandwidths in other 211 parts of the world are often much, much lower than in parts of the 212 world where broadband has deployed. 214 Simple experiments over 50Mbps home cable service against Google 215 images confirm latencies that reach or sometimes double those in the 216 table. Steady-state competing TCP traffic will multiply these times 217 correspondingly; even at 50Mbps, reliable, low latency VOIP can 218 therefore be problematic. From this table, it becomes obvious that 219 QOS in shared wireless networks has become essential, if only because 220 of this change in web browser behavior. Note that the 2 connection 221 rule still results in 100ms latencies on 1Mbps connections, which is 222 already very problematic for VOIP by induction of jitter. Two TCP 223 connections are capable of driving a megabit link at saturation over 224 most paths today even from a cold start with ICW=4. 226 In the effort to maximise speed (as seen by a data center) web 227 browsers/servers have turned web traffic into an delta function 228 congesting the user's queue for extended periods. Since the 229 broadband edge is badly over-buffered as shown first by Netalyzr, 230 packets are usually not lost, but instead, fill the one queue 231 separating people from the rest of the Internet until they drain. 233 Many carrier's telephony services are not blocked by this web traffic 234 since the carriers have generally provisioned voice channels 235 independently of data service; but most competing services such as 236 Vonage or Skype will be blocked, as they must use the single, 237 oversized queue. While I do not believe this advantage was by 238 design, it is an effect of bufferbloat and current broadband 239 supporting only a single queue, at most accelerating acks ahead of 240 other bulk data packets. In the presently deployed broadband 241 infrastructure, these other queues are usually unavailable for use by 242 time sensitive traffic, and DiffServ [RFC3260] is not implemented in 243 broadband head end equipment. Therefore time sensitive packets share 244 the same queue of non-time sensitive bulk data (HTTP) traffic. 246 3. Solutions 248 If HTTP pipelining were deployed, it would result in lower actual 249 times to most users; fewer bytes are needed due to sharing packets 250 among objects and requests, and much lower packet overhead and lower 251 ack traffic and significantly better TCP congestion behavior. While 252 increasing the initial window someday may indeed make sense, it is 253 truly a frightening to us to raise the ICW during this arms race 254 given already deployed HTTP/1.1 implementations. SPDY [SPDY] should 255 have similar (or better) results, but requires server side support 256 that will take time to develop and deploy, whereas most deployed web 257 servers have supported pipelining for over a decade (sometimes with 258 bugs, which is part of why it is painful to deploy web client HTTP 259 pipelining). 261 A full discussion of solutions that would improve latency for general 262 web browsing without destroying realtime applications is beyond the 263 scope of this ID. I note a few quickly (which are not mutually 264 exclusive) that can and should be persued. They all have differing 265 time scales and costs; all are desirable in my view, but the 266 discussion would be much more than I can cover here. 268 o Deployment of HTTP/1.1 pipelining (with reduction of # of 269 simultaneous connections back to RFC 2616 levels 271 o Deployment of SPDY 273 o DiffServ deployment in the broadband edge and its use by 274 applications 276 o DiffServ deployment in home routers (which often unbeknownst to 277 those not in the gaming industry, has already partially occurred 278 due to its inclusion in the default Linux PFIFO_FAST line 279 discipline). 281 o Some sort of "per user" or "per machine" queuing mechanism on 282 broadband connections such that complete starvation of service for 283 extended periods can be avoided. 285 At a deeper and more fundamental level, individual applications (such 286 as web browsers) may game the network with concurrent bad results to 287 the user (or other users sharing that edge connection), and with the 288 advent of Web sockets, even individual web applications may similarly 289 game the network's behavior. With the huge dynamic range of today's 290 edge environments, we have no good way to know what a "safe" initial 291 impulse of packets a server may send into the network in what 292 situation. Today there is no disincentive to applications abusing 293 the network. Congestion exposure mechanisms such as Congestion 294 Exposure [ConEx] are badly needed, and some way to enable users (and 295 their applications, on their behalf) to be aware of and react to 296 badly behaved applications. 298 4. IANA Considerations 300 This memo includes no request to IANA. 302 5. Security Considerations 304 Current practice of web browsers, in concert with "sharded" web sites 305 and changes to the initial congestion window, and the currently 306 deployed broadband infrastructure can be considered a denial of (low 307 latency) service attack on consumer's broadband service. 309 6. Informative References 311 [Cheshire] 312 Cheshire, "It's the Latency, Stupid", 1996, 313 . 316 [Chu] Chu, Dukkipati, Cheng, and Mathis, "Increasing TCP's 317 Initial Window", 2011, . 320 [ConEx] Briscoe, "congestion exposure (ConEx), re-feedback and re- 321 ECN", 2005, . 323 [Gettys] Gettys, "Whose house is of glasse, must not throw stones 324 at another", January 2011, . 329 [HTTPPerf] 330 Nielsen, Gettys, Baird-Smith, Prud'hommeaux, Lie, and 331 Lilley, "Network Performance Effects of HTTP/1.1, CSS1, 332 and PNG", June 1997, . 335 [Netalyzr] 336 Kreibich, Weaver, Nechaev, and Paxson, "Netalyzr: 337 illuminating the edge network.", November 2010, . 341 [RFC2068] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and T. 342 Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", 343 RFC 2068, January 1997. 345 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 346 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 347 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 349 [RFC3260] Grossman, D., "New Terminology and Clarifications for 350 Diffserv", RFC 3260, April 2002. 352 [SPDY] Belshe, "SPDY: An experimental protocol for a faster web", 353 2011, . 355 [Sundaresan] 356 Sundaresan, de Donato, Feamster, Teixeira, Crawford, and 357 Pescape, "Broadband Internet Performance: A View From the 358 Gateway, Proceedings of SIGCOMM 2011", August 2011, . 361 Author's Address 363 Jim Gettys 364 Alcatel-Lucent Bell Labs 365 21 Oak Knoll Road 366 Carlisle, Massachusetts 01741 367 USA 369 Phone: +1 978 254-7060 370 Email: jg@freedesktop.org