idnits 2.17.1 draft-mathis-tcpm-tcp-laminar-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2012) is 4297 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2140 (Obsoleted by RFC 9040) ** Obsolete normative reference: RFC 2861 (Obsoleted by RFC 7661) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance Working Group M. Mathis 3 Internet-Draft Google, Inc 4 Intended status: Experimental July 15, 2012 5 Expires: January 16, 2013 7 Laminar TCP and the case for refactoring TCP congestion control 8 draft-mathis-tcpm-tcp-laminar-01.txt 10 Abstract 12 The primary state variables used by all TCP congestion control 13 algorithms, cwnd and ssthresh, are heavily overloaded, carrying 14 different semantics in different states. This leads to excess 15 implementation complexity and poorly defined behaviors under some 16 combinations of events, such as application stalls during loss 17 recovery. We propose a new framework for TCP congestion control, and 18 to recast current standard algorithms to use new state variables. 19 This new framework will not generally change the behavior of any of 20 the primary congestion control algorithms when they are invoked in 21 isolation. It will permit new algorithms with better behaviors in 22 many corner cases, such as when two distinct primary algorithms are 23 invoked concurrently. It will also foster the creation of new 24 algorithms to address some events that are poorly treated by today's 25 standards. For the vast majority of traditional algorithms the 26 transformation to the new state variables is completely 27 straightforward. However, the resulting implementation is likely to 28 technically be in violation of existing TCP standards, even if it is 29 fully compliant with their principles and intent. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on January 16, 2013. 48 Copyright Notice 49 Copyright (c) 2012 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Overview of the new algorithm . . . . . . . . . . . . . . . . 3 66 3. Standards Impact . . . . . . . . . . . . . . . . . . . . . . . 4 67 4. Meta Language . . . . . . . . . . . . . . . . . . . . . . . . 5 68 5. State variables and definitions . . . . . . . . . . . . . . . 6 69 6. Updated Algorithms . . . . . . . . . . . . . . . . . . . . . . 6 70 6.1. Congestion avoidance . . . . . . . . . . . . . . . . . . . 7 71 6.2. Proportional Rate Reduction . . . . . . . . . . . . . . . 8 72 6.3. Restart after idle, Congestion Window Validation and 73 Pacing . . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 6.4. RTO and F-RTO . . . . . . . . . . . . . . . . . . . . . . 9 75 6.5. Undo . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 76 6.6. Control Block Interdependence . . . . . . . . . . . . . . 10 77 6.7. New Reno . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 7. Example Pseudocode . . . . . . . . . . . . . . . . . . . . . . 11 79 8. Compatibility with existing implementations . . . . . . . . . 12 80 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 81 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 82 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 83 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 15 85 1. Introduction 87 The primary state variables used by all TCP congestion control 88 algorithms, cwnd and ssthresh, are heavily overloaded, carrying 89 different semantics in different states. Multiple algorithms sharing 90 the same state variables lead to excess complexity, conflicting 91 correctness constraints, and makes it unreasonably difficult to 92 implement, test and evaluate new algorithms. 94 We are proposing a new framework for TCP congestion control that 95 separate transmission scheduling, which determines precisely when 96 data is sent, from pure congestion control, which determines the 97 amount of data to be sent in each RTT. This separation is 98 implemented with new state variables and greatly simplifies the 99 interactions between the two subsystems. It permits vast range of 100 new algorithms that are not feasible with the current 101 parameterization. 103 This note describes the new framework and presents a preliminary 104 mapping between current standards and new algorithms based on the new 105 state variables. At this point the new algorithms are not fully 106 specified, and many have still unconstrained design choices. In most 107 cases, our goal is to precisely mimic today's standard TCP, at least 108 as far as well defined primary behaviors. In general, it is a non- 109 goal to mimic behaviors in poorly defined corner cases, or other 110 cases where standard behaviors are viewed as being problematic. 112 It is called Laminar because one of its design goals is to eliminate 113 unnecessary turbulence introduced by TCP itself. 115 2. Overview of the new algorithm 117 The new framework separates transmission scheduling, which determines 118 precisely when data is sent, from pure Congestion Control, which 119 determines the total amount of data sent in any given RTT. 121 The default algorithm for transmission scheduling is a strict 122 implementation of Van Jacobsons' packet conservation principle 123 [Jacobson88]. Data arriving at the receiver cause ACKs which in turn 124 cause the sender to transmit an equivalent quantity of data back into 125 the network. The primary state variable is implicit in the quantity 126 of data and ACKs circulating in the network. This state observed 127 through an improved "total_pipe" estimator, which is based on "pipe" 128 as described in RFC 3517 [RFC3517] but also includes the quantity of 129 data reported by the current ACK and pending transmissions that have 130 passed congestion control but are waiting for other events such as 131 TSO. 133 A new state variable, CCwin, is the primary congestion control state 134 variable. It is updated only by the congestion control algorithms, 135 which are concerned with detecting and regulating the overall level 136 of congestion along the path. CCwin is TCP's best estimate for an 137 appropriate average window size. In general, it rises when the 138 network seem to be underfilled and is reduced in the presence of 139 congestion signals, such as loss, ECN marks or increased delay. 140 Although CCwin resembles cwnd, cwnd is overloaded and used by 141 multiple algorithms (such as burst suppression) with different and 142 sometimes conflicting goals. 144 Any time total_pipe is different from CCwin the transmission 145 scheduling algorithm slightly adjusts the number of segments sent in 146 response to each ACK. Slow start and Proportional Rate Reduction 147 [PRRid] are both embedded in the transmission scheduling algorithm. 149 If CCwin is larger than total_pipe, the default algorithm to grow 150 total_pipe is for each ACK to trigger one segment of additional data. 151 This is essentially an implicit slowstart, but it is gated by the 152 difference between CCwin and total_pipe, rather than the difference 153 between cwnd and ssthresh. In the future, additional algorithms such 154 as pacing, might be used to raise total_pipe. 156 During Fast Retransmit, the congestion control algorithm, such as 157 CUBIC, generally reduces CCwin in a single step. Proportional Rate 158 Reduction [PRRid] is used to gradually reduce total_pipe to agree 159 with CCwin. PRR was based on Laminar principles, so its 160 specification has many parallels to this document. 162 Connection startup is accomplished as follows: CCwin is set to 163 MAX_WIN (akin to ssthresh), and IW segments are transmitted. The 164 ACKs from these segments trigger additional data transmissions, and 165 slowstart proceeds as it does today. The very first congestion event 166 is a special case because there is not a prior value for CCwin. By 167 default and on the first congestion event only, CCwin would be set 168 from total_pipe, and then standard congestion control is invoked. 170 The primary advantage of the Laminar framework is that by 171 partitioning congestion control and transmission scheduling into 172 separate subsystems, each is subject to simpler design constraints, 173 making it far easier to develop many new algorithms that are not 174 feasible with the current organization of the code. 176 3. Standards Impact 178 Since we are proposing to refactor existing standards into new state 179 variables, all of the current congestion control standards documents 180 will potentially need to be reviewed. Although there are roughly 60 181 RFCs that mention cwnd or ssthresh, most only need self evident 182 reinterpretation. Others, such as MIBs, warrant a sentence or two 183 clarifying how to map CCwin and total_pipe onto existing 184 specifications that use cwnd and ssthresh. There are however several 185 RFCs that explicitly address the interplay between cwnd and ssthresh 186 in today's TCP, including RFC 5681 [RFC5681], RFC 5682 [RFC5682], RFC 187 4015 [RFC4015], and RFC 6582 [RFC6582]. These need to be reviewed 188 more carefully. In most cases the algorithms can easily be restated 189 under the Laminar framework. Others, such as Congestion Window 190 Validation [RFC2861], potentially require redesign. 192 This document does not propose to change the TCP friendly paradigm 193 [RFC2914]. By default all updated algorithms using these new state 194 variables would have behaviors similar to the current TCP 195 implementations, however over the longer term the intent is to permit 196 new algorithms that are not feasible today. For example, since CCwin 197 does not directly affect transmissions during recovery, it is 198 straightforward to permit recovery ACKs to raise CCwin even while PRR 199 is reducing total_pipe. This facilitates so called "fluid model" 200 algorithms which further decouple congestion control from the details 201 of the TCP the protocol. 203 But even without these advanced algorithms, we do anticipate some 204 second order effects. For example while testing PRR it was observed 205 that suppressing bursts by slightly delaying transmissions can 206 improve average performance, even though in a strict sense the new 207 algorithm is less aggressive than the old [IMC11PRR]. 209 4. Meta Language 211 We use the following terms when describing algorithms and their 212 alternatives: 214 Standard - The current state of the art, including both formal 215 standards and widely deployed algorithms that have come into standard 216 use, even though they may not be formally specified. [Although PRR 217 does not yet technically meet these criteria, we include it here]. 219 default - The simplest or most straightforward algorithm that fits 220 within the Laminar framework. For example implicit slowstart 221 whenever total_pipe is less than CCwin. This term does not make a 222 statment about the relative aggressiveness or any other properties of 223 the algorithm except that it is a reasonable choice and 224 straightforward to implement. 226 conformant - An algorithm that can produce the same packet trace as a 227 TCP implementation that strictly conforms to the current standards. 229 mimic - An algorithm constructed to be conformant to standards. 231 opportunity - An algorithm that can do something better than the 232 standard algorithm, typically better behavior in a corner cases that 233 is either not well specified or where the standard behavior is viewed 234 as being less than ideal. 236 more/less aggressive - Any algorithm that sends segments earlier/ 237 later than another (typically conformant) algorithm under identical 238 sequences of events. Note that this is an evaluation of the packet 239 level behavior, and does not reflect any higher order effects. 241 Observed performance - A statement about algorithm performance based 242 on a measurement study or other observations based on a significant 243 sample of authentic Internet paths. e.g. an algorithm might have 244 observed data rate that is different than another (typically 245 conformant) algorithm. 247 application stall - The application is failing to keep up with TCP: 248 either the sender is running out of data to send, or the receiver is 249 not reading it fast enough. When there is an application stall, 250 congestion control does not regulate data transmission and some of 251 the protocol events are triggered by application reads or writes, as 252 appropriate. 254 5. State variables and definitions 256 CCwin - The primary congestion control state variable. 258 DeliveredData - The total number of bytes that the current ACK 259 indicates have been delivered to the receiver. (See [PRRid] for more 260 details). 262 total_pipe - The total quantity of circulating data and ACKs. In 263 addition to RFC 3517 pipe, it includes DeliveredData for the current 264 ack, plus any data held for delayed transmission, for example to 265 permit a later TSO transmission. 267 sendcnt - The quantity of data to be sent in response to the current 268 ACK or other event. 270 6. Updated Algorithms 272 A survey of standard, common and proposed algorithms, and how they 273 might be reimplemented under the Laminar framework. 275 6.1. Congestion avoidance 277 Under the Laminar framework the loss recovery mechanism does not, by 278 default, interfere with the primary congestion control algorithms. 279 The CCwin state variable is updated only by the algorithms that 280 decide how much data to send on successive round trips. For example 281 standard Reno AIMD congestion control [RFC5681] can be implemented by 282 raising CCwin by one segment every CCwin worth of ACKs (once per RTT) 283 and halving it on every loss or ECN signal (e.g. CCwin = CCwin/2). 284 During recovery the transmission scheduling part of the Laminar 285 framework makes the necessary adjustments to bring total_pipe to 286 agree with CCwin, without tampering with CCwin. 288 This separation between computing CCwin and transmission scheduling 289 will enable new classes of congestion control algorithms, such as 290 fluid models that adjust CCwin on every ACK, even during recovery. 291 This is safe because raising CCwin does not directly trigger any 292 transmissions, it just steers the transmission scheduling closer to 293 the end of recovery. Fluid models have a number of advantages, such 294 as simpler closed form mathematical representations, and are 295 intrinsically more tolerant to reordering since non-recovery 296 disordered states don't inhibit window growth. 298 Investigating alternative algorithms and their impact is out of scope 299 for this document. It is important to note that while our goal here 300 is not to alter the TCP friendly paradigm, Laminar does not include 301 any implicit or explicit mechanism to prevent a Tragedy of the 302 Commons. However, see the comments in Section 9. 304 The initial slowstart does not use CCwin, except that CCwin starts at 305 the largest possible value. It is the transmission scheduling 306 algorithms that are responsible for performing the slowstart. On the 307 first loss it is necessary to compute a reasonable CCwin from 308 total_pipe. Ideally, we might save total_pipe at the time each 309 segment is scheduled for transmission, and use the saved value 310 associated with the lost segment to prime CCwin. However, this 311 approach requires extra state attached to every segment in the 312 retransmit queue. A simpler approach is to have a mathematical model 313 the slowstart, and to prime CCwin from total_pipe at the time the 314 loss is detected, but scaled down by the effective slowstart 315 multiplier (e.g. 1.5 or 2). In either case, once CCwin is primed 316 from total_pipe, it is typically appropriate to invoke the reduction 317 on loss function, to reduce it again per the congestion control 318 algorithm. 320 Nearly all congestion control algorithms need to have some mechanism 321 to prevent CCwin from growing while it is not regulating 322 transmissions e.g. during prolonged application stalls. 324 6.2. Proportional Rate Reduction 326 Since PRR [PRRid] was designed with Laminar principles in mind, 327 updating it is a straightforward variable substitution. CCwin 328 replaces ssthresh, and RecoverFS is initialized from total_pipe at 329 the beginning of recovery. Thus PRR provides a gradual window 330 reduction from the prior total_pipe down to the new CCwin. 332 There is one important difference from the current standards: CCwin 333 is computed solely on the basis of the prior value of CCwin. Compare 334 this to RFC 5681 which specifies that the congestion control function 335 is computed on the basis of the FlightSize (e.g. 336 ssthresh=FlightSize/2 ) This change from prior standard completely 337 alters how application stalls interact with congestion control. 339 Consider what happens if there is an application stall for most of 340 the RTT just before a Fast Retransmit: Under Laminar it is likely 341 that CCwin will be set to a value that is larger than total_pipe, and 342 subject to available application data PRR will go directly to 343 slowstart mode, to raise total_pipe up to CCwin. Note that the final 344 CCwin value does not depend on the duration of the application stall. 346 With standard TCP, any application stall reduces the final value of 347 cwnd at the end of recovery. In some sense application stalls during 348 recovery are treated as though they are additional losses, and have a 349 detrimental effect on the connection data rate that lasts far longer 350 than the stall itself. 352 If there are no application stalls, the standard and Laminar variants 353 of the PRR algorithm should have identical behaviors. Although it is 354 tempting to characterize Laminar as being more aggressive than the 355 standards, it would be more apropos to characterize the standard as 356 being excessively timid under certain combinations of overlapping 357 events that are not well represented by benchmarks or models. 359 6.3. Restart after idle, Congestion Window Validation and Pacing 361 Decoupling congestion control from transmission scheduling permits us 362 to develop new algorithms to raise total_pipe to CCwin after an 363 application stall or other events. Although it was stated earlier 364 that the default transmission scheduling algorithm for raising 365 total_pipe is an implicit slowstart, there is opportunity for better 366 algorithms. 368 We imagine a class of hybrid transmission scheduling algorithms that 369 use a combination of pacing and slowstart to reestablish TCP's self 370 clock. (See [Visweswaraiah99].) For example, whenever total_pipe is 371 significantly below CCwin, RTT and CCwin can be used to directly 372 compute a pacing rate. We suspect that pacing at the previous full 373 rate will prove to be somewhat brittle, sometimes causing excessive 374 loss and yielding erratic results. It is more likely that a hybrid 375 strategy will work better and be better for the network, for example 376 by pacing at some fraction (1/2 or 1/4) of the prior rate until 377 total_pipe reaches some fraction of CCwin (e.g. CCwin/2) and then 378 using conventional slowstart to bring total_pipe the rest of the way 379 up to CCwin. 381 This is far less aggressive than standard TCP without cwnd validation 382 [RFC2861] or when the application stall was less than one RTO, since 383 standards permit TCP to send a full cwnd size burst in these 384 situations. It is potentially more aggressive than conventional 385 slowstart invoked by cwnd validation when the application stall is 386 longer than several RTOs. Both standard behaviors in these 387 situations have always been viewed as problematic, because interface 388 rate bursts are clearly too aggressive and a full slowstart is 389 clearly too conservative. Mimicking either is a non-goal, when there 390 is ample opportunity to find a better compromise. 392 Although strictly speaking any new transmission scheduling algorithms 393 are independent of the Laminar framework, they are expected to have 394 substantially better behavior in many common environments and as such 395 strongly motivate the effort required to refactor TCP implementations 396 and standards. 398 6.4. RTO and F-RTO 400 We are not proposing any changes to the RTO timer or the F-RTO 401 [RFC5682] algorithm used to detect spurious retransmissions. Once it 402 is determined that segments were lost, CCwin is updated to a new 403 value as determined by the congestion control function, and Laminar 404 implicit slowstart is used to clock out (re)transmissions. Once all 405 holes are filled, a hybrid paced transmissions can be used to 406 reestablish TCPs self clock at the new data rate. This can be the 407 same hybrid pacing algorithm as is used to recover the self clock 408 after application stalls. 410 Note that as long as there is non-contiguous data at the receiver the 411 retransmission algorithms require timely SACK information to make 412 proper decisions about which segments to send. Pacing during loss 413 recovery is not recommended without further investigation. 415 6.5. Undo 417 Since CCwin is not used to implement transmission scheduling, undo is 418 trivial. CCwin can just be set back to its prior value and the 419 transmission scheduling algorithm will transmit more (or less) data 420 as needed. It is useful to note that the discussion about ssthresh 421 in [RFC4015] also applies to CCwin in TCP Laminar. Some people might 422 find it useful to think of CCwin as being equivalent to 423 MAX(ssthresh,cwnd). 425 There is an opportunity to do substantially better than current 426 algorithms. Undo can be implemented by saving the arithmetic 427 difference between the current and prior value of CCwin, and then 428 adding this delta back into CCwin when all retransmissions are deemed 429 to be spurious. If the congestion avoidance algorithm is linear (or 430 can be linearized), and is mathematically transportable across undo, 431 it is possible to design a congestion control algorithm that is 432 completely immune to reordering in the sense that the overall 433 evolution of CCwin is not affected by low level reordering, even if 434 it is pervasive. This is an area for future research. 436 6.6. Control Block Interdependence 438 Under the Laminar framework, congestion control state can be easily 439 shared between connections [RFC2140]. An ensemble of connections can 440 each maintain their own total_pipe (partial_pipe?) which in aggregate 441 tracks a single common CCwin. A master transmission scheduler 442 allocates permission to send (sndcnt) to each of the constituent 443 connection on the basis of the difference between the CCwin and the 444 aggregate total_pipe, and a fairness or capacity allocation policy 445 that balances the flows. Note that ACKs on one connection in an 446 ensemble might be used to clock transmissions on another connection, 447 and that following a loss, the window reductions can be allocated to 448 flows other than the one experiencing the loss. 450 6.7. New Reno 452 The key to making Laminar function well without SACK is having good 453 estimators for DeliveredData and total_pipe. By definition every 454 duplicate ACK indicates that one segment has arrived at the receiver 455 and total_pipe has fallen by one. On any ACK that advances snd.una, 456 total pipe can be updated from snd.nxt-snd.una, and DeliveredData is 457 the change in snd.una, minus the sum of the estimated DeliveredData 458 of the preceding duplicate ACKs. As with SACK the total 459 DeliveredData must agree with the overall forward progress over time. 461 7. Example Pseudocode 463 On startup: 465 CCwin = MAX_WIN 466 sndBank = IW 468 On every ACK: 470 DeliveredData = delta(snd.una) + delta(SACKd) 471 pipe = (RFC 3517 pipe algorithm) 472 total_pipe = pipe+DeliveredData+sndBank 473 sndcnt = DeliveredData // Default # transmissions 475 if new_recovery(): 476 if CCwin == MAX_WIN: 477 CCwin = total_pipe/2 // First time only 478 CCwin = CCwin/2 // Reno congestion control 479 prr_delivered = 0 // Total bytes delivered during recov 480 prr_out = 0 // Total bytes sent during recovery 481 RecoverFS = total_pipe // 483 if !in_recovery() && !application_limited(): 484 CCwin += (MSS/CCwin) 485 prr_delivered += DeliveredData // noop if not in recovery 486 if total_pipe > CCwin: 487 // Proportional Rate Reduction 488 sndcnt = CEIL(prr_delivered * CCwin / RecoverFS) - prr_out 490 else if total_pipe < CCwin: 491 if in_recovery(): 492 // PRR Slow Start Reduction Bound 493 limit = MAX(prr_delivered - prr_out, DeliveredData) + SMSS 494 sndcnt = MIN(CCwin - total_pipe, limit) 495 else: 496 // slow start with appropriate byte counting 497 inc = MIN(DeliveredData, 2*MSS) 498 sndcnt = DeliveredData + inc 500 // cue the transmission machinery 501 sndBank += sndcnt 502 limit = maxBank() 503 if sndBank > limit: 504 sndBank = limit 505 tcp_output() 507 For any data transmission or retransmission: 509 tcp_output(): 510 while sndBank && tso_ok(): 511 len = sendsomething() 512 sndBank -= len 513 prr_out += len // noop if not in recovery 515 8. Compatibility with existing implementations 517 On a segment by segment basis, the above algorithm is [believed to 518 be] fully conformant with or less aggressive than standards under all 519 conditions. 521 However this condition is not sufficient to guarantee that observed 522 performance can't be better than standards. Consider an application 523 that keeps TCP in bulk mode nearly all of the time, but has 524 occasional pauses that last some fraction of one RTT. A fully 525 conforment TCP would be permitted to "catch up" by sending a partial 526 window burst at full interface rate. On some networks, such bursts 527 might be very disruptive, causing otherwise unnecessary packet losses 528 and corresponding cwnd reductions. 530 In Laminar the default algorithm would be slowstart. Other 531 algorithms that might cause the same bursts would be permitted, 532 although are not described here. A better algorithm would be to pace 533 the data at (some fraction of) the prior rate. Neither pacing nor 534 slowstart is likely to cause unnecessary losses, and as was observed 535 while testing PRR, being less aggressive at the segment level has the 536 potential to increase the observed performance[IMC11PRR]. In this 537 scenario Laminar with pacing has the potential to outperform both of 538 the behaviors described by standards. 540 9. Security Considerations 542 The Laminar framework does not change the risk profile for TCP (or 543 other transport protocols) themselves. 545 However, the complexity of current algorithms as embodied in today's 546 code present a substantial barrier to people wishing to cheat "TCP 547 friendliness". It is a fairly well known and easily rediscovered 548 result that custom tweaks to make TCP more aggressive in one 549 environment generally make it fragile and perform less well across 550 the extreme diversity of the Internet. This negative outcome is a 551 substantial intrinsic barrier to wide deployment of rogue congestion 552 control algorithms. 554 A direct consequence of the changes proposed in this note, decoupling 555 congestion control from other algorithms, is likely to reduce the 556 barrier to rogue algorithms. However this separation and the ability 557 to introduce new congestion control algorithms is a key part of the 558 motivation for this work. 560 It is also important to note that web browsers have already largely 561 defeated TCP's ability to regulate congestion by opening many 562 concurrent connections. When a Web page contains content served from 563 multiple domains (the norm these days) all modern browsers open 564 between 35 and 60 connections (see: 565 http://www.browserscope.org/?category=network ). This is the Web 566 community's deliberate workaround for TCP's perceived poor 567 performance and inability make full use of certain types of consumer 568 grade networks. As a consequence the transport layer has already 569 lost a substantial portion of its ability to regulate congestion. It 570 was not anticipated that the tragedy of the commons in Internet 571 congestion would be driven by competition between applications and 572 not between TCP implementations. 574 In the short term, we can continue to try to use standards and peer 575 pressure to moderate the rise in overall congestion levels, however 576 the only real solution is to develop mechanisms in the Internet 577 itself to apply some sort of backpressure to overly aggressive 578 applications and transport protocols. We need to redouble efforts by 579 the ConEx WG and others to develop mechanisms to inform policy with 580 information about congestion and it's causes. Otherwise we have a 581 looming tragedy of the commons, in which TCP has only a minor role. 583 Implementers that change Laminar from counting bytes to segments have 584 to be cautious about the effects of ACK splitting attacks[Savage99], 585 where the receiver acknowledges partial segments for the purpose of 586 confusing the sender's congestion accounting. 588 10. IANA Considerations 590 This document makes no request of IANA. 592 Note to RFC Editor: this section may be removed on publication as an 593 RFC. 595 11. References 597 [Jacobson88] 598 Jacobson, V., "Congestion Avoidance and Control", 599 SIGCOMM 18(4), August 1988. 601 [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC 2140, 602 April 1997. 604 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 605 Window Validation", RFC 2861, June 2000. 607 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 608 RFC 2914, September 2000. 610 [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A 611 Conservative Selective Acknowledgment (SACK)-based Loss 612 Recovery Algorithm for TCP", RFC 3517, April 2003. 614 [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm 615 for TCP", RFC 4015, February 2005. 617 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 618 Control", RFC 5681, September 2009. 620 [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, 621 "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting 622 Spurious Retransmission Timeouts with TCP", RFC 5682, 623 September 2009. 625 [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The 626 NewReno Modification to TCP's Fast Recovery Algorithm", 627 RFC 6582, April 2012. 629 [PRRid] Mathis, M., Dukkipati, N., and Y. Cheng, "Proportional 630 Rate Reduction for TCP", 631 draft-mathis-tcpm-proportional-rate-reduction-01 (work in 632 progress), July 2011. 634 [IMC11PRR] 635 Mathis, M., Dukkipati, N., Cheng, Y., and M. Ghobadi, 636 "Proportional Rate Reduction for TCP", Proceedings of the 637 2011 ACM SIGCOMM conference on Internet measurement 638 conference , 2011. 640 [Savage99] 641 Savage, S., Cardwell, N., Wetherall, D., and T. Anderson, 642 "TCP congestion control with a misbehaving receiver", 643 SIGCOMM Comput. Commun. Rev. 29(5), October 1999. 645 [Visweswaraiah99] 646 Visweswaraiah, V., "Improving Restart of Idle TCP 647 Connections", Tech Report USC TR 97-661, November 1997. 649 Author's Address 651 Matt Mathis 652 Google, Inc 653 1600 Amphitheater Parkway 654 Mountain View, California 93117 655 USA 657 Email: mattmathis@google.com