idnits 2.17.1 draft-ietf-tcpm-ecnsyn-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 367: '... Section 6.1.1 of RFC 3168 [RFC3168] states that "A host MUST NOT set...' RFC 2119 keyword, line 493: '... [RFC3168], which specifies that "the sending TCP MUST restart the...' RFC 2119 keyword, line 646: '...ms followed at the end-systems MUST be...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (13 May 2009) is 5461 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET-DRAFT A. Mondal 3 Intended status: Experimental Northwestern University 4 Expires: 13 November 2009 S. Floyd 5 ICSI 6 K.K. Ramakrishnan 7 AT&T 8 13 May 2009 10 Adding Explicit Congestion Notification (ECN) Capability 11 to TCP's SYN/ACK Packets 12 draft-ietf-tcpm-ecnsyn-09.txt 14 Status of this Memo 16 This Internet-Draft is submitted to IETF in full conformance with the 17 provisions of BCP 78 and BCP 79. 19 This document may contain material from IETF Documents or IETF 20 Contributions published or made publicly available before November 21 10, 2008. The person(s) controlling the copyright in some of this 22 material may not have granted the IETF Trust the right to allow 23 modifications of such material outside the IETF Standards Process. 24 Without obtaining an adequate license from the person(s) controlling 25 the copyright in such materials, this document may not be modified 26 outside the IETF Standards Process, and derivative works of it may 27 not be created outside the IETF Standards Process, except to format 28 it for publication as an RFC or to translate it into languages other 29 than English. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF), its areas, and its working groups. Note that 33 other groups may also distribute working documents as Internet- 34 Drafts. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 The list of current Internet-Drafts can be accessed at 42 http://www.ietf.org/ietf/1id-abstracts.txt. 44 The list of Internet-Draft Shadow Directories can be accessed at 45 http://www.ietf.org/shadow.html. 47 This Internet-Draft will expire on 13 November 2009. 49 Copyright Notice 51 Copyright (c) 2009 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents in effect on the date of 56 publication of this document (http://trustee.ietf.org/license-info). 57 Please review these documents carefully, as they describe your rights 58 and restrictions with respect to this document. 60 Abstract 62 The proposal in this document is experimental. While it may be 63 deployed in the current Internet, it does not represent a consensus 64 that this is the best possible mechanism for the use of ECN in TCP 65 SYN/ACK packets. 67 This draft describes an optional, experimental modification to RFC 68 3168 to allow TCP SYN/ACK packets to be ECN-Capable. For TCP, RFC 69 3168 specifies setting an ECN-Capable codepoint on data packets, but 70 not on SYN and SYN/ACK packets. However, because of the high cost to 71 the TCP transfer of having a SYN/ACK packet dropped, with the 72 resulting retransmission timeout, this document describes the use of 73 ECN for the SYN/ACK packet itself, when sent in response to a SYN 74 packet with the two ECN flags set in the TCP header, indicating a 75 willingness to use ECN. Setting the initial TCP SYN/ACK packet as 76 ECN-Capable can be of great benefit to the TCP connection, avoiding 77 the severe penalty of a retransmission timeout for a connection that 78 has not yet started placing a load on the network. The TCP responder 79 (the sender of the SYN/ACK packet) must reply to a report of an ECN- 80 marked SYN/ACK packet by resending a SYN/ACK packet that is not ECN- 81 Capable. If the resent SYN/ACK packet is acknowledged, then the TCP 82 responder reduces its initial congestion window from two, three, or 83 four segments to one segment, thereby reducing the subsequent load 84 from that connection on the network. If instead the SYN/ACK packet 85 is dropped, or for some other reason the TCP responder does not 86 receive an acknowledgement in the specified time, the TCP responder 87 follows TCP standards for a dropped SYN/ACK packet (setting the 88 retransmission timer). 90 Table of Contents 92 1. Introduction ....................................................6 93 2. Conventions and Terminology .....................................8 94 3. Specification ...................................................8 95 3.1. SYN/ACK Packets Dropped in the Network .....................9 96 3.2. SYN/ACK Packets ECN-Marked in the Network .................10 97 3.3. Management Interface ......................................12 98 4. Discussion .....................................................13 99 4.1. Flooding Attacks ..........................................13 100 4.2. The TCP SYN Packet ........................................13 101 4.3. SYN/ACK Packets and Packet Size ...........................14 102 4.4. Response to ECN-marking of SYN/ACK Packets ................14 103 5. Related Work ...................................................16 104 6. Performance Evaluation .........................................17 105 6.1. The Costs and Benefit of Adding ECN-Capability ............17 106 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 107 Packets ........................................................18 108 7. Security Considerations ........................................19 109 7.1. 'Bad' Routers or Middleboxes ..............................19 110 7.2. Congestion Collapse .......................................20 111 8. Conclusions ....................................................20 112 9. Acknowledgements ...............................................21 113 A. Report on Simulations ..........................................21 114 A.1. Simulations with RED in Packet Mode .......................22 115 A.2. Simulations with RED in Byte Mode .........................26 116 B. Issues of Incremental Deployment ...............................28 117 Informative References ............................................31 118 IANA Considerations ...............................................32 120 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 122 Changes from draft-ietf-tcpm-ecnsyn-08: 124 * Minor editing and bug-fixes. Feedback from Anil Agarwal and 125 Alfred Hoenes. 127 * Changed the specification so that after the first SYN/ACK packet 128 is ECN-marked, and the responder receives an ECN-Echo, the 129 responder does not set the CWR flag in the second SYN/ACK packet. 130 We also specified that on receiving the non-ECN-marked SYN/ACK 131 packet, the TCP initiator clears the ECN-Echo flag on replying 132 packets. Feedback from Anil Agarwal. 134 * Changed it so that the initiator moves from the "SYN-Sent" state 135 to the "Established" state when it receives a SYN/ACK packet 136 that is not ECN-marked. 138 Changes from draft-ietf-tcpm-ecnsyn-07: 140 * Updated boilerplates. 142 * Changed proposed status from "Proposed Standard" to "Experimental", 143 and modified text in the Introduction to match. The added text 144 in the introduction is based on similar text in the Introduction of 145 RFC 3649. 147 * Specified that with ECN+/TryOnce, the originator restarts the 148 retransmission timer when it receives an ECN-marked SYN/ACK. 149 Also reran simulations for ECN+/TryOnce, and updated 150 Tables 1-6. 152 * Specified that the originator follows the traditional rules in 153 setting the cumulative ack field for the ACK acking the SYN/ACK. 155 * Minor editing. 157 Changes from draft-ietf-tcpm-ecnsyn-06: 159 * Updated text and simulation results to specify ECN+/TryOnce 160 instead of ECN+. Added tables on CDFs. 162 * Acknowledged Adam's Linux implementation of ECN+/TryOnce. 164 Changes from draft-ietf-tcpm-ecnsyn-05: 166 * Added "Updates: 3168" to the header. Added a reference 167 to RFC 4987. Mild editing. 168 Feedback from Lars's Area Director review. 170 * Updated simulation results with new simulation scripts that 171 don't require any modifications to the ns simulator, and that 172 all use the same seed for generating traffic. The results are 173 somewhat different for the very-high-congestion scenarios 174 (with loss rates of 25% in the absence of ECN-capability 175 for SYN/ACK packets). This is reflected in the simulations with 176 a target load of 125% in Tables 1 and 2. 178 * Added the URL for the web page that has the simulation scripts. 180 Changes from draft-ietf-tcpm-ecnsyn-04: 182 * Updating the copyright date. 184 Changes from draft-ietf-tcpm-ecnsyn-03: 186 * General editing. This includes using the terms "initiator" 187 and "responder" for the two ends of the TCP connection. 188 Feedback from Alfred Hoenes. 190 * Added some text to the backwards compatibility discussion, 191 now in Appendix B, about the pros and cons of using a TCP 192 flag for the TCP initiator to signal that it understands 193 ECN-Capable SYN/ACK packets. The consensus at this time is 194 not to use such a flag. Also added a recommendation that 195 TCP implementations include a management interface to turn 196 off the use of ECN for SYN/ACK packets. From email from 197 Bob Briscoe. 199 Changes from draft-ietf-tcpm-ecnsyn-02: 201 * Added to the discussion in the Security section of whether 202 ECN-Capable TCP SYN packets have problems with firewalls, 203 over and above the known problems of TCP data packets 204 (e.g., as in the Microsoft report). From a question raised 205 at the TCPM meeting at the July 2007 IETF. 207 * Added a sentence to the discussion of routers or middleboxes that 208 *might* drop TCP SYN packets on the basis of IP header fields. 209 Feedback from Remi Denis-Courmont. 211 * General editing. Feedback from Alfred Hoenes. 213 Changes from draft-ietf-tcpm-ecnsyn-01: 215 * Changes in response to feedback from Anil Agarwal. 217 * Added a look at the costs of adding ECN-Capability to 218 SYN/ACKs in a highly-congested scenario. 219 From feedback from Mark Allman and Janardhan Iyengar. 221 * Added a comparative evaluation of two possible responses 222 to an ECN-marked SYN/ACK packet. From Mark Allman. 224 Changes from draft-ietf-tcpm-ecnsyn-00: 226 * Only updating the revision number. 228 Changes from draft-ietf-twvsg-ecnsyn-00: 230 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 232 * Added a discussion in Section 3 of "Response to 233 ECN-marking of SYN/ACK packets". Based on 234 suggestions from Mark Allman. 236 * Added a discussion to the Conclusions about adding 237 ECN-capability to relevant set-up packets in other 238 protocols. From a suggestion from Wesley Eddy. 240 * Added a description of SYN exchanges with SYN cookies. 241 From a suggestion from Wesley Eddy. 243 * Added a discussion of one-way data transfers, where the 244 host sending the SYN/ACK packet sends no data packets. 246 * Minor editing, from feedback from Mark Allman and Janardhan 247 Iyengar. 249 * Future work: a look at the costs of adding 250 ECN-Capability in a worst-case scenario. 251 From feedback from Mark Allman and Janardhan Iyengar. 253 * Future work: a comparative evaluation of two 254 possible responses to an ECN-marked SYN/ACK packet. 256 Changes from draft-kuzmanovic-ecn-syn-00.txt: 258 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 260 END OF NOTE TO RFC EDITOR. 262 1. Introduction 264 TCP's congestion control mechanism has primarily used packet loss as 265 the congestion indication, with packets dropped when buffers 266 overflow. With such tail-drop mechanisms, the packet delay can be 267 high, as the queue at bottleneck routers can be fairly large. 268 Dropping packets only when the queue overflows, and having TCP react 269 only to such losses, results in: 270 1) significantly higher packet delay; 271 2) unnecessarily many packet losses; and 272 3) unfairness due to synchronization effects. 274 The adoption of Active Queue Management (AQM) mechanisms allows 275 better control of bottleneck queues [RFC2309]. This use of AQM has 276 the following potential benefits: 277 1) better control of the queue, with reduced queueing delay; 278 2) fewer packet drops; and 279 3) better fairness because of fewer synchronization effects. 281 With the adoption of ECN, performance may be further improved. When 282 the router detects congestion before buffer overflow, the router can 283 provide a congestion indication either by dropping a packet, or by 284 setting the Congestion Experienced (CE) codepoint in the Explicit 285 Congestion Notification (ECN) field in the IP header [RFC3168]. The 286 IETF has standardized the use of the Congestion Experienced (CE) 287 codepoint in the IP header for routers to indicate congestion. For 288 incremental deployment and backwards compatibility, the RFC on ECN 289 [RFC3168] specifies that routers may mark ECN-capable packets that 290 would otherwise have been dropped, using the Congestion Experienced 291 codepoint in the ECN field. The use of ECN allows TCP to react to 292 congestion while avoiding unnecessary retransmission timeouts. Thus, 293 using ECN has several benefits: 295 1) For short transfers, a TCP connection's congestion window may be 296 small. For example, if the current window contains only one packet, 297 and that packet is dropped, TCP will have to wait for a 298 retransmission timeout to recover, reducing its overall throughput. 299 Similarly, if the current window contains only a few packets and one 300 of those packets is dropped, there might not be enough duplicate 301 acknowledgements for a fast retransmission, and the sender of the 302 data packet might have to wait for a delay of several round-trip 303 times using Limited Transmit [RFC3042]. With the use of ECN, short 304 flows are less likely to have packets dropped, sometimes avoiding 305 unnecessary delays or costly retransmission timeouts. 307 2) While longer flows may not see substantially improved throughput 308 with the use of ECN, they may experience lower loss. This may benefit 309 TCP applications that are latency- and loss-sensitive, because of the 310 avoidance of retransmissions. 312 RFC 3168 [RFC3168] specifies setting the ECN-Capable codepoint on TCP 313 data packets, but not on TCP SYN and SYN/ACK packets. RFC 3168 314 [RFC3168] specifies the negotiation of the use of ECN between the two 315 TCP end-points in the TCP SYN and SYN-ACK exchange, using flags in 316 the TCP header. Erring on the side of being conservative, RFC 3168 317 [RFC3168] does not specify the use of ECN for the first SYN/ACK 318 packet itself. However, because of the high cost to the TCP transfer 319 of having a SYN/ACK packet dropped, with the resulting retransmission 320 timeout, this document specifies the use of ECN for the SYN/ACK 321 packet itself. This can be of great benefit to the TCP connection, 322 avoiding the severe penalty of a retransmission timeout for a 323 connection that has not yet started placing a load on the network. 324 The sender of the SYN/ACK packet must respond to a report of an ECN- 325 marked SYN/ACK packet (a SYN/ACK packet with the CE codepoint set in 326 the ECN field in the IP header) by sending a non-ECN-Capable SYN/ACK 327 packet, and by reducing its initial congestion window from two, 328 three, or four segments to one segment, reducing the subsequent load 329 from that connection on the network. 331 The use of ECN for SYN/ACK packets has the following potential 332 benefits: 333 1) Avoidance of a retransmission timeout; 334 2) Improvement in the throughput of short connections. 336 This draft specifies a modification to RFC 3168 [RFC3168] to allow 337 TCP SYN/ACK packets to be ECN-Capable. Section 3 contains the 338 specification of the change, while Section 4 discusses some of the 339 issues, and Section 5 discusses related work. Section 6 contains an 340 evaluation of the specified change. 342 2. Conventions and Terminology 344 We use the following terminology from RFC 3168 [RFC3168]: 346 The ECN field in the IP header: 347 o CE: the Congestion Experienced codepoint; and 348 o ECT: either one of the two ECN-Capable Transport codepoints. 350 The ECN flags in the TCP header: 351 o CWR: the Congestion Window Reduced flag; and 352 o ECE: the ECN-Echo flag. 354 ECN-setup packets: 355 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 356 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 358 In this document we use the terms "initiator" and "responder" to 359 refer to the sender of the SYN packet and of the SYN-ACK packet, 360 respectively. 362 3. Specification 364 This section specifies the modification to RFC 3168 [RFC3168] to 365 allow TCP SYN/ACK packets to be ECN-Capable. 367 Section 6.1.1 of RFC 3168 [RFC3168] states that "A host MUST NOT set 368 ECT on SYN or SYN-ACK packets." In this section, we specify that a 369 TCP node may respond to an initial ECN-setup SYN packet by setting 370 ECT in the responding ECN-setup SYN/ACK packet, indicating to routers 371 that the SYN/ACK packet is ECN-Capable. This allows a congested 372 router along the path to mark the packet instead of dropping the 373 packet as an indication of congestion. 375 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 376 packet, indicating willingness to use ECN for this connection. As 377 specified by RFC 3168 [RFC3168], if TCP node B is willing to use ECN, 378 node B responds with an ECN-setup SYN-ACK packet. 380 3.1. SYN/ACK Packets Dropped in the Network 382 Figure 1 shows an interchange with the SYN/ACK packet dropped by a 383 congested router. Node B waits for a retransmission timeout, and 384 then retransmits the SYN/ACK packet. 386 --------------------------------------------------------------- 387 TCP Node A Router TCP Node B 388 (initiator) (responder) 389 ---------- ------ ---------- 391 ECN-setup SYN packet ---> 392 ECN-setup SYN packet ---> 394 <--- ECN-setup SYN/ACK, possibly ECT 395 3-second timer set 396 SYN/ACK dropped . 397 . 398 . 399 3-second timer expires 400 <--- ECN-setup SYN/ACK, not ECT 401 <--- ECN-setup SYN/ACK 402 Data/ACK ---> 403 Data/ACK ---> 404 <--- Data (one to four segments) 405 --------------------------------------------------------------- 407 Figure 1: SYN exchange with the SYN/ACK packet dropped. 409 If the SYN/ACK packet is dropped in the network, the responder (node 410 B) responds by waiting three seconds for the retransmission timer to 411 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 412 dropped, the responder should resend the SYN/ACK packet without the 413 ECN-Capable codepoint. (Although we are not aware of any middleboxes 414 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 415 the IP header, we have learned to design our protocols defensively in 416 this regard [RFC3360].) 418 We note that if syn-cookies were used by the responder (node B) in 419 the exchange in Figure 1, the responder wouldn't set a timer upon 420 transmission of the SYN/ACK packet [SYN-COOK] [RFC4987]. In this 421 case, if the SYN/ACK packet was lost, the initiator (Node A) would 422 have to timeout and retransmit the SYN packet in order to trigger 423 another SYN-ACK. 425 3.2. SYN/ACK Packets ECN-Marked in the Network 427 Figure 2 shows an interchange with the SYN/ACK packet sent as ECN- 428 Capable, and ECN-marked instead of dropped at the congested router. 429 This document specifies ECN+/TryOnce, which differs from the original 430 proposal for ECN+ in [ECN+]; with ECN+/TryOnce, if the TCP responder 431 is informed that the SYN/ACK was ECN-marked, the TCP responder 432 immediately sends a SYN/ACK packet that is not ECN-Capable. The TCP 433 responder is only allowed to send data packets after the TCP 434 initiator reports the receipt of a SYN/ACK packet that is not ECN- 435 marked. 437 --------------------------------------------------------------- 438 TCP Node A Router TCP Node B 439 (initiator) (responder) 440 ---------- ------ ---------- 442 ECN-setup SYN packet ---> 443 ECN-setup SYN packet ---> 445 <--- ECN-setup SYN/ACK, ECT 446 3-second timer set 447 <--- Sets CE on SYN/ACK 448 <--- ECN-setup SYN/ACK, CE 450 ACK, ECN-Echo ---> 451 ACK, ECN-Echo ---> 452 Window reduced to one segment. 453 <--- ECN-setup SYN/ACK, not ECT 454 <--- ECN-setup SYN/ACK 456 Data/ACK, ECT ---> 457 Data/ACK, ECT ---> 458 <--- Data, ECT (one segment only) 459 --------------------------------------------------------------- 461 Figure 2: SYN exchange with the SYN/ACK packet marked. 462 ECN+/TryOnce. 464 If the initiator (node A) receives a SYN/ACK packet that has been 465 ECN-marked by the congested router, with the CE codepoint set, the 466 initiator restarts the retransmission timer. The initiator responds 467 to the ECN-marked SYN/ACK packet by setting the ECN-Echo flag in the 468 TCP header of the responding ACK packet. The initiator uses the 469 standard rules in setting the cumulative acknowledgement field in the 470 responding ACK packet. 472 The initiator does not advance from the "SYN-Sent" to the 473 "Established" state until it receives a SYN/ACK packet that is not 474 ECN-marked. 476 When the responder (node B) receives the ECN-Echo packet reporting 477 the Congestion Experienced indication in the SYN/ACK packet, the 478 responder sets the initial congestion window to one segment, instead 479 of two segments as allowed by [RFC2581], or three or four segments 480 allowed by [RFC3390]. As illustrated in Figure 2, if the responder 481 (node B) receives an ECN-Echo packet informing it of a Congestion 482 Experienced indication on its SYN/ACK packet, the responder sends a 483 SYN/ACK packet that is not ECN-Capable, in addition to setting the 484 initial window to one segment. The responder does not advance the 485 send sequence number. The responder also sets the retransmission 486 timer. The responder follows RFC 2988 [RFC2988] in setting the RTO 487 (retransmission timeout). 489 The TCP hosts follow the standard specification for the response to 490 duplicate SYN/ACK packets (e.g., Section 3.4 of RFC 793 [RFC793]). 492 We note that the mechanism in this document differs from RFC 3168 493 [RFC3168], which specifies that "the sending TCP MUST restart the 494 retransmission timer on receiving the ECN-Echo packet when the 495 congestion window is one." RFC 3168 [RFC3168] does not allow SYN/ACK 496 packets to be ECN-Capable. RFC 3168 [RFC3168] specifies that in 497 response to an ECN-Echo packet, the TCP responder also sets the CWR 498 flag in the TCP header of the next data packet sent, to acknowledge 499 its receipt of and reaction to the ECN-Echo flag. In contrast, in 500 response to an ECN-Echo packet acknowledging the receipt of an ECN- 501 Capable SYN/ACK packet, the TCP responder doesn't set the CWR flag, 502 but simply sends a SYN/ACK packet that is not ECN-Capable. On 503 receiving the non-ECN-Capable SYN/ACK packet, the TCP initiator 504 clears the ECN-Echo flag on replying packets. 506 --------------------------------------------------------------- 507 TCP Node A Router TCP Node B 508 (initiator) (responder) 509 ---------- ------ ---------- 511 ECN-setup SYN packet ---> 512 ECN-setup SYN packet ---> 514 <--- ECN-setup SYN/ACK, ECT 515 <--- Sets CE on SYN/ACK 516 <--- ECN-setup SYN/ACK, CE 518 ACK, ECN-Echo ---> 519 ACK, ECN-Echo ---> 520 Window reduced to one segment. 522 <--- ECN-setup SYN/ACK, not ECT 523 3-second timer set 524 SYN/ACK dropped . 525 . 526 . 527 3-second timer expires 528 <--- ECN-setup SYN/ACK, not ECT 529 <--- ECN-setup SYN/ACK, not ECT 530 Data/ACK, ECT ---> 531 Data/ACK, ECT ---> 532 <--- Data, ECT (one segment only) 533 --------------------------------------------------------------- 535 Figure 3: SYN exchange with the first SYN/ACK packet marked, 536 and the second SYN/ACK packet dropped. ECN+/TryOnce. 538 In contrast to Figure 2, Figure 3 shows an interchange where the 539 first SYN/ACK packet is ECN-marked and the second SYN/ACK packet is 540 dropped in the network. As in Figure 2, the TCP responder sets a 541 timer when the second SYN/ACK packet is sent. Figure 3 shows that if 542 the timer expires before the TCP responder receives an 543 acknowledgement for the other end, the TCP responder resends the 544 SYN/ACK packet, following the TCP standards. 546 3.3. Management Interface 548 The TCP implementation using ECN-Capable SYN/ACK packets should 549 include a management interface to allow the use of ECN to be turned 550 off for SYN/ACK packets. This is to deal with possible backwards 551 compatibility problems such as those discussed in Appendix B. 553 4. Discussion 555 The rationale for the specification in this document is the 556 following. When node B receives a TCP SYN packet with ECN-Echo bit 557 set in the TCP header, this indicates that node A is ECN-capable. If 558 node B is also ECN-capable, there are no obstacles to immediately 559 setting one of the ECN-Capable codepoints in the IP header in the 560 responding TCP SYN/ACK packet. 562 There can be a great benefit in setting an ECN-capable codepoint in 563 SYN/ACK packets, as is discussed further in [ECN+], and reported 564 briefly in Section 5 below. Congestion is most likely to occur in 565 the server-to-client direction. As a result, setting an ECN-capable 566 codepoint in SYN/ACK packets can reduce the occurrence of three- 567 second retransmission timeouts resulting from the drop of SYN/ACK 568 packets. 570 4.1. Flooding Attacks 572 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 573 packets does not raise any new or additional security 574 vulnerabilities. For example, provoking servers or hosts to send 575 SYN/ACK packets to a third party in order to perform a "SYN/ACK 576 flood" attack would be highly inefficient. Third parties would 577 immediately drop such packets, since they would know that they didn't 578 generate the TCP SYN packets in the first place. Moreover, such 579 SYN/ACK attacks would have the same signatures as the existing TCP 580 SYN attacks. Provoking servers or hosts to reply with SYN/ACK packets 581 in order to congest a certain link would also be highly inefficient 582 because SYN/ACK packets are small in size. 584 However, the addition of ECN-Capability to SYN/ACK packets could 585 allow SYN/ACK packets to persist for more hops along a network path 586 before being dropped, thus adding somewhat to the ability of a 587 SYN/ACK attack to flood a network link. 589 4.2. The TCP SYN Packet 591 There are several reasons why an ECN-Capable codepoint must not be 592 set in the IP header of the initiating TCP SYN packet. First, when 593 the TCP SYN packet is sent, there are no guarantees that the other 594 TCP endpoint (node B in Figure 2) is ECN-capable, or that it would be 595 able to understand and react if the ECN CE codepoint was set by a 596 congested router. 598 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 599 by malicious clients to `improve' the well-known TCP SYN attack. By 600 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 601 might be able to inject a large number of TCP SYN packets through a 602 potentially congested ECN-enabled router, congesting it even further. 604 For both these reasons, we continue the restriction that the TCP SYN 605 packet must not have the ECN-Capable codepoint in the IP header set. 607 4.3. SYN/ACK Packets and Packet Size 609 There are a number of router buffer architectures that have smaller 610 dropping rates for small (SYN) packets than for large (data) packets. 611 For example, for a Drop Tail queue in units of packets, where each 612 packet takes a single slot in the buffer regardless of packet size, 613 small and large packets are equally likely to be dropped. However, 614 for a Drop Tail queue in units of bytes, small packets are less 615 likely to be dropped than are large ones. Similarly, for RED in 616 packet mode, small and large packets are equally likely to be dropped 617 or marked, while for RED in byte mode, a packet's chance of being 618 dropped or marked is proportional to the packet size in bytes. 620 For a congested router with an AQM mechanism in byte mode, where a 621 packet's chance of being dropped or marked is proportional to the 622 packet size in bytes, the drop or marking rate for TCP SYN/ACK 623 packets should generally be low. In this case, the benefit of making 624 SYN/ACK packets ECN-Capable should be similarly moderate. However, 625 for a congested router with a Drop Tail queue in units of packets or 626 with an AQM mechanism in packet mode, and with no priority queueing 627 for smaller packets, small and large packets should have the same 628 probability of being dropped or marked. In such a case, making 629 SYN/ACK packets ECN-Capable should be of significant benefit. 631 We believe that there are a wide range of behaviors in the real world 632 in terms of the drop or mark behavior at routers as a function of 633 packet size [Tools] (Section 10). We note that all of these 634 alternatives listed above are available in the NS simulator (Drop 635 Tail queues are by default in units of packets, while the default for 636 RED queue management has been changed from packet mode to byte mode). 638 4.4. Response to ECN-marking of SYN/ACK Packets 640 One question is why TCP SYN/ACK packets should be treated differently 641 from other packets in terms of the end node's response to an ECN- 642 marked packet. Section 5 of RFC 3168 [RFC3168] specifies the 643 following: 645 "Upon the receipt by an ECN-Capable transport of a single CE packet, 646 the congestion control algorithms followed at the end-systems MUST be 647 essentially the same as the congestion control response to a *single* 648 dropped packet. For example, for ECN-Capable TCP the source TCP is 649 required to halve its congestion window for any window of data 650 containing either a packet drop or an ECN indication." 652 In particular, Section 6.1.2 of RFC 3168 [RFC3168] specifies that 653 when the TCP congestion window consists of a single packet and that 654 packet is ECN-marked in the network, then the data sender must reduce 655 the sending rate below one packet per round-trip time, by waiting for 656 one RTO before sending another packet. If the RTO was set to the 657 average round-trip time, this would result in halving the sending 658 rate; because the RTO is in fact larger than the average round-trip 659 time, the sending rate is reduced to less than half of its previous 660 value. 662 TCP's congestion control response to the *dropping* of a SYN/ACK 663 packet is to wait a default time before sending another packet. This 664 document argues that ECN gives end-systems a wider range of possible 665 responses to the *marking* of a SYN/ACK packet, and that waiting a 666 default time before sending another packet is not the desired 667 response. 669 On the conservative end, one could assume an effective congestion 670 window of one packet for the SYN/ACK packet, and respond to an ECN- 671 marked SYN/ACK packet by reducing the sending rate to one packet 672 every two round-trip times. As an approximation, the TCP end-node 673 could measure the round-trip time T between the sending of the 674 SYN/ACK packet and the receipt of the acknowledgement, and reply to 675 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 676 seconds before sending a data packet. 678 However, we note that for an ECN-marked SYN/ACK packet, halving the 679 *congestion window* is not the same as halving the *sending rate*; 680 there is no `sending rate' associated with an ECN-Capable SYN/ACK 681 packet, as such packets are only sent as the first packet in a 682 connection from that host. Further, a router's marking of a SYN/ACK 683 packet is not affected by any past history of that connection. 685 Adding ECN-Capability to SYN/ACK packets allows the response of the 686 responder setting the initial congestion window to one packet, 687 instead of its allowed default value of two, three, or four packets. 688 The responder sends a non-ECN-Capable SYN/ACK packet, and proceeds 689 with a cautious sending rate of one data packet per round-trip time 690 after that SYN/ACK packet is acknowledged. This document argues that 691 this approach is useful to users, with no dangers of congestion 692 collapse or of starvation of competing traffic. This is discussed in 693 more detail below in Section 6.2. 695 We note that if the data transfer is entirely from Node A to Node B, 696 there is still a difference in performance between the original 697 mechanism ECN+ and the mechanism ECN+/TryOnce specified in this 698 document. In particular, with ECN+/TryOnce the TCP originator does 699 not send data packets until it has received a non-ECN-marked SYN/ACK 700 packet from the other end. 702 5. Related Work 704 The addition of ECN-capability to TCP's SYN/ACK packets was initially 705 proposed in [ECN+]. The paper includes an extensive set of 706 simulation and testbed experiments to evaluate the effects of the 707 proposal, using several Active Queue Management (AQM) mechanisms, 708 including Random Early Detection (RED) [RED], Random Exponential 709 Marking (REM) [REM], and Proportional Integrator (PI) [PI]. The 710 performance measures were the end-to-end response times for each 711 request/response pair, and the aggregate throughput on the bottleneck 712 link. The end-to-end response time was computed as the time from the 713 moment when the request for the file is sent to the server, until 714 that file is successfully downloaded by the client. 716 The measurements from [ECN+] show that setting an ECN-Capable 717 codepoint in the IP packet header in TCP SYN/ACK packets 718 systematically improves performance with all evaluated AQM schemes. 719 When SYN/ACK packets at a congested router are ECN-marked instead of 720 dropped, this can avoid a long initial retransmission timeout, 721 improving the response time for the affected flow dramatically. 723 [ECN+] shows that the impact on aggregate throughput can also be 724 quite significant, because marking SYN ACK packets can prevent larger 725 flows from suffering long timeouts before being "admitted" into the 726 network. In addition, the testbed measurements from [ECN+] show that 727 web servers setting the ECN-Capable codepoint in TCP SYN/ACK packets 728 could serve more requests. 730 As a final step, [ECN+] explores the co-existence of flows that do 731 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 732 results in [ECN+] show that both types of flows can coexist, with 733 some performance degradation for flows that don't use ECN+. Flows 734 that do use ECN+ improve their end-to-end performance. At the same 735 time, the performance degradation for flows that don't use ECN+, as a 736 result of the flows that do use ECN+, increases as a greater fraction 737 of flows use ECN+. 739 6. Performance Evaluation 741 6.1. The Costs and Benefit of Adding ECN-Capability 743 [ECN+] explores the costs and benefits of adding ECN-Capability to 744 SYN/ACK packets with both simulations and experiments. The addition 745 of ECN-capability to SYN/ACK packets could be of significant benefit 746 for those ECN connections that would have had the SYN/ACK packet 747 dropped in the network, and for which the ECN-Capability would allow 748 the SYN/ACK to be marked rather than dropped. 750 The percent of SYN/ACK packets on a link can be quite high. In 751 particular, measurements on links dominated by web traffic indicate 752 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 754 The benefit of adding ECN-capability to SYN/ACK packets depends in 755 part on the size of the data transfer. The drop of a SYN/ACK packet 756 can increase the download time of a short file by an order of 757 magnitude, by requiring a three-second retransmission timeout. For 758 longer-lived flows, the effect of a dropped SYN/ACK packet on file 759 download time is less dramatic. However, even for longer-lived 760 flows, the addition of ECN-capability to SYN/ACK packets can improve 761 the fairness among long-lived flows, as newly-arriving flows would be 762 less likely to have to wait for retransmission timeouts. 764 One question that arises is what fraction of connections would see 765 the benefit from making SYN/ACK packets ECN-capable, in a particular 766 scenario. Specifically: 768 (1) What fraction of arriving SYN/ACK packets are dropped at the 769 congested router when the SYN/ACK packets are not ECN-capable? 771 (2) Of those SYN/ACK packets that are dropped, what fraction would 772 have been ECN-marked instead of dropped if the SYN/ACK packets had 773 been ECN-capable? 775 To answer (1), it is necessary to consider not only the level of 776 congestion but also the queue architecture at the congested link. As 777 described in Section 4 above, for some queue architectures small 778 packets are less likely to be dropped than large ones. In such an 779 environment, SYN/ACK packets would have lower packet drop rates; 780 question (1) could not necessarily be inferred from the overall 781 packet drop rate, but could be answered by measuring the drop rate 782 for SYN/ACK packets directly. In such an environment, adding ECN- 783 capability to SYN/ACK packets would be of less dramatic benefit than 784 in environments where all packets are equally likely to be dropped 785 regardless of packet size. 787 As question (2) implies, even if all of the SYN/ACK packets were ECN- 788 capable, there could still be some SYN/ACK packets dropped instead of 789 marked at the congested link; the full answer to question (2) depends 790 on the details of the queue management mechanism at the router. If 791 congestion is sufficiently bad, and the queue management mechanism 792 cannot prevent the buffer from overflowing, then SYN/ACK packets will 793 be dropped rather than marked upon buffer overflow whether or not 794 they are ECN-capable. 796 For some AQM mechanisms, ECN-capable packets are marked instead of 797 dropped any time this is possible, that is, any time the buffer is 798 not yet full. For other AQM mechanisms however, such as the RED 799 mechanism as recommended in [RED], packets are dropped rather than 800 marked when the packet drop/mark rate exceeds a certain threshold, 801 e.g., 10%, even if the packets are ECN-capable. For a router with 802 such an AQM mechanism, when congestion is sufficiently severe to 803 cause a high drop/mark rate, some SYN/ACK packets would be dropped 804 instead of marked whether or not they were ECN-capable. 806 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 807 packets depends not only on the overall packet drop rate in the 808 network, but also on the queue management architecture at the 809 congested link. 811 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets 813 This document specifies that the end-node responds to the report of 814 an ECN-marked SYN/ACK packet by setting the initial congestion window 815 to one segment, instead of its possible default value of two to four 816 segments, and resending a SYN/ACK packet that is not ECN-Capable. We 817 call this ECN+/TryOnce. 819 However, Section 4 discussed two other possible responses to an ECN- 820 marked SYN/ACK packet. In ECN+, the original proposal from [ECN+], 821 the end node responds to the report of an ECN-marked SYN/ACK packet 822 by setting the initial congestion window to one segment and 823 immediately sending a data packet, if it has one to send. In 824 ECN+/Wait, the end node responds to the report of an ECN-marked 825 SYN/ACK packet by setting the initial congestion window to one 826 segment and waiting an RTT before sending a data packet. 828 Simulations comparing the performance with Standard ECN (without ECN- 829 marked SYN/ACK packets), ECN+, ECN+/Wait, and ECN/TryOnce show little 830 difference, in terms of aggregate congestion, between ECN+ and 831 ECN+/Wait. However, for some scenarios with queues that are packet- 832 based rather than byte-based, and with packet drop rates above 25% 833 without ECN+, the use of ECN+ or of ECN+/Wait can more than double 834 the packet drop rates, to greater than 50%. The details are given in 835 Tables 1 and 3 of Appendix A below. ECN+/TryOnce does not increase 836 the packet drop rate in scenarios of high congestion. Therefore, 837 ECN+/TryOnce is superior to ECN+ or to ECN+/Wait, which both 838 significantly increase the packet drop rate in scenarios of high 839 congestion. At the same time, ECN+/TryOnce gives a performance 840 improvement similar to that of ECN+ or ECN+/Wait (Tables 2 and 4 of 841 Appendix A). 843 Our conclusions are that ECN+/TryOnce is safe, and has significant 844 benefits to the user, and avoids the problems of ECN+ or ECN+/Wait 845 under extreme levels of congestion. As a consequence, this document 846 specifies the use of ECN+/TryOnce. 848 [Note: We only discovered the occasional congestion-related problems 849 of ECN+ and of ECN+/Wait when re-running the simulations with an 850 updated version of the ns-2 simulator, after the internet-draft had 851 almost completed the standardization process.] 853 7. Security Considerations 855 TCP packets carrying the ECT codepoint in IP headers can be marked 856 rather than dropped by ECN-capable routers. This raises several 857 security concerns that we discuss below. 859 7.1. 'Bad' Routers or Middleboxes 861 There are a number of known deployment problems from using ECN with 862 TCP traffic in the Internet. The first reported problem, dating back 863 to 2000, is of a small but decreasing number of routers or 864 middleboxes that reset a TCP connection in response to TCP SYN 865 packets using flags in the TCP header to negotiate ECN-capability 866 [Kelson00] [RFC3360] [MAF05]. Dave Thaler reported at the March 2007 867 IETF of new two problems encountered by TCP connections using ECN; 868 the first of the two problems concerns routers that crash when a TCP 869 data packet arrives with the ECN field in the IP header with the 870 codepoint ECT(0) or ECT(1), indicating that an ECN-Capable connection 871 has been established [SBT07]. 873 While there is no evidence that any routers or middleboxes drop 874 SYN/ACK packets that contain an ECN-Capable or CE codepoint in the IP 875 header, such behavior cannot be excluded. (There seems to be a 876 number of routers or middleboxes that drop TCP SYN packets that 877 contain known or unknown IP options [MAF05] (Figure 1).) Thus, as 878 specified in Section 3, if a SYN/ACK packet with the ECT or CE 879 codepoint is dropped, the TCP node should resend the SYN/ACK packet 880 without the ECN-Capable codepoint. There is also no evidence that 881 any routers or middleboxes crash when a SYN/ACK arrives with an ECN- 882 Capable or CE codepoint in the IP header (over and above the routers 883 already known to crash when a data packet arrives with either ECT(0) 884 or ECT(1)), but we have not conducted any measurement studies of this 885 [F07]. 887 7.2. Congestion Collapse 889 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 890 marked instead of dropped at an ECN-capable router, the concern is 891 whether this can either invoke congestion, or worsen performance in 892 highly congested scenarios. However, after learning that a SYN/ACK 893 packet was ECN-marked, the responder sends a SYN/ACK packet that is 894 not ECN-Capable; if this SYN/ACK packet is dropped, the responder 895 then waits for a retransmission timeout, as specified in the TCP 896 standards. In addition, routers are free to drop rather than mark 897 arriving packets in times of high congestion, regardless of whether 898 the packets are ECN-capable. When congestion is very high and a 899 router's buffer is full, the router has no choice but to drop rather 900 than to mark an arriving packet. 902 The simulations reported in Appendix A show that even with demanding 903 traffic mixes dominated by short flows and high levels of congestion, 904 the aggregate packet dropping rates are not significantly different 905 with Standard ECN or with ECN+/TryOnce. However, in our simulations, 906 we have one scenario where ECN+ or ECN+/Wait results in a 907 significantly higher packet drop rate than ECN or ECN+/TryOnce 908 (Tables 1 and 3 in Appendix A below). 910 8. Conclusions 912 This draft specifies a modification to RFC 3168 [RFC3168] to allow 913 TCP nodes to send SYN/ACK packets as being ECN-Capable. Making the 914 SYN/ACK packet ECN-Capable avoids the high cost to a TCP transfer 915 when a SYN/ACK packet is dropped by a congested router, by avoiding 916 the resulting retransmission timeout. This improves the throughput 917 of short connections. This document specifies the ECN+/TryOnce 918 mechanism for ECN-Capability for SYN/ACK packets, where the sender of 919 the SYN/ACK packet responds to an ECN mark by reducing its initial 920 congestion window from two, three, or four segments to one segment, 921 and sending a SYN/ACK packet that is not ECN-Capable. The addition 922 of ECN-capability to SYN/ACK packets is particularly beneficial in 923 the server-to-client direction, where congestion is more likely to 924 occur. In this case, the initial information provided by the ECN 925 marking in the SYN/ACK packet enables the server to appropriately 926 adjust the initial load it places on the network, while avoiding the 927 delay of a retransmission timeout. 929 9. Acknowledgements 931 We thank Anil Agarwal, Mark Allman, Remi Denis-Courmont, Wesley Eddy, 932 Lars Eggert, Alfred Hoenes, Janardhan Iyengar, and Pasi Sarolahti for 933 feedback on earlier versions of this draft. We thank Adam Langley 934 [L08] for contributing a patch for ECN+/TryOnce for the Linux 935 development tree. 937 A. Report on Simulations 939 This section reports on simulations showing the costs of adding ECN+ 940 in highly-congested scenarios. This section also reports on 941 simulations for a comparative evaluation between ECN, ECN+, 942 ECN+/Wait, and ECN+/TryOnce. 944 The simulations are run with a range of file-size distributions, 945 using the PackMime traffic generator in the ns-2 simulator. They all 946 use a heavy-tailed distribution of file sizes. The simulations 947 reported in the tables below use a mean file size of 3 KBypes, to 948 show the results with a traffic mix with a large number of small 949 transfers. Other simulations were run with mean file sizes of 5 950 KBytes, 7 Kbytes, 14 KBytes, and 17 Kbytes. The title of each chart 951 gives the targeted average load from the traffic generator. Because 952 the simulations use a heavy-tailed distribution of file sizes, and 953 run for only 85 seconds (including ten seconds of warm-up time), the 954 actual load is often much smaller than the targeted load. The 955 congested link is 100 Mbps. RED is run in gentle mode, and arriving 956 ECN-Capable packets are only dropped instead of marked if the buffer 957 is full (and the router has no choice). 959 We explore three possible mechanisms for a TCP node's response to a 960 report of an ECN-marked SYN/ACK packet. With ECN+, the TCP node 961 sends a data packet immediately (with an initial congestion window of 962 one segment). With ECN+/Wait, the TCP node waits a round-trip time 963 before sending a data packet; the responder already has one 964 measurement of the round-trip time when the acknowledgement for the 965 SYN/ACK packet is received. With ECN+/TryOnce, the mechanism 966 standardized in this document, the TCP responder replies to a report 967 of an ECN-marked SYN/ACK packet by sending a SYN/ACK packet that is 968 not ECN-Capable, and reducing the initial congestion window to one 969 segment. 971 The simulation scripts are available on [ECN-SYN]. along with graphs 972 showing the distribution of response times for the TCP connections. 974 A.1. Simulations with RED in Packet Mode 976 The simulations with RED in packet mode and with the queue in packets 977 show that ECN+ is useful in times of moderate or of high congestion. 978 However, for the simulations with a target load of 125%, with a 979 packet loss rate of over 25% for ECN, ECN+ and ECN+/Wait both result 980 in a packet loss rate of over 50%. (In contrast, the packet loss 981 rate with ECN+/TryOnce is less than that of ECN alone.) For the 982 distribution of response times, the simulations show that ECN+, 983 ECN+/Wait, and ECN+/TryOnce all significantly improve the response 984 times, compared to the response times with plain ECN. 986 Table 1 shows the congestion levels for simulations with RED in 987 packet mode, with a queue in packets. To explore a worst-case 988 scenario, these simulations use a traffic mix with an unrealistically 989 small flow size distribution, with a mean flow size of 3 Kbytes. For 990 each table showing a particular traffic load, the four rows show the 991 number of packets dropped, the number of packets ECN-marked, the 992 aggregate packet drop rate, and the aggregate throughput, and the 993 four columns show the simulations with Standard ECN, ECN+, ECN+/Wait, 994 and ECN+/TryOnce. 996 These simulations were run with RED set to mark instead of drop 997 packets any time that the queue is not full. This is a worst-case 998 scenario for ECN+ and its variants. For the default implementation 999 of RED in the ns-2 simulator, when the average queue size exceeds a 1000 configured threshold, the router drops all arriving packets. For 1001 scenarios with this RED mechanisms, it is less likely that ECN+ or 1002 one of its variants would increase the average queue size above the 1003 configured threshold. 1005 The usefulness of ECN+: The first thing to observe is that for all of 1006 the simulations, the use of ECN+ or ECN+/Wait significantly increases 1007 the number of packets marked. In contrast, the use of ECN+/TryOnce 1008 significantly increases the number of packets marked in the 1009 simulations with moderate congestion, and gives a more moderate 1010 increase in the number of packets marked for the simulations with 1011 higher levels of congestion. However, the cumulative distribution 1012 function (CDF) in Table 2 shows that ECN+, ECN+/Wait, and 1013 ECN+/TryOnce all improve response times for all of the simulations, 1014 with moderate or with larger levels of congestion. 1016 Little increase in congestion, sometimes: The second thing to observe 1017 is that for the simulations with low or moderate levels of congestion 1018 (that is, with packet drop rates less than 10%), the use of ECN+, 1019 ECN+/Wait, and ECN+/TryOnce all decrease the aggregate packet drop 1020 rate, relative to the simulations with ECN. This makes sense, since 1021 with low or moderate levels of congestion, ECN+ allows SYN/ACK 1022 packets to be marked instead of dropped, and the use of ECN+ doesn't 1023 add to the aggregate congestion. However, for the simulations with 1024 packet drop rates of 15% or higher with ECN, the use of ECN+ or 1025 ECN+/Wait increases the aggregate packet drop rate, sometimes even 1026 doubling it. 1028 Comparing ECN+, ECN+/Wait, and ECN+/TryOnce: The aggregate packet 1029 drop rate is generally higher with ECN+/Wait than with ECN+. Thus, 1030 there is no congestion-related reason to prefer ECN+/Wait over ECN+. 1031 In contrast, the aggregate packet drop rate with ECN+/TryOnce is 1032 often significantly lower than the aggregate packet drop rate with 1033 either ECN, ECN+, ECN+/Wait. 1035 Target Load = 95%: 1036 ECN ECN+ ECN+/Wait ECN+/TryOnce 1037 ------- ------- ------- ---------- 1038 Dropped 20,516 11,226 11,735 16,755` 1039 Marked 30,586 37,741 37,425 40,764 1040 Loss rate 1.41% 0.78% 0.81% 1.02% 1041 Throughput 81% 81% 81% 81% 1043 Target Load = 110%: 1044 ECN ECN+ ECN+/Wait ECN+/TryOnce 1045 ------- ------- ------- ---------- 1046 Dropped 165,566 106,083 147,180 208,422 1047 Marked 179,735 281,306 308,473 235,483 1048 Loss rate 9.01% 6.12% 8.02% 6.89% 1049 Throughput 92% 92% 92% 94% 1051 Target Load = 125%: 1052 ECN ECN+ ECN+/Wait ECN+/TryOnce 1053 ------- ------- ------- ---------- 1054 Dropped 600,628 1,746,768 2,176,530 625,552 1055 Marked 418,433 1,166,450 1,164,932 439,847 1056 Loss rate 25.45% 51.73% 56.87% 18.31% 1057 Throughput 94% 98% 97% 95% 1059 Target Load = 150% 1060 ECN ECN+ ECN+/Wait ECN+/TryOnce 1061 ------- ------- ------- ---------- 1062 Dropped 1,449,945 1,565,0517 1,563,0801 1,351,637 1063 Marked 669,840 583,378 591,315 684,715 1064 Loss rate 46.7% 59.0% 59.0% 32.7% 1065 Throughput 88% 94% 94% 92% 1067 Table 1: Simulations with an average flow size of 3 Kbytes, a 1068 100 Mbps link, RED in packet mode, queue in packets. 1070 Target Load = 95%: 1071 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1072 ------------------------------------------------------ 1073 ECN: 0.00 0.07 0.26 0.51 0.82 0.96 0.97 0.97 0.97 1.00 1.00 1074 ECN+: 0.00 0.07 0.27 0.53 0.85 0.99 1.00 1.00 1.00 1.00 1.00 1075 Wait: 0.00 0.07 0.26 0.51 0.83 0.97 1.00 1.00 1.00 1.00 1.00 1076 Once: 0.00 0.07 0.24 0.49 0.83 0.97 1.00 1.00 1.00 1.00 1.00 1078 Target Load = 110%: 1079 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1080 ------------------------------------------------------ 1081 ECN: 0.00 0.05 0.19 0.41 0.67 0.79 0.80 0.80 0.80 0.96 0.96 1082 ECN+: 0.00 0.07 0.22 0.48 0.81 0.96 1.00 1.00 1.00 1.00 1.00 1083 Wait: 0.00 0.05 0.18 0.38 0.64 0.77 0.95 1.00 1.00 1.00 1.00 1084 Once: 0.00 0.06 0.19 0.42 0.70 0.86 0.95 0.96 0.96 0.99 0.99 1086 Target Load = 125%: 1087 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1088 ------------------------------------------------------ 1089 ECN: 0.00 0.04 0.13 0.27 0.46 0.56 0.58 0.59 0.59 0.82 0.82 1090 ECN+: 0.00 0.06 0.18 0.33 0.58 0.76 0.97 0.99 0.99 1.00 1.00 1091 Wait: 0.00 0.01 0.06 0.13 0.21 0.27 0.68 0.98 0.99 1.00 1.00 1092 Once: 0.00 0.05 0.16 0.34 0.58 0.73 0.85 0.87 0.87 0.95 0.96 1094 Target Load = 150%: 1095 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1096 ------------------------------------------------------ 1097 ECN: 0.00 0.03 0.08 0.18 0.31 0.39 0.42 0.42 0.43 0.68 0.68 1098 ECN+: 0.00 0.06 0.18 0.39 0.67 0.81 0.83 0.84 0.84 0.93 0.93 1099 Wait: 0.00 0.06 0.18 0.39 0.67 0.81 0.83 0.84 0.84 0.93 0.94 1100 Once: 0.00 0.04 0.13 0.27 0.46 0.59 0.72 0.75 0.75 0.88 0.88 1102 Table 2: The cumulative distribution function (CDF) for transfer 1103 times, for simulations with an average flow size of 3 Kbytes, a 1104 100 Mbps link, RED in packet mode, queue in packets. (The graphs are 1105 available from "http://www.icir.org/floyd/ecn-syn/".) 1106 Target Load = 95% 1107 ECN ECN+ ECN+/Wait ECN+/TryOnce 1108 ------- ------- ------- ---------- 1109 Dropped 8,448 6,362 7,740 14,107 1110 Marked 9,891 16,787 17,456 16,132 1111 Loss rate 5.5% 4.3% 5.0% 5.0% 1112 Throughput 78% 78% 78% 81% 1114 Target Load = 110% 1115 ECN ECN+ ECN+/Wait ECN+/TryOnce 1116 ------- ------- ------- ---------- 1117 Dropped 31,284 29,773 49,297 45,277 1118 Marked 28,429 54,729 60,383 34,622 1119 Loss rate 15.3% 15.2% 21.9% 13.6% 1120 Throughput 97% 96% 96% 94% 1122 Target Load = 125% 1123 ECN ECN+ ECN+/Wait ECN+/TryOnce 1124 ------- ------- ------- ---------- 1125 Dropped 61,433 176,682 214,096 75,612 1126 Marked 44,408 119,728 117,301 49,442 1127 Loss rate 25.4% 51.9% 56.0% 22.3% 1128 Throughput 97% 98% 98% 96% 1130 Target Load = 150% 1131 ECN ECN+ ECN+/Wait ECN+/TryOnce 1132 ------- ------- ------- ---------- 1133 Dropped 130,007 251,856 326,845 133,603 1134 Marked 63,066 146,757 147,239 66,444 1135 Loss rate 42.5% 61.3% 67.3% 31.7% 1136 Throughput 93% 99% 99% 94% 1138 Table 3: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 1139 link, RED in packet mode, queue in packets. 1141 Target Load = 95%: 1142 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1143 ------------------------------------------------------ 1144 ECN: 0.00 0.05 0.18 0.42 0.70 0.86 0.88 0.88 0.88 0.98 0.98 1145 ECN+: 0.00 0.06 0.20 0.45 0.78 0.96 1.00 1.00 1.00 1.00 1.00 1146 Wait: 0.00 0.05 0.18 0.40 0.68 0.84 0.96 1.00 1.00 1.00 1.00 1147 Once: 0.00 0.05 0.18 0.40 0.71 0.88 0.96 0.97 0.97 0.99 0.99 1149 Target Load = 110%: 1150 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1151 ------------------------------------------------------ 1152 ECN: 0.00 0.03 0.13 0.29 0.52 0.66 0.69 0.69 0.69 0.91 0.91 1153 ECN+: 0.00 0.05 0.17 0.36 0.66 0.88 0.98 0.99 1.00 1.00 1.00 1154 Wait: 0.00 0.02 0.08 0.20 0.35 0.47 0.76 0.98 1.00 1.00 1.00 1155 Once: 0.00 0.05 0.15 0.32 0.58 0.75 0.88 0.90 0.90 0.97 0.97 1157 Target Load = 125%: 1158 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1159 ------------------------------------------------------ 1160 ECN: 0.00 0.03 0.10 0.22 0.40 0.52 0.56 0.56 0.57 0.82 0.82 1161 ECN+: 0.00 0.03 0.14 0.27 0.49 0.70 0.96 0.99 0.99 0.99 1.00 1162 Wait: 0.00 0.00 0.03 0.07 0.12 0.18 0.50 0.94 0.99 0.99 1.00 1163 Once: 0.00 0.04 0.13 0.28 0.51 0.66 0.81 0.84 0.84 0.94 0.94 1165 Target Load = 150%: 1166 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1167 ------------------------------------------------------ 1168 ECN: 0.00 0.02 0.07 0.15 0.28 0.38 0.42 0.42 0.43 0.67 0.68 1169 ECN+: 0.00 0.00 0.00 0.00 0.01 0.05 0.68 0.83 0.95 0.97 0.98 1170 Wait: 0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.62 0.83 0.93 0.97 1171 Once: 0.00 0.03 0.11 0.24 0.42 0.56 0.71 0.75 0.75 0.88 0.88 1173 Table 4: The cumulative distribution function (CDF) for transfer 1174 times, for simulations with an average flow size of 3 Kbytes, a 1175 10 Mbps link, RED in packet mode, queue in packets. (The graphs are 1176 available from "http://www.icir.org/floyd/ecn-syn/".) 1178 A.2. Simulations with RED in Byte Mode 1180 Table 5 below shows simulations with RED in byte mode and the queue 1181 in bytes. There is no significant increase in aggregate congestion 1182 with the use of ECN+, ECN+/Wait, or ECN+/TryOnce. 1184 However, unlike the simulations with RED in packet mode, the 1185 simulations with RED in byte mode show little benefit from the use of 1186 ECN+ or ECN+/Wait, in that the packet marking rate with ECN+ or 1187 ECN+/Wait is not much different than the packet marking rate with 1188 Standard ECN. This is because with RED in byte mode, small packets 1189 like SYN/ACK packets are rarely dropped or marked - that is, there is 1190 no drawback from the use of ECN+ in these scenarios, but not much 1191 need for ECN+ either, in a scenario where small packets are unlikely 1192 to be dropped or marked. 1194 Target Load = 95% 1195 ECN ECN+ ECN+/Wait ECN+/TryOnce 1196 ------- ------- ------- ---------- 1197 Dropped 766 446 427 408 1198 Marked 32,683 34,289 33,412 31,892 1199 Loss rate 0.05% 0.03% 0.03% 0.03% 1200 Throughput 81% 81% 81% 81% 1202 Target Load = 110% 1203 ECN ECN+ ECN+/Wait ECN+/TryOnce 1204 ------- ------- ------- ---------- 1205 Dropped 2,496 2,110 1,733 2,020 1206 Marked 220,573 258,696 230,955 214,604 1207 Loss rate 0.15% 0.13% 0.11% 0.11% 1208 Throughput 92% 91% 92% 92% 1210 Target Load = 125% 1211 ECN ECN+ ECN+/Wait ECN+/TryOnce 1212 ------- ------- ------- ---------- 1213 Dropped 20,032 13,555 13,979 16,918 1214 Marked 725,165 726,992 726,823 615,235 1215 Loss rate 1.11% 0.76% 0.78% 0.66% 1216 Throughput 95% 95% 95% 96% 1218 Target Load = 150% 1219 ECN ECN+ ECN+/Wait ECN+/TryOnce 1220 ------- ------- ------- ---------- 1221 Dropped 484,251 483,847 507,727 600,737 1222 Marked 865,905 872,254 873,317 818,451 1223 Loss rate 19.09% 19.13% 19.71% 12.66% 1224 Throughput 99% 98% 99% 99% 1226 Table 5: Simulations with an average flow size of 3 Kbytes, a 1227 100 Mbps link, RED in byte mode, queue in bytes. 1229 Target Load = 95% 1230 ECN ECN+ ECN+/Wait ECN+/TryOnce 1231 ------- ------- ------- ---------- 1232 Dropped 142 77 103 99 1233 Marked 11,694 11,387 11,604 12,129 1234 Loss rate 0.1% 0.1% 0.1% 0.1% 1235 Throughput 78% 78% 78% 78% 1237 Target Load = 110% 1238 ECN ECN+ ECN+/Wait ECN+/TryOnce 1239 ------- ------- ------- ---------- 1240 Dropped 338 210 247 274 1241 Marked 41,676 40,412 44,173 36,265 1242 Loss rate 0.2% 0.1% 0.1% 0.1% 1243 Throughput 94% 94% 94% 96% 1245 Target Load = 125% 1246 ECN ECN+ ECN+/Wait ECN+/TryOnce 1247 ------- ------- ------- ---------- 1248 Dropped 1,559 951 978 1,723 1249 Marked 74,933 75,499 75,481 59,670 1250 Loss rate 0.8% 0.5% 0.5% 0.6% 1251 Throughput 99% 99% 99% 96% 1253 Target Load = 150% 1254 ECN ECN+ ECN+/Wait ECN+/TryOnce 1255 ------- ------- ------- ---------- 1256 Dropped 2,374 1,528 1,515 4,848 1257 Marked 85,739 86,428 86,144 81,350 1258 Loss rate 1.2% 0.8% 0.8% 1.4% 1259 Throughput 99% 98% 98% 98% 1261 Table 6: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 1262 link, RED in byte mode, queue in bytes. 1264 B. Issues of Incremental Deployment 1266 In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node 1267 B must have received an ECN-setup SYN packet from node A. However, 1268 it is possible that node A supports ECN, but either ignores the CE 1269 codepoint on received SYN/ACK packets, or ignores SYN/ACK packets 1270 with the ECT or CE codepoint set. If the TCP initiator ignores the 1271 CE codepoint on received SYN/ACK packets, this would mean that the 1272 TCP responder would not respond to this congestion indication. 1273 However, this seems to us an acceptable cost to pay in the 1274 incremental deployment of ECN-Capability for TCP's SYN/ACK packets. 1275 It would mean that the responder would not reduce the initial 1276 congestion window from two, three, or four segments down to one 1277 segment, as it should. and would not sent a non-ECN-Capable SYN/ACK 1278 packet to complete the SYN exchange. However, the TCP end nodes 1279 would still respond correctly to any subsequent CE indications on 1280 data packets later on in the connection. 1282 Figure 4 shows an interchange with the SYN/ACK packet ECN-marked, but 1283 with the ECN mark ignored by the TCP originator. 1285 --------------------------------------------------------------- 1286 TCP Node A Router TCP Node B 1287 (initiator) (responder) 1288 ---------- ------ ---------- 1290 ECN-setup SYN packet ---> 1291 ECN-setup SYN packet ---> 1293 <--- ECN-setup SYN/ACK, ECT 1294 <--- Sets CE on SYN/ACK 1295 <--- ECN-setup SYN/ACK, CE 1297 Data/ACK, No ECN-Echo ---> 1298 Data/ACK ---> 1299 <--- Data (up to four packets) 1300 --------------------------------------------------------------- 1302 Figure 4: SYN exchange with the SYN/ACK packet marked, 1303 but with the ECN mark ignored by the TCP initiator. 1305 Thus, to be explicit, when a TCP connection includes an initiator 1306 that supports ECN but *does not* support ECN-Capability for SYN/ACK 1307 packets, in combination with a responder that *does* support ECN- 1308 Capability for SYN/ACK packets, it is possible that the ECN-Capable 1309 SYN/ACK packets will be marked rather than dropped in the network, 1310 and that the responder will not learn about the ECN mark on the 1311 SYN/ACK packet. This would not be a problem if most packets from the 1312 responder supporting ECN for SYN/ACK packets were in long-lived TCP 1313 connections, but it would be more problematic if most of the packets 1314 were from TCP connections consisting of four data packets, and the 1315 TCP responder for these connections was ready to send its data 1316 packets immediately after the SYN/ACK exchange. Of course, with 1317 *severe* congestion, the SYN/ACK packets would likely be dropped 1318 rather than ECN-marked at the congested router, preventing the TCP 1319 responder from adding to the congestion by sending its initial window 1320 of four data packets. 1322 It is also possible that in some older TCP implementation, the 1323 initiator would ignore arriving SYN/ACK packets that had the ECT or 1324 CE codepoint set. This would result in a delay in connection set-up 1325 for that TCP connection, with the initiator re-sending the SYN packet 1326 after a retransmission timeout. We are not aware of any TCP 1327 implementations with this behavior. 1329 One possibility for coping with problems of backwards compatibility 1330 would be for TCP initiators to use a TCP flag that means "I 1331 understand ECN-Capable SYN/ACK packets". If this document were to 1332 standardize the use of such an "ECN-SYN" flag, then the TCP responder 1333 would only send a SYN/ACK packet as ECN-capable if the incoming SYN 1334 packet had the "ECN-SYN" flag set. An ECN-SYN flag would prevent the 1335 backwards compatibility problems described in the paragraphs above. 1337 One drawback to the use of an ECN-SYN flag is that it would use one 1338 of the four remaining reserved bits in the TCP header, for a 1339 transient backwards compatibility problem. This drawback is limited 1340 by the fact that the "ECN-SYN" flag would be defined only for use 1341 with ECN-setup SYN packets; that bit in the TCP header could be 1342 defined to have other uses for other kinds of TCP packets. 1344 Factors in deciding not to use an ECN-SYN flag include the following: 1346 (1) The limited installed base: At the time that this document was 1347 written, the TCP implementations in Microsoft Vista and Mac OS X 1348 included ECN, but ECN was not enabled by default [SBT07]. Thus, 1349 there was not a large deployed base of ECN-Capable TCP 1350 implementations. This limits the scope of any backwards 1351 compatibility problems. 1353 (2) Limits to the scope of the problem: The backwards compatibility 1354 problem would not be serious enough to cause congestion collapse; 1355 with severe congestion, the buffer at the congested router will 1356 overflow, and the congested router will drop rather than ECN-mark 1357 arriving SYN packets. Some active queue management mechanisms might 1358 switch from packet-marking to packet-dropping in times of high 1359 congestion before buffer overflow, as recommended in Section 19.1 of 1360 RFC 3168 [RFC3168]. This helps to prevent congestion collapse 1361 problems with the use of ECN. 1363 (3) Detection of and response to backwards-compatibility problems: A 1364 TCP responder such as a web server can't differentiate between a 1365 SYN/ACK packet that is not ECN-marked in the network, and a SYN/ACK 1366 packet that is ECN-marked, but where the ECN mark is ignored by the 1367 TCP initiator. However, a TCP responder *can* detect if a SYN/ACK 1368 packet is sent as ECN-capable and not reported as ECN-marked, but 1369 data packets are dropped or marked from the initial window of data. 1370 We will call this scenario "initial-window-congestion". If a web 1371 server frequently experienced initial-window congestion (without 1372 SYN/ACK congestion), then the web server *might* be experiencing 1373 backwards compatibility problems with ECN-Capable SYN/ACK packets, 1374 and could respond by not sending SYN/ACK packets as ECN-Capable. 1376 Informative References 1378 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 1379 SIGCOMM 2005. 1381 [ECN-SYN] ECN-SYN web page with simulation scripts, URL 1382 "http://www.icir.org/floyd/ecn-syn". 1384 [F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to 1385 TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to 1386 the BEHAVE mailing list, URL "http://www1.ietf.org/mail- 1387 archive/web/behave/current/msg02644.html". 1389 [Kelson00] Dax Kelson, note sent to the Linux kernel mailing list, 1390 September 10, 2000. 1392 [L08] A. Landley, "Re: [tcpm] I-D Action:draft-ietf-tcpm- 1393 ecnsyn-06.txt", Email to the tcpm mailing list, August 24, 2008. 1395 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 1396 of Transport Protocols in the Internet, ACM CCR, April 2005. 1398 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 1399 Improved Controllers for AQM Routers Supporting TCP Flows, April 1400 1998. 1402 [RED] Floyd, S., and Jacobson, V. Random Early Detection gateways 1403 for Congestion Avoidance . IEEE/ACM Transactions on Networking, V.1 1404 N.4, August 1993. 1406 [REM] S. Athuraliya, V. H. Li, S. H. Low and Q. Yin, REM: Active 1407 Queue Management, IEEE Network, May 2001. 1409 [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 1410 September 1981. 1412 [RFC2309] B. Braden et al., Recommendations on Queue Management and 1413 Congestion Avoidance in the Internet, RFC 2309, April 1998. 1415 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 1416 Control, RFC 2581, April 1999. 1418 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 1419 Timer, RFC 2988, November 2000. 1421 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 1422 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 1423 January 2001. 1425 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 1426 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 1427 Standard, September 2001. 1429 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 1430 3360, August 2002. 1432 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 1433 Initial Window, RFC 3390, October 2002. 1435 [RFC4987] W. Eddy, TCP SYN Flooding Attacks and Common Mitigations, 1436 RFC 4987, August 2007. 1438 [SCJO01] F. Smith, F. Campos, K. Jeffay, and D. Ott, What TCP/IP 1439 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 1441 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 1442 1444 [SBT07] M. Sridharan, D. Bansal, and D. Thaler, Implementation Report 1445 on Experiences with Various TCP RFCs, Presentation in the TSVAREA, 1446 IETF 68, March 2007. URL 1447 "http://www3.ietf.org/proceedings/07mar/slides/tsvarea-3/sld6.htm". 1449 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 1450 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 1451 tools-05, work in progress, February 2008. 1453 IANA Considerations 1455 There are no IANA considerations regarding this document. 1457 Authors' Addresses 1458 Aleksandar Kuzmanovic 1459 Phone: +1 (847) 467-5519 1460 Northwestern University 1461 Email: akuzma at northwestern.edu 1462 URL: http://cs.northwestern.edu/~a 1464 Amit Mondal 1465 Northwestern University 1466 Email: a-mondal at northwestern.edu 1468 Sally Floyd 1469 Phone: +1 (510) 666-2989 1470 ICIR (ICSI Center for Internet Research) 1471 Email: floyd@icir.org 1472 URL: http://www.icir.org/floyd/ 1474 K. K. Ramakrishnan 1475 Phone: +1 (973) 360-8764 1476 AT&T Labs Research 1477 Email: kkrama at research.att.com 1478 URL: http://www.research.att.com/info/kkrama