idnits 2.17.1 draft-ietf-tcpm-ecnsyn-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1445. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1456. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1463. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1469. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year (Using the creation date from RFC3168, updated by this document, for RFC5378 checks: 2000-11-17) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (3 November 2008) is 5646 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET-DRAFT A. Mondal 3 Intended status: Proposed Standard Northwestern University 4 Expires: 3 May 2009 S. Floyd 5 Updates: 3168 ICIR 6 K.K. Ramakrishnan 7 AT&T 8 3 November 2008 10 Adding Explicit Congestion Notification (ECN) Capability 11 to TCP's SYN/ACK Packets 12 draft-ietf-tcpm-ecnsyn-07.txt 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on May 2009. 39 Copyright Notice 41 Copyright (C) The IETF Trust (2008). 43 Abstract 45 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 46 packets to be ECN-Capable. For TCP, RFC 3168 only specifies setting 47 an ECN-Capable codepoint on data packets, and not on SYN and SYN/ACK 48 packets. However, because of the high cost to the TCP transfer of 49 having a SYN/ACK packet dropped, with the resulting retransmit 50 timeout, this document specifies the use of ECN for the SYN/ACK 51 packet itself, when sent in response to a SYN packet with the two ECN 52 flags set in the TCP header, indicating a willingness to use ECN. 53 Setting the initial TCP SYN/ACK packet as ECN-Capable can be of great 54 benefit to the TCP connection, avoiding the severe penalty of a 55 retransmit timeout for a connection that has not yet started placing 56 a load on the network. The TCP responder (the sender of the SYN/ACK 57 packet) must reply to a report of an ECN-marked SYN/ACK packet by 58 resending a SYN/ACK packet that is not ECN-Capable. If the resent 59 SYN/ACK packet is acknowledged, then the TCP responder reduces its 60 initial congestion window from two, three, or four segments to one 61 segment, thereby reducing the subsequent load from that connection on 62 the network. If instead the SYN/ACK packet is dropped, or for some 63 other reason the TCP responder does not receive an acknowledgement in 64 the specified time, the TCP responder follows TCP standards for a 65 dropped SYN/ACK packet (setting the retransmit timer). This document 66 updates RFC 3168. 68 Table of Contents 70 1. Introduction ....................................................5 71 2. Conventions and Terminology .....................................7 72 3. Specification ...................................................7 73 3.1. SYN/ACK Packets Dropped in the Network .....................8 74 3.2. SYN/ACK Packets ECN-Marked in the Network ..................9 75 3.3. Management Interface ......................................11 76 4. Discussion .....................................................12 77 4.1. Flooding Attacks ..........................................12 78 4.2. The TCP SYN Packet ........................................12 79 4.3. SYN/ACK Packets and Packet Size ...........................13 80 4.4. Response to ECN-marking of SYN/ACK Packets ................13 81 5. Related Work ...................................................15 82 6. Performance Evaluation .........................................16 83 6.1. The Costs and Benefit of Adding ECN-Capability ............16 84 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 85 Packets ........................................................17 86 7. Security Considerations ........................................18 87 7.1. 'Bad' Routers or Middleboxes ..............................18 88 7.2. Congestion Collapse .......................................19 89 8. Conclusions ....................................................19 90 9. Acknowledgements ...............................................20 91 A. Report on Simulations ..........................................20 92 A.1. Simulations with RED in Packet Mode .......................21 93 A.2. Simulations with RED in Byte Mode .........................25 94 B. Issues of Incremental Deployment ...............................27 95 Normative References ..............................................30 96 Informative References ............................................30 97 IANA Considerations ...............................................31 98 Full Copyright Statement ..........................................32 99 Intellectual Property .............................................32 101 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 103 Changes from draft-ietf-tcpm-ecnsyn-06: 105 * Updated text and simulation results to specify ECN+/TryOnce 106 instead of ECN+. Added tables on CDFs. 108 * Acknowledged Adam's Linux implementation of ECN+/TryOnce. 110 Changes from draft-ietf-tcpm-ecnsyn-05: 112 * Added "Updates: 3168" to the header. Added a reference 113 to RFC 4987. Mild editing. 114 Feedback from Lars's Area Director review. 116 * Updated simulation results with new simulation scripts that 117 don't require any modifications to the ns simulator, and that 118 all use the same seed for generating traffic. The results are 119 somewhat different for the very-high-congestion scenarios 120 (with loss rates of 25% in the absence of ECN-capability 121 for SYN/ACK packets). This is reflected in the simulations with 122 a target load of 125% in Tables 1 and 2. 124 * Added the URL for the web page that has the simulation scripts. 126 Changes from draft-ietf-tcpm-ecnsyn-04: 128 * Updating the copyright date. 130 Changes from draft-ietf-tcpm-ecnsyn-03: 132 * General editing. This includes using the terms "initiator" 133 and "responder" for the two ends of the TCP connection. 134 Feedback from Alfred Hoenes. 136 * Added some text to the backwards compatibility discussion, 137 now in Appendix B, about the pros and cons of using a TCP 138 flag for the TCP initiator to signal that it understands 139 ECN-Capable SYN/ACK packets. The consensus at this time is 140 not to use such a flag. Also added a recommendation that 141 TCP implementations include a management interface to turn 142 off the use of ECN for SYN/ACK packets. From email from 143 Bob Briscoe. 145 Changes from draft-ietf-tcpm-ecnsyn-02: 147 * Added to the discussion in the Security section of whether 148 ECN-Capable TCP SYN packets have problems with firewalls, 149 over and above the known problems of TCP data packets 150 (e.g., as in the Microsoft report). From a question raised 151 at the TCPM meeting at the July 2007 IETF. 153 * Added a sentence to the discussion of routers or middleboxes that 154 *might* drop TCP SYN packets on the basis of IP header fields. 155 Feedback from Remi Denis-Courmont. 157 * General editing. Feedback from Alfred Hoenes. 159 Changes from draft-ietf-tcpm-ecnsyn-01: 161 * Changes in response to feedback from Anil Agarwal. 163 * Added a look at the costs of adding ECN-Capability to 164 SYN/ACKs in a highly-congested scenario. 165 From feedback from Mark Allman and Janardhan Iyengar. 167 * Added a comparative evaluation of two possible responses 168 to an ECN-marked SYN/ACK packet. From Mark Allman. 170 Changes from draft-ietf-tcpm-ecnsyn-00: 172 * Only updating the revision number. 174 Changes from draft-ietf-twvsg-ecnsyn-00: 176 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 178 * Added a discussion in Section 3 of "Response to 179 ECN-marking of SYN/ACK packets". Based on 180 suggestions from Mark Allman. 182 * Added a discussion to the Conclusions about adding 183 ECN-capability to relevant set-up packets in other 184 protocols. From a suggestion from Wesley Eddy. 186 * Added a description of SYN exchanges with SYN cookies. 187 From a suggestion from Wesley Eddy. 189 * Added a discussion of one-way data transfers, where the 190 host sending the SYN/ACK packet sends no data packets. 192 * Minor editing, from feedback from Mark Allman and Janardhan 193 Iyengar. 195 * Future work: a look at the costs of adding 196 ECN-Capability in a worst-case scenario. 197 From feedback from Mark Allman and Janardhan Iyengar. 199 * Future work: a comparative evaluation of two 200 possible responses to an ECN-marked SYN/ACK packet. 202 Changes from draft-kuzmanovic-ecn-syn-00.txt: 204 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 206 END OF NOTE TO RFC EDITOR. 208 1. Introduction 210 TCP's congestion control mechanism has primarily used packet loss as 211 the congestion indication, with packets dropped when buffers 212 overflow. With such tail-drop mechanisms, the packet delay can be 213 high, as the queue at bottleneck routers can be fairly large. 214 Dropping packets only when the queue overflows, and having TCP react 215 only to such losses, results in: 216 1) significantly higher packet delay; 217 2) unnecessarily many packet losses; and 218 3) unfairness due to synchronization effects. 220 The adoption of Active Queue Management (AQM) mechanisms allows 221 better control of bottleneck queues [RFC2309]. This use of AQM has 222 the following potential benefits: 223 1) better control of the queue, with reduced queueing delay; 224 2) fewer packet drops; and 225 3) better fairness because of fewer synchronization effects. 227 With the adoption of ECN, performance may be further improved. When 228 the router detects congestion before buffer overflow, the router can 229 provide a congestion indication either by dropping a packet, or by 230 setting the Congestion Experienced (CE) codepoint in the Explicit 231 Congestion Notification (ECN) field in the IP header [RFC3168]. The 232 IETF has standardized the use of the Congestion Experienced (CE) 233 codepoint in the IP header for routers to indicate congestion. For 234 incremental deployment and backwards compatibility, the RFC on ECN 235 [RFC3168] specifies that routers may mark ECN-capable packets that 236 would otherwise have been dropped, using the Congestion Experienced 237 codepoint in the ECN field. The use of ECN allows TCP to react to 238 congestion while avoiding unnecessary retransmit timeouts. Thus, 239 using ECN has several benefits: 241 1) For short transfers, a TCP connection's congestion window may be 242 small. For example, if the current window contains only one packet, 243 and that packet is dropped, TCP will have to wait for a retransmit 244 timeout to recover, reducing its overall throughput. Similarly, if 245 the current window contains only a few packets and one of those 246 packets is dropped, there might not be enough duplicate 247 acknowledgements for a fast retransmission, and the sender of the 248 data packet might have to wait for a delay of several round-trip 249 times using Limited Transmit [RFC3042]. With the use of ECN, short 250 flows are less likely to have packets dropped, sometimes avoiding 251 unnecessary delays or costly retransmit timeouts. 253 2) While longer flows may not see substantially improved throughput 254 with the use of ECN, they may experience lower loss. This may benefit 255 TCP applications that are latency- and loss-sensitive, because of the 256 avoidance of retransmissions. 258 RFC 3168 only specifies marking the Congestion Experienced codepoint 259 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 260 specifies the negotiation of the use of ECN between the two TCP end- 261 points in the TCP SYN and SYN-ACK exchange, using flags in the TCP 262 header. Erring on the side of being conservative, RFC 3168 does not 263 specify the use of ECN for the first SYN/ACK packet itself. However, 264 because of the high cost to the TCP transfer of having a SYN/ACK 265 packet dropped, with the resulting retransmit timeout, this document 266 specifies the use of ECN for the SYN/ACK packet itself. This can be 267 of great benefit to the TCP connection, avoiding the severe penalty 268 of a retransmit timeout for a connection that has not yet started 269 placing a load on the network. The sender of the SYN/ACK packet must 270 respond to a report of an ECN-marked SYN/ACK packet by sending a non- 271 ECN-Capable SYN/ACK packet, and by reducing its initial congestion 272 window from two, three, or four segments to one segment, reducing the 273 subsequent load from that connection on the network. 275 The use of ECN for SYN/ACK packets has the following potential 276 benefits: 277 1) Avoidance of a retransmit timeout; 278 2) Improvement in the throughput of short connections. 280 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 281 packets to be ECN-Capable. Section 3 contains the specification of 282 the change, while Section 4 discusses some of the issues, and Section 283 5 discusses related work. Section 6 contains an evaluation of the 284 specified change. 286 2. Conventions and Terminology 288 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 289 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 290 document are to be interpreted as described in [RFC 2119]. 292 We use the following terminology from RFC 3168: 294 The ECN field in the IP header: 295 o CE: the Congestion Experienced codepoint; and 296 o ECT: either one of the two ECN-Capable Transport codepoints. 298 The ECN flags in the TCP header: 299 o CWR: the Congestion Window Reduced flag; and 300 o ECE: the ECN-Echo flag. 302 ECN-setup packets: 303 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 304 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 306 In this document we use the terms "initiator" and "responder" to 307 refer to the sender of the SYN packet and of the SYN-ACK packet, 308 respectively. 310 3. Specification 312 This section specifies the modification to RFC 3168 to allow TCP 313 SYN/ACK packets to be ECN-Capable. 315 RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on 316 SYN or SYN-ACK packets." In this section, we specify that a TCP node 317 MAY respond to an initial ECN-setup SYN packet by setting ECT in the 318 responding ECN-setup SYN/ACK packet, indicating to routers that the 319 SYN/ACK packet is ECN-Capable. This allows a congested router along 320 the path to mark the packet instead of dropping the packet as an 321 indication of congestion. 323 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 324 packet, indicating willingness to use ECN for this connection. As 325 specified by RFC 3168, if TCP node B is willing to use ECN, node B 326 responds with an ECN-setup SYN-ACK packet. 328 3.1. SYN/ACK Packets Dropped in the Network 330 Figure 1 shows an interchange with the SYN/ACK packet dropped by a 331 congested router. Node B waits for a retransmit timeout, and then 332 retransmits the SYN/ACK packet. 334 --------------------------------------------------------------- 335 TCP Node A Router TCP Node B 336 (initiator) (responder) 337 ---------- ------ ---------- 339 ECN-setup SYN packet ---> 340 ECN-setup SYN packet ---> 342 <--- ECN-setup SYN/ACK, possibly ECT 343 3-second timer set 344 SYN/ACK dropped . 345 . 346 . 347 3-second timer expires 348 <--- ECN-setup SYN/ACK, not ECT 349 <--- ECN-setup SYN/ACK 350 Data/ACK ---> 351 Data/ACK ---> 352 <--- Data (one to four segments) 353 --------------------------------------------------------------- 355 Figure 1: SYN exchange with the SYN/ACK packet dropped. 357 If the SYN/ACK packet is dropped in the network, the responder (node 358 B) responds by waiting three seconds for the retransmit timer to 359 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 360 dropped, the responder SHOULD resend the SYN/ACK packet without the 361 ECN-Capable codepoint. (Although we are not aware of any middleboxes 362 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 363 the IP header, we have learned to design our protocols defensively in 364 this regard [RFC3360].) 366 We note that if syn-cookies were used by the responder (node B) in 367 the exchange in Figure 1, the responder wouldn't set a timer upon 368 transmission of the SYN/ACK packet [SYN-COOK] [RFC4987]. In this 369 case, if the SYN/ACK packet was lost, the initiator (Node A) would 370 have to timeout and retransmit the SYN packet in order to trigger 371 another SYN-ACK. 373 3.2. SYN/ACK Packets ECN-Marked in the Network 375 Figure 2 shows an interchange with the SYN/ACK packet sent as ECN- 376 Capable, and ECN-marked instead of dropped at the congested router. 377 This document specifies ECN+/TryOnce, which differs from the original 378 proposal for ECN+ in [ECN+]; with ECN+/TryOnce, if the TCP responder 379 is informed that the SYN/ACK was ECN-marked, the TCP responder 380 immediately sends a SYN/ACK packet that is not ECN-Capable. The TCP 381 responder is only allowed to send data packets after the TCP 382 initiator reports the receipt of a SYN/ACK packet that is neither 383 marked nor dropped. 385 --------------------------------------------------------------- 386 TCP Node A Router TCP Node B 387 (initiator) (responder) 388 ---------- ------ ---------- 390 ECN-setup SYN packet ---> 391 ECN-setup SYN packet ---> 393 <--- ECN-setup SYN/ACK, ECT 394 3-second timer set 395 <--- Sets CE on SYN/ACK 396 <--- ECN-setup SYN/ACK, CE 398 Data/ACK, ECN-Echo ---> 399 Data/ACK, ECN-Echo ---> 400 Window reduced to one segment. 401 <--- ECN-setup SYN/ACK, CWR, not ECT 402 <--- ECN-setup SYN/ACK, CWR 404 Data/ACK ---> 405 Data/ACK ---> 406 <--- Data (one segment only) 407 --------------------------------------------------------------- 409 Figure 2: SYN exchange with the SYN/ACK packet marked. 410 ECN+/TryOnce. 412 If the initiator (node A) receives a SYN/ACK packet that has been 413 marked by the congested router, with the CE codepoint set, the 414 initiator MUST respond by setting the ECN-Echo flag in the TCP header 415 of the responding ACK packet. However, with ECN+/TryOnce the 416 initiator does not advance from the "SYN-Sent" to the "SYN-Received" 417 state until it receives a SYN/ACK packet that is not ECN-marked. As 418 specified in RFC 3168, the initiator continues to set the ECN-Echo 419 flag in packets until it receives a packet with the CWR flag set. 421 When the responder (node B) receives the ECN-Echo packet reporting 422 the Congestion Experienced indication in the SYN/ACK packet, the 423 responder MUST set the initial congestion window to one segment, 424 instead of two segments as allowed by [RFC2581], or three or four 425 segments allowed by [RFC3390]. In the original proposal for ECN+, if 426 the responder (node B) received an ECN-Echo packet informing it of a 427 Congestion Experienced indication on its SYN/ACK packet, the 428 responder would been able to send data packets using an initial 429 window of one segment, without waiting for a retransmit timeout. In 430 contrast, this document specifies ECN+/TryOnce, illustrated in Figure 431 2; if the responder (node B) receives an ECN-Echo packet informing it 432 of a Congestion Experienced indication on its SYN/ACK packet, the 433 responder sends a SYN/ACK packet that is not ECN-Capable, in addition 434 to setting the initial window to one segment. 436 We note that this document updates RFC 3168, which specified that 437 "the sending TCP MUST reset the retransmit timer on receiving the 438 ECN-Echo packet when the congestion window is one." As an update, 439 this document specifies the response of a TCP host to receiving an 440 ECN-Echo packet acknowledging the receipt of an ECN-Capable SYN/ACK 441 packet. 443 RFC 3168 specifies that in response to an ECN-Echo packet, the TCP 444 responder also sets the CWR flag in the TCP header of the next data 445 packet sent, to acknowledge its receipt of and reaction to the ECN- 446 Echo flag. This document updates RFC 3168 by specifying that in 447 response to an ECN-Echo packet acknowledging the receipt of an ECN- 448 Capable SYN/ACK packet, the responder sets the CWR flag in the TCP 449 header of the non-ECN-Capable SYN/ACK packet. 451 --------------------------------------------------------------- 452 TCP Node A Router TCP Node B 453 (initiator) (responder) 454 ---------- ------ ---------- 456 ECN-setup SYN packet ---> 457 ECN-setup SYN packet ---> 459 <--- ECN-setup SYN/ACK, ECT 460 <--- Sets CE on SYN/ACK 461 <--- ECN-setup SYN/ACK, CE 463 Data/ACK, ECN-Echo ---> 464 Data/ACK, ECN-Echo ---> 465 Window reduced to one segment. 467 <--- ECN-setup SYN/ACK, CWR, not ECT 468 3-second timer set 469 SYN/ACK dropped . 470 . 471 . 472 3-second timer expires 473 <--- ECN-setup SYN/ACK, CWR, not ECT 474 <--- ECN-setup SYN/ACK, CWR, not ECT 475 Data/ACK ---> 476 Data/ACK ---> 477 <--- Data (one segment only) 478 --------------------------------------------------------------- 480 Figure 3: SYN exchange with the first SYN/ACK packet marked, 481 and the second SYN/ACK packet dropped. ECN+/TryOnce. 483 In contrast to Figure 2, Figure 3 shows an interchange where the 484 first SYN/ACK packet is ECN-marked and the second SYN/ACK packet is 485 dropped in the network. As in Figure 2, the TCP responder sets a 486 timer when the second SYN/ACK packet is sent. Figure 3 shows that if 487 the timer expires before the TCP responder receives an 488 acknowledgement for the other end, the TCP responder resends the 489 SYN/ACK packet, following the TCP standards. 491 3.3. Management Interface 493 The TCP implementation using ECN-Capable SYN/ACK packets SHOULD 494 include a management interface to allow the use of ECN to be turned 495 off for SYN/ACK packets. This is to deal with possible backwards 496 compatibility problems such as those discussed in Appendix B. 498 4. Discussion 500 The rationale for the specification in this document is the 501 following. When node B receives a TCP SYN packet with ECN-Echo bit 502 set in the TCP header, this indicates that node A is ECN-capable. If 503 node B is also ECN-capable, there are no obstacles to immediately 504 setting one of the ECN-Capable codepoints in the IP header in the 505 responding TCP SYN/ACK packet. 507 There can be a great benefit in setting an ECN-capable codepoint in 508 SYN/ACK packets, as is discussed further in [ECN+], and reported 509 briefly in Section 5 below. Congestion is most likely to occur in 510 the server-to-client direction. As a result, setting an ECN-capable 511 codepoint in SYN/ACK packets can reduce the occurrence of three- 512 second retransmit timeouts resulting from the drop of SYN/ACK 513 packets. 515 4.1. Flooding Attacks 517 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 518 packets does not raise any new or additional security 519 vulnerabilities. For example, provoking servers or hosts to send 520 SYN/ACK packets to a third party in order to perform a "SYN/ACK 521 flood" attack would be highly inefficient. Third parties would 522 immediately drop such packets, since they would know that they didn't 523 generate the TCP SYN packets in the first place. Moreover, such 524 SYN/ACK attacks would have the same signatures as the existing TCP 525 SYN attacks. Provoking servers or hosts to reply with SYN/ACK packets 526 in order to congest a certain link would also be highly inefficient 527 because SYN/ACK packets are small in size. 529 However, the addition of ECN-Capability to SYN/ACK packets could 530 allow SYN/ACK packets to persist for more hops along a network path 531 before being dropped, thus adding somewhat to the ability of a 532 SYN/ACK attack to flood a network link. 534 4.2. The TCP SYN Packet 536 There are several reasons why an ECN-Capable codepoint MUST NOT be 537 set in the IP header of the initiating TCP SYN packet. First, when 538 the TCP SYN packet is sent, there are no guarantees that the other 539 TCP endpoint (node B in Figure 2) is ECN-capable, or that it would be 540 able to understand and react if the ECN CE codepoint was set by a 541 congested router. 543 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 544 by malicious clients to `improve' the well-known TCP SYN attack. By 545 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 546 might be able to inject a large number of TCP SYN packets through a 547 potentially congested ECN-enabled router, congesting it even further. 549 For both these reasons, we continue the restriction that the TCP SYN 550 packet MUST NOT have the ECN-Capable codepoint in the IP header set. 552 4.3. SYN/ACK Packets and Packet Size 554 There are a number of router buffer architectures that have smaller 555 dropping rates for small (SYN) packets than for large (data) packets. 556 For example, for a Drop Tail queue in units of packets, where each 557 packet takes a single slot in the buffer regardless of packet size, 558 small and large packets are equally likely to be dropped. However, 559 for a Drop Tail queue in units of bytes, small packets are less 560 likely to be dropped than are large ones. Similarly, for RED in 561 packet mode, small and large packets are equally likely to be dropped 562 or marked, while for RED in byte mode, a packet's chance of being 563 dropped or marked is proportional to the packet size in bytes. 565 For a congested router with an AQM mechanism in byte mode, where a 566 packet's chance of being dropped or marked is proportional to the 567 packet size in bytes, the drop or marking rate for TCP SYN/ACK 568 packets should generally be low. In this case, the benefit of making 569 SYN/ACK packets ECN-Capable should be similarly moderate. However, 570 for a congested router with a Drop Tail queue in units of packets or 571 with an AQM mechanism in packet mode, and with no priority queueing 572 for smaller packets, small and large packets should have the same 573 probability of being dropped or marked. In such a case, making 574 SYN/ACK packets ECN-Capable should be of significant benefit. 576 We believe that there are a wide range of behaviors in the real world 577 in terms of the drop or mark behavior at routers as a function of 578 packet size [Tools] (Section 10). We note that all of these 579 alternatives listed above are available in the NS simulator (Drop 580 Tail queues are by default in units of packets, while the default for 581 RED queue management has been changed from packet mode to byte mode). 583 4.4. Response to ECN-marking of SYN/ACK Packets 585 One question is why TCP SYN/ACK packets should be treated differently 586 from other packets in terms of the end node's response to an ECN- 587 marked packet. Section 5 of RFC 3168 specifies the following: 589 "Upon the receipt by an ECN-Capable transport of a single CE packet, 590 the congestion control algorithms followed at the end-systems MUST be 591 essentially the same as the congestion control response to a *single* 592 dropped packet. For example, for ECN-Capable TCP the source TCP is 593 required to halve its congestion window for any window of data 594 containing either a packet drop or an ECN indication." 596 In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP 597 congestion window consists of a single packet and that packet is ECN- 598 marked in the network, then the data sender must reduce the sending 599 rate below one packet per round-trip time, by waiting for one RTO 600 before sending another packet. If the RTO was set to the average 601 round-trip time, this would result in halving the sending rate; 602 because the RTO is in fact larger than the average round-trip time, 603 the sending rate is reduced to less than half of its previous value. 605 TCP's congestion control response to the *dropping* of a SYN/ACK 606 packet is to wait a default time before sending another packet. This 607 document argues that ECN gives end-systems a wider range of possible 608 responses to the *marking* of a SYN/ACK packet, and that waiting a 609 default time before sending another packet is not the desired 610 response. 612 On the conservative end, one could assume an effective congestion 613 window of one packet for the SYN/ACK packet, and respond to an ECN- 614 marked SYN/ACK packet by reducing the sending rate to one packet 615 every two round-trip times. As an approximation, the TCP end-node 616 could measure the round-trip time T between the sending of the 617 SYN/ACK packet and the receipt of the acknowledgement, and reply to 618 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 619 seconds before sending a data packet. 621 However, we note that for an ECN-marked SYN/ACK packet, halving the 622 *congestion window* is not the same as halving the *sending rate*; 623 there is no `sending rate' associated with an ECN-Capable SYN/ACK 624 packet, as such packets are only sent as the first packet in a 625 connection from that host. Further, a router's marking of a SYN/ACK 626 packet is not affected by any past history of that connection. 628 Adding ECN-Capability to SYN/ACK packets allows the response of the 629 responder setting the initial congestion window to one packet, 630 instead of its allowed default value of two, three, or four packets. 631 The responder sends a non-ECN-Capable SYN/ACK packet, and proceeds 632 with a cautious sending rate of one data packet per round-trip time 633 after that SYN/ACK packet is acknowledged. This document argues that 634 this approach is useful to users, with no dangers of congestion 635 collapse or of starvation of competing traffic. This is discussed in 636 more detail below in Section 6.2. 638 We note that if the data transfer is entirely from Node A to Node B, 639 there is still a difference in performance between the original 640 mechanism ECN+ and the mechanism ECN+/TryOnce specified in this 641 document. In particular, with ECN+/TryOnce the TCP originator does 642 not send data packets until it has received a non-ECN-marked SYN/ACK 643 packet from the other end. 645 5. Related Work 647 The addition of ECN-capability to TCP's SYN/ACK packets was initially 648 proposed in [ECN+]. The paper includes an extensive set of 649 simulation and testbed experiments to evaluate the effects of the 650 proposal, using several Active Queue Management (AQM) mechanisms, 651 including Random Early Detection (RED) [RED], Random Exponential 652 Marking (REM) [REM], and Proportional Integrator (PI) [PI]. The 653 performance measures were the end-to-end response times for each 654 request/response pair, and the aggregate throughput on the bottleneck 655 link. The end-to-end response time was computed as the time from the 656 moment when the request for the file is sent to the server, until 657 that file is successfully downloaded by the client. 659 The measurements from [ECN+] show that setting an ECN-Capable 660 codepoint in the IP packet header in TCP SYN/ACK packets 661 systematically improves performance with all evaluated AQM schemes. 662 When SYN/ACK packets at a congested router are ECN-marked instead of 663 dropped, this can avoid a long initial retransmit timeout, improving 664 the response time for the affected flow dramatically. 666 [ECN+] shows that the impact on aggregate throughput can also be 667 quite significant, because marking SYN ACK packets can prevent larger 668 flows from suffering long timeouts before being "admitted" into the 669 network. In addition, the testbed measurements from [ECN+] show that 670 web servers setting the ECN-Capable codepoint in TCP SYN/ACK packets 671 could serve more requests. 673 As a final step, [ECN+] explores the co-existence of flows that do 674 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 675 results in [ECN+] show that both types of flows can coexist, with 676 some performance degradation for flows that don't use ECN+. Flows 677 that do use ECN+ improve their end-to-end performance. At the same 678 time, the performance degradation for flows that don't use ECN+, as a 679 result of the flows that do use ECN+, increases as a greater fraction 680 of flows use ECN+. 682 6. Performance Evaluation 684 6.1. The Costs and Benefit of Adding ECN-Capability 686 [ECN+] explores the costs and benefits of adding ECN-Capability to 687 SYN/ACK packets with both simulations and experiments. The addition 688 of ECN-capability to SYN/ACK packets could be of significant benefit 689 for those ECN connections that would have had the SYN/ACK packet 690 dropped in the network, and for which the ECN-Capability would allow 691 the SYN/ACK to be marked rather than dropped. 693 The percent of SYN/ACK packets on a link can be quite high. In 694 particular, measurements on links dominated by web traffic indicate 695 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 697 The benefit of adding ECN-capability to SYN/ACK packets depends in 698 part on the size of the data transfer. The drop of a SYN/ACK packet 699 can increase the download time of a short file by an order of 700 magnitude, by requiring a three-second retransmit timeout. For 701 longer-lived flows, the effect of a dropped SYN/ACK packet on file 702 download time is less dramatic. However, even for longer-lived 703 flows, the addition of ECN-capability to SYN/ACK packets can improve 704 the fairness among long-lived flows, as newly-arriving flows would be 705 less likely to have to wait for retransmit timeouts. 707 One question that arises is what fraction of connections would see 708 the benefit from making SYN/ACK packets ECN-capable, in a particular 709 scenario. Specifically: 711 (1) What fraction of arriving SYN/ACK packets are dropped at the 712 congested router when the SYN/ACK packets are not ECN-capable? 714 (2) Of those SYN/ACK packets that are dropped, what fraction would 715 have been ECN-marked instead of dropped if the SYN/ACK packets had 716 been ECN-capable? 718 To answer (1), it is necessary to consider not only the level of 719 congestion but also the queue architecture at the congested link. As 720 described in Section 4 above, for some queue architectures small 721 packets are less likely to be dropped than large ones. In such an 722 environment, SYN/ACK packets would have lower packet drop rates; 723 question (1) could not necessarily be inferred from the overall 724 packet drop rate, but could be answered by measuring the drop rate 725 for SYN/ACK packets directly. In such an environment, adding ECN- 726 capability to SYN/ACK packets would be of less dramatic benefit than 727 in environments where all packets are equally likely to be dropped 728 regardless of packet size. 730 As question (2) implies, even if all of the SYN/ACK packets were ECN- 731 capable, there could still be some SYN/ACK packets dropped instead of 732 marked at the congested link; the full answer to question (2) depends 733 on the details of the queue management mechanism at the router. If 734 congestion is sufficiently bad, and the queue management mechanism 735 cannot prevent the buffer from overflowing, then SYN/ACK packets will 736 be dropped rather than marked upon buffer overflow whether or not 737 they are ECN-capable. 739 For some AQM mechanisms, ECN-capable packets are marked instead of 740 dropped any time this is possible, that is, any time the buffer is 741 not yet full. For other AQM mechanisms however, such as the RED 742 mechanism as recommended in [RED], packets are dropped rather than 743 marked when the packet drop/mark rate exceeds a certain threshold, 744 e.g., 10%, even if the packets are ECN-capable. For a router with 745 such an AQM mechanism, when congestion is sufficiently severe to 746 cause a high drop/mark rate, some SYN/ACK packets would be dropped 747 instead of marked whether or not they were ECN-capable. 749 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 750 packets depends not only on the overall packet drop rate in the 751 network, but also on the queue management architecture at the 752 congested link. 754 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets 756 This document specifies that the end-node responds to the report of 757 an ECN-marked SYN/ACK packet by setting the initial congestion window 758 to one segment, instead of its possible default value of two to four 759 segments, and resending a SYN/ACK packet that is not ECN-Capable. We 760 call this ECN+/TryOnce. 762 However, Section 4 discussed two other possible responses to an ECN- 763 marked SYN/ACK packet. In ECN+, the original proposal from [ECN+], 764 the end node responds to the report of an ECN-marked SYN/ACK packet 765 by setting the initial congestion window to one segment and 766 immediately sending a data packet, if it has one to send. In 767 ECN+/Wait, the end node responds to the report of an ECN-marked 768 SYN/ACK packet by setting the initial congestion window to one 769 segment and waiting an RTT before sending a data packet. 771 Simulations comparing the performance with Standard ECN (without ECN- 772 marked SYN/ACK packets), ECN+, and ECN+/Wait, and ECN/TryOnce show 773 little difference, in terms of aggregate congestion, between ECN+ and 774 ECN+/Wait. However, for some scenarios with queues that are packet- 775 based rather than byte-based, and with packet drop rates above 25% 776 without ECN+, the use of ECN+ or of ECN+/Wait can more than double 777 the packet drop rates, to greater than 50%. The details are given in 778 Tables 1 and 3 of Appendix A below. ECN+/TryOnce does not increase 779 the packet drop rate in scenarios of high congestion. Therefore, 780 ECN+/TryOnce is superior to ECN+ or to ECN+/Wait, which both 781 significantly increase the packet drop rate in scenarios of high 782 congestion. At the same time, ECN+/TryOnce gives a performance 783 improvement similar to that of ECN+ or ECN+/Wait (Tables 2 and 4 of 784 Appendix A). 786 Our conclusions are that ECN+/TryOnce is safe, and has significant 787 benefits to the user, and avoids the problems of ECN+ or ECN+/Wait 788 under extreme levels of congestion. As a consequence, this document 789 specifies the use of ECN+/TryOnce. 791 [Note: We only discovered the occasional congestion-related problems 792 of ECN+ and of ECN+/Wait when re-running the simulations with an 793 updated version of the ns-2 simulator, after the internet-draft had 794 almost completed the standardization process.] 796 7. Security Considerations 798 TCP packets carrying the ECT codepoint in IP headers can be marked 799 rather than dropped by ECN-capable routers. This raises several 800 security concerns that we discuss below. 802 7.1. 'Bad' Routers or Middleboxes 804 There are a number of known deployment problems from using ECN with 805 TCP traffic in the Internet. The first reported problem, dating back 806 to 2000, is of a small but decreasing number of routers or 807 middleboxes that reset a TCP connection in response to TCP SYN 808 packets using flags in the TCP header to negotiate ECN-capability 809 [Kelson00] [RFC3360] [MAF05]. Dave Thaler reported at the March 2007 810 IETF of new two problems encountered by TCP connections using ECN; 811 the first of the two problems concerns routers that crash when a TCP 812 data packet arrives with the ECN field in the IP header with the 813 codepoint ECT(0) or ECT(1), indicating that an ECN-Capable connection 814 has been established [SBT07]. 816 While there is no evidence that any routers or middleboxes drop 817 SYN/ACK packets that contain an ECN-Capable or CE codepoint in the IP 818 header, such behavior cannot be excluded. (There seems to be a 819 number of routers or middleboxes that drop TCP SYN packets that 820 contain known or unknown IP options [MAF05] (Figure 1).) Thus, as 821 specified in Section 3, if a SYN/ACK packet with the ECT or CE 822 codepoint is dropped, the TCP node SHOULD resend the SYN/ACK packet 823 without the ECN-Capable codepoint. There is also no evidence that 824 any routers or middleboxes crash when a SYN/ACK arrives with an ECN- 825 Capable or CE codepoint in the IP header (over and above the routers 826 already known to crash when a data packet arrives with either ECT(0) 827 or ECT(1)), but we have not conducted any measurement studies of this 828 [F07]. 830 7.2. Congestion Collapse 832 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 833 marked instead of dropped at an ECN-capable router, the concern is 834 whether this can either invoke congestion, or worsen performance in 835 highly congested scenarios. However, after learning that a SYN/ACK 836 packet was ECN-marked, the responder sends a SYN/ACK packet that is 837 not ECN-Capable; if this SYN/ACK packet is dropped, the responder 838 then waits for a retransmission timeout, as specified in the TCP 839 standards. In addition, routers are free to drop rather than mark 840 arriving packets in times of high congestion, regardless of whether 841 the packets are ECN-capable. When congestion is very high and a 842 router's buffer is full, the router has no choice but to drop rather 843 than to mark an arriving packet. 845 The simulations reported in Appendix A show that even with demanding 846 traffic mixes dominated by short flows and high levels of congestion, 847 the aggregate packet dropping rates are not significantly different 848 with Standard ECN or with ECN+/TryOnce. However, in our simulations, 849 we have one scenario where ECN+ or ECN+/Wait results in a 850 significantly higher packet drop rate than ECN or ECN+/TryOnce 851 (Tables 1 and 3 in Appendix A below). 853 8. Conclusions 855 This draft specifies a modification to RFC 3168 to allow TCP nodes to 856 send SYN/ACK packets as being ECN-Capable. Making the SYN/ACK packet 857 ECN-Capable avoids the high cost to a TCP transfer when a SYN/ACK 858 packet is dropped by a congested router, by avoiding the resulting 859 retransmit timeout. This improves the throughput of short 860 connections. This document specifies the ECN+/TryOnce mechanism for 861 ECN-Capability for SYN/ACK packets, where the sender of the SYN/ACK 862 packet responds to an ECN mark by reducing its initial congestion 863 window from two, three, or four segments to one segment, and sending 864 a SYN/ACK packet that is not ECN-Capable. The addition of ECN- 865 capability to SYN/ACK packets is particularly beneficial in the 866 server-to-client direction, where congestion is more likely to occur. 867 In this case, the initial information provided by the ECN marking in 868 the SYN/ACK packet enables the server to appropriately adjust the 869 initial load it places on the network, while avoiding the delay of a 870 retransmit timeout. 872 9. Acknowledgements 874 We thank Anil Agarwal, Mark Allman, Remi Denis-Courmont, Wesley Eddy, 875 Lars Eggert, Alfred Hoenes, Janardhan Iyengar, and Pasi Sarolahti for 876 feedback on earlier versions of this draft. We thank Adam Langley 877 [L08] for contributing a patch for ECN+/TryOnce for the Linux 878 development tree. 880 A. Report on Simulations 882 This section reports on simulations showing the costs of adding ECN+ 883 in highly-congested scenarios. This section also reports on 884 simulations for a comparative evaluation between ECN, ECN+, 885 ECN+/Wait, and ECN+/TryOnce. 887 The simulations are run with a range of file-size distributions, 888 using the PackMime traffic generator in the ns-2 simulator. They all 889 use a heavy-tailed distribution of file sizes. The simulations 890 reported in the tables below use a mean file size of 3 KBypes, to 891 show the results with a traffic mix with a large number of small 892 transfers. Other simulations were run with mean file sizes of 5 893 KBytes, 7 Kbytes, 14 KBytes, and 17 Kbytes. The title of each chart 894 gives the targeted average load from the traffic generator. Because 895 the simulations use a heavy-tailed distribution of file sizes, and 896 run for only 85 seconds (including ten seconds of warm-up time), the 897 actual load is often much smaller than the targeted load. The 898 congested link is 100 Mbps. RED is run in gentle mode, and arriving 899 ECN-Capable packets are only dropped instead of marked if the buffer 900 is full (and the router has no choice). 902 We explore three possible mechanisms for a TCP node's response to a 903 report of an ECN-marked SYN/ACK packet. With ECN+, the TCP node 904 sends a data packet immediately (with an initial congestion window of 905 one segment). With ECN+/Wait, the TCP node waits a round-trip time 906 before sending a data packet; the responder already has one 907 measurement of the round-trip time when the acknowledgement for the 908 SYN/ACK packet is received. With ECN+/TryOnce, the mechanism 909 standardized in this document, the TCP responder replies to a report 910 of an ECN-marked SYN/ACK packet by sending a SYN/ACK packet that is 911 not ECN-Capable, and reducing the initial congestion window to one 912 segment. 914 The simulation scripts are available on [ECN-SYN]. along with graphs 915 showing the distribution of response times for the TCP connections. 917 A.1. Simulations with RED in Packet Mode 919 The simulations with RED in packet mode and with the queue in packets 920 show that ECN+ is useful in times of moderate or of high congestion. 921 However, for the simulations with a target load of 125%, with a 922 packet loss rate of over 25% for ECN, ECN+ and ECN+/Wait both result 923 in a packet loss rate of over 50%. (In contrast, the packet loss 924 rate with ECN+/TryOnce is less than that of ECN alone.) For the 925 distribution of response times, the simulations show that ECN+, 926 ECN+/Wait, and ECN+/TryOnce all significantly improve the response 927 times, compared to the response times with plain ECN. 929 Table 1 shows the congestion levels for simulations with RED in 930 packet mode, with a queue in packets. To explore a worst-case 931 scenario, these simulations use a traffic mix with an unrealistically 932 small flow size distribution, with a mean flow size of 3 Kbytes. For 933 each table showing a particular traffic load, the four rows show the 934 number of packets dropped, the number of packets ECN-marked, the 935 aggregate packet drop rate, and the aggregate throughput, and the 936 four columns show the simulations with Standard ECN, ECN+, ECN+/Wait, 937 and ECN+/TryOnce. 939 These simulations were run with RED set to mark instead of drop 940 packets any time that the queue is not full. This is a worst-case 941 scenario for ECN+ and its variants. For the default implementation 942 of RED in the ns-2 simulator, when the average queue size exceeds a 943 configured threshold. the router drops all arriving packets. For 944 scenarios with this RED mechanisms, it is less likely that ECN+ or 945 one of its variants would increase the average queue size above the 946 configured threshold. 948 The usefulness of ECN+: The first thing to observe is that for all of 949 the simulations, the use of ECN+ or ECN+/Wait significantly increases 950 the number of packets marked. In contrast, the use of ECN+/TryOnce 951 significantly increases the number of packets marked in the 952 simulations with moderate congestion, and gives a more moderate 953 increase in the number of packets marked for the simulations with 954 higher levels of congestion. However, the cumulative distribution 955 function (CDF) in Table 2 shows that ECN+, ECN+/Wait, and 956 ECN+/TryOnce all improve response times for all of the simulations, 957 with moderate or with larger levels of congestion. 959 Little increase in congestion, sometimes: The second thing to observe 960 is that for the simulations with low or moderate levels of congestion 961 (that is, with packet drop rates less than 10%), the use of ECN+, 962 ECN+/Wait, and ECN+/TryOnce all decrease the aggregate packet drop 963 rate, relative to the simulations with ECN. This makes sense, since 964 with low or moderate levels of congestion, ECN+ allows SYN/ACK 965 packets to be marked instead of dropped, and the use of ECN+ doesn't 966 add to the aggregate congestion. However, for the simulations with 967 packet drop rates of 15% or higher with ECN, the use of ECN+ or 968 ECN+/Wait increases the aggregate packet drop rate, sometimes even 969 doubling it. 971 Comparing ECN+, ECN+/Wait, and ECN+/TryOnce: The aggregate packet 972 drop rate is generally higher with ECN+/Wait than with ECN+. Thus, 973 there is no congestion-related reason to prefer ECN+/Wait over ECN+. 974 In contrast, the aggregate packet drop rate with ECN+/TryOnce is 975 often significantly lower than the aggregate packet drop rate with 976 either ECN, ECN+, ECN+/Wait. 978 Target Load = 95%: 979 ECN ECN+ ECN+/Wait ECN+/TryOnce 980 ------- ------- ------- ---------- 981 Dropped 20,516 11,226 11,735 16,446` 982 Marked 30,586 37,741 37,425 40,530 983 Loss rate 1.41% 0.78% 0.81% 1.01% 984 Throughput 81% 81% 81% 81% 986 Target Load = 110%: 987 ECN ECN+ ECN+/Wait ECN+/TryOnce 988 ------- ------- ------- ---------- 989 Dropped 165,566 106,083 147,180 218,594 990 Marked 179,735 281,306 308,473 242,969 991 Loss rate 9.01% 6.12% 8.02% 7.14% 992 Throughput 92% 92% 92% 94% 994 Target Load = 125%: 995 ECN ECN+ ECN+/Wait ECN+/TryOnce 996 ------- ------- ------- ---------- 997 Dropped 600,628 1,746,768 2,176,530 650,781 998 Marked 418,433 1,166,450 1,164,932 440,432 999 Loss rate 25.45% 51.73% 56.87% 18.22% 1000 Throughput 94% 98% 97% 95% 1002 Target Load = 1.50% 1003 ECN ECN+ ECN+/Wait ECN+/TryOnce 1004 ------- ------- ------- ---------- 1005 Dropped 1,449,945 1,565,0517 1,563,0801 1,372,067 1006 Marked 669,840 583,378 591,315 675,290 1007 Loss rate 46.7% 59.0% 59.0% 32.3% 1008 Throughput 88% 94% 94% 93% 1010 Table 1: Simulations with an average flow size of 3 Kbytes, a 1011 100 Mbps link, RED in packet mode, queue in packets. 1013 Target Load = 95%: 1015 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1016 ------------------------------------------------------ 1017 ECN: 0.00 0.07 0.26 0.51 0.82 0.96 0.97 0.97 0.97 1.00 1.00 1018 ECN+: 0.00 0.07 0.27 0.53 0.85 0.99 1.00 1.00 1.00 1.00 1.00 1019 Wait: 0.00 0.07 0.26 0.51 0.83 0.97 1.00 1.00 1.00 1.00 1.00 1020 Once: 0.00 0.07 0.24 0.49 0.83 0.97 1.00 1.00 1.00 1.00 1.00 1022 Target Load = 110%: 1024 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1025 ------------------------------------------------------ 1026 ECN: 0.00 0.05 0.19 0.41 0.67 0.79 0.80 0.80 0.80 0.96 0.96 1027 ECN+: 0.00 0.07 0.22 0.48 0.81 0.96 1.00 1.00 1.00 1.00 1.00 1028 Wait: 0.00 0.05 0.18 0.38 0.64 0.77 0.95 1.00 1.00 1.00 1.00 1029 Once: 0.00 0.06 0.19 0.41 0.70 0.86 0.95 0.96 0.96 0.99 0.99 1031 Target Load = 125%: 1033 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1034 ------------------------------------------------------ 1035 ECN: 0.00 0.04 0.13 0.27 0.46 0.56 0.58 0.59 0.59 0.82 0.82 1036 ECN+: 0.00 0.06 0.18 0.33 0.58 0.76 0.97 0.99 0.99 1.00 1.00 1037 Wait: 0.00 0.01 0.06 0.13 0.21 0.27 0.68 0.98 0.99 1.00 1.00 1038 Once: 0.00 0.05 0.16 0.34 0.58 0.73 0.85 0.87 0.87 0.95 0.96 1040 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1041 ------------------------------------------------------ 1042 ECN: 0.00 0.03 0.08 0.18 0.31 0.39 0.42 0.42 0.43 0.68 0.68 1043 ECN+: 0.00 0.06 0.18 0.39 0.67 0.81 0.83 0.84 0.84 0.93 0.93 1044 Wait: 0.00 0.06 0.18 0.39 0.67 0.81 0.83 0.84 0.84 0.93 0.94 1045 Once: 0.00 0.04 0.13 0.28 0.47 0.60 0.72 0.75 0.76 0.88 0.89 1047 Table 2: The cumulative distribution function (CDF) for transfer 1048 times, for simulations with an average flow size of 3 Kbytes, a 1049 100 Mbps link, RED in packet mode, queue in packets. (The graphs are 1050 available from "http://www.icir.org/floyd/ecn-syn/".) 1051 Target Load = 0.95% 1052 ECN ECN+ ECN+/Wait ECN+/TryOnce 1053 ------- ------- ------- ---------- 1054 Dropped 8,448 6,362 7,740 16,323 1055 Marked 9,891 16,787 17,456 17,186 1056 Loss rate 5.5% 4.3% 5.0% 5.4% 1057 Throughput 78% 78% 78% 82% 1059 Target Load = 1.10% 1060 ECN ECN+ ECN+/Wait ECN+/TryOnce 1061 ------- ------- ------- ---------- 1062 Dropped 31,284 29,773 49,297 42,201 1063 Marked 28,429 54,729 60,383 33,672 1064 Loss rate 15.3% 15.2% 21.9% 13.5% 1065 Throughput 97% 96% 96% 95% 1067 Target Load = 1.25% 1068 ECN ECN+ ECN+/Wait ECN+/TryOnce 1069 ------- ------- ------- ---------- 1070 Dropped 61,433 176,682 214,096 79,463 1071 Marked 44,408 119,728 117,301 48,991 1072 Loss rate 25.4% 51.9% 56.0% 22.5% 1073 Throughput 97% 98% 98% 95% 1075 Target Load = 1.50% 1076 ECN ECN+ ECN+/Wait ECN+/TryOnce 1077 ------- ------- ------- ---------- 1078 Dropped 130,007 251,856 326,845 141,418 1079 Marked 63,066 146,757 147,239 67,772 1080 Loss rate 42.5% 61.3% 67.3% 33.3% 1081 Throughput 93% 99% 99% 94% 1083 Table 3: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 1084 link, RED in packet mode, queue in packets. 1086 Target Load = 95%: 1088 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1089 ------------------------------------------------------ 1090 ECN: 0.00 0.05 0.18 0.42 0.70 0.86 0.88 0.88 0.88 0.98 0.98 1091 ECN+: 0.00 0.06 0.20 0.45 0.78 0.96 1.00 1.00 1.00 1.00 1.00 1092 Wait: 0.00 0.05 0.18 0.40 0.68 0.84 0.96 1.00 1.00 1.00 1.00 1093 Once: 0.00 0.05 0.18 0.39 0.69 0.87 0.96 0.96 0.96 0.99 0.99 1095 Target Load = 110%: 1097 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1098 ------------------------------------------------------ 1099 ECN: 0.00 0.03 0.13 0.29 0.52 0.66 0.69 0.69 0.69 0.91 0.91 1100 ECN+: 0.00 0.05 0.17 0.36 0.66 0.88 0.98 0.99 1.00 1.00 1.00 1101 Wait: 0.00 0.02 0.08 0.20 0.35 0.47 0.76 0.98 1.00 1.00 1.00 1102 Once: 0.00 0.04 0.15 0.33 0.59 0.76 0.89 0.91 0.91 0.98 0.98 1104 Target Load = 125%: 1106 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1107 ------------------------------------------------------ 1108 ECN: 0.00 0.03 0.10 0.22 0.40 0.52 0.56 0.56 0.57 0.82 0.82 1109 ECN+: 0.00 0.03 0.14 0.27 0.49 0.70 0.96 0.99 0.99 0.99 1.00 1110 Wait: 0.00 0.00 0.03 0.07 0.12 0.18 0.50 0.94 0.99 0.99 1.00 1111 Once: 0.00 0.04 0.13 0.29 0.51 0.66 0.81 0.84 0.84 0.94 0.94 1113 Target Load = 150%: 1115 TIME: 10 100 200 300 400 500 1000 2000 3000 4000 5000 1116 ------------------------------------------------------ 1117 ECN: 0.00 0.02 0.07 0.15 0.28 0.38 0.42 0.42 0.43 0.67 0.68 1118 ECN+: 0.00 0.00 0.00 0.00 0.01 0.05 0.68 0.83 0.95 0.97 0.98 1119 Wait: 0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.62 0.83 0.93 0.97 1120 Once: 0.00 0.03 0.11 0.23 0.42 0.56 0.71 0.74 0.74 0.87 0.88 1122 Table 4: The cumulative distribution function (CDF) for transfer 1123 times, for simulations with an average flow size of 3 Kbytes, a 1124 10 Mbps link, RED in packet mode, queue in packets. (The graphs are 1125 available from "http://www.icir.org/floyd/ecn-syn/".) 1127 A.2. Simulations with RED in Byte Mode 1129 Table 5 below shows simulations with RED in byte mode and the queue 1130 in bytes. There is no significant increase in aggregate congestion 1131 with the use of ECN+, ECN+/Wait, or ECN+/TryOnce. 1133 However, unlike the simulations with RED in packet mode, the 1134 simulations with RED in byte mode show little benefit from the use of 1135 ECN+ or ECN+/Wait, in that the packet marking rate with ECN+ or 1136 ECN+/Wait is not much different than the packet marking rate with 1137 Standard ECN. This is because with RED in byte mode, small packets 1138 like SYN/ACK packets are rarely dropped or marked - that is, there is 1139 no drawback from the use of ECN+ in these scenarios, but not much 1140 need for ECN+ either, in a scenario where small packets are unlikely 1141 to be dropped or marked. 1143 Target Load = 95% 1144 ECN ECN+ ECN+/Wait ECN+/TryOnce 1145 ------- ------- ------- ---------- 1146 Dropped 766 446 427 408 1147 Marked 32,683 34,289 33,412 31,892 1148 Loss rate 0.05% 0.03% 0.03% 0.03% 1149 Throughput 81% 81% 81% 81% 1151 Target Load = 110% 1152 ECN ECN+ ECN+/Wait ECN+/TryOnce 1153 ------- ------- ------- ---------- 1154 Dropped 2,496 2,110 1,733 2,024 1155 Marked 220,573 258,696 230,955 224,338 1156 Loss rate 0.15% 0.13% 0.11% 0.11% 1157 Throughput 92% 91% 92% 92% 1159 Target Load = 125% 1160 ECN ECN+ ECN+/Wait ECN+/TryOnce 1161 ------- ------- ------- ---------- 1162 Dropped 20,032 13,555 13,979 19,544 1163 Marked 725,165 726,992 726,823 627,088 1164 Loss rate 1.11% 0.76% 0.78% 0.72% 1165 Throughput 95% 95% 95% 95% 1167 Target Load = 150% 1168 ECN ECN+ ECN+/Wait ECN+/TryOnce 1169 ------- ------- ------- ---------- 1170 Dropped 484,251 483,847 507,727 572,373 1171 Marked 865,905 872,254 873,317 816,841 1172 Loss rate 19.09% 19.13% 19.71% 12.28% 1173 Throughput 99% 98% 99% 99% 1175 Table 5: Simulations with an average flow size of 3 Kbytes, a 1176 100 Mbps link, RED in byte mode, queue in bytes. 1178 Target Load = 0.95% 1179 ECN ECN+ ECN+/Wait ECN+/TryOnce 1180 ------- ------- ------- ---------- 1181 Dropped 142 77 103 99 1182 Marked 11,694 11,387 11,604 12,129 1183 Loss rate 0.1% 0.1% 0.1% 0.1% 1184 Throughput 78% 78% 78% 78% 1186 Target Load = 1.10% 1187 ECN ECN+ ECN+/Wait ECN+/TryOnce 1188 ------- ------- ------- ---------- 1189 Dropped 338 210 247 292 1190 Marked 41,676 40,412 44,173 37,527 1191 Loss rate 0.2% 0.1% 0.1% 0.1% 1192 Throughput 94% 94% 94% 95% 1194 Target Load = 1.25% 1195 ECN ECN+ ECN+/Wait ECN+/TryOnce 1196 ------- ------- ------- ---------- 1197 Dropped 1,559 951 978 1,490 1198 Marked 74,933 75,499 75,481 57,721 1199 Loss rate 0.8% 0.5% 0.5% 0.5% 1200 Throughput 99% 99% 99% 96% 1202 Target Load = 1.50% 1203 ECN ECN+ ECN+/Wait ECN+/TryOnce 1204 ------- ------- ------- ---------- 1205 Dropped 2,374 1,528 1,515 4,517 1206 Marked 85,739 86,428 86,144 81,695 1207 Loss rate 1.2% 0.8% 0.8% 1.3% 1208 Throughput 99% 98% 98% 98% 1210 Table 6: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 1211 link, RED in byte mode, queue in bytes. 1213 B. Issues of Incremental Deployment 1215 In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node 1216 B must have received an ECN-setup SYN packet from node A. However, 1217 it is possible that node A supports ECN, but either ignores the CE 1218 codepoint on received SYN/ACK packets, or ignores SYN/ACK packets 1219 with the ECT or CE codepoint set. If the TCP initiator ignores the 1220 CE codepoint on received SYN/ACK packets, this would mean that the 1221 TCP responder would not respond to this congestion indication. 1222 However, this seems to us an acceptable cost to pay in the 1223 incremental deployment of ECN-Capability for TCP's SYN/ACK packets. 1224 It would mean that the responder would not reduce the initial 1225 congestion window from two, three, or four segments down to one 1226 segment, as it should. and would not sent a non-ECN-Capable SYN/ACK 1227 packet to complete the SYN exchange. However, the TCP end nodes 1228 would still respond correctly to any subsequent CE indications on 1229 data packets later on in the connection. 1231 Figure 4 shows an interchange with the SYN/ACK packet ECN-marked, but 1232 with the ECN mark ignored by the TCP originator. 1234 --------------------------------------------------------------- 1235 TCP Node A Router TCP Node B 1236 (initiator) (responder) 1237 ---------- ------ ---------- 1239 ECN-setup SYN packet ---> 1240 ECN-setup SYN packet ---> 1242 <--- ECN-setup SYN/ACK, ECT 1243 <--- Sets CE on SYN/ACK 1244 <--- ECN-setup SYN/ACK, CE 1246 Data/ACK, No ECN-Echo ---> 1247 Data/ACK ---> 1248 <--- Data (up to four packets) 1249 --------------------------------------------------------------- 1251 Figure 4: SYN exchange with the SYN/ACK packet marked, 1252 but with the ECN mark ignored by the TCP initiator. 1254 Thus, to be explicit, when a TCP connection includes an initiator 1255 that supports ECN but *does not* support ECN-Capability for SYN/ACK 1256 packets, in combination with a responder that *does* support ECN- 1257 Capability for SYN/ACK packets, it is possible that the ECN-Capable 1258 SYN/ACK packets will be marked rather than dropped in the network, 1259 and that the responder will not learn about the ECN mark on the 1260 SYN/ACK packet. This would not be a problem if most packets from the 1261 responder supporting ECN for SYN/ACK packets were in long-lived TCP 1262 connections, but it would be more problematic if most of the packets 1263 were from TCP connections consisting of four data packets, and the 1264 TCP responder for these connections was ready to send its data 1265 packets immediately after the SYN/ACK exchange. Of course, with 1266 *severe* congestion, the SYN/ACK packets would likely be dropped 1267 rather than ECN-marked at the congested router, preventing the TCP 1268 responder from adding to the congestion by sending its initial window 1269 of four data packets. 1271 It is also possible that in some older TCP implementation, the 1272 initiator would ignore arriving SYN/ACK packets that had the ECT or 1273 CE codepoint set. This would result in a delay in connection set-up 1274 for that TCP connection, with the initiator re-sending the SYN packet 1275 after a retransmit timeout. We are not aware of any TCP 1276 implementations with this behavior. 1278 One possibility for coping with problems of backwards compatibility 1279 would be for TCP initiators to use a TCP flag that means "I 1280 understand ECN-Capable SYN/ACK packets". If this document were to 1281 standardize the use of such an "ECN-SYN" flag, then the TCP responder 1282 would only send a SYN/ACK packet as ECN-capable if the incoming SYN 1283 packet had the "ECN-SYN" flag set. An ECN-SYN flag would prevent the 1284 backwards compatibility problems described in the paragraphs above. 1286 One drawback to the use of an ECN-SYN flag is that it would use one 1287 of the four remaining reserved bits in the TCP header, for a 1288 transient backwards compatibility problem. This drawback is limited 1289 by the fact that the "ECN-SYN" flag would be defined only for use 1290 with ECN-setup SYN packets; that bit in the TCP header could be 1291 defined to have other uses for other kinds of TCP packets. 1293 Factors in deciding not to use an ECN-SYN flag include the following: 1295 (1) The limited installed base: At the time that this document was 1296 written, the TCP implementations in Microsoft Vista and Mac OS X 1297 included ECN, but ECN was not enabled by default [SBT07]. Thus, 1298 there was not a large deployed base of ECN-Capable TCP 1299 implementations. This limits the scope of any backwards 1300 compatibility problems. 1302 (2) Limits to the scope of the problem: The backwards compatibility 1303 problem would not be serious enough to cause congestion collapse; 1304 with severe congestion, the buffer at the congested router will 1305 overflow, and the congested router will drop rather than ECN-mark 1306 arriving SYN packets. Some active queue management mechanisms might 1307 switch from packet-marking to packet-dropping in times of high 1308 congestion before buffer overflow, as recommended in Section 19.1 of 1309 RFC 3168. This helps to prevent congestion collapse problems with 1310 the use of ECN. 1312 (3) Detection of and response to backwards-compatibility problems: A 1313 TCP responder such as a web server can't differentiate between a 1314 SYN/ACK packet that is not ECN-marked in the network, and a SYN/ACK 1315 packet that is ECN-marked, but where the ECN mark is ignored by the 1316 TCP initiator. However, a TCP responder *can* detect if a SYN/ACK 1317 packet is sent as ECN-capable and not reported as ECN-marked, but 1318 data packets are dropped or marked from the initial window of data. 1319 We will call this scenario "initial-window-congestion". If a web 1320 server frequently experienced initial-window congestion (without 1321 SYN/ACK congestion), then the web server *might* be experiencing 1322 backwards compatibility problems with ECN-Capable SYN/ACK packets, 1323 and could respond by not sending SYN/ACK packets as ECN-Capable. 1325 Normative References 1327 [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate 1328 Requirement Levels, RFC 2119, March 1997. 1330 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 1331 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 1332 Standard, September 2001. 1334 Informative References 1336 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 1337 SIGCOMM 2005. 1339 [ECN-SYN] ECN-SYN web page with simulation scripts, URL 1340 "http://www.icir.org/floyd/ecn-syn". 1342 [F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to 1343 TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to 1344 the BEHAVE mailing list, URL "http://www1.ietf.org/mail- 1345 archive/web/behave/current/msg02644.html". 1347 [Kelson00] Dax Kelson, note sent to the Linux kernel mailing list, 1348 September 10, 2000. 1350 [L08] A. Landley, "Re: [tcpm] I-D Action:draft-ietf-tcpm- 1351 ecnsyn-06.txt", Email to the tcpm mailing list, August 24, 2008. 1353 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 1354 of Transport Protocols in the Internet, ACM CCR, April 2005. 1356 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 1357 Improved Controllers for AQM Routers Supporting TCP Flows, April 1358 1998. 1360 [RED] Floyd, S., and Jacobson, V. Random Early Detection gateways 1361 for Congestion Avoidance . IEEE/ACM Transactions on Networking, V.1 1362 N.4, August 1993. 1364 [REM] S. Athuraliya, V. H. Li, S. H. Low and Q. Yin, REM: Active 1365 Queue Management, IEEE Network, May 2001. 1367 [RFC2309] B. Braden et al., Recommendations on Queue Management and 1368 Congestion Avoidance in the Internet, RFC 2309, April 1998. 1370 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 1371 Control, RFC 2581, April 1999. 1373 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 1374 Timer, RFC 2988, November 2000. 1376 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 1377 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 1378 January 2001. 1380 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 1381 3360, August 2002. 1383 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 1384 Initial Window, RFC 3390, October 2002. 1386 [RFC4987] W. Eddy, TCP SYN Flooding Attacks and Common Mitigations, 1387 RFC 4987, August 2007. 1389 [SCJO01] F. Smith, F. Campos, K. Jeffay, and D. Ott, What TCP/IP 1390 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 1392 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 1393 1395 [SBT07] M. Sridharan, D. Bansal, and D. Thaler, Implementation Report 1396 on Experiences with Various TCP RFCs, Presentation in the TSVAREA, 1397 IETF 68, March 2007. URL 1398 "http://www3.ietf.org/proceedings/07mar/slides/tsvarea-3/sld6.htm". 1400 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 1401 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 1402 tools-05, work in progress, February 2008. 1404 IANA Considerations 1406 There are no IANA considerations regarding this document. 1408 Authors' Addresses 1409 Aleksandar Kuzmanovic 1410 Phone: +1 (847) 467-5519 1411 Northwestern University 1412 Email: akuzma at northwestern.edu 1413 URL: http://cs.northwestern.edu/~a 1415 Amit Mondal 1416 Northwestern University 1417 Email: a-mondal at northwestern.edu 1419 Sally Floyd 1420 Phone: +1 (510) 666-2989 1421 ICIR (ICSI Center for Internet Research) 1422 Email: floyd@icir.org 1423 URL: http://www.icir.org/floyd/ 1425 K. K. Ramakrishnan 1426 Phone: +1 (973) 360-8764 1427 AT&T Labs Research 1428 Email: kkrama at research.att.com 1429 URL: http://www.research.att.com/info/kkrama 1431 Full Copyright Statement 1433 Copyright (C) The IETF Trust (2008). 1435 This document is subject to the rights, licenses and restrictions 1436 contained in BCP 78, and except as set forth therein, the authors 1437 retain all their rights. 1439 This document and the information contained herein are provided on an 1440 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1441 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1442 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1443 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1444 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1445 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1447 Intellectual Property 1449 The IETF takes no position regarding the validity or scope of any 1450 Intellectual Property Rights or other rights that might be claimed to 1451 pertain to the implementation or use of the technology described in 1452 this document or the extent to which any license under such rights 1453 might or might not be available; nor does it represent that it has 1454 made any independent effort to identify any such rights. Information 1455 on the procedures with respect to rights in RFC documents can be 1456 found in BCP 78 and BCP 79. 1458 Copies of IPR disclosures made to the IETF Secretariat and any 1459 assurances of licenses to be made available, or the result of an 1460 attempt made to obtain a general license or permission for the use of 1461 such proprietary rights by implementers or users of this 1462 specification can be obtained from the IETF on-line IPR repository at 1463 http://www.ietf.org/ipr. 1465 The IETF invites any interested party to bring to its attention any 1466 copyrights, patents or patent applications, or other proprietary 1467 rights that may cover technology that may be required to implement 1468 this standard. Please address the information to the IETF at ietf- 1469 ipr@ietf.org.