idnits 2.17.1 draft-ietf-tcpm-ecnsyn-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1249. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1260. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1267. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1273. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year (Using the creation date from RFC3168, updated by this document, for RFC5378 checks: 2000-11-17) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (22 August 2008) is 5698 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET-DRAFT A. Mondal 3 Intended status: Proposed Standard Northwestern University 4 Expires: 22 February 2009 S. Floyd 5 Updates: 3168 ICIR 6 K.K. Ramakrishnan 7 AT&T 8 22 August 2008 10 Adding Explicit Congestion Notification (ECN) Capability 11 to TCP's SYN/ACK Packets 12 draft-ietf-tcpm-ecnsyn-06.txt 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on August 2008. 39 Copyright Notice 41 Copyright (C) The IETF Trust (2008). 43 Abstract 45 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 46 packets to be ECN-Capable. For TCP, RFC 3168 only specifies setting 47 an ECN-Capable codepoint on data packets, and not on SYN and SYN/ACK 48 packets. However, because of the high cost to the TCP transfer of 49 having a SYN/ACK packet dropped, with the resulting retransmit 50 timeout, this document specifies the use of ECN for the SYN/ACK 51 packet itself, when sent in response to a SYN packet with the two ECN 52 flags set in the TCP header, indicating a willingness to use ECN. 53 Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to 54 the TCP connection, avoiding the severe penalty of a retransmit 55 timeout for a connection that has not yet started placing a load on 56 the network. The sender of the SYN/ACK packet must respond to a 57 report of an ECN-marked SYN/ACK packet by reducing its initial 58 congestion window from two, three, or four segments to one segment, 59 thereby reducing the subsequent load from that connection on the 60 network. This document updates RFC 3168. 62 Table of Contents 64 1. Introduction ....................................................5 65 2. Conventions and Terminology .....................................6 66 3. Specification ...................................................7 67 3.1. SYN/ACK Packets Dropped in the Network .....................7 68 3.2. SYN/ACK Packets ECN-Marked in the Network ..................8 69 3.3. Management Interface ......................................10 70 4. Discussion .....................................................10 71 4.1. Flooding Attacks ..........................................10 72 4.2. The TCP SYN Packet ........................................11 73 4.3. SYN/ACK Packets and Packet Size ...........................11 74 4.4. Response to ECN-marking of SYN/ACK Packets ................12 75 5. Related Work ...................................................13 76 6. Performance Evaluation .........................................14 77 6.1. The Costs and Benefit of Adding ECN-Capability ............14 78 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 79 Packets ........................................................15 80 7. Security Considerations ........................................16 81 7.1. 'Bad' Routers or Middleboxes ..............................16 82 7.2. Congestion Collapse .......................................16 83 8. Conclusions ....................................................17 84 9. Acknowledgements ...............................................18 85 A. Report on Simulations ..........................................18 86 A.1. Simulations with RED in Packet Mode .......................19 87 A.2. Simulations with RED in Byte Mode .........................21 88 B. Issues of Incremental Deployment ...............................23 89 Normative References ..............................................26 90 Informative References ............................................26 91 IANA Considerations ...............................................27 92 Full Copyright Statement ..........................................28 93 Intellectual Property .............................................28 95 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 97 Changes from draft-ietf-tcpm-ecnsyn-05: 99 * Added "Updates: 3168" to the header. Added a reference 100 to RFC 4987. Mild editing. 101 Feedback from Lars's Area Director review. 103 * Updated simulation results with new simulation scripts that 104 don't require any modifications to the ns simulator, and that 105 all use the same seed for generating traffic. The results are 106 somewhat different for the very-high-congestion scenarios 107 (with loss rates of 25% in the absence of ECN-capability 108 for SYN/ACK packets). This is reflected in the simulations with 109 a target load of 125% in Tables 1 and 2. 111 * Added the URL for the web page that has the simulation scripts. 113 Changes from draft-ietf-tcpm-ecnsyn-04: 115 * Updating the copyright date. 117 Changes from draft-ietf-tcpm-ecnsyn-03: 119 * General editing. This includes using the terms "initiator" 120 and "responder" for the two ends of the TCP connection. 121 Feedback from Alfred Hoenes. 123 * Added some text to the backwards compatibility discussion, 124 now in Appendix B, about the pros and cons of using a TCP 125 flag for the TCP initiator to signal that it understands 126 ECN-Capable SYN/ACK packets. The consensus at this time is 127 not to use such a flag. Also added a recommendation that 128 TCP implementations include a management interface to turn 129 off the use of ECN for SYN/ACK packets. From email from 130 Bob Briscoe. 132 Changes from draft-ietf-tcpm-ecnsyn-02: 134 * Added to the discussion in the Security section of whether 135 ECN-Capable TCP SYN packets have problems with firewalls, 136 over and above the known problems of TCP data packets 137 (e.g., as in the Microsoft report). From a question raised 138 at the TCPM meeting at the July 2007 IETF. 140 * Added a sentence to the discussion of routers or middleboxes that 141 *might* drop TCP SYN packets on the basis of IP header fields. 142 Feedback from Remi Denis-Courmont. 144 * General editing. Feedback from Alfred Hoenes. 146 Changes from draft-ietf-tcpm-ecnsyn-01: 148 * Changes in response to feedback from Anil Agarwal. 150 * Added a look at the costs of adding ECN-Capability to 151 SYN/ACKs in a highly-congested scenario. 152 From feedback from Mark Allman and Janardhan Iyengar. 154 * Added a comparative evaluation of two possible responses 155 to an ECN-marked SYN/ACK packet. From Mark Allman. 157 Changes from draft-ietf-tcpm-ecnsyn-00: 159 * Only updating the revision number. 161 Changes from draft-ietf-twvsg-ecnsyn-00: 163 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 165 * Added a discussion in Section 3 of "Response to 166 ECN-marking of SYN/ACK packets". Based on 167 suggestions from Mark Allman. 169 * Added a discussion to the Conclusions about adding 170 ECN-capability to relevant set-up packets in other 171 protocols. From a suggestion from Wesley Eddy. 173 * Added a description of SYN exchanges with SYN cookies. 174 From a suggestion from Wesley Eddy. 176 * Added a discussion of one-way data transfers, where the 177 host sending the SYN/ACK packet sends no data packets. 179 * Minor editing, from feedback from Mark Allman and Janardhan 180 Iyengar. 182 * Future work: a look at the costs of adding 183 ECN-Capability in a worst-case scenario. 184 From feedback from Mark Allman and Janardhan Iyengar. 186 * Future work: a comparative evaluation of two 187 possible responses to an ECN-marked SYN/ACK packet. 189 Changes from draft-kuzmanovic-ecn-syn-00.txt: 191 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 193 END OF NOTE TO RFC EDITOR. 195 1. Introduction 197 TCP's congestion control mechanism has primarily used packet loss as 198 the congestion indication, with packets dropped when buffers 199 overflow. With such tail-drop mechanisms, the packet delay can be 200 high, as the queue at bottleneck routers can be fairly large. 201 Dropping packets only when the queue overflows, and having TCP react 202 only to such losses, results in: 203 1) significantly higher packet delay; 204 2) unnecessarily many packet losses; and 205 3) unfairness due to synchronization effects. 207 The adoption of Active Queue Management (AQM) mechanisms allows 208 better control of bottleneck queues [RFC2309]. This use of AQM has 209 the following potential benefits: 210 1) better control of the queue, with reduced queueing delay; 211 2) fewer packet drops; and 212 3) better fairness because of fewer synchronization effects. 214 With the adoption of ECN, performance may be further improved. When 215 the router detects congestion before buffer overflow, the router can 216 provide a congestion indication either by dropping a packet, or by 217 setting the Congestion Experienced (CE) codepoint in the Explicit 218 Congestion Notification (ECN) field in the IP header [RFC3168]. The 219 IETF has standardized the use of the Congestion Experienced (CE) 220 codepoint in the IP header for routers to indicate congestion. For 221 incremental deployment and backwards compatibility, the RFC on ECN 222 [RFC3168] specifies that routers may mark ECN-capable packets that 223 would otherwise have been dropped, using the Congestion Experienced 224 codepoint in the ECN field. The use of ECN allows TCP to react to 225 congestion while avoiding unnecessary retransmissions and, in some 226 cases, unnecessary retransmit timeouts. Thus, using ECN has several 227 benefits: 229 1) For short transfers, a TCP connection's congestion window may be 230 small. For example, if the current window contains only one packet, 231 and that packet is dropped, TCP will have to wait for a retransmit 232 timeout to recover, reducing its overall throughput. Similarly, if 233 the current window contains only a few packets and one of those 234 packets is dropped, there might not be enough duplicate 235 acknowledgements for a fast retransmission, and the sender of the 236 data packet might have to wait for a delay of several round-trip 237 times using Limited Transmit [RFC3042]. With the use of ECN, short 238 flows are less likely to have packets dropped, sometimes avoiding 239 unnecessary delays or costly retransmit timeouts. 241 2) While longer flows may not see substantially improved throughput 242 with the use of ECN, they experience lower loss. This may benefit TCP 243 applications that are latency- and loss-sensitive, because of the 244 avoidance of retransmissions. 246 RFC 3168 only specifies marking the Congestion Experienced codepoint 247 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 248 specifies the negotiation of the use of ECN between the two TCP end- 249 points in the TCP SYN and SYN-ACK exchange, using flags in the TCP 250 header. Erring on the side of being conservative, RFC 3168 does not 251 specify the use of ECN for the SYN/ACK packet itself. However, 252 because of the high cost to the TCP transfer of having a SYN/ACK 253 packet dropped, with the resulting retransmit timeout, this document 254 specifies the use of ECN for the SYN/ACK packet itself. This can be 255 of great benefit to the TCP connection, avoiding the severe penalty 256 of a retransmit timeout for a connection that has not yet started 257 placing a load on the network. The sender of the SYN/ACK packet must 258 respond to a report of an ECN-marked SYN/ACK packet by reducing its 259 initial congestion window from two, three, or four segments to one 260 segment, reducing the subsequent load from that connection on the 261 network. 263 The use of ECN for SYN/ACK packets has the following potential 264 benefits: 265 1) Avoidance of a retransmit timeout; 266 2) Improvement in the throughput of short connections. 268 This draft specifies ECN+, a modification to RFC 3168 to allow TCP 269 SYN/ACK packets to be ECN-Capable. Section 3 contains the 270 specification of the change, while Section 4 discusses some of the 271 issues, and Section 5 discusses related work. Section 6 contains an 272 evaluation of the specified change. 274 2. Conventions and Terminology 276 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 277 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 278 document are to be interpreted as described in [RFC 2119]. 280 We use the following terminology from RFC 3168: 282 The ECN field in the IP header: 283 o CE: the Congestion Experienced codepoint; and 284 o ECT: either one of the two ECN-Capable Transport codepoints. 286 The ECN flags in the TCP header: 287 o CWR: the Congestion Window Reduced flag; and 288 o ECE: the ECN-Echo flag. 290 ECN-setup packets: 291 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 292 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 294 In this document we use the terms "initiator" and "responder" to 295 refer to the sender of the SYN packet and of the SYN-ACK packet, 296 respectively. 298 3. Specification 300 This section specifies the modification to RFC 3168 to allow TCP 301 SYN/ACK packets to be ECN-Capable. 303 RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on 304 SYN or SYN-ACK packets." In this section, we specify that a TCP node 305 MAY respond to an ECN-setup SYN packet by setting ECT in the 306 responding ECN-setup SYN/ACK packet, indicating to routers that the 307 SYN/ACK packet is ECN-Capable. This allows a congested router along 308 the path to mark the packet instead of dropping the packet as an 309 indication of congestion. 311 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 312 packet, indicating willingness to use ECN for this connection. As 313 specified by RFC 3168, if TCP node B is willing to use ECN, node B 314 responds with an ECN-setup SYN-ACK packet. 316 3.1. SYN/ACK Packets Dropped in the Network 318 Figure 1 shows an interchange with the SYN/ACK packet dropped by a 319 congested router. Node B waits for a retransmit timeout, and then 320 retransmits the SYN/ACK packet. 322 --------------------------------------------------------------- 323 TCP Node A Router TCP Node B 324 ---------- ------ ---------- 326 ECN-setup SYN packet ---> 327 ECN-setup SYN packet ---> 329 <--- ECN-setup SYN/ACK, possibly ECT 330 3-second timer set 331 SYN/ACK dropped . 332 . 333 . 334 3-second timer expires 335 <--- ECN-setup SYN/ACK, not ECT 336 <--- ECN-setup SYN/ACK 337 Data/ACK ---> 338 Data/ACK ---> 339 <--- Data (one to four segments) 340 --------------------------------------------------------------- 342 Figure 1: SYN exchange with the SYN/ACK packet dropped. 344 If the SYN/ACK packet is dropped in the network, the responder (node 345 B) responds by waiting three seconds for the retransmit timer to 346 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 347 dropped, the responder SHOULD resend the SYN/ACK packet without the 348 ECN-Capable codepoint. (Although we are not aware of any middleboxes 349 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 350 the IP header, we have learned to design our protocols defensively in 351 this regard [RFC3360].) 353 We note that if syn-cookies were used by the responder (node B) in 354 the exchange in Figure 1, the responder wouldn't set a timer upon 355 transmission of the SYN/ACK packet [SYN-COOK] [RFC4987]. In this 356 case, if the SYN/ACK packet was lost, the initiator (Node A) would 357 have to timeout and retransmit the SYN packet in order to trigger 358 another SYN-ACK. 360 3.2. SYN/ACK Packets ECN-Marked in the Network 362 Figure 2 shows an interchange with the SYN/ACK packet sent as ECN- 363 Capable, and ECN-marked instead of dropped at the congested router. 365 --------------------------------------------------------------- 366 TCP Node A Router TCP Node B 367 ---------- ------ ---------- 369 ECN-setup SYN packet ---> 370 ECN-setup SYN packet ---> 372 <--- ECN-setup SYN/ACK, ECT 373 <--- Sets CE on SYN/ACK 374 <--- ECN-setup SYN/ACK, CE 376 Data/ACK, ECN-Echo ---> 377 Data/ACK, ECN-Echo ---> 378 Window reduced to one segment. 379 <--- Data, CWR (one segment only) 380 --------------------------------------------------------------- 382 Figure 2: SYN exchange with the SYN/ACK packet marked. 384 If the initiator (node A) receives a SYN/ACK packet that has been 385 marked by the congested router, with the CE codepoint set, the 386 initiator MUST respond by setting the ECN-Echo flag in the TCP header 387 of the responding ACK packet. As specified in RFC 3168, the 388 initiator continues to set the ECN-Echo flag in packets until it 389 receives a packet with the CWR flag set. 391 When the responder (node B) receives the ECN-Echo packet reporting 392 the Congestion Experienced indication in the SYN/ACK packet, the 393 responder MUST set the initial congestion window to one segment, 394 instead of two segments as allowed by [RFC2581], or three or four 395 segments allowed by [RFC3390]. If the responder (node B) was going 396 to use an initial window of one segment, and receives an ECN-Echo 397 packet informing it of a Congestion Experienced indication on its 398 SYN/ACK packet, the responder MAY continue to send with an initial 399 window of one segment, without waiting for a retransmit timeout. We 400 note that this updates RFC 3168, which specifies that "the sending 401 TCP MUST reset the retransmit timer on receiving the ECN-Echo packet 402 when the congestion window is one." As specified by RFC 3168, the 403 responder (node B) also sets the CWR flag in the TCP header of the 404 next data packet sent, to acknowledge its receipt of and reaction to 405 the ECN-Echo flag. 407 If the data transfer in Figure 2 is entirely from Node A to Node B, 408 then data packets from Node A continue to set the ECN-Echo flag in 409 data packets, waiting for the CWR flag from Node B acknowledging a 410 response to the ECN-Echo flag. 412 3.3. Management Interface 414 The TCP implementation using ECN-Capable SYN/ACK packets SHOULD 415 include a management interface to allow the use of ECN to be turned 416 off for SYN/ACK packets. This is to deal with possible backwards 417 compatibility problems such as those discussed in Appendix B. 419 4. Discussion 421 The rationale for the specification in this document is the 422 following. When node B receives a TCP SYN packet with ECN-Echo bit 423 set in the TCP header, this indicates that node A is ECN-capable. If 424 node B is also ECN-capable, there are no obstacles to immediately 425 setting one of the ECN-Capable codepoints in the IP header in the 426 responding TCP SYN/ACK packet. 428 There can be a great benefit in setting an ECN-capable codepoint in 429 SYN/ACK packets, as is discussed further in [ECN+], and reported 430 briefly in Section 5 below. Congestion is most likely to occur in 431 the server-to-client direction. As a result, setting an ECN-capable 432 codepoint in SYN/ACK packets can reduce the occurrence of three- 433 second retransmit timeouts resulting from the drop of SYN/ACK 434 packets. 436 4.1. Flooding Attacks 438 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 439 packets does not raise any novel security vulnerabilities. For 440 example, provoking servers or hosts to send SYN/ACK packets to a 441 third party in order to perform a "SYN/ACK flood" attack would be 442 highly inefficient. Third parties would immediately drop such 443 packets, since they would know that they didn't generate the TCP SYN 444 packets in the first place. Moreover, such SYN/ACK attacks would 445 have the same signatures as the existing TCP SYN attacks. Provoking 446 servers or hosts to reply with SYN/ACK packets in order to congest a 447 certain link would also be highly inefficient because SYN/ACK packets 448 are small in size. 450 However, the addition of ECN-Capability to SYN/ACK packets could 451 allow SYN/ACK packets to persist for more hops along a network path 452 before being dropped, thus adding somewhat to the ability of a 453 SYN/ACK attack to flood a network link. 455 4.2. The TCP SYN Packet 457 There are several reasons why an ECN-Capable codepoint MUST NOT be 458 set in the IP header of the initiating TCP SYN packet. First, when 459 the TCP SYN packet is sent, there are no guarantees that the other 460 TCP endpoint (node B in Figure 2) is ECN-capable, or that it would be 461 able to understand and react if the ECN CE codepoint was set by a 462 congested router. 464 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 465 by malicious clients to `improve' the well-known TCP SYN attack. By 466 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 467 might be able to inject a large number of TCP SYN packets through a 468 potentially congested ECN-enabled router, congesting it even further. 470 For both these reasons, we continue the restriction that the TCP SYN 471 packet MUST NOT have the ECN-Capable codepoint in the IP header set. 473 4.3. SYN/ACK Packets and Packet Size 475 There are a number of router buffer architectures that have smaller 476 dropping rates for small (SYN) packets than for large (data) packets. 477 For example, for a Drop Tail queue in units of packets, where each 478 packet takes a single slot in the buffer regardless of packet size, 479 small and large packets are equally likely to be dropped. However, 480 for a Drop Tail queue in units of bytes, small packets are less 481 likely to be dropped than are large ones. Similarly, for RED in 482 packet mode, small and large packets are equally likely to be dropped 483 or marked, while for RED in byte mode, a packet's chance of being 484 dropped or marked is proportional to the packet size in bytes. 486 For a congested router with an AQM mechanism in byte mode, where a 487 packet's chance of being dropped or marked is proportional to the 488 packet size in bytes, the drop or marking rate for TCP SYN/ACK 489 packets should generally be low. In this case, the benefit of making 490 SYN/ACK packets ECN-Capable should be similarly moderate. However, 491 for a congested router with a Drop Tail queue in units of packets or 492 with an AQM mechanism in packet mode, and with no priority queueing 493 for smaller packets, small and large packets should have the same 494 probability of being dropped or marked. In such a case, making 495 SYN/ACK packets ECN-Capable should be of significant benefit. 497 We believe that there are a wide range of behaviors in the real world 498 in terms of the drop or mark behavior at routers as a function of 499 packet size [Tools] (Section 10). We note that all of these 500 alternatives listed above are available in the NS simulator (Drop 501 Tail queues are by default in units of packets, while the default for 502 RED queue management has been changed from packet mode to byte mode). 504 4.4. Response to ECN-marking of SYN/ACK Packets 506 One question is why TCP SYN/ACK packets should be treated differently 507 from other packets in terms of the end node's response to an ECN- 508 marked packet. Section 5 of RFC 3168 specifies the following: 510 "Upon the receipt by an ECN-Capable transport of a single CE packet, 511 the congestion control algorithms followed at the end-systems MUST be 512 essentially the same as the congestion control response to a *single* 513 dropped packet. For example, for ECN-Capable TCP the source TCP is 514 required to halve its congestion window for any window of data 515 containing either a packet drop or an ECN indication." 517 In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP 518 congestion window consists of a single packet and that packet is ECN- 519 marked in the network, then the data sender must reduce the sending 520 rate below one packet per round-trip time, by waiting for one RTO 521 before sending another packet. If the RTO was set to the average 522 round-trip time, this would result in halving the sending rate; 523 because the RTO is in fact larger than the average round-trip time, 524 the sending rate is reduced to less than half of its previous value. 526 TCP's congestion control response to the *dropping* of a SYN/ACK 527 packet is to wait a default time before sending another packet. This 528 document argues that ECN gives end-systems a wider range of possible 529 responses to the *marking* of a SYN/ACK packet, and that waiting a 530 default time before sending a data packet is not the desired 531 response. 533 On the conservative end, one could assume an effective congestion 534 window of one packet for the SYN/ACK packet, and respond to an ECN- 535 marked SYN/ACK packet by reducing the sending rate to one packet 536 every two round-trip times. As an approximation, the TCP end-node 537 could measure the round-trip time T between the sending of the 538 SYN/ACK packet and the receipt of the acknowledgement, and reply to 539 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 540 seconds before sending a data packet. 542 However, we note that for an ECN-marked SYN/ACK packet, halving the 543 *congestion window* is not the same as halving the *sending rate*; 544 there is no `sending rate' associated with an ECN-Capable SYN/ACK 545 packet, as such packets are only sent as the first packet in a 546 connection from that host. Further, a router's marking of a SYN/ACK 547 packet is not affected by any past history of that connection. 549 Adding ECN-Capability to SYN/ACK packets allows the simple response 550 of the responder setting the initial congestion window to one packet, 551 instead of its allowed default value of two, three, or four packets, 552 with the responder proceeding with a cautious sending rate of one 553 packet per round-trip time. If that data packet is ECN-marked or 554 dropped, then the responder will wait an RTO before sending another 555 packet. This document argues that this approach is useful to users, 556 with no dangers of congestion collapse or of starvation of competing 557 traffic. This is discussed in more detail below in Section 6.2. In 558 particular, Section 6.2 discusses simulation results that support the 559 responder's specified behavior of setting the initial congestion 560 window to one packet in response to an ECN-marked SYN/ACK packet. 562 We note that if the data transfer is entirely from Node A to Node B, 563 then there is no effective difference between the two possible 564 responses to an ECN-marked SYN/ACK packet outlined above. In either 565 case, Node B sends no data packets, only sending acknowledgement 566 packets in response to received data packets. 568 5. Related Work 570 The addition of ECN-capability to TCP's SYN/ACK packets was proposed 571 in [ECN+]. The paper includes an extensive set of simulation and 572 testbed experiments to evaluate the effects of the proposal, using 573 several Active Queue Management (AQM) mechanisms, including Random 574 Early Detection (RED) [RED], Random Exponential Marking (REM) [REM], 575 and Proportional Integrator (PI) [PI]. The performance measures were 576 the end-to-end response times for each request/response pair, and the 577 aggregate throughput on the bottleneck link. The end-to-end response 578 time was computed as the time from the moment when the request for 579 the file is sent to the server, until that file is successfully 580 downloaded by the client. 582 The measurements from [ECN+] show that setting an ECN-Capable 583 codepoint in the IP packet header in TCP SYN/ACK packets 584 systematically improves performance with all evaluated AQM schemes. 585 When SYN/ACK packets at a congested router are ECN-marked instead of 586 dropped, this can avoid a long initial retransmit timeout, improving 587 the response time for the affected flow dramatically. 589 [ECN+] shows that the impact on aggregate throughput can also be 590 quite significant, because marking SYN ACK packets can prevent larger 591 flows from suffering long timeouts before being "admitted" into the 592 network. In addition, the testbed measurements from [ECN+] show that 593 web servers setting the ECN-Capable codepoint in TCP SYN/ACK packets 594 could serve more requests. 596 As a final step, [ECN+] explores the co-existence of flows that do 597 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 598 results in [ECN+] show that both types of flows can coexist, with 599 some performance degradation for flows that don't use ECN+. Flows 600 that do use ECN+ improve their end-to-end performance. At the same 601 time, the performance degradation for flows that don't use ECN+, as a 602 result of the flows that do use ECN+, increases as a greater fraction 603 of flows use ECN+. 605 6. Performance Evaluation 607 6.1. The Costs and Benefit of Adding ECN-Capability 609 [ECN+] explores the costs and benefits of adding ECN-Capability to 610 SYN/ACK packets with both simulations and experiments. The addition 611 of ECN-capability to SYN/ACK packets could be of significant benefit 612 for those ECN connections that would have had the SYN/ACK packet 613 dropped in the network, and for which the ECN-Capability would allow 614 the SYN/ACK to be marked rather than dropped. 616 The percent of SYN/ACK packets on a link can be quite high. In 617 particular, measurements on links dominated by web traffic indicate 618 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 620 The benefit of adding ECN-capability to SYN/ACK packets depends in 621 part on the size of the data transfer. The drop of a SYN/ACK packet 622 can increase the download time of a short file by an order of 623 magnitude, by requiring a three-second retransmit timeout. For 624 longer-lived flows, the effect of a dropped SYN/ACK packet on file 625 download time is less dramatic. However, even for longer-lived 626 flows, the addition of ECN-capability to SYN/ACK packets can improve 627 the fairness among long-lived flows, as newly-arriving flows would be 628 less likely to have to wait for retransmit timeouts. 630 One question that arises is what fraction of connections would see 631 the benefit from making SYN/ACK packets ECN-capable, in a particular 632 scenario. Specifically: 634 (1) What fraction of arriving SYN/ACK packets are dropped at the 635 congested router when the SYN/ACK packets are not ECN-capable? 637 (2) Of those SYN/ACK packets that are dropped, what fraction would 638 have been ECN-marked instead of dropped if the SYN/ACK packets had 639 been ECN-capable? 641 To answer (1), it is necessary to consider not only the level of 642 congestion but also the queue architecture at the congested link. As 643 described in Section 4 above, for some queue architectures small 644 packets are less likely to be dropped than large ones. In such an 645 environment, SYN/ACK packets would have lower packet drop rates; 646 question (1) could not necessarily be inferred from the overall 647 packet drop rate, but could be answered by measuring the drop rate 648 for SYN/ACK packets directly. In such an environment, adding ECN- 649 capability to SYN/ACK packets would be of less dramatic benefit than 650 in environments where all packets are equally likely to be dropped 651 regardless of packet size. 653 As question (2) implies, even if all of the SYN/ACK packets were ECN- 654 capable, there could still be some SYN/ACK packets dropped instead of 655 marked at the congested link; the full answer to question (2) depends 656 on the details of the queue management mechanism at the router. If 657 congestion is sufficiently bad, and the queue management mechanism 658 cannot prevent the buffer from overflowing, then SYN/ACK packets will 659 be dropped rather than marked upon buffer overflow whether or not 660 they are ECN-capable. 662 For some AQM mechanisms, ECN-capable packets are marked instead of 663 dropped any time this is possible, that is, any time the buffer is 664 not yet full. For other AQM mechanisms however, such as the RED 665 mechanism as recommended in [RED], packets are dropped rather than 666 marked when the packet drop/mark rate exceeds a certain threshold, 667 e.g., 10%, even if the packets are ECN-capable. For a router with 668 such an AQM mechanism, when congestion is sufficiently severe to 669 cause a high drop/mark rate, some SYN/ACK packets would be dropped 670 instead of marked whether or not they were ECN-capable. 672 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 673 packets depends not only on the overall packet drop rate in the 674 network, but also on the queue management architecture at the 675 congested link. 677 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets 679 This document specifies that the end-node responds to the report of 680 an ECN-marked SYN/ACK packet by setting the initial congestion window 681 to one segment, instead of its possible default value of two to four 682 segments. We call this ECN+ with NoWaiting. However, Section 4 683 discussed another possible response to an ECN-marked SYN/ACK packet, 684 of the end-node waiting an RTT before sending a data packet. We call 685 this approach ECN+ with Waiting. 687 Simulations comparing the performance with Standard ECN (without ECN- 688 marked SYN/ACK packets), ECN+ with NoWaiting, and ECN+ with Waiting 689 show little difference, in terms of aggregate congestion, between 690 ECN+ with NoWaiting and ECN+ with Waiting. The details are given in 691 Appendix A below. Our conclusions are that ECN+ with NoWaiting is 692 perfectly safe, and there are no congestion-related reasons for 693 preferring ECN+ with Waiting over ECN+ with NoWaiting. That is, 694 there is no need for the TCP end-node to wait a round-trip time 695 before sending a data packet after receiving an acknowledgement of an 696 ECN-marked SYN/ACK packet. 698 7. Security Considerations 700 TCP packets carrying the ECT codepoint in IP headers can be marked 701 rather than dropped by ECN-capable routers. This raises several 702 security concerns that we discuss below. 704 7.1. 'Bad' Routers or Middleboxes 706 There are a number of known deployment problems from using ECN with 707 TCP traffic in the Internet. The first reported problem, dating back 708 to 2000, is of a small but decreasing number of routers or 709 middleboxes that reset a TCP connection in response to TCP SYN 710 packets using flags in the TCP header to negotiate ECN-capability 711 [Kelson00] [RFC3360] [MAF05]. Dave Thaler reported at the March 2007 712 IETF of new two problems encountered by TCP connections using ECN; 713 the first of the two problems concerns routers that crash when a TCP 714 data packet arrives with the ECN field in the IP header with the 715 codepoint ECT(0) or ECT(1), indicating that an ECN-Capable connection 716 has been established [SBT07]. 718 While there is no evidence that any routers or middleboxes drop 719 SYN/ACK packets that contain an ECN-Capable or CE codepoint in the IP 720 header, such behavior cannot be excluded. (There seems to be a 721 number of routers or middleboxes that drop TCP SYN packets that 722 contain known or unknown IP options [MAF05] (Figure 1).) Thus, as 723 specified in Section 3, if a SYN/ACK packet with the ECT or CE 724 codepoint is dropped, the TCP node SHOULD resend the SYN/ACK packet 725 without the ECN-Capable codepoint. There is also no evidence that 726 any routers or middleboxes crash when a SYN/ACK arrives with an ECN- 727 Capable or CE codepoint in the IP header (over and above the routers 728 already known to crash when a data packet arrives with either ECT(0) 729 or ECT(1)), but we have not conducted any measurement studies of this 730 [F07]. 732 7.2. Congestion Collapse 734 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 735 marked instead of dropped at an ECN-capable router, the concern is 736 whether this can either invoke congestion, or worsen performance in 737 highly congested scenarios. However, after learning that a SYN/ACK 738 packet was ECN-marked, the responder will only send one data packet; 739 if this data packet is ECN-marked, the responder will then wait for a 740 retransmission timeout. In addition, routers are free to drop rather 741 than mark arriving packets in times of high congestion, regardless of 742 whether the packets are ECN-capable. When congestion is very high 743 and a router's buffer is full, the router has no choice but to drop 744 rather than to mark an arriving packet. 746 The simulations reported in Appendix A show that even with demanding 747 traffic mixes dominated by short flows and high levels of congestion, 748 the aggregate packet dropping rates are not significantly different 749 with Standard ECN, ECN+ with NoWaiting, or ECN+ with Waiting. In 750 particular, the simulations show that in periods of very high 751 congestion the packet-marking rate is low with or without ECN+, and 752 the use of ECN+ does not significantly increase the number of dropped 753 or marked packets. 755 The simulations show that ECN+ is most effective in times of moderate 756 congestion. In these moderate-congested scenarios, the use of ECN+ 757 increases the number of ECN-marked packets, because ECN+ allows 758 SYN/ACK packets to be ECN-marked. At the same time, in these times 759 of moderate congestion, the use of ECN+ instead of Standard ECN does 760 not significantly affect the overall levels of congestion. 762 The simulations show that the use of ECN+ is less effective in times 763 of high congestion; the simulations show that in times of high 764 congestion more packets are dropped instead of marked, both with 765 Standard ECN and with ECN+. In times of high congestion, the buffer 766 can overflow, even with Active Queue Management and ECN; when the 767 buffer is full arriving packets are dropped rather than marked, 768 whether the packets are ECN-capable or not. Thus while ECN+ is less 769 effective in times of high congestion, it still doesn't result in a 770 significant increase in the level of congestion. More details are 771 given in the appendix. 773 8. Conclusions 775 This draft specifies a modification to RFC 3168 to allow TCP nodes to 776 send SYN/ACK packets as being ECN-Capable. Making the SYN/ACK packet 777 ECN-Capable avoids the high cost to a TCP transfer when a SYN/ACK 778 packet is dropped by a congested router, by avoiding the resulting 779 retransmit timeout. This improves the throughput of short 780 connections. The sender of the SYN/ACK packet responds to an ECN 781 mark by reducing its initial congestion window from two, three, or 782 four segments to one segment, reducing the subsequent load from that 783 connection on the network. The addition of ECN-capability to SYN/ACK 784 packets is particularly beneficial in the server-to-client direction, 785 where congestion is more likely to occur. In this case, the initial 786 information provided by the ECN marking in the SYN/ACK packet enables 787 the server to more appropriately adjust the initial load it places on 788 the network. 790 9. Acknowledgements 792 We thank Anil Agarwal, Mark Allman, Remi Denis-Courmont, Wesley Eddy, 793 Lars Eggert, Alfred Hoenes, Janardhan Iyengar, and Pasi Sarolahti for 794 feedback on earlier versions of this draft. 796 A. Report on Simulations 798 This section reports on simulations showing the costs of adding ECN+ 799 in highly-congested scenarios. This section also reports on 800 simulations for a comparative evaluation between ECN+ with NoWaiting 801 and ECN+ with Waiting. 803 The simulations are run with a range of file-size distributions, 804 using the PackMime traffic generator in the ns-2 simulator. They all 805 use a heavy-tailed distribution of file sizes. The simulations 806 reported in the tables below use a mean file size of 3 KBypes, to 807 show the results with a traffic mix with a large number of small 808 transfers. Othe simulations were run with mean file sizes of 5 809 KBytes, 7 Kbytes, 14 KBytes, and 17 Kbytes. The title of each chart 810 gives the targeted average load from the traffic generator. Because 811 the simulations use a heavy-tailed distribution of file sizes, and 812 run for only 85 seconds (including ten seconds of warm-up time), the 813 actual load is often much smaller than the targeted load. The 814 congested link is 100 Mbps. RED is run in gentle mode, and arriving 815 ECN-Capable packets are only dropped instead of marked if the buffer 816 is full (and the router has no choice). 818 We explore two alternatives for a TCP node's response to a report of 819 an ECN-marked SYN/ACK packet. With ECN+ with NoWaiting, the TCP node 820 sends a data packet immediately (with an initial congestion window of 821 one segment). With the alternative ECN+ with Waiting, the TCP node 822 waits a round-trip time before sending a data packet; the responder 823 already has one measurement of the round-trip time when the 824 acknowledgement for the SYN/ACK packet is received. 826 In the tables below, ECN+ refers to ECN+ with NoWaiting, where the 827 responder starts transmitting immediately, and ECN+/wait refers to 828 ECN+ with Waiting, where the responder waits a round-trip time before 829 sending a data packet into the network. 831 The simulation scripts are available on [ECN-SYN]. 833 A.1. Simulations with RED in Packet Mode 835 The simulations with RED in packet mode and with the queue in packets 836 show that ECN+ is useful in times of moderate congestion, though it 837 adds little benefit in times of high congestion. The simulations 838 show a minimal increase in levels of congestion with either ECN+ with 839 Waiting or ECN+ with NoWaiting, either in terms of packet dropping or 840 marking rates or in terms of the distribution of responses times. 841 Thus, the simulations show no problems with ECN+ in times of high 842 congestion, and no reason to use ECN+ with Waiting instead of ECN+ 843 with NoWaiting. 845 Table 1 shows the congestion levels for simulations with RED in 846 packet mode, with a queue in packets. To explore a worst-case 847 scenario, these simulations use a traffic mix with an unrealistically 848 small flow size distribution, with a mean flow size of 3 Kbytes. For 849 each table showing a particular traffic load, the three rows show the 850 number of packets dropped, the number of packets ECN-marked, and the 851 aggregate packet drop rate, and the three columns show the 852 simulations with Standard ECN, ECN+ (NoWaiting) and ECN+/wait. 854 These simulations were run with RED set to mark instead of drop 855 packets any time that the queue is not full. For the default 856 implementation of RED in the ns-2 simulator, the router drops instead 857 of marks arriving packets when the average queue size exceeds a 858 configured threshold. 860 The usefulness of ECN+: The first thing to observe is that for all of 861 the simulations, the use of ECN+ or ECN+/wait significantly increased 862 the number of packets marked. This indicates that with ECN+ or 863 ECN+/wait, many SYN/ACK packets are marked instead of dropped. 865 Little increase in congestion, sometimes: The second thing to observe 866 is that for the simulations with low or moderate levels of congestion 867 (that is, with packet drop rates less than 10%), the use of ECN+ or 868 ECN+/wait decreases the aggregate packet drop rate, relative to the 869 simulations with ECN. This makes sense, since with low or moderate 870 levels of congestion, ECN+ allows SYN/ACK packets to be marked 871 instead of dropped, and the use of ECN+ doesn't add to the aggregate 872 congestion. However, for the simulations with packet drop rates of 873 15% or higher with ECN, the use of ECN+ or ECN+/wait increases the 874 aggregate packet drop rate, sometimes even doubling it. 876 Comparing ECN+ and ECN+/wait: The third thing to observe is that the 877 aggregate packet drop rate is generally higher with ECN+/wait than 878 with ECN+. Thus, there is no congestion-related reason to prefer 879 ECN+/wait over ECN+. 881 Target Load = 95%: 882 ECN ECN+ ECN+/wait 883 ------- ------- ------- 884 Dropped 18,512 11,244 12,135 885 Marked 27,026 36,977 38,743 886 Loss rate 1.27% 0.78% 0.84% 887 Throughput 81% 81% 81% 889 Target Load = 110%: 890 ECN ECN+ ECN+/wait 891 ------- ------- ------- 892 Dropped 165,866 110,525 144,821 893 Marked 180,714 290,629 311,233 894 Loss rate 9.04% 6.36% 7.94% 895 Throughput 92% 92% 92% 897 Target Load = 125%: 898 ECN ECN+ ECN+/wait 899 ------- ------- ------- 900 Dropped 574,114 1,764,677 2,229,280 901 Marked 409,441 1,172,550 1,181,209 902 Loss rate 24.55% 52.00% 57.64% 903 Throughput 94% 98% 97% 905 Table 1: Simulations with an average flow size of 3 Kbytes, a 906 100 Mbps link, RED in packet mode, queue in packets. 908 Target Load = 95%: 909 ECN ECN+ ECN+/wait 910 ------- ------- ------- 911 Dropped 8,754 6,719 7,269 912 Marked 10,376 17,637 16,956 913 Loss rate 5.68% 4.50% 4.75% 914 Throughput 78% 78% 78% 916 Target Load = 110%: 917 ECN ECN+ ECN+/wait 918 ------- ------- ------- 919 Dropped 32,110 32,014 48,838 920 Marked 28,476 56,550 62,252 921 Loss rate 15.68% 16.11% 21.92% 922 Throughput 96% 96% 96% 924 Target Load = 125%: 925 ECN ECN+ ECN+/wait 926 ------- ------- ------- 927 Dropped 60,710 174,920 215,001 928 Marked 43,497 119,620 118,172 929 Loss rate 25.08% 51.59% 56.27% 930 Throughput 98% 98% 98% 932 Target Load = 150%: 933 ECN ECN+ ECN+/wait 934 ------- ------- ------- 935 Dropped 133,128 250,762 327,584 936 Marked 63,306 146,581 147,307 937 Loss rate 43.34% 61.11% 67.33% 938 Throughput 93% 100% 100% 940 Table 2: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 941 link, RED in packet mode, queue in packets. 943 A.2. Simulations with RED in Byte Mode 945 Table 3 below shows simulations with RED in byte mode and the queue 946 in bytes. There is no significant increase in aggregate congestion 947 with the use of ECN+ or ECN+/wait, and no congestion-related reason 948 to prefer ECN+/wait over ECN+. 950 However, unlike the simulations with RED in packet mode, the 951 simulations with RED in byte mode show little benefit from the use of 952 ECN+ or ECN+/wait, in that the packet marking rate with ECN+ or 953 ECN+/wait is not much different than the packet marking rate with 954 Standard ECN. This is because with RED in byte mode, small packets 955 like SYN/ACK packets are rarely dropped or marked - that is, there is 956 no drawback from the use of ECN+ in these scenarios, but not much 957 need for ECN+ either, in a scenario where small packets are unlikely 958 to be dropped or marked. 960 Target Load = 95%: 961 ECN ECN+ ECN+/wait 962 ------- ------- ------- 963 Dropped 739 438 442 964 Marked 32,405 34,357 34,000 965 Loss rate 0.05% 0.03% 0.03% 966 Throughput 81% 81% 81% 968 Target Load = 110%: 969 ECN ECN+ ECN+/wait 970 ------- ------- ------- 971 Dropped 2,473 1,679 3,020 972 Marked 226,971 222,234 327,608 973 Loss rate 0.15% 0.10% 0.18% 974 Throughput 92% 92% 91% 976 Target Load = 125%: 977 ECN ECN+ ECN+/wait 978 ------- ------- ------- 979 Dropped 19,358 14,057 14,064 980 Marked 717,123 728,513 729,001 981 Loss rate 1.07% 0.78% 0.78% 982 Throughput 95% 95% 95% 984 Table 3: Simulations with an average flow size of 3 Kbytes, a 985 100 Mbps link, RED in byte mode, queue in bytes. 987 Target Load = 95%: 988 ECN ECN+ ECN+/wait 989 ------- ------- ------- 990 Dropped 142 81 78 991 Marked 11,694 11,812 11,964 992 Loss rate 0.01% 0.06% 0.05% 993 Throughput 78% 78% 78% 995 Target Load = 110%: 996 ECN ECN+ ECN+/wait 997 ------- ------- ------- 998 Dropped 314 215 188 999 Marked 39,697 42,388 40,229 1000 Loss rate 0.19% 0.13% 0.11% 1001 Throughput 95% 94% 95% 1003 Target Load = 125%: 1004 ECN ECN+ ECN+/wait 1005 ------- ------- ------- 1006 Dropped 1,599 1,011 985 1007 Marked 74,567 75,782 75,528 1008 Loss rate 0.87% 0.56% 0.54% 1009 Throughput 98% 98% 98% 1011 Target Load = 150%: 1012 ECN ECN+ ECN+/wait 1013 ------- ------- ------- 1014 Dropped 2,429 1,538 1,571 1015 Marked 85,312 86,481 86,476 1016 Loss rate 1.22% 0.78% 0.79% 1017 Throughput 98% 98% 98% 1019 Table 4: Simulations with an average flow size of 3 Kbytes, a 10 Mbps 1020 link, RED in byte mode, queue in bytes. 1022 B. Issues of Incremental Deployment 1024 In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node 1025 B must have received an ECN-setup SYN packet from node A. However, 1026 it is possible that node A supports ECN, but either ignores the CE 1027 codepoint on received SYN/ACK packets, or ignores SYN/ACK packets 1028 with the ECT or CE codepoint set. If the TCP initiator ignores the 1029 CE codepoint on received SYN/ACK packets, this would mean that the 1030 TCP responder would not respond to this congestion indication. 1031 However, this seems to us an acceptable cost to pay in the 1032 incremental deployment of ECN-Capability for TCP's SYN/ACK packets. 1033 It would mean that the responder would not reduce the initial 1034 congestion window from two, three, or four segments down to one 1035 segment, as it should. However, the TCP end nodes would still 1036 respond correctly to any subsequent CE indications on data packets 1037 later on in the connection. 1039 Figure 3 shows an interchange with the SYN/ACK packet ECN-marked, but 1040 with the ECN mark ignored by the TCP originator. 1042 --------------------------------------------------------------- 1043 TCP Node A Router TCP Node B 1044 ---------- ------ ---------- 1046 ECN-setup SYN packet ---> 1047 ECN-setup SYN packet ---> 1049 <--- ECN-setup SYN/ACK, ECT 1050 <--- Sets CE on SYN/ACK 1051 <--- ECN-setup SYN/ACK, CE 1053 Data/ACK, No ECN-Echo ---> 1054 Data/ACK ---> 1055 <--- Data (up to four packets) 1056 --------------------------------------------------------------- 1058 Figure 3: SYN exchange with the SYN/ACK packet marked, 1059 but with the ECN mark ignored by the TCP initiator. 1061 Thus, to be explicit, when a TCP connection includes an initiator 1062 that supports ECN but *does not* support ECN-Capability for SYN/ACK 1063 packets, in combination with a responder that *does* support ECN- 1064 Capability for SYN/ACK packets, it is possible that the ECN-Capable 1065 SYN/ACK packets will be marked rather than dropped in the network, 1066 and that the responder will not learn about the ECN mark on the 1067 SYN/ACK packet. This would not be a problem if most packets from the 1068 responder supporting ECN for SYN/ACK packets were in long-lived TCP 1069 connections, but it would be more problematic if most of the packets 1070 were from TCP connections consisting of four data packets, and the 1071 TCP responder for these connections was ready to send its data 1072 packets immediately after the SYN/ACK exchange. Of course, with 1073 *severe* congestion, the SYN/ACK packets would likely be dropped 1074 rather than ECN-marked at the congested router, preventing the TCP 1075 responder from adding to the congestion by sending its initial window 1076 of four data packets. 1078 It is also possible that in some older TCP implementation, the 1079 initiator would ignore arriving SYN/ACK packets that had the ECT or 1080 CE codepoint set. This would result in a delay in connection set-up 1081 for that TCP connection, with the initiator re-sending the SYN packet 1082 after a retransmit timeout. We are not aware of any TCP 1083 implementations with this behavior. 1085 One possibility for coping with problems of backwards compatibility 1086 would be for TCP initiators to use a TCP flag that means "I 1087 understand ECN-Capable SYN/ACK packets". If this document were to 1088 standardize the use of such an "ECN-SYN" flag, then the TCP responder 1089 would only send a SYN/ACK packet as ECN-capable if the incoming SYN 1090 packet had the "ECN-SYN" flag set. An ECN-SYN flag would prevent the 1091 backwards compatibility problems described in the paragraphs above. 1093 One drawback to the use of an ECN-SYN flag is that it would use one 1094 of the four remaining reserved bits in the TCP header, for a 1095 transient backwards compatibility problem. This drawback is limited 1096 by the fact that the "ECN-SYN" flag would be defined only for use 1097 with ECN-setup SYN packets; that bit in the TCP header could be 1098 defined to have other uses for other kinds of TCP packets. 1100 Factors in deciding not to use an ECN-SYN flag include the following: 1102 (1) The limited installed base: At the time that this document was 1103 written, the TCP implementations in Microsoft Vista and Mac OS X 1104 included ECN, but ECN was not enabled by default [SBT07]. Thus, 1105 there was not a large deployed base of ECN-Capable TCP 1106 implementations. This limits the scope of any backwards 1107 compatibility problems. 1109 (2) Limits to the scope of the problem: The backwards compatibility 1110 problem would not be serious enough to cause congestion collapse; 1111 with severe congestion, the buffer at the congested router will 1112 overflow, and the congested router will drop rather than ECN-mark 1113 arriving SYN packets. Some active queue management mechanisms might 1114 switch from packet-marking to packet-dropping in times of high 1115 congestion before buffer overflow, as recommended in Section 19.1 of 1116 RFC 3168. This helps to prevent congestion collapse problems with 1117 the use of ECN. 1119 (3) Detection of and response to backwards-compatibility problems: A 1120 TCP responder such as a web server can't differentiate between a 1121 SYN/ACK packet that is not ECN-marked in the network, and a SYN/ACK 1122 packet that is ECN-marked, but where the ECN mark is ignored by the 1123 TCP initiator. However, a TCP responder *can* detect if a SYN/ACK 1124 packet is sent as ECN-capable and not reported as ECN-marked, but 1125 data packets are dropped or marked from the initial window of data. 1126 We will call this scenario "initial-window-congestion". If a web 1127 server frequently experienced initial-window congestion (without 1128 SYN/ACK congestion), then the web server *might* be experiencing 1129 backwards compatibility problems with ECN-Capable SYN/ACK packets, 1130 and could respond by not sending SYN/ACK packets as ECN-Capable. 1132 Normative References 1134 [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate 1135 Requirement Levels, RFC 2119, March 1997. 1137 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 1138 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 1139 Standard, September 2001. 1141 Informative References 1143 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 1144 SIGCOMM 2005. 1146 [ECN-SYN] ECN-SYN web page with simulation scripts, URL 1147 "http://www.icir.org/floyd/ecn-syn". 1149 [F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to 1150 TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to 1151 the BEHAVE mailing list, URL "http://www1.ietf.org/mail- 1152 archive/web/behave/current/msg02644.html". 1154 [Kelson00] Dax Kelson, note sent to the Linux kernel mailing list, 1155 September 10, 2000. 1157 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 1158 of Transport Protocols in the Internet, ACM CCR, April 2005. 1160 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 1161 Improved Controllers for AQM Routers Supporting TCP Flows, April 1162 1998. 1164 [RED] Floyd, S., and Jacobson, V. Random Early Detection gateways 1165 for Congestion Avoidance . IEEE/ACM Transactions on Networking, V.1 1166 N.4, August 1993. 1168 [REM] S. Athuraliya, V. H. Li, S. H. Low and Q. Yin, REM: Active 1169 Queue Management, IEEE Network, May 2001. 1171 [RFC2309] B. Braden et al., Recommendations on Queue Management and 1172 Congestion Avoidance in the Internet, RFC 2309, April 1998. 1174 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 1175 Control, RFC 2581, April 1999. 1177 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 1178 Timer, RFC 2988, November 2000. 1180 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 1181 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 1182 January 2001. 1184 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 1185 3360, August 2002. 1187 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 1188 Initial Window, RFC 3390, October 2002. 1190 [RFC4987] W. Eddy, TCP SYN Flooding Attacks and Common Mitigations, 1191 RFC 4987, August 2007. 1193 [SCJO01] F. Smith, F. Campos, K. Jeffay, and D. Ott, What TCP/IP 1194 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 1196 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 1197 1199 [SBT07] M. Sridharan, D. Bansal, and D. Thaler, Implementation Report 1200 on Experiences with Various TCP RFCs, Presentation in the TSVAREA, 1201 IETF 68, March 2007. URL 1202 "http://www3.ietf.org/proceedings/07mar/slides/tsvarea-3/sld6.htm". 1204 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 1205 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 1206 tools-05, work in progress, February 2008. 1208 IANA Considerations 1210 There are no IANA considerations regarding this document. 1212 Authors' Addresses 1213 Aleksandar Kuzmanovic 1214 Phone: +1 (847) 467-5519 1215 Northwestern University 1216 Email: akuzma at northwestern.edu 1217 URL: http://cs.northwestern.edu/~a 1219 Amit Mondal 1220 Northwestern University 1221 Email: a-mondal at northwestern.edu 1223 Sally Floyd 1224 Phone: +1 (510) 666-2989 1225 ICIR (ICSI Center for Internet Research) 1226 Email: floyd@icir.org 1227 URL: http://www.icir.org/floyd/ 1229 K. K. Ramakrishnan 1230 Phone: +1 (973) 360-8764 1231 AT&T Labs Research 1232 Email: kkrama at research.att.com 1233 URL: http://www.research.att.com/info/kkrama 1235 Full Copyright Statement 1237 Copyright (C) The IETF Trust (2008). 1239 This document is subject to the rights, licenses and restrictions 1240 contained in BCP 78, and except as set forth therein, the authors 1241 retain all their rights. 1243 This document and the information contained herein are provided on an 1244 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1245 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1246 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1247 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1248 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1249 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1251 Intellectual Property 1253 The IETF takes no position regarding the validity or scope of any 1254 Intellectual Property Rights or other rights that might be claimed to 1255 pertain to the implementation or use of the technology described in 1256 this document or the extent to which any license under such rights 1257 might or might not be available; nor does it represent that it has 1258 made any independent effort to identify any such rights. Information 1259 on the procedures with respect to rights in RFC documents can be 1260 found in BCP 78 and BCP 79. 1262 Copies of IPR disclosures made to the IETF Secretariat and any 1263 assurances of licenses to be made available, or the result of an 1264 attempt made to obtain a general license or permission for the use of 1265 such proprietary rights by implementers or users of this 1266 specification can be obtained from the IETF on-line IPR repository at 1267 http://www.ietf.org/ipr. 1269 The IETF invites any interested party to bring to its attention any 1270 copyrights, patents or patent applications, or other proprietary 1271 rights that may cover technology that may be required to implement 1272 this standard. Please address the information to the IETF at ietf- 1273 ipr@ietf.org.