idnits 2.17.1 draft-ietf-tcpm-ecnsyn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 965. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 976. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 983. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 989. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (30 June 2007) is 6145 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) == Outdated reference: A later version (-05) exists of draft-irtf-tmrg-tools-03 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET-DRAFT A. Mondal 3 Intended status: Proposed Standard Northwestern University 4 Expires: 30 December 2007 S. Floyd 5 ICIR 6 K.K. Ramakrishnan 7 AT&T 8 30 June 2007 10 Adding Explicit Congestion Notification (ECN) Capability to TCP's 11 SYN/ACK Packets 12 draft-ietf-tcpm-ecnsyn-02.txt 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on December 2007. 39 Copyright Notice 41 Copyright (C) The IETF Trust (2007). 43 Abstract 45 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 46 packets to be ECN-Capable. For TCP, RFC 3168 only specifies setting 47 an ECN-Capable codepoint on data packets, and not on SYN and SYN/ACK 48 packets. However, because of the high cost to the TCP transfer of 49 having a SYN/ACK packet dropped, with the resulting retransmit 50 timeout, this document specifies the use of ECN for the SYN/ACK 51 packet itself, when sent in response to a SYN packet with the two ECN 52 flags set in the TCP header, indicating a willingness to use ECN. 53 Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to 54 the TCP connection, avoiding the severe penalty of a retransmit 55 timeout for a connection that has not yet started placing a load on 56 the network. The sender of the SYN/ACK packet must respond to a 57 report of an ECN-marked SYN/ACK packet by reducing its initial 58 congestion window from two, three, or four segments to one segment, 59 thereby reducing the subsequent load from that connection on the 60 network. 62 Table of Contents 64 1. Conventions .....................................................4 65 2. Introduction ....................................................4 66 3. Proposal ........................................................5 67 4. Discussion ......................................................8 68 5. Related Work ...................................................11 69 6. Performance Evaluation .........................................12 70 6.1. The Costs and Benefit of Adding ECN-Capability ............12 71 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 72 Packets ........................................................14 73 7. Security Considerations ........................................14 74 8. Conclusions ....................................................15 75 9. Acknowledgements ...............................................16 76 A. Report on Simulations ..........................................16 77 A.1. Simulations with RED in Packet Mode .......................17 78 A.2. Simulations with RED in Byte Mode .........................18 79 Normative References ..............................................19 80 Informative References ............................................19 81 IANA Considerations ...............................................21 82 Full Copyright Statement ..........................................21 83 Intellectual Property .............................................22 84 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 86 Changes from draft-ietf-tcpm-ecnsyn-01: 88 * Changes in response to feedback from Anil Agarwal. 90 * Added a look at the costs of adding ECN-Capability to 91 SYN/ACKs in a highly-congested scenario. 92 From feedback from Mark Allman and Janardhan Iyengar. 94 * Added a comparative evaluation of two possible responses 95 to an ECN-marked SYN/ACK packet. From Mark Allman. 97 Changes from draft-ietf-tcpm-ecnsyn-00: 99 * Only updating the revision number. 101 Changes from draft-ietf-twvsg-ecnsyn-00: 103 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 105 * Added a discussion in Section 3 of "Response to 106 ECN-marking of SYN/ACK packets". Based on 107 suggestions from Mark Allman. 109 * Added a discussion to the Conclusions about adding 110 ECN-capability to relevant set-up packets in other 111 protocols. From a suggestion from Wesley Eddy. 113 * Added a description of SYN exchanges with SYN cookies. 114 From a suggestion from Wesley Eddy. 116 * Added a discussion of one-way data transfers, where the 117 host sending the SYN/ACK packet sends no data packets. 119 * Minor editing, from feedback from Mark Allman and Janardhan 120 Iyengar. 122 * Future work: a look at the costs of adding 123 ECN-Capability in a worst-case scenario. 124 From feedback from Mark Allman and Janardhan Iyengar. 126 * Future work: a comparative evaluation of two 127 possible responses to an ECN-marked SYN/ACK packet. 129 Changes from draft-kuzmanovic-ecn-syn-00.txt: 131 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 133 END OF NOTE TO RFC EDITOR. 135 1. Conventions 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 139 document are to be interpreted as described in [RFC 2119]. 141 2. Introduction 143 TCP's congestion control mechanism has primarily used packet loss as 144 the congestion indication, with packets dropped when buffers 145 overflow. With such tail-drop mechanisms, the packet delay can be 146 high, as the queue at bottleneck routers can be fairly large. 147 Dropping packets only when the queue overflows, and having TCP react 148 only to such losses, results in: 149 1) significantly higher packet delay; 150 2) unnecessarily many packet losses; and 151 3) unfairness due to synchronization effects. 153 The adoption of Active Queue Management (AQM) mechanisms allows 154 better control of bottleneck queues [RFC2309]. This use of AQM has 155 the following potential benefits: 156 1) better control of the queue, with reduced queueing delay; 157 2) fewer packet drops; and 158 3) better fairness because of fewer synchronization effects. 160 With the adoption of ECN, performance may be further improved. When 161 the router detects congestion before buffer overflow, the router can 162 provide a congestion indication either by dropping a packet, or by 163 setting the Congestion Experienced (CE) codepoint in the Explicit 164 Congestion Notification (ECN) field in the IP header [RFC3168]. The 165 IETF has standardized the use of the Congestion Experienced (CE) 166 codepoint in the IP header for routers to indicate congestion. For 167 incremental deployment and backwards compatibility, the RFC on ECN 168 [RFC3168] specifies that routers may mark ECN-capable packets that 169 would otherwise have been dropped, using the Congestion Experienced 170 codepoint in the ECN field. The use of ECN allows TCP to react to 171 congestion while avoiding unnecessary retransmissions and, in some 172 cases, unnecessary retransmit timeouts. Thus, using ECN has several 173 benefits: 175 1) For short transfers, a TCP connection's congestion window may be 176 small. For example, if the current window contains only one packet, 177 and that packet is dropped, TCP will have to wait for a retransmit 178 timeout to recover, reducing its overall throughput. Similarly, if 179 the current window contains only a few packets and one of those 180 packets is dropped, there might not be enough duplicate 181 acknowledgements for a fast retransmission, and the sender might have 182 to wait for a delay of several round-trip times using Limited 183 Transmit [RFC3042]. With the use of ECN, short flows are less likely 184 to have packets dropped, sometimes avoiding unnecessary delays or 185 costly retransit timeouts. 187 2) While longer flows may not see substantially improved throughput 188 with the use of ECN, they experience lower loss. This may benefit TCP 189 applications that are latency- and loss-sensitive, because of the 190 avoidance of retransmissions. 192 RFC 3168 only specifies marking the Congestion Experienced codepoint 193 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 194 specifies the negotiation of the use of ECN between the two TCP end- 195 points in the TCP SYN and SYN-ACK exchange, using flags in the TCP 196 header. Erring on the side of being conservative, RFC 3168 does not 197 specify the use of ECN for the SYN/ACK packet itself. However, 198 because of the high cost to the TCP transfer of having a SYN/ACK 199 packet dropped, with the resulting retransmit timeout, this document 200 specifies the use of ECN for the SYN/ACK packet itself. This can be 201 of great benefit to the TCP connection, avoiding the severe penalty 202 of a retransmit timeout for a connection that has not yet started 203 placing a load on the network. The sender of the SYN/ACK packet must 204 respond to a report of an ECN-marked SYN/ACK packet by reducing its 205 initial congestion window from two, three, or four segments to one 206 segment, reducing the subsequent load from that connection on the 207 network. 209 The use of ECN for SYN/ACK packets has the following potential 210 benefits: 211 1) Avoidance of a retransmit timeout; 212 2) Improvement in the throughput of short connections. 214 This draft specifies ECN+, a modification to RFC 3168 to allow TCP 215 SYN/ACK packets to be ECN-Capable. Section 2 contains the 216 specification of the change, while Section 3 discusses some of the 217 issues, and Section 4 discusses related work. Section 5 contains an 218 evaluation of the proposed change. 220 3. Proposal 222 This section specifies the modification to RFC 3168 to allow TCP 223 SYN/ACK packets to be ECN-Capable. We use the following terminology 224 from RFC 3168: 226 The ECN field in the IP header: 227 o CE: the Congestion Experienced codepoint; and 228 o ECT: either one of the two ECN-Capable Transport codepoints. 230 The ECN flags in the TCP header: 231 o CWR: the Congestion Window Reduced flag; and 232 o ECE: the ECN-Echo flag. 234 ECN-setup packets: 235 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 236 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 238 RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on 239 SYN or SYN-ACK packets." In this section, we specify that a TCP node 240 MAY respond to an ECN-setup SYN packet by setting ECT in the 241 responding ECN-setup SYN/ACK packet, indicating to routers that the 242 SYN/ACK packet is ECN-Capable. This allows a congested router along 243 the path to mark the packet instead of dropping the packet as an 244 indication of congestion. 246 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 247 packet, indicating willingness to use ECN for this connection. As 248 specified by RFC 3168, if TCP node B is willing to use ECN, node B 249 responds with an ECN-setup SYN-ACK packet. 251 Table 1 shows an interchange with the SYN/ACK packet dropped by a 252 congested router. Node B waits for a retransmit timeout, and then 253 retransmits the SYN/ACK packet. 255 --------------------------------------------------------------- 256 TCP Node A Router TCP Node B 257 ---------- ------ ---------- 259 ECN-setup SYN packet ---> 260 ECN-setup SYN packet ---> 262 <--- ECN-setup SYN/ACK, possibly ECT 263 3-second timer set 264 SYN/ACK dropped . 265 . 266 . 267 3-second timer expires 268 <--- ECN-setup SYN/ACK, not ECT 269 <--- ECN-setup SYN/ACK 270 Data/ACK ---> 271 Data/ACK ---> 272 <--- Data (one to four segments) 273 --------------------------------------------------------------- 275 Table 1: SYN exchange with the SYN/ACK packet dropped. 277 If the SYN/ACK packet is dropped in the network, the TCP host (node 278 B) responds by waiting three seconds for the retransmit timer to 279 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 280 dropped, the TCP node SHOULD resend the SYN/ACK packet without the 281 ECN-Capable codepoint. (Although we are not aware of any middleboxes 282 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 283 the IP header, we have learned to design our protocols defensively in 284 this regard [RFC3360].) 286 We note that if syn-cookies were used by Node B in the exchange in 287 Table 1, TCP Node B wouldn't set a timer upon transmission of the 288 SYN/ACK packet [SYN-COOK]. In this case, if the SYN/ACK packet was 289 lost, the initiator (Node A) would have to timeout and retransmit the 290 SYN packet in order to trigger another SYN-ACK. 292 Table 2 shows an interchange with the SYN/ACK packet sent as ECN- 293 Capable, and ECN-marked instead of dropped at the congested router. 295 --------------------------------------------------------------- 296 TCP Node A Router TCP Node B 297 ---------- ------ ---------- 299 ECN-setup SYN packet ---> 300 ECN-setup SYN packet ---> 302 <--- ECN-setup SYN/ACK, ECT 303 <--- Sets CE on SYN/ACK 304 <--- ECN-setup SYN/ACK, CE 306 Data/ACK, ECN-Echo ---> 307 Data/ACK, ECN-Echo ---> 308 Window reduced to one segment. 309 <--- Data, CWR (one segment only) 310 --------------------------------------------------------------- 312 Table 2: SYN exchange with the SYN/ACK packet marked. 314 If the receiving node (node A) receives a SYN/ACK packet that has 315 been marked by the congested router, with the CE codepoint set, the 316 receiving node MUST respond by setting the ECN-Echo flag in the TCP 317 header of the responding ACK packet. As specified in RFC 3168, the 318 receiving node continues to set the ECN-Echo flag in packets until it 319 receives a packet with the CWR flag set. 321 When the sending node (node B) receives the ECN-Echo packet reporting 322 the Congestion Experienced indication in the SYN/ACK packet, the node 323 MUST set the initial congestion window to one segment, instead of two 324 segments as allowed by [RFC2581], or three or four segments allowed 325 by [RFC3390]. If the sending node (node B) was going to use an 326 initial window of one segment, and receives an ECN-Echo packet 327 informing it of a Congestion Experienced indication on its SYN/ACK 328 packet, the sending node MAY continue to send with an initial window 329 of one segment, without waiting for a retransmit timeout. We note 330 that this updates RFC 3168, which specifies that "the sending TCP 331 MUST reset the retransmit timer on receiving the ECN-Echo packet when 332 the congestion window is one." As specified by RFC 3168, the sending 333 node (node B) also sets the CWR flag in the TCP header of the next 334 data packet sent, to acknowledge its receipt of and reaction to the 335 ECN-Echo flag. 337 If the data transfer in Table 2 is entirely from Node A to Node B, 338 then data packets from Node A continue to set the ECN-Echo flag in 339 data packets, waiting for the CWR flag from Node B acknowledging a 340 response to the ECN-Echo flag. 342 4. Discussion 344 Motivation: 345 The rationale for the proposed change is the following. When node B 346 receives a TCP SYN packet with ECN-Echo bit set in the TCP header, 347 this indicates that node A is ECN-capable. If node B is also ECN- 348 capable, there are no obstacles to immediately setting one of the 349 ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK 350 packet. 352 There can be a great benefit in setting an ECN-capable codepoint in 353 SYN/ACK packets, as is discussed further in Section 4. Congestion is 354 most likely to occur in the server-to-client direction. As a result, 355 setting an ECN-capable codepoint in SYN/ACK packets can reduce the 356 occurrence of three-second retransmit timeouts resulting from the 357 drop of SYN/ACK packets. 359 Flooding attacks: 360 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 361 packets does not raise any novel security vulnerabilities. For 362 example, provoking servers or hosts to send SYN/ACK packets to a 363 third party in order to perform a "SYN/ACK flood" attack would be 364 highly inefficient. Third parties would immediately drop such 365 packets, since they would know that they didn't generate the TCP SYN 366 packets in the first place. Moreover, such SYN/ACK attacks would 367 have the same signatures as the existing TCP SYN attacks. Provoking 368 servers or hosts to reply with SYN/ACK packets in order to congest a 369 certain link would also be highly inefficient because SYN/ACK packets 370 are small in size. 372 However, the addition of ECN-Capability to SYN/ACK packets could 373 allow SYN/ACK packets to persist for more hops along a network path 374 before being dropped, thus adding somewhat to the ability of a 375 SYN/ACK attack to flood a network link. 377 The TCP SYN packet: 378 There are several reasons why an ECN-Capable codepoint MUST NOT be 379 set in the IP header of the initiating TCP SYN packet. First, when 380 the TCP SYN packet is sent, there are no guarantees that the other 381 TCP endpoint (node B in Table 2) is ECN-capable, or that it would be 382 able to understand and react if the ECN CE codepoint was set by a 383 congested router. 385 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 386 by malicious clients to `improve' the well-known TCP SYN attack. By 387 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 388 might be able to inject a large number of TCP SYN packets through a 389 potentially congested ECN-enabled router, congesting it even further. 391 For both these reasons, we continue the restriction that the TCP SYN 392 packet MUST NOT have the ECN-Capable codepoint in the IP header set. 394 Backwards compatibility: 395 In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node 396 B must have received an ECN-setup SYN packet from node A. However, 397 it is possible that node A supports ECN, but either ignores the CE 398 codepoint on received SYN/ACK packets, or ignores SYN/ACK packets 399 with the ECT or CE codepoint set. If the TCP sender ignores the CE 400 codepoint on received SYN/ACK packets, this would mean that the TCP 401 connection would not respond to this congestion indication. As 402 discussed in Section 2 under "Backwards compatibility", this would 403 not be an insurmountable problem. It would mean that the sender of 404 the SYN/ACK packet would not reduce the initial congestion window 405 from two, three, or four segments down to one segment, as it should. 406 However, the TCP sender would still respond correctly to any 407 subsequent CE indications on data packets later on in the connection. 409 It is also possible that in some older TCP implementation, the TCP 410 sender ignores SYN/ACK packets with the ECT or CE codepoint set. 411 This would result in a delay in connection set-up for that TCP 412 connection, with the TCP sender re-sending the SYN packet after a 413 retransmit timeout. 415 SYN/ACK packets and packet size: 416 There are a number of router buffer architectures that have smaller 417 dropping rates for small (SYN) packets than for large (data) packets. 419 For example, for a Drop Tail queue in units of packets, where each 420 packet takes a single slot in the buffer regardless of packet size, 421 small and large packets are equally likely to be dropped. However, 422 for a Drop Tail queue in units of bytes, small packets are less 423 likely to be dropped than are large ones. Similarly, for RED in 424 packet mode, small and large packets are equally likely to be dropped 425 or marked, while for RED in byte mode, a packet's chance of being 426 dropped or marked is proportional to the packet size in bytes. 428 For a congested router with an AQM mechanism in byte mode, where a 429 packet's chance of being dropped or marked is proportional to the 430 packet size in bytes, the drop or marking rate for TCP SYN/ACK 431 packets should generally be low. In this case, the benefit of making 432 SYN/ACK packets ECN-Capable should be similarly moderate. However, 433 for a congested router with a Drop Tail queue in units of packets or 434 with an AQM mechanism in packet mode, and with no priority queueing 435 for smaller packets, small and large packets should have the same 436 probability of being dropped or marked. In such a case, making 437 SYN/ACK packets ECN-Capable should be of significant benefit. 439 We believe that there are a wide range of behaviors in the real world 440 in terms of the drop or mark behavior at routers as a function of 441 packet size [Tools] (Section 10). We note that all of these 442 alternatives listed above are available in the NS simulator (Drop 443 Tail queues are by default in units of packets, while the default for 444 RED queue management has been changed from packet mode to byte mode). 446 Response to ECN-marking of SYN/ACK packets: 447 One question is why TCP SYN/ACK packets should be treated differently 448 from other packets in terms of the packet sender's response to an 449 ECN-marked packet. Section 5 of RFC 3168 specifies the following: 451 "Upon the receipt by an ECN-Capable transport of a single CE packet, 452 the congestion control algorithms followed at the end-systems MUST be 453 essentially the same as the congestion control response to a *single* 454 dropped packet. For example, for ECN-Capable TCP the source TCP is 455 required to halve its congestion window for any window of data 456 containing either a packet drop or an ECN indication." 458 In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP 459 congestion window consists of a single packet and that packet is ECN- 460 marked in the network, then the sender must reduce the sending rate 461 below one packet per round-trip time, by waiting for one RTO before 462 sending another packet. If the RTO was set to the average round-trip 463 time, this would result in halving the sending rate; because the RTO 464 is in fact larger than the average round-trip time, the sending rate 465 is reduced to less than half of its previous value. 467 TCP's congestion control response to the *dropping* of a SYN/ACK 468 packet is to wait a default time before sending another packet. This 469 document argues that ECN gives end-systems a wider range of possible 470 responses to the *marking* of a SYN/ACK packet, and that waiting a 471 default time before sending a data packet is not the desired 472 response. 474 On the conservative end, one could assume an effective congestion 475 window of one packet for the SYN/ACK packet, and respond to an ECN- 476 marked SYN/ACK packet by reducing the sending rate to one packet 477 every two round-trip times. As an approximation, the TCP end-node 478 could measure the round-trip time T between the sending of the 479 SYN/ACK packet and the receipt of the acknowledgement, and reply to 480 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 481 seconds before sending a data packet. 483 However, we note that for an ECN-marked SYN/ACK packet, halving the 484 *congestion window* is not the same as halving the *sending rate*; 485 there is no `sending rate' associated with an ECN-Capable SYN/ACK 486 packet, as such packets are only sent as the first packet in a 487 connection from that host. Further, a router's marking of a SYN/ACK 488 packet is not affected by any past history of that connection. 490 Adding ECN-Capability to SYN/ACK packets allows the simple response 491 of setting the initial congestion window to one packet, instead of 492 its allowed default value of two, three, or four packets, with the 493 host proceeding with a cautious sending rate of one packet per round- 494 trip time. If that packet is ECN-marked or dropped, then the sender 495 will wait an RTO before sending another packet. This document argues 496 that this approach is useful to users, with no dangers of congestion 497 collapse or of starvation of competing traffic. This is discussed in 498 more detail below in Section 5.2. 500 We note that if the data transfer is entirely from Node A to Node B, 501 then there is no effective difference between the two possible 502 responses to an ECN-marked SYN/ACK packet outlined above. In either 503 case, Node B sends no data packets, only sending acknowledgement 504 packets in response to received data packets. 506 5. Related Work 508 The addition of ECN-capability to TCP's SYN/ACK packets was proposed 509 in [ECN+]. The paper includes an extensive set of simulation and 510 testbed experiments to evaluate the effects of the proposal, using 511 several Active Queue Management (AQM) mechanisms, including Random 512 Early Detection (RED) [RED], Random Exponential Marking (REM) [REM], 513 and Proportional Integrator (PI) [PI]. The performance measures were 514 the end-to-end response times for each request/response pair, and the 515 aggregate throughput on the bottleneck link. The end-to-end response 516 time was computed as the time from the moment when the request for 517 the file is sent to the server, until that file is successfully 518 downloaded by the client. 520 The measurements from [ECN+] show that setting an ECN-Capable 521 codepoint in the IP packet header in TCP SYN/ACK packets 522 systematically improves performance with all evaluated AQM schemes. 523 When SYN/ACK packets at a congested router are ECN-marked instead of 524 dropped, this can avoid a long initial retransmit timeout, improving 525 the response time for the affected flow dramatically. 527 [ECN+] shows that the impact on aggregate throughput can also be 528 quite significant, because marking SYN ACK packets can prevent larger 529 flows from suffering long timeouts before being "admitted" into the 530 network. In addition, the testbed measurements from [ECN+] show that 531 web servers setting the ECN-Capable codepoint in TCP SYN/ACK packets 532 could serve more requests. 534 As a final step, [ECN+] explores the co-existence of flows that do 535 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 536 results in [ECN+] show that both types of flows can coexist, with 537 some performance degradation for flows that don't use ECN+. Flows 538 that do use ECN+ improve their end-to-end performance. At the same 539 time, the performance degradation for flows that don't use ECN+, as a 540 result of the flows that do use ECN+, increases as a greater fraction 541 of flows use ECN+. 543 6. Performance Evaluation 545 6.1. The Costs and Benefit of Adding ECN-Capability 547 [ECN+] explores the costs and benefits of adding ECN-Capability to 548 SYN/ACK packets with both simulations and experiments. The addition 549 of ECN-capability to SYN/ACK packets could be of significant benefit 550 for those ECN connections that would have had the SYN/ACK packet 551 dropped in the network, and for which the ECN-Capability would allow 552 the SYN/ACK to be marked rather than dropped. 554 The percent of SYN/ACK packets on a link can be quite high. In 555 particular, measurements on links dominated by web traffic indicate 556 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 558 The benefit of adding ECN-capability to SYN/ACK packets depends in 559 part on the size of the data transfer. The drop of a SYN/ACK packet 560 can increase the download time of a short file by an order of 561 magnitude, by requiring a three-second retransmit timeout. For 562 longer-lived flows, the effect of a dropped SYN/ACK packet on file 563 download time is less dramatic. However, even for longer-lived 564 flows, the addition of ECN-capability to SYN/ACK packets can improve 565 the fairness among long-lived flows, as newly-arriving flows would be 566 less likely to have to wait for retransmit timeouts. 568 One question that arises is what fraction of connections would see 569 the benefit from making SYN/ACK packets ECN-capable, in a particular 570 scenario? Specifically: 572 (1) What fraction of arriving SYN/ACK packets are dropped at the 573 congested router when the SYN/ACK packets are not ECN-capable? 575 (2) Of those SYN/ACK packets that are dropped, what fraction of those 576 drops would have been ECN-marks instead of drops if the SYN/ACK 577 packets had been ECN-capable? 579 To answer (1), it is necessary to consider not only the level of 580 congestion but also the queue architecture at the congested link. As 581 described in Section 3 above, for some queue architectures small 582 packets are less likely to be dropped than large ones. In such an 583 environment, SYN/ACK packets would have lower packet drop rates; 584 question (1) could not necessarily be inferred from the overall 585 packet drop rate, but could be answered by measuring the drop rate 586 for SYN/ACK packets directly. In such an environment, adding ECN- 587 capability to SYN/ACK packets would be of less dramatic benefit than 588 in environments where all packets are equally likely to be dropped 589 regardless of packet size. 591 As question (2) implies, even if all of the SYN/ACK packets were ECN- 592 capable, there could still be some SYN/ACK packets dropped instead of 593 marked at the congested link; the full answer to question (2) depends 594 on the details of the queue management mechanism at the router. If 595 congestion is sufficiently bad, and the queue management mechanism 596 cannot prevent the buffer from overflowing, then SYN/ACK packets will 597 be dropped rather than marked upon buffer overflow whether or not 598 they are ECN-capable. 600 For some AQM mechanisms, ECN-capable packets are marked instead of 601 dropped any time this is possible, that is, any time the buffer is 602 not yet full. For other AQM mechanisms however, such as the RED 603 mechanism as recommended in [RED], packets are dropped rather than 604 marked when the packet drop/mark rate exceeds a certain threshold, 605 e.g., 10%, even if the packets are ECN-capable. For a router with 606 such an AQM mechanism, when congestion is sufficiently severe to 607 cause a high drop/mark rate, some SYN/ACK packets would be dropped 608 instead of marked whether or not they were ECN-capable. 610 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 611 packets depends not only on the overall packet drop rate in the 612 network, but also on the queue management architecture at the 613 congested link. 615 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets 617 This document specifies that the end-node responds to the report of 618 an ECN-marked SYN/ACK packet by setting the initial congestion window 619 to one segment, instead of its possible default value of two to four 620 segments. We call this ECN+ with NoWaiting. However, in Section 3 621 discussed another possible response to an ECN-marked SYN/ACK packet, 622 of the end-node waiting an RTT before sending a data packet. We call 623 this approach ECN+ with Waiting. 625 Simulations comparing the performance with Standard ECN (without ECN- 626 marked SYN/ACK packets), ECN+ with NoWaiting, and ECN+ with Waiting 627 show little difference, in terms of aggregate congestion, between 628 ECN+ with NoWaiting and ECN+ with Waiting. The details are given in 629 Appendix A below. Our conclusions are that ECN+ with NoWaiting is 630 perfectly safe, and there are no congestion-related reasons for 631 preferring ECN+ with Waiting over ECN+ with NoWaiting. That is, 632 there is no need for the TCP end-node to wait a round-trip time 633 before sending a data packet after receiving an acknowledgement of an 634 ECN-marked SYN/ACK packet. 636 7. Security Considerations 638 TCP packets carrying the ECT codepoint in IP headers can be marked 639 rather than dropped by ECN-capable routers. This raises several 640 security concerns that we discuss below. 642 "Bad" routers or middleboxes: 643 There is a small but decreasing number of routers or middleboxes that 644 drop or reset SYN and SYN/ACK packets based on the ECN-related flags 645 in the TCP header [MAF05], [RFC3360]. While there is no evidence 646 that any middleboxes drop SYN/ACK packets that contain an ECN-Capable 647 or CE codepoint in the *IP header*, such behavior cannot be excluded. 648 Thus, as specified in Section 2, if a SYN/ACK packet with the ECT or 649 CE codepoint is dropped, the TCP node SHOULD resend the SYN/ACK 650 packet without the ECN-Capable codepoint. 652 Congestion collapse: 653 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 654 marked instead of dropped at an ECN-capable router, the concern is 655 whether this can either invoke congestion, or worsen performance in 656 highly congested scenarios. However, after learning that a SYN/ACK 657 packet was ECN-marked, the sender of that packet will only send one 658 data packet; if this data packet is ECN-marked, the sender will then 659 wait for a retransmission timeout. In addition, routers are free to 660 drop rather than mark arriving packets in times of high congestion, 661 regardless of whether the packets are ECN-capable. When congestion 662 is very high and a router's buffer is full, the router has no choice 663 but to drop rather than to mark an arriving packet. 665 The simulations reported in Appendix A show that even with demanding 666 traffic mixes dominated by short flows and high levels of congestion, 667 the aggregate packet dropping rates are not significantly different 668 with Standard ECN, ECN+ with NoWaiting, or ECN+ with Waiting. In 669 particular, the simulations show that in periods of very high 670 congestion the packet-marking rate is low with or without ECN+, and 671 the use of ECN+ does not significantly increase the number of dropped 672 or marked packets. 674 The simulations show that ECN+ is most effective in times of moderate 675 congestion. In these moderate-congested scenarios, the use of ECN+ 676 increases the number of ECN-marked packets, because ECN+ allows 677 SYN/ACK packets to be ECN-marked. At the same time, in these times 678 of moderate congestion, the use of ECN+ instead of Standard ECN does 679 not significantly affect the overall levels of congestion. 681 The simulations show that the use of ECN+ is less effective in times 682 of high congestion; the simulations show that in times of high 683 congestion more packets are dropped instead of marked, both with 684 Standard ECN and with ECN+. In times of high congestion, the buffer 685 can overflow, even with Active Queue Management and ECN; when the 686 buffer is full arriving packets are dropped rather than marked, 687 whether the packets are ECN-capable or not. Thus while ECN+ is less 688 effective in times of high congestion, it still doesn't result in a 689 significant increase in the level of congestion. More details are 690 given in the appendix. 692 8. Conclusions 694 This draft specifies a modification to RFC 3168 to allow TCP nodes to 695 send SYN/ACK packets as being ECN-Capable. Making the SYN/ACK packet 696 ECN-Capable avoids the high cost to a TCP transfer when a SYN/ACK 697 packet is dropped by a congested router, by avoiding the resulting 698 retransmit timeout. This improves the throughput of short 699 connections. The sender of the SYN/ACK packet responds to an ECN 700 mark by reducing its initial congestion window from two, three, or 701 four segments to one segment, reducing the subsequent load from that 702 connection on the network. The addition of ECN-capability to SYN/ACK 703 packets is particularly beneficial in the server-to-client direction, 704 where congestion is more likely to occur. In this case, the initial 705 information provided by the ECN marking in the SYN/ACK packet enables 706 the server to more appropriately adjust the initial load it places on 707 the network. 709 Future work will address the more general question of adding ECN- 710 Capability to relevant handshake packets in other protocols that use 711 retransmission-based reliability in their setup phase (e.g., SCTP, 712 DCCP, HIP, and the like). 714 9. Acknowledgements 716 We thank Anil Agarwal, Mark Allman, Wesley Eddy, Janardhan Iyengar, 717 and Pasi Sarolahti for feedback on earlier versions of this draft. 719 A. Report on Simulations 721 This section reports on simulations showing the costs of adding ECN+ 722 in highly-congested scenarios. This section also reports on 723 simulations for a comparative evaluation between ECN+ with NoWaiting 724 and ECN+ with Waiting. 726 The simulations are run with a range of file-size distributions. As 727 a baseline, they use the empirical heavy-tailed distribution reported 728 in [SCJO01], with a mean file size of around 7 KBytes. This flow- 729 size distribution is manipulated by skewing the flow sizes towards 730 lower and higher values to get distributions with mean file sizes of 731 3 KBytes, 5 KBytes, 14 KBytes and 17 KBytes. The congested link is 732 100 Mbps. RED is run in gentle mode, and arriving ECN-Capable 733 packets are only dropped instead of marked if the buffer is full (and 734 the router has no choice). 736 We explore two alternatives for a TCP node's response to a report of 737 an ECN-marked SYN/ACK packet. With ECN+ with NoWaiting, the TCP node 738 sends a data packet immediately (with an initial congestion window of 739 one segment). With the alternative ECN+ with Waiting, the TCP node 740 waits a round-trip time before sending a data packet; the sender 741 already has one measurement of the round-trip time when the 742 acknowledgement for the SYN/ACK packet is received. 744 In the tables below, ECN+ refers to ECN+ with NoWaiting, where the 745 sender starts transmitting immediately, and ECN+/wait refers to ECN+ 746 with Waiting, where the sender waits a round-trip time before sending 747 a data packet into the network. 749 The simulation scripts are available on [ECN-SYN], along with graphs 750 showing the distribution of response times for the TCP connections. 752 A.1. Simulations with RED in Packet Mode 754 The simulations with RED in packet mode and with the queue in packets 755 show that ECN+ is useful in times of moderate congestion, though it 756 adds little benefit in times of high congestion. The simulations 757 show a minimal increase in levels of congestion with either ECN+ with 758 Waiting or ECN+ with NoWaiting, either in terms of packet dropping or 759 marking rates or in terms of the distribution of responses times. 760 Thus, the simulations show no problems with ECN+ in times of high 761 congestion, and no reason to use ECN+ with Waiting instead of ECN+ 762 with NoWaiting. 764 Table 3 shows the congestion levels for simulations with RED in 765 packet mode, with a queue in packets. To explore a worst-case 766 scenario, these simulations use a traffic mix with an unrealistically 767 small flow size distribution, with a mean flow size of 3 Kbytes. For 768 each table showing a particular traffic load, the three rows show the 769 number of packets dropped, the number of packets ECN-marked, and the 770 aggregate packet drop rate, and the three columns show the 771 simulations with Standard ECN, ECN+ (NoWaiting) and ECN+/wait. 773 The usefulness of ECN+: The first thing to observe is that for the 774 simulations with the somewhat moderate load of 95%, with packet drop 775 rates of 5-6%, the use of ECN+ or ECN+/wait more than doubled the 776 number of packets marked. This indicates that with ECN+ or 777 ECN+/wait, many SYN/ACK packets are marked instead of dropped. 779 No increase in congestion: The second thing to observe is that in all 780 of the simulations, the use of ECN+ or ECN+/wait does not 781 significantly increase the aggregate packet drop rate. 783 Comparing ECN+ and ECN+/wait: The third thing to observe is that 784 there is little difference between ECN+ and ECN+/wait in terms of the 785 aggregate packet drop rate. Thus, there is no congestion-related 786 reason to prefer ECN+/wait over ECN+. 788 Traffic Load = 95%: 789 ECN ECN+ ECN+/wait 790 ------- ------- ------- 791 Dropped 74,645 64,034 64,983 792 Marked 7,639 17,681 16,914 793 Loss rate 6.05% 5.26% 5.33% 795 Traffic Load = 110%: 796 ECN ECN+ ECN+/wait 797 ------- ------- ------- 798 Dropped 161,644 163,620 165,196 799 Marked 4,375 6,653 6,144 800 Loss rate 10.38% 10.45% 10.53% 802 Traffic Load = 125%: 803 ECN ECN+ ECN+/wait 804 ------- ------- ------- 805 Dropped 257,671 268,161 264,437 806 Marked 2,885 3,712 3,359 807 Loss rate 14.52% 15.00% 14.83% 809 Traffic Load = 150%: 810 ECN ECN+ ECN+/wait 811 ------- ------- ------- 812 Loss rate 24.36% 24.61% 24.46% 814 Traffic Load = 200%: 815 ECN ECN+ ECN+/wait 816 ------- ------- ------- 817 Loss rate 29.99% 30.22% 30.23% 819 Table 3: Simulations with an average flow size of 3 Kbytes, RED in 820 packet mode, queue in packets. 822 A.2. Simulations with RED in Byte Mode 824 Table 4 below shows simulations with RED in byte mode and the queue 825 in bytes. Like the simulations with RED in packet mode, there is no 826 significant increase in aggregate congestion with the use of ECN+ or 827 ECN+/wait, and no congestion-related reason to prefer ECN+/wait over 828 ECN+. 830 However, unlike the simulations with RED in packet mode, the 831 simulations with RED in byte mode show little benefit from the use of 832 ECN+ or ECN+/wait, in that the packet marking rate with ECN+ or 833 ECN+/wait is not much different than the packet marking rate with 834 Standard ECN. This is because with RED in byte mode, small packets 835 like SYN/ACK packets are rarely dropped or marked - that is, there is 836 no drawback from the use of ECN+ in these scenarios, but not much 837 need for ECN+ either, in a scenario where small packets are unlikely 838 to be dropped or marked. 840 Traffic Load = 95%: 841 ECN ECN+ ECN+/wait 842 ------- ------- ------- 843 Dropped 13,044 13,323 14,855 844 Marked 18,880 19,175 19,049 845 Loss rate 1.13% 1.16% 1.29% 847 Traffic Load = 110%: 848 ECN ECN+ ECN+/wait 849 ------- ------- ------- 850 Dropped 84,809 83,013 83,564 851 Marked 4,086 4,644 4,826 852 Loss rate 5.90% 5.78% 5.81% 854 Traffic Load = 125%: 855 ECN ECN+ ECN+/wait 856 ------- ------- ------- 857 Dropped 157,305 157,435 158,368 858 Marked 2,183 2,363 2,663 859 Loss rate 9.89% 9.87% 9.93% 861 Table 4: Simulations with an average flow size of 3 Kbytes, RED in 862 byte mode, queue in bytes. 864 Normative References 866 [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate 867 Requirement Levels, RFC 2119, March 1997. 869 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 870 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 871 Standard, September 2001. 873 Informative References 875 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 876 SIGCOMM 2005. 878 [ECN-SYN] ECN-SYN web page with simulation scripts, URL to be added. 880 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 881 of Transport Protocols in the Internet, ACM CCR, April 2005. 883 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 884 Improved Controllers for AQM Routers Supporting TCP Flows, April 885 1998. 887 [RED] Floyd, S., and Jacobson, V. Random Early Detection gateways 888 for Congestion Avoidance . IEEE/ACM Transactions on Networking, V.1 889 N.4, August 1993. 891 [REM] S. Athuraliya, V. H. Li, S. H. Low and Q. Yin, REM: Active 892 Queue Management, IEEE Network, May 2001. 894 [RFC2309] B. Braden et al., Recommendations on Queue Management and 895 Congestion Avoidance in the Internet, RFC 2309, April 1998. 897 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 898 Control, RFC 2581, April 1999. 900 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 901 Timer, RFC 2988, November 2000. 903 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 904 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 905 January 2001. 907 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 908 3360, August 2002. 910 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 911 Initial Window, RFC 3390, October 2002. 913 [SCJO01] F. Smith, F. Campos, K. Jeffay, D. Ott, What {TCP/IP} 914 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 916 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 917 919 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 920 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 921 tools-03, work in progress, December 2006. 923 IANA Considerations 925 There are no IANA considerations regarding this document. 927 AUTHORS' ADDRESSES 929 Aleksandar Kuzmanovic 930 Phone: +1 (847) 467-5519 931 Northwestern University 932 Email: akuzma at northwestern.edu 933 URL: http://cs.northwestern.edu/~a 935 Amit Mondal 936 Northwestern University 937 Email: a-mondal at northwestern.edu 939 Sally Floyd 940 Phone: +1 (510) 666-2989 941 ICIR (ICSI Center for Internet Research) 942 Email: floyd at icir.org 943 URL: http://www.icir.org/floyd/ 945 K. K. Ramakrishnan 946 Phone: +1 (973) 360-8764 947 AT&T Labs Research 948 Email: kkrama at research.att.com 949 URL: http://www.research.att.com/info/kkrama 951 Full Copyright Statement 953 Copyright (C) The IETF Trust (2007). 955 This document is subject to the rights, licenses and restrictions 956 contained in BCP 78, and except as set forth therein, the authors 957 retain all their rights. 959 This document and the information contained herein are provided on an 960 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 961 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 962 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 963 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 964 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 965 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 967 Intellectual Property 969 The IETF takes no position regarding the validity or scope of any 970 Intellectual Property Rights or other rights that might be claimed to 971 pertain to the implementation or use of the technology described in 972 this document or the extent to which any license under such rights 973 might or might not be available; nor does it represent that it has 974 made any independent effort to identify any such rights. Information 975 on the procedures with respect to rights in RFC documents can be 976 found in BCP 78 and BCP 79. 978 Copies of IPR disclosures made to the IETF Secretariat and any 979 assurances of licenses to be made available, or the result of an 980 attempt made to obtain a general license or permission for the use of 981 such proprietary rights by implementers or users of this 982 specification can be obtained from the IETF on-line IPR repository at 983 http://www.ietf.org/ipr. 985 The IETF invites any interested party to bring to its attention any 986 copyrights, patents or patent applications, or other proprietary 987 rights that may cover technology that may be required to implement 988 this standard. Please address the information to the IETF at ietf- 989 ipr@ietf.org.