idnits 2.17.1 draft-ietf-tcpm-ecnsyn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 717. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 728. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 740. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 740. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) == Outdated reference: A later version (-05) exists of draft-irtf-tmrg-tools-02 Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET DRAFT Northwestern University 3 draft-ietf-tcpm-ecnsyn-01.txt S. Floyd 4 ICIR 5 K.K. Ramakrishnan 6 AT&T 7 October, 2006 9 Adding Explicit Congestion Notification (ECN) Capability to TCP's 10 SYN/ACK Packets 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on July 2006. 37 Abstract 39 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 40 packets to be ECN-Capable. For TCP, RFC 3168 only specified setting 41 an ECN-Capable codepoint on data packets, and not on SYN and SYN/ACK 42 packets. However, because of the high cost to the TCP transfer of 43 having a SYN/ACK packet dropped, with the resulting retransmit 44 timeout, this document is specifying the use of ECN for the SYN/ACK 45 packet itself, when sent in response to a SYN packet with the two ECN 46 flags set in the TCP header, indicating a willingness to use ECN. 47 Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to 48 the TCP connection, avoiding the severe penalty of a retransmit 49 timeout for a connection that has not yet started placing a load on 50 the network. The sender of the SYN/ACK packet must respond to an ECN 51 mark by reducing its initial congestion window from two, three, or 52 four segments to one segment, reducing the subsequent load from that 53 connection on the network. 55 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 57 Changes from draft-ietf-tcpm-ecnsyn-00: 59 * Only updating the revision number. 61 Changes from draft-ietf-twvsg-ecnsyn-00: 63 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 65 * Added a discussion in Section 3 of "Response to 66 ECN-marking of SYN/ACK packets". Based on 67 suggestions from Mark Allman. 69 * Added a discussion to the Conclusions about adding 70 ECN-capability to relevant set-up packets in other 71 protocols. From a suggestion from Wesley Eddy. 73 * Added a description of SYN exchanges with SYN cookies. 74 From a suggestion from Wesley Eddy. 76 * Added a discussion of one-way data transfers, where the 77 host sending the SYN/ACK packet sends no data packets. 79 * Minor editing, from feedback from Mark Allman and Janardhan 80 Iyengar. 82 * Future work: a look at the costs of adding 83 ECN-Capability in a worst-case scenario. 84 From feedback from Mark Allman and Janardhan Iyengar. 86 * Future work: a comparative evaluation of two 87 possible responses to an ECN-marked SYN/ACK packet. 89 Changes from draft-kuzmanovic-ecn-syn-00.txt: 91 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 93 END OF NOTE TO RFC EDITOR. 95 1. Conventions 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 99 document are to be interpreted as described in [RFC 2119]. 101 1. Introduction 103 TCP's congestion control mechanism has primarily used packet loss as 104 the congestion indication, with packets dropped when buffers 105 overflow. With such tail-drop mechanisms, the packet delay can be 106 high, as the queue at bottleneck routers can be fairly large. 107 Dropping packets only when the queue overflows, and having TCP react 108 only to such losses, results in: 109 1) significantly higher packet delay; 110 2) unnecessarily many packet losses; and 111 3) unfairness due to synchronization effects. 113 The adoption of Active Queue Management (AQM) mechanisms allows 114 better control of bottleneck queues [RFC2309]. This use of AQM has 115 the following potential benefits: 116 1) better control of the queue, with reduced queueing delay; 117 2) fewer packet drops; and 118 3) better fairness because of fewer synchronization effects. 120 With the adoption of ECN, performance may be further improved. When 121 the router detects congestion before buffer overflow, the router can 122 provide a congestion indication either by dropping a packet, or by 123 setting the Congestion Experienced (CE) codepoint in the Explicit 124 Congestion Notification (ECN) field in the IP header [RFC3168]. The 125 IETF has standardized the use of the Congestion Experienced (CE) 126 codepoint in the IP header for routers to indicate congestion. For 127 incremental deployment and backwards compatibility, the RFC on ECN 128 [RFC3168] specifies that routers may mark ECN-capable packets that 129 would otherwise have been dropped, using the Congestion Experienced 130 codepoint in the ECN field. The use of ECN allows TCP to react to 131 congestion while avoiding unnecessary retransmissions and, in some 132 cases, unnecessary retransmit timeouts. Thus, using ECN has several 133 benefits: 135 1) For short transfers, a TCP connection's congestion window may be 136 small. For example, if the current window contains only one packet, 137 and that packet is dropped, TCP will have to wait for a retransmit 138 timeout to recover, reducing its overall throughput. Similarly, if 139 the current window contains only a few packets and one of those 140 packets is dropped, there might not be enough duplicate 141 acknowledgements for a fast retransmission, and the sender might have 142 to wait for a delay of several round-trip times using Limited 143 Transmit [RFC3042]. With the use of ECN, short flows are less likely 144 to have packets dropped, sometimes avoiding unnecessary delays or 145 costly retransmit timeouts. 147 2) While longer flows may not see substantially improved throughput 148 with the use of ECN, they experience lower loss. This may benefit TCP 149 applications that are latency- and loss-sensitive, because of the 150 avoidance of retransmissions. 152 RFC 3168 only specified marking the Congestion Experienced codepoint 153 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 154 specified the negotiation of the use of ECN between the two TCP end- 155 points in the TCP SYN and SYN-ACK exchange, using flags in the TCP 156 header. Erring on the side of being conservative, RFC 3168 did not 157 specify the use of ECN for the SYN/ACK packet itself. However, 158 because of the high cost to the TCP transfer of having a SYN/ACK 159 packet dropped, with the resulting retransmit timeout, this document 160 is specifying the use of ECN for the SYN/ACK packet itself. This can 161 be of great benefit to the TCP connection, avoiding the severe 162 penalty of a retransmit timeout for a connection that has not yet 163 started placing a load on the network. The sender of the SYN/ACK 164 packet must respond to an ECN mark by reducing its initial congestion 165 window from two, three, or four segments to one segment, reducing the 166 subsequent load from that connection on the network. 168 The use of ECN for SYN/ACK packets has the following potential 169 benefits: 170 1) Avoidance of a retransmit timeout; 171 2) Improvement in the throughput of short connections. 173 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 174 packets to be ECN-Capable. Section 2 contains the specification of 175 the change, while Section 3 discusses some of the issues, and Section 176 4 discusses related work. Section 5 contains an evaluation of the 177 proposed change. 179 2. Proposal 181 This section specifies the modification to RFC 3168 to allow TCP 182 SYN/ACK packets to be ECN-Capable. We use the following terminology 183 from RFC 3168: 185 The ECN field in the IP header: 186 o CE: the Congestion Experienced codepoint; and 187 o ECT: either one of the two ECN-Capable Transport codepoints. 189 The ECN flags in the TCP header: 190 o CWR: the Congestion Window Reduced flag; and 191 o ECE: the ECN-Echo flag. 193 ECN-setup packets: 194 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 195 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 197 RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on 198 SYN or SYN-ACK packets." In this section, we specify that a TCP node 199 MAY respond to an ECN-setup SYN packet by setting ECT in the 200 responding ECN-setup SYN/ACK packet, indicating to routers that the 201 SYN/ACK packet is ECN-Capable. This allows a congested router along 202 the path to mark the packet instead of dropping the packet as an 203 indication of congestion. 205 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 206 packet, indicating willingness to use ECN for this connection. As 207 specified by RFC 3168, if TCP node B is willing to use ECN, node B 208 responds with an ECN-setup SYN-ACK packet. 210 Table 1 shows an interchange with the SYN/ACK packet dropped by a 211 congested router. Node B waits for a retransmit timeout, and then 212 retransmits the SYN/ACK packet. 214 --------------------------------------------------------------- 215 TCP Node A Router TCP Node B 216 ---------- ------ ---------- 218 ECN-setup SYN packet ---> 219 ECN-setup SYN packet ---> 221 <--- ECN-setup SYN/ACK, possibly ECT 222 3-second timer set 223 SYN/ACK dropped . 224 . 225 . 226 3-second timer expires 227 <--- ECN-setup SYN/ACK, not ECT 228 <--- ECN-setup SYN/ACK 229 Data/ACK ---> 230 Data/ACK ---> 231 <--- Data (one to four segments) 232 --------------------------------------------------------------- 234 Table 1: SYN exchange with the SYN/ACK packet dropped. 236 If the SYN/ACK packet is dropped in the network, the TCP host (node 237 B) responds by waiting three seconds for the retransmit timer to 238 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 239 dropped, the TCP node SHOULD resend the SYN/ACK packet without the 240 ECN-Capable codepoint. (Although we are not aware of any middleboxes 241 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 242 the IP header, we have learned to design our protocols defensively in 243 this regard [RFC3360].) 245 We note that if syn-cookies were used by Node B in the exchange in 246 Table 1, TCP Node B wouldn't set a timer upon transmission of the 247 SYN/ACK packet [SYN-COOK]. In this case, if the SYN/ACK packet was 248 lost, the initiator (Node A) would have to timeout and retransmit the 249 SYN packet in order to trigger another SYN-ACK. 251 Table 2 shows an interchange with the SYN/ACK packet sent as ECN- 252 Capable, and ECN-marked instead of dropped at the congested router. 254 --------------------------------------------------------------- 255 TCP Node A Router TCP Node B 256 ---------- ------ ---------- 258 ECN-setup SYN packet ---> 259 ECN-setup SYN packet ---> 261 <--- ECN-setup SYN/ACK, ECT 262 <--- Sets CE on SYN/ACK 263 <--- ECN-setup SYN/ACK, CE 265 Data/ACK, ECN-Echo ---> 266 Data/ACK, ECN-Echo ---> 267 Window reduced to one segment. 268 <--- Data, CWR (one segment only) 269 --------------------------------------------------------------- 271 Table 2: SYN exchange with the SYN/ACK packet marked. 273 If the receiving node (node A) receives a SYN/ACK packet that has 274 been marked by the congested router, with the CE codepoint set, the 275 receiving node MUST respond by setting the ECN-Echo flag in the TCP 276 header of the responding ACK packet. As specified in RFC 3168, the 277 receiving node continues to set the ECN-Echo flag in packets until it 278 receives a packet with the CWR flag set. 280 When the sending node (node B) receives the ECN-Echo packet reporting 281 the Congestion Experienced indication in the SYN/ACK packet, the node 282 MUST set the initial congestion window to one segment, instead of two 283 segments as allowed by [RFC2581], or three or four segments allowed 284 by [RFC3390]. If the sending node (node B) was going to use an 285 initial window of one segment, and receives an ECN-Echo packet 286 informing it of a Congestion Experienced indication on its SYN/ACK 287 packet, the sending node MAY continue to send with an initial window 288 of one segment, without waiting for a retransmit timeout. We note 289 that this updates RFC 3168, which specifies that "the sending TCP 290 MUST reset the retransmit timer on receiving the ECN-Echo packet when 291 the congestion window is one." As specified by RFC 3168, the sending 292 node (node B) also sets the CWR flag in the TCP header of the next 293 data packet sent, to acknowledge its receipt of and reaction to the 294 ECN-Echo flag. 296 If the data transfer in Table 2 is entirely from Node A to Node B, 297 then data packets from Node A continue to set the ECN-Echo flag in 298 data packets, waiting for the CWR flag from Node B acknowledging a 299 response to the ECN-Echo flag. 301 3. Discussion 303 Motivation: 304 The rationale for the proposed change is the following. When node B 305 receives a TCP SYN packet with ECN-Echo bit set in the TCP header, 306 this indicates that node A is ECN-capable. If node B is also ECN- 307 capable, there are no obstacles to immediately setting one of the 308 ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK 309 packet. 311 There can be a great benefit in setting an ECN-capable codepoint in 312 SYN/ACK packets, as is discussed further in Section 4. Congestion is 313 most likely to occur in the server-to-client direction. As a result, 314 setting an ECN-capable codepoint in SYN/ACK packets can reduce the 315 occurence of three-second retransmit timeouts resulting from the drop 316 of SYN/ACK packets. 318 Flooding attacks: 319 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 320 packets does not raise any novel security vulnerabilities. For 321 example, provoking servers or hosts to send SYN/ACK packets to a 322 third party in order to perform a "SYN/ACK flood" attack would be 323 greatly inefficient. Third parties would immediately drop such 324 packets, since they would know that they didn't generate the TCP SYN 325 packets in the first place. Moreover, such SYN/ACK attacks would 326 have the same signatures as the existing TCP SYN attacks. Provoking 327 servers or hosts to reply with SYN/ACK packets in order to congest a 328 certain link would also be highly inefficient because SYN ACK packets 329 are small in size. 331 However, the addition of ECN-Capability to SYN/ACK packets could 332 allow SYN/ACK packets to persist for more hops along a network path 333 before being dropped, thus adding somewhat to the ability of a 334 SYN/ACK attack to flood a network link. 336 The TCP SYN packet: 337 There are several reasons why an ECN-Capable codepoint MUST NOT be 338 set in the IP header of the initiating TCP SYN packet. First, when 339 the TCP SYN packet is sent, there are no guarantees that the other 340 TCP endpoint (node B in Table 2) is ECN-capable, or that it would be 341 able to understand and react if the ECN CE codepoint was set by a 342 congested router. 344 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 345 by malicious clients to `improve' the well-known TCP SYN attack. By 346 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 347 might be able to inject a large number of TCP SYN packets through a 348 potentially congested ECN-enabled router, congesting it even further. 350 For both these reasons, we continue the restriction that the TCP SYN 351 packet MUST NOT have the ECN-Capable codepoint in the IP header set. 353 Backwards compatibility: 354 If there are some older TCP implementations that don't respond to the 355 Congestion Experienced codepoint in a SYN/ACK packet, that would not 356 be an insurmountable problem. It would mean that the sender of the 357 SYN/ACK packet would not reduce the initial congestion window from 358 two, three, or four segments down to one segment, as it should. 359 However, the TCP sender would still respond correctly to any 360 subsequent CE indications on data packets later on in the connection. 362 SYN/ACK packets and packet size: 363 There are a number of router buffer architectures that have smaller 364 dropping rates for small (SYN) packets than for large (data) packets. 365 For example, for a Drop Tail queue in units of packets, where each 366 packet takes a single slot in the buffer regardless of packet size, 367 small and large packets are equally likely to be dropped. However, 368 for a Drop Tail queue in units of bytes, small packets are less 369 likely to be dropped than are large ones. Similarly, for RED in 370 packet mode, small and large packets are equally likely to be dropped 371 or marked, while for RED in byte mode, a packet's chance of being 372 dropped or marked is proportional to the packet size in bytes. 374 For a congested router with an AQM mechanism in byte mode, where a 375 packet's chance of being dropped or marked is proportional to the 376 packet size in bytes, the drop or marking rate for TCP SYN/ACK 377 packets should generally be low. In this case, the benefit of making 378 SYN/ACK packets ECN-Capable should be similarly moderate. However, 379 for a congested router with a Drop Tail queue in units of packets or 380 with an AQM mechanism in packet mode, and with no priority queueing 381 for smaller packets, small and large packets should have the same 382 probability of being dropped or marked. In such a case, making 383 SYN/ACK packets ECN-Capable should be of significant benefit. 385 We believe that there are a wide range of behaviors in the real world 386 in terms of the drop or mark behavior at routers as a function of 387 packet size [Tools] (Section 10). We note that all of these 388 alternatives listed above are available in the NS simulator (Drop 389 Tail queues are by default in units of packets, while the default for 390 RED queue management has been changed from packet mode to byte mode). 392 Response to ECN-marking of SYN/ACK packets: 393 One question is why TCP SYN/ACK packets should be treated differently 394 from other packets in terms of the packet sender's response to an 395 ECN-marked packet. Section 5 of RFC 3168 specifies the following: 397 "Upon the receipt by an ECN-Capable transport of a single CE packet, 398 the congestion control algorithms followed at the end-systems MUST be 399 essentially the same as the congestion control response to a *single* 400 dropped packet. For example, for ECN-Capable TCP the source TCP is 401 required to halve its congestion window for any window of data 402 containing either a packet drop or an ECN indication." 404 In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP 405 congestion window consists of a single packet and that packet is ECN- 406 marked in the network, then the sender must reduce the sending rate 407 below one packet per round-trip time, by waiting for one RTO before 408 sending another packet. If the RTO was set to the average round-trip 409 time, this would result in halving the sending rate; because the RTO 410 is in fact larger than the average round-trip time, the sending rate 411 is reduced to less than half of its previous value. 413 TCP's congestion control response to the *dropping* of a SYN/ACK 414 packet is to wait a default time before sending another packet. This 415 document argues that ECN gives end-systems a wider range of possible 416 responses to the *marking* of a SYN/ACK packet, and that waiting a 417 default time before sending a data packet is not the desired 418 response. 420 On the conservative end, one could assume an effective congestion 421 window of one packet for the SYN/ACK packet, and respond to an ECN- 422 marked SYN/ACK packet by reducing the sending rate to one packet 423 every two round-trip times. As an approximation, the TCP end-node 424 could measure the round-trip time T between the sending of the 425 SYN/ACK packet and the receipt of the acknowledgement, and reply to 426 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 427 seconds before sending a data packet. However, we note that for an 428 ECN-marked SYN/ACK packet, halving the *congestion window* is not the 429 same as halving the *sending rate*; there is no `sending rate' 430 associated with an ECN-Capable SYN/ACK packet, as such packets are 431 only sent as the first packet in a connection from that host. 432 Further, a router's marking of a SYN/ACK packet is not affected by 433 any past history of that connection. 435 Adding ECN-Capability to SYN/ACK packets allows the simple response 436 of setting the initial congestion window to one packet, instead of 437 its allowed default value of two, three, or four packets, with the 438 host proceeding with a cautious sending rate of one packet per round- 439 trip time. If that packet is ECN-marked or dropped, then the sender 440 will wait an RTO before sending another packet. This document argues 441 that such an approach is useful to users, with no dangers of 442 congestion collapse or of starvation of competing traffic. 444 We note that if the data transfer is entirely from Node A to Node B, 445 then there is no effective difference between the two possible 446 responses to an ECN-marked SYN/ACK packet outlined above. In either 447 case, Node B sends no data packets, only sending acknowledgement 448 packets in response to received data packets. 450 4. Related Work 452 The addition of ECN-capability to TCP's SYN/ACK packets was proposed 453 in [ECN+]. The paper includes an extensive set of simulation and 454 testbed experiments to evaluate the effects of the proposal, using 455 several Active Queue Management (AQM) mechanisms, including Random 456 Early Detection (RED) [RED], Random Exponential Marking (REM) [REM], 457 and Proportional Integrator (PI) [PI]. The performance measures were 458 the end-to-end response times for each request/response pair, and the 459 aggregate throughput on the bottleneck link. The end-to-end response 460 time was computed as the time from the moment when the request for 461 the file is sent to the server, until that file is successfully 462 downloaded by the client. 464 The measurements from [ECN+] showed that setting an ECN-Capable 465 codepoint in the IP packet header in TCP SYN/ACK packets 466 systematically improves performance with all evaluated AQM schemes. 467 When SYN/ACK packets at a congested router are ECN-marked instead of 468 dropped, this can avoid a long initial retransmit timeout, improving 469 the response time for the affected flow dramatically. 471 [ECN+] showed that the impact on aggregate throughput can also be 472 quite significant, because marking SYN ACK packets can prevent larger 473 flows from suffering long timeouts before being "admitted" into the 474 network. In addition, the testbed measurements from [ECN+] showed 475 that Web servers setting the ECN-Capable codepoint in TCP SYN/ACK 476 packets could serve more requests. 478 As a final step, [ECN+] explored the co-existence of flows that do 479 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 480 results in [ECN+] show that both types of flows can coexist, with 481 some performance degradation for flows that don't apply the change. 482 Flows that apply the change improve their end-to-end performance. At 483 the same time, the performance degradation for flows that don't apply 484 the change, as a result of the flows that do apply the change, 485 increases as a greater fraction of flows apply the change. 487 5. Performance Evaluation 489 5.1. The Costs and Benefit of Adding ECN-Capability 491 [ECN+] explored the costs and benefits of adding ECN-Capability to 492 SYN/ACK packets with both simulations and experiments. The addition 493 of ECN-capability to SYN/ACK packets could be of significant benefit 494 for those ECN connections that would have had the SYN/ACK packet 495 dropped in the network, and for which the ECN-Capability would allow 496 the SYN/ACK to be marked rather than dropped. 498 The percent of SYN/ACK packets on a link can be quite high. In 499 particular, measurements on links dominated by Web traffic indicate 500 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 502 The benefit of adding ECN-capability to SYN/ACK packets depends in 503 part on the size of the data transfer. The drop of a SYN/ACK packet 504 can increase the download time of a short file by an order of 505 magnitude, by requiring a three-second retransmit timeout. For 506 longer-lived flows, the effect of a dropped SYN/ACK packet on file 507 download time is less dramatic. However, even for longer-lived 508 flows, the addition of ECN-capability to SYN/ACK packets can improve 509 the fairness among long-lived flows, as newly-arriving flows would be 510 less likely to have to wait for retransmit timeouts. 512 The question that arises of course is what fraction of connections 513 would see the benefit from making SYN/ACK packets ECN-capable, in a 514 particular scenario? Specifically: 516 (1) What fraction of arriving SYN/ACK packets are dropped at the 517 congested router when the SYN/ACK packets are not ECN-capable? 519 (2) Of those SYN/ACK packets that are dropped, what fraction of those 520 drops would have been ECN-marks instead of drops if the SYN/ACK 521 packets had been ECN-capable? 522 To answer (1), it is necessary to consider not only the level of 523 congestion but also the queue architecture at the congested link. As 524 described in Section 3 above, for some queue architectures small 525 packets are less likely to be dropped than large ones. In such an 526 environment, SYN/ACK packets would have lower packet drop rates; 527 question (1) could not necessarily be inferred from the overall 528 packet drop rate, but could be answered by measuring the drop rate 529 for SYN/ACK packets directly. In such an environment, adding ECN- 530 capability to SYN/ACK packets would be of less dramatic benefit than 531 in environments where all packets are equally likely to be dropped 532 regardless of packet size. 534 As question (2) implies, even if all of the SYN/ACK packets were ECN- 535 capable, there could still be some SYN/ACK packets dropped instead of 536 marked at the congested link; the full answer to question (2) depends 537 on the details of the queue management mechanism at the router. If 538 congestion is sufficiently bad, and the queue management mechanism 539 cannot prevent the buffer from overflowing, then SYN/ACK packets will 540 be dropped rather than marked upon buffer overflow whether or not 541 they are ECN-capable. 543 For some AQM mechanisms, ECN-capable packets are marked instead of 544 dropped any time this is possible, that is, any time the buffer is 545 not yet full. For other AQM mechanisms however, such as the RED 546 mechanism as recommended in [RED], packets are dropped rather than 547 marked when the packet drop/mark rate exceeds a certain threshold, 548 e.g., 10%, even if the packets are ECN-capable. For a router with 549 such an AQM mechanism, when congestion is sufficiently severe to 550 cause a high drop/mark rate, some SYN/ACK packets would be dropped 551 instead of marked whether or not they were ECN-capable. 553 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 554 packets depends not only on the overall packet drop rate in the 555 network, but also on the queue management architecture at the 556 congested link. 558 5.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 559 Packets. 561 This document specifies that the end-node responds to the report of 562 an ECN-marked SYN/ACK packet by setting the initial congestion window 563 to one packet, instead of its possible default value of two to four 564 packets. However, in Section 3 we discussed another possible 565 response to an ECN-marked SYN/ACK packet, of the end-node waiting an 566 RTT before sending a data packet. Future work will include a 567 comparative evaluation of these two methods. 569 6. Security Considerations 571 TCP packets carrying the ECT codepoint in IP headers can be marked 572 rather than dropped by ECN-capable routers. This raises several 573 security concerns that we discuss below. 575 "Bad" middleboxes: 576 There is a small but decreasing number of middleboxes that drop or 577 reset SYN and SYN/ACK packets based on the ECN-related flags in the 578 TCP header [MAF05], [RFC3360]. While there is no evidence that any 579 middleboxes drop SYN/ACK packets that contain an ECN-Capable 580 codepoint in the *IP header*, such behavior cannot be excluded. 581 Thus, as specified in Section 2, if a SYN/ACK packet with the ECT 582 codepoint is dropped, the TCP node SHOULD resend the SYN/ACK packet 583 without the ECN-Capable codepoint. 585 Congestion collapse: 586 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 587 marked instead of dropped at an ECN-capable router, the concern is 588 whether this can either invoke congestion, or worsen performance in 589 highly congested scenarios. This is not a problem because after 590 learning that the SYN/ACK packet was ECN-marked, the sender of that 591 packet will only send one data packet; in the case that this data 592 packet is ECN-marked, the sender will wait for a retransmission 593 timeout. In addition, routers are free to drop rather than mark 594 arriving packets in times of high congestion, regardless of whether 595 the packets are ECN-capable. 597 7. Conclusions 599 This draft specifies a modification to RFC 3168 to allow TCP nodes to 600 send SYN/ACK packets as being ECN-Capable. Making the SYN/ACK packet 601 ECN-Capable avoids the high cost to a TCP transfer when a SYN/ACK 602 packet is dropped by a congested router, by avoiding the resulting 603 retransmit timeout. This improves the throughput of short 604 connections. The sender of the SYN/ACK packet responds to an ECN 605 mark by reducing its initial congestion window from two, three, or 606 four segments to one segment, reducing the subsequent load from that 607 connection on the network. The addition of ECN-capability to SYN/ACK 608 packets is particularly beneficial in the server-to-client direction, 609 where congestion is more likely to occur. In this case, the initial 610 information provided by the ECN marking in the SYN/ACK packet enables 611 the server to more appropriately adjust the initial load it places on 612 the network. 614 Future work will address the more general question of adding ECN- 615 Capability to relevant handshake packets in other protocols that use 616 retransmission-based reliability in their setup phase (e.g., SCTP, 617 DCCP, HIP, and the like). 619 8. Acknowledgements 621 We thank Mark Allman, Wesley Eddy, Janardhan Iyengar, and Pasi 622 Sarolahti for feedback on earlier versions of this draft. 624 9. Normative References 626 [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate 627 Requirement Levels, RFC 2119, March 1997. 629 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 630 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 631 Standard, September 2001. 633 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 634 Initial Window, RFC 3390, October 2002. 636 10. Informative References 638 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 639 SIGCOMM 2005. 641 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 642 of Transport Protocols in the Internet, ACM CCR, April 2005. 644 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 645 Improved Controllers for AQM Routers Supporting TCP Flows, INFOCOM, 646 June 2001. 648 [RED] S. Floyd and V. Jacobson, Random Early Detection Gateways for 649 Congestion Avoidance, IEEE/ACM Transactions on Networking, V.1, N.4, 650 1993. 652 [REM] S. Athuraliya, V. Li, S. Low, and Q Yin, REM: Active Queue 653 Management, IEEE Network, V.15, N. 3, May 2001. 655 [RFC2309] B. Braden et al., Recommendations on Queue Management and 656 Congestion Avoidance in the Internet, RFC 2309, April 1998. 658 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 659 Control, RFC 2581, April 1999. 661 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 662 Timer, RFC 2988, November 2000. 664 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 665 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 666 January 2001. 668 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 669 3360, August 2002. 671 [SCJO01] F. Smith, F. Campos, K. Jeffay, D. Ott, What {TCP/IP} 672 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 674 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 675 677 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 678 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 679 tools-02, work in progress, June 2006. 681 11. IANA Considerations 683 There are no IANA considerations regarding this document. 685 AUTHORS' ADDRESSES 687 Aleksandar Kuzmanovic 688 Phone: +1 (847) 467-5519 689 Northwestern University 690 Email: akuzma@northwestern.edu 691 URL: http://cs.northwestern.edu/~a 693 Sally Floyd 694 Phone: +1 (510) 666-2989 695 ICIR (ICSI Center for Internet Research) 696 Email: floyd@icir.org 697 URL: http://www.icir.org/floyd/ 699 K. K. Ramakrishnan 700 Phone: +1 (973) 360-8764 701 AT&T Labs Research 702 Email: kkrama@research.att.com 703 URL: http://www.research.att.com/info/kkrama 705 Full Copyright Statement 707 Copyright (C) The Internet Society (2006). This document is subject 708 to the rights, licenses and restrictions contained in BCP 78, and 709 except as set forth therein, the authors retain all their rights. 711 This document and the information contained herein are provided on 712 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 713 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 714 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 715 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 716 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 717 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 719 Intellectual Property 721 The IETF takes no position regarding the validity or scope of any 722 Intellectual Property Rights or other rights that might be claimed 723 to pertain to the implementation or use of the technology described 724 in this document or the extent to which any license under such 725 rights might or might not be available; nor does it represent that 726 it has made any independent effort to identify any such rights. 727 Information on the procedures with respect to rights in RFC 728 documents can be found in BCP 78 and BCP 79. 730 Copies of IPR disclosures made to the IETF Secretariat and any 731 assurances of licenses to be made available, or the result of an 732 attempt made to obtain a general license or permission for the use 733 of such proprietary rights by implementers or users of this 734 specification can be obtained from the IETF on-line IPR repository 735 at http://www.ietf.org/ipr. 736 The IETF invites any interested party to bring to its attention any 737 copyrights, patents or patent applications, or other proprietary 738 rights that may cover technology that may be required to implement 739 this standard. Please address the information to the IETF at ietf- 740 ipr@ietf.org.