idnits 2.17.1 draft-ietf-tcpm-ecnsyn-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 710. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 721. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 733. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 733. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2119' is mentioned on line 96, but not defined == Missing Reference: 'RFC 3168' is mentioned on line 124, but not defined == Missing Reference: 'Section 10' is mentioned on line 383, but not defined -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) == Outdated reference: A later version (-05) exists of draft-irtf-tmrg-tools-00 Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force A. Kuzmanovic 2 INTERNET DRAFT Northwestern University 3 draft-ietf-tcpm-ecnsyn-00.txt S. Floyd 4 ICIR 5 K.K. Ramakrishnan 6 AT&T 7 January, 2006 9 Adding Explicit Congestion Notification (ECN) Capability to TCP's 10 SYN/ACK Packets 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on July 2006. 37 Abstract 39 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 40 packets to be ECN-Capable. For TCP, RFC 3168 only specified setting 41 an ECN-Capable codepoint on data packets, and not on SYN and SYN/ACK 42 packets. However, because of the high cost to the TCP transfer of 43 having a SYN/ACK packet dropped, with the resulting retransmit 44 timeout, this document is specifying the use of ECN for the SYN/ACK 45 packet itself, when sent in response to a SYN packet with the two ECN 46 flags set in the TCP header, indicating a willingness to use ECN. 47 Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to 48 the TCP connection, avoiding the severe penalty of a retransmit 49 timeout for a connection that has not yet started placing a load on 50 the network. The sender of the SYN/ACK packet must respond to an ECN 51 mark by reducing its initial congestion window from two, three, or 52 four segments to one segment, reducing the subsequent load from that 53 connection on the network. 55 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 57 Changes from draft-ietf-twvsg-ecnsyn: 59 * Changed name of draft to draft-ietf-tcpm-ecnsyn. 61 * Added a discussion in Section 3 of "Response to 62 ECN-marking of SYN/ACK packets". Based on 63 suggestions from Mark Allman. 65 * Added a discussion to the Conclusions about adding 66 ECN-capability to relevant set-up packets in other 67 protocols. From a suggestion from Wesley Eddy. 69 * Added a description of SYN exchanges with SYN cookies. 70 From a suggestion from Wesley Eddy. 72 * Added a discussion of one-way data transfers, where the 73 host sending the SYN/ACK packet sends no data packets. 75 * Minor editing, from feedback from Mark Allman and Janardhan 76 Iyengar. 78 * Future work: a look at the costs of adding 79 ECN-Capability in a worst-case scenario. 80 From feedback from Mark Allman and Janardhan Iyengar. 82 * Future work: a comparative evaluation of two 83 possible responses to an ECN-marked SYN/ACK packet. 85 Changes from draft-kuzmanovic-ecn-syn-00.txt: 87 * Changed name of draft to draft-ietf-twvsg-ecnsyn. 89 END OF NOTE TO RFC EDITOR. 91 1. Conventions 93 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 94 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 95 document are to be interpreted as described in [RFC 2119]. 97 1. Introduction 99 TCP's congestion control mechanism has primarily used packet loss as 100 the congestion indication, with packets dropped when buffers 101 overflow. With such tail-drop mechanisms, the packet delay can be 102 high, as the queue at bottleneck routers can be fairly large. 103 Dropping packets only when the queue overflows, and having TCP react 104 only to such losses, results in: 105 1) significantly higher packet delay; 106 2) unnecessarily many packet losses; and 107 3) unfairness due to synchronization effects. 109 The adoption of Active Queue Management (AQM) mechanisms allows 110 better control of bottleneck queues [RFC2309]. This use of AQM has 111 the following potential benefits: 112 1) better control of the queue, with reduced queueing delay; 113 2) fewer packet drops; and 114 3) better fairness because of fewer synchronization effects. 116 With the adoption of ECN, performance may be further improved. When 117 the router detects congestion before buffer overflow, the router can 118 provide a congestion indication either by dropping a packet, or by 119 setting the Congestion Experienced (CE) codepoint in the Explicit 120 Congestion Notification (ECN) field in the IP header [RFC3168]. The 121 IETF has standardized the use of the Congestion Experienced (CE) 122 codepoint in the IP header for routers to indicate congestion. For 123 incremental deployment and backwards compatibility, the RFC on ECN 124 [RFC 3168] specifies that routers may mark ECN-capable packets that 125 would otherwise have been dropped, using the Congestion Experienced 126 codepoint in the ECN field. The use of ECN allows TCP to react to 127 congestion while avoiding unnecessary retransmissions and, in some 128 cases, unnecessary retransmit timeouts. Thus, using ECN has several 129 benefits: 131 1) For short transfers, a TCP connection's congestion window may be 132 small. For example, if the current window contains only one packet, 133 and that packet is dropped, TCP will have to wait for a retransmit 134 timeout to recover, reducing its overall throughput. Similarly, if 135 the current window contains only a few packets and one of those 136 packets is dropped, there might not be enough duplicate 137 acknowledgements for a fast retransmission, and the sender might have 138 to wait for a delay of several round-trip times using Limited 139 Transmit [RFC3042]. With the use of ECN, short flows are less likely 140 to have packets dropped, sometimes avoiding unnecessary delays or 141 costly retransmit timeouts. 143 2) While longer flows may not see substantially improved throughput 144 with the use of ECN, they experience lower loss. This may benefit TCP 145 applications that are latency- and loss-sensitive, because of the 146 avoidance of retransmissions. 148 RFC 3168 only specified marking the Congestion Experienced codepoint 149 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 150 specified the negotiation of the use of ECN between the two TCP end- 151 points in the TCP SYN and SYN-ACK exchange, using flags in the TCP 152 header. Erring on the side of being conservative, RFC 3168 did not 153 specify the use of ECN for the SYN/ACK packet itself. However, 154 because of the high cost to the TCP transfer of having a SYN/ACK 155 packet dropped, with the resulting retransmit timeout, this document 156 is specifying the use of ECN for the SYN/ACK packet itself. This can 157 be of great benefit to the TCP connection, avoiding the severe 158 penalty of a retransmit timeout for a connection that has not yet 159 started placing a load on the network. The sender of the SYN/ACK 160 packet must respond to an ECN mark by reducing its initial congestion 161 window from two, three, or four segments to one segment, reducing the 162 subsequent load from that connection on the network. 164 The use of ECN for SYN/ACK packets has the following potential 165 benefits: 166 1) Avoidance of a retransmit timeout; 167 2) Improvement in the throughput of short connections. 169 This draft specifies a modification to RFC 3168 to allow TCP SYN/ACK 170 packets to be ECN-Capable. Section 2 contains the specification of 171 the change, while Section 3 discusses some of the issues, and Section 172 4 discusses related work. Section 5 contains an evaluation of the 173 proposed change. 175 2. Proposal 177 This section specifies the modification to RFC 3168 to allow TCP 178 SYN/ACK packets to be ECN-Capable. We use the following terminology 179 from RFC 3168: 181 The ECN field in the IP header: 182 o CE: the Congestion Experienced codepoint; and 183 o ECT: either one of the two ECN-Capable Transport codepoints. 185 The ECN flags in the TCP header: 186 o CWR: the Congestion Window Reduced flag; and 187 o ECE: the ECN-Echo flag. 189 ECN-setup packets: 190 o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; 191 o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. 193 RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on 194 SYN or SYN-ACK packets." In this section, we specify that a TCP node 195 MAY respond to an ECN-setup SYN packet by setting ECT in the 196 responding ECN-setup SYN/ACK packet, indicating to routers that the 197 SYN/ACK packet is ECN-Capable. This allows a congested router along 198 the path to mark the packet instead of dropping the packet as an 199 indication of congestion. 201 Assume that TCP node A transmits to TCP node B an ECN-setup SYN 202 packet, indicating willingness to use ECN for this connection. As 203 specified by RFC 3168, if TCP node B is willing to use ECN, node B 204 responds with an ECN-setup SYN-ACK packet. 206 Table 1 shows an interchange with the SYN/ACK packet dropped by a 207 congested router. Node B waits for a retransmit timeout, and then 208 retransmits the SYN/ACK packet. 210 --------------------------------------------------------------- 211 TCP Node A Router TCP Node B 212 ---------- ------ ---------- 214 ECN-setup SYN packet ---> 215 ECN-setup SYN packet ---> 217 <--- ECN-setup SYN/ACK, possibly ECT 218 3-second timer set 219 SYN/ACK dropped . 220 . 221 . 222 3-second timer expires 223 <--- ECN-setup SYN/ACK, not ECT 224 <--- ECN-setup SYN/ACK 225 Data/ACK ---> 226 Data/ACK ---> 227 <--- Data (one to four segments) 228 --------------------------------------------------------------- 230 Table 1: SYN exchange with the SYN/ACK packet dropped. 232 If the SYN/ACK packet is dropped in the network, the TCP host (node 233 B) responds by waiting three seconds for the retransmit timer to 234 expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is 235 dropped, the TCP node SHOULD resend the SYN/ACK packet without the 236 ECN-Capable codepoint. (Although we are not aware of any middleboxes 237 that drop SYN/ACK packets that contain an ECN-Capable codepoint in 238 the IP header, we have learned to design our protocols defensively in 239 this regard [RFC3360].) 241 We note that if syn-cookies were used by Node B in the exchange in 242 Table 1, TCP Node B wouldn't set a timer upon transmission of the 243 SYN/ACK packet [SYN-COOK]. In this case, if the SYN/ACK packet was 244 lost, the initiator (Node A) would have to timeout and retransmit the 245 SYN packet in order to trigger another SYN-ACK. 247 Table 2 shows an interchange with the SYN/ACK packet sent as ECN- 248 Capable, and ECN-marked instead of dropped at the congested router. 250 --------------------------------------------------------------- 251 TCP Node A Router TCP Node B 252 ---------- ------ ---------- 254 ECN-setup SYN packet ---> 255 ECN-setup SYN packet ---> 257 <--- ECN-setup SYN/ACK, ECT 258 <--- Sets CE on SYN/ACK 259 <--- ECN-setup SYN/ACK, CE 261 Data/ACK, ECN-Echo ---> 262 Data/ACK, ECN-Echo ---> 263 Window reduced to one segment. 264 <--- Data, CWR (one segment only) 265 --------------------------------------------------------------- 267 Table 2: SYN exchange with the SYN/ACK packet marked. 269 If the receiving node (node A) receives a SYN/ACK packet that has 270 been marked by the congested router, with the CE codepoint set, the 271 receiving node MUST respond by setting the ECN-Echo flag in the TCP 272 header of the responding ACK packet. As specified in RFC 3168, the 273 receiving node continues to set the ECN-Echo flag in packets until it 274 receives a packet with the CWR flag set. 276 When the sending node (node B) receives the ECN-Echo packet reporting 277 the Congestion Experienced indication in the SYN/ACK packet, the node 278 MUST set the initial congestion window to one segment, instead of two 279 segments as allowed by [RFC2581], or three or four segments allowed 280 by [RFC3390]. If the sending node (node B) was going to use an 281 initial window of one segment, and receives an ECN-Echo packet 282 informing it of a Congestion Experienced indication on its SYN/ACK 283 packet, the sending node MAY continue to send with an initial window 284 of one segment, without waiting for a retransmit timeout. We note 285 that this updates RFC 3168, which specifies that "the sending TCP 286 MUST reset the retransmit timer on receiving the ECN-Echo packet when 287 the congestion window is one." As specified by RFC 3168, the sending 288 node (node B) also sets the CWR flag in the TCP header of the next 289 data packet sent, to acknowledge its receipt of and reaction to the 290 ECN-Echo flag. 292 If the data transfer in Table 2 is entirely from Node A to Node B, 293 then data packets from Node A continue to set the ECN-Echo flag in 294 data packets, waiting for the CWR flag from Node B acknowledging a 295 response to the ECN-Echo flag. 297 3. Discussion 299 Motivation: 300 The rationale for the proposed change is the following. When node B 301 receives a TCP SYN packet with ECN-Echo bit set in the TCP header, 302 this indicates that node A is ECN-capable. If node B is also ECN- 303 capable, there are no obstacles to immediately setting one of the 304 ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK 305 packet. 307 There can be a great benefit in setting an ECN-capable codepoint in 308 SYN/ACK packets, as is discussed further in Section 4. Congestion is 309 most likely to occur in the server-to-client direction. As a result, 310 setting an ECN-capable codepoint in SYN/ACK packets can reduce the 311 occurence of three-second retransmit timeouts resulting from the drop 312 of SYN/ACK packets. 314 Flooding attacks: 315 Setting an ECN-Capable codepoint in the responding TCP SYN/ACK 316 packets does not raise any novel security vulnerabilities. For 317 example, provoking servers or hosts to send SYN/ACK packets to a 318 third party in order to perform a "SYN/ACK flood" attack would be 319 greatly inefficient. Third parties would immediately drop such 320 packets, since they would know that they didn't generate the TCP SYN 321 packets in the first place. Moreover, such SYN/ACK attacks would 322 have the same signatures as the existing TCP SYN attacks. Provoking 323 servers or hosts to reply with SYN/ACK packets in order to congest a 324 certain link would also be highly inefficient because SYN ACK packets 325 are small in size. 327 However, the addition of ECN-Capability to SYN/ACK packets could 328 allow SYN/ACK packets to persist for more hops along a network path 329 before being dropped, thus adding somewhat to the ability of a 330 SYN/ACK attack to flood a network link. 332 The TCP SYN packet: 333 There are several reasons why an ECN-Capable codepoint MUST NOT be 334 set in the IP header of the initiating TCP SYN packet. First, when 335 the TCP SYN packet is sent, there are no guarantees that the other 336 TCP endpoint (node B in Table 2) is ECN-capable, or that it would be 337 able to understand and react if the ECN CE codepoint was set by a 338 congested router. 340 Second, the ECN-Capable codepoint in TCP SYN packets could be misused 341 by malicious clients to `improve' the well-known TCP SYN attack. By 342 setting an ECN-Capable codepoint in TCP SYN packets, a malicious host 343 might be able to inject a large number of TCP SYN packets through a 344 potentially congested ECN-enabled router, congesting it even further. 346 For both these reasons, we continue the restriction that the TCP SYN 347 packet MUST NOT have the ECN-Capable codepoint in the IP header set. 349 Backwards compatibility: 350 If there are some older TCP implementations that don't respond to the 351 Congestion Experienced codepoint in a SYN/ACK packet, that would not 352 be an insurmountable problem. It would mean that the sender of the 353 SYN/ACK packet would not reduce the initial congestion window from 354 two, three, or four segments down to one segment, as it should. 355 However, the TCP sender would still respond correctly to any 356 subsequent CE indications on data packets later on in the connection. 358 SYN/ACK packets and packet size: 359 There are a number of router buffer architectures that have smaller 360 dropping rates for small (SYN) packets than for large (data) packets. 361 For example, for a Drop Tail queue in units of packets, where each 362 packet takes a single slot in the buffer regardless of packet size, 363 small and large packets are equally likely to be dropped. However, 364 for a Drop Tail queue in units of bytes, small packets are less 365 likely to be dropped than are large ones. Similarly, for RED in 366 packet mode, small and large packets are equally likely to be dropped 367 or marked, while for RED in byte mode, a packet's chance of being 368 dropped or marked is proportional to the packet size in bytes. 370 For a congested router with an AQM mechanism in byte mode, where a 371 packet's chance of being dropped or marked is proportional to the 372 packet size in bytes, the drop or marking rate for TCP SYN/ACK 373 packets should generally be low. In this case, the benefit of making 374 SYN/ACK packets ECN-Capable should be similarly moderate. However, 375 for a congested router with a Drop Tail queue in units of packets or 376 with an AQM mechanism in packet mode, and with no priority queueing 377 for smaller packets, small and large packets should have the same 378 probability of being dropped or marked. In such a case, making 379 SYN/ACK packets ECN-Capable should be of significant benefit. 381 We believe that there are a wide range of behaviors in the real world 382 in terms of the drop or mark behavior at routers as a function of 383 packet size [Tools, Section 10]. We note that all of these 384 alternatives listed above are available in the NS simulator (Drop 385 Tail queues are by default in units of packets, while the default for 386 RED queue management has been changed from packet mode to byte mode). 388 Response to ECN-marking of SYN/ACK packets: 389 One question is why TCP SYN/ACK packets should be treated differently 390 from other packets in terms of the packet sender's response to an 391 ECN-marked packet. Section 5 of RFC 3168 specifies the following: 393 "Upon the receipt by an ECN-Capable transport of a single CE packet, 394 the congestion control algorithms followed at the end-systems MUST be 395 essentially the same as the congestion control response to a *single* 396 dropped packet. For example, for ECN-Capable TCP the source TCP is 397 required to halve its congestion window for any window of data 398 containing either a packet drop or an ECN indication." 400 In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP 401 congestion window consists of a single packet and that packet is ECN- 402 marked in the network, then the sender must reduce the sending rate 403 below one packet per round-trip time, by waiting for one RTO before 404 sending another packet. If the RTO was set to the average round-trip 405 time, this would result in halving the sending rate; because the RTO 406 is in fact larger than the average round-trip time, the sending rate 407 is reduced to less than half of its previous value. 409 TCP's congestion control response to the *dropping* of a SYN/ACK 410 packet is to wait a default time before sending another packet. This 411 document argues that ECN gives end-systems a wider range of possible 412 responses to the *marking* of a SYN/ACK packet, and that waiting a 413 default time before sending a data packet is not the desired 414 response. 416 On the conservative end, one could assume an effective congestion 417 window of one packet for the SYN/ACK packet, and respond to an ECN- 418 marked SYN/ACK packet by reducing the sending rate to one packet 419 every two round-trip times. As an approximation, the TCP end-node 420 could measure the round-trip time T between the sending of the 421 SYN/ACK packet and the receipt of the acknowledgement, and reply to 422 the acknowledgement of the ECN-marked SYN/ACK packet by waiting T 423 seconds before sending a data packet. However, we note that for an 424 ECN-marked SYN/ACK packet, halving the *congestion window* is not the 425 same as halving the *sending rate*; there is no `sending rate' 426 associated with an ECN-Capable SYN/ACK packet, as such packets are 427 only sent as the first packet in a connection from that host. 428 Further, a router's marking of a SYN/ACK packet is not affected by 429 any past history of that connection. 431 Adding ECN-Capability to SYN/ACK packets allows the simple response 432 of setting the initial congestion window to one packet, instead of 433 its allowed default value of two, three, or four packets, with the 434 host proceeding with a cautious sending rate of one packet per round- 435 trip time. If that packet is ECN-marked or dropped, then the sender 436 will wait an RTO before sending another packet. This document argues 437 that such an approach is useful to users, with no dangers of 438 congestion collapse or of starvation of competing traffic. 440 We note that if the data transfer is entirely from Node A to Node B, 441 then there is no effective difference between the two possible 442 responses to an ECN-marked SYN/ACK packet outlined above. In either 443 case, Node B sends no data packets, only sending acknowledgement 444 packets in response to received data packets. 446 4. Related Work 448 The addition of ECN-capability to TCP's SYN/ACK packets was proposed 449 in [ECN+]. The paper includes an extensive set of simulation and 450 testbed experiments to evaluate the effects of the proposal, using 451 several Active Queue Management (AQM) mechanisms, including Random 452 Early Detection (RED) [RED], Random Exponential Marking (REM) [REM], 453 and Proportional Integrator (PI) [PI]. The performance measures were 454 the end-to-end response times for each request/response pair, and the 455 aggregate throughput on the bottleneck link. The end-to-end response 456 time was computed as the time from the moment when the request for 457 the file is sent to the server, until that file is successfully 458 downloaded by the client. 460 The measurements from [ECN+] showed that setting an ECN-Capable 461 codepoint in the IP packet header in TCP SYN/ACK packets 462 systematically improves performance with all evaluated AQM schemes. 463 When SYN/ACK packets at a congested router are ECN-marked instead of 464 dropped, this can avoid a long initial retransmit timeout, improving 465 the response time for the affected flow dramatically. 467 [ECN+] showed that the impact on aggregate throughput can also be 468 quite significant, because marking SYN ACK packets can prevent larger 469 flows from suffering long timeouts before being "admitted" into the 470 network. In addition, the testbed measurements from [ECN+] showed 471 that Web servers setting the ECN-Capable codepoint in TCP SYN/ACK 472 packets could serve more requests. 474 As a final step, [ECN+] explored the co-existence of flows that do 475 and don't set the ECN-capable codepoint in TCP SYN/ACK packets. The 476 results in [ECN+] show that both types of flows can coexist, with 477 some performance degradation for flows that don't apply the change. 478 Flows that apply the change improve their end-to-end performance. At 479 the same time, the performance degradation for flows that don't apply 480 the change, as a result of the flows that do apply the change, 481 increases as a greater fraction of flows apply the change. 483 5. Performance Evaluation 485 5.1. The Costs and Benefit of Adding ECN-Capability 487 [ECN+] explored the costs and benefits of adding ECN-Capability to 488 SYN/ACK packets with both simulations and experiments. The addition 489 of ECN-capability to SYN/ACK packets could be of significant benefit 490 for those ECN connections that would have had the SYN/ACK packet 491 dropped in the network, and for which the ECN-Capability would allow 492 the SYN/ACK to be marked rather than dropped. 494 The percent of SYN/ACK packets on a link can be quite high. In 495 particular, measurements on links dominated by Web traffic indicate 496 that 15-20% of the packets can be SYN/ACK packets [SCJO01]. 498 The benefit of adding ECN-capability to SYN/ACK packets depends in 499 part on the size of the data transfer. The drop of a SYN/ACK packet 500 can increase the download time of a short file by an order of 501 magnitude, by requiring a three-second retransmit timeout. For 502 longer-lived flows, the effect of a dropped SYN/ACK packet on file 503 download time is less dramatic. However, even for longer-lived 504 flows, the addition of ECN-capability to SYN/ACK packets can improve 505 the fairness among long-lived flows, as newly-arriving flows would be 506 less likely to have to wait for retransmit timeouts. 508 The question that arises of course is what fraction of connections 509 would see the benefit from making SYN/ACK packets ECN-capable, in a 510 particular scenario? Specifically: 512 (1) What fraction of arriving SYN/ACK packets are dropped at the 513 congested router when the SYN/ACK packets are not ECN-capable? 515 (2) Of those SYN/ACK packets that are dropped, what fraction of those 516 drops would have been ECN-marks instead of drops if the SYN/ACK 517 packets had been ECN-capable? 518 To answer (1), it is necessary to consider not only the level of 519 congestion but also the queue architecture at the congested link. As 520 described in Section 3 above, for some queue architectures small 521 packets are less likely to be dropped than large ones. In such an 522 environment, SYN/ACK packets would have lower packet drop rates; 523 question (1) could not necessarily be inferred from the overall 524 packet drop rate, but could be answered by measuring the drop rate 525 for SYN/ACK packets directly. In such an environment, adding ECN- 526 capability to SYN/ACK packets would be of less dramatic benefit than 527 in environments where all packets are equally likely to be dropped 528 regardless of packet size. 530 As question (2) implies, even if all of the SYN/ACK packets were ECN- 531 capable, there could still be some SYN/ACK packets dropped instead of 532 marked at the congested link; the full answer to question (2) depends 533 on the details of the queue management mechanism at the router. If 534 congestion is sufficiently bad, and the queue management mechanism 535 cannot prevent the buffer from overflowing, then SYN/ACK packets will 536 be dropped rather than marked upon buffer overflow whether or not 537 they are ECN-capable. 539 For some AQM mechanisms, ECN-capable packets are marked instead of 540 dropped any time this is possible, that is, any time the buffer is 541 not yet full. For other AQM mechanisms however, such as the RED 542 mechanism as recommended in [RED], packets are dropped rather than 543 marked when the packet drop/mark rate exceeds a certain threshold, 544 e.g., 10%, even if the packets are ECN-capable. For a router with 545 such an AQM mechanism, when congestion is sufficiently severe to 546 cause a high drop/mark rate, some SYN/ACK packets would be dropped 547 instead of marked whether or not they were ECN-capable. 549 Thus, the degree of benefit of adding ECN-Capability to SYN/ACK 550 packets depends not only on the overall packet drop rate in the 551 network, but also on the queue management architecture at the 552 congested link. 554 5.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 555 Packets. 557 This document specifies that the end-node responds to the report of 558 an ECN-marked SYN/ACK packet by setting the initial congestion window 559 to one packet, instead of its possible default value of two to four 560 packets. However, in Section 3 we discussed another possible 561 response to an ECN-marked SYN/ACK packet, of the end-node waiting an 562 RTT before sending a data packet. Future work will include a 563 comparative evaluation of these two methods. 565 6. Security Considerations 567 TCP packets carrying the ECT codepoint in IP headers can be marked 568 rather than dropped by ECN-capable routers. This raises several 569 security concerns that we discuss below. 571 "Bad" middleboxes: 572 There is a small but decreasing number of middleboxes that drop or 573 reset SYN and SYN/ACK packets based on the ECN-related flags in the 574 TCP header [MAF05,RFC3360]. While there is no evidence that any 575 middleboxes drop SYN/ACK packets that contain an ECN-Capable 576 codepoint in the *IP header*, such behavior cannot be excluded. 577 Thus, as specified in Section 2, if a SYN/ACK packet with the ECT 578 codepoint is dropped, the TCP node SHOULD resend the SYN/ACK packet 579 without the ECN-Capable codepoint. 581 Congestion collapse: 582 Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- 583 marked instead of dropped at an ECN-capable router, the concern is 584 whether this can either invoke congestion, or worsen performance in 585 highly congested scenarios. This is not a problem because after 586 learning that the SYN/ACK packet was ECN-marked, the sender of that 587 packet will only send one data packet; in the case that this data 588 packet is ECN-marked, the sender will wait for a retransmission 589 timeout. In addition, routers are free to drop rather than mark 590 arriving packets in times of high congestion, regardless of whether 591 the packets are ECN-capable. 593 7. Conclusions 595 This draft specifies a modification to RFC 3168 to allow TCP nodes to 596 send SYN/ACK packets as being ECN-Capable. Making the SYN/ACK packet 597 ECN-Capable avoids the high cost to a TCP transfer when a SYN/ACK 598 packet is dropped by a congested router, by avoiding the resulting 599 retransmit timeout. This improves the throughput of short 600 connections. The sender of the SYN/ACK packet responds to an ECN 601 mark by reducing its initial congestion window from two, three, or 602 four segments to one segment, reducing the subsequent load from that 603 connection on the network. The addition of ECN-capability to SYN/ACK 604 packets is particularly beneficial in the server-to-client direction, 605 where congestion is more likely to occur. In this case, the initial 606 information provided by the ECN marking in the SYN/ACK packet enables 607 the server to more appropriately adjust the initial load it places on 608 the network. 610 Future work will address the more general question of adding ECN- 611 Capability to relevant handshake packets in other protocols that use 612 retransmission-based reliability in their setup phase (e.g., SCTP, 613 DCCP, HIP, and the like). 615 8. Acknowledgements 617 We thank Mark Allman, Wesley Eddy, Janardhan Iyengar, and Pasi 618 Sarolahti for feedback on earlier versions of this draft. 620 9. Normative References 622 [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of 623 Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed 624 Standard, September 2001. 626 [RFC3390] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's 627 Initial Window, RFC 3390, October 2002. 629 10. Informative References 631 [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, 632 SIGCOMM 2005. 634 [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution 635 of Transport Protocols in the Internet, ACM CCR, April 2005. 637 [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing 638 Improved Controllers for AQM Routers Supporting TCP Flows, INFOCOM, 639 June 2001. 641 [RED] S. Floyd and V. Jacobson, Random Early Detection Gateways for 642 Congestion Avoidance, IEEE/ACM Transactions on Networking, V.1, N.4, 643 1993. 645 [REM] S. Athuraliya, V. Li, S. Low, and Q Yin, REM: Active Queue 646 Management, IEEE Network, V.15, N. 3, May 2001. 648 [RFC2309] B. Braden et al., Recommendations on Queue Management and 649 Congestion Avoidance in the Internet, RFC 2309, April 1998. 651 [RFC2581] M. Allman, V. Paxson, and W. Stevens, TCP Congestion 652 Control, RFC 2581, April 1999. 654 [RFC2988] V. Paxson and M. Allman, Computing TCP's Retransmission 655 Timer, RFC 2988, November 2000. 657 [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's 658 Loss Recovery Using Limited Transmit, RFC 3042, Proposed Standard, 659 January 2001. 661 [RFC3360] S. Floyd, Inappropriate TCP Resets Considered Harmful, RFC 662 3360, August 2002. 664 [SCJO01] F. Smith, F. Campos, K. Jeffay, D. Ott, What {TCP/IP} 665 Protocol Headers Can Tell us about the Web, SIGMETRICS, June 2001. 667 [SYN-COOK] Dan J. Bernstein, SYN cookies, 1997, see also 668 670 [Tools] S. Floyd and E. Kohler, Tools for the Evaluation of 671 Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg- 672 tools-00, work in progress, September 2005. 674 11. IANA Considerations 676 There are no IANA considerations regarding this document. 678 AUTHORS' ADDRESSES 680 Aleksandar Kuzmanovic 681 Phone: +1 (847) 467-5519 682 Northwestern University 683 Email: akuzma@northwestern.edu 684 URL: http://cs.northwestern.edu/~a 686 Sally Floyd 687 Phone: +1 (510) 666-2989 688 ICIR (ICSI Center for Internet Research) 689 Email: floyd@icir.org 690 URL: http://www.icir.org/floyd/ 692 K. K. Ramakrishnan 693 Phone: +1 (973) 360-8764 694 AT&T Labs Research 695 Email: kkrama@research.att.com 696 URL: http://www.research.att.com/info/kkrama 698 Full Copyright Statement 700 Copyright (C) The Internet Society (2006). This document is subject 701 to the rights, licenses and restrictions contained in BCP 78, and 702 except as set forth therein, the authors retain all their rights. 704 This document and the information contained herein are provided on 705 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 706 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 707 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 708 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 709 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 710 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 712 Intellectual Property 714 The IETF takes no position regarding the validity or scope of any 715 Intellectual Property Rights or other rights that might be claimed 716 to pertain to the implementation or use of the technology described 717 in this document or the extent to which any license under such 718 rights might or might not be available; nor does it represent that 719 it has made any independent effort to identify any such rights. 720 Information on the procedures with respect to rights in RFC 721 documents can be found in BCP 78 and BCP 79. 723 Copies of IPR disclosures made to the IETF Secretariat and any 724 assurances of licenses to be made available, or the result of an 725 attempt made to obtain a general license or permission for the use 726 of such proprietary rights by implementers or users of this 727 specification can be obtained from the IETF on-line IPR repository 728 at http://www.ietf.org/ipr. 729 The IETF invites any interested party to bring to its attention any 730 copyrights, patents or patent applications, or other proprietary 731 rights that may cover technology that may be required to implement 732 this standard. Please address the information to the IETF at ietf- 733 ipr@ietf.org.