idnits 2.17.1 draft-kuehlewind-tcpm-accurate-ecn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: In this document the Urgent Pointer field is defined to be (re)usable for auxiliary data if the URG flag is not set. Note that as the contents of this field were previously undefined when the URG bit is not set, a new mechanism using these bits SHOULD not rely on the correct delivery. Further below in this document a new usage for four bits of the Urgent Pointer counter is defined. -- The document date (June 20, 2013) is 3963 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Unused Reference: 'I-D.briscoe-tsvwg-re-ecn-tcp' is defined on line 572, but no explicit reference was found in the text == Unused Reference: 'RFC5681' is defined on line 583, but no explicit reference was found in the text == Unused Reference: 'RFC5690' is defined on line 586, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance and Minor Extensions (tcpm) M. Kuehlewind, Ed. 3 Internet-Draft University of Stuttgart 4 Intended status: Experimental R. Scheffenegger 5 Expires: December 22, 2013 NetApp, Inc. 6 June 20, 2013 8 More Accurate ECN Feedback in TCP 9 draft-kuehlewind-tcpm-accurate-ecn-02 11 Abstract 13 Explicit Congestion Notification (ECN) is an IP/TCP mechanism where 14 network nodes can mark IP packets instead of dropping them to 15 indicate congestion to the end-points. ECN-capable receivers will 16 feedback this information to the sender. ECN is specified for TCP in 17 such a way that only one feedback signal can be transmitted per 18 Round-Trip Time (RTT). Recently, new TCP mechanisms like ConEx or 19 DCTCP need more accurate ECN feedback information in the case where 20 more than one marking is received in one RTT. This document 21 specifies a different scheme for the ECN feedback in the TCP header 22 to provide more than one feedback signal per RTT. Furthermore this 23 document specifies a re-use of the Urgent Pointer in the TCP header 24 if the URG flag is not set to increase the robustness of the proposed 25 ECN feedback scheme. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on December 22, 2013. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 1.1. Overview ECN and ECN Nonce in IP/TCP . . . . . . . . . . 3 63 1.2. Re-Use of the Urgent field in TCP . . . . . . . . . . . . 4 64 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 4 65 2. More Accurate ECN Feedback . . . . . . . . . . . . . . . . . 5 66 2.1. Negotiation during the TCP handshake . . . . . . . . . . 5 67 2.2. Feedback Coding . . . . . . . . . . . . . . . . . . . . . 7 68 2.2.1. Codepoint Coding of the more Accurate ECN (ACE) field 7 69 2.2.2. Use with ECN Nonce . . . . . . . . . . . . . . . . . 8 70 2.2.3. Auxiliary data in the Urgent Pointer field . . . . . 8 71 2.3. More Accurate ECN TCP Receiver . . . . . . . . . . . . . 9 72 2.4. More Accurate ECN TCP Sender . . . . . . . . . . . . . . 11 73 3. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 74 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 76 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 77 6.1. Normative References . . . . . . . . . . . . . . . . . . 13 78 6.2. Informative References . . . . . . . . . . . . . . . . . 13 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 81 1. Introduction 83 Explicit Congestion Notification (ECN) [RFC3168] is an IP/TCP 84 mechanism where network nodes can mark IP packets instead of dropping 85 them to indicate congestion to the end-points. ECN-capable receivers 86 will feedback this information to the sender. ECN is specified for 87 TCP in such a way that only one feedback signal can be transmitted 88 per Round-Trip Time (RTT). Recently, proposed mechanisms like 89 Congestion Exposure (ConEx) or DCTCP [Ali10] need more accurate ECN 90 feedback information in case when more than one marking is received 91 in one RTT. 93 This documents specifies a different scheme for the ECN feedback in 94 the TCP header to provide more than one feedback signal per RTT. 95 This modification does not obsolete [RFC3168]. To avoid confusion we 96 call the ECN specification of [RFC3168] 'classic ECN' in this 97 document. This document provides an extension that requires 98 additional negotiation in the TCP handshake by using the TCP nonce 99 sum (NS) bit, as specified in [RFC3540], which is currently not used 100 when SYN is set. If the more accurate ECN extension has been 101 negotiated successfully, the meaning of ECN TCP bits and the ECN NS 102 bit is different from the specification in [RFC3168], as well as some 103 bits of the largely unused TCP Urgent field as long as the URG flag 104 is not set. This document specifies the additional negotiation as 105 well as the new coding of the TCP ECN/NS bits. 107 The proposed coding scheme maintains the given bit space in the TCP 108 header as the ECN feedback information is needed in a timely manner 109 and as such should be reported in every ACK. The reuse will avoid 110 additional network load as the ACK size or the number of ACKs will 111 not increase. Moreover, the more accurate ECN information will 112 replace the classic ECN feedback if negotiated. Thus those bits are 113 not needed otherwise. But the proposed schemes requires also the use 114 of the NS bit in the TCP handshake as well as for the more accurate 115 ECN feedback. The proposed more accurate ECN feedback extension 116 includes the ECN-Nonce integrity mechanism as some coding space is 117 left open. 119 1.1. Overview ECN and ECN Nonce in IP/TCP 121 ECN requires two bits in the IP header. The ECN capability of a 122 packet is indicated when either one of the two bits is set. An ECN 123 sender can set one or the other bit to indicate an ECN-capable 124 transport (ECT) which results in two signals, ECT(0) and ECT(1). A 125 network node can set both bits simultaneously when it experiences 126 congestion. When both bits are set the packet is regarded as 127 "Congestion Experienced" (CE). 129 In the TCP header the first two bits in byte 14 are defined for the 130 use of ECN. The TCP mechanism for signaling the reception of a 131 congestion mark uses the ECN-Echo (ECE) flag in the TCP header. To 132 enable the TCP receiver to determine when to stop setting the ECN- 133 Echo flag, the CWR flag is set by the sender upon reception of the 134 feedback signal. This leads always to a full RTT of ACKs with ECE 135 set. Thus any additional CE markings arriving within this RTT can 136 not signaled back anymore. 138 ECN-Nonce [RFC3540] is an optional addition to ECN that is used to 139 protect the TCP sender against accidental or malicious concealment of 140 marked or dropped packets. This addition defines the last bit of 141 byte 13 in the TCP header as the Nonce Sum (NS) bit. With ECN-Nonce 142 a nonce sum is maintain that counts the occurrence of ECT(1) packets. 144 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 145 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 146 | | | N | C | E | U | A | P | R | S | F | 147 | Header Length | Reserved | S | W | C | R | C | S | S | Y | I | 148 | | | | R | E | G | K | H | T | N | N | 149 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 151 Figure 1: The (post-ECN Nonce) definition of the TCP header flags 153 1.2. Re-Use of the Urgent field in TCP 155 RFC0793 specified a mechanism to indicate "urgent data" to a 156 receiver. However, this mechanism is rarely used, and RFC6093 argues 157 to deprecate the use of the mechanism. Furthermore, the content of 158 the Urgent Pointer was always defined to be valid only, when the URG 159 TCP header flag is set. The position of the Urgent Pointer field as 160 well as the URG flag are displayed in Figure 2. 162 In this document the Urgent Pointer field is defined to be (re)usable 163 for auxiliary data if the URG flag is not set. Note that as the 164 contents of this field were previously undefined when the URG bit is 165 not set, a new mechanism using these bits SHOULD not rely on the 166 correct delivery. Further below in this document a new usage for 167 four bits of the Urgent Pointer counter is defined. 169 0 1 2 3 170 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 171 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 172 | Source Port | Destination Port | 173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 | Sequence Number | 175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 176 | Acknowledgment Number | 177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 178 | Data | Res |N|C|E|U|A|P|R|S|F| | 179 | Offset| erv |S|W|C|R|C|S|S|Y|I| Window | 180 | | ed | |R|E|G|K|H|T|N|N| | 181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 182 | Checksum | Urgent Pointer | 183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 185 Figure 2: TCP Header Format showing the 16 bit Urgent pointer 187 1.3. Requirements Language 188 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 189 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 190 document are to be interpreted as described in RFC 2119 [RFC2119]. 192 We use the following terminology from [RFC3168] and [RFC3540]: 194 The ECN field in the IP header: 196 CE: the Congestion Experienced codepoint, and 198 ECT(0): the first ECN-Capable Transport codepoint, and 200 ECT(1): the second ECN-Capable Transport codepoint. 202 The ECN flags in the TCP header: 204 CWR: the Congestion Window Reduced flag, 206 ECE: the ECN-Echo flag, and 208 NS: ECN Nonce Sum. 210 In this document, we will call the ECN feedback scheme as specified 211 in [RFC3168] the 'classic ECN' and our new proposal the 'accurate ECN 212 feedback' scheme. A 'congestion mark' is defined as an IP packet 213 where the CE codepoint is set. A 'congestion event' refers to one or 214 more congestion marks belong to the same overload situation in the 215 network (usually during one RTT). 217 2. More Accurate ECN Feedback 219 In this section we designate the sender to be the one sending data 220 and the receiver as the one that will acknowledge this data. Of 221 course such a scenario is describing only one half connection of a 222 TCP connection. The proposed scheme, if negotiated, will be used for 223 both half connection as both, sender and receiver, need to be capable 224 to echo and understand the accurate ECN feedback scheme. 226 2.1. Negotiation during the TCP handshake 228 During the TCP handshake at the start of a connection, an originator 229 of the connection (host A) MUST indicate a request to get more 230 accurate ECN feedback by setting the TCP flags NS=1, CWR=1 and ECE=1 231 in the initial . This coding allows to negotiate for the 232 classic ECN implicit if the receiver does not support the more 233 accurate ECN feedback scheme. 235 A responding host (host B) MUST return a with flags CWR=1 236 and ECE=0. The NS flag may be either 0 or 1, as described below. 237 The responding host MUST NOT set this combination of flags unless the 238 preceding has already requested support for accurate ECN 239 feedback as above. 241 These handshakes including the fallback when the receiver only 242 support the classic ECN or ECN-Nonce are summarized in Table 1 below. 243 X indicates that NS can be either 0 or 1 depending on whether 244 congestion had been experienced (see below). The handshake 245 indicating any of the other flavors of ECN are also shown for 246 comparison. To compress the width of the table, the headings of the 247 first four columns have been severely abbreviated, as following: 249 Ac: *Ac*curate ECN Feedback 251 N: ECN-*N*once (RFC3540) 253 E: *E*CN (RFC3168) 255 I: Not-ECN (*I*mplicit congestion notification). 257 +----+---+---+---+------------+----------------+------------------+ 258 | Ac | N | E | I | A->B | B->A | Mode | 259 +----+---+---+---+------------+----------------+------------------+ 260 | | | | | NS CWR ECE | NS CWR ECE | | 261 | AB | | | | 1 1 1 | X 1 0 | accurate ECN | 262 | A | B | | | 1 1 1 | 1 0 1 | ECN Nonce | 263 | A | | B | | 1 1 1 | 0 0 1 | classic ECN | 264 | A | | | B | 1 1 1 | 0 0 0 | Not ECN | 265 | A | | | B | 1 1 1 | X 1 1 | Not ECN (broken) | 266 +----+---+---+---+------------+----------------+------------------+ 268 Table 1: ECN capability negotiation between Sender (A) and 269 Receiver (B) 271 The responding host (B) MAY set the NS bit to 1 to indicate a 272 congestion feedback for the packet. Otherwise the receiver (B) 273 MUST reply to the sender with NS=0. The addition of ECN to TCP 274 packets is discussed and specified as experimental in 275 [RFC5562] where the addition of ECN to the SYN packet is optionally 276 described. The security implications when using this option are not 277 further discussed here. Only if the initial from client A is 278 marked CE, the server B SHOULD set the NS flag to 1 to indicate the 279 congestion immediately, instead of delaying the signal to the first 280 acknowledgment when the actual data transmission has started. So, 281 server B MAY set the alternative TCP header flags in its : 282 NS=1, CWR=1 and ECE=0. 284 Recall that, if the reflects the same flag settings as the 285 preceding (because there may exist broken TCP implementations 286 that behave this way), [RFC3168] specifies that the whole connection 287 MUST revert to Not-ECT. 289 2.2. Feedback Coding 291 This section proposes the new coding to provide a more accurate ECN 292 feedback by use of the two ECN TCP bits (ECE/CWR) as well as the TCP 293 NS bit and the optional use of the Urgent Pointer if the URG flag is 294 not set. This coding MUST only be used if the more accurate ECN 295 Feedback has been negotiated successfully in the TCP handshake. 297 2.2.1. Codepoint Coding of the more Accurate ECN (ACE) field 299 The more accurate ECN feedback coding uses the ECE, CWR and NS bits 300 as one field to encode 8 distinct codepoints. This overloaded use of 301 these 3 header flags as one 3-bit more Accurate ECN (ACE) field is 302 shown in Figure 3. The actual definition of the TCP header, 303 including the addition of support for the ECN Nonce, is shown for 304 comparison in Figure 1. This specification does not redefine the 305 names of these three TCP flags, it merely overloads them with another 306 definition once a flow with more accurate ECN feedback is 307 established. 309 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 310 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 311 | | | | U | A | P | R | S | F | 312 | Header Length | Reserved | ACE | R | C | S | S | Y | I | 313 | | | | G | K | H | T | N | N | 314 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 316 Figure 3: Definition of the ACE field within bytes 13 and 14 of the 317 TCP Header (when SYN=0). 319 The 8 possible codepoints are shown below. Five of them are used to 320 encode a "congestion indication" (CI) counter. The other three 321 codepoints are defined in the next section to be used for an 322 integrity check based on ECN-Nonce. The CI counter maintains the 323 number of CE marks observed at the receiver (see Section 2.3). 325 +-----+----+-----+-----+------------+ 326 | AcE | NS | CWR | ECE | CI (base5) | 327 +-----+----+-----+-----+------------+ 328 | 0 | 0 | 0 | 0 | 0 | 329 | 1 | 0 | 0 | 1 | 1 | 330 | 2 | 0 | 1 | 0 | 2 | 331 | 3 | 0 | 1 | 1 | 3 | 332 | 4 | 1 | 0 | 0 | 4 | 333 | 5 | 1 | 0 | 1 | - | 334 | 6 | 1 | 1 | 0 | - | 335 | 7 | 1 | 1 | 1 | - | 336 +-----+----+-----+-----+------------+ 338 Table 2: Codepoint assignment for accurate ECN feedback 340 Also note that, whenever the SYN flag of a TCP segment is set 341 (including when the ACK flag is also set), the NS, CWR and ECE flags 342 (i.e. the ACE field of the ) MUST NOT be interpreted as the 343 3-bit codepoint, which is only used in non-SYN packets. 345 2.2.2. Use with ECN Nonce 347 In ECN Nonce, by comparing the number of incoming ECT(1) 348 notifications with the actual number of packets that were transmitted 349 with an ECT(1) mark as well as the sum of the sender's two internal 350 counters, the sender can probabilistically detect a receiver that 351 sends false marks or suppresses accurate ECN feedback, or a path that 352 does not properly support ECN. 354 If an ECT(1) mark is received, an ETC(1) counter (E1) is incremented. 355 The receiver has to convey that updated information to the sender 356 with the next possible ACK using the three remaining codepoints as 357 shown in Table 3. 359 +-----+----+-----+-----+------------+------------+ 360 | ECI | NS | CWR | ECE | CI (base5) | E1 (base3) | 361 +-----+----+-----+-----+------------+------------+ 362 | 0 | 0 | 0 | 0 | 0 | - | 363 | 1 | 0 | 0 | 1 | 1 | - | 364 | 2 | 0 | 1 | 0 | 2 | - | 365 | 3 | 0 | 1 | 1 | 3 | - | 366 | 4 | 1 | 0 | 0 | 4 | - | 367 | 5 | 1 | 0 | 1 | - | 0 | 368 | 6 | 1 | 1 | 0 | - | 1 | 369 | 7 | 1 | 1 | 1 | - | 2 | 370 +-----+----+-----+-----+------------+------------+ 372 Table 3: Codepoint assignment for accurate ECN feedback and ECN Nonce 374 2.2.3. Auxiliary data in the Urgent Pointer field 375 In order to provide improved resiliency against loss or ACK thinning, 376 the limited number of bits in the existing TCP flags field is 377 insufficient. At the same time is it not necessary to deliver higher 378 order bits with every returned segment, or even reliably at all. 379 Therefore four bits of the reused Urgent Pointer field are defined as 380 the "Top ACE" field of the more accurate ECN feedback, as indicated 381 in Figure 4. This field carries the top (binary) counter value, if 382 the according codepoint does signal the feedback of a counter. 383 Therefore, we call this field "Top ACE". 385 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 386 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 387 | | | 388 | Reserved | Top ACE | 389 | | | 390 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 392 Figure 4: The (post-ECN Nonce) definition of the TCP header flags 394 As 5 codepoints are set aside to provide reasonable resiliency under 395 typical marking and loss regimes, the combination between the 4 bits 396 in the Top ACE field and the 5 codepoints in the ACE field allow for 397 up to 16*5 = 80 congestion indications to be unambiguously signaled 398 back to the sender, even with more extreme levels of CE marking, or 399 return ACK loss. 401 A combination with the 3 remaining codepoints (e.g. to signal a 402 counter for the number of observed ECT1 packets) and this field 403 allows for up to 16*3 = 48 distinct indications. 405 The reserved bits SHOULD be set to zero, and MUST NOT be interpreted 406 when evaluating the combination of the "Top ACE":"ACE" fields. Also, 407 when the URG flag is set, the entire Urgent Pointer MUST NOT be 408 interpreted to carry significance for the Accurate ECN feedback. 410 2.3. More Accurate ECN TCP Receiver 412 This section describes the receiver-side action to signal the 413 accurate ECN feedback back to the sender. To select the correct 414 codepoint for each ACK, the receiver will need to maintain a 415 congestion indication (CI) counter of how many CE marking have been 416 seen during a connection and an ECT(1) counter (E1) that is 417 incremented on the reception of a ECT(1) marked packet. 419 Thus for each incoming segment with a CE marking, the receiver will 420 increase CI by 1. With each ACK the receiver will calculate CI 421 modulo 5 and set the respective codepoint in the ACE field (see Table 422 2). In addition, the receiver calculates CI divided by 5 and may set 423 the "Top ACE" field to this value, provided the URG flag is not set 424 in the segment. To avoid counter wrap around in a high congestion 425 situation, the receiver MAY switch from a delayed ACK behavior to 426 send ACKs immediately after the data packet reception if needed. 428 By default an accurate ECN receiver SHOULD echo the current value of 429 the CI counter, using one of the codepoints encoding the CI counter. 430 Whenever a CE marked segment is received and thus the value of the CI 431 is changed, the receiver MUST echo the then current CI value in the 432 next ACK sent. The receiver MAY use the "Top ACE" field in addition 433 if the URG flag is not set. 435 The requirement to signal an updated CI value immediately with the 436 next ACK may conflict with a delayed ACK ratios larger than two, when 437 using the available number of codepoints only when "Top ACE" can not 438 be used. A receiver MAY change the ACK'ing rate such that a 439 sufficient rate of feedback signals can be sent. However, in the 440 combination with the redefined Urgent Pointer field, no change in the 441 ACK rate should be required. 443 Whenever a ECT(1) marked packet arrives, the receiver SHOULD signal 444 the current value of the E1 counter (modulo 3) in the next ACK using 445 the respective codepoint. If a CE mark was received before sending 446 the next ACK (e.g. delayed ACKs) sending the current CI value update 447 MUST take precedence. Further resilience against lost ACKs MAY be 448 provided by inserting the high order bits of the E1 counter (E1 449 divided by 3) into the Top ACE field. 451 For the implementation it is suggested to maintain two counters so to 452 avoid costly division operations while processing the header 453 information for the ACK. The first counter can be mapped directly 454 into the ACE field. A wrap by the count of 5 is implemented as a 455 single conditional check, and when that happens, a secondary, high- 456 order counter is increased once. This secondary counter can then be 457 mapped directly into the Top ACE field. 459 if (CE) { 460 if (CIcnt == 5) { 461 CIcnt = 0 462 CIovf += 1 463 } else 464 CIcnt += 1 465 } 467 ACE = CIcnt; 468 TopACE = CIovf; 470 Figure 5: Implemetation example 472 2.4. More Accurate ECN TCP Sender 474 This section specifies the sender-side action describing how to 475 exclude the number of congestion markings from the given receiver 476 feedback signal. 478 When the more accurate ECN feedback scheme is supported by the 479 sender, the sender will maintain a congestion indication received 480 (CI.r) counter. This CI.r counter will hold the number of CE marks 481 as signaled by the receiver, and reconstructed by the sender. 483 On the arrival of every ACK, the sender updates the local CI.r value 484 to the signaled CI value in the ACK as conveyed by the combination of 485 the ACE and "Top ACE" fields in the Urgent Pointer if the URG flag is 486 not set. 488 If the URG flag is set and thus the "Top ACE" field in the Urgent 489 Pointer field is not available, the sender calculates a value D as 490 the difference between value of the ACE field and the current CI.r 491 value modulo 5. D is assumed to be the number of CE marked packets 492 that arrived at the receiver since it sent the previously received 493 ACK. Thus the local counter CI.r must be increased by D. 495 As only a limited number of E1 codepoints exist and the receiver 496 might not acknowledge every single data packet immediately (e.g. 497 delayed ACKs), a sender SHOULD NOT mark more than 1/m of the packets 498 with ECT(1), where m is the ACK ratio (e.g. 50% when every second 499 data packet triggers an ACK). This constraint can be lifted when a 500 sender determines, that the auxiliary data is available (the Top ACE 501 field of an ACK with an E1 codepoint is increasing with the number of 502 sent ECT(1) segments). A sender SHOULD send no more than 3 503 consecutive packets marked with ECT(1), as long as the validity of 504 the auxiliary data in the Top ACE field has not been confirmed. 506 3. Acknowledgements 508 We want to thank Bob Briscoe and Michael Welzl for their input and 509 discussion. Special thanks to Bob Briscoe, who first proposed the 510 use of the ECN bits as one field. 512 4. IANA Considerations 514 This memo includes a request to IANA, to set up a new registry. This 515 registry redefines the use of the 16 bit "Urgent Pointer" while the 516 URG flag is not set. 4 of those bits ("Top ACE") are defined within 517 this document to be interpreted in conjunction with another field 518 ("ACE"), overwriting three of the existing TCP flags into a single 519 field. 521 5. Security Considerations 523 TBD 525 ACK loss 527 This scheme sends each codepoint only once. In the worst case at 528 least one, and often two or more consecutive ACKs can be dropped 529 without losing congestion information, even when the auxiliary data 530 field in the former Urgent Pointer field is unavailable (i.e. the URG 531 flag is set, or a middlebox clears its contents). 533 At low congestion rates, the sending of the current value of the CI 534 counter by default allows higher numbers of consecutive ACKs to be 535 lost, without impacting the accuracy of the ECN signal. 537 ECN Nonce 539 In the proposed scheme there are three more codepoints available that 540 could be used for an integrity check like ECN Nonce. If ECN nonce 541 would be implemented as proposed in Section 2.2.2, even more 542 information would be provided for ECN Nonce than in the original 543 specification. 545 A delayed ACK ratio of two can be sustained indefinitely without 546 reverting to auxiliary information, even during heavy congestion, but 547 not during excessive ECT(1) marking, which is under the control of 548 the sender. A higher ACK ratio can be sustained when congestion is 549 low, and the auxiliary data is available. 551 6. References 552 6.1. Normative References 554 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 555 Requirement Levels", BCP 14, RFC 2119, March 1997. 557 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 558 of Explicit Congestion Notification (ECN) to IP", RFC 559 3168, September 2001. 561 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 562 Congestion Notification (ECN) Signaling with Nonces", RFC 563 3540, June 2003. 565 6.2. Informative References 567 [Ali10] Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel, 568 P., Prabhakar, B., Sengupta, S., and M. Sridharan, "DCTCP: 569 Efficient Packet Transport for the Commoditized Data 570 Center", Jan 2010. 572 [I-D.briscoe-tsvwg-re-ecn-tcp] 573 Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, 574 "Re-ECN: Adding Accountability for Causing Congestion to 575 TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-09 (work in 576 progress), October 2010. 578 [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. 579 Ramakrishnan, "Adding Explicit Congestion Notification 580 (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, June 581 2009. 583 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 584 Control", RFC 5681, September 2009. 586 [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding 587 Acknowledgement Congestion Control to TCP", RFC 5690, 588 February 2010. 590 [draft-kuehlewind-tcpm-accurate-ecn-option] 591 Kuehlewind, M. and R. Scheffenegger, "Accurate ECN 592 Feedback Option in TCP", draft-kuehlewind-tcpm-accurate- 593 ecn-option-01 (work in progress), Jul 2012. 595 Authors' Addresses 597 Mirja Kuehlewind (editor) 598 University of Stuttgart 599 Pfaffenwaldring 47 600 Stuttgart 70569 601 Germany 603 Email: mirja.kuehlewind@ikr.uni-stuttgart.de 605 Richard Scheffenegger 606 NetApp, Inc. 607 Am Euro Platz 2 608 Vienna 1120 609 Austria 611 Phone: +43 1 3676811 3146 612 Email: rs@netapp.com