idnits 2.17.1 draft-kuehlewind-tcpm-accecn-reqs-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 15, 2012) is 4211 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'I-D.briscoe-tsvwg-re-ecn-tcp' is defined on line 313, but no explicit reference was found in the text == Unused Reference: 'RFC5562' is defined on line 325, but no explicit reference was found in the text == Unused Reference: 'RFC5681' is defined on line 330, but no explicit reference was found in the text == Unused Reference: 'RFC5690' is defined on line 333, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance and Minor Extensions M. Kuehlewind, Ed. 3 (tcpm) University of Stuttgart 4 Internet-Draft R. Scheffenegger 5 Intended status: Informational NetApp, Inc. 6 Expires: April 18, 2013 October 15, 2012 8 Problem Statement and Requirements for a More Accurate ECN Feedback 9 draft-kuehlewind-tcpm-accecn-reqs-00 11 Abstract 13 Explicit Congestion Notification (ECN) is an IP/TCP mechanism where 14 network nodes can mark IP packets instead of dropping them to 15 indicate congestion to the end-points. An ECN-capable receiver will 16 feedback this information to the sender. ECN is specified for TCP in 17 such a way that only one feedback signal can be transmitted per 18 Round-Trip Time (RTT). Recently, new TCP mechanisms like ConEx or 19 DCTCP need more accurate ECN feedback information in the case where 20 more than one marking is received in one RTT. This documents 21 specifies requirement for different ECN feedback scheme in the TCP 22 header to provide more than one feedback signal per RTT. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on April 18, 2013. 41 Copyright Notice 43 Copyright (c) 2012 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 60 2. Overview ECN and ECN Nonce in IP/TCP . . . . . . . . . . . . . 4 61 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4. Design Approaches . . . . . . . . . . . . . . . . . . . . . . . 6 63 4.1. Re-use of Header Bits . . . . . . . . . . . . . . . . . . . 6 64 4.2. Use of Reserved Bits . . . . . . . . . . . . . . . . . . . 7 65 4.3. TCP Option . . . . . . . . . . . . . . . . . . . . . . . . 7 66 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 67 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 68 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 69 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 8.1. Normative References . . . . . . . . . . . . . . . . . . . 7 71 8.2. Informative References . . . . . . . . . . . . . . . . . . 8 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 74 1. Introduction 76 Explicit Congestion Notification (ECN) [RFC3168] is an IP/TCP 77 mechanism where network nodes can mark IP packets instead of dropping 78 them to indicate congestion to the end-points. An ECN-capable 79 receiver will feedback this information to the sender. ECN is 80 specified for TCP in such a way that only one feedback signal can be 81 transmitted per Round-Trip Time (RTT). Recently, proposed mechanisms 82 like Congestion Exposure (ConEx) or DCTCP [Ali10] need more accurate 83 ECN feedback information in case when more than one marking is 84 received in one RTT. 86 The following scenarios should briefly show where the accurate 87 feedback is needed or provides additional value: 89 A Standard (RFC5681) TCP sender that supports ConEx: 90 In this case the congestion control algorithm still ignores 91 multiple marks per RTT, while the ConEx mechanism uses the 92 extra information per RTT to re-echo more precise congestion 93 information. 95 A sender using DCTCP congestion control without ConEx: 96 The congestion control algorithm uses the extra info per RTT 97 to perform its decrease depending on the number of congestion 98 marks. 100 A sender using DCTCP congestion control and supports ConEx: 101 Both the congestion control algorithm and ConEx use the 102 accurate ECN feedback mechanism. 104 A standard TCP sender (using RFC5681 congestion control algorithm) 105 without ConEx: 106 No accurate feedback is necessary here. The congestion 107 control algorithm still react only on one signal per RTT. 108 But it is best to have one generic feedback mechanism, 109 whether it is used or not. 111 This documents ... 113 1.1. Requirements Language 115 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 116 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 117 document are to be interpreted as described in RFC 2119 [RFC2119]. 119 We use the following terminology from [RFC3168] and [RFC3540]: 121 The ECN field in the IP header: 123 CE: the Congestion Experienced codepoint, and 125 ECT(0): the first ECN-Capable Transport codepoint, and 127 ECT(1): the second ECN-Capable Transport codepoint. 129 The ECN flags in the TCP header: 131 CWR: the Congestion Window Reduced flag, 133 ECE: the ECN-Echo flag, and 135 NS: ECN Nonce Sum. 137 In this document, we will call the ECN feedback scheme as specified 138 in [RFC3168] the 'classic ECN' and our new proposal the 'more 139 accurate ECN feedback' scheme. A 'congestion mark' is defined as an 140 IP packet where the CE codepoint is set. A 'congestion event' refers 141 to one or more congestion marks belong to the same overload situation 142 in the network (usually during one RTT). 144 2. Overview ECN and ECN Nonce in IP/TCP 146 ECN requires two bits in the IP header. The ECN capability of a 147 packet is indicated when either one of the two bits is set. An ECN 148 sender can set one or the other bit to indicate an ECN-capable 149 transport (ECT) which results in two signals, ECT(0) and ECT(1). A 150 network node can set both bits simultaneously when it experiences 151 congestion. When both bits are set the packet is regarded as 152 "Congestion Experienced" (CE). 154 In the TCP header the first two bits in byte 14 are defined for the 155 use of ECN. The TCP mechanism for signaling the reception of a 156 congestion mark uses the ECN-Echo (ECE) flag in the TCP header. To 157 enable the TCP receiver to determine when to stop setting the ECN- 158 Echo flag, the CWR flag is set by the sender upon reception of the 159 feedback signal. This leads always to a full RTT of ACKs with ECE 160 set. Thus any additional CE markings arriving within this RTT can 161 not signaled back anymore. 163 ECN-Nonce [RFC3540] is an optional addition to ECN that is used to 164 protect the TCP sender against accidental or malicious concealment of 165 marked or dropped packets. This addition defines the last bit of 166 byte 13 in the TCP header as the Nonce Sum (NS) bit. With ECN-Nonce 167 a nonce sum is maintain that counts the occurrence of ECT(1) packets. 169 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 170 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 171 | | | N | C | E | U | A | P | R | S | F | 172 | Header Length | Reserved | S | W | C | R | C | S | S | Y | I | 173 | | | | R | E | G | K | H | T | N | N | 174 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 176 Figure 1: The (post-ECN Nonce) definition of the TCP header flags 178 3. Requirements 180 The requirements of the accurate ECN feedback protocol for the use of 181 e.g. Conex or DCTCP are to have a fairly accurate (not necessarily 182 perfect), timely and protected signaling. This leads to the 183 following requirements: 185 Resilience 186 The ECN feedback signal is carried within the TCP 187 acknowledgment. TCP ACKs can get lost. Moreover, delayed 188 ACK are mostly used with TCP. That means in most cases only 189 every second data packets triggers an ACK. In a high 190 congestion situation where most of the packet are marked with 191 CE, an accurate feedback mechanism must still be able to 192 signal sufficient congestion information. Thus the accurate 193 ECN feedback extension has to take delayed ACK and ACK loss 194 into account. 196 Timely 197 The CE marking is induced by a network node on the 198 transmission path and echoed by the receiver in the TCP 199 acknowledgment. Thus when this information arrives at the 200 sender, its naturally already about one RTT old. With a 201 sufficient ACK rate a further delay of a small number of ACK 202 can be tolerated but with large delays this information will 203 be out dated due to high dynamic in the network. TCP 204 congestion control which introduces parts of these dynamics 205 operates on a time scale of one RTT. Thus the congestion 206 feedback information should be delivered timely (within one 207 RTT). 209 Integrity 210 With ECN Nonce, a misbehaving receiver or network node can be 211 detected with a certain probability. As this accurate ECN 212 feedback is reusing the NS bit, it is encouraged to ensure 213 integrity as least as good as ECN Nonce. If this is not 214 possible, alternative approaches should be provided how a 215 mechanism using the accurate ECN feedback extension can re- 216 ensure integrity or give strong incentives for the receiver 217 and network node to cooperate honestly. 219 Accuracy 220 Classic ECN feeds back one congestion notification per RTT, 221 as this is supposed to be used for TCP congestion control 222 which reduces the sending rate at most once per RTT. The 223 accurate ECN feedback scheme has to ensure that if a 224 congestion events occurs at least one congestion notification 225 is echoed and received per RTT as classic ECN would do. Of 226 course, the goal of this extension is to reconstruct the 227 number of CE marking more accurately. However, a sender 228 should not assume to get the exact number of congestion 229 marking in all situations. 231 Complexity 232 Of course, the more accurate ECN feedback can also be used, 233 even if only one ECN feedback signal per RTT is need. The 234 implementation should be as simple as possible and only a 235 minimum of addition state information should be needed. A 236 proposal fulfilling this for a more accurate ECN feedback can 237 then also be the standard ECN feedback mechanism. 239 4. Design Approaches 241 4.1. Re-use of Header Bits 243 The idea is to use the ECE, CWR and NS bits for additional capability 244 negotiation during the TCP handshake exchange, and then for the more 245 accurate ECN feedback itself on subsequent packets in the flow (where 246 SYN is not set). This appraoch only provide a limited resiliency 247 against ACK lost. 249 There have been several codings proposed so far: The one bit scheme 250 sends one ECE for each CE received (+ redundancy in next ACK using 251 the CWR bit). The 3 bit counter scheme uses all three bits for 252 continuesly feeding the three most significant bits of a CE counter 253 back. The 3 bit codepoint scheme encodes either a CE counter or an 254 ECT(1) counter in 8 codepoints. 256 Discussion on ACK loss and ECN... 258 ToDo: Use of other header bit? 260 4.2. Use of Reserved Bits 262 As seen in Figure 1, there are currently three unused flag bits in 263 the TCP header. The proposed scheme could be extended by one or more 264 bits, to add higher resiliency against ACK loss. The relative gain 265 would be proportionally higher resiliency against ACK loss, while the 266 respective drawbacks would remain identical. 268 4.3. TCP Option 270 Alternatively, a new TCP option could be introduced, to help maintain 271 the accuracy, and integrity of the ECN feedback between receiver and 272 sender. Such an option could provide more information. E.g. ECN 273 for RTP/UDP provides explicit the number of ECT(0), ECT(1), CE, non- 274 ECT marked and lost packets. However, deploying new TCP options has 275 its own challenges. A separate document proposes a new TCP Option 276 for accurate ECN feedback [I-D.kuehlewind-tcpm-accurate-ecn-option]. 277 This option could be used in addition to a more accurate ECN feedback 278 scheme described here or in addition to classic ECN, when available 279 and needed. 281 5. Acknowledgements 283 6. IANA Considerations 285 This memo includes no request to IANA. 287 7. Security Considerations 289 TBD 291 8. References 293 8.1. Normative References 295 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 296 Requirement Levels", BCP 14, RFC 2119, March 1997. 298 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 299 of Explicit Congestion Notification (ECN) to IP", 300 RFC 3168, September 2001. 302 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 303 Congestion Notification (ECN) Signaling with Nonces", 304 RFC 3540, June 2003. 306 8.2. Informative References 308 [Ali10] Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel, 309 P., Prabhakar, B., Sengupta, S., and M. Sridharan, "DCTCP: 310 Efficient Packet Transport for the Commoditized Data 311 Center", Jan 2010. 313 [I-D.briscoe-tsvwg-re-ecn-tcp] 314 Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, 315 "Re-ECN: Adding Accountability for Causing Congestion to 316 TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-09 (work in 317 progress), October 2010. 319 [I-D.kuehlewind-tcpm-accurate-ecn-option] 320 Kuehlewind, M. and R. Scheffenegger, "Accurate ECN 321 Feedback Option in TCP", 322 draft-kuehlewind-tcpm-accurate-ecn-option-01 (work in 323 progress), July 2012. 325 [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. 326 Ramakrishnan, "Adding Explicit Congestion Notification 327 (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, 328 June 2009. 330 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 331 Control", RFC 5681, September 2009. 333 [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding 334 Acknowledgement Congestion Control to TCP", RFC 5690, 335 February 2010. 337 Authors' Addresses 339 Mirja Kuehlewind (editor) 340 University of Stuttgart 341 Pfaffenwaldring 47 342 Stuttgart 70569 343 Germany 345 Email: mirja.kuehlewind@ikr.uni-stuttgart.de 346 Richard Scheffenegger 347 NetApp, Inc. 348 Am Euro Platz 2 349 Vienna, 1120 350 Austria 352 Phone: +43 1 3676811 3146 353 Email: rs@netapp.com