idnits 2.17.1 draft-iyengar-quic-delayed-ack-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (26 July 2020) is 1367 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- No information found for draft-ietf-quic-recovery-latest - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'QUIC-RECOVERY' -- No information found for draft-ietf-quic-transport-latest - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'QUIC-TRANSPORT' Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar 3 Internet-Draft Fastly 4 Intended status: Standards Track I. Swett 5 Expires: 27 January 2021 Google 6 26 July 2020 8 Sender Control of Acknowledgement Delays in QUIC 9 draft-iyengar-quic-delayed-ack-01 11 Abstract 13 This document describes a QUIC extension for an endpoint to control 14 its peer's delaying of acknowledgements. 16 Note to Readers 18 Discussion of this draft takes place on the QUIC working group 19 mailing list (quic@ietf.org), which is archived at 20 . 22 Working Group information can be found at ; source code and issues list for this draft can be found at 24 . 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on 27 January 2021. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 50 license-info) in effect on the date of publication of this document. 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. Code Components 53 extracted from this document must include Simplified BSD License text 54 as described in Section 4.e of the Trust Legal Provisions and are 55 provided without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 1.1. Terms and Definitions . . . . . . . . . . . . . . . . . . 2 61 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Negotiating Extension Use . . . . . . . . . . . . . . . . . . 4 63 4. ACK_FREQUENCY Frame . . . . . . . . . . . . . . . . . . . . . 4 64 5. Multiple ACK_FREQUENCY Frames . . . . . . . . . . . . . . . . 5 65 6. Sending Acknowledgments . . . . . . . . . . . . . . . . . . . 6 66 6.1. Response to Reordering . . . . . . . . . . . . . . . . . 7 67 6.2. Expediting Congestion Signals . . . . . . . . . . . . . . 7 68 6.3. Batch Processing of Packets . . . . . . . . . . . . . . . 7 69 7. Computation of Probe Timeout Period . . . . . . . . . . . . . 7 70 8. Implementation Considerations . . . . . . . . . . . . . . . . 8 71 8.1. Loss Detection . . . . . . . . . . . . . . . . . . . . . 8 72 8.2. New Connections . . . . . . . . . . . . . . . . . . . . . 8 73 8.3. Window-based Congestion Controllers . . . . . . . . . . . 8 74 9. Security Considerations . . . . . . . . . . . . . . . . . . . 9 75 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 11. Normative References . . . . . . . . . . . . . . . . . . . . 9 77 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 9 78 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 81 1. Introduction 83 This document describes a QUIC extension for an endpoint to control 84 its peer's delaying of acknowledgements. 86 1.1. Terms and Definitions 88 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 89 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 90 "OPTIONAL" in this document are to be interpreted as described in BCP 91 14 [RFC2119] [RFC8174] when, and only when, they appear in all 92 capitals, as shown here. 94 In the rest of this document, "sender" refers to a QUIC data sender 95 (and acknowledgement receiver). Similarly, "receiver" refers to a 96 QUIC data receiver (and acknowledgement sender). 98 An "acknowledgement packet" refers to a QUIC packet that contains 99 only an ACK frame. 101 This document uses terms, definitions, and notational conventions 102 described in Section 1.2 and Section 1.3 of [QUIC-TRANSPORT]. 104 2. Motivation 106 A receiver acknowledges received packets, but it can delay sending 107 these acknowledgements. The delaying of acknowledgements can impact 108 connection throughput, loss detection and congestion controller 109 performance at a data sender, and CPU utilization at both a data 110 sender and a data receiver. 112 Reducing the frequency of acknowledgement packets can improve 113 connection and endpoint performance in the following ways: 115 * Sending UDP packets can be noticeably CPU intensive on some 116 platforms. Reducing the number of packets that only contain 117 acknowledgements can therefore reduce the amount of CPU consumed 118 at a data receiver. Experience shows that this cost reduction can 119 be significant for high bandwidth connections. 121 * Similarly, receiving and processing UDP packets can also be CPU 122 intensive, and reducing acknowledgement frequency reduces this 123 cost at a data sender. 125 * Severely asymmetric link technologies, such as DOCSIS, LTE, and 126 satellite links, connection throughput in the data direction 127 becomes constrained when the reverse bandwidth is filled by 128 acknowledgment packets. When traversing such links, reducing the 129 number of acknowledgments allows connection throughput to scale 130 much further. 132 As discussed in Section 8 however, there are undesirable consequences 133 to congestion control and loss recovery if a receiver uniltaerally 134 reduces the acknowledgment frequency. A sender's constraints on the 135 acknowledgement frequency need to be taken into account to maximize 136 congestion controller and loss recovery performance. 138 [QUIC-TRANSPORT] currently specifies a simple delayed acknowledgement 139 mechanism that a receiver can use: send an acknowledgement for every 140 other packet, and for every packet when reordering is observed. This 141 simple mechanism does not allow a sender to signal its constraints. 142 This extension provides a mechanism to solve this problem. 144 3. Negotiating Extension Use 146 Endpoints advertise their support of the extension described in this 147 document by sending the following transport parameter (Section 7.2 of 148 [QUIC-TRANSPORT]): 150 min_ack_delay (0xDE1A): A variable-length integer representing the 151 minimum amount of time in microseconds by which the endpoint can 152 delay an acknowledgement. Values of 2^24 or greater are invalid, 153 and receipt of these values MUST be treated as a connection error 154 of type TRANSPORT_PARAMETER_ERROR. 156 An endpoint's min_ack_delay MUST NOT be greater than the its 157 max_ack_delay. Endpoints that support this extension MUST treat 158 receipt of a min_ack_delay that is greater than the received 159 max_ack_delay as a connection error of type 160 TRANSPORT_PARAMETER_ERROR. Note that while the endpoint's 161 max_ack_delay transport parameter is in milliseconds (Section 18.2 of 162 [QUIC-TRANSPORT]), min_ack_delay is specified in microseconds. 164 This Transport Parameter is encoded as per Section 18 of 165 [QUIC-TRANSPORT]. 167 4. ACK_FREQUENCY Frame 169 Delaying acknowledgements as much as possible reduces both work done 170 by the endpoints and network load. An endpoint's loss detection and 171 congestion control mechanisms however need to be tolerant of this 172 delay at the peer. An endpoint signals its tolerance to its peer 173 using an ACK_FREQUENCY frame, shown below: 175 0 1 2 3 176 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 178 | 0xAF (i) ... 179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 180 | Sequence Number (i) ... 181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 182 | Packet Tolerance (i) ... 183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 184 | Update Max Ack Delay (i) ... 185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 186 | Ignore Order (8)| 187 +-+-+-+-+-+-+-+-+-+ 189 Following the common frame format described in Section 12.4 of 190 [QUIC-TRANSPORT], ACK_FREQUENCY frames have a type of 0xAF, and 191 contain the following fields: 193 Sequence Number: A variable-length integer representing the sequence 194 number assigned to the ACK_FREQUENCY frame by the sender to allow 195 receivers to ignore obsolete frames, see Section 5. 197 Packet Tolerance: A variable-length integer representing the maximum 198 number of ack-eliciting packets after which the receiver sends an 199 acknowledgement. A value of 1 will result in an acknowledgement 200 being sent for every ack-eliciting packet received. A value of 0 201 is invalid. Receipt of an invalid value MUST be treated as a 202 connection error of type FRAME_ENCODING_ERROR. 204 Update Max Ack Delay: A variable-length integer representing an 205 update to the peer's "max_ack_delay" transport parameter 206 (Section 18.2 of [QUIC-TRANSPORT]). The value of this field is in 207 microseconds. Any value smaller than the "min_ack_delay" 208 advertised by this endpoint is invalid. Receipt of an invalid 209 value MUST be treated as a connection error of type 210 PROTOCOL_VIOLATION. 212 Ignore Order: An 8-bit field representing a boolean truth value. 213 This field MUST have the value 0x00 (representing "false") or 0x01 214 (representing "true"). This field can be set to "true" by an 215 endpoint that does not wish to receive an immediate 216 acknowledgement when the peer observes reordering (Section 6.1). 217 Receipt of any other value MUST be treated as a connection error 218 of type FRAME_ENCODING_ERROR. 220 ACK_FREQUENCY frames are ack-eliciting. However, their loss does not 221 require retransmission if an ACK_FREQUENCY frame with a larger 222 Sequence Number value has been sent. 224 An endpoint MAY send ACK_FREQUENCY frames multiple times during a 225 connection and with different values. 227 An endpoint will have committed a "max_ack_delay" value to the peer, 228 which specifies the maximum amount of time by which the endpoint will 229 delay sending acknowledgments. When the endpoint receives an 230 ACK_FREQUENCY frame, it MUST update this maximum time to the value 231 proposed by the peer in the Update Max Ack Delay field. 233 5. Multiple ACK_FREQUENCY Frames 235 An endpoint can send multiple ACK_FREQUENCY frames, and each one of 236 them can have different values in all fields. An endpoint MUST use a 237 sequence number of 0 for the first ACK_FREQUENCY frame it constructs 238 and sends, and a strictly increasing value thereafter. 240 An endpoint MUST allow reordered ACK_FREQUENCY frames to be received 241 and processed, see Section 13.3 of [QUIC-TRANSPORT]. 243 On the first received ACK_FREQUENCY frame in a connection, an 244 endpoint MUST immediately record all values from the frame. The 245 sequence number of the frame is recorded as the largest seen sequence 246 number. The new Packet Tolerance and Update Max Ack Delay values 247 MUST be immediately used for delaying acknowledgements; see 248 Section 6. 250 On a subsequently received ACK_FREQUENCY frame, the endpoint MUST 251 check if this frame is more recent than any previous ones, as 252 follows: 254 * If the frame's sequence number is not greater than the largest one 255 seen so far, the endpoint MUST ignore this frame. 257 * If the frame's sequence number is greater than the largest one 258 seen so far, the endpoint MUST immediately replace old recorded 259 state with values received in this frame. The endpoint MUST start 260 using the new values immediately for delaying acknowledgements; 261 see Section 6. The endpoint MUST also replace the recorded 262 sequence number. 264 6. Sending Acknowledgments 266 Prior to receiving an ACK_FREQUENCY frame, endpoints send 267 acknowledgements as specified in Section 13.2.1 of [QUIC-TRANSPORT]. 269 On receiving an ACK_FREQUENCY frame and updating its recorded 270 "max_ack_delay" and "Packet Tolerance" values (Section 5), the 271 endpoint MUST send an acknowledgement when one of the following 272 conditions are met: 274 * Since the last acknowledgement was sent, the number of received 275 ack-eliciting packets is greater than or equal to the recorded 276 "Packet Tolerance". 278 * Since the last acknowledgement was sent, "max_ack_delay" amount of 279 time has passed. 281 Section 6.1, Section 6.2, and Section 6.3 describe exceptions to this 282 strategy. 284 An endpoint is expected to bundle acknowledgements when possible. 285 Every time an acknowledgement is sent, bundled or otherwise, all 286 counters and timers related to delaying of acknowledgments are reset. 288 6.1. Response to Reordering 290 As specified in Section 13.3.1 of [QUIC-TRANSPORT], endpoints are 291 expected to send an acknowledgement immediately on receiving a 292 reordered ack-eliciting packet. This extension modifies this 293 behavior. 295 If the endpoint has not yet received an ACK_FREQUENCY frame, or if 296 the most recent frame received from the peer has an "Ignore Order" 297 value of "false" (0x00), the endpoint MUST immediately acknowledge 298 any subsequent packets that are received out of order. 300 If the most recent ACK_FREQUENCY frame received from the peer has an 301 "Ignore Order" value of "true" (0x01), the endpoint does not make 302 this exception. That is, the endpoint MUST NOT send an immediate 303 acknowledgement in response to packets received out of order, and 304 instead continues to use the peer's "Packet Tolerance" and 305 "max_ack_delay" thresholds for sending acknowledgements. 307 6.2. Expediting Congestion Signals 309 As specified in Section 13.3.1 of [QUIC-TRANSPORT], an endpoint 310 SHOULD immediately acknowledge packets marked with the ECN Congestion 311 Experienced (CE) codepoint in the IP header. Doing so reduces the 312 peer's response time to congestion events. 314 6.3. Batch Processing of Packets 316 For performance reasons, an endpoint can receive incoming packets 317 from the underlying platform in a batch of multiple packets. This 318 batch can contain enough packets to cause multiple acknowledgements 319 to be sent. 321 To avoid sending multiple acknowledgements in rapid succession, an 322 endpoint MAY process all packets in a batch before determining 323 whether a threshold has been met and an acknowledgement is to be sent 324 in response. 326 7. Computation of Probe Timeout Period 328 On sending an update to the peer's "max_ack_delay", an endpoint can 329 use this new value in later computations of its Probe Timeout (PTO) 330 period; see Section 5.2.1 of [QUIC-RECOVERY]. The endpoint MUST 331 however wait until the ACK_FREQUENCY frame that carries this new 332 value is acknowledged by the peer. 334 Until the frame is acknowledged, the endpoint MUST use the greater of 335 the current "max_ack_delay" and the value that is in flight when 336 computing the PTO period. Doing so avoids spurious PTOs that can be 337 caused by an update that increases the peer's "max_ack_delay". 339 While it is expected that endpoints will have only one ACK_FREQUENCY 340 frame in flight at any given time, this extension does not prohibit 341 having more than one in flight. Generally, when using 342 "max_ack_delay" for PTO computations, endpoints MUST use the maximum 343 of the current value and all those in flight. 345 8. Implementation Considerations 347 There are tradeoffs inherent in a sender sending an ACK_FREQUENCY 348 frame to the receiver. As such it is recommended that implementers 349 experiment with different strategies and find those which best suit 350 their applications and congestion controllers. There are, however, 351 noteworthy considerations when devising strategies for sending 352 ACK_FREQUENCY frames. 354 8.1. Loss Detection 356 A sender relies on receipt of acknowledgements to determine the 357 amount of data in flight and to detect losses, e.g. when packets 358 experience reordering, see [QUIC-RECOVERY]. Consequently, how often 359 a receiver sends acknowledgments determines how long it takes for 360 losses to be detected at the sender. 362 8.2. New Connections 364 Many congestion control algorithms have a startup mechanism during 365 the beginning phases of a connection. It is typical that in this 366 period the congestion controller will quickly increase the amount of 367 data in the network until it is signalled to stop. While the 368 mechanism used to achieve this increase varies, acknowledgments by 369 the peer are generally critical during this phase to drive the 370 congestion controller's machinery. A sender can send ACK_FREQUENCY 371 frames while its congestion controller is in this state, ensuring 372 that the receiver will send acknowledgments at a rate which is 373 optimal for the the sender's congestion controller. 375 8.3. Window-based Congestion Controllers 377 Congestion controllers that are purely window-based and strictly 378 adherent to packet conservation, such as the one defined in 379 [QUIC-RECOVERY], rely on receipt of acknowledgments to move the 380 congestion window forward and send additional data into the network. 381 Such controllers will suffer degraded performance if acknowledgments 382 are delayed excessively. Similarly, if these controllers rely on the 383 timing of peer acknowledgments (an "ACK clock"), delaying 384 acknowledgments will cause undesirable bursts of data into the 385 network. 387 9. Security Considerations 389 TBD. 391 10. IANA Considerations 393 TBD. 395 11. Normative References 397 [QUIC-RECOVERY] 398 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 399 and Congestion Control", Work in Progress, Internet-Draft, 400 draft-ietf-quic-recovery-latest, 401 . 404 [QUIC-TRANSPORT] 405 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 406 Multiplexed and Secure Transport", Work in Progress, 407 Internet-Draft, draft-ietf-quic-transport-latest, 408 . 411 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 412 Requirement Levels", BCP 14, RFC 2119, 413 DOI 10.17487/RFC2119, March 1997, 414 . 416 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 417 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 418 May 2017, . 420 Appendix A. Change Log 422 *RFC Editor's Note:* Please remove this section prior to 423 publication of a final version of this document. 425 Acknowledgments 427 The following people directly contributed key ideas that shaped this 428 draft: Bob Briscoe, Kazuho Oku, Marten Seemann. 430 Authors' Addresses 431 Jana Iyengar 432 Fastly 434 Email: jri.ietf@gmail.com 436 Ian Swett 437 Google 439 Email: ian.swett@google.com