idnits 2.17.1 draft-liu-multipath-quic-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1071 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([QUIC-TRANSPORT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (8 March 2021) is 1144 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 227 == Missing Reference: 'X' is mentioned on line 315, but not defined == Missing Reference: 'Y' is mentioned on line 317, but not defined -- Looks like a reference, but probably isn't: '1' on line 322 == Missing Reference: 'U' is mentioned on line 320, but not defined -- Looks like a reference, but probably isn't: '2' on line 320 == Missing Reference: 'V' is mentioned on line 322, but not defined == Outdated reference: A later version (-07) exists of draft-deconinck-quic-multipath-06 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC Y. Liu 3 Internet-Draft Y. Ma 4 Intended status: Standards Track Alibaba Inc. 5 Expires: 9 September 2021 C. Huitema 6 Private Octopus Inc. 7 Q. An 8 Alibaba Inc. 9 Z. Li 10 ICT-CAS 11 8 March 2021 13 Multipath Extension for QUIC 14 draft-liu-multipath-quic-03 16 Abstract 18 This document specifies multipath extension for the QUIC protocol to 19 enable the simultaneous usage of multiple paths for a single 20 connection. The extension is compliant with the single-path QUIC 21 design. The design principle is to support multipath by adding 22 limited extension to [QUIC-TRANSPORT]. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 9 September 2021. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction 58 2. Conventions and Definitions 59 3. Enable Multipath QUIC - Handshake 60 4. Path Management 61 4.1. Path Identifier and Connection ID 62 4.2. Path Packet Number Spaces 63 4.3. Path Initiation 64 4.4. Path State Management 65 4.5. Path Close 66 4.5.1. Use PATH_STATUS frame to close a path 67 4.5.2. Effect of RETIRE_CONNECTION_ID frame 68 4.5.3. Idle timeout 69 5. Using TLS to Secure QUIC Multipath 70 5.1. Packet protection for QUIC Multipath 71 5.2. Key Update for QUIC Multipath 72 6. Using Multipath QUIC with load balancers 73 7. Packet scheduling 74 7.1. Basic Scheduling 75 7.2. Scheduling with QoE Feedback 76 7.3. Per-stream Policy 77 8. Congestion control and loss detection 78 8.1. Congestion control 79 8.2. Packet number space and acknowledgements 80 8.3. Flow control 81 9. New frames 82 9.1. PATH_STATUS frame 83 9.2. ACK_MP frame 84 9.3. QOE_CONTROL_SIGNALS frame 85 10. Implementation Considerations 86 10.1. Handling of 0-RTT packets 87 11. Security Considerations 88 12. IANA Considerations 89 13. Changelog 90 14. Appendix.A Scenarios related to migration 91 15. Appendix.B Considerations on RTT estimate and loss detection 92 16. Appendix.C Difference from past proposals 93 17. References 94 17.1. Normative References 95 17.2. Informative References 96 Authors' Addresses 98 1. Introduction 100 In this document, we propose an extension to the current QUIC design 101 to enable the simultaneous usage of multiple paths for a single 102 connection. 104 This proposal is based on several basic design points: 106 * Re-use as much as possible mechanisms of QUIC-v1, which has 107 supported connection migration and path validation. 109 * To avoid the risk of packets being dropped by middleboxes (which 110 may only support QUIC-v1), use the same packet header formats as 111 QUIC V1. 113 * Endpoints need a Path Identifier for each different path which is 114 used to track states of packets. As we want to keep the packet 115 header formats unchanged [QUIC-TRANSPORT], Connection IDs (and the 116 sequence number of Connection IDs) would be a good choice of Path 117 Identifier. 119 * For the convenience of packet loss detection and recovery, 120 endpoints use a different packet number space for each Path 121 Identifier. 123 * Congestion Control, RTT measurements and PMTU discovery should be 124 per-path (following [QUIC-TRANSPORT]) 126 This document is organized as follows. It first provides definitions 127 of multipath quic in Section 2. It then specifies how to enable 128 multipath quic during handshake in Section 3, and path management in 129 Section 4. It discusses packet scheduling in Section 7, and 130 congestion control in Section 8. The new frames are defined in 131 Section 9. 133 2. Conventions and Definitions 135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 136 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 137 "OPTIONAL" in this document are to be interpreted as described in BCP 138 14 [RFC2119] [RFC8174] when, and only when, they appear in all 139 capitals, as shown here. 141 We assume that the reader is familiar with the terminology used in 142 [QUIC-TRANSPORT]. In addition, we define the following terms: 144 * Path Identifier: An identifier that is used to identify a path in 145 a QUIC connection at an endpoint. It is defined as the sequence 146 number of the destination Connection ID used for sending packets 147 on that particular path. 149 * Each node maintains a list of "Received Packets" for each of the 150 CID that it provided to the peer, which is used for acknowledging 151 packets received with that CID. 153 3. Enable Multipath QUIC - Handshake 155 This extension defines a new transport parameter, used to negotiate 156 the use of the multipath extension during the connection handshake, 157 as specified in [QUIC-TRANSPORT]. The new transport parameter is 158 defined as follow: 160 * name: enable_multipath (TBD - experiments use 0xbaba) 162 * value: 0 (default) for disabled, 1 for enabled 164 If the peer does not carry the enable_multipath(TBD - experiments use 165 0xbaba) transport parameter, which means the peer does NOT support 166 multipath, endpoint MUST fallback to [QUIC-TRANSPORT] with single 167 path and MUST NOT send any MP frames in the following packets, also 168 MUST NOT use the multipath specific AEAD algorithm defined in 169 Section 5.1. 171 Notice that transport parameter "active_connection_id_limit" 172 [QUIC-TRANSPORT] limits the number of usable Connection IDs, and also 173 limits the number of concurrent paths. 175 4. Path Management 177 After endpoints have negotiated in handshake flow that both endpoints 178 enable multipath feature, endpoints can start using multiple paths. 180 This proposal add one frame for path management: 182 * PATH_STATUS frame for the receiver side to claim the path state 183 and preference 185 All the new MP frames are sent in 1-RTT packets [QUIC-TRANSPORT]. 187 4.1. Path Identifier and Connection ID 189 Endpoints need a Path Identifier for each different path which is 190 used to track states of packets. Endpoints use Connection IDs in 191 1-RTT packet header as Path Identifier in each directions, and use 192 the sequence number of Connection IDs in MP frames to identify the 193 path referred. 195 Following [QUIC-TRANSPORT], Each endpoint uses NEW_CONNECTION_ID 196 frames to claim usable connections IDs for itself. Before an 197 endpoint add a new path, it SHOULD check whether there is at least 198 one unused available Connection ID for each side. 200 Endpoints can find which path a received packet belongs to according 201 to the Destination Connection ID of the 1-RTT packet. Endpoints can 202 find the context of a path by its' Connection ID or the Sequence 203 number of Connection ID. 205 4.2. Path Packet Number Spaces 207 For the convenience of packet loss detection and recovery, endpoints 208 use a different packet number space for each Path Identifier 209 (Connection ID). ACK_MP frame includes the sequence number of the 210 Destination Connection ID of the acknowledged packets as the Path 211 Identifier. 213 4.3. Path Initiation 215 Figure 1 illustrates an example of new path establishment. 217 Client Server 219 (Exchanges start on default path) 220 1-RTT[]: NEW_CONNECTION_ID[C1, Seq=1] --> 221 <-- 1-RTT[]: NEW_CONNECTION_ID[S1, Seq=1] 222 <-- 1-RTT[]: NEW_CONNECTION_ID[S2, Seq=2] 223 ... 224 (starts new path) 225 1-RTT[0]: DCID=S2, PATH_CHALLENGE[X] --> 226 Checks AEAD using nonce(CID sequence 2, PN 0) 227 <-- 1-RTT[0]: DCID=C1, PATH_RESPONSE[X], PATH_CHALLENGE[Y], 228 ACK_MP[Seq=2,PN=0] 229 Checks AEAD using nonce(CID sequence 1, PN 0) 230 1-RTT[1]: DCID=S2, PATH_RESPONSE[Y], 231 ACK_MP[Seq=1, PN=0], ... --> 233 Figure 1: Example of new path establishment 235 As shown in Figure 1, client provides one unused available Connection 236 ID (C1 with sequence number 1), and server provides two available 237 Connection IDs (S1 with sequence number 1, and S2 with sequence 238 number 2). When client wants to start a new path, it checks whether 239 there is unused available Connection IDs for each side, and choose an 240 available Connection ID S2 as the Destination Connection ID in the 241 new path. 243 Endpoints need to exchange unused available Connection IDs with the 244 NEW_CONNECTION_ID frame before an endpoint starts a new path. For 245 example, if the goal is to maintain 2 paths, each endpoint should 246 provide at least 3 CID to its peer: 2 in use, and one spare. If the 247 client has used all the allocated CID, it is supposed to retire those 248 that are not used anymore, and the server is supposed to provide 249 replacements, as specified in [QUIC-TRANSPORT]. 251 If the transport parameter "active_connection_id_limit" is negotiated 252 as N, and the server has provided N Connection IDs and the client has 253 started N paths, the limit is reached. If the client wants to start 254 a new path, it has to retire one of the established paths. 256 Path validation uses the PATH_CHALLENGE and PATH_RESPONSE frame 257 defined in QUIC-Transport [QUIC-TRANSPORT]. 259 4.4. Path State Management 261 An endpoint uses PATH_STATUS frames to inform that the peer should 262 send packets in the preference expressed by these frames. An 263 endpoint uses the sequence number of the CID used by the peer for 264 PATH_STATUS frames (describing the sender's path identifier). 266 In the example Figure 1, if the client wants to send a PATH_STATUS 267 frame to tell the server that it prefers the path with CID sequence 268 number 1 (of the server's side), the client should use the identifier 269 of the server (sequence 1) in PATH_STATUS frame. 271 PATH_STATUS frame describes 4 kinds of path states: 273 * Abandon a path, and release the corresponding resource. 275 * Mark a path as "available", i.e., allow the peer to use its own 276 logic to split traffic among available paths. 278 * Mark a path as "standby", i.e., suggest that no traffic should be 279 sent on that path if another path is available. 281 * Mark the priority of a path, i.e, path 1 is weight 8, path 2 is 282 weight 2, suggest that path 1 has higher priority than path 2, and 283 peer should try to send more data in path 1. 285 PATH_STATUS frame can be sent via a different path, instead of the 286 path identified by the Path Identifier field. 288 4.5. Path Close 290 An endpoint that want to delete a path SHOULD NOT rely on implicit 291 signals like idle time or packet losses, but instead SHOULD use 292 explicit ask to abandon path by sending the PATH_STATUS frame. 294 4.5.1. Use PATH_STATUS frame to close a path 296 Both client and server can close a path, by sending PATH_STATUS frame 297 which abandons the path with a corresponding Path Identifier. Once a 298 path is marked as "abandon", it means that the resources related to 299 the path can be released. 301 Figure 2 illustrates an example of path closing. In this case, we 302 are going to close the first path. For the first path, the server's 303 1-RTT packets use DCID C1, which has a sequence number of 1; the 304 client's 1-RTT packets use DCID S2, which has a sequence number of 2. 305 For the second path, the server's 1-RTT packets use DCID C2, which 306 has a sequence number of 2; the client's 1-RTT packets use CID S3, 307 which has a sequence number of 3. Note that two paths use different 308 packet number space. (For the convience of distinguishing the CID 309 sequence number and PATH_STATUS sequence number, we call the 310 "PATH_STATUS sequence number" as "PSSN".) 312 Client Server 314 (client tells server to abandon a path) 315 1-RTT[X]: DCID=S2 PATH_STATUS[id=1, PSSN1, status=abandon, pri.=0] -> 316 (server tells client to abandon a path) 317 <- 1-RTT[Y]: DCID=C1 PATH_STATUS[id=2, PSSN2, status=abandon, pri.=0], 318 ACK_MP[Seq=2, PN=X] 319 (client abandons the path that it is using) 320 1-RTT[U]: DCID=S3 RETIRE_CONNECTION_ID[2], ACK_MP[Seq=1, PN=Y] -> 321 (server abandons the path that it is using) 322 <- 1-RTT[V]: DCID=C2 RETIRE_CONNECTION_ID[1], ACK_MP[Seq=3, PN=U] 324 Figure 2: Example of closing a path 326 In scenarios such as client detects the network environment change 327 (client's 4G/Wi-Fi is turned off, Wi-Fi signal is fading to a 328 threshold), or endpoints detect that the quality of RTT or loss rate 329 is becoming worse, client or server can terminate a path immediately. 331 4.5.2. Effect of RETIRE_CONNECTION_ID frame 333 Receiving a RETIRE_CONNECTION_ID frame causes the endpoint to discard 334 the resources associated with that connection ID. If the connection 335 ID was used by the peer to identify a path from the peer to this 336 endpoint, the resources include the list of received packets used to 337 send acknowledgements. The peer MAY decide to keep sending data 338 using the same IP addresses and UDP ports previously associated with 339 the connection ID, but MUST use a different connection ID when doing 340 so. 342 4.5.3. Idle timeout 344 [QUIC-TRANSPORT] allows for closing of connections if they stay idle 345 for too long. The connection idle timeout in multipath QUIC is 346 defined as "no packet received on any path for the duration of the 347 idle timeout". It means that if all paths remain idle for the idle 348 timeout, the connection is implicitly closed. 350 5. Using TLS to Secure QUIC Multipath 352 In order to facilitate loss detection and recovery when sending data 353 over multiple paths, this specification defines how packets sent over 354 multiple paths use different packet number spaces. This requires 355 changes in the way AEAD is applied for packet protection, as 356 explained in Section 5.1, and tighter constrainst for key updates, as 357 explained in Section 5.2. 359 5.1. Packet protection for QUIC Multipath 361 Packet protection for QUIC V1 is specified is section 5 of 362 [QUIC-TLS]. The general principles of packet protection are not 363 changed for QUIC Multipath. No changes are needed for setting packet 364 protection keys, initial secrets, header protection, use of 0-RTT 365 keys, receiving out-of-order protected packets, receiving protected 366 packets, or retry packet integrity. However, the use of multiple 367 number spaces for 1-RTT packets requires changes in AEAD usage. 369 Section 5.3 of [QUIC-TLS] specifies AEAD usage, and in particular the 370 use of a nonce, N, formed by combining the packet protection IV with 371 the packet number. QUIC multipath uses multiple packet number 372 spaces, and thus the packet number alone would not guarantee the 373 uniqueness of the nonce. In order to guarantee this uniqueness, we 374 construct the nonce N by combining the packet protection IV with the 375 packet number and with the identifier of the path, which for 1-RTT 376 packets is the Sequence Number of the Destination Connection ID 377 present in the packet header, as defined in Section 5.1.1 of 378 [QUIC-TRANSPORT], or zero if the Connection ID is zero-length. 379 Section 19 of [QUIC-TRANSPORT] encode this Connection ID Sequence 380 Number as a A variable-length integer, allowing values up to 2^62-1; 381 for QUIC multipath, we require that a range of no more than 2^32-1 382 values be used without updating the packet protection key. 384 For QUIC multipath, the construction of the nonce starts with the 385 construction of a 96 bit path-and-packet-number, composed of the 32 386 bit Connection ID Sequence Number in byte order, two zero bits, and 387 the 62 bits of the reconstructed QUIC packet number in network byte 388 order. If the IV is larger than 96 bits, path-and-packet-number is 389 left-padded with zeros to the size of the IV. The exclusive OR of 390 the padded packet number and the IV forms the AEAD nonce. 392 For example, assuming the IV value is "6b26114b9cba2b63a9e8dd4f", the 393 connection ID sequence number is "3", and the packet number is 394 "aead", the nonce will be set to "6b2611489cba2b63a9a873e2". 396 5.2. Key Update for QUIC Multipath 398 The Key Phase bit update process for QUIC V1 is specified in 399 Section 6 of [QUIC-TLS]. The general principles of key update are 400 not changed for Multipath QUIC. Following QUIC V1, the Key Phase bit 401 is used to indicate which packet protection keys are used to protect 402 the packet. The Key Phase bit is toggled to signal each subsequent 403 key update. Because of network delays, packets protected with the 404 older key might arrive later than the packets protected with the new 405 key. Therefore, the endpoint needs to retain old packet keys to 406 allow these delayed packets to be processed and it must distinguish 407 between the new key and the old key. In QUIC V1, this is done using 408 packet numbers so that the rule is made simple: Use the older key if 409 packet number is lower than any packet number frome the current key 410 phase. 412 In QUIC multipath, some care is needed in the initiating Key Update 413 process. Because different paths use different packet number spaces 414 but share a single key, when a key update is initiated on one path, 415 packets sent to the other path needs to know when transition is 416 complete. Otherwise, it is possible that the other paths send 417 packets with the old keys, but skip sending any packets in the 418 current key phase and directly jump to sending packet in the next key 419 phase. When that happens, as the endpoint can only retain two sets 420 of packet protection keys with the 1-bit Key Phase bit, the other 421 paths cannot distinguish which key should be used to decode received 422 packets, which results in a key rotation synchronization problem. 424 To address such a synchronization issue, in QUIC multipath, if key 425 update is initilized on one path, the sender should send at least one 426 packet with the new key on all active paths. Regarding the 427 responding to Key Update process, the endpoint MUST NOT initiate a 428 subsequent key update until a packet with the current key has been 429 acknowledged on each path. 431 Following the Section 5.4. of [QUIC-TLS], the Key Phase bit is 432 protected, so sending multiple packets with Key Phase bit flipping at 433 the same time should not cause linkability issue. 435 6. Using Multipath QUIC with load balancers 437 This specification follows the Connection ID negotiation defined in 438 [QUIC-TRANSPORT]. For stateless or low-state load balancers 439 supporting Multipath QUIC, implementations SHOULD use the 440 specification of Connection ID generation and Load balancer routing 441 defined in [QUIC-LB], guarantee that packets with Connection IDs 442 belonging to the same connection, can be routed to same server. 444 7. Packet scheduling 446 7.1. Basic Scheduling 448 For an outgoing packet, the packet scheduler decides which path the 449 packet shall be transmitted. A basic static scheduling strategy 450 consists of four major components: 452 1. Path state: A scheduler may want to decide which path shall be 453 activated to transmit data. For instance, a scheduler can choose 454 to use only one of the two paths and completely ignore the other 455 one. A scheduler marks the selected paths to be in the 456 "available" state and the un-selected ones in the "standby" 457 state. 459 2. Path priority: Due to the fact that costs of transmitting data 460 over different paths are not always equal. For example, the 461 energy (battery) cost over a 5G path and a wifi path are very 462 different. In another example, transmissions over a wifi path 463 and a cellular path may incur different charges per packet. Note 464 that a user's preference may change over time. For instance, 465 certain mobile carriers offer unlimited free data for a 466 particular streaming app. Therefore, the path priority should be 467 made available in the scheduler. 469 3. Path selection algorithm: A selection algorithm splits packets 470 across different paths and determines the order of paths to be 471 selected. The selection algorithm takes congestion controller 472 states as inputs, such as smoothed RTTs (sRTTs), estimated 473 bandwidths (eBWs) and congestion window sizes (CWNDs) as well as 474 application-defined information such as path priorities and path 475 states. The outputs of the algorithm is an ordered list of paths 476 to put a packet on. To name a few, some of the commonly used 477 algorithms are: - Round-Robin: There is no priority. it selects 478 paths one by one in order to transmit data. - Lowest-RTT: It 479 first chooses the path with the lowest RTT and feeds packets to 480 it until that path's congestion window is full. Then it chooses 481 the path with the second lowest RTT. - Highest-Sending-Rate: It 482 first chooses the path with the highest bandwidth and feeds 483 packets to it until that path's congestion window is full. Then 484 it chooses path with the second largest bandwidth. 486 4. Packet redundancy: One major challenge in multi-path transmission 487 is that a packet loss on the slow path might block the overall 488 transmission when packets are split across fast-changing paths. 489 As the path selection algorithm takes inputs from congestion 490 controllers on predictions of the network which may not be 491 accurate enough for fast-changing wireless channels, such an 492 imprecise estimation could lead to network overuse/underuse. A 493 solution to this problem is to implement packet redundancy 494 strategy. A redundancy strategy can be applied to only ACK 495 packets(partial redundancy) or all data packets (full 496 redundancy). It is up to the application to determine whether, 497 when, and on which packets to activate redundancy. 499 The path state and path priority are managed by PATH_STATUS frame. 500 The path selection algorithm and packet redundancy are application 501 related and should be controlled by the applicaiton. 503 7.2. Scheduling with QoE Feedback 505 Applications may have completely different QoE requirements---the 506 interactive applications are delay sensitive, while the video 507 streaming applications are more throughput sensitive. There is thus 508 a trend of cross-layer design that takes applications' demands into 509 account when managing paths or scheduling packets. The QoE feedback 510 is used to fully support application-awareness in multipath 511 scheduling and is carried in the QOE_CONTROL_SIGNALS frames Figure 6. 512 The QOE_CONTROL_SIGNALS frames can include general application-level 513 information that is needed by the schedulers. The frequency of such 514 feedback should be controlled to limit the amount of extra packets. 515 The QoE control signal allows a synchronization of viewpoints between 516 two endhosts. It is up to the application to determine the 517 interpretation of QoE control signals. 519 7.3. Per-stream Policy 521 As QUIC supports stream multiplexing, streams are allowed to 522 associate stream priorities to express applications intent. For 523 instance, objects in a web page may be dependent on others and thus 524 have different priorities multipath quic scheduler. A stream 525 priority-aware packet scheduling algorithm will improve the 526 performance notably. 528 High priority /\ +---------+ 529 || | | 530 || +---------+ 531 || +---------+ 532 || | | 533 || +---------+ 534 || ... User-defined stream priority 535 || +---------+ 536 Low priority || | | 537 || +---------+ 538 ----------------------------------------------------------- 539 High priority /\ +---------+ 540 || | | 541 || +---------+ 542 || +---------+ 543 || | | 544 || +---------+ 545 || ... Default stream priority 546 || +---------+ 547 Low priority || | | 548 || +---------+ 550 Figure 3: Stream priority 552 The priority management scheme composes two separated priority 553 ranges. The user-defined priority range includes those streams that 554 the applications explicitly designate priorities, while the default 555 priority range includes the streams with no priorities set by the 556 applications. Only when the streams in the user-defined ranges have 557 no data to send, the streams in the default priority range can send. 558 In the same range, one can use the weighted-round robin for 559 scheduling---the higher-priority streams get more quota for data to 560 send in each round. One can also dynamically set/change the 561 priorities of the streams in the default priority ranges to enable 562 short stream first if needed. 564 8. Congestion control and loss detection 566 8.1. Congestion control 568 Implementations MAY support coupled congestion controllers such as 569 LIA [MPTCP-LIA], OLIA [MPTCP-OLIA], and etc., or support decoupled 570 congestion controllers in environments using disjoint network paths. 572 In decoupled congestion control, each path runs its own congestion 573 controller without interacting with the congestion controllers of 574 other paths. That is to say, in the aspect of congestion control, a 575 path behaves exactly the same as a normal QUIC connection over the 576 same network path. 578 Each path MAY choose congestion control algorithm independently. 580 8.2. Packet number space and acknowledgements 582 Each path has it's own packet number space for transmitting 1-RTT 583 packets. 585 Acknowledgements of Initial and Handshake packets MUST be carried 586 using ACK frames, as specified in [QUIC-TRANSPORT]. The ACK frames, 587 as defined in [QUIC-TRANSPORT], do not carry path identifiers. If 588 for some reason ACK frames are received in 1RTT packets while the 589 state of multipath negotiation is ambiguous, they MUST be interpreted 590 as acknowledging packets sent on path number 0. After endpoints 591 successfully negotiate multipath support, they SHOULD use ACK_MP 592 frames instead of ACK frames to signal acknowledgement of 1-RTT 593 packets, and also 0-RTT packets as specified in Section 10.1. 595 ACK_MP frame Section 9.2 can be returned via either a different path, 596 or the same path identified by the Path Identifier, based on 597 different strategies of sending ACK_MP frames. 599 8.3. Flow control 601 TBD. 603 9. New frames 605 All the new frames MUST be sent in 1-RTT packet, and MUST NOT use 606 other encryption levels. 608 If an endpoint receives MP frames from packets of other encryption 609 levels, it MUST return MP_PROTOCOL_VIOLATION as a connection error 610 and close the connection. 612 9.1. PATH_STATUS frame 614 PATH_STATUS Frame are used by endpoints to inform the peer of the 615 current status of one path, and the peer should send packets 616 according to the preference expressed in these frames. Endpoint use 617 the sequence number of the CID used by the peer for PATH_STATUS 618 frames (describing the sender's path identifier). PATH_STATUS frames 619 are formatted as shown in Figure 4. 621 PATH_STATUS Frame { 622 Type (i) = TBD-03 (experiments use 0xbaba03), 623 Path Identifier (i), 624 Path Status sequence number (i), 625 Path Status (i), 626 Path Priority (i), 627 } 629 Figure 4: PATH_STATUS Frame Format 631 PATH_STATUS Frames contain the following fields: 633 Path Identifier: A variable-length integer specifying the path 634 identifier. 636 Path Status sequence number: A variable-length integer specifying the 637 sequence number assigned for this PATH_STATUS frame. There is a 638 different path status sequence number space for each path. 640 Available values of Path Status field are: 642 * 0: Abandon 644 * 1: Standby 646 * 2: Available 648 If the value of Path Status field is 2-available, the receiver side 649 can use the Path Priority field to express the priority weight of a 650 path for the peer. 652 Frames may be received out of order. A peer MUST ignore an incoming 653 PATH_STATUS frame if it previously received another PATH_STATUS frame 654 for the same Path Identifier with a sequence number equal to or 655 higher than the sequence number of the incoming frame. 657 PATH_STATUS frames SHOULD be acknowledged. If a packet containing a 658 PATH_STATUS frame is considered lost, the peer should only repeat it 659 if it was the last status sent for that path -- as indicated by the 660 sequence number. 662 9.2. ACK_MP frame 664 ACK_MP frame allows for acknowledgements on different paths. ACK_MP 665 frame is formatted by adding a Path Identifier field to 666 [QUIC-TRANSPORT] ACK frame. ACK_MP frame is formatted as shown in 667 Figure 5. 669 ACK_MP Frame { 670 Type (i) = TBD-00..TBD-01 (experiments use 0xbaba00..0xbaba01), 671 Path Identifier (i), 672 Largest Acknowledged (i), 673 ACK Delay (i), 674 ACK Range Count (i), 675 First ACK Range (i), 676 ACK Range (..) ..., 677 [ECN Counts (..)], 678 } 680 Figure 5: ACK_MP Frame Format 682 Type(i) = TBD-00 (experiments use 0xbaba00) , with no ECN Counts 683 Type(i) = TBD-01 (experiments use 0xbaba01) , with ECN Counts 685 9.3. QOE_CONTROL_SIGNALS frame 687 QOE_CONTROL_SIGNALS frame is used to carry quality of experience 688 (QoE) information. A typical use of such information is to provide 689 feedback to help application-aware scheduling. Note that different 690 applications may have very different needs, the interpretation of the 691 QoE control signal can be up to the users. QOE_CONTROL_SIGNALS 692 frames are formatted as shown in Figure 6. 694 QOE_CONTROL_SIGNALS Frame { 695 Type (i) = TBD-02 (experiments use 0xbaba02), 696 Path Identifier (i), 697 QoE Control Signals Length(8), 698 QoE Control Signals (..) 699 } 701 Figure 6: QOE_CONTROL_SIGNALS Frame Format 703 QOE_CONTROL_SIGNALS frames may be received out of order, peers SHOULD 704 pass them to the application as they arrive. Although 705 QOE_CONTROL_SIGNALS frames are not retransmitted upon loss detection, 706 they are ack-eliciting [QUIC-RECOVERY]. 708 10. Implementation Considerations 710 ## Management of acknowledgements delay If implementation uses 711 ACK_FREQUENCY Frame in [QUIC-DELAYED-ACK] to let senders control the 712 frequency of acknowledgements, the same mechanism can be used in 713 multi-path QUIC. There are two parameters in the ACK_FREQUENCY 714 Frame, "Packet Tolerance" and "Update Max Ack Delay". 716 Those two parameters are typically computed in real time based on 717 observed performance: 719 * "Packet Tolerance" is set to a fraction of the congestion window 721 * "Update Max Ack Delay" is set to a fraction of the RTT -- but not 722 smaller than the specified min delay 724 In multi-path QUIC, there are multiple paths with different RTT and 725 different congestion windows. In this draft, it is suggested that 726 implementations can use the smallest RTT of the available paths to 727 compute the delay, and use the sum of congestion windows of all 728 available(not including standby/abandon state) paths. 730 10.1. Handling of 0-RTT packets 732 The draft specifies a packet number space for each path. Because 733 multi-path is enabled after the handshake negotiation complete, there 734 will be a separate context for each Connection ID after multi-path is 735 negotiated. 0-RTT packets are sent before these per path contexts are 736 established. To avoid confusion, this draft provides a way for 737 implementations to deal with 0-RTT packets that is both easy to 738 implement and compatible with [QUIC-TRANSPORT]: 740 * All 0-RTT packet are initially tracked in the "global" application 741 context. 743 * On the client side, 0-RTT packets are initially sent in the 744 "global" application context. The handshake concludes before any 745 1-RTT packet can be sent or received. When the handshake 746 completes, if multipath is negotiated, the tracking of 0-RTT 747 packets moves from the "global" application context to the "path 748 0" application context. That means the sequence number of the 749 first 1-RTT packets sent by the client will follow the sequence 750 number of the last 0-RTT packet. 752 * On the server side, the negotiation completes after the client 753 first flight is received and the the server first flight is sent. 754 0-RTT packets are received after that. If multipath is 755 negotiated, they are considered received on "path 0". 757 In conclusion, 0-RTT packets are tracked and processed with path 758 identifier 0. 760 11. Security Considerations 762 TBD. 764 12. IANA Considerations 766 This document defines a new transport parameter for the negotiation 767 of enable multiple paths for QUIC, and three new frame types. The 768 draft defines provisional values for experiments, but we expect IANA 769 to allocate short values if the draft is approved. 771 The following entry in Table 1 should be added to the "QUIC Transport 772 Parameters" registry under the "QUIC Protocol" heading. 774 +==============================+==================+===============+ 775 | Value | Parameter Name. | Specification | 776 +==============================+==================+===============+ 777 | TBD (experiments use 0xbaba) | enable_multipath | Section 3 | 778 +------------------------------+------------------+---------------+ 780 Table 1: Addition to QUIC Transport Parameters Entries 782 The following frame types defined in Table 2 should be added to the 783 "QUIC Frame Types" registry under the "QUIC Protocol" heading. 785 +====================+=====================+===============+ 786 | Value | Frame Name | Specification | 787 +====================+=====================+===============+ 788 | TBD-00 - TBD-01 | ACK_MP | Section 9.2 | 789 | (experiments use | | | 790 | 0xbaba00-0xbaba01) | | | 791 +--------------------+---------------------+---------------+ 792 | TBD-02 | QOE_CONTROL_SIGNALS | Section 9.3 | 793 | (experiments use | | | 794 | 0xbaba02) | | | 795 +--------------------+---------------------+---------------+ 796 | TBD-03 | PATH_STATUS | Section 9.1 | 797 | (experiments use | | | 798 | 0xbaba03) | | | 799 +--------------------+---------------------+---------------+ 801 Table 2: Addition to QUIC Frame Types Entries 803 13. Changelog 805 14. Appendix.A Scenarios related to migration 807 In QUIC V1, there are four scenarios related to migration: CID 808 renewal, NAT Rebinding, controlled migration, and migration to server 809 preferred address. It would be useful to explain exactly how these 810 four scenarios are supported or changed with Multipath QUIC. For V1, 811 these scenarios are described as follow: 813 * CID Renewal happens when the client starts using a new CID for 814 1-RTT packet, while still using the same four-tuple. This is 815 typically done for privacy, for example after a long period of 816 silence. The expected result is that the server will also use a 817 new CID for its next packets. In that scenario, RTT and 818 congestion control parameters remain the same before and after 819 migration. 821 * NAT Rebinding happens when a NAT on the path changes its mappings. 822 The server receives packets that bear the same CID as previously, 823 but arrive on a different four tuple. The complication is that 824 this could be an attack in which the attacker captures a packet 825 from the client and resends it from a different address. The 826 server is expected to perform continuity tests for both the old 827 and the new path, typically using a different CID for the new 828 path. If the continuity test on the new path succeeds before the 829 old path, the server migrates to the new path, otherwise it 830 continues using the old path and ignores the new path. 832 * Controlled migration happens when a client tests a new path. The 833 server receives packets that bear a new CID and arrive on a new 834 four tuple. The server responds to the path challenge, perform 835 its own continuity test on the new path. If the client sends non- 836 path-validation packets on the new path, the server switches to 837 sending on the new path and discards the old path. 839 * Preferred address migration happens when the server sends the 840 preferred address TP during the exchange. The client performs a 841 controlled migration to the new path, and if that is successful 842 discards the old path. 844 We could sum up these scenarios in the following table: 846 +=====+=========+===================+====================+ 847 | CID | 4-tuple | preferred address | result | 848 +=====+=========+===================+====================+ 849 | Old | Old | - | Not a migration. | 850 +-----+---------+-------------------+--------------------+ 851 | Old | New | - | NAT Rebinding. | 852 +-----+---------+-------------------+--------------------+ 853 | New | Old | - | CID Renewal. | 854 +-----+---------+-------------------+--------------------+ 855 | New | New | matches PFA | Migration to | 856 | | | | Preferred Address. | 857 +-----+---------+-------------------+--------------------+ 858 | New | New | other | Controlled | 859 | | | | Migration. | 860 +-----+---------+-------------------+--------------------+ 862 Table 3: Scenarios related to migration 864 The expectation in those scenarios is: 866 +==============+============================================+ 867 | Scenario | Expectation | 868 +==============+============================================+ 869 | Not a | Continue using existing path | 870 | migration | | 871 +--------------+--------------------------------------------+ 872 | NAT | After validation, use new path and discard | 873 | Rebinding | previous path. | 874 +--------------+--------------------------------------------+ 875 | CID Renewal | Create new path with new CIDs, discard old | 876 | | path. Reuse RTT and CC parameter. | 877 +--------------+--------------------------------------------+ 878 | Controlled | Create new path with new CIDs. Server | 879 | Migration | creates a new path,ready to use both | 880 | | paths. Client may later discard old path. | 881 +--------------+--------------------------------------------+ 882 | Migration to | Same as Controlled Migration, but the | 883 | Preferred | client is expected to abandon the old path | 884 | Address | | 885 +--------------+--------------------------------------------+ 887 Table 4: Expectation in scenarios related to migration 889 In multipath quic, client / server create a new path and abandon the 890 old path to do exactly the same thing as connection migration in the 891 previous scenarios. 893 15. Appendix.B Considerations on RTT estimate and loss detection 895 QUIC implementations use RTT estimates in many ways: 897 * For loss detection, RTT estimates are used to evaluate how long to 898 wait for an acknowledgement before a packet is declared lost. 900 * Several congestion control algorithm (e.g. LEDBAT, VEGAS, 901 HYSTART) use variations of the RTT above the minimum value to 902 detect the beginning of congestion. 904 * BBR uses the minimal RTT to compute the minimal size of the 905 congestion window for a target data rate. 907 * ACK delays are often set as a fraction of the RTT. 909 In a multipath environment, the RTT can be estimated each time a new 910 packet is acknolwedged. However, the observed RTT will vary not only 911 based on the state of the send path, but also based on the choice of 912 the return path used for acknowledgements. Each RTT measurement will 913 the sum of the one-way delay on the send path and the one-way delay 914 on the return path. This has a number of implications for the 915 different ways of using the RTT presented above: 917 * If the goal is to detect possible losses, it is probably 918 sufficient to consider all RTT measurements for a given path. 919 Classic formulas like adding smoothed RTT and a number of 920 deviations aim at estimating a reasonable upper bound of the 921 acknowledgement delays. Statistics on observed acknowledgement 922 delays will provide a valid estimate, regardless of the selection 923 of the return path by the peer. 925 * If the goal is to detect the onset of collision and tune a 926 congestion algorithm, the variations of delays due to the choices 927 of return paths will be a source of errors. Implementations will 928 need to pick a strategy, such as for example only considering 929 acknowledgements received through the "fastest" return path, or 930 maybe those received through the matching four tuple for the 931 sending path. An alternative would be to use time stamps to 932 directly estimate variations of the one way delays. 933 [QUIC-Timestamp] provides good support for such one-way-delay 934 compuation. 936 * If BBR is in use and ACKs are returned on different paths, it may 937 cause an ambiguity issue with the computation of bandwidth and 938 delay product (BDP). In BBR, BDP is used to limit the number of 939 inflight packets. One may choose to use the smallest RTT measured 940 to compute BDP. However, if the majority of ACKs are returned 941 from a high-latency path, the cwnd = cwnd_gain * bandwidth * 942 min_rtt may be lower than what is needed to achieve good 943 performance. One possible solution is to transmit a new packet 944 and its ACK on the same path. Other possible solutions may 945 include transmitting ACKs on the shortest path with relative 946 increase of cwnd_gain. For the time being, we think there is a 947 research problem and it is up to the implementers to pick the best 948 solution. 950 16. Appendix.C Difference from past proposals 952 This proposal differs from past proposals 953 [I-D.deconinck-quic-multipath] in two fundamental perspectives: 955 * The multi-path QUIC is built on top of the concept of the 956 bidirectional paths, which readily fits into the nature of both 957 cellular and wifi links that cover the majority of multi-path 958 applications in QUIC while keeping the design simple and easy to 959 implement. In doing so, we are able to re-use most of the current 960 QUIC transport design with the sole addition of three new frames. 962 * The multi-path QUIC design enables feedback-based dynamic 963 scheduling strategy. As the major goal of multi-path QUIC is to 964 enhance performance in mobile applications, where the sender and 965 receiver may have different viewpoints about the fast-changing 966 wireless connectivity, especially in high-mobility scenarios, the 967 proposed design allows the sender and receiver to synchronize 968 their viewpoints via message exchange in ACK packet in order to 969 maximize performance. 971 17. References 973 17.1. Normative References 975 [QUIC-DELAYED-ACK] 976 Iyengar, J., Ed. and I. Swett, Ed., "Sender Control of 977 Acknowledgement Delays in QUIC", Work in Progress, 978 Internet-Draft, draft-iyengar-quic-delayed-ack-02, 979 . 982 [QUIC-LB] Duke, M., Ed. and N. Banks, Ed., "QUIC-LB: Generating 983 Routable QUIC Connection IDs", Work in Progress, Internet- 984 Draft, draft-ietf-quic-load-balancers, 985 . 988 [QUIC-RECOVERY] 989 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 990 and Congestion Control", Work in Progress, Internet-Draft, 991 draft-ietf-quic-recovery, 992 . 994 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 995 QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- 996 tls, . 998 [QUIC-TRANSPORT] 999 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 1000 Multiplexed and Secure Transport", Work in Progress, 1001 Internet-Draft, draft-ietf-quic-transport, 1002 . 1004 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1005 Requirement Levels", BCP 14, RFC 2119, 1006 DOI 10.17487/RFC2119, March 1997, 1007 . 1009 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1010 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1011 May 2017, . 1013 17.2. Informative References 1015 [I-D.deconinck-quic-multipath] 1016 Coninck, Q. and O. Bonaventure, "Multipath Extensions for 1017 QUIC (MP-QUIC)", Work in Progress, Internet-Draft, draft- 1018 deconinck-quic-multipath-06, 2 November 2020, 1019 . 1022 [MPTCP-LIA] 1023 Raiciu, C., Handly, M., and D. Wischik, "Coupled 1024 Congestion Control for Multipath Transport Protocols", 1025 October 2011, . 1027 [MPTCP-OLIA] 1028 Khalili, R., Gast, N., and J. Boudec, "Opportunistic 1029 Linked-Increases Congestion Control Algorithm for MPTCP", 1030 July 2014, . 1033 [QUIC-Timestamp] 1034 Huitema, C., "Quic Timestamps For Measuring One-Way 1035 Delays", August 2020, 1036 . 1038 Authors' Addresses 1040 Yanmei Liu 1041 Alibaba Inc. 1043 Email: miaoji.lym@alibaba-inc.com 1045 Yunfei Ma 1046 Alibaba Inc. 1048 Email: yunfei.ma@alibaba-inc.com 1050 Christian Huitema 1051 Private Octopus Inc. 1053 Email: huitema@huitema.net 1055 Qing An 1056 Alibaba Inc. 1058 Email: anqing.aq@alibaba-inc.com 1060 Zhenyu Li 1061 ICT-CAS 1063 Email: zyli@ict.ac.cn