idnits 2.17.1 draft-liu-multipath-quic-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1012 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([QUIC-TRANSPORT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (16 December 2020) is 1226 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 225 == Missing Reference: 'X' is mentioned on line 310, but not defined == Missing Reference: 'Y' is mentioned on line 312, but not defined -- Looks like a reference, but probably isn't: '1' on line 317 -- Looks like a reference, but probably isn't: '2' on line 315 == Outdated reference: A later version (-07) exists of draft-deconinck-quic-multipath-06 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC Y. Liu 3 Internet-Draft Y. Ma 4 Intended status: Standards Track Alibaba Inc. 5 Expires: 19 June 2021 C. Huitema 6 Private Octopus Inc. 7 Q. An 8 Alibaba Inc. 9 Z. Li 10 ICT-CAS 11 16 December 2020 13 Multipath Extension for QUIC 14 draft-liu-multipath-quic-02 16 Abstract 18 This document specifies multipath extension for the QUIC protocol to 19 enable the simultaneous usage of multiple paths for a single 20 connection. The extension is compliant with the single-path QUIC 21 design. The design principle is to support multipath by adding 22 limited extension to [QUIC-TRANSPORT]. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 19 June 2021. 41 Copyright Notice 43 Copyright (c) 2020 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction 58 2. Conventions and Definitions 59 3. Enable Multipath QUIC - Handshake 60 4. Path Management 61 4.1. Path Identifier and Connection ID 62 4.2. Path Packet Number Spaces 63 4.3. Path Initiation 64 4.4. Path State Management 65 4.5. Path Close 66 4.5.1. Use PATH_STATUS frame to close a path 67 4.5.2. Use RETIRE_CONNECTION_ID frame to close a path 68 4.5.3. Idle timeout 69 5. Using TLS to Secure QUIC Multipath 70 5.1. Packet protection for QUIC Multipath 71 5.2. Key Update for QUIC Multipath 72 6. Using Multipath QUIC with load balancers 73 7. Packet scheduling 74 7.1. Basic Scheduling 75 7.2. Scheduling with QoE Feedback 76 7.3. Per-stream Policy 77 8. Congestion control and loss detection 78 8.1. Congestion control 79 8.2. Packet number space and acknowledgements 80 8.3. Flow control 81 9. New frames 82 9.1. PATH_STATUS frame 83 9.2. ACK_MP frame 84 9.3. QOE_CONTROL_SIGNALS frame 85 10. Security Considerations 86 11. IANA Considerations 87 12. Changelog 88 13. Appendix.A Scenarios related to migration 89 14. Appendix.B Considerations on RTT estimate and loss detection 90 15. Appendix.C Difference from past proposals 91 16. References 92 16.1. Normative References 93 16.2. Informative References 94 Authors' Addresses 96 1. Introduction 98 In this document, we propose an extension to the current QUIC design 99 to enable the simultaneous usage of multiple paths for a single 100 connection. 102 This proposal is based on several basic design points: 104 * Re-use as much as possible mechanisms of QUIC-v1, which has 105 supported connection migration and path validation. 107 * To avoid the risk of packets being dropped by middleboxes (which 108 may only support QUIC-v1), use the same packet header formats as 109 QUIC V1. 111 * Endpoints need a Path Identifier for each different path which is 112 used to track states of packets. As we want to keep the packet 113 header formats unchanged [QUIC-TRANSPORT], Connection IDs (and the 114 sequence number of Connection IDs) would be a good choice of Path 115 Identifier. 117 * For the convenience of packet loss detection and recovery, 118 endpoints use a different packet number space for each Path 119 Identifier. 121 * Congestion Control, RTT measurements and PMTU discovery should be 122 per-path (following [QUIC-TRANSPORT]) 124 This document is organized as follows. It first provides definitions 125 of multipath quic in Section 2. It then specifies how to enable 126 multipath quic during handshake in Section 3, and path management in 127 Section 4. It discusses packet scheduling in Section 7, and 128 congestion control in Section 8. The new frames are defined in 129 Section 9. 131 2. Conventions and Definitions 133 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 134 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 135 "OPTIONAL" in this document are to be interpreted as described in BCP 136 14 [RFC2119] [RFC8174] when, and only when, they appear in all 137 capitals, as shown here. 139 We assume that the reader is familiar with the terminology used in 140 [QUIC-TRANSPORT]. In addition, we define the following terms: 142 * Path Identifier: An identifier that is used to identify a path in 143 a QUIC connection at an endpoint. It is defined as the sequence 144 number of the destination Connection ID used for sending packets 145 on that particular path. 147 * Each node maintains a list of "Received Packets" for each of the 148 CID that it provided to the peer, which is used for acknowledging 149 packets received with that CID. 151 3. Enable Multipath QUIC - Handshake 153 This extension defines a new transport parameter, used to negotiate 154 the use of the multipath extension during the connection handshake, 155 as specified in [QUIC-TRANSPORT]. The new transport parameter is 156 defined as follow: 158 * name: enable_multipath (TBD - experiments use 0xbaba) 160 * value: 0 (default) for disabled, 1 for enabled 162 If the peer does not carry the enable_multipath(TBD - experiments use 163 0xbaba) transport parameter, which means the peer does NOT support 164 multipath, endpoint MUST fallback to [QUIC-TRANSPORT] with single 165 path and MUST NOT send any MP frames in the following packets, also 166 MUST NOT use the multipath specific AEAD algorithm defined in 167 Section 5.1. 169 Notice that transport parameter "active_connection_id_limit" 170 [QUIC-TRANSPORT] limits the number of usable Connection IDs, and also 171 limits the number of concurrent paths. 173 4. Path Management 175 After endpoints have negotiated in handshake flow that both endpoints 176 enable multipath feature, endpoints can start using multiple paths. 178 This proposal add one frame for path management: 180 * PATH_STATUS frame for the receiver side to claim the path state 181 and preference 183 All the new MP frames are sent in 1-RTT packets [QUIC-TRANSPORT]. 185 4.1. Path Identifier and Connection ID 187 Endpoints need a Path Identifier for each different path which is 188 used to track states of packets. Endpoints use Connection IDs in 189 1-RTT packet header as Path Identifier in each directions, and use 190 the sequence number of Connection IDs in MP frames to identify the 191 path referred. 193 Following [QUIC-TRANSPORT], Each endpoint uses NEW_CONNECTION_ID 194 frames to claim usable connections IDs for itself. Before an 195 endpoint add a new path, it SHOULD check whether there is at least 196 one unused available Connection ID for each side. 198 Endpoints can find which path a received packet belongs to according 199 to the Destination Connection ID of the 1-RTT packet. Endpoints can 200 find the context of a path by its' Connection ID or the Sequence 201 number of Connection ID. 203 4.2. Path Packet Number Spaces 205 For the convenience of packet loss detection and recovery, endpoints 206 use a different packet number space for each Path Identifier 207 (Connection ID). ACK_MP frame includes the sequence number of the 208 Destination Connection ID of the acknowledged packets as the Path 209 Identifier. 211 4.3. Path Initiation 213 Figure 1 illustrates an example of new path establishment. 215 Client Server 217 (Exchanges start on default path) 218 1-RTT[]: NEW_CONNECTION_ID[C1, Seq=1] --> 219 <-- 1-RTT[]: NEW_CONNECTION_ID[S1, Seq=1] 220 <-- 1-RTT[]: NEW_CONNECTION_ID[S2, Seq=2] 221 ... 222 (starts new path) 223 1-RTT[0]: DCID=S2, PATH_CHALLENGE[X] --> 224 Checks AEAD using nonce(CID sequence 2, PN 0) 225 <-- 1-RTT[0]: DCID=C1, PATH_RESPONSE[X], PATH_CHALLENGE[Y], 226 ACK_MP[Seq=1,PN=0] 227 Checks AEAD using nonce(CID sequence 1, PN 0) 228 1-RTT[1]: DCID=S2, PATH_RESPONSE[Y], 229 ACK_MP[Seq=1, PN=0], ... --> 231 Figure 1: Example of new path establishment 233 As shown in Figure 1, client provides one unused available Connection 234 ID (C1 with sequence number 1), and server provides two available 235 Connection IDs (S1 with sequence number 1, and S2 with sequence 236 number 2). When client wants to start a new path, it checks whether 237 there is unused available Connection IDs for each side, and choose an 238 available Connection ID S2 as the Destination Connection ID in the 239 new path. 241 Endpoints need to exchange unused available Connection IDs with the 242 NEW_CONNECTION_ID frame before an endpoint starts a new path. For 243 example, if the goal is to maintain 2 paths, each endpoint should 244 provide at least 3 CID to its peer: 2 in use, and one spare. If the 245 client has used all the allocated CID, it is supposed to retire those 246 that are not used anymore, and the server is supposed to provide 247 replacements, as specified in [QUIC-TRANSPORT]. 249 If the transport parameter "active_connection_id_limit" is negotiated 250 as N, and the server has provided N Connection IDs and the client has 251 started N paths, the limit is reached. If the client wants to start 252 a new path, it has to retire one of the established paths. 254 Path validation uses the PATH_CHALLENGE and PATH_RESPONSE frame 255 defined in QUIC-Transport [QUIC-TRANSPORT]. 257 4.4. Path State Management 259 An endpoint uses PATH_STATUS frames to inform that the peer should 260 send packets in the preference expressed by these frames. An 261 endpoint uses the sequence number of the CID used by the peer for 262 PATH_STATUS frames (describing the sender's path identifier). 264 In the example Figure 1, if the client wants to send a PATH_STATUS 265 frame to tell the server that it prefers the path with CID sequence 266 number 1 (of the server's side), the client should use the identifier 267 of the server (sequence 1) in PATH_STATUS frame. 269 PATH_STATUS frame describes 4 kinds of path states: 271 * Abandon a path, and release the corresponding resource. 273 * Mark a path as "available", i.e., allow the peer to use its own 274 logic to split traffic among available paths. 276 * Mark a path as "standby", i.e., suggest that no traffic should be 277 sent on that path if another path is available. 279 * Mark the priority of a path, i.e, path 1 is weight 8, path 2 is 280 weight 2, suggest that path 1 has higher priority than path 2, and 281 peer should try to send more data in path 1. 283 PATH_STATUS frame can be sent via a different path, instead of the 284 path identified by the Path Identifier field. 286 4.5. Path Close 288 An endpoint that want to delete a path SHOULD NOT rely on implicit 289 signals like idle time or packet losses, but instead SHOULD use 290 explicit signals like retiring Connection ID or asking to abandon 291 path. 293 4.5.1. Use PATH_STATUS frame to close a path 295 Both client and server can close a path, by sending PATH_STATUS frame 296 which abandons the path with a corresponding Path Identifier. Once a 297 path is marked as "abandon", it means that the resources related to 298 the path can be released. 300 Figure 2 illustrates an example of path closing. In the case, the 301 path identifier used by the server is CID C1, sequence number of CID 302 is 1; the path identifier used by the client is CID S2, sequence 303 number of CID is 2. (For the convience of distinguishing the CID 304 sequence number and PATH_STATUS sequence number, we call the 305 "PATH_STATUS sequence number" as "PSSN") 307 Client Server 309 (client tells server to abandon a path) 310 1-RTT[X]: PATH_STATUS[id=1, PSSN1, status=abandon, pri.=0] -> 311 (server tells client to abandon a path) 312 <- 1-RTT[Y]: PATH_STATUS[id=2, PSSN2, status=abandon, pri.=0], 313 ACK_MP[Seq=1, PN=X] 314 (client abandons the path that it is using) 315 1-RTT[X+1]: RETIRE_CONNECTION_ID[2], ACK_MP[Seq=2, PN=Y] -> 316 (server abandons the path that it is using) 317 <- 1-RTT[Y+1]: RETIRE_CONNECTION_ID[1], ACK_MP[Seq=1, PN=X+1] 319 Figure 2: Example of closing a path 321 In scenarios such as client detects the network environment change 322 (client's 4G/Wi-Fi is turned off, Wi-Fi signal is fading to a 323 threshold), or endpoints detect that the quality of RTT or loss rate 324 is becoming worse, client or server can terminate a path immediately. 326 4.5.2. Use RETIRE_CONNECTION_ID frame to close a path 328 A sender can close a path by retiring the associated Connection ID. 329 The RETIRE_CONNECTION_ID frame can be sent on any path. 331 Receiving a RETIRE_CONNECTION_ID frame causes the endpoint to discard 332 the resources associated with that connection ID. If the connection 333 ID was used by the peer to identify a path from the peer to this 334 endpoint, the resources include the list of received packets used to 335 send acknowledgements. There is no reason for the endpoint to send a 336 PATH_STATUS(abandon) for that path, since the peer has already 337 abandoned it. An endpoint SHOULD only send RETIRE_CONNECTION_ID to 338 the peer if all packets sent with that CID are either acknowledged or 339 considered lost. 341 This has no direct effect on reverse paths from this endpoint to the 342 peer. If the peer wants to direct the endpoint to abandon such 343 paths, it should send PATH_STATUS(abandon) frames for the relevant 344 paths. 346 4.5.3. Idle timeout 348 [QUIC-TRANSPORT] allows for closing of connections if they stay idle 349 for too long. The connection idle timeout in multipath QUIC is 350 defined as "no packet received on any path for the duration of the 351 idle timeout". It means that if all paths remain idle for the idle 352 timeout, the connection is implicitly closed. 354 5. Using TLS to Secure QUIC Multipath 356 In order to facilitate loss detection and recovery when sending data 357 over multiple paths, this specification defines how packets sent over 358 multiple paths use different packet number spaces. This requires 359 changes in the way AEAD is applied for packet protection, as 360 explained in Section 5.1, and tighter constrainst for key updates, as 361 explained in Section 5.2. 363 5.1. Packet protection for QUIC Multipath 365 Packet protection for QUIC V1 is specified is section 5 of 366 [QUIC-TLS]. The general principles of packet protection are not 367 changed for QUIC Multipath. No changes are needed for setting packet 368 protection keys, initial secrets, header protection, use of 0-RTT 369 keys, receiving out-of-order protected packets, receiving protected 370 packets, or retry packet integrity. However, the use of multiple 371 number spaces for 1-RTT packets requires changes in AEAD usage. 373 Section 5.3 of [QUIC-TLS] specifies AEAD usage, and in particular the 374 use of a nonce, N, formed by combining the packet protection IV with 375 the packet number. QUIC multipath uses multiple packet number 376 spaces, and thus the packet number alone would not guarantee the 377 uniqueness of the nonce. In order to guarantee this uniqueness, we 378 construct the nonce N by combining the packet protection IV with the 379 packet number and with the identifier of the path, which for 1-RTT 380 packets is the Sequence Number of the Destination Connection ID 381 present in the packet header, as defined in Section 5.1.1 of 382 [QUIC-TRANSPORT], or zero if the Connection ID is zero-length. 383 Section 19 of [QUIC-TRANSPORT] encode this Connection ID Sequence 384 Number as a A variable-length integer, allowing values up to 2^62-1; 385 for QUIC multipath, we require that these values be no larger than 386 2^32 -1. 388 For QUIC multipath, the construction of the nonce starts with the 389 construction of a 96 bit path-and-packet-number, composed of the 32 390 bit Connection ID Sequence Number in byte order, two zero bits, and 391 the 62 bits of the reconstructed QUIC packet number in network byte 392 order. If the IV is larger than 96 bits, path-and-packet-number is 393 left-padded with zeros to the size of the IV. The exclusive OR of 394 the padded packet number and the IV forms the AEAD nonce. 396 For example, assuming the IV value is "6b26114b9cba2b63a9e8dd4f", the 397 connection ID sequence number is "3", and the packet number is 398 "aead", the nonce will be set to "6b2611489cba2b63a9a873e2". 400 5.2. Key Update for QUIC Multipath 402 The Key Phase bit update process for QUIC V1 is specified in 403 Section 6 of [QUIC-TLS]. The general principles of key update are 404 not changed for Multipath QUIC. Following QUIC V1, the Key Phase bit 405 is used to indicate which packet protection keys are used to protect 406 the packet. The Key Phase bit is toggled to signal each subsequent 407 key update. Because of network delays, packets protected with the 408 older key might arrive later than the packets protected with the new 409 key. Therefore, the endpoint needs to retain old packet keys to 410 allow these delayed packets to be processed and it must distinguish 411 between the new key and the old key. In QUIC V1, this is done using 412 packet numbers so that the rule is made simple: Use the older key if 413 packet number is lower than any packet number frome the current key 414 phase. 416 In QUIC multipath, some care is needed in the initiating Key Update 417 process. Because different paths use different packet number spaces 418 but share a single key, when a key update is initiated on one path, 419 packets sent to the other path needs to know when transition is 420 complete. Otherwise, it is possible that the other paths send 421 packets with the old keys, but skip sending any packets in the 422 current key phase and directly jump to sending packet in the next key 423 phase. When that happens, as the endpoint can only retain two sets 424 of packet protection keys with the 1-bit Key Phase bit, the other 425 paths cannot distinguish which key should be used to decode received 426 packets, which results in a key rotation synchronization problem. 428 To address such a synchronization issue, in QUIC multipath, if key 429 update is initilized on one path, the sender should send at least one 430 packet with the new key on all active paths. Regarding the 431 responding to Key Update process, the endpoint MUST NOT initiate a 432 subsequent key update until a packet with the current key has been 433 acknowledged on each path. 435 Following the Section 5.4. of [QUIC-TLS], the Key Phase bit is 436 protected, so sending multiple packets with Key Phase bit flipping at 437 the same time should not cause linkability issue. 439 6. Using Multipath QUIC with load balancers 441 This specification follows the Connection ID negotiation defined in 442 [QUIC-TRANSPORT]. For stateless or low-state load balancers 443 supporting Multipath QUIC, implementations SHOULD use the 444 specification of Connection ID generation and Load balancer routing 445 defined in [QUIC-LB], guarantee that packets with Connection IDs 446 belonging to the same connection, can be routed to same server. 448 7. Packet scheduling 450 7.1. Basic Scheduling 452 For an outgoing packet, the packet scheduler decides which path the 453 packet shall be transmitted. A basic static scheduling strategy 454 consists of four major components: 456 1. Path state: A scheduler may want to decide which path shall be 457 activated to transmit data. For instance, a scheduler can choose 458 to use only one of the two paths and completely ignore the other 459 one. A scheduler marks the selected paths to be in the 460 "available" state and the un-selected ones in the "standby" 461 state. 463 2. Path priority: Due to the fact that costs of transmitting data 464 over different paths are not always equal. For example, the 465 energy (battery) cost over a 5G path and a wifi path are very 466 different. In another example, transmissions over a wifi path 467 and a cellular path may incur different charges per packet. Note 468 that a user's preference may change over time. For instance, 469 certain mobile carriers offer unlimited free data for a 470 particular streaming app. Therefore, the path priority should be 471 made available in the scheduler. 473 3. Path selection algorithm: A selection algorithm splits packets 474 across different paths and determines the order of paths to be 475 selected. The selection algorithm takes congestion controller 476 states as inputs, such as smoothed RTTs (sRTTs), estimated 477 bandwidths (eBWs) and congestion window sizes (CWNDs) as well as 478 application-defined information such as path priorities and path 479 states. The outputs of the algorithm is an ordered list of paths 480 to put a packet on. To name a few, some of the commonly used 481 algorithms are: - Round-Robin: There is no priority. it selects 482 paths one by one in order to transmit data. - Lowest-RTT: It 483 first chooses the path with the lowest RTT and feeds packets to 484 it until that path's congestion window is full. Then it chooses 485 the path with the second lowest RTT. - Highest-Sending-Rate: It 486 first chooses the path with the highest bandwidth and feeds 487 packets to it until that path's congestion window is full. Then 488 it chooses path with the second largest bandwidth. 490 4. Packet redundancy: One major challenge in multi-path transmission 491 is that a packet loss on the slow path might block the overall 492 transmission when packets are split across fast-changing paths. 493 As the path selection algorithm takes inputs from congestion 494 controllers on predictions of the network which may not be 495 accurate enough for fast-changing wireless channels, such an 496 imprecise estimation could lead to network overuse/underuse. A 497 solution to this problem is to implement packet redundancy 498 strategy. A redundancy strategy can be applied to only ACK 499 packets(partial redundancy) or all data packets (full 500 redundancy). It is up to the application to determine whether, 501 when, and on which packets to activate redundancy. 503 The path state and path priority are managed by PATH_STATUS frame. 504 The path selection algorithm and packet redundancy are application 505 related and should be controlled by the applicaiton. 507 7.2. Scheduling with QoE Feedback 509 Applications may have completely different QoE requirements---the 510 interactive applications are delay sensitive, while the video 511 streaming applications are more throughput sensitive. There is thus 512 a trend of cross-layer design that takes applications' demands into 513 account when managing paths or scheduling packets. The QoE feedback 514 is used to fully support application-awareness in multipath 515 scheduling and is carried in the QOE_CONTROL_SIGNALS frames Figure 6. 516 The QOE_CONTROL_SIGNALS frames can include general application-level 517 information that is needed by the schedulers. The frequency of such 518 feedback should be controlled to limit the amount of extra packets. 519 The QoE control signal allows a synchronization of viewpoints between 520 two endhosts. It is up to the application to determine the 521 interpretation of QoE control signals. 523 7.3. Per-stream Policy 525 As QUIC supports stream multiplexing, streams are allowed to 526 associate stream priorities to express applications intent. For 527 instance, objects in a web page may be dependent on others and thus 528 have different priorities multipath quic scheduler. A stream 529 priority-aware packet scheduling algorithm will improve the 530 performance notably. 532 High priority /\ +---------+ 533 || | | 534 || +---------+ 535 || +---------+ 536 || | | 537 || +---------+ 538 || ... User-defined stream priority 539 || +---------+ 540 Low priority || | | 541 || +---------+ 542 ----------------------------------------------------------- 543 High priority /\ +---------+ 544 || | | 545 || +---------+ 546 || +---------+ 547 || | | 548 || +---------+ 549 || ... Default stream priority 550 || +---------+ 551 Low priority || | | 552 || +---------+ 554 Figure 3: Stream priority 556 The priority management scheme composes two separated priority 557 ranges. The user-defined priority range includes those streams that 558 the applications explicitly designate priorities, while the default 559 priority range includes the streams with no priorities set by the 560 applications. Only when the streams in the user-defined ranges have 561 no data to send, the streams in the default priority range can send. 562 In the same range, one can use the weighted-round robin for 563 scheduling---the higher-priority streams get more quota for data to 564 send in each round. One can also dynamically set/change the 565 priorities of the streams in the default priority ranges to enable 566 short stream first if needed. 568 8. Congestion control and loss detection 570 8.1. Congestion control 572 Implementations MAY support coupled congestion controllers such as 573 LIA [MPTCP-LIA], OLIA [MPTCP-OLIA], and etc., or support decoupled 574 congestion controllers in environments using disjoint network paths. 576 In decoupled congestion control, each path runs its own congestion 577 controller without interacting with the congestion controllers of 578 other paths. That is to say, in the aspect of congestion control, a 579 path behaves exactly the same as a normal QUIC connection over the 580 same network path. 582 Each path MAY choose congestion control algorithm independently. 584 8.2. Packet number space and acknowledgements 586 Each path has it's own packet number space for transmitting 1-RTT 587 packets. 589 ACK frame [QUIC-TRANSPORT] MUST be returned via the same path on 590 which the corresponding packets were sent. 592 ACK_MP frame can be returned via either a different path, or the same 593 path identified by the Path Identifier, based on different strategies 594 of sending ACK_MP frames. 596 Note: Only ACK_MP frame returned via the same path can be used to 597 calculate RTT(round trip time). 599 8.3. Flow control 601 TBD. 603 9. New frames 605 All the new frames MUST be sent in 1-RTT packet, and MUST NOT use 606 other encryption levels. 608 If an endpoint receives MP frames from packets of other encryption 609 levels, it MUST return MP_PROTOCOL_VIOLATION as a connection error 610 and close the connection. 612 9.1. PATH_STATUS frame 614 PATH_STATUS Frame are used by endpoints to inform the peer of the 615 current status of one path, and the peer should send packets 616 according to the preference expressed in these frames. Endpoint use 617 the sequence number of the CID used by the peer for PATH_STATUS 618 frames (describing the sender's path identifier). PATH_STATUS frames 619 are formatted as shown in Figure 4. 621 PATH_STATUS Frame { 622 Type (i) = TBD-03 (experiments use 0xbaba03), 623 Path Identifier (i), 624 Path Status sequence number (i), 625 Path Status (i), 626 Path Priority (i), 627 } 629 Figure 4: PATH_STATUS Frame Format 631 PATH_STATUS Frames contain the following fields: 633 Path Identifier: A variable-length integer specifying the path 634 identifier. 636 Path Status sequence number: A variable-length integer specifying the 637 sequence number assigned for this PATH_STATUS frame. There is a 638 different path status sequence number space for each path. 640 Available values of Path Status field are: 642 * 0: Abandon 644 * 1: Standby 646 * 2: Available 648 If the value of Path Status field is 2-available, the receiver side 649 can use the Path Priority field to express the priority weight of a 650 path for the peer. 652 Frames may be received out of order. A peer MUST ignore an incoming 653 PATH_STATUS frame if it previously received another PATH_STATUS frame 654 for the same Path Identifier with a sequence number equal to or 655 higher than the sequence number of the incoming frame. 657 PATH_STATUS frames SHOULD be acknowledged. If a packet containing a 658 PATH_STATUS frame is considered lost, the peer should only repeat it 659 if it was the last status sent for that path -- as indicated by the 660 sequence number. 662 9.2. ACK_MP frame 664 ACK_MP frame allows for acknowledgements on different paths. ACK_MP 665 frame is formatted by adding a Path Identifier field to 666 [QUIC-TRANSPORT] ACK frame. ACK_MP frame is formatted as shown in 667 Figure 5. 669 ACK_MP Frame { 670 Type (i) = TBD-00..TBD-01 (experiments use 0xbaba00..0xbaba01), 671 Path Identifier (i), 672 Largest Acknowledged (i), 673 ACK Delay (i), 674 ACK Range Count (i), 675 First ACK Range (i), 676 ACK Range (..) ..., 677 [ECN Counts (..)], 678 } 680 Figure 5: ACK_MP Frame Format 682 Type(i) = TBD-00 (experiments use 0xbaba00) , with no ECN Counts 683 Type(i) = TBD-01 (experiments use 0xbaba01) , with ECN Counts 685 9.3. QOE_CONTROL_SIGNALS frame 687 QOE_CONTROL_SIGNALS frame is used to carry quality of experience 688 (QoE) information. A typical use of such information is to provide 689 feedback to help application-aware scheduling. Note that different 690 applications may have very different needs, the interpretation of the 691 QoE control signal can be up to the users. QOE_CONTROL_SIGNALS 692 frames are formatted as shown in Figure 6. 694 QOE_CONTROL_SIGNALS Frame { 695 Type (i) = TBD-02 (experiments use 0xbaba02), 696 Path Identifier (i), 697 QoE Control Signals Length(8), 698 QoE Control Signals (..) 699 } 701 Figure 6: QOE_CONTROL_SIGNALS Frame Format 703 QOE_CONTROL_SIGNALS frames may be received out of order, peers SHOULD 704 pass them to the application as they arrive. Although 705 QOE_CONTROL_SIGNALS frames are not retransmitted upon loss detection, 706 they are ack-eliciting [QUIC-RECOVERY]. 708 10. Security Considerations 710 TBD. 712 11. IANA Considerations 714 This document defines a new transport parameter for the negotiation 715 of enable multiple paths for QUIC, and three new frame types. The 716 draft defines provisional values for experiments, but we expect IANA 717 to allocate short values if the draft is approved. 719 The following entry in Table 1 should be added to the "QUIC Transport 720 Parameters" registry under the "QUIC Protocol" heading. 722 +==============================+==================+===============+ 723 | Value | Parameter Name. | Specification | 724 +==============================+==================+===============+ 725 | TBD (experiments use 0xbaba) | enable_multipath | Section 3 | 726 +------------------------------+------------------+---------------+ 728 Table 1: Addition to QUIC Transport Parameters Entries 730 The following frame types defined in Table 2 should be added to the 731 "QUIC Frame Types" registry under the "QUIC Protocol" heading. 733 +====================+=====================+===============+ 734 | Value | Frame Name | Specification | 735 +====================+=====================+===============+ 736 | TBD-00 - TBD-01 | ACK_MP | Section 9.2 | 737 | (experiments use | | | 738 | 0xbaba00-0xbaba01) | | | 739 +--------------------+---------------------+---------------+ 740 | TBD-02 | QOE_CONTROL_SIGNALS | Section 9.3 | 741 | (experiments use | | | 742 | 0xbaba02) | | | 743 +--------------------+---------------------+---------------+ 744 | TBD-03 | PATH_STATUS | Section 9.1 | 745 | (experiments use | | | 746 | 0xbaba03) | | | 747 +--------------------+---------------------+---------------+ 749 Table 2: Addition to QUIC Frame Types Entries 751 12. Changelog 753 13. Appendix.A Scenarios related to migration 755 In QUIC V1, there are four scenarios related to migration: CID 756 renewal, NAT Rebinding, controlled migration, and migration to server 757 preferred address. It would be useful to explain exactly how these 758 four scenarios are supported or changed with Multipath QUIC. For V1, 759 these scenarios are described as follow: 761 * CID Renewal happens when the client starts using a new CID for 762 1-RTT packet, while still using the same four-tuple. This is 763 typically done for privacy, for example after a long period of 764 silence. The expected result is that the server will also use a 765 new CID for its next packets. In that scenario, RTT and 766 congestion control parameters remain the same before and after 767 migration. 769 * NAT Rebinding happens when a NAT on the path changes its mappings. 770 The server receives packets that bear the same CID as previously, 771 but arrive on a different four tuple. The complication is that 772 this could be an attack in which the attacker captures a packet 773 from the client and resends it from a different address. The 774 server is expected to perform continuity tests for both the old 775 and the new path, typically using a different CID for the new 776 path. If the continuity test on the new path succeeds before the 777 old path, the server migrates to the new path, otherwise it 778 continues using the old path and ignores the new path. 780 * Controlled migration happens when a client tests a new path. The 781 server receives packets that bear a new CID and arrive on a new 782 four tuple. The server responds to the path challenge, perform 783 its own continuity test on the new path. If the client sends non- 784 path-validation packets on the new path, the server switches to 785 sending on the new path and discards the old path. 787 * Preferred address migration happens when the server sends the 788 preferred address TP during the exchange. The client performs a 789 controlled migration to the new path, and if that is successful 790 discards the old path. 792 We could sum up these scenarios in the following table: 794 +=====+=========+===================+====================+ 795 | CID | 4-tuple | preferred address | result | 796 +=====+=========+===================+====================+ 797 | Old | Old | - | Not a migration. | 798 +-----+---------+-------------------+--------------------+ 799 | Old | New | - | NAT Rebinding. | 800 +-----+---------+-------------------+--------------------+ 801 | New | Old | - | CID Renewal. | 802 +-----+---------+-------------------+--------------------+ 803 | New | New | matches PFA | Migration to | 804 | | | | Preferred Address. | 805 +-----+---------+-------------------+--------------------+ 806 | New | New | other | Controlled | 807 | | | | Migration. | 808 +-----+---------+-------------------+--------------------+ 810 Table 3: Scenarios related to migration 812 The expectation in those scenarios is: 814 +==============+============================================+ 815 | Scenario | Expectation | 816 +==============+============================================+ 817 | Not a | Continue using existing path | 818 | migration | | 819 +--------------+--------------------------------------------+ 820 | NAT | After validation, use new path and discard | 821 | Rebinding | previous path. | 822 +--------------+--------------------------------------------+ 823 | CID Renewal | Create new path with new CIDs, discard old | 824 | | path. Reuse RTT and CC parameter. | 825 +--------------+--------------------------------------------+ 826 | Controlled | Create new path with new CIDs. Server | 827 | Migration | creates a new path,ready to use both | 828 | | paths. Client may later discard old path. | 829 +--------------+--------------------------------------------+ 830 | Migration to | Same as Controlled Migration, but the | 831 | Preferred | client is expected to abandon the old path | 832 | Address | | 833 +--------------+--------------------------------------------+ 835 Table 4: Expectation in scenarios related to migration 837 In multipath quic, client / server create a new path and abandon the 838 old path to do exactly the same thing as connection migration in the 839 previous scenarios. 841 14. Appendix.B Considerations on RTT estimate and loss detection 843 QUIC implementations use RTT estimates in many ways: 845 * For loss detection, RTT estimates are used to evaluate how long to 846 wait for an acknowledgement before a packet is declared lost. 848 * Several congestion control algorithm (e.g. LEDBAT, VEGAS, 849 HYSTART) use variations of the RTT above the minimum value to 850 detect the beginning of congestion. 852 * BBR uses the minimal RTT to compute the minimal size of the 853 congestion window for a target data rate. 855 * ACK delays are often set as a fraction of the RTT. 857 In a multipath environment, the RTT can be estimated each time a new 858 packet is acknolwedged. However, the observed RTT will vary not only 859 based on the state of the send path, but also based on the choice of 860 the return path used for acknowledgements. Each RTT measurement will 861 the sum of the one-way delay on the send path and the one-way delay 862 on the return path. This has a number of implications for the 863 different ways of using the RTT presented above: 865 * If the goal is to detect possible losses, it is probably 866 sufficient to consider all RTT measurements for a given path. 867 Classic formulas like adding smoothed RTT and a number of 868 deviations aim at estimating a reasonable upper bound of the 869 acknowledgement delays. Statistics on observed acknowledgement 870 delays will provide a valid estimate, regardless of the selection 871 of the return path by the peer. 873 * If the goal is to detect the onset of collision and tune a 874 congestion algorithm, the variations of delays due to the choices 875 of return paths will be a source of errors. Implementations will 876 need to pick a strategy, such as for example only considering 877 acknowledgements received through the "fastest" return path, or 878 maybe those received through the matching four tuple for the 879 sending path. An alternative would be to use time stamps to 880 directly estimate variations of the one way delays. 881 [QUIC-Timestamp] provides good support for such one-way-delay 882 compuation. 884 * If BBR is in use and ACKs are returned on different paths, it may 885 cause an ambiguity issue with the computation of bandwidth and 886 delay product (BDP). In BBR, BDP is used to limit the number of 887 inflight packets. One may choose to use the smallest RTT measured 888 to compute BDP. However, if the majority of ACKs are returned 889 from a high-latency path, the cwnd = cwnd_gain * bandwidth * 890 min_rtt may be lower than what is needed to achieve good 891 performance. One possible solution is to transmit a new packet 892 and its ACK on the same path. Other possible solutions may 893 include transmitting ACKs on the shortest path with relative 894 increase of cwnd_gain. For the time being, we think there is a 895 research problem and it is up to the implementers to pick the best 896 solution. 898 15. Appendix.C Difference from past proposals 900 This proposal differs from past proposals 901 [I-D.deconinck-quic-multipath] in two fundamental perspectives: 903 * The multi-path QUIC is built on top of the concept of the 904 bidirectional paths, which readily fits into the nature of both 905 cellular and wifi links that cover the majority of multi-path 906 applications in QUIC while keeping the design simple and easy to 907 implement. In doing so, we are able to re-use most of the current 908 QUIC transport design with the sole addition of three new frames. 910 * The multi-path QUIC design enables feedback-based dynamic 911 scheduling strategy. As the major goal of multi-path QUIC is to 912 enhance performance in mobile applications, where the sender and 913 receiver may have different viewpoints about the fast-changing 914 wireless connectivity, especially in high-mobility scenarios, the 915 proposed design allows the sender and receiver to synchronize 916 their viewpoints via message exchange in ACK packet in order to 917 maximize performance. 919 16. References 921 16.1. Normative References 923 [QUIC-LB] Duke, M., Ed. and N. Banks, Ed., "QUIC-LB: Generating 924 Routable QUIC Connection IDs", Work in Progress, Internet- 925 Draft, draft-ietf-quic-load-balancers, 926 . 929 [QUIC-RECOVERY] 930 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 931 and Congestion Control", Work in Progress, Internet-Draft, 932 draft-ietf-quic-recovery, 933 . 935 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 936 QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- 937 tls, . 939 [QUIC-TRANSPORT] 940 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 941 Multiplexed and Secure Transport", Work in Progress, 942 Internet-Draft, draft-ietf-quic-transport, 943 . 945 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 946 Requirement Levels", BCP 14, RFC 2119, 947 DOI 10.17487/RFC2119, March 1997, 948 . 950 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 951 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 952 May 2017, . 954 16.2. Informative References 956 [I-D.deconinck-quic-multipath] 957 Coninck, Q. and O. Bonaventure, "Multipath Extensions for 958 QUIC (MP-QUIC)", Work in Progress, Internet-Draft, draft- 959 deconinck-quic-multipath-06, 2 November 2020, 960 . 963 [MPTCP-LIA] 964 Raiciu, C., Handly, M., and D. Wischik, "Coupled 965 Congestion Control for Multipath Transport Protocols", 966 October 2011, . 968 [MPTCP-OLIA] 969 Khalili, R., Gast, N., and J. Boudec, "Opportunistic 970 Linked-Increases Congestion Control Algorithm for MPTCP", 971 July 2014, . 974 [QUIC-Timestamp] 975 Huitema, C., "Quic Timestamps For Measuring One-Way 976 Delays", August 2020, 977 . 979 Authors' Addresses 981 Yanmei Liu 982 Alibaba Inc. 984 Email: miaoji.lym@alibaba-inc.com 986 Yunfei Ma 987 Alibaba Inc. 989 Email: yunfei.ma@alibaba-inc.com 991 Christian Huitema 992 Private Octopus Inc. 994 Email: huitema@huitema.net 996 Qing An 997 Alibaba Inc. 999 Email: anqing.aq@alibaba-inc.com 1001 Zhenyu Li 1002 ICT-CAS 1004 Email: zyli@ict.ac.cn