idnits 2.17.1 draft-liu-multipath-quic-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([QUIC-TRANSPORT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (11 December 2020) is 1231 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 234 == Missing Reference: 'X' is mentioned on line 318, but not defined == Missing Reference: 'Y' is mentioned on line 320, but not defined -- Looks like a reference, but probably isn't: '1' on line 325 -- Looks like a reference, but probably isn't: '2' on line 323 == Outdated reference: A later version (-07) exists of draft-deconinck-quic-multipath-06 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC Y. Liu 3 Internet-Draft Y. Ma 4 Intended status: Standards Track Alibaba Inc. 5 Expires: 14 June 2021 C. Huitema 6 Private Octopus Inc. 7 Q. An 8 Alibaba Inc. 9 Z. Li 10 ICT-CAS 11 11 December 2020 13 Multipath Extension for QUIC 14 draft-liu-multipath-quic-00 16 Abstract 18 This document specifies multipath extension for the QUIC protocol to 19 enable the simultaneous usage of multiple paths for a single 20 connection. The extension is compliant with the single-path QUIC 21 design. The design principle is to support multipath by adding 22 limited extension to [QUIC-TRANSPORT]. 24 Discussion Venues 26 This note is to be removed before publishing as an RFC. 28 Source for this draft and an issue tracker can be found at 29 https://github.com/yfmascgy/Multipath-QUIC-IETF-Draft. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on 14 June 2021. 48 Copyright Notice 50 Copyright (c) 2020 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 55 license-info) in effect on the date of publication of this document. 56 Please review these documents carefully, as they describe your rights 57 and restrictions with respect to this document. Code Components 58 extracted from this document must include Simplified BSD License text 59 as described in Section 4.e of the Trust Legal Provisions and are 60 provided without warranty as described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 66 3. Enable Multipath QUIC - Handshake . . . . . . . . . . . . . . 4 67 4. Path Management . . . . . . . . . . . . . . . . . . . . . . . 4 68 4.1. Path Identifier and Connection ID . . . . . . . . . . . . 5 69 4.2. Path Packet Number Spaces . . . . . . . . . . . . . . . . 5 70 4.3. Path Initiation . . . . . . . . . . . . . . . . . . . . . 5 71 4.4. Path State Management . . . . . . . . . . . . . . . . . . 6 72 4.5. Path Close . . . . . . . . . . . . . . . . . . . . . . . 7 73 4.5.1. Use PATH_STATUS frame to close a path . . . . . . . . 7 74 4.5.2. Use RETIRE_CONNECTION_ID frame to close a path . . . 8 75 4.5.3. Idle timeout . . . . . . . . . . . . . . . . . . . . 8 76 5. Using TLS to Secure QUIC Multipath . . . . . . . . . . . . . 9 77 5.1. Packet protection for QUIC Multipath . . . . . . . . . . 9 78 5.2. Key Update for QUIC Multipath . . . . . . . . . . . . . . 10 79 6. Using Multipath QUIC with load balancers . . . . . . . . . . 10 80 7. Packet scheduling . . . . . . . . . . . . . . . . . . . . . . 11 81 7.1. Basic Scheduling . . . . . . . . . . . . . . . . . . . . 11 82 7.2. Scheduling with QoE Feedback . . . . . . . . . . . . . . 12 83 7.3. Per-stream Policy . . . . . . . . . . . . . . . . . . . . 12 84 8. Congestion control and loss detection . . . . . . . . . . . . 13 85 8.1. Congestion control . . . . . . . . . . . . . . . . . . . 13 86 8.2. Packet number space and acknowledgements . . . . . . . . 14 87 8.3. Flow control . . . . . . . . . . . . . . . . . . . . . . 14 88 9. New frames . . . . . . . . . . . . . . . . . . . . . . . . . 14 89 9.1. PATH_STATUS frame . . . . . . . . . . . . . . . . . . . . 14 90 9.2. MP_ACK frame . . . . . . . . . . . . . . . . . . . . . . 15 91 9.3. QOE_CONTROL_SIGNALS frame . . . . . . . . . . . . . . . . 16 92 9.4. MP_ADD_ADDRESS frame . . . . . . . . . . . . . . . . . . 16 93 9.5. MP_REMOVE_ADDRESS frame . . . . . . . . . . . . . . . . . 16 94 10. Security Considerations . . . . . . . . . . . . . . . . . . . 17 95 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 96 12. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 17 97 13. Appendix.A Scenarios related to migration . . . . . . . . . . 17 98 14. Appendix.B Considerations on RTT estimate and loss 99 detection . . . . . . . . . . . . . . . . . . . . . . . . 19 100 15. Appendix.C Difference from past proposals . . . . . . . . . . 20 101 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 102 16.1. Normative References . . . . . . . . . . . . . . . . . . 20 103 16.2. Informative References . . . . . . . . . . . . . . . . . 21 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 106 1. Introduction 108 In this document, we propose an extension to the current QUIC design 109 to enable the simultaneous usage of multiple paths for a single 110 connection. 112 This proposal is based on several basic design points: 114 * Re-use as much as possible mechanisms of QUIC-v1, which has 115 supported connection migration and path validation. 117 * To avoid the risk of packets being dropped by middleboxes (which 118 may only support QUIC-v1), use the same packet header formats as 119 QUIC V1. 121 * Endpoints need a Path Identifier for each different path which is 122 used to track states of packets. As we want to keep the packet 123 header formats unchanged [QUIC-TRANSPORT], Connection IDs (and the 124 sequence number of Connection IDs) would be a good choice of Path 125 Identifier. 127 * For the convenience of packet loss detection and recovery, 128 endpoints use a different packet number space for each Path 129 Identifier. 131 * Congestion Control, RTT measurements and PMTU discovery should be 132 per-path (following [QUIC-TRANSPORT]) 134 This document is organized as follows. It first provides definitions 135 of multipath quic in Section 2. It then specifies how to enable 136 multipath quic during handshake in Section 3, and path management in 137 Section 4. It discusses packet scheduling in Section 7, and 138 congestion control in Section 8. The new frames are defined in 139 Section 9. 141 2. Conventions and Definitions 143 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 144 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 145 "OPTIONAL" in this document are to be interpreted as described in BCP 146 14 [RFC2119] [RFC8174] when, and only when, they appear in all 147 capitals, as shown here. 149 We assume that the reader is familiar with the terminology used in 150 [QUIC-TRANSPORT]. In addition, we define the following terms: 152 * Path Identifier: An identifier that is used to identify a path in 153 a QUIC connection at an endpoint. It is defined as the sequence 154 number of the destination Connection ID used for sending packets 155 on that particular path. 157 * Each node maintains a list of "Received Packets" for each of the 158 CID that it provided to the peer, which is used for acknowledging 159 packets received with that CID. 161 3. Enable Multipath QUIC - Handshake 163 This extension defines a new transport parameter, used to negotiate 164 the use of the multipath extension during the connection handshake, 165 as specified in [QUIC-TRANSPORT]. The new transport parameter is 166 defined as follow: 168 * name: enable_multipath (0x40) 170 * value: 0 (default) for disabled, 1 for enabled 172 If the peer does not carry the enable_multipath(0x40) transport 173 parameter, which means the peer does NOT support multipath, endpoint 174 MUST fallback to [QUIC-TRANSPORT] with single path and MUST NOT send 175 any MP frames in the following packets, also MUST NOT use the 176 multipath specific AEAD algorithm defined in Section 5.1. 178 Notice that transport parameter "active_connection_id_limit" 179 [QUIC-TRANSPORT] limits the number of usable Connection IDs, and also 180 limits the number of concurrent paths. 182 4. Path Management 184 After endpoints have negotiated in handshake flow that both endpoints 185 enable multipath feature, endpoints can start using multiple paths. 187 This proposal add one frame for path management: 189 * PATH_STATUS frame for the receiver side to claim the path state 190 and preference 192 All the new MP frames are sent in 1-RTT packets [QUIC-TRANSPORT]. 194 4.1. Path Identifier and Connection ID 196 Endpoints need a Path Identifier for each different path which is 197 used to track states of packets. Endpoints use Connection IDs in 198 1-RTT packet header as Path Identifier in each directions, and use 199 the sequence number of Connection IDs in MP frames to identify the 200 path referred. 202 Following [QUIC-TRANSPORT], Each endpoint uses NEW_CONNECTION_ID 203 frames to claim usable connections IDs for itself. Before an 204 endpoint add a new path, it SHOULD check whether there is at least 205 one unused available Connection ID for each side. 207 Endpoints can find which path a received packet belongs to according 208 to the Destination Connection ID of the 1-RTT packet. Endpoints can 209 find the context of a path by its' Connection ID or the Sequence 210 number of Connection ID. 212 4.2. Path Packet Number Spaces 214 For the convenience of packet loss detection and recovery, endpoints 215 use a different packet number space for each Path Identifier 216 (Connection ID). MP_ACK frame includes the sequence number of the 217 Destination Connection ID of the acknowledged packets as the Path 218 Identifier. 220 4.3. Path Initiation 222 Figure 1 illustrates an example of new path establishment. 224 Client Server 226 (Exchanges start on default path) 227 1-RTT[]: NEW_CONNECTION_ID[C1, Seq=1] --> 228 <-- 1-RTT[]: NEW_CONNECTION_ID[S1, Seq=1] 229 <-- 1-RTT[]: NEW_CONNECTION_ID[S2, Seq=2] 230 ... 231 (starts new path) 232 1-RTT[0]: DCID=S2, PATH_CHALLENGE[X] --> 233 Checks AEAD using nonce(CID sequence 2, PN 0) 234 <-- 1-RTT[0]: DCID=C1, PATH_RESPONSE[X], PATH_CHALLENGE[Y], 235 MP_ACK[Seq=1,PN=0] 236 Checks AEAD using nonce(CID sequence 1, PN 0) 237 1-RTT[1]: DCID=S2, PATH_RESPONSE[Y], 238 MP_ACK[Seq=1, PN=0], ... --> 240 Figure 1: Example of new path establishment 242 As shown in Figure 1, client provides one unused available Connection 243 ID (C1 with sequence number 1), and server provides two available 244 Connection IDs (S1 with sequence number 1, and S2 with sequence 245 number 2). When client wants to start a new path, it checks whether 246 there is unused available Connection IDs for each side, and choose an 247 available Connection ID S2 as the Destination Connection ID in the 248 new path. 250 Endpoints need to exchange unused available Connection IDs with the 251 NEW_CONNECTION_ID frame before an endpoint starts a new path. For 252 example, if the goal is to maintain 2 paths, each endpoint should 253 provide at least 3 CID to its peer: 2 in use, and one spare. If the 254 client has used all the allocated CID, it is supposed to retire those 255 that are not used anymore, and the server is supposed to provide 256 replacements, as specified in [QUIC-TRANSPORT]. 258 If the transport parameter "active_connection_id_limit" is negotiated 259 as N, and the server has provided N Connection IDs and the client has 260 started N paths, the limit is reached. If the client wants to start 261 a new path, it has to retire one of the established paths. 263 Path validation uses the PATH_CHALLENGE and PATH_RESPONSE frame 264 defined in QUIC-Transport [QUIC-TRANSPORT]. 266 4.4. Path State Management 268 An endpoint uses PATH_STATUS frames to inform that the peer should 269 send packets in the preference expressed by these frames. An 270 endpoint uses the sequence number of the CID used by the peer for 271 PATH_STATUS frames (describing the sender's path identifier). 273 In the example Figure 1, if the client wants to send a PATH_STATUS 274 frame to tell the server that it prefers the path with CID sequence 275 number 1 (of the server's side), the client should use the identifier 276 of the server (sequence 1) in PATH_STATUS frame. 278 PATH_STATUS frame describes 4 kinds of path states: 280 * Abandon a path, and release the corresponding resource. 282 * Mark a path as "available", i.e., allow the peer to use its own 283 logic to split traffic among available paths. 285 * Mark a path as "standby", i.e., suggest that no traffic should be 286 sent on that path if another path is available. 288 * Mark the priority of a path, i.e, path 1 is weight 8, path 2 is 289 weight 2, suggest that path 1 has higher priority than path 2, and 290 peer should try to send more data in path 1. 292 PATH_STATUS frame can be sent via a different path, instead of the 293 path identified by the Path Identifier field. 295 4.5. Path Close 297 An endpoint that want to delete a path SHOULD NOT rely on implicit 298 signals like idle time or packet losses, but instead SHOULD use 299 explicit signals like retiring Connection ID or asking to abandon 300 path. 302 4.5.1. Use PATH_STATUS frame to close a path 304 Both client and server can close a path, by sending PATH_STATUS frame 305 which abandons the path with a corresponding Path Identifier. Once a 306 path is marked as "abandon", it means that the resources related to 307 the path can be released. 309 Figure 2 illustrates an example of path closing. In the case, the 310 path identifier used by the server is CID C1, sequence number of CID 311 is 1; the path identifier used by the client is CID S2, sequence 312 number of CID is 2. (For the convience of distinguishing the CID 313 sequence number and PATH_STATUS sequence number, we call the 314 "PATH_STATUS sequence number" as "PSSN") 315 Client Server 317 (client tells server to abandon a path) 318 1-RTT[X]: PATH_STATUS[id=1, PSSN1, status=abandon, pri.=0] -> 319 (server tells client to abandon a path) 320 <- 1-RTT[Y]: PATH_STATUS[id=2, PSSN2, status=abandon, pri.=0], 321 MP_ACK[Seq=1, PN=X] 322 (client abandons the path that it is using) 323 1-RTT[X+1]: RETIRE_CONNECTION_ID[2], MP_ACK[Seq=2, PN=Y] -> 324 (server abandons the path that it is using) 325 <- 1-RTT[Y+1]: RETIRE_CONNECTION_ID[1], MP_ACK[Seq=1, PN=X+1] 327 Figure 2: Example of closing a path 329 In scenarios such as client detects the network environment change 330 (client's 4G/Wi-Fi is turned off, Wi-Fi signal is fading to a 331 threshold), or endpoints detect that the quality of RTT or loss rate 332 is becoming worse, client or server can terminate a path immediately. 334 4.5.2. Use RETIRE_CONNECTION_ID frame to close a path 336 A sender can close a path by retiring the associated Connection ID. 337 The RETIRE_CONNECTION_ID frame can be sent on any path. 339 Receiving a RETIRE_CONNECTION_ID frame causes the endpoint to discard 340 the resources associated with that connection ID. If the connection 341 ID was used by the peer to identify a path from the peer to this 342 endpoint, the resources include the list of received packets used to 343 send acknowledgements. There is no reason for the endpoint to send a 344 PATH_STATUS(abandon) for that path, since the peer has already 345 abandoned it. An endpoint SHOULD only send RETIRE_CONNECTION_ID to 346 the peer if all packets sent with that CID are either acknowledged or 347 considered lost. 349 This has no direct effect on reverse paths from this endpoint to the 350 peer. If the peer wants to direct the endpoint to abandon such 351 paths, it should send PATH_STATUS(abandon) frames for the relevant 352 paths. 354 4.5.3. Idle timeout 356 [QUIC-TRANSPORT] allows for closing of connections if they stay idle 357 for too long. The connection idle timeout in multipath QUIC is 358 defined as "no packet received on any path for the duration of the 359 idle timeout". It means that if all paths remain idle for the idle 360 timeout, the connection is implicitly closed. 362 5. Using TLS to Secure QUIC Multipath 364 In order to facilitate loss detection and recovery when sending data 365 over multiple paths, this specification defines how packets sent over 366 multiple paths use different packet number spaces. This requires 367 changes in the way AEAD is applied for packet protection, as 368 explained in Section 5.1, and tighter constrainst for key updates, as 369 explained in Section 5.2. 371 5.1. Packet protection for QUIC Multipath 373 Packet protection for QUIC V1 is specified is section 5 of 374 [QUIC-TLS]. The general principles of packet protection are not 375 changed for QUIC Multipath. No changes are needed for setting packet 376 protection keys, initial secrets, header protection, use of 0-RTT 377 keys, receiving out-of-order protected packets, receiving protected 378 packets, or retry packet integrity. However, the use of multiple 379 number spaces for 1-RTT packets requires changes in AEAD usage. 381 Section 5.3 of [QUIC-TLS] specifies AEAD usage, and in particular the 382 use of a nonce, N, formed by combining the packet protection IV with 383 the packet number. QUIC multipath uses multiple packet number 384 spaces, and thus the packet number alone would not guarantee the 385 uniqueness of the nonce. In order to guarantee this uniqueness, we 386 construct the nonce N by combining the packet protection IV with the 387 packet number and with the identifier of the path, which for 1-RTT 388 packets is the Sequence Number of the Destination Connection ID 389 present in the packet header, as defined in Section 5.1.1 of 390 [QUIC-TRANSPORT], or zero if the Connection ID is zero-length. 391 Section 19 of [QUIC-TRANSPORT] encode this Connection ID Sequence 392 Number as a A variable-length integer, allowing values up to 2^62-1; 393 for QUIC multipath, we require that these values be no larger than 394 2^32 -1. 396 For QUIC multipath, the construction of the nonce starts with the 397 construction of a 96 bit path-and-packet-number, composed of the 32 398 bit Connection ID Sequence Number in byte order, two zero bits, and 399 the 62 bits of the reconstructed QUIC packet number in network byte 400 order. If the IV is larger than 96 bits, path-and-packet-number is 401 left-padded with zeros to the size of the IV. The exclusive OR of 402 the padded packet number and the IV forms the AEAD nonce. 404 For example, assuming the IV value is "6b26114b9cba2b63a9e8dd4f", the 405 connection ID sequence number is "3", and the packet number is 406 "aead", the nonce will be set to "6b2611489cba2b63a9a873e2". 408 5.2. Key Update for QUIC Multipath 410 The Key Phase bit update process for QUIC V1 is specified in 411 Section 6 of [QUIC-TLS]. The general principles of key update are 412 not changed for Multipath QUIC. Following QUIC V1, the Key Phase bit 413 is used to indicate which packet protection keys are used to protect 414 the packet. The Key Phase bit is toggled to signal each subsequent 415 key update. Because of network delays, packets protected with the 416 older key might arrive later than the packets protected with the new 417 key. Therefore, the endpoint needs to retain old packet keys to 418 allow these delayed packets to be processed and it must distinguish 419 between the new key and the old key. In QUIC V1, this is done using 420 packet numbers so that the rule is made simple: Use the older key if 421 packet number is lower than any packet number frome the current key 422 phase. 424 In QUIC multipath, some care is needed in the initiating Key Update 425 process. Because different paths use different packet number spaces 426 but share a single key, when a key update is initiated on one path, 427 packets sent to the other path needs to know when transition is 428 complete. Otherwise, it is possible that the other paths send 429 packets with the old keys, but skip sending any packets in the 430 current key phase and directly jump to sending packet in the next key 431 phase. When that happens, as the endpoint can only retain two sets 432 of packet protection keys with the 1-bit Key Phase bit, the other 433 paths cannot distinguish which key should be used to decode received 434 packets, which results in a key rotation synchronization problem. 436 To address such a synchronization issue, in QUIC multipath, if key 437 update is initilized on one path, the sender should send at least one 438 packet with the new key on all active paths. Regarding the 439 responding to Key Update process, the endpoint MUST NOT initiate a 440 subsequent key update until a packet with the current key has been 441 acknowledged on each path. 443 Following the Section 5.4. of [QUIC-TLS], the Key Phase bit is 444 protected, so sending multiple packets with Key Phase bit flipping at 445 the same time should not cause linkability issue. 447 6. Using Multipath QUIC with load balancers 449 This specification follows the Connection ID negotiation defined in 450 [QUIC-TRANSPORT]. For stateless or low-state load balancers 451 supporting Multipath QUIC, implementations SHOULD use the 452 specification of Connection ID generation and Load balancer routing 453 defined in [QUIC-LB], guarantee that packets with Connection IDs 454 belonging to the same connection, can be routed to same server. 456 7. Packet scheduling 458 7.1. Basic Scheduling 460 For an outgoing packet, the packet scheduler decides which path the 461 packet shall be transmitted. A basic static scheduling strategy 462 consists of four major components: 464 1. Path state: A scheduler may want to decide which path shall be 465 activated to transmit data. For instance, a scheduler can choose 466 to use only one of the two paths and completely ignore the other 467 one. A scheduler marks the selected paths to be in the 468 "available" state and the un-selected ones in the "standby" 469 state. 471 2. Path priority: Due to the fact that costs of transmitting data 472 over different paths are not always equal. For example, the 473 energy (battery) cost over a 5G path and a wifi path are very 474 different. In another example, transmissions over a wifi path 475 and a cellular path may incur different charges per packet. Note 476 that a user's preference may change over time. For instance, 477 certain mobile carriers offer unlimited free data for a 478 particular streaming app. Therefore, the path priority should be 479 made available in the scheduler. 481 3. Path selection algorithm: A selection algorithm splits packets 482 across different paths and determines the order of paths to be 483 selected. The selection algorithm takes congestion controller 484 states as inputs, such as smoothed RTTs (sRTTs), estimated 485 bandwidths (eBWs) and congestion window sizes (CWNDs) as well as 486 application-defined information such as path priorities and path 487 states. The outputs of the algorithm is an ordered list of paths 488 to put a packet on. To name a few, some of the commonly used 489 algorithms are: - Round-Robin: There is no priority. it selects 490 paths one by one in order to transmit data. - Lowest-RTT: It 491 first chooses the path with the lowest RTT and feeds packets to 492 it until that path's congestion window is full. Then it chooses 493 the path with the second lowest RTT. - Highest-Sending-Rate: It 494 first chooses the path with the highest bandwidth and feeds 495 packets to it until that path's congestion window is full. Then 496 it chooses path with the second largest bandwidth. 498 4. Packet redundancy: One major challenge in multi-path transmission 499 is that a packet loss on the slow path might block the overall 500 transmission when packets are split across fast-changing paths. 501 As the path selection algorithm takes inputs from congestion 502 controllers on predictions of the network which may not be 503 accurate enough for fast-changing wireless channels, such an 504 imprecise estimation could lead to network overuse/underuse. A 505 solution to this problem is to implement packet redundancy 506 strategy. A redundancy strategy can be applied to only ACK 507 packets(partial redundancy) or all data packets (full 508 redundancy). It is up to the application to determine whether, 509 when, and on which packets to activate redundancy. 511 The path state and path priority are managed by PATH_STATUS frame. 512 The path selection algorithm and packet redundancy are application 513 related and should be controlled by the applicaiton. 515 7.2. Scheduling with QoE Feedback 517 Applications may have completely different QoE requirements---the 518 interactive applications are delay sensitive, while the video 519 streaming applications are more throughput sensitive. There is thus 520 a trend of cross-layer design that takes applications' demands into 521 account when managing paths or scheduling packets. The QoE feedback 522 is used to fully support application-awareness in multipath 523 scheduling and is carried in the QOE_CONTROL_SIGNALS frames Figure 6. 524 The QOE_CONTROL_SIGNALS frames can include general application-level 525 information that is needed by the schedulers. The frequency of such 526 feedback should be controlled to limit the amount of extra packets. 527 The QoE control signal allows a synchronization of viewpoints between 528 two endhosts. It is up to the application to determine the 529 interpretation of QoE control signals. 531 7.3. Per-stream Policy 533 As QUIC supports stream multiplexing, streams are allowed to 534 associate stream priorities to express applications intent. For 535 instance, objects in a web page may be dependent on others and thus 536 have different priorities multipath quic scheduler. A stream 537 priority-aware packet scheduling algorithm will improve the 538 performance notably. 540 High priority /\ +---------+ 541 || | | 542 || +---------+ 543 || +---------+ 544 || | | 545 || +---------+ 546 || ... User-defined stream priority 547 || +---------+ 548 Low priority || | | 549 || +---------+ 550 ----------------------------------------------------------- 551 High priority /\ +---------+ 552 || | | 553 || +---------+ 554 || +---------+ 555 || | | 556 || +---------+ 557 || ... Default stream priority 558 || +---------+ 559 Low priority || | | 560 || +---------+ 562 Figure 3: Stream priority 564 The priority management scheme composes two separated priority 565 ranges. The user-defined priority range includes those streams that 566 the applications explicitly designate priorities, while the default 567 priority range includes the streams with no priorities set by the 568 applications. Only when the streams in the user-defined ranges have 569 no data to send, the streams in the default priority range can send. 570 In the same range, one can use the weighted-round robin for 571 scheduling---the higher-priority streams get more quota for data to 572 send in each round. One can also dynamically set/change the 573 priorities of the streams in the default priority ranges to enable 574 short stream first if needed. 576 8. Congestion control and loss detection 578 8.1. Congestion control 580 Implementations MAY support coupled congestion controllers such as 581 LIA [MPTCP-LIA], OLIA [MPTCP-OLIA], and etc., or support decoupled 582 congestion controllers in environments using disjoint network paths. 584 In decoupled congestion control, each path runs its own congestion 585 controller without interacting with the congestion controllers of 586 other paths. That is to say, in the aspect of congestion control, a 587 path behaves exactly the same as a normal QUIC connection over the 588 same network path. 590 Each path MAY choose congestion control algorithm independently. 592 8.2. Packet number space and acknowledgements 594 Each path has it's own packet number space for transmitting 1-RTT 595 packets. 597 ACK frame [QUIC-TRANSPORT] MUST be returned via the same path on 598 which the corresponding packets were sent. 600 MP_ACK frame can be returned via either a different path, or the same 601 path identified by the Path Identifier, based on different strategies 602 of sending MP_ACK frames. 604 Note: Only MP_ACK frame returned via the same path can be used to 605 calculate RTT(round trip time). 607 8.3. Flow control 609 TBD. 611 9. New frames 613 All the new frames MUST be sent in 1-RTT packet, and MUST NOT use 614 other encryption levels. 616 If an endpoint receives MP frames from packets of other encryption 617 levels, it MAY return MP_PROTOCOL_VIOLATION as a connection error and 618 close the connection. 620 9.1. PATH_STATUS frame 622 PATH_STATUS Frame are used by endpoints to inform the peer of the 623 current status of one path, and the peer should send packets 624 according to the preference expressed in these frames. Endpoint use 625 the sequence number of the CID used by the peer for PATH_STATUS 626 frames (describing the sender's path identifier). PATH_STATUS frames 627 are formatted as shown in Figure 4. 629 PATH_STATUS Frame { 630 Type (i) = 0x2a, 631 Path Identifier (i), 632 Path Status sequence number (i), 633 Path Status (i), 634 Path Priority (i), 635 } 637 Figure 4: PATH_STATUS Frame Format 639 PATH_STATUS Frames contain the following fields: 641 Path Identifier: A variable-length integer specifying the path 642 identifier. 644 Path Status sequence number: A variable-length integer specifying the 645 sequence number assigned for this PATH_STATUS frame. There is a 646 different path status sequence number space for each path. 648 Available values of Path Status field are: 650 * 0: Abandon 652 * 1: Standby 654 * 2: Available 656 If the value of Path Status field is 2-available, the receiver side 657 can use the Path Priority field to express the priority weight of a 658 path for the peer. 660 Frames may be received out of order. A peer MUST ignore an incoming 661 PATH_STATUS frame if it previously received another PATH_STATUS frame 662 for the same Path Identifier with a sequence number equal to or 663 higher than the sequence number of the incoming frame. 665 PATH_STATUS frames SHOULD be acknowledged. If a packet containing a 666 PATH_STATUS frame is considered lost, the peer should only repeat it 667 if it was the last status sent for that path -- as indicated by the 668 sequence number. 670 9.2. MP_ACK frame 672 MP_ACK frame allows for acknowledgements on different paths. MP_ACK 673 frame is formatted by adding a Path Identifier field to 674 [QUIC-TRANSPORT] ACK frame. MP_ACK frame is formatted as shown in 675 Figure 5. 677 MP_ACK Frame { 678 Type (i) = 0x22..0x23, 679 Path Identifier (i), 680 Largest Acknowledged (i), 681 ACK Delay (i), 682 ACK Range Count (i), 683 First ACK Range (i), 684 ACK Range (..) ..., 685 [ECN Counts (..)], 686 } 688 Figure 5: MP_ACK Frame Format 690 Type(i) = 0x22 , with no ECN Counts Type(i) = 0x23 , with ECN Counts 692 9.3. QOE_CONTROL_SIGNALS frame 694 QOE_CONTROL_SIGNALS frame is used to carry quality of experience 695 (QoE) information. A typical use of such information is to provide 696 feedback to help application-aware scheduling. Note that different 697 applications may have very different needs, the interpretation of the 698 QoE control signal can be up to the users. QOE_CONTROL_SIGNALS 699 frames are formatted as shown in Figure 6. 701 QOE_CONTROL_SIGNALS Frame { 702 Type (i) = 0x24, 703 Path Identifier (i), 704 QoE Control Signals Length(8), 705 QoE Control Signals (..) 706 } 708 Figure 6: QOE_CONTROL_SIGNALS Frame Format 710 QOE_CONTROL_SIGNALS frames may be received out of order, peers SHOULD 711 pass them to the application as they arrive. Although 712 QOE_CONTROL_SIGNALS frames are not retransmitted upon loss detection, 713 they are ack-eliciting [QUIC-RECOVERY]. 715 9.4. MP_ADD_ADDRESS frame 717 TBD. 719 9.5. MP_REMOVE_ADDRESS frame 721 TBD. 723 10. Security Considerations 725 TBD. 727 11. IANA Considerations 729 This document makes no request of IANA. 731 12. Changelog 733 13. Appendix.A Scenarios related to migration 735 In QUIC V1, there are four scenarios related to migration: CID 736 renewal, NAT Rebinding, controlled migration, and migration to server 737 preferred address. It would be useful to explain exactly how these 738 four scenarios are supported or changed with Multipath QUIC. For V1, 739 these scenarios are described as follow: 741 * CID Renewal happens when the client starts using a new CID for 742 1-RTT packet, while still using the same four-tuple. This is 743 typically done for privacy, for example after a long period of 744 silence. The expected result is that the server will also use a 745 new CID for its next packets. In that scenario, RTT and 746 congestion control parameters remain the same before and after 747 migration. 749 * NAT Rebinding happens when a NAT on the path changes its mappings. 750 The server receives packets that bear the same CID as previously, 751 but arrive on a different four tuple. The complication is that 752 this could be an attack in which the attacker captures a packet 753 from the client and resends it from a different address. The 754 server is expected to perform continuity tests for both the old 755 and the new path, typically using a different CID for the new 756 path. If the continuity test on the new path succeeds before the 757 old path, the server migrates to the new path, otherwise it 758 continues using the old path and ignores the new path. 760 * Controlled migration happens when a client tests a new path. The 761 server receives packets that bear a new CID and arrive on a new 762 four tuple. The server responds to the path challenge, perform 763 its own continuity test on the new path. If the client sends non- 764 path-validation packets on the new path, the server switches to 765 sending on the new path and discards the old path. 767 * Preferred address migration happens when the server sends the 768 preferred address TP during the exchange. The client performs a 769 controlled migration to the new path, and if that is successful 770 discards the old path. 772 We could sum up these scenarios in the following table: 774 +=====+=========+===================+====================+ 775 | CID | 4-tuple | preferred address | result | 776 +=====+=========+===================+====================+ 777 | Old | Old | - | Not a migration. | 778 +-----+---------+-------------------+--------------------+ 779 | Old | New | - | NAT Rebinding. | 780 +-----+---------+-------------------+--------------------+ 781 | New | Old | - | CID Renewal. | 782 +-----+---------+-------------------+--------------------+ 783 | New | New | matches PFA | Migration to | 784 | | | | Preferred Address. | 785 +-----+---------+-------------------+--------------------+ 786 | New | New | other | Controlled | 787 | | | | Migration. | 788 +-----+---------+-------------------+--------------------+ 790 Table 1: Scenarios related to migration 792 The expectation in those scenarios is: 794 +==============+============================================+ 795 | Scenario | Expectation | 796 +==============+============================================+ 797 | Not a | Continue using existing path | 798 | migration | | 799 +--------------+--------------------------------------------+ 800 | NAT | After validation, use new path and discard | 801 | Rebinding | previous path. | 802 +--------------+--------------------------------------------+ 803 | CID Renewal | Create new path with new CIDs, discard old | 804 | | path. Reuse RTT and CC parameter. | 805 +--------------+--------------------------------------------+ 806 | Controlled | Create new path with new CIDs. Server | 807 | Migration | creates a new path,ready to use both | 808 | | paths. Client may later discard old path. | 809 +--------------+--------------------------------------------+ 810 | Migration to | Same as Controlled Migration, but the | 811 | Preferred | client is expected to abandon the old path | 812 | Address | | 813 +--------------+--------------------------------------------+ 815 Table 2: Expectation in scenarios related to migration 817 In multipath quic, client / server create a new path and abandon the 818 old path to do exactly the same thing as connection migration in the 819 previous scenarios. 821 14. Appendix.B Considerations on RTT estimate and loss detection 823 QUIC implementations use RTT estimates in many ways: 825 * For loss detection, RTT estimates are used to evaluate how long to 826 wait for an acknowledgement before a packet is declared lost. 828 * Several congestion control algorithm (e.g. LEDBAT, VEGAS, 829 HYSTART) use variations of the RTT above the minimum value to 830 detect the beginning of congestion. 832 * BBR uses the minimal RTT to compute the minimal size of the 833 congestion window for a target data rate. 835 * ACK delays are often set as a fraction of the RTT. 837 In a multipath environment, the RTT can be estimated each time a new 838 packet is acknolwedged. However, the observed RTT will vary not only 839 based on the state of the send path, but also based on the choice of 840 the return path used for acknowledgements. Each RTT measurement will 841 the sum of the one-way delay on the send path and the one-way delay 842 on the return path. This has a number of implications for the 843 different ways of using the RTT presented above: 845 * If the goal is to detect possible losses, it is probably 846 sufficient to consider all RTT measurements for a given path. 847 Classic formulas like adding smoothed RTT and a number of 848 deviations aim at estimating a reasonable upper bound of the 849 acknowledgement delays. Statistics on observed acknowledgement 850 delays will provide a valid estimate, regardless of the selection 851 of the return path by the peer. 853 * If the goal is to detect the onset of collision and tune a 854 congestion algorithm, the variations of delays due to the choices 855 of return paths will be a source of errors. Implementations will 856 need to pick a strategy, such as for example only considering 857 acknowledgements received through the "fastest" return path, or 858 maybe those received through the matching four tuple for the 859 sending path. An alternative would be to use time stamps to 860 directly estimate variations of the one way delays. 861 [QUIC-Timestamp] provides good support for such one-way-delay 862 compuation. 864 * If BBR is in use and ACKs are returned on different paths, it may 865 cause an ambiguity issue with the computation of bandwidth and 866 delay product (BDP). In BBR, BDP is used to limit the number of 867 inflight packets. One may choose to use the smallest RTT measured 868 to compute BDP. However, if the majority of ACKs are returned 869 from a high-latency path, the cwnd = cwnd_gain * bandwidth * 870 min_rtt may be lower than what is needed to achieve good 871 performance. One possible solution is to always transmit a new 872 packet and its ACK on the same path. 874 15. Appendix.C Difference from past proposals 876 This proposal differs from past proposals 877 [I-D.deconinck-quic-multipath] in two fundamental perspectives: 879 * The multi-path QUIC is built on top of the concept of the 880 bidirectional paths, which readily fits into the nature of both 881 cellular and wifi links that cover the majority of multi-path 882 applications in QUIC while keeping the design simple and easy to 883 implement. In doing so, we are able to re-use most of the current 884 QUIC transport design with the sole addition of six new frames. 886 * The multi-path QUIC design enables feedback-based dynamic 887 scheduling strategy. As the major goal of multi-path QUIC is to 888 enhance performance in mobile applications, where the sender and 889 receiver may have different viewpoints about the fast-changing 890 wireless connectivity, especially in high-mobility scenarios, the 891 proposed design allows the sender and receiver to synchronize 892 their viewpoints via message exchange in ACK packet in order to 893 maximize performance. 895 16. References 897 16.1. Normative References 899 [QUIC-LB] Duke, M., Ed. and N. Banks, Ed., "QUIC-LB: Generating 900 Routable QUIC Connection IDs", Work in Progress, Internet- 901 Draft, draft-ietf-quic-load-balancers, 902 . 905 [QUIC-RECOVERY] 906 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 907 and Congestion Control", Work in Progress, Internet-Draft, 908 draft-ietf-quic-recovery, 909 . 911 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 912 QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- 913 tls, . 915 [QUIC-TRANSPORT] 916 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 917 Multiplexed and Secure Transport", Work in Progress, 918 Internet-Draft, draft-ietf-quic-transport, 919 . 921 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 922 Requirement Levels", BCP 14, RFC 2119, 923 DOI 10.17487/RFC2119, March 1997, 924 . 926 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 927 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 928 May 2017, . 930 16.2. Informative References 932 [I-D.deconinck-quic-multipath] 933 Coninck, Q. and O. Bonaventure, "Multipath Extensions for 934 QUIC (MP-QUIC)", Work in Progress, Internet-Draft, draft- 935 deconinck-quic-multipath-06, 2 November 2020, 936 . 939 [MPTCP-LIA] 940 Raiciu, C., Handly, M., and D. Wischik, "Coupled 941 Congestion Control for Multipath Transport Protocols", 942 October 2011, . 944 [MPTCP-OLIA] 945 Khalili, R., Gast, N., and J. Boudec, "Opportunistic 946 Linked-Increases Congestion Control Algorithm for MPTCP", 947 July 2014, . 950 [QUIC-Timestamp] 951 Huitema, C., "Quic Timestamps For Measuring One-Way 952 Delays", August 2020, 953 . 955 Authors' Addresses 957 Yanmei Liu 958 Alibaba Inc. 960 Email: miaoji.lym@alibaba-inc.com 961 Yunfei Ma 962 Alibaba Inc. 964 Email: yunfei.ma@alibaba-inc.com 966 Christian Huitema 967 Private Octopus Inc. 969 Email: huitema@huitema.net 971 Qing An 972 Alibaba Inc. 974 Email: anqing.aq@alibaba-inc.com 976 Zhenyu Li 977 ICT-CAS 979 Email: zyli@ict.ac.cn