idnits 2.17.1 draft-ietf-quic-load-balancers-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 15 instances of too long lines in the document, the longest one being 22 characters in excess of 72. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (4 October 2021) is 935 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-13 -- Obsolete informational reference (is this intentional?): RFC 4347 (Obsoleted by RFC 6347) -- Obsolete informational reference (is this intentional?): RFC 6347 (Obsoleted by RFC 9147) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC M. Duke 3 Internet-Draft F5 Networks, Inc. 4 Intended status: Standards Track N. Banks 5 Expires: 7 April 2022 Microsoft 6 4 October 2021 8 QUIC-LB: Generating Routable QUIC Connection IDs 9 draft-ietf-quic-load-balancers-08 11 Abstract 13 The QUIC protocol design is resistant to transparent packet 14 inspection, injection, and modification by intermediaries. However, 15 the server can explicitly cooperate with network services by agreeing 16 to certain conventions and/or sharing state with those services. 17 This specification provides a standardized means of solving three 18 problems: (1) maintaining routability to servers via a low-state load 19 balancer even when the connection IDs in use change; (2) explicit 20 encoding of the connection ID length in all packets to assist 21 hardware accelerators; and (3) injection of QUIC Retry packets by an 22 anti-Denial-of-Service agent on behalf of the server. 24 Note to Readers 26 Discussion of this document takes place on the QUIC Working Group 27 mailing list (quic@ietf.org), which is archived at 28 https://mailarchive.ietf.org/arch/browse/quic/ 29 (https://mailarchive.ietf.org/arch/browse/quic/). 31 Source for this draft and an issue tracker can be found at 32 https://github.com/quicwg/load-balancers (https://github.com/quicwg/ 33 load-balancers). 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at https://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on 7 April 2022. 51 Copyright Notice 53 Copyright (c) 2021 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 58 license-info) in effect on the date of publication of this document. 59 Please review these documents carefully, as they describe your rights 60 and restrictions with respect to this document. Code Components 61 extracted from this document must include Simplified BSD License text 62 as described in Section 4.e of the Trust Legal Provisions and are 63 provided without warranty as described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 69 1.2. Notation . . . . . . . . . . . . . . . . . . . . . . . . 5 70 2. Protocol Objectives . . . . . . . . . . . . . . . . . . . . . 6 71 2.1. Simplicity . . . . . . . . . . . . . . . . . . . . . . . 6 72 2.2. Security . . . . . . . . . . . . . . . . . . . . . . . . 6 73 3. First CID octet . . . . . . . . . . . . . . . . . . . . . . . 7 74 3.1. Config Rotation . . . . . . . . . . . . . . . . . . . . . 7 75 3.2. Configuration Failover . . . . . . . . . . . . . . . . . 8 76 3.3. Length Self-Description . . . . . . . . . . . . . . . . . 8 77 3.4. Format . . . . . . . . . . . . . . . . . . . . . . . . . 8 78 4. Load Balancing Preliminaries . . . . . . . . . . . . . . . . 9 79 4.1. Unroutable Connection IDs . . . . . . . . . . . . . . . . 9 80 4.2. Fallback Algorithms . . . . . . . . . . . . . . . . . . . 10 81 4.3. Server ID Allocation . . . . . . . . . . . . . . . . . . 11 82 4.3.1. Static Allocation . . . . . . . . . . . . . . . . . . 11 83 4.3.2. Dynamic Allocation . . . . . . . . . . . . . . . . . 12 84 4.4. CID format . . . . . . . . . . . . . . . . . . . . . . . 14 85 5. Routing Algorithms . . . . . . . . . . . . . . . . . . . . . 15 86 5.1. Plaintext CID Algorithm . . . . . . . . . . . . . . . . . 15 87 5.1.1. Configuration Agent Actions . . . . . . . . . . . . . 15 88 5.1.2. Load Balancer Actions . . . . . . . . . . . . . . . . 15 89 5.1.3. Server Actions . . . . . . . . . . . . . . . . . . . 15 90 5.2. Stream Cipher CID Algorithm . . . . . . . . . . . . . . . 15 91 5.2.1. Configuration Agent Actions . . . . . . . . . . . . . 16 92 5.2.2. Load Balancer Actions . . . . . . . . . . . . . . . . 16 93 5.2.3. Server Actions . . . . . . . . . . . . . . . . . . . 17 94 5.3. Block Cipher CID Algorithm . . . . . . . . . . . . . . . 17 95 5.3.1. Configuration Agent Actions . . . . . . . . . . . . . 18 96 5.3.2. Load Balancer Actions . . . . . . . . . . . . . . . . 18 97 5.3.3. Server Actions . . . . . . . . . . . . . . . . . . . 18 98 6. ICMP Processing . . . . . . . . . . . . . . . . . . . . . . . 18 99 7. Retry Service . . . . . . . . . . . . . . . . . . . . . . . . 19 100 7.1. Common Requirements . . . . . . . . . . . . . . . . . . . 19 101 7.1.1. Considerations for Non-Initial Packets . . . . . . . 20 102 7.2. No-Shared-State Retry Service . . . . . . . . . . . . . . 21 103 7.2.1. Configuration Agent Actions . . . . . . . . . . . . . 21 104 7.2.2. Service Requirements . . . . . . . . . . . . . . . . 21 105 7.2.3. Server Requirements . . . . . . . . . . . . . . . . . 23 106 7.3. Shared-State Retry Service . . . . . . . . . . . . . . . 24 107 7.3.1. Token Protection with AEAD . . . . . . . . . . . . . 26 108 7.3.2. Configuration Agent Actions . . . . . . . . . . . . . 27 109 7.3.3. Service Requirements . . . . . . . . . . . . . . . . 27 110 7.3.4. Server Requirements . . . . . . . . . . . . . . . . . 28 111 8. Configuration Requirements . . . . . . . . . . . . . . . . . 28 112 9. Additional Use Cases . . . . . . . . . . . . . . . . . . . . 29 113 9.1. Load balancer chains . . . . . . . . . . . . . . . . . . 29 114 9.2. Moving connections between servers . . . . . . . . . . . 30 115 10. Version Invariance of QUIC-LB . . . . . . . . . . . . . . . . 30 116 11. Security Considerations . . . . . . . . . . . . . . . . . . . 31 117 11.1. Attackers not between the load balancer and server . . . 32 118 11.2. Attackers between the load balancer and server . . . . . 32 119 11.3. Multiple Configuration IDs . . . . . . . . . . . . . . . 32 120 11.4. Limited configuration scope . . . . . . . . . . . . . . 32 121 11.5. Stateless Reset Oracle . . . . . . . . . . . . . . . . . 33 122 11.6. Connection ID Entropy . . . . . . . . . . . . . . . . . 34 123 11.7. Shared-State Retry Keys . . . . . . . . . . . . . . . . 34 124 11.8. Resource Consumption of the SID table . . . . . . . . . 35 125 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 126 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 127 13.1. Normative References . . . . . . . . . . . . . . . . . . 35 128 13.2. Informative References . . . . . . . . . . . . . . . . . 36 129 Appendix A. QUIC-LB YANG Model . . . . . . . . . . . . . . . . . 37 130 A.1. Tree Diagram . . . . . . . . . . . . . . . . . . . . . . 42 131 Appendix B. Load Balancer Test Vectors . . . . . . . . . . . . . 43 132 B.1. Plaintext Connection ID Algorithm . . . . . . . . . . . . 43 133 B.2. Stream Cipher Connection ID Algorithm . . . . . . . . . . 44 134 B.3. Block Cipher Connection ID Algorithm . . . . . . . . . . 46 135 B.4. Shared State Retry Tokens . . . . . . . . . . . . . . . . 46 136 Appendix C. Interoperability with DTLS over UDP . . . . . . . . 46 137 C.1. DTLS 1.0 and 1.2 . . . . . . . . . . . . . . . . . . . . 47 138 C.2. DTLS 1.3 . . . . . . . . . . . . . . . . . . . . . . . . 47 139 C.3. Future Versions of DTLS . . . . . . . . . . . . . . . . . 48 140 Appendix D. Acknowledgments . . . . . . . . . . . . . . . . . . 48 141 Appendix E. Change Log . . . . . . . . . . . . . . . . . . . . . 48 142 E.1. since draft-ietf-quic-load-balancers-07 . . . . . . . . . 49 143 E.2. since draft-ietf-quic-load-balancers-06 . . . . . . . . . 49 144 E.3. since draft-ietf-quic-load-balancers-05 . . . . . . . . . 49 145 E.4. since draft-ietf-quic-load-balancers-04 . . . . . . . . . 49 146 E.5. since-draft-ietf-quic-load-balancers-03 . . . . . . . . . 50 147 E.6. since-draft-ietf-quic-load-balancers-02 . . . . . . . . . 50 148 E.7. since-draft-ietf-quic-load-balancers-01 . . . . . . . . . 50 149 E.8. since-draft-ietf-quic-load-balancers-00 . . . . . . . . . 50 150 E.9. Since draft-duke-quic-load-balancers-06 . . . . . . . . . 50 151 E.10. Since draft-duke-quic-load-balancers-05 . . . . . . . . . 50 152 E.11. Since draft-duke-quic-load-balancers-04 . . . . . . . . . 51 153 E.12. Since draft-duke-quic-load-balancers-03 . . . . . . . . . 51 154 E.13. Since draft-duke-quic-load-balancers-02 . . . . . . . . . 51 155 E.14. Since draft-duke-quic-load-balancers-01 . . . . . . . . . 51 156 E.15. Since draft-duke-quic-load-balancers-00 . . . . . . . . . 51 157 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 159 1. Introduction 161 QUIC packets [RFC9000] usually contain a connection ID to allow 162 endpoints to associate packets with different address/port 4-tuples 163 to the same connection context. This feature makes connections 164 robust in the event of NAT rebinding. QUIC endpoints usually 165 designate the connection ID which peers use to address packets. 166 Server-generated connection IDs create a potential need for out-of- 167 band communication to support QUIC. 169 QUIC allows servers (or load balancers) to designate an initial 170 connection ID to encode useful routing information for load 171 balancers. It also encourages servers, in packets protected by 172 cryptography, to provide additional connection IDs to the client. 173 This allows clients that know they are going to change IP address or 174 port to use a separate connection ID on the new path, thus reducing 175 linkability as clients move through the world. 177 There is a tension between the requirements to provide routing 178 information and mitigate linkability. Ultimately, because new 179 connection IDs are in protected packets, they must be generated at 180 the server if the load balancer does not have access to the 181 connection keys. However, it is the load balancer that has the 182 context necessary to generate a connection ID that encodes useful 183 routing information. In the absence of any shared state between load 184 balancer and server, the load balancer must maintain a relatively 185 expensive table of server-generated connection IDs, and will not 186 route packets correctly if they use a connection ID that was 187 originally communicated in a protected NEW_CONNECTION_ID frame. 189 This specification provides common algorithms for encoding the server 190 mapping in a connection ID given some shared parameters. The mapping 191 is generally only discoverable by observers that have the parameters, 192 preserving unlinkability as much as possible. 194 Aside from load balancing, a QUIC server may also desire to offload 195 other protocol functions to trusted intermediaries. These 196 intermediaries might include hardware assist on the server host 197 itself, without access to fully decrypted QUIC packets. For example, 198 this document specifies a means of offloading stateless retry to 199 counter Denial of Service attacks. It also proposes a system for 200 self-encoding connection ID length in all packets, so that crypto 201 offload can consistently look up key information. 203 While this document describes a small set of configuration parameters 204 to make the server mapping intelligible, the means of distributing 205 these parameters between load balancers, servers, and other trusted 206 intermediaries is out of its scope. There are numerous well-known 207 infrastructures for distribution of configuration. 209 1.1. Terminology 211 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 212 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 213 document are to be interpreted as described in RFC 2119 [RFC2119]. 215 In this document, these words will appear with that interpretation 216 only when in ALL CAPS. Lower case uses of these words are not to be 217 interpreted as carrying significance described in RFC 2119. 219 In this document, "client" and "server" refer to the endpoints of a 220 QUIC connection unless otherwise indicated. A "load balancer" is an 221 intermediary for that connection that does not possess QUIC 222 connection keys, but it may rewrite IP addresses or conduct other IP 223 or UDP processing. A "configuration agent" is the entity that 224 determines the QUIC-LB configuration parameters for the network and 225 leverages some system to distribute that configuration. 227 Note that stateful load balancers that act as proxies, by terminating 228 a QUIC connection with the client and then retrieving data from the 229 server using QUIC or another protocol, are treated as a server with 230 respect to this specification. 232 For brevity, "Connection ID" will often be abbreviated as "CID". 234 1.2. Notation 236 All wire formats will be depicted using the notation defined in 237 Section 1.3 of [RFC9000]. There is one addition: the function len() 238 refers to the length of a field which can serve as a limit on a 239 different field, so that the lengths of two fields can be concisely 240 defined as limited to a sum, for example: 242 x(A..B) y(C..B-len(x)) 244 indicates that x can be of any length between A and B, and y can be 245 of any length between C and B provided that (len(x) + len(y)) does 246 not exceed B. 248 The example below illustrates the basic framework: 250 Example Structure { 251 One-bit Field (1), 252 7-bit Field with Fixed Value (7) = 61, 253 Field with Variable-Length Integer (i), 254 Arbitrary-Length Field (..), 255 Variable-Length Field (8..24), 256 Variable-Length Field with Dynamic Limit (8..24-len(Variable-Length Field)), 257 Field With Minimum Length (16..), 258 Field With Maximum Length (..128), 259 [Optional Field (64)], 260 Repeated Field (8) ..., 261 } 263 Figure 1: Example Format 265 2. Protocol Objectives 267 2.1. Simplicity 269 QUIC is intended to provide unlinkability across connection 270 migration, but servers are not required to provide additional 271 connection IDs that effectively prevent linkability. If the 272 coordination scheme is too difficult to implement, servers behind 273 load balancers using connection IDs for routing will use trivially 274 linkable connection IDs. Clients will therefore be forced to choose 275 between terminating the connection during migration or remaining 276 linkable, subverting a design objective of QUIC. 278 The solution should be both simple to implement and require little 279 additional infrastructure for cryptographic keys, etc. 281 2.2. Security 283 In the limit where there are very few connections to a pool of 284 servers, no scheme can prevent the linking of two connection IDs with 285 high probability. In the opposite limit, where all servers have many 286 connections that start and end frequently, it will be difficult to 287 associate two connection IDs even if they are known to map to the 288 same server. 290 QUIC-LB is relevant in the region between these extremes: when the 291 information that two connection IDs map to the same server is helpful 292 to linking two connection IDs. Obviously, any scheme that 293 transparently communicates this mapping to outside observers 294 compromises QUIC's defenses against linkability. 296 Though not an explicit goal of the QUIC-LB design, concealing the 297 server mapping also complicates attempts to focus attacks on a 298 specific server in the pool. 300 3. First CID octet 302 The first octet of a Connection ID is reserved for two special 303 purposes, one mandatory (config rotation) and one optional (length 304 self-description). 306 Subsequent sections of this document refer to the contents of this 307 octet as the "first octet." 309 3.1. Config Rotation 311 The first two bits of any connection ID MUST encode an identifier for 312 the configuration that the connection ID uses. This enables 313 incremental deployment of new QUIC-LB settings (e.g., keys). 315 When new configuration is distributed to servers, there will be a 316 transition period when connection IDs reflecting old and new 317 configuration coexist in the network. The rotation bits allow load 318 balancers to apply the correct routing algorithm and parameters to 319 incoming packets. 321 Configuration Agents SHOULD deliver new configurations to load 322 balancers before doing so to servers, so that load balancers are 323 ready to process CIDs using the new parameters when they arrive. 325 A Configuration Agent SHOULD NOT use a codepoint to represent a new 326 configuration until it takes precautions to make sure that all 327 connections using CIDs with an old configuration at that codepoint 328 have closed or transitioned. 330 Servers MUST NOT generate new connection IDs using an old 331 configuration after receiving a new one from the configuration agent. 332 Servers MUST send NEW_CONNECTION_ID frames that provide CIDs using 333 the new configuration, and retire CIDs using the old configuration 334 using the "Retire Prior To" field of that frame. 336 It also possible to use these bits for more long-lived distinction of 337 different configurations, but this has privacy implications (see 338 Section 11.3). 340 3.2. Configuration Failover 342 If a server has not received a valid QUIC-LB configuration, and 343 believes that low-state, Connection-ID aware load balancers are in 344 the path, it SHOULD generate connection IDs with the config rotation 345 bits set to '11' and SHOULD use the "disable_active_migration" 346 transport parameter in all new QUIC connections. It SHOULD NOT send 347 NEW_CONNECTION_ID frames with new values. 349 A load balancer that sees a connection ID with config rotation bits 350 set to '11' MUST revert to 5-tuple routing. 352 3.3. Length Self-Description 354 Local hardware cryptographic offload devices may accelerate QUIC 355 servers by receiving keys from the QUIC implementation indexed to the 356 connection ID. However, on physical devices operating multiple QUIC 357 servers, it is impractical to efficiently lookup these keys if the 358 connection ID does not self-encode its own length. 360 Note that this is a function of particular server devices and is 361 irrelevant to load balancers. As such, load balancers MAY omit this 362 from their configuration. However, the remaining 6 bits in the first 363 octet of the Connection ID are reserved to express the length of the 364 following connection ID, not including the first octet. 366 A server not using this functionality SHOULD make the six bits appear 367 to be random. 369 3.4. Format 371 First Octet { 372 Config Rotation (2), 373 CID Len or Random Bits (6), 374 } 376 Figure 2: First Octet Format 378 The first octet has the following fields: 380 Config Rotation: Indicates the configuration used to interpret the 381 CID. 383 CID Len or Random Bits: Length Self-Description (if applicable), or 384 random bits otherwise. Encodes the length of the Connection ID 385 following the First Octet. 387 4. Load Balancing Preliminaries 389 In QUIC-LB, load balancers do not generate individual connection IDs 390 for servers. Instead, they communicate the parameters of an 391 algorithm to generate routable connection IDs. 393 The algorithms differ in the complexity of configuration at both load 394 balancer and server. Increasing complexity improves obfuscation of 395 the server mapping. 397 This section describes three participants: the configuration agent, 398 the load balancer, and the server. For any given QUIC-LB 399 configuration that enables connection-ID-aware load balancing, there 400 must be a choice of (1) routing algorithm, (2) server ID allocation 401 strategy, and (3) algorithm parameters. 403 Fundamentally, servers generate connection IDs that encode their 404 server ID. Load balancers decode the server ID from the CID in 405 incoming packets to route to the correct server. 407 There are situations where a server pool might be operating two or 408 more routing algorithms or parameter sets simultaneously. The load 409 balancer uses the first two bits of the connection ID to multiplex 410 incoming DCIDs over these schemes (see Section 3.1). 412 4.1. Unroutable Connection IDs 414 QUIC-LB servers will generate Connection IDs that are decodable to 415 extract a server ID in accordance with a specified algorithm and 416 parameters. However, QUIC often uses client-generated Connection IDs 417 prior to receiving a packet from the server. 419 These client-generated CIDs might not conform to the expectations of 420 the routing algorithm and therefore not be routable by the load 421 balancer. Those that are not routable are "unroutable DCIDs" and 422 receive similar treatment regardless of why they're unroutable: 424 * The config rotation bits (Section 3.1) may not correspond to an 425 active configuration. Note: a packet with a DCID that indicates 426 5-tuple routing (see Section 3.2) is always routable. 428 * The DCID might not be long enough for the decoder to process. 430 * The extracted server mapping might not correspond to an active 431 server. 433 All other DCIDs are routable. 435 Load balancers MUST forward packets with routable DCIDs to a server 436 in accordance with the chosen routing algorithm. 438 Load balancers SHOULD drop short header packets with unroutable 439 DCIDs. 441 The routing of long headers with unroutable DCIDs depends on the 442 server ID allocation strategy, described in Section 4.3. However, 443 the load balancer MUST NOT drop these packets, with one exception. 445 Load balancers MAY drop packets with long headers and unroutable 446 DCIDs if and only if it knows that the encoded QUIC version does not 447 allow an unroutable DCID in a packet with that signature. For 448 example, a load balancer can safely drop a QUIC version 1 Handshake 449 packet with an unroutable DCID, as a version 1 Handshake packet sent 450 to a QUIC-LB routable server will always have a server-generated 451 routable CID. The prohibition against dropping packets with long 452 headers remains for unknown QUIC versions. 454 Furthermore, while the load balancer function MUST NOT drop packets, 455 the device might implement other security policies, outside the scope 456 of this specification, that might force a drop. 458 Servers that receive packets with unroutable CIDs MUST use the 459 available mechanisms to induce the client to use a routable CID in 460 future packets. In QUIC version 1, this requires using a routable 461 CID in the Source CID field of server-generated long headers. 463 4.2. Fallback Algorithms 465 There are conditions described below where a load balancer routes a 466 packet using a "fallback algorithm." It can choose any algorithm, 467 without coordination with the servers, but the algorithm SHOULD be 468 deterministic over short time scales so that related packets go to 469 the same server. The design of this algorithm SHOULD consider the 470 version-invariant properties of QUIC described in [RFC8999] to 471 maximize its robustness to future versions of QUIC. 473 A fallback algorithm MUST NOT make the routing behavior dependent on 474 any bits in the first octet of the QUIC packet header, except the 475 first bit, which indicates a long header. All other bits are QUIC 476 version-dependent and intermediaries SHOULD NOT base their design on 477 version-specific templates. 479 For example, one fallback algorithm might convert a unroutable DCID 480 to an integer and divided by the number of servers, with the modulus 481 used to forward the packet. The number of servers is usually 482 consistent on the time scale of a QUIC connection handshake. Another 483 might simply hash the address/port 4-tuple. See also Section 10. 485 4.3. Server ID Allocation 487 For any given configuration, the configuration agent must specify if 488 server IDs will be statically or dynamically allocated. Load 489 Balancer configurations with statically allocated server IDs 490 explicitly include a mapping of server IDs to forwarding addresses. 491 The corresponding server configurations contain one or more unique 492 server IDs. 494 A dynamically allocated configuration does not have a pre-defined 495 assignment, reducing configuration complexity. However, it places 496 limits on the maximum server ID length and requires more state at the 497 load balancer. In certain edge cases, it can force parts of the 498 system to fail over to 5-tuple routing for a short time. 500 In either case, the configuration agent chooses a server ID length 501 for each configuration that MUST be at least one octet. For Static 502 Allocation, the maximum length depends on the algorithm. For dynamic 503 allocation, the maximum length is 7 octets. 505 A QUIC-LB configuration MAY significantly over-provision the server 506 ID space (i.e., provide far more codepoints than there are servers) 507 to increase the probability that a randomly generated Destination 508 Connection ID is unroutable. 510 Conceptually, each configuration has its own set of server ID 511 allocations, though two static configurations with identical server 512 ID lengths MAY use a common allocation between them. 514 A server encodes one of its assigned server IDs in any CID it 515 generates using the relevant configuration. 517 4.3.1. Static Allocation 519 In the static allocation method, the configuration agent assigns at 520 least one server ID to each server. 522 When forwarding a packet with a long header and unroutable DCID, load 523 balancers MUST forward packets with long headers and unroutable DCIDs 524 using an fallback algorithm as specified in Section 4.2. 526 4.3.2. Dynamic Allocation 528 In the dynamic allocation method, the load balancer assigns server 529 IDs dynamically so that configuration does not require fixed server 530 ID assignment. This reduces linkability and simplifies 531 configuration. However, it also limits the length of the server ID 532 and requires the load balancer to lie on the path of outbound 533 packets. As the server mapping is no longer part of the 534 configuration, standby load balancers need an out-of-band mechanism 535 to synchronize server ID allocations in the event of failures of the 536 primary device. 538 To summarize, the load balancer forwards incoming Initial packets 539 arbitrarily and both load balancer and server are sometimes able to 540 infer a potential server ID allocation from the CID in the packet. 541 The server can signal acceptance of that allocation by using it 542 immediately, in which case both entities add it to their permanent 543 table. Usually, however, the server will reject the allocation by 544 not using it, in which case it is not added to the permanent 545 assignment list. 547 4.3.2.1. Configuration Agent Actions 549 The configuration agent does not assign server IDs, but does 550 configure a server ID length. The server ID MUST be at least one and 551 no more than seven octets. See Section 11.8 for other considerations 552 if also using the Plaintext CID algorithm. 554 4.3.2.2. Load Balancer Actions 556 The load balancer maintains a mapping of assigned server IDs to 557 routing information for servers, initialized as empty. This mapping 558 is independent for each operating configuration. 560 Note that when the load balancer's tables for a configuration are 561 empty, all incoming DCIDs corresponding to that configuration are 562 unroutable by definition. 564 The load balancer processes a long header packet as follows: 566 * If the config rotation bits do not match a known configuration, 567 the load balancer routes the packet using a fallback algorithm 568 (see Section 4.2). It does not extract a server ID. 570 * If there is a matching configuration, but the CID is not long 571 enough to apply the algorithm, the load balancer pads the 572 connection ID with zeros to the required length. 574 * Otherwise, the load balancer extracts the server ID in accordance 575 with the configured algorithm and parameters. 577 If the load balancer extracted a server ID already in its mapping, it 578 routes the packet accordingly. If the server ID is not in the 579 mapping, it routes the packet according to a fallback algorithm and 580 awaits the first long header the server sends in response. 582 If the load balancer extracted an unassigned server ID and observes 583 that the first long header packet the server sends has a Source 584 Connection ID that encodes the same server ID, it adds that server ID 585 to the mapping. Otherwise, it takes no action. 587 4.3.2.3. Server actions 589 Each server maintains a list of server IDs assigned to it, 590 initialized empty. 592 Upon receipt of a packet with a client-generated DCID, the server 593 MUST follow these steps in order: 595 * If the config rotation bits do not correspond to a known 596 configuration, do not attempt to extract a server ID. 598 * If the DCID is not long enough to decode using the configured 599 algorithm, pad it with zeros to the required length and extract a 600 server ID. 602 * If the DCID is long enough to decode, extract the server ID. 604 If the server ID is not already in its list, the server MUST decide 605 whether or not to immediately use it to encode a CID on the new 606 connection. If it chooses to use it, it adds the server ID to its 607 list. If it does not, it MUST NOT use the server ID in future CIDs. 609 The server SHOULD NOT use more than one CID, unless it is close to 610 exhausting the nonces for an existing assignment. Note also that the 611 load balancer may observe a single entity claiming multiple server 612 IDs because that entity actually represents multiple servers devices 613 or processors. 615 The server MUST generate a new connection ID if the client-generated 616 CID is of insufficient length for the configuration. 618 The server then processes the packet normally. 620 When a server needs a new connection ID, it uses one of the server 621 IDs in its list to populate the server ID field of that CID. It MAY 622 vary this selection to reduce linkability within a connection. 624 After loading a new configuration, a server may not have any 625 available SIDs. This is because an incoming packet may not contain 626 the config rotation bits necessary to extract a server ID in 627 accordance with the algorithm above. When required to generate a CID 628 under these conditions, the server MUST generate CIDs using the 629 5-tuple routing codepoint (see Section 3.2. Note that these 630 connections will not be robust to client address changes while they 631 use this connection ID. For this reason, a server SHOULD retire 632 these connection IDs and replace them with routable ones once it 633 receives a client-generated CID that allows it to acquire a server 634 ID. As, statistically, one in every four such CIDs can provide a 635 server ID, this is typically a short interval. 637 4.4. CID format 639 All connection IDs use the following format: 641 QUIC-LB Connection ID { 642 First Octet (8), 643 Server ID (..), 644 Nonce (..), 645 For Server Use (..), 646 } 648 Figure 3: CID Format 650 Each configuration specifies the length of the Server ID and Nonce 651 fields, with limits defined for each algorithm. 653 The Server ID is assigned to each server in accordance with 654 Section 4.3. Dynamically allocated SIDs are limited to seven octets 655 or fewer. Statically allocated ones have different limits for each 656 algorithm. 658 The Nonce is selected by the server when it generates a CID. As the 659 name implies, a server MUST use a nonce no more than once when 660 generating a CID for a given server ID and unique set of 661 configuration parameters. Limits on the length of the nonce are 662 different for each algorithm. 664 The First Octet, Server ID, and Nonce comprise the minimum length 665 Connection ID for any given algorithm. The load balancer need not 666 know the full connection ID length to successfully process a packet, 667 given that it is of minimum size. 669 The For Server Use field has any value and length chosen by the 670 server, within the connection ID length limits in the operative QUIC 671 version. It SHOULD appear random and SHOULD NOT link two connection 672 IDs to the same connection, or indicate they originate from the same 673 server. 675 5. Routing Algorithms 677 Encryption in the algorithms below uses the AES-128-ECB cipher. 678 Future standards could add new algorithms that use other ciphers to 679 provide cryptographic agility in accordance with [RFC7696]. QUIC-LB 680 implementations SHOULD be extensible to support new algorithms. 682 5.1. Plaintext CID Algorithm 684 The Plaintext CID Algorithm makes no attempt to obscure the mapping 685 of connections to servers, significantly increasing linkability. 687 5.1.1. Configuration Agent Actions 689 For static SID allocation, the server ID length is limited to 16 690 octets. The nonce length MUST be zero. 692 5.1.2. Load Balancer Actions 694 On each incoming packet, the load balancer extracts consecutive 695 octets, beginning with the second octet. These bytes represent the 696 server ID. 698 5.1.3. Server Actions 700 The server chooses how many octets to reserve for its own use, which 701 MUST be at least one octet. 703 When a server needs a new connection ID, it encodes one of its 704 assigned server IDs in consecutive octets beginning with the second. 706 5.2. Stream Cipher CID Algorithm 708 The Stream Cipher CID algorithm provides cryptographic protection at 709 the cost of additional per-packet processing at the load balancer to 710 decrypt every incoming connection ID. The CID format is depicted 711 below. 713 5.2.1. Configuration Agent Actions 715 The configuration agent assigns a server ID to every server in its 716 pool, and determines a server ID length (in octets) sufficiently 717 large to encode all server IDs, including potential future servers. 719 The nonce length MUST be no fewer than 4 and no more than 16 octets. 721 The server ID length and nonce length MUST sum to 19 or fewer octets, 722 and SHOULD sum to 15 or fewer octets to allow space for server use. 724 5.2.2. Load Balancer Actions 726 Upon receipt of a QUIC packet, the load balancer extracts as many of 727 the earliest octets from the destination connection ID as necessary 728 to match the server ID. The nonce immediately follows. 730 The load balancer decrypts the nonce and the server ID using the 731 following three pass algorithm: 733 * Pass 1: The load balancer decrypts the server ID using 128-bit AES 734 Electronic Codebook (ECB) mode, much like QUIC header protection. 735 The encrypted nonce octets are zero-padded to 16 octets. AES-ECB 736 encrypts this encrypted nonce using its key to generate a mask 737 which it applies to the encrypted server id. This provides an 738 intermediate value of the server ID, referred to as server-id 739 intermediate. 741 server_id_intermediate = encrypted_server_id ^ AES-ECB(key, padded- 742 encrypted-nonce) 744 * Pass 2: The load balancer decrypts the nonce octets using 128-bit 745 AES ECB mode, using the server-id intermediate as "nonce" for this 746 pass. The server-id intermediate octets are zero-padded to 16 747 octets. AES-ECB encrypts this padded server-id intermediate using 748 its key to generate a mask which it applies to the encrypted 749 nonce. This provides the decrypted nonce value. 751 nonce = encrypted_nonce ^ AES-ECB(key, padded-server_id_intermediate) 753 * Pass 3: The load balancer decrypts the server ID using 128-bit AES 754 ECB mode. The nonce octets are zero-padded to 16 octets. AES-ECB 755 encrypts this nonce using its key to generate a mask which it 756 applies to the intermediate server id. This provides the 757 decrypted server ID. 759 server_id = server_id_intermediate ^ AES-ECB(key, padded-nonce) 760 For example, if the nonce length is 10 octets and the server ID 761 length is 2 octets, the connection ID can be as small as 13 octets. 762 The load balancer uses the the second through eleventh octets of the 763 connection ID for the nonce, zero-pads it to 16 octets, uses xors the 764 result with the twelfth and thirteenth octet. The result is padded 765 with 14 octets of zeros and encrypted to obtain a mask that is xored 766 with the nonce octets. Finally, the nonce octets are padded with six 767 octets of zeros, encrypted, and the first two octets xored with the 768 server ID octets to obtain the actual server ID. 770 This three-pass algorithm is a simplified version of the FFX 771 algorithm, with the property that each encrypted nonce value depends 772 on all server ID bits, and each encrypted server ID bit depends on 773 all nonce bits and all server ID bits. This mitigates attacks 774 against stream ciphers in which attackers simply flip encrypted 775 server-ID bits. 777 The output of the decryption is the server ID that the load balancer 778 uses for routing. 780 5.2.3. Server Actions 782 When generating a routable connection ID, the server writes arbitrary 783 bits into its nonce octets, and its provided server ID into the 784 server ID octets. Servers MAY opt to have a longer connection ID 785 beyond the nonce and server ID. The additional bits MAY encode 786 additional information, but SHOULD appear essentially random to 787 observers. 789 If the decrypted nonce bits increase monotonically, that guarantees 790 that nonces are not reused between connection IDs from the same 791 server. 793 The server encrypts the server ID using exactly the algorithm as 794 described in Section 5.2.2, performing the three passes in reverse 795 order. 797 5.3. Block Cipher CID Algorithm 799 The Block Cipher CID Algorithm, by using a full 16 octets of 800 plaintext and a 128-bit cipher, provides higher cryptographic 801 protection and detection of unroutable connection IDs. However, it 802 also requires connection IDs of at least 17 octets, increasing 803 overhead of client-to-server packets. 805 5.3.1. Configuration Agent Actions 807 The server ID length MUST be no more than 12 octets. The server ID 808 length and nonce length MUST sum to exactly 16 octets. 810 The configuration agent also selects an 16-octet AES-ECB key to use 811 for connection ID decryption. 813 5.3.2. Load Balancer Actions 815 Upon receipt of a QUIC packet, the load balancer reads the first 816 octet to obtain the config rotation bits. It then decrypts the 817 subsequent 16 octets using AES-ECB decryption and the chosen key. 819 The decrypted plaintext contains the server id and opaque server data 820 in that order. The load balancer uses the server ID octets for 821 routing. 823 5.3.3. Server Actions 825 The server encrypts both its server ID and a nonce in 16-octet block 826 with the configured AES-ECB key. 828 6. ICMP Processing 830 For protocols where 4-tuple load balancing is sufficient, it is 831 straightforward to deliver ICMP packets from the network to the 832 correct server, by reading the echoed IP and transport-layer headers 833 to obtain the 4-tuple. When routing is based on connection ID, 834 further measures are required, as most QUIC packets that trigger ICMP 835 responses will only contain a client-generated connection ID that 836 contains no routing information. 838 To solve this problem, load balancers MAY maintain a mapping of 839 Client IP and port to server ID based on recently observed packets. 841 Alternatively, servers MAY implement the technique described in 842 Section 14.4.1 of [RFC9000] to increase the likelihood a Source 843 Connection ID is included in ICMP responses to Path Maximum 844 Transmission Unit (PMTU) probes. Load balancers MAY parse the echoed 845 packet to extract the Source Connection ID, if it contains a QUIC 846 long header, and extract the Server ID as if it were in a Destination 847 CID. 849 7. Retry Service 851 When a server is under load, QUICv1 allows it to defer storage of 852 connection state until the client proves it can receive packets at 853 its advertised IP address. Through the use of a Retry packet, a 854 token in subsequent client Initial packets, and transport parameters, 855 servers verify address ownership and clients verify that there is no 856 on-path attacker generating Retry packets. 858 A "Retry Service" detects potential Denial of Service attacks and 859 handles sending of Retry packets on behalf of the server. As it is, 860 by definition, literally an on-path entity, the service must 861 communicate some of the original connection IDs back to the server so 862 that it can pass client verification. It also must either verify the 863 address itself (with the server trusting this verification) or make 864 sure there is common context for the server to verify the address 865 using a service-generated token. 867 There are two different mechanisms to allow offload of DoS mitigation 868 to a trusted network service. One requires no shared state; the 869 server need only be configured to trust a retry service, though this 870 imposes other operational constraints. The other requires a shared 871 key, but has no such constraints. 873 7.1. Common Requirements 875 Regardless of mechanism, a retry service has an active mode, where it 876 is generating Retry packets, and an inactive mode, where it is not, 877 based on its assessment of server load and the likelihood an attack 878 is underway. The choice of mode MAY be made on a per-packet or per- 879 connection basis, through a stochastic process or based on client 880 address. 882 A configuration agent MUST distribute a list of QUIC versions the 883 Retry Service supports. It MAY also distribute either an "Allow- 884 List" or a "Deny-List" of other QUIC versions. It MUST NOT 885 distribute both an Allow-List and a Deny-List. 887 The Allow-List or Deny-List MUST NOT include any versions included 888 for Retry Service Support. 890 The Configuration Agent MUST provide a means for the entity that 891 controls the Retry Service to report its supported version(s) to the 892 configuration Agent. If the entity has not reported this 893 information, it MUST NOT activate the Retry Service and the 894 configuration agent MUST NOT distribute configuration that activates 895 it. 897 The configuration agent MAY delete versions from the final supported 898 version list if policy does not require the Retry Service to operate 899 on those versions. 901 The configuration Agent MUST provide a means for the entities that 902 control servers behind the Retry Service to report either an Allow- 903 List or a Deny-List. 905 If all entities supply Allow-Lists, the consolidated list MUST be the 906 union of these sets. If all entities supply Deny-Lists, the 907 consolidated list MUST be the intersection of these sets. 909 If entities provide a mixture of Allow-Lists and Deny-Lists, the 910 consolidated list MUST be a Deny-List that is the intersection of all 911 provided Deny-Lists and the inverses of all Allow-Lists. 913 If no entities that control servers have reported Allow-Lists or 914 Deny-Lists, the default is a Deny-List with the null set (i.e., all 915 unsupported versions will be admitted). This preserves the future 916 extensibilty of QUIC. 918 A retry service MUST forward all packets for a QUIC version it does 919 not support that are not on a Deny-List or absent from an Allow-List. 920 Note that if servers support versions the retry service does not, 921 this may increase load on the servers. 923 Note that future versions of QUIC might not have Retry packets, 924 require different information in Retry, or use different packet type 925 indicators. 927 7.1.1. Considerations for Non-Initial Packets 929 Initial Packets are especially effective at consuming server 930 resources because they cause the server to create connection state. 931 Even when mitigating this load with Retry Packets, the act of 932 validating an Initial Token and sending a Retry Packet is more 933 expensive than the response to a non-Initial packet with an unknown 934 Connection ID: simply dropping it and/or sending a Stateless Reset. 936 Nevertheless, a Retry Service in Active Mode might desire to shield 937 servers from non-Initial packets that do not correspond to a 938 previously admitted Initial Packet. This has a number of 939 considerations. 941 * If a Retry Service maintains no per-flow state whatsoever, it 942 cannot distinguish between valid and invalid non-Initial packets 943 and MUST forward all non-Initial Packets to the server. 945 * For QUIC versions the Retry Service does not support and are 946 present on the Allow-List (or absent from the Deny-List), the 947 Retry Service cannot distinguish Initial Packets from other long 948 headers and therefore MUST admit all long headers. 950 * If a Retry Service keeps per-flow state, it can identify 4-tuples 951 that have been previously approved, admit non-Initial packets from 952 those flows, and drop all others. However, dropping short headers 953 will effectively break Address Migration and NAT Rebinding when in 954 Active Mode, as post-migration packets will arrive with a 955 previously unknown 4-tuple. This policy will also break 956 connection attempts using any new QUIC versions that begin 957 connections with a short header. 959 * If a Retry Service is integrated with a QUIC-LB routable load 960 balancer, it can verify that the Destination Connection ID is 961 routable, and only admit non-Initial packets with routable DCIDs. 962 As the Connection ID encoding is invariant across QUIC versions, 963 the Retry Service can do this for all short headers. 965 Nothing in this section prevents Retry Services from making basic 966 syntax correctness checks on packets with QUIC versions that it 967 understands (e.g., enforcing the Initial Packet datagram size minimum 968 in version 1) and dropping packets that are not routable with the 969 QUIC specification. 971 7.2. No-Shared-State Retry Service 973 The no-shared-state retry service requires no coordination, except 974 that the server must be configured to accept this service and know 975 which QUIC versions the retry service supports. The scheme uses the 976 first bit of the token to distinguish between tokens from Retry 977 packets (codepoint '0') and tokens from NEW_TOKEN frames (codepoint 978 '1'). 980 7.2.1. Configuration Agent Actions 982 See Section 7.1. 984 7.2.2. Service Requirements 986 A no-shared-state retry service MUST be present on all paths from 987 potential clients to the server. These paths MUST fail to pass QUIC 988 traffic should the service fail for any reason. That is, if the 989 service is not operational, the server MUST NOT be exposed to client 990 traffic. Otherwise, servers that have already disabled their Retry 991 capability would be vulnerable to attack. 993 The path between service and server MUST be free of any potential 994 attackers. Note that this and other requirements above severely 995 restrict the operational conditions in which a no-shared-state retry 996 service can safely operate. 998 Retry tokens generated by the service MUST have the format below. 1000 Non-Shared-State Retry Service Token { 1001 Token Type (1) = 0, 1002 ODCIL (7) = 8..20, 1003 Original Destination Connection ID (64..160), 1004 Opaque Data (..), 1005 } 1007 Figure 4: Format of non-shared-state retry service tokens 1009 The first bit of retry tokens generated by the service MUST be zero. 1010 The token has the following additional fields: 1012 ODCIL: The length of the original destination connection ID from the 1013 triggering Initial packet. This is in cleartext to be readable for 1014 the server, but authenticated later in the token. The Retry Service 1015 SHOULD reject any token in which the value is less than 8. 1017 Original Destination Connection ID: This also in cleartext and 1018 authenticated later. 1020 Opaque Data: This data contains the information necessary to 1021 authenticate the Retry token in accordance with the QUIC 1022 specification. A straightforward implementation would encode the 1023 Retry Source Connection ID, client IP address, and a timestamp in the 1024 Opaque Data. A more space-efficient implementation would use the 1025 Retry Source Connection ID and Client IP as associated data in an 1026 encryption operation, and encode only the timestamp and the 1027 authentication tag in the Opaque Data. If the Initial Packet has 1028 altered the Connection ID or source IP address, authentication of the 1029 token will fail. 1031 Upon receipt of an Initial packet with a token that begins with '0', 1032 the retry service MUST validate the token in accordance with the QUIC 1033 specification. 1035 In active mode, the service MUST issue Retry packets for all Client 1036 initial packets that contain no token, or a token that has the first 1037 bit set to '1'. It MUST NOT forward the packet to the server. The 1038 service MUST validate all tokens with the first bit set to '0'. If 1039 successful, the service MUST forward the packet with the token 1040 intact. If unsuccessful, it MUST drop the packet. The Retry Service 1041 MAY send an Initial Packet containing a CONNECTION_CLOSE frame with 1042 the INVALID_TOKEN error code when dropping the packet. 1044 Note that this scheme has a performance drawback. When the retry 1045 service is in active mode, clients with a token from a NEW_TOKEN 1046 frame will suffer a 1-RTT penalty even though its token provides 1047 proof of address. 1049 In inactive mode, the service MUST forward all packets that have no 1050 token or a token with the first bit set to '1'. It MUST validate all 1051 tokens with the first bit set to '0'. If successful, the service 1052 MUST forward the packet with the token intact. If unsuccessful, it 1053 MUST either drop the packet or forward it with the token removed. 1054 The latter requires decryption and re-encryption of the entire 1055 Initial packet to avoid authentication failure. Forwarding the 1056 packet causes the server to respond without the 1057 original_destination_connection_id transport parameter, which 1058 preserves the normal QUIC signal to the client that there is an on- 1059 path attacker. 1061 7.2.3. Server Requirements 1063 A server behind a non-shared-state retry service MUST NOT send Retry 1064 packets for a QUIC version the retry service understands. It MAY 1065 send Retry for QUIC versions the Retry Service does not understand. 1067 Tokens sent in NEW_TOKEN frames MUST have the first bit set to '1'. 1069 If a server receives an Initial Packet with the first bit set to '1', 1070 it could be from a server-generated NEW_TOKEN frame and should be 1071 processed in accordance with the QUIC specification. If a server 1072 receives an Initial Packet with the first bit to '0', it is a Retry 1073 token and the server MUST NOT attempt to validate it. Instead, it 1074 MUST assume the address is validated, MUST include the packet's 1075 Destination Connection ID in a Retry Source Connection ID transport 1076 parameter, and MUST extract the Original Destination Connection ID 1077 from the token cleartext for use in the transport parameter of the 1078 same name. 1080 7.3. Shared-State Retry Service 1082 A shared-state retry service uses a shared key, so that the server 1083 can decode the service's retry tokens. It does not require that all 1084 traffic pass through the Retry service, so servers MAY send Retry 1085 packets in response to Initial packets that don't include a valid 1086 token. 1088 Both server and service must have time synchronized with respect to 1089 one another to prevent tokens being incorrectly marked as expired, 1090 though tight synchronization is unnecessary. 1092 The tokens are protected using AES128-GCM AEAD, as explained in 1093 Section 7.3.1. All tokens, generated by either the server or retry 1094 service, MUST use the following format, which includes: 1096 * A 1 bit token type identifier. 1098 * A 7 bit token key identifier. 1100 * A 96 bit unique token number transmitted in clear text, but 1101 protected as part of the AEAD associated data. 1103 * A token body, encoding the Original Destination Connection ID and 1104 the Timestamp, optionally followed by server specific Opaque Data. 1106 The token protection uses an 128 bit representation of the source IP 1107 address from the triggering Initial packet. The client IP address is 1108 16 octets. If an IPv4 address, the last 12 octets are zeroes. It 1109 also uses the Source Connection ID of the Retry packet, which will 1110 cause an authentication failure if it differs from the Destination 1111 Connection ID of the packet bearing the token. 1113 If there is a Network Address Translator (NAT) in the server 1114 infrastructure that changes the client IP, the Retry Service MUST 1115 either be positioned behind the NAT, or the NAT must have the token 1116 key to rewrite the Retry token accordingly. Note also that a host 1117 that obtains a token through a NAT and then attempts to connect over 1118 a path that does not have an identically configured NAT will fail 1119 address validation. 1121 The 96 bit unique token number is set to a random value using a 1122 cryptography-grade random number generator. 1124 The token key identifier and the corresponding AEAD key and AEAD IV 1125 are provisioned by the configuration agent. 1127 The token body is encoded as follows: 1129 Shared-State Retry Service Token Body { 1130 Timestamp (64), 1131 [ODCIL (8) = 8..20], 1132 [Original Destination Connection ID (64..160)], 1133 [Port (16)], 1134 Opaque Data (..), 1135 } 1137 Figure 5: Body of shared-state retry service tokens 1139 The token body has the following fields: 1141 Timestamp: The Timestamp is a 64-bit integer, in network order, that 1142 expresses the expiration time of the token as a number of seconds in 1143 POSIX time (see Sec. 4.16 of [TIME_T]). 1145 ODCIL: The original destination connection ID length. Tokens in 1146 NEW_TOKEN frames do not have this field. 1148 Original Destination Connection ID: The server or Retry Service 1149 copies this from the field in the client Initial packet. Tokens in 1150 NEW_TOKEN frames do not have this field. 1152 Port: The Source Port of the UDP datagram that triggered the Retry 1153 packet. This field MUST be present if and only if the ODCIL is 1154 greater than zero. This field is therefore always absent in tokens 1155 in NEW_TOKEN frames. 1157 Opaque Data: The server may use this field to encode additional 1158 information, such as congestion window, RTT, or MTU. The Retry 1159 Service MUST have zero-length opaque data. 1161 Some implementations of QUIC encode in the token the Initial Packet 1162 Number used by the client, in order to verify that the client sends 1163 the retried Initial with a PN larger that the triggering Initial. 1164 Such implementations will encode the Initial Packet Number as part of 1165 the opaque data. As tokens may be generated by the Service, servers 1166 MUST NOT reject tokens because they lack opaque data and therefore 1167 the packet number. 1169 Shared-state Retry Services use the AES-128-ECB cipher. Future 1170 standards could add new algorithms that use other ciphers to provide 1171 cryptographic agility in accordance with [RFC7696]. Retry Service 1172 and server implementations SHOULD be extensible to support new 1173 algorithms. 1175 7.3.1. Token Protection with AEAD 1177 On the wire, the token is presented as: 1179 Shared-State Retry Service Token { 1180 Token Type (1), 1181 Key Sequence (7), 1182 Unique Token Number (96), 1183 Encrypted Shared-State Retry Service Token Body (64..), 1184 AEAD Integrity Check Value (128), 1185 } 1187 Figure 6: Wire image of shared-state retry service tokens 1189 The tokens are protected using AES128-GCM as follows: 1191 * The Key Sequence is the 7 bit identifier to retrieve the token key 1192 and IV. 1194 * The AEAD IV, is a 96 bit data which produced by implementer's 1195 custom AEAD IV derivation function. 1197 * The AEAD nonce, N, is formed by combining the AEAD IV with the 96 1198 bit unique token number. The 96 bits of the unique token number 1199 are left-padded with zeros to the size of the IV. The exclusive 1200 OR of the padded unique token number and the AEAD IV forms the 1201 AEAD nonce. 1203 * The associated data is a formatted as a pseudo header by combining 1204 the cleartext part of the token with the IP address of the client. 1205 The format of the pseudoheader depends on whether the Token Type 1206 bit is '1' (a NEW_TOKEN token) or '0' (a Retry token). 1208 Shared-State Retry Service Token Pseudoheader { 1209 IP Address (128), 1210 Token Type (1), 1211 Key Sequence (7), 1212 Unique Token Number (96), 1213 [RSCIL (8)], 1214 [Retry Source Connection ID (0..20)], 1215 } 1217 Figure 7: Psuedoheader for shared-state retry service tokens 1219 RSCIL: The Retry Source Connection ID Length in octets. This field 1220 is only present when the Token Type is '0'. 1222 Retry Source Connection ID: To create a Retry Token, populate this 1223 field with the Source Connection ID the Retry packet will use. To 1224 validate a Retry token, populate it with the Destination Connection 1225 ID of the Initial packet that carries the token. This field is only 1226 present when the Token Type is '0'. 1228 * The input plaintext for the AEAD is the token body. The output 1229 ciphertext of the AEAD is transmitted in place of the token body. 1231 * The AEAD Integrity Check Value(ICV), defined in Section 6 of 1232 [RFC4106], is computed as part of the AEAD encryption process, and 1233 is verified during decryption. 1235 7.3.2. Configuration Agent Actions 1237 The configuration agent generates and distributes a "token key", a 1238 "token IV", a key sequence, and the information described in 1239 Section 7.1. 1241 7.3.3. Service Requirements 1243 In inactive mode, the Retry service forwards all packets without 1244 further inspection or processing. The rest of this section only 1245 applies to a service in active mode. 1247 Retry services MUST NOT issue Retry packets except where explicitly 1248 allowed below, to avoid sending a Retry packet in response to a Retry 1249 token. 1251 The service MUST generate Retry tokens with the format described 1252 above when it receives a client Initial packet with no token. 1254 If there is a token of either type, the service MUST attempt to 1255 decrypt it. 1257 To decrypt a packet, the service checks the Token Type and constructs 1258 a pseudoheader with the appropriate format for that type, using the 1259 bearing packet's Destination Connection ID to populate the Retry 1260 Source Connection ID field, if any. 1262 A token is invalid if: 1264 * it uses unknown key sequence, 1266 * the AEAD ICV does not match the expected value (By construction, 1267 it will only match if the client IP Address, and any Retry Source 1268 Connection ID, also matches), 1270 * the ODCIL, if present, is invalid for a client-generated CID (less 1271 than 8 or more than 20 in QUIC version 1), 1273 * the Timestamp of a token points to time in the past (however, in 1274 order to allow for clock skew, it SHOULD NOT consider tokens to be 1275 expired if the Timestamp encodes a few seconds in the past), or 1277 * the port number, if present, does not match the source port in the 1278 encapsulating UDP header. 1280 Packets with valid tokens MUST be forwarded to the server. 1282 The service MUST drop packets with invalid tokens. If the token is 1283 of type '1' (NEW_TOKEN), it MUST respond with a Retry packet. If of 1284 type '0', it MUST NOT respond with a Retry packet. 1286 7.3.4. Server Requirements 1288 The server MAY issue Retry or NEW_TOKEN tokens in accordance with 1289 [RFC9000]. When doing so, it MUST follow the format above. 1291 The server MUST validate all tokens that arrive in Initial packets, 1292 as they may have bypassed the Retry service. It determines validity 1293 using the procedure in Section 7.3.3. 1295 If a valid Retry token, the server populates the 1296 original_destination_connection_id transport parameter using the 1297 corresponding token field. It populates the 1298 retry_source_connection_id transport parameter with the Destination 1299 Connection ID of the packet bearing the token. 1301 In all other respects, the server processes both valid and invalid 1302 tokens in accordance with [RFC9000]. 1304 For QUIC versions the service does not support, the server MAY use 1305 any token format. 1307 8. Configuration Requirements 1309 QUIC-LB requires common configuration to synchronize understanding of 1310 encodings and guarantee explicit consent of the server. 1312 The load balancer and server MUST agree on a routing algorithm, 1313 server ID allocation method, and the relevant parameters for that 1314 algorithm. 1316 All algorithm configurations can have a server ID length, nonce 1317 length, and key. However, for Plaintext CID, the key is not used and 1318 the nonce length is always zero. For Block Cipher CID, the nonce 1319 length is directly computed from the server ID length. 1321 If server IDs are statically allocated, the load balancer MUST 1322 receive the full table of mappings, and each server must receive its 1323 assigned SID(s), from the configuration agent. 1325 Note that server IDs are opaque bytes, not integers, so there is no 1326 notion of network order or host order. 1328 A server configuration MUST specify if the first octet encodes the 1329 CID length. Note that a load balancer does not need the CID length, 1330 as the required bytes are present in the QUIC packet. 1332 A full QUIC-LB server configuration MUST also specify the supported 1333 QUIC versions of any Retry Service. If a shared-state service, the 1334 server also must have the token key. 1336 A non-shared-state Retry Service need only be configured with the 1337 QUIC versions it supports, and an Allow- or Deny-List. A shared- 1338 state Retry Service also needs the token key, and to be aware if a 1339 NAT sits between it and the servers. 1341 Appendix A provides a YANG Model of the a full QUIC-LB configuration. 1343 9. Additional Use Cases 1345 This section discusses considerations for some deployment scenarios 1346 not implied by the specification above. 1348 9.1. Load balancer chains 1350 Some network architectures may have multiple tiers of low-state load 1351 balancers, where a first tier of devices makes a routing decision to 1352 the next tier, and so on, until packets reach the server. Although 1353 QUIC-LB is not explicitly designed for this use case, it is possible 1354 to support it. 1356 If each load balancer is assigned a range of server IDs that is a 1357 subset of the range of IDs assigned to devices that are closer to the 1358 client, then the first devices to process an incoming packet can 1359 extract the server ID and then map it to the correct forwarding 1360 address. Note that this solution is extensible to arbitrarily large 1361 numbers of load-balancing tiers, as the maximum server ID space is 1362 quite large. 1364 9.2. Moving connections between servers 1366 Some deployments may transparently move a connection from one server 1367 to another. The means of transferring connection state between 1368 servers is out of scope of this document. 1370 To support a handover, a server involved in the transition could 1371 issue CIDs that map to the new server via a NEW_CONNECTION_ID frame, 1372 and retire CIDs associated with the new server using the "Retire 1373 Prior To" field in that frame. 1375 Alternately, if the old server is going offline, the load balancer 1376 could simply map its server ID to the new server's address. 1378 10. Version Invariance of QUIC-LB 1380 Non-shared-state Retry Services are inherently dependent on the 1381 format (and existence) of Retry Packets in each version of QUIC, and 1382 so Retry Service configuration explicitly includes the supported QUIC 1383 versions. 1385 The server ID encodings, and requirements for their handling, are 1386 designed to be QUIC version independent (see [RFC8999]). A QUIC-LB 1387 load balancer will generally not require changes as servers deploy 1388 new versions of QUIC. However, there are several unlikely future 1389 design decisions that could impact the operation of QUIC-LB. 1391 The maximum Connection ID length could be below the minimum necessary 1392 for one or more encoding algorithms. 1394 Section 4.1 provides guidance about how load balancers should handle 1395 unroutable DCIDs. This guidance, and the implementation of an 1396 algorithm to handle these DCIDs, rests on some assumptions: 1398 * Incoming short headers do not contain DCIDs that are client- 1399 generated. 1401 * The use of client-generated incoming DCIDs does not persist beyond 1402 a few round trips in the connection. 1404 * While the client is using DCIDs it generated, some exposed fields 1405 (IP address, UDP port, client-generated destination Connection ID) 1406 remain constant for all packets sent on the same connection. 1408 * Dynamic server ID allocation is dependent on client-generated 1409 Destination CIDs in Initial Packets being at least 8 octets in 1410 length. If they are not, the load balancer may not be able to 1411 extract a valid server ID to add to its table. Configuring a 1412 shorter server ID length can increase robustness to a change. 1414 While this document does not update the commitments in [RFC8999], the 1415 additional assumptions are minimal and narrowly scoped, and provide a 1416 likely set of constants that load balancers can use with minimal risk 1417 of version- dependence. 1419 If these assumptions are invalid, this specification is likely to 1420 lead to loss of packets that contain unroutable DCIDs, and in extreme 1421 cases connection failure. 1423 Some load balancers might inspect elements of the Server Name 1424 Indication (SNI) extension in the TLS Client Hello to make a routing 1425 decision. Note that the format and cryptographic protection of this 1426 information may change in future versions or extensions of TLS or 1427 QUIC, and therefore this functionality is inherently not version- 1428 invariant. 1430 11. Security Considerations 1432 QUIC-LB is intended to prevent linkability. Attacks would therefore 1433 attempt to subvert this purpose. 1435 Note that the Plaintext CID algorithm makes no attempt to obscure the 1436 server mapping, and therefore does not address these concerns. It 1437 exists to allow consistent CID encoding for compatibility across a 1438 network infrastructure, which makes QUIC robust to NAT rebinding. 1439 Servers that are running the Plaintext CID algorithm SHOULD only use 1440 it to generate new CIDs for the Server Initial Packet and SHOULD NOT 1441 send CIDs in QUIC NEW_CONNECTION_ID frames, except that it sends one 1442 new Connection ID in the event of config rotation Section 3.1. Doing 1443 so might falsely suggest to the client that said CIDs were generated 1444 in a secure fashion. 1446 A linkability attack would find some means of determining that two 1447 connection IDs route to the same server. As described above, there 1448 is no scheme that strictly prevents linkability for all traffic 1449 patterns, and therefore efforts to frustrate any analysis of server 1450 ID encoding have diminishing returns. 1452 11.1. Attackers not between the load balancer and server 1454 Any attacker might open a connection to the server infrastructure and 1455 aggressively simulate migration to obtain a large sample of IDs that 1456 map to the same server. It could then apply analytical techniques to 1457 try to obtain the server encoding. 1459 The Stream and Block Cipher CID algorithms provide robust protection 1460 against any sort of linkage. The Plaintext CID algorithm makes no 1461 attempt to protect this encoding. 1463 Were this analysis to obtain the server encoding, then on-path 1464 observers might apply this analysis to correlating different client 1465 IP addresses. 1467 11.2. Attackers between the load balancer and server 1469 Attackers in this privileged position are intrinsically able to map 1470 two connection IDs to the same server. The QUIC-LB algorithms do 1471 prevent the linkage of two connection IDs to the same individual 1472 connection if servers make reasonable selections when generating new 1473 IDs for that connection. 1475 11.3. Multiple Configuration IDs 1477 During the period in which there are multiple deployed configuration 1478 IDs (see Section 3.1), there is a slight increase in linkability. 1479 The server space is effectively divided into segments with CIDs that 1480 have different config rotation bits. Entities that manage servers 1481 SHOULD strive to minimize these periods by quickly deploying new 1482 configurations across the server pool. 1484 11.4. Limited configuration scope 1486 A simple deployment of QUIC-LB in a cloud provider might use the same 1487 global QUIC-LB configuration across all its load balancers that route 1488 to customer servers. An attacker could then simply become a 1489 customer, obtain the configuration, and then extract server IDs of 1490 other customers' connections at will. 1492 To avoid this, the configuration agent SHOULD issue QUIC-LB 1493 configurations to mutually distrustful servers that have different 1494 keys for encryption algorithms. In many cases, the load balancers 1495 can distinguish these configurations by external IP address. 1497 However, assigning multiple entities to an IP address is 1498 complimentary with concealing DNS requests (e.g., DoH [RFC8484]) and 1499 the TLS Server Name Indicator (SNI) ([I-D.ietf-tls-esni]) to obscure 1500 the ultimate destination of traffic. While the load balancer's 1501 fallback algorithm (Section 4.2) can use the SNI to make a routing 1502 decision on the first packet, there are three ways to route 1503 subsequent packets: 1505 * all co-tenants can use the same QUIC-LB configuration, leaking the 1506 server mapping to each other as described above; 1508 * co-tenants can be issued one of up to three configurations 1509 distinguished by the config rotation bits (Section 3.1), exposing 1510 information about the target domain to the entire network; or 1512 * tenants can use 4-tuple routing in their CIDs (in which case they 1513 SHOULD disable migration in their connections), which neutralizes 1514 the value of QUIC-LB but preserves privacy. 1516 When configuring QUIC-LB, administrators must evaluate the privacy 1517 tradeoff considering the relative value of each of these properties, 1518 given the trust model between tenants, the presence of methods to 1519 obscure the domain name, and value of address migration in the tenant 1520 use cases. 1522 As the plaintext algorithm makes no attempt to conceal the server 1523 mapping, these deployments SHOULD simply use a common configuration. 1525 11.5. Stateless Reset Oracle 1527 Section 21.9 of [RFC9000] discusses the Stateless Reset Oracle 1528 attack. For a server deployment to be vulnerable, an attacking 1529 client must be able to cause two packets with the same Destination 1530 CID to arrive at two different servers that share the same 1531 cryptographic context for Stateless Reset tokens. As QUIC-LB 1532 requires deterministic routing of DCIDs over the life of a 1533 connection, it is a sufficient means of avoiding an Oracle without 1534 additional measures. 1536 Note also that when a server starts using a new QUIC-LB config 1537 rotation codepoint, new CIDs might not be unique with respect to 1538 previous configurations that occupied that codepoint, and therefore 1539 different clients may have observed the same CID and stateless reset 1540 token. A straightforward method of managing stateless reset keys is 1541 to maintain a separate key for each config rotation codepoint, and 1542 replace each key when the configuration for that codepoint changes. 1543 Thus, a server transitions from one config to another, it will be 1544 able to generate correct tokens for connections using either type of 1545 CID. 1547 11.6. Connection ID Entropy 1549 The Stream Cipher and Block Cipher algorithms need to generate 1550 different cipher text for each generated Connection ID instance to 1551 protect the Server ID. To do so, at least four octets of the CID are 1552 reserved for a nonce that, if used only once, will result in unique 1553 cipher text for each Connection ID. 1555 If servers simply increment the nonce by one with each generated 1556 connection ID, then it is safe to use the existing keys until any 1557 server's nonce counter exhausts the allocated space and rolls over to 1558 zero. Whether or not it implements this method, the server MUST NOT 1559 reuse a nonce until it switches to a configuration with new keys. 1561 Configuration agents SHOULD implement an out-of-band method to 1562 discover when servers are in danger of exhausting their nonce space, 1563 and SHOULD respond by issuing a new configuration. A server that has 1564 exhausted its nonces MUST either switch to a different configuration, 1565 or if none exists, use the 4-tuple routing config rotation codepoint. 1567 11.7. Shared-State Retry Keys 1569 The Shared-State Retry Service defined in Section 7.3 describes the 1570 format of retry tokens or new tokens protected and encrypted using 1571 AES128-GCM. Each token includes a 96 bit randomly generated unique 1572 token number, and an 8 bit identifier used to get the AES-GCM 1573 encryption context. The AES-GCM encryption context contains a 128 1574 bit key and an AEAD IV. There are three important security 1575 considerations for these tokens: 1577 * An attacker that obtains a copy of the encryption key will be able 1578 to decrypt and forge tokens. 1580 * Attackers may be able to retrieve the key if they capture a 1581 sufficently large number of retry tokens encrypted with a given 1582 key. 1584 * Confidentiality of the token data will fail if separate tokens 1585 reuse the same 96 bit unique token number and the same key. 1587 To protect against disclosure of keys to attackers, service and 1588 servers MUST ensure that the keys are stored securely. To limit the 1589 consequences of potential exposures, the time to live of any given 1590 key should be limited. 1592 Section 6.6 of [RFC9001] states that "Endpoints MUST count the number 1593 of encrypted packets for each set of keys. If the total number of 1594 encrypted packets with the same key exceeds the confidentiality limit 1595 for the selected AEAD, the endpoint MUST stop using those keys." It 1596 goes on with the specific limit: "For AEAD_AES_128_GCM and 1597 AEAD_AES_256_GCM, the confidentiality limit is 2^23 encrypted 1598 packets; see Appendix B.1." It is prudent to adopt the same limit 1599 here, and configure the service in such a way that no more than 2^23 1600 tokens are generated with the same key. 1602 In order to protect against collisions, the 96 bit unique token 1603 numbers should be generated using a cryptographically secure 1604 pseudorandom number generator (CSPRNG), as specified in Appendix C.1 1605 of the TLS 1.3 specification [RFC8446]. With proper random numbers, 1606 if fewer than 2^40 tokens are generated with a single key, the risk 1607 of collisions is lower than 0.001%. 1609 11.8. Resource Consumption of the SID table 1611 When using Dynamic SID allocation, the load balancer's SID table can 1612 be as large as 2^56 entries, which is prohibitively large. To 1613 constrain the size of this table, servers are encouraged to accept as 1614 few SIDs as possible, so that the remainder do not enter the load 1615 balancer's table. 1617 12. IANA Considerations 1619 There are no IANA requirements. 1621 13. References 1623 13.1. Normative References 1625 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 1626 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 1627 . 1629 [RFC8999] Thomson, M., "Version-Independent Properties of QUIC", 1630 RFC 8999, DOI 10.17487/RFC8999, May 2021, 1631 . 1633 [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 1634 Multiplexed and Secure Transport", RFC 9000, 1635 DOI 10.17487/RFC9000, May 2021, 1636 . 1638 [TIME_T] "Open Group Standard: Vol. 1: Base Definitions, Issue 7", 1639 IEEE Std 1003.1 , 2018, 1640 . 1643 13.2. Informative References 1645 [I-D.draft-ietf-tls-dtls13] 1646 Rescorla, E., Tschofenig, H., and N. Modadugu, "The 1647 Datagram Transport Layer Security (DTLS) Protocol Version 1648 1.3", Work in Progress, Internet-Draft, draft-ietf-tls- 1649 dtls13-43, 30 April 2021, 1650 . 1653 [I-D.ietf-tls-dtls-connection-id] 1654 Rescorla, E., Tschofenig, H., Fossati, T., and A. Kraus, 1655 "Connection Identifiers for DTLS 1.2", Work in Progress, 1656 Internet-Draft, draft-ietf-tls-dtls-connection-id-13, 22 1657 June 2021, . 1660 [I-D.ietf-tls-esni] 1661 Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1662 Encrypted Client Hello", Work in Progress, Internet-Draft, 1663 draft-ietf-tls-esni-13, 12 August 2021, 1664 . 1667 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1668 Requirement Levels", BCP 14, RFC 2119, 1669 DOI 10.17487/RFC2119, March 1997, 1670 . 1672 [RFC4106] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode 1673 (GCM) in IPsec Encapsulating Security Payload (ESP)", 1674 RFC 4106, DOI 10.17487/RFC4106, June 2005, 1675 . 1677 [RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1678 Security", RFC 4347, DOI 10.17487/RFC4347, April 2006, 1679 . 1681 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 1682 the Network Configuration Protocol (NETCONF)", RFC 6020, 1683 DOI 10.17487/RFC6020, October 2010, 1684 . 1686 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1687 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 1688 January 2012, . 1690 [RFC7696] Housley, R., "Guidelines for Cryptographic Algorithm 1691 Agility and Selecting Mandatory-to-Implement Algorithms", 1692 BCP 201, RFC 7696, DOI 10.17487/RFC7696, November 2015, 1693 . 1695 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 1696 Updates for Secure Real-time Transport Protocol (SRTP) 1697 Extension for Datagram Transport Layer Security (DTLS)", 1698 RFC 7983, DOI 10.17487/RFC7983, September 2016, 1699 . 1701 [RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", 1702 BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018, 1703 . 1705 [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS 1706 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, 1707 . 1709 [RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 1710 QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021, 1711 . 1713 Appendix A. QUIC-LB YANG Model 1715 This YANG model conforms to [RFC6020] and expresses a complete QUIC- 1716 LB configuration. 1718 module ietf-quic-lb { 1719 yang-version "1.1"; 1720 namespace "urn:ietf:params:xml:ns:yang:ietf-quic-lb"; 1721 prefix "quic-lb"; 1723 import ietf-yang-types { 1724 prefix yang; 1725 reference 1726 "RFC 6991: Common YANG Data Types."; 1727 } 1729 import ietf-inet-types { 1730 prefix inet; 1731 reference 1732 "RFC 6991: Common YANG Data Types."; 1733 } 1735 organization 1736 "IETF QUIC Working Group"; 1738 contact 1739 "WG Web: 1740 WG List: 1742 Authors: Martin Duke (martin.h.duke at gmail dot com) 1743 Nick Banks (nibanks at microsoft dot com)"; 1745 description 1746 "This module enables the explicit cooperation of QUIC servers with 1747 trusted intermediaries without breaking important protocol features. 1749 Copyright (c) 2021 IETF Trust and the persons identified as 1750 authors of the code. All rights reserved. 1752 Redistribution and use in source and binary forms, with or 1753 without modification, is permitted pursuant to, and subject to 1754 the license terms contained in, the Simplified BSD License set 1755 forth in Section 4.c of the IETF Trust's Legal Provisions 1756 Relating to IETF Documents 1757 (https://trustee.ietf.org/license-info). 1759 This version of this YANG module is part of RFC XXXX 1760 (https://www.rfc-editor.org/info/rfcXXXX); see the RFC itself 1761 for full legal notices. 1763 The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL 1764 NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 1765 'MAY', and 'OPTIONAL' in this document are to be interpreted as 1766 described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, 1767 they appear in all capitals, as shown here."; 1769 revision "2021-01-29" { 1770 description 1771 "Initial Version"; 1772 reference 1773 "RFC XXXX, QUIC-LB: Generating Routable QUIC Connection IDs"; 1774 } 1776 container quic-lb { 1777 presence "The container for QUIC-LB configuration."; 1779 description 1780 "QUIC-LB container."; 1782 typedef quic-lb-key { 1783 type yang:hex-string { 1784 length 47; 1785 } 1786 description 1787 "This is a 16-byte key, represented with 47 bytes"; 1788 } 1790 list cid-configs { 1791 key "config-rotation-bits"; 1792 description 1793 "List up to three load balancer configurations"; 1795 leaf config-rotation-bits { 1796 type uint8 { 1797 range "0..2"; 1798 } 1799 mandatory true; 1800 description 1801 "Identifier for this CID configuration."; 1802 } 1804 leaf first-octet-encodes-cid-length { 1805 type boolean; 1806 default false; 1807 description 1808 "If true, the six least significant bits of the first CID 1809 octet encode the CID length minus one."; 1810 } 1812 leaf cid-key { 1813 type quic-lb-key; 1814 description 1815 "Key for encrypting the connection ID. If absent, the 1816 configuration uses the Plaintext algorithm."; 1817 } 1819 leaf nonce-length { 1820 type uint8 { 1821 range "4..16"; 1822 } 1823 must '(../cid-key)' { 1824 error-message "nonce-length only valid if cid-key is set"; 1825 } 1826 description 1827 "Length, in octets, of the nonce. If absent when cid-key is 1828 present, the configuration uses the Block Cipher Algorithm. 1829 If present along with cid-key, the configuration uses the 1830 Stream Cipher Algorithm."; 1831 } 1833 leaf dynamic-sid { 1834 type boolean; 1835 description 1836 "If true, server IDs are allocated dynamically."; 1837 } 1839 leaf server-id-length { 1840 type uint8 { 1841 range "1..18"; 1842 } 1843 must '(dynamic-sid and . <= 7) or 1844 (not(../dynamic-sid)) and 1845 (not(../cid-key) and . <= 16) or 1846 ((../nonce-length) and . <= (19 - ../nonce-length)) or 1847 ((../cid-key) and not(../nonce-length) and . <= 12))' { 1848 error-message 1849 "Server ID length too long for routing algorithm and server ID 1850 allocation method"; 1851 } 1852 mandatory true; 1853 description 1854 "Length (in octets) of a server ID. Further range-limited 1855 by sid-allocation, cid-key, and nonce-length."; 1856 } 1858 list server-id-mappings { 1859 when "not(../dynamic-sid)"; 1860 key "server-id"; 1861 description "Statically allocated Server IDs"; 1863 leaf server-id { 1864 type yang:hex-string; 1865 must "string-length(.) = 3 * ../../server-id-length - 1"; 1866 mandatory true; 1867 description 1868 "An allocated server ID"; 1869 } 1871 leaf server-address { 1872 type inet:ip-address; 1873 mandatory true; 1874 description 1875 "Destination address corresponding to the server ID"; 1876 } 1877 } 1878 } 1880 container retry-service-config { 1881 description 1882 "Configuration of Retry Service. If supported-versions is empty, there 1883 is no retry service. If token-keys is empty, it uses the non-shared- 1884 state service. If present, it uses shared-state tokens."; 1886 leaf-list supported-versions { 1887 type uint32; 1888 description 1889 "QUIC versions that the retry service supports. If empty, there 1890 is no retry service."; 1891 } 1893 leaf unsupported-version-default { 1894 type enumeration { 1895 enum allow { 1896 description "Unsupported versions admitted by default"; 1897 } 1898 enum deny { 1899 description "Unsupported versions denied by default"; 1900 } 1901 } 1902 default allow; 1903 description 1904 "Are unsupported versions not in version-exceptions allowed 1905 or denied?"; 1906 } 1908 leaf-list version-exceptions { 1909 type uint32; 1910 description 1911 "Exceptions to the default-deny or default-allow rule."; 1912 } 1914 list token-keys { 1915 key "key-sequence-number"; 1916 description 1917 "list of active keys, for key rotation purposes. Existence implies 1918 shared-state format"; 1920 leaf key-sequence-number { 1921 type uint8 { 1922 range "0..127"; 1923 } 1924 mandatory true; 1925 description 1926 "Identifies the key used to encrypt the token"; 1927 } 1929 leaf token-key { 1930 type quic-lb-key; 1931 mandatory true; 1932 description 1933 "16-byte key to encrypt the token"; 1934 } 1936 leaf token-iv { 1937 type yang:hex-string { 1938 length 23; 1939 } 1940 mandatory true; 1941 description 1942 "8-byte IV to encrypt the token, encoded in 23 bytes"; 1943 } 1944 } 1945 } 1946 } 1947 } 1949 A.1. Tree Diagram 1951 This summary of the YANG model uses the notation in [RFC8340]. 1953 module: ietf-quic-lb 1954 +--rw quic-lb 1955 +--rw cid-configs* 1956 | [config-rotation-bits] 1957 | +--rw config-rotation-bits uint8 1958 | +--rw first-octet-encodes-cid-length? boolean 1959 | +--rw cid-key? yang:hex-string 1960 | +--rw nonce-length? uint8 1961 | +--rw dynamic-sid boolean 1962 | +--rw server-id-length uint8 1963 | +--rw server-id-mappings*? 1964 | | [server-id] 1965 | | +--rw server-id yang:hex-string 1966 | | +--rw server-address inet:ip-address 1967 +--ro retry-service-config 1968 | +--rw supported-versions* 1969 | | +--rw version uint32 1970 | +--rw unsupported-version-default enumeration {allow deny} 1971 | +--rw version-exceptions* 1972 | | +--rw version uint32 1973 | +--rw token-keys*? 1974 | | [key-sequence-number] 1975 | | +--rw key-sequence-number uint8 1976 | | +--rw token-key yang:hex-string 1977 | | +--rw token-iv yang:hex-string 1979 Appendix B. Load Balancer Test Vectors 1981 Each section of this draft includes multiple sets of load balancer 1982 configuration, each of which has five examples of server ID and 1983 server use bytes and how they are encoded in a CID. 1985 In some cases, there are no server use bytes. Note that, for 1986 simplicity, the first octet bits used for neither config rotation nor 1987 length self-encoding are random, rather than listed in the server use 1988 field. Therefore, a server implementation using these parameters may 1989 generate CIDs with a slightly different first octet. 1991 This section uses the following abbreviations: 1993 cid Connection ID 1994 cr_bits Config Rotation Bits 1995 LB Load Balancer 1996 sid Server ID 1997 sid_len Server ID length 1998 su Server Use Bytes 2000 All values except length_self_encoding and sid_len are expressed in 2001 hexidecimal format. 2003 B.1. Plaintext Connection ID Algorithm 2004 LB configuration: cr_bits 0x0 length_self_encoding: y sid_len 1 2006 cid 01be sid be su 2007 cid 0221b7 sid 21 su b7 2008 cid 03cadfd8 sid ca su dfd8 2009 cid 041e0c9328 sid 1e su 0c9328 2010 cid 050c8f6d9129 sid 0c su 8f6d9129 2012 LB configuration: cr_bits 0x0 length_self_encoding: n sid_len 2 2014 cid 02aab0 sid aab0 su 2015 cid 3ac4b106 sid c4b1 su 06 2016 cid 08bd3cf4a0 sid bd3c su f4a0 2017 cid 3771d59502d6 sid 71d5 su 9502d6 2018 cid 1d57dee8b888f3 sid 57de su e8b888f3 2020 LB configuration: cr_bits 0x0 length_self_encoding: y sid_len 3 2022 cid 0336c976 sid 36c976 su 2023 cid 04aa291806 sid aa2918 su 06 2024 cid 0586897bd8b6 sid 86897b su d8b6 2025 cid 063625bcae4de0 sid 3625bc su ae4de0 2026 cid 07966fb1f3cb535f sid 966fb1 su f3cb535f 2028 LB configuration: cr_bits 0x0 length_self_encoding: n sid_len 4 2030 cid 185172fab8 sid 5172fab8 su 2031 cid 2eb7ff2c9297 sid b7ff2c92 su 97 2032 cid 14f3eb3dd3edbe sid f3eb3dd3 su edbe 2033 cid 3feb31cece744b74 sid eb31cece su 744b74 2034 cid 06b9f34c353ce23bb5 sid b9f34c35 su 3ce23bb5 2036 LB configuration: cr_bits 0x0 length_self_encoding: y sid_len 5 2038 cid 05bdcd8d0b1d sid bdcd8d0b1d su 2039 cid 06aee673725a63 sid aee673725a su 63 2040 cid 07bbf338ddbf37f4 sid bbf338ddbf su 37f4 2041 cid 08fbbca64c26756840 sid fbbca64c26 su 756840 2042 cid 09e7737c495b93894e34 sid e7737c495b su 93894e34 2044 B.2. Stream Cipher Connection ID Algorithm 2046 In each case below, the server is using a plain text nonce value of 2047 zero. 2049 LB configuration: cr_bits 0x0 length_self_encoding: y nonce_len 12 sid_len 1 2050 key 4d9d0fd25a25e7f321ef464e13f9fa3d 2052 cid 0d69fe8ab8293680395ae256e89c sid c5 su 2053 cid 0e420d74ed99b985e10f5073f43027 sid d5 su 27 2054 cid 0f380f440c6eefd3142ee776f6c16027 sid 10 su 6027 2055 cid 1020607efbe82049ddbf3a7c3d9d32604d sid 3c su 32604d 2056 cid 11e132d12606a1bb0fa17e1caef00ec54c10 sid e3 su 0ec54c10 2058 LB configuration: cr_bits 0x0 length_self_encoding: n nonce_len 12 sid_len 2 2059 key 49e1cec7fd264b1f4af37413baf8ada9 2061 cid 3d3a5e1126414271cc8dc2ec7c8c15 sid f7fe su 2062 cid 007042539e7c5f139ac2adfbf54ba748 sid eaf4 su 48 2063 cid 2bc125dd2aed2aafacf59855d99e029217 sid e880 su 9217 2064 cid 3be6728dc082802d9862c6c8e4dda3d984d8 sid 62c6 su d984d8 2065 cid 1afe9c6259ad350fc7bad28e0aeb2e8d4d4742 sid 8502 su 8d4d4742 2067 LB configuration: cr_bits 0x0 length_self_encoding: y nonce_len 14 sid_len 3 2068 key 2c70df0b399bd33a7335523dcdb884ad 2070 cid 11d62e8670565cd30b552edff6782ff5a740 sid d794bb su 2071 cid 12c70e481f49363cabd9370d1fd5012c12bca5 sid 2cbd5d su a5 2072 cid 133b95dfd8ad93566782f8424df82458069fc9e9 sid d126cd su c9e9 2073 cid 13ac6ffcd635532ab60370306c7ee572d6b6e795 sid 539e42 su e795 2074 cid 1383ed07a9700777ff450bb39bb9c1981266805c sid 9094dd su 805c 2076 LB configuration: cr_bits 0x0 length_self_encoding: n nonce_len 12 sid_len 4 2077 key 2297b8a95c776cf9c048b76d9dc27019 2079 cid 32873890c3059ca62628089439c44c1f84 sid 7398d8ca su 2080 cid 1ff7c7d7b9823954b178636c99a7dc93ac83 sid 9655f091 su 83 2081 cid 31044000a5ebb3bf2fa7629a17f2c78b077c17 sid 8b035fc6 su 7c17 2082 cid 1791bd28c66721e8fea0c6f34fd2d8e663a6ef70 sid 6672e0e2 su a6ef70 2083 cid 3df1d90ad5ccd5f8f475f040e90aeca09ec9839d sid b98b1fff su c9839d 2085 LB configuration: cr_bits 0x0 length_self_encoding: y nonce_len 8 sid_len 5 2086 key 484b2ed942d9f4765e45035da3340423 2088 cid 0da995b7537db605bfd3a38881ae sid 391a7840dc su 2089 cid 0ed8d02d55b91d06443540d1bf6e98 sid 10f7f7b284 su 98 2090 cid 0f3f74be6d46a84ccb1fd1ee92cdeaf2 sid 0606918fc0 su eaf2 2091 cid 1045626dbf20e03050837633cc5650f97c sid e505eea637 su 50f97c 2092 cid 11bb9a17f691ab446a938427febbeb593eaa sid 99343a2a96 su eb593eaa 2093 B.3. Block Cipher Connection ID Algorithm 2095 In each case below, the server is using a plain text nonce value of 2096 zero. 2098 TBD 2100 B.4. Shared State Retry Tokens 2102 In this case, the shared-state retry token is issued by retry 2103 service, so the opaque data of shared-state retry token body would be 2104 null (Section 7.3). 2106 LB configuration: 2107 key_seq 0x00 2108 encrypt_key 0x30313233343536373839303132333435 2109 AEAD_IV 0x313233343536373839303132 2111 Shared-State Retry Service Token Body: 2112 ODCIL 0x12 2113 RSCIL 0x10 2114 port 0x1a0a 2115 original_destination_connection_id 0x0c3817b544ca1c94313bba41757547eec937 2116 retry_source_connection_id 0x0301e770d24b3b13070dd5c2a9264307 2117 timestamp 0x0000000060c7bf4d 2119 Shared-State Retry Service Token: 2120 unique_token_number 0x59ef316b70575e793e1a8782 2121 key_sequence 0x00 2122 encrypted_shared_state_retry_service_token_body 2123 0x7d38b274aa4427c7a1557c3fa666945931defc65da387a83855196a7cb73caac1e28e5346fd76868de94f8b62294 2124 AEAD_ICV 0xf91174fdd711543a32d5e959867f9c22 2126 AEAD related parameters: 2127 client_ip_addr 127.0.0.1 2128 client_port 6666 2129 AEAD_nonce 0x68dd025f45616941072ab6b0 2130 AEAD_associated_data 0x7f00000100000000000000000000000059ef316b70575e793e1a878200 2132 Appendix C. Interoperability with DTLS over UDP 2134 Some environments may contain DTLS traffic as well as QUIC operating 2135 over UDP, which may be hard to distinguish. 2137 In most cases, the packet parsing rules above will cause a QUIC-LB 2138 load balancer to route DTLS traffic in an appropriate way. DTLS 1.3 2139 implementations that use the connection_id extension 2140 [I-D.ietf-tls-dtls-connection-id] might use the techniques in this 2141 document to generate connection IDs and achieve robust routability 2142 for DTLS associations if they meet a few additional requirements. 2143 This non-normative appendix describes this interaction. 2145 C.1. DTLS 1.0 and 1.2 2147 DTLS 1.0 [RFC4347] and 1.2 [RFC6347] use packet formats that a QUIC- 2148 LB router will interpret as short header packets with CIDs that 2149 request 4-tuple routing. As such, they will route such packets 2150 consistently as long as the 4-tuple does not change. Note that DTLS 2151 1.0 has been deprecated by the IETF. 2153 The first octet of every DTLS 1.0 or 1.2 datagram contains the 2154 content type. A QUIC-LB load balancer will interpret any content 2155 type less than 128 as a short header packet, meaning that the 2156 subsequent octets should contain a connection ID. 2158 Existing TLS content types comfortably fit in the range below 128. 2159 Assignment of codepoints greater than 64 would require coordination 2160 in accordance with [RFC7983], and anyway would likely create problems 2161 demultiplexing DTLS and version 1 of QUIC. Therefore, this document 2162 believes it is extremely unlikely that TLS content types of 128 or 2163 greater will be assigned. Nevertheless, such an assignment would 2164 cause a QUIC-LB load balancer to interpret the packet as a QUIC long 2165 header with an essentially random connection ID, which is likely to 2166 be routed irregularly. 2168 The second octet of every DTLS 1.0 or 1.2 datagram is the bitwise 2169 complement of the DTLS Major version (i.e. version 1.x = 0xfe). A 2170 QUIC-LB load balancer will interpret this as a connection ID that 2171 requires 4-tuple based load balancing, meaning that the routing will 2172 be consistent as long as the 4-tuple remains the same. 2174 [I-D.ietf-tls-dtls-connection-id] defines an extension to add 2175 connection IDs to DTLS 1.2. Unfortunately, a QUIC-LB load balancer 2176 will not correctly parse the connection ID and will continue 4-tuple 2177 routing. An modified QUIC-LB load balancer that correctly identifies 2178 DTLS and parses a DTLS 1.2 datagram for the connection ID is outside 2179 the scope of this document. 2181 C.2. DTLS 1.3 2183 DTLS 1.3 [I-D.draft-ietf-tls-dtls13] changes the structure of 2184 datagram headers in relevant ways. 2186 Handshake packets continue to have a TLS content type in the first 2187 octet and 0xfe in the second octet, so they will be 4-tuple routed, 2188 which should not present problems for likely NAT rebinding or address 2189 change events. 2191 Non-handshake packets always have zero in their most significant bit 2192 and will therefore always be treated as QUIC short headers. If the 2193 connection ID is present, it follows in the succeeding octets. 2194 Therefore, a DTLS 1.3 association where the server utilizes 2195 Connection IDs and the encodings in this document will be routed 2196 correctly in the presence of client address and port changes. 2198 However, if the client does not include the connection_id extension 2199 in its ClientHello, the server is unable to use connection IDs. In 2200 this case, non- handshake packets will appear to contain random 2201 connection IDs and be routed randomly. Thus, unmodified QUIC-LB load 2202 balancers will not work with DTLS 1.3 if the client does not 2203 advertise support for connection IDs, or the server does not request 2204 the use of a compliant connection ID. 2206 A QUIC-LB load balancer might be modified to identify DTLS 1.3 2207 packets and correctly parse the fields to identify when there is no 2208 connection ID and revert to 4-tuple routing, removing the server 2209 requirement above. However, such a modification is outside the scope 2210 of this document, and classifying some packets as DTLS might be 2211 incompatible with future versions of QUIC. 2213 C.3. Future Versions of DTLS 2215 As DTLS does not have an IETF consensus document that defines what 2216 parts of DTLS will be invariant in future versions, it is difficult 2217 to speculate about the applicability of this section to future 2218 versions of DTLS. 2220 Appendix D. Acknowledgments 2222 The authors would like to thank Christian Huitema and Ian Swett for 2223 their major design contributions. 2225 Manasi Deval, Erik Fuller, Toma Gavrichenkov, Jana Iyengar, Subodh 2226 Iyengar, Ladislav Lhotka, Jan Lindblad, Ling Tao Nju, Kazuho Oku, 2227 Udip Pant, Martin Thomson, Dmitri Tikhonov, Victor Vasiliev, and 2228 William Zeng Ke all provided useful input to this document. 2230 Appendix E. Change Log 2232 *RFC Editor's Note:* Please remove this section prior to 2233 publication of a final version of this document. 2235 E.1. since draft-ietf-quic-load-balancers-07 2237 * Shortened SSCID nonce minimum length to 4 bytes 2239 * Removed RSCID from Retry token body 2241 * Simplified CID formats 2243 * Shrunk size of SID table 2245 E.2. since draft-ietf-quic-load-balancers-06 2247 * Added interoperability with DTLS 2249 * Changed "non-compliant" to "unroutable" 2251 * Changed "arbitrary" algorithm to "fallback" 2253 * Revised security considerations for mistrustful tenants 2255 * Added retry service considerations for non-Initial packets 2257 E.3. since draft-ietf-quic-load-balancers-05 2259 * Added low-config CID for further discussion 2261 * Complete revision of shared-state Retry Token 2263 * Added YANG model 2265 * Updated configuration limits to ensure CID entropy 2267 * Switched to notation from quic-transport 2269 E.4. since draft-ietf-quic-load-balancers-04 2271 * Rearranged the shared-state retry token to simplify token 2272 processing 2274 * More compact timestamp in shared-state retry token 2276 * Revised server requirements for shared-state retries 2278 * Eliminated zero padding from the test vectors 2280 * Added server use bytes to the test vectors 2282 * Additional compliant DCID criteria 2284 E.5. since-draft-ietf-quic-load-balancers-03 2286 * Improved Config Rotation text 2288 * Added stream cipher test vectors 2290 * Deleted the Obfuscated CID algorithm 2292 E.6. since-draft-ietf-quic-load-balancers-02 2294 * Replaced stream cipher algorithm with three-pass version 2296 * Updated Retry format to encode info for required TPs 2298 * Added discussion of version invariance 2300 * Cleaned up text about config rotation 2302 * Added Reset Oracle and limited configuration considerations 2304 * Allow dropped long-header packets for known QUIC versions 2306 E.7. since-draft-ietf-quic-load-balancers-01 2308 * Test vectors for load balancer decoding 2310 * Deleted remnants of in-band protocol 2312 * Light edit of Retry Services section 2314 * Discussed load balancer chains 2316 E.8. since-draft-ietf-quic-load-balancers-00 2318 * Removed in-band protocol from the document 2320 E.9. Since draft-duke-quic-load-balancers-06 2322 * Switch to IETF WG draft. 2324 E.10. Since draft-duke-quic-load-balancers-05 2326 * Editorial changes 2328 * Made load balancer behavior independent of QUIC version 2330 * Got rid of token in stream cipher encoding, because server might 2331 not have it 2333 * Defined "non-compliant DCID" and specified rules for handling 2334 them. 2336 * Added psuedocode for config schema 2338 E.11. Since draft-duke-quic-load-balancers-04 2340 * Added standard for retry services 2342 E.12. Since draft-duke-quic-load-balancers-03 2344 * Renamed Plaintext CID algorithm as Obfuscated CID 2346 * Added new Plaintext CID algorithm 2348 * Updated to allow 20B CIDs 2350 * Added self-encoding of CID length 2352 E.13. Since draft-duke-quic-load-balancers-02 2354 * Added Config Rotation 2356 * Added failover mode 2358 * Tweaks to existing CID algorithms 2360 * Added Block Cipher CID algorithm 2362 * Reformatted QUIC-LB packets 2364 E.14. Since draft-duke-quic-load-balancers-01 2366 * Complete rewrite 2368 * Supports multiple security levels 2370 * Lightweight messages 2372 E.15. Since draft-duke-quic-load-balancers-00 2374 * Converted to markdown 2376 * Added variable length connection IDs 2378 Authors' Addresses 2379 Martin Duke 2380 F5 Networks, Inc. 2382 Email: martin.h.duke@gmail.com 2384 Nick Banks 2385 Microsoft 2387 Email: nibanks@microsoft.com