idnits 2.17.1 draft-ietf-quic-manageability-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 578: '... RECOMMENDED. First, it only provid...' RFC 2119 keyword, line 969: '... NOT RECOMMENDED....' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (22 January 2021) is 1191 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 1205 == Unused Reference: 'Ding2015' is defined on line 1272, but no explicit reference was found in the text == Unused Reference: 'IPIM' is defined on line 1293, but no explicit reference was found in the text == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-08 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-08 -- Duplicate reference: draft-ietf-quic-applicability, mentioned in 'QUIC-APPLICABILITY', was also mentioned in 'I-D.ietf-quic-applicability'. == Outdated reference: A later version (-34) exists of draft-ietf-quic-http-33 == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-09 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 26 July 2021 Google 6 22 January 2021 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-09 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on caveats impacting network operations involving QUIC 15 traffic. Its intended audience is network operators, as well as 16 content providers that rely on the use of QUIC-aware middleboxes, 17 e.g. for load balancing. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 26 July 2021. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Simplified BSD License text 47 as described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 4 54 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 55 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 56 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 6 57 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 7 58 2.5. Integrity Protection of the Wire Image . . . . . . . . . 11 59 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 11 60 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 12 61 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 12 62 3. Network-visible Information about QUIC Flows . . . . . . . . 12 63 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 13 64 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 13 65 3.1.2. Rejection of Garbage Traffic . . . . . . . . . . . . 14 66 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 14 67 3.3. Application Identification . . . . . . . . . . . . . . . 14 68 3.3.1. Extracting Server Name Indication (SNI) 69 Information . . . . . . . . . . . . . . . . . . . . . 15 70 3.4. Flow Association . . . . . . . . . . . . . . . . . . . . 16 71 3.5. Flow teardown . . . . . . . . . . . . . . . . . . . . . . 16 72 3.6. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 16 73 3.7. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 17 74 3.7.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 17 75 3.7.2. Using the Spin Bit for Passive RTT Measurement . . . 17 76 4. Specific Network Management Tasks . . . . . . . . . . . . . . 19 77 4.1. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 19 78 4.2. Passive Network Performance Measurement and 79 Troubleshooting . . . . . . . . . . . . . . . . . . . . . 19 80 4.3. Server Cooperation with Load Balancers . . . . . . . . . 20 81 4.4. DDoS Detection and Mitigation . . . . . . . . . . . . . . 20 82 4.5. UDP Policing . . . . . . . . . . . . . . . . . . . . . . 21 83 4.6. Distinguishing Acknowledgment traffic . . . . . . . . . . 21 84 4.7. Quality of Service handling and ECMP . . . . . . . . . . 22 85 4.8. QUIC and Network Address Translation (NAT) . . . . . . . 22 86 4.8.1. Resource Conservation . . . . . . . . . . . . . . . . 23 87 4.8.2. "Helping" with routing infrastructure issues . . . . 23 88 4.9. Filtering behavior . . . . . . . . . . . . . . . . . . . 24 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 91 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 25 92 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 93 9. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . 26 94 9.1. Distinguishing IETF QUIC and Google QUIC Versions . . . . 26 95 9.2. Extracting the CRYPTO frame . . . . . . . . . . . . . . . 27 96 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 97 10.1. Normative References . . . . . . . . . . . . . . . . . . 28 98 10.2. Informative References . . . . . . . . . . . . . . . . . 28 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 101 1. Introduction 103 QUIC [QUIC-TRANSPORT] is a new transport protocol encapsulated in UDP 104 and encrypted by default. QUIC integrates TLS [QUIC-TLS] to encrypt 105 all payload data and most control information. The design focused on 106 support of semantics for HTTP, which required changes to HTTP known 107 as HTTP/3 [QUIC-HTTP]. 109 Given that QUIC is an end-to-end transport protocol, all information 110 in the protocol header, even that which can be inspected, is not 111 meant to be mutable by the network, and is therefore integrity- 112 protected. While less information is visible to the network than for 113 TCP, integrity protection can also simplify troubleshooting, because 114 none of the nodes on the network path can modify the transport layer 115 information. 117 This document provides guidance for network operations that manage 118 QUIC traffic. This includes guidance on how to interpret and utilize 119 information that is exposed by QUIC to the network, requirements and 120 assumptions that the QUIC design with respect to network treatment, 121 and a description of how common network management practices will be 122 impacted by QUIC. 124 Since QUIC's wire image [WIRE-IMAGE] is integrity protected and not 125 modifiable on path, in-network operations are not possible without 126 terminating the QUIC connection, for instance using a back-to-back 127 proxy. Proxy operations are not in scope for this document. A proxy 128 can either explicit identify itself as providing a proxy service, or 129 may share the TLS credentials to authenticate as the server and (in 130 some cases) client acting as a front-facing instance for the endpoint 131 itself. 133 Network management is not a one-size-fits-all endeavour: practices 134 considered necessary or even mandatory within enterprise networks 135 with certain compliance requirements, for example, would be 136 impermissible on other networks without those requirements. This 137 document therefore does not make any specific recommendations as to 138 which practices should or should not be applied; for each practice, 139 it describes what is and is not possible with the QUIC transport 140 protocol as defined. 142 2. Features of the QUIC Wire Image 144 In this section, we discuss those aspects of the QUIC transport 145 protocol that have an impact on the design and operation of devices 146 that forward QUIC packets. Here, we are concerned primarily with the 147 unencrypted part of QUIC's wire image [WIRE-IMAGE], which we define 148 as the information available in the packet header in each QUIC 149 packet, and the dynamics of that information. Since QUIC is a 150 versioned protocol, the wire image of the header format can also 151 change from version to version. However, the field that identifies 152 the QUIC version in some packets, and the format of the Version 153 Negotiation Packet, are both inspectable and invariant 154 [QUIC-INVARIANTS]. 156 This document describes version 1 of the QUIC protocol, whose wire 157 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 158 of the wire image described herein may change in future versions of 159 the protocol, except when specified as an invariant 160 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 161 or to infer the behavior of future versions of QUIC. 163 Section 9.1 provides non-normative guidance on the identification of 164 QUIC version 1 packets compared to some pre-standard versions. 166 2.1. QUIC Packet Header Structure 168 QUIC packets may have either a long header, or a short header. The 169 first bit of the QUIC header is the Header Form bit, and indicates 170 which type of header is present. The purpose of this bit is 171 invariant across QUIC versions. 173 The long header exposes more information. It is used during 174 connection establishment, including version negotiation, retry, and 175 0-RTT data. It contains a version number, as well as source and 176 destination connection IDs for grouping packets belonging to the same 177 flow. The definition and location of these fields in the QUIC long 178 header are invariant for future versions of QUIC, although future 179 versions of QUIC may provide additional fields in the long header 180 [QUIC-INVARIANTS]. 182 Short headers are used after connection establishment, and contain 183 only an optional destination connection ID and the spin bit for RTT 184 measurement. 186 The following information is exposed in QUIC packet headers: 188 * "fixed bit": the second most significant bit of the first octet 189 most QUIC packets of the current version is currently set to 1, 190 for endpoints to demultiplex with other UDP-encapsulated 191 protocols. Even thought this bit is fixed in the QUICv1 192 specification, endpoints may use a version or extension that 193 varies the bit. Therefore, observers cannot reliably use it as an 194 identifier for QUIC. 196 * latency spin bit: the third most significant bit of first octet in 197 the short packet header. The spin bit is set by endpoints such 198 that tracking edge transitions can be used to passively observe 199 end-to-end RTT. See Section 3.7.2 for further details. 201 * header type: the long header has a 2 bit packet type field 202 following the Header Form and fixed bits. Header types correspond 203 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 204 for details. 206 * version number: the version number is present in the long header, 207 and identifies the version used for that packet. During Version 208 Negotiation (see Section 2.8 and Section 17.2.1 of 209 [QUIC-TRANSPORT]), the version number field has a special value 210 (0x00000000) that identifies the packet as a Version Negotiation 211 packet. Many QUIC versions that start with 0xff implement IETF 212 drafts. QUIC versions that start with 0x0000 are reserved for 213 IETF consensus documents. For example, QUIC version 1 uses 214 version 0x00000001. Operators should expect to observe packets 215 with other version numbers as a result of various internet 216 experiments and future standards. 218 * source and destination connection ID: short and long packet 219 headers carry a destination connection ID, a variable-length field 220 that can be used to identify the connection associated with a QUIC 221 packet, for load-balancing and NAT rebinding purposes; see 222 Section 4.3 and Section 2.6. Long packet headers additionally 223 carry a source connection ID. The source connection ID 224 corresponds to the destination connection ID the source would like 225 to have on packets sent to it, and is only present on long packet 226 headers. On long header packets, the length of the connection IDs 227 is also present; on short header packets, the length of the 228 destination connection ID is implicit. 230 * length: the length of the remaining QUIC packet after the length 231 field, present on long headers. This field is used to implement 232 coalesced packets during the handshake (see Section 2.2). 234 * token: Initial packets may contain a token, a variable-length 235 opaque value optionally sent from client to server, used for 236 validating the client's address. Retry packets also contain a 237 token, which can be used by the client in an Initial packet on a 238 subsequent connection attempt. The length of the token is 239 explicit in both cases. 241 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 242 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 243 obfuscated in any way. For other kinds of packets, other information 244 in the packet headers is cryptographically obfuscated: 246 * packet number: All packets except Version Negotiation and Retry 247 packets have an associated packet number; however, this packet 248 number is encrypted, and therefore not of use to on-path 249 observers. The offset of the packet number is encoded in long 250 headers, while it is implicit (depending on destination connection 251 ID length) in short headers. The length of the packet number is 252 cryptographically obfuscated. 254 * key phase: The Key Phase bit, present in short headers, specifies 255 the keys used to encrypt the packet to support key rotation. The 256 Key Phase bit is cryptographically obfuscated. 258 2.2. Coalesced Packets 260 Multiple QUIC packets may be coalesced into a UDP datagram, with a 261 datagram carrying one or more long header packets followed by zero or 262 one short header packets. When packets are coalesced, the Length 263 fields in the long headers are used to separate QUIC packets; see 264 Section 12.2 of [QUIC-TRANSPORT]. The length header field is 265 variable length, and its position in the header is also variable 266 depending on the length of the source and destination connection ID; 267 see Section 17.2 of [QUIC-TRANSPORT]. 269 2.3. Use of Port Numbers 271 Applications that have a mapping for TCP as well as QUIC are expected 272 to use the same port number for both services. However, as with TCP- 273 based services, especially when application layer information is 274 encrypted, there is no guarantee that a specific application will use 275 the registered port, or the used port is carrying traffic belonging 276 to the respective registered service. For example, [QUIC-HTTP] 277 specifies the use of Alt-Svc for discovery of HTTP/3 services on 278 other ports. 280 Further, as QUIC has a connection ID, it is also possible to maintain 281 multiple QUIC connections over one 5-tuple. However, if the 282 connection ID is not present in the packet header, all packets of the 283 5-tuple belong to the same QUIC connection. 285 2.4. The QUIC Handshake 287 New QUIC connections are established using a handshake, which is 288 distinguishable on the wire and contains some information that can be 289 passively observed. 291 To illustrate the information visible in the QUIC wire image during 292 the handshake, we first show the general communication pattern 293 visible in the UDP datagrams containing the QUIC handshake, then 294 examine each of the datagrams in detail. 296 In the nominal case, the QUIC handshake can be recognized on the wire 297 through at least four datagrams we'll call "QUIC Client Hello", "QUIC 298 Server Hello", and "Initial Completion", and "Handshake Completion", 299 for purposes of this illustration, as shown in Figure 1. 301 Packets in the handshake belong to three separate cryptographic and 302 transport contexts ("Initial", which contains observable payload, and 303 "Handshake" and "1-RTT", which do not). QUIC packets in separate 304 contexts during the handshake are generally coalesced (see 305 Section 2.2) in order to reduce the number of UDP datagrams sent 306 during the handshake. 308 As shown here, the client can send 0-RTT data as soon as it has sent 309 its Client Hello, and the server can send 1-RTT data as soon as it 310 has sent its Server Hello. 312 Client Server 313 | | 314 +----QUIC Client Hello-------------------->| 315 +----(zero or more 0RTT)------------------>| 316 | | 317 |<--------------------QUIC Server Hello----+ 318 |<---------(1RTT encrypted data starts)----+ 319 | | 320 +----Initial Completion------------------->| 321 +----(1RTT encrypted data starts)--------->| 322 | | 323 |<-----------------Handshake Completion----+ 324 | | 326 Figure 1: General communication pattern visible in the QUIC handshake 327 A typical handshake starts with the client sending of a QUIC Client 328 Hello datagram as shown in Figure 2, which elicits a QUIC Server 329 Hello datagram as shown in Figure 3 typically containing three 330 packets: an Initial packet with the Server Hello, a Handshake packet 331 with the rest of the server's side of the TLS handshake, and initial 332 1-RTT data, if present. 334 The Initial Completion datagram contains at least one Handshake 335 packet and some also include an Initial packet. 337 Datagrams that contain a QUIC Initial Packet (Client Hello, Server 338 Hello, and some Initial Completion) must be at least 1200 octets 339 long. This protects against amplification attacks and verifies that 340 the network path meets minimum Maximum Transmission Unit (MTU) 341 requirements. This is usually accomplished with either the addition 342 of PADDING frames to the Initial packet, or coalescing of the Initial 343 Packet with packets from other encryption contexts. 345 The content of QUIC Initial packets are encrypted using Initial 346 Secrets, which are derived from a per-version constant and the 347 client's destination connection ID; they are therefore observable by 348 any on-path device that knows the per-version constant. We therefore 349 consider these as visible in our illustration. The content of QUIC 350 Handshake packets are encrypted using keys established during the 351 initial handshake exchange, and are therefore not visible. 353 Initial, Handshake, and the Short Header packets transmitted after 354 the handshake belong to cryptographic and transport contexts. The 355 Initial Completion Figure 4 and the Handshake Completion Figure 5 356 datagrams finish these first two contexts, by sending the final 357 acknowledgment and finishing the transmission of CRYPTO frames. 359 +----------------------------------------------------------+ 360 | UDP header (source and destination UDP ports) | 361 +----------------------------------------------------------+ 362 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 363 +----------------------------------------------------------+ | 364 | QUIC CRYPTO frame header | | 365 +----------------------------------------------------------+ | 366 | TLS Client Hello (incl. TLS SNI) | | 367 +----------------------------------------------------------+ | 368 | QUIC PADDING frames | | 369 +----------------------------------------------------------+<-+ 371 Figure 2: Typical QUIC Client Hello datagram pattern with no 0-RTT 373 The Client Hello datagram exposes version number, source and 374 destination connection IDs in the clear. Information in the TLS 375 Client Hello frame, including any TLS Server Name Indication (SNI) 376 present, is obfuscated using the Initial secret. Note that the 377 location of PADDING is implementation-dependent, and PADDING frames 378 may not appear in a coalesced Initial packet. 380 +------------------------------------------------------------+ 381 | UDP header (source and destination UDP ports) | 382 +------------------------------------------------------------+ 383 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 384 +------------------------------------------------------------+ | 385 | QUIC CRYPTO frame header | | 386 +------------------------------------------------------------+ | 387 | TLS Server Hello | | 388 +------------------------------------------------------------+ | 389 | QUIC ACK frame (acknowledging client hello) | | 390 +------------------------------------------------------------+<-+ 391 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 392 +------------------------------------------------------------+ | 393 | encrypted payload (presumably CRYPTO frames) | | 394 +------------------------------------------------------------+<-+ 395 | QUIC short header | 396 +------------------------------------------------------------+ 397 | 1-RTT encrypted payload | 398 +------------------------------------------------------------+ 400 Figure 3: Typical QUIC Server Hello datagram pattern 402 The Server Hello datagram also exposes version number, source and 403 destination connection IDs and information in the TLS Server Hello 404 message which is obfuscated using the Initial secret. 406 +------------------------------------------------------------+ 407 | UDP header (source and destination UDP ports) | 408 +------------------------------------------------------------+ 409 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 410 +------------------------------------------------------------+ | 411 | QUIC ACK frame (acknowledging Server Hello Initial) | | 412 +------------------------------------------------------------+<-+ 413 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 414 +------------------------------------------------------------+ | 415 | encrypted payload (presumably CRYPTO/ACK frames) | | 416 +------------------------------------------------------------+<-+ 417 | QUIC short header | 418 +------------------------------------------------------------+ 419 | 1-RTT encrypted payload | 420 +------------------------------------------------------------+ 421 Figure 4: Typical QUIC Initial Completion datagram pattern 423 The Initial Completion datagram does not expose any additional 424 information; however, recognizing it can be used to determine that a 425 handshake has completed (see Section 3.2), and for three-way 426 handshake RTT estimation as in Section 3.7. 428 +------------------------------------------------------------+ 429 | UDP header (source and destination UDP ports) | 430 +------------------------------------------------------------+ 431 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 432 +------------------------------------------------------------+ | 433 | encrypted payload (presumably ACK frame) | | 434 +------------------------------------------------------------+<-+ 435 | QUIC short header | 436 +------------------------------------------------------------+ 437 | 1-RTT encrypted payload | 438 +------------------------------------------------------------+ 440 Figure 5: Typical QUIC Handshake Completion datagram pattern 442 Similar to Initial Completion, Handshake Completion also exposes no 443 additional information; observing it serves only to determine that 444 the handshake has completed. 446 When the client uses 0-RTT connection resumption, 0-RTT data may also 447 be seen in the QUIC Client Hello datagram, as shown in Figure 6. 449 +----------------------------------------------------------+ 450 | UDP header (source and destination UDP ports) | 451 +----------------------------------------------------------+ 452 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 453 +----------------------------------------------------------+ | 454 | QUIC CRYPTO frame header | | 455 +----------------------------------------------------------+ | 456 | TLS Client Hello (incl. TLS SNI) | | 457 +----------------------------------------------------------+<-+ 458 | QUIC long header (type = 0RTT, Version, DCID, SCID) (Length) 459 +----------------------------------------------------------+ | 460 | 0-rtt encrypted payload | | 461 +----------------------------------------------------------+<-+ 463 Figure 6: Typical 0-RTT QUIC Client Hello datagram pattern 465 In a 0-RTT QUIC Client Hello datagram, the PADDING frame is only 466 present if necessary to increase the size of the datagram with 0RTT 467 data to at least 1200 bytes. Additional datagrams containing only 468 0-RTT protected long header packets may be sent from the client to 469 the server after the Client Hello datagram, containing the rest of 470 the 0-RTT data. The amount of 0-RTT protected data is limited by the 471 initial congestion window, typically around 10 packets [RFC6928]. 473 2.5. Integrity Protection of the Wire Image 475 As soon as the cryptographic context is established, all information 476 in the QUIC header, including exposed information, is integrity 477 protected. Further, information that was sent and exposed in 478 handshake packets sent before the cryptographic context was 479 established are validated later during the cryptographic handshake. 480 Therefore, devices on path cannot alter any information or bits in 481 QUIC packet headers, except specific parts of Initial packets, since 482 alteration of header information will lead to a failed integrity 483 check at the receiver, and can even lead to connection termination. 485 2.6. Connection ID and Rebinding 487 The connection ID in the QUIC packet headers allows routing of QUIC 488 packets at load balancers on other than five-tuple information, 489 ensuring that related flows are appropriately balanced together; and 490 to allow rebinding of a connection after one of the endpoint's 491 addresses changes - usually the client's. Client and server 492 negotiate connection IDs during the handshake; typically, however, 493 only the server will request a connection ID for the lifetime of the 494 connection. Connection IDs for either endpoint may change during the 495 lifetime of a connection, with the new connection ID being negotiated 496 via encrypted frames. See Section 5.1 of [QUIC-TRANSPORT]. 497 Therefore, observing a new connection ID does not necessary indicate 498 a new connection. 500 Server-generated connection IDs should seek to obscure any encoding, 501 of routing identities or any other information. Exposing the server 502 mapping would allow linkage of multiple IP addresses to the same host 503 if the server also supports migration. Furthermore, this opens an 504 attack vector on specific servers or pools. 506 The best way to obscure an encoding is to appear random to observers, 507 which is most rigorously achieved with encryption. Even when 508 encrypted, a scheme could embed the unencrypted length of the 509 connection ID in the connection ID itself, instead of remembering it. 511 [QUIC_LB] further specified possible algorithms to generate 512 connection IDs at load balancers. 514 2.7. Packet Numbers 516 The packet number field is always present in the QUIC packet header; 517 however, it is always encrypted. The encryption key for packet 518 number protection on handshake packets sent before cryptographic 519 context establishment is specific to the QUIC version, while packet 520 number protection on subsequent packets uses secrets derived from the 521 end-to-end cryptographic context. Packet numbers are therefore not 522 part of the wire image that is visible to on-path observers. 524 2.8. Version Negotiation and Greasing 526 Version Negotiation packets are not intrinsically protected, but QUIC 527 versions can use later encrypted messages to verify that they were 528 authentic. Therefore any manipulation of this list will be detected 529 and may cause the endpoints to terminate the connection attempt. 531 Also note that the list of versions in the Version Negotiation packet 532 may contain reserved versions. This mechanism is used to avoid 533 ossification in the implementation on the selection mechanism. 534 Further, a client may send a Initial Client packet with a reserved 535 version number to trigger version negotiation. In the Version 536 Negotiation packet the connection ID and packet number of the Client 537 Initial packet are reflected to provide a proof of return- 538 routability. Therefore changing this information will also cause the 539 connection to fail. 541 QUIC is expected to evolve rapidly, so new versions, both 542 experimental and IETF standard versions, will be deployed in the 543 Internet more often than with traditional Internet- and transport- 544 layer protocols. Using a particular version number to recognize 545 valid QUIC traffic is likely to persistently miss a fraction of QUIC 546 flows and completely fail in the near future, and is therefore not 547 recommended. In addition, due to the speed of evolution of the 548 protocol, devices that attempt to distinguish QUIC traffic from non- 549 QUIC traffic for purposes of network admission control should admit 550 all QUIC traffic regardless of version. 552 3. Network-visible Information about QUIC Flows 554 This section addresses the different kinds of observations and 555 inferences that can be made about QUIC flows by a passive observer in 556 the network based on the wire image in Section 2. Here we assume a 557 bidirectional observer (one that can see packets in both directions 558 in the sequence in which they are carried on the wire) unless noted. 560 3.1. Identifying QUIC Traffic 562 The QUIC wire image is not specifically designed to be 563 distinguishable from other UDP traffic. 565 The only application binding defined by the IETF QUIC WG is HTTP/3 566 [QUIC-HTTP] at the time of this writing; however, many other 567 applications are currently being defined and deployed over QUIC, so 568 an assumption that all QUIC traffic is HTTP/3 is not valid. HTTP 569 over QUIC uses UDP port 443 by default, although URLs referring to 570 resources available over HTTP/3 may specify alternate port numbers. 571 Simple assumptions about whether a given flow is using QUIC based 572 upon a UDP port number may therefore not hold; see also [RFC7605] 573 section 5. 575 While the second most significant bit (0x40) of the first octet is 576 set to 1 in most QUIC packets of the current version (see 577 Section 2.1), this method of recognizing QUIC traffic is NOT 578 RECOMMENDED. First, it only provides one bit of information and is 579 quite prone to collide with UDP-based protocols other than those that 580 this static bit is meant to allow multiplexing with. Second, this 581 feature of the wire image is not invariant [QUIC-INVARIANTS] and may 582 change in future versions of the protocol, or even be negotiated 583 during the handshake via the use of transport parameters. 585 Even though transport parameters transmitted in the client initial 586 are obserable by the network, they cannot be modified by the network 587 without risking connection failure. Further, the negotiated reply 588 from the server cannot be observed, so observers on the network 589 cannot know which parameters are actually in use. 591 3.1.1. Identifying Negotiated Version 593 An in-network observer assuming that a set of packets belongs to a 594 QUIC flow can infer the version number in use by observing the 595 handshake: an Initial packet with a given version from a client to 596 which a server responds with an Initial packet with the same version 597 implies acceptance of that version. 599 Negotiated version cannot be identified for flows for which a 600 handshake is not observed, such as in the case of connection 601 migration; however, these flows can be associated with flows for 602 which a version has been identified; see Section 3.4. 604 This document focuses on QUIC Version 1, and this section applies 605 only to packets belonging to Version 1 QUIC flows; for purposes of 606 on-path observation, it assumes that these packets have been 607 identified as such through the observation of a version number 608 exchange as described above. 610 3.1.2. Rejection of Garbage Traffic 612 A related question is whether a first packet of a given flow on a 613 known QUIC-associated port is a valid QUIC packet, to support in- 614 network filtering of garbage UDP packets (reflection attacks, random 615 backscatter). While heuristics based on the first byte of the packet 616 (packet type) could be used to separate valid from invalid first 617 packet types, the deployment of such heuristics is not recommended, 618 as packet types may have different meanings in future versions of the 619 protocol. 621 3.2. Connection Confirmation 623 Connection establishment uses Initial and Handshake packets 624 containing a TLS handshake, and Retry packets that do not contain 625 parts of the handshake. Connection establishment can therefore be 626 detected using heuristics similar to those used to detect TLS over 627 TCP. A client initiating a 0-RTT connection may also send data 628 packets in 0-RTT Protected packets directly after the Initial packet 629 containing the TLS Client Hello. Since these packets may be 630 reordered in the network, 0-RTT Protected data packets could be seen 631 before the Initial packet. 633 Note that clients send Initial packets before servers do, servers 634 send Handshake packets before clients do, and only clients send 635 Initial packets with tokens. Therefore, the role as a client or 636 server can generally be confirmed by an on- path observer. An 637 attempted connection after Retry can be detected by correlating the 638 token on the Retry with the token on the subsequent Initial packet 639 and the destination connection ID of the new Initial packet. 641 3.3. Application Identification 643 The cleartext TLS handshake may contain Server Name Indication (SNI) 644 [RFC6066], by which the client reveals the name of the server it 645 intends to connect to, in order to allow the server to present a 646 certificate based on that name. It may also contain information from 647 Application-Layer Protocol Negotiation (ALPN) [RFC7301], by which the 648 client exposes the names of application-layer protocols it supports; 649 an observer can deduce that one of those protocols will be used if 650 the connection continues. 652 Work is currently underway in the TLS working group to encrypt the 653 SNI in TLS 1.3 [TLS-ESNI]. This would make SNI-based application 654 identification impossible through passive measurement for QUIC and 655 other protocols that use TLS. 657 3.3.1. Extracting Server Name Indication (SNI) Information 659 If the SNI is not encrypted it can be derived from the QUIC Initial 660 packet by calculating the Initial Secret to decrypt the packet 661 payload and parse the QUIC CRYPTO Frame containing the TLS 662 ClientHello. 664 As both the initial salt for the Initial Secret as well as CRYPTO 665 frame itself are version-specific, the first step is always to parse 666 the version number (second to sixth byte of the long header). Note 667 that only long header packets carry the version number, so it is 668 necessary to also check the if first bit of the QUIC packet is set to 669 1, indicating a long header. 671 Note that proprietary QUIC versions, that have been deployed before 672 standardization, might not set the first bit in a QUIC long header 673 packets to 1. To parse these versions, example code is provided in 674 the appendix (see Section 9.1), however, it is expected that these 675 versions will gradually disappear over time. 677 When the version has been identified as QUIC version 1, the packet 678 type needs to be verified as an Initial packet by checking that the 679 third and fourth bit of the header are both set to 0. Then the 680 client destination connection ID needs to be extracted to calculate 681 the Initial Secret together with the version specific initial salt, 682 as described in [QUIC-TLS]. The length of the connection ID is 683 indicated in the 6th byte of the header followed by the connection ID 684 itself. 686 To determine the end of the header and find the start of the payload, 687 the packet number length, the source connection ID length, and the 688 token length need to be extracted. The packet number length is 689 defined by the seventh and eight bits of the header as described in 690 section 17.2. of [QUIC-TRANSPORT], but is obfuscated as described in 691 [QUIC-TLS]. The source connection ID length is specified in the byte 692 after the destination connection ID. And the token length, which 693 follows the source connection ID, is a variable length integer as 694 specified in Section 16 of [QUIC-TRANSPORT]. 696 After decryption, the Initial Client packet can be parsed to detect 697 the CRYPTO frame that contains the TLS Client Hello, which then can 698 be parsed similarly to TLS over TCP connections. The Initial client 699 packet may contain other frames, so the first bytes of each frame 700 need to be checked to identify the frame type, and if needed skip 701 over it. Note that the length of the frames is dependent on the 702 frame type. In QUIC version 1, the packet is expected to only carry 703 the CRYPTO frame and optionally padding frames. However, PADDING 704 frames, which are each one byte of zeros, may also occur before or 705 after the CRYPTO frame. 707 Note that client Initial packets after the first do not always use 708 the destination connection ID that was used to generate the Initial 709 keys. Therefore, attempts to decrypt these packets using the 710 procedure above might fail. 712 3.4. Flow Association 714 The QUIC connection ID (see Section 2.6) is designed to allow an on- 715 path device such as a load-balancer to associate two flows as 716 identified by five-tuple when the address and port of one of the 717 endpoints changes; e.g. due to NAT rebinding or server IP address 718 migration. An observer keeping flow state can associate a connection 719 ID with a given flow, and can associate a known flow with a new flow 720 when when observing a packet sharing a connection ID and one endpoint 721 address (IP address and port) with the known flow. 723 However, since the connection ID may change multiple times during the 724 lifetime of a flow, and the negotiation of connection ID changes is 725 encrypted, packets with the same 5-tuple but different connection IDs 726 may or may not belong to the same connection. 728 The connection ID value should be treated as opaque; see Section 4.3 729 for caveats regarding connection ID selection at servers. 731 3.5. Flow teardown 733 QUIC does not expose the end of a connection; the only indication to 734 on-path devices that a flow has ended is that packets are no longer 735 observed. Stateful devices on path such as NATs and firewalls must 736 therefore use idle timeouts to determine when to drop state for QUIC 737 flows, see further section Section 4.1. 739 3.6. Flow Symmetry Measurement 741 QUIC explicitly exposes which side of a connection is a client and 742 which side is a server during the handshake. In addition, the 743 symmetry of a flow (whether primarily client-to-server, primarily 744 server-to-client, or roughly bidirectional, as input to basic traffic 745 classification techniques) can be inferred through the measurement of 746 data rate in each direction. While QUIC traffic is protected and 747 ACKs may be padded, padding is not required. 749 3.7. Round-Trip Time (RTT) Measurement 751 Round-trip time of QUIC flows can be inferred by observation once per 752 flow, during the handshake, as in passive TCP measurement; this 753 requires parsing of the QUIC packet header and recognition of the 754 handshake, as illustrated in Section 2.4. It can also be inferred 755 during the flow's lifetime, if the endpoints use the spin bit 756 facility described below and in [QUIC-TRANSPORT], section 17.3.1. 758 3.7.1. Measuring Initial RTT 760 In the common case, the delay between the Initial packet containing 761 the TLS Client Hello and the Handshake packet containing the TLS 762 Server Hello represents the RTT component on the path between the 763 observer and the server. The delay between the TLS Server Hello and 764 the Handshake packet containing the TLS Finished message sent by the 765 client represents the RTT component on the path between the observer 766 and the client. While the client may send 0-RTT Protected packets 767 after the Initial packet during 0-RTT connection re-establishment, 768 these can be ignored for RTT measurement purposes. 770 Handshake RTT can be measured by adding the client-to-observer and 771 observer-to-server RTT components together. This measurement 772 necessarily includes any transport and application layer delay (the 773 latter mainly caused by the asymmetric crypto operations associated 774 with the TLS handshake) at both sides. 776 3.7.2. Using the Spin Bit for Passive RTT Measurement 778 The spin bit provides an additional method to measure per-flow RTT 779 from observation points on the network path throughout the duration 780 of a connection. Endpoint participation in spin bit signaling is 781 optional in QUIC. That is, while its location is fixed in this 782 version of QUIC, an endpoint can unilaterally choose to not support 783 "spinning" the bit. Use of the spin bit for RTT measurement by 784 devices on path is only possible when both endpoints enable it. Some 785 endpoints may disable use of the spin bit by default, others only in 786 specific deployment scenarios, e.g. for servers and clients where the 787 RTT would reveal the presence of a VPN or proxy. To avoid making 788 these connections identifiable based on the usage of the spin bit, 789 all endpoints randomly disable "spinning" for at least one eighth of 790 connections, even if otherwise enabled by default. An endpoint not 791 participating in spin bit signaling for a given connection can use a 792 fixed spin value for the duration of the connection, or can set the 793 bit randomly on each packet sent. 795 When in use and a QUIC flow sends data continuously, the latency spin 796 bit in each direction changes value once per round-trip time (RTT). 797 An on-path observer can observe the time difference between edges 798 (changes from 1 to 0 or 0 to 1) in the spin bit signal in a single 799 direction to measure one sample of end-to-end RTT. 801 Note that this measurement, as with passive RTT measurement for TCP, 802 includes any transport protocol delay (e.g., delayed sending of 803 acknowledgements) and/or application layer delay (e.g., waiting for a 804 response to be generated). It therefore provides devices on path a 805 good instantaneous estimate of the RTT as experienced by the 806 application. A simple linear smoothing or moving minimum filter can 807 be applied to the stream of RTT information to get a more stable 808 estimate. 810 However, application-limited and flow-control-limited senders can 811 have application and transport layer delay, respectively, that are 812 much greater than network RTT. When the sender is application- 813 limited and e.g. only sends small amount of periodic application 814 traffic, where that period is longer than the RTT, measuring the spin 815 bit provides information about the application period, not the 816 network RTT. 818 Since the spin bit logic at each endpoint considers only samples from 819 packets that advance the largest packet number, signal generation 820 itself is resistant to reordering. However, reordering can cause 821 problems at an observer by causing spurious edge detection and 822 therefore inaccurate (i.e., lower) RTT estimates, if reordering 823 occurs across a spin-bit flip in the stream. 825 Simple heuristics based on the observed data rate per flow or changes 826 in the RTT series can be used to reject bad RTT samples due to lost 827 or reordered edges in the spin signal, as well as application or flow 828 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 829 significantly higher than RTTs over the history of the flow. These 830 heuristics may use the handshake RTT as an initial RTT estimate for a 831 given flow. Usually such heuristics would also detect if the spin is 832 either constant or randomly set for a connection. 834 An on-path observer that can see traffic in both directions (from 835 client to server and from server to client) can also use the spin bit 836 to measure "upstream" and "downstream" component RTT; i.e, the 837 component of the end-to-end RTT attributable to the paths between the 838 observer and the server and the observer and the client, 839 respectively. It does this by measuring the delay between a spin 840 edge observed in the upstream direction and that observed in the 841 downstream direction, and vice versa. 843 4. Specific Network Management Tasks 845 In this section, we review specific network management and 846 measurement techniques and how QUIC's design impacts them. 848 4.1. Stateful Treatment of QUIC Traffic 850 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 851 middlebox) is possible through QUIC traffic and version 852 identification (Section 3.1) and observation of the handshake for 853 connection confirmation (Section 3.2). The lack of any visible end- 854 of-flow signal (Section 3.5) means that this state must be purged 855 either through timers or through least-recently-used eviction, 856 depending on application requirements. 858 [RFC4787] recommends a 2 minute timeout interval for UDP. However, 859 timers can be lower, in the range of 15 to 30 seconds. In contrast, 860 [RFC5382] recommends a timeout of more than 2 hours for TCP, given 861 that TCP is a connection-oriented protocol with well-defined closure 862 semantics. For network devices that are QUIC-aware, it is 863 recommended to also use longer timeouts for QUIC traffic, as QUIC is 864 connection-oriented. As such, a handshake packet from the server 865 indicates the willingness of the server to communicate with the 866 client. 868 The QUIC header optionally contains a connection ID which can be used 869 as additional entropy beyond the 5-tuple, if needed. The QUIC 870 handshake needs to be observed in order to understand whether the 871 connection ID is present and what length it has. However, connection 872 IDs may be renegotiated during a connection, and this renegotiation 873 is not visible to the path. Keying state off the connection ID may 874 therefore cause undetectable and unrecoverable loss of state in the 875 middle of a connection. Use of connection ID specifically 876 discouraged for NAT applications. 878 4.2. Passive Network Performance Measurement and Troubleshooting 880 Limited RTT measurement is possible by passive observation of QUIC 881 traffic; see Section 3.7. No passive measurement of loss is possible 882 with the present wire image. Extremely limited observation of 883 upstream congestion may be possible via the observation of CE 884 markings on ECN-enabled QUIC traffic. 886 4.3. Server Cooperation with Load Balancers 888 In the case of content distribution networking architectures 889 including load balancers, the connection ID provides a way for the 890 server to signal information about the desired treatment of a flow to 891 the load balancers. Guidance on assigning connection IDs is given in 892 [QUIC-APPLICABILITY]. 894 4.4. DDoS Detection and Mitigation 896 Current practices in detection and mitigation of Distributed Denial 897 of Service (DDoS) attacks generally involve classification of 898 incoming traffic (as packets, flows, or some other aggregate) into 899 "good" (productive) and "bad" (DDoS) traffic, and then differential 900 treatment of this traffic to forward only good traffic. This 901 operation is often done in a separate specialized mitigation 902 environment through which all traffic is filtered; a generalized 903 architecture for separation of concerns in mitigation is given in 904 [DOTS-ARCH]. 906 Key to successful DDoS mitigation is efficient classification of this 907 traffic in the mitigation environment. Limited first-packet garbage 908 detection as in Section 3.1.2 and stateful tracking of QUIC traffic 909 as in Section 4.1 above may be useful during classification. 911 Note that the use of a connection ID to support connection migration 912 renders 5-tuple based filtering insufficient and requires more state 913 to be maintained by DDoS defense systems. For the common case of NAT 914 rebinding, DDoS defense systems can detect a change in the client's 915 endpoint address by linking flows based on the server's connection 916 IDs. QUIC's linkability resistance ensures that a deliberate 917 connection migration is accompanied by a change in the connection ID. 919 It is questionable whether connection migrations must be supported 920 during a DDoS attack. If the connection migration is not visible to 921 the network that performs the DDoS detection, an active, migrated 922 QUIC connection may be blocked by such a system under attack. As 923 soon as the connection blocking is detected by the client, the client 924 may rely on the fast resumption mechanism provided by QUIC. When 925 clients migrate to a new path, they should be prepared for the 926 migration to fail and attempt to reconnect quickly. 928 TCP syncookies [RFC4937] are a well-established method of mitigating 929 some kinds of TCP DDoS attacks. QUIC Retry packets are the 930 functional analogue to syncookies, forcing clients to prove 931 possession of their IP address before committing server state. 932 However, there are safeguards in QUIC against unsolicited injection 933 of these packets by intermediaries who do not have consent of the end 934 server. See [QUIC_LB] for standard ways for intermediaries to send 935 Retry packets on behalf of consenting servers. 937 4.5. UDP Policing 939 Today, UDP is the most prevalent DDoS vector, since it is easy for 940 compromised non-admin applications to send a flood of large UDP 941 packets (while with TCP the attacker gets throttled by the congestion 942 controller) or to craft reflection and amplification attacks. 943 Networks should therefore be prepared for UDP flood attacks on ports 944 used for QUIC traffic. One possible response to this threat is to 945 police UDP traffic on the network, allocating a fixed portion of the 946 network capacity to UDP and blocking UDP datagram over that cap. 948 The recommended way to police QUIC packets is to either drop them all 949 or to throttle them based on the hash of the UDP datagram's source 950 and destination addresses, blocking a portion of the hash space that 951 corresponds to the fraction of UDP traffic one wishes to drop. When 952 the handshake is blocked, QUIC-capable applications may failover to 953 TCP (at least applications using well-known UDP ports). However, 954 blindly blocking a significant fraction of QUIC packets will allow 955 many QUIC handshakes to complete, preventing a TCP failover, but the 956 connections will suffer from severe packet loss. 958 4.6. Distinguishing Acknowledgment traffic 960 Some deployed in-network functions distinguish pure-acknowledgment 961 (ACK) packets from packets carrying upper-layer data in order to 962 attempt to enhance performance, for example by queueing ACKs 963 differently or manipulating ACK signaling. Distinguishing ACK 964 packets is trivial in TCP, but not supported by QUIC, since 965 acknowledgment signaling is carried inside QUIC's encrypted payload, 966 and ACK manipulation is impossible. Specifically, heuristics 967 attempting to distinguish ACK-only packets from payload-carrying 968 packets based on packet size are likely to fail, and are emphatically 969 NOT RECOMMENDED. 971 4.7. Quality of Service handling and ECMP 973 It is expected that any QoS handling in the network, e.g. based on 974 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 975 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 976 per-packet) and as such that all packets belonging to the same QUIC 977 connection get uniform treatment. Using ECMP to distribute packets 978 from a single flow across multiple network paths or any other non- 979 uniform treatment of packets belong to the same connection could 980 result in variations in order, delivery rate, and drop rate. As 981 feedback about loss or delay of each packet is used as input to the 982 congestion controller, these variations could adversely affect 983 performance. 985 Depending of the loss recovery mechanism implemented, QUIC may be 986 more tolerant of packet re-ordering than traditional TCP traffic (see 987 Section 2.7). However, it cannot be known by the network which exact 988 recovery mechanism is used and therefore reordering tolerance should 989 be considered as unknown. 991 4.8. QUIC and Network Address Translation (NAT) 993 QUIC Connection IDs are opaque byte fields that are expressed 994 consistently across all QUIC versions [QUIC-INVARIANTS], see 995 Section 2.6. This feature may appear to present opportunities to 996 optimize NAT port usage and simplify the work of the QUIC server. In 997 fact, NAT behavior that relies on CID may instead cause connection 998 failure when endpoints change Connection ID, and disable important 999 protocol security features. NATs should retain their existing 4- 1000 tuple-based operation and refrain from parsing or otherwise using 1001 QUIC connection IDs. 1003 This section uses the colloquial term NAT to mean NAPT (section 2.2 1004 of [RFC3022]), which overloads several IP addresses to one IP address 1005 or to an IP address pool, as commonly deployed in carrier-grade NATs 1006 or residential NATs. 1008 The remainder of this section explains how QUIC supports NATs better 1009 than other connection-oriented protocols, why NAT use of Connection 1010 ID might appear attractive, and how NAT use of CID can create serious 1011 problems for the endpoints. 1013 [RFC4787] contains some guidance on building NATs to interact 1014 constructively with a wide range of applications. This section 1015 extends the discussion to QUIC. 1017 By using the CID, QUIC connections can survive NAT rebindings as long 1018 as no routing function in the path is dependent on client IP address 1019 and port to deliver packets between server and NAT. Reducing the 1020 timeout on UDP NATs might be tempting in light of this property, but 1021 not all QUIC server deployments will be robust to rebinding. 1023 4.8.1. Resource Conservation 1025 NATs sometimes hit an operational limit where they exhaust available 1026 public IP addresses and ports, and must evict flows from their 1027 address/port mapping. CIDs might appear to offer a way to multiplex 1028 many connections over a single address and port. 1030 However, QUIC endpoints may negotiate new connection IDs inside 1031 cryptographically protected packets, and begin using them at will. 1032 Imagine two clients behind a NAT that are sharing the same public IP 1033 address and port. The NAT is differentiating them using the incoming 1034 Connection ID. If one client secretly changes its connection ID, 1035 there will be no mapping for the NAT, and the connection will 1036 suddenly break. 1038 QUIC is deliberately designed to fail rather than persist when the 1039 network cannot support its operation. For HTTP/3, this extends to 1040 recommending a fallback to TCP-based versions of HTTP rather than 1041 persisting with a QUIC connection that might be unstable. And 1042 [I-D.ietf-quic-applicability] recommends TCP fallback for other 1043 protocols on the basis that this is preferable to sudden connection 1044 errors and time outs. Furthermore, wide deployment of NATs with this 1045 behavior hinders the use of QUIC's migration function, which relies 1046 on the ability to change the connection ID any time during the 1047 lifetime of a QUIC connection. 1049 It is possible, in principle, to encode the client's identity in a 1050 connection ID using the techniques described in [QUIC_LB] and 1051 explicit coordination with the NAT. However, this implies that the 1052 client shares configuration with the NAT, which might be logistically 1053 difficult. This adds administrative overhead while not resolving the 1054 case where a client migrates to a point behind the NAT. 1056 Note that multiplexing connection IDs over a single port anyway 1057 violates the best common practice to avoid "port overloading" as 1058 described in [RFC4787]. 1060 4.8.2. "Helping" with routing infrastructure issues 1062 Concealing client address changes in order to simplify operational 1063 routing issues will mask important signals that drive security 1064 mechanisms, and therefore opens QUIC up to various attacks. 1066 One challenge in QUIC deployments that want to benefit from QUIC's 1067 migration capability is server infrastructures with routers and 1068 switches that direct traffic based on address-port 4-tuple rather 1069 than connection ID. The use of source IP address means that a NAT 1070 rebinding or address migration will deliver packets to the wrong 1071 server. As all QUIC payloads are encrypted, routers and switches 1072 will not have access to negotiated but not-yet-in-use CIDs. This is 1073 a particular problem for low-state load balancers. [QUIC_LB] 1074 addresses this problem proposing a QUIC extension to allow some 1075 server-load balancer coordination for routable CIDs. 1077 It seems that a NAT anywhere in the front of such an infrastructure 1078 setup could save the effort of converting all these devices by 1079 decoding routable connection IDs and rewriting the packet IP 1080 addresses to allow consistent routing by legacy devices. 1082 Unfortunately, the change of IP address or port is an important 1083 signal to QUIC endpoints. It requires a review of path-dependent 1084 variables like congestion control parameters. It can also signify 1085 various attacks that mislead one endpoint about the best peer address 1086 for the connection (see section 9 of [QUIC-TRANSPORT]). The QUIC 1087 PATH_CHALLENGE and PATH_RESPONSE frames are intended to detect and 1088 mitigate these attacks and verify connectivity to the new address. 1089 This mechanism cannot work if the NAT is bleaching peer address 1090 changes. 1092 For example, an attacker might copy a legitimate QUIC packet and 1093 change the source address to match its own. In the absence of a 1094 bleaching NAT, the receiving endpoint would interpret this as a 1095 potential NAT rebinding and use a PATH_CHALLENGE frame to prove that 1096 the peer endpoint is not truly at the new address, thus thwarting the 1097 attack. A bleaching NAT has no means of sending an encrypted 1098 PATH_CHALLENGE frame, so it might start redirecting all QUIC traffic 1099 to the attacker address and thus allow an observer to break the 1100 connection. 1102 4.9. Filtering behavior 1104 [RFC4787] describes possible packet filtering behaviors that relate 1105 to NATs. Though the guidance there holds, a particularly unwise 1106 behavior is to admit a handful of UDP packets and then make a 1107 decision as to whether or not to filter it. QUIC applications are 1108 encouraged to fail over to TCP if early packets do not arrive at 1109 their destination. Admitting a few packets allows the QUIC endpoint 1110 to determine that the path accepts QUIC. Sudden drops afterwards 1111 will result in slow and costly timeouts before abandoning the 1112 connection. 1114 5. IANA Considerations 1116 This document has no actions for IANA. 1118 6. Security Considerations 1120 QUIC is an encrypted and authenticated transport. That means, once 1121 the cryptographic handshake is complete, QUIC endpoints discard most 1122 packets that are not authenticated, greatly limiting the ability of 1123 an attacker to interfere with existing connections. 1125 However, some information is still observerable, as supporting 1126 manageability of QUIC traffic inherently involves tradeoffs with the 1127 confidentiality of QUIC's control information; this entire document 1128 is therefore security-relevant. 1130 More security considerations for QUIC are discussed in 1131 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1132 passive attackers in the network as well as attacks on specific QUIC 1133 mechanism. 1135 Version Negotiation packets do not contain any mechanism to prevent 1136 version downgrade attacks. However, future versions of QUIC that use 1137 Version Negotiation packets are require to define a mechanism that is 1138 robust against version downgrade attacks. Therefore a network node 1139 should not attempt to impact version selection, as version downgrade 1140 may result in connection failure. 1142 7. Contributors 1144 The following people have contributed text to sections of this 1145 document: 1147 * Dan Druta 1149 * Martin Duke 1151 * Marcus Ilhar 1153 * Igor Lubashev 1155 * David Schinazi 1157 8. Acknowledgments 1159 Special thanks to Martin Thomson and Martin Duke for the detailed 1160 reviews and feedback. 1162 This work is partially supported by the European Commission under 1163 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1164 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1165 for Education, Research, and Innovation under contract no. 15.0268. 1166 This support does not imply endorsement. 1168 9. Appendix 1170 This appendix uses the following conventions: array[i] - one byte at 1171 index i of array array[i:j] - subset of array starting with index i 1172 (inclusive) up to j-1 (inclusive) array[i:] - subset of array 1173 starting with index i (inclusive) up to the end of the array 1175 9.1. Distinguishing IETF QUIC and Google QUIC Versions 1177 This section contains algorithms that allows parsing versions from 1178 both Google QUIC and IETF QUIC. These mechanisms will become 1179 irrelevant when IETF QUIC is fully deployed and Google QUIC is 1180 deprecated. 1182 Note that other than this appendix, nothing in this document applies 1183 to Google QUIC. And the purpose of this appendix is merely to 1184 distinguish IETF QUIC from any versions of Google QUIC. 1186 Conceptually, a Google QUIC version is an opaque 32bit field. When 1187 we refer to a version with four printable characters, we use its 1188 ASCII representation: for example, Q050 refers to {'Q', '0', '5', 1189 '0'} which is equal to {0x51, 0x30, 0x35, 0x30}. Otherwise, we use 1190 its hexadecimal representation: for example, 0xff00001d refers to 1191 {0xff, 0x00, 0x00, 0x1d}. 1193 QUIC versions that start with 'Q' or 'T' followed by three digits are 1194 Google QUIC versions. Versions up to and including 43 are documented 1195 by . Versions 1197 Q046, Q050, T050, and T051 are not fully documented, but this 1198 appendix should contain enough information to allow parsing Client 1199 Hellos for those versions. 1201 To extract the version number itself, one needs to look at the first 1202 byte of the QUIC packet, in other words the first byte of the UDP 1203 payload. 1205 first_byte = packet[0] 1206 first_byte_bit1 = ((first_byte & 0x80) != 0) 1207 first_byte_bit2 = ((first_byte & 0x40) != 0) 1208 first_byte_bit3 = ((first_byte & 0x20) != 0) 1209 first_byte_bit4 = ((first_byte & 0x10) != 0) 1210 first_byte_bit5 = ((first_byte & 0x08) != 0) 1211 first_byte_bit6 = ((first_byte & 0x04) != 0) 1212 first_byte_bit7 = ((first_byte & 0x02) != 0) 1213 first_byte_bit8 = ((first_byte & 0x01) != 0) 1214 if (first_byte_bit1) { 1215 version = packet[1:5] 1216 } else if (first_byte_bit5 && !first_byte_bit2) { 1217 if (!first_byte_bit8) { 1218 abort("Packet without version") 1219 } 1220 if (first_byte_bit5) { 1221 version = packet[9:13] 1222 } else { 1223 version = packet[5:9] 1224 } 1225 } else { 1226 abort("Packet without version") 1227 } 1229 9.2. Extracting the CRYPTO frame 1230 counter = 0 1231 while (payload[counter] == 0) { 1232 counter += 1 1233 } 1234 first_nonzero_payload_byte = payload[counter] 1235 fnz_payload_byte_bit3 = ((first_nonzero_payload_byte & 0x20) != 0) 1237 if (first_nonzero_payload_byte != 0x06) { 1238 abort("Unexpected frame") 1239 } 1240 if (payload[counter+1] != 0x00) { 1241 abort("Unexpected crypto stream offset") 1242 } 1243 counter += 2 1244 if ((payload[counter] & 0xc0) == 0) { 1245 crypto_data_length = payload[counter] 1246 counter += 1 1247 } else { 1248 crypto_data_length = payload[counter:counter+2] 1249 counter += 2 1250 } 1251 crypto_data = payload[counter:counter+crypto_data_length] 1252 ParseTLS(crypto_data) 1254 10. References 1256 10.1. Normative References 1258 [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC", 1259 Work in Progress, Internet-Draft, draft-ietf-quic-tls-34, 1260 14 January 2021, . 1263 [QUIC-TRANSPORT] 1264 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1265 and Secure Transport", Work in Progress, Internet-Draft, 1266 draft-ietf-quic-transport-34, 14 January 2021, 1267 . 1270 10.2. Informative References 1272 [Ding2015] Ding, H. and M. Rabinovich, "TCP Stretch Acknowledgments 1273 and Timestamps - Findings and Impliciations for Passive 1274 RTT Measurement (ACM Computer Communication Review)", July 1275 2015, . 1278 [DOTS-ARCH] 1279 Mortensen, A., Reddy.K, T., Andreasen, F., Teague, N., and 1280 R. Compton, "Distributed-Denial-of-Service Open Threat 1281 Signaling (DOTS) Architecture", Work in Progress, 1282 Internet-Draft, draft-ietf-dots-architecture-18, 6 March 1283 2020, . 1286 [I-D.ietf-quic-applicability] 1287 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1288 Transport Protocol", Work in Progress, Internet-Draft, 1289 draft-ietf-quic-applicability-08, 2 November 2020, 1290 . 1293 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1294 Internet Measurement (arXiv preprint 1612.02902)", 9 1295 December 2016, . 1297 [QUIC-APPLICABILITY] 1298 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1299 Transport Protocol", Work in Progress, Internet-Draft, 1300 draft-ietf-quic-applicability-08, 2 November 2020, 1301 . 1304 [QUIC-HTTP] 1305 Bishop, M., "Hypertext Transfer Protocol Version 3 1306 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1307 quic-http-33, 15 December 2020, . 1310 [QUIC-INVARIANTS] 1311 Thomson, M., "Version-Independent Properties of QUIC", 1312 Work in Progress, Internet-Draft, draft-ietf-quic- 1313 invariants-13, 14 January 2021, . 1316 [QUIC_LB] Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC 1317 Connection IDs", Work in Progress, Internet-Draft, draft- 1318 ietf-quic-load-balancers-05, 30 October 2020, 1319 . 1322 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1323 and W. Weiss, "An Architecture for Differentiated 1324 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1325 . 1327 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 1328 Address Translator (Traditional NAT)", RFC 3022, 1329 DOI 10.17487/RFC3022, January 2001, 1330 . 1332 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1333 Translation (NAT) Behavioral Requirements for Unicast 1334 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1335 2007, . 1337 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1338 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1339 June 2007, . 1341 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1342 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1343 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1344 . 1346 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1347 Extensions: Extension Definitions", RFC 6066, 1348 DOI 10.17487/RFC6066, January 2011, 1349 . 1351 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1352 "Increasing TCP's Initial Window", RFC 6928, 1353 DOI 10.17487/RFC6928, April 2013, 1354 . 1356 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1357 "Transport Layer Security (TLS) Application-Layer Protocol 1358 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1359 July 2014, . 1361 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1362 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1363 August 2015, . 1365 [TLS-ESNI] Rescorla, E., Oku, K., Sullivan, N., and C. Wood, "TLS 1366 Encrypted Client Hello", Work in Progress, Internet-Draft, 1367 draft-ietf-tls-esni-09, 16 December 2020, 1368 . 1371 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1372 Integrity Signals for Passive Measurement (in Proc. TMA 1373 2014)", April 2014. 1375 [WIRE-IMAGE] 1376 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1377 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1378 2019, . 1380 Authors' Addresses 1382 Mirja Kuehlewind 1383 Ericsson 1385 Email: mirja.kuehlewind@ericsson.com 1387 Brian Trammell 1388 Google 1389 Gustav-Gull-Platz 1 1390 CH- 8004 Zurich 1391 Switzerland 1393 Email: ietf@trammell.ch