idnits 2.17.1 draft-ietf-quic-manageability-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (22 February 2021) is 1131 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 1368 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-09 == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-09 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 26 August 2021 Google 6 22 February 2021 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-10 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on caveats impacting network operations involving QUIC 15 traffic. Its intended audience is network operators, as well as 16 content providers that rely on the use of QUIC-aware middleboxes, 17 e.g. for load balancing. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 26 August 2021. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Simplified BSD License text 47 as described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 4 54 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 55 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 56 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 6 57 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 7 58 2.5. Integrity Protection of the Wire Image . . . . . . . . . 11 59 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 11 60 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 12 61 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 12 62 3. Network-visible Information about QUIC Flows . . . . . . . . 13 63 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 13 64 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 13 65 3.1.2. Rejection of Garbage Traffic . . . . . . . . . . . . 14 66 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 14 67 3.3. Distinguishing Acknowledgment traffic . . . . . . . . . . 15 68 3.4. Application Identification . . . . . . . . . . . . . . . 15 69 3.4.1. Extracting Server Name Indication (SNI) 70 Information . . . . . . . . . . . . . . . . . . . . . 15 71 3.5. Flow Association . . . . . . . . . . . . . . . . . . . . 17 72 3.6. Flow teardown . . . . . . . . . . . . . . . . . . . . . . 17 73 3.7. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 17 74 3.8. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 17 75 3.8.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 18 76 3.8.2. Using the Spin Bit for Passive RTT Measurement . . . 18 77 4. Specific Network Management Tasks . . . . . . . . . . . . . . 20 78 4.1. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 20 79 4.2. Passive Network Performance Measurement and 80 Troubleshooting . . . . . . . . . . . . . . . . . . . . . 21 81 4.3. Server Cooperation with Load Balancers . . . . . . . . . 21 82 4.4. DDoS Detection and Mitigation . . . . . . . . . . . . . . 21 83 4.5. UDP Policing . . . . . . . . . . . . . . . . . . . . . . 22 84 4.6. Handling ICMP Messages . . . . . . . . . . . . . . . . . 22 85 4.7. Quality of Service handling and ECMP . . . . . . . . . . 23 86 4.8. QUIC and Network Address Translation (NAT) . . . . . . . 23 87 4.8.1. Resource Conservation . . . . . . . . . . . . . . . . 24 88 4.8.2. "Helping" with routing infrastructure issues . . . . 25 89 4.9. Filtering behavior . . . . . . . . . . . . . . . . . . . 26 90 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 91 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26 92 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 26 93 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 94 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 95 9.1. Normative References . . . . . . . . . . . . . . . . . . 27 96 9.2. Informative References . . . . . . . . . . . . . . . . . 27 97 Appendix A. Appendix . . . . . . . . . . . . . . . . . . . . . . 30 98 A.1. Distinguishing IETF QUIC and Google QUIC Versions . . . . 30 99 A.2. Extracting the CRYPTO frame . . . . . . . . . . . . . . . 31 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 102 1. Introduction 104 QUIC [QUIC-TRANSPORT] is a new transport protocol that is 105 encapsulated in UDP. QUIC integrates TLS [QUIC-TLS] to encrypt all 106 payload data and most control information. QUIC version 1 was 107 designed primarily as a transport for HTTP, with the resulting 108 protocol being known as HTTP/3 [QUIC-HTTP]. 110 Given that QUIC is an end-to-end transport protocol, all information 111 in the protocol header, even that which can be inspected, is not 112 meant to be mutable by the network, and is therefore integrity- 113 protected. While less information is visible to the network than for 114 TCP, integrity protection can also simplify troubleshooting, because 115 none of the nodes on the network path can modify the transport layer 116 information. 118 This document provides guidance for network operations that manage 119 QUIC traffic. This includes guidance on how to interpret and utilize 120 information that is exposed by QUIC to the network, requirements and 121 assumptions that the QUIC design with respect to network treatment, 122 and a description of how common network management practices will be 123 impacted by QUIC. 125 Since QUIC's wire image [WIRE-IMAGE] is integrity-protected, in- 126 network operations that depend on modification of data are not 127 possible without the cooperation of an endpoint. Network operation 128 practices that alter data are only possible if performed as a QUIC 129 endpoint; this might be possible with the introduction of a proxy 130 which authenticates as an endpoint. Proxy operations are not in 131 scope for this document. 133 Network management is not a one-size-fits-all endeavour: practices 134 considered necessary or even mandatory within enterprise networks 135 with certain compliance requirements, for example, would be 136 impermissible on other networks without those requirements. This 137 document therefore does not make any specific recommendations as to 138 which practices should or should not be applied; for each practice, 139 it describes what is and is not possible with the QUIC transport 140 protocol as defined. 142 2. Features of the QUIC Wire Image 144 In this section, we discuss those aspects of the QUIC transport 145 protocol that have an impact on the design and operation of devices 146 that forward QUIC packets. Here, we are concerned primarily with the 147 unencrypted part of QUIC's wire image [WIRE-IMAGE], which we define 148 as the information available in the packet header in each QUIC 149 packet, and the dynamics of that information. Since QUIC is a 150 versioned protocol, the wire image of the header format can also 151 change from version to version. However, the field that identifies 152 the QUIC version in some packets, and the format of the Version 153 Negotiation Packet, are both inspectable and invariant 154 [QUIC-INVARIANTS]. 156 This document describes version 1 of the QUIC protocol, whose wire 157 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 158 of the wire image described herein may change in future versions of 159 the protocol, except when specified as an invariant 160 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 161 or to infer the behavior of future versions of QUIC. 163 Appendix A.1 provides non-normative guidance on the identification of 164 QUIC version 1 packets compared to some pre-standard versions. 166 2.1. QUIC Packet Header Structure 168 QUIC packets may have either a long header, or a short header. The 169 first bit of the QUIC header is the Header Form bit, and indicates 170 which type of header is present. The purpose of this bit is 171 invariant across QUIC versions. 173 The long header exposes more information. It is used during 174 connection establishment, including version negotiation, retry, and 175 0-RTT data. It contains a version number, as well as source and 176 destination connection IDs for grouping packets belonging to the same 177 flow. The definition and location of these fields in the QUIC long 178 header are invariant for future versions of QUIC, although future 179 versions of QUIC may provide additional fields in the long header 180 [QUIC-INVARIANTS]. 182 Short headers are used after connection establishment, and contain 183 only an optional destination connection ID and the spin bit for RTT 184 measurement. 186 The following information is exposed in QUIC packet headers: 188 * "fixed bit": the second most significant bit of the first octet 189 most QUIC packets of the current version is currently set to 1, 190 for endpoints to demultiplex with other UDP-encapsulated 191 protocols. Even thought this bit is fixed in the QUICv1 192 specification, endpoints may use a version or extension that 193 varies the bit. Therefore, observers cannot depend on it as an 194 identifier for QUIC. 196 * latency spin bit: the third most significant bit of first octet in 197 the short packet header. The spin bit is set by endpoints such 198 that tracking edge transitions can be used to passively observe 199 end-to-end RTT. See Section 3.8.2 for further details. 201 * header type: the long header has a 2 bit packet type field 202 following the Header Form and fixed bits. Header types correspond 203 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 204 for details. 206 * version number: the version number is present in the long header, 207 and identifies the version used for that packet. During Version 208 Negotiation (see Section 2.8 and Section 17.2.1 of 209 [QUIC-TRANSPORT]), the version number field has a special value 210 (0x00000000) that identifies the packet as a Version Negotiation 211 packet. Upon time of publishing of this document, QUIC versions 212 that start with 0xff implement IETF drafts. QUIC version 1 uses 213 version 0x00000001. Operators should expect to observe packets 214 with other version numbers as a result of various Internet 215 experiments and future standards. 217 * source and destination connection ID: short and long packet 218 headers carry a destination connection ID, a variable-length field 219 that can be used to identify the connection associated with a QUIC 220 packet, for load-balancing and NAT rebinding purposes; see 221 Section 4.3 and Section 2.6. Long packet headers additionally 222 carry a source connection ID. The source connection ID 223 corresponds to the destination connection ID the source would like 224 to have on packets sent to it, and is only present on long packet 225 headers. On long header packets, the length of the connection IDs 226 is also present; on short header packets, the length of the 227 destination connection ID is implicit. 229 * length: the length of the remaining QUIC packet after the length 230 field, present on long headers. This field is used to implement 231 coalesced packets during the handshake (see Section 2.2). 233 * token: Initial packets may contain a token, a variable-length 234 opaque value optionally sent from client to server, used for 235 validating the client's address. Retry packets also contain a 236 token, which can be used by the client in an Initial packet on a 237 subsequent connection attempt. The length of the token is 238 explicit in both cases. 240 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 241 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 242 obfuscated in any way. For other kinds of packets, other information 243 in the packet headers is cryptographically obfuscated: 245 * packet number: All packets except Version Negotiation and Retry 246 packets have an associated packet number; however, this packet 247 number is encrypted, and therefore not of use to on-path 248 observers. The offset of the packet number is encoded in long 249 headers, while it is implicit (depending on destination connection 250 ID length) in short headers. The length of the packet number is 251 cryptographically obfuscated. 253 * key phase: The Key Phase bit, present in short headers, specifies 254 the keys used to encrypt the packet to support key rotation. The 255 Key Phase bit is cryptographically obfuscated. 257 2.2. Coalesced Packets 259 Multiple QUIC packets may be coalesced into a UDP datagram, with a 260 datagram carrying one or more long header packets followed by zero or 261 one short header packets. When packets are coalesced, the Length 262 fields in the long headers are used to separate QUIC packets; see 263 Section 12.2 of [QUIC-TRANSPORT]. The length header field is 264 variable length, and its position in the header is also variable 265 depending on the length of the source and destination connection ID; 266 see Section 17.2 of [QUIC-TRANSPORT]. 268 2.3. Use of Port Numbers 270 Applications that have a mapping for TCP as well as QUIC are expected 271 to use the same port number for both services. However, as for all 272 other IETF transports [RFC7605], there is no guarantee that a 273 specific application will use a given registered port, or that a 274 given port carries traffic belonging to the respective registered 275 service, especially when application layer information is encrypted. 276 For example, [QUIC-HTTP] specifies the use of Alt-Svc for discovery 277 of HTTP/3 services on other ports. 279 Further, as QUIC has a connection ID, it is also possible to maintain 280 multiple QUIC connections over one 5-tuple. However, if the 281 connection ID is not present in the packet header, all packets of the 282 5-tuple belong to the same QUIC connection. 284 2.4. The QUIC Handshake 286 New QUIC connections are established using a handshake, which is 287 distinguishable on the wire and contains some information that can be 288 passively observed. 290 To illustrate the information visible in the QUIC wire image during 291 the handshake, we first show the general communication pattern 292 visible in the UDP datagrams containing the QUIC handshake, then 293 examine each of the datagrams in detail. 295 The QUIC handshake can normally be recognized on the wire through at 296 least four datagrams we'll call "QUIC Client Hello", "QUIC Server 297 Hello", and "Initial Completion", and "Handshake Completion", for 298 purposes of this illustration, as shown in Figure 1. 300 Packets in the handshake belong to three separate cryptographic and 301 transport contexts ("Initial", which contains observable payload, and 302 "Handshake" and "1-RTT", which do not). QUIC packets in separate 303 contexts during the handshake are generally coalesced (see 304 Section 2.2) in order to reduce the number of UDP datagrams sent 305 during the handshake. 307 As shown here, the client can send 0-RTT data as soon as it has sent 308 its Client Hello, and the server can send 1-RTT data as soon as it 309 has sent its Server Hello. 311 Client Server 312 | | 313 +----QUIC Client Hello-------------------->| 314 +----(zero or more 0RTT)------------------>| 315 | | 316 |<--------------------QUIC Server Hello----+ 317 |<---------(1RTT encrypted data starts)----+ 318 | | 319 +----Initial Completion------------------->| 320 +----(1RTT encrypted data starts)--------->| 321 | | 322 |<-----------------Handshake Completion----+ 323 | | 325 Figure 1: General communication pattern visible in the QUIC handshake 326 A typical handshake starts with the client sending of a QUIC Client 327 Hello datagram as shown in Figure 2, which elicits a QUIC Server 328 Hello datagram as shown in Figure 3 typically containing three 329 packets: an Initial packet with the Server Hello, a Handshake packet 330 with the rest of the server's side of the TLS handshake, and initial 331 1-RTT data, if present. 333 The Initial Completion datagram contains at least one Handshake 334 packet and some also include an Initial packet. 336 Datagrams that contain a QUIC Initial Packet (Client Hello, Server 337 Hello, and some Initial Completion) contain at least 1200 octets of 338 UDP payload. This protects against amplification attacks and 339 verifies that the network path meets the requirements for the minimum 340 QUIC IP packet size, see Section 14 of [QUIC-TRANSPORT]. This is 341 accomplished by either adding PADDING frames within the Initial 342 packet, coalescing other packets with the Initial packet, or leaving 343 unused payload in the UDP packet after the Initial packet. A network 344 path needs to be able to forward at least this size of packet for 345 QUIC to be used. 347 The content of QUIC Initial packets are encrypted using Initial 348 Secrets, which are derived from a per-version constant and the 349 client's destination connection ID; they are therefore observable by 350 any on-path device that knows the per-version constant. They are 351 therefore considered visible in this illustration. The content of 352 QUIC Handshake packets are encrypted using keys established during 353 the initial handshake exchange, and are therefore not visible. 355 Initial, Handshake, and the Short Header packets transmitted after 356 the handshake belong to cryptographic and transport contexts. The 357 Initial Completion Figure 4 and the Handshake Completion Figure 5 358 datagrams finish these first two contexts, by sending the final 359 acknowledgment and finishing the transmission of CRYPTO frames. 361 +----------------------------------------------------------+ 362 | UDP header (source and destination UDP ports) | 363 +----------------------------------------------------------+ 364 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 365 +----------------------------------------------------------+ | 366 | QUIC CRYPTO frame header | | 367 +----------------------------------------------------------+ | 368 | TLS Client Hello (incl. TLS SNI) | | 369 +----------------------------------------------------------+ | 370 | QUIC PADDING frames | | 371 +----------------------------------------------------------+<-+ 373 Figure 2: Typical QUIC Client Hello datagram pattern with no 0-RTT 375 The Client Hello datagram exposes version number, source and 376 destination connection IDs without encryption. Information in the 377 TLS Client Hello frame, including any TLS Server Name Indication 378 (SNI) present, is obfuscated using the Initial secret. Note that the 379 location of PADDING is implementation-dependent, and PADDING frames 380 may not appear in a coalesced Initial packet. 382 +------------------------------------------------------------+ 383 | UDP header (source and destination UDP ports) | 384 +------------------------------------------------------------+ 385 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 386 +------------------------------------------------------------+ | 387 | QUIC CRYPTO frame header | | 388 +------------------------------------------------------------+ | 389 | TLS Server Hello | | 390 +------------------------------------------------------------+ | 391 | QUIC ACK frame (acknowledging client hello) | | 392 +------------------------------------------------------------+<-+ 393 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 394 +------------------------------------------------------------+ | 395 | encrypted payload (presumably CRYPTO frames) | | 396 +------------------------------------------------------------+<-+ 397 | QUIC short header | 398 +------------------------------------------------------------+ 399 | 1-RTT encrypted payload | 400 +------------------------------------------------------------+ 402 Figure 3: Typical QUIC Server Hello datagram pattern 404 The Server Hello datagram also exposes version number, source and 405 destination connection IDs and information in the TLS Server Hello 406 message which is obfuscated using the Initial secret. 408 +------------------------------------------------------------+ 409 | UDP header (source and destination UDP ports) | 410 +------------------------------------------------------------+ 411 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 412 +------------------------------------------------------------+ | 413 | QUIC ACK frame (acknowledging Server Hello Initial) | | 414 +------------------------------------------------------------+<-+ 415 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 416 +------------------------------------------------------------+ | 417 | encrypted payload (presumably CRYPTO/ACK frames) | | 418 +------------------------------------------------------------+<-+ 419 | QUIC short header | 420 +------------------------------------------------------------+ 421 | 1-RTT encrypted payload | 422 +------------------------------------------------------------+ 423 Figure 4: Typical QUIC Initial Completion datagram pattern 425 The Initial Completion datagram does not expose any additional 426 information; however, recognizing it can be used to determine that a 427 handshake has completed (see Section 3.2), and for three-way 428 handshake RTT estimation as in Section 3.8. 430 +------------------------------------------------------------+ 431 | UDP header (source and destination UDP ports) | 432 +------------------------------------------------------------+ 433 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 434 +------------------------------------------------------------+ | 435 | encrypted payload (presumably ACK frame) | | 436 +------------------------------------------------------------+<-+ 437 | QUIC short header | 438 +------------------------------------------------------------+ 439 | 1-RTT encrypted payload | 440 +------------------------------------------------------------+ 442 Figure 5: Typical QUIC Handshake Completion datagram pattern 444 Similar to Initial Completion, Handshake Completion also exposes no 445 additional information; observing it serves only to determine that 446 the handshake has completed. 448 When the client uses 0-RTT connection resumption, 0-RTT data may also 449 be seen in the QUIC Client Hello datagram, as shown in Figure 6. 451 +----------------------------------------------------------+ 452 | UDP header (source and destination UDP ports) | 453 +----------------------------------------------------------+ 454 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 455 +----------------------------------------------------------+ | 456 | QUIC CRYPTO frame header | | 457 +----------------------------------------------------------+ | 458 | TLS Client Hello (incl. TLS SNI) | | 459 +----------------------------------------------------------+<-+ 460 | QUIC long header (type = 0RTT, Version, DCID, SCID) (Length) 461 +----------------------------------------------------------+ | 462 | 0-rtt encrypted payload | | 463 +----------------------------------------------------------+<-+ 465 Figure 6: Typical 0-RTT QUIC Client Hello datagram pattern 467 In a 0-RTT QUIC Client Hello datagram, the PADDING frame is only 468 present if necessary to increase the size of the datagram with 0RTT 469 data to at least 1200 bytes. Additional datagrams containing only 470 0-RTT protected long header packets may be sent from the client to 471 the server after the Client Hello datagram, containing the rest of 472 the 0-RTT data. The amount of 0-RTT protected data that can be sent 473 in the first round is limited by the initial congestion window, 474 typically around 10 packets (see Section 7.2 of [QUIC-RECOVERY]). 476 2.5. Integrity Protection of the Wire Image 478 As soon as the cryptographic context is established, all information 479 in the QUIC header, including exposed information, is integrity 480 protected. Further, information that was sent and exposed in 481 handshake packets sent before the cryptographic context was 482 established are validated later during the cryptographic handshake. 483 Therefore, devices on path cannot alter any information or bits in 484 QUIC packets. Such alterations would cause the integrity check to 485 fail, which results in the receiver discarding the packet. Some 486 parts of Initial packets could be altered by removing and re-applying 487 the authenticated encryption without immediate discard at the 488 receiver. However, the cryptographic handshake validates most fields 489 and any modifications in those fields will result in connection 490 establishment failing later on. 492 2.6. Connection ID and Rebinding 494 The connection ID in the QUIC packet headers allows routing of QUIC 495 packets at load balancers on other than five-tuple information, 496 ensuring that related flows are appropriately balanced together; and 497 to allow rebinding of a connection after one of the endpoint's 498 addresses changes - usually the client's. Client and server 499 negotiate connection IDs during the handshake; typically, however, 500 only the server will request a connection ID for the lifetime of the 501 connection. Connection IDs for either endpoint may change during the 502 lifetime of a connection, with the new connection ID being negotiated 503 via encrypted frames. See Section 5.1 of [QUIC-TRANSPORT]. 504 Therefore, observing a new connection ID does not necessary indicate 505 a new connection. 507 Server-generated connection IDs should seek to obscure any encoding, 508 of routing identities or any other information. Exposing the server 509 mapping would allow linkage of multiple IP addresses to the same host 510 if the server also supports migration. Furthermore, this opens an 511 attack vector on specific servers or pools. 513 The best way to obscure an encoding is to appear random to observers, 514 which is most rigorously achieved with encryption. Even when 515 encrypted, a scheme could embed the unencrypted length of the 516 connection ID in the connection ID itself, instead of remembering it. 518 [QUIC_LB] further specified possible algorithms to generate 519 connection IDs at load balancers. 521 2.7. Packet Numbers 523 The packet number field is always present in the QUIC packet header; 524 however, it is always encrypted. The encryption key for packet 525 number protection on handshake packets sent before cryptographic 526 context establishment is specific to the QUIC version, while packet 527 number protection on subsequent packets uses secrets derived from the 528 end-to-end cryptographic context. Packet numbers are therefore not 529 part of the wire image that is visible to on-path observers. 531 2.8. Version Negotiation and Greasing 533 Version Negotiation packets are used by the server to indicate that a 534 requested version from the client is not supported (see section 6 of 535 [QUIC-TRANSPORT]. Version Negotiation packets are not intrinsically 536 protected, but QUIC versions can use later encrypted messages to 537 verify that they were authentic. Therefore any modification of this 538 list will be detected and may cause the endpoints to terminate the 539 connection attempt. 541 Also note that the list of versions in the Version Negotiation packet 542 may contain reserved versions. This mechanism is used to avoid 543 ossification in the implementation on the selection mechanism. 544 Further, a client may send a Initial Client packet with a reserved 545 version number to trigger version negotiation. In the Version 546 Negotiation packet the connection ID and packet number of the Client 547 Initial packet are reflected to provide a proof of return- 548 routability. Therefore changing this information will also cause the 549 connection to fail. 551 QUIC is expected to evolve rapidly, so new versions, both 552 experimental and IETF standard versions, will be deployed in the 553 Internet more often than with traditional Internet- and transport- 554 layer protocols. Using a particular version number to recognize 555 valid QUIC traffic is likely to persistently miss a fraction of QUIC 556 flows and completely fail in the near future, and is therefore not 557 recommended. In addition, due to the speed of evolution of the 558 protocol, devices that attempt to distinguish QUIC traffic from non- 559 QUIC traffic for purposes of network admission control should admit 560 all QUIC traffic regardless of version. 562 3. Network-visible Information about QUIC Flows 564 This section addresses the different kinds of observations and 565 inferences that can be made about QUIC flows by a passive observer in 566 the network based on the wire image in Section 2. Here we assume a 567 bidirectional observer (one that can see packets in both directions 568 in the sequence in which they are carried on the wire) unless noted. 570 3.1. Identifying QUIC Traffic 572 The QUIC wire image is not specifically designed to be 573 distinguishable from other UDP traffic. 575 The only application binding defined by the IETF QUIC WG is HTTP/3 576 [QUIC-HTTP] at the time of this writing; however, many other 577 applications are currently being defined and deployed over QUIC, so 578 an assumption that all QUIC traffic is HTTP/3 is not valid. HTTP 579 over QUIC uses UDP port 443 by default, although URLs referring to 580 resources available over HTTP/3 may specify alternate port numbers. 581 Simple assumptions about whether a given flow is using QUIC based 582 upon a UDP port number may therefore not hold; see also [RFC7605] 583 section 5. 585 While the second most significant bit (0x40) of the first octet is 586 set to 1 in most QUIC packets of the current version (see Section 2.1 587 and section 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC 588 traffic is not reliable. First, it only provides one bit of 589 information and is prone to collision with UDP-based protocols other 590 than those that this static bit is meant to allow multiplexing with. 591 Second, this feature of the wire image is not invariant 592 [QUIC-INVARIANTS] and may change in future versions of the protocol, 593 or even be negotiated during the handshake via the use of transport 594 parameters. 596 Even though transport parameters transmitted in the client initial 597 are obserable by the network, they cannot be modified by the network 598 without risking connection failure. Further, the negotiated reply 599 from the server cannot be observed, so observers on the network 600 cannot know which parameters are actually in use. 602 3.1.1. Identifying Negotiated Version 604 An in-network observer assuming that a set of packets belongs to a 605 QUIC flow can infer the version number in use by observing the 606 handshake: for QUIC version 1 if the version number in the Initial 607 packet from a client is the same as the version number in Initial 608 packet of the server response, that version has been accepted by both 609 endpoints to be used for the rest of the connection. 611 Negotiated version cannot be identified for flows for which a 612 handshake is not observed, such as in the case of connection 613 migration; however, it might be possible to associate a flow with a 614 flow for which a version has been identified; see Section 3.5. 616 This document focuses on QUIC Version 1, and this section applies 617 only to packets belonging to Version 1 QUIC flows; for purposes of 618 on-path observation, it assumes that these packets have been 619 identified as such through the observation of a version number 620 exchange as described above. 622 3.1.2. Rejection of Garbage Traffic 624 A related question is whether a first packet of a given flow on a 625 known QUIC-associated port is a valid QUIC packet, to support in- 626 network filtering of garbage UDP packets (reflection attacks, random 627 backscatter). While heuristics based on the first byte of the packet 628 (packet type) could be used to separate valid from invalid first 629 packet types, the deployment of such heuristics is not recommended, 630 as packet types may have different meanings in future versions of the 631 protocol. 633 3.2. Connection Confirmation 635 Connection establishment uses Initial and Handshake packets 636 containing a TLS handshake, and Retry packets that do not contain 637 parts of the handshake. Connection establishment can therefore be 638 detected using heuristics similar to those used to detect TLS over 639 TCP. A client initiating a 0-RTT connection may also send data 640 packets in 0-RTT Protected packets directly after the Initial packet 641 containing the TLS Client Hello. Since these packets may be 642 reordered in the network, 0-RTT Protected data packets could be seen 643 before the Initial packet. 645 Note that clients send Initial packets before servers do, servers 646 send Handshake packets before clients do, and only clients send 647 Initial packets with tokens. Therefore, the role as a client or 648 server can generally be confirmed by an on- path observer. An 649 attempted connection after Retry can be detected by correlating the 650 token on the Retry with the token on the subsequent Initial packet 651 and the destination connection ID of the new Initial packet. 653 3.3. Distinguishing Acknowledgment traffic 655 Some deployed in-network functions distinguish pure-acknowledgment 656 (ACK) packets from packets carrying upper-layer data in order to 657 attempt to enhance performance, for example by queueing ACKs 658 differently or manipulating ACK signaling. Distinguishing ACK 659 packets is trivial in TCP, but not supported by QUIC, since 660 acknowledgment signaling is carried inside QUIC's encrypted payload, 661 and ACK manipulation is impossible. Specifically, heuristics 662 attempting to distinguish ACK-only packets from payload-carrying 663 packets based on packet size are likely to fail, and are not 664 recommended to use as a way to construe internals of QUIC's operation 665 as those mechanisms can change, e.g., due to the use of extensions. 667 3.4. Application Identification 669 The cleartext TLS handshake may contain Server Name Indication (SNI) 670 [RFC6066], by which the client reveals the name of the server it 671 intends to connect to, in order to allow the server to present a 672 certificate based on that name. It may also contain information from 673 Application-Layer Protocol Negotiation (ALPN) [RFC7301], by which the 674 client exposes the names of application-layer protocols it supports; 675 an observer can deduce that one of those protocols will be used if 676 the connection continues. 678 Work is currently underway in the TLS working group to encrypt the 679 SNI in TLS 1.3 [TLS-ESNI]. This would make SNI-based application 680 identification impossible by on-path observation for QUIC and other 681 protocols that use TLS. 683 3.4.1. Extracting Server Name Indication (SNI) Information 685 If the SNI is not encrypted it can be derived from the QUIC Initial 686 packet by calculating the Initial Secret to decrypt the packet 687 payload and parse the QUIC CRYPTO Frame containing the TLS 688 ClientHello. 690 As both the initial salt for the Initial Secret as well as CRYPTO 691 frame itself are version-specific, the first step is always to parse 692 the version number (second to sixth byte of the long header). Note 693 that only long header packets carry the version number, so it is 694 necessary to also check the if first bit of the QUIC packet is set to 695 1, indicating a long header. 697 Note that proprietary QUIC versions, that have been deployed before 698 standardization, might not set the first bit in a QUIC long header 699 packets to 1. To parse these versions, example code is provided in 700 the appendix (see Appendix A.1), however, it is expected that these 701 versions will gradually disappear over time. 703 When the version has been identified as QUIC version 1, the packet 704 type needs to be verified as an Initial packet by checking that the 705 third and fourth bit of the header are both set to 0. Then the 706 client destination connection ID needs to be extracted to calculate 707 the Initial Secret together with the version specific initial salt, 708 as described in [QUIC-TLS]. The length of the connection ID is 709 indicated in the 6th byte of the header followed by the connection ID 710 itself. 712 To determine the end of the header and find the start of the payload, 713 the packet number length, the source connection ID length, and the 714 token length need to be extracted. The packet number length is 715 defined by the seventh and eight bits of the header as described in 716 section 17.2. of [QUIC-TRANSPORT], but is obfuscated as described in 717 [QUIC-TLS]. The source connection ID length is specified in the byte 718 after the destination connection ID. And the token length, which 719 follows the source connection ID, is a variable length integer as 720 specified in Section 16 of [QUIC-TRANSPORT]. 722 After decryption, the Initial Client packet can be parsed to detect 723 the CRYPTO frame that contains the TLS Client Hello, which then can 724 be parsed similarly to TLS over TCP connections. The Initial client 725 packet may contain other frames, so the first bytes of each frame 726 need to be checked to identify the frame type, and if needed skip 727 over it. Note that the length of the frames is dependent on the 728 frame type. In QUIC version 1, the packet is expected to only carry 729 the CRYPTO frame and optionally padding frames. However, PADDING 730 frames, each consisting of a single zero byte, may also occur before 731 or after the CRYPTO frame. 733 Note that client Initial packets after the first do not always use 734 the destination connection ID that was used to generate the Initial 735 keys. Therefore, attempts to decrypt these packets using the 736 procedure above might fail. 738 3.5. Flow Association 740 The QUIC connection ID (see Section 2.6) is designed to allow an on- 741 path device such as a load-balancer to associate two flows as 742 identified by five-tuple when the address and port of one of the 743 endpoints changes; e.g. due to NAT rebinding or server IP address 744 migration. An observer keeping flow state can associate a connection 745 ID with a given flow, and can associate a known flow with a new flow 746 when when observing a packet sharing a connection ID and one endpoint 747 address (IP address and port) with the known flow. 749 However, since the connection ID may change multiple times during the 750 lifetime of a flow, and the negotiation of connection ID changes is 751 encrypted, packets with the same 5-tuple but different connection IDs 752 may or may not belong to the same connection. 754 The connection ID value should be treated as opaque; see Section 4.3 755 for caveats regarding connection ID selection at servers. 757 3.6. Flow teardown 759 QUIC does not expose the end of a connection; the only indication to 760 on-path devices that a flow has ended is that packets are no longer 761 observed. Stateful devices on path such as NATs and firewalls must 762 therefore use idle timeouts to determine when to drop state for QUIC 763 flows, see further section Section 4.1. 765 3.7. Flow Symmetry Measurement 767 QUIC explicitly exposes which side of a connection is a client and 768 which side is a server during the handshake. In addition, the 769 symmetry of a flow (whether primarily client-to-server, primarily 770 server-to-client, or roughly bidirectional, as input to basic traffic 771 classification techniques) can be inferred through the measurement of 772 data rate in each direction. While QUIC traffic is protected and 773 ACKs may be padded, padding is not required. 775 3.8. Round-Trip Time (RTT) Measurement 777 Round-trip time of QUIC flows can be inferred by observation once per 778 flow, during the handshake, as in passive TCP measurement; this 779 requires parsing of the QUIC packet header and recognition of the 780 handshake, as illustrated in Section 2.4. It can also be inferred 781 during the flow's lifetime, if the endpoints use the spin bit 782 facility described below and in [QUIC-TRANSPORT], section 17.3.1. 784 3.8.1. Measuring Initial RTT 786 In the common case, the delay between the Initial packet containing 787 the TLS Client Hello and the Handshake packet containing the TLS 788 Server Hello represents the RTT component on the path between the 789 observer and the server. The delay between the TLS Server Hello and 790 the Handshake packet containing the TLS Finished message sent by the 791 client represents the RTT component on the path between the observer 792 and the client. While the client may send 0-RTT Protected packets 793 after the Initial packet during 0-RTT connection re-establishment, 794 these can be ignored for RTT measurement purposes. 796 Handshake RTT can be measured by adding the client-to-observer and 797 observer-to-server RTT components together. This measurement 798 necessarily includes any transport and application layer delay (the 799 latter mainly caused by the asymmetric crypto operations associated 800 with the TLS handshake) at both sides. 802 3.8.2. Using the Spin Bit for Passive RTT Measurement 804 The spin bit provides a version-specific method to measure per-flow 805 RTT from observation points on the network path throughout the 806 duration of a connection. See section 17.4 of [QUIC-TRANSPORT] for 807 the definition of the spin bit in Version 1 of QUIC. Endpoint 808 participation in spin bit signaling is optional. That is, while its 809 location is fixed in this version of QUIC, an endpoint can 810 unilaterally choose to not support "spinning" the bit. 812 Use of the spin bit for RTT measurement by devices on path is only 813 possible when both endpoints enable it. Some endpoints may disable 814 use of the spin bit by default, others only in specific deployment 815 scenarios, e.g. for servers and clients where the RTT would reveal 816 the presence of a VPN or proxy. To avoid making these connections 817 identifiable based on the usage of the spin bit, all endpoints 818 randomly disable "spinning" for at least one eighth of connections, 819 even if otherwise enabled by default. An endpoint not participating 820 in spin bit signaling for a given connection can use a fixed spin 821 value for the duration of the connection, or can set the bit randomly 822 on each packet sent. 824 When in use and a QUIC flow sends data continuously, the latency spin 825 bit in each direction changes value once per round-trip time (RTT). 826 An on-path observer can observe the time difference between edges 827 (changes from 1 to 0 or 0 to 1) in the spin bit signal in a single 828 direction to measure one sample of end-to-end RTT. This mechanism 829 follows the principles of protocol measurability laid out in [IPIM]. 831 Note that this measurement, as with passive RTT measurement for TCP, 832 includes any transport protocol delay (e.g., delayed sending of 833 acknowledgements) and/or application layer delay (e.g., waiting for a 834 response to be generated). It therefore provides devices on path a 835 good instantaneous estimate of the RTT as experienced by the 836 application. 838 However, application-limited and flow-control-limited senders can 839 have application and transport layer delay, respectively, that are 840 much greater than network RTT. When the sender is application- 841 limited and e.g. only sends small amount of periodic application 842 traffic, where that period is longer than the RTT, measuring the spin 843 bit provides information about the application period, not the 844 network RTT. 846 Since the spin bit logic at each endpoint considers only samples from 847 packets that advance the largest packet number, signal generation 848 itself is resistant to reordering. However, reordering can cause 849 problems at an observer by causing spurious edge detection and 850 therefore inaccurate (i.e., lower) RTT estimates, if reordering 851 occurs across a spin-bit flip in the stream. 853 Simple heuristics based on the observed data rate per flow or changes 854 in the RTT series can be used to reject bad RTT samples due to lost 855 or reordered edges in the spin signal, as well as application or flow 856 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 857 significantly higher than RTTs over the history of the flow. These 858 heuristics may use the handshake RTT as an initial RTT estimate for a 859 given flow. Usually such heuristics would also detect if the spin is 860 either constant or randomly set for a connection. 862 An on-path observer that can see traffic in both directions (from 863 client to server and from server to client) can also use the spin bit 864 to measure "upstream" and "downstream" component RTT; i.e, the 865 component of the end-to-end RTT attributable to the paths between the 866 observer and the server and the observer and the client, 867 respectively. It does this by measuring the delay between a spin 868 edge observed in the upstream direction and that observed in the 869 downstream direction, and vice versa. 871 Raw RTT samples generated using these techniques can be processed in 872 various ways to generate useful network performance metrics. A 873 simple linear smoothing or moving minimum filter can be applied to 874 the stream of RTT samples to get a more stable estimate of 875 application-experienced RTT. RTT samples measured from the spin bit 876 can also be used to generate RTT distribution information, including 877 minimum RTT (which approximates network RTT over longer time windows) 878 and RTT variance (which approximates jitter as seen by the 879 application). 881 4. Specific Network Management Tasks 883 In this section, we review specific network management and 884 measurement techniques and how QUIC's design impacts them. 886 4.1. Stateful Treatment of QUIC Traffic 888 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 889 middlebox) is possible through QUIC traffic and version 890 identification (Section 3.1) and observation of the handshake for 891 connection confirmation (Section 3.2). The lack of any visible end- 892 of-flow signal (Section 3.6) means that this state must be purged 893 either through timers or through least-recently-used eviction, 894 depending on application requirements. 896 [RFC4787] requires a timeout that is not less than 2 minutes for most 897 UDP traffic. However, in pratice, timers are often lower, in the 898 range of 15 to 30 seconds. In contrast, [RFC5382] recommends a 899 timeout of more than 2 hours for TCP, given that TCP is a connection- 900 oriented protocol with well-defined closure semantics. For network 901 devices that are QUIC-aware, it is recommended to also use longer 902 timeouts for QUIC traffic, as QUIC is connection-oriented. As such, 903 a handshake packet from the server indicates the willingness of the 904 server to communicate with the client. 906 The QUIC header optionally contains a connection ID which can be used 907 as additional entropy beyond the 5-tuple, if needed. The QUIC 908 handshake needs to be observed in order to understand whether the 909 connection ID is present and what length it has. However, connection 910 IDs may be renegotiated during after the handshake, and this 911 renegotiation is not visible to the path. Using the connection ID as 912 a flow key field for stateful treatment of flows may therefore cause 913 undetectable and unrecoverable loss of state in the middle of a 914 connection. Use of connection IDs is specifically discouraged for 915 NAT applications. 917 4.2. Passive Network Performance Measurement and Troubleshooting 919 Limited RTT measurement is possible by passive observation of QUIC 920 traffic; see Section 3.8. No passive measurement of loss is possible 921 with the present wire image. Extremely limited observation of 922 upstream congestion may be possible via the observation of CE 923 markings on ECN-enabled QUIC traffic. 925 4.3. Server Cooperation with Load Balancers 927 In the case of content distribution networking architectures 928 including load balancers, the connection ID provides a way for the 929 server to signal information about the desired treatment of a flow to 930 the load balancers. Guidance on assigning connection IDs is given in 931 [QUIC-APPLICABILITY]. 933 4.4. DDoS Detection and Mitigation 935 Current practices in detection and mitigation of Distributed Denial 936 of Service (DDoS) attacks generally involve classification of 937 incoming traffic (as packets, flows, or some other aggregate) into 938 "good" (productive) and "bad" (DDoS) traffic, and then differential 939 treatment of this traffic to forward only good traffic. This 940 operation is often done in a separate specialized mitigation 941 environment through which all traffic is filtered; a generalized 942 architecture for separation of concerns in mitigation is given in 943 [DOTS-ARCH]. 945 Key to successful DDoS mitigation is efficient classification of this 946 traffic in the mitigation environment. Limited first-packet garbage 947 detection as in Section 3.1.2 and stateful tracking of QUIC traffic 948 as in Section 4.1 above may be useful during classification. 950 Note that the use of a connection ID to support connection migration 951 renders 5-tuple based filtering insufficient and requires more state 952 to be maintained by DDoS defense systems. For the common case of NAT 953 rebinding, DDoS defense systems can detect a change in the client's 954 endpoint address by linking flows based on the server's connection 955 IDs. QUIC's linkability resistance ensures that a deliberate 956 connection migration is accompanied by a change in the connection ID. 958 It is questionable whether connection migrations must be supported 959 during a DDoS attack. If the connection migration is not visible to 960 the network that performs the DDoS detection, an active, migrated 961 QUIC connection may be blocked by such a system under attack. As 962 soon as the connection blocking is detected by the client, the client 963 may rely on the fast resumption mechanism provided by QUIC. When 964 clients migrate to a new path, they should be prepared for the 965 migration to fail and attempt to reconnect quickly. 967 TCP syncookies [RFC4937] are a well-established method of mitigating 968 some kinds of TCP DDoS attacks. QUIC Retry packets are the 969 functional analogue to syncookies, forcing clients to prove 970 possession of their IP address before committing server state. 971 However, there are safeguards in QUIC against unsolicited injection 972 of these packets by intermediaries who do not have consent of the end 973 server. See [QUIC_LB] for standard ways for intermediaries to send 974 Retry packets on behalf of consenting servers. 976 4.5. UDP Policing 978 Today, UDP is the most prevalent DDoS vector, since it is easy for 979 compromised non-admin applications to send a flood of large UDP 980 packets (while with TCP the attacker gets throttled by the congestion 981 controller) or to craft reflection and amplification attacks. 982 Networks should therefore be prepared for UDP flood attacks on ports 983 used for QUIC traffic. One possible response to this threat is to 984 police UDP traffic on the network, allocating a fixed portion of the 985 network capacity to UDP and blocking UDP datagram over that cap. 987 The recommended way to police QUIC packets is to either drop them all 988 or to throttle them based on the hash of the UDP datagram's source 989 and destination addresses, blocking a portion of the hash space that 990 corresponds to the fraction of UDP traffic one wishes to drop. When 991 the handshake is blocked, QUIC-capable applications may failover to 992 TCP (at least applications using well-known UDP ports). However, 993 blindly blocking a significant fraction of QUIC packets will allow 994 many QUIC handshakes to complete, preventing a TCP failover, but the 995 connections will suffer from severe packet loss. 997 4.6. Handling ICMP Messages 999 Datagram Packetization Layer PMTU Discovery (PLPMTUD) can be used by 1000 QUIC to probe for the supported PMTU. PLPMTUD optionally uses ICMP 1001 messages (e.g., IPv6 Packet Too Big messages). Given known attacks 1002 with the use of ICMP messages, the use of PLPMTUD in QUIC has been 1003 designed to safely use but not rely on receiving ICMP feedback (see 1004 Section 14.2.1. of [QUIC-TRANSPORT]). 1006 Networks are recommended to forward these ICMP messages and retain as 1007 much of the original packet as possible without exceeding the minimum 1008 MTU for the IP version when generating ICMP messages as recommended 1009 in [RFC1812] and [RFC4443]. 1011 4.7. Quality of Service handling and ECMP 1013 It is expected that any QoS handling in the network, e.g. based on 1014 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 1015 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 1016 per-packet) and as such that all packets belonging to the same QUIC 1017 connection get uniform treatment. Using ECMP to distribute packets 1018 from a single flow across multiple network paths or any other non- 1019 uniform treatment of packets belong to the same connection could 1020 result in variations in order, delivery rate, and drop rate. As 1021 feedback about loss or delay of each packet is used as input to the 1022 congestion controller, these variations could adversely affect 1023 performance. 1025 Depending of the loss recovery mechanism implemented, QUIC may be 1026 more tolerant of packet re-ordering than traditional TCP traffic (see 1027 Section 2.7). However, it cannot be known by the network which exact 1028 recovery mechanism is used and therefore reordering tolerance should 1029 be considered as unknown. 1031 4.8. QUIC and Network Address Translation (NAT) 1033 QUIC Connection IDs are opaque byte fields that are expressed 1034 consistently across all QUIC versions [QUIC-INVARIANTS], see 1035 Section 2.6. This feature may appear to present opportunities to 1036 optimize NAT port usage and simplify the work of the QUIC server. In 1037 fact, NAT behavior that relies on CID may instead cause connection 1038 failure when endpoints change Connection ID, and disable important 1039 protocol security features. NATs should retain their existing 4- 1040 tuple-based operation and refrain from parsing or otherwise using 1041 QUIC connection IDs. 1043 This section uses the colloquial term NAT to mean NAPT (section 2.2 1044 of [RFC3022]), which overloads several IP addresses to one IP address 1045 or to an IP address pool, as commonly deployed in carrier-grade NATs 1046 or residential NATs. 1048 The remainder of this section explains how QUIC supports NATs better 1049 than other connection-oriented protocols, why NAT use of Connection 1050 ID might appear attractive, and how NAT use of CID can create serious 1051 problems for the endpoints. 1053 [RFC4787] contains some guidance on building NATs to interact 1054 constructively with a wide range of applications. This section 1055 extends the discussion to QUIC. 1057 By using the CID, QUIC connections can survive NAT rebindings as long 1058 as no routing function in the path is dependent on client IP address 1059 and port to deliver packets between server and NAT. Reducing the 1060 timeout on UDP NATs might be tempting in light of this property, but 1061 not all QUIC server deployments will be robust to rebinding. 1063 4.8.1. Resource Conservation 1065 NATs sometimes hit an operational limit where they exhaust available 1066 public IP addresses and ports, and must evict flows from their 1067 address/port mapping. CIDs might appear to offer a way to multiplex 1068 many connections over a single address and port. 1070 However, QUIC endpoints may negotiate new connection IDs inside 1071 cryptographically protected packets, and begin using them at will. 1072 Imagine two clients behind a NAT that are sharing the same public IP 1073 address and port. The NAT is differentiating them using the incoming 1074 Connection ID. If one client secretly changes its connection ID, 1075 there will be no mapping for the NAT, and the connection will 1076 suddenly break. 1078 QUIC is deliberately designed to fail rather than persist when the 1079 network cannot support its operation. For HTTP/3, this extends to 1080 recommending a fallback to TCP-based versions of HTTP rather than 1081 persisting with a QUIC connection that might be unstable. And 1082 [QUIC-APPLICABILITY] recommends TCP fallback for other protocols on 1083 the basis that this is preferable to sudden connection errors and 1084 time outs. Furthermore, wide deployment of NATs with this behavior 1085 hinders the use of QUIC's migration function, which relies on the 1086 ability to change the connection ID any time during the lifetime of a 1087 QUIC connection. 1089 It is possible, in principle, to encode the client's identity in a 1090 connection ID using the techniques described in [QUIC_LB] and 1091 explicit coordination with the NAT. However, this implies that the 1092 client shares configuration with the NAT, which might be logistically 1093 difficult. This adds administrative overhead while not resolving the 1094 case where a client migrates to a point behind the NAT. 1096 Note that multiplexing connection IDs over a single port anyway 1097 violates the best common practice to avoid "port overloading" as 1098 described in [RFC4787]. 1100 4.8.2. "Helping" with routing infrastructure issues 1102 Concealing client address changes in order to simplify operational 1103 routing issues will mask important signals that drive security 1104 mechanisms, and therefore opens QUIC up to various attacks. 1106 One challenge in QUIC deployments that want to benefit from QUIC's 1107 migration capability is server infrastructures with routers and 1108 switches that direct traffic based on address-port 4-tuple rather 1109 than connection ID. The use of source IP address means that a NAT 1110 rebinding or address migration will deliver packets to the wrong 1111 server. As all QUIC payloads are encrypted, routers and switches 1112 will not have access to negotiated but not-yet-in-use CIDs. This is 1113 a particular problem for low-state load balancers. [QUIC_LB] 1114 addresses this problem proposing a QUIC extension to allow some 1115 server-load balancer coordination for routable CIDs. 1117 It seems that a NAT anywhere in the front of such an infrastructure 1118 setup could save the effort of converting all these devices by 1119 decoding routable connection IDs and rewriting the packet IP 1120 addresses to allow consistent routing by legacy devices. 1122 Unfortunately, the change of IP address or port is an important 1123 signal to QUIC endpoints. It requires a review of path-dependent 1124 variables like congestion control parameters. It can also signify 1125 various attacks that mislead one endpoint about the best peer address 1126 for the connection (see section 9 of [QUIC-TRANSPORT]). The QUIC 1127 PATH_CHALLENGE and PATH_RESPONSE frames are intended to detect and 1128 mitigate these attacks and verify connectivity to the new address. 1129 This mechanism cannot work if the NAT is bleaching peer address 1130 changes. 1132 For example, an attacker might copy a legitimate QUIC packet and 1133 change the source address to match its own. In the absence of a 1134 bleaching NAT, the receiving endpoint would interpret this as a 1135 potential NAT rebinding and use a PATH_CHALLENGE frame to prove that 1136 the peer endpoint is not truly at the new address, thus thwarting the 1137 attack. A bleaching NAT has no means of sending an encrypted 1138 PATH_CHALLENGE frame, so it might start redirecting all QUIC traffic 1139 to the attacker address and thus allow an observer to break the 1140 connection. 1142 4.9. Filtering behavior 1144 [RFC4787] describes possible packet filtering behaviors that relate 1145 to NATs. Though the guidance there holds, a particularly unwise 1146 behavior is to admit a handful of UDP packets and then make a 1147 decision as to whether or not to filter it. QUIC applications are 1148 encouraged to fail over to TCP if early packets do not arrive at 1149 their destination. Admitting a few packets allows the QUIC endpoint 1150 to determine that the path accepts QUIC. Sudden drops afterwards 1151 will result in slow and costly timeouts before abandoning the 1152 connection. 1154 5. IANA Considerations 1156 This document has no actions for IANA. 1158 6. Security Considerations 1160 QUIC is an encrypted and authenticated transport. That means, once 1161 the cryptographic handshake is complete, QUIC endpoints discard most 1162 packets that are not authenticated, greatly limiting the ability of 1163 an attacker to interfere with existing connections. 1165 However, some information is still observerable, as supporting 1166 manageability of QUIC traffic inherently involves tradeoffs with the 1167 confidentiality of QUIC's control information; this entire document 1168 is therefore security-relevant. 1170 More security considerations for QUIC are discussed in 1171 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1172 passive attackers in the network as well as attacks on specific QUIC 1173 mechanism. 1175 Version Negotiation packets do not contain any mechanism to prevent 1176 version downgrade attacks. However, future versions of QUIC that use 1177 Version Negotiation packets are require to define a mechanism that is 1178 robust against version downgrade attacks. Therefore a network node 1179 should not attempt to impact version selection, as version downgrade 1180 may result in connection failure. 1182 7. Contributors 1184 The following people have contributed text to sections of this 1185 document: 1187 * Dan Druta 1189 * Martin Duke 1190 * Marcus Ilhar 1192 * Igor Lubashev 1194 * David Schinazi 1196 8. Acknowledgments 1198 Special thanks to Martin Thomson and Martin Duke for the detailed 1199 reviews and feedback. 1201 This work is partially supported by the European Commission under 1202 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1203 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1204 for Education, Research, and Innovation under contract no. 15.0268. 1205 This support does not imply endorsement. 1207 9. References 1209 9.1. Normative References 1211 [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC", 1212 Work in Progress, Internet-Draft, draft-ietf-quic-tls-34, 1213 14 January 2021, 1214 . 1216 [QUIC-TRANSPORT] 1217 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1218 and Secure Transport", Work in Progress, Internet-Draft, 1219 draft-ietf-quic-transport-34, 14 January 2021, 1220 . 1223 9.2. Informative References 1225 [DOTS-ARCH] 1226 Mortensen, A., Reddy, T., Andreasen, F., Teague, N., and 1227 R. Compton, "DDoS Open Threat Signaling (DOTS) 1228 Architecture", Work in Progress, Internet-Draft, draft- 1229 ietf-dots-architecture-18, 6 March 2020, 1230 . 1233 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1234 Internet Measurement (arXiv preprint 1612.02902)", 9 1235 December 2016, . 1237 [QUIC-APPLICABILITY] 1238 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1239 Transport Protocol", Work in Progress, Internet-Draft, 1240 draft-ietf-quic-applicability-09, 22 January 2021, 1241 . 1244 [QUIC-HTTP] 1245 Bishop, M., "Hypertext Transfer Protocol Version 3 1246 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1247 quic-http-34, 2 February 2021, 1248 . 1250 [QUIC-INVARIANTS] 1251 Thomson, M., "Version-Independent Properties of QUIC", 1252 Work in Progress, Internet-Draft, draft-ietf-quic- 1253 invariants-13, 14 January 2021, 1254 . 1257 [QUIC-RECOVERY] 1258 Iyengar, J. and I. Swett, "QUIC Loss Detection and 1259 Congestion Control", Work in Progress, Internet-Draft, 1260 draft-ietf-quic-recovery-34, 14 January 2021, 1261 . 1263 [QUIC_LB] Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC 1264 Connection IDs", Work in Progress, Internet-Draft, draft- 1265 ietf-quic-load-balancers-06, 4 February 2021, 1266 . 1269 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1270 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1271 . 1273 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1274 and W. Weiss, "An Architecture for Differentiated 1275 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1276 . 1278 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 1279 Address Translator (Traditional NAT)", RFC 3022, 1280 DOI 10.17487/RFC3022, January 2001, 1281 . 1283 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1284 Control Message Protocol (ICMPv6) for the Internet 1285 Protocol Version 6 (IPv6) Specification", STD 89, 1286 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1287 . 1289 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1290 Translation (NAT) Behavioral Requirements for Unicast 1291 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1292 2007, . 1294 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1295 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1296 June 2007, . 1298 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1299 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1300 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1301 . 1303 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1304 Extensions: Extension Definitions", RFC 6066, 1305 DOI 10.17487/RFC6066, January 2011, 1306 . 1308 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1309 "Transport Layer Security (TLS) Application-Layer Protocol 1310 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1311 July 2014, . 1313 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1314 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1315 August 2015, . 1317 [TLS-ESNI] Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1318 Encrypted Client Hello", Work in Progress, Internet-Draft, 1319 draft-ietf-tls-esni-09, 16 December 2020, 1320 . 1322 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1323 Integrity Signals for Passive Measurement (in Proc. TMA 1324 2014)", April 2014. 1326 [WIRE-IMAGE] 1327 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1328 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1329 2019, . 1331 Appendix A. Appendix 1333 This appendix uses the following conventions: array[i] - one byte at 1334 index i of array array[i:j] - subset of array starting with index i 1335 (inclusive) up to j-1 (inclusive) array[i:] - subset of array 1336 starting with index i (inclusive) up to the end of the array 1338 A.1. Distinguishing IETF QUIC and Google QUIC Versions 1340 This section contains algorithms that allows parsing versions from 1341 both Google QUIC and IETF QUIC. These mechanisms will become 1342 irrelevant when IETF QUIC is fully deployed and Google QUIC is 1343 deprecated. 1345 Note that other than this appendix, nothing in this document applies 1346 to Google QUIC. And the purpose of this appendix is merely to 1347 distinguish IETF QUIC from any versions of Google QUIC. 1349 Conceptually, a Google QUIC version is an opaque 32bit field. When 1350 we refer to a version with four printable characters, we use its 1351 ASCII representation: for example, Q050 refers to {'Q', '0', '5', 1352 '0'} which is equal to {0x51, 0x30, 0x35, 0x30}. Otherwise, we use 1353 its hexadecimal representation: for example, 0xff00001d refers to 1354 {0xff, 0x00, 0x00, 0x1d}. 1356 QUIC versions that start with 'Q' or 'T' followed by three digits are 1357 Google QUIC versions. Versions up to and including 43 are documented 1358 by . Versions 1360 Q046, Q050, T050, and T051 are not fully documented, but this 1361 appendix should contain enough information to allow parsing Client 1362 Hellos for those versions. 1364 To extract the version number itself, one needs to look at the first 1365 byte of the QUIC packet, in other words the first byte of the UDP 1366 payload. 1368 first_byte = packet[0] 1369 first_byte_bit1 = ((first_byte & 0x80) != 0) 1370 first_byte_bit2 = ((first_byte & 0x40) != 0) 1371 first_byte_bit3 = ((first_byte & 0x20) != 0) 1372 first_byte_bit4 = ((first_byte & 0x10) != 0) 1373 first_byte_bit5 = ((first_byte & 0x08) != 0) 1374 first_byte_bit6 = ((first_byte & 0x04) != 0) 1375 first_byte_bit7 = ((first_byte & 0x02) != 0) 1376 first_byte_bit8 = ((first_byte & 0x01) != 0) 1377 if (first_byte_bit1) { 1378 version = packet[1:5] 1379 } else if (first_byte_bit5 && !first_byte_bit2) { 1380 if (!first_byte_bit8) { 1381 abort("Packet without version") 1382 } 1383 if (first_byte_bit5) { 1384 version = packet[9:13] 1385 } else { 1386 version = packet[5:9] 1387 } 1388 } else { 1389 abort("Packet without version") 1390 } 1392 A.2. Extracting the CRYPTO frame 1393 counter = 0 1394 while (payload[counter] == 0) { 1395 counter += 1 1396 } 1397 first_nonzero_payload_byte = payload[counter] 1398 fnz_payload_byte_bit3 = ((first_nonzero_payload_byte & 0x20) != 0) 1400 if (first_nonzero_payload_byte != 0x06) { 1401 abort("Unexpected frame") 1402 } 1403 if (payload[counter+1] != 0x00) { 1404 abort("Unexpected crypto stream offset") 1405 } 1406 counter += 2 1407 if ((payload[counter] & 0xc0) == 0) { 1408 crypto_data_length = payload[counter] 1409 counter += 1 1410 } else { 1411 crypto_data_length = payload[counter:counter+2] 1412 counter += 2 1413 } 1414 crypto_data = payload[counter:counter+crypto_data_length] 1415 ParseTLS(crypto_data) 1417 Authors' Addresses 1419 Mirja Kuehlewind 1420 Ericsson 1422 Email: mirja.kuehlewind@ericsson.com 1424 Brian Trammell 1425 Google 1426 Gustav-Gull-Platz 1 1427 CH- 8004 Zurich 1428 Switzerland 1430 Email: ietf@trammell.ch