idnits 2.17.1 draft-ietf-quic-manageability-11.txt: -(1305): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (21 April 2021) is 1099 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 1468 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-11 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-11 -- Duplicate reference: draft-ietf-quic-applicability, mentioned in 'QUIC-APPLICABILITY', was also mentioned in 'I-D.ietf-quic-applicability'. == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-10 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 23 October 2021 Google Switzerland GmbH 6 21 April 2021 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-11 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on the implications of QUIC's design and wire image on 15 network operations involving QUIC traffic. Its intended audience is 16 network operators and equipment vendors who rely on the use of 17 transport-aware network functions. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 23 October 2021. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Simplified BSD License text 47 as described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 4 54 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 55 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 56 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 6 57 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 7 58 2.5. Integrity Protection of the Wire Image . . . . . . . . . 11 59 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 11 60 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 12 61 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 12 62 3. Network-Visible Information about QUIC Flows . . . . . . . . 12 63 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 13 64 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 13 65 3.1.2. First Packet Identification for Garbage Rejection . . 14 66 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 14 67 3.3. Distinguishing Acknowledgment Traffic . . . . . . . . . . 15 68 3.4. Server Name Indication (SNI) . . . . . . . . . . . . . . 15 69 3.4.1. Extracting Server Name Indication (SNI) 70 Information . . . . . . . . . . . . . . . . . . . . . 15 71 3.5. Flow Association . . . . . . . . . . . . . . . . . . . . 17 72 3.6. Flow Teardown . . . . . . . . . . . . . . . . . . . . . . 17 73 3.7. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 17 74 3.8. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 18 75 3.8.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 18 76 3.8.2. Using the Spin Bit for Passive RTT Measurement . . . 18 77 4. Specific Network Management Tasks . . . . . . . . . . . . . . 20 78 4.1. Passive Network Performance Measurement and 79 Troubleshooting . . . . . . . . . . . . . . . . . . . . 20 80 4.2. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 20 81 4.3. Address Rewriting to Ensure Routing Stability . . . . . . 22 82 4.4. Server Cooperation with Load Balancers . . . . . . . . . 22 83 4.5. Filtering Behavior . . . . . . . . . . . . . . . . . . . 23 84 4.6. UDP Blocking or Throttling . . . . . . . . . . . . . . . 23 85 4.7. DDoS Detection and Mitigation . . . . . . . . . . . . . . 24 86 4.8. Quality of Service handling and ECMP . . . . . . . . . . 25 87 4.9. Handling ICMP Messages . . . . . . . . . . . . . . . . . 26 88 4.10. Guiding Path MTU . . . . . . . . . . . . . . . . . . . . 26 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 27 91 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28 92 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 28 93 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 94 9.1. Normative References . . . . . . . . . . . . . . . . . . 28 95 9.2. Informative References . . . . . . . . . . . . . . . . . 29 96 Appendix A. Distinguishing IETF QUIC and Google QUIC Versions . 32 97 A.1. Extracting the CRYPTO frame . . . . . . . . . . . . . . . 33 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34 101 1. Introduction 103 QUIC [QUIC-TRANSPORT] is a new transport protocol that is 104 encapsulated in UDP. QUIC integrates TLS [QUIC-TLS] to encrypt all 105 payload data and most control information. QUIC version 1 was 106 designed primarily as a transport for HTTP, with the resulting 107 protocol being known as HTTP/3 [QUIC-HTTP]. 109 This document provides guidance for network operations that manage 110 QUIC traffic. This includes guidance on how to interpret and utilize 111 information that is exposed by QUIC to the network, requirements and 112 assumptions of the QUIC design with respect to network treatment, and 113 a description of how common network management practices will be 114 impacted by QUIC. 116 QUIC is an end-to-end transport protocol. No information in the 117 protocol header, even that which can be inspected, is meant to be 118 mutable by the network. This is achieved through integrity 119 protection of the wire image [WIRE-IMAGE]. Encryption of most 120 control signaling means that less information is visible to the 121 network than is the case with TCP. 123 Integrity protection can also simplify troubleshooting, because none 124 of the nodes on the network path can modify transport layer 125 information. However, it does imply that in-network operations that 126 depend on modification of data are not possible without the 127 cooperation of an QUIC endpoint. This might be possible with the 128 introduction of a proxy which authenticates as an endpoint. Proxy 129 operations are not in scope for this document. 131 Network management is not a one-size-fits-all endeavour: practices 132 considered necessary or even mandatory within enterprise networks 133 with certain compliance requirements, for example, would be 134 impermissible on other networks without those requirements. This 135 document therefore does not make any specific recommendations as to 136 which practices should or should not be applied; for each practice, 137 it describes what is and is not possible with the QUIC transport 138 protocol as defined. 140 2. Features of the QUIC Wire Image 142 In this section, we discuss those aspects of the QUIC transport 143 protocol that have an impact on the design and operation of devices 144 that forward QUIC packets. Here, we are concerned primarily with the 145 unencrypted part of QUIC's wire image [WIRE-IMAGE], which we define 146 as the information available in the packet header in each QUIC 147 packet, and the dynamics of that information. Since QUIC is a 148 versioned protocol, the wire image of the header format can also 149 change from version to version. However, the field that identifies 150 the QUIC version in some packets, and the format of the Version 151 Negotiation Packet, are both inspectable and invariant 152 [QUIC-INVARIANTS]. 154 This document describes version 1 of the QUIC protocol, whose wire 155 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 156 of the wire image described herein may change in future versions of 157 the protocol, except when specified as an invariant 158 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 159 or to infer the behavior of future versions of QUIC. 161 Appendix A provides non-normative guidance on the identification of 162 QUIC version 1 packets compared to some pre-standard versions. 164 2.1. QUIC Packet Header Structure 166 QUIC packets may have either a long header or a short header. The 167 first bit of the QUIC header is the Header Form bit, and indicates 168 which type of header is present. The purpose of this bit is 169 invariant across QUIC versions. 171 The long header exposes more information. In version 1 of QUIC, it 172 is used during connection establishment, including version 173 negotiation, retry, and 0-RTT data. It contains a version number, as 174 well as source and destination connection IDs for grouping packets 175 belonging to the same flow. The definition and location of these 176 fields in the QUIC long header are invariant for future versions of 177 QUIC, although future versions of QUIC may provide additional fields 178 in the long header [QUIC-INVARIANTS]. 180 Short headers contain only an optional destination connection ID and 181 the spin bit for RTT measurement. In version 1 of QUIC, they are 182 used after connection establishment. 184 The following information is exposed in QUIC packet headers in all 185 versions of QUIC: 187 * version number: the version number is present in the long header, 188 and identifies the version used for that packet. During Version 189 Negotiation (see Section 17.2.1 of [QUIC-TRANSPORT] and 190 Section 2.8), the version number field has a special value 191 (0x00000000) that identifies the packet as a Version Negotiation 192 packet. QUIC version 1 uses version 0x00000001. Operators should 193 expect to observe packets with other version numbers as a result 194 of various Internet experiments, future standards, and greasing. 195 All deployed versions are maintained in an IANA registry (see 196 Section 22.2 of [QUIC-TRANSPORT]). 198 * source and destination connection ID: short and long packet 199 headers carry a destination connection ID, a variable-length field 200 that can be used to identify the connection associated with a QUIC 201 packet, for load-balancing and NAT rebinding purposes; see 202 Section 4.4 and Section 2.6. Long packet headers additionally 203 carry a source connection ID. The source connection ID 204 corresponds to the destination connection ID the source would like 205 to have on packets sent to it, and is only present on long packet 206 headers. On long header packets, the length of the connection IDs 207 is also present; on short header packets, the length of the 208 destination connection ID is implicit. 210 In version 1 of QUIC, the following additional information is 211 exposed: 213 * "fixed bit": The second-most-significant bit of the first octet of 214 most QUIC packets of the current version is set to 1, enabling 215 endpoints to demultiplex with other UDP-encapsulated protocols. 216 Even though this bit is fixed in the version 1 specification, 217 endpoints might use an extension that varies the bit. Therefore, 218 observers cannot reliably use it as an identifier for QUIC. 220 * latency spin bit: The third-most-significant bit of the first 221 octet in the short packet header for version 1. The spin bit is 222 set by endpoints such that tracking edge transitions can be used 223 to passively observe end-to-end RTT. See Section 3.8.2 for 224 further details. 226 * header type: The long header has a 2 bit packet type field 227 following the Header Form and fixed bits. Header types correspond 228 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 229 for details. 231 * length: The length of the remaining QUIC packet after the length 232 field, present on long headers. This field is used to implement 233 coalesced packets during the handshake (see Section 2.2). 235 * token: Initial packets may contain a token, a variable-length 236 opaque value optionally sent from client to server, used for 237 validating the client's address. Retry packets also contain a 238 token, which can be used by the client in an Initial packet on a 239 subsequent connection attempt. The length of the token is 240 explicit in both cases. 242 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 243 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 244 obfuscated in any way. For other kinds of packets, version 1 of QUIC 245 cryptographically obfuscates other information in the packet headers: 247 * packet number: All packets except Version Negotiation and Retry 248 packets have an associated packet number; however, this packet 249 number is encrypted, and therefore not of use to on-path 250 observers. The offset of the packet number is encoded in long 251 headers, while it is implicit (depending on destination connection 252 ID length) in short headers. The length of the packet number is 253 cryptographically obfuscated. 255 * key phase: The Key Phase bit, present in short headers, specifies 256 the keys used to encrypt the packet to support key rotation. The 257 Key Phase bit is cryptographically obfuscated. 259 2.2. Coalesced Packets 261 Multiple QUIC packets may be coalesced into a UDP datagram, with a 262 datagram carrying one or more long header packets followed by zero or 263 one short header packets. When packets are coalesced, the Length 264 fields in the long headers are used to separate QUIC packets; see 265 Section 12.2 of [QUIC-TRANSPORT]. The length header field is 266 variable length, and its position in the header is also variable 267 depending on the length of the source and destination connection ID; 268 see Section 17.2 of [QUIC-TRANSPORT]. 270 2.3. Use of Port Numbers 272 Applications that have a mapping for TCP as well as QUIC are expected 273 to use the same port number for both services. However, as for all 274 other IETF transports [RFC7605], there is no guarantee that a 275 specific application will use a given registered port, or that a 276 given port carries traffic belonging to the respective registered 277 service, especially when application layer information is encrypted. 278 For example, [QUIC-HTTP] specifies the use of Alt-Svc for discovery 279 of HTTP/3 services on other ports. 281 Further, as QUIC has a connection ID, it is also possible to maintain 282 multiple QUIC connections over one 5-tuple. However, if the 283 connection ID is zero-length, all packets of the 5-tuple belong to 284 the same QUIC connection. 286 2.4. The QUIC Handshake 288 New QUIC connections are established using a handshake, which is 289 distinguishable on the wire and contains some information that can be 290 passively observed. 292 To illustrate the information visible in the QUIC wire image during 293 the handshake, we first show the general communication pattern 294 visible in the UDP datagrams containing the QUIC handshake, then 295 examine each of the datagrams in detail. 297 The QUIC handshake can normally be recognized on the wire through at 298 least four datagrams we'll call "Client Initial", "Server Initial", 299 and "Client Completion", and "Server Completion", for purposes of 300 this illustration, as shown in Figure 1. 302 Packets in the handshake belong to three separate cryptographic and 303 transport contexts ("Initial", which contains observable payload, and 304 "Handshake" and "1-RTT", which do not). QUIC packets in separate 305 contexts during the handshake are generally coalesced (see 306 Section 2.2) in order to reduce the number of UDP datagrams sent 307 during the handshake. 309 As shown here, the client can send 0-RTT data as soon as it has sent 310 its Client Hello, and the server can send 1-RTT data as soon as it 311 has sent its Server Hello. 313 Client Server 314 | | 315 +----Client Initial----------------------->| 316 +----(zero or more 0RTT)------------------>| 317 | | 318 |<-----------------------Server Initial----+ 319 |<---------(1RTT encrypted data starts)----+ 320 | | 321 +----Client Completion-------------------->| 322 +----(1RTT encrypted data starts)--------->| 323 | | 324 |<--------------------Server Completion----+ 325 | | 327 Figure 1: General communication pattern visible in the QUIC handshake 328 A typical handshake starts with the client sending of a Client 329 Initial datagram as shown in Figure 2, which elicits a Server Initial 330 datagram as shown in Figure 3 typically containing three packets: an 331 Initial packet with the Server Initial, a Handshake packet with the 332 rest of the server's side of the TLS handshake, and initial 1-RTT 333 data, if present. 335 The Client Completion datagram contains at least one Handshake packet 336 and some also include an Initial packet. 338 Datagrams that contain a Client Initial Packet (Client Initial, 339 Server Initial, and some Client Completion) contain at least 1200 340 octets of UDP payload. This protects against amplification attacks 341 and verifies that the network path meets the requirements for the 342 minimum QUIC IP packet size; see Section 14 of [QUIC-TRANSPORT]. 343 This is accomplished by either adding PADDING frames within the 344 Initial packet, coalescing other packets with the Initial packet, or 345 leaving unused payload in the UDP packet after the Initial packet. A 346 network path needs to be able to forward at least this size of packet 347 for QUIC to be used. 349 The content of Client Initial packets are encrypted using Initial 350 Secrets, which are derived from a per-version constant and the 351 client's destination connection ID; they are therefore observable by 352 any on-path device that knows the per-version constant. They are 353 therefore considered visible in this illustration. The content of 354 QUIC Handshake packets are encrypted using keys established during 355 the initial handshake exchange, and are therefore not visible. 357 Initial, Handshake, and the Short Header packets transmitted after 358 the handshake belong to cryptographic and transport contexts. The 359 Client Completion Figure 4 and the Server Completion Figure 5 360 datagrams finish these first two contexts, by sending the final 361 acknowledgment and finishing the transmission of CRYPTO frames. 363 +----------------------------------------------------------+ 364 | UDP header (source and destination UDP ports) | 365 +----------------------------------------------------------+ 366 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 367 +----------------------------------------------------------+ | 368 | QUIC CRYPTO frame header | | 369 +----------------------------------------------------------+ | 370 | TLS Client Hello (incl. TLS SNI) | | 371 +----------------------------------------------------------+ | 372 | QUIC PADDING frames | | 373 +----------------------------------------------------------+<-+ 375 Figure 2: Typical Client Initial datagram pattern without 0-RTT 377 The Client Initial datagram exposes version number, source and 378 destination connection IDs without encryption. Information in the 379 TLS Client Hello frame, including any TLS Server Name Indication 380 (SNI) present, is obfuscated using the Initial secret. Note that the 381 location of PADDING is implementation-dependent, and PADDING frames 382 might not appear in a coalesced Initial packet. 384 +------------------------------------------------------------+ 385 | UDP header (source and destination UDP ports) | 386 +------------------------------------------------------------+ 387 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 388 +------------------------------------------------------------+ | 389 | QUIC CRYPTO frame header | | 390 +------------------------------------------------------------+ | 391 | TLS Server Hello | | 392 +------------------------------------------------------------+ | 393 | QUIC ACK frame (acknowledging client hello) | | 394 +------------------------------------------------------------+<-+ 395 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 396 +------------------------------------------------------------+ | 397 | encrypted payload (presumably CRYPTO frames) | | 398 +------------------------------------------------------------+<-+ 399 | QUIC short header | 400 +------------------------------------------------------------+ 401 | 1-RTT encrypted payload | 402 +------------------------------------------------------------+ 404 Figure 3: Typical Server Initial datagram pattern 406 The Server Initial datagram also exposes version number, source and 407 destination connection IDs in the clear; information in the TLS 408 Server Hello message is obfuscated using the Initial secret. 410 +------------------------------------------------------------+ 411 | UDP header (source and destination UDP ports) | 412 +------------------------------------------------------------+ 413 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 414 +------------------------------------------------------------+ | 415 | QUIC ACK frame (acknowledging Server Initial Initial) | | 416 +------------------------------------------------------------+<-+ 417 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 418 +------------------------------------------------------------+ | 419 | encrypted payload (presumably CRYPTO/ACK frames) | | 420 +------------------------------------------------------------+<-+ 421 | QUIC short header | 422 +------------------------------------------------------------+ 423 | 1-RTT encrypted payload | 424 +------------------------------------------------------------+ 425 Figure 4: Typical Client Completion datagram pattern 427 The Client Completion datagram does not expose any additional 428 information; however, recognizing it can be used to determine that a 429 handshake has completed (see Section 3.2), and for three-way 430 handshake RTT estimation as in Section 3.8. 432 +------------------------------------------------------------+ 433 | UDP header (source and destination UDP ports) | 434 +------------------------------------------------------------+ 435 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 436 +------------------------------------------------------------+ | 437 | encrypted payload (presumably ACK frame) | | 438 +------------------------------------------------------------+<-+ 439 | QUIC short header | 440 +------------------------------------------------------------+ 441 | 1-RTT encrypted payload | 442 +------------------------------------------------------------+ 444 Figure 5: Typical Server Completion datagram pattern 446 Similar to Client Completion, Server Completion also exposes no 447 additional information; observing it serves only to determine that 448 the handshake has completed. 450 When the client uses 0-RTT connection resumption, 0-RTT data may also 451 be seen in the Client Initial datagram, as shown in Figure 6. 453 +----------------------------------------------------------+ 454 | UDP header (source and destination UDP ports) | 455 +----------------------------------------------------------+ 456 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 457 +----------------------------------------------------------+ | 458 | QUIC CRYPTO frame header | | 459 +----------------------------------------------------------+ | 460 | TLS Client Hello (incl. TLS SNI) | | 461 +----------------------------------------------------------+<-+ 462 | QUIC long header (type = 0RTT, Version, DCID, SCID) (Length) 463 +----------------------------------------------------------+ | 464 | 0-rtt encrypted payload | | 465 +----------------------------------------------------------+<-+ 467 Figure 6: Typical 0-RTT Client Initial datagram pattern 469 In a 0-RTT Client Initial datagram, the PADDING frame is only present 470 if necessary to increase the size of the datagram with 0RTT data to 471 at least 1200 bytes. Additional datagrams containing only 0-RTT 472 protected long header packets may be sent from the client to the 473 server after the Client Initial datagram, containing the rest of the 474 0-RTT data. The amount of 0-RTT protected data that can be sent in 475 the first round is limited by the initial congestion window, 476 typically around 10 packets (see Section 7.2 of [QUIC-RECOVERY]). 478 2.5. Integrity Protection of the Wire Image 480 As soon as the cryptographic context is established, all information 481 in the QUIC header, including exposed information, is integrity- 482 protected. Further, information that was exposed in packets sent 483 before the cryptographic context was established is validated during 484 the cryptographic handshake. Therefore, devices on path cannot alter 485 any information or bits in QUIC packets. Such alterations would 486 cause the integrity check to fail, which results in the receiver 487 discarding the packet. Some parts of Initial packets could be 488 altered by removing and re-applying the authenticated encryption 489 without immediate discard at the receiver. However, the 490 cryptographic handshake validates most fields and any modifications 491 in those fields will result in connection establishment failing later 492 on. 494 2.6. Connection ID and Rebinding 496 The connection ID in the QUIC packet headers allows association of 497 QUIC packets using information independent of the five-tuple. This 498 allows rebinding of a connection after one of one endpoint 499 experienced an address change - usually the client. Further it can 500 be used by in-network devices to ensure that related 5-tuple flows 501 are appropriately balanced together. 503 Client and server negotiate connection IDs during the handshake; 504 typically, however, only the server will request a connection ID for 505 the lifetime of the connection. Connection IDs for either endpoint 506 may change during the lifetime of a connection, with the new 507 connection ID being supplied via encrypted frames (see Section 5.1 of 508 [QUIC-TRANSPORT]). Therefore, observing a new connection ID does not 509 necessary indicate a new connection. 511 [QUIC_LB] specifies algorithms for encoding the server mapping in a 512 connection ID in order to share this information with selected on- 513 path devices such as load balancers. Server mappings should only be 514 exposed to selected entities. Uncontrolled exposure would allow 515 linkage of multiple IP addresses to the same host if the server also 516 supports migration which opens an attack vector on specific servers 517 or pools. The best way to obscure an encoding is to appear random to 518 any other observers, which is most rigorously achieved with 519 encryption. As a result any attempt to infer information from 520 specific parts of a connection ID is unlikely to be useful. 522 2.7. Packet Numbers 524 The packet number field is always present in the QUIC packet header 525 in version 1; however, it is always encrypted. The encryption key 526 for packet number protection on handshake packets sent before 527 cryptographic context establishment is specific to the QUIC version, 528 while packet number protection on subsequent packets uses secrets 529 derived from the end-to-end cryptographic context. Packet numbers 530 are therefore not part of the wire image that is visible to on-path 531 observers. 533 2.8. Version Negotiation and Greasing 535 Version Negotiation packets are used by the server to indicate that a 536 requested version from the client is not supported (see Section 6 of 537 [QUIC-TRANSPORT]. Version Negotiation packets are not intrinsically 538 protected, but future QUIC versions will use later encrypted messages 539 to verify that they were authentic. Therefore any modification of 540 this list will be detected and may cause the endpoints to terminate 541 the connection attempt. 543 Also note that the list of versions in the Version Negotiation packet 544 may contain reserved versions. This mechanism is used to avoid 545 ossification in the implementation on the selection mechanism. 546 Further, a client may send a Initial Client packet with a reserved 547 version number to trigger version negotiation. In the Version 548 Negotiation packet, the connection IDs of the Client Initial packet 549 are reflected to provide a proof of return-routability. Therefore, 550 changing this information will also cause the connection to fail. 552 QUIC is expected to evolve rapidly, so new versions, both 553 experimental and IETF standard versions, will be deployed in the 554 Internet more often than with traditional Internet- and transport- 555 layer protocols. Using a particular version number to recognize 556 valid QUIC traffic is likely to persistently miss a fraction of QUIC 557 flows and completely fail in the near future, and is therefore not 558 recommended. In addition, due to the speed of evolution of the 559 protocol, devices that attempt to distinguish QUIC traffic from non- 560 QUIC traffic for purposes of network admission control should admit 561 all QUIC traffic regardless of version. 563 3. Network-Visible Information about QUIC Flows 565 This section addresses the different kinds of observations and 566 inferences that can be made about QUIC flows by a passive observer in 567 the network based on the wire image in Section 2. Here we assume a 568 bidirectional observer (one that can see packets in both directions 569 in the sequence in which they are carried on the wire) unless noted. 571 3.1. Identifying QUIC Traffic 573 The QUIC wire image is not specifically designed to be 574 distinguishable from other UDP traffic. 576 The only application binding defined by the IETF QUIC WG is HTTP/3 577 [QUIC-HTTP] at the time of this writing; however, many other 578 applications are currently being defined and deployed over QUIC, so 579 an assumption that all QUIC traffic is HTTP/3 is not valid. HTTP/3 580 uses UDP port 443 by default, although URLs referring to resources 581 available over HTTP/3 may specify alternate port numbers. Simple 582 assumptions about whether a given flow is using QUIC based upon a UDP 583 port number may therefore not hold; see also Section 5 of [RFC7605]. 585 While the second-most-significant bit (0x40) of the first octet is 586 set to 1 in most QUIC packets of the current version (see Section 2.1 587 and Section 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC 588 traffic is not reliable. First, it only provides one bit of 589 information and is prone to collision with UDP-based protocols other 590 than those considered in [RFC7983]. Second, this feature of the wire 591 image is not invariant [QUIC-INVARIANTS] and may change in future 592 versions of the protocol, or even be negotiated during the handshake 593 via the use of an extension. 595 Even though transport parameters transmitted in the client's Initial 596 packet are observable by the network, they cannot be modified by the 597 network without risking connection failure. Further, the reply from 598 the server cannot be observed, so observers on the network cannot 599 know which parameters are actually in use. 601 3.1.1. Identifying Negotiated Version 603 An in-network observer assuming that a set of packets belongs to a 604 QUIC flow can infer the version number in use by observing the 605 handshake: for QUIC version 1, if the version number in the Initial 606 packet from a client is the same as the version number in the Initial 607 packet of the server response, that version has been accepted by both 608 endpoints to be used for the rest of the connection. 610 The negotiated version cannot be identified for flows for which a 611 handshake is not observed, such as in the case of connection 612 migration; however, it might be possible to associate a flow with a 613 flow for which a version has been identified; see Section 3.5. 615 3.1.2. First Packet Identification for Garbage Rejection 617 A related question is whether the first packet of a given flow on a 618 port known to be associated with QUIC is a valid QUIC packet. This 619 determination supports in-network filtering of garbage UDP packets 620 (reflection attacks, random backscatter, etc.). While heuristics 621 based on the first byte of the packet (packet type) could be used to 622 separate valid from invalid first packet types, the deployment of 623 such heuristics is not recommended, as bits in the first byte may 624 have different meanings in future versions of the protocol. 626 3.2. Connection Confirmation 628 This document focuses on QUIC version 1, and this section applies 629 only to packets belonging to QUIC version 1 flows; for purposes of 630 on-path observation, it assumes that these packets have been 631 identified as such through the observation of a version number 632 exchange as described above. 634 Connection establishment uses Initial and Handshake packets 635 containing a TLS handshake, and Retry packets that do not contain 636 parts of the handshake. Connection establishment can therefore be 637 detected using heuristics similar to those used to detect TLS over 638 TCP. A client initiating a connection may also send data in 0-RTT 639 packets directly after the Initial packet containing the TLS Client 640 Hello. Since these packets may be reordered in the network, 0-RTT 641 packets could be seen before the Initial packet. 643 Note that in this version of QUIC, clients send Initial packets 644 before servers do, servers send Handshake packets before clients do, 645 and only clients send Initial packets with tokens. Therefore, an 646 endpoint can be identified as a client or server by an on-path 647 observer. An attempted connection after Retry can be detected by 648 correlating the contents of the Retry packet with the Token and the 649 Destination Connection ID fields of the new Initial packet. 651 3.3. Distinguishing Acknowledgment Traffic 653 Some deployed in-network functions distinguish pure-acknowledgment 654 (ACK) packets from packets carrying upper-layer data in order to 655 attempt to enhance performance, for example by queueing ACKs 656 differently or manipulating ACK signaling. Distinguishing ACK 657 packets is trivial in TCP, but not supported by QUIC, since 658 acknowledgment signaling is carried inside QUIC's encrypted payload, 659 and ACK manipulation is impossible. Specifically, heuristics 660 attempting to distinguish ACK-only packets from payload-carrying 661 packets based on packet size are likely to fail, and are not 662 recommended to use as a way to construe internals of QUIC's operation 663 as those mechanisms can change, e.g., due to the use of extensions. 665 3.4. Server Name Indication (SNI) 667 The client's TLS ClientHello may contain a Server Name Indication 668 (SNI) [RFC6066] extension, by which the client reveals the name of 669 the server it intends to connect to, in order to allow the server to 670 present a certificate based on that name. It may also contain an 671 Application-Layer Protocol Negotiation (ALPN) [RFC7301] extension, by 672 which the client exposes the names of application-layer protocols it 673 supports; an observer can deduce that one of those protocols will be 674 used if the connection continues. 676 Work is currently underway in the TLS working group to encrypt the 677 contents of the ClientHello in TLS 1.3 [TLS-ECH]. This would make 678 SNI-based application identification impossible by on-path 679 observation for QUIC and other protocols that use TLS. 681 3.4.1. Extracting Server Name Indication (SNI) Information 683 If the ClientHello is not encrypted, it can be derived from the 684 client's Initial packet by calculating the Initial secret to decrypt 685 the packet payload and parsing the QUIC CRYPTO Frame containing the 686 TLS ClientHello. 688 As both the derivation of the Initial secret and the structure of the 689 Initial packet itself are version-specific, the first step is always 690 to parse the version number (second to sixth bytes of the long 691 header). Note that only long header packets carry the version 692 number, so it is necessary to also check if the first bit of the QUIC 693 packet is set to 1, indicating a long header. 695 Note that proprietary QUIC versions, that have been deployed before 696 standardization, might not set the first bit in a QUIC long header 697 packet to 1. To parse these versions, example code is provided in 698 the appendix (see Appendix A). However, it is expected that these 699 versions will gradually disappear over time. 701 When the version has been identified as QUIC version 1, the packet 702 type needs to be verified as an Initial packet by checking that the 703 third and fourth bits of the header are both set to 0. Then the 704 Destination Connection ID needs to be extracted to calculate the 705 Initial secret using the version-specific Initial salt, as described 706 in Section 5.2 of [QUIC-TLS]. The length of the connection ID is 707 indicated in the 6th byte of the header followed by the connection ID 708 itself. 710 To determine the end of the header and find the start of the payload, 711 the packet number length, the source connection ID length, and the 712 token length need to be extracted. The packet number length is 713 defined by the seventh and eight bits of the header as described in 714 Section 17.2 of [QUIC-TRANSPORT], but is obfuscated as described in 715 Section 5.4 of [QUIC-TLS]. The source connection ID length is 716 specified in the byte after the destination connection ID. The token 717 length, which follows the source connection ID, is a variable-length 718 integer as specified in Section 16 of [QUIC-TRANSPORT]. 720 After decryption, the client's Initial packet can be parsed to detect 721 the CRYPTO frame that contains the TLS ClientHello, which then can be 722 parsed similarly to TLS over TCP connections. The client's Initial 723 packet may contain other frames, so the first bytes of each frame 724 need to be checked to identify the frame type, and if needed skip 725 over it. Note that the length of the frames is dependent on the 726 frame type. In QUIC version 1, the packet is expected to contain 727 only CRYPTO frames and optionally PADDING frames. PADDING frames, 728 each consisting of a single zero byte, may occur before, after, or 729 between CRYPTO frames. There might be multiple CRYPTO frames. 730 Finally, an extension might define additional frame types which could 731 be present. 733 Note that subsequent Initial packets might contain a Destination 734 Connection ID other than the one used to generate the Initial secret. 735 Therefore, attempts to decrypt these packets using the procedure 736 above might fail unless the Initial secret is retained by the 737 observer. 739 3.5. Flow Association 741 The QUIC connection ID (see Section 2.6) is designed to allow a 742 coordinating on-path device, such as a load-balancer, to associate 743 two flows when one of the endpoints changes address or port. This 744 change can be due to NAT rebinding or address migration. 746 The connection ID must change upon intentional address change by an 747 endpoint, and connection ID negotiation is encrypted, so it is not 748 possible for a passive observer to link intended changes of address 749 using the connection ID. 751 When one endpoint unintentionally changes its address, as is the case 752 with NAT rebinding, an on-path observer may be able to use the 753 connection ID to associate the flow on the new address with the flow 754 on the old address. 756 A network function that attempts to use the connection ID to 757 associate flows must be robust to the failure of this technique. 758 Since the connection ID may change multiple times during the lifetime 759 of a connection, packets with the same five-tuple but different 760 connection IDs might or might not belong to the same connection. 761 Likewise, packets with the same connection ID but different five- 762 tuples might not belong to the same connection, either. 764 Connection IDs should be treated as opaque; see Section 4.4 for 765 caveats regarding connection ID selection at servers. 767 3.6. Flow Teardown 769 QUIC does not expose the end of a connection; the only indication to 770 on-path devices that a flow has ended is that packets are no longer 771 observed. Stateful devices on path such as NATs and firewalls must 772 therefore use idle timeouts to determine when to drop state for QUIC 773 flows; see Section 4.2. 775 3.7. Flow Symmetry Measurement 777 QUIC explicitly exposes which side of a connection is a client and 778 which side is a server during the handshake. In addition, the 779 symmetry of a flow (whether primarily client-to-server, primarily 780 server-to-client, or roughly bidirectional, as input to basic traffic 781 classification techniques) can be inferred through the measurement of 782 data rate in each direction. While QUIC traffic is protected and 783 ACKs may be padded, padding is not required. 785 3.8. Round-Trip Time (RTT) Measurement 787 The round-trip time of QUIC flows can be inferred by observation once 788 per flow, during the handshake, as in passive TCP measurement; this 789 requires parsing of the QUIC packet header and recognition of the 790 handshake, as illustrated in Section 2.4. It can also be inferred 791 during the flow's lifetime, if the endpoints use the spin bit 792 facility described below and in Section 17.3.1 of [QUIC-TRANSPORT]. 794 3.8.1. Measuring Initial RTT 796 In the common case, the delay between the client's Initial packet 797 (containing the TLS ClientHello) and the server's Initial packet 798 (containing the TLS ServerHello) represents the RTT component on the 799 path between the observer and the server. The delay between the 800 server's first Handshake packet and the Handshake packet sent by the 801 client represents the RTT component on the path between the observer 802 and the client. While the client may send 0-RTT packets after the 803 Initial packet during connection re-establishment, these can be 804 ignored for RTT measurement purposes. 806 Handshake RTT can be measured by adding the client-to-observer and 807 observer-to-server RTT components together. This measurement 808 necessarily includes any transport- and application-layer delay (the 809 latter mainly caused by the asymmetric crypto operations associated 810 with the TLS handshake) at both sides. 812 3.8.2. Using the Spin Bit for Passive RTT Measurement 814 The spin bit provides a version-specific method to measure per-flow 815 RTT from observation points on the network path throughout the 816 duration of a connection. See Section 17.4 of [QUIC-TRANSPORT] for 817 the definition of the spin bit in Version 1 of QUIC. Endpoint 818 participation in spin bit signaling is optional. That is, while its 819 location is fixed in this version of QUIC, an endpoint can 820 unilaterally choose to not support "spinning" the bit. 822 Use of the spin bit for RTT measurement by devices on path is only 823 possible when both endpoints enable it. Some endpoints may disable 824 use of the spin bit by default, others only in specific deployment 825 scenarios, e.g. for servers and clients where the RTT would reveal 826 the presence of a VPN or proxy. To avoid making these connections 827 identifiable based on the usage of the spin bit, all endpoints 828 randomly disable "spinning" for at least one eighth of connections, 829 even if otherwise enabled by default. An endpoint not participating 830 in spin bit signaling for a given connection can use a fixed spin 831 value for the duration of the connection, or can set the bit randomly 832 on each packet sent. 834 When in use and a QUIC flow sends data continuously, the latency spin 835 bit in each direction changes value once per round-trip time (RTT). 836 An on-path observer can observe the time difference between edges 837 (changes from 1 to 0 or 0 to 1) in the spin bit signal in a single 838 direction to measure one sample of end-to-end RTT. This mechanism 839 follows the principles of protocol measurability laid out in [IPIM]. 841 Note that this measurement, as with passive RTT measurement for TCP, 842 includes any transport protocol delay (e.g., delayed sending of 843 acknowledgements) and/or application layer delay (e.g., waiting for a 844 response to be generated). It therefore provides devices on path a 845 good instantaneous estimate of the RTT as experienced by the 846 application. 848 However, application-limited and flow-control-limited senders can 849 have application and transport layer delay, respectively, that are 850 much greater than network RTT. When the sender is application- 851 limited and e.g. only sends small amount of periodic application 852 traffic, where that period is longer than the RTT, measuring the spin 853 bit provides information about the application period, not the 854 network RTT. 856 Since the spin bit logic at each endpoint considers only samples from 857 packets that advance the largest packet number, signal generation 858 itself is resistant to reordering. However, reordering can cause 859 problems at an observer by causing spurious edge detection and 860 therefore inaccurate (i.e., lower) RTT estimates, if reordering 861 occurs across a spin-bit flip in the stream. 863 Simple heuristics based on the observed data rate per flow or changes 864 in the RTT series can be used to reject bad RTT samples due to lost 865 or reordered edges in the spin signal, as well as application or flow 866 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 867 significantly higher than RTTs over the history of the flow. These 868 heuristics may use the handshake RTT as an initial RTT estimate for a 869 given flow. Usually such heuristics would also detect if the spin is 870 either constant or randomly set for a connection. 872 An on-path observer that can see traffic in both directions (from 873 client to server and from server to client) can also use the spin bit 874 to measure "upstream" and "downstream" component RTT; i.e, the 875 component of the end-to-end RTT attributable to the paths between the 876 observer and the server and the observer and the client, 877 respectively. It does this by measuring the delay between a spin 878 edge observed in the upstream direction and that observed in the 879 downstream direction, and vice versa. 881 Raw RTT samples generated using these techniques can be processed in 882 various ways to generate useful network performance metrics. A 883 simple linear smoothing or moving minimum filter can be applied to 884 the stream of RTT samples to get a more stable estimate of 885 application-experienced RTT. RTT samples measured from the spin bit 886 can also be used to generate RTT distribution information, including 887 minimum RTT (which approximates network RTT over longer time windows) 888 and RTT variance (which approximates jitter as seen by the 889 application). 891 4. Specific Network Management Tasks 893 In this section, we review specific network management and 894 measurement techniques and how QUIC's design impacts them. 896 4.1. Passive Network Performance Measurement and Troubleshooting 898 Limited RTT measurement is possible by passive observation of QUIC 899 traffic; see Section 3.8. No passive measurement of loss is possible 900 with the present wire image. Extremely limited observation of 901 upstream congestion may be possible via the observation of CE 902 markings on ECN-enabled QUIC traffic. 904 4.2. Stateful Treatment of QUIC Traffic 906 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 907 middlebox) is possible through QUIC traffic and version 908 identification (Section 3.1) and observation of the handshake for 909 connection confirmation (Section 3.2). The lack of any visible end- 910 of-flow signal (Section 3.6) means that this state must be purged 911 either through timers or through least-recently-used eviction, 912 depending on application requirements. 914 While QUIC has no clear network-visible end-of-connection signal and 915 therefore does require timer-based state removal, the QUIC handshake 916 indicates confirmation by both ends of a valid bidirectional 917 transmission. As soon as the handshake completed, timers should be 918 set long enough to also allow for short idle time during a valid 919 transmission. 921 [RFC4787] requires a timeout that is not less than 2 minutes for most 922 UDP traffic. However, in practice, timers are sometimes lower, in 923 the range of 30 to 60 seconds. In contrast, [RFC5382] recommends a 924 timeout of more than 2 hours for TCP, given that TCP is a connection- 925 oriented protocol with well- defined closure semantics. 927 Even though QUIC has explicitly been designed tolerate NAT 928 rebindings, decreasing the NAT timeout is not recommended, as it may 929 negatively impact application performance or incentivize endpoints to 930 send very frequent keep-alive packets. Instead it is recommended, 931 even when lower timers are used for other UDP traffic, to use a timer 932 of at least two minutes for QUIC traffic. 934 If state is removed too early, this could lead to black-holing of 935 incoming packets after a short idle period. To detect this 936 situation, a timer at the client needs to expire before a re- 937 establishment can happen (if at all), which would lead to unnecessary 938 long delays in an otherwise working connection. 940 Furthermore, not all endpoints use routing architectures where 941 connections will survive a port or address change. So even when the 942 client revives the connection, a NAT rebinding can cause a routing 943 mismatch where a packet is not even delivered to the server that 944 might support address migration. For these reasons, the limits in 945 [RFC4787] are important to avoid black-holing of packets (and hence 946 avoid interrupting the flow of data to the client), especially where 947 devices are able to distinguish QUIC traffic from other UDP payloads. 949 The QUIC header optionally contains a connection ID which could 950 provide additional entropy beyond the 5-tuple. The QUIC handshake 951 needs to be observed in order to understand whether the connection ID 952 is present and what length it has. However, connection IDs may be 953 renegotiated after the handshake, and this renegotiation is not 954 visible to the path. Therefore using the connection ID as a flow key 955 field for stateful treatment of flows is not recommended as 956 connection ID changes will cause undetectable and unrecoverable loss 957 of state in the middle of a connection. Specially, the use of the 958 connection ID for functions that require state to make a forwarding 959 decison is not viable as it will break connectivity or at minimum 960 cause long timeout-based delays before this problem is detected by 961 the endpoints and the connection can potentially be re-established. 963 Use of connection IDs is specifically discouraged for NAT 964 applications. If a NAT hits an operational limit, it is recommended 965 to rather drop the initial packets of a flow (see also Section 4.5), 966 which potentially triggers a fallback to TCP. Use of the connection 967 ID to multiplex multiple connections on the same IP address/port pair 968 is not a viable solution as it risks connectivity breakage, in case 969 the connection ID changes. 971 4.3. Address Rewriting to Ensure Routing Stability 973 While QUIC's migration capability makes it possible for an server to 974 survive address changes, this does not work if the routers or 975 switches in the server infrastructure route using the address-port 976 4-tuple. If infrastructure routes on addresses only, NAT rebinding 977 or address migration will cause packets to be delivered to the wrong 978 server. [QUIC_LB] describes a way to addresses this problem by 979 coordinating the selection and use of connection IDs between load- 980 balancers and servers. 982 Applying address translation at a middlebox to maintain a stable 983 address-port mapping for flows based on connection ID might seem like 984 a solution to this problem. However, hiding information about the 985 change of the IP address or port conceals important and security- 986 relevant information from QUIC endpoints and as such would facilitate 987 amplification attacks (see Section 9 of [QUIC-TRANSPORT]). A NAT 988 function that hides peer address changes prevents the other end from 989 detecting and mitigating attacks as the endpoint cannot verify 990 connectivity to the new address using QUIC PATH_CHALLENGE and 991 PATH_RESPONSE frames. 993 In addition, a change of IP address or port is also an input signal 994 to other internal mechanisms in QUIC. When a path change is 995 detected, path-dependent variables like congestion control parameters 996 will be reset protecting the new path from overload. 998 Therefore, the use of address rewriting to ensure routing stability 999 can open QUIC up to various attacks, as it conceals client address 1000 changes, and as such masks important signals that drive security 1001 mechanisms. 1003 4.4. Server Cooperation with Load Balancers 1005 In the case of networking architectures that include load balancers, 1006 the connection ID can be used as a way for the server to signal 1007 information about the desired treatment of a flow to the load 1008 balancers. Guidance on assigning connection IDs is given in 1009 [QUIC-APPLICABILITY]. [QUIC_LB] describes a system for coordinating 1010 selection and use of connection IDs between load-balancers and 1011 servers. 1013 4.5. Filtering Behavior 1015 [RFC4787] describes possible packet filtering behaviors that relate 1016 to NATs but is often also used is other scenarios where packet 1017 filtering is desired. Though the guidance there holds, a 1018 particularly unwise behavior is to admit a handful of UDP packets and 1019 then make a decision as to whether or not to filter it. QUIC 1020 applications are encouraged to fail over to TCP if early packets do 1021 not arrive at their destination [I-D.ietf-quic-applicability], as 1022 QUIC is based on UDP and there are known blocks of UDP traffic (see 1023 Section 4.6). Admitting a few packets allows the QUIC endpoint to 1024 determine that the path accepts QUIC. Sudden drops afterwards will 1025 result in slow and costly timeouts before abandoning the connection. 1027 4.6. UDP Blocking or Throttling 1029 Today, UDP is the most prevalent DDoS vector, since it is easy for 1030 compromised non-admin applications to send a flood of large UDP 1031 packets (while with TCP the attacker gets throttled by the congestion 1032 controller) or to craft reflection and amplification attacks. Some 1033 networks therefore block UDP traffic. With increased deployment of 1034 QUIC, there is also an increased need to allow UDP traffic on ports 1035 used for QUIC. However, if UDP is generally enabled on these ports, 1036 UDP flood attacks may also use the same ports. One possible response 1037 to this threat is to throttle UDP traffic on the network, allocating 1038 a fixed portion of the network capacity to UDP and blocking UDP 1039 datagrams over that cap. As the portion of QUIC traffic compared to 1040 TCP is also expected to increase over time, using such a limit is not 1041 recommended but if done, limits might need to be adapted dynamically. 1043 Further, if UDP traffic is desired to be throttled, it is recommended 1044 to block individual QUIC flows entirely rather than dropping packets 1045 randomly. When the handshake is blocked, QUIC-capable applications 1046 may failover to TCP However, blocking a random fraction of QUIC 1047 packets across 4-tuples will allow many QUIC handshakes to complete, 1048 preventing a TCP failover, but the connections will suffer from 1049 severe packet loss (see also Section 4.5). Therefore UDP throttling 1050 should be realized by per-flow policing as opposed to per-packet 1051 policing. Note that this per-flow policing should be stateless to 1052 avoid problems with stateful treatment of QUIC flows (see 1053 Section 4.2), for example blocking a portion of the space of values 1054 of a hash function over the addresses and ports in the UDP datagram. 1055 While QUIC endpoints are often able to survive address changes, e.g. 1056 by NAT rebindings, blocking a portion of the traffic based on 5-tuple 1057 hashing increases the risk of black-holing an active connection when 1058 the address changes. 1060 4.7. DDoS Detection and Mitigation 1062 On-path observation of the transport headers of packets can be used 1063 for various security functions. For example, Denial of Service (DOS) 1064 and Distributed DOS (DDOS) attacks against the infrastructure or 1065 against an endpoint can be detected and mitigated by characterising 1066 anomalous traffic. Other uses include support for security audits 1067 (e.g., verifying the compliance with ciphersuites); client and 1068 application fingerprinting for inventory; and to provide alerts for 1069 network intrusion detection and other next generation firewall 1070 functions. 1072 Current practices in detection and mitigation of DDoS attacks 1073 generally involve classification of incoming traffic (as packets, 1074 flows, or some other aggregate) into "good" (productive) and "bad" 1075 (DDoS) traffic, and then differential treatment of this traffic to 1076 forward only good traffic. This operation is often done in a 1077 separate specialized mitigation environment through which all traffic 1078 is filtered; a generalized architecture for separation of concerns in 1079 mitigation is given in [DOTS-ARCH]. 1081 Efficient classification of this DDoS traffic in the mitigation 1082 environment is key to the success of this approach. Limited first- 1083 packet garbage detection as in Section 3.1.2 and stateful tracking of 1084 QUIC traffic as in Section 4.2 above may be useful during 1085 classification. 1087 Note that the use of a connection ID to support connection migration 1088 renders 5-tuple based filtering insufficient to detect active flows 1089 and requires more state to be maintained by DDoS defense systems if 1090 support of migration of QUIC flows is desired. For the common case 1091 of NAT rebinding, where the client's address changes without the 1092 client's intent or knowledge, DDoS defense systems can detect a 1093 change in the client's endpoint address by linking flows based on the 1094 server's connection IDs. However, QUIC's linkability resistance 1095 ensures that a deliberate connection migration is accompanied by a 1096 change in the connection ID. In this case, the connection ID can not 1097 be used to distinguish valid, active traffic from new attack traffic. 1099 It is also possible for endpoints to directly support security 1100 functions such as DoS classification and mitigation. Endpoints can 1101 cooperate with an in-network device directly by e.g. sharing 1102 information about connection IDs. 1104 Another potential method could use an on-path network device that 1105 relies on pattern inferences in the traffic and heuristics or machine 1106 learning instead of processing observed header information. 1108 However, it is questionable whether connection migrations must be 1109 supported during a DDoS attack. While unintended migration without a 1110 connection ID change can be more easily supported, it might be 1111 acceptable to not support migrations of active QUIC connections that 1112 are not visible to the network functions performing the DDoS 1113 detection. As soon as the connection blocking is detected by the 1114 client, the client may be able to rely on the fast resumption 1115 mechanism provided by QUIC. When clients migrate to a new path, they 1116 should be prepared for the migration to fail and attempt to reconnect 1117 quickly. 1119 Beyond in-network DDoS protection mechanisms, TCP syncookies 1120 [RFC4937] are a well-established method of mitigating some kinds of 1121 TCP DDoS attacks. QUIC Retry packets are the functional analogue to 1122 syncookies, forcing clients to prove possession of their IP address 1123 before committing server state. However, there are safeguards in 1124 QUIC against unsolicited injection of these packets by intermediaries 1125 who do not have consent of the end server. See [QUIC_LB] for 1126 standard ways for intermediaries to send Retry packets on behalf of 1127 consenting servers. 1129 4.8. Quality of Service handling and ECMP 1131 It is expected that any QoS handling in the network, e.g. based on 1132 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 1133 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 1134 per-packet) and as such that all packets belonging to the same QUIC 1135 connection get uniform treatment. Using ECMP to distribute packets 1136 from a single flow across multiple network paths or any other non- 1137 uniform treatment of packets belong to the same connection could 1138 result in variations in order, delivery rate, and drop rate. As 1139 feedback about loss or delay of each packet is used as input to the 1140 congestion controller, these variations could adversely affect 1141 performance. 1143 Depending of the loss recovery mechanism implemented, QUIC may be 1144 more tolerant of packet re-ordering than traditional TCP traffic (see 1145 Section 2.7). However, it cannot be known by the network which exact 1146 recovery mechanism is used and therefore reordering tolerance should 1147 be considered as unknown. 1149 4.9. Handling ICMP Messages 1151 Datagram Packetization Layer PMTU Discovery (PLPMTUD) can be used by 1152 QUIC to probe for the supported PMTU. PLPMTUD optionally uses ICMP 1153 messages (e.g., IPv6 Packet Too Big messages). Given known attacks 1154 with the use of ICMP messages, the use of PLPMTUD in QUIC has been 1155 designed to safely use but not rely on receiving ICMP feedback (see 1156 Section 14.2.1. of [QUIC-TRANSPORT]). 1158 Networks are recommended to forward these ICMP messages and retain as 1159 much of the original packet as possible without exceeding the minimum 1160 MTU for the IP version when generating ICMP messages as recommended 1161 in [RFC1812] and [RFC4443]. 1163 4.10. Guiding Path MTU 1165 Some networks support 1500-byte packets, but can only do so by 1166 fragmenting at a lower layer before traversing a smaller MTU segment, 1167 and then reassembling. This is permissible even when the IP layer is 1168 IPv6 or IPv4 with the DF bit set, because it occurs below the IP 1169 layer. However, this process can add to compute and memory costs, 1170 leading to a bottleneck that limits network capacity. In such 1171 networks this generates a desire to influence a majority of senders 1172 to use smaller packets, so that the limited reassembly capacity is 1173 not exceeded. 1175 For TCP, MSS clamping (Section 3.2 of [RFC4459]) is often used to 1176 change the sender's maximum TCP segment size, but QUIC requires a 1177 different approach. Section 14 of [QUIC-TRANSPORT] advises senders 1178 to probe larger sizes using Datagram Packetization Layer PMTU 1179 Discovery ([DPLPMTUD]) or Path Maximum Transmission Unit Discovery 1180 (PMTUD: [RFC1191] and [RFC8201]). This mechanism will encourage 1181 senders to approach the maximum size, which could cause fragmentation 1182 with a network segment that they may not be aware of. 1184 If path performance is limited when sending larger packets, an on- 1185 path device should support a maximum packet size for a specific 1186 transport flow and then consistently drop all packets that exceed the 1187 configured size when the inner IPv4 packet has DF set, or IPv6 is 1188 used. Endpoints can cache PMTU information between IP flows, in the 1189 IP-layer cache, so short-term consistency between the PMTU for flows 1190 can help avoid an endpoint using a PMTU that is inefficient. 1192 Networks with configurations that would lead to fragmentation of 1193 large packets should drop such packets rather than fragmenting them. 1194 Network operators who plan to implement a more selective policy may 1195 start by focussing on QUIC. QUIC flows cannot always be easily 1196 distinguished from other UDP traffic, but we assume at least some 1197 portion of QUIC traffic can be identified (see Section 3.1). For 1198 QUIC endpoints using DPLPMTUD it is recommended for the path to drop 1199 a packet larger than the supported size. A QUIC probe packet is used 1200 to discover the PMTU. If lost, this does not impact the flow of QUIC 1201 data. 1203 IPv4 routers generate an ICMP message when a packet is dropped 1204 because the link MTU was exceeded. [RFC8504] specifies how an IPv6 1205 node generates an ICMPv6 Packet Too Big message (PTB) in this case. 1206 PMTUD relies upon an endpoint receiving such PTB messages [RFC8201], 1207 whereas DPLPMTUD does not reply upon these messages, but still can 1208 optionally use these to improve performance Section 4.6 of 1209 [DPLPMTUD]. 1211 Since a network cannot know in advance which discovery method a QUIC 1212 endpoint is using, it should always send a PTB message in addition to 1213 dropping the oversized packet. A generated PTB message should be 1214 compliant with the validation requirements of Section 14.2.1 of 1215 [QUIC-TRANSPORT], otherwise it will be ignored by DPLPMTUD. This 1216 will likely provide the right signal for the endpoint to keep the 1217 packet size small and thereby avoid network fragmentation for that 1218 flow entirely. 1220 5. IANA Considerations 1222 This document has no actions for IANA. 1224 6. Security Considerations 1226 QUIC is an encrypted and authenticated transport. That means, once 1227 the cryptographic handshake is complete, QUIC endpoints discard most 1228 packets that are not authenticated, greatly limiting the ability of 1229 an attacker to interfere with existing connections. 1231 However, some information is still observerable, as supporting 1232 manageability of QUIC traffic inherently involves tradeoffs with the 1233 confidentiality of QUIC's control information; this entire document 1234 is therefore security-relevant. 1236 More security considerations for QUIC are discussed in 1237 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1238 passive attackers in the network as well as attacks on specific QUIC 1239 mechanism. 1241 Version Negotiation packets do not contain any mechanism to prevent 1242 version downgrade attacks. However, future versions of QUIC that use 1243 Version Negotiation packets are require to define a mechanism that is 1244 robust against version downgrade attacks. Therefore a network node 1245 should not attempt to impact version selection, as version downgrade 1246 may result in connection failure. 1248 7. Contributors 1250 The following people have contributed text to sections of this 1251 document: 1253 * Dan Druta 1255 * Martin Duke 1257 * Igor Lubashev 1259 * David Schinazi 1261 * Gorry Fairhurst 1263 * Chris Box 1265 8. Acknowledgments 1267 Thanks to Thomas Fossati, Jana Iygengar, Marcus Ihlar for their early 1268 reviews and feedback. Special thanks also to Martin Thomson and 1269 Martin Duke for their detailed reviews and input. And thanks to Sean 1270 Turner, Mike Bishop, Ian Swett, and Nick Banks for their last call 1271 reviews. 1273 This work is partially supported by the European Commission under 1274 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1275 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1276 for Education, Research, and Innovation under contract no. 15.0268. 1277 This support does not imply endorsement. 1279 9. References 1281 9.1. Normative References 1283 [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC", 1284 Work in Progress, Internet-Draft, draft-ietf-quic-tls-34, 1285 14 January 2021, 1286 . 1288 [QUIC-TRANSPORT] 1289 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1290 and Secure Transport", Work in Progress, Internet-Draft, 1291 draft-ietf-quic-transport-34, 14 January 2021, 1292 . 1295 9.2. Informative References 1297 [DOTS-ARCH] 1298 Mortensen, A., Reddy, T., Andreasen, F., Teague, N., and 1299 R. Compton, "DDoS Open Threat Signaling (DOTS) 1300 Architecture", Work in Progress, Internet-Draft, draft- 1301 ietf-dots-architecture-18, 6 March 2020, 1302 . 1305 [DPLPMTUD] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1306 Völker, "Packetization Layer Path MTU Discovery for 1307 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1308 September 2020, . 1310 [I-D.ietf-quic-applicability] 1311 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1312 Transport Protocol", Work in Progress, Internet-Draft, 1313 draft-ietf-quic-applicability-11, 21 April 2021, 1314 . 1317 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1318 Internet Measurement (arXiv preprint 1612.02902)", 9 1319 December 2016, . 1321 [QUIC-APPLICABILITY] 1322 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1323 Transport Protocol", Work in Progress, Internet-Draft, 1324 draft-ietf-quic-applicability-11, 21 April 2021, 1325 . 1328 [QUIC-HTTP] 1329 Bishop, M., "Hypertext Transfer Protocol Version 3 1330 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1331 quic-http-34, 2 February 2021, 1332 . 1334 [QUIC-INVARIANTS] 1335 Thomson, M., "Version-Independent Properties of QUIC", 1336 Work in Progress, Internet-Draft, draft-ietf-quic- 1337 invariants-13, 14 January 2021, 1338 . 1341 [QUIC-RECOVERY] 1342 Iyengar, J. and I. Swett, "QUIC Loss Detection and 1343 Congestion Control", Work in Progress, Internet-Draft, 1344 draft-ietf-quic-recovery-34, 14 January 2021, 1345 . 1347 [QUIC_LB] Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC 1348 Connection IDs", Work in Progress, Internet-Draft, draft- 1349 ietf-quic-load-balancers-06, 4 February 2021, 1350 . 1353 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1354 DOI 10.17487/RFC1191, November 1990, 1355 . 1357 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1358 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1359 . 1361 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1362 and W. Weiss, "An Architecture for Differentiated 1363 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1364 . 1366 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1367 Control Message Protocol (ICMPv6) for the Internet 1368 Protocol Version 6 (IPv6) Specification", STD 89, 1369 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1370 . 1372 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1373 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1374 2006, . 1376 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1377 Translation (NAT) Behavioral Requirements for Unicast 1378 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1379 2007, . 1381 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1382 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1383 June 2007, . 1385 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1386 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1387 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1388 . 1390 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1391 Extensions: Extension Definitions", RFC 6066, 1392 DOI 10.17487/RFC6066, January 2011, 1393 . 1395 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1396 "Transport Layer Security (TLS) Application-Layer Protocol 1397 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1398 July 2014, . 1400 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1401 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1402 August 2015, . 1404 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 1405 Updates for Secure Real-time Transport Protocol (SRTP) 1406 Extension for Datagram Transport Layer Security (DTLS)", 1407 RFC 7983, DOI 10.17487/RFC7983, September 2016, 1408 . 1410 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1411 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1412 DOI 10.17487/RFC8201, July 2017, 1413 . 1415 [RFC8504] Chown, T., Loughney, J., and T. Winters, "IPv6 Node 1416 Requirements", BCP 220, RFC 8504, DOI 10.17487/RFC8504, 1417 January 2019, . 1419 [TLS-ECH] Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1420 Encrypted Client Hello", Work in Progress, Internet-Draft, 1421 draft-ietf-tls-esni-10, 8 March 2021, 1422 . 1424 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1425 Integrity Signals for Passive Measurement (in Proc. TMA 1426 2014)", April 2014. 1428 [WIRE-IMAGE] 1429 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1430 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1431 2019, . 1433 Appendix A. Distinguishing IETF QUIC and Google QUIC Versions 1435 This section contains algorithms that allows parsing versions from 1436 both Google QUIC and IETF QUIC. These mechanisms will become 1437 irrelevant when IETF QUIC is fully deployed and Google QUIC is 1438 deprecated. 1440 Note that other than this appendix, nothing in this document applies 1441 to Google QUIC. And the purpose of this appendix is merely to 1442 distinguish IETF QUIC from any versions of Google QUIC. 1444 This appendix uses the following conventions: * array[i] - one byte 1445 at index i of array * array[i:j] - subset of array starting with 1446 index i (inclusive) up to j-1 (inclusive) * array[i:] - subset of 1447 array starting with index i (inclusive) up to the end of the array 1449 Conceptually, a Google QUIC version is an opaque 32bit field. When 1450 we refer to a version with four printable characters, we use its 1451 ASCII representation: for example, Q050 refers to {'Q', '0', '5', 1452 '0'} which is equal to {0x51, 0x30, 0x35, 0x30}. Otherwise, we use 1453 its hexadecimal representation: for example, 0xff00001d refers to 1454 {0xff, 0x00, 0x00, 0x1d}. 1456 QUIC versions that start with 'Q' or 'T' followed by three digits are 1457 Google QUIC versions. Versions up to and including 43 are documented 1458 by . Versions 1460 Q046, Q050, T050, and T051 are not fully documented, but this 1461 appendix should contain enough information to allow parsing Client 1462 Hellos for those versions. 1464 To extract the version number itself, one needs to look at the first 1465 byte of the QUIC packet, in other words the first byte of the UDP 1466 payload. 1468 first_byte = packet[0] 1469 first_byte_bit1 = ((first_byte & 0x80) != 0) 1470 first_byte_bit2 = ((first_byte & 0x40) != 0) 1471 first_byte_bit3 = ((first_byte & 0x20) != 0) 1472 first_byte_bit4 = ((first_byte & 0x10) != 0) 1473 first_byte_bit5 = ((first_byte & 0x08) != 0) 1474 first_byte_bit6 = ((first_byte & 0x04) != 0) 1475 first_byte_bit7 = ((first_byte & 0x02) != 0) 1476 first_byte_bit8 = ((first_byte & 0x01) != 0) 1477 if (first_byte_bit1) { 1478 version = packet[1:5] 1479 } else if (first_byte_bit5 && !first_byte_bit2) { 1480 if (!first_byte_bit8) { 1481 abort("Packet without version") 1482 } 1483 if (first_byte_bit5) { 1484 version = packet[9:13] 1485 } else { 1486 version = packet[5:9] 1487 } 1488 } else { 1489 abort("Packet without version") 1490 } 1492 A.1. Extracting the CRYPTO frame 1493 counter = 0 1494 while (payload[counter] == 0) { 1495 counter += 1 1496 } 1497 first_nonzero_payload_byte = payload[counter] 1498 fnz_payload_byte_bit3 = ((first_nonzero_payload_byte & 0x20) != 0) 1500 if (first_nonzero_payload_byte != 0x06) { 1501 abort("Unexpected frame") 1502 } 1503 if (payload[counter+1] != 0x00) { 1504 abort("Unexpected crypto stream offset") 1505 } 1506 counter += 2 1507 if ((payload[counter] & 0xc0) == 0) { 1508 crypto_data_length = payload[counter] 1509 counter += 1 1510 } else { 1511 crypto_data_length = payload[counter:counter+2] 1512 counter += 2 1513 } 1514 crypto_data = payload[counter:counter+crypto_data_length] 1515 ParseTLS(crypto_data) 1517 Authors' Addresses 1519 Mirja Kuehlewind 1520 Ericsson 1522 Email: mirja.kuehlewind@ericsson.com 1524 Brian Trammell 1525 Google Switzerland GmbH 1526 Gustav-Gull-Platz 1 1527 CH- 8004 Zurich 1528 Switzerland 1530 Email: ietf@trammell.ch