idnits 2.17.1 draft-ietf-quic-manageability-13.txt: -(1353): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1487): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 4 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (2 September 2021) is 966 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-12 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-12 -- Duplicate reference: draft-ietf-quic-applicability, mentioned in 'QUIC-APPLICABILITY', was also mentioned in 'I-D.ietf-quic-applicability'. -- Duplicate reference: RFC8899, mentioned in 'RFC8899', was also mentioned in 'DPLPMTUD'. == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-13 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 6 March 2022 Google Switzerland GmbH 6 2 September 2021 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-13 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on the implications of QUIC's design and wire image on 15 network operations involving QUIC traffic. Its intended audience is 16 network operators and equipment vendors who rely on the use of 17 transport-aware network functions. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 6 March 2022. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Simplified BSD License text 47 as described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 3 54 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 55 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 56 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 6 57 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 6 58 2.5. Integrity Protection of the Wire Image . . . . . . . . . 11 59 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 11 60 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 12 61 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 12 62 3. Network-Visible Information about QUIC Flows . . . . . . . . 13 63 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 13 64 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 13 65 3.1.2. First Packet Identification for Garbage Rejection . . 14 66 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 14 67 3.3. Distinguishing Acknowledgment Traffic . . . . . . . . . . 15 68 3.4. Server Name Indication (SNI) . . . . . . . . . . . . . . 15 69 3.4.1. Extracting Server Name Indication (SNI) 70 Information . . . . . . . . . . . . . . . . . . . . . 15 71 3.5. Flow Association . . . . . . . . . . . . . . . . . . . . 16 72 3.6. Flow Teardown . . . . . . . . . . . . . . . . . . . . . . 17 73 3.7. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 17 74 3.8. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 17 75 3.8.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 18 76 3.8.2. Using the Spin Bit for Passive RTT Measurement . . . 18 77 4. Specific Network Management Tasks . . . . . . . . . . . . . . 20 78 4.1. Passive Network Performance Measurement and 79 Troubleshooting . . . . . . . . . . . . . . . . . . . . 20 80 4.2. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 20 81 4.3. Address Rewriting to Ensure Routing Stability . . . . . . 22 82 4.4. Server Cooperation with Load Balancers . . . . . . . . . 22 83 4.5. Filtering Behavior . . . . . . . . . . . . . . . . . . . 23 84 4.6. UDP Blocking, Throttling, and NAT Binding . . . . . . . . 23 85 4.7. DDoS Detection and Mitigation . . . . . . . . . . . . . . 24 86 4.8. Quality of Service Handling and ECMP Routing . . . . . . 25 87 4.9. Handling ICMP Messages . . . . . . . . . . . . . . . . . 26 88 4.10. Guiding Path MTU . . . . . . . . . . . . . . . . . . . . 26 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 27 91 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28 92 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 29 93 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 94 9.1. Normative References . . . . . . . . . . . . . . . . . . 29 95 9.2. Informative References . . . . . . . . . . . . . . . . . 29 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 98 1. Introduction 100 QUIC [QUIC-TRANSPORT] is a new transport protocol that is 101 encapsulated in UDP. QUIC integrates TLS [QUIC-TLS] to encrypt all 102 payload data and most control information. QUIC version 1 was 103 designed primarily as a transport for HTTP, with the resulting 104 protocol being known as HTTP/3 [QUIC-HTTP]. 106 This document provides guidance for network operations that manage 107 QUIC traffic. This includes guidance on how to interpret and utilize 108 information that is exposed by QUIC to the network, requirements and 109 assumptions of the QUIC design with respect to network treatment, and 110 a description of how common network management practices will be 111 impacted by QUIC. 113 QUIC is an end-to-end transport protocol. No information in the 114 protocol header, even that which can be inspected, is mutable by the 115 network. This is achieved through integrity protection of the wire 116 image [WIRE-IMAGE]. Encryption of most control signaling means that 117 less information is visible to the network than is the case with TCP. 119 Integrity protection can also simplify troubleshooting, because none 120 of the nodes on the network path can modify transport layer 121 information. However, it means in-network operations that depend on 122 modification of data are not possible without the cooperation of an 123 QUIC endpoint. This might be possible with the introduction of a 124 proxy which authenticates as an endpoint. Proxy operations are not 125 in scope for this document. 127 Network management is not a one-size-fits-all endeavour: practices 128 considered necessary or even mandatory within enterprise networks 129 with certain compliance requirements, for example, would be 130 impermissible on other networks without those requirements. This 131 document therefore does not make any specific recommendations as to 132 which practices should or should not be applied; for each practice, 133 it describes what is and is not possible with the QUIC transport 134 protocol as defined. 136 2. Features of the QUIC Wire Image 138 This section discusses those aspects of the QUIC transport protocol 139 that have an impact on the design and operation of devices that 140 forward QUIC packets. This section is therefore primarily 141 considering the unencrypted part of QUIC's wire image [WIRE-IMAGE], 142 which is defined as the information available in the packet header in 143 each QUIC packet, and the dynamics of that information. Since QUIC 144 is a versioned protocol, the wire image of the header format can also 145 change from version to version. However, the field that identifies 146 the QUIC version in some packets, and the format of the Version 147 Negotiation Packet, are both inspectable and invariant 148 [QUIC-INVARIANTS]. 150 This document describes version 1 of the QUIC protocol, whose wire 151 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 152 of the wire image described herein may change in future versions of 153 the protocol, except when specified as an invariant 154 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 155 or to infer the behavior of future versions of QUIC. 157 2.1. QUIC Packet Header Structure 159 QUIC packets may have either a long header or a short header. The 160 first bit of the QUIC header is the Header Form bit, and indicates 161 which type of header is present. The purpose of this bit is 162 invariant across QUIC versions. 164 The long header exposes more information. It contains a version 165 number, as well as source and destination connection IDs for 166 associating packets with a QUIC connection. The definition and 167 location of these fields in the QUIC long header are invariant for 168 future versions of QUIC, although future versions of QUIC may provide 169 additional fields in the long header [QUIC-INVARIANTS]. 171 In version 1 of QUIC, the long header is used during connection 172 establishment to transmit crypto handshake data, perform version 173 negotiation, retry, and send 0-RTT data. 175 Short headers contain only an optional destination connection ID and 176 the spin bit for RTT measurement. In version 1 of QUIC, they are 177 used after connection establishment. 179 The following information is exposed in QUIC packet headers in all 180 versions of QUIC: 182 * version number: the version number is present in the long header, 183 and identifies the version used for that packet. During Version 184 Negotiation (see Section 17.2.1 of [QUIC-TRANSPORT] and 185 Section 2.8), the version number field has a special value 186 (0x00000000) that identifies the packet as a Version Negotiation 187 packet. QUIC version 1 uses version 0x00000001. Operators should 188 expect to observe packets with other version numbers as a result 189 of various Internet experiments, future standards, and greasing 190 ([RFC7801]). All deployed versions are maintained in an IANA 191 registry (see Section 22.2 of [QUIC-TRANSPORT]). 193 * source and destination connection ID: short and long packet 194 headers carry a destination connection ID, a variable-length field 195 that can be used to identify the connection associated with a QUIC 196 packet, for load-balancing and NAT rebinding purposes; see 197 Section 4.4 and Section 2.6. Long packet headers additionally 198 carry a source connection ID. The source connection ID 199 corresponds to the destination connection ID the source would like 200 to have on packets sent to it, and is only present on long packet 201 headers. On long header packets, the length of the connection IDs 202 is also present; on short header packets, the length of the 203 destination connection ID is implicit. 205 In version 1 of QUIC, the following additional information is 206 exposed: 208 * "fixed bit": The second-most-significant bit of the first octet of 209 most QUIC packets of the current version is set to 1, enabling 210 endpoints to demultiplex with other UDP-encapsulated protocols. 211 Even though this bit is fixed in the version 1 specification, 212 endpoints might use an extension that varies the bit. Therefore, 213 observers cannot reliably use it as an identifier for QUIC. 215 * latency spin bit: The third-most-significant bit of the first 216 octet in the short packet header for version 1. The spin bit is 217 set by endpoints such that tracking edge transitions can be used 218 to passively observe end-to-end RTT. See Section 3.8.2 for 219 further details. 221 * header type: The long header has a 2 bit packet type field 222 following the Header Form and fixed bits. Header types correspond 223 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 224 for details. 226 * length: The length of the remaining QUIC packet after the length 227 field, present on long headers. This field is used to implement 228 coalesced packets during the handshake (see Section 2.2). 230 * token: Initial packets may contain a token, a variable-length 231 opaque value optionally sent from client to server, used for 232 validating the client's address. Retry packets also contain a 233 token, which can be used by the client in an Initial packet on a 234 subsequent connection attempt. The length of the token is 235 explicit in both cases. 237 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 238 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 239 obfuscated in any way. For other kinds of packets, version 1 of QUIC 240 cryptographically obfuscates other information in the packet headers: 242 * packet number: All packets except Version Negotiation and Retry 243 packets have an associated packet number; however, this packet 244 number is encrypted, and therefore not of use to on-path 245 observers. The offset of the packet number can be decoded in long 246 headers, while it is implicit (depending on destination connection 247 ID length) in short headers. The length of the packet number is 248 cryptographically obfuscated. 250 * key phase: The Key Phase bit, present in short headers, specifies 251 the keys used to encrypt the packet to support key rotation. The 252 Key Phase bit is cryptographically obfuscated. 254 2.2. Coalesced Packets 256 Multiple QUIC packets may be coalesced into a single UDP datagram, 257 with a datagram carrying one or more long header packets followed by 258 zero or one short header packets. When packets are coalesced, the 259 Length fields in the long headers are used to separate QUIC packets; 260 see Section 12.2 of [QUIC-TRANSPORT]. The Length field is variable 261 length, and its position in the header is also variable depending on 262 the length of the source and destination connection ID; see 263 Section 17.2 of [QUIC-TRANSPORT]. 265 2.3. Use of Port Numbers 267 Applications that have a mapping for TCP as well as QUIC are expected 268 to use the same port number for both services. However, as for all 269 other IETF transports [RFC7605], there is no guarantee that a 270 specific application will use a given registered port, or that a 271 given port carries traffic belonging to the respective registered 272 service, especially when application layer information is encrypted. 273 For example, [QUIC-HTTP] specifies the use of Alt-Svc for discovery 274 of HTTP/3 services on other ports. 276 Further, as QUIC has a connection ID, it is also possible to maintain 277 multiple QUIC connections over one 5-tuple. However, if the 278 connection ID is zero-length, all packets of the 5-tuple likely 279 belong to the same QUIC connection. 281 2.4. The QUIC Handshake 283 New QUIC connections are established using a handshake, which is 284 distinguishable on the wire and contains some information that can be 285 passively observed. 287 To illustrate the information visible in the QUIC wire image during 288 the handshake, we first show the general communication pattern 289 visible in the UDP datagrams containing the QUIC handshake, then 290 examine each of the datagrams in detail. 292 The QUIC handshake can normally be recognized on the wire through 293 four flights of datagrams labelled "Client Initial", "Server 294 Initial", "Client Completion", and "Server Completion", in the 295 illustration shown in Figure 1. 297 Packets in the handshake belong to three separate cryptographic and 298 transport contexts ("Initial", which contains observable payload, and 299 "Handshake" and "1-RTT", which do not). QUIC packets in separate 300 contexts during the handshake can be coalesced (see Section 2.2) in 301 order to reduce the number of UDP datagrams sent during the 302 handshake. QUIC packets can be lost and reordered, so packets within 303 a flight might not be sent close in time, though the sequence of the 304 flights will not change, because one flight depends upon the peer's 305 previous flight. 307 As shown here, the client can send 0-RTT data as soon as it has sent 308 its Client Hello, and the server can send 1-RTT data as soon as it 309 has sent its Server Hello. 311 Client Server 312 | | 313 +----Client Initial----------------------->| 314 +----(zero or more 0RTT)------------------>| 315 | | 316 |<-----------------------Server Initial----+ 317 |<---------(1RTT encrypted data starts)----+ 318 | | 319 +----Client Completion-------------------->| 320 +----(1RTT encrypted data starts)--------->| 321 | | 322 |<--------------------Server Completion----+ 323 | | 325 Figure 1: General communication pattern visible in the QUIC handshake 327 A handshake starts with the client sending one or more datagrams 328 containing Initial packets as shown in Figure 2, which elicits the 329 Server Initial response as shown in Figure 3 typically containing 330 three types of packets: Initial packet(s) with the beginning of the 331 server's side of the TLS handshake, Handshake packet(s) with the rest 332 of the server's portion of the TLS handshake, and 1-RTT packet(s), if 333 present. 335 The Client Completion flight contains at least one Handshake packet 336 and could also include an Initial packet. 338 Datagrams that contain an Initial packet (Client Initial, Server 339 Initial, and some Client Completion) contain at least 1200 octets of 340 UDP payload. This protects against amplification attacks and 341 verifies that the network path meets the requirements for the minimum 342 QUIC IP packet size; see Section 14 of [QUIC-TRANSPORT]. This is 343 accomplished by either adding PADDING frames within the Initial 344 packet, coalescing other packets with the Initial packet, or leaving 345 unused payload in the UDP packet after the Initial packet. A network 346 path needs to be able to forward at least this size of packet for 347 QUIC to be used. 349 The content of Initial packets is encrypted using Initial Secrets, 350 which are derived from a per-version constant and the client's 351 destination connection ID; they are therefore observable by any on- 352 path device that knows the per-version constant and considered 353 visible in this illustration. The content of QUIC Handshake packets 354 are encrypted using keys established during the initial handshake 355 exchange, and are therefore not visible. 357 Initial, Handshake, and 1-RTT packets belong to different 358 cryptographic and transport contexts. The Client Completion Figure 4 359 and the Server Completion Figure 5 flights conclude the Initial and 360 Handshake contexts, by sending final acknowledgments and CRYPTO 361 frames. 363 +----------------------------------------------------------+ 364 | UDP header (source and destination UDP ports) | 365 +----------------------------------------------------------+ 366 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 367 +----------------------------------------------------------+ | 368 | QUIC CRYPTO frame header | | 369 +----------------------------------------------------------+ | 370 | | TLS Client Hello (incl. TLS SNI) | | | 371 +----------------------------------------------------------+ | 372 | QUIC PADDING frames | | 373 +----------------------------------------------------------+<-+ 375 Figure 2: Example Client Initial datagram without 0-RTT 377 A Client Initial packet exposes the version, source and destination 378 connection IDs without encryption. The payload of the Initial packet 379 is obfuscated using the Initial secret. The complete TLS Client 380 Hello, including any TLS Server Name Indication (SNI) present, is 381 sent in one or more CRYPTO frames across one or more QUIC Initial 382 packets. 384 +------------------------------------------------------------+ 385 | UDP header (source and destination UDP ports) | 386 +------------------------------------------------------------+ 387 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 388 +------------------------------------------------------------+ | 389 | QUIC CRYPTO frame header | | 390 +------------------------------------------------------------+ | 391 | TLS Server Hello | | 392 +------------------------------------------------------------+ | 393 | QUIC ACK frame (acknowledging client hello) | | 394 +------------------------------------------------------------+<-+ 395 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 396 +------------------------------------------------------------+ | 397 | encrypted payload (presumably CRYPTO frames) | | 398 +------------------------------------------------------------+<-+ 399 | QUIC short header | 400 +------------------------------------------------------------+ 401 | 1-RTT encrypted payload | 402 +------------------------------------------------------------+ 404 Figure 3: Coalesced Server Initial datagram pattern 406 The Server Initial datagram also exposes version number, source and 407 destination connection IDs in the clear; the payload of the Initial 408 packet(s) is obfuscated using the Initial secret. 410 +------------------------------------------------------------+ 411 | UDP header (source and destination UDP ports) | 412 +------------------------------------------------------------+ 413 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 414 +------------------------------------------------------------+ | 415 | QUIC ACK frame (acknowledging Server Initial) | | 416 +------------------------------------------------------------+<-+ 417 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 418 +------------------------------------------------------------+ | 419 | encrypted payload (presumably CRYPTO/ACK frames) | | 420 +------------------------------------------------------------+<-+ 421 | QUIC short header | 422 +------------------------------------------------------------+ 423 | 1-RTT encrypted payload | 424 +------------------------------------------------------------+ 426 Figure 4: Coalesced Client Completion datagram pattern 428 The Client Completion flight does not expose any additional 429 information; however, as the destination connection ID is server- 430 selected, it usually is not the same ID than in the Client Initial. 431 Client Completion flights contain 1-RTT packets which indicate the 432 handshake has completed (see Section 3.2) on the client, and for 433 three-way handshake RTT estimation as in Section 3.8. 435 +------------------------------------------------------------+ 436 | UDP header (source and destination UDP ports) | 437 +------------------------------------------------------------+ 438 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 439 +------------------------------------------------------------+ | 440 | encrypted payload (presumably ACK frame) | | 441 +------------------------------------------------------------+<-+ 442 | QUIC short header | 443 +------------------------------------------------------------+ 444 | 1-RTT encrypted payload | 445 +------------------------------------------------------------+ 447 Figure 5: Coalesced Server Completion datagram pattern 449 Similar to Client Completion, Server Completion also exposes no 450 additional information; observing it serves only to determine that 451 the handshake has completed. 453 When the client uses 0-RTT connection resumption, the Client Initial 454 flight can also include one or more 0-RTT packets, as shown in 455 Figure 6. 457 +----------------------------------------------------------+ 458 | UDP header (source and destination UDP ports) | 459 +----------------------------------------------------------+ 460 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 461 +----------------------------------------------------------+ | 462 | QUIC CRYPTO frame header | | 463 +----------------------------------------------------------+ | 464 | TLS Client Hello (incl. TLS SNI) | | 465 +----------------------------------------------------------+<-+ 466 | QUIC long header (type = 0RTT, Version, DCID, SCID) (Length) 467 +----------------------------------------------------------+ | 468 | 0-rtt encrypted payload | | 469 +----------------------------------------------------------+<-+ 471 Figure 6: Coalesced 0-RTT Client Initial datagram 473 When a 0-RTT packet is coalesced with an Initial packet, the datagram 474 will be padded to 1200 byes. Additional datagrams containing only 475 0-RTT packets with long headers can be sent after the client Initial 476 packet(s), containing more 0-RTT data. The amount of 0-RTT protected 477 data that can be sent in the first flight is limited by the initial 478 congestion window, typically to around 10 packets (see Section 7.2 of 479 [QUIC-RECOVERY]). 481 2.5. Integrity Protection of the Wire Image 483 As soon as the cryptographic context is established, all information 484 in the QUIC header, including exposed information, is integrity 485 protected. Further, information that was exposed in packets sent 486 before the cryptographic context was established is validated during 487 the cryptographic handshake. Therefore, devices on path cannot alter 488 any information or bits in QUIC packets. Such alterations would 489 cause the integrity check to fail, which results in the receiver 490 discarding the packet. Some parts of Initial packets could be 491 altered by removing and re-applying the authenticated encryption 492 without immediate discard at the receiver. However, the 493 cryptographic handshake validates most fields and any modifications 494 in those fields will result in connection establishment failing 495 later. 497 2.6. Connection ID and Rebinding 499 The connection ID in the QUIC packet headers allows association of 500 QUIC packets using information independent of the five-tuple. This 501 allows rebinding of a connection after one of the endpoints 502 experienced an address change - usually the client. Further it can 503 be used by in-network devices to ensure that related 5-tuple flows 504 are appropriately balanced together. 506 Client and server each choose a connection ID during the handshake; 507 for example, a server might request that a client use a connection 508 ID, whereas the client might choose a zero-length value. Connection 509 IDs for either endpoint may change during the lifetime of a 510 connection, with the new connection ID being supplied via encrypted 511 frames (see Section 5.1 of [QUIC-TRANSPORT]). Therefore, observing a 512 new connection ID does not necessarily indicate a new connection. 514 [QUIC_LB] specifies algorithms for encoding the server mapping in a 515 connection ID in order to share this information with selected on- 516 path devices such as load balancers. Server mappings should only be 517 exposed to selected entities. Uncontrolled exposure would allow 518 linkage of multiple IP addresses to the same host if the server also 519 supports migration that opens an attack vector on specific servers or 520 pools. The best way to obscure an encoding is to appear random to 521 any other observers, which is most rigorously achieved with 522 encryption. As a result, any attempt to infer information from 523 specific parts of a connection ID is unlikely to be useful. 525 2.7. Packet Numbers 527 The Packet Number field is always present in the QUIC packet header 528 in version 1; however, it is always encrypted. The encryption key 529 for packet number protection on Initial packets -- which are sent 530 before cryptographic context establishment -- is specific to the QUIC 531 version, while packet number protection on subsequent packets uses 532 secrets derived from the end-to-end cryptographic context. Packet 533 numbers are therefore not part of the wire image that is visible to 534 on-path observers. 536 2.8. Version Negotiation and Greasing 538 Version Negotiation packets are used by the server to indicate that a 539 requested version from the client is not supported (see Section 6 of 540 [QUIC-TRANSPORT]. Version Negotiation packets are not intrinsically 541 protected, but future QUIC versions will use later encrypted messages 542 to verify that they were authentic. Therefore, any modification of 543 this list will be detected and may cause the endpoints to terminate 544 the connection attempt. 546 Also note that the list of versions in the Version Negotiation packet 547 may contain reserved versions. This mechanism is used to avoid 548 ossification in the implementation on the selection mechanism. 549 Further, a client may send an Initial packet with a reserved version 550 number to trigger version negotiation. In the Version Negotiation 551 packet, the connection IDs of the client's Initial packet are 552 reflected to provide a proof of return-routability. Therefore, 553 changing this information will also cause the connection to fail. 555 QUIC is expected to evolve rapidly, so new versions, both 556 experimental and IETF standard versions, will be deployed on the 557 Internet more often than with traditional Internet- and transport- 558 layer protocols. Using a particular version number to recognize 559 valid QUIC traffic is likely to persistently miss a fraction of QUIC 560 flows and completely fail in the near future, and is therefore not 561 recommended. In addition, due to the speed of evolution of the 562 protocol, devices that attempt to distinguish QUIC traffic from non- 563 QUIC traffic for purposes of network admission control should admit 564 all QUIC traffic regardless of version. 566 3. Network-Visible Information about QUIC Flows 568 This section addresses the different kinds of observations and 569 inferences that can be made about QUIC flows by a passive observer in 570 the network based on the wire image in Section 2. Here we assume a 571 bidirectional observer (one that can see packets in both directions 572 in the sequence in which they are carried on the wire) unless noted, 573 but typically without access to any keying information. 575 3.1. Identifying QUIC Traffic 577 The QUIC wire image is not specifically designed to be 578 distinguishable from other UDP traffic by a passive observer in the 579 network. 581 The only application binding defined by the IETF QUIC WG is HTTP/3 582 [QUIC-HTTP] at the time of this writing; however, many other 583 applications are currently being defined and deployed over QUIC, so 584 an assumption that all QUIC traffic is HTTP/3 is not valid. HTTP/3 585 uses UDP port 443 by convention but various methods can be used to 586 specify alternate port numbers. Simple assumptions about whether a 587 given flow is using QUIC based upon a UDP port number may therefore 588 not hold; see also Section 5 of [RFC7605]. 590 While the second-most-significant bit (0x40) of the first octet is 591 set to 1 in most QUIC packets of the current version (see Section 2.1 592 and Section 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC 593 traffic is not reliable. First, it only provides one bit of 594 information and is prone to collision with UDP-based protocols other 595 than those considered in [RFC7983]. Second, this feature of the wire 596 image is not invariant [QUIC-INVARIANTS] and may change in future 597 versions of the protocol, or even be negotiated during the handshake 598 via the use of an extension. 600 Even though transport parameters transmitted in the client's Initial 601 packet are observable by the network, they cannot be modified by the 602 network without causing connection failure. Further, the reply from 603 the server cannot be observed, so observers on the network cannot 604 know which parameters are actually in use. 606 3.1.1. Identifying Negotiated Version 608 An in-network observer assuming that a set of packets belongs to a 609 QUIC flow might infer the version number in use by observing the 610 handshake: for QUIC version 1, if the version number in the Initial 611 packet from a client is the same as the version number in the Initial 612 packet of the server response, that version has been accepted by both 613 endpoints to be used for the rest of the connection. 615 The negotiated version cannot be identified for flows for which a 616 handshake is not observed, such as in the case of connection 617 migration; however, it might be possible to associate a flow with a 618 flow for which a version has been identified; see Section 3.5. 620 3.1.2. First Packet Identification for Garbage Rejection 622 A related question is whether the first packet of a given flow on a 623 port known to be associated with QUIC is a valid QUIC packet. This 624 determination supports in-network filtering of garbage UDP packets 625 (reflection attacks, random backscatter, etc.). While heuristics 626 based on the first byte of the packet (packet type) could be used to 627 separate valid from invalid first packet types, the deployment of 628 such heuristics is not recommended, as bits in the first byte may 629 have different meanings in future versions of the protocol. 631 3.2. Connection Confirmation 633 This document focuses on QUIC version 1, and this Connection 634 Confirmation section applies only to packets belonging to QUIC 635 version 1 flows; for purposes of on-path observation, it assumes that 636 these packets have been identified as such through the observation of 637 a version number exchange as described above. 639 Connection establishment uses Initial and Handshake packets 640 containing a TLS handshake, and Retry packets that do not contain 641 parts of the handshake. Connection establishment can therefore be 642 detected using heuristics similar to those used to detect TLS over 643 TCP. A client initiating a connection may also send data in 0-RTT 644 packets directly after the Initial packet containing the TLS Client 645 Hello. Since packets may be reordered or lost in the network, 0-RTT 646 packets could be seen before the Initial packet. 648 Note that in this version of QUIC, clients send Initial packets 649 before servers do, servers send Handshake packets before clients do, 650 and only clients send Initial packets with tokens. Therefore, an 651 endpoint can be identified as a client or server by an on-path 652 observer. An attempted connection after Retry can be detected by 653 correlating the contents of the Retry packet with the Token and the 654 Destination Connection ID fields of the new Initial packet. 656 3.3. Distinguishing Acknowledgment Traffic 658 Some deployed in-network functions distinguish pure-acknowledgment 659 (ACK) packets from packets carrying upper-layer data in order to 660 attempt to enhance performance, for example by queueing ACKs 661 differently or manipulating ACK signaling [RFC3449]. Distinguishing 662 ACK packets is possible in TCP, but is not supported by QUIC, since 663 acknowledgment signaling is carried inside QUIC's encrypted payload, 664 and ACK manipulation is impossible. Specifically, heuristics 665 attempting to distinguish ACK-only packets from payload-carrying 666 packets based on packet size are likely to fail, and are not 667 recommended to use as a way to construe internals of QUIC's operation 668 as those mechanisms can change, e.g., due to the use of extensions. 670 3.4. Server Name Indication (SNI) 672 The client's TLS ClientHello may contain a Server Name Indication 673 (SNI) [RFC6066] extension, by which the client reveals the name of 674 the server it intends to connect to, in order to allow the server to 675 present a certificate based on that name. It may also contain an 676 Application-Layer Protocol Negotiation (ALPN) [RFC7301] extension, by 677 which the client exposes the names of application-layer protocols it 678 supports; an observer can deduce that one of those protocols will be 679 used if the connection continues. 681 Work is currently underway in the TLS working group to encrypt the 682 contents of the ClientHello in TLS 1.3 [TLS-ECH]. This would make 683 SNI-based application identification impossible by on-path 684 observation for QUIC and other protocols that use TLS. 686 3.4.1. Extracting Server Name Indication (SNI) Information 688 If the ClientHello is not encrypted, SNI can be derived from the 689 client's Initial packet(s) by calculating the Initial secret to 690 decrypt the packet payload and parsing the QUIC CRYPTO frame(s) 691 containing the TLS ClientHello. 693 As both the derivation of the Initial secret and the structure of the 694 Initial packet itself are version-specific, the first step is always 695 to parse the version number (the second through fifth bytes of the 696 long header). Note that only long header packets carry the version 697 number, so it is necessary to also check if the first bit of the QUIC 698 packet is set to 1, indicating a long header. 700 Note that proprietary QUIC versions, that have been deployed before 701 standardization, might not set the first bit in a QUIC long header 702 packet to 1. However, it is expected that these versions will 703 gradually disappear over time. 705 When the version has been identified as QUIC version 1, the packet 706 type needs to be verified as an Initial packet by checking that the 707 third and fourth bits of the header are both set to 0. Then the 708 Destination Connection ID needs to be extracted from the packet. The 709 Initial secret is calculated using the version-specific Initial salt, 710 as described in Section 5.2 of [QUIC-TLS]. The length of the 711 connection ID is indicated in the 6th byte of the header followed by 712 the connection ID itself. 714 Note that subsequent Initial packets might contain a Destination 715 Connection ID other than the one used to generate the Initial secret. 716 Therefore, attempts to decrypt these packets using the procedure 717 above might fail unless the Initial secret is retained by the 718 observer. 720 To determine the end of the packet header and find the start of the 721 payload, the packet number length, the source connection ID length, 722 and the token length need to be extracted. The packet number length 723 is defined by the seventh and eight bits of the header as described 724 in Section 17.2 of [QUIC-TRANSPORT], but is obfuscated as described 725 in Section 5.4 of [QUIC-TLS]. The source connection ID length is 726 specified in the byte after the destination connection ID. The token 727 length, which follows the source connection ID, is a variable-length 728 integer as specified in Section 16 of [QUIC-TRANSPORT]. 730 After decryption, the client's Initial packet(s) can be parsed to 731 detect the CRYPTO frame(s) that contains the TLS ClientHello, which 732 then can be parsed similarly to TLS over TCP connections. Note that 733 there can be multiple CRYPTO frames spread out over one or mor 734 Initial packets, and they might not be in order, so reassembling the 735 CRYPTO stream by parsing offsets and lengths is required. Further, 736 the client's Initial packet(s) may contain other frames, so the first 737 bytes of each frame need to be checked to identify the frame type, 738 and if needed skipped over it. Note that the length of the frames is 739 dependent on the frame type; see Section 18 of [QUIC-TRANSPORT]. 740 E.g. PADDING frames, each consisting of a single zero byte, may 741 occur before, after, or between CRYPTO frames. However, extensions 742 might define additional frame types. If an unknown frame type is 743 encountered, it is impossible to know the length of that frame which 744 prevents skipping over it, and therefore parsing fails. 746 3.5. Flow Association 748 The QUIC connection ID (see Section 2.6) is designed to allow a 749 coordinating on-path device, such as a load-balancer, to associate 750 two flows when one of the endpoints changes address. This change can 751 be due to NAT rebinding or address migration. 753 The connection ID must change upon intentional address change by an 754 endpoint, and connection ID negotiation is encrypted, so it is not 755 possible for a passive observer to link intended changes of address 756 using the connection ID. 758 When one endpoint's address unintentionally changes, as is the case 759 with NAT rebinding, an on-path observer may be able to use the 760 connection ID to associate the flow on the new address with the flow 761 on the old address. 763 A network function that attempts to use the connection ID to 764 associate flows must be robust to the failure of this technique. 765 Since the connection ID may change multiple times during the lifetime 766 of a connection, packets with the same five-tuple but different 767 connection IDs might or might not belong to the same connection. 768 Likewise, packets with the same connection ID but different five- 769 tuples might not belong to the same connection, either. 771 Connection IDs should be treated as opaque; see Section 4.4 for 772 caveats regarding connection ID selection at servers. 774 3.6. Flow Teardown 776 QUIC does not expose the end of a connection; the only indication to 777 on-path devices that a flow has ended is that packets are no longer 778 observed. Stateful devices on path such as NATs and firewalls must 779 therefore use idle timeouts to determine when to drop state for QUIC 780 flows; see Section 4.2. 782 3.7. Flow Symmetry Measurement 784 QUIC explicitly exposes which side of a connection is a client and 785 which side is a server during the handshake. In addition, the 786 symmetry of a flow (whether primarily client-to-server, primarily 787 server-to-client, or roughly bidirectional, as input to basic traffic 788 classification techniques) can be inferred through the measurement of 789 data rate in each direction. While QUIC traffic is protected and 790 ACKs may be padded, padding is not required. 792 3.8. Round-Trip Time (RTT) Measurement 794 The round-trip time (RTT) of QUIC flows can be inferred by 795 observation once per flow, during the handshake, as in passive TCP 796 measurement; this requires parsing of the QUIC packet header and 797 recognition of the handshake, as illustrated in Section 2.4. It can 798 also be inferred during the flow's lifetime, if the endpoints use the 799 spin bit facility described below and in Section 17.3.1 of 800 [QUIC-TRANSPORT]. 802 3.8.1. Measuring Initial RTT 804 In the common case, the delay between the client's Initial packet 805 (containing the TLS ClientHello) and the server's Initial packet 806 (containing the TLS ServerHello) represents the RTT component on the 807 path between the observer and the server. The delay between the 808 server's first Handshake packet and the Handshake packet sent by the 809 client represents the RTT component on the path between the observer 810 and the client. While the client may send 0-RTT packets after the 811 Initial packet during connection re-establishment, these can be 812 ignored for RTT measurement purposes. 814 Handshake RTT can be measured by adding the client-to-observer and 815 observer-to-server RTT components together. This measurement 816 necessarily includes any transport- and application-layer delay (the 817 latter mainly caused by the asymmetric crypto operations associated 818 with the TLS handshake) at both sides. 820 3.8.2. Using the Spin Bit for Passive RTT Measurement 822 The spin bit provides a version-specific method to measure per-flow 823 RTT from observation points on the network path throughout the 824 duration of a connection. See Section 17.4 of [QUIC-TRANSPORT] for 825 the definition of the spin bit in Version 1 of QUIC. Endpoint 826 participation in spin bit signaling is optional. That is, while its 827 location is fixed in this version of QUIC, an endpoint can 828 unilaterally choose to not support "spinning" the bit. 830 Use of the spin bit for RTT measurement by devices on path is only 831 possible when both endpoints enable it. Some endpoints may disable 832 use of the spin bit by default, others only in specific deployment 833 scenarios, e.g. for servers and clients where the RTT would reveal 834 the presence of a VPN or proxy. To avoid making these connections 835 identifiable based on the usage of the spin bit, all endpoints 836 randomly disable "spinning" for at least one eighth of connections, 837 even if otherwise enabled by default. An endpoint not participating 838 in spin bit signaling for a given connection can use a fixed spin 839 value for the duration of the connection, or can set the bit randomly 840 on each packet sent. 842 When in use, the latency spin bit in each direction changes value 843 once per RTT any time that both endpoints are sending packets 844 continuously. An on-path observer can observe the time difference 845 between edges (changes from 1 to 0 or 0 to 1) in the spin bit signal 846 in a single direction to measure one sample of end-to-end RTT. This 847 mechanism follows the principles of protocol measurability laid out 848 in [IPIM]. 850 Note that this measurement, as with passive RTT measurement for TCP, 851 includes any transport protocol delay (e.g., delayed sending of 852 acknowledgments) and/or application layer delay (e.g., waiting for a 853 response to be generated). It therefore provides devices on path a 854 good instantaneous estimate of the RTT as experienced by the 855 application. 857 However, application-limited and flow-control-limited senders can 858 have application and transport layer delay, respectively, that are 859 much greater than network RTT. When the sender is application- 860 limited and e.g. only sends small amount of periodic application 861 traffic, where that period is longer than the RTT, measuring the spin 862 bit provides information about the application period, not the 863 network RTT. 865 Since the spin bit logic at each endpoint considers only samples from 866 packets that advance the largest packet number, signal generation 867 itself is resistant to reordering. However, reordering can cause 868 problems at an observer by causing spurious edge detection and 869 therefore inaccurate (i.e., lower) RTT estimates, if reordering 870 occurs across a spin-bit flip in the stream. 872 Simple heuristics based on the observed data rate per flow or changes 873 in the RTT series can be used to reject bad RTT samples due to lost 874 or reordered edges in the spin signal, as well as application or flow 875 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 876 significantly higher than RTTs over the history of the flow. These 877 heuristics may use the handshake RTT as an initial RTT estimate for a 878 given flow. Usually such heuristics would also detect if the spin is 879 either constant or randomly set for a connection. 881 An on-path observer that can see traffic in both directions (from 882 client to server and from server to client) can also use the spin bit 883 to measure "upstream" and "downstream" component RTT; i.e, the 884 component of the end-to-end RTT attributable to the paths between the 885 observer and the server and the observer and the client, 886 respectively. It does this by measuring the delay between a spin 887 edge observed in the upstream direction and that observed in the 888 downstream direction, and vice versa. 890 Raw RTT samples generated using these techniques can be processed in 891 various ways to generate useful network performance metrics. A 892 simple linear smoothing or moving minimum filter can be applied to 893 the stream of RTT samples to get a more stable estimate of 894 application-experienced RTT. RTT samples measured from the spin bit 895 can also be used to generate RTT distribution information, including 896 minimum RTT (which approximates network RTT over longer time windows) 897 and RTT variance (which approximates jitter as seen by the 898 application). 900 4. Specific Network Management Tasks 902 In this section, we review specific network management and 903 measurement techniques and how QUIC's design impacts them. 905 4.1. Passive Network Performance Measurement and Troubleshooting 907 Limited RTT measurement is possible by passive observation of QUIC 908 traffic; see Section 3.8. No passive measurement of loss is possible 909 with the present wire image. Limited observation of upstream 910 congestion may be possible via the observation of CE markings on ECN- 911 enabled QUIC traffic. 913 On-path devices can also make measurements of RTT, loss and other 914 performance metrics when information is carried in an additional 915 network-layer packet header (Section 6 of 916 [I-D.ietf-tsvwg-transport-encrypt] describes use of operations, 917 administration and management (OAM) information). Using network- 918 layer approaches also has the advantage that common observation and 919 analysis tools can be consistently used by multiple transport 920 protocols, however, these techniques are often limited to 921 measurements within one or multiple cooperating domains. 923 4.2. Stateful Treatment of QUIC Traffic 925 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 926 middlebox) is possible through QUIC traffic and version 927 identification (Section 3.1) and observation of the handshake for 928 connection confirmation (Section 3.2). The lack of any visible end- 929 of-flow signal (Section 3.6) means that this state must be purged 930 either through timers or through least-recently-used eviction, 931 depending on application requirements. 933 While QUIC has no clear network-visible end-of-connection signal and 934 therefore does require timer-based state removal, the QUIC handshake 935 indicates confirmation by both ends of a valid bidirectional 936 transmission. As soon as the handshake completed, timers should be 937 set long enough to also allow for short idle time during a valid 938 transmission. 940 [RFC4787] requires a network state timeout that is not less than 2 941 minutes for most UDP traffic. However, in practice, a QUIC endpoint 942 can experience lower timeouts, in the range of 30 to 60 seconds. 944 In contrast, [RFC5382] recommends a state timeout of more than 2 945 hours for TCP, given that TCP is a connection-oriented protocol with 946 well- defined closure semantics. Even though QUIC has explicitly 947 been designed to tolerate NAT rebindings, decreasing the NAT timeout 948 is not recommended, as it may negatively impact application 949 performance or incentivize endpoints to send very frequent keep-alive 950 packets. 952 The recommendation is therefore that, even when lower state timeouts 953 are used for other UDP traffic, a state timeout of at least two 954 minutes ought to be used for QUIC traffic. 956 If state is removed too early, this could lead to black-holing of 957 incoming packets after a short idle period. To detect this 958 situation, a timer at the client needs to expire before a re- 959 establishment can happen (if at all), which would lead to unnecessary 960 long delays in an otherwise working connection. 962 Furthermore, not all endpoints use routing architectures where 963 connections will survive a port or address change. So even when the 964 client revives the connection, a NAT rebinding can cause a routing 965 mismatch where a packet is not even delivered to the server that 966 might support address migration. For these reasons, the limits in 967 [RFC4787] are important to avoid black-holing of packets (and hence 968 avoid interrupting the flow of data to the client), especially where 969 devices are able to distinguish QUIC traffic from other UDP payloads. 971 The QUIC header optionally contains a connection ID which could 972 provide additional entropy beyond the 5-tuple. The QUIC handshake 973 needs to be observed in order to understand whether the connection ID 974 is present and what length it has. However, connection IDs may be 975 renegotiated after the handshake, and this renegotiation is not 976 visible to the path. Therefore, using the connection ID as a flow 977 key field for stateful treatment of flows is not recommended as 978 connection ID changes will cause undetectable and unrecoverable loss 979 of state in the middle of a connection. Specially, the use of the 980 connection ID for functions that require state to make a forwarding 981 decison is not viable as it will break connectivity or at minimum 982 cause long timeout-based delays before this problem is detected by 983 the endpoints and the connection can potentially be re-established. 985 Use of connection IDs is specifically discouraged for NAT 986 applications. If a NAT hits an operational limit, it is recommended 987 to rather drop the initial packets of a flow (see also Section 4.5), 988 which potentially triggers a fallback to TCP. Use of the connection 989 ID to multiplex multiple connections on the same IP address/port pair 990 is not a viable solution as it risks connectivity breakage, in case 991 the connection ID changes. 993 4.3. Address Rewriting to Ensure Routing Stability 995 While QUIC's migration capability makes it possible for a connection 996 to survive client address changes, this does not work if the routers 997 or switches in the server infrastructure route using the address-port 998 4-tuple. If infrastructure routes on addresses only, NAT rebinding 999 or address migration will cause packets to be delivered to the wrong 1000 server. [QUIC_LB] describes a way to addresses this problem by 1001 coordinating the selection and use of connection IDs between load- 1002 balancers and servers. 1004 Applying address translation at a middlebox to maintain a stable 1005 address-port mapping for flows based on connection ID might seem like 1006 a solution to this problem. However, hiding information about the 1007 change of the IP address or port conceals important and security- 1008 relevant information from QUIC endpoints and as such would facilitate 1009 amplification attacks (see Section 9 of [QUIC-TRANSPORT]). A NAT 1010 function that hides peer address changes prevents the other end from 1011 detecting and mitigating attacks as the endpoint cannot verify 1012 connectivity to the new address using QUIC PATH_CHALLENGE and 1013 PATH_RESPONSE frames. 1015 In addition, a change of IP address or port is also an input signal 1016 to other internal mechanisms in QUIC. When a path change is 1017 detected, path-dependent variables like congestion control parameters 1018 will be reset protecting the new path from overload. 1020 4.4. Server Cooperation with Load Balancers 1022 In the case of networking architectures that include load balancers, 1023 the connection ID can be used as a way for the server to signal 1024 information about the desired treatment of a flow to the load 1025 balancers. Guidance on assigning connection IDs is given in 1026 [QUIC-APPLICABILITY]. [QUIC_LB] describes a system for coordinating 1027 selection and use of connection IDs between load-balancers and 1028 servers. 1030 4.5. Filtering Behavior 1032 [RFC4787] describes possible packet filtering behaviors that relate 1033 to NATs but is often also used is other scenarios where packet 1034 filtering is desired. Though the guidance there holds, a 1035 particularly unwise behavior admits a handful of UDP packets and then 1036 makes a decision to whether or not filter later packets in the same 1037 connection. QUIC applications are encouraged to fail over to TCP if 1038 early packets do not arrive at their destination 1039 [I-D.ietf-quic-applicability], as QUIC is based on UDP and there are 1040 known blocks of UDP traffic (see Section 4.6). Admitting a few 1041 packets allows the QUIC endpoint to determine that the path accepts 1042 QUIC. Sudden drops afterwards will result in slow and costly 1043 timeouts before abandoning the connection. 1045 4.6. UDP Blocking, Throttling, and NAT Binding 1047 Today, UDP is the most prevalent DDoS vector, since it is easy for 1048 compromised non-admin applications to send a flood of large UDP 1049 packets (while with TCP the attacker gets throttled by the congestion 1050 controller) or to craft reflection and amplification attacks. Some 1051 networks therefore block UDP traffic. With increased deployment of 1052 QUIC, there is also an increased need to allow UDP traffic on ports 1053 used for QUIC. However, if UDP is generally enabled on these ports, 1054 UDP flood attacks may also use the same ports. One possible response 1055 to this threat is to throttle UDP traffic on the network, allocating 1056 a fixed portion of the network capacity to UDP and blocking UDP 1057 datagrams over that cap. As the portion of QUIC traffic compared to 1058 TCP is also expected to increase over time, using such a limit is not 1059 recommended but if done, limits might need to be adapted dynamically. 1061 Further, if UDP traffic is desired to be throttled, it is recommended 1062 to block individual QUIC flows entirely rather than dropping packets 1063 indiscriminately. When the handshake is blocked, QUIC-capable 1064 applications may fail over to TCP. However, blocking a random 1065 fraction of QUIC packets across 4-tuples will allow many QUIC 1066 handshakes to complete, preventing a TCP failover, but these 1067 connections will suffer from severe packet loss (see also 1068 Section 4.5). Therefore, UDP throttling should be realized by per- 1069 flow policing, as opposed to per-packet policing. Note that this 1070 per-flow policing should be stateless to avoid problems with stateful 1071 treatment of QUIC flows (see Section 4.2), for example blocking a 1072 portion of the space of values of a hash function over the addresses 1073 and ports in the UDP datagram. While QUIC endpoints are often able 1074 to survive address changes, e.g. by NAT rebindings, blocking a 1075 portion of the traffic based on 5-tuple hashing increases the risk of 1076 black-holing an active connection when the address changes. 1078 Note that some source ports are assumed to be reflection attack 1079 vectors by some servers; see Section 8.1 of 1080 [I-D.ietf-quic-applicability]. As a result, NAT binding to these 1081 source ports can result in that traffic being blocked. 1083 4.7. DDoS Detection and Mitigation 1085 On-path observation of the transport headers of packets can be used 1086 for various security functions. For example, Denial of Service (DOS) 1087 and Distributed DOS (DDOS) attacks against the infrastructure or 1088 against an endpoint can be detected and mitigated by characterising 1089 anomalous traffic. Other uses include support for security audits 1090 (e.g., verifying the compliance with ciphersuites); client and 1091 application fingerprinting for inventory; and to provide alerts for 1092 network intrusion detection and other next generation firewall 1093 functions. 1095 Current practices in detection and mitigation of DDoS attacks 1096 generally involve classification of incoming traffic (as packets, 1097 flows, or some other aggregate) into "good" (productive) and "bad" 1098 (DDoS) traffic, and then differential treatment of this traffic to 1099 forward only good traffic. This operation is often done in a 1100 separate specialized mitigation environment through which all traffic 1101 is filtered; a generalized architecture for separation of concerns in 1102 mitigation is given in [DOTS-ARCH]. 1104 Efficient classification of this DDoS traffic in the mitigation 1105 environment is key to the success of this approach. Limited first- 1106 packet garbage detection as in Section 3.1.2 and stateful tracking of 1107 QUIC traffic as in Section 4.2 above may be useful during 1108 classification. 1110 Note that the use of a connection ID to support connection migration 1111 renders 5-tuple based filtering insufficient to detect active flows 1112 and requires more state to be maintained by DDoS defense systems if 1113 support of migration of QUIC flows is desired. For the common case 1114 of NAT rebinding, where the client's address changes without the 1115 client's intent or knowledge, DDoS defense systems can detect a 1116 change in the client's endpoint address by linking flows based on the 1117 server's connection IDs. However, QUIC's linkability resistance 1118 ensures that a deliberate connection migration is accompanied by a 1119 change in the connection ID. In this case, the connection ID can not 1120 be used to distinguish valid, active traffic from new attack traffic. 1122 It is also possible for endpoints to directly support security 1123 functions such as DoS classification and mitigation. Endpoints can 1124 cooperate with an in-network device directly by e.g. sharing 1125 information about connection IDs. 1127 Another potential method could use an on-path network device that 1128 relies on pattern inferences in the traffic and heuristics or machine 1129 learning instead of processing observed header information. 1131 However, it is questionable whether connection migrations must be 1132 supported during a DDoS attack. While unintended migration without a 1133 connection ID change can be more easily supported, it might be 1134 acceptable to not support migrations of active QUIC connections that 1135 are not visible to the network functions performing the DDoS 1136 detection. As soon as the connection blocking is detected by the 1137 client, the client may be able to rely on the fast resumption 1138 mechanism provided by QUIC. When clients migrate to a new path, they 1139 should be prepared for the migration to fail and attempt to reconnect 1140 quickly. 1142 Beyond in-network DDoS protection mechanisms, TCP syncookies 1143 [RFC4937] are a well-established method of mitigating some kinds of 1144 TCP DDoS attacks. QUIC Retry packets are the functional analogue to 1145 syncookies, forcing clients to prove possession of their IP address 1146 before committing server state. However, there are safeguards in 1147 QUIC against unsolicited injection of these packets by intermediaries 1148 who do not have consent of the end server. See [QUIC_LB] for 1149 standard ways for intermediaries to send Retry packets on behalf of 1150 consenting servers. 1152 4.8. Quality of Service Handling and ECMP Routing 1154 It is expected that any QoS handling in the network, e.g. based on 1155 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 1156 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 1157 per-packet) and as such that all packets belonging to the same QUIC 1158 connection get uniform treatment. 1160 Using ECMP to distribute packets from a single flow across multiple 1161 network paths or any other non-uniform treatment of packets belong to 1162 the same connection could result in variations in order, delivery 1163 rate, and drop rate. As feedback about loss or delay of each packet 1164 is used as input to the congestion controller, these variations could 1165 adversely affect performance. Depending on the loss recovery 1166 mechanism implemented, QUIC may be more tolerant of packet re- 1167 ordering than traditional TCP traffic (see Section 2.7). However, 1168 the recovery mechanism used by a flow cannot be known by the network 1169 and therefore reordering tolerance should be considered as unknown. 1171 4.9. Handling ICMP Messages 1173 Datagram Packetization Layer PMTU Discovery (PLPMTUD) can be used by 1174 QUIC to probe for the supported PMTU. PLPMTUD optionally uses ICMP 1175 messages (e.g., IPv6 Packet Too Big messages). Given known attacks 1176 with the use of ICMP messages, the use of PLPMTUD in QUIC has been 1177 designed to safely use but not rely on receiving ICMP feedback (see 1178 Section 14.2.1. of [QUIC-TRANSPORT]). 1180 Networks are recommended to forward these ICMP messages and retain as 1181 much of the original packet as possible without exceeding the minimum 1182 MTU for the IP version when generating ICMP messages as recommended 1183 in [RFC1812] and [RFC4443]. 1185 4.10. Guiding Path MTU 1187 Some network segments support 1500-byte packets, but can only do so 1188 by fragmenting at a lower layer before traversing a network segment 1189 with a smaller MTU, and then reassembling within the network segment. 1190 This is permissible even when the IP layer is IPv6 or IPv4 with the 1191 DF bit set, because fragmention occurs below the IP layer. However, 1192 this process can add to compute and memory costs, leading to a 1193 bottleneck that limits network capacity. In such networks this 1194 generates a desire to influence a majority of senders to use smaller 1195 packets, to avoid exceeding limited reassembly capacity. 1197 For TCP, MSS clamping (Section 3.2 of [RFC4459]) is often used to 1198 change the sender's TCP maximum segment size, but QUIC requires a 1199 different approach. Section 14 of [QUIC-TRANSPORT] advises senders 1200 to probe larger sizes using Datagram Packetization Layer PMTU 1201 Discovery ([DPLPMTUD]) or Path Maximum Transmission Unit Discovery 1202 (PMTUD: [RFC1191] and [RFC8201]). This mechanism encourages senders 1203 to approach the maximum packet size, which could then cause 1204 fragmentation within a network segment of which they may not be 1205 aware. 1207 If path performance is limited when forwarding larger packets, an on- 1208 path device should support a maximum packet size for a specific 1209 transport flow and then consistently drop all packets that exceed the 1210 configured size when the inner IPv4 packet has DF set, or IPv6 is 1211 used. 1213 Networks with configurations that would lead to fragmentation of 1214 large packets within a network segment should drop such packets 1215 rather than fragmenting them. Network operators who plan to 1216 implement a more selective policy may start by focusing on QUIC. 1218 QUIC flows cannot always be easily distinguished from other UDP 1219 traffic, but we assume at least some portion of QUIC traffic can be 1220 identified (see Section 3.1). For networks supporting QUIC, it is 1221 recommended that a path drops any packet larger than the 1222 fragmentation size. When a QUIC endpoint uses DPLPMTUD, it will use 1223 a QUIC probe packet to discover the PMTU. If this probe is lost, it 1224 will not impact the flow of QUIC data. 1226 IPv4 routers generate an ICMP message when a packet is dropped 1227 because the link MTU was exceeded. [RFC8504] specifies how an IPv6 1228 node generates an ICMPv6 Packet Too Big message (PTB) in this case. 1229 PMTUD relies upon an endpoint receiving such PTB messages [RFC8201], 1230 whereas DPLPMTUD does not reply upon these messages, but still can 1231 optionally use these to improve performance Section 4.6 of 1232 [DPLPMTUD]. 1234 A network cannot know in advance which discovery method is used by a 1235 QUIC endpoint, so it should send a PTB message in addition to 1236 dropping an oversized packet. A generated PTB message should be 1237 compliant with the validation requirements of Section 14.2.1 of 1238 [QUIC-TRANSPORT], otherwise it will be ignored for PMTU discovery. 1239 This provides a signal to the endpoint to prevent the packet size 1240 from growing too large, which can entirely avoid network segment 1241 fragmentation for that flow. 1243 Endpoints can cache PMTU information, in the IP-layer cache. This 1244 short-term consistency between the PMTU for flows can help avoid an 1245 endpoint using a PMTU that is inefficient. The IP cache can also 1246 influence the PMTU value of other IP flows that use the same path 1247 [RFC8201][RFC8899], including IP packets carrying protocols other 1248 than QUIC. The representation of an IP path is implementation- 1249 specific [RFC8201]. 1251 5. IANA Considerations 1253 This document has no actions for IANA. 1255 6. Security Considerations 1257 QUIC is an encrypted and authenticated transport. That means, once 1258 the cryptographic handshake is complete, QUIC endpoints discard most 1259 packets that are not authenticated, greatly limiting the ability of 1260 an attacker to interfere with existing connections. 1262 However, some information is still observerable, as supporting 1263 manageability of QUIC traffic inherently involves tradeoffs with the 1264 confidentiality of QUIC's control information; this entire document 1265 is therefore security-relevant. 1267 More security considerations for QUIC are discussed in 1268 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1269 passive attackers in the network as well as attacks on specific QUIC 1270 mechanism. 1272 Version Negotiation packets do not contain any mechanism to prevent 1273 version downgrade attacks. However, future versions of QUIC that use 1274 Version Negotiation packets are required to define a mechanism that 1275 is robust against version downgrade attacks. Therefore, a network 1276 node should not attempt to impact version selection, as version 1277 downgrade may result in connection failure. 1279 7. Contributors 1281 The following people have contributed significant text to and/or 1282 feedback on this document: 1284 * Chris Box 1286 * Dan Druta 1288 * David Schinazi 1290 * Gorry Fairhurst 1292 * Ian Swett 1294 * Igor Lubashev 1296 * Jana Iyengar 1298 * Jared Mauch 1300 * Lars Eggert 1302 * Lucas Purdue 1304 * Marcus Ilhar 1306 * Mark Nottingham 1308 * Martin Duke 1310 * Martin Thomson 1312 * Matt Joras 1314 * Mike Bishop 1315 * Nick Banks 1317 * Thomas Fossati 1319 * Sean Turner 1321 8. Acknowledgments 1323 This work was partially supported by the European Commission under 1324 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1325 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1326 for Education, Research, and Innovation under contract no. 15.0268. 1327 This support does not imply endorsement. 1329 9. References 1331 9.1. Normative References 1333 [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC", 1334 Work in Progress, Internet-Draft, draft-ietf-quic-tls-34, 1335 14 January 2021, . 1338 [QUIC-TRANSPORT] 1339 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1340 and Secure Transport", Work in Progress, Internet-Draft, 1341 draft-ietf-quic-transport-34, 14 January 2021, 1342 . 1345 9.2. Informative References 1347 [DOTS-ARCH] 1348 Mortensen, A., Ed., Reddy.K, T., Ed., Andreasen, F., 1349 Teague, N., and R. Compton, "DDoS Open Threat Signaling 1350 (DOTS) Architecture", RFC 8811, DOI 10.17487/RFC8811, 1351 August 2020, . 1353 [DPLPMTUD] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1354 Völker, "Packetization Layer Path MTU Discovery for 1355 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1356 September 2020, . 1358 [I-D.ietf-quic-applicability] 1359 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1360 Transport Protocol", Work in Progress, Internet-Draft, 1361 draft-ietf-quic-applicability-12, 30 June 2021, 1362 . 1365 [I-D.ietf-tsvwg-transport-encrypt] 1366 Fairhurst, G. and C. Perkins, "Considerations around 1367 Transport Header Confidentiality, Network Operations, and 1368 the Evolution of Internet Transport Protocols", Work in 1369 Progress, Internet-Draft, draft-ietf-tsvwg-transport- 1370 encrypt-21, 20 April 2021, 1371 . 1374 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1375 Internet Measurement (arXiv preprint 1612.02902)", 9 1376 December 2016, . 1378 [QUIC-APPLICABILITY] 1379 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1380 Transport Protocol", Work in Progress, Internet-Draft, 1381 draft-ietf-quic-applicability-12, 30 June 2021, 1382 . 1385 [QUIC-HTTP] 1386 Bishop, M., "Hypertext Transfer Protocol Version 3 1387 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1388 quic-http-34, 2 February 2021, 1389 . 1392 [QUIC-INVARIANTS] 1393 Thomson, M., "Version-Independent Properties of QUIC", 1394 Work in Progress, Internet-Draft, draft-ietf-quic- 1395 invariants-13, 14 January 2021, 1396 . 1399 [QUIC-RECOVERY] 1400 Iyengar, J. and I. Swett, "QUIC Loss Detection and 1401 Congestion Control", Work in Progress, Internet-Draft, 1402 draft-ietf-quic-recovery-34, 14 January 2021, 1403 . 1406 [QUIC_LB] Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC 1407 Connection IDs", Work in Progress, Internet-Draft, draft- 1408 ietf-quic-load-balancers-07, 9 July 2021, 1409 . 1412 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1413 DOI 10.17487/RFC1191, November 1990, 1414 . 1416 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1417 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1418 . 1420 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1421 and W. Weiss, "An Architecture for Differentiated 1422 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1423 . 1425 [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. 1426 Sooriyabandara, "TCP Performance Implications of Network 1427 Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449, 1428 December 2002, . 1430 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1431 Control Message Protocol (ICMPv6) for the Internet 1432 Protocol Version 6 (IPv6) Specification", STD 89, 1433 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1434 . 1436 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1437 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1438 2006, . 1440 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1441 Translation (NAT) Behavioral Requirements for Unicast 1442 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1443 2007, . 1445 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1446 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1447 June 2007, . 1449 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1450 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1451 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1452 . 1454 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1455 Extensions: Extension Definitions", RFC 6066, 1456 DOI 10.17487/RFC6066, January 2011, 1457 . 1459 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1460 "Transport Layer Security (TLS) Application-Layer Protocol 1461 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1462 July 2014, . 1464 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1465 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1466 August 2015, . 1468 [RFC7801] Dolmatov, V., Ed., "GOST R 34.12-2015: Block Cipher 1469 "Kuznyechik"", RFC 7801, DOI 10.17487/RFC7801, March 2016, 1470 . 1472 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 1473 Updates for Secure Real-time Transport Protocol (SRTP) 1474 Extension for Datagram Transport Layer Security (DTLS)", 1475 RFC 7983, DOI 10.17487/RFC7983, September 2016, 1476 . 1478 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1479 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1480 DOI 10.17487/RFC8201, July 2017, 1481 . 1483 [RFC8504] Chown, T., Loughney, J., and T. Winters, "IPv6 Node 1484 Requirements", BCP 220, RFC 8504, DOI 10.17487/RFC8504, 1485 January 2019, . 1487 [RFC8899] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1488 Völker, "Packetization Layer Path MTU Discovery for 1489 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1490 September 2020, . 1492 [TLS-ECH] Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1493 Encrypted Client Hello", Work in Progress, Internet-Draft, 1494 draft-ietf-tls-esni-13, 12 August 2021, 1495 . 1498 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1499 Integrity Signals for Passive Measurement (in Proc. TMA 1500 2014)", April 2014. 1502 [WIRE-IMAGE] 1503 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1504 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1505 2019, . 1507 Authors' Addresses 1509 Mirja Kuehlewind 1510 Ericsson 1512 Email: mirja.kuehlewind@ericsson.com 1514 Brian Trammell 1515 Google Switzerland GmbH 1516 Gustav-Gull-Platz 1 1517 CH- 8004 Zurich 1518 Switzerland 1520 Email: ietf@trammell.ch