idnits 2.17.1 draft-ietf-quic-manageability-12.txt: -(1329): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1463): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 4 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (30 June 2021) is 1031 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-11 == Outdated reference: A later version (-18) exists of draft-ietf-quic-applicability-11 -- Duplicate reference: draft-ietf-quic-applicability, mentioned in 'QUIC-APPLICABILITY', was also mentioned in 'I-D.ietf-quic-applicability'. -- Duplicate reference: RFC8899, mentioned in 'RFC8899', was also mentioned in 'DPLPMTUD'. == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-11 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 1 January 2022 Google Switzerland GmbH 6 30 June 2021 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-12 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on the implications of QUIC's design and wire image on 15 network operations involving QUIC traffic. Its intended audience is 16 network operators and equipment vendors who rely on the use of 17 transport-aware network functions. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 1 January 2022. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Simplified BSD License text 47 as described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 3 54 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 55 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 56 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 6 57 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 6 58 2.5. Integrity Protection of the Wire Image . . . . . . . . . 11 59 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 11 60 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 12 61 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 12 62 3. Network-Visible Information about QUIC Flows . . . . . . . . 13 63 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 13 64 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 13 65 3.1.2. First Packet Identification for Garbage Rejection . . 14 66 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 14 67 3.3. Distinguishing Acknowledgment Traffic . . . . . . . . . . 15 68 3.4. Server Name Indication (SNI) . . . . . . . . . . . . . . 15 69 3.4.1. Extracting Server Name Indication (SNI) 70 Information . . . . . . . . . . . . . . . . . . . . . 15 71 3.5. Flow Association . . . . . . . . . . . . . . . . . . . . 16 72 3.6. Flow Teardown . . . . . . . . . . . . . . . . . . . . . . 17 73 3.7. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 17 74 3.8. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 17 75 3.8.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 18 76 3.8.2. Using the Spin Bit for Passive RTT Measurement . . . 18 77 4. Specific Network Management Tasks . . . . . . . . . . . . . . 20 78 4.1. Passive Network Performance Measurement and 79 Troubleshooting . . . . . . . . . . . . . . . . . . . . 20 80 4.2. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 20 81 4.3. Address Rewriting to Ensure Routing Stability . . . . . . 22 82 4.4. Server Cooperation with Load Balancers . . . . . . . . . 22 83 4.5. Filtering Behavior . . . . . . . . . . . . . . . . . . . 23 84 4.6. UDP Blocking or Throttling . . . . . . . . . . . . . . . 23 85 4.7. DDoS Detection and Mitigation . . . . . . . . . . . . . . 24 86 4.8. Quality of Service Handling and ECMP Routing . . . . . . 25 87 4.9. Handling ICMP Messages . . . . . . . . . . . . . . . . . 25 88 4.10. Guiding Path MTU . . . . . . . . . . . . . . . . . . . . 26 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 27 91 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28 92 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 28 93 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 94 9.1. Normative References . . . . . . . . . . . . . . . . . . 28 95 9.2. Informative References . . . . . . . . . . . . . . . . . 29 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 98 1. Introduction 100 QUIC [QUIC-TRANSPORT] is a new transport protocol that is 101 encapsulated in UDP. QUIC integrates TLS [QUIC-TLS] to encrypt all 102 payload data and most control information. QUIC version 1 was 103 designed primarily as a transport for HTTP, with the resulting 104 protocol being known as HTTP/3 [QUIC-HTTP]. 106 This document provides guidance for network operations that manage 107 QUIC traffic. This includes guidance on how to interpret and utilize 108 information that is exposed by QUIC to the network, requirements and 109 assumptions of the QUIC design with respect to network treatment, and 110 a description of how common network management practices will be 111 impacted by QUIC. 113 QUIC is an end-to-end transport protocol. No information in the 114 protocol header, even that which can be inspected, is mutable by the 115 network. This is achieved through integrity protection of the wire 116 image [WIRE-IMAGE]. Encryption of most control signaling means that 117 less information is visible to the network than is the case with TCP. 119 Integrity protection can also simplify troubleshooting, because none 120 of the nodes on the network path can modify transport layer 121 information. However, it means in-network operations that depend on 122 modification of data are not possible without the cooperation of an 123 QUIC endpoint. This might be possible with the introduction of a 124 proxy which authenticates as an endpoint. Proxy operations are not 125 in scope for this document. 127 Network management is not a one-size-fits-all endeavour: practices 128 considered necessary or even mandatory within enterprise networks 129 with certain compliance requirements, for example, would be 130 impermissible on other networks without those requirements. This 131 document therefore does not make any specific recommendations as to 132 which practices should or should not be applied; for each practice, 133 it describes what is and is not possible with the QUIC transport 134 protocol as defined. 136 2. Features of the QUIC Wire Image 138 This section discusses those aspects of the QUIC transport protocol 139 that have an impact on the design and operation of devices that 140 forward QUIC packets. This section is therefore primarily 141 considering the unencrypted part of QUIC's wire image [WIRE-IMAGE], 142 which is defined as the information available in the packet header in 143 each QUIC packet, and the dynamics of that information. Since QUIC 144 is a versioned protocol, the wire image of the header format can also 145 change from version to version. However, the field that identifies 146 the QUIC version in some packets, and the format of the Version 147 Negotiation Packet, are both inspectable and invariant 148 [QUIC-INVARIANTS]. 150 This document describes version 1 of the QUIC protocol, whose wire 151 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 152 of the wire image described herein may change in future versions of 153 the protocol, except when specified as an invariant 154 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 155 or to infer the behavior of future versions of QUIC. 157 2.1. QUIC Packet Header Structure 159 QUIC packets may have either a long header or a short header. The 160 first bit of the QUIC header is the Header Form bit, and indicates 161 which type of header is present. The purpose of this bit is 162 invariant across QUIC versions. 164 The long header exposes more information. It contains a version 165 number, as well as source and destination connection IDs for 166 associating packets with a QUIC connection. The definition and 167 location of these fields in the QUIC long header are invariant for 168 future versions of QUIC, although future versions of QUIC may provide 169 additional fields in the long header [QUIC-INVARIANTS]. 171 In version 1 of QUIC, the long header is used during connection 172 establishment to transmit crypto handshake data, perform version 173 negotiation, retry, and send 0-RTT data. 175 Short headers contain only an optional destination connection ID and 176 the spin bit for RTT measurement. In version 1 of QUIC, they are 177 used after connection establishment. 179 The following information is exposed in QUIC packet headers in all 180 versions of QUIC: 182 * version number: the version number is present in the long header, 183 and identifies the version used for that packet. During Version 184 Negotiation (see Section 17.2.1 of [QUIC-TRANSPORT] and 185 Section 2.8), the version number field has a special value 186 (0x00000000) that identifies the packet as a Version Negotiation 187 packet. QUIC version 1 uses version 0x00000001. Operators should 188 expect to observe packets with other version numbers as a result 189 of various Internet experiments, future standards, and greasing 190 ([RFC7801]). All deployed versions are maintained in an IANA 191 registry (see Section 22.2 of [QUIC-TRANSPORT]). 193 * source and destination connection ID: short and long packet 194 headers carry a destination connection ID, a variable-length field 195 that can be used to identify the connection associated with a QUIC 196 packet, for load-balancing and NAT rebinding purposes; see 197 Section 4.4 and Section 2.6. Long packet headers additionally 198 carry a source connection ID. The source connection ID 199 corresponds to the destination connection ID the source would like 200 to have on packets sent to it, and is only present on long packet 201 headers. On long header packets, the length of the connection IDs 202 is also present; on short header packets, the length of the 203 destination connection ID is implicit. 205 In version 1 of QUIC, the following additional information is 206 exposed: 208 * "fixed bit": The second-most-significant bit of the first octet of 209 most QUIC packets of the current version is set to 1, enabling 210 endpoints to demultiplex with other UDP-encapsulated protocols. 211 Even though this bit is fixed in the version 1 specification, 212 endpoints might use an extension that varies the bit. Therefore, 213 observers cannot reliably use it as an identifier for QUIC. 215 * latency spin bit: The third-most-significant bit of the first 216 octet in the short packet header for version 1. The spin bit is 217 set by endpoints such that tracking edge transitions can be used 218 to passively observe end-to-end RTT. See Section 3.8.2 for 219 further details. 221 * header type: The long header has a 2 bit packet type field 222 following the Header Form and fixed bits. Header types correspond 223 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 224 for details. 226 * length: The length of the remaining QUIC packet after the length 227 field, present on long headers. This field is used to implement 228 coalesced packets during the handshake (see Section 2.2). 230 * token: Initial packets may contain a token, a variable-length 231 opaque value optionally sent from client to server, used for 232 validating the client's address. Retry packets also contain a 233 token, which can be used by the client in an Initial packet on a 234 subsequent connection attempt. The length of the token is 235 explicit in both cases. 237 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 238 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 239 obfuscated in any way. For other kinds of packets, version 1 of QUIC 240 cryptographically obfuscates other information in the packet headers: 242 * packet number: All packets except Version Negotiation and Retry 243 packets have an associated packet number; however, this packet 244 number is encrypted, and therefore not of use to on-path 245 observers. The offset of the packet number can be decoded in long 246 headers, while it is implicit (depending on destination connection 247 ID length) in short headers. The length of the packet number is 248 cryptographically obfuscated. 250 * key phase: The Key Phase bit, present in short headers, specifies 251 the keys used to encrypt the packet to support key rotation. The 252 Key Phase bit is cryptographically obfuscated. 254 2.2. Coalesced Packets 256 Multiple QUIC packets may be coalesced into a single UDP datagram, 257 with a datagram carrying one or more long header packets followed by 258 zero or one short header packets. When packets are coalesced, the 259 Length fields in the long headers are used to separate QUIC packets; 260 see Section 12.2 of [QUIC-TRANSPORT]. The Length field is variable 261 length, and its position in the header is also variable depending on 262 the length of the source and destination connection ID; see 263 Section 17.2 of [QUIC-TRANSPORT]. 265 2.3. Use of Port Numbers 267 Applications that have a mapping for TCP as well as QUIC are expected 268 to use the same port number for both services. However, as for all 269 other IETF transports [RFC7605], there is no guarantee that a 270 specific application will use a given registered port, or that a 271 given port carries traffic belonging to the respective registered 272 service, especially when application layer information is encrypted. 273 For example, [QUIC-HTTP] specifies the use of Alt-Svc for discovery 274 of HTTP/3 services on other ports. 276 Further, as QUIC has a connection ID, it is also possible to maintain 277 multiple QUIC connections over one 5-tuple. However, if the 278 connection ID is zero-length, all packets of the 5-tuple likely 279 belong to the same QUIC connection. 281 2.4. The QUIC Handshake 283 New QUIC connections are established using a handshake, which is 284 distinguishable on the wire and contains some information that can be 285 passively observed. 287 To illustrate the information visible in the QUIC wire image during 288 the handshake, we first show the general communication pattern 289 visible in the UDP datagrams containing the QUIC handshake, then 290 examine each of the datagrams in detail. 292 The QUIC handshake can normally be recognized on the wire through 293 four flights of datagrams labelled "Client Initial", "Server 294 Initial", "Client Completion", and "Server Completion", in the 295 illustration shown in Figure 1. 297 Packets in the handshake belong to three separate cryptographic and 298 transport contexts ("Initial", which contains observable payload, and 299 "Handshake" and "1-RTT", which do not). QUIC packets in separate 300 contexts during the handshake can be coalesced (see Section 2.2) in 301 order to reduce the number of UDP datagrams sent during the 302 handshake. QUIC packets can be lost and reordered, so packets within 303 a flight might not be sent close in time, though the sequence of the 304 flights will not change, because one flight depends upon the peer's 305 previous flight. 307 As shown here, the client can send 0-RTT data as soon as it has sent 308 its Client Hello, and the server can send 1-RTT data as soon as it 309 has sent its Server Hello. 311 Client Server 312 | | 313 +----Client Initial----------------------->| 314 +----(zero or more 0RTT)------------------>| 315 | | 316 |<-----------------------Server Initial----+ 317 |<---------(1RTT encrypted data starts)----+ 318 | | 319 +----Client Completion-------------------->| 320 +----(1RTT encrypted data starts)--------->| 321 | | 322 |<--------------------Server Completion----+ 323 | | 325 Figure 1: General communication pattern visible in the QUIC handshake 327 A handshake starts with the client sending one or more datagrams 328 containing Initial packets as shown in Figure 2, which elicits the 329 Server Initial response as shown in Figure 3 typically containing 330 three types of packets: Initial packet(s) with the beginning of the 331 server's side of the TLS handshake, Handshake packet(s) with the rest 332 of the server's portion of the TLS handshake, and 1-RTT packet(s), if 333 present. 335 The Client Completion flight contains at least one Handshake packet 336 and could also include an Initial packet. 338 Datagrams that contain an Initial packet (Client Initial, Server 339 Initial, and some Client Completion) contain at least 1200 octets of 340 UDP payload. This protects against amplification attacks and 341 verifies that the network path meets the requirements for the minimum 342 QUIC IP packet size; see Section 14 of [QUIC-TRANSPORT]. This is 343 accomplished by either adding PADDING frames within the Initial 344 packet, coalescing other packets with the Initial packet, or leaving 345 unused payload in the UDP packet after the Initial packet. A network 346 path needs to be able to forward at least this size of packet for 347 QUIC to be used. 349 The content of Initial packets is encrypted using Initial Secrets, 350 which are derived from a per-version constant and the client's 351 destination connection ID; they are therefore observable by any on- 352 path device that knows the per-version constant and considered 353 visible in this illustration. The content of QUIC Handshake packets 354 are encrypted using keys established during the initial handshake 355 exchange, and are therefore not visible. 357 Initial, Handshake, and 1-RTT packets belong to different 358 cryptographic and transport contexts. The Client Completion Figure 4 359 and the Server Completion Figure 5 flights conclude the Initial and 360 Handshake contexts, by sending final acknowledgments and CRYPTO 361 frames. 363 +----------------------------------------------------------+ 364 | UDP header (source and destination UDP ports) | 365 +----------------------------------------------------------+ 366 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 367 +----------------------------------------------------------+ | 368 | QUIC CRYPTO frame header | | 369 +----------------------------------------------------------+ | 370 | | TLS Client Hello (incl. TLS SNI) | | | 371 +----------------------------------------------------------+ | 372 | QUIC PADDING frames | | 373 +----------------------------------------------------------+<-+ 375 Figure 2: Example Client Initial datagram without 0-RTT 377 A Client Initial packet exposes the version, source and destination 378 connection IDs without encryption. The payload of the Initial packet 379 is obfuscated using the Initial secret. The complete TLS Client 380 Hello, including any TLS Server Name Indication (SNI) present, is 381 sent in one or more CRYPTO frames across one or more QUIC Initial 382 packets. 384 +------------------------------------------------------------+ 385 | UDP header (source and destination UDP ports) | 386 +------------------------------------------------------------+ 387 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 388 +------------------------------------------------------------+ | 389 | QUIC CRYPTO frame header | | 390 +------------------------------------------------------------+ | 391 | TLS Server Hello | | 392 +------------------------------------------------------------+ | 393 | QUIC ACK frame (acknowledging client hello) | | 394 +------------------------------------------------------------+<-+ 395 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 396 +------------------------------------------------------------+ | 397 | encrypted payload (presumably CRYPTO frames) | | 398 +------------------------------------------------------------+<-+ 399 | QUIC short header | 400 +------------------------------------------------------------+ 401 | 1-RTT encrypted payload | 402 +------------------------------------------------------------+ 404 Figure 3: Coalesced Server Initial datagram pattern 406 The Server Initial datagram also exposes version number, source and 407 destination connection IDs in the clear; the payload of the Initial 408 packet(s) is obfuscated using the Initial secret. 410 +------------------------------------------------------------+ 411 | UDP header (source and destination UDP ports) | 412 +------------------------------------------------------------+ 413 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 414 +------------------------------------------------------------+ | 415 | QUIC ACK frame (acknowledging Server Initial) | | 416 +------------------------------------------------------------+<-+ 417 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 418 +------------------------------------------------------------+ | 419 | encrypted payload (presumably CRYPTO/ACK frames) | | 420 +------------------------------------------------------------+<-+ 421 | QUIC short header | 422 +------------------------------------------------------------+ 423 | 1-RTT encrypted payload | 424 +------------------------------------------------------------+ 426 Figure 4: Coalesced Client Completion datagram pattern 428 The Client Completion flight does not expose any additional 429 information; however, as the destination connection ID is server- 430 selected, it usually is not the same ID than in the Client Initial. 431 Client Completion flights contain 1-RTT packets which indicate the 432 handshake has completed (see Section 3.2) on the client, and for 433 three-way handshake RTT estimation as in Section 3.8. 435 +------------------------------------------------------------+ 436 | UDP header (source and destination UDP ports) | 437 +------------------------------------------------------------+ 438 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 439 +------------------------------------------------------------+ | 440 | encrypted payload (presumably ACK frame) | | 441 +------------------------------------------------------------+<-+ 442 | QUIC short header | 443 +------------------------------------------------------------+ 444 | 1-RTT encrypted payload | 445 +------------------------------------------------------------+ 447 Figure 5: Coalesced Server Completion datagram pattern 449 Similar to Client Completion, Server Completion also exposes no 450 additional information; observing it serves only to determine that 451 the handshake has completed. 453 When the client uses 0-RTT connection resumption, the Client Initial 454 flight can also include one or more 0-RTT packets, as shown in 455 Figure 6. 457 +----------------------------------------------------------+ 458 | UDP header (source and destination UDP ports) | 459 +----------------------------------------------------------+ 460 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 461 +----------------------------------------------------------+ | 462 | QUIC CRYPTO frame header | | 463 +----------------------------------------------------------+ | 464 | TLS Client Hello (incl. TLS SNI) | | 465 +----------------------------------------------------------+<-+ 466 | QUIC long header (type = 0RTT, Version, DCID, SCID) (Length) 467 +----------------------------------------------------------+ | 468 | 0-rtt encrypted payload | | 469 +----------------------------------------------------------+<-+ 471 Figure 6: Coalesced 0-RTT Client Initial datagram 473 When a 0-RTT packet is coalesced with an Initial packet, the datagram 474 will be padded to 1200 byes. Additional datagrams containing only 475 0-RTT packets with long headers can be sent after the client Initial 476 packet(s), containing more 0-RTT data. The amount of 0-RTT protected 477 data that can be sent in the first flight is limited by the initial 478 congestion window, typically to around 10 packets (see Section 7.2 of 479 [QUIC-RECOVERY]). 481 2.5. Integrity Protection of the Wire Image 483 As soon as the cryptographic context is established, all information 484 in the QUIC header, including exposed information, is integrity 485 protected. Further, information that was exposed in packets sent 486 before the cryptographic context was established is validated during 487 the cryptographic handshake. Therefore, devices on path cannot alter 488 any information or bits in QUIC packets. Such alterations would 489 cause the integrity check to fail, which results in the receiver 490 discarding the packet. Some parts of Initial packets could be 491 altered by removing and re-applying the authenticated encryption 492 without immediate discard at the receiver. However, the 493 cryptographic handshake validates most fields and any modifications 494 in those fields will result in connection establishment failing 495 later. 497 2.6. Connection ID and Rebinding 499 The connection ID in the QUIC packet headers allows association of 500 QUIC packets using information independent of the five-tuple. This 501 allows rebinding of a connection after one of the endpoints 502 experienced an address change - usually the client. Further it can 503 be used by in-network devices to ensure that related 5-tuple flows 504 are appropriately balanced together. 506 Client and server each choose a connection ID during the handshake; 507 for example, a server might request that a client use a connection 508 ID, whereas the client might choose a zero-length value. Connection 509 IDs for either endpoint may change during the lifetime of a 510 connection, with the new connection ID being supplied via encrypted 511 frames (see Section 5.1 of [QUIC-TRANSPORT]). Therefore, observing a 512 new connection ID does not necessarily indicate a new connection. 514 [QUIC_LB] specifies algorithms for encoding the server mapping in a 515 connection ID in order to share this information with selected on- 516 path devices such as load balancers. Server mappings should only be 517 exposed to selected entities. Uncontrolled exposure would allow 518 linkage of multiple IP addresses to the same host if the server also 519 supports migration that opens an attack vector on specific servers or 520 pools. The best way to obscure an encoding is to appear random to 521 any other observers, which is most rigorously achieved with 522 encryption. As a result, any attempt to infer information from 523 specific parts of a connection ID is unlikely to be useful. 525 2.7. Packet Numbers 527 The Packet Number field is always present in the QUIC packet header 528 in version 1; however, it is always encrypted. The encryption key 529 for packet number protection on Initial packets -- which are sent 530 before cryptographic context establishment -- is specific to the QUIC 531 version, while packet number protection on subsequent packets uses 532 secrets derived from the end-to-end cryptographic context. Packet 533 numbers are therefore not part of the wire image that is visible to 534 on-path observers. 536 2.8. Version Negotiation and Greasing 538 Version Negotiation packets are used by the server to indicate that a 539 requested version from the client is not supported (see Section 6 of 540 [QUIC-TRANSPORT]. Version Negotiation packets are not intrinsically 541 protected, but future QUIC versions will use later encrypted messages 542 to verify that they were authentic. Therefore, any modification of 543 this list will be detected and may cause the endpoints to terminate 544 the connection attempt. 546 Also note that the list of versions in the Version Negotiation packet 547 may contain reserved versions. This mechanism is used to avoid 548 ossification in the implementation on the selection mechanism. 549 Further, a client may send an Initial packet with a reserved version 550 number to trigger version negotiation. In the Version Negotiation 551 packet, the connection IDs of the client's Initial packet are 552 reflected to provide a proof of return-routability. Therefore, 553 changing this information will also cause the connection to fail. 555 QUIC is expected to evolve rapidly, so new versions, both 556 experimental and IETF standard versions, will be deployed on the 557 Internet more often than with traditional Internet- and transport- 558 layer protocols. Using a particular version number to recognize 559 valid QUIC traffic is likely to persistently miss a fraction of QUIC 560 flows and completely fail in the near future, and is therefore not 561 recommended. In addition, due to the speed of evolution of the 562 protocol, devices that attempt to distinguish QUIC traffic from non- 563 QUIC traffic for purposes of network admission control should admit 564 all QUIC traffic regardless of version. 566 3. Network-Visible Information about QUIC Flows 568 This section addresses the different kinds of observations and 569 inferences that can be made about QUIC flows by a passive observer in 570 the network based on the wire image in Section 2. Here we assume a 571 bidirectional observer (one that can see packets in both directions 572 in the sequence in which they are carried on the wire) unless noted, 573 but typically without access to any keying information. 575 3.1. Identifying QUIC Traffic 577 The QUIC wire image is not specifically designed to be 578 distinguishable from other UDP traffic by a passive observer in the 579 network. 581 The only application binding defined by the IETF QUIC WG is HTTP/3 582 [QUIC-HTTP] at the time of this writing; however, many other 583 applications are currently being defined and deployed over QUIC, so 584 an assumption that all QUIC traffic is HTTP/3 is not valid. HTTP/3 585 uses UDP port 443 by convention but various methods can be used to 586 specify alternate port numbers. Simple assumptions about whether a 587 given flow is using QUIC based upon a UDP port number may therefore 588 not hold; see also Section 5 of [RFC7605]. 590 While the second-most-significant bit (0x40) of the first octet is 591 set to 1 in most QUIC packets of the current version (see Section 2.1 592 and Section 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC 593 traffic is not reliable. First, it only provides one bit of 594 information and is prone to collision with UDP-based protocols other 595 than those considered in [RFC7983]. Second, this feature of the wire 596 image is not invariant [QUIC-INVARIANTS] and may change in future 597 versions of the protocol, or even be negotiated during the handshake 598 via the use of an extension. 600 Even though transport parameters transmitted in the client's Initial 601 packet are observable by the network, they cannot be modified by the 602 network without causing connection failure. Further, the reply from 603 the server cannot be observed, so observers on the network cannot 604 know which parameters are actually in use. 606 3.1.1. Identifying Negotiated Version 608 An in-network observer assuming that a set of packets belongs to a 609 QUIC flow might infer the version number in use by observing the 610 handshake: for QUIC version 1, if the version number in the Initial 611 packet from a client is the same as the version number in the Initial 612 packet of the server response, that version has been accepted by both 613 endpoints to be used for the rest of the connection. 615 The negotiated version cannot be identified for flows for which a 616 handshake is not observed, such as in the case of connection 617 migration; however, it might be possible to associate a flow with a 618 flow for which a version has been identified; see Section 3.5. 620 3.1.2. First Packet Identification for Garbage Rejection 622 A related question is whether the first packet of a given flow on a 623 port known to be associated with QUIC is a valid QUIC packet. This 624 determination supports in-network filtering of garbage UDP packets 625 (reflection attacks, random backscatter, etc.). While heuristics 626 based on the first byte of the packet (packet type) could be used to 627 separate valid from invalid first packet types, the deployment of 628 such heuristics is not recommended, as bits in the first byte may 629 have different meanings in future versions of the protocol. 631 3.2. Connection Confirmation 633 This document focuses on QUIC version 1, and this Connection 634 Confirmation section applies only to packets belonging to QUIC 635 version 1 flows; for purposes of on-path observation, it assumes that 636 these packets have been identified as such through the observation of 637 a version number exchange as described above. 639 Connection establishment uses Initial and Handshake packets 640 containing a TLS handshake, and Retry packets that do not contain 641 parts of the handshake. Connection establishment can therefore be 642 detected using heuristics similar to those used to detect TLS over 643 TCP. A client initiating a connection may also send data in 0-RTT 644 packets directly after the Initial packet containing the TLS Client 645 Hello. Since packets may be reordered or lost in the network, 0-RTT 646 packets could be seen before the Initial packet. 648 Note that in this version of QUIC, clients send Initial packets 649 before servers do, servers send Handshake packets before clients do, 650 and only clients send Initial packets with tokens. Therefore, an 651 endpoint can be identified as a client or server by an on-path 652 observer. An attempted connection after Retry can be detected by 653 correlating the contents of the Retry packet with the Token and the 654 Destination Connection ID fields of the new Initial packet. 656 3.3. Distinguishing Acknowledgment Traffic 658 Some deployed in-network functions distinguish pure-acknowledgment 659 (ACK) packets from packets carrying upper-layer data in order to 660 attempt to enhance performance, for example by queueing ACKs 661 differently or manipulating ACK signaling [RFC3449]. Distinguishing 662 ACK packets is possible in TCP, but is not supported by QUIC, since 663 acknowledgment signaling is carried inside QUIC's encrypted payload, 664 and ACK manipulation is impossible. Specifically, heuristics 665 attempting to distinguish ACK-only packets from payload-carrying 666 packets based on packet size are likely to fail, and are not 667 recommended to use as a way to construe internals of QUIC's operation 668 as those mechanisms can change, e.g., due to the use of extensions. 670 3.4. Server Name Indication (SNI) 672 The client's TLS ClientHello may contain a Server Name Indication 673 (SNI) [RFC6066] extension, by which the client reveals the name of 674 the server it intends to connect to, in order to allow the server to 675 present a certificate based on that name. It may also contain an 676 Application-Layer Protocol Negotiation (ALPN) [RFC7301] extension, by 677 which the client exposes the names of application-layer protocols it 678 supports; an observer can deduce that one of those protocols will be 679 used if the connection continues. 681 Work is currently underway in the TLS working group to encrypt the 682 contents of the ClientHello in TLS 1.3 [TLS-ECH]. This would make 683 SNI-based application identification impossible by on-path 684 observation for QUIC and other protocols that use TLS. 686 3.4.1. Extracting Server Name Indication (SNI) Information 688 If the ClientHello is not encrypted, SNI can be derived from the 689 client's Initial packet by calculating the Initial secret to decrypt 690 the packet payload and parsing the QUIC CRYPTO frame(s) containing 691 the TLS ClientHello. 693 As both the derivation of the Initial secret and the structure of the 694 Initial packet itself are version-specific, the first step is always 695 to parse the version number (the second through fifth bytes of the 696 long header). Note that only long header packets carry the version 697 number, so it is necessary to also check if the first bit of the QUIC 698 packet is set to 1, indicating a long header. 700 Note that proprietary QUIC versions, that have been deployed before 701 standardization, might not set the first bit in a QUIC long header 702 packet to 1. However, it is expected that these versions will 703 gradually disappear over time. 705 When the version has been identified as QUIC version 1, the packet 706 type needs to be verified as an Initial packet by checking that the 707 third and fourth bits of the header are both set to 0. Then the 708 Destination Connection ID needs to be extracted from the packet. The 709 Initial secret is calculated using the version-specific Initial salt, 710 as described in Section 5.2 of [QUIC-TLS]. The length of the 711 connection ID is indicated in the 6th byte of the header followed by 712 the connection ID itself. 714 Note that subsequent Initial packets might contain a Destination 715 Connection ID other than the one used to generate the Initial secret. 716 Therefore, attempts to decrypt these packets using the procedure 717 above might fail unless the Initial secret is retained by the 718 observer. 720 To determine the end of the header and find the start of the payload, 721 the packet number length, the source connection ID length, and the 722 token length need to be extracted. The packet number length is 723 defined by the seventh and eight bits of the header as described in 724 Section 17.2 of [QUIC-TRANSPORT], but is obfuscated as described in 725 Section 5.4 of [QUIC-TLS]. The source connection ID length is 726 specified in the byte after the destination connection ID. The token 727 length, which follows the source connection ID, is a variable-length 728 integer as specified in Section 16 of [QUIC-TRANSPORT]. 730 After decryption, the client's Initial packet can be parsed to detect 731 the CRYPTO frame(s) that contains the TLS ClientHello, which then can 732 be parsed similarly to TLS over TCP connections. Note that there can 733 be multiple CRYPTO frames, and they might not be in order, so 734 reassembling the CRYPTO stream by parsing offsets and lengths is 735 required. Further, the client's Initial packet may contain other 736 frames, so the first bytes of each frame need to be checked to 737 identify the frame type, and if needed skipped over it. Note that 738 the length of the frames is dependent on the frame type; see 739 Section 18 of [QUIC-TRANSPORT]. E.g. PADDING frames, each 740 consisting of a single zero byte, may occur before, after, or between 741 CRYPTO frames. However, extensions might define additional frame 742 types. If an unknown frame type is encountered, it is impossible to 743 know the length of that frame which prevents skipping over it, and 744 therefore parsing fails. 746 3.5. Flow Association 748 The QUIC connection ID (see Section 2.6) is designed to allow a 749 coordinating on-path device, such as a load-balancer, to associate 750 two flows when one of the endpoints changes address. This change can 751 be due to NAT rebinding or address migration. 753 The connection ID must change upon intentional address change by an 754 endpoint, and connection ID negotiation is encrypted, so it is not 755 possible for a passive observer to link intended changes of address 756 using the connection ID. 758 When one endpoint's address unintentionally changes, as is the case 759 with NAT rebinding, an on-path observer may be able to use the 760 connection ID to associate the flow on the new address with the flow 761 on the old address. 763 A network function that attempts to use the connection ID to 764 associate flows must be robust to the failure of this technique. 765 Since the connection ID may change multiple times during the lifetime 766 of a connection, packets with the same five-tuple but different 767 connection IDs might or might not belong to the same connection. 768 Likewise, packets with the same connection ID but different five- 769 tuples might not belong to the same connection, either. 771 Connection IDs should be treated as opaque; see Section 4.4 for 772 caveats regarding connection ID selection at servers. 774 3.6. Flow Teardown 776 QUIC does not expose the end of a connection; the only indication to 777 on-path devices that a flow has ended is that packets are no longer 778 observed. Stateful devices on path such as NATs and firewalls must 779 therefore use idle timeouts to determine when to drop state for QUIC 780 flows; see Section 4.2. 782 3.7. Flow Symmetry Measurement 784 QUIC explicitly exposes which side of a connection is a client and 785 which side is a server during the handshake. In addition, the 786 symmetry of a flow (whether primarily client-to-server, primarily 787 server-to-client, or roughly bidirectional, as input to basic traffic 788 classification techniques) can be inferred through the measurement of 789 data rate in each direction. While QUIC traffic is protected and 790 ACKs may be padded, padding is not required. 792 3.8. Round-Trip Time (RTT) Measurement 794 The round-trip time (RTT) of QUIC flows can be inferred by 795 observation once per flow, during the handshake, as in passive TCP 796 measurement; this requires parsing of the QUIC packet header and 797 recognition of the handshake, as illustrated in Section 2.4. It can 798 also be inferred during the flow's lifetime, if the endpoints use the 799 spin bit facility described below and in Section 17.3.1 of 800 [QUIC-TRANSPORT]. 802 3.8.1. Measuring Initial RTT 804 In the common case, the delay between the client's Initial packet 805 (containing the TLS ClientHello) and the server's Initial packet 806 (containing the TLS ServerHello) represents the RTT component on the 807 path between the observer and the server. The delay between the 808 server's first Handshake packet and the Handshake packet sent by the 809 client represents the RTT component on the path between the observer 810 and the client. While the client may send 0-RTT packets after the 811 Initial packet during connection re-establishment, these can be 812 ignored for RTT measurement purposes. 814 Handshake RTT can be measured by adding the client-to-observer and 815 observer-to-server RTT components together. This measurement 816 necessarily includes any transport- and application-layer delay (the 817 latter mainly caused by the asymmetric crypto operations associated 818 with the TLS handshake) at both sides. 820 3.8.2. Using the Spin Bit for Passive RTT Measurement 822 The spin bit provides a version-specific method to measure per-flow 823 RTT from observation points on the network path throughout the 824 duration of a connection. See Section 17.4 of [QUIC-TRANSPORT] for 825 the definition of the spin bit in Version 1 of QUIC. Endpoint 826 participation in spin bit signaling is optional. That is, while its 827 location is fixed in this version of QUIC, an endpoint can 828 unilaterally choose to not support "spinning" the bit. 830 Use of the spin bit for RTT measurement by devices on path is only 831 possible when both endpoints enable it. Some endpoints may disable 832 use of the spin bit by default, others only in specific deployment 833 scenarios, e.g. for servers and clients where the RTT would reveal 834 the presence of a VPN or proxy. To avoid making these connections 835 identifiable based on the usage of the spin bit, all endpoints 836 randomly disable "spinning" for at least one eighth of connections, 837 even if otherwise enabled by default. An endpoint not participating 838 in spin bit signaling for a given connection can use a fixed spin 839 value for the duration of the connection, or can set the bit randomly 840 on each packet sent. 842 When in use, the latency spin bit in each direction changes value 843 once per RTT any time that both endpoints are sending packets 844 continuously. An on-path observer can observe the time difference 845 between edges (changes from 1 to 0 or 0 to 1) in the spin bit signal 846 in a single direction to measure one sample of end-to-end RTT. This 847 mechanism follows the principles of protocol measurability laid out 848 in [IPIM]. 850 Note that this measurement, as with passive RTT measurement for TCP, 851 includes any transport protocol delay (e.g., delayed sending of 852 acknowledgments) and/or application layer delay (e.g., waiting for a 853 response to be generated). It therefore provides devices on path a 854 good instantaneous estimate of the RTT as experienced by the 855 application. 857 However, application-limited and flow-control-limited senders can 858 have application and transport layer delay, respectively, that are 859 much greater than network RTT. When the sender is application- 860 limited and e.g. only sends small amount of periodic application 861 traffic, where that period is longer than the RTT, measuring the spin 862 bit provides information about the application period, not the 863 network RTT. 865 Since the spin bit logic at each endpoint considers only samples from 866 packets that advance the largest packet number, signal generation 867 itself is resistant to reordering. However, reordering can cause 868 problems at an observer by causing spurious edge detection and 869 therefore inaccurate (i.e., lower) RTT estimates, if reordering 870 occurs across a spin-bit flip in the stream. 872 Simple heuristics based on the observed data rate per flow or changes 873 in the RTT series can be used to reject bad RTT samples due to lost 874 or reordered edges in the spin signal, as well as application or flow 875 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 876 significantly higher than RTTs over the history of the flow. These 877 heuristics may use the handshake RTT as an initial RTT estimate for a 878 given flow. Usually such heuristics would also detect if the spin is 879 either constant or randomly set for a connection. 881 An on-path observer that can see traffic in both directions (from 882 client to server and from server to client) can also use the spin bit 883 to measure "upstream" and "downstream" component RTT; i.e, the 884 component of the end-to-end RTT attributable to the paths between the 885 observer and the server and the observer and the client, 886 respectively. It does this by measuring the delay between a spin 887 edge observed in the upstream direction and that observed in the 888 downstream direction, and vice versa. 890 Raw RTT samples generated using these techniques can be processed in 891 various ways to generate useful network performance metrics. A 892 simple linear smoothing or moving minimum filter can be applied to 893 the stream of RTT samples to get a more stable estimate of 894 application-experienced RTT. RTT samples measured from the spin bit 895 can also be used to generate RTT distribution information, including 896 minimum RTT (which approximates network RTT over longer time windows) 897 and RTT variance (which approximates jitter as seen by the 898 application). 900 4. Specific Network Management Tasks 902 In this section, we review specific network management and 903 measurement techniques and how QUIC's design impacts them. 905 4.1. Passive Network Performance Measurement and Troubleshooting 907 Limited RTT measurement is possible by passive observation of QUIC 908 traffic; see Section 3.8. No passive measurement of loss is possible 909 with the present wire image. Limited observation of upstream 910 congestion may be possible via the observation of CE markings on ECN- 911 enabled QUIC traffic. 913 On-path devices can also make measurements of RTT, loss and other 914 performance metrics when information is carried in an additional 915 network-layer packet header (Section 6 of 916 [I-D.ietf-tsvwg-transport-encrypt] describes use of operations, 917 administration and management (OAM) information). Using network- 918 layer approaches also has the advantage that common observation and 919 analysis tools can be consistently used by multiple transport 920 protocols, however, these techniques are often limited to 921 measurements within one or multiple cooperating domains. 923 4.2. Stateful Treatment of QUIC Traffic 925 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 926 middlebox) is possible through QUIC traffic and version 927 identification (Section 3.1) and observation of the handshake for 928 connection confirmation (Section 3.2). The lack of any visible end- 929 of-flow signal (Section 3.6) means that this state must be purged 930 either through timers or through least-recently-used eviction, 931 depending on application requirements. 933 While QUIC has no clear network-visible end-of-connection signal and 934 therefore does require timer-based state removal, the QUIC handshake 935 indicates confirmation by both ends of a valid bidirectional 936 transmission. As soon as the handshake completed, timers should be 937 set long enough to also allow for short idle time during a valid 938 transmission. 940 [RFC4787] requires a network state timeout that is not less than 2 941 minutes for most UDP traffic. However, in practice, a QUIC endpoint 942 can experience lower timeouts, in the range of 30 to 60 seconds. 944 In contrast, [RFC5382] recommends a state timeout of more than 2 945 hours for TCP, given that TCP is a connection-oriented protocol with 946 well- defined closure semantics. Even though QUIC has explicitly 947 been designed to tolerate NAT rebindings, decreasing the NAT timeout 948 is not recommended, as it may negatively impact application 949 performance or incentivize endpoints to send very frequent keep-alive 950 packets. 952 The recommendation is therefore that, even when lower state timeouts 953 are used for other UDP traffic, a state timeout of at least two 954 minutes ought to be used for QUIC traffic. 956 If state is removed too early, this could lead to black-holing of 957 incoming packets after a short idle period. To detect this 958 situation, a timer at the client needs to expire before a re- 959 establishment can happen (if at all), which would lead to unnecessary 960 long delays in an otherwise working connection. 962 Furthermore, not all endpoints use routing architectures where 963 connections will survive a port or address change. So even when the 964 client revives the connection, a NAT rebinding can cause a routing 965 mismatch where a packet is not even delivered to the server that 966 might support address migration. For these reasons, the limits in 967 [RFC4787] are important to avoid black-holing of packets (and hence 968 avoid interrupting the flow of data to the client), especially where 969 devices are able to distinguish QUIC traffic from other UDP payloads. 971 The QUIC header optionally contains a connection ID which could 972 provide additional entropy beyond the 5-tuple. The QUIC handshake 973 needs to be observed in order to understand whether the connection ID 974 is present and what length it has. However, connection IDs may be 975 renegotiated after the handshake, and this renegotiation is not 976 visible to the path. Therefore, using the connection ID as a flow 977 key field for stateful treatment of flows is not recommended as 978 connection ID changes will cause undetectable and unrecoverable loss 979 of state in the middle of a connection. Specially, the use of the 980 connection ID for functions that require state to make a forwarding 981 decison is not viable as it will break connectivity or at minimum 982 cause long timeout-based delays before this problem is detected by 983 the endpoints and the connection can potentially be re-established. 985 Use of connection IDs is specifically discouraged for NAT 986 applications. If a NAT hits an operational limit, it is recommended 987 to rather drop the initial packets of a flow (see also Section 4.5), 988 which potentially triggers a fallback to TCP. Use of the connection 989 ID to multiplex multiple connections on the same IP address/port pair 990 is not a viable solution as it risks connectivity breakage, in case 991 the connection ID changes. 993 4.3. Address Rewriting to Ensure Routing Stability 995 While QUIC's migration capability makes it possible for a connection 996 to survive client address changes, this does not work if the routers 997 or switches in the server infrastructure route using the address-port 998 4-tuple. If infrastructure routes on addresses only, NAT rebinding 999 or address migration will cause packets to be delivered to the wrong 1000 server. [QUIC_LB] describes a way to addresses this problem by 1001 coordinating the selection and use of connection IDs between load- 1002 balancers and servers. 1004 Applying address translation at a middlebox to maintain a stable 1005 address-port mapping for flows based on connection ID might seem like 1006 a solution to this problem. However, hiding information about the 1007 change of the IP address or port conceals important and security- 1008 relevant information from QUIC endpoints and as such would facilitate 1009 amplification attacks (see Section 9 of [QUIC-TRANSPORT]). A NAT 1010 function that hides peer address changes prevents the other end from 1011 detecting and mitigating attacks as the endpoint cannot verify 1012 connectivity to the new address using QUIC PATH_CHALLENGE and 1013 PATH_RESPONSE frames. 1015 In addition, a change of IP address or port is also an input signal 1016 to other internal mechanisms in QUIC. When a path change is 1017 detected, path-dependent variables like congestion control parameters 1018 will be reset protecting the new path from overload. 1020 4.4. Server Cooperation with Load Balancers 1022 In the case of networking architectures that include load balancers, 1023 the connection ID can be used as a way for the server to signal 1024 information about the desired treatment of a flow to the load 1025 balancers. Guidance on assigning connection IDs is given in 1026 [QUIC-APPLICABILITY]. [QUIC_LB] describes a system for coordinating 1027 selection and use of connection IDs between load-balancers and 1028 servers. 1030 4.5. Filtering Behavior 1032 [RFC4787] describes possible packet filtering behaviors that relate 1033 to NATs but is often also used is other scenarios where packet 1034 filtering is desired. Though the guidance there holds, a 1035 particularly unwise behavior admits a handful of UDP packets and then 1036 makes a decision to whether or not filter later packets in the same 1037 connection. QUIC applications are encouraged to fail over to TCP if 1038 early packets do not arrive at their destination 1039 [I-D.ietf-quic-applicability], as QUIC is based on UDP and there are 1040 known blocks of UDP traffic (see Section 4.6). Admitting a few 1041 packets allows the QUIC endpoint to determine that the path accepts 1042 QUIC. Sudden drops afterwards will result in slow and costly 1043 timeouts before abandoning the connection. 1045 4.6. UDP Blocking or Throttling 1047 Today, UDP is the most prevalent DDoS vector, since it is easy for 1048 compromised non-admin applications to send a flood of large UDP 1049 packets (while with TCP the attacker gets throttled by the congestion 1050 controller) or to craft reflection and amplification attacks. Some 1051 networks therefore block UDP traffic. With increased deployment of 1052 QUIC, there is also an increased need to allow UDP traffic on ports 1053 used for QUIC. However, if UDP is generally enabled on these ports, 1054 UDP flood attacks may also use the same ports. One possible response 1055 to this threat is to throttle UDP traffic on the network, allocating 1056 a fixed portion of the network capacity to UDP and blocking UDP 1057 datagrams over that cap. As the portion of QUIC traffic compared to 1058 TCP is also expected to increase over time, using such a limit is not 1059 recommended but if done, limits might need to be adapted dynamically. 1061 Further, if UDP traffic is desired to be throttled, it is recommended 1062 to block individual QUIC flows entirely rather than dropping packets 1063 indiscriminately. When the handshake is blocked, QUIC-capable 1064 applications may fail over to TCP. However, blocking a random 1065 fraction of QUIC packets across 4-tuples will allow many QUIC 1066 handshakes to complete, preventing a TCP failover, but these 1067 connections will suffer from severe packet loss (see also 1068 Section 4.5). Therefore, UDP throttling should be realized by per- 1069 flow policing, as opposed to per-packet policing. Note that this 1070 per-flow policing should be stateless to avoid problems with stateful 1071 treatment of QUIC flows (see Section 4.2), for example blocking a 1072 portion of the space of values of a hash function over the addresses 1073 and ports in the UDP datagram. While QUIC endpoints are often able 1074 to survive address changes, e.g. by NAT rebindings, blocking a 1075 portion of the traffic based on 5-tuple hashing increases the risk of 1076 black-holing an active connection when the address changes. 1078 4.7. DDoS Detection and Mitigation 1080 On-path observation of the transport headers of packets can be used 1081 for various security functions. For example, Denial of Service (DOS) 1082 and Distributed DOS (DDOS) attacks against the infrastructure or 1083 against an endpoint can be detected and mitigated by characterising 1084 anomalous traffic. Other uses include support for security audits 1085 (e.g., verifying the compliance with ciphersuites); client and 1086 application fingerprinting for inventory; and to provide alerts for 1087 network intrusion detection and other next generation firewall 1088 functions. 1090 Current practices in detection and mitigation of DDoS attacks 1091 generally involve classification of incoming traffic (as packets, 1092 flows, or some other aggregate) into "good" (productive) and "bad" 1093 (DDoS) traffic, and then differential treatment of this traffic to 1094 forward only good traffic. This operation is often done in a 1095 separate specialized mitigation environment through which all traffic 1096 is filtered; a generalized architecture for separation of concerns in 1097 mitigation is given in [DOTS-ARCH]. 1099 Efficient classification of this DDoS traffic in the mitigation 1100 environment is key to the success of this approach. Limited first- 1101 packet garbage detection as in Section 3.1.2 and stateful tracking of 1102 QUIC traffic as in Section 4.2 above may be useful during 1103 classification. 1105 Note that the use of a connection ID to support connection migration 1106 renders 5-tuple based filtering insufficient to detect active flows 1107 and requires more state to be maintained by DDoS defense systems if 1108 support of migration of QUIC flows is desired. For the common case 1109 of NAT rebinding, where the client's address changes without the 1110 client's intent or knowledge, DDoS defense systems can detect a 1111 change in the client's endpoint address by linking flows based on the 1112 server's connection IDs. However, QUIC's linkability resistance 1113 ensures that a deliberate connection migration is accompanied by a 1114 change in the connection ID. In this case, the connection ID can not 1115 be used to distinguish valid, active traffic from new attack traffic. 1117 It is also possible for endpoints to directly support security 1118 functions such as DoS classification and mitigation. Endpoints can 1119 cooperate with an in-network device directly by e.g. sharing 1120 information about connection IDs. 1122 Another potential method could use an on-path network device that 1123 relies on pattern inferences in the traffic and heuristics or machine 1124 learning instead of processing observed header information. 1126 However, it is questionable whether connection migrations must be 1127 supported during a DDoS attack. While unintended migration without a 1128 connection ID change can be more easily supported, it might be 1129 acceptable to not support migrations of active QUIC connections that 1130 are not visible to the network functions performing the DDoS 1131 detection. As soon as the connection blocking is detected by the 1132 client, the client may be able to rely on the fast resumption 1133 mechanism provided by QUIC. When clients migrate to a new path, they 1134 should be prepared for the migration to fail and attempt to reconnect 1135 quickly. 1137 Beyond in-network DDoS protection mechanisms, TCP syncookies 1138 [RFC4937] are a well-established method of mitigating some kinds of 1139 TCP DDoS attacks. QUIC Retry packets are the functional analogue to 1140 syncookies, forcing clients to prove possession of their IP address 1141 before committing server state. However, there are safeguards in 1142 QUIC against unsolicited injection of these packets by intermediaries 1143 who do not have consent of the end server. See [QUIC_LB] for 1144 standard ways for intermediaries to send Retry packets on behalf of 1145 consenting servers. 1147 4.8. Quality of Service Handling and ECMP Routing 1149 It is expected that any QoS handling in the network, e.g. based on 1150 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 1151 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 1152 per-packet) and as such that all packets belonging to the same QUIC 1153 connection get uniform treatment. 1155 Using ECMP to distribute packets from a single flow across multiple 1156 network paths or any other non-uniform treatment of packets belong to 1157 the same connection could result in variations in order, delivery 1158 rate, and drop rate. As feedback about loss or delay of each packet 1159 is used as input to the congestion controller, these variations could 1160 adversely affect performance. Depending on the loss recovery 1161 mechanism implemented, QUIC may be more tolerant of packet re- 1162 ordering than traditional TCP traffic (see Section 2.7). However, 1163 the recovery mechanism used by a flow cannot be known by the network 1164 and therefore reordering tolerance should be considered as unknown. 1166 4.9. Handling ICMP Messages 1168 Datagram Packetization Layer PMTU Discovery (PLPMTUD) can be used by 1169 QUIC to probe for the supported PMTU. PLPMTUD optionally uses ICMP 1170 messages (e.g., IPv6 Packet Too Big messages). Given known attacks 1171 with the use of ICMP messages, the use of PLPMTUD in QUIC has been 1172 designed to safely use but not rely on receiving ICMP feedback (see 1173 Section 14.2.1. of [QUIC-TRANSPORT]). 1175 Networks are recommended to forward these ICMP messages and retain as 1176 much of the original packet as possible without exceeding the minimum 1177 MTU for the IP version when generating ICMP messages as recommended 1178 in [RFC1812] and [RFC4443]. 1180 4.10. Guiding Path MTU 1182 Some network segments support 1500-byte packets, but can only do so 1183 by fragmenting at a lower layer before traversing a network segment 1184 with a smaller MTU, and then reassembling within the network segment. 1185 This is permissible even when the IP layer is IPv6 or IPv4 with the 1186 DF bit set, because fragmention occurs below the IP layer. However, 1187 this process can add to compute and memory costs, leading to a 1188 bottleneck that limits network capacity. In such networks this 1189 generates a desire to influence a majority of senders to use smaller 1190 packets, to avoid exceeding limited reassembly capacity. 1192 For TCP, MSS clamping (Section 3.2 of [RFC4459]) is often used to 1193 change the sender's TCP maximum segment size, but QUIC requires a 1194 different approach. Section 14 of [QUIC-TRANSPORT] advises senders 1195 to probe larger sizes using Datagram Packetization Layer PMTU 1196 Discovery ([DPLPMTUD]) or Path Maximum Transmission Unit Discovery 1197 (PMTUD: [RFC1191] and [RFC8201]). This mechanism encourages senders 1198 to approach the maximum packet size, which could then cause 1199 fragmentation within a network segment of which they may not be 1200 aware. 1202 If path performance is limited when forwarding larger packets, an on- 1203 path device should support a maximum packet size for a specific 1204 transport flow and then consistently drop all packets that exceed the 1205 configured size when the inner IPv4 packet has DF set, or IPv6 is 1206 used. 1208 Networks with configurations that would lead to fragmentation of 1209 large packets within a network segment should drop such packets 1210 rather than fragmenting them. Network operators who plan to 1211 implement a more selective policy may start by focusing on QUIC. 1213 QUIC flows cannot always be easily distinguished from other UDP 1214 traffic, but we assume at least some portion of QUIC traffic can be 1215 identified (see Section 3.1). For networks supporting QUIC, it is 1216 recommended that a path drops any packet larger than the 1217 fragmentation size. When a QUIC endpoint uses DPLPMTUD, it will use 1218 a QUIC probe packet to discover the PMTU. If this probe is lost, it 1219 will not impact the flow of QUIC data. 1221 IPv4 routers generate an ICMP message when a packet is dropped 1222 because the link MTU was exceeded. [RFC8504] specifies how an IPv6 1223 node generates an ICMPv6 Packet Too Big message (PTB) in this case. 1224 PMTUD relies upon an endpoint receiving such PTB messages [RFC8201], 1225 whereas DPLPMTUD does not reply upon these messages, but still can 1226 optionally use these to improve performance Section 4.6 of 1227 [DPLPMTUD]. 1229 A network cannot know in advance which discovery method is used by a 1230 QUIC endpoint, so it should send a PTB message in addition to 1231 dropping an oversized packet. A generated PTB message should be 1232 compliant with the validation requirements of Section 14.2.1 of 1233 [QUIC-TRANSPORT], otherwise it will be ignored for PMTU discovery. 1234 This provides a signal to the endpoint to prevent the packet size 1235 from growing too large, which can entirely avoid network segment 1236 fragmentation for that flow. 1238 Endpoints can cache PMTU information, in the IP-layer cache. This 1239 short-term consistency between the PMTU for flows can help avoid an 1240 endpoint using a PMTU that is inefficient. The IP cache can also 1241 influence the PMTU value of other IP flows that use the same path 1242 [RFC8201][RFC8899], including IP packets carrying protocols other 1243 than QUIC. The representation of an IP path is implementation- 1244 specific [RFC8201]. 1246 5. IANA Considerations 1248 This document has no actions for IANA. 1250 6. Security Considerations 1252 QUIC is an encrypted and authenticated transport. That means, once 1253 the cryptographic handshake is complete, QUIC endpoints discard most 1254 packets that are not authenticated, greatly limiting the ability of 1255 an attacker to interfere with existing connections. 1257 However, some information is still observerable, as supporting 1258 manageability of QUIC traffic inherently involves tradeoffs with the 1259 confidentiality of QUIC's control information; this entire document 1260 is therefore security-relevant. 1262 More security considerations for QUIC are discussed in 1263 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1264 passive attackers in the network as well as attacks on specific QUIC 1265 mechanism. 1267 Version Negotiation packets do not contain any mechanism to prevent 1268 version downgrade attacks. However, future versions of QUIC that use 1269 Version Negotiation packets are required to define a mechanism that 1270 is robust against version downgrade attacks. Therefore, a network 1271 node should not attempt to impact version selection, as version 1272 downgrade may result in connection failure. 1274 7. Contributors 1276 The following people have contributed text to sections of this 1277 document: 1279 * Dan Druta 1281 * Martin Duke 1283 * Igor Lubashev 1285 * David Schinazi 1287 * Gorry Fairhurst 1289 * Chris Box 1291 8. Acknowledgments 1293 Thanks to Thomas Fossati, Jana Iygengar, Marcus Ihlar for their early 1294 reviews and feedback. Special thanks also to Martin Thomson and 1295 Martin Duke for their detailed reviews and input. And thanks to Sean 1296 Turner, Mike Bishop, Ian Swett, and Nick Banks for their last call 1297 reviews. 1299 This work is partially supported by the European Commission under 1300 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1301 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1302 for Education, Research, and Innovation under contract no. 15.0268. 1303 This support does not imply endorsement. 1305 9. References 1307 9.1. Normative References 1309 [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC", 1310 Work in Progress, Internet-Draft, draft-ietf-quic-tls-34, 1311 14 January 2021, . 1314 [QUIC-TRANSPORT] 1315 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1316 and Secure Transport", Work in Progress, Internet-Draft, 1317 draft-ietf-quic-transport-34, 14 January 2021, 1318 . 1321 9.2. Informative References 1323 [DOTS-ARCH] 1324 Mortensen, A., Ed., Reddy.K, T., Ed., Andreasen, F., 1325 Teague, N., and R. Compton, "DDoS Open Threat Signaling 1326 (DOTS) Architecture", RFC 8811, DOI 10.17487/RFC8811, 1327 August 2020, . 1329 [DPLPMTUD] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1330 Völker, "Packetization Layer Path MTU Discovery for 1331 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1332 September 2020, . 1334 [I-D.ietf-quic-applicability] 1335 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1336 Transport Protocol", Work in Progress, Internet-Draft, 1337 draft-ietf-quic-applicability-11, 21 April 2021, 1338 . 1341 [I-D.ietf-tsvwg-transport-encrypt] 1342 Fairhurst, G. and C. Perkins, "Considerations around 1343 Transport Header Confidentiality, Network Operations, and 1344 the Evolution of Internet Transport Protocols", Work in 1345 Progress, Internet-Draft, draft-ietf-tsvwg-transport- 1346 encrypt-21, 20 April 2021, 1347 . 1350 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1351 Internet Measurement (arXiv preprint 1612.02902)", 9 1352 December 2016, . 1354 [QUIC-APPLICABILITY] 1355 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1356 Transport Protocol", Work in Progress, Internet-Draft, 1357 draft-ietf-quic-applicability-11, 21 April 2021, 1358 . 1361 [QUIC-HTTP] 1362 Bishop, M., "Hypertext Transfer Protocol Version 3 1363 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1364 quic-http-34, 2 February 2021, 1365 . 1368 [QUIC-INVARIANTS] 1369 Thomson, M., "Version-Independent Properties of QUIC", 1370 Work in Progress, Internet-Draft, draft-ietf-quic- 1371 invariants-13, 14 January 2021, 1372 . 1375 [QUIC-RECOVERY] 1376 Iyengar, J. and I. Swett, "QUIC Loss Detection and 1377 Congestion Control", Work in Progress, Internet-Draft, 1378 draft-ietf-quic-recovery-34, 14 January 2021, 1379 . 1382 [QUIC_LB] Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC 1383 Connection IDs", Work in Progress, Internet-Draft, draft- 1384 ietf-quic-load-balancers-06, 4 February 2021, 1385 . 1388 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1389 DOI 10.17487/RFC1191, November 1990, 1390 . 1392 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1393 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1394 . 1396 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1397 and W. Weiss, "An Architecture for Differentiated 1398 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1399 . 1401 [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. 1402 Sooriyabandara, "TCP Performance Implications of Network 1403 Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449, 1404 December 2002, . 1406 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1407 Control Message Protocol (ICMPv6) for the Internet 1408 Protocol Version 6 (IPv6) Specification", STD 89, 1409 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1410 . 1412 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1413 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1414 2006, . 1416 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1417 Translation (NAT) Behavioral Requirements for Unicast 1418 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1419 2007, . 1421 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1422 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1423 June 2007, . 1425 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1426 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1427 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1428 . 1430 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1431 Extensions: Extension Definitions", RFC 6066, 1432 DOI 10.17487/RFC6066, January 2011, 1433 . 1435 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1436 "Transport Layer Security (TLS) Application-Layer Protocol 1437 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1438 July 2014, . 1440 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1441 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1442 August 2015, . 1444 [RFC7801] Dolmatov, V., Ed., "GOST R 34.12-2015: Block Cipher 1445 "Kuznyechik"", RFC 7801, DOI 10.17487/RFC7801, March 2016, 1446 . 1448 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 1449 Updates for Secure Real-time Transport Protocol (SRTP) 1450 Extension for Datagram Transport Layer Security (DTLS)", 1451 RFC 7983, DOI 10.17487/RFC7983, September 2016, 1452 . 1454 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1455 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1456 DOI 10.17487/RFC8201, July 2017, 1457 . 1459 [RFC8504] Chown, T., Loughney, J., and T. Winters, "IPv6 Node 1460 Requirements", BCP 220, RFC 8504, DOI 10.17487/RFC8504, 1461 January 2019, . 1463 [RFC8899] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1464 Völker, "Packetization Layer Path MTU Discovery for 1465 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1466 September 2020, . 1468 [TLS-ECH] Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1469 Encrypted Client Hello", Work in Progress, Internet-Draft, 1470 draft-ietf-tls-esni-11, 14 June 2021, 1471 . 1474 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1475 Integrity Signals for Passive Measurement (in Proc. TMA 1476 2014)", April 2014. 1478 [WIRE-IMAGE] 1479 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1480 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1481 2019, . 1483 Authors' Addresses 1485 Mirja Kuehlewind 1486 Ericsson 1488 Email: mirja.kuehlewind@ericsson.com 1490 Brian Trammell 1491 Google Switzerland GmbH 1492 Gustav-Gull-Platz 1 1493 CH- 8004 Zurich 1494 Switzerland 1496 Email: ietf@trammell.ch