idnits 2.17.1 draft-trammell-quic-spin-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 13, 2017) is 2319 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-quic-manageability-01 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-08 == Outdated reference: A later version (-28) exists of draft-ietf-tls-tls13-22 == Outdated reference: A later version (-04) exists of draft-trammell-wire-image-01 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC B. Trammell, Ed. 3 Internet-Draft P. De Vaere 4 Intended status: Informational ETH Zurich 5 Expires: June 16, 2018 R. Even 6 Huawei 7 G. Fioccola 8 Telecom Italia 9 T. Fossati 10 Nokia 11 M. Ihlar 12 Ericsson 13 A. Morton 14 AT&T Labs 15 E. Stephan 16 Orange 17 December 13, 2017 19 The Addition of a Spin Bit to the QUIC Transport Protocol 20 draft-trammell-quic-spin-01 22 Abstract 24 This document summarizes work to date on the addition of a "spin 25 bit", intended for explicit measurability of end-to-end RTT on QUIC 26 flows. It proposes a detailed mechanism for the spin bit, describes 27 how to use it to measure end-to-end latency, discusses corner cases 28 and their workarounds in the measurement, describes experimental 29 evaluation of the mechanism done to date, and examines the utility 30 and privacy implications of the spin bit. As the overhead and risk 31 associated with the spin bit are negligible, and the utility of a 32 passive RTT measurement signal at higher resolution than once per 33 flow is clear, this document advocates for the addition of the spin 34 bit to the protocol. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on June 16, 2018. 53 Copyright Notice 55 Copyright (c) 2017 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (https://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 71 1.1. About This Document . . . . . . . . . . . . . . . . . . . 3 72 2. The Spin Bit Mechanism . . . . . . . . . . . . . . . . . . . 4 73 2.1. Proposed Short Header Format Including Spin Bit . . . . . 4 74 3. Using the Spin Bit for Passive RTT Measurement . . . . . . . 5 75 3.1. Limitations and Workarounds . . . . . . . . . . . . . . . 6 76 3.2. Illustration . . . . . . . . . . . . . . . . . . . . . . 6 77 3.3. Experimental Evaluation . . . . . . . . . . . . . . . . . 8 78 4. Use Cases for Passive RTT Measurement . . . . . . . . . . . . 10 79 4.1. Inter-domain Troubleshooting . . . . . . . . . . . . . . 10 80 4.2. Two-Point Intradomain Measurement . . . . . . . . . . . . 11 81 4.3. Bufferbloat Mitigation in Cellular Networks . . . . . . . 12 82 4.4. Locating WiFi Problems in Home Networks . . . . . . . . . 12 83 4.5. Internet Measurement Research . . . . . . . . . . . . . . 13 84 5. Alternate RTT Measurement Approaches for Diagnosing QUIC 85 flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 86 5.1. Handshake RTT measurement . . . . . . . . . . . . . . . . 14 87 5.2. Parallel active measurement . . . . . . . . . . . . . . . 14 88 5.3. Frequency Analysis . . . . . . . . . . . . . . . . . . . 15 89 6. Greasing . . . . . . . . . . . . . . . . . . . . . . . . . . 16 90 7. Privacy and Security Considerations . . . . . . . . . . . . . 17 91 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 92 9. Informative References . . . . . . . . . . . . . . . . . . . 19 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 95 1. Introduction 97 The QUIC transport protocol [QUIC-TRANS] is a UDP-encapsulated 98 protocol integrated with Transport Layer Security (TLS) [TLS] to 99 encrypt most of its protocol internals, beyond those handshake 100 packets needed to establish or resume a TLS session, and information 101 required to reassemble QUIC streams (the packet number) and to route 102 QUIC packets to the correct machine in a load-balancing situation 103 (the connection ID). In contrast to TCP, QUIC's wire image (see 104 [WIRE-IMAGE]) exposes much less information about transport protocol 105 state than TCP's wire image. Specifically, the fact that sequence 106 and acknowledgement numbers and timestamps (available in TCP) cannot 107 be seen by on-path observers in QUIC means that passive TCP loss and 108 latency measurement techniques that rely on this information (e.g. 109 [CACM-TCP], [TMA-QOF]) cannot be easily ported to work with QUIC. 111 This document proposes a solution to this problem by adding a 112 "latency spin bit" to the QUIC short header. This bit is designed 113 solely for explicit passive measurability of the protocol. It 114 provides one RTT sample per RTT to passive observers of QUIC traffic. 115 This document describes the mechanism, how it can be added to QUIC, 116 and how it can be used by passive measurement facilities to generate 117 RTT samples. It explores potential corner cases and shortcomings of 118 the mechanism and how they can be mitigated. It summarizes 119 experimental results to date with an implementation of the spin bit 120 built atop a recent QUIC implementation. It additionally describes 121 use cases for passive RTT measurement at the resolution provided by 122 the spin bit. It further reviews findings on privacy risk researched 123 by the QUIC RTT Design Team, which was tasked by the IETF QUIC 124 Working Group to determine the risk/utility tradeoff for the spin 125 bit. 127 The spin bit has low overhead, presents negligible privacy risk, and 128 has clear utility in providing passive RTT measurability of QUIC that 129 is far superior to QUIC's measurability without the spin bit, and 130 equivalent to or better than TCP passive measurability. 132 1.1. About This Document 134 This document is maintained in the GitHub repository 135 https://github.com/britram/draft-trammell-quic-spin, and the editor's 136 copy is available online at https://britram.github.io/draft-trammell- 137 quic-spin. Current open issues on the document can be seen at 138 https://github.com/britram/draft-trammell-quic-spin/issues. Comments 139 and suggestions on this document can be made by filing an issue 140 there, or by contacting the editor. 142 2. The Spin Bit Mechanism 144 The latency spin bit enables latency monitoring from observation 145 points on the network path. Each endpoint, client and server, 146 maintains a spin value, 0 or 1, for each QUIC connection, and sets 147 the spin bit on packets it sends for that connection to the 148 appropriate value (below). It also maintains the highest packet 149 number seen from its peer on the connection. The value is then 150 determined at each endpoint as follows: 152 o The server initializes its spin value to 0. When it receives a 153 packet from the client, if that packet has a short header and if 154 it increments the highest packet number seen by the server from 155 the client, it sets the spin value to the spin bit in the received 156 packet. 158 o The client initializes its spin value to 0. When it receives a 159 packet from the server, if the packet has a short header and if it 160 increments the highest packet number seen by the client from the 161 server, it sets the spin value to the opposite of the spin bit in 162 the received packet. 164 This procedure will cause the spin bit to change value in each 165 direction once per round trip. Observation points can estimate the 166 network latency by observing these changes in the latency spin bit, 167 as described in Section 3. See Section 3.2 for an illustration of 168 this mechanism in action. 170 2.1. Proposed Short Header Format Including Spin Bit 172 Since it is possible to measure handshake RTT without a spin bit (see 173 Section 5.1), it is sufficient to include the spin bit in the short 174 packet header. This proposal suggests using the fourth most 175 significant bit (0x10) of the first octet in the short header for the 176 spin bit. 178 0 1 2 3 179 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 180 +-+-+-+-+-+-+-+-+ 181 |0|C|K|S|Type(4)| 182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 183 | | 184 + [Connection ID (64)] + 185 | | 186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 187 | Packet Number (8/16/32) ... 188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 189 | Protected Payload (*) ... 190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 Figure 1: Short Header Format including proposed Spin Bit 194 This will limit the number of available short packet types to 16. 195 The short packet types will be redefined to the following values: 197 +------+--------------------+ 198 | Type | Packet Number Size | 199 +------+--------------------+ 200 | 0xD | 4 octets | 201 | | | 202 | 0xE | 2 octets | 203 | | | 204 | 0xF | 1 octet | 205 +------+--------------------+ 207 Table 1: Short Header Packet Types after Definition of Spin Bit 209 Note that this proposal changes the short header as defined in 210 [QUIC-TRANS] at the time of writing; regardless of where and how the 211 spin bit is eventually defined, the key properties of the spin bit 212 are (1) it's a single bit, (2) it spins as defined in Section 2, and 213 (3) it appears only in the short header; i.e. after version 214 negotiation and connection establishment are completed. 216 3. Using the Spin Bit for Passive RTT Measurement 218 When a QUIC flow is sending at full rate (i.e., neither application 219 nor flow control limited), the latency spin bit in each direction 220 changes value once per round-trip time (RTT). An on-path observer 221 can observe the time difference between edges in the spin bit signal 222 in a single direction to measure one sample of end-to-end RTT. Note 223 that this measurement, as with passive RTT measurement for TCP, 224 includes any transport protocol delay (e.g., delayed sending of 225 acknowledgements) and/or application layer delay (e.g., waiting for a 226 request to complete). It therefore provides devices on path a good 227 instantaneous estimate of the RTT as experienced by the application. 228 A simple linear smoothing or moving minimum filter can be applied to 229 the stream of RTT information to get a more stable estimate. 231 An on-path observer that can see traffic in both directions (from 232 client to server and from server to client) can also use the spin bit 233 to measure "upstream" and "downstream" component RTT; i.e, the 234 component of the end-to-end RTT attributable to the paths between the 235 observer and the server and the observer and the client, 236 respectively. It does this by measuring the delay between a spin 237 edge observed in the upstream direction and that observed in the 238 downstream direction, and vice versa. 240 3.1. Limitations and Workarounds 242 Application-limited and flow-control-limited senders can have 243 application and transport layer delay, respectively, that are much 244 greater than network RTT. Therefore, the spin bit provides network 245 latency information only when the sender is neither application nor 246 flow control limited. When the sender is application-limited by 247 periodic application traffic, where that period is longer than the 248 RTT, measuring the spin bit provides information about the 249 application period, not the RTT. Simple heuristics based on the 250 observed data rate per flow or changes in the RTT series can be used 251 to reject bad RTT samples due to application or flow control 252 limitation. 254 Since the spin bit logic at each endpoint considers only samples on 255 packets that advance the largest packet number seen, signal 256 generation itself is resistant to reordering. However, reordering 257 can cause problems at an observer by causing spurious edge detection 258 and therefore low RTT estimates, if reordering occurs across a spin 259 bit flip in the stream. This can be probabilistically mitigated by 260 the observer also tracking the low-order bits of the packet number, 261 and rejecting edges that appear out-of-order [RFC4737]. 263 3.2. Illustration 265 To illustrate the operation of the spin bit, we consider a simplified 266 model of a single path between client and server as a queue with 267 slots for five packets, and assume that both client and server sent 268 packets at a constant rate. If each packet moves one slot in the 269 queue per clock tick, note that this network has a RTT of 10 ticks. 271 Initially, during connection establishment, no packets with a spin 272 bit are in flight, as shown in Figure 2. 274 +--------+ - - - - - +--------+ 275 | | --------> | | 276 | Client | | Server | 277 | | <-------- | | 278 +--------+ - - - - - +--------+ 280 Figure 2: Initial state, no spin bit between client and server 282 Either the server, the client, or both can begin sending packets with 283 short headers after connection establishment, as shown in Figure 3; 284 here, no spin edges are yet in transit. 286 +--------+ 0 0 - - - +--------+ 287 | | --------> | | 288 | Client | | Server | 289 | | <-------- | | 290 +--------+ - - 0 0 0 +--------+ 292 Figure 3: Client and server begin sending packets with spin 0 294 Once the server's first 0-marked packet arrives at the client, the 295 client sets its spin value to 1, and begins sending packets with the 296 spin bit set, as shown in Figure 4. The spin edge is now in transit 297 toward the server. 299 +--------+ 1 0 0 0 0 +--------+ 300 | | --------> | | 301 | Client | | Server | 302 | | <-------- | | 303 +--------+ 0 0 0 0 0 +--------+ 305 Figure 4: The bit begins spinning 307 Five ticks later, this packet arrives at the server, which takes its 308 spin value from it and reflects that value back on the next packet it 309 sends, as shown in Figure 5. The spin edge is now in transit toward 310 the client. 312 +--------+ 1 1 1 1 1 +--------+ 313 | | --------> | | 314 | Client | | Server | 315 | | <-------- | | 316 +--------+ 0 0 0 0 1 +--------+ 318 Figure 5: Server reflects the spin edge 320 Five ticks later, the 1-marked packet arrives at the client, which 321 inverts its spin value and sends the inverted value on the next 322 packet it sends, as shown in Figure 6. 324 obs. points X Y 325 +--------+ 0 1 1 1 1 +--------+ 326 | | --------> | | 327 | Client | | Server | 328 | | <-------- | | 329 +--------+ 1 1 1 1 1 +--------+ 330 Y 332 Figure 6: Client inverts the spin edge 334 Here we can also see how measurement works. An observer watching the 335 signal at single observation point X in Figure 6 will see an edge 336 every 10 ticks, i.e. once per RTT. An observer watching the signal 337 at a symmetric observation point Y in Figure 6 will see a server- 338 client edge 4 ticks after the client-server edge, and a client-server 339 edge 6 ticks after the server-client edge, allowing it to compute 340 component RTT. 342 Figure 7 shows how this mechanism works in the presence of 343 reordering. Here, packet C carries the spin edge, and packet B is 344 reordered on the way to the client. In this case, the client will 345 begin sending spin 1 after the arrival of C, and ignore the spin bit 346 flip to 1 on packet B, since B < C; i.e. it does not increment the 347 highest packet number seen. 349 +--------+ 0 0 0 0 0 +--------+ 350 | | --------> | | 351 | Client | | Server | 352 | | <-------- | | 353 +--------+ 1 0 1 0 0 +--------+ 354 PN= A C B D E 356 Figure 7: Handling reordering 358 3.3. Experimental Evaluation 360 We have evaluated the effectiveness of the spin bit in an emulated 361 network environment. The spin bit was added to a fork of [MINQ], 362 using the mechanism described in Section 2, but with the spin bit 363 appearing in a measurement byte added to the header for passive 364 measurability experiments. Spin bit measurement support was added to 365 [MOKUMOKUREN]. Full results of these ongoing experiments are 366 available online in [SPINBIT-REPORT], but we summarize our findings 367 here. 369 First, we confirm that the spin bit works as advertised: it provides 370 one useful RTT sample per RTT to any passive observer of the flow. 371 This sample tracks each sender's local instantaneous estimate of RTT 372 as well as the expected RTT (i.e., defined by the emulation) fairly 373 well. One surprising implication of this is that the spin bit 374 provides _more_ information than is available by local estimation to 375 an endpoint which is mostly receiving data frames and sending mainly 376 ACKs, and as such can also be useful in purely endpoint-local 377 observations of the RTT evolution during the flow. The spin bit also 378 works correctly under moderate to heavy packet loss and jitter. 380 Second, we confirm that the spin bit can be easily implemented 381 without requiring deep integration into a QUIC implementation. 382 Indeed, it could be implemented completely independently, as a shim, 383 aside from the requirement that the spin bit value be integrity- 384 protected along with the rest of the QUIC header. 386 Third, we performed experiments focused on the intermittent-sender 387 problem described in Section 3.1. We confirm that the spin bit does 388 not provide useful RTT samples after the handshake when packets are 389 only sent intermittently. Simple heuristics can be used to recognize 390 this situation, however, and to reject these RTT samples. We also 391 find that a simple sender-side heuristic can be used to determine 392 whether a sample will be useful. If a sender sends a packet more 393 than a specified delay (e.g. 1ms) after the last packet received by 394 the client, it knows that any latency spin observation of that packet 395 will be invalid. If a second "spin valid" bit were available, the 396 sender could then mark that packet "spin invalid". Our experiments 397 show that this simple heuristic and spin validity bit are successful 398 in marking all packets whose RTT samples should be rejected. 400 Fourth, we performed experiments focused on the reordering problem 401 described in Section 3.1. We find that while reordering can cause 402 spurious samples at a naive observer, two simple approaches can be 403 used to reject spurious RTT samples due to reordering. First, a two- 404 bit spin signal that always advances in a single direction (e.g. 00 405 -> 01 -> 10 -> 11) successfully rejects all reordered samples, 406 including under amounts of reordering that render the transport 407 itself mostly useless. However, adding a bit is not necessary: 408 having the observer keep the least significant bits of the packet 409 number, and rejecting samples from packets that reverse the sequence 410 [RFC4737], as suggested in Section 3.1, is essentially as successful 411 as a two-bit spin signal in mitigating the effects of reordering on 412 RTT measurement. 414 Fifth, we performed parallel active measurements using ping, as 415 described in Section 5.2. In our emulated network, the ICMP packets 416 and the QUIC packets traverse the same links with the same treatment, 417 and share queues at each link, which mitigates most of the issues 418 with ping. We find that while ping works as expected in measuring 419 end-to-end RTT, it does not track the sender's estimate of RTT, and 420 as such does not measure the RTT experienced by the application layer 421 as well as the spin bit does. 423 In summary, our experiments show that the spin bit is suitable for 424 purpose, can be implemented with minimal disruption, and that most of 425 the identified problems can be easily mitigated. See 426 [SPINBIT-REPORT] for more. 428 4. Use Cases for Passive RTT Measurement 430 This section describes use cases for passive RTT measurement. Most 431 of these are currently achieved with TCP, i.e., the matching of 432 packets based on sequence and acknowledgment numbers, or timestamps 433 and timestamp echoes, in order to generate upstream and downstream 434 RTT samples which can be added to get end-to-end RTT. These use 435 cases could be achieved with QUIC by replacing sequence/ 436 acknowledgement and timestamp analysis with spin bit analysis, as 437 described in Section 3. 439 In any case, the measurement methodology follows one of a few basic 440 variants: 442 o The RTT evolution of a flow or a set of flows can be compared to 443 baseline or expected RTT measurements for flows with the same 444 characteristics in order to detect or localize latency issues in a 445 specific network. 447 o The RTT evolution of a single flow can also be examined in detail 448 to diagnose performance issues with that flow. 450 o The spin bit can be used to generate a large number of samples of 451 RTT for a flow aggregate (e.g., all flows between two given 452 networks) without regard to temporal evolution of the RTT, in 453 order to examine the distribution of RTTs for a group of flows 454 that should have similar RTT (e.g., because they should share the 455 same path(s)). 457 4.1. Inter-domain Troubleshooting 459 Network access providers are often the first point of contact by 460 their customers when network problems impact the performance of 461 bandwidth-intensive and latency-sensitive applications such as video, 462 regardless of whether the root cause lies within the access 463 provider's network, the service provider's network, on the Internet 464 paths between them, or within the customer's own network. 466 The network performance is currently measured by points of presence 467 on-the-path which extract spatial delay and loss metrics measurements 468 [RFC6049] from fields of the transport layer (e.g. TCP) or of 469 application layer (e.g. RTP). The information is captured in the 470 upper layer because neither the IP header nor the UDP layer includes 471 fields allowing the measurement of upstream and downstream delay and 472 loss. 474 Local network performance problems are detected with monitoring tools 475 which observe the variation of upstream metrics and downstream 476 metrics. 478 Inter-domain troubleshooting relies on the same metrics but is not a 479 pro-active task. It is a recursive process which hones in on the 480 domain and link responsible for the failure. In practice, inter- 481 domain troubleshooting is a communication process between the Network 482 Operations Center (NOC) teams of the networks on the path, because 483 the root cause of a problem is rarely located on a single network, 484 and requires cooperation and exchange of data between the NOCs. 486 One example is the troubleshooting performance degradation resulting 487 from a change of routing policy on one side of the path which 488 increases the burden on a defective line card of a device located 489 somewhere on the path. The card's misbehavior introduces an abnormal 490 reordered packets only in the traffic exchanged at line rate. 492 Other examples are similar in terms of cooperation requirements and 493 the need to refer to measurements. NOCs need to share the same 494 measurement metrics and to measure these metrics on the same fields 495 of the packet to enable a minimal level of technical cooperation. 497 Experimentation with the spinbit Section 3.3 has shown ability to 498 replace the current RTT measurement opportunities based on clear-text 499 transport or application header fields with a standard approach for 500 measuring passive upstream and downstream RTT, which are a 501 fundamental metric for this diagnostic process. 503 4.2. Two-Point Intradomain Measurement 505 The spin bit is also useful as a basic signal for instantaneous 506 measurement of the treatment of QUIC traffic within a single network. 507 Though the primary design goal of the spin bit signal is to enable 508 single-observer on-path measurement of end-to-end RTT, the spin bit 509 can also be used by two cooperating observers with access to traffic 510 flowing in the same direction as an alternate marking signal, as 511 described in [ALT-MARK]. The only difference from alternate marking 512 with a generated signal is that the size of the alternation will 513 change with the flight size each RTT. However, these changes do not 514 affect the applicability of the method that works for each marking 515 batch separately applied between two measurement points on the same 516 direction. This two point measurement is an additional feature 517 enabled "for free" by the spin bit signal. 519 So, with more than one observer on the same direction, it can be 520 useful to segment the RTT and deduce the contribution to the RTT of 521 the portion of the network between two on-path observers. This can 522 be easily performed by calculating the delay between two or more 523 measurement points on a single direction by applying [ALT-MARK]. In 524 this way, packet loss, delay and delay variation can be measured for 525 each segment of the network depending on the number and distribution 526 of the available on-path observation points. When these observation 527 points are applied at network borders, the alternate-marking signal 528 can be used to measure the performance of QUIC traffic within a 529 network operator's own domain of responsibility. own portion of the 530 network. 532 4.3. Bufferbloat Mitigation in Cellular Networks 534 Cellular networks consist of multiple Radio Access Networks (RAN) 535 where mobile devices are attached to base stations. It is common 536 that base stations from different vendors and different generations 537 are deployed in the same cellular network. 539 Due to the dynamic nature of RANs, base stations have typically been 540 provisioned with large buffers to maximize throughput despite rapid 541 changes in capacity. As a side effect, bufferbloat has become a 542 common issue in such networks [WWMM-BLOAT]. 544 An effective way of mitigating bufferbloat without sacrificing too 545 much throughput is to deploy Active Queue Management (AQM) in 546 bottleneck routers and base stations. However, due to the variation 547 in deployed base-stations it is not always possible to enable AQM at 548 the bottlenecks, without massive infrastructure investments. 550 An alternative approach is to deploy AQM as a network function in a 551 more centralized location than the traditional bottleneck nodes. 552 Such an AQM monitors the RTT progression of flows and drops or marks 553 packets when the measured latency is indicative of congestion. Such 554 a function also has the possibility to detect misbehaving flows and 555 reduce the negative impact they have on the network. 557 4.4. Locating WiFi Problems in Home Networks 559 Many residential networks use WiFi (802.11) on the last segment, and 560 WiFi signal strength degradation manifests in high first-hop delay, 561 due to the fact that the MAC layer will retransmit packets lost at 562 that layer. Measuring the RTT between endpoints on the customer 563 network and parts of the service provider's own infrastructure (which 564 have predictable delay characteristics) can be used to isolate this 565 cause of performance problems. 567 The network provider can measure the RTT and packet loss in the home 568 gateway or an upstream point if there is no access to home gateway. 569 A problem in the WiFi network is identified by seeing high delay and 570 low packet loss. 572 These measurements are particularly useful for traffic which is 573 latency sensitive, such as interactive video applications. However, 574 since high latency is often correlated with other network-layer 575 issues such as chronic interconnect congestion [IMC-CONGESTION], it 576 is useful for general troubleshooting of network layer issues in an 577 interdomain setting. 579 In this case, multiple RTT samples per flow are useful less for 580 observing intraflow behavior, and more for generating sufficient 581 samples for a given aggregate to make a high-quality measurement. 583 4.5. Internet Measurement Research 585 As a large, distributed, engineered system with no centralized 586 control, the Internet has emergent properties of interest to the 587 research community not just for purely scientific curiosity, but also 588 to provide applicable guidance to Internet engineering, Internet 589 protocol design and development, network operations, and policy 590 development. Latency measurements in particular are both an active 591 area of research as well as an important tool for certain measurement 592 studies (see, e.g. [IMC-TCPSIG], from the most recent Internet 593 Measurement Conference). While much of this work is currently done 594 with active measurements, the ability to generate latency samples 595 passively or using a hybrid measurement approach (i.e., through 596 passive observation of purpose-generated active measurement traffic; 597 see [RFC7799]) can drastically increase the efficiency and 598 scalability of these studies. A latency spin bit would make these 599 techniques applicable to QUIC, as well. 601 5. Alternate RTT Measurement Approaches for Diagnosing QUIC flows 603 There are three broad alternatives to explicit signaling for passive 604 RTT measurement of the RTT experienced by QUIC flows. 606 5.1. Handshake RTT measurement 608 The first of these is handshake RTT measurement. As described in 609 [QUIC-MGT], the packets of the QUIC handshake are distinguishable on 610 the wire in such a way that they can be used for one RTT measurement 611 sample per flow: the delay between the client initial and the server 612 cleartext packet can be used to measure "upstream" RTT (between the 613 observer and the server), and the delay between the server cleartext 614 packet and the next client cleartext packet can be used to measure 615 "downstream" RTT (between the client and the observer). When RTT 616 measurements are used in large aggregates (all flows traversing a 617 large link, for example), a methodology based on handshake RTT could 618 be used to generate sufficient samples for some purposes without the 619 spin bit. 621 However, this methodology would rely on the assumption that the 622 difference between handshake RTT and nominal in-flow RTT is 623 negligible. Specifically, (1) any additional delay required to 624 compute any cryptographic parameters must be negligible with respect 625 to network RTT; (2) any additional delay required to establish state 626 along the path must be negligible with respect to network RTT; and 627 (3) network treatment of initial packets in a flow must be identical 628 to that of later packets in the flow. When these assumptions cannot 629 be shown to hold, spin-bit based RTT measurement is preferable to 630 handshake RTT measurement, even for applications for which handshake 631 RTT measurement would otherwise be suitable. 633 5.2. Parallel active measurement 635 The second alternative is parallel active measurement: using ICMP 636 Echo Request and Reply [RFC0792] [RFC4433], a dedicated measurement 637 protocol like TWAMP [RFC5357], or a separate diagnostic QUIC flow to 638 measure RTT. Regardless of protocol, the active measurement must be 639 initiated by a client on the same network as the client of the QUIC 640 flow(s) of interest, or a network close by in the Internet topology, 641 toward the server. Note that there is no guarantee that ICMP flows 642 will receive the same network treatment as the flows under study, 643 both due to differential treatment of ICMP traffic and due to ECMP 644 routing (see e.g. [TOKYO-PING]). TWAMP and QUIC diagnostic flows, 645 though both use UDP, have similar issues regarding ECMP. However, in 646 situations where the entity doing the measurement can guarantee that 647 the active measurement traffic will traverse the subpaths of interest 648 (e.g. residential access network measurement under a network 649 architecture and business model where the network operator owns the 650 CPE), active measurement can be used to generate RTT samples at the 651 cost of at least two non-productive packets sent though the network 652 per sample. 654 5.3. Frequency Analysis 656 The third alternative, proposed during the QUIC RTT design team 657 process, relies on the inter-packet spacing to convey information 658 about the RTT, and would therefore allow measurements confined to a 659 single direction of transmission, as described in [CARRA-RTT]. 661 We evaluated the applicability of this work to passive RTT 662 measurement in QUIC, and found it wanting. We assembled a toolchain, 663 as described in [NOSPIN], that allowed evaluation of a critical 664 aspect of the [CARRA-RTT] method: extraction of inter-packet times of 665 real packet streams and the analysis of frequencies present in the 666 packet stream using the Lomb-Scargle Periodogram. Several streams 667 were evaluated, as summarized below: 669 o It seems that Carra et al. [CARRA-RTT] took the noisy and low- 670 confidence results of a statistical process (no RTT-related 671 frequency has been detected even after using very low alpha 672 confidence) and added heuristics with sliding-window averaging to 673 infer the fundamental frequency and RTT present in a 674 unidirectional stream. 676 o There appear to be several limitations on the streams that are 677 applicable. Streams with long RTT (~50ms) are more likely to be 678 suitable (having a better match between packet rate and relatively 679 low frequencies to detect). 681 o None of the TCP streams analysed (to date) possess a sufficient 682 packet rate such that the measured fundamental frequency or the 683 multiples of the fundamental are actually within the detectable 684 range. 686 o "Ideal" interarrival time streams were simulated with uniform 687 sampling and period. The Lomb-Scargle Periodogram is surprisingly 688 unable to detect the fundamental frequency at 100 Hz from the 689 constant 10 ms packet spacing. 691 o It is not clear if IETF QUIC protocol stream will possess the same 692 inter-packet arrival time features as TCP streams. Also, Carra et 693 al. note that their process may not work if the TCP stream 694 encounters a bottleneck, which would be an essential circumstance 695 for network troubleshooting. Mobile networks with time-slot 696 service disciplines would likely cause similar issues as a 697 bottleneck, by imposing their time-slot interval on the spacing of 698 most packets. 700 o The Carra et al. [CARRA-RTT] calculation of minimum and maximum 701 frequencies that can be detected may not be applicable when the 702 inter-arrival times are (both) the signal being detected and 703 govern the non-uniform sampling frequency. 705 6. Greasing 707 Routes, congestion levels and therefore latency between two fixed 708 QUIC endpoints, as well as the shape of individual application flows, 709 fluctuate in ways that are not totally predictable by an on path 710 observer. In general, there is no a-priori pattern for the spin-bit 711 distribution that will always materialise on a certain flow 712 aggregate, even for a single user. 714 There has been discussion in the QUIC working group that greasing 715 could be a strategy to counter an evil access provider that might 716 gate access to its users on a valid spin bit signal. Let's accept 717 for a moment this threat model and consider the practical case of a 718 home gateway that temporarily misbehaves, for example draining its 719 queues slower than it would normally do while a firmware download is 720 in progress. It would be ill-considered for an access provider (even 721 a malicious one) to block, or otherwise interfere with, QUIC flows 722 originating from behind that CPE solely based on the fact that RTTs 723 are now different from "usual". In fact, providing a numerical 724 assessment of what such "usual" RTT looks like would necessarily 725 include many paths with different length, and considerable RTT 726 variability within any fixed path, which is clearly beyond most ISPs' 727 reach. But even assuming it were, there is a simple cost-benefit 728 counterargument here that the same effect (i.e., gating traffic from 729 or to a given user based on observed traffic patterns) could be 730 achieved with much cheaper and effective means (e.g., [SHBAIR]). 732 So, the potential for ossification appears to be extremely low. 733 Since it depends on so much external noise, the spin-bit result 734 variability is self-greasing to an extent. In fact, implementing 735 explicit greasing around the spin-bit might even be harmful as it 736 would potentially erode confidence in the veracity of the signal. 738 However, if a greasing algorithm is really needed - for example, if 739 we want to reuse the bit with different semantics in the future 740 (i.e.: the spin-bit is not included in the header invariants), one 741 very simple implementation would be as follows: each server will 742 refuse to spin its bit on a per-flow basis with a given probability 743 p, instead leaving it stuck to a randomly chosen value, 0 or 1. The 744 client will then end up leaving its bit stuck to the opposite value, 745 or could detect this condition and also pick a randomly chosen stuck 746 value. The value chosen for p must be small enough to let the spin- 747 bit mechanics work and large enough not to be seen as an error 748 instead of an intentional protocol feature. 750 7. Privacy and Security Considerations 752 The privacy considerations for the latency spin bit are essentially 753 the same as those for passive RTT measurement in general. 755 A concern was raised during the discussion of this feature within the 756 QUIC working group and the QUIC RTT Design Team that high-resolution 757 RTT information might be usable for geolocation. However, an 758 evaluation based on RTT samples taken over 13,780 paths in the 759 Internet from RIPE Atlas anchoring measurements [TRILAT] shows that 760 the magnitude and uncertainty of RTT data limit the resolution of 761 geolocation information that can be derived from Internet RTT to 762 national- or continental-scale; i.e., less resolution than is 763 generally available from free, open IP geolocation databases. 765 One reason for the inaccuracy of geolocation from network RTT is that 766 Internet backbone transmission facilities do not follow the great- 767 circle path between major nodes. Instead, major geographic features 768 and the efficiency of connecting adjacent major cities both influence 769 the facility routing. An evaluation of ~3500 measurements on a mesh 770 of 25 backbone nodes in the continental United States shows that 85% 771 had RTT to great-circle error of 3ms or more, making location within 772 US State boundaries ambiguous [CONUS]. 774 Therefore, in the general case, when an endpoint's IP address is 775 known, RTT information provides negligible additional information. 777 RTT information may be used to infer the occupancy of queues along a 778 path; indeed, this is part of its utility for performance measurement 779 and diagnostics. When a link on a given path has excessive buffering 780 (on the order of hundreds of milliseconds or more), such that the 781 difference in delay between an empty queue and a full queue dwarfs 782 normal variance and RTT along the path, RTT variance during the 783 lifetime of a flow can be used to infer the presence of traffic on 784 the bottleneck link. In practice, however, this is not a concern for 785 passive measurement of congestion-controlled traffic, since any 786 observer in a situation to observe RTT passively need not infer the 787 presence of the traffic, as it can observe it directly. 789 In addition, since RTT information contains application as well as 790 network delay, patterns in RTT variance from minimum, and therefore 791 application delay, can be used to infer or fingerprint application- 792 layer behavior. However, as with the case above, this is not a 793 concern with passive measurement, since the packet size and 794 interarrival time sequence, which is also directly observable, 795 carries more information than RTT variance sequence. 797 We therefore conclude that the high-resolution, per-flow exposure of 798 RTT for passive measurement as provided by the spin bit poses 799 negligible marginal risk to privacy. 801 As shown in Section 2, the spin bit can be implemented separately 802 from the rest of the mechanisms of the QUIC transport protocol, as it 803 requires no access to any state other than that observable in the 804 QUIC packet header itself. We recommend that implementations take 805 advantage of this property, to reduce the risk that errors in the 806 implementation could leak private transport protocol state through 807 the spin bit. 809 Since the spin bit is disconnected from transport mechanics, a QUIC 810 endpoint implementing the spin bit that has a model of the actual 811 network RTT and a target RTT to expose can "lie" about its spin bit 812 transitions, by anticipating or delaying observed transitions, even 813 without coordination with and the collusion of the other endpoint. 814 This is not the case with TCP, which requires coordination and 815 collusion to expose false information via its sequence and 816 acknowledgment numbers and its timestamp option. When passive 817 measurement is used for purposes where one endpoint might gain a 818 material advantage by representing a false RTT, e.g. SLA 819 verification or enforcement of telecommunications regulations, this 820 situation raises a question about the trustworthiness of spin bit RTT 821 measurements. 823 This issue must be appreciated by users of spin bit information, but 824 mitigation is simple, as QUIC implementations designed to lie about 825 RTT through spin bit modification can easily be detected. A lying 826 server can be contacted by an honest client under the control of a 827 verifying party, and the client's RTT estimate compared with the 828 spin-bit exposed estimate. Though in the general case, it is 829 impossible to verify explicit path signals with two complicit 830 endpoints (see [WIRE-IMAGE]), a lying server/client pair may be 831 subject to dynamic analysis along paths with known RTTs. We consider 832 the ease of verification of lying in situations where this would be 833 prohibited by regulation or contract, combined with the consequences 834 of violation of said regulation or contract, to be a sufficient 835 incentive in the general case not to do it. 837 8. Acknowledgments 839 Many thanks to Christian Huitema, who originally proposed the spin 840 bit as pull request 609 on [QUIC-TRANS]. Thanks to Tobias Buehler 841 for feedback on the draft. Special thanks to the QUIC RTT Design 842 Team for discussions leading especially to the measurement 843 limitations and privacy and security considerations sections. 845 This work is partially supported by the European Commission under 846 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 847 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 848 for Education, Research, and Innovation under contract no. 15.0268. 849 This support does not imply endorsement. 851 9. Informative References 853 [ALT-MARK] 854 Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., 855 Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, 856 "Alternate Marking method for passive and hybrid 857 performance monitoring", draft-ietf-ippm-alt-mark-14 (work 858 in progress), December 2017. 860 [CACM-TCP] 861 Strowes, S., "Passively Measuring TCP Round-Trip Times (in 862 Communications of the ACM)", October 2013. 864 [CARRA-RTT] 865 Carra, D., Avrachenkov, K., Alouf, S., Blanc, A., Nain, 866 P., and G. Post, "Passive Online RTT Estimation for Flow- 867 Aware Routers Using One-Way Traffic (NETWORKING 2010, LNCS 868 6091, pp. 109-121)", 2010. 870 [CONUS] Morton, A., "Comparison of Backbone Node RTT and Great 871 Circle Distances (https://github.com/acmacm/CONUS-RTT)", 872 September 2017. 874 [IMC-CONGESTION] 875 Luckie, M., Dhamdhere, A., Clark, D., Huffaker, B., and k. 876 claffy, "Challenges in Inferring Internet Interdomain 877 Congestion (in Proc. ACM IMC 2014)", November 2014. 879 [IMC-TCPSIG] 880 Sundaresan, S., Dhamdhere, A., Allman, M., and . k claffy, 881 "TCP Congestion Signatures (in Proc. ACM IMC 2017)", n.d.. 883 [MINQ] Rescorla, E., "MINQ, a simple Go implementation of QUIC 884 (https://github.com/ekr/minq)", November 2017. 886 [MOKUMOKUREN] 887 Trammell, B., "Mokumokuren, a lightweight flow meter using 888 gopacket (https://github.com/britram/mokumokuren)", 889 November 2017. 891 [NOSPIN] Morton, A., "Description of a tool chain to evaluate 892 Unidirectional Passive RTT measurement (and results) 893 (https://github.com/acmacm/PassiveRTT)", October 2017. 895 [QUIC-MGT] 896 Kuehlewind, M. and B. Trammell, "Manageability of the QUIC 897 Transport Protocol", draft-ietf-quic-manageability-01 898 (work in progress), October 2017. 900 [QUIC-TRANS] 901 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 902 and Secure Transport", draft-ietf-quic-transport-08 (work 903 in progress), December 2017. 905 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 906 RFC 792, DOI 10.17487/RFC0792, September 1981, 907 . 909 [RFC4433] Kulkarni, M., Patel, A., and K. Leung, "Mobile IPv4 910 Dynamic Home Agent (HA) Assignment", RFC 4433, 911 DOI 10.17487/RFC4433, March 2006, 912 . 914 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 915 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 916 DOI 10.17487/RFC4737, November 2006, 917 . 919 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J. 920 Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", 921 RFC 5357, DOI 10.17487/RFC5357, October 2008, 922 . 924 [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of 925 Metrics", RFC 6049, DOI 10.17487/RFC6049, January 2011, 926 . 928 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 929 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 930 May 2016, . 932 [SHBAIR] Shbair, W., Cholez, T., Francois, J., and I. Chrisment, "A 933 multi-level framework to identify HTTPS services (in Proc. 934 IEEE/IFIP NOMS)", April 2016. 936 [SPINBIT-REPORT] 937 De Vaere, P., "Latency Spinbit Implementation Experience 938 (https://devae.re/f/eth/quic/spinbit_report/)", November 939 2017. 941 [TLS] Rescorla, E., "The Transport Layer Security (TLS) Protocol 942 Version 1.3", draft-ietf-tls-tls13-22 (work in progress), 943 November 2017. 945 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 946 Integrity Signals for Passive Measurement (in Proc. TMA 947 2014)", April 2014. 949 [TOKYO-PING] 950 Pelsser, C., Cittadini, L., Vissicchio, S., and R. Bush, 951 "From Paris to Tokyo - On the Suitability of ping to 952 Measure Latency (In Proc. ACM IMC 2014)", October 2014. 954 [TRILAT] Trammell, B., "On the Suitability of RTT Measurements for 955 Geolocation 956 (https://github.com/britram/trilateration/blob/paper-rev- 957 1/paper.ipynb)", August 2017. 959 [WIRE-IMAGE] 960 Trammell, B. and M. Kuehlewind, "The Wire Image of a 961 Network Protocol", draft-trammell-wire-image-01 (work in 962 progress), December 2017. 964 [WWMM-BLOAT] 965 Alfredsson, S., Giudice, G., Garcia, J., Brunstrom, A., 966 Cicco, L., and S. Mascolo, "Impact of TCP Congestion 967 Control on Bufferbloat in Cellular Networks (in Proc. IEEE 968 WoWMoM 2013)", June 2013. 970 Authors' Addresses 972 Brian Trammell (editor) 973 ETH Zurich 975 Email: ietf@trammell.ch 977 Piet De Vaere 978 ETH Zurich 980 Email: piet@devae.re 981 Roni Even 982 Huawei 984 Email: roni.even@huawei.com 986 Giuseppe Fioccola 987 Telecom Italia 989 Email: giuseppe.fioccola@telecomitalia.it 991 Thomas Fossati 992 Nokia 994 Email: thomas.fossati@nokia.com 996 Marcus Ihlar 997 Ericsson 999 Email: marcus.ihlar@ericsson.com 1001 Al Morton 1002 AT&T Labs 1004 Email: acmorton@att.com 1006 Emile Stephan 1007 Orange 1009 Email: emile.stephan@orange.com