idnits 2.17.1 draft-cfb-tsvwg-spinbit-new-measurements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.trammell-tsvwg-spin], [I-D.trammell-ippm-spin]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (November 4, 2019) is 1633 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-23 ** Obsolete normative reference: RFC 8321 (Obsoleted by RFC 9341) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TSVWG M. Cociglio 3 Internet-Draft Telecom Italia 4 Intended status: Experimental G. Fioccola 5 Expires: May 7, 2020 Huawei Technologies 6 F. Bulgarella 7 R. Sisto 8 Politecnico di Torino 9 November 4, 2019 11 New Spin bit enabled measurements with one or two more bits 12 draft-cfb-tsvwg-spinbit-new-measurements-00 14 Abstract 16 This document introduces additional measurements by using the same 17 spin bit signal as defined in [I-D.trammell-tsvwg-spin] and 18 [I-D.trammell-ippm-spin]. The spin bit signal alone is not enough to 19 evaluate correctly in every network condition the RTT of a flow. In 20 order to solve this problem, it is theorized the possibility of 21 introducing an additional validation signal called delay bit, similar 22 to what is done by the Valid Edge Counter (VEC), but using just one 23 bit instead of two. An alternative with two bits is also introduced 24 with a so called loss bit. More in general a loss signal is defined 25 to measure packet loss and two alternatives are presented with one 26 bit and two bits. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on May 7, 2020. 50 Copyright Notice 52 Copyright (c) 2019 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Spin bit and Delay bit mechanism . . . . . . . . . . . . . . 3 69 2.1. Delay Sample generation . . . . . . . . . . . . . . . . . 5 70 2.1.1. The recovery process . . . . . . . . . . . . . . . . 6 71 2.2. Delay Sample reflection . . . . . . . . . . . . . . . . . 6 72 3. Using the Spin bit and Delay bit for Hybrid RTT Measurement . 7 73 3.1. End-to-end RTT measurement . . . . . . . . . . . . . . . 7 74 3.2. Half-RTT measurement . . . . . . . . . . . . . . . . . . 7 75 3.3. Intra-domain RTT measurement . . . . . . . . . . . . . . 8 76 4. Observer's algorithm and Waiting Interval . . . . . . . . . . 8 77 5. Adding a Loss signal for Packet loss measurement . . . . . . 9 78 5.1. Round Trip Packet Loss measurement . . . . . . . . . . . 10 79 6. RTT dependent Packet Loss using one bit loss signal . . . . . 11 80 6.1. Observer's logic for one bit loss signal . . . . . . . . 12 81 7. RTT independent Packet Loss using two bits loss signal . . . 12 82 7.1. Observer's logic for two bits loss signal . . . . . . . . 14 83 8. Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 14 84 8.1. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 8.2. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 86 9. Security Considerations . . . . . . . . . . . . . . . . . . . 15 87 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 88 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 89 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 90 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 91 12.2. Informative References . . . . . . . . . . . . . . . . . 16 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 94 1. Introduction 96 Both [I-D.trammell-tsvwg-spin] and [I-D.trammell-ippm-spin] define an 97 explicit per-flow transport-layer signal for hybrid measurement of 98 end-to-end RTT. This signal consists of three bits: a spin bit, 99 which oscillates once per end-to-end RTT, and a two-bit Valid Edge 100 Counter (VEC), which compensates for loss and reordering of the spin 101 bit to increase fidelity of the signal in less than ideal network 102 conditions. 104 In this document it is introduced the delay bit, that is a single bit 105 signal that can be used together with the spin bit by passive 106 observers to measure the RTT of a network flow, avoiding the spin bit 107 ambiguities that arise as soon as network conditions deteriorate. 108 Unlike the spin bit, which is actually set in every packet 109 transmitted on the network, the delay bit is set only once per round 110 trip. 112 This document defines a hybrid measurement RFC 7799 [RFC7799] path 113 signal to be embedded into a transport layer protocol, explicitly 114 intended for exposing end-to-end RTT to measurement devices on path. 116 The document introduces a mechanism applicable to any transport-layer 117 protocol, then explains how to bind the signal to a variety of IETF 118 transport protocols, and in particular to QUIC and TCP. 120 The application of the Spin bit to QUIC is described in 121 [I-D.ietf-quic-spin-exp] which adds the spin bit only (without the 122 VEC) to QUIC for experimentation purposes. 124 Note that both the spin bit and the delay bit are inspired by RFC 125 8321 [RFC8321]. This is also mentioned in [I-D.trammell-quic-spin]. 127 2. Spin bit and Delay bit mechanism 129 The main idea is to have a single packet, with a second marked bit 130 (the delay bit), that bounces between client and server during the 131 entire connection life. This single packet is called Delay Sample. 133 A simple observer placed in an intermediate point, tracking the delay 134 sample and the relative timestamp in every spin bit period, can 135 measure the end-to-end round trip delay of the connection. In the 136 same way as seen with the spin bit and the VEC, it is possible to 137 carry out other types of measurements. The next paragraphs give an 138 overview of the observer capabilities. 140 In order to describe the delay sample working mechanism in detail, we 141 have to distinguish two different phases which take part in the delay 142 bit lifetime: initialization and reflection. The initialization is 143 the generation of the delay sample, while the reflection realizes the 144 bounce behavior of this single packet between the two endpoints. 146 The next figure describes the Delay bit mechanism: the first bit is 147 the spin bit and the second one is the delay bit. 149 +--------+ -- -- -- -- -- +--------+ 150 | | -----------> | | 151 | Client | | Server | 152 | | <----------- | | 153 +--------+ -- -- -- -- -- +--------+ 155 (a) No traffic at beginning. 157 +--------+ 00 00 01 -- -- +--------+ 158 | | -----------> | | 159 | Client | | Server | 160 | | <----------- | | 161 +--------+ -- -- -- -- -- +--------+ 163 (b) The Client starts sending data and 164 sets the first packet as Delay Sample. 166 +--------+ 00 00 00 00 00 +--------+ 167 | | -----------> | | 168 | Client | | Server | 169 | | <----------- | | 170 +--------+ -- -- 01 00 00 +--------+ 172 (c) The Server starts sending data 173 and reflects the Delay Sample. 175 +--------+ 10 10 11 00 00 +--------+ 176 | | -----------> | | 177 | Client | | Server | 178 | | <----------- | | 179 +--------+ 00 00 00 00 00 +--------+ 181 (d) The Client inverts the spin bit and 182 reflects the Delay Sample. 184 +--------+ 10 10 10 10 10 +--------+ 185 | | -----------> | | 186 | Client | | Server | 187 | | <----------- | | 188 +--------+ 00 00 11 10 10 +--------+ 189 (e) The Server reflects the Delay Sample. 191 +--------+ 00 00 01 10 10 +--------+ 192 | | -----------> | | 193 | Client | | Server | 194 | | <----------- | | 195 +--------+ 10 10 10 10 10 +--------+ 197 (f) The client reverts the spin 198 bit and reflects the Delay Sample. 200 Figure 1: Spin bit and Delay bit 202 2.1. Delay Sample generation 204 During this first phase, endpoints play different roles. First of 205 all a single delay sample must be bouncing per round trip period (and 206 so per spin bit period). According to that statement and in order to 207 simplify the general algorithm, the delay sample generation is in 208 charge of just one of the two endpoints: 210 o the Client, when connection starts and spin bit is set to 0, 211 initializes the delay bit of the first packet to 1, so it becomes 212 the delay sample for that marking period. Only this packet is 213 marked with the delay bit set to 1 for this round trip period; the 214 other ones will carry only the spin bit; 216 o the server never initializes the delay bit to 1; its only task is 217 to reflect the incoming delay bit into the next outgoing packet 218 only if certain conditions occur. 220 Theoretically, in absence of network impairments, the delay sample 221 should bounce between client and server continuously, for the entire 222 duration of the connection. Actually, that is highly unlikely mainly 223 for two different reasons: 225 1) the packet carrying the delay bit might be lost during its journey 226 on the network which is unreliable by definition; 228 2) one of the two endpoints could stop or delay sending data because 229 the application is limiting the amount of traffic transmitted; 231 To deal with these problems, the algorithm provides a procedure to 232 regenerate the delay sample and to inform a possible observer that a 233 problem has occurred, and then the measurement has to be restarted. 235 2.1.1. The recovery process 237 In order to relieve the server from tasks that go beyond the mere 238 reflection of the sample, even in this case the recovery process 239 belongs to the client. A fundamental assumption is that a delay 240 sample is strictly related to its spin bit period. Considering this 241 rule, the client verifies that every spin bit period ends with its 242 delay sample. If that does not happen and a marking period 243 terminates without a delay sample, the client waits a further empty 244 period; then, in the following period, it reinitializes the mechanism 245 by setting the delay bit of the first outgoing packet to 1, making it 246 the new delay sample. The empty period is needed to inform the 247 intermediate points that there was an issue and a new delay 248 measurement session is starting. 250 2.2. Delay Sample reflection 252 The reflection is the process that enables the bouncing of the delay 253 sample between client and server. The behavior of the two endpoints 254 is slightly different. With the exception of the client that, as 255 previously exposed, generates a new delay sample, by default the 256 delay bit is set to 0. 258 Server side reflection: when a packet with the delay bit set to 1 259 arrives, the server marks the first packet in the opposite direction 260 as the delay sample, if it has the same spin bit value. While if it 261 has the opposite spin bit value this sample is considered lost. 263 Client side reflection: when a packet with delay bit set to 1 264 arrives, the client marks the first packet in the opposite direction 265 as the delay sample, if it has the opposite spin bit value. While if 266 it has the same spin bit value this sample is considered lost. 268 In both cases, if the outgoing marked packet is transmitted with a 269 delay greater than a predetermined threshold after the reception of 270 the incoming delay sample (1ms by default), reflection is aborted and 271 this sample is considered lost. 273 It is noteworthy that differently from what happens with the VEC for 274 which the reflection always concerns the edge of the period, in this 275 case reflection takes place for the packet that is carrying the delay 276 bit regardless of its position within the period. For this reason it 277 is necessary to introduce that condition of validation in order to 278 identify and discard those samples that, due to reordering, might 279 move to a contiguous period. Furthermore, by introducing a threshold 280 for the retransmission delay of the sample, it is possible to 281 eliminate all those measurements which, due to lack of traffic on the 282 endpoints, would be overestimated and not true. Thus, the maximum 283 estimation error, without considering any other delays due to flow 284 control, would amount to twice the threshold (e.g. 2ms) per 285 measurement, in the worst case. 287 3. Using the Spin bit and Delay bit for Hybrid RTT Measurement 289 Unlike what happens with the spin bit for which it is necessary to 290 validate or at least heuristically evaluate the goodness of an edge, 291 the delay sample can be used by an intermediate observer as a simple 292 demarcator between a period and the following one eliminating the 293 ambiguities on the calculation of the RTT found with the analysis of 294 the spin-bit only. The measurement types, that can be done from the 295 observation of the delay sample, are exactly the same achievable with 296 the spin bit only (with or without the VEC). 298 3.1. End-to-end RTT measurement 300 The delay sample generation process ensures that only one packet 301 marked with the delay bit set to 1 runs back and forth on the wire 302 between two endpoints per round trip time. Therefore, in order to 303 determine the end-to-end RTT measurement of a QUIC flow, an on-path 304 passive observer can simply compute the time difference between two 305 delay samples observed in a single direction. Note that a 306 measurement, to be valid, must take into account the difference in 307 time between the timestamps of two consecutive delay samples 308 belonging to adjacent spin-bit periods. For this reason, an 309 observer, in addition to intercepting and analyzing the packets 310 containing the delay bit set to 1, must maintain awareness of each 311 spin period in such a way as to be able to assign each delay sample 312 to its period and, at the same time, identifying those periods that 313 do not contain it. 315 3.2. Half-RTT measurement 317 An on-path passive observer that is sniffing traffic in both 318 directions -- from client to server and from server to client -- can 319 also use the delay sample to measure "upstream" and "downstream" RTT 320 components. Also known as the half-RTT measurement, it represents 321 the components of the end-to-end RTT concerning the paths between the 322 client and the observer (upstream), and the observer and the server 323 (downstream). It does this by measuring the delay between a delay 324 sample observed in the downstream direction and the one observed in 325 the upstream direction, and vice versa. Also in this case, it should 326 verify that the two delay samples belong to two adjacent periods, for 327 the upstream component, or to the same period for the downstream 328 component. 330 3.3. Intra-domain RTT measurement 332 Taking advantage of the half-RTT measurements it is also possible to 333 calculate the intra-domain RTT which is the portion of the entire RTT 334 used by a QUIC flow to traverse the network of a provider (or part of 335 it). To achieve this result two observers, able to watch traffic in 336 both directions, must be employed simultaneously at ingress and 337 egress of the network to be measured. At this point, to determine 338 the delay between the two observers, it is enough to subtract the two 339 computed upstream (or downstream) RTT components. 341 The spin bit is an alternate marking generated signal and the only 342 difference than RFC 8321 [RFC8321] is the size of the alternation 343 that will change with the flight size each RTT. So it can be useful 344 to segment the RTT and deduce the contribution to the RTT of the 345 portion of the network between two on-path observers and it can be 346 easily performed by calculating the delay between two or more 347 measurement points on a single direction by applying RFC 8321 348 [RFC8321]. 350 4. Observer's algorithm and Waiting Interval 352 Given below is a formal summary of the functioning of the observer 353 every time a delay sample is detected. A packet containing the delay 354 bit set to 1: 356 o if it has the same spin bit value of the current period and no 357 delay sample was detected in the previous period, then it can be 358 used as a left edge (i.e. to start measuring an RTT sample), but 359 not as a right edge (i.e. to complete and RTT measurement since 360 the last edge). If the observation point is symmetric (i.e. it 361 can see both upstream and downstream packets in the flow) and in 362 the current period a delay sample was detected in the opposite 363 direction (i.e. in the upstream direction), the packet can also be 364 used to compute the downstream RTT component. 366 o if it has the same spin bit value of the current period and a 367 delay sample was detected in the previous period, then it can be 368 used at the same time as a left or right edge, and to compute RTT 369 component in both directions. 371 Like stated previously, every time an empty period is detected, the 372 observer must restart the measurement process and consider the next 373 delay sample that will come as the beginning of a new measure, then 374 as a left edge. As a result, being able to assign the delay sample 375 to the corresponding spin period becomes a crucial factor for the 376 proper functioning of the entire algorithm. 378 Considering that the division into periods is realized by exploiting 379 the spin bit square wave, it is easy to understand that the presence 380 of spurious spin edges -- caused by packet reordering -- would 381 inevitably lead the observer to overestimate the amount of periods 382 actually present in the transmission. This results in a greater 383 number of empty periods detected and the consequent decrease of the 384 actual RTT samples achievable. Therefore, in order to maximize the 385 performance of the whole algorithm, the observer must implement a 386 mechanism to filter out spurious spin edges. 388 To face this problem the waiting interval has to be introduced. 389 Basically, every time a spin bit edge is detected, the observer sets 390 a time interval during which it rejects every potential spurious 391 edges observed on the wire. While, at the end of the interval it 392 starts again to accept changes in the spin bit value. This 393 guarantees a proper protection against the spurious edges in relation 394 to the size of the interval itself. For instance, an interval of 5ms 395 is able to filter out edges that have been reordered by a maximum of 396 5ms. Clearly, the mechanism does its job for intervals smaller than 397 the RTT of the observed connection (if RTT is smaller than the 398 waiting interval the observer can't measure the RTT). 400 5. Adding a Loss signal for Packet loss measurement 402 It is possible to introduce a mechanism to evaluate also the packet 403 loss together with the delay measurement. This can be achieved by 404 introducing the loss signal, a single or two bits signal whose 405 purpose is to mark a variable number of packets (from live traffic) 406 which are exchanged two times between the endpoints realizing a two 407 round-trip reflection. The overall exchange comprises: 409 o The client first selects, generates and consequent transmits to 410 the server a first train of packets, by marking the loss bit to 1; 412 o The server, upon reception from the client of each one of the 413 packets included in the first train, reflects to the client a 414 respective second train of packets of the same size as the first 415 train received, by marking the loss bit to 1; 417 o The client, upon reception from the server of each one of the 418 packets included in the second train, reflects to the server a 419 respective third train of packets of the same size as the second 420 train received, by marking the loss bit to 1; 422 o The server, upon reception from the client of each one of the 423 packets included in the third train, finally reflects to the 424 client a respective fourth train of packets of the same size as 425 the third train received, by marking the loss bit to 1. 427 Packets belonging to the first round (first and second train) 428 represent the Generation Phase while those belonging to the second 429 round (third and fourth train) represent the Reflection Phase. 431 A passive on-path observer, placed on whatever direction, can 432 trivially count and compare the number of marked packets seen during 433 the two mentioned phases (i.e. the first and third or the second and 434 the fourth trains of packets, depending on which direction is 435 observed) and estimate the loss rate experienced by the connection. 436 This process is repeated continuously to obtain more measurements as 437 long as the endpoints exchange traffic. These measurements can be 438 called Round Trip(RT) losses 440 The general algorithm shown above gives an idea of its underlying 441 principles but is not enough to make the whole process working 442 properly. 444 Firstly, there is the issue that packet rates in the two directions 445 may be different. Therefore, the right number of packets to be 446 marked has to be chosen in order to avoid their congestion on the 447 slowest traffic direction. As a consequence, this number is 448 inevitably equal to the amount of packets transited, indeed, on the 449 slowest direction. This problem can be easily addressed by a method 450 wherein the two endpoints of a communication exchange marked packets 451 interleaved with unmarked packets. From an implementation point of 452 view, this result can be achieved by introducing a single token 453 system that adjusts the number of outgoing marked packets. 454 Basically, the token is enabled every time a packet arrives and 455 disabled when a marked packet is transmitted. Since the creation of 456 the initial train of marked packets is carried out by the client, the 457 management and use of this single token is also assigned to it, which 458 in fact "calculates" the correct number of packets to be marked each 459 time. 461 Secondly, a mechanism to individually identify each train of packets 462 must be provided to enable the observer to distinguish between trains 463 belonging to different phases (Generation and Reflection). About 464 this point, different approaches are used depending on the number of 465 bits of the loss signal and it will be discussed in the next 466 sections. 468 5.1. Round Trip Packet Loss measurement 470 Since the measurements are performed on a portion of the traffic 471 exchanged between client and server, the observer calculates the end- 472 to-end Round Trip Packet Loss that, statistically, will be equal to 473 the loss rate experienced by the connection along the entire network 474 path. So this measurement can be simply referred as the Round Trip 475 Packet Loss (RTPL). 477 In addition, this methodology allows the Half-RTPL measurement and 478 the Intra-domain RTPL measurement, in the same way as described in 479 the previous sections for RTT measurement. 481 6. RTT dependent Packet Loss using one bit loss signal 483 The single bit loss signal is implemented using just one bit: marked 484 packets have this bit set to 1, whereas unmarked ones have it set to 485 0. This solution requires a working spin-bit signal used to separate 486 different trains of packets. In particular, a "pause" of at least 487 one empty spin-bit period is introduced between each phase of the 488 algorithm. An on-path observer can determine in this way if a phase 489 (and therefore a train of packets) is ended and a new one is 490 starting. 492 The client is in charge of almost the entire complexity of the 493 algorithm. Its task can be summarized in 4 different points: 495 1. The client starts generating marked packets for two consecutive 496 spin-bit periods; it maintains a generation token that is enabled 497 every time a packet arrives and disabled when another one is 498 forwarded. When this token is disabled, the generation process 499 is paused (i.e. outgoing packets are transmitted unmarked) and 500 resumes as soon as its value returns true, and that happens as 501 soon as a packet is received. In addition, at the end of the 502 first spin-bit period spent in generation, the reflection counter 503 is unlocked to start counting incoming marked packets which will 504 be later reflected; 506 2. When the generation is completed, the client waits to see in 507 input an empty spin-bit period so as to be sure that everyone has 508 seen at least that empty period. This one will be used by the 509 observer as a divider between generated and reflected packets. 510 During this phase, all the outgoing packets are forwarded with 511 the loss bit set to 0. The reflection counter is still 512 incremented every time a marked packet arrives; 514 3. The client starts reflecting marked packets until the reflection 515 counter is zeroed; the generation token is also used (in the same 516 way) during this phase to avoid congestion on the slowest traffic 517 direction. In addition, at the end of the first spin-period 518 spent in reflection, the reflection counter is locked to avoid 519 incoming reflected packets incrementing it; 521 4. When the reflection is completed, the client waits to see in 522 input an empty spin-bit period so as to be sure that everyone has 523 seen at least that empty period. This one will be used by the 524 observer as a divider between reflected and newly generated 525 packets. During this phase, all the outgoing packets are 526 forwarded with the loss bit set to 0. The whole process restarts 527 going back to the first point. 529 As previously anticipated, the server simply reflects each incoming 530 marked packet sent by the client. It maintains a simple counter that 531 is incremented every time a marked packet arrives and decremented 532 when a marked one is sent in the opposite direction. 534 This one bit loss signal methodology replies and exposes the RTT of 535 the connection on the wire in any case, when the spin bit and the 536 delay bit are used and when these are disabled. 538 6.1. Observer's logic for one bit loss signal 540 The on-path observer, placed in any direction, counts marked packets 541 and separates different trains detecting empty spin-bit periods 542 between them (one or more). Then, it simply computes the difference 543 between a Generation train and a Reflection train to produce a 544 statistical measurement of the Round Trip Packet Loss (RTPL) and of 545 the connection end-to-end loss rate. 547 Here is an example. Packets are represented by two digits (first one 548 is the spin bit, second one is the loss bit): 550 Generation Pause Reflection Pause 551 ____________________ ______________ ____________________ ________ 552 | | | | | 553 01 01 00 01 11 10 11 00 00 10 10 10 01 00 01 01 10 11 10 00 00 10 555 Figure 2: one bit loss signal example 557 Note that 5 marked packets have been generated of which 4 reflected. 559 7. RTT independent Packet Loss using two bits loss signal 561 An RTT independent version of this algorithm requires two bits and 562 does not rely on the spin-bit signal to enable pause detection. That 563 is because packets generated and reflected by the client are marked 564 using two different marking values thus removing the need of 565 introducing a pause between them. Furthermore, instead of generating 566 marked packets for the duration of two spin-bit periods (as seen in 567 the one bit loss signal), a fixed duration for the generation phase 568 can be used (e.g., 100ms). 570 In this way, no information related to the RTT of the connection is 571 exposed on wire. 573 Using a two bits loss signal, four possible values can be used inside 574 each packet (i.e. 0 to 3). During the Generation phase, marked 575 packets have the loss value set to 1 whereas unmarked ones to 0. On 576 the contrary, during the Reflection phase, marked packets have the 577 loss value set to 2 whereas unmarked ones to 3. By doing so, even 578 unmarked packets have their own alternate marking methodology that 579 can be used by intermediate points to compute the one-way loss rate 580 between them (RFC 8321 [RFC8321]). 582 Even in this case, the client is in charge of almost the entire 583 complexity of the algorithm. Its task can be summarized in 2 584 different points: 586 1. The client generates marked packets (i.e. with loss bits set to 587 1) for 100ms; it also maintains a generation token that is 588 enabled every time a packet arrives and disabled when another one 589 is forwarded. When this token is disabled, the generation 590 process is paused (i.e. outgoing packets are transmitted unmarked 591 with the loss bits set to 0) and resumes as soon as its value 592 returns true. 594 2. When the generation is completed, the client starts reflecting 595 marked packets (i.e. with loss bits set to 2) until the 596 reflection counter is zeroed and for at least 100ms. The 597 generation token is also used during this phase to avoid 598 congestion on the slowest traffic direction; however, in this 599 case, "unmarked" packets are transmitted with the loss bit set to 600 3. The whole process restarts going back to the first point. 602 Independently of the current phase of the algorithm, the reflection 603 counter is increased every time a packet carrying a loss value equal 604 to 1 arrives. Moreover, depending on the connection RTT, the client 605 should vary the duration of the generation phase to different values. 606 For example, for connection below 100ms of RTT the client generates 607 for 100ms; for connection below 300ms of RTT it generates for 300ms 608 and for connection below 1s of RTT it generates for 1000ms. This is 609 necessary to ensure that the client has already received generated 610 marked packets before the beginning of the reflection phase. 612 As regards the role of the server, it simply reflects each incoming 613 marked packet sent by the client. It maintains two different 614 counters for generated and reflected packets (i.e. loss bits to 1 and 615 2) in concomitance with a mechanism to reflect in output the same 616 number of marked packets in the same order of arrival (with at most 617 the reordering of packets arrived out of sequence). 619 7.1. Observer's logic for two bits loss signal 621 The on-path observer, placed in any direction, counts marked packets 622 belonging to different phases simply looking at the loss value 623 carried by each packet (therefore, it does not look at the spin-bit 624 value anymore). Then, in the same way seen for the previous one bit 625 algorithm, it simply computes the difference between a Generation 626 train and a Reflection train to produce a statistical measurement of 627 the Round Trip Packet Loss (RTPL) and of the connection end-to-end 628 loss rate. Moreover, it can also count unmarked packets and, 629 cooperating with a second observer placed in the same direction, 630 compute the one-way loss rate between two intermediate points using 631 the alternate marking methodology (RFC 8321 [RFC8321]). 633 Here is an example. Packets are represented by a single digit 634 corresponding to the carried two-bits loss value (0 to 3): 636 Generation Reflection Generation 637 _____________________ _____________________ _____________________ 638 | | | | 639 1 1 0 1 1 1 1 0 1 1 0 2 2 2 2 3 2 3 3 2 3 3 1 1 0 1 0 0 1 1 0 1 0 641 Figure 3: two bits loss signal example 643 Note that 8 marked packets have been generated of which 6 reflected; 644 then again 6 marked packets are generated. 646 8. Protocols 648 8.1. QUIC 650 The binding of the delay bit signal to QUIC is partially described in 651 [I-D.ietf-quic-spin-exp], which adds the spin bit only to the QUIC 652 protocol. From an implementation point of view, the delay bit is 653 placed in the partially unencrypted (but authenticated) QUIC header, 654 alongside the spin bit, occupying one of the two bits left reserved 655 for future experiments. As things stand, according to 656 [I-D.ietf-quic-transport], the proposed scheme of the first header's 657 byte would be 01SDRKPP. 659 Regarding the loss signal, since the use of the spin bit is not 660 mandatory and many connection may not have it spinning, two different 661 configuration are proposed: 663 If the spin-bit IS enabled (i.e. the RTT is already exposed on 664 wire), use the 1 bit loss signal alongside the delay bit to 665 improve delay measurements accuracy; in this configuration, the 666 proposed scheme of the first header's byte would be 01SDLKPP; 668 If the spin-bit IS NOT enabled, use the 2 bits loss signal just to 669 measure connection loss rate without exposing any RTT related 670 information on wire; in this configuration, the proposed scheme of 671 the first header's byte would be 01SLLKPP. 673 This implies that an observer must be able to determine whether the 674 spin bit is active and correctly spinning or not (choosing, 675 accordingly, the right version of packet loss measurement to be 676 used). 678 8.2. TCP 680 The signal can be added to TCP by defining bit 4 of bytes 13-14 of 681 the TCP header to carry the spin bit, and eventually bits 5 and 6 to 682 carry additional information, like the delay bit and the 1 bit loss 683 signal (or the two bits loss signal). 685 9. Security Considerations 687 The privacy considerations for the hybrid RTT measurement signal are 688 essentially the same as those for passive RTT measurement in general. 690 10. Acknowledgements 692 tbc 694 11. IANA Considerations 696 tbc 698 12. References 700 12.1. Normative References 702 [I-D.ietf-quic-spin-exp] 703 Trammell, B. and M. Kuehlewind, "The QUIC Latency Spin 704 Bit", draft-ietf-quic-spin-exp-01 (work in progress), 705 October 2018. 707 [I-D.ietf-quic-transport] 708 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 709 and Secure Transport", draft-ietf-quic-transport-23 (work 710 in progress), September 2019. 712 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 713 Requirement Levels", BCP 14, RFC 2119, 714 DOI 10.17487/RFC2119, March 1997, 715 . 717 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 718 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 719 May 2016, . 721 [RFC8321] Fioccola, G., Ed., Capello, A., Cociglio, M., Castaldelli, 722 L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, 723 "Alternate-Marking Method for Passive and Hybrid 724 Performance Monitoring", RFC 8321, DOI 10.17487/RFC8321, 725 January 2018, . 727 12.2. Informative References 729 [I-D.trammell-ippm-spin] 730 Trammell, B., "An Explicit Transport-Layer Signal for 731 Hybrid RTT Measurement", draft-trammell-ippm-spin-00 (work 732 in progress), January 2019. 734 [I-D.trammell-quic-spin] 735 Trammell, B., Vaere, P., Even, R., Fioccola, G., Fossati, 736 T., Ihlar, M., Morton, A., and S. Emile, "Adding Explicit 737 Passive Measurability of Two-Way Latency to the QUIC 738 Transport Protocol", draft-trammell-quic-spin-03 (work in 739 progress), May 2018. 741 [I-D.trammell-tsvwg-spin] 742 Trammell, B., "A Transport-Independent Explicit Signal for 743 Hybrid RTT Measurement", draft-trammell-tsvwg-spin-00 744 (work in progress), July 2018. 746 Authors' Addresses 748 Mauro Cociglio 749 Telecom Italia 750 Via Reiss Romoli, 274 751 Torino 10148 752 Italy 754 Email: mauro.cociglio@telecomitalia.it 755 Giuseppe Fioccola 756 Huawei Technologies 757 Riesstrasse, 25 758 Munich 80992 759 Germany 761 Email: giuseppe.fioccola@huawei.com 763 Fabio Bulgarella 764 Politecnico di Torino 766 Email: fabio.bulgarella@guest.telecomitalia.it 768 Riccardo Sisto 769 Politecnico di Torino 770 Corso Duca degli Abruzzi, 24 771 Torino 10129 772 Italy 774 Email: riccardo.sisto@polito.it