idnits 2.17.1 draft-cfb-ippm-spinbit-measurements-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8321], [I-D.trammell-ippm-spin]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (July 3, 2020) is 1385 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-29 ** Obsolete normative reference: RFC 8321 (Obsoleted by RFC 9341) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPPM M. Cociglio 3 Internet-Draft Telecom Italia 4 Intended status: Experimental G. Fioccola 5 Expires: January 4, 2021 Huawei Technologies 6 M. Nilo 7 F. Bulgarella 8 Telecom Italia 9 R. Sisto 10 Politecnico di Torino 11 July 3, 2020 13 Client-Server Explicit Performance Measurements 14 draft-cfb-ippm-spinbit-measurements-02 16 Abstract 18 This document introduces an additional single bit signal to enhance 19 the spin bit [I-D.trammell-ippm-spin] performance in presence of 20 network impairments and application limited flow. In addition, it 21 defines two new explicit per-flow transport-layer signals for hybrid 22 measurement of connection loss rate. The former is a spin-bit 23 dependent signal and uses a single bit. The latter is a standalone 24 solution based on a two bits loss signal and on alternate marking RFC 25 8321 [RFC8321]. 27 Requirements Language 29 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 30 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 31 document are to be interpreted as described in RFC 2119 [RFC2119]. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on January 4, 2021. 50 Copyright Notice 52 Copyright (c) 2020 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Spin bit and Delay bit mechanism . . . . . . . . . . . . . . 4 69 2.1. Delay Sample generation . . . . . . . . . . . . . . . . . 5 70 2.1.1. The recovery process . . . . . . . . . . . . . . . . 6 71 2.2. Delay Sample reflection . . . . . . . . . . . . . . . . . 6 72 3. Using the Spin bit and Delay bit for Hybrid RTT Measurement . 7 73 3.1. End-to-end RTT measurement . . . . . . . . . . . . . . . 7 74 3.2. Half-RTT measurement . . . . . . . . . . . . . . . . . . 8 75 3.3. Intra-domain RTT measurement . . . . . . . . . . . . . . 9 76 4. Observer's algorithm and Waiting Interval . . . . . . . . . . 10 77 5. Adding a Loss signal for Packet loss measurement . . . . . . 11 78 5.1. Round Trip Packet Loss measurement . . . . . . . . . . . 13 79 6. Packet Loss using one bit loss signal . . . . . . . . . . . . 14 80 6.1. Observer's logic for one bit loss signal . . . . . . . . 16 81 7. Two Bits packet loss measurement using alternate marking . . 16 82 7.1. Setting the square bit (Q) on outgoing packets . . . . . 16 83 7.2. Setting the reflection square bit (R) on outgoing packets 17 84 7.2.1. Determining the completion of an incoming marking 85 period . . . . . . . . . . . . . . . . . . . . . . . 18 86 7.3. Observer's logic and passive loss measurements . . . . . 18 87 7.3.1. Upstream one-way loss . . . . . . . . . . . . . . . . 19 88 7.3.2. Three-quarters connection loss . . . . . . . . . . . 19 89 7.3.3. Full one-way loss in the opposite direction . . . . . 20 90 7.3.4. Half round-trip loss . . . . . . . . . . . . . . . . 21 91 7.3.5. Downstream one-way loss . . . . . . . . . . . . . . . 21 92 7.4. Enhancement of reflection period size computation . . . . 22 93 7.5. Improvement of the resilience to out of sequence . . . . 22 94 8. Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 23 95 8.1. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 23 96 8.2. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23 98 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 99 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 100 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 101 12.1. Normative References . . . . . . . . . . . . . . . . . . 24 102 12.2. Informative References . . . . . . . . . . . . . . . . . 24 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 105 1. Introduction 107 Both [I-D.trammell-tsvwg-spin] and [I-D.trammell-ippm-spin] define an 108 explicit per-flow transport-layer signal for hybrid measurement of 109 end-to-end RTT. This signal consists of three bits: a spin bit, 110 which oscillates once per end-to-end RTT, and a two-bit Valid Edge 111 Counter (VEC), which compensates for loss and reordering of the spin 112 bit to increase fidelity of the signal in less than ideal network 113 conditions. 115 In this document it is introduced the delay bit, that is a single bit 116 signal that can be used together with the spin bit by passive 117 observers to measure the RTT of a network flow, avoiding the spin bit 118 ambiguities that arise as soon as network conditions deteriorate. 119 Unlike the spin bit, which is actually set in every packet 120 transmitted on the network, the delay bit is set only once per round 121 trip. 123 Regarding loss rate measurement, two new algorithms are introduced. 124 The first algorithm enables end-to-end round trip loss rate 125 measurement using a single bit signal called loss bit. This signal 126 is used to mark a train of packets (a portion of traffic) which 127 bounces back an forth two times between endpoints, realizing a two 128 round trip reflection. A passive on-path observer, placed on 129 whatever direction, can trivially count and compare the number of 130 marked packets seen during the two reflections estimating 131 statistically the loss rate experienced by the connection. The 132 second algorithm uses a double square signal and RFC 8321 [RFC8321] 133 to mark the whole traffic exchanged between endpoints. This solution 134 enables different types of measurements providing a complete picture 135 of connection loss events. 137 This document defines hybrid measurement RFC 7799 [RFC7799] path 138 signals to be embedded into a transport layer protocol, explicitly 139 intended for exposing end-to-end RTT and loss rate information to 140 measurement devices on path. 142 The document introduces mechanisms applicable to any transport-layer 143 protocol, then explains how to bind the signals to a variety of IETF 144 transport protocols, and in particular to QUIC and TCP. 146 The application of the spin bit to QUIC is described in 147 [I-D.ietf-quic-spin-exp] which adds the spin bit to QUIC for 148 experimentation purposes. 150 Note that spin bit, delay bit and loss bits explained in this 151 document are inspired by RFC 8321 [RFC8321]. This is also mentioned 152 in [I-D.trammell-quic-spin]. 154 Note that additional details about the Performance Measurements for 155 QUIC are also described in the paper [ANRW19-PM-QUIC]. 157 2. Spin bit and Delay bit mechanism 159 The main idea is to have a single packet, with a second marked bit 160 (the delay bit), that bounces between client and server during the 161 entire connection life. This single packet is called Delay Sample. 163 A simple observer placed in an intermediate point, tracking the delay 164 sample and the relative timestamp in every spin bit period, can 165 measure the end-to-end round trip delay of the connection. In the 166 same way as seen with the spin bit, it is possible to carry out other 167 types of measurements using this additional bit. The next paragraphs 168 give an overview of the observer capabilities. 170 In order to describe the delay sample working mechanism in detail, we 171 have to distinguish two different phases which take part in the delay 172 bit lifetime: initialization and reflection. The initialization is 173 the generation of the delay sample, while the reflection realizes the 174 bounce behavior of this single packet between the two endpoints. 176 The next figure describes the Delay bit mechanism: the first bit is 177 the spin bit and the second one is the delay bit. 179 +--------+ -- -- -- -- -- +--------+ 180 | | -----------> | | 181 | Client | | Server | 182 | | <----------- | | 183 +--------+ -- -- -- -- -- +--------+ 185 (a) No traffic at beginning. 187 +--------+ 00 00 01 -- -- +--------+ 188 | | -----------> | | 189 | Client | | Server | 190 | | <----------- | | 191 +--------+ -- -- -- -- -- +--------+ 192 (b) The Client starts sending data and 193 sets the first packet as Delay Sample. 195 +--------+ 00 00 00 00 00 +--------+ 196 | | -----------> | | 197 | Client | | Server | 198 | | <----------- | | 199 +--------+ -- -- 01 00 00 +--------+ 201 (c) The Server starts sending data 202 and reflects the Delay Sample. 204 +--------+ 10 10 11 00 00 +--------+ 205 | | -----------> | | 206 | Client | | Server | 207 | | <----------- | | 208 +--------+ 00 00 00 00 00 +--------+ 210 (d) The Client inverts the spin bit and 211 reflects the Delay Sample. 213 +--------+ 10 10 10 10 10 +--------+ 214 | | -----------> | | 215 | Client | | Server | 216 | | <----------- | | 217 +--------+ 00 00 11 10 10 +--------+ 219 (e) The Server reflects the Delay Sample. 221 +--------+ 00 00 01 10 10 +--------+ 222 | | -----------> | | 223 | Client | | Server | 224 | | <----------- | | 225 +--------+ 10 10 10 10 10 +--------+ 227 (f) The client reverts the spin 228 bit and reflects the Delay Sample. 230 Figure 1: Spin bit and Delay bit 232 2.1. Delay Sample generation 234 During this first phase, endpoints play different roles. First of 235 all a single delay sample must be bouncing per round trip period (and 236 so per spin bit period). According to that statement and in order to 237 simplify the general algorithm, the delay sample generation is in 238 charge of just one of the two endpoints: 240 o the client, when connection starts and spin bit is set to 0, 241 initializes the delay bit of the first packet to 1, so it becomes 242 the delay sample for that marking period. Only this packet is 243 marked with the delay bit set to 1 for this round trip period; the 244 other ones will carry only the spin bit; 246 o the server never initializes the delay bit to 1; its only task is 247 to reflect the incoming delay bit into the next outgoing packet 248 only if certain conditions occur. 250 Theoretically, in absence of network impairments, the delay sample 251 should bounce between client and server continuously, for the entire 252 duration of the connection. Actually, that is highly unlikely mainly 253 for two different reasons: 255 1) the packet carrying the delay bit might be lost during its journey 256 on the network which is unreliable by definition; 258 2) one of the two endpoints could stop or delay sending data because 259 the application is limiting the amount of traffic transmitted; 261 To deal with these problems, the algorithm provides a procedure to 262 regenerate the delay sample and to inform a possible observer that a 263 problem has occurred, and then the measurement has to be restarted. 265 2.1.1. The recovery process 267 In order to relieve the server from tasks that go beyond the mere 268 reflection of the sample, even in this case the recovery process 269 belongs to the client. A fundamental assumption is that a delay 270 sample is strictly related to its spin bit period. Considering this 271 rule, the client verifies that every spin bit period ends with its 272 delay sample. If that does not happen and a marking period 273 terminates without a delay sample, the client waits a further empty 274 period; then, in the following period, it reinitializes the mechanism 275 by setting the delay bit of the first outgoing packet to 1, making it 276 the new delay sample. The empty period is needed to inform the 277 intermediate points that there was an issue and a new delay 278 measurement session is starting. 280 2.2. Delay Sample reflection 282 The reflection is the process that enables the bouncing of the delay 283 sample between client and server. The behavior of the two endpoints 284 is slightly different. With the exception of the client that, as 285 previously exposed, generates a new delay sample, by default the 286 delay bit is set to 0. 288 Server side reflection: when a packet with the delay bit set to 1 289 arrives, the server marks the first packet in the opposite direction 290 as the delay sample, if it has the same spin bit value. While if it 291 has the opposite spin bit value this sample is considered lost. 293 Client side reflection: when a packet with delay bit set to 1 294 arrives, the client marks the first packet in the opposite direction 295 as the delay sample, if it has the opposite spin bit value. While if 296 it has the same spin bit value this sample is considered lost. 298 In both cases, if the outgoing marked packet is transmitted with a 299 delay greater than a predetermined threshold after the reception of 300 the incoming delay sample (1ms by default), reflection is aborted and 301 this sample is considered lost. 303 Note that reflection takes place for the packet that is carrying the 304 delay bit regardless of its position within the period. For this 305 reason it is necessary to introduce that condition of validation in 306 order to identify and discard those samples that, due to reordering, 307 might move to a contiguous period. Furthermore, by introducing a 308 threshold for the retransmission delay of the sample, it is possible 309 to eliminate all those measurements which, due to lack of traffic on 310 the endpoints, would be overestimated and not true. Thus, the 311 maximum estimation error, without considering any other delays due to 312 flow control, would amount to twice the threshold (e.g. 2ms) per 313 measurement, in the worst case. 315 3. Using the Spin bit and Delay bit for Hybrid RTT Measurement 317 Unlike what happens with the spin bit for which it is necessary to 318 validate or at least heuristically evaluate the goodness of an edge, 319 the delay sample can be used by an intermediate observer as a simple 320 demarcator between a period and the following one eliminating the 321 ambiguities on the calculation of the RTT found with the analysis of 322 the spin-bit only. The measurement types, that can be done from the 323 observation of the delay sample, are exactly the same achievable with 324 the spin bit only. 326 3.1. End-to-end RTT measurement 328 The delay sample generation process ensures that only one packet 329 marked with the delay bit set to 1 runs back and forth on the wire 330 between two endpoints per round trip time. Therefore, in order to 331 determine the end-to-end RTT measurement of a QUIC flow, an on-path 332 passive observer can simply compute the time difference between two 333 delay samples observed in a single direction. Note that a 334 measurement, to be valid, must take into account the difference in 335 time between the timestamps of two consecutive delay samples 336 belonging to adjacent spin-bit periods. For this reason, an 337 observer, in addition to intercepting and analyzing the packets 338 containing the delay bit set to 1, must maintain awareness of each 339 spin period in such a way as to be able to assign each delay sample 340 to its period and, at the same time, identifying those periods that 341 do not contain it. 343 =======================|======================> 344 = ********** -----Obs----> ********** = 345 = * Client * * Server * = 346 = ********** <------------ ********** = 347 <============================================== 349 (a) client-server RTT 351 ==============================================> 352 = ********** ------------> ********** = 353 = * Client * * Server * = 354 = ********** <----Obs----- ********** = 355 <======================|======================= 357 (b) server-client RTT 359 Figure 2: Round-trip time (both direction) 361 3.2. Half-RTT measurement 363 An on-path passive observer that is sniffing traffic in both 364 directions -- from client to server and from server to client -- can 365 also use the delay sample to measure "upstream" and "downstream" RTT 366 components. Also known as the half-RTT measurement, it represents 367 the components of the end-to-end RTT concerning the paths between the 368 client and the observer (upstream), and the observer and the server 369 (downstream). It does this by measuring the delay between a delay 370 sample observed in the downstream direction and the one observed in 371 the upstream direction, and vice versa. Also in this case, it should 372 verify that the two delay samples belong to two adjacent periods, for 373 the upstream component, or to the same period for the downstream 374 component. 376 =======================> 377 = ********** ------|-----> ********** 378 = * Client * Obs * Server * 379 = ********** <-----|------ ********** 380 <======================= 382 (a) client-observer half-RTT 384 =======================> 385 ********** ------|-----> ********** = 386 * Client * Obs * Server * = 387 ********** <-----|------ ********** = 388 <======================= 390 (b) observer-server half-RTT 392 Figure 3: Half Round-trip time (both direction) 394 3.3. Intra-domain RTT measurement 396 Taking advantage of the half-RTT measurements it is also possible to 397 calculate the intra-domain RTT which is the portion of the entire RTT 398 used by a QUIC flow to traverse the network of a provider (or part of 399 it). To achieve this result two observers, able to watch traffic in 400 both directions, must be employed simultaneously at ingress and 401 egress of the network to be measured. At this point, to determine 402 the delay between the two observers, it is enough to subtract the two 403 computed upstream (or downstream) RTT components. 405 =========================================> 406 = =====================> 407 = = ********** ---|--> ---|--> ********** 408 = = * Client * Obs Obs * Server * 409 = = ********** <--|--- <--|--- ********** 410 = <===================== 411 <========================================= 413 (a) client-observer RTT components (half-RTTs) 415 ==================> 416 ********** ---|--> ---|--> ********** 417 * Client * Obs Obs * Server * 418 ********** <--|--- <--|--- ********** 419 <================== 421 (b) the intra-domain RTT resulting from the 422 subtraction of the above RTT components 424 Figure 4: Intra-domain Round-trip time (client-observer: upstream) 426 The spin bit is an alternate marking generated signal and the only 427 difference than RFC 8321 [RFC8321] is the size of the alternation 428 that will change with the flight size each RTT. So it can be useful 429 to segment the RTT and deduce the contribution to the RTT of the 430 portion of the network between two on-path observers and it can be 431 easily performed by calculating the delay between two or more 432 measurement points on a single direction by applying RFC 8321 433 [RFC8321]. 435 4. Observer's algorithm and Waiting Interval 437 Given below is a formal summary of the functioning of the observer 438 every time a delay sample is detected. A packet containing the delay 439 bit set to 1: 441 o if it has the same spin bit value of the current period and no 442 delay sample was detected in the previous period, then it can be 443 used as a left edge (i.e. to start measuring an RTT sample), but 444 not as a right edge (i.e. to complete and RTT measurement since 445 the last edge). If the observation point is symmetric (i.e. it 446 can see both upstream and downstream packets in the flow) and in 447 the current period a delay sample was detected in the opposite 448 direction (i.e. in the upstream direction), the packet can also be 449 used to compute the downstream RTT component. 451 o if it has the same spin bit value of the current period and a 452 delay sample was detected in the previous period, then it can be 453 used at the same time as a left or right edge, and to compute RTT 454 component in both directions. 456 Like stated previously, every time an empty period is detected, the 457 observer must restart the measurement process and consider the next 458 delay sample that will come as the beginning of a new measure, then 459 as a left edge. As a result, being able to assign the delay sample 460 to the corresponding spin period becomes a crucial factor for the 461 proper functioning of the entire algorithm. 463 Considering that the division into periods is realized by exploiting 464 the spin bit square wave, it is easy to understand that the presence 465 of spurious spin edges -- caused by packet reordering -- would 466 inevitably lead the observer to overestimate the amount of periods 467 actually present in the transmission. This results in a greater 468 number of empty periods detected and the consequent decrease of the 469 actual RTT samples achievable. Therefore, in order to maximize the 470 performance of the whole algorithm, the observer must implement a 471 mechanism to filter out spurious spin edges. 473 To face this problem the waiting interval has to be introduced. 474 Basically, every time a spin bit edge is detected, the observer sets 475 a time interval during which it rejects every potential spurious 476 edges observed on the wire. While, at the end of the interval it 477 starts again to accept changes in the spin bit value. This 478 guarantees a proper protection against the spurious edges in relation 479 to the size of the interval itself. For instance, an interval of 5ms 480 is able to filter out edges that have been reordered by a maximum of 481 5ms. Clearly, the mechanism does its job for intervals smaller than 482 the RTT of the observed connection (if RTT is smaller than the 483 waiting interval the observer can't measure the RTT). 485 5. Adding a Loss signal for Packet loss measurement 487 It is possible to introduce a mechanism to evaluate also the packet 488 loss together with the delay measurement. This can be achieved by 489 introducing the loss signal, a single bit signal whose purpose is to 490 mark a variable number of packets (from live traffic) which are 491 exchanged two times between the endpoints realizing a two round-trip 492 reflection. The overall exchange comprises: 494 o The client first selects, generates and consequent transmits to 495 the server a first train of packets, by marking the loss bit to 1; 497 o The server, upon reception from the client of each one of the 498 packets included in the first train, reflects to the client a 499 respective second train of packets of the same size as the first 500 train received, by marking the loss bit to 1; 502 o The client, upon reception from the server of each one of the 503 packets included in the second train, reflects to the server a 504 respective third train of packets of the same size as the second 505 train received, by marking the loss bit to 1; 507 o The server, upon reception from the client of each one of the 508 packets included in the third train, finally reflects to the 509 client a respective fourth train of packets of the same size as 510 the third train received, by marking the loss bit to 1. 512 Packets belonging to the first round (first and second train) 513 represent the Generation Phase while those belonging to the second 514 round (third and fourth train) represent the Reflection Phase. 516 A passive on-path observer, placed on whatever direction, can 517 trivially count and compare the number of marked packets seen during 518 the two mentioned phases (i.e. the first and third or the second and 519 the fourth trains of packets, depending on which direction is 520 observed) and estimate the loss rate experienced by the connection. 521 This process is repeated continuously to obtain more measurements as 522 long as the endpoints exchange traffic. These measurements can be 523 called Round Trip(RT) losses 525 The general algorithm shown above gives an idea of its underlying 526 principles but is not enough to make the whole process working 527 properly. 529 Firstly, there is the issue that packet rates in the two directions 530 may be different. Therefore, the right number of packets to be 531 marked has to be chosen in order to avoid their congestion on the 532 slowest traffic direction. As a consequence, this number is 533 inevitably equal to the amount of packets transited, indeed, on the 534 slowest direction. This problem can be easily addressed by a method 535 wherein the two endpoints of a communication exchange marked packets 536 interleaved with unmarked packets. From an implementation point of 537 view, this result can be achieved by introducing a single token 538 system that adjusts the number of outgoing marked packets. 539 Basically, the token is enabled every time a packet arrives and 540 disabled when a marked packet is transmitted. Since the creation of 541 the initial train of marked packets is carried out by the client, the 542 management and use of this single token is also assigned to it, which 543 in fact "calculates" the correct number of packets to be marked each 544 time. 546 Secondly, a mechanism to individually identify each train of packets 547 must be provided to enable the observer to distinguish between trains 548 belonging to different phases (Generation and Reflection). 550 5.1. Round Trip Packet Loss measurement 552 Since the measurements are performed on a portion of the traffic 553 exchanged between client and server, the observer calculates the end- 554 to-end Round Trip Packet Loss that, statistically, will be equal to 555 the loss rate experienced by the connection along the entire network 556 path. So this measurement can be simply referred as the Round Trip 557 Packet Loss (RTPL). 559 =======================|======================> 560 = ********** -----Obs----> ********** = 561 = * Client * * Server * = 562 = ********** <------------ ********** = 563 <============================================== 565 (a) client-server RTPL 567 ==============================================> 568 = ********** ------------> ********** = 569 = * Client * * Server * = 570 = ********** <----Obs----- ********** = 571 <======================|======================= 573 (b) server-client RTPL 575 Figure 5: Round-trip packet loss (both direction) 577 In addition, this methodology allows the Half-RTPL measurement and 578 the Intra-domain RTPL measurement, in the same way as described in 579 the previous sections for RTT measurement. 581 =======================> 582 = ********** ------|-----> ********** 583 = * Client * Obs * Server * 584 = ********** <-----|------ ********** 585 <======================= 587 (a) client-observer half-RTPL 589 =======================> 590 ********** ------|-----> ********** = 591 * Client * Obs * Server * = 592 ********** <-----|------ ********** = 593 <======================= 595 (b) observer-server half-RTPL 597 Figure 6: Half Round-trip packet loss (both direction) 599 =========================================> 600 =====================> = 601 ********** ---|--> ---|--> ********** = = 602 * Client * Obs Obs * Server * = = 603 ********** <--|--- <--|--- ********** = = 604 <===================== = 605 <========================================= 607 (a) observer-server RTPL components (half-RTPLs) 609 ==================> 610 ********** ---|--> ---|--> ********** 611 * Client * Obs Obs * Server * 612 ********** <--|--- <--|--- ********** 613 <================== 615 (b) the intra-domain RTPL resulting from the 616 subtraction of the above RTPL components 618 Figure 7: Intra-domain Round-trip packet loss (observer-server) 620 6. Packet Loss using one bit loss signal 622 The single bit loss signal, whose basic mechanism was generalized in 623 the previous section, is implemented using just one bit: marked 624 packets have this bit set to 1, whereas unmarked ones have it set to 625 0. This solution requires a working spin-bit signal used to separate 626 different trains of packets. In particular, a "pause" of at least 627 one empty spin-bit period is introduced between each phase of the 628 algorithm. An on-path observer can determine in this way if a phase 629 (and therefore a train of packets) is ended and a new one is 630 starting. 632 The client is in charge of almost the entire complexity of the 633 algorithm. Its task can be summarized in 4 different points: 635 1. The client starts generating marked packets for two consecutive 636 spin-bit periods; it maintains a generation token that is enabled 637 every time a packet arrives and disabled when another one is 638 forwarded. When this token is disabled, the generation process 639 is paused (i.e. outgoing packets are transmitted unmarked) and 640 resumes as soon as its value returns true, and that happens as 641 soon as a packet is received. In addition, at the end of the 642 first spin-bit period spent in generation, the reflection counter 643 is unlocked to start counting incoming marked packets which will 644 be later reflected; 646 2. When the generation is completed, the client waits to see in 647 input an empty spin-bit period so as to be sure that everyone has 648 seen at least that empty period. This one will be used by the 649 observer as a divider between generated and reflected packets. 650 During this phase, all the outgoing packets are forwarded with 651 the loss bit set to 0. The reflection counter is still 652 incremented every time a marked packet arrives; 654 3. The client starts reflecting marked packets until the reflection 655 counter is zeroed; the generation token is also used (in the same 656 way) during this phase to avoid congestion on the slowest traffic 657 direction. In addition, at the end of the first spin-period 658 spent in reflection, the reflection counter is locked to avoid 659 incoming reflected packets incrementing it; 661 4. When the reflection is completed, the client waits to see in 662 input an empty spin-bit period so as to be sure that everyone has 663 seen at least that empty period. This one will be used by the 664 observer as a divider between reflected and newly generated 665 packets. During this phase, all the outgoing packets are 666 forwarded with the loss bit set to 0. The whole process restarts 667 going back to the first point. 669 As previously anticipated, the server simply reflects each incoming 670 marked packet sent by the client. It maintains a simple counter that 671 is incremented every time a marked packet arrives and decremented 672 when a marked one is sent in the opposite direction. 674 6.1. Observer's logic for one bit loss signal 676 The on-path observer, placed in any direction, counts marked packets 677 and separates different trains detecting empty spin-bit periods 678 between them (one or more). Then, it simply computes the difference 679 between a Generation train and a Reflection train to produce a 680 statistical measurement of the Round Trip Packet Loss (RTPL) and of 681 the connection end-to-end loss rate. 683 Here is an example. Packets are represented by two digits (first one 684 is the spin bit, second one is the loss bit): 686 Generation Pause Reflection Pause 687 ____________________ ______________ ____________________ ________ 688 | | | | | 689 01 01 00 01 11 10 11 00 00 10 10 10 01 00 01 01 10 11 10 00 00 10 691 Figure 8: one bit loss signal example 693 Note that 5 marked packets have been generated of which 4 reflected. 695 7. Two Bits packet loss measurement using alternate marking 697 An alternative methodology, based on the classical alternate marking 698 RFC 8321 [RFC8321], can be deployed to enable passive packet loss 699 measurement in a connection oriented communication. This section 700 explains its fundamentals and all the metrics that can be achieved by 701 exploiting this mechanism. 703 Two new loss bits are introduced: 705 o Square Bit (Q): this bit is toggled every N outgoing packets 706 generating a square signal as already seen in the alternate 707 marking methodology RFC 8321 [RFC8321]. 709 o Reflection Square Bit (R): this bit is used to reflect the 710 incoming square signal (the one generated by the opposite 711 endpoint) according to the algorithm explained in next Section; in 712 a nutshell, it is used to report the losses found in the opposite 713 transmission channel. 715 7.1. Setting the square bit (Q) on outgoing packets 717 The sQuare value is initialized to 0 and is applied to the Q-bit of 718 every outgoing packet. The sQuare value is toggled after sending N 719 packets (e.g. 64). By doing so, each endpoint splits its outgoing 720 traffic into blocks of N packets with different "packet color" as 721 defined by RFC 8321 [RFC8321]. A single block of N packets is called 722 "marking period". Observation points can estimate upstream losses by 723 counting the number of packets included in a marking period of the 724 produced square signal. 726 7.2. Setting the reflection square bit (R) on outgoing packets 728 Unlike the sQuare signal for which packets are transmitted into 729 blocks of fixed size, the Reflection square signal (being an 730 alternate marking signal too) produces blocks of packets whose size 731 varies according to these simple rules: 733 o when the transmission of a new block starts, its size is set equal 734 to the size of the last marking period whose reception has been 735 completed; 737 o if, before transmission of the block is terminated, the reception 738 of at least one further marking period is completed, the size of 739 the block is updated to the average size of the further received 740 marking periods. Implementation details follow. 742 The Reflection square value is initialized to 0 and is applied to the 743 R-bit of every outgoing packet. The Reflection square value is 744 toggled for the first time when the completion of a marking period is 745 detected in the incoming sQuare signal (produced by the opposite node 746 using the Q-bit). When this happens, the number of packets (p), 747 detected within this first marking period, is used to generate a 748 reflection square signal which toggles every M=p packets (at first). 749 This new signal produces blocks of M packets (marked using the R-bit) 750 and each of them is called "reflection marking period". 752 The M value is then updated every time a completed marking period in 753 the incoming sQuare signal is received, following this formula: 754 M=round(avg(p)). 756 The parameter avg(p) is the average number of packets in a marking 757 period computed considering all the marking periods received since 758 the beginning of the current reflection marking period. 760 Looking at the R-bit, observation points have clear indication of 761 losses experienced by the entire opposite channel plus those occurred 762 in the path from the sender up to them (if losses occur in this 763 latter portion of path). 765 7.2.1. Determining the completion of an incoming marking period 767 A simple sQuare bit transition cannot be used to determine the 768 completion of a marking period. Indeed, packet reordering can lead 769 to the generation of spurious edges in the sQuare signal. To address 770 this problem, a marking period is considered ended when at least X 771 packets (e.g. 5) with reverse marking (i.e. belonging to the 772 following marking period) have been received. 774 This same approach can be used by observation points to clean both 775 sQuare and Reflection square signals. 777 7.3. Observer's logic and passive loss measurements 779 Since both sQuare and Reflection square bits are toggled at most 780 every N packets (except for the first transition of the R-bit as 781 explained before), an on-path observer can trivially count the number 782 of packets of each marking block and, knowing the value of N, can 783 estimate the amount of loss experienced by the connection. Different 784 metrics can be measured depending on which direction the observer is 785 looking to. 787 One direction observer: 789 o upstream one-way loss: the loss between the sender and the 790 observation point 792 o "three-quarters" connection loss: the loss between the receiver 793 and the sender in the opposite direction plus the loss between the 794 sender and the observation point in the observed direction 796 o full one-way loss in the opposite direction: the loss between the 797 receiver and the sender in the opposite direction 799 Two directions observer (same metrics seen previously applied to both 800 direction, plus): 802 o client-observer half round-trip loss: the loss between the client 803 and the observation point in both directions 805 o observer-server half round-trip loss: the loss between the 806 observation point and the server in both directions 808 o downstream one-way loss: the loss between the observation point 809 and the receiver (valid for both directions) 811 7.3.1. Upstream one-way loss 813 Since packets are continuously Q-bit marked into alternate blocks of 814 size N, knowing the value of N, an on-path observer can estimate the 815 amount of loss occurred from the sender up to it after observing at 816 least N packets. The upstream one-way loss rate ("uowl") is one 817 minus the average number of packets in a block of packets with the 818 same Q value ("p") divided by N ("uowl=1-avg(p)/N"). 820 =====================> 821 ********** -----Obs----> ********** 822 * Client * * Server * 823 ********** <------------ ********** 825 (a) in client-server channel (uowl_up) 827 ********** ------------> ********** 828 * Client * * Server * 829 ********** <----Obs----- ********** 830 <===================== 832 (b) in server-client channel (uowl_down) 834 Figure 9: Upstream one-way loss 836 7.3.2. Three-quarters connection loss 838 Except for the very first block in which there is nothing to reflect 839 (a complete marking period has not been yet received), packets are 840 continuously R-bit marked into alternate blocks of size lower or 841 equal than N. Knowing the value of N, an on-path observer can 842 estimate the amount of loss occurred in the whole opposite channel 843 plus the loss from the sender up to it in the observation channel. 844 As for the previous metric, the "three-quarters" connection loss rate 845 ("tql") is one minus the average number of packets in a block of 846 packets with the same R value ("t") divided by N ("tql=1-avg(t)/N"). 848 =======================> 849 = ********** -----Obs----> ********** 850 = * Client * * Server * 851 = ********** <------------ ********** 852 <============================================ 854 (a) in client-server channel (tql_up) 856 ============================================> 857 ********** ------------> ********** = 858 * Client * * Server * = 859 ********** <----Obs----- ********** = 860 <======================= 862 (b) in server-client channel (tql_down) 864 Figure 10: Three-quarters connection loss 866 The following metrics derive from these first two metrics. 868 7.3.3. Full one-way loss in the opposite direction 870 Using the previous metrics, full one-way loss can be computed: 872 fowl_down = tql_up - uowl_up 874 fowl_up = tql_down - uowl_down 876 ********** -----Obs----> ********** 877 * Client * * Server * 878 ********** <------------ ********** 879 <========================================== 881 (a) in client-server channel (fowl_down) 883 ==========================================> 884 ********** ------------> ********** 885 * Client * * Server * 886 ********** <----Obs----- ********** 888 (b) in server-client channel (fowl_up) 890 Figure 11: Full one-way loss in the opposite direction 892 7.3.4. Half round-trip loss 894 Using the previous metrics, the two half round-trip loss measurements 895 can be computed: 897 hrtl_co = tql_up - uowl_down 899 hrtl_os = tql_down - uowl_up 901 =======================> 902 = ********** ------|-----> ********** 903 = * Client * Obs * Server * 904 = ********** <-----|------ ********** 905 <======================= 907 (a) client-observer half round-trip loss (hrtl_co) 909 =======================> 910 ********** ------|-----> ********** = 911 * Client * Obs * Server * = 912 ********** <-----|------ ********** = 913 <======================= 915 (b) observer-server half round-trip loss (hrtl_os) 917 Figure 12: Half Round-trip loss (both direction) 919 7.3.5. Downstream one-way loss 921 Using the previous metrics, downstream one-way loss can be computed: 923 dowl_up = hrtl_os - uowl_down 925 dowl_down = hrtl_co - uowl_up 926 =====================> 927 ********** ------|-----> ********** 928 * Client * Obs * Server * 929 ********** <-----|------ ********** 931 (a) in client-server channel (dowl_up) 933 ********** ------|-----> ********** 934 * Client * Obs * Server * 935 ********** <-----|------ ********** 936 <===================== 938 (b) in server-client channel (dowl_down) 940 Figure 13: Downstream one-way loss 942 7.4. Enhancement of reflection period size computation 944 The use of the rounding function used in the M computation introduces 945 errors. However, these errors can be minimized by storing the 946 rounding applied each time M is computed, and using it during the 947 computation of the M value in the following reflection marking 948 period. 950 This can be achieved introducing the new r_avg parameter in the 951 previous M formula. The new formula is M=round(avg(p)+r_avg) where 952 r_avg is computed as not rounded M minus rounded M; its initial value 953 is equal to 0. 955 7.5. Improvement of the resilience to out of sequence 957 Since endpoints have clear indication about reordered packets, we can 958 use this information to absorb out of sequences in the incoming 959 sQuare wave, even when the marking period threshold (see 7.2.1 960 Section) has been reached. 962 This can be achieved by updating the size of the current reflection 963 block while this is being transmitted. The reflection block size is 964 then updated every time an incoming reordered packet of the previous 965 marking period is detected. This can be done if and only if the 966 transmission of the current reflection block is in progress and no 967 packets of the following marking period (Q-bit) have been received. 969 8. Protocols 971 8.1. QUIC 973 The binding of the delay bit signal to QUIC is partially described in 974 [I-D.ietf-quic-transport], which adds the spin bit to the first byte 975 of the short packet header, leaving two reserved bits for future 976 experiments. 978 To implement the additional signals discussed in this document, the 979 first byte of the short packet header can be modified as follows: 981 the delay bit (D) can be placed in the first reserved bit (i.e. 982 the fourth most significant bit _0x10_) while the loss bit in the 983 second reserved bit (i.e. the fifth most significant bit _0x08_); 984 the proposed scheme is: 986 0 1 2 3 4 5 6 7 987 +-+-+-+-+-+-+-+-+ 988 |0|1|S|D|L|K|P|P| 989 +-+-+-+-+-+-+-+-+ 991 Figure 14: scheme 1 993 alternatively, the standalone two bits loss signal (QR) can be 994 placed in both reserved bits; the proposed scheme, in this case, 995 is: 997 0 1 2 3 4 5 6 7 998 +-+-+-+-+-+-+-+-+ 999 |0|1|S|Q|R|K|P|P| 1000 +-+-+-+-+-+-+-+-+ 1002 Figure 15: scheme 2 1004 8.2. TCP 1006 The signals can be added to TCP by defining bit 4 of bytes 13-14 of 1007 the TCP header to carry the spin bit, and eventually bits 5 and 6 to 1008 carry additional information, like the delay bit and the 1 bit loss 1009 signal (or the two bits loss signal). 1011 9. Security Considerations 1013 The privacy considerations for the hybrid RTT measurement signal are 1014 essentially the same as those for passive RTT measurement in general. 1016 10. Acknowledgements 1018 tbc 1020 11. IANA Considerations 1022 tbc 1024 12. References 1026 12.1. Normative References 1028 [I-D.ietf-quic-spin-exp] 1029 Trammell, B. and M. Kuehlewind, "The QUIC Latency Spin 1030 Bit", draft-ietf-quic-spin-exp-01 (work in progress), 1031 October 2018. 1033 [I-D.ietf-quic-transport] 1034 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1035 and Secure Transport", draft-ietf-quic-transport-29 (work 1036 in progress), June 2020. 1038 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1039 Requirement Levels", BCP 14, RFC 2119, 1040 DOI 10.17487/RFC2119, March 1997, 1041 . 1043 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 1044 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 1045 May 2016, . 1047 [RFC8321] Fioccola, G., Ed., Capello, A., Cociglio, M., Castaldelli, 1048 L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, 1049 "Alternate-Marking Method for Passive and Hybrid 1050 Performance Monitoring", RFC 8321, DOI 10.17487/RFC8321, 1051 January 2018, . 1053 12.2. Informative References 1055 [ANRW19-PM-QUIC] 1056 ACM/IRTF Applied Networking Research Workshop 2019 1057 (ANRW'19), "Performance measurements of QUIC 1058 communications", DOI 10.1145/3340301.3341127, 2019. 1060 [I-D.trammell-ippm-spin] 1061 Trammell, B., "An Explicit Transport-Layer Signal for 1062 Hybrid RTT Measurement", draft-trammell-ippm-spin-00 (work 1063 in progress), January 2019. 1065 [I-D.trammell-quic-spin] 1066 Trammell, B., Vaere, P., Even, R., Fioccola, G., Fossati, 1067 T., Ihlar, M., Morton, A., and S. Emile, "Adding Explicit 1068 Passive Measurability of Two-Way Latency to the QUIC 1069 Transport Protocol", draft-trammell-quic-spin-03 (work in 1070 progress), May 2018. 1072 [I-D.trammell-tsvwg-spin] 1073 Trammell, B., "A Transport-Independent Explicit Signal for 1074 Hybrid RTT Measurement", draft-trammell-tsvwg-spin-00 1075 (work in progress), July 2018. 1077 Authors' Addresses 1079 Mauro Cociglio 1080 Telecom Italia 1081 Via Reiss Romoli, 274 1082 Torino 10148 1083 Italy 1085 Email: mauro.cociglio@telecomitalia.it 1087 Giuseppe Fioccola 1088 Huawei Technologies 1089 Riesstrasse, 25 1090 Munich 80992 1091 Germany 1093 Email: giuseppe.fioccola@huawei.com 1095 Massimo Nilo 1096 Telecom Italia 1097 Via Reiss Romoli, 274 1098 Torino 10148 1099 Italy 1101 Email: massimo.nilo@telecomitalia.it 1103 Fabio Bulgarella 1104 Telecom Italia 1105 Via Reiss Romoli, 274 1106 Torino 10148 1107 Italy 1109 Email: fabio.bulgarella@guest.telecomitalia.it 1110 Riccardo Sisto 1111 Politecnico di Torino 1112 Corso Duca degli Abruzzi, 24 1113 Torino 10129 1114 Italy 1116 Email: riccardo.sisto@polito.it