idnits 2.17.1 

draft-cfb-ippm-spinbit-measurements-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC8321],
     [I-D.trammell-ippm-spin]), which it shouldn't.  Please replace those with
     straight textual mentions of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (July 3, 2020) is 1385 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-29

  ** Obsolete normative reference: RFC 8321 (Obsoleted by RFC 9341)


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	IPPM                                                         M. Cociglio
3	Internet-Draft                                            Telecom Italia
4	Intended status: Experimental                                G. Fioccola
5	Expires: January 4, 2021                             Huawei Technologies
6	                                                                 M. Nilo
7	                                                           F. Bulgarella
8	                                                          Telecom Italia
9	                                                                R. Sisto
10	                                                   Politecnico di Torino
11	                                                            July 3, 2020

13	            Client-Server Explicit Performance Measurements
14	                 draft-cfb-ippm-spinbit-measurements-02

16	Abstract

18	   This document introduces an additional single bit signal to enhance
19	   the spin bit [I-D.trammell-ippm-spin] performance in presence of
20	   network impairments and application limited flow.  In addition, it
21	   defines two new explicit per-flow transport-layer signals for hybrid
22	   measurement of connection loss rate.  The former is a spin-bit
23	   dependent signal and uses a single bit.  The latter is a standalone
24	   solution based on a two bits loss signal and on alternate marking RFC
25	   8321 [RFC8321].

27	Requirements Language

29	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
30	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
31	   document are to be interpreted as described in RFC 2119 [RFC2119].

33	Status of This Memo

35	   This Internet-Draft is submitted in full conformance with the
36	   provisions of BCP 78 and BCP 79.

38	   Internet-Drafts are working documents of the Internet Engineering
39	   Task Force (IETF).  Note that other groups may also distribute
40	   working documents as Internet-Drafts.  The list of current Internet-
41	   Drafts is at https://datatracker.ietf.org/drafts/current/.

43	   Internet-Drafts are draft documents valid for a maximum of six months
44	   and may be updated, replaced, or obsoleted by other documents at any
45	   time.  It is inappropriate to use Internet-Drafts as reference
46	   material or to cite them other than as "work in progress."

48	   This Internet-Draft will expire on January 4, 2021.

50	Copyright Notice

52	   Copyright (c) 2020 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (https://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
68	   2.  Spin bit and Delay bit mechanism  . . . . . . . . . . . . . .   4
69	     2.1.  Delay Sample generation . . . . . . . . . . . . . . . . .   5
70	       2.1.1.  The recovery process  . . . . . . . . . . . . . . . .   6
71	     2.2.  Delay Sample reflection . . . . . . . . . . . . . . . . .   6
72	   3.  Using the Spin bit and Delay bit for Hybrid RTT Measurement .   7
73	     3.1.  End-to-end RTT measurement  . . . . . . . . . . . . . . .   7
74	     3.2.  Half-RTT measurement  . . . . . . . . . . . . . . . . . .   8
75	     3.3.  Intra-domain RTT measurement  . . . . . . . . . . . . . .   9
76	   4.  Observer's algorithm and Waiting Interval . . . . . . . . . .  10
77	   5.  Adding a Loss signal for Packet loss measurement  . . . . . .  11
78	     5.1.  Round Trip Packet Loss measurement  . . . . . . . . . . .  13
79	   6.  Packet Loss using one bit loss signal . . . . . . . . . . . .  14
80	     6.1.  Observer's logic for one bit loss signal  . . . . . . . .  16
81	   7.  Two Bits packet loss measurement using alternate marking  . .  16
82	     7.1.  Setting the square bit (Q) on outgoing packets  . . . . .  16
83	     7.2.  Setting the reflection square bit (R) on outgoing packets  17
84	       7.2.1.  Determining the completion of an incoming marking
85	               period  . . . . . . . . . . . . . . . . . . . . . . .  18
86	     7.3.  Observer's logic and passive loss measurements  . . . . .  18
87	       7.3.1.  Upstream one-way loss . . . . . . . . . . . . . . . .  19
88	       7.3.2.  Three-quarters connection loss  . . . . . . . . . . .  19
89	       7.3.3.  Full one-way loss in the opposite direction . . . . .  20
90	       7.3.4.  Half round-trip loss  . . . . . . . . . . . . . . . .  21
91	       7.3.5.  Downstream one-way loss . . . . . . . . . . . . . . .  21
92	     7.4.  Enhancement of reflection period size computation . . . .  22
93	     7.5.  Improvement of the resilience to out of sequence  . . . .  22
94	   8.  Protocols . . . . . . . . . . . . . . . . . . . . . . . . . .  23
95	     8.1.  QUIC  . . . . . . . . . . . . . . . . . . . . . . . . . .  23
96	     8.2.  TCP . . . . . . . . . . . . . . . . . . . . . . . . . . .  23
97	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
98	   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  24
99	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  24
100	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  24
101	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  24
102	     12.2.  Informative References . . . . . . . . . . . . . . . . .  24
103	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  25

105	1.  Introduction

107	   Both [I-D.trammell-tsvwg-spin] and [I-D.trammell-ippm-spin] define an
108	   explicit per-flow transport-layer signal for hybrid measurement of
109	   end-to-end RTT.  This signal consists of three bits: a spin bit,
110	   which oscillates once per end-to-end RTT, and a two-bit Valid Edge
111	   Counter (VEC), which compensates for loss and reordering of the spin
112	   bit to increase fidelity of the signal in less than ideal network
113	   conditions.

115	   In this document it is introduced the delay bit, that is a single bit
116	   signal that can be used together with the spin bit by passive
117	   observers to measure the RTT of a network flow, avoiding the spin bit
118	   ambiguities that arise as soon as network conditions deteriorate.
119	   Unlike the spin bit, which is actually set in every packet
120	   transmitted on the network, the delay bit is set only once per round
121	   trip.

123	   Regarding loss rate measurement, two new algorithms are introduced.
124	   The first algorithm enables end-to-end round trip loss rate
125	   measurement using a single bit signal called loss bit.  This signal
126	   is used to mark a train of packets (a portion of traffic) which
127	   bounces back an forth two times between endpoints, realizing a two
128	   round trip reflection.  A passive on-path observer, placed on
129	   whatever direction, can trivially count and compare the number of
130	   marked packets seen during the two reflections estimating
131	   statistically the loss rate experienced by the connection.  The
132	   second algorithm uses a double square signal and RFC 8321 [RFC8321]
133	   to mark the whole traffic exchanged between endpoints.  This solution
134	   enables different types of measurements providing a complete picture
135	   of connection loss events.

137	   This document defines hybrid measurement RFC 7799 [RFC7799] path
138	   signals to be embedded into a transport layer protocol, explicitly
139	   intended for exposing end-to-end RTT and loss rate information to
140	   measurement devices on path.

142	   The document introduces mechanisms applicable to any transport-layer
143	   protocol, then explains how to bind the signals to a variety of IETF
144	   transport protocols, and in particular to QUIC and TCP.

146	   The application of the spin bit to QUIC is described in
147	   [I-D.ietf-quic-spin-exp] which adds the spin bit to QUIC for
148	   experimentation purposes.

150	   Note that spin bit, delay bit and loss bits explained in this
151	   document are inspired by RFC 8321 [RFC8321].  This is also mentioned
152	   in [I-D.trammell-quic-spin].

154	   Note that additional details about the Performance Measurements for
155	   QUIC are also described in the paper [ANRW19-PM-QUIC].

157	2.  Spin bit and Delay bit mechanism

159	   The main idea is to have a single packet, with a second marked bit
160	   (the delay bit), that bounces between client and server during the
161	   entire connection life.  This single packet is called Delay Sample.

163	   A simple observer placed in an intermediate point, tracking the delay
164	   sample and the relative timestamp in every spin bit period, can
165	   measure the end-to-end round trip delay of the connection.  In the
166	   same way as seen with the spin bit, it is possible to carry out other
167	   types of measurements using this additional bit.  The next paragraphs
168	   give an overview of the observer capabilities.

170	   In order to describe the delay sample working mechanism in detail, we
171	   have to distinguish two different phases which take part in the delay
172	   bit lifetime: initialization and reflection.  The initialization is
173	   the generation of the delay sample, while the reflection realizes the
174	   bounce behavior of this single packet between the two endpoints.

176	   The next figure describes the Delay bit mechanism: the first bit is
177	   the spin bit and the second one is the delay bit.

179	      +--------+   --  --  --  --  --   +--------+
180	      |        |       ----------->     |        |
181	      | Client |                        | Server |
182	      |        |      <-----------      |        |
183	      +--------+   --  --  --  --  --   +--------+

185	      (a) No traffic at beginning.

187	      +--------+   00  00  01  --  --   +--------+
188	      |        |       ----------->     |        |
189	      | Client |                        | Server |
190	      |        |      <-----------      |        |
191	      +--------+   --  --  --  --  --   +--------+
192	       (b) The Client starts sending data and
193	        sets the first packet as Delay Sample.

195	      +--------+   00  00  00  00  00   +--------+
196	      |        |       ----------->     |        |
197	      | Client |                        | Server |
198	      |        |      <-----------      |        |
199	      +--------+   --  --  01  00  00   +--------+

201	       (c) The Server starts sending data
202	        and reflects the Delay Sample.

204	      +--------+   10  10  11  00  00   +--------+
205	      |        |       ----------->     |        |
206	      | Client |                        | Server |
207	      |        |      <-----------      |        |
208	      +--------+   00  00  00  00  00   +--------+

210	      (d) The Client inverts the spin bit and
211	       reflects the Delay Sample.

213	      +--------+   10  10  10  10  10   +--------+
214	      |        |       ----------->     |        |
215	      | Client |                        | Server |
216	      |        |      <-----------      |        |
217	      +--------+   00  00  11  10  10   +--------+

219	      (e) The Server reflects the Delay Sample.

221	      +--------+   00  00  01  10  10   +--------+
222	      |        |       ----------->     |        |
223	      | Client |                        | Server |
224	      |        |      <-----------      |        |
225	      +--------+   10  10  10  10  10   +--------+

227	      (f) The client reverts the spin
228	       bit and reflects the Delay Sample.

230	                     Figure 1: Spin bit and Delay bit

232	2.1.  Delay Sample generation

234	   During this first phase, endpoints play different roles.  First of
235	   all a single delay sample must be bouncing per round trip period (and
236	   so per spin bit period).  According to that statement and in order to
237	   simplify the general algorithm, the delay sample generation is in
238	   charge of just one of the two endpoints:

240	   o  the client, when connection starts and spin bit is set to 0,
241	      initializes the delay bit of the first packet to 1, so it becomes
242	      the delay sample for that marking period.  Only this packet is
243	      marked with the delay bit set to 1 for this round trip period; the
244	      other ones will carry only the spin bit;

246	   o  the server never initializes the delay bit to 1; its only task is
247	      to reflect the incoming delay bit into the next outgoing packet
248	      only if certain conditions occur.

250	   Theoretically, in absence of network impairments, the delay sample
251	   should bounce between client and server continuously, for the entire
252	   duration of the connection.  Actually, that is highly unlikely mainly
253	   for two different reasons:

255	   1) the packet carrying the delay bit might be lost during its journey
256	   on the network which is unreliable by definition;

258	   2) one of the two endpoints could stop or delay sending data because
259	   the application is limiting the amount of traffic transmitted;

261	   To deal with these problems, the algorithm provides a procedure to
262	   regenerate the delay sample and to inform a possible observer that a
263	   problem has occurred, and then the measurement has to be restarted.

265	2.1.1.  The recovery process

267	   In order to relieve the server from tasks that go beyond the mere
268	   reflection of the sample, even in this case the recovery process
269	   belongs to the client.  A fundamental assumption is that a delay
270	   sample is strictly related to its spin bit period.  Considering this
271	   rule, the client verifies that every spin bit period ends with its
272	   delay sample.  If that does not happen and a marking period
273	   terminates without a delay sample, the client waits a further empty
274	   period; then, in the following period, it reinitializes the mechanism
275	   by setting the delay bit of the first outgoing packet to 1, making it
276	   the new delay sample.  The empty period is needed to inform the
277	   intermediate points that there was an issue and a new delay
278	   measurement session is starting.

280	2.2.  Delay Sample reflection

282	   The reflection is the process that enables the bouncing of the delay
283	   sample between client and server.  The behavior of the two endpoints
284	   is slightly different.  With the exception of the client that, as
285	   previously exposed, generates a new delay sample, by default the
286	   delay bit is set to 0.

288	   Server side reflection: when a packet with the delay bit set to 1
289	   arrives, the server marks the first packet in the opposite direction
290	   as the delay sample, if it has the same spin bit value.  While if it
291	   has the opposite spin bit value this sample is considered lost.

293	   Client side reflection: when a packet with delay bit set to 1
294	   arrives, the client marks the first packet in the opposite direction
295	   as the delay sample, if it has the opposite spin bit value.  While if
296	   it has the same spin bit value this sample is considered lost.

298	   In both cases, if the outgoing marked packet is transmitted with a
299	   delay greater than a predetermined threshold after the reception of
300	   the incoming delay sample (1ms by default), reflection is aborted and
301	   this sample is considered lost.

303	   Note that reflection takes place for the packet that is carrying the
304	   delay bit regardless of its position within the period.  For this
305	   reason it is necessary to introduce that condition of validation in
306	   order to identify and discard those samples that, due to reordering,
307	   might move to a contiguous period.  Furthermore, by introducing a
308	   threshold for the retransmission delay of the sample, it is possible
309	   to eliminate all those measurements which, due to lack of traffic on
310	   the endpoints, would be overestimated and not true.  Thus, the
311	   maximum estimation error, without considering any other delays due to
312	   flow control, would amount to twice the threshold (e.g. 2ms) per
313	   measurement, in the worst case.

315	3.  Using the Spin bit and Delay bit for Hybrid RTT Measurement

317	   Unlike what happens with the spin bit for which it is necessary to
318	   validate or at least heuristically evaluate the goodness of an edge,
319	   the delay sample can be used by an intermediate observer as a simple
320	   demarcator between a period and the following one eliminating the
321	   ambiguities on the calculation of the RTT found with the analysis of
322	   the spin-bit only.  The measurement types, that can be done from the
323	   observation of the delay sample, are exactly the same achievable with
324	   the spin bit only.

326	3.1.  End-to-end RTT measurement

328	   The delay sample generation process ensures that only one packet
329	   marked with the delay bit set to 1 runs back and forth on the wire
330	   between two endpoints per round trip time.  Therefore, in order to
331	   determine the end-to-end RTT measurement of a QUIC flow, an on-path
332	   passive observer can simply compute the time difference between two
333	   delay samples observed in a single direction.  Note that a
334	   measurement, to be valid, must take into account the difference in
335	   time between the timestamps of two consecutive delay samples
336	   belonging to adjacent spin-bit periods.  For this reason, an
337	   observer, in addition to intercepting and analyzing the packets
338	   containing the delay bit set to 1, must maintain awareness of each
339	   spin period in such a way as to be able to assign each delay sample
340	   to its period and, at the same time, identifying those periods that
341	   do not contain it.

343	           =======================|======================>
344	           = **********     -----Obs---->     ********** =
345	           = * Client *                       * Server * =
346	           = **********     <------------     ********** =
347	           <==============================================

349	                     (a) client-server RTT

351	           ==============================================>
352	           = **********     ------------>     ********** =
353	           = * Client *                       * Server * =
354	           = **********     <----Obs-----     ********** =
355	           <======================|=======================

357	                     (b) server-client RTT

359	                Figure 2: Round-trip time (both direction)

361	3.2.  Half-RTT measurement

363	   An on-path passive observer that is sniffing traffic in both
364	   directions -- from client to server and from server to client -- can
365	   also use the delay sample to measure "upstream" and "downstream" RTT
366	   components.  Also known as the half-RTT measurement, it represents
367	   the components of the end-to-end RTT concerning the paths between the
368	   client and the observer (upstream), and the observer and the server
369	   (downstream).  It does this by measuring the delay between a delay
370	   sample observed in the downstream direction and the one observed in
371	   the upstream direction, and vice versa.  Also in this case, it should
372	   verify that the two delay samples belong to two adjacent periods, for
373	   the upstream component, or to the same period for the downstream
374	   component.

376	           =======================>
377	           = **********     ------|----->     **********
378	           = * Client *          Obs          * Server *
379	           = **********     <-----|------     **********
380	           <=======================

382	                  (a) client-observer half-RTT

384	                                  =======================>
385	             **********     ------|----->     ********** =
386	             * Client *          Obs          * Server * =
387	             **********     <-----|------     ********** =
388	                                  <=======================

390	                  (b) observer-server half-RTT

392	              Figure 3: Half Round-trip time (both direction)

394	3.3.  Intra-domain RTT measurement

396	   Taking advantage of the half-RTT measurements it is also possible to
397	   calculate the intra-domain RTT which is the portion of the entire RTT
398	   used by a QUIC flow to traverse the network of a provider (or part of
399	   it).  To achieve this result two observers, able to watch traffic in
400	   both directions, must be employed simultaneously at ingress and
401	   egress of the network to be measured.  At this point, to determine
402	   the delay between the two observers, it is enough to subtract the two
403	   computed upstream (or downstream) RTT components.

405	         =========================================>
406	         = =====================>
407	         = = **********      ---|-->           ---|-->      **********
408	         = = * Client *         Obs               Obs       * Server *
409	         = = **********      <--|---           <--|---      **********
410	         = <=====================
411	         <=========================================

413	                  (a) client-observer RTT components (half-RTTs)

415	                                ==================>
416	             **********      ---|-->           ---|-->      **********
417	             * Client *         Obs               Obs       * Server *
418	             **********      <--|---           <--|---      **********
419	                                <==================

421	                  (b) the intra-domain RTT resulting from the
422	                  subtraction of the above RTT components

424	    Figure 4: Intra-domain Round-trip time (client-observer: upstream)

426	   The spin bit is an alternate marking generated signal and the only
427	   difference than RFC 8321 [RFC8321] is the size of the alternation
428	   that will change with the flight size each RTT.  So it can be useful
429	   to segment the RTT and deduce the contribution to the RTT of the
430	   portion of the network between two on-path observers and it can be
431	   easily performed by calculating the delay between two or more
432	   measurement points on a single direction by applying RFC 8321
433	   [RFC8321].

435	4.  Observer's algorithm and Waiting Interval

437	   Given below is a formal summary of the functioning of the observer
438	   every time a delay sample is detected.  A packet containing the delay
439	   bit set to 1:

441	   o  if it has the same spin bit value of the current period and no
442	      delay sample was detected in the previous period, then it can be
443	      used as a left edge (i.e. to start measuring an RTT sample), but
444	      not as a right edge (i.e. to complete and RTT measurement since
445	      the last edge).  If the observation point is symmetric (i.e. it
446	      can see both upstream and downstream packets in the flow) and in
447	      the current period a delay sample was detected in the opposite
448	      direction (i.e. in the upstream direction), the packet can also be
449	      used to compute the downstream RTT component.

451	   o  if it has the same spin bit value of the current period and a
452	      delay sample was detected in the previous period, then it can be
453	      used at the same time as a left or right edge, and to compute RTT
454	      component in both directions.

456	   Like stated previously, every time an empty period is detected, the
457	   observer must restart the measurement process and consider the next
458	   delay sample that will come as the beginning of a new measure, then
459	   as a left edge.  As a result, being able to assign the delay sample
460	   to the corresponding spin period becomes a crucial factor for the
461	   proper functioning of the entire algorithm.

463	   Considering that the division into periods is realized by exploiting
464	   the spin bit square wave, it is easy to understand that the presence
465	   of spurious spin edges -- caused by packet reordering -- would
466	   inevitably lead the observer to overestimate the amount of periods
467	   actually present in the transmission.  This results in a greater
468	   number of empty periods detected and the consequent decrease of the
469	   actual RTT samples achievable.  Therefore, in order to maximize the
470	   performance of the whole algorithm, the observer must implement a
471	   mechanism to filter out spurious spin edges.

473	   To face this problem the waiting interval has to be introduced.
474	   Basically, every time a spin bit edge is detected, the observer sets
475	   a time interval during which it rejects every potential spurious
476	   edges observed on the wire.  While, at the end of the interval it
477	   starts again to accept changes in the spin bit value.  This
478	   guarantees a proper protection against the spurious edges in relation
479	   to the size of the interval itself.  For instance, an interval of 5ms
480	   is able to filter out edges that have been reordered by a maximum of
481	   5ms.  Clearly, the mechanism does its job for intervals smaller than
482	   the RTT of the observed connection (if RTT is smaller than the
483	   waiting interval the observer can't measure the RTT).

485	5.  Adding a Loss signal for Packet loss measurement

487	   It is possible to introduce a mechanism to evaluate also the packet
488	   loss together with the delay measurement.  This can be achieved by
489	   introducing the loss signal, a single bit signal whose purpose is to
490	   mark a variable number of packets (from live traffic) which are
491	   exchanged two times between the endpoints realizing a two round-trip
492	   reflection.  The overall exchange comprises:

494	   o  The client first selects, generates and consequent transmits to
495	      the server a first train of packets, by marking the loss bit to 1;

497	   o  The server, upon reception from the client of each one of the
498	      packets included in the first train, reflects to the client a
499	      respective second train of packets of the same size as the first
500	      train received, by marking the loss bit to 1;

502	   o  The client, upon reception from the server of each one of the
503	      packets included in the second train, reflects to the server a
504	      respective third train of packets of the same size as the second
505	      train received, by marking the loss bit to 1;

507	   o  The server, upon reception from the client of each one of the
508	      packets included in the third train, finally reflects to the
509	      client a respective fourth train of packets of the same size as
510	      the third train received, by marking the loss bit to 1.

512	   Packets belonging to the first round (first and second train)
513	   represent the Generation Phase while those belonging to the second
514	   round (third and fourth train) represent the Reflection Phase.

516	   A passive on-path observer, placed on whatever direction, can
517	   trivially count and compare the number of marked packets seen during
518	   the two mentioned phases (i.e. the first and third or the second and
519	   the fourth trains of packets, depending on which direction is
520	   observed) and estimate the loss rate experienced by the connection.
521	   This process is repeated continuously to obtain more measurements as
522	   long as the endpoints exchange traffic.  These measurements can be
523	   called Round Trip(RT) losses

525	   The general algorithm shown above gives an idea of its underlying
526	   principles but is not enough to make the whole process working
527	   properly.

529	   Firstly, there is the issue that packet rates in the two directions
530	   may be different.  Therefore, the right number of packets to be
531	   marked has to be chosen in order to avoid their congestion on the
532	   slowest traffic direction.  As a consequence, this number is
533	   inevitably equal to the amount of packets transited, indeed, on the
534	   slowest direction.  This problem can be easily addressed by a method
535	   wherein the two endpoints of a communication exchange marked packets
536	   interleaved with unmarked packets.  From an implementation point of
537	   view, this result can be achieved by introducing a single token
538	   system that adjusts the number of outgoing marked packets.
539	   Basically, the token is enabled every time a packet arrives and
540	   disabled when a marked packet is transmitted.  Since the creation of
541	   the initial train of marked packets is carried out by the client, the
542	   management and use of this single token is also assigned to it, which
543	   in fact "calculates" the correct number of packets to be marked each
544	   time.

546	   Secondly, a mechanism to individually identify each train of packets
547	   must be provided to enable the observer to distinguish between trains
548	   belonging to different phases (Generation and Reflection).

550	5.1.  Round Trip Packet Loss measurement

552	   Since the measurements are performed on a portion of the traffic
553	   exchanged between client and server, the observer calculates the end-
554	   to-end Round Trip Packet Loss that, statistically, will be equal to
555	   the loss rate experienced by the connection along the entire network
556	   path.  So this measurement can be simply referred as the Round Trip
557	   Packet Loss (RTPL).

559	           =======================|======================>
560	           = **********     -----Obs---->     ********** =
561	           = * Client *                       * Server * =
562	           = **********     <------------     ********** =
563	           <==============================================

565	                     (a) client-server RTPL

567	           ==============================================>
568	           = **********     ------------>     ********** =
569	           = * Client *                       * Server * =
570	           = **********     <----Obs-----     ********** =
571	           <======================|=======================

573	                     (b) server-client RTPL

575	             Figure 5: Round-trip packet loss (both direction)

577	   In addition, this methodology allows the Half-RTPL measurement and
578	   the Intra-domain RTPL measurement, in the same way as described in
579	   the previous sections for RTT measurement.

581	           =======================>
582	           = **********     ------|----->     **********
583	           = * Client *          Obs          * Server *
584	           = **********     <-----|------     **********
585	           <=======================

587	                  (a) client-observer half-RTPL

589	                                  =======================>
590	             **********     ------|----->     ********** =
591	             * Client *          Obs          * Server * =
592	             **********     <-----|------     ********** =
593	                                  <=======================

595	                  (b) observer-server half-RTPL

597	          Figure 6: Half Round-trip packet loss (both direction)

599	                             =========================================>
600	                                               =====================> =
601	          **********      ---|-->           ---|-->      ********** = =
602	          * Client *         Obs               Obs       * Server * = =
603	          **********      <--|---           <--|---      ********** = =
604	                                               <===================== =
605	                             <=========================================

607	               (a) observer-server RTPL components (half-RTPLs)

609	                             ==================>
610	          **********      ---|-->           ---|-->      **********
611	          * Client *         Obs               Obs       * Server *
612	          **********      <--|---           <--|---      **********
613	                             <==================

615	               (b) the intra-domain RTPL resulting from the
616	               subtraction of the above RTPL components

618	      Figure 7: Intra-domain Round-trip packet loss (observer-server)

620	6.  Packet Loss using one bit loss signal

622	   The single bit loss signal, whose basic mechanism was generalized in
623	   the previous section, is implemented using just one bit: marked
624	   packets have this bit set to 1, whereas unmarked ones have it set to
625	   0.  This solution requires a working spin-bit signal used to separate
626	   different trains of packets.  In particular, a "pause" of at least
627	   one empty spin-bit period is introduced between each phase of the
628	   algorithm.  An on-path observer can determine in this way if a phase
629	   (and therefore a train of packets) is ended and a new one is
630	   starting.

632	   The client is in charge of almost the entire complexity of the
633	   algorithm.  Its task can be summarized in 4 different points:

635	   1.  The client starts generating marked packets for two consecutive
636	       spin-bit periods; it maintains a generation token that is enabled
637	       every time a packet arrives and disabled when another one is
638	       forwarded.  When this token is disabled, the generation process
639	       is paused (i.e. outgoing packets are transmitted unmarked) and
640	       resumes as soon as its value returns true, and that happens as
641	       soon as a packet is received.  In addition, at the end of the
642	       first spin-bit period spent in generation, the reflection counter
643	       is unlocked to start counting incoming marked packets which will
644	       be later reflected;

646	   2.  When the generation is completed, the client waits to see in
647	       input an empty spin-bit period so as to be sure that everyone has
648	       seen at least that empty period.  This one will be used by the
649	       observer as a divider between generated and reflected packets.
650	       During this phase, all the outgoing packets are forwarded with
651	       the loss bit set to 0.  The reflection counter is still
652	       incremented every time a marked packet arrives;

654	   3.  The client starts reflecting marked packets until the reflection
655	       counter is zeroed; the generation token is also used (in the same
656	       way) during this phase to avoid congestion on the slowest traffic
657	       direction.  In addition, at the end of the first spin-period
658	       spent in reflection, the reflection counter is locked to avoid
659	       incoming reflected packets incrementing it;

661	   4.  When the reflection is completed, the client waits to see in
662	       input an empty spin-bit period so as to be sure that everyone has
663	       seen at least that empty period.  This one will be used by the
664	       observer as a divider between reflected and newly generated
665	       packets.  During this phase, all the outgoing packets are
666	       forwarded with the loss bit set to 0.  The whole process restarts
667	       going back to the first point.

669	   As previously anticipated, the server simply reflects each incoming
670	   marked packet sent by the client.  It maintains a simple counter that
671	   is incremented every time a marked packet arrives and decremented
672	   when a marked one is sent in the opposite direction.

674	6.1.  Observer's logic for one bit loss signal

676	   The on-path observer, placed in any direction, counts marked packets
677	   and separates different trains detecting empty spin-bit periods
678	   between them (one or more).  Then, it simply computes the difference
679	   between a Generation train and a Reflection train to produce a
680	   statistical measurement of the Round Trip Packet Loss (RTPL) and of
681	   the connection end-to-end loss rate.

683	   Here is an example.  Packets are represented by two digits (first one
684	   is the spin bit, second one is the loss bit):

686	          Generation          Pause           Reflection       Pause
687	     ____________________ ______________ ____________________ ________
688	    |                    |              |                    |        |
689	     01 01 00 01 11 10 11 00 00 10 10 10 01 00 01 01 10 11 10 00 00 10

691	                   Figure 8: one bit loss signal example

693	   Note that 5 marked packets have been generated of which 4 reflected.

695	7.  Two Bits packet loss measurement using alternate marking

697	   An alternative methodology, based on the classical alternate marking
698	   RFC 8321 [RFC8321], can be deployed to enable passive packet loss
699	   measurement in a connection oriented communication.  This section
700	   explains its fundamentals and all the metrics that can be achieved by
701	   exploiting this mechanism.

703	   Two new loss bits are introduced:

705	   o  Square Bit (Q): this bit is toggled every N outgoing packets
706	      generating a square signal as already seen in the alternate
707	      marking methodology RFC 8321 [RFC8321].

709	   o  Reflection Square Bit (R): this bit is used to reflect the
710	      incoming square signal (the one generated by the opposite
711	      endpoint) according to the algorithm explained in next Section; in
712	      a nutshell, it is used to report the losses found in the opposite
713	      transmission channel.

715	7.1.  Setting the square bit (Q) on outgoing packets

717	   The sQuare value is initialized to 0 and is applied to the Q-bit of
718	   every outgoing packet.  The sQuare value is toggled after sending N
719	   packets (e.g. 64).  By doing so, each endpoint splits its outgoing
720	   traffic into blocks of N packets with different "packet color" as
721	   defined by RFC 8321 [RFC8321].  A single block of N packets is called
722	   "marking period".  Observation points can estimate upstream losses by
723	   counting the number of packets included in a marking period of the
724	   produced square signal.

726	7.2.  Setting the reflection square bit (R) on outgoing packets

728	   Unlike the sQuare signal for which packets are transmitted into
729	   blocks of fixed size, the Reflection square signal (being an
730	   alternate marking signal too) produces blocks of packets whose size
731	   varies according to these simple rules:

733	   o  when the transmission of a new block starts, its size is set equal
734	      to the size of the last marking period whose reception has been
735	      completed;

737	   o  if, before transmission of the block is terminated, the reception
738	      of at least one further marking period is completed, the size of
739	      the block is updated to the average size of the further received
740	      marking periods.  Implementation details follow.

742	   The Reflection square value is initialized to 0 and is applied to the
743	   R-bit of every outgoing packet.  The Reflection square value is
744	   toggled for the first time when the completion of a marking period is
745	   detected in the incoming sQuare signal (produced by the opposite node
746	   using the Q-bit).  When this happens, the number of packets (p),
747	   detected within this first marking period, is used to generate a
748	   reflection square signal which toggles every M=p packets (at first).
749	   This new signal produces blocks of M packets (marked using the R-bit)
750	   and each of them is called "reflection marking period".

752	   The M value is then updated every time a completed marking period in
753	   the incoming sQuare signal is received, following this formula:
754	   M=round(avg(p)).

756	   The parameter avg(p) is the average number of packets in a marking
757	   period computed considering all the marking periods received since
758	   the beginning of the current reflection marking period.

760	   Looking at the R-bit, observation points have clear indication of
761	   losses experienced by the entire opposite channel plus those occurred
762	   in the path from the sender up to them (if losses occur in this
763	   latter portion of path).

765	7.2.1.  Determining the completion of an incoming marking period

767	   A simple sQuare bit transition cannot be used to determine the
768	   completion of a marking period.  Indeed, packet reordering can lead
769	   to the generation of spurious edges in the sQuare signal.  To address
770	   this problem, a marking period is considered ended when at least X
771	   packets (e.g. 5) with reverse marking (i.e. belonging to the
772	   following marking period) have been received.

774	   This same approach can be used by observation points to clean both
775	   sQuare and Reflection square signals.

777	7.3.  Observer's logic and passive loss measurements

779	   Since both sQuare and Reflection square bits are toggled at most
780	   every N packets (except for the first transition of the R-bit as
781	   explained before), an on-path observer can trivially count the number
782	   of packets of each marking block and, knowing the value of N, can
783	   estimate the amount of loss experienced by the connection.  Different
784	   metrics can be measured depending on which direction the observer is
785	   looking to.

787	   One direction observer:

789	   o  upstream one-way loss: the loss between the sender and the
790	      observation point

792	   o  "three-quarters" connection loss: the loss between the receiver
793	      and the sender in the opposite direction plus the loss between the
794	      sender and the observation point in the observed direction

796	   o  full one-way loss in the opposite direction: the loss between the
797	      receiver and the sender in the opposite direction

799	   Two directions observer (same metrics seen previously applied to both
800	   direction, plus):

802	   o  client-observer half round-trip loss: the loss between the client
803	      and the observation point in both directions

805	   o  observer-server half round-trip loss: the loss between the
806	      observation point and the server in both directions

808	   o  downstream one-way loss: the loss between the observation point
809	      and the receiver (valid for both directions)

811	7.3.1.  Upstream one-way loss

813	   Since packets are continuously Q-bit marked into alternate blocks of
814	   size N, knowing the value of N, an on-path observer can estimate the
815	   amount of loss occurred from the sender up to it after observing at
816	   least N packets.  The upstream one-way loss rate ("uowl") is one
817	   minus the average number of packets in a block of packets with the
818	   same Q value ("p") divided by N ("uowl=1-avg(p)/N").

820	          =====================>
821	          **********     -----Obs---->     **********
822	          * Client *                       * Server *
823	          **********     <------------     **********

825	            (a) in client-server channel (uowl_up)

827	          **********     ------------>     **********
828	          * Client *                       * Server *
829	          **********     <----Obs-----     **********
830	                               <=====================

832	            (b) in server-client channel (uowl_down)

834	                      Figure 9: Upstream one-way loss

836	7.3.2.  Three-quarters connection loss

838	   Except for the very first block in which there is nothing to reflect
839	   (a complete marking period has not been yet received), packets are
840	   continuously R-bit marked into alternate blocks of size lower or
841	   equal than N.  Knowing the value of N, an on-path observer can
842	   estimate the amount of loss occurred in the whole opposite channel
843	   plus the loss from the sender up to it in the observation channel.
844	   As for the previous metric, the "three-quarters" connection loss rate
845	   ("tql") is one minus the average number of packets in a block of
846	   packets with the same R value ("t") divided by N ("tql=1-avg(t)/N").

848	        =======================>
849	        = **********     -----Obs---->     **********
850	        = * Client *                       * Server *
851	        = **********     <------------     **********
852	        <============================================

854	            (a) in client-server channel (tql_up)

856	          ============================================>
857	          **********     ------------>     ********** =
858	          * Client *                       * Server * =
859	          **********     <----Obs-----     ********** =
860	                               <=======================

862	            (b) in server-client channel (tql_down)

864	                 Figure 10: Three-quarters connection loss

866	   The following metrics derive from these first two metrics.

868	7.3.3.  Full one-way loss in the opposite direction

870	   Using the previous metrics, full one-way loss can be computed:

872	   fowl_down = tql_up - uowl_up

874	   fowl_up = tql_down - uowl_down

876	          **********     -----Obs---->     **********
877	          * Client *                       * Server *
878	          **********     <------------     **********
879	          <==========================================

881	            (a) in client-server channel (fowl_down)

883	          ==========================================>
884	          **********     ------------>     **********
885	          * Client *                       * Server *
886	          **********     <----Obs-----     **********

888	            (b) in server-client channel (fowl_up)

890	          Figure 11: Full one-way loss in the opposite direction

892	7.3.4.  Half round-trip loss

894	   Using the previous metrics, the two half round-trip loss measurements
895	   can be computed:

897	   hrtl_co = tql_up - uowl_down

899	   hrtl_os = tql_down - uowl_up

901	        =======================>
902	        = **********     ------|----->     **********
903	        = * Client *          Obs          * Server *
904	        = **********     <-----|------     **********
905	        <=======================

907	            (a) client-observer half round-trip loss (hrtl_co)

909	                               =======================>
910	          **********     ------|----->     ********** =
911	          * Client *          Obs          * Server * =
912	          **********     <-----|------     ********** =
913	                               <=======================

915	            (b) observer-server half round-trip loss (hrtl_os)

917	             Figure 12: Half Round-trip loss (both direction)

919	7.3.5.  Downstream one-way loss

921	   Using the previous metrics, downstream one-way loss can be computed:

923	   dowl_up = hrtl_os - uowl_down

925	   dowl_down = hrtl_co - uowl_up
926	                               =====================>
927	          **********     ------|----->     **********
928	          * Client *          Obs          * Server *
929	          **********     <-----|------     **********

931	             (a) in client-server channel (dowl_up)

933	          **********     ------|----->     **********
934	          * Client *          Obs          * Server *
935	          **********     <-----|------     **********
936	          <=====================

938	             (b) in server-client channel (dowl_down)

940	                    Figure 13: Downstream one-way loss

942	7.4.  Enhancement of reflection period size computation

944	   The use of the rounding function used in the M computation introduces
945	   errors.  However, these errors can be minimized by storing the
946	   rounding applied each time M is computed, and using it during the
947	   computation of the M value in the following reflection marking
948	   period.

950	   This can be achieved introducing the new r_avg parameter in the
951	   previous M formula.  The new formula is M=round(avg(p)+r_avg) where
952	   r_avg is computed as not rounded M minus rounded M; its initial value
953	   is equal to 0.

955	7.5.  Improvement of the resilience to out of sequence

957	   Since endpoints have clear indication about reordered packets, we can
958	   use this information to absorb out of sequences in the incoming
959	   sQuare wave, even when the marking period threshold (see 7.2.1
960	   Section) has been reached.

962	   This can be achieved by updating the size of the current reflection
963	   block while this is being transmitted.  The reflection block size is
964	   then updated every time an incoming reordered packet of the previous
965	   marking period is detected.  This can be done if and only if the
966	   transmission of the current reflection block is in progress and no
967	   packets of the following marking period (Q-bit) have been received.

969	8.  Protocols

971	8.1.  QUIC

973	   The binding of the delay bit signal to QUIC is partially described in
974	   [I-D.ietf-quic-transport], which adds the spin bit to the first byte
975	   of the short packet header, leaving two reserved bits for future
976	   experiments.

978	   To implement the additional signals discussed in this document, the
979	   first byte of the short packet header can be modified as follows:

981	      the delay bit (D) can be placed in the first reserved bit (i.e.
982	      the fourth most significant bit _0x10_) while the loss bit in the
983	      second reserved bit (i.e. the fifth most significant bit _0x08_);
984	      the proposed scheme is:

986	          0 1 2 3 4 5 6 7
987	         +-+-+-+-+-+-+-+-+
988	         |0|1|S|D|L|K|P|P|
989	         +-+-+-+-+-+-+-+-+

991	                            Figure 14: scheme 1

993	      alternatively, the standalone two bits loss signal (QR) can be
994	      placed in both reserved bits; the proposed scheme, in this case,
995	      is:

997	          0 1 2 3 4 5 6 7
998	         +-+-+-+-+-+-+-+-+
999	         |0|1|S|Q|R|K|P|P|
1000	         +-+-+-+-+-+-+-+-+

1002	                            Figure 15: scheme 2

1004	8.2.  TCP

1006	   The signals can be added to TCP by defining bit 4 of bytes 13-14 of
1007	   the TCP header to carry the spin bit, and eventually bits 5 and 6 to
1008	   carry additional information, like the delay bit and the 1 bit loss
1009	   signal (or the two bits loss signal).

1011	9.  Security Considerations

1013	   The privacy considerations for the hybrid RTT measurement signal are
1014	   essentially the same as those for passive RTT measurement in general.

1016	10.  Acknowledgements

1018	   tbc

1020	11.  IANA Considerations

1022	   tbc

1024	12.  References

1026	12.1.  Normative References

1028	   [I-D.ietf-quic-spin-exp]
1029	              Trammell, B. and M. Kuehlewind, "The QUIC Latency Spin
1030	              Bit", draft-ietf-quic-spin-exp-01 (work in progress),
1031	              October 2018.

1033	   [I-D.ietf-quic-transport]
1034	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1035	              and Secure Transport", draft-ietf-quic-transport-29 (work
1036	              in progress), June 2020.

1038	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1039	              Requirement Levels", BCP 14, RFC 2119,
1040	              DOI 10.17487/RFC2119, March 1997,
1041	              <https://www.rfc-editor.org/info/rfc2119>.

1043	   [RFC7799]  Morton, A., "Active and Passive Metrics and Methods (with
1044	              Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799,
1045	              May 2016, <https://www.rfc-editor.org/info/rfc7799>.

1047	   [RFC8321]  Fioccola, G., Ed., Capello, A., Cociglio, M., Castaldelli,
1048	              L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi,
1049	              "Alternate-Marking Method for Passive and Hybrid
1050	              Performance Monitoring", RFC 8321, DOI 10.17487/RFC8321,
1051	              January 2018, <https://www.rfc-editor.org/info/rfc8321>.

1053	12.2.  Informative References

1055	   [ANRW19-PM-QUIC]
1056	              ACM/IRTF Applied Networking Research Workshop 2019
1057	              (ANRW'19), "Performance measurements of QUIC
1058	              communications", DOI 10.1145/3340301.3341127, 2019.

1060	   [I-D.trammell-ippm-spin]
1061	              Trammell, B., "An Explicit Transport-Layer Signal for
1062	              Hybrid RTT Measurement", draft-trammell-ippm-spin-00 (work
1063	              in progress), January 2019.

1065	   [I-D.trammell-quic-spin]
1066	              Trammell, B., Vaere, P., Even, R., Fioccola, G., Fossati,
1067	              T., Ihlar, M., Morton, A., and S. Emile, "Adding Explicit
1068	              Passive Measurability of Two-Way Latency to the QUIC
1069	              Transport Protocol", draft-trammell-quic-spin-03 (work in
1070	              progress), May 2018.

1072	   [I-D.trammell-tsvwg-spin]
1073	              Trammell, B., "A Transport-Independent Explicit Signal for
1074	              Hybrid RTT Measurement", draft-trammell-tsvwg-spin-00
1075	              (work in progress), July 2018.

1077	Authors' Addresses

1079	   Mauro Cociglio
1080	   Telecom Italia
1081	   Via Reiss Romoli, 274
1082	   Torino  10148
1083	   Italy

1085	   Email: mauro.cociglio@telecomitalia.it

1087	   Giuseppe Fioccola
1088	   Huawei Technologies
1089	   Riesstrasse, 25
1090	   Munich  80992
1091	   Germany

1093	   Email: giuseppe.fioccola@huawei.com

1095	   Massimo Nilo
1096	   Telecom Italia
1097	   Via Reiss Romoli, 274
1098	   Torino  10148
1099	   Italy

1101	   Email: massimo.nilo@telecomitalia.it

1103	   Fabio Bulgarella
1104	   Telecom Italia
1105	   Via Reiss Romoli, 274
1106	   Torino  10148
1107	   Italy

1109	   Email: fabio.bulgarella@guest.telecomitalia.it
1110	   Riccardo Sisto
1111	   Politecnico di Torino
1112	   Corso Duca degli Abruzzi, 24
1113	   Torino  10129
1114	   Italy

1116	   Email: riccardo.sisto@polito.it