idnits 2.17.1 

draft-ietf-ippm-alt-mark-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 701: '...correlation mechanism SHOULD be in use...'
     RFC 2119 keyword, line 756: '...   SHOULD provide a way to configure t...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 26, 2017) is 2490 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC5905' is mentioned on line 724, but not defined

  == Missing Reference: 'IEEE1588' is mentioned on line 725, but not defined

  ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679)

  ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680)

  == Outdated reference: A later version (-05) exists of
     draft-bryant-mpls-sfl-framework-04

  == Outdated reference: A later version (-12) exists of
     draft-ietf-bier-mpls-encapsulation-07

  == Outdated reference: A later version (-15) exists of
     draft-ietf-bier-pmmm-oam-01

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mpls-flow-ident-04

  == Outdated reference: A later version (-10) exists of
     draft-ietf-mpls-rfc6374-sfl-00

  == Outdated reference: A later version (-14) exists of
     draft-mirsky-sfc-pmamm-00


     Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   G. Fioccola, Ed.
3	Internet-Draft                                           A. Capello, Ed.
4	Intended status: Experimental                                M. Cociglio
5	Expires: December 28, 2017                                L. Castaldelli
6	                                                          Telecom Italia
7	                                                            M. Chen, Ed.
8	                                                           L. Zheng, Ed.
9	                                                     Huawei Technologies
10	                                                          G. Mirsky, Ed.
11	                                                                     ZTE
12	                                                         T. Mizrahi, Ed.
13	                                                                 Marvell
14	                                                           June 26, 2017

16	 Alternate Marking method for passive and hybrid performance monitoring
17	                      draft-ietf-ippm-alt-mark-05

19	Abstract

21	   This document describes a method to perform packet loss, delay and
22	   jitter measurements on live traffic.  This method is based on
23	   Alternate Marking (Coloring) technique.  A report on the operational
24	   experiment done at Telecom Italia is explained in order to give an
25	   example and show the method applicability.  This technique can be
26	   applied in various situations as detailed in this document and could
27	   be considered passive or hybrid depending on the application.

29	Status of This Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on December 28, 2017.

46	Copyright Notice

48	   Copyright (c) 2017 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	   2.  Overview of the method  . . . . . . . . . . . . . . . . . . .   4
65	   3.  Detailed description of the method  . . . . . . . . . . . . .   6
66	     3.1.  Packet loss measurement . . . . . . . . . . . . . . . . .   6
67	     3.2.  Timing aspects  . . . . . . . . . . . . . . . . . . . . .  10
68	     3.3.  One-way delay measurement . . . . . . . . . . . . . . . .  11
69	       3.3.1.  Single marking methodology  . . . . . . . . . . . . .  11
70	       3.3.2.  Double marking methodology  . . . . . . . . . . . . .  13
71	     3.4.  Delay variation measurement . . . . . . . . . . . . . . .  14
72	   4.  Considerations  . . . . . . . . . . . . . . . . . . . . . . .  15
73	     4.1.  Synchronization . . . . . . . . . . . . . . . . . . . . .  15
74	     4.2.  Data Correlation  . . . . . . . . . . . . . . . . . . . .  15
75	     4.3.  Packet Re-ordering  . . . . . . . . . . . . . . . . . . .  16
76	   5.  Implementation and deployment . . . . . . . . . . . . . . . .  17
77	     5.1.  Report on the operational experiment at Telecom Italia  .  17
78	       5.1.1.  Coloring the packets  . . . . . . . . . . . . . . . .  19
79	       5.1.2.  Counting the packets  . . . . . . . . . . . . . . . .  20
80	       5.1.3.  Collecting data and calculating packet loss . . . . .  21
81	       5.1.4.  Metric transparency . . . . . . . . . . . . . . . . .  22
82	     5.2.  IP flow performance measurement (IPFPM) . . . . . . . . .  22
83	     5.3.  OAM Passive Performance Measurement . . . . . . . . . . .  22
84	     5.4.  RFC6374 Use Case  . . . . . . . . . . . . . . . . . . . .  22
85	     5.5.  Application to active performance measurement . . . . . .  23
86	   6.  Hybrid measurement  . . . . . . . . . . . . . . . . . . . . .  23
87	   7.  Compliance with RFC6390 guidelines  . . . . . . . . . . . . .  23
88	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  25
89	   9.  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .  26
90	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  27
91	   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  27
92	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  27
93	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  27
94	     12.2.  Informative References . . . . . . . . . . . . . . . . .  27
95	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

97	1.  Introduction

99	   Nowadays, most of the traffic in Service Providers' networks carries
100	   contents that are highly sensitive to packet loss [RFC2680], delay
101	   [RFC2679], and jitter [RFC3393].

103	   In view of this scenario, Service Providers need methodologies and
104	   tools to monitor and measure network performances with an adequate
105	   accuracy, in order to constantly control the quality of experience
106	   perceived by their customers.  On the other hand, performance
107	   monitoring provides useful information for improving network
108	   management (e.g.  isolation of network problems, troubleshooting,
109	   etc.).

111	   A lot of work related to OAM, that includes also performance
112	   monitoring techniques, has been done by Standards Developing
113	   Organizations(SDOs): [RFC7276] provides a good overview of existing
114	   OAM mechanisms defined in IETF, ITU-T and IEEE.  Considering IETF, a
115	   lot of work has been done on fault detection and connectivity
116	   verification, while a minor effort has been dedicated so far to
117	   performance monitoring.  The IPPM WG has defined standard metrics to
118	   measure network performance; however, the methods developed in this
119	   WG mainly refer to focus on active measurement techniques.  More
120	   recently, the MPLS WG has defined mechanisms for measuring packet
121	   loss, one-way and two-way delay, and delay variation in MPLS
122	   networks[RFC6374], but their applicability to passive measurements
123	   has some limitations, especially for pure connection-less networks.

125	   The lack of adequate tools to measure packet loss with the desired
126	   accuracy drove an effort to design a new method for the performance
127	   monitoring of live traffic, possibly easy to implement and deploy.
128	   The effort led to the method described in this document: basically,
129	   it is a passive performance monitoring technique, potentially
130	   applicable to any kind of packet based traffic, including Ethernet,
131	   IP, and MPLS, both unicast and multicast.  The method addresses
132	   primarily packet loss measurement, but it can be easily extended to
133	   one-way delay and delay variation measurements as well.

135	   The method has been explicitly designed for passive measurements but
136	   it can also be used with active probes.  Passive measurements are
137	   usually more easily understood by customers and provide a much better
138	   accuracy, especially for packet loss measurements.

140	   RFC 7799 [RFC7799] defines passive and hybrid methods of measurement.
141	   In particular, Passive Methods of Measurement are based solely on
142	   observations of an undisturbed and unmodified packet stream of
143	   interest; Hybrid Methods are Methods of Measurement that use a
144	   combination of Active Methods and Passive Methods.

146	   Taking into consideration these definitions, Alternate Marking Method
147	   could be considered Hybrid or Passive depending on the case.  In case
148	   the marking field is obtained by changing existing field values of
149	   the packets (e.g.  DSCP field), the technique is Hybrid.  In case the
150	   marking field is dedicated, reserved and is included in the protocol
151	   specification Alternate Marking technique can be considered as
152	   Passive (e.g.  RFC6374 Synonymous Flow Label or OAM Marking Bits in
153	   BIER Header).

155	   This document is organized as follows:

157	   o  Section 2 gives an overview of the method, including a comparison
158	      with different measurement strategies;

160	   o  Section 3 describes the method in detail;

162	   o  Section 4 reports considerations about synchronization, data
163	      correlation and packet re-ordering;

165	   o  Section 5 reports examples of implementation and deployment of the
166	      method.  Furthermore the operational experiment done at Telecom
167	      Italia is described;

169	   o  Section 6 introduces Hybrid measurement aspects;

171	   o  Section 7 is about the Compliance with RFC6390 guidelines;

173	   o  Section 8 includes some security aspects;

175	   o  Section 9 finally summarizes some concluding remarks.

177	2.  Overview of the method

179	   In order to perform packet loss measurements on a live traffic flow,
180	   different approaches exist.  The most intuitive one consists in
181	   numbering the packets, so that each router that receives the flow can
182	   immediately detect a packet missing.  This approach, though very
183	   simple in theory, is not simple to achieve: it requires the insertion
184	   of a sequence number into each packet and the devices must be able to
185	   extract the number and check it in real time.  Such a task can be
186	   difficult to implement on live traffic: if UDP is used as the
187	   transport protocol, the sequence number is not available; on the
188	   other hand, if a higher layer sequence number (e.g. in the RTP
189	   header) is used, extracting that information from each packet and
190	   process it in real time could overload the device.

192	   An alternate approach is to count the number of packets sent on one
193	   end, the number of packets received on the other end, and to compare
194	   the two values.  This operation is much simpler to implement, but
195	   requires that the devices performing the measurement are in sync: in
196	   order to compare two counters it is required that they refer exactly
197	   to the same set of packets.  Since a flow is continuous and cannot be
198	   stopped when a counter has to be read, it could be difficult to
199	   determine exactly when to read the counter.  A possible solution to
200	   overcome this problem is to virtually split the flow in consecutive
201	   blocks by inserting periodically a delimiter so that each counter
202	   refers exactly to the same block of packets.  The delimiter could be
203	   for example a special packet inserted artificially into the flow.
204	   However, delimiting the flow using specific packets has some
205	   limitations.  First, it requires generating additional packets within
206	   the flow and requires the equipment to be able to process those
207	   packets.  In addition, the method is vulnerable to out of order
208	   reception of delimiting packets and, to a lesser extent, to their
209	   loss.

211	   The method proposed in this document follows the second approach, but
212	   it doesn't use additional packets to virtually split the flow in
213	   blocks.  Instead, it "colors" the packets so that the packets
214	   belonging to the same block will have the same color, whilst
215	   consecutive blocks will have different colors.  Each change of color
216	   represents a sort of auto-synchronization signal that guarantees the
217	   consistency of measurements taken by different devices along the
218	   path.

220	   Figure 1 represents a very simple network and shows how the method
221	   can be used to measure packet loss on different network segments: by
222	   enabling the measurement on several interfaces along the path, it is
223	   possible to perform link monitoring, node monitoring or end-to-end
224	   monitoring.  The method is flexible enough to measure packet loss on
225	   any segment of the network and can be used to isolate the faulty
226	   element.

228	                            Traffic flow
229	        ========================================================>
230	          +------+       +------+       +------+       +------+
231	      ---<>  R1  <>-----<>  R2  <>-----<>  R3  <>-----<>  R4  <>---
232	          +------+       +------+       +------+       +------+
233	          .              .      .              .       .      .
234	          .              .      .              .       .      .
235	          .              <------>              <------->      .
236	          .          Node Packet Loss      Link Packet Loss   .
237	          .                                                   .
238	          <--------------------------------------------------->
239	                           End-to-End Packet loss

241	                     Figure 1: Available measurements

243	3.  Detailed description of the method

245	   This section describes in detail how the method operate.  A special
246	   emphasis is given to the measurement of packet loss, that represents
247	   the core application of the method, but applicability to delay and
248	   jitter measurements is also considered.

250	3.1.  Packet loss measurement

252	   The basic idea is to virtually split traffic flows into consecutive
253	   blocks: each block represents a measurable entity unambiguously
254	   recognizable by all network devices along the path.  By counting the
255	   number of packets in each block and comparing the values measured by
256	   different network devices along the path, it is possible to measure
257	   packet loss occurred in any single block between any two points.

259	   As discussed in the previous section, a simple way to create the
260	   blocks is to "color" the traffic (two colors are sufficient) so that
261	   packets belonging to different consecutive blocks will have different
262	   colors.  Whenever the color changes, the previous block terminates
263	   and the new one begins.  Hence, all the packets belonging to the same
264	   block will have the same color and packets of different consecutive
265	   blocks will have different colors.  The number of packets in each
266	   block depends on the criterion used to create the blocks: if the
267	   color is switched after a fixed number of packets, then each block
268	   will contain the same number of packets (except for any losses); but
269	   if the color is switched according to a fixed timer, then the number
270	   of packets may be different in each block depending on the packet
271	   rate.

273	   The following figure shows how a flow looks like when it is split in
274	   traffic blocks with colored packets.

276	   A: packet with A coloring
277	   B: packet with B coloring

279	            |           |           |           |           |
280	            |           |    Traffic flow       |           |
281	    ------------------------------------------------------------------->
282	     BBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAA
283	    ------------------------------------------------------------------->
284	       ...  |  Block 5  |  Block 4  |  Block 3  |  Block 2  |  Block 1
285	            |           |           |           |           |

287	                        Figure 2: Traffic coloring

289	   Figure 3 shows how the method can be used to measure link packet loss
290	   between two adjacent nodes.

292	   Referring to the figure, let's assume we want to monitor the packet
293	   loss on the link between two routers: router R1 and router R2.
294	   According to the method, the traffic is colored alternatively with
295	   two different colors, A and B.  Whenever the color changes, the
296	   transition generates a sort of square-wave signal, as depicted in the
297	   following figure.

299	   Color A   ----------+           +-----------+           +----------
300	                       |           |           |           |
301	   Color B             +-----------+           +-----------+
302	              Block n        ...      Block 3     Block 2     Block 1
303	            <---------> <---------> <---------> <---------> <--------->

305	                                Traffic flow
306	            ===========================================================>
307	   Color ...AAAAAAAAAAA BBBBBBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAA...
308	            ===========================================================>

310	                 Figure 3: Computation of link packet loss

312	   Traffic coloring could be done by R1 itself or by an upward router.
313	   R1 needs two counters, C(A)R1 and C(B)R1, on its egress interface:
314	   C(A)R1 counts the packets with color A and C(B)R1 counts those with
315	   color B.  As long as traffic is colored A, only counter C(A)R1 will
316	   be incremented, while C(B)R1 is not incremented; vice versa, when the
317	   traffic is colored as B, only C(B)R1 is incremented.  C(A)R1 and
318	   C(B)R1 can be used as reference values to determine the packet loss
319	   from R1 to any other measurement point down the path.  Router R2,
320	   similarly, will need two counters on its ingress interface, C(A)R2
321	   and C(B)R2, to count the packets received on that interface and
322	   colored with color A and B respectively.  When an A block ends, it is
323	   possible to compare C(A)R1 and C(A)R2 and calculate the packet loss
324	   within the block; similarly, when the successive B block terminates,
325	   it is possible to compare C(B)R1 with C(B)R2, and so on for every
326	   successive block.

328	   Likewise, by using two counters on R2 egress interface it is possible
329	   to count the packets sent out of R2 interface and use them as
330	   reference values to calculate the packet loss from R2 to any
331	   measurement point down R2.

333	   Using a fixed timer for color switching offers a better control over
334	   the method: the (time) length of the blocks can be chosen large
335	   enough to simplify the collection and the comparison of measures
336	   taken by different network devices.  It's preferable to read the
337	   value of the counters not immediately after the color switch: some
338	   packets could arrive out of order and increment the counter
339	   associated to the previous block (color), so it is worth waiting for
340	   some time.  A safe choice is to wait L/2 time units (where L is the
341	   duration for each block) after the color switch, to read the still
342	   counter of the previous color, so the possibility to read a running
343	   counter instead of a still one is minimized.  The drawback is that
344	   the longer the duration of the block, the less frequent the
345	   measurement can be taken.

347	   The following table shows how the counters can be used to calculate
348	   the packet loss between R1 and R2.  The first column lists the
349	   sequence of traffic blocks while the other columns contain the
350	   counters of A-colored packets and B-colored packets for R1 and R2.
351	   In this example, we assume that the values of the counters are reset
352	   to zero whenever a block ends and its associated counter has been
353	   read: with this assumption, the table shows only relative values,
354	   that is the exact number of packets of each color within each block.
355	   If the values of the counters were not reset, the table would contain
356	   cumulative values, but the relative values could be determined simply
357	   by difference from the value of the previous block of the same color.

359	   The color is switched on the basis of a fixed timer (not shown in the
360	   table), so the number of packets in each block is different.

362	           +-------+--------+--------+--------+--------+------+
363	           | Block | C(A)R1 | C(B)R1 | C(A)R2 | C(B)R2 | Loss |
364	           +-------+--------+--------+--------+--------+------+
365	           | 1     | 375    | 0      | 375    | 0      | 0    |
366	           |       |        |        |        |        |      |
367	           | 2     | 0      | 388    | 0      | 388    | 0    |
368	           |       |        |        |        |        |      |
369	           | 3     | 382    | 0      | 381    | 0      | 1    |
370	           |       |        |        |        |        |      |
371	           | 4     | 0      | 377    | 0      | 374    | 3    |
372	           |       |        |        |        |        |      |
373	           | ...   | ...    | ...    | ...    | ...    | ...  |
374	           |       |        |        |        |        |      |
375	           | 2n    | 0      | 387    | 0      | 387    | 0    |
376	           |       |        |        |        |        |      |
377	           | 2n+1  | 379    | 0      | 377    | 0      | 2    |
378	           +-------+--------+--------+--------+--------+------+

380	       Table 1: Evaluation of counters for packet loss measurements

382	   During an A block (blocks 1, 3 and 2n+1), all the packets are
383	   A-colored, therefore the C(A) counters are incremented to the number
384	   seen on the interface, while C(B) counters are zero.  Vice versa,
385	   during a B block (blocks 2, 4 and 2n), all the packets are B-colored:
386	   C(A) counters are zero, while C(B) counters are incremented.

388	   When a block ends (because of color switching) the relative counters
389	   stop incrementing and it is possible to read them, compare the values
390	   measured on router R1 and R2 and calculate the packet loss within
391	   that block.

393	   For example, looking at the table above, during the first block
394	   (A-colored), C(A)R1 and C(A)R2 have the same value (375), which
395	   corresponds to the exact number of packets of the first block (no
396	   loss).  Also during the second block (B-colored) R1 and R2 counters
397	   have the same value (388), which corresponds to the number of packets
398	   of the second block (no loss).  During blocks three and four, R1 and
399	   R2 counters are different, meaning that some packets have been lost:
400	   in the example, one single packet (382-381) was lost during block
401	   three and three packets (377-374) were lost during block four.

403	   The method applied to R1 and R2 can be extended to any other router
404	   and applied to more complex networks, as far as the measurement is
405	   enabled on the path followed by the traffic flow(s) being observed.

407	3.2.  Timing aspects

409	   This document introduces two color switching method: one is based on
410	   fixed number of packet, the other is based on fixed timer.  But the
411	   method based on fixed timer is preferable because is more
412	   deterministic, and will be considered in the rest of the dcoument.

414	   By considering the clock error between network devices R1 and R2,
415	   they must be synchronized to the same clock reference with an
416	   accuracy of +/- L/2 time units, where L is the time duration of the
417	   block.  So each colored packet can be assigned to the right batch by
418	   each router.  This is because the minimum time distance between two
419	   packets of the same color but belonging to different batches is L
420	   time units.

422	   In practice, there are also out of order at batch boundaries,
423	   strictly related to the delay between measurement points.  This means
424	   that, without considering clock error, we wait L/2 after color
425	   switching to be sure to take a still counter.

427	   In summary we need to take into account two contributions: clock
428	   error between network devices and the interval we need to wait to
429	   avoid out of order because of network delay.

431	   The following figure explains both issues.

433	   ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB...
434	                |<======================================>|
435	                |                   L                    |
436	   ...=========>|<==================><==================>|<==========...
437	                |       L/2                   L/2        |
438	                |<===>|                            |<===>|
439	                   d  |                            |   d
440	                      |<==========================>|
441	                       available counting interval

443	                         Figure 4: Timing aspects

445	   It is assumed that all network devices are synchronized to a common
446	   reference time with an accuracy of +/- A/2.  Thus, the difference
447	   between the clock values of any two network devices is bounded by A.

449	   The guardband d is given by:

451	   d = A + D_max - D_min,
452	   where A is the clock accuracy, D_max is an upper bound on the network
453	   delay between the network devices, and D_min is a lower bound on the
454	   delay.

456	   The available counting interval is L - 2d that must be > 0.

458	   The condition that must be satisfied and is a requirement on the
459	   synchronization accuracy is:

461	   d < L/2.

463	3.3.  One-way delay measurement

465	   The same principle used to measure packet loss can be applied also to
466	   one-way delay measurement.  There are three alternatives, as
467	   described hereinafter.

469	3.3.1.  Single marking methodology

471	   The alternation of colors can be used as a time reference to
472	   calculate the delay.  Whenever the color changes (that means that a
473	   new block has started) a network device can store the timestamp of
474	   the first packet of the new block; that timestamp can be compared
475	   with the timestamp of the same packet on a second router to compute
476	   packet delay.  Considering Figure 2, R1 stores a timestamp TS(A1)R1
477	   when it sends the first packet of block 1 (A-colored), a timestamp
478	   TS(B2)R1 when it sends the first packet of block 2 (B-colored) and so
479	   on for every other block.  R2 performs the same operation on the
480	   receiving side, recording TS(A1)R2, TS(B2)R2 and so on.  Since the
481	   timestamps refer to specific packets (the first packet of each block)
482	   we are sure that timestamps compared to compute delay refer to the
483	   same packets.  By comparing TS(A1)R1 with TS(A1)R2 (and similarly
484	   TS(B2)R1 with TS(B2)R2 and so on) it is possible to measure the delay
485	   between R1 and R2.  In order to have more measurements, it is
486	   possible to take and store more timestamps, referring to other
487	   packets within each block.

489	   In order to coherently compare timestamps collected on different
490	   routers, the network nodes must be in sync.  Furthermore, a
491	   measurement is valid only if no packet loss occurs and if packet
492	   misordering can be avoided, otherwise the first packet of a block on
493	   R1 could be different from the first packet of the same block on R2
494	   (f.i. if that packet is lost between R1 and R2 or it arrives after
495	   the next one).

497	   The following table shows how timestamps can be used to calculate the
498	   delay between R1 and R2.  The first column lists the sequence of
499	   blocks while other columns contain the timestamp referring to the
500	   first packet of each block on R1 and R2.  The delay is computed as a
501	   difference between timestamps.  For the sake of simplicity, all the
502	   values are expressed in milliseconds.

504	      +-------+---------+---------+---------+---------+-------------+
505	      | Block | TS(A)R1 | TS(B)R1 | TS(A)R2 | TS(B)R2 | Delay R1-R2 |
506	      +-------+---------+---------+---------+---------+-------------+
507	      | 1     | 12.483  | -       | 15.591  | -       | 3.108       |
508	      |       |         |         |         |         |             |
509	      | 2     | -       | 6.263   | -       | 9.288   | 3.025       |
510	      |       |         |         |         |         |             |
511	      | 3     | 27.556  | -       | 30.512  | -       | 2.956       |
512	      |       |         |         |         |         |             |
513	      |       | -       | 18.113  | -       | 21.269  | 3.156       |
514	      |       |         |         |         |         |             |
515	      | ...   | ...     | ...     | ...     | ...     | ...         |
516	      |       |         |         |         |         |             |
517	      | 2n    | 77.463  | -       | 80.501  | -       | 3.038       |
518	      |       |         |         |         |         |             |
519	      | 2n+1  | -       | 24.333  | -       | 27.433  | 3.100       |
520	      +-------+---------+---------+---------+---------+-------------+

522	         Table 2: Evaluation of timestamps for delay measurements

524	   The first row shows timestamps taken on R1 and R2 respectively and
525	   referring to the first packet of block 1 (which is A-colored).  Delay
526	   can be computed as a difference between the timestamp on R2 and the
527	   timestamp on R1.  Similarly, the second row shows timestamps (in
528	   milliseconds) taken on R1 and R2 and referring to the first packet of
529	   block 2 (which is B-colored).  Comparing timestamps taken on
530	   different nodes in the network and referring to the same packets
531	   (identified using the alternation of colors) it is possible to
532	   measure delay on different network segments.

534	   For the sake of simplicity, in the above example a single measurement
535	   is provided within a block, taking into account only the first packet
536	   of each block.  The number of measurements can be easily increased by
537	   considering multiple packets in the block: for instance, a timestamp
538	   could be taken every N packets, thus generating multiple delay
539	   measurements.  Taking this to the limit, in principle the delay could
540	   be measured for each packet, by taking and comparing the
541	   corresponding timestamps (possible but impractical from an
542	   implementation point of view).

544	3.3.1.1.  Mean delay

546	   As mentioned before, the method previously exposed for measuring the
547	   delay is sensitive to out of order reception of packets.  In order to
548	   overcome this problem, a different approach has been considered: it
549	   is based on the concept of mean delay.  The mean delay is calculated
550	   by considering the average arrival time of the packets within a
551	   single block.  The network device locally stores a timestamp for each
552	   packet received within a single block: summing all the timestamps and
553	   dividing by the total number of packets received, the average arrival
554	   time for that block of packets can be calculated.  By subtracting the
555	   average arrival times of two adjacent devices it is possible to
556	   calculate the mean delay between those nodes.  When computing the
557	   mean delay, measurement error could be augmented by accumulating
558	   measurement error of a lot of packets.  This method is robust to out
559	   of order packets and also to packet loss (only a small error is
560	   introduced).  Moreover, it greatly reduces the number of timestamps
561	   (only one per block for each network device) that have to be
562	   collected by the management system.  On the other hand, it only gives
563	   one measure for the duration of the block (f.i. 5 minutes), and it
564	   doesn't give the minimum, maximum and median delay values (RFC 6703
565	   [RFC6703]).  This limitation could be overcome by reducing the
566	   duration of the block (f.i. from 5 minutes to a few seconds), that
567	   implicates an highly optimized implementation of the method.

569	   By summing the mean delays of the two directions of a path, it is
570	   also possible to measure the two-way mean delay (round-trip delay).

572	3.3.2.  Double marking methodology

574	   The Single marking methodology for one-way delay measurement is
575	   sensitive to out of order reception of packets.  The first approach
576	   to overcome this problem is described before and is based on the
577	   concept of mean delay.  But the limitation of mean delay is that it
578	   doesn't give information about the delay values distribution for the
579	   duration of the block.  Additionally it may be useful to have not
580	   only the mean delay but also the minimum, maximum and median delay
581	   values and, in wider terms, to know more about the statistic
582	   distribution of delay values.  So in order to have more information
583	   about the delay and to overcome out of order issues, a different
584	   approach can be introduced: it is based on double marking
585	   methodology.

587	   Basically, the idea is to use the first marking to create the
588	   alternate flow and, within this colored flow, a second marking to
589	   select the packets for measuring delay/jitter.  The first marking is
590	   needed for packet loss and mean delay measurement.  The second
591	   marking creates a new set of marked packets that are fully identified
592	   over the network, so that a network device can store the timestamps
593	   of these packets; these timestamps can be compared with the
594	   timestamps of the same packets on a second router to compute packet
595	   delay values for each packet.  The number of measurements can be
596	   easily increased by changing the frequency of the second marking.
597	   But the frequency of the second marking must be not too high in order
598	   to avoid out of order issues.  Between packets with the second
599	   marking there should be a security time gap (e.g. this gap could be,
600	   at the minimum, the mean network delay calculated with the previous
601	   methodology) to avoid out of order issues and also to have a number
602	   of measurement packets that is rate independent.  If a second marking
603	   packet is lost, the delay measurement for the considered block is
604	   corrupted and should be discarded.

606	   Mean delay is calculated on all the packets of a sample and is a
607	   simple computation to be performed for single marking method.  In
608	   some cases the mean delay measure is not sufficient to characterize
609	   the sample, and more statistics of delay extent data are needed, e.g.
610	   percentiles, variance and median delay values.  The conventional
611	   range (maximum-minimum) should be avoided for several reasons,
612	   including stability of the maximum delay due to the influence by
613	   outliers.  RFC 5481 [RFC5481] section 6.5 highlights how the 99.9th
614	   percentile of delay and delay variation is more helpful to
615	   performance planners.  To overcome this drawback the idea is to
616	   couple the mean delay measure for the entire batch with double
617	   marking method, where a subset of batch packets are selected for
618	   extensive delay calculation by using a second marking.  In this way
619	   it is possible to perform a detailed analysis on these double marked
620	   packets.  Please note that there are classic algorithms for median
621	   and variance calculation, but are out of the scope of this document.
622	   The comparison between the mean delay for the entire batch and the
623	   mean delay on these double marked packets gives an useful information
624	   since it is possible to understand if the double marking measurements
625	   are actually representative of the delay trends.

627	3.4.  Delay variation measurement

629	   Similarly to one-way delay measurement (both for single marking and
630	   double marking), the method can also be used to measure the inter-
631	   arrival jitter.  We refer to the definition in RFC 3393 [RFC3393].
632	   The alternation of colors, for single marking method, can be used as
633	   a time reference to measure delay variations.  In case of double
634	   marking, the time reference is given by the second marked packets.
635	   Considering the example depicted in Figure 2, R1 stores a timestamp
636	   TS(A)R1 whenever it sends the first packet of a block and R2 stores a
637	   timestamp TS(B)R2 whenever it receives the first packet of a block.
638	   The inter-arrival jitter can be easily derived from one-way delay
639	   measurement, by evaluating the delay variation of consecutive
640	   samples.

642	   The concept of mean delay can also be applied to delay variation, by
643	   evaluating the average variation of the interval between consecutive
644	   packets of the flow from R1 to R2.

646	4.  Considerations

648	   This section highlights some considerations about the methodology.

650	4.1.  Synchronization

652	   The Alternate Marking technique does not require a strong
653	   synchronization, especially for packet loss and two-way delay
654	   measurement.  Only one-way delay measurement requires network devices
655	   to have synchronized clocks.

657	   The color switching is the reference for all the network devices, and
658	   the only requirement to be achieved is that all network devices have
659	   to recognize the right batch along the path.

661	   If the length of the measurement period is L time units, then all
662	   network devices must be synchronized to the same clock reference with
663	   an accuracy of +/- L/2 time units (without considering network
664	   delay).  This level of accuracy guarantees that all network devices
665	   consistently match the color bit to the correct block.  For example,
666	   if the color is toggeled every second (L = 1 second), then clocks
667	   must be synchronized with an accuracy of +/- 0.5 second to a common
668	   time reference.

670	   This synchronization requirement can be satisfied even with a
671	   relatively inaccurate synchronization method.  This is true for
672	   packet loss and two-way delay measurement, instead, for one-way delay
673	   measurement clock synchronization must be accurate.

675	   Therefore, a system that uses only packet loss and two-way delay
676	   measurement does not require synchronization.  This is because the
677	   value of the clocks of network devices does not affect the
678	   computation of the two-way delay measurement.

680	4.2.  Data Correlation

682	   Data Correlation is the mechanism to compare counters and timestamps
683	   for packet loss, delay and delay variation calculation.  It could be
684	   performed in several ways depending on the alternate marking
685	   application and use case.

687	   o  A possibility is to use a centralized solution using Network
688	      Management System (NMS) to correlate data;

690	   o  Another possibility is to define a protocol based distributed
691	      solution, by defining a new protocol or by extending the existing
692	      protocols (e.g.  RFC6374, TWAMP, OWAMP) in order to communicate
693	      the counters and timestamps between nodes.

695	   In the following paragraphs an example data correlation mechanism is
696	   explained and could be use independently of the adopted solutions.

698	   When data is collected on the upstream and downstream node, e.g.,
699	   packet counts for packet loss measurement or timestamps for packet
700	   delay measurement, and periodically reported to or pulled by other
701	   nodes or NMS, a certain data correlation mechanism SHOULD be in use
702	   to help the nodes or NMS to tell whether any two or more packet
703	   counts are related to the same block of markers, or any two
704	   timestamps are related to the same marked packet.

706	   The alternate marking method described in this document literally
707	   split the packets of the measured flow into different measurement
708	   blocks, in addition a Block Number could be assigned to each of such
709	   measurement block.  The BN is generated each time a node reads the
710	   data (packet counts or timestamps), and is associated with each
711	   packet count and timestamp reported to or pulled by other nodes or
712	   NMS.  The value of BN could be calculated as the modulo of the local
713	   time (when the data are read) and the interval of the marking time
714	   period.

716	   When the nodes or NMS see, for example, same BNs associated with two
717	   packet counts from an upstream and a downstream node respectively, it
718	   considers that these two packet counts corresponding to the same
719	   block, i.e. that these two packet counts belong to the same block of
720	   markers from the upstream and downstream node.  The assumption of
721	   this BN mechanism is that the measurement nodes are time
722	   synchronized.  This requires the measurement nodes to have a certain
723	   time synchronization capability (e.g., the Network Time Protocol
724	   (NTP) [RFC5905], or the IEEE 1588 Precision Time Protocol (PTP)
725	   [IEEE1588]).  Synchronization aspects are further discussed in
726	   Section 4.

728	4.3.  Packet Re-ordering

730	   Due to ECMP, packet re-ordering is very common in IP network.  The
731	   accuracy of marking based PM, especially packet loss measurement, may
732	   be affected by packet re-ordering.  Take a look at the following
733	   example:

735	   Block   :    1    |    2    |    3    |    4    |    5    |...
736	   --------|---------|---------|---------|---------|---------|---
737	   Node R1 : AAAAAAA | BBBBBBB | AAAAAAA | BBBBBBB | AAAAAAA |...
738	   Node R2 : AAAAABB | AABBBBA | AAABAAA | BBBBBBA | ABAAABA |...

740	                        Figure 5: Packet Reordering

742	   In the following paragraphs an example of data correlation mechanism
743	   is explained and could be use independently of the adopted solutions.

745	   Most of the packet re-ordering occur at the edge of adjacent blocks,
746	   and they are easy to handle if the interval of each block is
747	   sufficient large.  Then, it can assume that the packets with
748	   different marker belong to the block that they are more close to.  If
749	   the interval is small, it is difficult and sometime impossible to
750	   determine to which block a packet belongs.  See above example, the
751	   packet with the marker of "B" in block 3, there is no safe way to
752	   tell whether the packet belongs to block 2 or block 4.

754	   To choose a proper interval is important and how to choose a proper
755	   interval is out of the scope of this document.  But an implementation
756	   SHOULD provide a way to configure the interval and allow a certain
757	   degree of packet re-ordering.

759	5.  Implementation and deployment

761	   The methodology described in the previous sections can be applied in
762	   various situations.  Basically Alternate Marking technique could be
763	   used in many cases for performance measurement.  The only requirement
764	   is to select and mark the flow to be monitored; in this way packets
765	   are batched by the sender and each batch is alternately marked such
766	   that can be easily recognized by the receiver.

768	   An example of implementation and deployment is explained in the next
769	   section, just to clarify how the method can work.

771	5.1.  Report on the operational experiment at Telecom Italia

773	   The method described in this document, also called PNPM (Packet
774	   Network Performance Monitoring), has been invented and engineered in
775	   Telecom Italia and it's currently being used in Telecom Italia's
776	   network.  The methodology has been applied by leveraging functions
777	   and tools available on IP routers and it's currently being used to
778	   monitor packet loss in some portions of Telecom Italia's network.
779	   The application of the method to delay measurement is currently being
780	   evaluated in Telecom Italia's labs.  This section describes how the
781	   features currently available on existing routing platforms can be
782	   used to apply the method, in order to give an example of
783	   implementation and deployment.

785	   The fundamental steps for this implementation of the method can be
786	   summarized in the following items:

788	   o  coloring the packets;

790	   o  counting the packets;

792	   o  collecting data and calculating the packet loss.

794	   o  metric transparency.

796	   Before going deeper into the implementation details, it's worth
797	   mentioning two different strategies that can be used when
798	   implementing the method:

800	   o  flow-based: the flow-based strategy is used when only a limited
801	      number of traffic flows need to be monitored.  This could be the
802	      case, for example, of IPTV channels or other specific applications
803	      traffic with high QoS requirements (i.e.  Mobile Backhauling
804	      traffic).  According to this strategy, only a subset of the flows
805	      is colored.  Counters for packet loss measurements can be
806	      instantiated for each single flow, or for the set as a whole,
807	      depending on the desired granularity.  A relevant problem with
808	      this approach is the necessity to know in advance the path
809	      followed by flows that are subject to measurement.  Path rerouting
810	      and traffic load-balancing increase the issue complexity,
811	      especially for unicast traffic.  The problem is easier to solve
812	      for multicast traffic where load balancing is seldom used,
813	      especially for IPTV traffic where static joins are frequently used
814	      to force traffic forwarding and replication.  Another application
815	      is on Mobile Backhauling, implemented with a VPN MPLS in Telecom
816	      Italia's network; where the monitoring is between the Provider
817	      Edge nodes of the VPN MPLS.

819	   o  link-based: measurements are performed on all the traffic on a
820	      link by link basis.  The link could be a physical link or a
821	      logical link (for instance an Ethernet VLAN or a MPLS PW).
822	      Counters could be instantiated for the traffic as a whole or for
823	      each traffic class (in case it is desired to monitor each class
824	      separately), but in the second case a couple of counters is needed
825	      for each class.

827	   The current implementation in Telecom Italia uses the first strategy.
828	   As mentioned, the flow-based measurement requires the identification
829	   of the flow to be monitored and the discovery of the path followed by
830	   the selected flow.  It is possible to monitor a single flow or
831	   multiple flows grouped together, but in this case measurement is
832	   consistent only if all the flows in the group follow the same path.
833	   Moreover, a Service Provider should be aware that, if a measurement
834	   is performed by grouping many flows, it is not possible to determine
835	   exactly which flow was affected by packets loss.  In order to have
836	   measures per single flow it is necessary to configure counters for
837	   each specific flow.  Once the flow(s) to be monitored have been
838	   identified, it is necessary to configure the monitoring on the proper
839	   nodes.  Configuring the monitoring means configuring the policy to
840	   intercept the traffic and configuring the counters to count the
841	   packets.  To have just an end-to-end monitoring, it is sufficient to
842	   enable the monitoring on the first and the last hop routers of the
843	   path: the mechanism is completely transparent to intermediate nodes
844	   and independent from the path followed by traffic flows.  On the
845	   contrary, to monitor the flow on a hop-by-hop basis along its whole
846	   path it is necessary to enable the monitoring on every node from the
847	   source to the destination.  In case the exact path followed by the
848	   flow is not known a priori (i.e. the flow has multiple paths to reach
849	   the destination) it is necessary to enable the monitoring system on
850	   every path: counters on interfaces traversed by the flow will report
851	   packet count, counters on other interfaces will be null.

853	5.1.1.  Coloring the packets

855	   The coloring operation is fundamental in order to create packet
856	   blocks.  This implies choosing where to activate the coloring and how
857	   to color the packets.

859	   In case of flow-based measurements, it is desirable, in general, to
860	   have a single coloring node because it is easier to manage and
861	   doesn't rise any risk of conflict (consider the case where two nodes
862	   color the same flow).  Thus it is advantageous to color the flow as
863	   close as possible to the source.  In addition, coloring a flow close
864	   to the source allows an end-to-end measure if a measurement point is
865	   enabled on the last-hop router as well.  The only requirement is that
866	   the coloring must change periodically and every node along the path
867	   must be able to identify unambiguously the colored packets.  For
868	   link-based measurements, all traffic needs to be colored when
869	   transmitted on the link.  If the traffic had already been colored,
870	   then it has to be re-colored because the color must be consistent on
871	   the link.  This means that each hop along the path must (re-)color
872	   the traffic; the color is not required to be consistent along
873	   different links.

875	   Traffic coloring can be implemented by setting a specific bit in the
876	   packet header and changing the value of that bit periodically.  With
877	   current router implementations, only QoS related fields and features
878	   offer the required flexibility to set bits in the packet header.  In
879	   case a Service Provider only uses the three most significant bits of
880	   the DSCP field (corresponding to IP Precedence) for QoS
881	   classification and queuing, it is possible to use the two less
882	   significant bits of the DSCP field (bit 0 and bit 1) to implement the
883	   method without affecting QoS policies.  One of the two bits (bit 0)
884	   could be used to identify flows subject to traffic monitoring (set to
885	   1 if the flow is under monitoring, otherwise it is set to 0), while
886	   the second (bit 1) can be used for coloring the traffic (switching
887	   between values 0 and 1, corresponding to color A and B) and creating
888	   the blocks.

890	   In practice, coloring the traffic using the DSCP field can be
891	   implemented by configuring on the router output interface an access
892	   list that intercepts the flow(s) to be monitored and applies to them
893	   a policy that sets the DSCP field accordingly.  Since traffic
894	   coloring has to be switched between the two values over time, the
895	   policy needs to be modified periodically: an automatic script can be
896	   used perform this task on the basis of a fixed timer.  In Telecom
897	   Italia's implementation this timer is set to 5 minutes: this value
898	   showed to be a good compromise between measurement frequency and
899	   stability of the measurement (i.e. possibility to collect all the
900	   measures referring to the same block).

902	5.1.2.  Counting the packets

904	   Assuming that the coloring of the packets is performed only by the
905	   source node, the nodes between source and destination (included) have
906	   to count the colored packets that they receive and forward: this
907	   operation can be enabled on every router along the path or only on a
908	   subset, depending on which network segment is being monitored (a
909	   single link, a particular metro area, the backbone, the whole path).

911	   Since the color switches periodically between two values, two
912	   counters (one for each value) are needed: one counter for packets
913	   with color A and one counter for packets with color B.  For each flow
914	   (or group of flows) being monitored and for every interface where the
915	   monitoring is active, a couple of counters is needed.  For example,
916	   in order to monitor separately 3 flows on a router with 4 interfaces
917	   involved, 24 counters are needed (2 counters for each of the 3 flows
918	   on each of the 4 interfaces).  If traffic is colored using the DSCP
919	   field, as in Telecom Italia's implementation, an access-list that
920	   matches specific DSCP values can be used to count the packets of the
921	   flow(s) being monitored.

923	   In case of link-based measurements the behaviour is similar except
924	   that coloring and counting operations are performed on a link by link
925	   basis at each endpoint of the link.

927	   Another important aspect to take into consideration is when to read
928	   the counters: in order to count the exact number of packets of a
929	   block the routers must perform this operation when that block has
930	   ended: in other words, the counter for color A must be read when the
931	   current block has color B, in order to be sure that the value of the
932	   counter is stable.  This task can be accomplished in two ways.  The
933	   general approach suggests to read the counters periodically, many
934	   times during a block duration, and to compare these successive
935	   readings: when the counter stops incrementing means that the current
936	   block has ended and its value can be elaborated safely.
937	   Alternatively, if the coloring operation is performed on the basis of
938	   a fixed timer, it is possible to configure the reading of the
939	   counters according to that timer: for example, if each block is 5
940	   minutes long, reading the counter for color A every 5 minute in the
941	   middle of the subsequent block (with color B) is a safe choice.  A
942	   sufficient margin should be considered between the end of a block and
943	   the reading of the counter, in order to take into account any out-of-
944	   order packets.  The choice of a 5 minutes timer for colore switching
945	   was also inspired by these considerations.

947	5.1.3.  Collecting data and calculating packet loss

949	   The nodes enabled to perform performance monitoring collect the value
950	   of the counters, but they are not able to directly use this
951	   information to measure packet loss, because they only have their own
952	   samples.  For this reason, an external Network Management System
953	   (NMS) is required to collect and elaborate data and to perform packet
954	   loss calculation.  The NMS compares the values of counters from
955	   different nodes and can calculate if some packets were lost (even a
956	   single packet) and also where packets were lost.

958	   The value of the counters needs to be transmitted to the NMS as soon
959	   as it has been read.  This can be accomplished by using SNMP or FTP
960	   and can be done in Push Mode or Polling Mode.  In the first case,
961	   each router periodically sends the information to the NMS, in the
962	   latter case it is the NMS that periodically polls routers to collect
963	   information.  In any case, the NMS has to collect all the relevant
964	   values from all the routers within one cycle of the timer (5
965	   minutes).

967	   If link-based measurement is used, it would be possible to use a
968	   protocol to exchange values of counters between the two endpoints in
969	   order to let them perform the packet loss calculation for each
970	   traffic direction.  A similar approach could be complicated if
971	   applied to a flow-based measurement.

973	5.1.4.  Metric transparency

975	   In Telecom Italia's implementation the source node colors the packets
976	   with a policy that is modified periodically via an automatic script
977	   in order to alternate the DSCP field of the packets.  The nodes
978	   between source and destination (included) have to count with an
979	   access-list the colored packets that they receive and forward.

981	   Moreover the destination node has an important role: the colored
982	   packets are intercepted and a policy restores and sets the DSCP field
983	   of all the packets to the initial value.  In this way the metric is
984	   transparent because outside the section of the network under
985	   monitoring the traffic flow is unchanged.

987	   In such a case, thanks to this restoring technique, network elements
988	   outside the Alternate Marking monitoring domain (e.g. the two
989	   Provider Edge nodes of the Mobile Backhauling VPN MPLS) are totally
990	   anaware that packets were marked.  So this restoring technique makes
991	   Alternate Marking completely transparent outside its monitoring
992	   domain.

994	5.2.  IP flow performance measurement (IPFPM)

996	   This application of marking method is described in
997	   [I-D.chen-ippm-coloring-based-ipfpm-framework].

999	5.3.  OAM Passive Performance Measurement

1001	   In [I-D.ietf-bier-mpls-encapsulation] two OAM bits from Bit Index
1002	   Explicit Replication (BIER) Header are reserved for the passive
1003	   performance measurement marking method.  [I-D.ietf-bier-pmmm-oam]
1004	   details the measurement for multicast service over BIER domain.

1006	   [I-D.mirsky-sfc-pmamm] describes how the alternate marking method can
1007	   be used as the passive performance measurement method in a Service
1008	   Function Chaining (SFC) domain.

1010	   The application of the marking method to Network Virtualization
1011	   Overlays (NVO3) protocols is a work in progress.

1013	5.4.  RFC6374 Use Case

1015	   RFC6374 [RFC6374] uses the LM packet as the packet accounting
1016	   demarcation point.  Unfortunately this gives rise to a number of
1017	   problems that may lead to significant packet accounting errors in
1018	   certain situations.  [I-D.ietf-mpls-flow-ident] discusses the desired
1019	   capabilities for MPLS flow identification in order to perform a
1020	   better in-band performance monitoring of user data packets.  A method
1021	   of accomplishing identification is Synonymous Flow Labels (SFL)
1022	   introduced in [I-D.bryant-mpls-sfl-framework], while
1023	   [I-D.ietf-mpls-rfc6374-sfl] describes RFC6374 performance
1024	   measurements with SFL.

1026	5.5.  Application to active performance measurement

1028	   [I-D.fioccola-ippm-alt-mark-active] describes how to extend the
1029	   existing Active Measurement Protocol, in order to implement alternate
1030	   marking methodology.  [I-D.fioccola-ippm-rfc6812-alt-mark-ext]
1031	   describes an extension to the Cisco SLA Protocol Measurement-Type
1032	   UDP-Measurement.

1034	6.  Hybrid measurement

1036	   The method has been explicitly designed for passive measurements but
1037	   it can also be used with active measurements.  In order to have both
1038	   end to end measurements and intermediate measurements (hybrid
1039	   measurements) two end points can exchanges artificial traffic flows
1040	   and apply alternate marking over these flows.  In the intermediate
1041	   points artificial traffic is managed in the same way as real traffic
1042	   and measured as specified before.  So the application of marking
1043	   method can simplify also the active measurement, as explained in
1044	   [I-D.fioccola-ippm-alt-mark-active].

1046	7.  Compliance with RFC6390 guidelines

1048	   RFC6390 [RFC6390] defines a framework and a process for developing
1049	   Performance Metrics for protocols above and below the IP layer (such
1050	   as IP-based applications that operate over reliable or datagram
1051	   transport protocols).

1053	   This document doesn't aim to propose a new Performance Metric but a
1054	   new method of measurement for a few Performance Metrics that have
1055	   already been standardized.  Nevertheless, it's worth applying
1056	   [RFC6390] guidelines to the present document, in order to provide a
1057	   more complete and coherent description of the proposed method.  We
1058	   used a subset of the Performance Metric Definition template defined
1059	   by [RFC6390].

1061	   o  Metric name and description: as already stated, this document
1062	      doesn't propose any new Performance Metric.  On the contrary, it
1063	      describes a novel method for measuring packet loss [RFC2680].  The
1064	      same concept, with small differences, can also be used to measure
1065	      delay [RFC2679], and jitter [RFC3393].  The document mainly
1066	      describes the applicability to packet loss measurement.

1068	   o  Method of Measurement or Calculation: according to the method
1069	      described in the previous sections, the number of packets lost is
1070	      calculated by subtracting the value of the counter on the source
1071	      node from the value of the counter on the destination node.  Both
1072	      counters must refer to the same color.  The calculation is
1073	      performed when the value of the counters is in a steady state.

1075	   o  Units of Measurement: the method calculates and reports the exact
1076	      number of packets sent by the source node and not received by the
1077	      destination node.

1079	   o  Measurement Points: the measurement can be performed between
1080	      adjacent nodes, on a per-link basis, or along a multi-hop path,
1081	      provided that the traffic under measurement follows that path.  In
1082	      case of a multi-hop path, the measurements can be performed both
1083	      end-to-end and hop-by-hop.

1085	   o  Measurement Timing: the method have a constraint on the frequency
1086	      of measurements.  In order to perform a measure, the counter must
1087	      be in a steady state: this happens when the traffic is being
1088	      colored with the alternate color; for example in the Telecom
1089	      Italia application of the method the time interval is set to 5
1090	      minutes.

1092	   o  Implementation: the Telecom Italia application of the method uses
1093	      two encodings of the DSCP field to color the packets; this enables
1094	      the use of policy configurations on the router to color the
1095	      packets and accordingly configure the counter for each color.  The
1096	      path followed by traffic being measured should be known in advance
1097	      in order to configure the counters along the path and be able to
1098	      compare the correct values.

1100	   o  Use and Applications: the method can be used to measure packet
1101	      loss with high precision on live traffic; moreover, by combining
1102	      end-to-end and per-link measurements, the method is useful to
1103	      pinpoint the single link that is experiencing loss events.

1105	   o  Reporting Model: the value of the counters has to be sent to a
1106	      centralized management system that perform the calculations; such
1107	      samples must contain a reference to the time interval they refer
1108	      to, so that the management system can perform the correct
1109	      correlation; the samples have to be sent while the corresponding
1110	      counter is in a steady state (within a time interval), otherwise
1111	      the value of the sample should be stored locally.

1113	   o  Dependencies: the values of the counters have to be correlated to
1114	      the time interval they refer to; moreover, as far the Telecom
1115	      Italia application of the method is based on DSCP values, there
1116	      are significant dependencies on the usage of the DSCP field: it
1117	      must be possible to rely on unused DSCP values without affecting
1118	      QoS-related configuration and behavior; moreover, the intermediate
1119	      nodes must not change the value of the DSCP field not to alter the
1120	      measurement.

1122	   o  Organization of Results: the method of measurement produces
1123	      singletons.

1125	   o  Parameters: currently, the main parameter of the method is the
1126	      time interval used to alternate the colors and read the counters.

1128	8.  Security Considerations

1130	   This document specifies a method to perform measurements in the
1131	   context of a Service Provider's network and has not been developed to
1132	   conduct Internet measurements, so it does not directly affect
1133	   Internet security nor applications which run on the Internet.
1134	   However, implementation of this method must be mindful of security
1135	   and privacy concerns.

1137	   There are two types of security concerns: potential harm caused by
1138	   the measurements and potential harm to the measurements.  For what
1139	   concerns the first point, the measurements described in this document
1140	   are passive, so there are no packets injected into the network
1141	   causing potential harm to the network itself and to data traffic.
1142	   Nevertheless, the method implies modifications on the fly to the IP
1143	   header of data packets: this must be performed in a way that doesn't
1144	   alter the quality of service experienced by packets subject to
1145	   measurements and that preserve stability and performance of routers
1146	   doing the measurements.  The measurements themselves could be harmed
1147	   by routers altering the marking of the packets, or by an attacker
1148	   injecting artificial traffic.  Authentication techniques, such as
1149	   digital signatures, may be used where appropriate to guard against
1150	   injected traffic attacks.

1152	   The privacy concerns of network measurement are limited because the
1153	   method only relies on information contained in the IP header without
1154	   any release of user data.

1156	   The measurement itself may be affected by routers (or other network
1157	   devices) along the path of IP packets intentionally altering the
1158	   value of marking bits of packets.  As mentioned above, the mechanism
1159	   specified in this document is just in the context of one Service
1160	   Provider's network, and thus the routers (or other network devices)
1161	   are locally administered and this type of attack can be avoided.

1163	   One of the main security threats in OAM protocols is network
1164	   reconnaissance; an attacker can gather information about the network
1165	   performance by passively eavesdropping to OAM messages.  The
1166	   advantage of the methods described in this document is that the
1167	   marking bits are the only information that is exchanged between the
1168	   network devices.  Therefore, passive eavesdropping to data plane
1169	   traffic does not allow attackers to gain information about the
1170	   network performance.

1172	   Delay attacks are another potential threat in the context of this
1173	   document.  Delay measurement is performed using a specific packet in
1174	   each block, marked by a dedicated color bit.  Therefore, a man-in-
1175	   the-middle attacker can selectively induce synthetic delay only to
1176	   delay-colored packets, causing systematic error in the delay
1177	   measurements.  As discussed in previous sections, the methods
1178	   described in this document rely on an underlying time synchronization
1179	   protocol.  Thus, by attacking the time protocol an attacker can
1180	   potentially compromise the integrity of the measurement.  A detailed
1181	   discussion about the threats against time protocols and how to
1182	   mitigate them is presented in RFC 7384 [RFC7384].

1184	9.  Conclusions

1186	   The advantages of the method described in this document are:

1188	   o  easy implementation: it can be implemented using features already
1189	      available on major routing platforms;

1191	   o  low computational effort: the additional load on processing is
1192	      negligible;

1194	   o  accurate packet loss measurement: single packet loss granularity
1195	      is achieved with a passive measurement;

1197	   o  potential applicability to any kind of packet/frame -based
1198	      traffic: Ethernet, IP, MPLS, etc., both unicast and multicast;

1200	   o  robustness: the method can tolerate out of order packets and it's
1201	      not based on "special" packets whose loss could have a negative
1202	      impact;

1204	   o  no interoperability issues: the features required to implement the
1205	      method are available on all current routing platforms.

1207	   The method doesn't raise any specific need for protocol extension,
1208	   but it could be further improved by means of some extension to
1209	   existing protocols.  Specifically, the use of DiffServ bits for
1210	   coloring the packets could not be a viable solution in some cases: a
1211	   standard method to color the packets for this specific application
1212	   could be beneficial.

1214	10.  IANA Considerations

1216	   There are no IANA actions required.

1218	11.  Acknowledgements

1220	   The previous IETF drafts about this technique were:
1221	   [I-D.cociglio-mboned-multicast-pm] and [I-D.tempia-opsawg-p3m].
1222	   There are some references to this methodology in other IETF works
1223	   (e.g.  [I-D.ietf-mpls-flow-ident], [I-D.bryant-mpls-sfl-framework]
1224	   [I-D.ietf-mpls-rfc6374-sfl], [I-D.ietf-bier-mpls-encapsulation],
1225	   [I-D.ietf-bier-pmmm-oam]
1226	   [I-D.chen-ippm-coloring-based-ipfpm-framework]).

1228	   In addition the authors would like to thank Alberto Tempia Bonda,
1229	   Domenico Laforgia, Daniele Accetta and Mario Bianchetti for their
1230	   contribution to the definition and the implementation of the method.

1232	12.  References

1234	12.1.  Normative References

1236	   [RFC2679]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
1237	              Delay Metric for IPPM", RFC 2679, DOI 10.17487/RFC2679,
1238	              September 1999, <http://www.rfc-editor.org/info/rfc2679>.

1240	   [RFC2680]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
1241	              Packet Loss Metric for IPPM", RFC 2680,
1242	              DOI 10.17487/RFC2680, September 1999,
1243	              <http://www.rfc-editor.org/info/rfc2680>.

1245	   [RFC3393]  Demichelis, C. and P. Chimento, "IP Packet Delay Variation
1246	              Metric for IP Performance Metrics (IPPM)", RFC 3393,
1247	              DOI 10.17487/RFC3393, November 2002,
1248	              <http://www.rfc-editor.org/info/rfc3393>.

1250	12.2.  Informative References

1252	   [I-D.bryant-mpls-sfl-framework]
1253	              Bryant, S., Chen, M., Li, Z., Swallow, G., Sivabalan, S.,
1254	              and G. Mirsky, "Synonymous Flow Label Framework", draft-
1255	              bryant-mpls-sfl-framework-04 (work in progress), April
1256	              2017.

1258	   [I-D.chen-ippm-coloring-based-ipfpm-framework]
1259	              Chen, M., Zheng, L., Mirsky, G., Fioccola, G., and T.
1260	              Mizrahi, "IP Flow Performance Measurement Framework",
1261	              draft-chen-ippm-coloring-based-ipfpm-framework-06 (work in
1262	              progress), March 2016.

1264	   [I-D.cociglio-mboned-multicast-pm]
1265	              Cociglio, M., Capello, A., Bonda, A., and L. Castaldelli,
1266	              "A method for IP multicast performance monitoring", draft-
1267	              cociglio-mboned-multicast-pm-01 (work in progress),
1268	              October 2010.

1270	   [I-D.fioccola-ippm-alt-mark-active]
1271	              Fioccola, G., Clemm, A., Bryant, S., Cociglio, M.,
1272	              Chandramouli, M., and A. Capello, "Alternate Marking
1273	              Extension to Active Measurement Protocol", draft-fioccola-
1274	              ippm-alt-mark-active-01 (work in progress), March 2017.

1276	   [I-D.fioccola-ippm-rfc6812-alt-mark-ext]
1277	              Fioccola, G., Clemm, A., Cociglio, M., Chandramouli, M.,
1278	              and A. Capello, "Alternate Marking Extension to Cisco SLA
1279	              Protocol RFC6812", draft-fioccola-ippm-rfc6812-alt-mark-
1280	              ext-01 (work in progress), March 2016.

1282	   [I-D.ietf-bier-mpls-encapsulation]
1283	              Wijnands, I., Rosen, E., Dolganow, A., Tantsura, J.,
1284	              Aldrin, S., and I. Meilik, "Encapsulation for Bit Index
1285	              Explicit Replication in MPLS and non-MPLS Networks",
1286	              draft-ietf-bier-mpls-encapsulation-07 (work in progress),
1287	              June 2017.

1289	   [I-D.ietf-bier-pmmm-oam]
1290	              Mirsky, G., Zheng, L., Chen, M., and G. Fioccola,
1291	              "Performance Measurement (PM) with Marking Method in Bit
1292	              Index Explicit Replication (BIER) Layer", draft-ietf-bier-
1293	              pmmm-oam-01 (work in progress), January 2017.

1295	   [I-D.ietf-mpls-flow-ident]
1296	              Bryant, S., Pignataro, C., Chen, M., Li, Z., and G.
1297	              Mirsky, "MPLS Flow Identification Considerations", draft-
1298	              ietf-mpls-flow-ident-04 (work in progress), February 2017.

1300	   [I-D.ietf-mpls-rfc6374-sfl]
1301	              Bryant, S., Chen, M., Li, Z., Swallow, G., Sivabalan, S.,
1302	              Mirsky, G., and G. Fioccola, "RFC6374 Synonymous Flow
1303	              Labels", draft-ietf-mpls-rfc6374-sfl-00 (work in
1304	              progress), June 2017.

1306	   [I-D.mirsky-sfc-pmamm]
1307	              Mirsky, G. and G. Fioccola, "Performance Measurement (PM)
1308	              with Alternate Marking Method in Service Function Chaining
1309	              (SFC) Domain", draft-mirsky-sfc-pmamm-00 (work in
1310	              progress), April 2017.

1312	   [I-D.tempia-opsawg-p3m]
1313	              Capello, A., Cociglio, M., Castaldelli, L., and A. Bonda,
1314	              "A packet based method for passive performance
1315	              monitoring", draft-tempia-opsawg-p3m-04 (work in
1316	              progress), February 2014.

1318	   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
1319	              Applicability Statement", RFC 5481, DOI 10.17487/RFC5481,
1320	              March 2009, <http://www.rfc-editor.org/info/rfc5481>.

1322	   [RFC6374]  Frost, D. and S. Bryant, "Packet Loss and Delay
1323	              Measurement for MPLS Networks", RFC 6374,
1324	              DOI 10.17487/RFC6374, September 2011,
1325	              <http://www.rfc-editor.org/info/rfc6374>.

1327	   [RFC6390]  Clark, A. and B. Claise, "Guidelines for Considering New
1328	              Performance Metric Development", BCP 170, RFC 6390,
1329	              DOI 10.17487/RFC6390, October 2011,
1330	              <http://www.rfc-editor.org/info/rfc6390>.

1332	   [RFC6703]  Morton, A., Ramachandran, G., and G. Maguluri, "Reporting
1333	              IP Network Performance Metrics: Different Points of View",
1334	              RFC 6703, DOI 10.17487/RFC6703, August 2012,
1335	              <http://www.rfc-editor.org/info/rfc6703>.

1337	   [RFC7276]  Mizrahi, T., Sprecher, N., Bellagamba, E., and Y.
1338	              Weingarten, "An Overview of Operations, Administration,
1339	              and Maintenance (OAM) Tools", RFC 7276,
1340	              DOI 10.17487/RFC7276, June 2014,
1341	              <http://www.rfc-editor.org/info/rfc7276>.

1343	   [RFC7384]  Mizrahi, T., "Security Requirements of Time Protocols in
1344	              Packet Switched Networks", RFC 7384, DOI 10.17487/RFC7384,
1345	              October 2014, <http://www.rfc-editor.org/info/rfc7384>.

1347	   [RFC7799]  Morton, A., "Active and Passive Metrics and Methods (with
1348	              Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799,
1349	              May 2016, <http://www.rfc-editor.org/info/rfc7799>.

1351	Authors' Addresses

1353	   Giuseppe Fioccola (editor)
1354	   Telecom Italia
1355	   Via Reiss Romoli, 274
1356	   Torino  10148
1357	   Italy

1359	   Email: giuseppe.fioccola@telecomitalia.it

1361	   Alessandro Capello (editor)
1362	   Telecom Italia
1363	   Via Reiss Romoli, 274
1364	   Torino  10148
1365	   Italy

1367	   Email: alessandro.capello@telecomitalia.it

1369	   Mauro Cociglio
1370	   Telecom Italia
1371	   Via Reiss Romoli, 274
1372	   Torino  10148
1373	   Italy

1375	   Email: mauro.cociglio@telecomitalia.it

1377	   Luca Castaldelli
1378	   Telecom Italia
1379	   Via Reiss Romoli, 274
1380	   Torino  10148
1381	   Italy

1383	   Email: luca.castaldelli@telecomitalia.it

1385	   Mach(Guoyi) Chen (editor)
1386	   Huawei Technologies

1388	   Email: mach.chen@huawei.com

1390	   Lianshu Zheng (editor)
1391	   Huawei Technologies

1393	   Email: vero.zheng@huawei.com
1394	   Greg Mirsky  (editor)
1395	   ZTE
1396	   USA

1398	   Email: gregimirsky@gmail.com

1400	   Tal Mizrahi (editor)
1401	   Marvell
1402	   6 Hamada st.
1403	   Yokneam
1404	   Israel

1406	   Email: talmi@marvell.com