idnits 2.17.1 

draft-cociglio-mboned-multicast-pm-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 22, 2010) is 4936 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-02) exists of
     draft-bipi-mboned-ip-multicast-pm-requirement-00


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	MBONED                                                       M. Cociglio
3	Internet-Draft                                                A. Capello
4	Intended status: Experimental                            A. Tempia Bonda
5	Expires: April 25, 2011                                   L. Castaldelli
6	                                                          Telecom Italia
7	                                                        October 22, 2010

9	            A method for IP multicast performance monitoring
10	               draft-cociglio-mboned-multicast-pm-01.txt

12	Abstract

14	   This document defines a method to accomplish performance monitoring
15	   measurements on live IP flows, including packet loss, one-way delay
16	   and jitter.  The proposed method is applicable to both unicast and
17	   multicast traffic, but only IP multicast streams are considered in
18	   this document.  The method can be implemented using tools and
19	   features already available on IP routers and does not require any
20	   protocol extension.  For this reason, it does not raise any
21	   interoperability issue.  However, the method could be further
22	   improved by means of some extension to existing protocols, but this
23	   aspect is left for further study and it is out of the scope of the
24	   document.

26	Status of this Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on April 25, 2011.

43	Copyright Notice

45	   Copyright (c) 2010 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
61	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
62	   3.  Principle of the method  . . . . . . . . . . . . . . . . . . .  5
63	   4.  Characteristics of the method  . . . . . . . . . . . . . . . .  7
64	   5.  Detailed description of the method . . . . . . . . . . . . . .  9
65	     5.1.  Packet Loss  . . . . . . . . . . . . . . . . . . . . . . .  9
66	     5.2.  One-way Delay  . . . . . . . . . . . . . . . . . . . . . . 12
67	     5.3.  Inter-arrival jitter . . . . . . . . . . . . . . . . . . . 13
68	   6.  Deployment considerations  . . . . . . . . . . . . . . . . . . 15
69	     6.1.  Multicast Flow Identification  . . . . . . . . . . . . . . 15
70	     6.2.  Path Discovery . . . . . . . . . . . . . . . . . . . . . . 15
71	     6.3.  Flow Marking . . . . . . . . . . . . . . . . . . . . . . . 15
72	     6.4.  Monitoring Nodes . . . . . . . . . . . . . . . . . . . . . 16
73	     6.5.  Management System  . . . . . . . . . . . . . . . . . . . . 17
74	     6.6.  Scalability  . . . . . . . . . . . . . . . . . . . . . . . 17
75	     6.7.  Interoperability . . . . . . . . . . . . . . . . . . . . . 18
76	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 19
77	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
78	   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
79	   10. Informative References . . . . . . . . . . . . . . . . . . . . 22
80	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23

82	1.  Introduction

84	   The deployment of video services managed by Service Providers
85	   determined the following two main consequences:

87	   o  a widespread adoption of IP multicast to carry live TV channels

89	   o  a strong effort to guarantee a user experience comparable to
90	      traditional TV broadcasting services

92	   The second point implies a reinforced interest in performance
93	   monitoring techniques, including packet loss, delay and jitter
94	   measurements.  As discussed in
95	   [I-D.bipi-mboned-ip-multicast-pm-requirement], these techniques
96	   should satisfy a few fundamental requirements:

98	   o  applicability to real traffic

100	   o  availability of packet loss, delay and jitter measurements

102	   o  possibility to have both end-to-end and segment-by-segment
103	      measures, in order to exploit fault localization

105	   o  scalability

107	   o  low interoperability issues

109	   Currently available tools are not compliant with all of these
110	   requirements, thus the opportunity to work on a new solution.

112	   The method described in the present document allows performing packet
113	   loss, delay and jitter measurements on real IP multicast streams, on
114	   an end-to-end or segment-by-segment basis.  In the basic proposal,
115	   there are no interoperability issues, since the method doesn't
116	   require any extension to existing protocols and can be implemented
117	   using tools already available on major routing platforms.

119	2.  Terminology

121	   Terminology used in this document:

123	   o  CB Bit (Control Bit): bit used to "mark" traffic to be monitored

125	   o  Block: sequence of consecutive packets with the CB set to the same
126	      value

128	   o  MI (Marking Interval): duration of a block (it defines the
129	      frequency at which CB is changed)

131	   o  PI (Polling Interval): it defines the frequency at which
132	      performance information is collected

134	   o  NMS: Network Management System

136	3.  Principle of the method

138	   In order to perform packet loss measurements on real traffic flows,
139	   it is generally required to include a sequence number in the packet
140	   header and to have an equipment able to extract the sequence number
141	   and check in real time if some packets are missing.  Such approach
142	   can be difficult to implement on real traffic: if UDP is used as the
143	   transport protocol the sequence number is not available, on the other
144	   hand if a higher layer sequence number (e.g. in the RTP header) is
145	   used, extracting the information from the RTP header on every packet
146	   and performing the calculation in real-time can be stressing for the
147	   equipment.

149	   The method proposed in this document is a simple and efficient way to
150	   measure packet loss on real traffic streams, without numbering
151	   packets or overloading network equipment.  The basic idea is to
152	   consider the traffic being measured as a sequence of blocks made of
153	   consecutive packets.  Blocks can be defined based on the number of
154	   packets (each block contains a configured fixed number of packets) or
155	   on its duration (f.i. blocks are 5 minutes long and the number of
156	   packets on each block can vary and depends on the flow rate).  In any
157	   case blocks must be recognizable unambiguously on every node along
158	   the path: by counting on a node the number of packets in each block
159	   and comparing the values with those measured by a different router
160	   along the path, it is possible to measure packet loss (if any)
161	   between the two nodes.

163	   Figure 1 represents a simple multicast forwarding tree made of 6
164	   nodes and 3 receivers.

166	    +-----+                                       +--------+
167	    | SRC |                             +--------<>   R5   <>---  Recv 1
168	    +-----+                             |         +--------+
169	       |                                |
170	       |                                |
171	   +---<>---+      +--------+      +---<>---+     +--------+
172	   |   R1   <>----<>   R2   <>----<>   R3   <>---<>   R6   <>---  Recv 2
173	   +--------+      +---<>---+      +--------+     +--------+
174	                        |
175	                        |
176	                        |          +--------+
177	                        +---------<>   R4   <>---  Recv 3
178	                                   +--------+
179	    <>   Interface
180	   ----  Link

182	              Figure 1: Example of multicast forwarding tree

184	   Blocks of consecutive packets are identified using some information
185	   in the packet flow itself, for instance a field in the packet header
186	   that can assume two different values.  The first-hop router (R1 in
187	   Figure 1) sets such field and changes it periodically (f.i. every 5
188	   minutes or every 100000 packets) alternating the two values and
189	   creating a sequence of blocks.  All the packets within a block have
190	   the field set to the same value and all the packets within the
191	   following block have the field set to the second value.  If blocks
192	   are defined on a time basis, the number of packets in each block is
193	   not fixed, but depends on the flow rate.  However, since blocks are
194	   created on the first-hop router and not modified along the path, all
195	   the nodes should count the same number of packets within the same
196	   block (if no packet loss occurs).  By counting the number of packets
197	   in the block on each node and comparing those values, it's possible
198	   to unveil any packet loss with the maximum precision (a single packet
199	   lost) and to identify where the loss occurred.

201	   In the following we will assume to define blocks on a time basis.

203	   The same approach can also be used to measure one-way delay and
204	   inter-arrival jitter.  In this case, the transition from a block to
205	   the following one is used as a time reference to calculate the delay
206	   between any two nodes in the network.  Time synchronization is
207	   required in order to have a consistent delay measurement.

209	   Inter-arrival jitter can be easily estimated from delay measures and
210	   does not require necessarily synchronization between the nodes.

212	4.  Characteristics of the method

214	   The method described in this document fulfills all the requirements
215	   described in , in addition it is characterized by the following
216	   advantages:

218	   o  easy implementation (use of features already available on major
219	      routing platforms)

221	   o  low computational effort

223	   o  highly precise packet loss measurement (single packet loss
224	      granularity)

226	   o  applicability to any kind of IP traffic (unicast and multicast)

228	   o  independence from the flow bit rate

230	   o  independence from higher level protocols (e.g.  RTP, etc.) or
231	      video coding (e.g.  MPEG, etc.)

233	   o  no interoperability issues

235	   Figure 2 represents a subtree of the multicast forwarding tree
236	   depicted in figure 1 and shows how the method can be used to measure
237	   packet loss (or one-way delay and inter-arrival jitter) on different
238	   network segments.

240	    +-----+
241	    | SRC |
242	    +-----+
243	       |
244	       |
245	   +---<>---+      +--------+     +--------+      +--------+
246	   |   R1   <>----<>   R2   <>---<>   R3   <>----<>   R6   <>---  Recv 2
247	   +--------+      +--------+     +--------+      +--------+
248	       .           .        .              .      .        .
249	       .           .        .              .      .        .
250	       .           <-------->              <------>        .
251	       .       Node Packet Loss        Link Packet Loss    .
252	       .                                                   .
253	       <--------------------------------------------------->
254	                        End-to-End Packet loss

256	                     Figure 2: Available measurements

258	   By applying the method on different interfaces along the multicast
259	   distribution tree, it is possible to measure packet loss across a
260	   single link, across a node (e.g. due to queuing management) or end-
261	   to-end.  In general, it is possible to monitor any segment of the
262	   network.

264	5.  Detailed description of the method

266	   This section describes more in detail the application of the method
267	   for measuring packet loss, one-way delay and jitter in packet-
268	   switched networks.

270	5.1.  Packet Loss

272	   Figure 1 shows how the method described in this document can be used
273	   to measure the packet loss across a link between two adjacent nodes.
274	   For example, referring Figure 1, we are interested in monitoring the
275	   packet loss on the link between R1 and R2.  According to the method
276	   briefly described in Section 3, since router R1 is the first-hop
277	   router, it is responsible for marking the field in the packet header.
278	   As discussed before, a single bit is sufficient to this purpose : the
279	   bit used to mark the traffic is called Control Bit (CB bit).  By
280	   assuming alternately on each period values 0 and 1, the Control Bit
281	   generates a sort of square-wave signal and the original traffic flow
282	   is converted in a sequence of blocks.  The semi-period T/2 of the
283	   square-wave is called Marking Interval (MI) and corresponds to the
284	   duration of each single block.  The action of "marking" the traffic
285	   (setting the Control Bit) can be executed on the ingress interface of
286	   R1.  On the egress interface of R1 two counters, named C(0)R1 and
287	   C(1)R1, will count the number of packets with the CB bit set to 0 and
288	   1 respectively.  As long as traffic is marked to 0, only counter
289	   C(0)R1 is incremented while C(1)R1 doesn't change.  Counters C(0)R1
290	   and C(1)R1 can be used as reference values to determine the packet
291	   loss from R1 to R2 (or to other nodes along the path toward the
292	   destination).

294	   Router R2, similarly, will instantiate on its ingress interface two
295	   counters, C(0)R2 and C(1)R2, to count the number of packets received
296	   with the CB bit set to 0 and 1 respectively.  By comparing C(0)R1
297	   with C(0)R2 and C(1)R1 with C(1)R2 and repeating this operation on
298	   every block, it is possible to detect the number of packets lost in
299	   the link between R1 and R2.

301	   Similarly, using 2 counters on the R2 egress interface and on every
302	   interface along the path, it is possible to use them to determine
303	   packet loss on every network segment and therefore detect where
304	   packet losses occur.

306	                       T/2                 T
307	                     <------>        <-------------->
308	                            +-------+       +-------+
309	                            |       |       |       |
310	                    +-------+       +-------+       +-------
311	     Control Bit    0000000011111111000000001111111100000000

313	                      Block   Block   Block   Block   Block
314	                     <------><------><------><------><------>

316	               +---------+                            +---------+
317	     -------> <>    R1   <> -----------------------> <>   R2    <> --->
318	               +---------+                            +---------+

320	      Figure 3: Application of the method to compute link packet loss

322	   The method doesn't require any synchronization in the network, as the
323	   traffic flow implicitly carries the synchronization in the
324	   alternation of values of the Control Bit.

326	   Table 1 shows an example of the use of router counters to calculate
327	   the packet loss between R1 and R2.  Time is expressed in minutes and
328	   we assume to check counter values on each router every two minutes
329	   (it doesn't matter if R1 and R2 are not synchronized).  We assume
330	   also that the Marking Interval is 5 minutes, meaning that the CB bit
331	   changes every 5 minutes.

333	   The columns contain the values of C(0) and C(1) for both R1 and R2,
334	   in particular, the table shows the values they assume every 2
335	   minutes.  Counters increases according to the Control Bit: when CB is
336	   0, only C(0) increases and C(1) is still, when CB is 1, only C(1)
337	   increases and C(0) is still.  Packet loss calculation must be
338	   performed when a counter is stable, because it means that a block is
339	   terminated and we can count exactly the number of packets within that
340	   block.

342	               +------+--------+--------+--------+--------+
343	               | Time | C(0)R1 | C(1)R1 | C(0)R2 | C(1)R2 |
344	               +------+--------+--------+--------+--------+
345	               | 0    | 0      | 0      | 0      | 0      |
346	               |      |        |        |        |        |
347	               | 2    | 112    | 0      | 110    | 0      |
348	               |      |        |        |        |        |
349	               | 4    | 234    | 0      | 237    | 0      |
350	               |      |        |        |        |        |
351	               | 6    | 277    | 103    | 277    | 101    |
352	               |      |        |        |        |        |
353	               | 8    | 277    | 212    | 277    | 210    |
354	               |      |        |        |        |        |
355	               | 10   | 277    | 259    | 277    | 256    |
356	               |      |        |        |        |        |
357	               | 12   | 403    | 262    | 401    | 261    |
358	               |      |        |        |        |        |
359	               | 14   | 827    | 262    | 819    | 261    |
360	               +------+--------+--------+--------+--------+

362	       Table 1: Evaluation of counters for packet loss measurements

364	   For example, looking at Table 1, traffic is initially marked with
365	   CB=0 because only C(0)R1 and C(0)R2 increase, while C(1) counters are
366	   still.  At minute 6, C(1) counters have started moving while C(0)
367	   counters have stopped (in fact at minute 8 they have the same values
368	   they had at minute 6): it means that the block with CB=0 is
369	   terminated and the flow is now being marked with CB=1.  Hence the
370	   value of C(0) counters gives the exact number of packets transmitted
371	   in that block.  Comparing C(0)R1 and C(0)R2 at minute 8 it is
372	   possible to verify if any packet of the first block was lost in the
373	   link between R1 and R2 (in the case shown in the table C(0)R1 =
374	   C(0)R2 = 277, meaning that no packets were lost).  At minute 12, C(0)
375	   counters have started moving again while C(1) counters have stopped
376	   (at minute 14 they have the same values they had at minute 12): it
377	   means now that the block with CB=1 is terminated and the flow is now
378	   being marked again with CB=0.  The value of C(1) counters gives the
379	   exact number of packets transmitted in the block just terminated.
380	   Comparing C(1)R1 and C(1)R2 at minute 14 it is possible to verify if
381	   any packet of that block was lost (this time C(1)R1 = 262 and C(1)R2
382	   = 261, meaning that 1 packet was lost).

384	   The same method can be applied to more complex networks, as far as
385	   the measurement is enabled on the path followed by the traffic flow
386	   being analyzed.

388	5.2.  One-way Delay

390	   The method to measure one-way delay directly refers to the packet
391	   loss method.  The event when the marking changes from 0 to 1 or vice
392	   versa is used as a time reference to calculate the delay.
393	   Considering again the example depicted in Figure 1, R1 will record as
394	   an event every change in the marking, by storing a timestamp TS R1
395	   every time it sends the first packet of a block.  R2 will do the same
396	   operation, recording TS R2 every time it receives the first packet of
397	   a block.  By comparing TS R1 and TS R2 it's possible to calculate the
398	   delay between R1 and R2.

400	   In order to coherently compare the timestamps collected on different
401	   routers, synchronization is required in the network.  Moreover, the
402	   measurement can be considered valid only if no packet loss occurred.
403	   If some packets are lost it is possible that the first packet of a
404	   block on R1 is not the first packet of the same block on R2.

406	   Going into details, whenever an interface sends/receives the first
407	   packet of a block (that is a packet with Control Bit set to 0 or 1,
408	   while previous packets were marked with the opposite value), a
409	   timestamp should be recorded.  By comparing timestamps recorded on
410	   different nodes in the network, it is possible to calculate the delay
411	   on each network segment.  As stated before, synchronization is
412	   required to get a reliable delay measurement.

414	   Table 2 considers the same example of Figure 1, but both packet loss
415	   and one-way delay are now measured.  Time is expressed in minutes,
416	   while timestamps are expressed in milliseconds (hours and minutes are
417	   omitted for simplicity).  We assume to check counters and timestamp
418	   values on each router every two minutes and we assume the Marking
419	   Interval is 5 minutes.  Routers R1 and R2, besides incrementing
420	   counters C(0) and C(1), now also set a timestamp whenever the
421	   corresponding counter begins incrementing (i.e. the first packet is
422	   sent/received).

424	   +-------+-----+--------+-----+--------+-----+--------+-----+--------+
425	   | Time  | R1  | TS0 R1 | R1  | TS1 R1 | R2  | TS0 R2 | R2  | TS1 R2 |
426	   | (min) | C0  | (sec)  | C1  | (sec)  | C0  | (sec)  | C1  | (sec)  |
427	   +-------+-----+--------+-----+--------+-----+--------+-----+--------+
428	   | 0     | 0   | -      | 0   | -      | 0   | -      | 0   | -      |
429	   |       |     |        |     |        |     |        |     |        |
430	   | 2     | 112 | 7.483  | 0   | -      | 110 | 7.487  | 0   | -      |
431	   |       |     |        |     |        |     |        |     |        |
432	   | 4     | 234 | -      | 0   | -      | 237 | -      | 0   | -      |
433	   |       |     |        |     |        |     |        |     |        |
434	   | 6     | 277 | -      | 103 | 3.621  | 277 | -      | 101 | 3.626  |
435	   |       |     |        |     |        |     |        |     |        |
436	   | 8     | 277 | -      | 212 | -      | 277 | -      | 210 | -      |
437	   |       |     |        |     |        |     |        |     |        |
438	   | 10    | 277 | -      | 259 | -      | 277 | -      | 256 | -      |
439	   |       |     |        |     |        |     |        |     |        |
440	   | 12    | 403 | 5.752  | 262 | -      | 401 | 5.757  | 262 | -      |
441	   |       |     |        |     |        |     |        |     |        |
442	   | 14    | 827 | -      | 262 | -      | 819 | -      | 262 | -      |
443	   +-------+-----+--------+-----+--------+-----+--------+-----+--------+

445	          Table 2: Evaluation of counters for delay measurements

447	   At minute 2, C(0) counters have started moving on both routers and
448	   the first timestamp (relative to the first packet with CB=0) is
449	   recorded: R1 timestamp is 7.483, R2 timestamp is 7.487.  Notice that
450	   those timestamps refer to the same packet because the first packet of
451	   the block is the same on both routers (if no packet loss has
452	   occurred): therefore they can be compared and, if we assume that R1
453	   and R2 are synchronized, they can be used to measure the delay
454	   between R1 and R2 (4 msec).  At minute 6 the marking has changed,
455	   C(0) counters have stopped and C(1) counters have started moving: it
456	   means that a new block with CB=1 has started, therefore R1 and R2
457	   record a new timestamp.  The new timestamp refers to the first packet
458	   of the block with CB=1 (which is the same packet on both routers).
459	   R1 timestamp is 3.621, R2 timestamp is 3.626; again, the two values
460	   are comparable and the delay is 5 msec.

462	   It is possible to perform more than one delay measurement per period
463	   by taking not only the timestamp of the first packet of each block,
464	   but also the timestamp of other packets within the same block.  What
465	   is required is packets triggering timestamps being the same on every
466	   router along the path.

468	5.3.  Inter-arrival jitter

470	   Similarly to one-way delay measurement, the method to evaluate the
471	   inter-arrival jitter directly refers to the packet loss method.

473	   Again, the event when the marking changes from 0 to 1 or vice versa
474	   is used as a time reference to record timestamps: considering the
475	   example depicted in Figure 1, R1 will store a timestamp TS R1 every
476	   time it sends the first packet of a block and R2 will record a
477	   timestamp TS R2 every time it receives the first packet of a block.

479	   The inter-arrival jitter can be easily derived from one-way delay
480	   measurement.  For example, it is possible to evaluate the jitter
481	   calculating the delay variation on two consecutive samples:
482	   considering the values shown in Table 2, since the measured delay is
483	   4 msec for the first sample and 5 msec for the second sample, the
484	   derived jitter is 1 msec.

486	   In this case, synchronization in the network is not strictly required
487	   because it is compensated by jitter calculation.

489	6.  Deployment considerations

491	   This section describes some aspects that should be taken into account
492	   when the method is deployed in a real network.  For sake of
493	   simplicity, we consider a network scenario where only packet loss is
494	   being measured, but all the considerations are valid and can be
495	   easily extended to one-way delay and inter-arrival jitter measurement
496	   as well.

498	6.1.  Multicast Flow Identification

500	   The first thing to do in order to monitor multicast traffic in a real
501	   network is to identify the flow to be monitored.  The method
502	   described in this document is able to monitor a single multicast
503	   stream or multiple flows grouped together, but in this case
504	   measurement is consistent only if all the flows in the group follow
505	   the same path.  Moreover, a network operator must be aware that, if
506	   measurement is performed on many streams, it is not possible to
507	   determine exactly which flow was affected by packets loss (all the
508	   flows are considered as a single stream by the monitoring system).

510	6.2.  Path Discovery

512	   Once the multicast stream(s) to be monitored is identified, it is
513	   important to enable the monitoring system in the proper nodes.  In
514	   order to have just an end-to-end monitoring it is sufficient to
515	   enable the monitoring system on the first and last-hop routers of the
516	   path: the mechanism is completely transparent to intermediate nodes
517	   and independent from the path followed by multicast streams.  At the
518	   contrary, to monitor the flow along its whole path and on every
519	   segment (every node and link) it is necessary to enable monitoring on
520	   every node from the source to the destination.  To this purpose it
521	   isn't strictly required to know the exact path followed by the flow.
522	   If, for example, the flow has multiple paths to reach a destination,
523	   it is sufficient to enable the monitoring system on every path, then
524	   a Management System will process just the right information (or it
525	   will process all the counters but some of them will be zero, meaning
526	   that the considered flow is not flowing through the corresponding
527	   interface).

529	6.3.  Flow Marking

531	   Once the multicast stream is identified and its path is known, it is
532	   necessary to "mark" the flow so to create packet blocks.  This means
533	   choosing where to activate the marking and how to "mark" packets.

535	   Regarding the first point, it is desirable, in general, to have a
536	   single marking node because it is simpler to manage and doesn't rise
537	   any risk of conflict (consider the case where two nodes mark the same
538	   flow).  To this purpose it is necessary to mark the flow as close as
539	   possible to the multicast source, f.i. on the first router downstream
540	   to multicast sources where all the multicast streams can be marked.
541	   In addition, marking a flow close to the source allows an end-to-end
542	   measurement if a measurement point is enabled on the last-hop router
543	   as well.  Theoretically, the flow could be marked before the first-
544	   hop router, directly by the sources: in this case the first-hop
545	   router just need to count packets of each block and acts as an
546	   intermediate node.  The only requirement is that marking must change
547	   periodically and every node along the path must be able to identify
548	   unambiguously marked packets.

550	   On the contrary, if many marking nodes are required, it is important
551	   that each marking node marks different flows so to avoid "marking
552	   conflicts" that would invalidate measurements.

554	   Regarding the second point, as described in Section 5.1, a field in
555	   the IP header could be sufficient for this purpose.  As an example,
556	   it is possible to use the two less significant bits of the DSCP field
557	   (bit 0 and bit 1).  One of them (bit 0) is always set to value 1 and
558	   is used to identify the flow to be measured, the other one (bit 1) is
559	   changed periodically and assumes alternately values 0 and 1.  This
560	   way traffic flow is transformed in a sequence of blocks where each
561	   block has all the packets with bit 1 of DSCP field set to the same
562	   value (0 or 1).  Of course, marking can be based on DSCP field if
563	   differentiated packet scheduling is not based on that field and, for
564	   instance, it is based only on IP Precedence bits.

566	   In practice, the marking using the DSCP field can be performed
567	   configuring on the first-hop router an access list that intercepts
568	   the flow(s) to be measured and a policy that sets the DSCP field
569	   accordingly.  Flows to be measured can be changed easily modifying
570	   the access list.  Moreover, since traffic marking must change to
571	   create traffic blocks, it is necessary to change the policy
572	   periodically: this can be done for example using an automatic script
573	   that periodically modifies the configuration.

575	6.4.  Monitoring Nodes

577	   The operation of marking flows to be monitored can be accomplished by
578	   a single node, namely the first-hop router.  All the intermediate
579	   nodes are not required to perform any particular operation except
580	   counting marked packets they receive and forward: this operation can
581	   be enabled on every router along the multicast forwarding tree or
582	   just on a small subset, depending on which network segment we want to
583	   monitor (a single link, a particular metro area, the backbone, the
584	   whole path).

586	   The operation of counting packets on intermediate nodes is very
587	   simple and can be accomplished f.i. configuring an access list that
588	   intercepts packets belonging to the multicast group being monitored
589	   with certain DSCP values (those configured on the first-hop router
590	   and used to mark the flow).  This way only "marked" packets will be
591	   counted.  Since marking changes periodically between two values, two
592	   counters (one for each value) are needed for a single flow being
593	   monitored: one counter for packets with CB = 0 and one counter for
594	   packets with CB = 1.

596	   Marking and counting are two decoupled operations: it is possible to
597	   mark all the multicast flows on the source but monitor just one or
598	   few flows, by enabling counters only for the intended streams.

600	6.5.  Management System

602	   Nodes enabled to perform performance monitoring collect counters
603	   relative to multicast flows, but they are not able to use this
604	   information to measure packet loss, because they only have local
605	   information and lack a global view of the network.  For this reason
606	   an external Network Management System (NMS) is required to collect
607	   and elaborate data and to perform packet loss calculation.  The NMS
608	   compares values of counters from different nodes and is then able to
609	   determine if some packets were lost (even a single packet) and also
610	   where packets were lost.

612	   Information collected by the routers (counter values) needs to be
613	   transferred to the NMS periodically.  This can be accomplished f.i.
614	   via FTP or TFTP and can be done in Push Mode or Polling Mode.  In the
615	   first case, each router sends periodically the information it
616	   collects to the NMS, in the latter case it is the NMS that
617	   periodically polls routers to collect information.  In any case, the
618	   Polling Interval (PI) should be compliant with the Shannon theorem:
619	   (PI < MI / 2).  This means that the Management System should collect,
620	   during every Marking Interval, at least two samples of each counter
621	   (in order to determine if the counter is incrementing or is still
622	   within the considered interval).

624	6.6.  Scalability

626	   This section describes what is needed on a node in order to enable
627	   the performance measurement system to the purpose of understand its
628	   scalability.

630	   Regarding the marking, it is preferable to have a single marking node
631	   for reasons explained in Section 6.3.  The marking can be easily
632	   performed on a single multicast flow as well as on the entire
633	   multicast traffic.  What is needed for example is a single policy
634	   that marks all the intended traffic with a specific DSCP value: this
635	   operation doesn't raise any scalability issue, since it is generally
636	   performed by routers for QoS purposes.

638	   Regarding the counting, what is needed are two counters for every
639	   flow (or group of flows) being monitored and for every interface
640	   where the monitoring system is activated.  For example, in order to
641	   monitor 3 multicast flows on a router with 4 interfaces involved, 24
642	   counters are needed (2 counters for each of the 3 flows on each of
643	   the 4 interfaces).  If access lists are used to count packets, a
644	   single ACL can be used to count packets of many flows (access list
645	   entries will increase with the number of flows), but a different
646	   access list is required on every interface.

648	   The number of counters and access lists can easily increase with the
649	   number of flows and interfaces, however monitoring is not required on
650	   every interface (it should be activated only on interfaces belonging
651	   to the multicast forwarding tree).  Besides, it can be sufficient to
652	   monitor few flows to have a monitoring system that spans the whole
653	   network because multicast flows follow the shortest path which is
654	   usually the same for all the streams (except in case of multiple
655	   equal cost paths), therefore flows using the same path are subject to
656	   give similar performance results.

658	6.7.  Interoperability

660	   The method described in this document doesn't raise any
661	   interoperability issue, since it doesn't require any new protocol or
662	   any kind of interaction among nodes.  Traffic marking can be
663	   performed by a single node, while counting of packets is performed
664	   locally by each router and the correlation between counters is done
665	   by an external NMS.

667	   The only requirement is that every node should be able to identify
668	   marked flows, but, as explained in Sections 6.3 and 6.4, this can be
669	   accomplished using simple functionalities that doesn't have any
670	   interoperability issue and are already available on major routing
671	   platforms.

673	7.  Security Considerations

675	   This document specifies a method to perform measurements in the
676	   context of a Service Provider's network and has not been developed to
677	   conduct Internet measurements, so it does not directly affect
678	   Internet security nor applications which run on the Internet.
679	   However, implementation of this method must be mindful of security
680	   and privacy concerns.

682	   There are two types of security concerns: potential harm caused by
683	   the measurements and potential harm to the measurements.  For what
684	   concerns the first point, the measurements described in this document
685	   are passive, so there are no packets injected into the network
686	   causing potential harm to the network itself and to data traffic.
687	   Nevertheless, the method implies modifications on the fly to the IP
688	   header of data packets: this must be performed in a way that doesn't
689	   alter the quality of service experienced by packets subject to
690	   measurements and that preserve stability and performance of routers
691	   doing the measurements.  The measurements themselves could be harmed
692	   by routers altering the marking of the packets, or by an attacker
693	   injecting artificial traffic.  Authentication techniques, such as
694	   digital signatures, may be used where appropriate to guard against
695	   injected traffic attacks.

697	   The privacy concerns of network measurement are limited because the
698	   method only relies on information contained in the IP header without
699	   any release of user data.

701	8.  IANA Considerations

703	   There are no IANA actions required.

705	9.  Acknowledgements

707	   The authors would like to thank Domenico Laforgia, Daniele Accetta
708	   and Mario Bianchetti for their contribution to the definition and the
709	   implementation of the method.  The authors would also like to thank
710	   Paolo Fasano and Matteo Cravero for their useful suggestions.

712	10.  Informative References

714	   [I-D.bipi-mboned-ip-multicast-pm-requirement]
715	              Bianchetti, M., Picciano, G., Chen, M., and J. Qiu,
716	              "Requirements for IP multicast performance monitoring",
717	              draft-bipi-mboned-ip-multicast-pm-requirement-00 (work in
718	              progress), July 2009.

720	Authors' Addresses

722	   Mauro Cociglio
723	   Telecom Italia
724	   Via Reiss Romoli, 274
725	   Torino  10148
726	   Italy

728	   Email: mauro.cociglio@telecomitalia.it

730	   Alessandro Capello
731	   Telecom Italia
732	   Via Reiss Romoli, 274
733	   Torino  10148
734	   Italy

736	   Email: alessandro.capello@telecomitalia.it

738	   Alberto Tempia Bonda
739	   Telecom Italia
740	   Via Reiss Romoli, 274
741	   Torino  10148
742	   Italy

744	   Email: alberto.tempiabonda@telecomitalia.it

746	   Luca Castaldelli
747	   Telecom Italia
748	   Via Reiss Romoli, 274
749	   Torino  10148
750	   Italy

752	   Email: luca.castaldelli@telecomitalia.it