idnits 2.17.1 draft-ietf-ippm-alt-mark-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 677: '...correlation mechanism SHOULD be in use...' RFC 2119 keyword, line 732: '... SHOULD provide a way to configure t...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 8, 2017) is 2627 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'RFC5905' is mentioned on line 700, but not defined == Missing Reference: 'IEEE1588' is mentioned on line 701, but not defined ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) == Outdated reference: A later version (-04) exists of draft-bryant-mpls-rfc6374-sfl-03 == Outdated reference: A later version (-05) exists of draft-bryant-mpls-sfl-framework-02 == Outdated reference: A later version (-01) exists of draft-fioccola-ippm-alt-mark-active-00 == Outdated reference: A later version (-12) exists of draft-ietf-bier-mpls-encapsulation-06 == Outdated reference: A later version (-15) exists of draft-ietf-bier-pmmm-oam-01 == Outdated reference: A later version (-07) exists of draft-ietf-mpls-flow-ident-03 Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft A. Capello, Ed. 4 Intended status: Experimental M. Cociglio 5 Expires: August 12, 2017 L. Castaldelli 6 Telecom Italia 7 M. Chen, Ed. 8 L. Zheng, Ed. 9 Huawei Technologies 10 G. Mirsky, Ed. 11 ZTE 12 T. Mizrahi, Ed. 13 Marvell 14 February 8, 2017 16 Alternate Marking method for passive performance monitoring 17 draft-ietf-ippm-alt-mark-03 19 Abstract 21 This document describes a passive method to perform packet loss, 22 delay and jitter measurements on live traffic. This method is based 23 on Alternate Marking (Coloring) technique. A report on the 24 operational experiment done at Telecom Italia is explained in order 25 to give an example and show the method applicability. This technique 26 can be applied in various situations as detailed in this document. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on August 12, 2017. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Overview of the method . . . . . . . . . . . . . . . . . . . 4 64 3. Detailed description of the method . . . . . . . . . . . . . 5 65 3.1. Packet loss measurement . . . . . . . . . . . . . . . . . 5 66 3.1.1. Timing aspects . . . . . . . . . . . . . . . . . . . 9 67 3.2. One-way delay measurement . . . . . . . . . . . . . . . . 10 68 3.2.1. Single marking methodology . . . . . . . . . . . . . 10 69 3.2.2. Double marking methodology . . . . . . . . . . . . . 12 70 3.3. Delay variation measurement . . . . . . . . . . . . . . . 14 71 4. Considerations . . . . . . . . . . . . . . . . . . . . . . . 14 72 4.1. Synchronization . . . . . . . . . . . . . . . . . . . . . 14 73 4.2. Data Correlation . . . . . . . . . . . . . . . . . . . . 15 74 4.3. Packet Re-ordering . . . . . . . . . . . . . . . . . . . 16 75 5. Implementation and deployment . . . . . . . . . . . . . . . . 16 76 5.1. Report on the operational experiment at Telecom Italia . 17 77 5.1.1. Coloring the packets . . . . . . . . . . . . . . . . 18 78 5.1.2. Counting the packets . . . . . . . . . . . . . . . . 19 79 5.1.3. Collecting data and calculating packet loss . . . . . 20 80 5.1.4. Metric transparency . . . . . . . . . . . . . . . . . 21 81 5.2. IP flow performance measurement (IPFPM) . . . . . . . . . 21 82 5.3. Performance Measurement Marking Method in BIER Domain . . 21 83 5.4. Overlay OAM Passive Performance Measurement . . . . . . . 21 84 5.5. RFC6374 Use Case . . . . . . . . . . . . . . . . . . . . 22 85 5.6. Application to active performance measurement . . . . . . 22 86 6. Hybrid measurement . . . . . . . . . . . . . . . . . . . . . 22 87 7. Compliance with RFC6390 guidelines . . . . . . . . . . . . . 22 88 8. Security Considerations . . . . . . . . . . . . . . . . . . . 24 89 9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25 90 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 91 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 92 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 93 12.1. Normative References . . . . . . . . . . . . . . . . . . 26 94 12.2. Informative References . . . . . . . . . . . . . . . . . 27 95 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 97 1. Introduction 99 Nowadays, most of the traffic in Service Providers' networks carries 100 real time content. These contents are highly sensitive to packet 101 loss [RFC2680], while interactive contents are sensitive to delay 102 [RFC2679], and jitter [RFC3393]. 104 In view of this scenario, Service Providers need methodologies and 105 tools to monitor and measure network performances with an adequate 106 accuracy, in order to constantly control the quality of experience 107 perceived by their customers. On the other hand, performance 108 monitoring provides useful information for improving network 109 management (e.g. isolation of network problems, troubleshooting, 110 etc.). 112 A lot of work related to OAM, that includes also performance 113 monitoring techniques, has been done by Standards Developing 114 Organizations(SDOs):: [RFC7276] provides a good overview of existing 115 OAM mechanisms defined in IETF, ITU-T and IEEE. Considering IETF, a 116 lot of work has been done on fault detection and connectivity 117 verification, while a minor effort has been dedicated so far to 118 performance monitoring. The IPPM WG has defined standard metrics to 119 measure network performance; however, the methods developed in this 120 WG mainly refer to focus on active measurement techniques. More 121 recently, the MPLS WG has defined mechanisms for measuring packet 122 loss, one-way and two-way delay, and delay variation in MPLS 123 networks[RFC6374], but their applicability to passive measurements 124 has some limitations, especially for pure connection-less networks. 126 The lack of adequate tools to measure packet loss with the desired 127 accuracy drove an effort to design a new method for the performance 128 monitoring of live traffic, possibly easy to implement and deploy. 129 The effort led to the method described in this document: basically, 130 it is a passive performance monitoring technique, potentially 131 applicable to any kind of packet based traffic, including Ethernet, 132 IP, and MPLS, both unicast and multicast. The method addresses 133 primarily packet loss measurement, but it can be easily extended to 134 one-way delay and delay variation measurements as well. 136 The method has been explicitly designed for passive measurements but 137 it can also be used with active probes. Passive measurements are 138 usually more easily understood by customers and provide a much better 139 accuracy, especially for packet loss measurements. 141 This document is organized as follows: 143 o Section 2 gives an overview of the method, including a comparison 144 with different measurement strategies; 146 o Section 3 describes the method in detail; 148 o Section 4 reports considerations about synchronization, data 149 correlation and packet re-ordering; 151 o Section 5 reports examples of implementation and deployment of the 152 method. Furthermore the operational experiment done at Telecom 153 Italia is described; 155 o Section 8 includes some security aspects; 157 o Section 9 finally summarizes some concluding remarks. 159 2. Overview of the method 161 In order to perform packet loss measurements on a live traffic flow, 162 different approaches exist. The most intuitive one consists in 163 numbering the packets, so that each router that receives the flow can 164 immediately detect a packet missing. This approach, though very 165 simple in theory, is not simple to achieve: it requires the insertion 166 of a sequence number into each packet and the devices must be able to 167 extract the number and check it in real time. Such a task can be 168 difficult to implement on live traffic: if UDP is used as the 169 transport protocol, the sequence number is not available; on the 170 other hand, if a higher layer sequence number (e.g. in the RTP 171 header) is used, extracting that information from each packet and 172 process it in real time could overload the device. 174 An alternate approach is to count the number of packets sent on one 175 end, the number of packets received on the other end, and to compare 176 the two values. This operation is much simpler to implement, but 177 requires that the devices performing the measurement are in sync: in 178 order to compare two counters it is required that they refer exactly 179 to the same set of packets. Since a flow is continuous and cannot be 180 stopped when a counter has to be read, it could be difficult to 181 determine exactly when to read the counter. A possible solution to 182 overcome this problem is to virtually split the flow in consecutive 183 blocks by inserting periodically a delimiter so that each counter 184 refers exactly to the same block of packets. The delimiter could be 185 for example a special packet inserted artificially into the flow. 186 However, delimiting the flow using specific packets has some 187 limitations. First, it requires generating additional packets within 188 the flow and requires the equipment to be able to process those 189 packets. In addition, the method is vulnerable to out of order 190 reception of delimiting packets and, to a lesser extent, to their 191 loss. 193 The method proposed in this document follows the second approach, but 194 it doesn't use additional packets to virtually split the flow in 195 blocks. Instead, it "colors" the packets so that the packets 196 belonging to the same block will have the same color, whilst 197 consecutive blocks will have different colors. Each change of color 198 represents a sort of auto-synchronization signal that guarantees the 199 consistency of measurements taken by different devices along the 200 path. 202 Figure 1 represents a very simple network and shows how the method 203 can be used to measure packet loss on different network segments: by 204 enabling the measurement on several interfaces along the path, it is 205 possible to perform link monitoring, node monitoring or end-to-end 206 monitoring. The method is flexible enough to measure packet loss on 207 any segment of the network and can be used to isolate the faulty 208 element. 210 Traffic flow 211 ========================================================> 212 +------+ +------+ +------+ +------+ 213 ---<> R1 <>-----<> R2 <>-----<> R3 <>-----<> R4 <>--- 214 +------+ +------+ +------+ +------+ 215 . . . . . . 216 . . . . . . 217 . <------> <-------> . 218 . Node Packet Loss Link Packet Loss . 219 . . 220 <---------------------------------------------------> 221 End-to-End Packet loss 223 Figure 1: Available measurements 225 3. Detailed description of the method 227 This section describes in detail how the method operate. A special 228 emphasis is given to the measurement of packet loss, that represents 229 the core application of the method, but applicability to delay and 230 jitter measurements is also considered. 232 3.1. Packet loss measurement 234 The basic idea is to virtually split traffic flows into consecutive 235 blocks: each block represents a measurable entity unambiguously 236 recognizable by all network devices along the path. By counting the 237 number of packets in each block and comparing the values measured by 238 different network devices along the path, it is possible to measure 239 packet loss occurred in any single block between any two points. 241 As discussed in the previous section, a simple way to create the 242 blocks is to "color" the traffic (two colors are sufficient) so that 243 packets belonging to different consecutive blocks will have different 244 colors. Whenever the color changes, the previous block terminates 245 and the new one begins. Hence, all the packets belonging to the same 246 block will have the same color and packets of different consecutive 247 blocks will have different colors. The number of packets in each 248 block depends on the criterion used to create the blocks: if the 249 color is switched after a fixed number of packets, then each block 250 will contain the same number of packets (except for any losses); but 251 if the color is switched according to a fixed timer, then the number 252 of packets may be different in each block depending on the packet 253 rate. 255 The following figure shows how a flow looks like when it is split in 256 traffic blocks with colored packets. 258 A: packet with A coloring 259 B: packet with B coloring 261 | | | | | 262 | | Traffic flow | | 263 -------------------------------------------------------------------> 264 BBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAA 265 -------------------------------------------------------------------> 266 ... | Block 5 | Block 4 | Block 3 | Block 2 | Block 1 267 | | | | | 269 Figure 2: Traffic coloring 271 Figure 3 shows how the method can be used to measure link packet loss 272 between two adjacent nodes. 274 Referring to the figure, let's assume we want to monitor the packet 275 loss on the link between two routers: router R1 and router R2. 276 According to the method, the traffic is colored alternatively with 277 two different colors, A and B. Whenever the color changes, the 278 transition generates a sort of square-wave signal, as depicted in the 279 following figure. 281 Color A ----------+ +-----------+ +---------- 282 | | | | 283 Color B +-----------+ +-----------+ 284 Block n ... Block 3 Block 2 Block 1 285 <---------> <---------> <---------> <---------> <---------> 287 Traffic flow 288 ===========================================================> 289 Color ...AAAAAAAAAAA BBBBBBBBBBB AAAAAAAAAAA BBBBBBBBBBB AAAAAAA... 290 ===========================================================> 292 Figure 3: Computation of link packet loss 294 Traffic coloring could be done by R1 itself or by an upward router. 295 R1 needs two counters, C(A)R1 and C(B)R1, on its egress interface: 296 C(A)R1 counts the packets with color A and C(B)R1 counts those with 297 color B. As long as traffic is colored A, only counter C(A)R1 will 298 be incremented, while C(B)R1 is not incremented; vice versa, when the 299 traffic is colored as B, only C(B)R1 is incremented. C(A)R1 and 300 C(B)R1 can be used as reference values to determine the packet loss 301 from R1 to any other measurement point down the path. Router R2, 302 similarly, will need two counters on its ingress interface, C(A)R2 303 and C(B)R2, to count the packets received on that interface and 304 colored with color A and B respectively. When an A block ends, it is 305 possible to compare C(A)R1 and C(A)R2 and calculate the packet loss 306 within the block; similarly, when the successive B block terminates, 307 it is possible to compare C(B)R1 with C(B)R2, and so on for every 308 successive block. 310 Likewise, by using two counters on R2 egress interface it is possible 311 to count the packets sent out of R2 interface and use them as 312 reference values to calculate the packet loss from R2 to any 313 measurement point down R2. 315 Using a fixed timer for color switching offers a better control over 316 the method: the (time) length of the blocks can be chosen large 317 enough to simplify the collection and the comparison of measures 318 taken by different network devices. It's preferable to read the 319 value of the counters not immediately after the color switch: some 320 packets could arrive out of order and increment the counter 321 associated to the previous block (color), so it is worth waiting for 322 some time. A safe choice is to wait L/2 time units (where L is the 323 duration for each block) after the color switch, to read the still 324 counter of the previous color, so the possibility to read a running 325 counter instead of a still one is minimized. The drawback is that 326 the longer the duration of the block, the less frequent the 327 measurement can be taken. 329 The following table shows how the counters can be used to calculate 330 the packet loss between R1 and R2. The first column lists the 331 sequence of traffic blocks while the other columns contain the 332 counters of A-colored packets and B-colored packets for R1 and R2. 333 In this example, we assume that the values of the counters are reset 334 to zero whenever a block ends and its associated counter has been 335 read: with this assumption, the table shows only relative values, 336 that is the exact number of packets of each color within each block. 337 If the values of the counters were not reset, the table would contain 338 cumulative values, but the relative values could be determined simply 339 by difference from the value of the previous block of the same color. 341 The color is switched on the basis of a fixed timer (not shown in the 342 table), so the number of packets in each block is different. 344 +-------+--------+--------+--------+--------+------+ 345 | Block | C(A)R1 | C(B)R1 | C(A)R2 | C(B)R2 | Loss | 346 +-------+--------+--------+--------+--------+------+ 347 | 1 | 375 | 0 | 375 | 0 | 0 | 348 | | | | | | | 349 | 2 | 0 | 388 | 0 | 388 | 0 | 350 | | | | | | | 351 | 3 | 382 | 0 | 381 | 0 | 1 | 352 | | | | | | | 353 | 4 | 0 | 377 | 0 | 374 | 3 | 354 | | | | | | | 355 | ... | ... | ... | ... | ... | ... | 356 | | | | | | | 357 | n | 0 | 387 | 0 | 387 | 0 | 358 | | | | | | | 359 | n+1 | 379 | 0 | 377 | 0 | 2 | 360 +-------+--------+--------+--------+--------+------+ 362 Table 1: Evaluation of counters for packet loss measurements 364 During an A block (blocks 1, 3 and n+1), all the packets are 365 A-colored, therefore the C(A) counters are incremented to the number 366 seen on the interface, while C(B) counters are zero. Vice versa, 367 during a B block (blocks 2, 4 and n), all the packets are B-colored: 368 C(A) counters are zero, while C(B) counters are incremented. 370 When a block ends (because of color switching) the relative counters 371 stop incrementing and it is possible to read them, compare the values 372 measured on router R1 and R2 and calculate the packet loss within 373 that block. 375 For example, looking at the table above, during the first block 376 (A-colored), C(A)R1 and C(A)R2 have the same value (375), which 377 corresponds to the exact number of packets of the first block (no 378 loss). Also during the second block (B-colored) R1 and R2 counters 379 have the same value (388), which corresponds to the number of packets 380 of the second block (no loss). During blocks three and four, R1 and 381 R2 counters are different, meaning that some packets have been lost: 382 in the example, one single packet (382-381) was lost during block 383 three and three packets (377-374) were lost during block four. 385 The method applied to R1 and R2 can be extended to any other router 386 and applied to more complex networks, as far as the measurement is 387 enabled on the path followed by the traffic flow(s) being observed. 389 3.1.1. Timing aspects 391 This document introduces two color switching method: one is based on 392 fixed number of packet, the other is based on fixed timer. But the 393 method based on fixed timer is preferable because is more 394 deterministic, and will be considered in the rest of the dcoument. 396 By considering the clock error between network devices R1 and R2, 397 they must be synchronized to the same clock reference with an 398 accuracy of +/- L/2 time units, where L is the time duration of the 399 block. So each colored packet can be assigned to the right batch by 400 each router. This is because the minimum time distance between two 401 packets of the same color but belonging to different batches is L 402 time units. 404 In practice, there are also out of order at batch boundaries, 405 strictly related to the delay between measurement points. This means 406 that, without considering clock error, we wait L/2 after color 407 switching to be sure to take a still counter. 409 In summary we need to take into account two contributions: clock 410 error between network devices and the interval we need to wait to 411 avoid out of order because of network delay. 413 The following figure eplains both issues. 415 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 416 |<======================================>| 417 | L | 418 ...=========>|<==================><==================>|<==========... 419 | L/2 L/2 | 420 |<===>| |<===>| 421 d | | d 422 |<==========================>| 423 available counting interval 425 Figure 4: Timing aspects 427 It is assumed that all network devices are synchronized to a common 428 reference time with an accuracy of +/- A/2. Thus, the difference 429 between the clock values of any two network devices is bounded by A. 431 The guardband d is given by: 433 d = A + D_max - D_min, 435 where A is the clock accuracy, D_max is an upper bound on the network 436 delay between the network devices, and D_min is a lower bound on the 437 delay. 439 The available counting interval is L - 2d that must be > 0. 441 The condition that must be satisfied and is a requirement on the 442 synchronization accuracy is: 444 d < L/2. 446 3.2. One-way delay measurement 448 The same principle used to measure packet loss can be applied also to 449 one-way delay measurement. There are three alternatives, as 450 described hereinafter. 452 3.2.1. Single marking methodology 454 The alternation of colors can be used as a time reference to 455 calculate the delay. Whenever the color changes (that means that a 456 new block has started) a network device can store the timestamp of 457 the first packet of the new block; that timestamp can be compared 458 with the timestamp of the same packet on a second router to compute 459 packet delay. Considering Figure 2, R1 stores a timestamp TS(A1)R1 460 when it sends the first packet of block 1 (A-colored), a timestamp 461 TS(B2)R1 when it sends the first packet of block 2 (B-colored) and so 462 on for every other block. R2 performs the same operation on the 463 receiving side, recording TS(A1)R2, TS(B2)R2 and so on. Since the 464 timestamps refer to specific packets (the first packet of each block) 465 we are sure that timestamps compared to compute delay refer to the 466 same packets. By comparing TS(A1)R1 with TS(A1)R2 (and similarly 467 TS(B2)R1 with TS(B2)R2 and so on) it is possible to measure the delay 468 between R1 and R2. In order to have more measurements, it is 469 possible to take and store more timestamps, referring to other 470 packets within each block. 472 In order to coherently compare timestamps collected on different 473 routers, the network nodes must be in sync. Furthermore, a 474 measurement is valid only if no packet loss occurs and if packet 475 misordering can be avoided, otherwise the first packet of a block on 476 R1 could be different from the first packet of the same block on R2 477 (f.i. if that packet is lost between R1 and R2 or it arrives after 478 the next one). 480 The following table shows how timestamps can be used to calculate the 481 delay between R1 and R2. The first column lists the sequence of 482 blocks while other columns contain the timestamp referring to the 483 first packet of each block on R1 and R2. The delay is computed as a 484 difference between timestamps. For the sake of simplicity, all the 485 values are expressed in milliseconds. 487 +-------+---------+---------+---------+---------+-------------+ 488 | Block | TS(A)R1 | TS(B)R1 | TS(A)R2 | TS(B)R2 | Delay R1-R2 | 489 +-------+---------+---------+---------+---------+-------------+ 490 | 1 | 12.483 | - | 15.591 | - | 3.108 | 491 | | | | | | | 492 | 2 | - | 6.263 | - | 9.288 | 3.025 | 493 | | | | | | | 494 | 3 | 27.556 | - | 30.512 | - | 2.956 | 495 | | | | | | | 496 | | - | 18.113 | - | 21.269 | 3.156 | 497 | | | | | | | 498 | ... | ... | ... | ... | ... | ... | 499 | | | | | | | 500 | n | 77.463 | - | 80.501 | - | 3.038 | 501 | | | | | | | 502 | n+1 | - | 24.333 | - | 27.433 | 3.100 | 503 +-------+---------+---------+---------+---------+-------------+ 505 Table 2: Evaluation of timestamps for delay measurements 507 The first row shows timestamps taken on R1 and R2 respectively and 508 referring to the first packet of block 1 (which is A-colored). Delay 509 can be computed as a difference between the timestamp on R2 and the 510 timestamp on R1. Similarly, the second row shows timestamps (in 511 milliseconds) taken on R1 and R2 and referring to the first packet of 512 block 2 (which is B-colored). Comparing timestamps taken on 513 different nodes in the network and referring to the same packets 514 (identified using the alternation of colors) it is possible to 515 measure delay on different network segments. 517 For the sake of simplicity, in the above example a single measurement 518 is provided within a block, taking into account only the first packet 519 of each block. The number of measurements can be easily increased by 520 considering multiple packets in the block: for instance, a timestamp 521 could be taken every N packets, thus generating multiple delay 522 measurements. Taking this to the limit, in principle the delay could 523 be measured for each packet, by taking and comparing the 524 corresponding timestamps (possible but impractical from an 525 implementation point of view). 527 3.2.1.1. Mean delay 529 As mentioned before, the method previously exposed for measuring the 530 delay is sensitive to out of order reception of packets. In order to 531 overcome this problem, a different approach has been considered: it 532 is based on the concept of mean delay. The mean delay is calculated 533 by considering the average arrival time of the packets within a 534 single block. The network device locally stores a timestamp for each 535 packet received within a single block: summing all the timestamps and 536 dividing by the total number of packets received, the average arrival 537 time for that block of packets can be calculated. By subtracting the 538 average arrival times of two adjacent devices it is possible to 539 calculate the mean delay between those nodes. This method is robust 540 to out of order packets and also to packet loss (only a small error 541 is introduced). Moreover, it greatly reduces the number of 542 timestamps (only one per block for each network device) that have to 543 be collected by the management system. On the other hand, it only 544 gives one measure for the duration of the block (f.i. 5 minutes), and 545 it doesn't give the minimum, maximum and median delay values (RFC 546 6703 [RFC6703]). This limitation could be overcome by reducing the 547 duration of the block (f.i. from 5 minutes to a few seconds), that 548 implicates an highly optimized implementation of the method. 550 By summing the mean delays of the two directions of a path, it is 551 also possible to measure the two-way mean delay (round-trip delay). 553 3.2.2. Double marking methodology 555 The Single marking methodology for one-way delay measurement is 556 sensitive to out of order reception of packets. The first approach 557 to overcome this problem is described before and is based on the 558 concept of mean delay. But the limitation of mean delay is that it 559 doesn't give information about the delay values distribution for the 560 duration of the block. Additionally it may be useful to have not 561 only the mean delay but also the minimum and maximum delay values 562 and, in wider terms, to know more about the statistic distribution of 563 delay values. So in order to have more information about the delay 564 and to overcome out of order issues, a different approach can be 565 introduced: it is based on double marking methodology. 567 Basically, the idea is to use the first marking to create the 568 alternate flow and, within this colored flow, a second marking to 569 select the packets for measuring delay/jitter. The first marking is 570 needed for packet loss and mean delay measurement. The second 571 marking creates a new set of marked packets that are fully identified 572 over the network, so that a network device can store the timestamps 573 of these packets; these timestamps can be compared with the 574 timestamps of the same packets on a second router to compute packet 575 delay values for each packet. The number of measurements can be 576 easily increased by changing the frequency of the second marking. 577 But the frequency of the second marking must be not too high in order 578 to avoid out of order issues. Between packets with the second 579 marking there should be a security time gap (e.g. this gap could be, 580 at the minimum, the mean network delay calculated with the previous 581 methodology) to avoid out of order issues and also to have a number 582 of measurement packets that is rate independent. If a second marking 583 packet is lost, the delay measurement for the considered block is 584 corrupted and should be discarded. 586 Mean delay is calculated on all the packets of a batch and is a 587 simple computation to be performed for single marking method. In 588 some cases mean delay measure could not be enough when more delay 589 extent data are needed (e.g. minimum, maximum, variance and median 590 delay values for each block). To overcome this drawback the idea is 591 to couple the mean delay measure for the entire batch with double 592 marking method, where a subset of batch packets are selected for 593 extensive delay calculation by using a second marking. In this way 594 it is possible to measure the minimum, the maximum, the variance and 595 the median in order to perform a detailed analysis on these double 596 marked packets. Please note that there are classic algorithms for 597 median and variance calculation, but are out of the scope of this 598 document. The comparison between the mean delay for the entire batch 599 and the mean delay on these double marked packets gives an useful 600 information since it is possible to understand if the double marking 601 measurements are actually representative of the delay trends. 603 3.3. Delay variation measurement 605 Similarly to one-way delay measurement (both for single marking and 606 double marking), the method can also be used to measure the inter- 607 arrival jitter. We refer to the definition in RFC 3393 [RFC3393]. 608 The alternation of colors, for single marking method, can be used as 609 a time reference to measure delay variations. In case of double 610 marking, the time reference is given by the second marked packets. 611 Considering the example depicted in Figure 2, R1 stores a timestamp 612 TS(A)R1 whenever it sends the first packet of a block and R2 stores a 613 timestamp TS(B)R2 whenever it receives the first packet of a block. 614 The inter-arrival jitter can be easily derived from one-way delay 615 measurement, by evaluating the delay variation of consecutive 616 samples. 618 The concept of mean delay can also be applied to delay variation, by 619 evaluating the average variation of the interval between consecutive 620 packets of the flow from R1 to R2. 622 4. Considerations 624 This section highlights some considerations about the methodology. 626 4.1. Synchronization 628 The Alternate Marking technique does not require a strong 629 synchronization, especially for packet loss and two-way delay 630 measurement. Only one-way delay measurement requires network devices 631 to have synchronized clocks. 633 The color switching is the reference for all the network devices, and 634 the only requirement to be achieved is that all network devices have 635 to recognize the right batch along the path. 637 If the length of the measurement period is L time units, then all 638 network devices must be synchronized to the same clock reference with 639 an accuracy of +/- L/2 time units (without considering network 640 delay). This level of accuracy guarantees that all network devices 641 consistently match the color bit to the correct block. For example, 642 if the color is toggeled every second (L = 1 second), then clocks 643 must be synchronized with an accuracy of +/- 0.5 second to a common 644 time reference. 646 This synchronization requirement can be satisfied even with a 647 relatively inaccurate synchronization method. This is true for 648 packet loss and two-way delay measurement, instead, for one-way delay 649 measurement clock synchronization must be accurate. 651 Therefore, a system that uses only packet loss and two-way delay 652 measurement does not require synchronization. This is because the 653 value of the clocks of network devices does not affect the 654 computation of the two-way delay measurement. 656 4.2. Data Correlation 658 Data Correlation is the mechanism to compare counters and timestamps 659 for packet loss, delay and delay variation calculation. It could be 660 performed in several ways depending on the alternate marking 661 application and use case. 663 o A possibility is to use a centralized solution using Network 664 Management System (NMS) to correlate data; 666 o Another possibility is to define a protocol based distributed 667 solution, by defining a new protocol or by extending the existing 668 protocols (e.g. RFC6374, TWAMP, OWAMP) in order to communicate 669 the counters and timestamps between nodes. 671 In the following paragraphs an example data correlation mechanism is 672 explained and could be use independently of the adopted solutions. 674 When data is collected on the upstream and downstream node, e.g., 675 packet counts for packet loss measurement or timestamps for packet 676 delay measurement, and periodically reported to or pulled by other 677 nodes or NMS, a certain data correlation mechanism SHOULD be in use 678 to help the nodes or NMS to tell whether any two or more packet 679 counts are related to the same block of markers, or any two 680 timestamps are related to the same marked packet. 682 The alternate marking method described in this document literally 683 split the packets of the measured flow into different measurement 684 blocks, in addition a Block Number could be assigned to each of such 685 measurement block. The BN is generated each time a node reads the 686 data (packet counts or timestamps), and is associated with each 687 packet count and timestamp reported to or pulled by other nodes or 688 NMS. The value of BN could be calculated as the modulo of the local 689 time (when the data are read) and the interval of the marking time 690 period. 692 When the nodes or NMS see, for example, same BNs associated with two 693 packet counts from an upstream and a downstream node respectively, it 694 considers that these two packet counts corresponding to the same 695 block, i.e. that these two packet counts belong to the same block of 696 markers from the upstream and downstream node. The assumption of 697 this BN mechanism is that the measurement nodes are time 698 synchronized. This requires the measurement nodes to have a certain 699 time synchronization capability (e.g., the Network Time Protocol 700 (NTP) [RFC5905], or the IEEE 1588 Precision Time Protocol (PTP) 701 [IEEE1588]). Synchronization aspects are further discussed in 702 Section 4. 704 4.3. Packet Re-ordering 706 Due to ECMP, packet re-ordering is very common in IP network. The 707 accuracy of marking based PM, especially packet loss measurement, may 708 be affected by packet re-ordering. Take a look at the following 709 example: 711 Block : 1 | 2 | 3 | 4 | 5 |... 712 --------|---------|---------|---------|---------|---------|--- 713 Node R1 : AAAAAAA | BBBBBBB | AAAAAAA | BBBBBBB | AAAAAAA |... 714 Node R2 : AAAAABB | AABBBBA | AAABAAA | BBBBBBA | ABAAABA |... 716 Figure 5: Packet Reordering 718 In the following paragraphs an example of data correlation mechanism 719 is explained and could be use independently of the adopted solutions. 721 Most of the packet re-ordering occur at the edge of adjacent blocks, 722 and they are easy to handle if the interval of each block is 723 sufficient large. Then, it can assume that the packets with 724 different marker belong to the block that they are more close to. If 725 the interval is small, it is difficult and sometime impossible to 726 determine to which block a packet belongs. See above example, the 727 packet with the marker of "B" in block 3, there is no safe way to 728 tell whether the packet belongs to block 2 or block 4. 730 To choose a proper interval is important and how to choose a proper 731 interval is out of the scope of this document. But an implementation 732 SHOULD provide a way to configure the interval and allow a certain 733 degree of packet re-ordering. 735 5. Implementation and deployment 737 The methodology described in the previous sections can be applied in 738 various situations. Basically Alternate Marking technique could be 739 used in many cases for performance measurement. The only requirement 740 is to select and mark the flow to be monitored; in this way packets 741 are batched by the sender and each batch is alternately marked such 742 that can be easily recognized by the receiver. 744 An example of implementation and deployment is explained in the next 745 section, just to clarify how the method can work. 747 5.1. Report on the operational experiment at Telecom Italia 749 The method described in this document, also called PNPM (Packet 750 Network Performance Monitoring), has been invented and engineered in 751 Telecom Italia and it's currently being used in Telecom Italia's 752 network. The methodology has been applied by leveraging functions 753 and tools available on IP routers and it's currently being used to 754 monitor packet loss in some portions of Telecom Italia's network. 755 The application of the method to delay measurement is currently being 756 evaluated in Telecom Italia's labs. This section describes how the 757 features currently available on existing routing platforms can be 758 used to apply the method, in order to give an example of 759 implementation and deployment. 761 The fundamental steps for this implementation of the method can be 762 summarized in the following items: 764 o coloring the packets; 766 o counting the packets; 768 o collecting data and calculating the packet loss. 770 o metric transparency. 772 Before going deeper into the implementation details, it's worth 773 mentioning two different strategies that can be used when 774 implementing the method: 776 o flow-based: the flow-based strategy is used when only a limited 777 number of traffic flows need to be monitored. This could be the 778 case, for example, of IPTV channels or other specific applications 779 traffic with high QoS requirements (i.e. Mobile Backhauling 780 traffic). According to this strategy, only a subset of the flows 781 is colored. Counters for packet loss measurements can be 782 instantiated for each single flow, or for the set as a whole, 783 depending on the desired granularity. A relevant problem with 784 this approach is the necessity to know in advance the path 785 followed by flows that are subject to measurement. Path rerouting 786 and traffic load-balancing increase the issue complexity, 787 especially for unicast traffic. The problem is easier to solve 788 for multicast traffic where load balancing is seldom used, 789 especially for IPTV traffic where static joins are frequently used 790 to force traffic forwarding and replication. Another application 791 is on Mobile Backhauling, implemented with a VPN MPLS in Telecom 792 Italia's network; in this case the problem with unicast traffic is 793 overcome by monitoring just the two Provider Edge nodes of the VPN 794 MPLS. 796 o link-based: measurements are performed on all the traffic on a 797 link by link basis. The link could be a physical link or a 798 logical link (for instance an Ethernet VLAN or a MPLS PW). 799 Counters could be instantiated for the traffic as a whole or for 800 each traffic class (in case it is desired to monitor each class 801 separately), but in the second case a couple of counters is needed 802 for each class. 804 The current implementation in Telecom Italia uses the first strategy. 805 As mentioned, the flow-based measurement requires the identification 806 of the flow to be monitored and the discovery of the path followed by 807 the selected flow. It is possible to monitor a single flow or 808 multiple flows grouped together, but in this case measurement is 809 consistent only if all the flows in the group follow the same path. 810 Moreover, a Service Provider should be aware that, if a measurement 811 is performed by grouping many flows, it is not possible to determine 812 exactly which flow was affected by packets loss. In order to have 813 measures per single flow it is necessary to configure counters for 814 each specific flow. Once the flow(s) to be monitored have been 815 identified, it is necessary to configure the monitoring on the proper 816 nodes. Configuring the monitoring means configuring the policy to 817 intercept the traffic and configuring the counters to count the 818 packets. To have just an end-to-end monitoring, it is sufficient to 819 enable the monitoring on the first and the last hop routers of the 820 path: the mechanism is completely transparent to intermediate nodes 821 and independent from the path followed by traffic flows. On the 822 contrary, to monitor the flow on a hop-by-hop basis along its whole 823 path it is necessary to enable the monitoring on every node from the 824 source to the destination. In case the exact path followed by the 825 flow is not known a priori (i.e. the flow has multiple paths to reach 826 the destination) it is necessary to enable the monitoring system on 827 every path: counters on interfaces traversed by the flow will report 828 packet count, counters on other interfaces will be null. 830 5.1.1. Coloring the packets 832 The coloring operation is fundamental in order to create packet 833 blocks. This implies choosing where to activate the coloring and how 834 to color the packets. 836 In case of flow-based measurements, it is desirable, in general, to 837 have a single coloring node because it is easier to manage and 838 doesn't rise any risk of conflict (consider the case where two nodes 839 color the same flow). Thus it is necessary to color the flow as 840 close as possible to the source. In addition, coloring a flow close 841 to the source allows an end-to-end measure if a measurement point is 842 enabled on the last-hop router as well. The only requirement is that 843 the coloring must change periodically and every node along the path 844 must be able to identify unambiguously the colored packets. For 845 link-based measurements, all traffic needs to be colored when 846 transmitted on the link. If the traffic had already been colored, 847 then it has to be re-colored because the color must be consistent on 848 the link. This means that each hop along the path must (re-)color 849 the traffic; the color is not required to be consistent along 850 different links. 852 Traffic coloring can be implemented by setting a specific bit in the 853 packet header and changing the value of that bit periodically. With 854 current router implementations, only QoS related fields and features 855 offer the required flexibility to set bits in the packet header. In 856 case a Service Provider only uses the three most significant bits of 857 the DSCP field (corresponding to IP Precedence) for QoS 858 classification and queuing, it is possible to use the two less 859 significant bits of the DSCP field (bit 0 and bit 1) to implement the 860 method without affecting QoS policies. One of the two bits (bit 0) 861 could be used to identify flows subject to traffic monitoring (set to 862 1 if the flow is under monitoring, otherwise it is set to 0), while 863 the second (bit 1) can be used for coloring the traffic (switching 864 between values 0 and 1, corresponding to color A and B) and creating 865 the blocks. 867 In practice, coloring the traffic using the DSCP field can be 868 implemented by configuring on the router output interface an access 869 list that intercepts the flow(s) to be monitored and applies to them 870 a policy that sets the DSCP field accordingly. Since traffic 871 coloring has to be switched between the two values over time, the 872 policy needs to be modified periodically: an automatic script ca be 873 used perform this task on the basis of a fixed timer. In Telecom 874 Italia's implementation this timer is set to 5 minutes: this value 875 showed to be a good compromise between measurement frequency and 876 stability of the measurement (i.e. possibility to collect all the 877 measures referring to the same block). 879 5.1.2. Counting the packets 881 Assuming that the coloring of the packets is performed only by the 882 source node, the nodes between source and destination (included) have 883 to count the colored packets that they receive and forward: this 884 operation can be enabled on every router along the path or only on a 885 subset, depending on which network segment is being monitored (a 886 single link, a particular metro area, the backbone, the whole path). 888 Since the color switches periodically between two values, two 889 counters (one for each value) are needed: one counter for packets 890 with color A and one counter for packets with color B. For each flow 891 (or group of flows) being monitored and for every interface where the 892 monitoring is active, a couple od counters is needed. For example, 893 in order to monitor separately 3 flows on a router with 4 interfaces 894 involved, 24 counters are needed (2 counters for each of the 3 flows 895 on each of the 4 interfaces). If traffic is colored using the DSCP 896 field, as in Telecom Italia's implementation, an access-list that 897 matches specific DSCP values can be used to count the packets of the 898 flow(s) being monitored. 900 In case of link-based measurements the behaviour is similar except 901 that coloring and counting operations are performed on a link by link 902 basis at each endpoint of the link. 904 Another important aspect to take into consideration is when to read 905 the counters: in order to count the exact number of packets of a 906 block the routers must perform this operation when that block has 907 ended: in other words, the counter for color A must be read when the 908 current block has color B, in order to be sure that the value of the 909 counter is stable. This task can be accomplished in two ways. The 910 general approach suggests to read the counters periodically, many 911 times during a block duration, and to compare these successive 912 readings: when the counter stops incrementing means that the current 913 block has ended and its value can be elaborated safely. 914 Alternatively, if the coloring operation is performed on the basis of 915 a fixed timer, it is possible to configure the reading of the 916 counters according to that timer: for example, if each block is 5 917 minutes long, reading the counter for color A every 5 minute in the 918 middle of the subsequent block (with color B) is a safe choice. A 919 sufficient margin should be considered between the end of a block and 920 the reading of the counter, in order to take into account any out-of- 921 order packets. The choice of a 5 minutes timer for colore switching 922 was also inspired by these considerations. 924 5.1.3. Collecting data and calculating packet loss 926 The nodes enabled to perform performance monitoring collect the value 927 of the counters, but they are not able to directly use this 928 information to measure packet loss, because they only have their own 929 samples. For this reason, an external Network Management System 930 (NMS) is required to collect and elaborate data and to perform packet 931 loss calculation. The NMS compares the values of counters from 932 different nodes and can calculate if some packets were lost (even a 933 single packet) and also where packets were lost. 935 The value of the counters needs to be transmitted to the NMS as soon 936 as it has been read. This can be accomplished by using SNMP or FTP 937 and can be done in Push Mode or Polling Mode. In the first case, 938 each router periodically sends the information to the NMS, in the 939 latter case it is the NMS that periodically polls routers to collect 940 information. In any case, the NMS has to collect all the relevant 941 values from all the routers within one cycle of the timer (5 942 minutes). 944 If link-based measurement is used, it would be possible to use a 945 protocol to exchange values of counters between the two endpoints in 946 order to let them perform the packet loss calculation for each 947 traffic direction. A similar approach could be complicated if 948 applied to a flow-based measurement. 950 5.1.4. Metric transparency 952 In Telecom Italia's implementation the source node colors the packets 953 with a policy that is modified periodically via an automatic script 954 in order to alternate the DSCP field of the packets. The nodes 955 between source and destination (included) have to count with an 956 access-list the colored packets that they receive and forward. 958 Moreover the destination node has an important role: the colored 959 packets are intercepted and a policy restores and sets the DSCP field 960 of all the packets to the initial value. In this way the metric is 961 transparent because outside the section of the network under 962 monitoring the traffic flow is unchanged. 964 In such a case, thanks to this restoring technique, network elements 965 outside the Alternate Marking monitoring domain (e.g. the two 966 Provider Edge nodes of the Mobile Backhauling VPN MPLS) are totally 967 anaware that packets were marked. So this restoring technique makes 968 Alternate Marking completely transparent outside its monitoring 969 domain. 971 5.2. IP flow performance measurement (IPFPM) 973 This application of marking method is described in 974 [I-D.chen-ippm-coloring-based-ipfpm-framework]. 976 5.3. Performance Measurement Marking Method in BIER Domain 978 In [I-D.ietf-bier-mpls-encapsulation] two OAM bits from Bit Index 979 Explicit Replication (BIER) Header are reserved for the passive 980 performance measurement marking method. [I-D.ietf-bier-pmmm-oam] 981 details the measurement for multicast service over BIER domain. 983 5.4. Overlay OAM Passive Performance Measurement 985 The Overlay OAM Design Team is considering the preliminary OAM 986 requirements from NVO3, BIER, and SFC. Marking Method is the 987 preferred passive method to measure performance. 989 [I-D.ooamdt-rtgwg-ooam-requirement] and 990 [I-D.ooamdt-rtgwg-oam-gap-analysis] explain in deep this item. 992 5.5. RFC6374 Use Case 994 RFC6374 [RFC6374] uses the LM packet as the packet accounting 995 demarcation point. Unfortunately this gives rise to a number of 996 problems that may lead to significant packet accounting errors in 997 certain situations. [I-D.ietf-mpls-flow-ident] discusses the desired 998 capabilities for MPLS flow identification in order to perform a 999 better in-band performance monitoring of user data packets. A method 1000 of accomplishing identification is Synonymous Flow Labels (SFL) 1001 introduced in [I-D.bryant-mpls-sfl-framework], while 1002 [I-D.bryant-mpls-rfc6374-sfl] describes RFC6374 performance 1003 measurements with SFL. 1005 5.6. Application to active performance measurement 1007 [I-D.fioccola-ippm-alt-mark-active] describes how to extend the 1008 existing Active Measurement Protocol, in order to implement alternate 1009 marking methodology. [I-D.fioccola-ippm-rfc6812-alt-mark-ext] 1010 describes an extension to the Cisco SLA Protocol Measurement-Type 1011 UDP-Measurement. 1013 6. Hybrid measurement 1015 The method has been explicitly designed for passive measurements but 1016 it can also be used with active measurements. In order to have both 1017 end to end measurements and intermediate measurements (hybrid 1018 measurements) two end points can exchanges artificial traffic flows 1019 and apply alternate marking over these flows. In the intermediate 1020 points artificial traffic is managed in the same way as real traffic 1021 and measured as specified before. So the application of marking 1022 method can simplify also the active measurement, as explained in 1023 [I-D.fioccola-ippm-alt-mark-active]. 1025 7. Compliance with RFC6390 guidelines 1027 RFC6390 [RFC6390] defines a framework and a process for developing 1028 Performance Metrics for protocols above and below the IP layer (such 1029 as IP-based applications that operate over reliable or datagram 1030 transport protocols). 1032 This document doesn't aim to propose a new Performance Metric but a 1033 new method of measurement for a few Performance Metrics that have 1034 already been standardized. Nevertheless, it's worth applying 1035 [RFC6390] guidelines to the present document, in order to provide a 1036 more complete and coherent description of the proposed method. We 1037 used a subset of the Performance Metric Definition template defined 1038 by [RFC6390]. 1040 o Metric name and description: as already stated, this document 1041 doesn't propose any new Performance Metric. On the contrary, it 1042 describes a novel method for measuring packet loss [RFC2680]. The 1043 same concept, with small differences, can also be used to measure 1044 delay [RFC2679], and jitter [RFC3393]. The document mainly 1045 describes the applicability to packet loss measurement. 1047 o Method of Measurement or Calculation: according to the method 1048 described in the previous sections, the number of packets lost is 1049 calculated by subtracting the value of the counter on the source 1050 node from the value of the counter on the destination node. Both 1051 counters must refer to the same color. The calculation is 1052 performed when the value of the counters is in a steady state. 1054 o Units of Measurement: the method calculates and reports the exact 1055 number of packets sent by the source node and not received by the 1056 destination node. 1058 o Measurement Points: the measurement can be performed between 1059 adjacent nodes, on a per-link basis, or along a multi-hop path, 1060 provided that the traffic under measurement follows that path. In 1061 case of a multi-hop path, the measurements can be performed both 1062 end-to-end and hop-by-hop. 1064 o Measurement Timing: the method have a constraint on the frequency 1065 of measurements. In order to perform a measure, the counter must 1066 be in a steady state: this happens when the traffic is being 1067 colored with the alternate color; for example in the Telecom 1068 Italia application of the method the time interval is set to 5 1069 minutes. 1071 o Implementation: the Telecom Italia application of the method uses 1072 two encodings of the DSCP field to color the packets; this enables 1073 the use of policy configurations on the router to color the 1074 packets and accordingly configure the counter for each color. The 1075 path followed by traffic being measured should be known in advance 1076 in order to configure the counters along the path and be able to 1077 compare the correct values. 1079 o Use and Applications: the method can be used to measure packet 1080 loss with high precision on live traffic; moreover, by combining 1081 end-to-end and per-link measurements, the method is useful to 1082 pinpoint the single link that is experiencing loss events. 1084 o Reporting Model: the value of the counters has to be sent to a 1085 centralized management system that perform the calculations; such 1086 samples must contain a reference to the time interval they refer 1087 to, so that the management system can perform the correct 1088 correlation; the samples have to be sent while the corresponding 1089 counter is in a steady state (within a time interval), otherwise 1090 the value of the sample should be stored locally. 1092 o Dependencies: the values of the counters have to be correlated to 1093 the time interval they refer to; moreover, as far the Telecom 1094 Italia application of the method is based on DSCP values, there 1095 are significant dependencies on the usage of the DSCP field: it 1096 must be possible to rely on unused DSCP values without affecting 1097 QoS-related configuration and behavior; moreover, the intermediate 1098 nodes must not change the value of the DSCP field not to alter the 1099 measurement. 1101 o Organization of Results: the method of measurement produces 1102 singletons. 1104 o Parameters: currently, the main parameter of the method is the 1105 time interval used to alternate the colors and read the counters. 1107 8. Security Considerations 1109 This document specifies a method to perform measurements in the 1110 context of a Service Provider's network and has not been developed to 1111 conduct Internet measurements, so it does not directly affect 1112 Internet security nor applications which run on the Internet. 1113 However, implementation of this method must be mindful of security 1114 and privacy concerns. 1116 There are two types of security concerns: potential harm caused by 1117 the measurements and potential harm to the measurements. For what 1118 concerns the first point, the measurements described in this document 1119 are passive, so there are no packets injected into the network 1120 causing potential harm to the network itself and to data traffic. 1121 Nevertheless, the method implies modifications on the fly to the IP 1122 header of data packets: this must be performed in a way that doesn't 1123 alter the quality of service experienced by packets subject to 1124 measurements and that preserve stability and performance of routers 1125 doing the measurements. The measurements themselves could be harmed 1126 by routers altering the marking of the packets, or by an attacker 1127 injecting artificial traffic. Authentication techniques, such as 1128 digital signatures, may be used where appropriate to guard against 1129 injected traffic attacks. 1131 The privacy concerns of network measurement are limited because the 1132 method only relies on information contained in the IP header without 1133 any release of user data. 1135 The measurement itself may be affected by routers (or other network 1136 devices) along the path of IP packets intentionally altering the 1137 value of marking bits of packets. As mentioned above, the mechanism 1138 specified in this document is just in the context of one Service 1139 Provider's network, and thus the routers (or other network devices) 1140 are locally administered and this type of attack can be avoided. 1142 One of the main security threats in OAM protocols is network 1143 reconnaissance; an attacker can gather information about the network 1144 performance by passively eavesdropping to OAM messages. The 1145 advantage of the methods described in this document is that the 1146 marking bits are the only information that is exchanged between the 1147 network devices. Therefore, passive eavesdropping to data plane 1148 traffic does not allow attackers to gain information about the 1149 network performance. 1151 Delay attacks are another potential threat in the context of this 1152 document. Delay measurement is performed using a specific packet in 1153 each block, marked by a dedicated color bit. Therefore, a man-in- 1154 the-middle attacker can selectively induce synthetic delay only to 1155 delay-colored packets, causing systematic error in the delay 1156 measurements. As discussed in previous sections, the methods 1157 described in this document rely on an underlying time synchronization 1158 protocol. Thus, by attacking the time protocol an attacker can 1159 potentially compromise the integrity of the measurement. A detailed 1160 discussion about the threats against time protocols and how to 1161 mitigate them is presented in RFC 7384 [RFC7384]. 1163 9. Conclusions 1165 The advantages of the method described in this document are: 1167 o easy implementation: it can be implemented using features already 1168 available on major routing platforms; 1170 o low computational effort: the additional load on processing is 1171 negligible; 1173 o accurate packet loss measurement: single packet loss granularity 1174 is achieved with a passive measurement; 1176 o potential applicability to any kind of packet/frame -based 1177 traffic: Ethernet, IP, MPLS, etc., both unicast and multicast; 1179 o robustness: the method can tolerate out of order packets and it's 1180 not based on "special" packets whose loss could have a negative 1181 impact; 1183 o no interoperability issues: the features required to implement the 1184 method are available on all current routing platforms. 1186 The method doesn't raise any specific need for protocol extension, 1187 but it could be further improved by means of some extension to 1188 existing protocols. Specifically, the use of DiffServ bits for 1189 coloring the packets could not be a viable solution in some cases: a 1190 standard method to color the packets for this specific application 1191 could be beneficial. 1193 10. IANA Considerations 1195 There are no IANA actions required. 1197 11. Acknowledgements 1199 The previous IETF drafts about this technique were: 1200 [I-D.cociglio-mboned-multicast-pm] and [I-D.tempia-opsawg-p3m]. 1201 There are some references to this methodology in other IETF works 1202 (e.g. [I-D.ietf-mpls-flow-ident], [I-D.bryant-mpls-sfl-framework] 1203 [I-D.bryant-mpls-rfc6374-sfl], [I-D.ietf-bier-mpls-encapsulation], 1204 [I-D.ietf-bier-pmmm-oam] 1205 [I-D.chen-ippm-coloring-based-ipfpm-framework]). 1207 In addition the authors would like to thank Domenico Laforgia, 1208 Daniele Accetta and Mario Bianchetti for their contribution to the 1209 definition and the implementation of the method. 1211 12. References 1213 12.1. Normative References 1215 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1216 Delay Metric for IPPM", RFC 2679, DOI 10.17487/RFC2679, 1217 September 1999, . 1219 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1220 Packet Loss Metric for IPPM", RFC 2680, 1221 DOI 10.17487/RFC2680, September 1999, 1222 . 1224 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1225 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1226 DOI 10.17487/RFC3393, November 2002, 1227 . 1229 12.2. Informative References 1231 [I-D.bryant-mpls-rfc6374-sfl] 1232 Bryant, S., Chen, M., Li, Z., Swallow, G., Sivabalan, S., 1233 Mirsky, G., and G. Fioccola, "RFC6374 Synonymous Flow 1234 Labels", draft-bryant-mpls-rfc6374-sfl-03 (work in 1235 progress), October 2016. 1237 [I-D.bryant-mpls-sfl-framework] 1238 Bryant, S., Chen, M., Li, Z., Swallow, G., Sivabalan, S., 1239 and G. Mirsky, "Synonymous Flow Label Framework", draft- 1240 bryant-mpls-sfl-framework-02 (work in progress), October 1241 2016. 1243 [I-D.chen-ippm-coloring-based-ipfpm-framework] 1244 Chen, M., Zheng, L., Mirsky, G., Fioccola, G., and T. 1245 Mizrahi, "IP Flow Performance Measurement Framework", 1246 draft-chen-ippm-coloring-based-ipfpm-framework-06 (work in 1247 progress), March 2016. 1249 [I-D.cociglio-mboned-multicast-pm] 1250 Cociglio, M., Capello, A., Bonda, A., and L. Castaldelli, 1251 "A method for IP multicast performance monitoring", draft- 1252 cociglio-mboned-multicast-pm-01 (work in progress), 1253 October 2010. 1255 [I-D.fioccola-ippm-alt-mark-active] 1256 Fioccola, G., Clemm, A., Cociglio, M., Chandramouli, M., 1257 and A. Capello, "Alternate Marking Extension to Active 1258 Measurement Protocol", draft-fioccola-ippm-alt-mark- 1259 active-00 (work in progress), July 2016. 1261 [I-D.fioccola-ippm-rfc6812-alt-mark-ext] 1262 Fioccola, G., Clemm, A., Cociglio, M., Chandramouli, M., 1263 and A. Capello, "Alternate Marking Extension to Cisco SLA 1264 Protocol RFC6812", draft-fioccola-ippm-rfc6812-alt-mark- 1265 ext-01 (work in progress), March 2016. 1267 [I-D.ietf-bier-mpls-encapsulation] 1268 Wijnands, I., Rosen, E., Dolganow, A., Tantsura, J., 1269 Aldrin, S., and I. Meilik, "Encapsulation for Bit Index 1270 Explicit Replication in MPLS and non-MPLS Networks", 1271 draft-ietf-bier-mpls-encapsulation-06 (work in progress), 1272 December 2016. 1274 [I-D.ietf-bier-pmmm-oam] 1275 Mirsky, G., Zheng, L., Chen, M., and G. Fioccola, 1276 "Performance Measurement (PM) with Marking Method in Bit 1277 Index Explicit Replication (BIER) Layer", draft-ietf-bier- 1278 pmmm-oam-01 (work in progress), January 2017. 1280 [I-D.ietf-mpls-flow-ident] 1281 Bryant, S., Pignataro, C., Chen, M., Li, Z., and G. 1282 Mirsky, "MPLS Flow Identification Considerations", draft- 1283 ietf-mpls-flow-ident-03 (work in progress), January 2017. 1285 [I-D.ooamdt-rtgwg-oam-gap-analysis] 1286 Mirsky, G., Nordmark, E., Pignataro, C., Kumar, N., Kumar, 1287 D., Chen, M., Yizhou, L., Mozes, D., Networks, J., and I. 1288 Bagdonas, "Operations, Administration and Maintenance 1289 (OAM) for Overlay Networks: Gap Analysis", draft-ooamdt- 1290 rtgwg-oam-gap-analysis-02 (work in progress), July 2016. 1292 [I-D.ooamdt-rtgwg-ooam-requirement] 1293 Kumar, N., Pignataro, C., Kumar, D., Mirsky, G., Chen, M., 1294 Nordmark, E., Networks, J., and D. Mozes, "Overlay OAM 1295 Requirements", draft-ooamdt-rtgwg-ooam-requirement-02 1296 (work in progress), January 2017. 1298 [I-D.tempia-opsawg-p3m] 1299 Capello, A., Cociglio, M., Castaldelli, L., and A. Bonda, 1300 "A packet based method for passive performance 1301 monitoring", draft-tempia-opsawg-p3m-04 (work in 1302 progress), February 2014. 1304 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 1305 Measurement for MPLS Networks", RFC 6374, 1306 DOI 10.17487/RFC6374, September 2011, 1307 . 1309 [RFC6390] Clark, A. and B. Claise, "Guidelines for Considering New 1310 Performance Metric Development", BCP 170, RFC 6390, 1311 DOI 10.17487/RFC6390, October 2011, 1312 . 1314 [RFC6703] Morton, A., Ramachandran, G., and G. Maguluri, "Reporting 1315 IP Network Performance Metrics: Different Points of View", 1316 RFC 6703, DOI 10.17487/RFC6703, August 2012, 1317 . 1319 [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. 1320 Weingarten, "An Overview of Operations, Administration, 1321 and Maintenance (OAM) Tools", RFC 7276, 1322 DOI 10.17487/RFC7276, June 2014, 1323 . 1325 [RFC7384] Mizrahi, T., "Security Requirements of Time Protocols in 1326 Packet Switched Networks", RFC 7384, DOI 10.17487/RFC7384, 1327 October 2014, . 1329 Authors' Addresses 1331 Giuseppe Fioccola (editor) 1332 Telecom Italia 1333 Via Reiss Romoli, 274 1334 Torino 10148 1335 Italy 1337 Email: giuseppe.fioccola@telecomitalia.it 1339 Alessandro Capello (editor) 1340 Telecom Italia 1341 Via Reiss Romoli, 274 1342 Torino 10148 1343 Italy 1345 Email: alessandro.capello@telecomitalia.it 1347 Mauro Cociglio 1348 Telecom Italia 1349 Via Reiss Romoli, 274 1350 Torino 10148 1351 Italy 1353 Email: mauro.cociglio@telecomitalia.it 1354 Luca Castaldelli 1355 Telecom Italia 1356 Via Reiss Romoli, 274 1357 Torino 10148 1358 Italy 1360 Email: luca.castaldelli@telecomitalia.it 1362 Mach(Guoyi) Chen (editor) 1363 Huawei Technologies 1365 Email: mach.chen@huawei.com 1367 Lianshu Zheng (editor) 1368 Huawei Technologies 1370 Email: vero.zheng@huawei.com 1372 Greg Mirsky (editor) 1373 ZTE 1374 USA 1376 Email: gregimirsky@gmail.com 1378 Tal Mizrahi (editor) 1379 Marvell 1380 6 Hamada st. 1381 Yokneam 1382 Israel 1384 Email: talmi@marvell.com