idnits 2.17.1 draft-ietf-ippm-multipoint-alt-mark-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 23, 2020) is 1495 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 8321 (Obsoleted by RFC 9341) == Outdated reference: A later version (-10) exists of draft-ietf-ippm-route-07 == Outdated reference: A later version (-21) exists of draft-song-opsawg-ifit-framework-11 == Outdated reference: A later version (-14) exists of draft-zhou-ippm-enhanced-alternate-marking-04 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPPM Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Intended status: Experimental M. Cociglio 5 Expires: September 24, 2020 Telecom Italia 6 A. Sapio 7 R. Sisto 8 Politecnico di Torino 9 March 23, 2020 11 Multipoint Alternate Marking method for passive and hybrid performance 12 monitoring 13 draft-ietf-ippm-multipoint-alt-mark-09 15 Abstract 17 The Alternate Marking method, as presented in RFC 8321, can be 18 applied only to point-to-point flows because it assumes that all the 19 packets of the flow measured on one node are measured again by a 20 single second node. This document generalizes and expands this 21 methodology to measure any kind of unicast flows, whose packets can 22 follow several different paths in the network, in wider terms a 23 multipoint-to-multipoint network. For this reason the technique here 24 described is called Multipoint Alternate Marking. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 24, 2020. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2.1. Correlation with RFC5644 . . . . . . . . . . . . . . . . 5 63 3. Flow classification . . . . . . . . . . . . . . . . . . . . . 5 64 4. Multipoint Performance Measurement . . . . . . . . . . . . . 8 65 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 8 66 5. Multipoint Packet Loss . . . . . . . . . . . . . . . . . . . 10 67 6. Network Clustering . . . . . . . . . . . . . . . . . . . . . 11 68 6.1. Algorithm for Cluster partition . . . . . . . . . . . . . 11 69 7. Timing Aspects . . . . . . . . . . . . . . . . . . . . . . . 15 70 8. Multipoint Delay and Delay Variation . . . . . . . . . . . . 17 71 8.1. Delay measurements on multipoint paths basis . . . . . . 17 72 8.1.1. Single Marking measurement . . . . . . . . . . . . . 17 73 8.2. Delay measurements on single packets basis . . . . . . . 17 74 8.2.1. Single and Double Marking measurement . . . . . . . . 17 75 8.2.2. Hashing selection method . . . . . . . . . . . . . . 18 76 9. A Closed Loop Performance Management approach . . . . . . . . 20 77 10. Examples of application . . . . . . . . . . . . . . . . . . . 21 78 11. Security Considerations . . . . . . . . . . . . . . . . . . . 22 79 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 80 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 81 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 82 14.1. Normative References . . . . . . . . . . . . . . . . . . 22 83 14.2. Informative References . . . . . . . . . . . . . . . . . 23 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 86 1. Introduction 88 The Alternate Marking method, as described in RFC 8321 [RFC8321], is 89 applicable to a point-to-point path. The extension proposed in this 90 document applies to the most general case of multipoint-to-multipoint 91 path and enables flexible and adaptive performance measurements in a 92 managed network. 94 The Alternate Marking methodology described in RFC 8321 [RFC8321] 95 allows the synchronization of the measurements in different points by 96 dividing the packet flow into batches. So it is possible to get 97 coherent counters and show what is happening in every marking period 98 for each monitored flow. The monitoring parameters are the packet 99 counter and timestamps of a flow for each marking period. Note that 100 additional details about the applicability of the Alternate Marking 101 methodology are described both in RFC 8321 [RFC8321] and in the paper 102 [IEEE-Network-PNPM]. 104 There are some applications of the Alternate Marking method where 105 there are a lot of monitored flows and nodes. Multipoint Alternate 106 Marking aims to reduce these values and makes the performance 107 monitoring more flexible in case a detailed analysis is not needed. 108 For instance, by considering n measurement points and m monitored 109 flows,the order of magnitude of the packet counters for each time 110 interval is n*m*2 (1 per color). The number of measurement points 111 and monitored flows may vary and depends on the portion of the 112 network we are monitoring (core network, metro network, access 113 network) and on the granularity (for each service, each customer). 114 So if both n and m are high values the packet counters increase a lot 115 and Multipoint Alternate Marking offers a tool to control these 116 parameters. 118 The approach presented in this document is applied only to unicast 119 flows and not to multicast. Broadcast, Unknown-unicast, and 120 Multicast (BUM) traffic is not considered here, because traffic 121 replication is not covered by the Multipoint Alternate Marking 122 method. Furthermore it can be applicable to anycast flows and Equal- 123 Cost MultiPath (ECMP) paths can also be easily monitored with this 124 technique. 126 In short, RFC 8321 [RFC8321] applies to point-to-point unicast flows 127 and BUM traffic while this document and its Clustered Alternate 128 Marking method is valid for multipoint-to-multipoint unicast flows, 129 anycast and ECMP flows. 131 The Alternate Marking method can therefore be extended to any kind of 132 multipoint to multipoint paths, and the network clustering approach 133 presented in this document is the formalization of how to implement 134 this property and allow a flexible and optimized performance 135 measurement support for network management in every situation. 137 Without network clustering, it is possible to apply Alternate Marking 138 only for all the network or per single flow. Instead, with network 139 clustering, it is possible to use the partition of the network into 140 clusters at different levels in order to perform the needed degree of 141 detail. In some circumstances it is possible to monitor a Multipoint 142 Network by analysing the Network Clustering, without examining in 143 depth. In case of problems (packet loss is measured or the delay is 144 too high) the filtering criteria could be specified more in order to 145 perform a detailed analysis by using a different combination of 146 clusters up to a per-flow measurement as described in RFC 8321 147 [RFC8321]. 149 This approach fits very well with the Closed Loop Network and 150 Software Defined Network (SDN) paradigm where the SDN Orchestrator 151 and the SDN Controllers are the brains of the network and can manage 152 flow control to the switches and routers and, in the same way, can 153 calibrate the performance measurements depending on the desired 154 accuracy. An SDN Controller Application can orchestrate how accurate 155 the network performance monitoring is setup by applying the 156 Multipoint Alternate Marking as described in this document. 158 It is important to underline that, as extension of RFC 8321 159 [RFC8321], this is a methodology draft, so the mechanism that can be 160 used to transmit the counters and the timestamps is out of scope here 161 and the implementation is open. Several options are possible, e.g. 162 [I-D.zhou-ippm-enhanced-alternate-marking]. 164 Note that, as for RFC 8321 [RFC8321], the fragmented packets case can 165 be managed with this methodology if fragmentation happens outside the 166 portion of the monitored network. 168 2. Terminology 170 The definitions of the basic terms are identical to those found in 171 Alternate Marking (RFC 8321 [RFC8321]). It is to be remembered that 172 RFC 8321 [RFC8321] is valid for point-to-point unicast flows and BUM 173 traffic. 175 The important new terms that need to be explained are listed below: 177 Multipoint Alternate Marking: Extension to RFC 8321 [RFC8321], 178 valid for multipoint-to-multipoint unicast flows, anycast and ECMP 179 flows. It can also be referred as Clustered Alternate Marking; 181 Flow definition: The concept of flow is generalized in this 182 document. The identification fields are selected without any 183 constraints and, in general, the flow can be a multipoint-to- 184 multipoint flow, as a result of aggregate point-to-point flows; 186 Monitoring Network: it is identified with the nodes of the network 187 that are the measurement points (MPs) and the links that are the 188 connections between MPs. The Monitoring Network graph depends on 189 the flow definition, so it can represent a specific flow or the 190 the entire network topology as aggregate of all the flows; 191 Cluster: smallest identifiable subnetwork of the entire Monitoring 192 Network graph that still satisfies the condition that the number 193 of packets that goes in is the same that goes out; 195 Multipoint metrics: packet loss, delay and delay variation are 196 extended to the case of multipoint flows. It is possible to 197 compute these metrics on multipoint paths basis in order to 198 associate the measurements to a cluster, to a combination of 199 clusters or to the entire monitored network. For delay and delay 200 variation, it is also possible to define the metrics on a single 201 packet basis and it means that the multipoint path is used to 202 easily couple packets between input and output nodes of a 203 multipoint path. 205 The next section highlights the correlation with the terms used in 206 RFC 5644 [RFC5644]. 208 2.1. Correlation with RFC5644 210 RFC 5644 [RFC5644] is limited to active measurements using a single 211 source packet or stream, and observations of corresponding packets 212 along the path (spatial), at one or more destinations (one-to-group), 213 or both. 215 Instead, the scope of this memo is to define multiparty metrics for 216 passive and hybrid measurements in a group-to-group topology with 217 multiple sources and destinations. 219 RFC 5644 [RFC5644] introduces metric names that can be reused also 220 here but have to be extended and rephrased to be applied to the 221 Alternate Marking schema: 223 a. the multiparty metrics are not only one-to-group metrics but can 224 be also group-to-group metrics; 226 b. the spatial metrics, used for measuring the performance of 227 segments of a source to destination path, are applied here to 228 group-to-group segments (called Clusters). 230 3. Flow classification 232 An unicast flow is identified by all the packets having a set of 233 common characteristics. This definition is inspired by RFC 7011 234 [RFC7011]. 236 As an example, by considering a flow as all the packets sharing the 237 same source IP address or the same destination IP address, it is easy 238 to understand that the resulting pattern will not be a point-to-point 239 connection, but a point-to-multipoint or multipoint-to-point 240 connection. 242 In general a flow can be defined by a set of selection rules used to 243 match a subset of the packets processed by the network device. These 244 rules specify a set of layer-3 and layer-4 headers fields 245 (Identification Fields) and the relative values that must be found in 246 matching packets. 248 The choice of the identification fields directly affects the type of 249 paths that the flow would follow in the network. In fact, it is 250 possible to relate a set of identification fields with the pattern of 251 the resulting graphs, as listed in Figure 1. 253 A TCP 5-tuple usually identifies flows following either a single path 254 or a point-to-point multipath (in case of load balancing). On the 255 contrary, a single source address selects aggregate flows following a 256 point-to-multipoint, while a multipoint-to-point can be the result of 257 a matching on a single destination address. In case a selection rule 258 and its reverse are used for bidirectional measurements, they can 259 correspond to a point-to-multipoint in one direction and a 260 multipoint-to-point in the opposite direction. 262 So the flows to be monitored are selected into the monitoring points 263 using packet selection rules, that can also change the pattern of the 264 monitored network. 266 Note that, more in general, the flow can be defined at different 267 levels based on the encapsulation considered and additional 268 conditions that are not in the packet header can also be included as 269 part of matching criteria. 271 The Alternate Marking method is applicable only to a single path (and 272 partially to a one-to-one multipath), so the extension proposed in 273 this document is suitable also for the most general case of 274 multipoint-to-multipoint, which embraces all the other patterns of 275 Figure 1. 277 point-to-point single path 278 +------+ +------+ +------+ 279 ---<> R1 <>----<> R2 <>----<> R3 <>--- 280 +------+ +------+ +------+ 282 point-to-point multipath 283 +------+ 284 <> R2 <> 285 / +------+ \ 286 / \ 287 +------+ / \ +------+ 288 ---<> R1 <> <> R4 <>--- 289 +------+ \ / +------+ 290 \ / 291 \ +------+ / 292 <> R3 <> 293 +------+ 295 point-to-multipoint 296 +------+ 297 <> R4 <>--- 298 / +------+ 299 +------+ / 300 <> R2 <> 301 / +------+ \ 302 +------+ / \ +------+ 303 ---<> R1 <> <> R5 <>--- 304 +------+ \ +------+ 305 \ +------+ 306 <> R3 <> 307 +------+ \ 308 \ +------+ 309 <> R6 <>--- 310 +------+ 312 multipoint-to-point 313 +------+ 314 ---<> R1 <> 315 +------+ \ 316 \ +------+ 317 <> R4 <> 318 / +------+ \ 319 +------+ / \ +------+ 320 ---<> R2 <> <> R6 <>--- 321 +------+ / +------+ 322 +------+ / 323 <> R5 <> 324 / +------+ 325 +------+ / 326 ---<> R3 <> 327 +------+ 329 multipoint-to-multipoint 330 +------+ +------+ 331 ---<> R1 <> <> R6 <>--- 332 +------+ \ / +------+ 333 \ +------+ / 334 <> R4 <> 335 +------+ \ 336 +------+ \ +------+ 337 ---<> R2 <> <> R7 <>--- 338 +------+ \ / +------+ 339 \ +------+ / 340 <> R5 <> 341 / +------+ \ 342 +------+ / \ +------+ 343 ---<> R3 <> <> R8 <>--- 344 +------+ +------+ 346 Figure 1: Flow classification 348 The case of unicast flow is considered in the previous figure. 349 Anyway the anycast flow is also in scope because there is no 350 replication and only a single node from the anycast group receives 351 the traffic, so it can be viewed as a special case of unicast flow. 352 Furthermore, an ECMP flow is in scope by definition, since it is a 353 point-to-multipoint unicast flow. 355 4. Multipoint Performance Measurement 357 By Using the Alternate Marking method only point-to-point paths can 358 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 359 point path we have to define, with a specific value, 5 identification 360 fields (IP Source, IP Destination, Transport Protocol, Source Port, 361 Destination Port). 363 Multipoint Alternate Marking enables the performance measurement for 364 multipoint flows selected by identification fields without any 365 constraints (even the entire network production traffic). It is also 366 possible to use multiple marking points for the same monitored flow. 368 4.1. Monitoring Network 370 The Monitoring Network is deduced from the Production Network, by 371 identifying the nodes of the graph that are the measurement points, 372 and the links that are the connections between measurement points. 374 There are some techniques that can help with the building of the 375 monitoring network (as an example it is possible to mention 377 [I-D.ietf-ippm-route]). In general there are different options: the 378 monitoring network can be obtained by considering all the possible 379 paths for the traffic or also by periodically checking the traffic 380 (e.g. daily, weekly, monthly) and update the graph as appropriate, 381 but this is up to the Network Management System (NMS) configuration. 383 So a graph model of the monitoring network can be built according to 384 the Alternate Marking method: the monitored interfaces and links are 385 identified. Only the measurement points and links where the traffic 386 has flowed have to be represented in the graph. 388 The following figure shows a simple example of a Monitoring Network 389 graph: 391 +------+ 392 <> R6 <>--- 393 / +------+ 394 +------+ +------+ / 395 <> R2 <>---<> R4 <> 396 / +------+ \ +------+ \ 397 / \ \ +------+ 398 +------+ / +------+ \ +------+ <> R7 <>--- 399 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 400 +------+ \ +------+ \ +------+ \ 401 \ \ \ +------+ 402 \ \ <> R8 <>--- 403 \ \ +------+ 404 \ \ 405 \ \ +------+ 406 \ <> R9 <>--- 407 \ +------+ 408 \ 409 \ +------+ 410 <> R10 <>--- 411 +------+ 413 Figure 2: Monitoring Network Graph 415 Each monitoring point is characterized by the packet counter that 416 refers only to a marking period of the monitored flow. 418 The same is applicable also for the delay but it will be described in 419 the following sections. 421 5. Multipoint Packet Loss 423 Since all the packets of the considered flow leaving the network have 424 previously entered the network, the number of packets counted by all 425 the input nodes is always greater or equal than the number of packets 426 counted by all the output nodes. Non-initial fragments are not 427 considered here. 429 The assumption is the use of the Alternate Marking method. And in 430 case of no packet loss occurring in the marking period, if all the 431 input and output points of the network domain to be monitored are 432 measurement points, the sum of the number of packets on all the 433 ingress interfaces equals the number on egress interfaces for the 434 monitored flow. In this circumstance, if no packet loss occurs, the 435 intermediate measurement points have only the task to split the 436 measurement. 438 It is possible to define the Network Packet Loss of one monitored 439 flow for a single period: <>. This is true for 442 every packet flow in each marking period. 444 The Monitored Network Packet Loss with n input nodes and m output 445 nodes is given by: 447 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 449 where: 451 PL is the Network Packet Loss (number of lost packets) 453 PIi is the Number of packets flowed through the i-th Input node in 454 this period 456 POj is the Number of packets flowed through the j-th Output node in 457 this period 459 The equation is applied on a per-time-interval basis and on an per- 460 flow basis: 462 The reference interval is the Alternate Marking period as defined 463 in RFC 8321 [RFC8321]. 465 The flow definition is generalized here, indeed, as described 466 before, a multipoint packet flow is considered and the 467 identification fields can be selected without any constraints. 469 6. Network Clustering 471 The previous Equation can determine the number of packets lost 472 globally in the monitored network, exploiting only the data provided 473 by the counters in the input and output nodes. 475 In addition it is also possible to leverage the data provided by the 476 other counters in the network to converge on the smallest 477 identifiable subnetworks where the losses occur. These subnetworks 478 are named Clusters. 480 A Cluster graph is a subnetwork of the entire Monitoring Network 481 graph that still satisfies the packet loss equation (introduced in 482 the previous section) where PL in this case is the number of packets 483 lost in the Cluster. As for the entire Monitoring Network graph, the 484 Cluster is defined on a per-flow basis. 486 For this reason a Cluster should contain all the arcs emanating from 487 its input nodes and all the arcs terminating at its output nodes. 488 This ensures that we can count all the packets (and only those) 489 exiting an input node again at the output node, whatever path they 490 follow. 492 In a completely monitored unidirectional network (a network where 493 every network interface is monitored), each network device 494 corresponds to a Cluster and each physical link corresponds to two 495 Clusters (one for each device). 497 Clusters can have different sizes depending on flow filtering 498 criteria adopted. 500 Moreover, sometimes Clusters can be optionally simplified. For 501 example when two monitored interfaces are divided by a single router 502 (one is the input interface and the other is the output interface and 503 the router has only these two interfaces), instead of counting 504 exactly twice, upon entering and leaving, it is possible to consider 505 a single measurement point (in this case we do not care of the 506 internal packet loss of the router). 508 It is worth highlighting that it might also be convenient to define 509 Clusters based on the topological information and applicable to all 510 the possible flows in the monitored network. 512 6.1. Algorithm for Cluster partition 514 A simple algorithm can be applied in order to split our monitoring 515 network into Clusters. This can be done for each direction 516 separately. The Cluster partition is based on the Monitoring Network 517 Graph that can be valid for a specific flow or can also be general 518 and valid for the entire network topology. 520 It is a two-step algorithm: 522 o Group the links where there is the same starting node; 524 o Join the grouped links with at least one ending node in common. 526 Considering that the links are unidirectional, the first step implies 527 to list all the links as connection between two nodes and to group 528 the different links if they have the same starting node. Note that 529 it is possible to start from any link and the procedure works anyway. 530 Following this classification, the second step implies to eventually 531 join the groups classified in the first step by looking at the ending 532 nodes. If different groups have at least one common ending node, 533 they are put together and belong to the same set. After the 534 application of the two steps of the algorithm, each one of the 535 composed sets of links together with the endpoint nodes constitutes a 536 Cluster. 538 In our monitoring network graph example it is possible to identify 539 the Clusters partition by applying this two-step algorithm. 541 The first step identifies the following groups: 543 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 545 2. Group 2: (R2-R4), (R2-R5) 547 3. Group 3: (R3-R5), (R3-R9) 549 4. Group 4: (R4-R6), (R4-R7) 551 5. Group 5: (R5-R8) 553 And then, the second step builds the Clusters partition (in 554 particular we can underline that Group 2 and Group 3 connect 555 together, since R5 is in common): 557 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 559 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 561 3. Cluster 3: (R4-R6), (R4-R7) 563 4. Cluster 4: (R5-R8) 564 The flow direction here considered is from left to right. For the 565 opposite direction the same way of reasoning can be applied and, in 566 this example, you get the same Clusters partition. 568 In the end the following 4 Clusters are obtained: 570 Cluster 1 571 +------+ 572 <> R2 <>--- 573 / +------+ 574 / 575 +------+ / +------+ 576 ---<> R1 <>---<> R3 <>--- 577 +------+ \ +------+ 578 \ 579 \ 580 \ 581 \ 582 \ 583 \ 584 \ 585 \ 586 \ +------+ 587 <> R10 <>--- 588 +------+ 590 Cluster 2 591 +------+ +------+ 592 ---<> R2 <>---<> R4 <>--- 593 +------+ \ +------+ 594 \ 595 +------+ \ +------+ 596 ---<> R3 <>---<> R5 <>--- 597 +------+ \ +------+ 598 \ 599 \ 600 \ 601 \ 602 \ +------+ 603 <> R9 <>--- 604 +------+ 606 Cluster 3 607 +------+ 608 <> R6 <>--- 609 / +------+ 610 +------+ / 611 ---<> R4 <> 612 +------+ \ 613 \ +------+ 614 <> R7 <>--- 615 +------+ 617 Cluster 4 618 +------+ 619 ---<> R5 <> 620 +------+ \ 621 \ +------+ 622 <> R8 <>--- 623 +------+ 625 Figure 3: Clusters example 627 There are Clusters with more than 2 nodes and two-nodes Clusters. In 628 the two-nodes Clusters the loss is on the link (Cluster 4). In more- 629 than-2-nodes Clusters the loss is on the Cluster but we cannot know 630 in which link (Cluster 1, 2, 3). 632 In this way the calculation of packet loss can be made on Cluster 633 basis. Note that the packet counters for each marking period permit 634 to calculate the packet rate on Cluster basis, so Committed 635 Information Rate (CIR) and Excess Information Rate (EIR) could also 636 be deduced on Cluster basis. 638 Obviously, by combining some Clusters in a new connected subnetwork 639 (called Super Cluster) the Packet Loss Rule is still true. 641 In this way, in a very large network there is no need to configure 642 detailed filter criteria to inspect the traffic. You can check a 643 multipoint network and, in case of problems, you can go deep with a 644 step-by-step cluster analysis, but only for the cluster or 645 combination of clusters where the problem happens. 647 In summary, once defined a flow, the algorithm to build the Cluster 648 Partition considers all the possible links and nodes crossed by the 649 given flow, even if there is no traffic. It is based on topological 650 information. So, if the flow does not enter or traverse all the 651 nodes, the counters have a non-zero value for the involved nodes, 652 while a zero value for the other nodes without traffic, but, in the 653 end all the formulas are still valid. 655 The algorithm described above is an Iterative clustering algorithm, 656 but it is also possible to apply a Recursive clustering algorithm by 657 using the node-node adjacency matrix representation 658 ([IEEE-ACM-ToN-MPNPM]). 660 The complete and mathematical analysis of the possible Algorithms for 661 Cluster partition, including the considerations in terms of 662 efficiency and a comparison between the different methods, is in the 663 paper [IEEE-ACM-ToN-MPNPM]. 665 7. Timing Aspects 667 It is important to consider the timing aspects, since out of order 668 packets happen and have to be handled as well as described in RFC 669 8321 [RFC8321]. But, in a multi-source situation an additional issue 670 has to be considered. With multipoint path, the egress nodes will 671 receive alternate marked packets in random order from different 672 ingress nodes, and this must not affect the measurement. 674 So, if we analyse a multipoint-to-multipoint path with more than one 675 marking node, it is important to recognize the reference measurement 676 interval. In general the measurement interval for describing the 677 results is the interval of the marking node that is more aligned with 678 the start of the measurement, as reported in the following figure. 680 Note that the mark switching approach based on a fixed timer is 681 considered in this document. 683 time -> start stop 684 T(R1) |-------------| 685 T(R2) |-------------| 686 T(R3) |------------| 688 Figure 4: Measurement Interval 690 In the figure it is assumed that the node with the earliest clock 691 (R1) identifies the right starting and ending time of the 692 measurement, but it is just an assumption and other possibilities 693 could occur. So, in this case, T(R1) is the measurement interval and 694 its recognition is essential in order to be compatible and make 695 comparison with other active/passive/hybrid Packet Loss metrics. 697 When we expand to multipoint-to-multipoint flows, we have to consider 698 that all source nodes mark the traffic and this adds more complexity. 700 Regarding the timing aspects of the methodology, RFC 8321 [RFC8321] 701 already describes two contributions that are taken into account: the 702 clock error between network devices and the network delay between 703 measurement points. 705 But we should now consider an additional contribution. Since all 706 source nodes mark the traffic, the source measurement intervals can 707 be of different lengths and with different offsets and this mismatch 708 m can be added to d, as shown in figure. 710 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 711 |<======================================>| 712 | L | 713 ...=========>|<==================><==================>|<==========... 714 | L/2 L/2 | 715 |<=><===>| |<===><=>| 716 m d | | d m 717 |<====================>| 718 available counting interval 720 Figure 5: Timing Aspects for Multipoint paths 722 So the misalignment between the marking source routers gives an 723 additional constraint and the value of m is added to d (that already 724 includes clock error and network delay). 726 Thus, three different possible contributions are considered: clock 727 error between network devices, network delay between measurement 728 points and the misalignment between the marking source routers. 730 In the end, the condition that must be satisfied to enable the method 731 to function properly is that the available counting interval must be 732 > 0, and that means: 734 L - 2m - 2d > 0. 736 This formula needs to be verified for each measurement point on the 737 multipoint path, where m is misalignment between the marking source 738 routers, while d, already introduced in RFC 8321 [RFC8321], takes 739 into account clock error and network delay between network nodes. 740 Therefore, the mismatch between measurement intervals must satisfy 741 this condition. 743 Note that the timing considerations are valid for both packet loss 744 and delay measurements. 746 8. Multipoint Delay and Delay Variation 748 The same line of reasoning can be applied to Delay and Delay 749 Variation. Similarly to the delay measurements defined in RFC 8321 750 [RFC8321], the marking batches anchor the samples to a particular 751 period and this is the time reference that can be used. It is 752 important to highlight that both delay and delay variation 753 measurements make sense in a multipoint path. The Delay Variation is 754 calculated by considering the same packets selected for measuring the 755 Delay. 757 In general, it is possible to perform delay and delay variation 758 measurements on multipoint paths basis or on single packets basis: 760 o Delay measurements on multipoint paths basis means that the delay 761 value is representative of an entire multipoint path (e.g. whole 762 multipoint network, a cluster or a combination of clusters). 764 o Delay measurements on a single packet basis means that you can use 765 multipoint path just to easily couple packets between input and 766 output nodes of a multipoint path, as it is described in the 767 following sections. 769 8.1. Delay measurements on multipoint paths basis 771 8.1.1. Single Marking measurement 773 Mean delay and mean delay variation measurements can also be 774 generalized to the case of multipoint flows. It is possible to 775 compute the average one-way delay of packets, in one block, in a 776 cluster or in the entire monitored network. 778 The average latency can be measured as the difference between the 779 weighted averages of the mean timestamps of the sets of output and 780 input nodes. This means that, in the calculation, it is possible to 781 weigh the timestamps by considering the number of packets for each 782 endpoints. 784 8.2. Delay measurements on single packets basis 786 8.2.1. Single and Double Marking measurement 788 Delay and delay variation measurements relative to only one picked 789 packet per period (both single and double marked) can be performed in 790 the Multipoint scenario with some limitations: 792 Single marking based on the first/last packet of the interval 793 would not work, because it would not be possible to agree on the 794 first packet of the interval. 796 Double marking or multiplexed marking would work, but each 797 measurement would only give information about the delay of a 798 single path. However, by repeating the measurement multiple 799 times, it is possible to get information about all the paths in 800 the multipoint flow. This can be done in case of point-to- 801 multipoint path but it is more difficult to achieve in case of 802 multipoint-to-multipoint path because of the multiple source 803 routers. 805 If we would perform a delay measurement for more than one picked 806 packet in the same marking period and, especially, if we want to get 807 delay measurements on multipoint-to-multipoint basis, both single and 808 double marking method are not useful in the Multipoint scenario, 809 since they would not be representative of the entire flow. The 810 packets can follow different paths with various delays, and in 811 general it can be very difficult to recognize marked packets in a 812 multipoint-to-multipoint path especially in the case when there is 813 more than one per period. 815 A desirable option is to monitor simultaneously all the paths of a 816 multipoint path in the same marking period and, for this purpose, 817 hashing can be used as reported in the next Section. 819 8.2.2. Hashing selection method 821 RFC 5474 [RFC5474] and RFC 5475 [RFC5475] introduce sampling and 822 filtering techniques for IP Packet Selection. 824 The hash-based selection methodologies for delay measurement can work 825 in a multipoint-to-multipoint path and can be used both coupled to 826 mean delay or stand alone. 828 [I-D.mizrahi-ippm-compact-alternate-marking] introduces how to use 829 the Hash method (RFC 5474 [RFC5474] and RFC 5475 [RFC5475]) combined 830 with Alternate Marking method for point-to-point flows. It is also 831 called Mixed Hashed Marking: the coupling of marking method and 832 hashing technique is very useful because the marking batches anchor 833 the samples selected with hashing and this simplifies the correlation 834 of the hashing packets along the path. 836 It is possible to use a basic hash or a dynamic hash method. One of 837 the challenges of the basic approach is that the frequency of the 838 sampled packets may vary considerably. For this reason the dynamic 839 approach has been introduced for point-to-point flow in order to have 840 the desired and almost fixed number of samples for each measurement 841 period. In the hash-based sampling, Alternate Marking is used to 842 create periods, so that hash-based samples are divided into batches, 843 allowing to anchor the selected samples to their period. Moreover in 844 the dynamic hash-based sampling, by dynamically adapting the length 845 of the hash value, the number of samples is bounded in each marking 846 period. This can be realized by choosing the maximum number of 847 samples (NMAX) to be caught in a marking period. The algorithm 848 starts with only few hash bits, that permit to select a greater 849 percentage of packets (e.g. with 0 bit of hash all the packets are 850 sampled, with 1 bit of hash half of the packets are sampled, and so 851 on). When the number of selected packets reaches NMAX, a hashing bit 852 is added. As a consequence, the sampling proceeds at half of the 853 original rate and also the packets already selected that do not match 854 the new hash are discarded. This step can be repeated iteratively. 855 It is assumed that each sample includes the timestamp (used for delay 856 measurement) and the hash value, allowing the management system to 857 match the samples received from the two measurement points. The 858 dynamic process statistically converges at the end of a marking 859 period and the final number of selected samples is between NMAX/2 and 860 NMAX. Therefore, the dynamic approach paces the sampling rate, 861 allowing to bound the number of sampled packets per sampling period. 863 In a multipoint environment the behaviour is similar to a point-to 864 point flow. In particular, in the context of a multipoint-to- 865 multipoint flow, the dynamic hash could be the solution to perform 866 delay measurements on specific packets and to overcome the single and 867 double marking limitations. 869 The management system receives the samples including the timestamps 870 and the hash value from all the MPs, and this happens both for point- 871 to-point and for multipoint-to-multipoint flows. Then the longest 872 hash used by MPs is deduced and it is applied to couple timestamps of 873 the same packets of 2 MPs of a point-to-point path or of input and 874 output MPs of a Cluster (or a Super Cluster or the entire network). 875 But some considerations are needed: if there isn't packet loss the 876 set of input samples is always equal to the set of output samples. 877 In case of packet loss the set of output samples can be a subset of 878 input samples but the method still works because, at the end, it is 879 easy to couple the input and output timestamps of each caught packet 880 using the hash (in particular the "unused part of the hash" that 881 should be different for each packet). 883 Therefore, the basic hash is logically similar to the double marking 884 method, and in case of point-to-point path double marking and basic 885 hash selection are equivalent. The dynamic approach scales the 886 number of measurements per interval, and it would seem that double 887 marking would also work well if we reduced the interval length, but 888 this can be done only for point-to-point path and not for multipoint 889 path, where we cannot couple the picked packets in a multipoint 890 paths. So, in general, if we want to get delay measurements on 891 multipoint-to-multipoint path basis and want to select more than one 892 packet per period, double marking cannot be used because we could not 893 be able to couple the picked packets between input and output nodes. 894 On the other hand we can do that by using hashing selection. 896 9. A Closed Loop Performance Management approach 898 The Multipoint Alternate Marking framework that is introduced in this 899 document adds flexibility to Performance Management (PM) because it 900 can reduce the order of magnitude of the packet counters. This 901 allows an SDN Orchestrator to supervise, control and manage PM in 902 large networks. 904 The monitoring network can be considered as a whole or can be split 905 in Clusters, that are the smallest subnetworks (group-to-group 906 segments), maintaining the packet loss property for each subnetwork. 907 They can also be combined in new connected subnetworks at different 908 levels depending on the detail we want to achieve. 910 An SDN Controller or a Network Management System (NMS) can calibrate 911 Performance Measurements since they are aware of the network 912 topology. They can start without examining in depth. In case of 913 necessity (packet loss is measured or the delay is too high), the 914 filtering criteria could be immediately reconfigured in order to 915 perform a partition of the network by using Clusters and/or different 916 combinations of Clusters. In this way the problem can be localized 917 in a specific Cluster or in a single combination of Clusters and a 918 more detailed analysis can be performed step-by-step by successive 919 approximation up to a point-to-point flow detailed analysis. This is 920 the so called Closed Loop. 922 This approach can be called Network Zooming and can be performed in 923 two different ways: 925 1) change the traffic filter and select more detailed flows; 927 2) activate new measurement points by defining more specified 928 clusters. 930 The Network Zooming approach implies that the some filters or rules 931 are changed and there is a transient time to wait once the new 932 network configuration takes effect and it can be determined by the 933 Network Orchestrator/Controller, based on the network conditions. 935 For example, if the Network Zooming identifies the performance 936 problem for the traffic coming from a specific source, we need to 937 recognize the marked signal from this specific source node and its 938 relative path. For this purpose we can activate all the available 939 measurement points and specify better the flow filter criteria (i.e. 940 5-tuple). As an alternative, it can be enough to select packets from 941 the specific source for delay measurements, and in this case it is 942 possible to apply the hashing technique as mentioned in the previous 943 sections. 945 [I-D.song-opsawg-ifit-framework] defines an architecture where the 946 centralized Data Collector and Network Management can apply the 947 intelligent and flexible Alternate Marking algorithm as previously 948 described. 950 As for RFC 8321 [RFC8321], it is possible to classify the traffic and 951 mark a portion of the total traffic. For each period the packet rate 952 and bandwidth are calculated from the number of packets. In this way 953 the Network Orchestrator becomes aware if the traffic rate overcomes 954 limits. In addition more precision can be obtained by reducing the 955 marking period, indeed some implementations use a marking period of 1 956 sec and less. 958 In addition an SDN Controller could also collect the measurement 959 history. 961 It is important to mention that the Multipoint Alternate Marking 962 framework also helps Traffic Visualization. Indeed this methodology 963 is very useful to identify which path or which cluster is crossed by 964 the flow. 966 10. Examples of application 968 There are application fields where it may be useful to take into 969 consideration the Multipoint Alternate Marking: 971 o VPN: The IP traffic is selected on IP source basis in both 972 directions. At the endpoint WAN interface all the output traffic 973 is counted in a single flow. The input traffic is composed by all 974 the other flows aggregated for source address. So, by considering 975 n end-points, the monitored flows are n (each flow with 1 ingress 976 point and (n-1) egress points) instead of n*(n-1) flows (each 977 flow, with 1 ingress point and 1 egress point); 979 o Mobile Backhaul: LTE traffic is selected, in the Up direction, by 980 the EnodeB source address and, in Down direction, by the EnodeB 981 destination address because the packets are sent from the Mobile 982 Packet Core to the EnodeB. So the monitored flow is only one per 983 EnodeB in both directions; 985 o Over The Top (OTT) services: The traffic is selected, in the Down 986 direction by the source addresses of the packets sent by OTT 987 Servers. In the opposite direction (Up) by the destination IP 988 addresses of the same Servers. So the monitoring is based on a 989 single flow per OTT Servers in both directions. 991 o Enterprise SD-WAN: SD-WAN allows to connect remote branch offices 992 to Data Centers and build higher-performance WANs. A centralized 993 controller is used to set policies and prioritize traffic. The 994 SD-WAN takes into account these policies and the availability of 995 network bandwidth to route traffic. This helps ensure that 996 application performance meets service level agreements (SLAs). 997 This methodology can also help the path selection for the WAN 998 connection based on per Cluster and per flow performance. 1000 Note that the list is just an example and it is not exhaustive. More 1001 applications are possible. 1003 11. Security Considerations 1005 This document specifies a method to perform measurements that does 1006 not directly affect Internet security nor applications that run on 1007 the Internet. However, implementation of this method must be mindful 1008 of security and privacy concerns, as explained in RFC 8321 [RFC8321]. 1010 12. Acknowledgements 1012 The authors would like to thank Al Morton, Tal Mizrahi, Rachel Huang 1013 for the precious contribution. 1015 13. IANA Considerations 1017 This memo makes no requests of IANA. 1019 14. References 1021 14.1. Normative References 1023 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1024 Grossglauser, M., and J. Rexford, "A Framework for Packet 1025 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1026 March 2009, . 1028 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1029 Raspall, "Sampling and Filtering Techniques for IP Packet 1030 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1031 . 1033 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1034 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1035 DOI 10.17487/RFC5644, October 2009, 1036 . 1038 [RFC8321] Fioccola, G., Ed., Capello, A., Cociglio, M., Castaldelli, 1039 L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, 1040 "Alternate-Marking Method for Passive and Hybrid 1041 Performance Monitoring", RFC 8321, DOI 10.17487/RFC8321, 1042 January 2018, . 1044 14.2. Informative References 1046 [I-D.ietf-ippm-route] 1047 Alvarez-Hamelin, J., Morton, A., Fabini, J., Pignataro, 1048 C., and R. Geib, "Advanced Unidirectional Route Assessment 1049 (AURA)", draft-ietf-ippm-route-07 (work in progress), 1050 December 2019. 1052 [I-D.mizrahi-ippm-compact-alternate-marking] 1053 Mizrahi, T., Arad, C., Fioccola, G., Cociglio, M., Chen, 1054 M., Zheng, L., and G. Mirsky, "Compact Alternate Marking 1055 Methods for Passive and Hybrid Performance Monitoring", 1056 draft-mizrahi-ippm-compact-alternate-marking-05 (work in 1057 progress), July 2019. 1059 [I-D.song-opsawg-ifit-framework] 1060 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In- 1061 situ Flow Information Telemetry", draft-song-opsawg-ifit- 1062 framework-11 (work in progress), March 2020. 1064 [I-D.zhou-ippm-enhanced-alternate-marking] 1065 Zhou, T., Fioccola, G., Li, Z., Lee, S., and M. Cociglio, 1066 "Enhanced Alternate Marking Method", draft-zhou-ippm- 1067 enhanced-alternate-marking-04 (work in progress), October 1068 2019. 1070 [IEEE-ACM-ToN-MPNPM] 1071 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1072 Monitoring in Packet Networks", 1073 DOI 10.1109/TNET.2019.2950157, 2019. 1075 [IEEE-Network-PNPM] 1076 IEEE Network, "AM-PM: Efficient Network Telemetry using 1077 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1079 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1080 "Specification of the IP Flow Information Export (IPFIX) 1081 Protocol for the Exchange of Flow Information", STD 77, 1082 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1083 . 1085 Authors' Addresses 1087 Giuseppe Fioccola (editor) 1088 Huawei Technologies 1089 Riesstrasse, 25 1090 Munich 80992 1091 Germany 1093 Email: giuseppe.fioccola@huawei.com 1095 Mauro Cociglio 1096 Telecom Italia 1097 Via Reiss Romoli, 274 1098 Torino 10148 1099 Italy 1101 Email: mauro.cociglio@telecomitalia.it 1103 Amedeo Sapio 1104 Politecnico di Torino 1105 Corso Duca degli Abruzzi, 24 1106 Torino 10129 1107 Italy 1109 Email: amedeo.sapio@polito.it 1111 Riccardo Sisto 1112 Politecnico di Torino 1113 Corso Duca degli Abruzzi, 24 1114 Torino 10129 1115 Italy 1117 Email: riccardo.sisto@polito.it