idnits 2.17.1 draft-ietf-ippm-connectivity-monitoring-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 23, 2020) is 1210 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'T1' is mentioned on line 366, but not defined == Missing Reference: 'T0' is mentioned on line 366, but not defined Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 ippm R. Geib, Ed. 2 Internet-Draft Deutsche Telekom 3 Intended status: Standards Track December 23, 2020 4 Expires: June 26, 2021 6 A Connectivity Monitoring Metric for IPPM 7 draft-ietf-ippm-connectivity-monitoring-00 9 Abstract 11 Within a Segment Routing domain, segment routed measurement packets 12 can be sent along pre-determined paths. This enables new kinds of 13 measurements. Connectivity monitoring allows to supervise the state 14 and performance of a connection or a (sub)path from one or a few 15 central monitoring systems. This document specifies a suitable 16 type-P connectivity monitoring metric. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on June 26, 2021. 35 Copyright Notice 37 Copyright (c) 2020 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (https://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 54 2. A brief segment routing connectivity monitoring framework . . 4 55 3. Network topology requirements . . . . . . . . . . . . . . . . 9 56 4. Singleton Definition for Type-P-SR-Path-Connectivity-and- 57 Congestion . . . . . . . . . . . . . . . . . . . . . . . . . 10 58 4.1. Metric Name . . . . . . . . . . . . . . . . . . . . . . . 10 59 4.2. Metric Parameters . . . . . . . . . . . . . . . . . . . . 10 60 4.3. Metric Units . . . . . . . . . . . . . . . . . . . . . . 10 61 4.4. Definition . . . . . . . . . . . . . . . . . . . . . . . 10 62 4.5. Discussion . . . . . . . . . . . . . . . . . . . . . . . 11 63 4.6. Methodologies . . . . . . . . . . . . . . . . . . . . . . 11 64 4.7. Errors and Uncertainties . . . . . . . . . . . . . . . . 13 65 4.8. Reporting the Metric . . . . . . . . . . . . . . . . . . 13 66 5. Singleton Definition for Type-P-SR-Path-Round-Trip-Delay- 67 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 14 68 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 69 7. Security Considerations . . . . . . . . . . . . . . . . . . . 14 70 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 71 8.1. Normative References . . . . . . . . . . . . . . . . . . 14 72 8.2. Informative References . . . . . . . . . . . . . . . . . 15 73 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 75 1. Introduction 77 Within a Segment Routing domain, measurement packets can be sent 78 along pre-determined segment routed paths [RFC8402]. A segment 79 routed path may consist of pre-determined sub paths, specific router- 80 interfaces or a combination of both. A measurement path may also 81 consist of sub paths spanning multiple routers, given that all 82 segments to address a desired path are available and known at the SR 83 domain edge interface. 85 A Path Monitoring System or PMS (see [RFC8403]) is a dedicated 86 central Segment Routing (SR) domain monitoring device (as compared to 87 a distributed monitoring approach based on router-data and -functions 88 only). Monitoring individual sub-paths or point-to-point connections 89 is executed for different purposes. IGP exchanges hello messages 90 between neighbors to keep alive routing and swiftly adapt routing to 91 topology changes. Network Operators may be interested in monitoring 92 connectivity and congestion of interfaces or sub-paths at a timescale 93 of seconds, minutes or hours. In both cases, the periodicity is 94 significantly smaller than commodity interface monitoring based on 95 router counters, which may be collected on a minute timescale to keep 96 the processor- or monitoring data-load low. 98 The IPPM architecture was a first step to that direction [RFC2330]. 99 Commodity IPPM solutions require dedicated measurement systems, a 100 large number of measurement agents and synchronised clocks. 101 Monitoring a domain from edge to edge by commodity IPPM solutions 102 increases scalability of the monitoring system. But localising the 103 site of a detected change in network behaviour may then require 104 network tomography methods. 106 The IPPM Metrics for Measuring Connectivity offer generic 107 connectivity metrics [RFC2678]. These metrics allow to measure 108 connectivity between end nodes without making any assumption on the 109 paths between them. The metric and the type-p packet specified by 110 this document follow a different approach: they are designed to 111 monitor connectivity and performance of a specific single link or a 112 path segment. The underlying definition of connectivity is partially 113 the same: a packet not reaching a destination indicates a loss of 114 connectivity. An IGP re-route may indicate a loss of a link, while 115 it might not cause loss of connectivity between end systems. The 116 metric specified here detects a link-loss, if the change in end-to- 117 end delay along a new route is differing from that of the original 118 path. 120 A Segment Routing PMS is part of an SR domain. The PMS is IGP 121 topology aware, covering the IP and (if present) the MPLS layer 122 topology [RFC8402]. This allows to steer PMS measurement packets 123 along arbitrary pre-determined concatenated sub-paths, identified by 124 suitable Segment IDs. Basically, the SR connectivity metric as 125 specified by this document requires set up of a number of 126 constrained, overlaid measurement loops (or measurement paths). The 127 delay of the packets sent along each of these measurement loops is 128 measured. A single congested interface or a single loss of 129 connectivity of a monitored sub-path cause a delay change on several 130 measurement paths. Any single evnet of that type on one of the 131 monitored sub-paths changes delays of a unique subset of measurement 132 loops. The number of measurement loops may be limited to one per 133 sub-path (or connection) to be monitored, if a hub-and-spoke like 134 sub-path topology as described below is monitored. In addition to 135 information revealed by a commodity ICMP ping measurement, the 136 metrics and methods specified here identify the location of a 137 congested interface. To do so, tomography assumptions and methods 138 are combined to first plan the overlaid SR measurement loop set up 139 and later on to evaluate the captured delay measurements. 141 There's another difference as compared with commodity ping: the 142 measurement loop packets remain in the data plane of passed routers. 144 These need to forward the measurement packets without additional 145 processing apart from that. 147 It is recommended to consider automated measurement loop set-ups. 148 The methods proposed here are error-prone if the topology and 149 measurement loop design isn't followed properly. While details of an 150 automated set-up are not within scope of this document, some formal 151 defintions of constraints to respected are given. 153 This document specifies a type-p metric determining properties of an 154 SR path which allows to monitor connectivity and congestion of 155 interfaces and further allows to locate the path or interface which 156 caused a change in the reported type-p metric. This document is 157 limited on the MPLS layer, but the methodology may be applied within 158 SR domains or MPLS domains in general. 160 1.1. Requirements Language 162 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 163 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 164 document are to be interpreted as described in RFC 2119 [RFC2119]. 166 2. A brief segment routing connectivity monitoring framework 168 The Segment Routing IGP topology information consists of the IP and 169 (if present) the MPLS layer topology. The minimum SR topology 170 information consists of Node-Segment-Identifiers (Node-SID), 171 identifying an SR router. The IGP exchange of Adjacency-SIDs 172 [RFC8667], which identify local interfaces to adjacent nodes, is 173 optional. It is RECOMMENDED to distribute Adj-SIDs in a domain 174 operating a PMS to monitor connectivity as specified below. If Adj- 175 SIDs aren't availbale, [RFC8029] provides methods how to steer 176 packets along desired paths by the proper choice of an MPLS Echo- 177 request IP-destination address. A detailed description of [RFC8029] 178 methods as a replacement of Adj-SIDs is out of scope of this 179 document. 181 An active round trip measurement between two adjacent nodes is a 182 simple method to monitor connectivity of a connecting link. If 183 multiple links are operational between two adjacent nodes and only a 184 single one fails, a single plain round trip measurement may fail to 185 notice that or identify which link has failed. A round trip 186 measurement also fails to identify which interface is congested, even 187 if only a single link connects two adjacent nodes. 189 Segment Routing enables the set-up of extended measurement loops. 190 Several different measurement loops can be set up to form a partial 191 overlay. If done properly, any network change impacts more than a 192 single measurement loop's round trip delay (or causes drops of 193 packets of more than one loop). Randomly chosen measurement loop 194 paths including the interfaces or paths to be monitored may fail to 195 produce the desired unique result patterns, hence commodity network 196 tomography methods aren't applicable here [CommodityTomography]. The 197 approach pursued here uses a pre-specified measurement loop overlay 198 design. 200 A centralised monitoring approach doesn't require report collection 201 and result correlation from two (or more) receivers (the measured 202 delays of different measurement loops still need to be correlated). 204 An additional property of the measurement path set-up specified below 205 is that it allows to estimate the packet round trip and the one way 206 delay of a monitored sub-path. The delay along a single link is not 207 perfectly symmetric. Packet processing causes small delay 208 differences per interface and direction. These cause an error, which 209 can't be quantified or removed by the specified method. Quantifying 210 this error requires a different measurement set-up. As this will 211 introduce additional measurements loops, packets and evaluations, the 212 cost in terms of reduced scalability is not felt to be worth the 213 benefit in measurement accuracy. IPPM metrics prefer precision to 214 accuracy and the mentioned processing differences are relatively 215 stable, resulting in relatively precise delay estimates for each 216 monitored sub-path. 218 An example hub and spoke network, operated as SR domain, is shown 219 below. The included PMS shown is supposed to monitor the 220 connectivity of all 6 links (a very generic kind of sub-path) 221 attaching the spoke-nodes L050, L060 and L070 to the hub-nodes L100 222 and L200. 224 +---+ +----+ +----+ 225 |PMS| |L100|-----|L050| 226 +---+ +----+\ /+----+ 227 | / \ \_/_____ 228 | / \ / \+----+ 229 +----+/ \/_ +----|L060| 230 |L300| / |/ +----+ 231 +----+\ / /\_ 232 \ / / \ 233 \+----+ / +----+ 234 |L200|-----|L070| 235 +----+ +----+ 237 Hub and spoke connectivity verification with a PMS 239 Figure 1 241 The SID values are picked for convenient reading only. Node-SID: 100 242 identifies L100, Node-SID: 300 identifies L300 and so on. Adj-SID 243 10050: Adjacency L100 to L050, Adj-SID 10060: Adjacency L100 to L060, 244 Adj-SID 60200: Adjacency L60 to L200 and so on (note that the Adj-SID 245 are locally assigned per node interface, meaning two per link). 247 Monitoring the 6 links between hub nodes Ln00 (where n=1,2) and spoke 248 nodes L0m0 (where m=5,6,7) requires 6 measurement loops, which have 249 the following properties: 251 o Each measurement loop follows a single round trip from one hub 252 Ln00 to one spoke L0m0 (e.g., between L100 and L050). 254 o Each measurement loop passes two more links: one between the same 255 hub Ln00 and another spoke L0m0 and from there to the alternate 256 hub Ln00 (e.g., between L100 and L060 and then from L060 to L200) 258 o Every monitored link is passed by a single round trip measurement 259 loop only once and further only once unidirectional by two other 260 loops. These unidirectional mearurement loop sections forward 261 packets in opposing direction along the monitored link. In the 262 end, three measurement loops pass each single monitored link (sub- 263 path). In figure 1, e.g., one measurement loop having a round 264 trip L100 to L050 and back (M1, see below), a second loop passing 265 L100 to L050 only (M3) and a third loop passing L050 to L100 only 266 (M6). 268 Note that any 6 links connecting two to five nodes can be monitored 269 that way too. Further note that the measurement loop overlay chosen 270 is optimised for 6 links and a hub and spoke topology of two to five 271 nodes. The 'one measurement loop per measured sub-path' paradigm 272 only works under these conditions. 274 The above overlay scheme results in 6 measurement loops for the given 275 example. The start and end of each measurement loop is PMS to L300 276 to L100 or L200 and a similar sub-path on the return leg. These 277 parts of the measurement loops are omitted here for brevity (some 278 discussion may befound below). The following delays are measured 279 along the SR paths of each measurement loop: 281 1. M1 is the delay along L100 -> L050 -> L100 -> L060 -> L200 283 2. M2 is the delay along L100 -> L060 -> L100 -> L070 -> L200 285 3. M3 is the delay along L100 -> L070 -> L100 -> L050 -> L200 287 4. M4 is the delay along L200 -> L050 -> L200 -> L060 -> L100 289 5. M5 is the delay along L200 -> L060 -> L200 -> L070 -> L100 291 6. M6 is the delay along L200 -> L070 -> L200 -> L050 -> L100 293 An example for a stack of a loop consisting of Node-SID segments 294 allowing to caprture M1 is (top to bottom): 100 | 050 | 100 | 060 | 295 200 | PMS. 297 An example for a stack of Adj-SID segments the loop resulting in M1 298 is (top to bottom): 100 | 10050 | 50100 | 10060 | 60200 | PMS. As 299 can be seen, the Node-SIDs 100 and PMS are present at top and bottom 300 of the segment stack. Their purpose is to transport the packet from 301 the PMS to the start of the measurement loop at L100 and return it to 302 the PMS from its end. 304 The Evaluation of the measurement loop Round Trip Delays M1 - M6 305 allow to detect the follwing state-changes of the monitored sub- 306 paths: 308 o If the loops are set up using Node-SIDs only, any single complete 309 loss of connectivity caused by a failing single link between any 310 Ln00 and any L0m0 node briefly disturbs (and changes the measured 311 delay) of three loops. The traffic to the Node-SIDs is rerouted 312 (in the case of a single links loss, no node is completely 313 disconnected in the example network). 315 o If the loops are set up using Adj-SIDs only, any single complete 316 loss of connectivity caused by a failing single link between any 317 Ln00 and any L0m0 node terminates the traffic along three 318 measurement loops. The packets of all three loops will be 319 dropped, until the link gets back into service. Traffic to Adj- 320 SIDs is not rerouted. Note that Node-SIDs may be used to foward 321 the measurement packets from the PMS to the hub node, where the 322 first sub-path to be monitored begins and from the hub node, 323 receiving the measurement from the last monitored sub path, to the 324 PMS. 326 o Any congested single interface between any Ln00 and any L0m0 node 327 only impacts the measured delay of two measurement loops. 329 o As an example, the formula for a single link (sub-path) Round Trip 330 Delay (RTD) is shown here 4 * RTD_L100-L050-L100 = 3 * M1 + M3 + 331 M6 - M2 - M4 - M5. This formula is reproducible for all other 332 links: sum up 3*RTD measured along the loop passing the monitored 333 link of interest in round trip fashion, and add the RTDs of the 334 two measurement loops passing the link of interest only in a 335 single direction. From this sum subtract the RTD measured on all 336 loops not passing the monitored link of interest to get four times 337 the RTD of the monitored link of interest. 339 A closer look reveals that each single event of interest for the 340 proposed metric, which are a loss of connectivity or a case of 341 congestion, uniquely only impacts a single set of measurement loops 342 which can be determined a-priori. If, e.g., connectivity is lost 343 between L200 and L050, measurement loops (3), (4) and (6) indicate a 344 change in the measured delay. 346 As a second example: if the interface L070 to L100 is congested, 347 measurement loops (3) and (5) indicate a change in the measured 348 delay. Without listing all events, all cases of single losses of 349 connectivity or single events of congestion influence only delay 350 measurements of a unique set of measurement loops. 352 Assume that the measurement loops are set up while there's no 353 congestion. In that case, the congestion free RTDs of all monitored 354 links can be calculated as shown above. A single congestion event 355 adds queuing delay to the RTD measured by two specific measurement 356 loops. The two measurement loops impacted allow to distinct the 357 congested interface and calculation of the queue-depth in terms of 358 seconds. As an example, assume a queue of an average depth of 20 ms 359 to build up at interface L200 to L070 after the uncongested 360 measurement interval T0. The measurement loops M5 and M6 are the 361 only ones passing the interface in that direction. Both indicate a 362 congestion M5 and M6 of + 20 ms during measurement interval T1, while 363 M1-4 indicate no change. The location of the congested interface is 364 determined by the combination of the two (and only two) measurement 365 loops M5 and M6 showing an increased delay. The average queue depth 366 = ( M5[T1] - M5[T0] + M5[T1] - M5[T0] )/2. 368 As mentioned there's a constant delay added for each measurement 369 loop, which is the delay of the path transversed from PMS -> L100 + 370 L200 -> PMS. Please note, that this added delay is appearing twice 371 in the formula resulting in the monitored link delay estimate of the 372 example network. Then it is the RTD PMS -> L100 + RTD L200 -> PMS. 373 Both RTDs can be directly measured by two additional measurements 374 Cor1 = RTD ( PMS -> L100 -> PMS) and Cor2 = RTD (PMS -> L200 -> PMS). 375 The monitored link RTD formula was linkRTDuncor = 3*Mx + My + Mz - Ms 376 - Mt - Mu. The correct 4*linkRTDx = 4*linkRTDxuncor - Cor1 - Cor2. 378 If the interface between PMS and L100/L200 is congested, all 379 measurements loops M1-M6 as well as Cor1 and Cor2 will see a change. 380 A congested interface of a monitored link doesn't impact the RTDs 381 captured by Cor1 and Cor2. 383 The measurement loops may also be set up between hub nodes L100 and 384 L200, if that's preferred and supported by the nodes. In that case, 385 the above formulas apply without correction. 387 3. Network topology requirements 389 The metric and methods specified below can be applied in networks 390 with a hub and spoke topology. A single network change of type loss 391 of connectivity or congestion can be detected. The nodes don't have 392 to be hubs or spokes, this is just a topology requirement. In 393 detail, the topology MUST meet the following constraints: 395 o The SR domain sub-paths to be monitored create a hub and spoke 396 topology with a PMS connected to all hub nodes. The PMS may 397 reside in a hub. 399 o Exactly 6 (six) sub-paths are monitored. 401 o The monitored sub-paths connect at least two and no more than 5 402 nodes. 404 o Every spoke node MUST have at least one path to every hub node. 406 o Every spoke node MUST at least be connected to one (or more) hub 407 node(s) by two monitored sub-paths. 409 o Sub-paths between spokes can't be monitored and therefore are out 410 of scope (the overlay measurement loops can't be set up as 411 desired). 413 Shared resources, like a Shared Risk Link Group (e.g., a single fiber 414 bundle) or a shared queue passed by several logical links need to be 415 considered during set up. Shared resources may either be desired or 416 to be avoided. As an example, if a set of logical links share one 417 parental scheduler queue, it is sufficient to monitor a single 418 logical connection to monitor the state of that parental scheduler. 420 4. Singleton Definition for Type-P-SR-Path-Connectivity-and-Congestion 422 4.1. Metric Name 424 Type-P-SR-SubPath-Connectivity 426 4.2. Metric Parameters 428 o Src, the IP address of a source host 430 o Dst, the IP address of a destination host if IP routing is 431 applicable; in the case of MPLS routing, a diagnostic address as 432 specified by [RFC8029] 434 o T, a time 436 o L, a packet length in bits. The packets of a Type P packet stream 437 from which the sample Path-Connectivity-and-Congestion metric is 438 taken MUST all be of the same length. 440 o MLA, a stack of Segment IDs determining a Monitoring Loop. The 441 Segment-IDs MUST be chosen so that a singleton type-p packet 442 passes one single monitored sub-path_a bidirectional, one 443 monitored sub-path_b unidirectional and one monitored sub-path_c 444 unidirectional, where sub-path_a, -_b and -_c MUST NOT be 445 identical and MUST NOT share properties to be monitored. 447 o P, the specification of the packet type, over and above the source 448 and destination addresses 450 o DS, a constant time interval between two type-P packets in unit 451 seconds 453 4.3. Metric Units 455 A sequence of consecutive time values. 457 4.4. Definition 459 A moving average of AV time values per measurement path is compared 460 by a change point detection algorithm. The temporal packet spacing 461 value DS represents the smallest period within which a change in 462 connectivity or congestion may be detected. 464 A single loss of connectivity of a sub-path between two nodes affects 465 three different measurement paths. Depending on the value chosen for 466 DS, packet loss might occur (note that the moving average evaluation 467 needs to span a longer period than convergence time; alternatively, 468 packet-loss visible along the three measurement paths may serve as an 469 evaluation criterium). After routing convergence the type-p packets 470 along the three measurement paths show a change in delay. 472 A congestion of a single interface of a sub-path connecting two nodes 473 affects two different measurement paths. The the type-p packets 474 along the two congested measurement paths show an additional change 475 in delay. 477 4.5. Discussion 479 Detection of a multiple losses of monitored sub-path connectivity or 480 congestion of a multiple monitored sub-paths may be possible. These 481 cases have not been investigated, but may occur in the case of Shared 482 Risk Link Groups. Monitoring Shared Risk LinkGroups and sub-paths 483 with multiple failures abd congestion is not within scope of this 484 document. 486 4.6. Methodologies 488 For the given type-p, the methodology is as follows: 490 o The set of measurement paths MUST be routed in a way that each 491 single loss of connectivity and each case of single interface 492 congestion of one of the sub-paths passed by a type-p packet 493 creates a unique pattern of type-p packets belonging to a subset 494 of all configured measurement paths indicate a change in the 495 measured delay. As a minimum, each sub-path to be monitored MUST 496 be passed 498 o 500 * by one measurement_path_1 and its type-p packet in 501 bidirectional direction 503 * by one measurement_path_2 and its type-p packet in "downlink" 504 direction 506 * by one measurement_path_3 and its type-p packet in "uplink" 507 direction 509 o "Uplink" and "Downlink" have no architectural relevance. The 510 terms are chosen to express, that the packets of 511 measurement_path_2 and measuremnt_path_3 pass the monitored sub- 512 path unidirectional in opposing direction. Measuremnt_path_1, 513 measurement_path_2 and measurement_path_3 MUST NOT be identical. 515 o All measurement paths SHOULD terminate between identical sender 516 and receiver interfaces. It is recommended to connect the sender 517 and receiver as closely to the paths to be monitored as possible. 518 Each intermediate sub-path between sender and receiver one one 519 hand and sub-paths to be monitored is an additional source of 520 errors requiring separate monitoring. 522 o Segment Routed domains supporting Node- and Adj-SIDs should enable 523 the monitoring path set-up as specified. Other routing protocols 524 may be used as well, but the monitoring path set up might be 525 complex or impossible. 527 o Pre-compute how the two and three measurement path delay changes 528 correlate to sub-path connectivity and congestion patterns. 529 Absolute change valaues aren't required, a simultaneous change of 530 two or three particular measurement paths is. 532 o Ensure that the temporal resolution of the measurement clock 533 allows to reliably capture a unique delay value for each 534 configured measurement path while sub-path connectivity is 535 complete and no congestion is present. 537 o Synchronised clocks are not strictly required, as the metric is 538 evaluating differences in delay. Changes in clock synchronisation 539 SHOULD NOT be close to the time interval within which changes in 540 connectivity or congestion should be monitored. 542 o At the Src host, select Src and Dst IP addresses, and address 543 information to route the type-p packet along one of the configured 544 measurement path. Form a test packet of Type-P with these 545 addresses. 547 o Configure the Dst host access to receive the packet. 549 o At the Src host, place a timestamp, a sequence number and a unique 550 identifier of the measurement path in the prepared Type-P packet, 551 and send it towards Dst. 553 o Capture the one-way delay and determine packet-loss by the metrics 554 specified by [RFC7679] and [RFC7680] respectively and store the 555 result for the path. 557 o If two or three subpaths indicate a change in delay, report a 558 change in connectivity or congestion status as pre-computed above. 560 o If two or three sub paths indicate a change in delay, report a 561 change in connectivity or congestion status as pre-computed above. 563 Note that monitoring 6 sub paths requires setting up 6 monitoring 564 paths as shown in the figure above. 566 4.7. Errors and Uncertainties 568 Sources of error are: 570 o Measurement paths whose delays don't indicate a change after sub- 571 path connectivity changed. 573 o A timestamps whose resolution is missing or inacurrate at the 574 delays measured for the different monitoring paths. 576 o Multiple occurrences of sub path connectivity and congestion. 578 o Loss of connectivity and congestion along sub-paths connecting the 579 measurement device(s) with the sub-paths to be monitored. 581 4.8. Reporting the Metric 583 The metric reports loss of connectivity of monitored sub-path or 584 congestion of an interface and identifies the sub-path and the 585 direction of traffic in the case of congestion. 587 The temporal resolution of the detected events depends on the spacing 588 interval of packets transmitted per measurement path. An identical 589 sending interval is chosen for every measurement path. As a rule of 590 thumb, an event is reliably detected if a sample consists of at least 591 5 probes indicating the same underlying change in behavior. 592 Depending on the underlying event either two or three measurement 593 paths are impacted. At least two consecutively received measurement 594 packets per measurement path should suffice to indicate a change. 595 The values chosen for an operational network will have to reflect 596 scalability constraints of a PMS measurement interface. As an 597 example, a PMS may work reliable if no more than one measurement 598 packet is transmitted per millisecond. Further, measurement is 599 configured so that the measurement packets return to the sender 600 interface. Assume always groups of 6 links to be monitored as 601 described above by 6 measurements paths. If one packet is sent per 602 measurement path within 500 ms, up to 498 links can be monitored with 603 a reliable temporal resolution of roughly one second per detected 604 event. 606 Note that per group measurement packet spacing, measurement loop 607 delay difference and latency caused by congestion impact the 608 reporting interval. If each measurement path of a single 6 link 609 monitoring group is addressed in consecutive milliseconds (within the 610 500 ms interval) and the sum of maximum physical delay of the per 611 group measurement paths and latency possibly added by congestion is 612 below 490 ms, the one second reports reliably capture 4 packets of 613 two different measurement paths, if two measurement paths are 614 congested, or 6 packets of three different measurement paths, if a 615 link is lost. 617 A variety of reporting options exist, if scalability issues and 618 network properties are respected. 620 5. Singleton Definition for Type-P-SR-Path-Round-Trip-Delay-Estimate 622 This section will be added in a later version, if there's interest in 623 picking up this work. 625 6. IANA Considerations 627 If standardised, the metric will require an entry in the IPPM metric 628 registry. 630 7. Security Considerations 632 This draft specifies how to use methods specified or described within 633 [RFC8402] and [RFC8403]. It does not introduce new or additional SR 634 features. The security considerations of both references apply here 635 too. 637 8. References 639 8.1. Normative References 641 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 642 Requirement Levels", BCP 14, RFC 2119, 643 DOI 10.17487/RFC2119, March 1997, 644 . 646 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 647 Connectivity", RFC 2678, DOI 10.17487/RFC2678, September 648 1999, . 650 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 651 Ed., "A One-Way Delay Metric for IP Performance Metrics 652 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January 653 2016, . 655 [RFC7680] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 656 Ed., "A One-Way Loss Metric for IP Performance Metrics 657 (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January 658 2016, . 660 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 661 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 662 Switched (MPLS) Data-Plane Failures", RFC 8029, 663 DOI 10.17487/RFC8029, March 2017, 664 . 666 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 667 Decraene, B., Litkowski, S., and R. Shakir, "Segment 668 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 669 July 2018, . 671 [RFC8667] Previdi, S., Ed., Ginsberg, L., Ed., Filsfils, C., 672 Bashandy, A., Gredler, H., and B. Decraene, "IS-IS 673 Extensions for Segment Routing", RFC 8667, 674 DOI 10.17487/RFC8667, December 2019, 675 . 677 8.2. Informative References 679 [CommodityTomography] 680 Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., 681 Kolaczyk, ED., and N. Taft, "Structural analysis of 682 network traffic flows", 2004, 683 . 686 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 687 "Framework for IP Performance Metrics", RFC 2330, 688 DOI 10.17487/RFC2330, May 1998, 689 . 691 [RFC8403] Geib, R., Ed., Filsfils, C., Pignataro, C., Ed., and N. 692 Kumar, "A Scalable and Topology-Aware MPLS Data-Plane 693 Monitoring System", RFC 8403, DOI 10.17487/RFC8403, July 694 2018, . 696 Author's Address 697 Ruediger Geib (editor) 698 Deutsche Telekom 699 Heinrich Hertz Str. 3-7 700 Darmstadt 64295 701 Germany 703 Phone: +49 6151 5812747 704 Email: Ruediger.Geib@telekom.de