idnits 2.17.1 

draft-ietf-ippm-connectivity-monitoring-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (December 23, 2020) is 1210 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'T1' is mentioned on line 366, but not defined

  == Missing Reference: 'T0' is mentioned on line 366, but not defined


     Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------

1	ippm                                                        R. Geib, Ed.
2	Internet-Draft                                          Deutsche Telekom
3	Intended status: Standards Track                       December 23, 2020
4	Expires: June 26, 2021

6	               A Connectivity Monitoring Metric for IPPM
7	               draft-ietf-ippm-connectivity-monitoring-00

9	Abstract

11	   Within a Segment Routing domain, segment routed measurement packets
12	   can be sent along pre-determined paths.  This enables new kinds of
13	   measurements.  Connectivity monitoring allows to supervise the state
14	   and performance of a connection or a (sub)path from one or a few
15	   central monitoring systems.  This document specifies a suitable
16	   type-P connectivity monitoring metric.

18	Status of This Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at https://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   This Internet-Draft will expire on June 26, 2021.

35	Copyright Notice

37	   Copyright (c) 2020 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents
42	   (https://trustee.ietf.org/license-info) in effect on the date of
43	   publication of this document.  Please review these documents
44	   carefully, as they describe your rights and restrictions with respect
45	   to this document.  Code Components extracted from this document must
46	   include Simplified BSD License text as described in Section 4.e of
47	   the Trust Legal Provisions and are provided without warranty as
48	   described in the Simplified BSD License.

50	Table of Contents

52	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
53	     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   4
54	   2.  A brief segment routing connectivity monitoring framework . .   4
55	   3.  Network topology requirements . . . . . . . . . . . . . . . .   9
56	   4.  Singleton Definition for Type-P-SR-Path-Connectivity-and-
57	       Congestion  . . . . . . . . . . . . . . . . . . . . . . . . .  10
58	     4.1.  Metric Name . . . . . . . . . . . . . . . . . . . . . . .  10
59	     4.2.  Metric Parameters . . . . . . . . . . . . . . . . . . . .  10
60	     4.3.  Metric Units  . . . . . . . . . . . . . . . . . . . . . .  10
61	     4.4.  Definition  . . . . . . . . . . . . . . . . . . . . . . .  10
62	     4.5.  Discussion  . . . . . . . . . . . . . . . . . . . . . . .  11
63	     4.6.  Methodologies . . . . . . . . . . . . . . . . . . . . . .  11
64	     4.7.  Errors and Uncertainties  . . . . . . . . . . . . . . . .  13
65	     4.8.  Reporting the Metric  . . . . . . . . . . . . . . . . . .  13
66	   5.  Singleton Definition for Type-P-SR-Path-Round-Trip-Delay-
67	       Estimate  . . . . . . . . . . . . . . . . . . . . . . . . . .  14
68	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  14
69	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
70	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  14
71	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  14
72	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  15
73	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  15

75	1.  Introduction

77	   Within a Segment Routing domain, measurement packets can be sent
78	   along pre-determined segment routed paths [RFC8402].  A segment
79	   routed path may consist of pre-determined sub paths, specific router-
80	   interfaces or a combination of both.  A measurement path may also
81	   consist of sub paths spanning multiple routers, given that all
82	   segments to address a desired path are available and known at the SR
83	   domain edge interface.

85	   A Path Monitoring System or PMS (see [RFC8403]) is a dedicated
86	   central Segment Routing (SR) domain monitoring device (as compared to
87	   a distributed monitoring approach based on router-data and -functions
88	   only).  Monitoring individual sub-paths or point-to-point connections
89	   is executed for different purposes.  IGP exchanges hello messages
90	   between neighbors to keep alive routing and swiftly adapt routing to
91	   topology changes.  Network Operators may be interested in monitoring
92	   connectivity and congestion of interfaces or sub-paths at a timescale
93	   of seconds, minutes or hours.  In both cases, the periodicity is
94	   significantly smaller than commodity interface monitoring based on
95	   router counters, which may be collected on a minute timescale to keep
96	   the processor- or monitoring data-load low.

98	   The IPPM architecture was a first step to that direction [RFC2330].
99	   Commodity IPPM solutions require dedicated measurement systems, a
100	   large number of measurement agents and synchronised clocks.
101	   Monitoring a domain from edge to edge by commodity IPPM solutions
102	   increases scalability of the monitoring system.  But localising the
103	   site of a detected change in network behaviour may then require
104	   network tomography methods.

106	   The IPPM Metrics for Measuring Connectivity offer generic
107	   connectivity metrics [RFC2678].  These metrics allow to measure
108	   connectivity between end nodes without making any assumption on the
109	   paths between them.  The metric and the type-p packet specified by
110	   this document follow a different approach: they are designed to
111	   monitor connectivity and performance of a specific single link or a
112	   path segment.  The underlying definition of connectivity is partially
113	   the same: a packet not reaching a destination indicates a loss of
114	   connectivity.  An IGP re-route may indicate a loss of a link, while
115	   it might not cause loss of connectivity between end systems.  The
116	   metric specified here detects a link-loss, if the change in end-to-
117	   end delay along a new route is differing from that of the original
118	   path.

120	   A Segment Routing PMS is part of an SR domain.  The PMS is IGP
121	   topology aware, covering the IP and (if present) the MPLS layer
122	   topology [RFC8402].  This allows to steer PMS measurement packets
123	   along arbitrary pre-determined concatenated sub-paths, identified by
124	   suitable Segment IDs.  Basically, the SR connectivity metric as
125	   specified by this document requires set up of a number of
126	   constrained, overlaid measurement loops (or measurement paths).  The
127	   delay of the packets sent along each of these measurement loops is
128	   measured.  A single congested interface or a single loss of
129	   connectivity of a monitored sub-path cause a delay change on several
130	   measurement paths.  Any single evnet of that type on one of the
131	   monitored sub-paths changes delays of a unique subset of measurement
132	   loops.  The number of measurement loops may be limited to one per
133	   sub-path (or connection) to be monitored, if a hub-and-spoke like
134	   sub-path topology as described below is monitored.  In addition to
135	   information revealed by a commodity ICMP ping measurement, the
136	   metrics and methods specified here identify the location of a
137	   congested interface.  To do so, tomography assumptions and methods
138	   are combined to first plan the overlaid SR measurement loop set up
139	   and later on to evaluate the captured delay measurements.

141	   There's another difference as compared with commodity ping: the
142	   measurement loop packets remain in the data plane of passed routers.

144	   These need to forward the measurement packets without additional
145	   processing apart from that.

147	   It is recommended to consider automated measurement loop set-ups.
148	   The methods proposed here are error-prone if the topology and
149	   measurement loop design isn't followed properly.  While details of an
150	   automated set-up are not within scope of this document, some formal
151	   defintions of constraints to respected are given.

153	   This document specifies a type-p metric determining properties of an
154	   SR path which allows to monitor connectivity and congestion of
155	   interfaces and further allows to locate the path or interface which
156	   caused a change in the reported type-p metric.  This document is
157	   limited on the MPLS layer, but the methodology may be applied within
158	   SR domains or MPLS domains in general.

160	1.1.  Requirements Language

162	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
163	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
164	   document are to be interpreted as described in RFC 2119 [RFC2119].

166	2.  A brief segment routing connectivity monitoring framework

168	   The Segment Routing IGP topology information consists of the IP and
169	   (if present) the MPLS layer topology.  The minimum SR topology
170	   information consists of Node-Segment-Identifiers (Node-SID),
171	   identifying an SR router.  The IGP exchange of Adjacency-SIDs
172	   [RFC8667], which identify local interfaces to adjacent nodes, is
173	   optional.  It is RECOMMENDED to distribute Adj-SIDs in a domain
174	   operating a PMS to monitor connectivity as specified below.  If Adj-
175	   SIDs aren't availbale, [RFC8029] provides methods how to steer
176	   packets along desired paths by the proper choice of an MPLS Echo-
177	   request IP-destination address.  A detailed description of [RFC8029]
178	   methods as a replacement of Adj-SIDs is out of scope of this
179	   document.

181	   An active round trip measurement between two adjacent nodes is a
182	   simple method to monitor connectivity of a connecting link.  If
183	   multiple links are operational between two adjacent nodes and only a
184	   single one fails, a single plain round trip measurement may fail to
185	   notice that or identify which link has failed.  A round trip
186	   measurement also fails to identify which interface is congested, even
187	   if only a single link connects two adjacent nodes.

189	   Segment Routing enables the set-up of extended measurement loops.
190	   Several different measurement loops can be set up to form a partial
191	   overlay.  If done properly, any network change impacts more than a
192	   single measurement loop's round trip delay (or causes drops of
193	   packets of more than one loop).  Randomly chosen measurement loop
194	   paths including the interfaces or paths to be monitored may fail to
195	   produce the desired unique result patterns, hence commodity network
196	   tomography methods aren't applicable here [CommodityTomography].  The
197	   approach pursued here uses a pre-specified measurement loop overlay
198	   design.

200	   A centralised monitoring approach doesn't require report collection
201	   and result correlation from two (or more) receivers (the measured
202	   delays of different measurement loops still need to be correlated).

204	   An additional property of the measurement path set-up specified below
205	   is that it allows to estimate the packet round trip and the one way
206	   delay of a monitored sub-path.  The delay along a single link is not
207	   perfectly symmetric.  Packet processing causes small delay
208	   differences per interface and direction.  These cause an error, which
209	   can't be quantified or removed by the specified method.  Quantifying
210	   this error requires a different measurement set-up.  As this will
211	   introduce additional measurements loops, packets and evaluations, the
212	   cost in terms of reduced scalability is not felt to be worth the
213	   benefit in measurement accuracy.  IPPM metrics prefer precision to
214	   accuracy and the mentioned processing differences are relatively
215	   stable, resulting in relatively precise delay estimates for each
216	   monitored sub-path.

218	   An example hub and spoke network, operated as SR domain, is shown
219	   below.  The included PMS shown is supposed to monitor the
220	   connectivity of all 6 links (a very generic kind of sub-path)
221	   attaching the spoke-nodes L050, L060 and L070 to the hub-nodes L100
222	   and L200.

224	      +---+   +----+     +----+
225	      |PMS|   |L100|-----|L050|
226	      +---+   +----+\   /+----+
227	        |    /    \  \_/_____
228	        |   /      \  /      \+----+
229	     +----+/        \/_  +----|L060|
230	     |L300|         /  |/     +----+
231	     +----+\       /   /\_
232	            \     /   /   \
233	             \+----+ /   +----+
234	              |L200|-----|L070|
235	              +----+     +----+

237	   Hub and spoke connectivity verification with a PMS

239	                                 Figure 1

241	   The SID values are picked for convenient reading only.  Node-SID: 100
242	   identifies L100, Node-SID: 300 identifies L300 and so on.  Adj-SID
243	   10050: Adjacency L100 to L050, Adj-SID 10060: Adjacency L100 to L060,
244	   Adj-SID 60200: Adjacency L60 to L200 and so on (note that the Adj-SID
245	   are locally assigned per node interface, meaning two per link).

247	   Monitoring the 6 links between hub nodes Ln00 (where n=1,2) and spoke
248	   nodes L0m0 (where m=5,6,7) requires 6 measurement loops, which have
249	   the following properties:

251	   o  Each measurement loop follows a single round trip from one hub
252	      Ln00 to one spoke L0m0 (e.g., between L100 and L050).

254	   o  Each measurement loop passes two more links: one between the same
255	      hub Ln00 and another spoke L0m0 and from there to the alternate
256	      hub Ln00 (e.g., between L100 and L060 and then from L060 to L200)

258	   o  Every monitored link is passed by a single round trip measurement
259	      loop only once and further only once unidirectional by two other
260	      loops.  These unidirectional mearurement loop sections forward
261	      packets in opposing direction along the monitored link.  In the
262	      end, three measurement loops pass each single monitored link (sub-
263	      path).  In figure 1, e.g., one measurement loop having a round
264	      trip L100 to L050 and back (M1, see below), a second loop passing
265	      L100 to L050 only (M3) and a third loop passing L050 to L100 only
266	      (M6).

268	   Note that any 6 links connecting two to five nodes can be monitored
269	   that way too.  Further note that the measurement loop overlay chosen
270	   is optimised for 6 links and a hub and spoke topology of two to five
271	   nodes.  The 'one measurement loop per measured sub-path' paradigm
272	   only works under these conditions.

274	   The above overlay scheme results in 6 measurement loops for the given
275	   example.  The start and end of each measurement loop is PMS to L300
276	   to L100 or L200 and a similar sub-path on the return leg.  These
277	   parts of the measurement loops are omitted here for brevity (some
278	   discussion may befound below).  The following delays are measured
279	   along the SR paths of each measurement loop:

281	   1.  M1 is the delay along L100 -> L050 -> L100 -> L060 -> L200

283	   2.  M2 is the delay along L100 -> L060 -> L100 -> L070 -> L200

285	   3.  M3 is the delay along L100 -> L070 -> L100 -> L050 -> L200

287	   4.  M4 is the delay along L200 -> L050 -> L200 -> L060 -> L100

289	   5.  M5 is the delay along L200 -> L060 -> L200 -> L070 -> L100

291	   6.  M6 is the delay along L200 -> L070 -> L200 -> L050 -> L100

293	   An example for a stack of a loop consisting of Node-SID segments
294	   allowing to caprture M1 is (top to bottom): 100 | 050 | 100 | 060 |
295	   200 | PMS.

297	   An example for a stack of Adj-SID segments the loop resulting in M1
298	   is (top to bottom): 100 | 10050 | 50100 | 10060 | 60200 | PMS.  As
299	   can be seen, the Node-SIDs 100 and PMS are present at top and bottom
300	   of the segment stack.  Their purpose is to transport the packet from
301	   the PMS to the start of the measurement loop at L100 and return it to
302	   the PMS from its end.

304	   The Evaluation of the measurement loop Round Trip Delays M1 - M6
305	   allow to detect the follwing state-changes of the monitored sub-
306	   paths:

308	   o  If the loops are set up using Node-SIDs only, any single complete
309	      loss of connectivity caused by a failing single link between any
310	      Ln00 and any L0m0 node briefly disturbs (and changes the measured
311	      delay) of three loops.  The traffic to the Node-SIDs is rerouted
312	      (in the case of a single links loss, no node is completely
313	      disconnected in the example network).

315	   o  If the loops are set up using Adj-SIDs only, any single complete
316	      loss of connectivity caused by a failing single link between any
317	      Ln00 and any L0m0 node terminates the traffic along three
318	      measurement loops.  The packets of all three loops will be
319	      dropped, until the link gets back into service.  Traffic to Adj-
320	      SIDs is not rerouted.  Note that Node-SIDs may be used to foward
321	      the measurement packets from the PMS to the hub node, where the
322	      first sub-path to be monitored begins and from the hub node,
323	      receiving the measurement from the last monitored sub path, to the
324	      PMS.

326	   o  Any congested single interface between any Ln00 and any L0m0 node
327	      only impacts the measured delay of two measurement loops.

329	   o  As an example, the formula for a single link (sub-path) Round Trip
330	      Delay (RTD) is shown here 4 * RTD_L100-L050-L100 = 3 * M1 + M3 +
331	      M6 - M2 - M4 - M5.  This formula is reproducible for all other
332	      links: sum up 3*RTD measured along the loop passing the monitored
333	      link of interest in round trip fashion, and add the RTDs of the
334	      two measurement loops passing the link of interest only in a
335	      single direction.  From this sum subtract the RTD measured on all
336	      loops not passing the monitored link of interest to get four times
337	      the RTD of the monitored link of interest.

339	   A closer look reveals that each single event of interest for the
340	   proposed metric, which are a loss of connectivity or a case of
341	   congestion, uniquely only impacts a single set of measurement loops
342	   which can be determined a-priori.  If, e.g., connectivity is lost
343	   between L200 and L050, measurement loops (3), (4) and (6) indicate a
344	   change in the measured delay.

346	   As a second example: if the interface L070 to L100 is congested,
347	   measurement loops (3) and (5) indicate a change in the measured
348	   delay.  Without listing all events, all cases of single losses of
349	   connectivity or single events of congestion influence only delay
350	   measurements of a unique set of measurement loops.

352	   Assume that the measurement loops are set up while there's no
353	   congestion.  In that case, the congestion free RTDs of all monitored
354	   links can be calculated as shown above.  A single congestion event
355	   adds queuing delay to the RTD measured by two specific measurement
356	   loops.  The two measurement loops impacted allow to distinct the
357	   congested interface and calculation of the queue-depth in terms of
358	   seconds.  As an example, assume a queue of an average depth of 20 ms
359	   to build up at interface L200 to L070 after the uncongested
360	   measurement interval T0.  The measurement loops M5 and M6 are the
361	   only ones passing the interface in that direction.  Both indicate a
362	   congestion M5 and M6 of + 20 ms during measurement interval T1, while
363	   M1-4 indicate no change.  The location of the congested interface is
364	   determined by the combination of the two (and only two) measurement
365	   loops M5 and M6 showing an increased delay.  The average queue depth
366	   = ( M5[T1] - M5[T0] + M5[T1] - M5[T0] )/2.

368	   As mentioned there's a constant delay added for each measurement
369	   loop, which is the delay of the path transversed from PMS -> L100 +
370	   L200 -> PMS.  Please note, that this added delay is appearing twice
371	   in the formula resulting in the monitored link delay estimate of the
372	   example network.  Then it is the RTD PMS -> L100 + RTD L200 -> PMS.
373	   Both RTDs can be directly measured by two additional measurements
374	   Cor1 = RTD ( PMS -> L100 -> PMS) and Cor2 = RTD (PMS -> L200 -> PMS).
375	   The monitored link RTD formula was linkRTDuncor = 3*Mx + My + Mz - Ms
376	   - Mt - Mu.  The correct 4*linkRTDx = 4*linkRTDxuncor - Cor1 - Cor2.

378	   If the interface between PMS and L100/L200 is congested, all
379	   measurements loops M1-M6 as well as Cor1 and Cor2 will see a change.
380	   A congested interface of a monitored link doesn't impact the RTDs
381	   captured by Cor1 and Cor2.

383	   The measurement loops may also be set up between hub nodes L100 and
384	   L200, if that's preferred and supported by the nodes.  In that case,
385	   the above formulas apply without correction.

387	3.  Network topology requirements

389	   The metric and methods specified below can be applied in networks
390	   with a hub and spoke topology.  A single network change of type loss
391	   of connectivity or congestion can be detected.  The nodes don't have
392	   to be hubs or spokes, this is just a topology requirement.  In
393	   detail, the topology MUST meet the following constraints:

395	   o  The SR domain sub-paths to be monitored create a hub and spoke
396	      topology with a PMS connected to all hub nodes.  The PMS may
397	      reside in a hub.

399	   o  Exactly 6 (six) sub-paths are monitored.

401	   o  The monitored sub-paths connect at least two and no more than 5
402	      nodes.

404	   o  Every spoke node MUST have at least one path to every hub node.

406	   o  Every spoke node MUST at least be connected to one (or more) hub
407	      node(s) by two monitored sub-paths.

409	   o  Sub-paths between spokes can't be monitored and therefore are out
410	      of scope (the overlay measurement loops can't be set up as
411	      desired).

413	   Shared resources, like a Shared Risk Link Group (e.g., a single fiber
414	   bundle) or a shared queue passed by several logical links need to be
415	   considered during set up.  Shared resources may either be desired or
416	   to be avoided.  As an example, if a set of logical links share one
417	   parental scheduler queue, it is sufficient to monitor a single
418	   logical connection to monitor the state of that parental scheduler.

420	4.  Singleton Definition for Type-P-SR-Path-Connectivity-and-Congestion

422	4.1.  Metric Name

424	   Type-P-SR-SubPath-Connectivity

426	4.2.  Metric Parameters

428	   o  Src, the IP address of a source host

430	   o  Dst, the IP address of a destination host if IP routing is
431	      applicable; in the case of MPLS routing, a diagnostic address as
432	      specified by [RFC8029]

434	   o  T, a time

436	   o  L, a packet length in bits.  The packets of a Type P packet stream
437	      from which the sample Path-Connectivity-and-Congestion metric is
438	      taken MUST all be of the same length.

440	   o  MLA, a stack of Segment IDs determining a Monitoring Loop.  The
441	      Segment-IDs MUST be chosen so that a singleton type-p packet
442	      passes one single monitored sub-path_a bidirectional, one
443	      monitored sub-path_b unidirectional and one monitored sub-path_c
444	      unidirectional, where sub-path_a, -_b and -_c MUST NOT be
445	      identical and MUST NOT share properties to be monitored.

447	   o  P, the specification of the packet type, over and above the source
448	      and destination addresses

450	   o  DS, a constant time interval between two type-P packets in unit
451	      seconds

453	4.3.  Metric Units

455	   A sequence of consecutive time values.

457	4.4.  Definition

459	   A moving average of AV time values per measurement path is compared
460	   by a change point detection algorithm.  The temporal packet spacing
461	   value DS represents the smallest period within which a change in
462	   connectivity or congestion may be detected.

464	   A single loss of connectivity of a sub-path between two nodes affects
465	   three different measurement paths.  Depending on the value chosen for
466	   DS, packet loss might occur (note that the moving average evaluation
467	   needs to span a longer period than convergence time; alternatively,
468	   packet-loss visible along the three measurement paths may serve as an
469	   evaluation criterium).  After routing convergence the type-p packets
470	   along the three measurement paths show a change in delay.

472	   A congestion of a single interface of a sub-path connecting two nodes
473	   affects two different measurement paths.  The the type-p packets
474	   along the two congested measurement paths show an additional change
475	   in delay.

477	4.5.  Discussion

479	   Detection of a multiple losses of monitored sub-path connectivity or
480	   congestion of a multiple monitored sub-paths may be possible.  These
481	   cases have not been investigated, but may occur in the case of Shared
482	   Risk Link Groups.  Monitoring Shared Risk LinkGroups and sub-paths
483	   with multiple failures abd congestion is not within scope of this
484	   document.

486	4.6.  Methodologies

488	   For the given type-p, the methodology is as follows:

490	   o  The set of measurement paths MUST be routed in a way that each
491	      single loss of connectivity and each case of single interface
492	      congestion of one of the sub-paths passed by a type-p packet
493	      creates a unique pattern of type-p packets belonging to a subset
494	      of all configured measurement paths indicate a change in the
495	      measured delay.  As a minimum, each sub-path to be monitored MUST
496	      be passed

498	   o

500	      *  by one measurement_path_1 and its type-p packet in
501	         bidirectional direction

503	      *  by one measurement_path_2 and its type-p packet in "downlink"
504	         direction

506	      *  by one measurement_path_3 and its type-p packet in "uplink"
507	         direction

509	   o  "Uplink" and "Downlink" have no architectural relevance.  The
510	      terms are chosen to express, that the packets of
511	      measurement_path_2 and measuremnt_path_3 pass the monitored sub-
512	      path unidirectional in opposing direction.  Measuremnt_path_1,
513	      measurement_path_2 and measurement_path_3 MUST NOT be identical.

515	   o  All measurement paths SHOULD terminate between identical sender
516	      and receiver interfaces.  It is recommended to connect the sender
517	      and receiver as closely to the paths to be monitored as possible.
518	      Each intermediate sub-path between sender and receiver one one
519	      hand and sub-paths to be monitored is an additional source of
520	      errors requiring separate monitoring.

522	   o  Segment Routed domains supporting Node- and Adj-SIDs should enable
523	      the monitoring path set-up as specified.  Other routing protocols
524	      may be used as well, but the monitoring path set up might be
525	      complex or impossible.

527	   o  Pre-compute how the two and three measurement path delay changes
528	      correlate to sub-path connectivity and congestion patterns.
529	      Absolute change valaues aren't required, a simultaneous change of
530	      two or three particular measurement paths is.

532	   o  Ensure that the temporal resolution of the measurement clock
533	      allows to reliably capture a unique delay value for each
534	      configured measurement path while sub-path connectivity is
535	      complete and no congestion is present.

537	   o  Synchronised clocks are not strictly required, as the metric is
538	      evaluating differences in delay.  Changes in clock synchronisation
539	      SHOULD NOT be close to the time interval within which changes in
540	      connectivity or congestion should be monitored.

542	   o  At the Src host, select Src and Dst IP addresses, and address
543	      information to route the type-p packet along one of the configured
544	      measurement path.  Form a test packet of Type-P with these
545	      addresses.

547	   o  Configure the Dst host access to receive the packet.

549	   o  At the Src host, place a timestamp, a sequence number and a unique
550	      identifier of the measurement path in the prepared Type-P packet,
551	      and send it towards Dst.

553	   o  Capture the one-way delay and determine packet-loss by the metrics
554	      specified by [RFC7679] and [RFC7680] respectively and store the
555	      result for the path.

557	   o  If two or three subpaths indicate a change in delay, report a
558	      change in connectivity or congestion status as pre-computed above.

560	   o  If two or three sub paths indicate a change in delay, report a
561	      change in connectivity or congestion status as pre-computed above.

563	   Note that monitoring 6 sub paths requires setting up 6 monitoring
564	   paths as shown in the figure above.

566	4.7.  Errors and Uncertainties

568	   Sources of error are:

570	   o  Measurement paths whose delays don't indicate a change after sub-
571	      path connectivity changed.

573	   o  A timestamps whose resolution is missing or inacurrate at the
574	      delays measured for the different monitoring paths.

576	   o  Multiple occurrences of sub path connectivity and congestion.

578	   o  Loss of connectivity and congestion along sub-paths connecting the
579	      measurement device(s) with the sub-paths to be monitored.

581	4.8.  Reporting the Metric

583	   The metric reports loss of connectivity of monitored sub-path or
584	   congestion of an interface and identifies the sub-path and the
585	   direction of traffic in the case of congestion.

587	   The temporal resolution of the detected events depends on the spacing
588	   interval of packets transmitted per measurement path.  An identical
589	   sending interval is chosen for every measurement path.  As a rule of
590	   thumb, an event is reliably detected if a sample consists of at least
591	   5 probes indicating the same underlying change in behavior.
592	   Depending on the underlying event either two or three measurement
593	   paths are impacted.  At least two consecutively received measurement
594	   packets per measurement path should suffice to indicate a change.
595	   The values chosen for an operational network will have to reflect
596	   scalability constraints of a PMS measurement interface.  As an
597	   example, a PMS may work reliable if no more than one measurement
598	   packet is transmitted per millisecond.  Further, measurement is
599	   configured so that the measurement packets return to the sender
600	   interface.  Assume always groups of 6 links to be monitored as
601	   described above by 6 measurements paths.  If one packet is sent per
602	   measurement path within 500 ms, up to 498 links can be monitored with
603	   a reliable temporal resolution of roughly one second per detected
604	   event.

606	   Note that per group measurement packet spacing, measurement loop
607	   delay difference and latency caused by congestion impact the
608	   reporting interval.  If each measurement path of a single 6 link
609	   monitoring group is addressed in consecutive milliseconds (within the
610	   500 ms interval) and the sum of maximum physical delay of the per
611	   group measurement paths and latency possibly added by congestion is
612	   below 490 ms, the one second reports reliably capture 4 packets of
613	   two different measurement paths, if two measurement paths are
614	   congested, or 6 packets of three different measurement paths, if a
615	   link is lost.

617	   A variety of reporting options exist, if scalability issues and
618	   network properties are respected.

620	5.  Singleton Definition for Type-P-SR-Path-Round-Trip-Delay-Estimate

622	   This section will be added in a later version, if there's interest in
623	   picking up this work.

625	6.  IANA Considerations

627	   If standardised, the metric will require an entry in the IPPM metric
628	   registry.

630	7.  Security Considerations

632	   This draft specifies how to use methods specified or described within
633	   [RFC8402] and [RFC8403].  It does not introduce new or additional SR
634	   features.  The security considerations of both references apply here
635	   too.

637	8.  References

639	8.1.  Normative References

641	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
642	              Requirement Levels", BCP 14, RFC 2119,
643	              DOI 10.17487/RFC2119, March 1997,
644	              <https://www.rfc-editor.org/info/rfc2119>.

646	   [RFC2678]  Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring
647	              Connectivity", RFC 2678, DOI 10.17487/RFC2678, September
648	              1999, <https://www.rfc-editor.org/info/rfc2678>.

650	   [RFC7679]  Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton,
651	              Ed., "A One-Way Delay Metric for IP Performance Metrics
652	              (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January
653	              2016, <https://www.rfc-editor.org/info/rfc7679>.

655	   [RFC7680]  Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton,
656	              Ed., "A One-Way Loss Metric for IP Performance Metrics
657	              (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January
658	              2016, <https://www.rfc-editor.org/info/rfc7680>.

660	   [RFC8029]  Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N.,
661	              Aldrin, S., and M. Chen, "Detecting Multiprotocol Label
662	              Switched (MPLS) Data-Plane Failures", RFC 8029,
663	              DOI 10.17487/RFC8029, March 2017,
664	              <https://www.rfc-editor.org/info/rfc8029>.

666	   [RFC8402]  Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
667	              Decraene, B., Litkowski, S., and R. Shakir, "Segment
668	              Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
669	              July 2018, <https://www.rfc-editor.org/info/rfc8402>.

671	   [RFC8667]  Previdi, S., Ed., Ginsberg, L., Ed., Filsfils, C.,
672	              Bashandy, A., Gredler, H., and B. Decraene, "IS-IS
673	              Extensions for Segment Routing", RFC 8667,
674	              DOI 10.17487/RFC8667, December 2019,
675	              <https://www.rfc-editor.org/info/rfc8667>.

677	8.2.  Informative References

679	   [CommodityTomography]
680	              Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C.,
681	              Kolaczyk, ED., and N. Taft, "Structural analysis of
682	              network traffic flows", 2004,
683	              <https://www.cc.gatech.edu/classes/AY2007/cs7260_spring/
684	              papers/odflows-sigm04.pdf>.

686	   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
687	              "Framework for IP Performance Metrics", RFC 2330,
688	              DOI 10.17487/RFC2330, May 1998,
689	              <https://www.rfc-editor.org/info/rfc2330>.

691	   [RFC8403]  Geib, R., Ed., Filsfils, C., Pignataro, C., Ed., and N.
692	              Kumar, "A Scalable and Topology-Aware MPLS Data-Plane
693	              Monitoring System", RFC 8403, DOI 10.17487/RFC8403, July
694	              2018, <https://www.rfc-editor.org/info/rfc8403>.

696	Author's Address
697	   Ruediger Geib (editor)
698	   Deutsche Telekom
699	   Heinrich Hertz Str. 3-7
700	   Darmstadt  64295
701	   Germany

703	   Phone: +49 6151 5812747
704	   Email: Ruediger.Geib@telekom.de