idnits 2.17.1 draft-geib-spring-oam-usecase-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 353: '...n the path liveliness MAY use any out-...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 2, 2014) is 3586 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 4379 (Obsoleted by RFC 8029) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 spring R. Geib, Ed. 2 Internet-Draft Deutsche Telekom 3 Intended status: Informational C. Filsfils 4 Expires: January 3, 2015 C. Pignataro 5 N. Kumar 6 Cisco Systems, Inc. 7 July 2, 2014 9 Use case for a scalable and topology aware MPLS data plane monitoring 10 system 11 draft-geib-spring-oam-usecase-02 13 Abstract 15 This document describes features and a use case of a path monitoring 16 system. Segment based routing enables a scalable and simple method 17 to monitor data plane liveliness of the complete set of paths 18 belonging to a single domain. Compared with legacy MPLS ping and 19 path trace, MPLS topology awareness reduces management and control 20 plane involvement of OAM measurements while enabling new OAM 21 features. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 3, 2015. 40 Copyright Notice 42 Copyright (c) 2014 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. An MPLS topology aware path monitoring system . . . . . . . . 4 59 3. SR based OAM use case illustration . . . . . . . . . . . . . . 5 60 3.1. Use-case 1 - LSP dataplane liveliness detection and 61 monitoring . . . . . . . . . . . . . . . . . . . . . . . . 6 62 3.2. Use-case 2 - Monitoring a remote bundle . . . . . . . . . 8 63 3.3. Use-Case 3 - Fault localization . . . . . . . . . . . . . 8 64 4. Failure Notification from PMS to LERi . . . . . . . . . . . . 9 65 5. Applying SR to monitor LDP paths . . . . . . . . . . . . . . . 9 66 6. PMS monitoring of different Segment ID types . . . . . . . . . 9 67 7. Connectivity Verification using PMS . . . . . . . . . . . . . 10 68 8. Extensions of related standards . . . . . . . . . . . . . . . 10 69 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 70 10. Security Considerations . . . . . . . . . . . . . . . . . . . 10 71 11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 10 72 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 73 12.1. Normative References . . . . . . . . . . . . . . . . . . . 11 74 12.2. Informative References . . . . . . . . . . . . . . . . . . 11 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 77 1. Introduction 79 It is essential for a network operator to monitor all the forwarding 80 paths observed by the transported user packets. The monitoring flow 81 must be forwarded in dataplane in a similar way as user packets. 82 Problem localization is required. 84 This document describes a solution to this problem statement and 85 illustrates it with use-cases. 87 The solution is described for a single IGP MPLS domain. 89 The solution applies to monitoring of LDP LSP's as well as to 90 monitoring of Segment Routed LSP's. Segment Routing simplifies the 91 solution by the use of IGP-based signalled segments as specified by 92 [ID.sr-isis]. Thus a centralised monitoring unit is MPLS topology 93 aware in a Segment Routed domain and this topology awareness is used 94 for OAM purposes. The MPLS path monitoring system described by this 95 document can be realised with pre-Segment based Routing (SR) 96 technology. Making such a monitoring system aware of a domains 97 complete MPLS topology requires e.g. management plane access. To 98 avoid the use of stale MPLS label information, IGP must be monitored 99 and MPLS topology must be timely aligned with IGP topology. 100 Obviously, enhancing IGPs to exchange of MPLS topology information 101 significantly simplifies and stabilises such an MPLS path monitoring 102 system. 104 This document adopts the terminology and framework described in 105 [ID.sr-archi]. It further adopts the editorial simplification 106 explained in section 1.2 of the segment routing use-cases 107 [ID.sr-use]. 109 The proposed solution offers several benefits for network monitoring. 110 A single centralized monitoring device is able to monitor the 111 complete set of a domains forwarding paths. OAM packets never leave 112 data plane. Legacy path trace is still required. In addition to 113 Segment Routing related IGP extensions, also RFC 4379 features should 114 be extended to support detection of SR routed paths. They further 115 should be enhanced to support all deployed IP/MPLS entropy options. 116 In an IPv6 domain, a MPLS like tree trace functionality is desirable. 118 Faults can be localized: 120 o by IGP LSA analysis. 122 o by correlation between different probes. 124 o by MPLS traceroute and adapted ping messages. 126 The proposed solution requires topology awareness as well as a 127 suitable security architecture. Topology awareness is an essential 128 part of link state IGPs. Adding MPLS topology awareness to an IGP 129 speaking device hence enables a simple and scaleable data plane 130 monitoring mechanism. 132 MPLS OAM offers flexible features to recognise an execute data paths 133 of an MPLS domain. By utilsing the ECMP related tool set of RFC 4379 134 [RFC4379], a segment based routing LSP monitoring system may: 136 o easily detect ECMP functionality and properties of paths at data 137 level. 139 o construct monitoring packets executing desired paths also if ECMP 140 is present. 142 o limit the MPLS label stack of an OAM packet to a minmum of 3 143 labels. 145 MPLS OAM supports detection and execution of ECMP paths quite smart. 146 This document is foscused on MPLS path monitoring. 148 Alternatively, any path may be executed by building suitable label 149 stacks. This allows path execution without ECMP awareness. 151 The MPLS path monitoring system may be a specialised system residing 152 at a single interface of the domain to be monitored. As long as 153 measurement packets return to this or another interface to a 154 specialised OAM system, the MPLS monitoring system is the single 155 entity pushing monitoring packet label stacks. Concerns about router 156 label stack pushing capabilities don't apply in this case. 158 First drafts discussing requirements, extensions of RFC4379 and 159 possible solutions to allow SR usage as described by this document 160 are at hand, see [ID.sr-4379ext] and [ID.sr-oam_detect]. 162 2. An MPLS topology aware path monitoring system 164 An MPLS path monitoring system (PMS) which is able to learn the IGP 165 LSDB (including the SID's) is able to build a measurement packet 166 which executes every arbitrary chain of paths. A node connected to 167 an SR domain is MPLS topology aware (the node knows all related IP 168 adresses, MPLS SIDs and labels). 170 Let us describe how the PMS can check the liveliness of the MPLS 171 transport path between LER i and LER j and then monitor it. 173 The PMS may do so by sending packets carrying the following MPLS 174 label stack infomation: 176 o Top Label: a path from PMS to LER i This is expressed as Node SID 177 of LER i. 179 o Next Label: the path that needs to be monitored from LER i to LER 180 j. If this path is a single physical interface (or a bundle of 181 connected interfaces), it can be expressed by the related AdjSID. 182 If the shortest path from LER i to LER j is supposed to be 183 monitored, the Node-SID (LER j) can be used. Another option is to 184 insert a list of segments expressing the desired path (hop by hop 185 as an extreme case). If LER i pushes a stack of Labels based on a 186 SR policy decision and this stack of LSPs is to be monitored, the 187 PMS needs an interface to collect the information enabling it to 188 address this SR created path. 190 o Next Label or address: the path back to the PMS. Likely, no 191 further segment/label is required here. Indeed, once the packet 192 reaches LER j, the 'steering' part of the solution is done and the 193 probe just needs to return to the PMS. This is best achieved by 194 popping the MPLS stack and revealing a probe packet with PMS as 195 destination address (note that in this case, the source and 196 destination addresses could be the same). If an IP address is 197 applied, no SID/label has to be assigned to the PMS (if it is a 198 host/server residing in an IP subnet outside the MPLS domain). 200 Note: if the PMS is an IP host not connected to the MPLS domain, the 201 PMS can send its probe with the list of SIDs/Labels onto a suitable 202 tunnel provding an MPLS access to a router which is part of the 203 monitored MPLS domain. 205 3. SR based OAM use case illustration 206 3.1. Use-case 1 - LSP dataplane liveliness detection and monitoring 208 +---+ +----+ +-----+ 209 |PMS| |LSR1|-----|LER i| 210 +---+ +----+ +-----+ 211 | / \ / 212 | / \__/ 213 +-----+/ /| 214 |LER m| / | 215 +-----+\ / \ 216 \ / \ 217 \+----+ +-----+ 218 |LSR2|-----|LER j| 219 +----+ +-----+ 221 Example of a PMS based LSP dataplane liveness detection and 222 monitoring 224 Figure 1 226 For the sake of simplicity, let's assume that all the nodes are 227 configured with the same SRGB [ID.sr-archi]. as described by section 228 1.2 of [ID.sr-use]. 230 Let's assign the following Node SIDs to the nodes of the figure: PMS 231 = 10, LER i = 20, LER j = 30. 233 The aim is to check liveliness of the path LER i to LER j and to 234 monitor availability of that path afterwards. The PMS does this by 235 creating a measurement packet with the following label stack (top to 236 bottom): 20 - 30 - 10. 238 LER m forwards the packet received from the PMS to LSR1. Assuming 239 Pen-ultimate Hop Popping to be deployed, LSR1 pops the top label and 240 forwards the packet to LER i. There the top label has a value 30 and 241 LER i forwards it to LER j. This will be done transmitting the 242 packet via LSR1 or LSR2. The LSR will again pop the top label. LER 243 j will forward the packet now carrying the top label 10 to the PMS 244 (and it will pass a LSR and LER m). 246 A few observations on the example given in figure 1: 248 o The path PMS to LER i must be available. This path must be 249 detectable, but it is usually sufficient to apply an SPF based 250 path. 252 o If ECMP is deployed, it may be desired to measure along both 253 possible paths, a packet may use between LER i and LER j. To do 254 so, in a first step the PMS sends MPLS OAM packets to execute a so 255 called tree trace between LER i and LER j and stores the IP 256 destination addresses required to execute each detected path. 257 This method of dealing with load balancing paths requires the 258 smallest label stacks if long term monitoring of paths is applied 259 after the tree trace completion. 261 o The path LER j to PMS to must be be available. This path must be 262 detectable, but it is usually sufficient to apply an SPF based 263 path. 265 Once the MPLS paths (Node SIDs) and the required IP address 266 information has been detected, the LER i to LER j can be monitored by 267 the PMS. Monitoring doesn't require MPLS OAM functionality, it is 268 purely based on forwarding. To ensure reliable results, the PMS 269 should be aware of any changes in IGP or MPLS topology. Further 270 changes in ECMP functionality at LER i will impact results. Either 271 the PMS should be notified of such changes or they should be limited 272 to planned maintenance. After a topology change, MPLS OAM will be 273 useful to detect the impact of the change. 275 Determining a path to be executed prior to a measurement may also be 276 done by setting up a label including all node SIDs along that path 277 (if LER1 has Node SID 40 in the example and it should be passed 278 between LER i and LER j, the label stack is 20 - 40 - 30 - 10). The 279 advantage of this method is, that it does not involve MPLS OAM 280 functionality and it is independent of ECMP functionalities. The 281 method still is able to monitor all link combinations of all paths of 282 an MPLS domain. If correct forwarding along the desired paths has to 283 be checked, RFC4739 functionality should be applied also in this 284 case. 286 Obviously, the PMS is able to check and monitor data plane liveliness 287 of all LSPs in the domain. The PMS may be a router, but could also 288 be dedicated monitoring system. If measurement system reliability is 289 an issue, more than a single PMS may be connected to the MPLS domain. 291 Monitoring an MPLS domain by a PMS based on SR offers the option of 292 monitoring complete MPLS domains with little effort and very 293 excellent scalability. Data plane failure detection by circulating 294 monitoring packets can be executed at any time. The PMS further 295 executes MPLS OAM functions everywhere in the MPLS domain. It does 296 not require access to LSR/LER management interfaces to do so. MPLS 297 traceroutes as specified above should be executed only during off 298 peak times (and then with limited parallel MPLS ping/trace load). 300 3.2. Use-case 2 - Monitoring a remote bundle 302 +---+ _ +--+ +-------+ 303 | | { } | |---991---L1---662---| | 304 |PMS|--{ }-|R1|---992---L2---663---|R2 (72)| 305 | | {_} | |---993---L3---664---| | 306 +---+ +--+ +-------+ 308 SR based probing of all the links of a remote bundle 310 Figure 2 312 R1 adresses Lx by the Adjacency SID 99x, while R2 adresses Lx by the 313 Adjacency SID 66(x+1). 315 In the above figure, the PMS needs to assess the dataplane 316 availability of all the links within a remote bundle connected to 317 routers R1 and R2. 319 The monitoring system retrieves the SID/Label information from the 320 IGP LSDB and appends the following segment list/label stack: {72, 321 662, 992, 664} on its IP probe (whose source and destination 322 addresses are the address of the PMS). 324 MS sends the probe to its connected router. If the connected router 325 is not SR compliant, a tunneling technique can be used to tunnel the 326 probe and its MPLS stack to the first SR router. The MPLS/SR domain 327 then forwards the probe to R2 (72 is the Node SID of R2). R2 328 forwards the probe to R1 over link L1 (Adjacency SID 662). R1 329 forwards the probe to R2 over link L2 (Adjacency SID 992). R2 330 forwards the probe to R1 over link L3 (Adjacency SID 664). R1 then 331 forwards the IP probe to PMS as per classic IP forwarding. 333 3.3. Use-Case 3 - Fault localization 335 In the previous example, a uni-directional fault on the middle link 336 from R1 to R2 would be localized by sending the following two probes 337 with respective segment lists: 339 o 72, 662, 992, 664 341 o 72, 663, 992, 664 343 The first probe would fail while the second would succeed. 344 Correlation of the measurements reveals that the only difference is 345 using the Adjacency SID 662 of the middle link from R1 to R2 in the 346 non successful measurement. Assuming the second probe has been 347 routed correctly, the fault must have been occurring in R2 which 348 didn't forward the packet to the interface identified by its 349 Adjacency SID 662. 351 4. Failure Notification from PMS to LERi 353 PMS on detecting any failure in the path liveliness MAY use any out- 354 of-band mechanism to signal te\he failure to LERi. This document 355 does not not propose any specific mechanism and Operators can choose 356 any existing or new approach. 358 Alternately, the Operator may log the failure in local monitoring 359 system and take necessary action by manual intervention. 361 5. Applying SR to monitor LDP paths 363 A SR based PMS connected to a MPLS domain consisting of LER and LSR 364 supporting SR and LDP in parrallel in all nodes may use SR paths to 365 transmit packets to and from start and end points of LDP paths to be 366 monitored. In the above example, the label stack top to bottom may 367 be as follows, when sent by the PMS: 369 o Top: SR based Node-SID of LER i at LER m. 371 o Next: LDP label identifying the path to LER j at LER i. 373 o Bottom: SR based Node-SID identifying the path to the PMS at LER j 375 While the mixed operation shown here still requires the PMS to be 376 aware of the LER LDP-MPLS topology, the PMS may learn the SR MPLS 377 topology by IGP and use this information. 379 6. PMS monitoring of different Segment ID types 381 MPLS SR topology awareness should allow the SID to monitor liveliness 382 of most types of SIDs (this may not be recommendable if a SID 383 identifies an inter domain interface). 385 To match control plane information with data plane information, 386 RFC4379 should be enhanced to allow collection of data relevant to 387 check all relevant types of Segment IDs. 389 7. Connectivity Verification using PMS 391 While the PMS based use cases explained in Section 3 is sufficient to 392 provide Continuity check between LER i and LER j, it may not help 393 perform connectivity verification. So in some cases like data plane 394 programming corruption, it is possible that a transit node between 395 LER i and LER j erroneously remove the top segment ID and forward to 396 PMS based on bottom segment ID leading to falsified path liveliness 397 to PMS. 399 There are various method to perform basic connectivity verification 400 like intermittely setting the TTL to 1 in bottom label so LER j 401 selectively perform connectivity verification. A detailed 402 explanation will be added in later version. 404 8. Extensions of related standards 406 RFC4379 functions should be extended to support Flow- and Entropy 407 Label based ECMP. Further, an RFC4379 like functionality may be 408 desirable for IPv6 networks. 410 9. IANA Considerations 412 This memo includes no request to IANA. 414 10. Security Considerations 416 As mentioned in the introduction, a PMS monitoring packet should 417 never leave the domain where it originated. It therefore should 418 never use stale MPLS or IGP routing information. Further, asigning 419 different label ranges for different purposes may be useful. A well 420 known global service level range may be excluded for utilisation 421 within PMS measurement packets. These ideas shoulddn't start a 422 discussion. They rather should point out, that such a discussion is 423 required when SR based OAM mechanisms like a SR are standardised. 425 11. Acknowledgement 427 The authors would like to thank Nobo Akiya for his cotribution. 429 12. References 430 12.1. Normative References 432 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 433 Label Switched (MPLS) Data Plane Failures", RFC 4379, 434 February 2006. 436 12.2. Informative References 438 [ID.sr-4379ext] 439 IETF, "Label Switched Path (LSP) Ping/Trace for Segment 440 Routing Networks Using MPLS Dataplane", IETF, http:// 441 datatracker.ietf.org/doc/draft-kumar-mpls-spring-lsp- 442 ping/, 2013. 444 [ID.sr-archi] 445 IETF, "Segment Routing Architecture", IETF, https:// 446 datatracker.ietf.org/doc/ 447 draft-filsfils-rtgwg-segment-routing/, 2013. 449 [ID.sr-isis] 450 IETF, "IS-IS Extensions for Segment Routing", IETF, http: 451 //datatracker.ietf.org/doc/ 452 draft-previdi-isis-segment-routing-extensions/, 2013. 454 [ID.sr-oam_detect] 455 IETF, "Detecting Multi-Protocol Label Switching (MPLS) 456 Data Plane Failures in Source Routed LSPs", IETF, http:/ 457 /datatracker.ietf.org/doc/draft-kini-spring-mpls-lsp- 458 ping/, 2013. 460 [ID.sr-use] 461 IETF, "Segment Routing Use Cases", IETF, http:// 462 datatracker.ietf.org/doc/ 463 draft-filsfils-rtgwg-segment-routing-use-cases/, 2013. 465 Authors' Addresses 467 Ruediger Geib (editor) 468 Deutsche Telekom 469 Heinrich Hertz Str. 3-7 470 Darmstadt, 64295 471 Germany 473 Phone: +49 6151 5812747 474 Email: Ruediger.Geib@telekom.de 475 Clarence Filsfils 476 Cisco Systems, Inc. 477 Brussels, 478 Belgium 480 Phone: 481 Email: cfilsfil@cisco.com 483 Carlos Pignataro 484 Cisco Systems, Inc. 485 7200 Kit Creek Road 486 Research Triangle Park, NC 27709-4987 487 US 489 Email: cpignata@cisco.com 491 Nagendra Kumar 492 Cisco Systems, Inc. 493 7200 Kit Creek Road 494 Research Triangle Park, NC 27709 495 US 497 Email: naikumar@cisco.com