idnits 2.17.1 draft-ietf-rtgwg-mofrr-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 566 has weird spacing: '...lineaux cedex...' -- The document date (January 17, 2014) is 3752 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5036' is defined on line 510, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 4601 (Obsoleted by RFC 7761) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Karan 3 Internet-Draft C. Filsfils 4 Intended status: Informational D. Farinacci 5 Expires: July 21, 2014 IJ. Wijnands, Ed. 6 Cisco Systems, Inc. 7 B. Decraene 8 France Telecom 9 U. Joorde 10 Deutsche Telekom 11 W. Henderickx 12 Alcatel-Lucent 13 January 17, 2014 15 Multicast only Fast Re-Route 16 draft-ietf-rtgwg-mofrr-03 18 Abstract 20 As IPTV deployments grow in number and size, service providers are 21 looking for solutions that minimize the service disruption due to 22 faults in the IP network carrying the packets for these services. 23 This draft describes a mechanism for minimizing packet loss in a 24 network when node or link failures occur. Multicast only Fast Re- 25 Route (MoFRR) works by making simple enhancements to multicast 26 routing protocols such as PIM and mLDP. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on July 21, 2014. 45 Copyright Notice 47 Copyright (c) 2014 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Conventions used in this document . . . . . . . . . . . . 3 64 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 4 66 3. Upstream Multicast Hop Selection . . . . . . . . . . . . . . . 4 67 3.1. PIM . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 3.2. mLDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 4. Topologies for MoFRR . . . . . . . . . . . . . . . . . . . . . 5 70 4.1. Dual-Plane Topology . . . . . . . . . . . . . . . . . . . 5 71 5. Detecting Failures . . . . . . . . . . . . . . . . . . . . . . 8 72 6. ECMP-mode MoFRR . . . . . . . . . . . . . . . . . . . . . . . 9 73 7. Non-ECMP-mode MoFRR . . . . . . . . . . . . . . . . . . . . . 9 74 8. Keep It Simple Principle . . . . . . . . . . . . . . . . . . . 10 75 9. Capacity Planning for MoFRR . . . . . . . . . . . . . . . . . 11 76 10. Other Applications . . . . . . . . . . . . . . . . . . . . . . 11 77 11. Security Considerations . . . . . . . . . . . . . . . . . . . 12 78 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 79 13. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 12 80 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 81 14.1. Normative References . . . . . . . . . . . . . . . . . . . 12 82 14.2. Informative References . . . . . . . . . . . . . . . . . . 13 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 85 1. Introduction 87 Multiple techniques have been developed and deployed to improve 88 service guarantees, both for multicast video traffic and Video on 89 Demand traffic. Most existing solutions are geared towards finding 90 an alternate path around one or more failed network elements (link, 91 node, path failures). 93 This draft describes a mechanism for minimizing packet loss in a 94 network when node or link failures occur. Multicast only Fast Re- 95 Route (MoFRR) works by making simple changes to the way selected 96 routers use multicast protocols such as PIM and mLDP. No changes to 97 the protocols themselves are required. With MoFRR, in many cases, 98 multicast routing protocols don't necessarily have to depend on or 99 have to wait on unicast routing protocols to detect network failures. 101 On a merge point MoFRR logic determines a primary Upstream Multicast 102 Hop (UMH) and a secondary UMH and joins the tree via both 103 simultaneously. Data packets are received over the primary and 104 secondary paths. Only the packets from the primary UMH are accepted 105 and forwarded down the tree, the packets from the secondary UMH are 106 discarded. The UMH determination is different for PIM and mLDP and 107 explained later in this document. When a failure is detected on the 108 path to the primary UMH, the repair occurs by changing the secondary 109 UMH into the primary and the primary into the secondary. Since the 110 repair is local, it is fast - greatly improving convergence times in 111 the event of node or link failures on the path to the primary UMH. 113 1.1. Conventions used in this document 115 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 116 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 117 document are to be interpreted as described in RFC 2119 [RFC2119]. 119 1.2. Terminology 121 MoFRR : Multicast only Fast Re-Route. 123 ECMP : Equal Cost Multi-Path. 125 mLDP : Multi-point Label Distribution Protocol. 127 PIM : Protocol Independent Multicast. 129 UMH : Upstream Multicast Hop, a candidate next-hop that can be used 130 to reach the root of the tree. 132 tree : Either a PIM (S,G)/(*,G) tree or a mLDP P2MP or MP2MP LSP. 134 OIF : Outgoing InterFace, an interface used to forward multicast 135 packets down the tree towards the receivers. Either a PIM 136 (S,G)/(*,G) tree or a mLDP P2MP or MP2MP LSP. 138 LFA : Loop Free Alternate, a candidate UMH that can be used for the 139 secondary MoFRR path. 141 2. Basic Overview 143 The basic idea of MoFRR is for a merge point router to join a 144 multicast tree via two divergent upstream paths in order to get 145 maximum redundancy. The two divergent paths SHOULD never merge 146 upstream, otherwise the maximal redundancy is compromised. Sometimes 147 the topology guarantees maximal redundancy, other times additional 148 configuration or techniques are needed to enforce it. See later in 149 this document. 151 A merge point router should only accept and forward on one of the 152 upstream paths at the time in order to avoid duplicate packet 153 forwarding. The selection of the primary and secondary UMH is done 154 by the MoFRR logic and normally based on unicast routing to find loop 155 free candidates. 157 Note, the impact of additional amount of data on the network is 158 mitigated when tree membership is densely populated. When a part of 159 the network has redundant data flowing, join latency for new joining 160 members is reduced because its likely a tree merge point is not far 161 away. 163 3. Upstream Multicast Hop Selection 165 An Upstream Multicast Hop (UMH) is a candidate next-hop that can be 166 used to reach the root of the tree. This is normally based on 167 unicast routing to find loop free candidate(s). With MoFRR 168 procedures we select a primary and a backup UMH. The procedures for 169 determining the UMH are different for PIM and mLDP. See below; 171 3.1. PIM 173 The UMH selection in PIM is also known as the Reverse Path Forwarding 174 (RPF) procedure. Based on a unicast route lookup on either the 175 Source address or Rendezvous Point (RP) [RFC4601], an upstream 176 interface is selected for sending the PIM Joins/Prunes AND accepting 177 the multicast packets. The interface the packets are received on is 178 used to pass or fail the RPF check. If packets are received on an 179 interface that was not selected by the RPF procedure, or not the 180 primary, the packets are discarded. 182 3.2. mLDP 184 The UMH selection in mLDP also depends on unicast routing, but the 185 difference with PIM is that the acceptance of multicast packets is 186 based on MPLS labels and independent on the interface the packet is 187 received on. Using the procedures as defined in [RFC6388] an 188 upstream Label Switched Router (LSR) is elected. The upstream LSR 189 that was elected for a Label Switched Path (LSP) gets a unique local 190 MPLS Label allocated. Multicast packets are only forwarded if the 191 MPLS label matches the MPLS label that was allocated for that LSPs 192 (primary) upstream LSR. 194 4. Topologies for MoFRR 196 MoFRR works best in topologies illustrated in the figure below. 197 MoFRR may be enabled on any router in the network. In the figures 198 below, MoFRR is shown enabled on the Provider Edge (PE) routers to 199 illustrate one way in which the technology may be deployed. 201 4.1. Dual-Plane Topology 202 S 203 P / \ P 204 / \ 205 ^ G1 R1 ^ 206 P / \ P 207 / \ 208 G2----------R2 ^ 209 | \ | \ P 210 ^ | \ | \ 211 P | G3----------R3 212 | | | | 213 | | | | ^ 214 G4---|------R4 | P 215 ^ \ | \ | 216 P \ | \ | 217 G5----------R5 218 ^ | | ^ 219 P | | P 220 | | 221 Gi Ri 222 \ \__ ^ /| 223 \ \ S1/ | ^ 224 ^ \ ^\ / |P2 225 P1 \ S2\_/__ | 226 \ / \| 227 PE1 PE2 228 P = Primary path 229 S = Secondary path 231 FIG1. Two-Plane Network Design 233 The topology has two planes, a primary plane and a secondary plane 234 that are fully disjoint from each other all the way into the POPs. 235 This two plane design is common in service provider networks as it 236 eliminates single point of failures in their core network. The links 237 marked PJ indicate the normal path of how the PIM joins flow from the 238 POPs towards the source of the network. Multicast streams, 239 especially for the densely watched channels, typically flow along 240 both the planes in the network anyways. 242 The only change MoFRR adds to this is on the links marked S where the 243 PE routers join a secondary path to their secondary ECMP UMH. As a 244 result of this, each PE router receives two copies of the same 245 stream, one from the primary plane and the other from the secondary 246 plane. As a result of normal UMH behavior, the multicast stream 247 received over the primary path is accepted and forwarded to the 248 downstream receivers. The copy of the stream received from the 249 secondary UNH is discarded. 251 When a router detects a routing failure on the path to its its 252 primary UMH, it will switch to the secondary UMH and accept packets 253 for that stream. If the failure is repaired the router may switch 254 back. The primary and secondary UMHs have only local context and not 255 end-to-end context. 257 As one can see, MoFRR achieves the faster convergence by pre-building 258 the secondary multicast tree and receiving the traffic on that 259 secondary path. The example discussed above is a simple case where 260 there are two ECMP paths from each PE device towards the source, one 261 along the primary plane and one along the secondary. In cases where 262 the topology is asymmetric or is a ring, this ECMP nature does not 263 hold, and additional rules have to be taken into account to choose 264 when and where to join the secondary path. 266 MoFRR is appealing in such topologies for the following reasons: 268 1. Ease of deployment and simplicity: the functionality is only 269 required on the PE devices although it may be configured on all 270 routers in the topology. Furthermore, each PE device can be 271 enabled separately. PEs not enabled for MoFRR do not see any 272 change or degradation. Inter-operability testing is not required 273 as there are no PIM or mLDP protocol change. 275 2. End-to-end failure detection and recovery: any failure along the 276 path from the source to the PE can be detected and repaired with 277 the secondary disjoint stream. 279 3. Capacity Efficiency: as illustrated in the previous example, the 280 Multicast trees corresponding to IPTV channels cover the backbone 281 and distribution topology in a very dense manner. As a 282 consequence, the secondary path graft into the normal Multicast 283 trees (ie. trees signaled by PIM or mLDP without MoFRR extension) 284 at the aggregation level and hence do not demand any extra 285 capacity either on the distribution links or in the backbone. 286 They simply use the capacity that is normally used, without any 287 duplication. This is different from conventional FRR mechanisms 288 which often duplicate the capacity requirements (the backup path 289 crosses links/nodes which already carry the primary/normal tree 290 and hence twice as much capacity is required). 292 4. Loop free: the secondary path join is sent on an ECMP disjoint 293 path. By definition, the neighbor receiving this request is 294 closer to the source and hence will not cause a loop. 296 The topology we just analyzed is very frequent and can be modeled as 297 per Fig2. The PE has two ECMP disjoint paths to the source. Each 298 ECMP path uses a disjoint plane of the network. 300 Source 301 / \ 302 Plane1 Plane2 303 | | 304 A1 A2 305 \ / 306 PE 308 FIG2. PE is dual-homed to Dual-Plane Backbone 310 Another frequent topology is described in Fig 3. PEs are grouped by 311 pairs. In each pair, each PE is connected to a different plane. 312 Each PE has one single shortest-path to a source (via its connected 313 plane). There is no ECMP like in Fig 2. However, there is clearly a 314 way to provide MoFRR benefits as each PE can offer a disjoint 315 secondary path to the other plane PE (via the disjoint path). 317 MoFRR secondary neighbor selection process needs to be extended in 318 this case as one cannot simply rely on using an ECMP path as 319 secondary neighbor. This extension is referred to as non-ecmp 320 extension and is described later in the document. 322 Source 323 / \ 324 Plane1 Plane2 325 | | 326 A1 A2 327 | | 328 PE1----PE2 330 FIG3. PEs are connected in pairs to Dual-Plane Backbone 332 5. Detecting Failures 334 Once the two paths are established, the next step is detecting a 335 failure on the primary path to know when to switch to the backup 336 path. 338 The first (and simplest) option to detect a path failure is if a 339 directly connected link that is used as MoFRR UMH goes down. This 340 option can be used in combination with the other options as 341 documented below. 50msec switchover is possible. 343 A second option consists of comparing the packets received on the 344 primary and secondary streams but only forwarding one of them -- the 345 first one received, no matter which interface it is received on. 346 Zero packet loss is possible for RTP-based streams. 348 A third option assumes a minimum known packet rate for a given data 349 stream. If a packet is not received on the primary RPF within this 350 time frame, the router assumes primary path failure and switches to 351 the secondary RPF interface. 50msec switchover is possible. 353 A fourth option leverages the significant improvements of the IGP 354 convergence speed. When the primary path to the source is withdrawn 355 by the IGP, the MoFRR-enabled router switches over to the backup 356 path, the UMH is changed to the secondary UMH. Since the secondary 357 path is already in place, and assuming it is disjoint from the 358 primary path, convergence times would not include the time required 359 to build a new tree and hence are smaller. Realistic availability 360 requirements (sub-second to sub-200msec) should be possible. 362 6. ECMP-mode MoFRR 364 If the IGP installs two ECMP paths to the source and if the Multicast 365 tree is enabled for ECMP-Mode MoFRR, the router installs them as 366 primary and secondary UMH. Only packets received from the primary 367 UMH path are processed. Packets received from the secondary UMH are 368 dropped. 370 The selected primary UMH should be the same as if MoFRR extension was 371 not enabled. 373 If more than two ECMP paths exist, two are selected as primary and 374 secondary UMH. Information from the IGP link-state topology could be 375 leveraged to optimize this selection. 377 Note, MoFRR does not restrict the number of UMH paths that are 378 joined. Implementations may use as many paths as are configured. 380 7. Non-ECMP-mode MoFRR 381 SourceS 382 / \ 383 / \ 384 Backbone 385 | | 386 | | 387 | | 388 X--------N 390 Fig5. Non-ECMP-Mode MoFRR 392 X is configured for MoFRR for a Multicast tree 393 R(X) is the primary UMH to S for X 394 N is a neighbor of X 395 R(N) is the LFA UMH to S for X 397 Router X in FIG5 has one primary path R(X) and one secondary LFA path 398 R(N) to reach the source. How it is determined that N is a LFA path 399 from X to S follows the procedures as documented in [RFC5286]. A 400 router X configured for non-ECMP-mode MoFRR for a Multicast tree 401 joins a primary path to its primary UMH R(X) and a secondary path to 402 LFA UMH N. Router X MUST stop joining the seconday path if the 403 following as described below occurs; 405 Consider the example in FIG3, if PE1 and PE2 have received an igmp 406 request for a Multicast tree, they will both join the primary path on 407 their plane and a secondary path to the neighbor PE. If their 408 receivers would leave at the same time, it could be possible for the 409 Multicast tree on PE1 and PE2 to never get deleted as each PE refresh 410 each other via the secondary path joins (remember that a secondary 411 path join is not distinguishable from a primary join). In order to 412 prevent control-plane loops a router MUST never setup a secondary 413 path to a LFA UMH if this UMH is the only member in the OIF list. 415 8. Keep It Simple Principle 417 Many Service Providers devise their topology such that PEs have 418 disjoint paths to the multicast sources. MoFRR leverages the 419 existence of these disjoint paths without any PIM or mLDP protocol 420 modification. Interoperability testing is thus not required. In 421 such topologies, MoFRR only needs to be deployed on the PE devices. 422 Each PE device can be enabled one by one. PEs not enabled for MoFRR 423 do not see any change or degradation. 425 Multicast streams with Tight SLA requirements are often characterized 426 by a continuous high packet rate (SD video has a continuous 427 interpacket gap of ~ 3msec). MoFRR simply leverages the stream 428 characteristic to detect any failures along the primary branch and 429 switch-over on the secondary branch in a few 10s of msec. 431 9. Capacity Planning for MoFRR 433 As for LFA FRR (draft-ietf-rtgwg-lfa-applicability-00), MoFRR 434 applicability is topology dependent. 436 In this document, we have described two very frequent designs (Fig 2 437 and Fig 3) which provide maximum MoFRR benefits. 439 Designers with topologies different than Fig2 and 3 can still benefit 440 from MoFRR benefits thanks to the use of capacity planning tools. 442 Such tools are able to simulate the ability of each PE to build two 443 disjoint branches of the same tree. This for hundreds of PEs and 444 hundreds of sources. 446 This allows to assess the MoFRR protection coverage of a given 447 network, for a set of sources. 449 If the protection coverage is deemed insufficient, the designer can 450 use such tool to optimize the topology (add links, change igp 451 metrics). 453 10. Other Applications 455 While all the examples in this document show the MoFRR applicability 456 on PE devices, it is clear that MoFRR could be enabled on aggregation 457 or core routers. 459 MoFRR can be popular in Data Center network configurations. With the 460 advent of lower cost ethernet and increasing port density in routers, 461 there is more meshed connectivity than ever before. When using a 462 3-level access, distribution, and core layers in a Data Center, there 463 is a lot of inexpensive bandwidth connecting the layers. This will 464 lend itself to more opportunities for ECMP paths at multiple layers. 465 This allows for multiple layers of redundancy protecting link and 466 node failure at each layer with minimal redundancy cost. 468 Redundancy costs are reduced because only one packet is forwarded at 469 every link along the primary and secondary data paths so there is no 470 duplication of data on any link thereby providing make-before-break 471 protection at a very small cost. 473 Alternate methods to detect failures such as MPLS-OAM or BFD may be 474 considered. 476 The MoFRR principle may be applied to MVPNs. 478 11. Security Considerations 480 There are no security considerations for this design other than what 481 is already in the main PIM specification [RFC4601] and mLDP 482 specification [RFC6388] . 484 12. Acknowledgments 486 The authors would like to thank John Zwiebel, Greg Shepherd and Dave 487 Oran for their review of the draft. 489 13. Contributor Addresses 491 Below is a list of other contributing authors in alphabetical order: 493 Nicolai Leymann 494 Deutsche Telekom 495 Winterfeldtstrasse 21 496 Berlin 10781 497 DE 498 Email: N.Leymann@telekom.de 500 Jeff Tantsura 501 Ericsson 502 300 Holger Way 503 San Jose CA 95134 504 USA 506 14. References 508 14.1. Normative References 510 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 511 Specification", RFC 5036, October 2007. 513 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 514 Requirement Levels", BCP 14, RFC 2119, March 1997. 516 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 517 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 519 14.2. Informative References 521 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 522 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 523 Protocol Specification (Revised)", RFC 4601, August 2006. 525 [RFC6388] Wijnands, IJ., Minei, I., Kompella, K., and B. Thomas, 526 "Label Distribution Protocol Extensions for Point-to- 527 Multipoint and Multipoint-to-Multipoint Label Switched 528 Paths", RFC 6388, November 2011. 530 Authors' Addresses 532 Apoorva Karan 533 Cisco Systems, Inc. 534 3750 Cisco Way 535 San Jose CA, 95134 536 USA 538 Email: apoorva@cisco.com 540 Clarence Filsfils 541 Cisco Systems, Inc. 542 De kleetlaan 6a 543 Diegem BRABANT 1831 544 Belgium 546 Email: cfilsfil@cisco.com 548 Dino Farinacci 549 Cisco Systems, Inc. 550 425 East Tasman Drive 551 San Jose CA, 95134 552 USA 554 Email: dino@cisco.com 555 IJsbrand Wijnands (editor) 556 Cisco Systems, Inc. 557 De Kleetlaan 6a 558 Diegem 1831 559 BE 561 Email: ice@cisco.com 563 Bruno Decraene 564 France Telecom 565 38-40 rue du General Leclerc 566 Issy Moulineaux cedex 9, 92794 567 FR 569 Email: bruno.decraene@orange.com 571 Uwe Joorde 572 Deutsche Telekom 573 Hammer Str. 216-226 574 Muenster D-48153 575 DE 577 Email: Uwe.Joorde@telekom.de 579 Wim Henderickx 580 Alcatel-Lucent 581 Copernicuslaan 50 582 Antwerp 2018 583 Belgium 585 Email: wim.henderickx@alcatel-lucent.com