idnits 2.17.1 draft-ietf-rtgwg-mofrr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 9 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 590 has weird spacing: '...lineaux cedex...' -- The document date (October 02, 2012) is 4195 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5036' is defined on line 536, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 4601 (Obsoleted by RFC 7761) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Karan 3 Internet-Draft C. Filsfils 4 Intended status: Informational D. Farinacci 5 Expires: April 5, 2013 IJ. Wijnands, Ed. 6 Cisco Systems, Inc. 7 B. Decraene 8 France Telecom 9 U. Joorde 10 Deutsche Telekom 11 W. Henderickx 12 Alcatel-Lucent 13 October 02, 2012 15 Multicast only Fast Re-Route 16 draft-ietf-rtgwg-mofrr-00 18 Abstract 20 As IPTV deployments grow in number and size, service providers are 21 looking for solutions that minimize the service disruption due to 22 faults in the IP network carrying the packets for these services. 23 This draft describes a mechanism for minimizing packet loss in a 24 network when node or link failures occur. Multicast only Fast Re- 25 Route (MoFRR) works by making simple enhancements to multicast 26 routing protocols such as PIM and mLDP. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 5, 2013. 45 Copyright Notice 47 Copyright (c) 2012 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Conventions used in this document . . . . . . . . . . . . 3 64 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 4 66 3. Upstream Multicast Hop Selection . . . . . . . . . . . . . . . 4 67 3.1. PIM . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 3.2. mLDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 4. Topologies for MoFRR . . . . . . . . . . . . . . . . . . . . . 5 70 4.1. Dual-Plane Topology . . . . . . . . . . . . . . . . . . . 5 71 5. Detecting Failures . . . . . . . . . . . . . . . . . . . . . . 8 72 6. ECMP-mode MoFRR . . . . . . . . . . . . . . . . . . . . . . . 9 73 7. Non-ECMP-mode MoFRR . . . . . . . . . . . . . . . . . . . . . 9 74 7.1. Variation . . . . . . . . . . . . . . . . . . . . . . . . 11 75 8. Keep It Simple Principle . . . . . . . . . . . . . . . . . . . 11 76 9. Capacity Planning for MoFRR . . . . . . . . . . . . . . . . . 11 77 10. Other Applications . . . . . . . . . . . . . . . . . . . . . . 12 78 11. Security Considerations . . . . . . . . . . . . . . . . . . . 12 79 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 80 13. Contributing authors . . . . . . . . . . . . . . . . . . . . . 13 81 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 82 14.1. Normative References . . . . . . . . . . . . . . . . . . . 13 83 14.2. Informative References . . . . . . . . . . . . . . . . . . 13 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 86 1. Introduction 88 Multiple techniques have been developed and deployed to improve 89 service guarantees, both for multicast video traffic and Video on 90 Demand traffic. Most existing solutions are geared towards finding 91 an alternate path around one or more failed network elements (link, 92 node, path failures). 94 This draft describes a mechanism for minimizing packet loss in a 95 network when node or link failures occur. Multicast only Fast Re- 96 Route (MoFRR) works by making simple changes to the way selected 97 routers use multicast protocols such as PIM and mLDP. No changes to 98 the protocols themselves are required. With MoFRR, in many cases, 99 multicast routing protocols don't necessarily have to depend on or 100 have to wait on unicast routing protocols to detect network failures. 102 On a merge point MoFRR logic determines a primary Upstream Multicast 103 Hop (UMH) and a secondary UMH and joins the tree via both 104 simultaneously. Data packets are received over the primary and 105 secondary paths. Only the packets from the primary UMH are accepted 106 and forwarded down the tree, the packets from the secondary UMH are 107 discarded. The UMH determination is different for PIM and mLDP and 108 explained later in this document. When a failure is detected on the 109 path to the primary UMH, the repair occurs by changing the secondary 110 UMH into the primary and the primary into the secondary. Since the 111 repair is local, it is fast - greatly improving convergence times in 112 the event of node or link failures on the path to the primary UMH. 114 1.1. Conventions used in this document 116 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 117 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 118 document are to be interpreted as described in RFC 2119 [RFC2119]. 120 1.2. Terminology 122 MoFRR : Multicast only Fast Re-Route. 124 ECMP : Equal Cost Multi-Path. 126 mLDP : Multi-point Label Distribution Protocol. 128 PIM : Protocol Independent Multicast. 130 UMH : Upstream Multicast Hop, a candidate next-hop that can be used 131 to reach the root of the tree. 133 tree : Either a PIM (S,G)/(*,G) tree or a mLDP P2MP or MP2MP LSP. 135 OIF : Outgoing InterFace, an interface used to forward multicast 136 packets down the tree towards the receivers. Either a PIM 137 (S,G)/(*,G) tree or a mLDP P2MP or MP2MP LSP. 139 2. Basic Overview 141 The basic idea of MoFRR is for a merge point router to join a 142 multicast tree via two divergent upstream paths in order to get 143 maximum redundancy. The two divergent paths SHOULD never merge 144 upstream, otherwise the maximal redundancy is compromised. Sometimes 145 the topology guarantees maximal redundancy, other times additional 146 configuration or techniques are needed to enforce it. See later in 147 this document. 149 A merge point router should only accept and forward on one of the 150 upstream paths at the time in order to avoid duplicate packet 151 forwarding. The selection of the primary and secondary UMH is done 152 by the MoFRR logic and normally based on unicast routing to find loop 153 free candidates. 155 Note, the impact of additional amount of data on the network is 156 mitigated when tree membership is densely populated. When a part of 157 the network has redundant data flowing, join latency for new joining 158 members is reduced because its likely a tree merge point is not far 159 away. 161 3. Upstream Multicast Hop Selection 163 An Upstream Multicast Hop (UMH) is a candidate next-hop that can be 164 used to reach the root of the tree. This is normally based on 165 unicast routing to find loop free candidate(s). With MoFRR 166 procedures we select a primary and a backup UMH. The procedures for 167 determining the UMH are different for PIM and mLDP. See below; 169 3.1. PIM 171 The UMH selection in PIM is also known as the Reverse Path Forwarding 172 (RPF) procedure. Based on a unicast route lookup on either the 173 Source address or Rendezvous Point (RP) [RFC4601], an upstream 174 interface is selected for sending the PIM Joins/Prunes AND accepting 175 the multicast packets. The interface the packets are received on is 176 used to pass or fail the RPF check. If packets are received on an 177 interface that was not selected by the RPF procedure, or not the 178 primary, the packets are discarded. 180 3.2. mLDP 182 The UMH selection in mLDP also depends on unicast routing, but the 183 difference with PIM is that the acceptance of multicast packets is 184 based on MPLS labels and independent on the interface the packet is 185 received on. Using the procedures as defined in [RFC6388] an 186 upstream Label Switched Router (LSR) is elected. The upstream LSR 187 that was elected for a Label Switched Path (LSP) gets a unique local 188 MPLS Label allocated. Multicast packets are only forwarded if the 189 MPLS label matches the MPLS label that was allocated for that LSPs 190 (primary) upstream LSR. 192 4. Topologies for MoFRR 194 MoFRR works best in topologies illustrated in the figure below. 195 MoFRR may be enabled on any router in the network. In the figures 196 below, MoFRR is shown enabled on the Provider Edge (PE) routers to 197 illustrate one way in which the technology may be deployed. 199 4.1. Dual-Plane Topology 200 S 201 P / \ P 202 / \ 203 ^ G1 R1 ^ 204 P / \ P 205 / \ 206 G2----------R2 ^ 207 | \ | \ P 208 ^ | \ | \ 209 P | G3----------R3 210 | | | | 211 | | | | ^ 212 G4---|------R4 | P 213 ^ \ | \ | 214 P \ | \ | 215 G5----------R5 216 ^ | | ^ 217 P | | P 218 | | 219 Gi Ri 220 \ \__ ^ /| 221 \ \ S1/ | ^ 222 ^ \ ^\ / |P2 223 P1 \ S2\_/__ | 224 \ / \| 225 PE1 PE2 226 P = Primary path 227 S = Secondary path 229 FIG1. Two-Plane Network Design 231 The topology has two planes, a primary plane and a secondary plane 232 that are fully disjoint from each other all the way into the POPs. 233 This two plane design is common in service provider networks as it 234 eliminates single point of failures in their core network. The links 235 marked PJ indicate the normal path of how the PIM joins flow from the 236 POPs towards the source of the network. Multicast streams, 237 especially for the densely watched channels, typically flow along 238 both the planes in the network anyways. 240 The only change MoFRR adds to this is on the links marked S where the 241 PE routers join a secondary path to their secondary ECMP UMH. As a 242 result of this, each PE router receives two copies of the same 243 stream, one from the primary plane and the other from the secondary 244 plane. As a result of normal UMH behavior, the multicast stream 245 received over the primary path is accepted and forwarded to the 246 downstream receivers. The copy of the stream received from the 247 secondary UNH is discarded. 249 When a router detects a routing failure on the path to its its 250 primary UMH, it will switch to the secondary UMH and accept packets 251 for that stream. If the failure is repaired the router may switch 252 back. The primary and secondary UMHs have only local context and not 253 end-to-end context. 255 As one can see, MoFRR achieves the faster convergence by pre-building 256 the secondary multicast tree and receiving the traffic on that 257 secondary path. The example discussed above is a simple case where 258 there are two ECMP paths from each PE device towards the source, one 259 along the primary plane and one along the secondary. In cases where 260 the topology is asymmetric or is a ring, this ECMP nature does not 261 hold, and additional rules have to be taken into account to choose 262 when and where to join the secondary path. 264 MoFRR is appealing in such topologies for the following reasons: 266 1. Ease of deployment and simplicity: the functionality is only 267 required on the PE devices although it may be configured on all 268 routers in the topology. Furthermore, each PE device can be 269 enabled separately. PEs not enabled for MoFRR do not see any 270 change or degradation. Inter-operability testing is not required 271 as there are no PIM or mLDP protocol change. 273 2. End-to-end failure detection and recovery: any failure along the 274 path from the source to the PE can be detected and repaired with 275 the secondary disjoint stream. 277 3. Capacity Efficiency: as illustrated in the previous example, the 278 Multicast trees corresponding to IPTV channels cover the backbone 279 and distribution topology in a very dense manner. As a 280 consequence, the secondary path graft into the normal Multicast 281 trees (ie. trees signaled by PIM or mLDP without MoFRR extension) 282 at the aggregation level and hence do not demand any extra 283 capacity either on the distribution links or in the backbone. 284 They simply use the capacity that is normally used, without any 285 duplication. This is different from conventional FRR mechanisms 286 which often duplicate the capacity requirements (the backup path 287 crosses links/nodes which already carry the primary/normal tree 288 and hence twice as much capacity is required). 290 4. Loop free: the secondary path join is sent on an ECMP disjoint 291 path. By definition, the neighbor receiving this request is 292 closer to the source and hence will not cause a loop. 294 The topology we just analyzed is very frequent and can be modeled as 295 per Fig2. The PE has two ECMP disjoint paths to the source. Each 296 ECMP path uses a disjoint plane of the network. 298 Source 299 / \ 300 Plane1 Plane2 301 | | 302 A1 A2 303 \ / 304 PE 306 FIG2. PE is dual-homed to Dual-Plane Backbone 308 Another frequent topology is described in Fig 3. PEs are grouped by 309 pairs. In each pair, each PE is connected to a different plane. 310 Each PE has one single shortest-path to a source (via its connected 311 plane). There is no ECMP like in Fig 2. However, there is clearly a 312 way to provide MoFRR benefits as each PE can offer a disjoint 313 secondary path to the other plane PE (via the disjoint path). 315 MoFRR secondary neighbor selection process needs to be extended in 316 this case as one cannot simply rely on using an ECMP path as 317 secondary neighbor. This extension is referred to as non-ecmp 318 extension and is described later in the document. 320 Source 321 / \ 322 Plane1 Plane2 323 | | 324 A1 A2 325 | | 326 PE1----PE2 328 FIG3. PEs are connected in pairs to Dual-Plane Backbone 330 5. Detecting Failures 332 Once the two paths are established, the next step is detecting a 333 failure on the primary path to know when to switch to the backup 334 path. 336 A first option consists of comparing the packets received on the 337 primary and secondary streams but only forwarding one of them -- the 338 first one received, no matter which interface it is received on. 339 Zero packet loss is possible for RTP-based streams. 341 A second option assumes a minimum known packet rate for a given data 342 stream. If a packet is not received on the primary RPF within this 343 time frame, the router assumes primary path failure and switches to 344 the secondary RPF interface. 50msec switchover is possible. 346 A third option leverages the significant improvements of the IGP 347 convergence speed. When the primary path to the source is withdrawn 348 by the IGP, the MoFRR-enabled router switches over to the backup 349 path, the UMH is changed to the secondary UMH. Since the secondary 350 path is already in place, and assuming it is disjoint from the 351 primary path, convergence times would not include the time required 352 to build a new tree and hence are smaller. Realistic availability 353 requirements (sub-second to sub-200msec) should be possible. 355 A fourth option consists in leveraging connected link failure. This 356 option makes sense when MoFRR is deployed across the network (not 357 only at PE). 359 6. ECMP-mode MoFRR 361 If the IGP installs two ECMP paths to the source and if the Multicast 362 tree is enabled for ECMP-Mode MoFRR, the router installs them as 363 primary and secondary UMH. Only packets received from the primary 364 UMH path are processed. Packets received from the secondary UMH are 365 dropped. 367 The selected primary UMH should be the same as if MoFRR extension was 368 not enabled. 370 If more than two ECMP paths exist, two are selected as primary and 371 secondary UMH. Information from the IGP link-state topology could be 372 leveraged to optimize this selection. 374 Note, MoFRR does not restrict the number of UMH paths that are 375 joined. Implementations may use as many paths as are configured. 377 7. Non-ECMP-mode MoFRR 378 SourceS 379 / \ 380 / \ 381 Backbone 382 | | 383 | | 384 | | 385 X--------N 387 Fig5. Non-ECMP-Mode MoFRR 389 X is configured for MoFRR for a Multicast tree 390 R(X) is Xs UMH to S 391 N is a neighbor of X 392 R(N) is Ns UMH to S 393 xs represents the IGP metric from X to S 394 ns represents the IGP metric from N to S 395 xn represents the IGP metric from X to N 397 A router X configured for non-ECMP-mode MoFRR for a Multicast tree 398 joins a primary path to its primary UMH R(X) and a secondary path to 399 UMH N if the following three conditions are met. 401 C1: xs < xn + ns 402 C2: ns < nx + xs 403 C3: X cannot join the secondary path N if N is the only member of the OIF list 405 The first condition ensures that N is not on the primary branch from 406 X to S. 408 The second condition ensures that X is not on the primary branch from 409 N to S. 411 These two conditions ensure that at least locally the two paths are 412 disjoint. 414 The third condition is required to break control-plane loops which 415 could occur in some scenarios. 417 For example in FIG3, if PE1 and PE2 have received an igmp request for 418 a Multicast tree, they will both join the primary path on their plane 419 and a secondary path to the neighbor PE. If their receivers would 420 leave at the same time, it could be possible for the Multicast tree 421 on PE1 and PE2 to never get deleted as each PE refresh each other via 422 the secondary path joins (remember that a secondary path join is not 423 distinguishable from a primary join. MoFRR does not require any PIM 424 or mLDP protocol modification). 426 A control-plane loop occurs when two nodes keep a state forever due 427 to joining the secondary path to each other. This forever condition 428 is not acceptable as no real receiver is connected to the nodes 429 (directly via IGMP or indirectly via PIM). Rule 3 prevents this case 430 as it prevents the mutual refresh of secondary joins and it applies 431 it in the specific case where there is no real receiver connected. 433 7.1. Variation 435 Rule R3 can be removed if Rule 2 is restricted as follows: 437 R2p: ns < xs 439 This ensures that X will only join the secondary path to a neighbor N 440 who is strictly closer to the source than X is. By reciprocity, N 441 will thus never join the secondary path for the same Multicast tree 442 via X. The strictly smaller than is key here. 444 Note that this non-ECMP-mode MoFRR variation does not support the 445 square topology and hence is less preferred. 447 8. Keep It Simple Principle 449 Many Service Providers devise their topology such that PEs have 450 disjoint paths to the multicast sources. MoFRR leverages the 451 existence of these disjoint paths without any PIM or mLDP protocol 452 modification. Interoperability testing is thus not required. In 453 such topologies, MoFRR only needs to be deployed on the PE devices. 454 Each PE device can be enabled one by one. PEs not enabled for MoFRR 455 do not see any change or degradation. 457 Multicast streams with Tight SLA requirements are often characterized 458 by a continuous high packet rate (SD video has a continuous 459 interpacket gap of ~ 3msec). MoFRR simply leverages the stream 460 characteristic to detect any failures along the primary branch and 461 switch-over on the secondary branch in a few 10s of msec. 463 9. Capacity Planning for MoFRR 465 As for LFA FRR (draft-ietf-rtgwg-lfa-applicability-00), MoFRR 466 applicability is topology dependent. 468 In this document, we have described two very frequent designs (Fig 2 469 and Fig 3) which provide maximum MoFRR benefits. 471 Designers with topologies different than Fig2 and 3 can still benefit 472 from MoFRR benefits thanks to the use of capacity planning tools. 474 Such tools are able to simulate the ability of each PE to build two 475 disjoint branches of the same tree. This for hundreds of PEs and 476 hundreds of sources. 478 This allows to assess the MoFRR protection coverage of a given 479 network, for a set of sources. 481 If the protection coverage is deemed insufficient, the designer can 482 use such tool to optimize the topology (add links, change igp 483 metrics). 485 10. Other Applications 487 While all the examples in this document show the MoFRR applicability 488 on PE devices, it is clear that MoFRR could be enabled on aggregation 489 or core routers. 491 MoFRR can be popular in Data Center network configurations. With the 492 advent of lower cost ethernet and increasing port density in routers, 493 there is more meshed connectivity than ever before. When using a 494 3-level access, distribution, and core layers in a Data Center, there 495 is a lot of inexpensive bandwidth connecting the layers. This will 496 lend itself to more opportunities for ECMP paths at multiple layers. 497 This allows for multiple layers of redundancy protecting link and 498 node failure at each layer with minimal redundancy cost. 500 Redundancy costs are reduced because only one packet is forwarded at 501 every link along the primary and secondary data paths so there is no 502 duplication of data on any link thereby providing make-before-break 503 protection at a very small cost. 505 Alternate methods to detect failures such as MPLS-OAM or BFD may be 506 considered. 508 The MoFRR principle may be applied to MVPNs. 510 11. Security Considerations 512 There are no security considerations for this design other than what 513 is already in the main PIM specification [RFC4601] and mLDP 514 specification [RFC6388] . 516 12. Acknowledgments 518 The authors would like to thank John Zwiebel, Greg Shepherd and Dave 519 Oran for their review of the draft. 521 13. Contributing authors 523 Below is a list of other contributing authors in alphabetical order: 525 Nicolai Leymann 526 Deutsche Telekom 527 Winterfeldtstrasse 21 528 Berlin 10781 529 DE 530 Email: N.Leymann@telekom.de 532 14. References 534 14.1. Normative References 536 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 537 Specification", RFC 5036, October 2007. 539 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 540 Requirement Levels", BCP 14, RFC 2119, March 1997. 542 14.2. Informative References 544 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 545 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 546 Protocol Specification (Revised)", RFC 4601, August 2006. 548 [RFC6388] Wijnands, IJ., Minei, I., Kompella, K., and B. Thomas, 549 "Label Distribution Protocol Extensions for Point-to- 550 Multipoint and Multipoint-to-Multipoint Label Switched 551 Paths", RFC 6388, November 2011. 553 Authors' Addresses 555 Apoorva Karan 556 Cisco Systems, Inc. 557 3750 Cisco Way 558 San Jose CA, 95134 559 USA 561 Email: apoorva@cisco.com 563 Clarence Filsfils 564 Cisco Systems, Inc. 565 De kleetlaan 6a 566 Diegem BRABANT 1831 567 Belgium 569 Email: cfilsfil@cisco.com 571 Dino Farinacci 572 Cisco Systems, Inc. 573 425 East Tasman Drive 574 San Jose CA, 95134 575 USA 577 Email: dino@cisco.com 579 IJsbrand Wijnands (editor) 580 Cisco Systems, Inc. 581 De Kleetlaan 6a 582 Diegem 1831 583 BE 585 Email: ice@cisco.com 587 Bruno Decraene 588 France Telecom 589 38-40 rue du General Leclerc 590 Issy Moulineaux cedex 9, 92794 591 FR 593 Email: bruno.decraene@orange-ftgroup.com 594 Uwe Joorde 595 Deutsche Telekom 596 Hammer Str. 216-226 597 Muenster D-48153 598 DE 600 Email: Uwe.Joorde@telekom.de 602 Wim Henderickx 603 Alcatel-Lucent 604 Copernicuslaan 50 605 Antwerp 2018 606 Belgium 608 Email: wim.henderickx@alcatel-lucent.com