idnits 2.17.1 draft-atlas-rtgwg-mrt-mc-arch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 192 has weird spacing: '...wo MRTs found...' == Line 475 has weird spacing: '...wo MRTs found...' -- The document date (March 2, 2012) is 4438 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'R' is mentioned on line 468, but not defined == Missing Reference: 'F' is mentioned on line 468, but not defined == Missing Reference: 'C' is mentioned on line 468, but not defined == Missing Reference: 'G' is mentioned on line 468, but not defined == Missing Reference: 'E' is mentioned on line 465, but not defined == Missing Reference: 'D' is mentioned on line 465, but not defined == Missing Reference: 'J' is mentioned on line 465, but not defined == Missing Reference: 'A' is mentioned on line 471, but not defined == Missing Reference: 'B' is mentioned on line 471, but not defined == Missing Reference: 'H' is mentioned on line 471, but not defined == Missing Reference: 'S' is mentioned on line 1032, but not defined == Missing Reference: 'PLR' is mentioned on line 1032, but not defined == Unused Reference: 'I-D.wijnands-mpls-mldp-node-protection' is defined on line 1239, but no explicit reference was found in the text == Outdated reference: A later version (-04) exists of draft-enyedi-rtgwg-mrt-frr-algorithm-00 ** Downref: Normative reference to an Informational draft: draft-enyedi-rtgwg-mrt-frr-algorithm (ref. 'I-D.enyedi-rtgwg-mrt-frr-algorithm') == Outdated reference: A later version (-10) exists of draft-ietf-rtgwg-mrt-frr-architecture-00 ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) == Outdated reference: A later version (-04) exists of draft-iwijnand-mpls-mldp-multi-topology-01 == Outdated reference: A later version (-02) exists of draft-karan-mofrr-01 == Outdated reference: A later version (-01) exists of draft-kebler-pim-mrt-protection-00 == Outdated reference: A later version (-04) exists of draft-wijnands-mpls-mldp-node-protection-00 Summary: 2 errors (**), 0 flaws (~~), 22 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Area Working Group A. Atlas, Ed. 3 Internet-Draft R. Kebler 4 Intended status: Standards Track Juniper Networks 5 Expires: September 3, 2012 IJ. Wijnands 6 Cisco Systems, Inc. 7 A. Csaszar 8 G. Enyedi 9 Ericsson 10 March 2, 2012 12 An Architecture for Multicast Protection Using Maximally Redundant Trees 13 draft-atlas-rtgwg-mrt-mc-arch-00 15 Abstract 17 Failure protection is desirable for multicast traffic, whether 18 signaled via PIM or mLDP. Different mechanisms are suitable for 19 different use-cases and deployment scenarios. This document 20 describes the architecture for global protection (aka multicast live- 21 live) and for local protection (aka fast-reroute). 23 The general methods for global protection and local protection using 24 alternate-trees are dependent upon the use of Maximally Redundant 25 Trees. Local protection can also tunnel traffic in unicast tunnels 26 to take advantage of the routing and fast-reroute mechanisms 27 available for IP/LDP unicast destinations. 29 The failures protected against are single link or node failures. 30 While the basic architecture might support protection against shared 31 risk group failures, algorithms to dynamically compute MRTs 32 supporting this are for future study. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 3, 2012. 50 Copyright Notice 52 Copyright (c) 2012 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Maximally Redundant Trees (MRTs) . . . . . . . . . . . . . 4 69 1.2. MRTs and Multicast . . . . . . . . . . . . . . . . . . . . 6 70 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 3. Use-Cases and Applicability . . . . . . . . . . . . . . . . . 8 72 4. Global Protection: Multicast Live-Live . . . . . . . . . . . . 9 73 4.1. Creation of MRMTs . . . . . . . . . . . . . . . . . . . . 10 74 4.2. Traffic Self-Identification . . . . . . . . . . . . . . . 11 75 4.2.1. Merging MRMTs for PIM if Traffic Doesn't 76 Self-Identify . . . . . . . . . . . . . . . . . . . . 12 77 4.3. Convergence Behavior . . . . . . . . . . . . . . . . . . . 13 78 4.4. Inter-area/level Behavior . . . . . . . . . . . . . . . . 14 79 4.4.1. Inter-area Node Protection with 2 border routers . . . 15 80 4.4.2. Inter-area Node Protection with > 2 Border Routers . . 16 81 4.5. PIM . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 82 4.5.1. Traffic Handling: RPF Checks . . . . . . . . . . . . . 17 83 4.6. mLDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 84 5. Local Repair: Fast-Reroute . . . . . . . . . . . . . . . . . . 17 85 5.1. PLR-driven Unicast Tunnels . . . . . . . . . . . . . . . . 18 86 5.1.1. Learning the MPs . . . . . . . . . . . . . . . . . . . 19 87 5.1.2. Using Unicast Tunnels and Indirection . . . . . . . . 19 88 5.1.3. MP Alternate Traffic Handling . . . . . . . . . . . . 20 89 5.1.4. Merge Point Reconvergence . . . . . . . . . . . . . . 21 90 5.1.5. PLR termination of alternate traffic . . . . . . . . . 21 91 5.2. MP-driven Unicast Tunnels . . . . . . . . . . . . . . . . 21 92 5.3. MP-driven Alternate Trees . . . . . . . . . . . . . . . . 22 93 5.3.1. PIM details for Alternate-Trees . . . . . . . . . . . 25 94 5.3.2. mLDP details for Alternate-Trees . . . . . . . . . . . 25 95 5.3.3. Traffic Handling by PLR . . . . . . . . . . . . . . . 25 96 5.4. Methods Compared for PIM . . . . . . . . . . . . . . . . . 26 97 5.5. Methods Compared for mLDP . . . . . . . . . . . . . . . . 26 98 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 99 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 100 8. Security Considerations . . . . . . . . . . . . . . . . . . . 27 101 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 102 9.1. Normative References . . . . . . . . . . . . . . . . . . . 27 103 9.2. Informative References . . . . . . . . . . . . . . . . . . 27 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 106 1. Introduction 108 This document describes how the algorithms in 109 [I-D.enyedi-rtgwg-mrt-frr-algorithm], which are used in 110 [I-D.ietf-rtgwg-mrt-frr-architecture] for unicast IP/LDP fast- 111 reroute, can be used to provide protection for multicast traffic. It 112 specifically applies to multicast state signaled by PIM[RFC4601] or 113 mLDP[RFC6388]. There are additional protocols that depend upon these 114 (e.g. VPLS, mVPN, etc.) and consideration of the applicability to 115 such traffic will be in a future version. 117 In this document, global protection is used to refer to the method of 118 having two maximally disjoint multicast trees where traffic may be 119 sent on both and resolved by the receiver. This is similar to the 120 ability with RSVP-TE LSPs to have a primary and a hot standby, except 121 that it can operate in 1+1 mode. This capability is also referred to 122 as multicast live-live and is a generalized form of that discussed in 123 [I-D.karan-mofrr]. In this document, local protection refers to the 124 method of having alternate ways of reaching the pre-identified merge 125 points upon detection of a local failure. This capability is also 126 referred to as fast-reroute. 128 This document describes the general architecture, framework, and 129 trade-offs of the different approaches to solving these general 130 problems. It will recommend how to generally provide global 131 protection and local protection for mLDP and PIM traffic. Where 132 protocol extensions are necessary, they will be defined in separate 133 documents as follows. 135 o Global 1+1 Protection Using PIM 137 o Global 1+1 Protection Using mLDP 139 o Local Protection Using mLDP: 140 [I-D.wijnands-mpls-mldp-node-protection]This document describes 141 how to provide node-protection and the necessary extensions using 142 targeted LDP session. 144 o Local Protection Using PIM 146 1.1. Maximally Redundant Trees (MRTs) 148 Maximally Redundant Trees (MRTs) are described in 149 [I-D.enyedi-rtgwg-mrt-frr-algorithm]; here we only give a brief 150 description about the concept. A pair of MRTs is a pair of directed 151 spanning trees (red and blue tree) with a common root, directed so 152 that each node can be reached from the root on both trees. Moreover, 153 these trees are redundant, since they are constructed so that no 154 single link or single node failure can separate any node from the 155 root on both trees, unless that failed link or node is splitting the 156 network into completely separated components (e.g. the link or node 157 was a cut-edge or cut-vertex). 159 Although for multicast, the arcs (directed links) are directed away 160 from the root instead of towards the root, the same MRT computations 161 are used and apply. This is similar to how multicast uses unicast 162 routing's next-hops as the upstream-hops. Thus this definition 163 slightly differs from the one presented in 164 [I-D.enyedi-rtgwg-mrt-frr-algorithm], since the arcs are directed 165 away and not towards the root. When we need two paths towards a 166 given destination and not two away from it (e.g. for unicast detours 167 for local repair solutions), we only need to reverse the arcs from 168 how they are used for the unicast routing case; thus constructing 169 MRTs towards or away from the root is the same problem. A pair of 170 MRTs is depicted in Figure 1. 172 [E]---[D]---| |---[J] 173 | | | | | 174 | | | | | 175 [R] [F] [C]---[G] | 176 | | | | | 177 | | | | | 178 [A]---[B]---| |---[H] 180 (a) a network 182 [E]<--[D]---| |-->[J] [E]<--[D] [J] 183 ^ | | | | ^ ^ 184 | V V | | | | 185 [R] [F] [C]-->[G] | [R] [F] [C]-->[G] | 186 | | | ^ ^ | | 187 V V V | | | | 188 [A]<--[B] [H] [A]-->[B]---| |-->[H] 190 (b) Blue MRT of root R (c) Red MRT of root R 192 Figure 1: A network and two MRTs found in it 194 It is important to realize that this redundancy criterion does not 195 imply that, after a failure, either of the MRTs remains intact, since 196 a node failure must affect any spanning tree. Redundancy here means 197 that there will be a set of nodes, which can be reached along the 198 blue MRT, and there will be another set, which remains reachable 199 along the red MRT. As an example, suppose that node F goes down; 200 that would separate B and A on the blue MRT and D and E on the red 201 MRT. Naturally, it is possible that the intersection of these two 202 sets is not empty, e.g. C, G, H and J will remain reachable on both 203 MRTs. Additionally, observe that a single link can be used in both 204 of the trees in different directions, so even a link failure can cut 205 both trees. In this example, the failure of link F<->B leads to the 206 same reachability sets. 208 Finally, it is critical to recall that a pair of MRTs is always 209 constructed together and they are not SPTs. While it would be useful 210 to have an algorithm that could find a redundant pair for a given 211 tree (e.g. for the SPT), that is impossible in general. Moreover, if 212 there is a failure and at least one of the trees change, the other 213 tree may need to change as well. Therefore, even if a node still 214 receives the traffic along the red tree, it cannot keep the old red 215 tree, and construct a blue pair for it; there can be reconfiguration 216 in cases when traditional shortest-path-based-thinking would not 217 expect it. To converge to a new pair of disjoint MRTs, it is 218 generally necessary to update both the blue MRT and the red MRT. 220 The two MRTs provide two separate forwarding topologies that can be 221 used in addition to the default shortest-path-tree (SPT) forwarding 222 topology (usually MT-ID 0). There is a Blue MRT forwarding topology 223 represented by one MT-ID; similarly there is a Red MRT forwarding 224 topology represented by a different MT-ID. Naturally, a multicast 225 protocol is required to use the forwarding topologies information to 226 build the desired multicast trees. The multicast protocol can simply 227 request appropriate upstream interfaces, but include the MT-ID when 228 needed. 230 1.2. MRTs and Multicast 232 Maximally Redundant Trees (MRT) provide two advantages for protecting 233 multicast traffic. First, for global protection, MRTs are precisely 234 what needs to be computed to have maximally redundant multicast 235 distribution trees. Second, for local repair, MRTs ensure that there 236 will protection to the merge points; the certainty of a path from any 237 merge point to the PLR that avoids the failure node allows for the 238 creation of alternate trees. 240 A known disadvantage of MRT, and redundant trees in general, is that 241 the trees do not necessarily provide shortest detour paths. Modeling 242 is underway to investigate and compare the MRT lengths for the 243 different algorithm options [I-D.enyedi-rtgwg-mrt-frr-algorithm]. 245 2. Terminology 246 2-connected: A graph that has no cut-vertices. This is a graph 247 that requires two nodes to be removed before the network is 248 partitioned. 250 2-connected cluster: A maximal set of nodes that are 2-connected. 252 2-edge-connected: A network graph where at least two links must be 253 removed to partition the network. 255 ADAG: Almost Directed Acyclic Graph - a graph that, if all links 256 incoming to the root were removed, would be a DAG. 258 block: Either a 2-connected cluster, a cut-edge, or an isolated 259 vertex. 261 cut-link: A link whose removal partitions the network. A cut-link 262 by definition must be connected between two cut-vertices. If 263 there are multiple parallel links, then they are referred to as 264 cut-links in this document if removing the set of parallel links 265 would partition the network. 267 cut-vertex: A vertex whose removal partitions the network. 269 DAG: Directed Acyclic Graph - a graph where all links are directed 270 and there are no cycles in it. 272 GADAG: Generalized ADAG - a graph that is the combination of the 273 ADAGs of all blocks. 275 Maximally Redundant Trees (MRT): A pair of trees where the path 276 from any node X to the root R along the first tree and the path 277 from the same node X to the root along the second tree share the 278 minimum number of nodes and the minimum number of links. Each 279 such shared node is a cut-vertex. Any shared links are cut-links. 280 Any RT is an MRT but many MRTs are not RTs. 282 Maximally Redundant Multicast Trees (MRMT): A pair of multicast 283 trees built of the sub-set of MRTs that is needed to reach all 284 interested receivers. 286 network graph: A graph that reflects the network topology where all 287 links connect exactly two nodes and broadcast links have been 288 transformed into the standard pseudo-node representation. 290 Redundant Trees (RT): A pair of trees where the path from any node 291 X to the root R along the first tree is node-disjoint with the 292 path from the same node X to the root along the second tree. 293 These can be computed in 2-connected graphs. 295 Merge Point (MP): For local repair, a router at which the alternate 296 traffic rejoins the primary multicast tree. For global 297 protection, a router which receives traffic on multiple trees and 298 must decide which stream to forward on. 300 Point of Local Repair (PLR): The router that detects a local 301 failure and decides whether and when to forward traffic on 302 appropriate alternates. 304 MT-ID: Multi-topology identifier. The default shortest-path-tree 305 topology is MT-ID 0. 307 MultiCast Ingress (MCI): Multicast Ingress, the node where the 308 multicast stream enters the current transport technology (MPLS- 309 mLDP or IP-PIM) domain. This maybe the router attached to the 310 multicast source, the PIM Rendezvous Point (RP) or the mLDP Root 311 node address. 313 Upstream Multicast Hop (UMH): Upstream Multicast Hop, a candidate 314 next-hop that can be used to reach the MCI of the tree. 316 Stream Selection: The process by which a router determines which of 317 the multiple primary multicast streams to accept and forward. The 318 router can decide on a packet-by-packet basis or simply per- 319 stream. This is done for global protection 1+1 and described in 320 [I-D.karan-mofrr]. 322 MultiCast Egress (MCE): Multicast Egress, a node where the 323 multicast stream exists the current transport technology (MPLS- 324 mLDP or IP-PIM) domain. This is usually a receiving router that 325 may forward the multicast traffic on towards receivers based upon 326 IGMP or other technology. 328 3. Use-Cases and Applicability 330 Protection of multicast streams has gained importance with the use of 331 multicast to distribute video, including live video such as IP-TV. 332 There are a number of different scenarios and uses of multicast that 333 require protection. A few preliminary examples are described below. 335 o When video is distributed via IP or MPLS for a cable application, 336 it is desirable to have global protection 1+1 so that the 337 customer-perceived impact is limited. A QAM can join two 338 multicast groups and determine which stream to use based upon the 339 stream quality. A network implementing this may be custom- 340 engineered for this particular purpose. 342 o In financial markets, stock ticker data is distributed via 343 multicast. The loss of data can have a significant financial 344 impact. Depending on the network, either global protection 1+1 or 345 local protection can minimize the impact. 347 o Several solutions exist for updating software or firmwares of a 348 large number of end-user or operator-owned networking equipment 349 that are based on IP multicast. Since IP multicast is based on 350 datagram transport so taking care of lost data is cumbersome and 351 decreases the advantages offered by multicast. Solutions may rely 352 on sending the updates several times: a properly protected network 353 may result in that less repetitions are required. Other solutions 354 rely on the recipent asking for lost data segments explicitly on- 355 demand. A network failure could cause data loss for a significant 356 number of receivers, which in turn would start requesting the the 357 lost data in a burst that could overload the server. Properly 358 engineered multicast fast reroute would minimise such impacts. 360 o Some providers offer multicast VPN services to their customers. 361 SLAs between the customer and provider may set low packet loss 362 requirements. In such cases interruptions longer than the outage 363 timescales targeted by FRR could cause direct financial losses for 364 the provider. 366 Global protection 1+1 uses maximally redundant multicast trees 367 (MRMTs) to simultaneously distribute a multicast stream on both 368 MRMTs. The disadvantage is the extra state and bandwidth 369 requirements of always sending the traffic twice. The advantage is 370 that the latency of each MRMT can be known and the receiver can 371 select the best stream. 373 Local protection provides a patch around the fault while the 374 multicast tree reconverges. When PLR replication is used, there is 375 no extra multicast state in the network, but the bandwidth 376 requirements vary based upon how many potential merge-points must be 377 provided. When alternate-trees are used, there is extra multicast 378 state but the bandwidth requirements on a link can be minimized to no 379 more than once for the primary multicast tree traffic and once for 380 the alternate-tree traffic. 382 4. Global Protection: Multicast Live-Live 384 In MoFRR [I-D.karan-mofrr], the idea of joining both a primary and a 385 secondary tree is introduced with the requirement that the primary 386 and secondary trees be link and node disjoint. This works well for 387 networks where there are dual-planes, as explained in 388 [I-D.karan-mofrr]. For other networks, it is still desirable to have 389 two disjoint multicast trees and allow a receiver to join both and 390 make its own decision about which traffic to accept. 392 Using MRTs gives the ability to guarantee that the two trees are as 393 disjoint as possible and dynamically recomputed whenever the topology 394 changes. The MRTs used are rooted at the MultiCast Ingress (MCI). 395 One multicast tree is created using the Blue MRT forwarding topology. 396 The second multicast tree is created using the Red MRT forwarding 397 topology. This can be accomplished by specifying the appropriate 398 MT-ID associated with each forwarding topology. 400 There are four different aspects of using MRTs for 1+1 Global 401 Protection that are necessary to consider. They are as follows. 403 1. Creation of the maximally redundant multicast trees (MRMTs) based 404 upon the forwarding topologies. 406 2. Traffic Identification: How to handle traffic when the two MRMTs 407 overlap due to a cut-vertex or cut-link. 409 3. Convergence: How to converge after a network change and get back 410 to a protected state. 412 4. Inter-area/inter-level Behavior: How to compute and use MRMTs 413 when the multicast source is outside the area/level and how to 414 provide border-router protection. 416 4.1. Creation of MRMTs 418 The creation of the two maximally redundant multicast trees occurs as 419 described below. This assumes that the next-hops to the MCI 420 associated with the Blue and Red forwarding topologies have already 421 been computed and stored. 423 1. A receiving router determines that it wants to join both the Blue 424 tree and the Red tree. The details on how it does this decision 425 are not covered in this document and could be based on 426 configuration, additional protocols, etc. 428 2. The router selects among the Blue next-hops an Upstream Multicast 429 Hop (UMH) to reach the MCI node. The router joins the tree 430 towards the selected UMH including a multi-topology id (MT-ID) 431 identifying the Blue MRT. 433 3. The router selects among the Red next-hops an Upstream Multicast 434 Hop (UMH) to reach the MCI node. The router joins the tree 435 towards the selected UMH including a multi-topology id (MT-ID) 436 identifying the Red MRT. 438 4. When a router receives a tree setup request specifying a 439 particular MT-ID (e.g. Color), then the router selects among the 440 Color next-hops to the MCI a UMH node, creates the necessary 441 multicast state, and joins the tree towards the UMH node. 443 4.2. Traffic Self-Identification 445 Two maximally redundant trees will share any cut-vertices and cut- 446 links in the network. In the multicast global protection 1+1 case, 447 this means that the potential single failures of the other nodes and 448 links in the network are still protected against. If each cut-vertex 449 cannot associate traffic to a particular MRMT, then the traffic would 450 be incorrectly replicated to both MRMT resulting in complete 451 duplication of traffic. An example of such MRTs is given earlier in 452 Figure 1 and repeated below in Figure 2, where there are two cut- 453 vertices C and G and a cut-link C<->G. 455 [E]---[D]---| |---[J] 456 | | | | | 457 | | | | | 458 [R] [F] [C]---[G] | 459 | | | | | 460 | | | | | 461 [A]---[B]---| |---[H] 463 (a) a network 465 [E]<--[D]---| |-->[J] [E]<--[D] [J] 466 ^ | | | | ^ ^ 467 | V V | | | | 468 [R] [F] [C]-->[G] | [R] [F] [C]-->[G] | 469 | | | ^ ^ | | 470 V V V | | | | 471 [A]<--[B] [H] [A]-->[B]---| |-->[H] 473 (b) Blue MRT of root R (c) Red MRT of root R 475 Figure 2: A network and two MRTs found in it 477 In this example, traffic from the multicast source R to a receiver G, 478 J, or H will cross link C<->G on both the Blue and Red MRMTs. When 479 this occurs, there are several different possibilities depending upon 480 protocol. 482 mLDP: Different label bindings will be created for the Blue and Red 483 MRMTs. As specified in [I-D.iwijnand-mpls-mldp-multi-topology], 484 the P2MP FEC Element will use the MT IP Address Family to encode 485 the Root node address and MRT T-ID. Each MRMT will therefore have 486 a different P2MP FEC Element and be assigned an independent label. 488 PIM: There are three different ways to handle IP traffic forwarded 489 based upon PIM when that traffic will overlap on a link. 491 A. Different Groups: If different multicast groups are used for 492 each MRMT, then the traffic clearly indicates which MRMT it 493 belongs to. In this case, traffic on the Blue MRMT would use 494 multicast group G-blue and traffic on the Red MRMT would use 495 multicast group G-red. 497 B. Different Source Loopbacks: Another option is to use different 498 IP addresses for the source S, so S might announce S-red and 499 S-blue. In this case, traffic on the Blue MRMT would have an 500 IP source of S-blue and traffic on the Red MRMT would have an 501 IP source of S-red. 503 C. Stream Selection and Merging: The third option, described in 504 Section 4.2.1, is to have a router that gets (S,G) Joins for 505 both the Blue MT-ID and the Red MT-ID merge those into a 506 single tree. The router may need to select which upstream 507 stream to use, just as if it were a receiving router. 509 There are three options presented for PIM. The most appropriate will 510 depend upon deployment scenario as well as router capabilities. 512 4.2.1. Merging MRMTs for PIM if Traffic Doesn't Self-Identify 514 When traffic doesn't self-identify, the cut-vertices must follow 515 specific rules to avoid traffic duplication. This section describes 516 that behavior which allows the same (S,G) to be used for both the 517 Blue MT-ID and Red MT-ID (e.g. when the traffic doesn't self-identify 518 as to its MT-ID). 520 The behavior described in this section differs from the conflict 521 resolution described in [RFC6420] because these rules apply to the 522 Global Protection 1+1 case. Specifically, it is not sufficient for a 523 upstream router to pick only one of the two MT-IDs to join because 524 that does not maximize the protection provided. 526 As described in [RFC6420], a router that receives (S,G) Joins for 527 both the Blue MT-ID and the Red MT-ID can merge the set of downstream 528 interfaces in its forwarding entry. Unlike the procedures defined in 529 [RFC6420], the router must send a Join upstream for each MT-ID. If a 530 router has different upstream interfaces for these MRMTs, then the 531 router will need to do stream selection and forward the selected 532 stream to its outgoing interfaces, just as if it were an MCE. The 533 stream selection methods of detecting failures and handle traffic 534 discarding are described in [I-D.karan-mofrr]. 536 This method does not work if the MRMTs merge on a common LAN with 537 different upstream routers. In this case, the traffic cannot be 538 distinguished on the LAN and will result in duplication on the LAN. 539 The normal PIM Assert procedure would stop one of the upstream 540 routers from transmitting duplicates onto the LAN once it is 541 detected. This, in turn, may cause the duplicate stream to be pruned 542 back to the source. Thus, end-to-end protection in this case of the 543 MRMTs converging on a single LAN with different upstream interfaces 544 can only be accomplished by the methods of traffic self- 545 identification. 547 4.3. Convergence Behavior 549 It is necessary to handle topology changes and get back to having two 550 MRMTs that provide global protection. To understand the requirements 551 and what can be computed, recall the following facts. 553 a. It is not generally possible to compute a single tree that is 554 maximally redundant to an existing tree. 556 b. The pair of MRTs must be computed simultaneously. 558 c. After a single link or node failure, there is one set of nodes 559 that can be reached from the root on the Blue MRMT and a second 560 set of nodes that can be reached from the root on the Red MRMT. 561 If the failure wasn't a cut-vertex or cut-edge, all nodes will be 562 in at least one of these two sets. 564 To gracefully converge, it is necessary to never have a router where 565 both its red MRMT and blue MRMT are broken. There are three 566 different ways in which this could be done. These options are being 567 more fully explored to see which is most practical and provides the 568 best set of trade-offs. 570 Ordered Convergence When a single failure occurs, each receiver 571 determines whether it was affected or unaffected. First, the 572 affected receivers identify the broken MRMT color (e.g. blue) and 573 join the MRMT via their new UMH for that MRT color. Once the 574 affected receivers receive confirmation that the new MRMT has been 575 successfully created back to the MCI, then the affected receivers 576 switch to using that MRMT. The affected receivers tear down the 577 old broken MRMT state and join the MRMT via their new UMH for the 578 other MRT color (e.g. red). Finally, once the affected receivers 579 receive confirmation that the new MRMT has been successfully 580 created back to the MCI, the affected receivers can tear down the 581 old working MRMT state. Once the affected receivers have updated 582 their state, the unaffected receivers need to also do the same 583 staging - first joining the MRMT via their new UMH for the Blue 584 MRT, waiting for confirmation, switching to using traffic from the 585 Blue MRMT, tearing down the old Blue MRMT state, joining the MRMT 586 via their new UMH for the Red MRT, waiting for confirmation, and 587 tearing down the old Red MRMT state. There are complexities 588 remaining, such as determining how an Unaffected Receiver decides 589 that the Affected Receivers are done. When the topology change 590 isn't a failure, all receivers are unaffected and the same process 591 can apply. 593 Protocol Make-Before-Break In the control plane, a router joins the 594 tree on the new Blue topology but does not stop receiving traffic 595 on the old Blue topology. Once traffic is observed from the new 596 Blue UMH, then the router accepts traffic on the new Blue UMH and 597 removes the old Blue UMH. This behavior can happen simultaneously 598 with both Blue and Red forwarding topologies. An advantage is 599 that it works regardless of the type of topology change and 600 existing traffic streams aren't broken. Another advantage is that 601 the complexity is limited and this method is well understood. The 602 disadvantage is that the number of traffic-affecting events 603 depends upon the number of hops to the MCI. 605 Multicast Source Make-Before-Break On a topology change, routers 606 would create new MRMTs using new MRT forwarding state and leaving 607 the old MRMTs as they are. After the new MRMTs are complete, the 608 multicast source could switch from sending on the old MRMTs to 609 sending on the new MRMTs. After a time, the old MRMTs could be 610 torn down. There are a number of details to still investigate. 612 4.4. Inter-area/level Behavior 614 A source outside of the IGP area/level can be treated as a proxy 615 node. When the join request reaches a border router (whether ABR for 616 OSPF or LBR for ISIS), that border router needs to determine whether 617 to use the Blue or Red forwarding topology in the next selected area/ 618 level. 620 |-------------------| 621 | | 622 |---[S]---| [BR1]-----[ X ] | 623 | | | | | 624 [ A ]-----[ B ] | | | 625 | | [ Y ]-----[BR2]--(proxy for S) 626 | | 627 [BR1]-----[BR2] (b) Area 10 628 Y's Red next-hop: BR1 629 (a) Area 0 Y's Blue next-hop: BR2 630 Red Next-Hops to S 631 BR1's is BR2 632 BR2's is B 633 B's is S 635 Blue Next-Hops to S 636 BR1's is A 637 BR2's is BR1 638 A's is S 640 Figure 3: Inter-area Selection - next-hops towards S 642 Achieving maximally node-disjoint trees across multiple areas is hard 643 due to the information-hiding and abstraction. If there is only one 644 border router, it is trivial but protection of the border router is 645 not possible. With exactly 2 border routers, inter-area/level node 646 protection is reasonably straightforward but can require that the BR 647 rewrite the (S,G) for PIM. With more than 2 border routers, inter- 648 area node protection is possible at the cost of additional bandwidth 649 and border router complexity. These two solutions are described in 650 the following sub-sections. 652 4.4.1. Inter-area Node Protection with 2 border routers 654 If there are exactly two border routers between the areas, then the 655 solution and necessary computation is straightforward. In that 656 specific case, each BR knows that only the other BR must specifically 657 be avoided in the second area when a forwarding topology is selected. 658 As described in [I-D.enyedi-rtgwg-mrt-frr-algorithm], it is possible 659 for a node X to determine whether the Red or Blue forwarding topology 660 should be used to reach a node D while avoiding another node Y. 662 The results of this computation and the resulting changes in MT-ID 663 from Red to Blue or Blue to Red are illustrated in Figure 3. It 664 shows an example where BR1 must modify joins received from Area 10 665 for the Red MT-ID to use the Blue MT-ID in Area 0. Similarly, BR2 666 must modify joins received from Area 10 for the Blue MT-ID to use the 667 Red MT-ID in Area 0. 669 For mLDP, modifying the MT-ID in the control-plane is all that is 670 needed. For PIM, if the same (S,G) is used for both the Blue MT-ID 671 and the Red MT-ID, then only control-plane changes are needed. 672 However, for PIM, if different group IDs (e.g. G-red and G-blue) or 673 different source loopback addresses (S-red and S-blue) are used, it 674 is necessary to modify the traffic to reflect the MT-ID included in 675 the join message received on that interface. An alternative could be 676 to use an MPLS label that indicates the MT-ID instead of different 677 group IDs or source loopback addresses. 679 To summarize the necessary logic, when a BR1 receives a join from a 680 neighbor in area N to a destination D in area M on the Color MT-ID, 681 the BR1: 683 a. Identifies the BR2 at the other end of the proxy node in area N. 685 b. Determines which forwarding topology may avoid BR2 to reach D in 686 area M. Refer to that as Color-2 MT-ID. 688 c. Uses Color-2 MT-ID to determine the next-hops to S. When a join 689 is sent upstream, the MT-ID used is that for Color-2. 691 4.4.2. Inter-area Node Protection with > 2 Border Routers 693 If there are more than two BRs between areas, then the problem of 694 ensuring inter-area node-disjointness is not solved. Instead, once a 695 request to join the multicast tree has been received by a BR from an 696 area that isn't closest to the multicast source, the BR must join 697 both the Red MT-ID and the Blue MT-ID in the area closest to the 698 multicast source. Regardless of what single link or node failure 699 happens, each BR will receive the multicast stream. Then, the BR can 700 use the stream-selection techniques specified in [I-D.karan-mofrr] to 701 pick either the Blue or Red stream and forward it to downstream 702 routers in the other area. Each of the BRs for the other area should 703 be attached to a proxy-node representing the other area. 705 This approach ensures that a BR will receive the multicast stream in 706 the closest area as long as the single link or node failure isn't a 707 single point of failure. Thus, each area or level is independently 708 protected. The BR is required to be able to select among the 709 multicast streams and, if necessary for PIM, translate the traffic to 710 contain the correct (S,G) for forwarding. 712 4.5. PIM 714 Capabilities need to be exchanged to determine that a neighbor 715 supports using MRT forwarding topologies with PIM. Additional 716 signaling extensions are not necessary to PIM to support Global 717 Protection. [RFC6420] already defines how to specify an MT-ID as a 718 Join Attribute. 720 4.5.1. Traffic Handling: RPF Checks 722 For PIM, RPF checks would still be enabled by the control plane. The 723 control plane can program different forwarding entries on the G-blue 724 incoming interface and on the G-red incoming interface. The other 725 interfaces would still discard both G-blue and G-red traffic. 727 The receiver would still need to detect failures and handle traffic 728 discarding as is specified in [I-D.karan-mofrr]. 730 4.6. mLDP 732 Capabilities need to be exchanged to determine that a neighbor 733 supports using MRT forwarding topologies with mLDP. The basic 734 mechansims for mLDP to support multi-topology are already described 735 in [I-D.iwijnand-mpls-mldp-multi-topology]. It may be desirable to 736 extend the capability defined in this draft to indicate that MRT is 737 or is not supported. 739 5. Local Repair: Fast-Reroute 741 Local repair for multicast traffic is different from unicast in 742 several important ways. 744 o There is more than a single final destination. The full set of 745 receiving routers may not be known by the PLR and may be extremely 746 large. Therefore, it makes sense to repair to the immediate next- 747 hops for link-repair and the next-next-hops for node-repair. 748 These are the potential merge points (MPs). 750 o If a failure cannot be positively identified as a node-failure, 751 then it is important to repair to the immediate next-hops since 752 they may have receivers attached. 754 o If a failure cannot be positively identified as a link-failure and 755 node protection is desired, then it is important to repair to the 756 next-next-hops since they may not receive traffic from the 757 immediate next-hops. 759 o Updating multicast forwarding state may take significantly longer 760 than updating unicast state, since the multicast state is updated 761 tree by tree based on control-plane signaling. 763 o For tunnel-based IP/LDP approaches, neither the PLR nor the MP may 764 be able to specify which interface the alternate traffic will 765 arrive at the MP on. The simplest reason is the unicast 766 forwarding includes the use of ECMP and the path selection is 767 based upon internal router behavior for all paths between the PLR 768 and the MP. 770 For multicast fast-reroute, there are three different mechanisms that 771 can be used. As long as the necessary signaling is available, these 772 methods can be combined in the same network and even for the same PLR 773 and failure point. 775 PLR-driven Unicast Tunnels: The PLR learns the set of MPs that need 776 protection. On a failure, the PLR replicates the traffic and 777 tunnels it to each MP using the unicast route. If desired, an 778 RSVP-TE tunnel could be used instead of relying upon unicast 779 routing. 781 MP-driven Unicast Tunnels: Each MP learns the identity of the PLR. 782 Before failure, each MP independently signals to the PLR the 783 desire for protection and other information to use. On a failure, 784 the PLR replicates the traffic and tunnels it to each MP using the 785 unicast route. If desired, an RSVP-TE tunnel could be used 786 instead of relying upon unicast routing. 788 MP-driven Alternate Trees: Each MP learns the identity of the PLR 789 and the failure point (node and interface) to be protected 790 against. Each MP selects an upstream interface and forwarding 791 topology where the path will avoid the failure point; each MP 792 signals a join towards that upstream interface to create that 793 state. 795 Each of these options is described in more detail in their respective 796 sections. Then the methods are compared and contrasted for PIM and 797 for mLDP. 799 5.1. PLR-driven Unicast Tunnels 801 With PLR-driven unicast tunnels, the PLR learns the set of merge 802 points (MPs) and, on a locally detected failure, uses the existing 803 unicast routing to tunnel the multicast traffic to those merge 804 points. The failure being protected against may be link or node 805 failure. If unicast forwarding can provide an SRLG-protecting 806 alternate, then SRLG-protection is also possible. 808 There are five aspects to making this work. 810 1. PLR needs to learn the MPs and their associated MPLS labels to 811 create protection state. 813 2. Unicast routing has to offer alternates or have dedicated tunnels 814 to reach the MPs. The PLR encapsulates the multicast traffic and 815 directs it to be forwarded via unicast routing. 817 3. The MP must identify alternate traffic and decide when to accept 818 and forward it or drop it. 820 4. When the MP reconverges, it must move to its new UMH using make- 821 before-break so that traffic loss is minimized. 823 5. The PLR must know when to stop sending traffic on the alternates. 825 5.1.1. Learning the MPs 827 If link-protection is all that is desired, then the PLR already knows 828 the identities of the MPs. For node-protection, this is not 829 sufficient. In the PLR-driven case, there is no direct communication 830 possible between the PLR and the next-next-hops on the multicast 831 tree. (For mLDP, when targeted LDP sessions are used, this is 832 considered to be MP-driven and is covered in Section 5.2.) 834 In addition to learning the identities of the MPs, the PLR must also 835 learn the MPLS label, if any, associated with each MP. For mLDP, a 836 different label should be supplied for the alternate traffic; this 837 allows the MP to distinguish between the primary and alternate 838 traffic. For PIM, an MPLS label is used to identify that traffic is 839 the alternate. The unicast tunnel used to send traffic to the MP may 840 have penultimate-hop-popping done; thus without an explicit MPLS 841 label, there is no certainty that a packet could be conclusively 842 identified as primary traffic or as alternate traffic. 844 A router must tell its UMH the identity of all downstream multicast 845 routers, and their associated alternate labels, on the particular 846 multicast tree. This clearly requires protocol extensions. The 847 extensions for PIM are given in [I-D.kebler-pim-mrt-protection]. 849 5.1.2. Using Unicast Tunnels and Indirection 851 The PLR must encapsulate the multicast traffic and tunnel it towards 852 each MP. The key point is how that traffic then reaches the MP. 853 There are basically two possibilities. It is possible that a 854 dedicated RSVP-TE tunnel exists and can be used to reach the MP for 855 just this traffic; such an RSVP-TE tunnel would be explicitly routed 856 to avoid the failure point. The second possibility is that the 857 packet is tunneled via LDP and uses unicast routing. The second case 858 is explored here. 860 It is necessary to assume that unicast LDP fast-reroute 861 [I-D.ietf-rtgwg-mrt-frr-architecture][RFC5714][RFC5286] is supported 862 by the PLR. Since multicast convergence takes longer than unicast 863 convergence, the PLR may have two different routes to the MP over 864 time. When the failure happens, the PLR will have an alternate, 865 whether LFA or MRT, to reach the MP. Then the unicast routing 866 converges and the PLR will have a new primary route to the MP. Once 867 the routing has converged, it is important that alternate traffic is 868 no longer carried on the MRT forwarding topologies. This rule allows 869 the MRT forwarding topologies to reconverge and be available for the 870 next failure. Therefore, it is also necessary for the tunneled 871 multicast traffic to move from the alternate route to the new primary 872 route when the PLR reconverges. Therefore, the tunneled multicast 873 traffic should use indirection to obtain the unicast routing's 874 current next-hops to the MP. If physical indirection is not 875 feasible, then when the unicast LIB is updated, the associated 876 multicast alternate tunnel state should be as well. 878 When the PLR detects a local failure, the PLR replicates each 879 multicast packet, swaps or adds the alternate MPLS label needed by 880 the MP, and finally pushes the appropriate label for the MP based 881 upon the outgoing interface selected by the unicast routing. 883 For PIM, if no alternate labels are supplied by the MPs, then the 884 multicast traffic could be tunneled in IP. This would require 885 unicast IP fast-reroute. 887 5.1.3. MP Alternate Traffic Handling 889 A potential Merge Point must determine when and if to accept 890 alternate traffic. There are two critical components to this 891 decision. First, the MP must know the state of all links to its UMH. 892 This allows the MP to determine whether the multicast stream could be 893 received from the UMH. Second, the MP must be able to distinguish 894 between a normal multicast tree packet and an alternate packet. 896 The logic is similar for PIM and mLDP, but in PIM there is only one 897 RPF-interface or interface of interest to the UMH. In mLDP, all the 898 directly connected interfaces to the UMH are of interest. When the 899 MP detects a local failure, if that interface was the last connected 900 to the UMH and used for the multicast group, then the MP must rapidly 901 switch from accepting the normal multicast tree traffic to accepting 902 the alternate traffic. This rapid change must happen within the same 903 approximately 50 milliseconds that the PLR switching to send traffic 904 on the alternate takes and for the same reasons. It does no good for 905 the PLR to send alternate traffic if the MP doesn't accept it when it 906 is needed. 908 The MP can identify alternate traffic based upon the MPLS label. 909 This will be the alternate label that the MP supplied to its UMH for 910 this purpose. 912 5.1.4. Merge Point Reconvergence 914 After a failure, the MP will want to join the multicast tree 915 according to the new topology. It is critical that the MP does this 916 in a way that minimizes the traffic disruption. Whenever paths 917 change, there is also the possibility for a traffic-affecting event 918 due to different latencies. However, traffic impact above that 919 should be avoided. 921 The MP must do make-before-break. Until the MP knows that its new 922 UMH is fully connected to the MCI, the MP should continue to accept 923 its old alternate traffic. The MP could learn that the new UMH is 924 sufficient either via control-plane mechanisms or data-driven. In 925 the latter case, the reception of traffic from the new UMH can 926 trigger the change-over. If the data-driven approach is used, a 927 time-out to force the switch should apply to handle multicast trees 928 that have long quiet periods. 930 5.1.5. PLR termination of alternate traffic 932 The PLR sends traffic on the alternates for a configurable time-out. 933 There is no clean way for the next-hop routers and/or next-next-hop 934 routers to indicate that the traffic is no longer needed. 936 If better control were desired, each MP could tell its UMH what the 937 desired time-out is. The UMH could forward this to the PLR as well. 938 Then the PLR could send alternate traffic to different MPs based upon 939 the MP's individual timer. This would only be an advantage if some 940 of the MPs were expected to have a longer multicast reconvergence 941 time than others - either due to load or router capabilities. 943 5.2. MP-driven Unicast Tunnels 945 MP-driven unicast tunnels are only relevant for mLDP where targeted 946 LDP sessions are feasible. For PIM, there is no mechanism to 947 communicate beyond a router's immediate neighbors; these techniques 948 could work for link-protection, but even then there would not be a 949 way of requesting that the PLR should stop sending traffic. 951 There are three differences for MP-driven unicast tunnels from PLR- 952 driven unicast tunnels. 954 1. The MPs learn the identity of the PLR from their UMH. The PLR 955 does not learn the identities of the MPs. 957 2. The MPs create direct connections to the PLR and communicate 958 their alternate labels. 960 3. When the MPs have converged, each explicitly tells the PLR to 961 stop sending alternate traffic. 963 The first means that a router communicates its UMH to all its 964 downstream multicast hops. Then each MP communicates to the PLR(s) 965 (1 for link-protection and 1 for node-protection) and indicates the 966 multicast tree that protection is desired for and the associated 967 alternate label. 969 When the PLR learns about a new MP, it adds that MP and associated 970 information to the set of MPs to be protected. On a failure, the PLR 971 does the same behavior as for the PLR-driven unicast tunnels. 973 After the failure, the MP reconverges using make-before-break. Then 974 the MP explicitly communicates to the PLR(s) that alternate traffic 975 is no longer needed for that multicast tree. When the node- 976 protecting PLR hasn't changed for a MP, it may be necessary to 977 withdraw the old alternate label, which tells the PLR to stop 978 transmitting alternate traffic, and then provide a new alternate 979 label. 981 5.3. MP-driven Alternate Trees 983 For some networks, it is highly desirable not to have the PLR perform 984 replication to each MP. PLR replication can cause substantial 985 congestion on links used by alternates to different MPs. At the same 986 time, it is also desirable to have minimal extra state created in the 987 network. This can be resolved by creating alternate-trees that can 988 protect multiple multicast groups as a bypass-alternate-tree. An 989 alternate-tree can also be created per multicast group, PLR and 990 failure point. 992 It is not possible to merge alternate-trees for different PLRs or for 993 different neighbors. This is shown in Figure 4 where G can't select 994 an acceptable upstream node on the alternate tree that doesn't 995 violate either the need to avoid C (for PLR A) or D (for PLR B). 997 |-------[S]--------| Alternate from A must avoid C 998 V V Alternate from B ust avoid D 999 [A]------[E]-------[B] 1000 | | | 1001 V | V 1002 |--[C]------[F]-------[D]---| 1003 | | | | 1004 | |-------[G]--------| | 1005 | | | 1006 | | | 1007 |->[R1]-----[H]-------[R2]<-| 1009 (a) Multicast tree from S 1010 S->A->C->R1 and S->B->D->R2 1012 Figure 4: Alternate Trees from PLR A and B can't be merged 1014 A MP that joins an alternate-tree for a particular multicast stream 1015 should not expect or request PLR-replicated tunneled alternate 1016 traffic for that same multicast stream. 1018 Each alternate-tree is identified by the PLR which sources the 1019 traffic and the failure point (node and link) (FP) to be avoided. 1020 Different multicast groups with the same PLR and FP may have 1021 different sets of MPs - but they are all at most going to include the 1022 FP (for link protection) and the neighbors of FP except for the PLR. 1023 For a bypass-alternate-tree to work, it must be acceptable to 1024 temporarily send a multicast group's traffic to FP's neighbors that 1025 do not need it. This is the trade-off required to reduce alternate- 1026 tree state and use bypass-alternate-trees. As discussed in 1027 Section 5.1.3, a potential MP can determine whether to accept 1028 alternate traffic based upon the state of its normal upstream links. 1029 Alternate traffic for a group the MP hasn't joined can just be 1030 discarded. 1032 [S]......[PLR]--[ A ] 1033 | | | 1034 1| |2 | 1035 [ FP]--[MP3] 1036 | \ | 1037 | \ | 1038 [MP1]--[MP2] 1040 Figure 5: Alternate Tree Scenario 1042 For any router, knowing the PLR and the FP to avoid will force 1043 selection of either the Blue MRT or the Red MRT. It is possible that 1044 the FP doesn't actually appear in either MRT path, but the FP will 1045 always be in either the set of nodes that might be used for the Blue 1046 MRT path or the set of nodes that might be used for the Red MRT path. 1047 The FP's membership in one of the sets is a function of the partial 1048 ordering and topological ordering created by the MRT algorithm and is 1049 consistent between routers in the network graph. 1051 To create an alternate-tree, the following must happen: 1053 1. For node-protection, the MP learns from its upstream (the FP) the 1054 node-id of its upstream (the PLR) and, optionally, a link 1055 identifier for the link used to the PLR. The link-id is only 1056 needed for traffic handling in PIM, since mLDP can have targeted 1057 sessions between the MP and the PLR. 1059 2. For link-protection, the MP needs to know the node-id of its 1060 upstream (the PLR) and, optionally, its identifier for the link 1061 used to the PLR. 1063 3. The MP determines whether to use the Blue or Red forwarding 1064 topology to reach the PLR while avoiding the FP and associated 1065 interface. This gives the MP its alternate-tree upstream 1066 interface. 1068 4. The MP signals a backup-join to its alternate-tree upstream 1069 interface. The backup-join specifies the PLR, FP and, for PIM, 1070 the FP-PLR link identifier. If the alternate-tree is not to be 1071 used as a bypass-alternate-tree, then the multicast group (e.g. 1072 (S,G) or Opaque-Value) must be specified. 1074 5. A router that receives a backup-join and is not the PLR needs to 1075 create multicast state and send a backup-join towards the PLR on 1076 the appropriate Blue or Red forwarding topology as is locally 1077 determined to avoid the FP and FP-PLR link. 1079 6. Backup-joins for the same (PLR, FP, PLR-FP link-id) that 1080 reference the same multicast group can be merged into a single 1081 alternate-tree. Similarly, backup-joins for the same (PLR, FP, 1082 PLR-FP link-id) that reference no multicast group can be merged 1083 into a single alternate-tree. 1085 7. When the PLR receives the backup-join, it associates either the 1086 specified multicast group with that alternate-tree, if such is 1087 given, or associates all multicast groups that go to the FP via 1088 the specified FP-PLR link with the alternate-tree. 1090 For an example, look at Figure 5. FP would send a backup-join to MP3 1091 indicating (PLR, FP, PLR-FP link-1). MP3 sends a backup-join to A. 1092 MP1 sends a backup-join to MP2 and MP2 sends a backup-join to MP3. 1094 It is necessary that traffic on each alternate-tree self-identify as 1095 to which alternate-tree it is part of. This is because an alternate- 1096 tree for a multicast-group and a particular (PLR, FP, PLR-FP link-id) 1097 can easily overlap with an alternate-tree for the same multicast 1098 group and a different (PLR, FP, PLR-FP link-id). The best way of 1099 doing this depends upon whether PIM or mLDP is being used. 1101 5.3.1. PIM details for Alternate-Trees 1103 For PIM, the (S,G) of the IP packet is a globally unique identifier 1104 and is understood. To identify the alternate-tree, the most 1105 straightforward way is to use MPLS labels distributed in the PIM 1106 backup-join messages. A MP can use the incoming label to indicate 1107 the set of RPF-interfaces for which the traffic may be an alternate. 1108 If the alternate-tree isn't a bypass-alternate-tree, then only one 1109 RPF interface is referenced. If the alternate-tree is a bypass- 1110 alternate-tree, then multiple RPF-interfaces (parallel links to FP) 1111 might be intended. Alternate-tree traffic may cross an interface 1112 multiple times - either because the interface is a broadcast 1113 interface and different downstream-assigned labels are provided 1114 and/or because a MP may provide different labels. 1116 5.3.2. mLDP details for Alternate-Trees 1118 For mLDP, if bypass-alternate-trees are used, then the PLR must 1119 provide upstream-assigned labels to each multicast stream. The MP 1120 provides the label for the alternate-tree; if the alternate-tree is 1121 not a bypass-alternate-tree, this label also describes the multicast 1122 stream. If the alternate-tree is a bypass-alternate-tree, then it 1123 provides the context for the PLR-assigned labels for each multicast 1124 stream. If there are targeted LDP sessions between the PLR and the 1125 MPs, then the PLR could provide the necessary upstream-assigned 1126 labels. 1128 5.3.3. Traffic Handling by PLR 1130 An issue with traffic is how long should the PLR continue to send 1131 alternate traffic out. With an alternate-tree, the PLR can know to 1132 stop forwarding alternate traffic on the alternate-tree when that 1133 alternate-tree's state is torn down. This provides a clear signal 1134 that alternate traffic is no longer needed. 1136 5.4. Methods Compared for PIM 1138 The two approaches that are feasible for PIM are PLR-driven Unicast 1139 Tunnels and MP-driven Alternate-Trees. 1141 +-------------------------+-------------------+---------------------+ 1142 | Aspect | PLR-driven | MP-driven | 1143 | | Unicast Tunnels | Alternate-Trees | 1144 +-------------------------+-------------------+---------------------+ 1145 | Worst-case Traffic | 1 + number of MPs | 2 | 1146 | Replication Per Link | | | 1147 | PLR alternate-traffic | timer-based | control-plane | 1148 | | | terminated | 1149 | Extra multicast state | none | per (PLR,FP,S) for | 1150 | | | bypass mode | 1151 +-------------------------+-------------------+---------------------+ 1153 Which approach is prefered may be network-dependent. It should also 1154 be possible to use both in the same network. 1156 5.5. Methods Compared for mLDP 1158 All three approaches are feasible for mLDP. Below is a brief 1159 comparison of various aspects of each. 1161 +-------------------+---------------+-------------+-----------------+ 1162 | Aspect | MP-driven | PLR-driven | MP-driven | 1163 | | Unicast | Unicast | Alternate-Trees | 1164 | | Tunnels | Tunnels | | 1165 +-------------------+---------------+-------------+-----------------+ 1166 | Worst-case | 1 + number of | 1 + number | 2 | 1167 | Traffic | MPs | of MPs | | 1168 | Replication Per | | | | 1169 | Link | | | | 1170 | PLR | control-plane | timer-based | control-plane | 1171 | alternate-traffic | terminated | | terminated | 1172 | Extra multicast | none | none | per (PLR,FP,S) | 1173 | state | | | for bypass mode | 1174 +-------------------+---------------+-------------+-----------------+ 1176 6. Acknowledgements 1178 The authors would like to thank Kishore Tiruveedhula, Santosh Esale, 1179 and Maciek Konstantynowicz for their suggestions and review. 1181 7. IANA Considerations 1183 This doument includes no request to IANA. 1185 8. Security Considerations 1187 This architecture is not currently believed to introduce new security 1188 concerns. 1190 9. References 1192 9.1. Normative References 1194 [I-D.enyedi-rtgwg-mrt-frr-algorithm] 1195 Atlas, A. and A. Csaszar, "Algorithms for computing 1196 Maximally Redundant Trees for IP/LDP Fast- Reroute", 1197 draft-enyedi-rtgwg-mrt-frr-algorithm-00 (work in 1198 progress), October 2011. 1200 [I-D.ietf-rtgwg-mrt-frr-architecture] 1201 Atlas, A., Kebler, R., Konstantynowicz, M., Csaszar, A., 1202 White, R., and M. Shand, "An Architecture for IP/LDP Fast- 1203 Reroute Using Maximally Redundant Trees", 1204 draft-ietf-rtgwg-mrt-frr-architecture-00 (work in 1205 progress), January 2012. 1207 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 1208 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 1209 Protocol Specification (Revised)", RFC 4601, August 2006. 1211 [RFC6388] Wijnands, IJ., Minei, I., Kompella, K., and B. Thomas, 1212 "Label Distribution Protocol Extensions for Point-to- 1213 Multipoint and Multipoint-to-Multipoint Label Switched 1214 Paths", RFC 6388, November 2011. 1216 [RFC6420] Cai, Y. and H. Ou, "PIM Multi-Topology ID (MT-ID) Join 1217 Attribute", RFC 6420, November 2011. 1219 9.2. Informative References 1221 [I-D.iwijnand-mpls-mldp-multi-topology] 1222 Wijnands, I. and K. Raza, "mLDP Extensions for Multi 1223 Topology Routing", 1224 draft-iwijnand-mpls-mldp-multi-topology-01 (work in 1225 progress), January 2012. 1227 [I-D.karan-mofrr] 1228 Karan, A., Filsfils, C., Farinacci, D., Decraene, B., 1229 Leymann, N., and T. Telkamp, "Multicast only Fast Re- 1230 Route", draft-karan-mofrr-01 (work in progress), 1231 March 2011. 1233 [I-D.kebler-pim-mrt-protection] 1234 Kebler, R., Atlas, A., Wijnands, IJ., and G. Enyedi, "PIM 1235 Extensions for Protection Using Maximally Redundant 1236 Trees", draft-kebler-pim-mrt-protection-00 (work in 1237 progress), March 2012. 1239 [I-D.wijnands-mpls-mldp-node-protection] 1240 Wijnands, I., Rosen, E., Raza, K., Tantsura, J., and A. 1241 Atlas, "mLDP Node Protection", 1242 draft-wijnands-mpls-mldp-node-protection-00 (work in 1243 progress), February 2012. 1245 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 1246 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 1248 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 1249 RFC 5714, January 2010. 1251 Authors' Addresses 1253 Alia Atlas (editor) 1254 Juniper Networks 1255 10 Technology Park Drive 1256 Westford, MA 01886 1257 USA 1259 Email: akatlas@juniper.net 1261 Robert Kebler 1262 Juniper Networks 1263 10 Technology Park Drive 1264 Westford, MA 01886 1265 USA 1267 Email: rkebler@juniper.net 1268 IJsbrand Wijnands 1269 Cisco Systems, Inc. 1271 Email: ice@cisco.com 1273 Andras Csaszar 1274 Ericsson 1275 Konyves Kalman krt 11 1276 Budapest 1097 1277 Hungary 1279 Email: Andras.Csaszar@ericsson.com 1281 Gabor Sandor Enyedi 1282 Ericsson 1283 Konyves Kalman krt 11. 1284 Budapest 1097 1285 Hungary 1287 Email: Gabor.Sandor.Enyedi@ericsson.com