idnits 2.17.1 

draft-atlas-rtgwg-mrt-mc-arch-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 192 has weird spacing: '...wo MRTs  found...'

  == Line 475 has weird spacing: '...wo MRTs  found...'

  -- The document date (March 2, 2012) is 4438 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'R' is mentioned on line 468, but not defined

  == Missing Reference: 'F' is mentioned on line 468, but not defined

  == Missing Reference: 'C' is mentioned on line 468, but not defined

  == Missing Reference: 'G' is mentioned on line 468, but not defined

  == Missing Reference: 'E' is mentioned on line 465, but not defined

  == Missing Reference: 'D' is mentioned on line 465, but not defined

  == Missing Reference: 'J' is mentioned on line 465, but not defined

  == Missing Reference: 'A' is mentioned on line 471, but not defined

  == Missing Reference: 'B' is mentioned on line 471, but not defined

  == Missing Reference: 'H' is mentioned on line 471, but not defined

  == Missing Reference: 'S' is mentioned on line 1032, but not defined

  == Missing Reference: 'PLR' is mentioned on line 1032, but not defined

  == Unused Reference: 'I-D.wijnands-mpls-mldp-node-protection' is defined on
     line 1239, but no explicit reference was found in the text

  == Outdated reference: A later version (-04) exists of
     draft-enyedi-rtgwg-mrt-frr-algorithm-00

  ** Downref: Normative reference to an Informational draft:
     draft-enyedi-rtgwg-mrt-frr-algorithm (ref.
     'I-D.enyedi-rtgwg-mrt-frr-algorithm')

  == Outdated reference: A later version (-10) exists of
     draft-ietf-rtgwg-mrt-frr-architecture-00

  ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761)

  == Outdated reference: A later version (-04) exists of
     draft-iwijnand-mpls-mldp-multi-topology-01

  == Outdated reference: A later version (-02) exists of draft-karan-mofrr-01

  == Outdated reference: A later version (-01) exists of
     draft-kebler-pim-mrt-protection-00

  == Outdated reference: A later version (-04) exists of
     draft-wijnands-mpls-mldp-node-protection-00


     Summary: 2 errors (**), 0 flaws (~~), 22 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Routing Area Working Group                                 A. Atlas, Ed.
3	Internet-Draft                                                 R. Kebler
4	Intended status: Standards Track                        Juniper Networks
5	Expires: September 3, 2012                                  IJ. Wijnands
6	                                                     Cisco Systems, Inc.
7	                                                              A. Csaszar
8	                                                               G. Enyedi
9	                                                                Ericsson
10	                                                           March 2, 2012

12	An Architecture for Multicast Protection Using Maximally Redundant Trees
13	                    draft-atlas-rtgwg-mrt-mc-arch-00

15	Abstract

17	   Failure protection is desirable for multicast traffic, whether
18	   signaled via PIM or mLDP.  Different mechanisms are suitable for
19	   different use-cases and deployment scenarios.  This document
20	   describes the architecture for global protection (aka multicast live-
21	   live) and for local protection (aka fast-reroute).

23	   The general methods for global protection and local protection using
24	   alternate-trees are dependent upon the use of Maximally Redundant
25	   Trees.  Local protection can also tunnel traffic in unicast tunnels
26	   to take advantage of the routing and fast-reroute mechanisms
27	   available for IP/LDP unicast destinations.

29	   The failures protected against are single link or node failures.
30	   While the basic architecture might support protection against shared
31	   risk group failures, algorithms to dynamically compute MRTs
32	   supporting this are for future study.

34	Status of this Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at http://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on September 3, 2012.

50	Copyright Notice

52	   Copyright (c) 2012 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (http://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
68	     1.1.  Maximally Redundant Trees (MRTs) . . . . . . . . . . . . .  4
69	     1.2.  MRTs and Multicast . . . . . . . . . . . . . . . . . . . .  6
70	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  6
71	   3.  Use-Cases and Applicability  . . . . . . . . . . . . . . . . .  8
72	   4.  Global Protection: Multicast Live-Live . . . . . . . . . . . .  9
73	     4.1.  Creation of MRMTs  . . . . . . . . . . . . . . . . . . . . 10
74	     4.2.  Traffic Self-Identification  . . . . . . . . . . . . . . . 11
75	       4.2.1.  Merging MRMTs for PIM if Traffic Doesn't
76	               Self-Identify  . . . . . . . . . . . . . . . . . . . . 12
77	     4.3.  Convergence Behavior . . . . . . . . . . . . . . . . . . . 13
78	     4.4.  Inter-area/level Behavior  . . . . . . . . . . . . . . . . 14
79	       4.4.1.  Inter-area Node Protection with 2 border routers . . . 15
80	       4.4.2.  Inter-area Node Protection with > 2 Border Routers . . 16
81	     4.5.  PIM  . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
82	       4.5.1.  Traffic Handling: RPF Checks . . . . . . . . . . . . . 17
83	     4.6.  mLDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
84	   5.  Local Repair: Fast-Reroute . . . . . . . . . . . . . . . . . . 17
85	     5.1.  PLR-driven Unicast Tunnels . . . . . . . . . . . . . . . . 18
86	       5.1.1.  Learning the MPs . . . . . . . . . . . . . . . . . . . 19
87	       5.1.2.  Using Unicast Tunnels and Indirection  . . . . . . . . 19
88	       5.1.3.  MP Alternate Traffic Handling  . . . . . . . . . . . . 20
89	       5.1.4.  Merge Point Reconvergence  . . . . . . . . . . . . . . 21
90	       5.1.5.  PLR termination of alternate traffic . . . . . . . . . 21
91	     5.2.  MP-driven Unicast Tunnels  . . . . . . . . . . . . . . . . 21
92	     5.3.  MP-driven Alternate Trees  . . . . . . . . . . . . . . . . 22
93	       5.3.1.  PIM details for Alternate-Trees  . . . . . . . . . . . 25
94	       5.3.2.  mLDP details for Alternate-Trees . . . . . . . . . . . 25
95	       5.3.3.  Traffic Handling by PLR  . . . . . . . . . . . . . . . 25
96	     5.4.  Methods Compared for PIM . . . . . . . . . . . . . . . . . 26
97	     5.5.  Methods Compared for mLDP  . . . . . . . . . . . . . . . . 26
98	   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26
99	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 27
100	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 27
101	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
102	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 27
103	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 27
104	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28

106	1.  Introduction

108	   This document describes how the algorithms in
109	   [I-D.enyedi-rtgwg-mrt-frr-algorithm], which are used in
110	   [I-D.ietf-rtgwg-mrt-frr-architecture] for unicast IP/LDP fast-
111	   reroute, can be used to provide protection for multicast traffic.  It
112	   specifically applies to multicast state signaled by PIM[RFC4601] or
113	   mLDP[RFC6388].  There are additional protocols that depend upon these
114	   (e.g.  VPLS, mVPN, etc.) and consideration of the applicability to
115	   such traffic will be in a future version.

117	   In this document, global protection is used to refer to the method of
118	   having two maximally disjoint multicast trees where traffic may be
119	   sent on both and resolved by the receiver.  This is similar to the
120	   ability with RSVP-TE LSPs to have a primary and a hot standby, except
121	   that it can operate in 1+1 mode.  This capability is also referred to
122	   as multicast live-live and is a generalized form of that discussed in
123	   [I-D.karan-mofrr].  In this document, local protection refers to the
124	   method of having alternate ways of reaching the pre-identified merge
125	   points upon detection of a local failure.  This capability is also
126	   referred to as fast-reroute.

128	   This document describes the general architecture, framework, and
129	   trade-offs of the different approaches to solving these general
130	   problems.  It will recommend how to generally provide global
131	   protection and local protection for mLDP and PIM traffic.  Where
132	   protocol extensions are necessary, they will be defined in separate
133	   documents as follows.

135	   o  Global 1+1 Protection Using PIM

137	   o  Global 1+1 Protection Using mLDP

139	   o  Local Protection Using mLDP:
140	      [I-D.wijnands-mpls-mldp-node-protection]This document describes
141	      how to provide node-protection and the necessary extensions using
142	      targeted LDP session.

144	   o  Local Protection Using PIM

146	1.1.  Maximally Redundant Trees (MRTs)

148	   Maximally Redundant Trees (MRTs) are described in
149	   [I-D.enyedi-rtgwg-mrt-frr-algorithm]; here we only give a brief
150	   description about the concept.  A pair of MRTs is a pair of directed
151	   spanning trees (red and blue tree) with a common root, directed so
152	   that each node can be reached from the root on both trees.  Moreover,
153	   these trees are redundant, since they are constructed so that no
154	   single link or single node failure can separate any node from the
155	   root on both trees, unless that failed link or node is splitting the
156	   network into completely separated components (e.g. the link or node
157	   was a cut-edge or cut-vertex).

159	   Although for multicast, the arcs (directed links) are directed away
160	   from the root instead of towards the root, the same MRT computations
161	   are used and apply.  This is similar to how multicast uses unicast
162	   routing's next-hops as the upstream-hops.  Thus this definition
163	   slightly differs from the one presented in
164	   [I-D.enyedi-rtgwg-mrt-frr-algorithm], since the arcs are directed
165	   away and not towards the root.  When we need two paths towards a
166	   given destination and not two away from it (e.g. for unicast detours
167	   for local repair solutions), we only need to reverse the arcs from
168	   how they are used for the unicast routing case; thus constructing
169	   MRTs towards or away from the root is the same problem.  A pair of
170	   MRTs is depicted in Figure 1.

172	                            [E]---[D]---|     |---[J]
173	                             |     |    |     |    |
174	                             |     |    |     |    |
175	                            [R]   [F]  [C]---[G]   |
176	                             |     |    |     |    |
177	                             |     |    |     |    |
178	                            [A]---[B]---|     |---[H]

180	                                 (a) a network

182	           [E]<--[D]---|     |-->[J]        [E]<--[D]             [J]
183	            ^     |    |     |    |                ^               ^
184	            |     V    V     |    |                |               |
185	           [R]   [F]  [C]-->[G]   |         [R]   [F]  [C]-->[G]   |
186	                  |               |          |     ^    ^     |    |
187	                  V               V          V     |    |     |    |
188	           [A]<--[B]             [H]        [A]-->[B]---|     |-->[H]

190	            (b) Blue MRT of root R           (c) Red MRT of root R

192	               Figure 1: A network and two MRTs  found in it

194	   It is important to realize that this redundancy criterion does not
195	   imply that, after a failure, either of the MRTs remains intact, since
196	   a node failure must affect any spanning tree.  Redundancy here means
197	   that there will be a set of nodes, which can be reached along the
198	   blue MRT, and there will be another set, which remains reachable
199	   along the red MRT.  As an example, suppose that node F goes down;
200	   that would separate B and A on the blue MRT and D and E on the red
201	   MRT.  Naturally, it is possible that the intersection of these two
202	   sets is not empty, e.g.  C, G, H and J will remain reachable on both
203	   MRTs.  Additionally, observe that a single link can be used in both
204	   of the trees in different directions, so even a link failure can cut
205	   both trees.  In this example, the failure of link F<->B leads to the
206	   same reachability sets.

208	   Finally, it is critical to recall that a pair of MRTs is always
209	   constructed together and they are not SPTs.  While it would be useful
210	   to have an algorithm that could find a redundant pair for a given
211	   tree (e.g. for the SPT), that is impossible in general.  Moreover, if
212	   there is a failure and at least one of the trees change, the other
213	   tree may need to change as well.  Therefore, even if a node still
214	   receives the traffic along the red tree, it cannot keep the old red
215	   tree, and construct a blue pair for it; there can be reconfiguration
216	   in cases when traditional shortest-path-based-thinking would not
217	   expect it.  To converge to a new pair of disjoint MRTs, it is
218	   generally necessary to update both the blue MRT and the red MRT.

220	   The two MRTs provide two separate forwarding topologies that can be
221	   used in addition to the default shortest-path-tree (SPT) forwarding
222	   topology (usually MT-ID 0).  There is a Blue MRT forwarding topology
223	   represented by one MT-ID; similarly there is a Red MRT forwarding
224	   topology represented by a different MT-ID.  Naturally, a multicast
225	   protocol is required to use the forwarding topologies information to
226	   build the desired multicast trees.  The multicast protocol can simply
227	   request appropriate upstream interfaces, but include the MT-ID when
228	   needed.

230	1.2.  MRTs and Multicast

232	   Maximally Redundant Trees (MRT) provide two advantages for protecting
233	   multicast traffic.  First, for global protection, MRTs are precisely
234	   what needs to be computed to have maximally redundant multicast
235	   distribution trees.  Second, for local repair, MRTs ensure that there
236	   will protection to the merge points; the certainty of a path from any
237	   merge point to the PLR that avoids the failure node allows for the
238	   creation of alternate trees.

240	   A known disadvantage of MRT, and redundant trees in general, is that
241	   the trees do not necessarily provide shortest detour paths.  Modeling
242	   is underway to investigate and compare the MRT lengths for the
243	   different algorithm options [I-D.enyedi-rtgwg-mrt-frr-algorithm].

245	2.  Terminology
246	   2-connected:   A graph that has no cut-vertices.  This is a graph
247	      that requires two nodes to be removed before the network is
248	      partitioned.

250	   2-connected cluster:   A maximal set of nodes that are 2-connected.

252	   2-edge-connected:   A network graph where at least two links must be
253	      removed to partition the network.

255	   ADAG:   Almost Directed Acyclic Graph - a graph that, if all links
256	      incoming to the root were removed, would be a DAG.

258	   block:   Either a 2-connected cluster, a cut-edge, or an isolated
259	      vertex.

261	   cut-link:   A link whose removal partitions the network.  A cut-link
262	      by definition must be connected between two cut-vertices.  If
263	      there are multiple parallel links, then they are referred to as
264	      cut-links in this document if removing the set of parallel links
265	      would partition the network.

267	   cut-vertex:   A vertex whose removal partitions the network.

269	   DAG:   Directed Acyclic Graph - a graph where all links are directed
270	      and there are no cycles in it.

272	   GADAG:   Generalized ADAG - a graph that is the combination of the
273	      ADAGs of all blocks.

275	   Maximally Redundant Trees (MRT):   A pair of trees where the path
276	      from any node X to the root R along the first tree and the path
277	      from the same node X to the root along the second tree share the
278	      minimum number of nodes and the minimum number of links.  Each
279	      such shared node is a cut-vertex.  Any shared links are cut-links.
280	      Any RT is an MRT but many MRTs are not RTs.

282	   Maximally Redundant Multicast Trees (MRMT):   A pair of multicast
283	      trees built of the sub-set of MRTs that is needed to reach all
284	      interested receivers.

286	   network graph:   A graph that reflects the network topology where all
287	      links connect exactly two nodes and broadcast links have been
288	      transformed into the standard pseudo-node representation.

290	   Redundant Trees (RT):   A pair of trees where the path from any node
291	      X to the root R along the first tree is node-disjoint with the
292	      path from the same node X to the root along the second tree.
293	      These can be computed in 2-connected graphs.

295	   Merge Point (MP):   For local repair, a router at which the alternate
296	      traffic rejoins the primary multicast tree.  For global
297	      protection, a router which receives traffic on multiple trees and
298	      must decide which stream to forward on.

300	   Point of Local Repair (PLR):   The router that detects a local
301	      failure and decides whether and when to forward traffic on
302	      appropriate alternates.

304	   MT-ID:   Multi-topology identifier.  The default shortest-path-tree
305	      topology is MT-ID 0.

307	   MultiCast Ingress (MCI):   Multicast Ingress, the node where the
308	      multicast stream enters the current transport technology (MPLS-
309	      mLDP or IP-PIM) domain.  This maybe the router attached to the
310	      multicast source, the PIM Rendezvous Point (RP) or the mLDP Root
311	      node address.

313	   Upstream Multicast Hop (UMH):   Upstream Multicast Hop, a candidate
314	      next-hop that can be used to reach the MCI of the tree.

316	   Stream Selection:   The process by which a router determines which of
317	      the multiple primary multicast streams to accept and forward.  The
318	      router can decide on a packet-by-packet basis or simply per-
319	      stream.  This is done for global protection 1+1 and described in
320	      [I-D.karan-mofrr].

322	   MultiCast Egress (MCE):   Multicast Egress, a node where the
323	      multicast stream exists the current transport technology (MPLS-
324	      mLDP or IP-PIM) domain.  This is usually a receiving router that
325	      may forward the multicast traffic on towards receivers based upon
326	      IGMP or other technology.

328	3.  Use-Cases and Applicability

330	   Protection of multicast streams has gained importance with the use of
331	   multicast to distribute video, including live video such as IP-TV.
332	   There are a number of different scenarios and uses of multicast that
333	   require protection.  A few preliminary examples are described below.

335	   o  When video is distributed via IP or MPLS for a cable application,
336	      it is desirable to have global protection 1+1 so that the
337	      customer-perceived impact is limited.  A QAM can join two
338	      multicast groups and determine which stream to use based upon the
339	      stream quality.  A network implementing this may be custom-
340	      engineered for this particular purpose.

342	   o  In financial markets, stock ticker data is distributed via
343	      multicast.  The loss of data can have a significant financial
344	      impact.  Depending on the network, either global protection 1+1 or
345	      local protection can minimize the impact.

347	   o  Several solutions exist for updating software or firmwares of a
348	      large number of end-user or operator-owned networking equipment
349	      that are based on IP multicast.  Since IP multicast is based on
350	      datagram transport so taking care of lost data is cumbersome and
351	      decreases the advantages offered by multicast.  Solutions may rely
352	      on sending the updates several times: a properly protected network
353	      may result in that less repetitions are required.  Other solutions
354	      rely on the recipent asking for lost data segments explicitly on-
355	      demand.  A network failure could cause data loss for a significant
356	      number of receivers, which in turn would start requesting the the
357	      lost data in a burst that could overload the server.  Properly
358	      engineered multicast fast reroute would minimise such impacts.

360	   o  Some providers offer multicast VPN services to their customers.
361	      SLAs between the customer and provider may set low packet loss
362	      requirements.  In such cases interruptions longer than the outage
363	      timescales targeted by FRR could cause direct financial losses for
364	      the provider.

366	   Global protection 1+1 uses maximally redundant multicast trees
367	   (MRMTs) to simultaneously distribute a multicast stream on both
368	   MRMTs.  The disadvantage is the extra state and bandwidth
369	   requirements of always sending the traffic twice.  The advantage is
370	   that the latency of each MRMT can be known and the receiver can
371	   select the best stream.

373	   Local protection provides a patch around the fault while the
374	   multicast tree reconverges.  When PLR replication is used, there is
375	   no extra multicast state in the network, but the bandwidth
376	   requirements vary based upon how many potential merge-points must be
377	   provided.  When alternate-trees are used, there is extra multicast
378	   state but the bandwidth requirements on a link can be minimized to no
379	   more than once for the primary multicast tree traffic and once for
380	   the alternate-tree traffic.

382	4.  Global Protection: Multicast Live-Live

384	   In MoFRR [I-D.karan-mofrr], the idea of joining both a primary and a
385	   secondary tree is introduced with the requirement that the primary
386	   and secondary trees be link and node disjoint.  This works well for
387	   networks where there are dual-planes, as explained in
388	   [I-D.karan-mofrr].  For other networks, it is still desirable to have
389	   two disjoint multicast trees and allow a receiver to join both and
390	   make its own decision about which traffic to accept.

392	   Using MRTs gives the ability to guarantee that the two trees are as
393	   disjoint as possible and dynamically recomputed whenever the topology
394	   changes.  The MRTs used are rooted at the MultiCast Ingress (MCI).
395	   One multicast tree is created using the Blue MRT forwarding topology.
396	   The second multicast tree is created using the Red MRT forwarding
397	   topology.  This can be accomplished by specifying the appropriate
398	   MT-ID associated with each forwarding topology.

400	   There are four different aspects of using MRTs for 1+1 Global
401	   Protection that are necessary to consider.  They are as follows.

403	   1.  Creation of the maximally redundant multicast trees (MRMTs) based
404	       upon the forwarding topologies.

406	   2.  Traffic Identification: How to handle traffic when the two MRMTs
407	       overlap due to a cut-vertex or cut-link.

409	   3.  Convergence: How to converge after a network change and get back
410	       to a protected state.

412	   4.  Inter-area/inter-level Behavior: How to compute and use MRMTs
413	       when the multicast source is outside the area/level and how to
414	       provide border-router protection.

416	4.1.  Creation of MRMTs

418	   The creation of the two maximally redundant multicast trees occurs as
419	   described below.  This assumes that the next-hops to the MCI
420	   associated with the Blue and Red forwarding topologies have already
421	   been computed and stored.

423	   1.  A receiving router determines that it wants to join both the Blue
424	       tree and the Red tree.  The details on how it does this decision
425	       are not covered in this document and could be based on
426	       configuration, additional protocols, etc.

428	   2.  The router selects among the Blue next-hops an Upstream Multicast
429	       Hop (UMH) to reach the MCI node.  The router joins the tree
430	       towards the selected UMH including a multi-topology id (MT-ID)
431	       identifying the Blue MRT.

433	   3.  The router selects among the Red next-hops an Upstream Multicast
434	       Hop (UMH) to reach the MCI node.  The router joins the tree
435	       towards the selected UMH including a multi-topology id (MT-ID)
436	       identifying the Red MRT.

438	   4.  When a router receives a tree setup request specifying a
439	       particular MT-ID (e.g.  Color), then the router selects among the
440	       Color next-hops to the MCI a UMH node, creates the necessary
441	       multicast state, and joins the tree towards the UMH node.

443	4.2.  Traffic Self-Identification

445	   Two maximally redundant trees will share any cut-vertices and cut-
446	   links in the network.  In the multicast global protection 1+1 case,
447	   this means that the potential single failures of the other nodes and
448	   links in the network are still protected against.  If each cut-vertex
449	   cannot associate traffic to a particular MRMT, then the traffic would
450	   be incorrectly replicated to both MRMT resulting in complete
451	   duplication of traffic.  An example of such MRTs is given earlier in
452	   Figure 1 and repeated below in Figure 2, where there are two cut-
453	   vertices C and G and a cut-link C<->G.

455	                            [E]---[D]---|     |---[J]
456	                             |     |    |     |    |
457	                             |     |    |     |    |
458	                            [R]   [F]  [C]---[G]   |
459	                             |     |    |     |    |
460	                             |     |    |     |    |
461	                            [A]---[B]---|     |---[H]

463	                                 (a) a network

465	           [E]<--[D]---|     |-->[J]        [E]<--[D]             [J]
466	            ^     |    |     |    |                ^               ^
467	            |     V    V     |    |                |               |
468	           [R]   [F]  [C]-->[G]   |         [R]   [F]  [C]-->[G]   |
469	                  |               |          |     ^    ^     |    |
470	                  V               V          V     |    |     |    |
471	           [A]<--[B]             [H]        [A]-->[B]---|     |-->[H]

473	            (b) Blue MRT of root R           (c) Red MRT of root R

475	               Figure 2: A network and two MRTs  found in it

477	   In this example, traffic from the multicast source R to a receiver G,
478	   J, or H will cross link C<->G on both the Blue and Red MRMTs.  When
479	   this occurs, there are several different possibilities depending upon
480	   protocol.

482	   mLDP:   Different label bindings will be created for the Blue and Red
483	      MRMTs.  As specified in [I-D.iwijnand-mpls-mldp-multi-topology],
484	      the P2MP FEC Element will use the MT IP Address Family to encode
485	      the Root node address and MRT T-ID.  Each MRMT will therefore have
486	      a different P2MP FEC Element and be assigned an independent label.

488	   PIM:   There are three different ways to handle IP traffic forwarded
489	      based upon PIM when that traffic will overlap on a link.

491	      A.  Different Groups: If different multicast groups are used for
492	          each MRMT, then the traffic clearly indicates which MRMT it
493	          belongs to.  In this case, traffic on the Blue MRMT would use
494	          multicast group G-blue and traffic on the Red MRMT would use
495	          multicast group G-red.

497	      B.  Different Source Loopbacks: Another option is to use different
498	          IP addresses for the source S, so S might announce S-red and
499	          S-blue.  In this case, traffic on the Blue MRMT would have an
500	          IP source of S-blue and traffic on the Red MRMT would have an
501	          IP source of S-red.

503	      C.  Stream Selection and Merging: The third option, described in
504	          Section 4.2.1, is to have a router that gets (S,G) Joins for
505	          both the Blue MT-ID and the Red MT-ID merge those into a
506	          single tree.  The router may need to select which upstream
507	          stream to use, just as if it were a receiving router.

509	   There are three options presented for PIM.  The most appropriate will
510	   depend upon deployment scenario as well as router capabilities.

512	4.2.1.  Merging MRMTs for PIM if Traffic Doesn't Self-Identify

514	   When traffic doesn't self-identify, the cut-vertices must follow
515	   specific rules to avoid traffic duplication.  This section describes
516	   that behavior which allows the same (S,G) to be used for both the
517	   Blue MT-ID and Red MT-ID (e.g. when the traffic doesn't self-identify
518	   as to its MT-ID).

520	   The behavior described in this section differs from the conflict
521	   resolution described in [RFC6420] because these rules apply to the
522	   Global Protection 1+1 case.  Specifically, it is not sufficient for a
523	   upstream router to pick only one of the two MT-IDs to join because
524	   that does not maximize the protection provided.

526	   As described in [RFC6420], a router that receives (S,G) Joins for
527	   both the Blue MT-ID and the Red MT-ID can merge the set of downstream
528	   interfaces in its forwarding entry.  Unlike the procedures defined in
529	   [RFC6420], the router must send a Join upstream for each MT-ID.  If a
530	   router has different upstream interfaces for these MRMTs, then the
531	   router will need to do stream selection and forward the selected
532	   stream to its outgoing interfaces, just as if it were an MCE.  The
533	   stream selection methods of detecting failures and handle traffic
534	   discarding are described in [I-D.karan-mofrr].

536	   This method does not work if the MRMTs merge on a common LAN with
537	   different upstream routers.  In this case, the traffic cannot be
538	   distinguished on the LAN and will result in duplication on the LAN.
539	   The normal PIM Assert procedure would stop one of the upstream
540	   routers from transmitting duplicates onto the LAN once it is
541	   detected.  This, in turn, may cause the duplicate stream to be pruned
542	   back to the source.  Thus, end-to-end protection in this case of the
543	   MRMTs converging on a single LAN with different upstream interfaces
544	   can only be accomplished by the methods of traffic self-
545	   identification.

547	4.3.  Convergence Behavior

549	   It is necessary to handle topology changes and get back to having two
550	   MRMTs that provide global protection.  To understand the requirements
551	   and what can be computed, recall the following facts.

553	   a.  It is not generally possible to compute a single tree that is
554	       maximally redundant to an existing tree.

556	   b.  The pair of MRTs must be computed simultaneously.

558	   c.  After a single link or node failure, there is one set of nodes
559	       that can be reached from the root on the Blue MRMT and a second
560	       set of nodes that can be reached from the root on the Red MRMT.
561	       If the failure wasn't a cut-vertex or cut-edge, all nodes will be
562	       in at least one of these two sets.

564	   To gracefully converge, it is necessary to never have a router where
565	   both its red MRMT and blue MRMT are broken.  There are three
566	   different ways in which this could be done.  These options are being
567	   more fully explored to see which is most practical and provides the
568	   best set of trade-offs.

570	   Ordered Convergence  When a single failure occurs, each receiver
571	      determines whether it was affected or unaffected.  First, the
572	      affected receivers identify the broken MRMT color (e.g. blue) and
573	      join the MRMT via their new UMH for that MRT color.  Once the
574	      affected receivers receive confirmation that the new MRMT has been
575	      successfully created back to the MCI, then the affected receivers
576	      switch to using that MRMT.  The affected receivers tear down the
577	      old broken MRMT state and join the MRMT via their new UMH for the
578	      other MRT color (e.g. red).  Finally, once the affected receivers
579	      receive confirmation that the new MRMT has been successfully
580	      created back to the MCI, the affected receivers can tear down the
581	      old working MRMT state.  Once the affected receivers have updated
582	      their state, the unaffected receivers need to also do the same
583	      staging - first joining the MRMT via their new UMH for the Blue
584	      MRT, waiting for confirmation, switching to using traffic from the
585	      Blue MRMT, tearing down the old Blue MRMT state, joining the MRMT
586	      via their new UMH for the Red MRT, waiting for confirmation, and
587	      tearing down the old Red MRMT state.  There are complexities
588	      remaining, such as determining how an Unaffected Receiver decides
589	      that the Affected Receivers are done.  When the topology change
590	      isn't a failure, all receivers are unaffected and the same process
591	      can apply.

593	   Protocol Make-Before-Break  In the control plane, a router joins the
594	      tree on the new Blue topology but does not stop receiving traffic
595	      on the old Blue topology.  Once traffic is observed from the new
596	      Blue UMH, then the router accepts traffic on the new Blue UMH and
597	      removes the old Blue UMH.  This behavior can happen simultaneously
598	      with both Blue and Red forwarding topologies.  An advantage is
599	      that it works regardless of the type of topology change and
600	      existing traffic streams aren't broken.  Another advantage is that
601	      the complexity is limited and this method is well understood.  The
602	      disadvantage is that the number of traffic-affecting events
603	      depends upon the number of hops to the MCI.

605	   Multicast Source Make-Before-Break  On a topology change, routers
606	      would create new MRMTs using new MRT forwarding state and leaving
607	      the old MRMTs as they are.  After the new MRMTs are complete, the
608	      multicast source could switch from sending on the old MRMTs to
609	      sending on the new MRMTs.  After a time, the old MRMTs could be
610	      torn down.  There are a number of details to still investigate.

612	4.4.  Inter-area/level Behavior

614	   A source outside of the IGP area/level can be treated as a proxy
615	   node.  When the join request reaches a border router (whether ABR for
616	   OSPF or LBR for ISIS), that border router needs to determine whether
617	   to use the Blue or Red forwarding topology in the next selected area/
618	   level.

620	                                      |-------------------|
621	                                      |                   |
622	               |---[S]---|          [BR1]-----[ X ]       |
623	               |         |            |         |         |
624	             [ A ]-----[ B ]          |         |         |
625	               |         |          [ Y ]-----[BR2]--(proxy for S)
626	               |         |
627	             [BR1]-----[BR2]          (b) Area 10
628	                                      Y's Red next-hop:  BR1
629	            (a) Area 0                Y's Blue next-hop: BR2
630	           Red Next-Hops to S
631	             BR1's is BR2
632	             BR2's is B
633	             B's is S

635	           Blue Next-Hops to S
636	             BR1's is A
637	             BR2's is BR1
638	             A's is S

640	           Figure 3: Inter-area Selection - next-hops towards S

642	   Achieving maximally node-disjoint trees across multiple areas is hard
643	   due to the information-hiding and abstraction.  If there is only one
644	   border router, it is trivial but protection of the border router is
645	   not possible.  With exactly 2 border routers, inter-area/level node
646	   protection is reasonably straightforward but can require that the BR
647	   rewrite the (S,G) for PIM.  With more than 2 border routers, inter-
648	   area node protection is possible at the cost of additional bandwidth
649	   and border router complexity.  These two solutions are described in
650	   the following sub-sections.

652	4.4.1.  Inter-area Node Protection with 2 border routers

654	   If there are exactly two border routers between the areas, then the
655	   solution and necessary computation is straightforward.  In that
656	   specific case, each BR knows that only the other BR must specifically
657	   be avoided in the second area when a forwarding topology is selected.
658	   As described in [I-D.enyedi-rtgwg-mrt-frr-algorithm], it is possible
659	   for a node X to determine whether the Red or Blue forwarding topology
660	   should be used to reach a node D while avoiding another node Y.

662	   The results of this computation and the resulting changes in MT-ID
663	   from Red to Blue or Blue to Red are illustrated in Figure 3.  It
664	   shows an example where BR1 must modify joins received from Area 10
665	   for the Red MT-ID to use the Blue MT-ID in Area 0.  Similarly, BR2
666	   must modify joins received from Area 10 for the Blue MT-ID to use the
667	   Red MT-ID in Area 0.

669	   For mLDP, modifying the MT-ID in the control-plane is all that is
670	   needed.  For PIM, if the same (S,G) is used for both the Blue MT-ID
671	   and the Red MT-ID, then only control-plane changes are needed.
672	   However, for PIM, if different group IDs (e.g.  G-red and G-blue) or
673	   different source loopback addresses (S-red and S-blue) are used, it
674	   is necessary to modify the traffic to reflect the MT-ID included in
675	   the join message received on that interface.  An alternative could be
676	   to use an MPLS label that indicates the MT-ID instead of different
677	   group IDs or source loopback addresses.

679	   To summarize the necessary logic, when a BR1 receives a join from a
680	   neighbor in area N to a destination D in area M on the Color MT-ID,
681	   the BR1:

683	   a.  Identifies the BR2 at the other end of the proxy node in area N.

685	   b.  Determines which forwarding topology may avoid BR2 to reach D in
686	       area M. Refer to that as Color-2 MT-ID.

688	   c.  Uses Color-2 MT-ID to determine the next-hops to S. When a join
689	       is sent upstream, the MT-ID used is that for Color-2.

691	4.4.2.  Inter-area Node Protection with > 2 Border Routers

693	   If there are more than two BRs between areas, then the problem of
694	   ensuring inter-area node-disjointness is not solved.  Instead, once a
695	   request to join the multicast tree has been received by a BR from an
696	   area that isn't closest to the multicast source, the BR must join
697	   both the Red MT-ID and the Blue MT-ID in the area closest to the
698	   multicast source.  Regardless of what single link or node failure
699	   happens, each BR will receive the multicast stream.  Then, the BR can
700	   use the stream-selection techniques specified in [I-D.karan-mofrr] to
701	   pick either the Blue or Red stream and forward it to downstream
702	   routers in the other area.  Each of the BRs for the other area should
703	   be attached to a proxy-node representing the other area.

705	   This approach ensures that a BR will receive the multicast stream in
706	   the closest area as long as the single link or node failure isn't a
707	   single point of failure.  Thus, each area or level is independently
708	   protected.  The BR is required to be able to select among the
709	   multicast streams and, if necessary for PIM, translate the traffic to
710	   contain the correct (S,G) for forwarding.

712	4.5.  PIM

714	   Capabilities need to be exchanged to determine that a neighbor
715	   supports using MRT forwarding topologies with PIM.  Additional
716	   signaling extensions are not necessary to PIM to support Global
717	   Protection.  [RFC6420] already defines how to specify an MT-ID as a
718	   Join Attribute.

720	4.5.1.  Traffic Handling: RPF Checks

722	   For PIM, RPF checks would still be enabled by the control plane.  The
723	   control plane can program different forwarding entries on the G-blue
724	   incoming interface and on the G-red incoming interface.  The other
725	   interfaces would still discard both G-blue and G-red traffic.

727	   The receiver would still need to detect failures and handle traffic
728	   discarding as is specified in [I-D.karan-mofrr].

730	4.6.  mLDP

732	   Capabilities need to be exchanged to determine that a neighbor
733	   supports using MRT forwarding topologies with mLDP.  The basic
734	   mechansims for mLDP to support multi-topology are already described
735	   in [I-D.iwijnand-mpls-mldp-multi-topology].  It may be desirable to
736	   extend the capability defined in this draft to indicate that MRT is
737	   or is not supported.

739	5.  Local Repair: Fast-Reroute

741	   Local repair for multicast traffic is different from unicast in
742	   several important ways.

744	   o  There is more than a single final destination.  The full set of
745	      receiving routers may not be known by the PLR and may be extremely
746	      large.  Therefore, it makes sense to repair to the immediate next-
747	      hops for link-repair and the next-next-hops for node-repair.
748	      These are the potential merge points (MPs).

750	   o  If a failure cannot be positively identified as a node-failure,
751	      then it is important to repair to the immediate next-hops since
752	      they may have receivers attached.

754	   o  If a failure cannot be positively identified as a link-failure and
755	      node protection is desired, then it is important to repair to the
756	      next-next-hops since they may not receive traffic from the
757	      immediate next-hops.

759	   o  Updating multicast forwarding state may take significantly longer
760	      than updating unicast state, since the multicast state is updated
761	      tree by tree based on control-plane signaling.

763	   o  For tunnel-based IP/LDP approaches, neither the PLR nor the MP may
764	      be able to specify which interface the alternate traffic will
765	      arrive at the MP on.  The simplest reason is the unicast
766	      forwarding includes the use of ECMP and the path selection is
767	      based upon internal router behavior for all paths between the PLR
768	      and the MP.

770	   For multicast fast-reroute, there are three different mechanisms that
771	   can be used.  As long as the necessary signaling is available, these
772	   methods can be combined in the same network and even for the same PLR
773	   and failure point.

775	   PLR-driven Unicast Tunnels:   The PLR learns the set of MPs that need
776	      protection.  On a failure, the PLR replicates the traffic and
777	      tunnels it to each MP using the unicast route.  If desired, an
778	      RSVP-TE tunnel could be used instead of relying upon unicast
779	      routing.

781	   MP-driven Unicast Tunnels:   Each MP learns the identity of the PLR.
782	      Before failure, each MP independently signals to the PLR the
783	      desire for protection and other information to use.  On a failure,
784	      the PLR replicates the traffic and tunnels it to each MP using the
785	      unicast route.  If desired, an RSVP-TE tunnel could be used
786	      instead of relying upon unicast routing.

788	   MP-driven Alternate Trees:   Each MP learns the identity of the PLR
789	      and the failure point (node and interface) to be protected
790	      against.  Each MP selects an upstream interface and forwarding
791	      topology where the path will avoid the failure point; each MP
792	      signals a join towards that upstream interface to create that
793	      state.

795	   Each of these options is described in more detail in their respective
796	   sections.  Then the methods are compared and contrasted for PIM and
797	   for mLDP.

799	5.1.  PLR-driven Unicast Tunnels

801	   With PLR-driven unicast tunnels, the PLR learns the set of merge
802	   points (MPs) and, on a locally detected failure, uses the existing
803	   unicast routing to tunnel the multicast traffic to those merge
804	   points.  The failure being protected against may be link or node
805	   failure.  If unicast forwarding can provide an SRLG-protecting
806	   alternate, then SRLG-protection is also possible.

808	   There are five aspects to making this work.

810	   1.  PLR needs to learn the MPs and their associated MPLS labels to
811	       create protection state.

813	   2.  Unicast routing has to offer alternates or have dedicated tunnels
814	       to reach the MPs.  The PLR encapsulates the multicast traffic and
815	       directs it to be forwarded via unicast routing.

817	   3.  The MP must identify alternate traffic and decide when to accept
818	       and forward it or drop it.

820	   4.  When the MP reconverges, it must move to its new UMH using make-
821	       before-break so that traffic loss is minimized.

823	   5.  The PLR must know when to stop sending traffic on the alternates.

825	5.1.1.  Learning the MPs

827	   If link-protection is all that is desired, then the PLR already knows
828	   the identities of the MPs.  For node-protection, this is not
829	   sufficient.  In the PLR-driven case, there is no direct communication
830	   possible between the PLR and the next-next-hops on the multicast
831	   tree.  (For mLDP, when targeted LDP sessions are used, this is
832	   considered to be MP-driven and is covered in Section 5.2.)

834	   In addition to learning the identities of the MPs, the PLR must also
835	   learn the MPLS label, if any, associated with each MP.  For mLDP, a
836	   different label should be supplied for the alternate traffic; this
837	   allows the MP to distinguish between the primary and alternate
838	   traffic.  For PIM, an MPLS label is used to identify that traffic is
839	   the alternate.  The unicast tunnel used to send traffic to the MP may
840	   have penultimate-hop-popping done; thus without an explicit MPLS
841	   label, there is no certainty that a packet could be conclusively
842	   identified as primary traffic or as alternate traffic.

844	   A router must tell its UMH the identity of all downstream multicast
845	   routers, and their associated alternate labels, on the particular
846	   multicast tree.  This clearly requires protocol extensions.  The
847	   extensions for PIM are given in [I-D.kebler-pim-mrt-protection].

849	5.1.2.  Using Unicast Tunnels and Indirection

851	   The PLR must encapsulate the multicast traffic and tunnel it towards
852	   each MP.  The key point is how that traffic then reaches the MP.
853	   There are basically two possibilities.  It is possible that a
854	   dedicated RSVP-TE tunnel exists and can be used to reach the MP for
855	   just this traffic; such an RSVP-TE tunnel would be explicitly routed
856	   to avoid the failure point.  The second possibility is that the
857	   packet is tunneled via LDP and uses unicast routing.  The second case
858	   is explored here.

860	   It is necessary to assume that unicast LDP fast-reroute
861	   [I-D.ietf-rtgwg-mrt-frr-architecture][RFC5714][RFC5286] is supported
862	   by the PLR.  Since multicast convergence takes longer than unicast
863	   convergence, the PLR may have two different routes to the MP over
864	   time.  When the failure happens, the PLR will have an alternate,
865	   whether LFA or MRT, to reach the MP.  Then the unicast routing
866	   converges and the PLR will have a new primary route to the MP.  Once
867	   the routing has converged, it is important that alternate traffic is
868	   no longer carried on the MRT forwarding topologies.  This rule allows
869	   the MRT forwarding topologies to reconverge and be available for the
870	   next failure.  Therefore, it is also necessary for the tunneled
871	   multicast traffic to move from the alternate route to the new primary
872	   route when the PLR reconverges.  Therefore, the tunneled multicast
873	   traffic should use indirection to obtain the unicast routing's
874	   current next-hops to the MP.  If physical indirection is not
875	   feasible, then when the unicast LIB is updated, the associated
876	   multicast alternate tunnel state should be as well.

878	   When the PLR detects a local failure, the PLR replicates each
879	   multicast packet, swaps or adds the alternate MPLS label needed by
880	   the MP, and finally pushes the appropriate label for the MP based
881	   upon the outgoing interface selected by the unicast routing.

883	   For PIM, if no alternate labels are supplied by the MPs, then the
884	   multicast traffic could be tunneled in IP.  This would require
885	   unicast IP fast-reroute.

887	5.1.3.  MP Alternate Traffic Handling

889	   A potential Merge Point must determine when and if to accept
890	   alternate traffic.  There are two critical components to this
891	   decision.  First, the MP must know the state of all links to its UMH.
892	   This allows the MP to determine whether the multicast stream could be
893	   received from the UMH.  Second, the MP must be able to distinguish
894	   between a normal multicast tree packet and an alternate packet.

896	   The logic is similar for PIM and mLDP, but in PIM there is only one
897	   RPF-interface or interface of interest to the UMH.  In mLDP, all the
898	   directly connected interfaces to the UMH are of interest.  When the
899	   MP detects a local failure, if that interface was the last connected
900	   to the UMH and used for the multicast group, then the MP must rapidly
901	   switch from accepting the normal multicast tree traffic to accepting
902	   the alternate traffic.  This rapid change must happen within the same
903	   approximately 50 milliseconds that the PLR switching to send traffic
904	   on the alternate takes and for the same reasons.  It does no good for
905	   the PLR to send alternate traffic if the MP doesn't accept it when it
906	   is needed.

908	   The MP can identify alternate traffic based upon the MPLS label.
909	   This will be the alternate label that the MP supplied to its UMH for
910	   this purpose.

912	5.1.4.  Merge Point Reconvergence

914	   After a failure, the MP will want to join the multicast tree
915	   according to the new topology.  It is critical that the MP does this
916	   in a way that minimizes the traffic disruption.  Whenever paths
917	   change, there is also the possibility for a traffic-affecting event
918	   due to different latencies.  However, traffic impact above that
919	   should be avoided.

921	   The MP must do make-before-break.  Until the MP knows that its new
922	   UMH is fully connected to the MCI, the MP should continue to accept
923	   its old alternate traffic.  The MP could learn that the new UMH is
924	   sufficient either via control-plane mechanisms or data-driven.  In
925	   the latter case, the reception of traffic from the new UMH can
926	   trigger the change-over.  If the data-driven approach is used, a
927	   time-out to force the switch should apply to handle multicast trees
928	   that have long quiet periods.

930	5.1.5.  PLR termination of alternate traffic

932	   The PLR sends traffic on the alternates for a configurable time-out.
933	   There is no clean way for the next-hop routers and/or next-next-hop
934	   routers to indicate that the traffic is no longer needed.

936	   If better control were desired, each MP could tell its UMH what the
937	   desired time-out is.  The UMH could forward this to the PLR as well.
938	   Then the PLR could send alternate traffic to different MPs based upon
939	   the MP's individual timer.  This would only be an advantage if some
940	   of the MPs were expected to have a longer multicast reconvergence
941	   time than others - either due to load or router capabilities.

943	5.2.  MP-driven Unicast Tunnels

945	   MP-driven unicast tunnels are only relevant for mLDP where targeted
946	   LDP sessions are feasible.  For PIM, there is no mechanism to
947	   communicate beyond a router's immediate neighbors; these techniques
948	   could work for link-protection, but even then there would not be a
949	   way of requesting that the PLR should stop sending traffic.

951	   There are three differences for MP-driven unicast tunnels from PLR-
952	   driven unicast tunnels.

954	   1.  The MPs learn the identity of the PLR from their UMH.  The PLR
955	       does not learn the identities of the MPs.

957	   2.  The MPs create direct connections to the PLR and communicate
958	       their alternate labels.

960	   3.  When the MPs have converged, each explicitly tells the PLR to
961	       stop sending alternate traffic.

963	   The first means that a router communicates its UMH to all its
964	   downstream multicast hops.  Then each MP communicates to the PLR(s)
965	   (1 for link-protection and 1 for node-protection) and indicates the
966	   multicast tree that protection is desired for and the associated
967	   alternate label.

969	   When the PLR learns about a new MP, it adds that MP and associated
970	   information to the set of MPs to be protected.  On a failure, the PLR
971	   does the same behavior as for the PLR-driven unicast tunnels.

973	   After the failure, the MP reconverges using make-before-break.  Then
974	   the MP explicitly communicates to the PLR(s) that alternate traffic
975	   is no longer needed for that multicast tree.  When the node-
976	   protecting PLR hasn't changed for a MP, it may be necessary to
977	   withdraw the old alternate label, which tells the PLR to stop
978	   transmitting alternate traffic, and then provide a new alternate
979	   label.

981	5.3.  MP-driven Alternate Trees

983	   For some networks, it is highly desirable not to have the PLR perform
984	   replication to each MP.  PLR replication can cause substantial
985	   congestion on links used by alternates to different MPs.  At the same
986	   time, it is also desirable to have minimal extra state created in the
987	   network.  This can be resolved by creating alternate-trees that can
988	   protect multiple multicast groups as a bypass-alternate-tree.  An
989	   alternate-tree can also be created per multicast group, PLR and
990	   failure point.

992	   It is not possible to merge alternate-trees for different PLRs or for
993	   different neighbors.  This is shown in Figure 4 where G can't select
994	   an acceptable upstream node on the alternate tree that doesn't
995	   violate either the need to avoid C (for PLR A) or D (for PLR B).

997	           |-------[S]--------|           Alternate from A must avoid C
998	           V                  V           Alternate from B ust avoid D
999	          [A]------[E]-------[B]
1000	           |        |         |
1001	           V        |         V
1002	       |--[C]------[F]-------[D]---|
1003	       |   |                  |    |
1004	       |   |-------[G]--------|    |
1005	       |            |              |
1006	       |            |              |
1007	       |->[R1]-----[H]-------[R2]<-|

1009	          (a) Multicast tree from S
1010	        S->A->C->R1  and  S->B->D->R2

1012	        Figure 4: Alternate Trees from PLR A and B can't be merged

1014	   A MP that joins an alternate-tree for a particular multicast stream
1015	   should not expect or request PLR-replicated tunneled alternate
1016	   traffic for that same multicast stream.

1018	   Each alternate-tree is identified by the PLR which sources the
1019	   traffic and the failure point (node and link) (FP) to be avoided.
1020	   Different multicast groups with the same PLR and FP may have
1021	   different sets of MPs - but they are all at most going to include the
1022	   FP (for link protection) and the neighbors of FP except for the PLR.
1023	   For a bypass-alternate-tree to work, it must be acceptable to
1024	   temporarily send a multicast group's traffic to FP's neighbors that
1025	   do not need it.  This is the trade-off required to reduce alternate-
1026	   tree state and use bypass-alternate-trees.  As discussed in
1027	   Section 5.1.3, a potential MP can determine whether to accept
1028	   alternate traffic based upon the state of its normal upstream links.
1029	   Alternate traffic for a group the MP hasn't joined can just be
1030	   discarded.

1032	                             [S]......[PLR]--[ A ]
1033	                                       | |     |
1034	                                      1| |2    |
1035	                                      [ FP]--[MP3]
1036	                                        |  \   |
1037	                                        |   \  |
1038	                                     [MP1]--[MP2]

1040	                     Figure 5: Alternate Tree Scenario

1042	   For any router, knowing the PLR and the FP to avoid will force
1043	   selection of either the Blue MRT or the Red MRT.  It is possible that
1044	   the FP doesn't actually appear in either MRT path, but the FP will
1045	   always be in either the set of nodes that might be used for the Blue
1046	   MRT path or the set of nodes that might be used for the Red MRT path.
1047	   The FP's membership in one of the sets is a function of the partial
1048	   ordering and topological ordering created by the MRT algorithm and is
1049	   consistent between routers in the network graph.

1051	   To create an alternate-tree, the following must happen:

1053	   1.  For node-protection, the MP learns from its upstream (the FP) the
1054	       node-id of its upstream (the PLR) and, optionally, a link
1055	       identifier for the link used to the PLR.  The link-id is only
1056	       needed for traffic handling in PIM, since mLDP can have targeted
1057	       sessions between the MP and the PLR.

1059	   2.  For link-protection, the MP needs to know the node-id of its
1060	       upstream (the PLR) and, optionally, its identifier for the link
1061	       used to the PLR.

1063	   3.  The MP determines whether to use the Blue or Red forwarding
1064	       topology to reach the PLR while avoiding the FP and associated
1065	       interface.  This gives the MP its alternate-tree upstream
1066	       interface.

1068	   4.  The MP signals a backup-join to its alternate-tree upstream
1069	       interface.  The backup-join specifies the PLR, FP and, for PIM,
1070	       the FP-PLR link identifier.  If the alternate-tree is not to be
1071	       used as a bypass-alternate-tree, then the multicast group (e.g.
1072	       (S,G) or Opaque-Value) must be specified.

1074	   5.  A router that receives a backup-join and is not the PLR needs to
1075	       create multicast state and send a backup-join towards the PLR on
1076	       the appropriate Blue or Red forwarding topology as is locally
1077	       determined to avoid the FP and FP-PLR link.

1079	   6.  Backup-joins for the same (PLR, FP, PLR-FP link-id) that
1080	       reference the same multicast group can be merged into a single
1081	       alternate-tree.  Similarly, backup-joins for the same (PLR, FP,
1082	       PLR-FP link-id) that reference no multicast group can be merged
1083	       into a single alternate-tree.

1085	   7.  When the PLR receives the backup-join, it associates either the
1086	       specified multicast group with that alternate-tree, if such is
1087	       given, or associates all multicast groups that go to the FP via
1088	       the specified FP-PLR link with the alternate-tree.

1090	   For an example, look at Figure 5.  FP would send a backup-join to MP3
1091	   indicating (PLR, FP, PLR-FP link-1).  MP3 sends a backup-join to A.
1092	   MP1 sends a backup-join to MP2 and MP2 sends a backup-join to MP3.

1094	   It is necessary that traffic on each alternate-tree self-identify as
1095	   to which alternate-tree it is part of.  This is because an alternate-
1096	   tree for a multicast-group and a particular (PLR, FP, PLR-FP link-id)
1097	   can easily overlap with an alternate-tree for the same multicast
1098	   group and a different (PLR, FP, PLR-FP link-id).  The best way of
1099	   doing this depends upon whether PIM or mLDP is being used.

1101	5.3.1.  PIM details for Alternate-Trees

1103	   For PIM, the (S,G) of the IP packet is a globally unique identifier
1104	   and is understood.  To identify the alternate-tree, the most
1105	   straightforward way is to use MPLS labels distributed in the PIM
1106	   backup-join messages.  A MP can use the incoming label to indicate
1107	   the set of RPF-interfaces for which the traffic may be an alternate.
1108	   If the alternate-tree isn't a bypass-alternate-tree, then only one
1109	   RPF interface is referenced.  If the alternate-tree is a bypass-
1110	   alternate-tree, then multiple RPF-interfaces (parallel links to FP)
1111	   might be intended.  Alternate-tree traffic may cross an interface
1112	   multiple times - either because the interface is a broadcast
1113	   interface and different downstream-assigned labels are provided
1114	   and/or because a MP may provide different labels.

1116	5.3.2.  mLDP details for Alternate-Trees

1118	   For mLDP, if bypass-alternate-trees are used, then the PLR must
1119	   provide upstream-assigned labels to each multicast stream.  The MP
1120	   provides the label for the alternate-tree; if the alternate-tree is
1121	   not a bypass-alternate-tree, this label also describes the multicast
1122	   stream.  If the alternate-tree is a bypass-alternate-tree, then it
1123	   provides the context for the PLR-assigned labels for each multicast
1124	   stream.  If there are targeted LDP sessions between the PLR and the
1125	   MPs, then the PLR could provide the necessary upstream-assigned
1126	   labels.

1128	5.3.3.  Traffic Handling by PLR

1130	   An issue with traffic is how long should the PLR continue to send
1131	   alternate traffic out.  With an alternate-tree, the PLR can know to
1132	   stop forwarding alternate traffic on the alternate-tree when that
1133	   alternate-tree's state is torn down.  This provides a clear signal
1134	   that alternate traffic is no longer needed.

1136	5.4.  Methods Compared for PIM

1138	   The two approaches that are feasible for PIM are PLR-driven Unicast
1139	   Tunnels and MP-driven Alternate-Trees.

1141	   +-------------------------+-------------------+---------------------+
1142	   |          Aspect         |     PLR-driven    |      MP-driven      |
1143	   |                         |  Unicast Tunnels  |   Alternate-Trees   |
1144	   +-------------------------+-------------------+---------------------+
1145	   |    Worst-case Traffic   | 1 + number of MPs |          2          |
1146	   |   Replication Per Link  |                   |                     |
1147	   |  PLR alternate-traffic  |    timer-based    |    control-plane    |
1148	   |                         |                   |      terminated     |
1149	   |  Extra multicast state  |        none       |  per (PLR,FP,S) for |
1150	   |                         |                   |     bypass mode     |
1151	   +-------------------------+-------------------+---------------------+

1153	   Which approach is prefered may be network-dependent.  It should also
1154	   be possible to use both in the same network.

1156	5.5.  Methods Compared for mLDP

1158	   All three approaches are feasible for mLDP.  Below is a brief
1159	   comparison of various aspects of each.

1161	   +-------------------+---------------+-------------+-----------------+
1162	   |       Aspect      |   MP-driven   |  PLR-driven |    MP-driven    |
1163	   |                   |    Unicast    |   Unicast   | Alternate-Trees |
1164	   |                   |    Tunnels    |   Tunnels   |                 |
1165	   +-------------------+---------------+-------------+-----------------+
1166	   |     Worst-case    | 1 + number of |  1 + number |        2        |
1167	   |      Traffic      |      MPs      |    of MPs   |                 |
1168	   |  Replication Per  |               |             |                 |
1169	   |        Link       |               |             |                 |
1170	   |        PLR        | control-plane | timer-based |  control-plane  |
1171	   | alternate-traffic |   terminated  |             |    terminated   |
1172	   |  Extra multicast  |      none     |     none    |  per (PLR,FP,S) |
1173	   |       state       |               |             | for bypass mode |
1174	   +-------------------+---------------+-------------+-----------------+

1176	6.  Acknowledgements

1178	   The authors would like to thank Kishore Tiruveedhula, Santosh Esale,
1179	   and Maciek Konstantynowicz for their suggestions and review.

1181	7.  IANA Considerations

1183	   This doument includes no request to IANA.

1185	8.  Security Considerations

1187	   This architecture is not currently believed to introduce new security
1188	   concerns.

1190	9.  References

1192	9.1.  Normative References

1194	   [I-D.enyedi-rtgwg-mrt-frr-algorithm]
1195	              Atlas, A. and A. Csaszar, "Algorithms for computing
1196	              Maximally Redundant Trees for IP/LDP Fast- Reroute",
1197	              draft-enyedi-rtgwg-mrt-frr-algorithm-00 (work in
1198	              progress), October 2011.

1200	   [I-D.ietf-rtgwg-mrt-frr-architecture]
1201	              Atlas, A., Kebler, R., Konstantynowicz, M., Csaszar, A.,
1202	              White, R., and M. Shand, "An Architecture for IP/LDP Fast-
1203	              Reroute Using Maximally Redundant Trees",
1204	              draft-ietf-rtgwg-mrt-frr-architecture-00 (work in
1205	              progress), January 2012.

1207	   [RFC4601]  Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
1208	              "Protocol Independent Multicast - Sparse Mode (PIM-SM):
1209	              Protocol Specification (Revised)", RFC 4601, August 2006.

1211	   [RFC6388]  Wijnands, IJ., Minei, I., Kompella, K., and B. Thomas,
1212	              "Label Distribution Protocol Extensions for Point-to-
1213	              Multipoint and Multipoint-to-Multipoint Label Switched
1214	              Paths", RFC 6388, November 2011.

1216	   [RFC6420]  Cai, Y. and H. Ou, "PIM Multi-Topology ID (MT-ID) Join
1217	              Attribute", RFC 6420, November 2011.

1219	9.2.  Informative References

1221	   [I-D.iwijnand-mpls-mldp-multi-topology]
1222	              Wijnands, I. and K. Raza, "mLDP Extensions for Multi
1223	              Topology Routing",
1224	              draft-iwijnand-mpls-mldp-multi-topology-01 (work in
1225	              progress), January 2012.

1227	   [I-D.karan-mofrr]
1228	              Karan, A., Filsfils, C., Farinacci, D., Decraene, B.,
1229	              Leymann, N., and T. Telkamp, "Multicast only Fast Re-
1230	              Route", draft-karan-mofrr-01 (work in progress),
1231	              March 2011.

1233	   [I-D.kebler-pim-mrt-protection]
1234	              Kebler, R., Atlas, A., Wijnands, IJ., and G. Enyedi, "PIM
1235	              Extensions for Protection Using Maximally Redundant
1236	              Trees",  draft-kebler-pim-mrt-protection-00 (work in
1237	              progress), March 2012.

1239	   [I-D.wijnands-mpls-mldp-node-protection]
1240	              Wijnands, I., Rosen, E., Raza, K., Tantsura, J., and A.
1241	              Atlas, "mLDP Node Protection",
1242	              draft-wijnands-mpls-mldp-node-protection-00 (work in
1243	              progress), February 2012.

1245	   [RFC5286]  Atlas, A. and A. Zinin, "Basic Specification for IP Fast
1246	              Reroute: Loop-Free Alternates", RFC 5286, September 2008.

1248	   [RFC5714]  Shand, M. and S. Bryant, "IP Fast Reroute Framework",
1249	              RFC 5714, January 2010.

1251	Authors' Addresses

1253	   Alia Atlas (editor)
1254	   Juniper Networks
1255	   10 Technology Park Drive
1256	   Westford, MA  01886
1257	   USA

1259	   Email: akatlas@juniper.net

1261	   Robert Kebler
1262	   Juniper Networks
1263	   10 Technology Park Drive
1264	   Westford, MA  01886
1265	   USA

1267	   Email: rkebler@juniper.net
1268	   IJsbrand Wijnands
1269	   Cisco Systems, Inc.

1271	   Email: ice@cisco.com

1273	   Andras Csaszar
1274	   Ericsson
1275	   Konyves Kalman krt 11
1276	   Budapest  1097
1277	   Hungary

1279	   Email: Andras.Csaszar@ericsson.com

1281	   Gabor Sandor Enyedi
1282	   Ericsson
1283	   Konyves Kalman krt 11.
1284	   Budapest  1097
1285	   Hungary

1287	   Email: Gabor.Sandor.Enyedi@ericsson.com