idnits 2.17.1 

draft-ietf-bess-mvpn-fast-failover-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 1, 2020) is 1545 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                      T. Morin, Ed.
3	Internet-Draft                                                    Orange
4	Intended status: Standards Track                          R. Kebler, Ed.
5	Expires: August 4, 2020                                 Juniper Networks
6	                                                          G. Mirsky, Ed.
7	                                                               ZTE Corp.
8	                                                        February 1, 2020

10	                  Multicast VPN fast upstream failover
11	                 draft-ietf-bess-mvpn-fast-failover-09

13	Abstract

15	   This document defines multicast VPN extensions and procedures that
16	   allow fast failover for upstream failures, by allowing downstream PEs
17	   to take into account the status of Provider-Tunnels (P-tunnels) when
18	   selecting the upstream PE for a VPN multicast flow, and extending BGP
19	   MVPN routing so that a C-multicast route can be advertised toward a
20	   standby upstream PE.

22	Requirements Language

24	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
25	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
26	   "OPTIONAL" in this document are to be interpreted as described in BCP
27	   14 [RFC2119] [RFC8174] when, and only when, they appear in all
28	   capitals, as shown here.

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at https://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on August 4, 2020.

47	Copyright Notice

49	   Copyright (c) 2020 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (https://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
66	   3.  UMH Selection based on tunnel status  . . . . . . . . . . . .   3
67	     3.1.  Determining the status of a tunnel  . . . . . . . . . . .   4
68	       3.1.1.  mVPN tunnel root tracking . . . . . . . . . . . . . .   5
69	       3.1.2.  PE-P Upstream link status . . . . . . . . . . . . . .   5
70	       3.1.3.  P2MP RSVP-TE tunnels  . . . . . . . . . . . . . . . .   5
71	       3.1.4.  Leaf-initiated P-tunnels  . . . . . . . . . . . . . .   6
72	       3.1.5.  (C-S, C-G) counter information  . . . . . . . . . . .   6
73	       3.1.6.  BFD Discriminator Attribute . . . . . . . . . . . . .   6
74	       3.1.7.  Per PE-CE link BFD Discriminator  . . . . . . . . . .   9
75	   4.  Standby C-multicast route . . . . . . . . . . . . . . . . . .  10
76	     4.1.  Downstream PE behavior  . . . . . . . . . . . . . . . . .  10
77	     4.2.  Upstream PE behavior  . . . . . . . . . . . . . . . . . .  12
78	     4.3.  Reachability determination  . . . . . . . . . . . . . . .  13
79	     4.4.  Inter-AS  . . . . . . . . . . . . . . . . . . . . . . . .  13
80	       4.4.1.  Inter-AS procedures for downstream PEs, ASBR fast
81	               failover  . . . . . . . . . . . . . . . . . . . . . .  14
82	       4.4.2.  Inter-AS procedures for ASBRs . . . . . . . . . . . .  14
83	   5.  Hot Root Standby  . . . . . . . . . . . . . . . . . . . . . .  14
84	   6.  Duplicate packets . . . . . . . . . . . . . . . . . . . . . .  15
85	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
86	     7.1.  BFD Discriminator . . . . . . . . . . . . . . . . . . . .  15
87	     7.2.  BFD Discriminator Extention Type  . . . . . . . . . . . .  16
88	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
89	   9.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  17
90	   10. Contributor Addresses . . . . . . . . . . . . . . . . . . . .  17
91	   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .  19
92	     11.1.  Normative References . . . . . . . . . . . . . . . . . .  19
93	     11.2.  Informative References . . . . . . . . . . . . . . . . .  20
94	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  20

96	1.  Introduction

98	   In the context of multicast in BGP/MPLS VPNs, it is desirable to
99	   provide mechanisms allowing fast recovery of connectivity on
100	   different types of failures.  This document addresses failures of
101	   elements in the provider network that are upstream of PEs connected
102	   to VPN sites with receivers.

104	   Section 3 describes local procedures allowing an egress PE (a PE
105	   connected to a receiver site) to take into account the status of
106	   P-tunnels to determine the Upstream Multicast Hop (UMH) for a given
107	   (C-S, C-G).  This method does not provide a "fast failover" solution
108	   when used alone, but can be used with the following sections for a
109	   "fast failover" solution.

111	   Section 4 describes protocol extensions that can speed up failover by
112	   not requiring any multicast VPN routing message exchange at recovery
113	   time.

115	   Moreover, section 5 describes a "hot leaf standby" mechanism, that
116	   uses a combination of these two mechanisms.  This approach has
117	   similarities with the solution described in [RFC7431] to improve
118	   failover times when PIM routing is used in a network given some
119	   topology and metric constraints.

121	2.  Terminology

123	   The terminology used in this document is the terminology defined in
124	   [RFC6513] and [RFC6514].

126	   x-PMSI: I-PMSI or S-PMSI

128	3.  UMH Selection based on tunnel status

130	   Current multicast VPN specifications [RFC6513], section 5.1, describe
131	   the procedures used by a multicast VPN downstream PE to determine
132	   what the upstream multicast hop (UMH) is for a given (C-S, C-G).

134	   The procedure described here is an OPTIONAL procedure that consists
135	   of having a downstream PE take into account the status of P-tunnels
136	   rooted at each possible upstream PEs, Because all PEs could arrive at
137	   a different conclusion regarding the state of the tunnel, procedures
138	   described in Section 9.1.1 of [RFC6513] MUST be used when using
139	   inclusive tunnels.

141	   For a given downstream PE and a given VRF, the P-tunnel corresponding
142	   to a given upstream PE for a given (C-S, C-G) state is the S-PMSI
143	   tunnel advertised by that upstream PE for this (C-S, C-G) and
144	   imported into that VRF, or if there isn't any such S-PMSI, the I-PMSI
145	   tunnel advertised by that PE and imported into that VRF.

147	   There are three options specified in Section 5.1 of [RFC6513] for a
148	   downstream PE to select an Upstream PE.

150	   o  The first two options select the Upstream PE from a candidate PE
151	      set either based on IP address or a hashing algorithm.  When used
152	      together with the optional procedure of considering the P-tunnel
153	      status as in this document, a candidate upstream PE is included in
154	      the set if it either:

156	      A.  advertise a PMSI bound to a tunnel, where the specified tunnel
157	          is not known to be down or up

159	      B.  do not advertise any x-PMSI applicable to the given (C-S, C-G)
160	          but have associated a VRF Route Import BGP attribute to the
161	          unicast VPN route for S (this is necessary to avoid
162	          incorrectly invalidating a UMH PE that would use a policy
163	          where no I-PMSI is advertised for a given VRF and where only
164	          S-PMSI are used, the S-PMSI advertisement being possibly done
165	          only after the upstream PE receives a C-multicast route for
166	          (C-S, C-G)/(C-*, C-G) to be carried over the advertised
167	          S-PMSI).

169	      If the resulting candidate set is empty, then the procedure is
170	      repeated without considering the P-tunnel status.

172	   o  The third option uses the installed UMH Route (i.e., the "best"
173	      route towards the C-root) as the Selected UMH Route, and its
174	      originating PE is the selected Upstream PE.  With the optional
175	      procedure of considering P-tunnel status as in this document, the
176	      Selected UMH Route is the best one among those whose originating
177	      PE's P-tunnel is not "down".  If that does not exist, the
178	      installed UMH Route is selected regardless of the P-tunnel status.

180	3.1.  Determining the status of a tunnel

182	   Different factors can be considered to determine the "status" of a
183	   P-tunnel and are described in the following sub-sections.  The
184	   optional procedures proposed in this section also allow that all
185	   downstream PEs don't apply the same rules to define what the status
186	   of a P-tunnel is (please see Section 6), and some of them will
187	   produce a result that may be different for different downstream PEs.
188	   Thus what is called the "status" of a P-tunnel in this section, is
189	   not a characteristic of the tunnel in itself, but is the status of
190	   the tunnel, *as seen from a particular downstream PE*.  Additionally,
191	   some of the following methods determine the ability of downstream PE
192	   to receive traffic on the P-tunnel and not specifically on the status
193	   of the P-tunnel itself.  That could be referred to as "P-tunnel
194	   reception status", but for simplicity, we will use the terminology of
195	   P-tunnel "status" for all of these methods.

197	   Depending on the criteria used to determine the status of a P-tunnel,
198	   there may be an interaction with another resiliency mechanism used
199	   for the P-tunnel itself, and the UMH update may happen immediately or
200	   may need to be delayed.  Each particular case is covered in each
201	   separate sub-section below.

203	3.1.1.  mVPN tunnel root tracking

205	   A condition to consider that the status of a P-tunnel is up is that
206	   the root of the tunnel, as determined in the PMSI tunnel attribute,
207	   is reachable through unicast routing tables.  In this case, the
208	   downstream PE can immediately update its UMH when the reachability
209	   condition changes.

211	   That is similar to BGP next-hop tracking for VPN routes, except that
212	   the address considered is not the BGP next-hop address, but the root
213	   address in the PMSI tunnel attribute.

215	   If BGP next-hop tracking is done for VPN routes and the root address
216	   of a given tunnel happens to be the same as the next-hop address in
217	   the BGP auto-discovery route advertising the tunnel, then using this
218	   mechanism for the tunnel will not bring any specific benefit.

220	3.1.2.  PE-P Upstream link status

222	   A condition to consider a tunnel status as Up can be that the last-
223	   hop link of the P-tunnel is up.

225	   Using this method when a fast restoration mechanism (such as MPLS FRR
226	   [RFC4090]) is in place for the link requires careful consideration
227	   and coordination of defect detection intervals for the link and the
228	   tunnel.  In many cases, it is not practical to use both methods at
229	   the same time.

231	3.1.3.  P2MP RSVP-TE tunnels

233	   For P-tunnels of type P2MP MPLS-TE, the status of the P-tunnel is
234	   considered up if the sub-LSP to this downstream PE is in Up state.
235	   The determination of whether a P2MP RSVP-TE LSP is in Up state
236	   requires Path and Resv state for the LSP and is based on procedures
237	   specified in [RFC4875].  As a result, the downstream PE can
238	   immediately update its UMH when the reachability condition changes.

240	   When signaling state for a P2MP TE LSP is removed (e.g., if the
241	   ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE
242	   LSP changes state from Up to Down as determined by procedures in
243	   [RFC4875], the status of the corresponding P-tunnel SHOULD be re-
244	   evaluated.  If the P-tunnel transitions from up to Down state, the
245	   upstream PE that is the ingress of the P-tunnel SHOULD NOT be
246	   considered a valid UMH.

248	3.1.4.  Leaf-initiated P-tunnels

250	   An upstream PE SHOULD be removed from the UMH candidate list for a
251	   given (C-S, C-G) if the P-tunnel (I-PMSI or S-PMSI) for this (S, G)
252	   is leaf-triggered (PIM, mLDP), but for some reason, internal to the
253	   protocol, the upstream one-hop branch of the tunnel from P to PE
254	   cannot be built.  As a result, the downstream PE can immediately
255	   update its UMH when the reachability condition changes.

257	3.1.5.  (C-S, C-G) counter information

259	   In cases, where the downstream node can be configured so that the
260	   maximum inter-packet time is known for all the multicast flows mapped
261	   on a P-tunnel, the local per-(C-S, C-G) traffic counter information
262	   for traffic received on this P-tunnel can be used to determine the
263	   status of the P-tunnel.

265	   When such a procedure is used, in the context where fast restoration
266	   mechanisms are used for the P-tunnels, a configurable timer MUST be
267	   configured on the downstream PE to wait before updating the UMH, to
268	   let the P-tunnel restoration mechanism happen.  It is RECOMMENDED to
269	   provide a reasonable default value for this timer.  An implementation
270	   SHOULD use three seconds as the default value for this timer.

272	   This method can be applicable, for instance, when a (C-S, C-G) flow
273	   is mapped on an S-PMSI.

275	   In cases where this mechanism is used in conjunction with the method
276	   described in Section 5, no prior knowledge of the rate of the
277	   multicast streams is required; downstream PEs can compare reception
278	   on the two P-tunnels to determine when one of them is down.

280	3.1.6.  BFD Discriminator Attribute

282	   P-tunnel status MAY be derived from the status of a multipoint BFD
283	   session [RFC8562] whose discriminator is advertised along with an
284	   x-PMSI A-D route.

286	   This document defines the format and ways of using a new BGP
287	   attribute called the "BFD Discriminator".  It is an optional
288	   transitive BGP attribute.  The format of this attribute is defined as
289	   follows:

291	       0                   1                   2                   3
292	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
293	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
294	      |    BFD Mode   |                  Reserved                     |
295	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
296	      |                       BFD Discriminator                       |
297	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
298	      ~                         Optional TLVs                         ~
299	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

301	                 Format of the BFD Discriminator Attribute

303	   Where:

305	      BFD Mode is the one octet long field.  This specification defines
306	      the P2MP value (TBA3) Section 7.1.

308	      Reserved field is three octets long, and the value MUST be zeroed
309	      on transmission and ignored on receipt.

311	      BFD Discriminator is four octets long field.

313	      Optional TLVs is the optional variable-length field that MAY be
314	      used in the BFD Discriminator attribute for future extensions.
315	      TLVs MAY be included is a sequential or nested manner.  Each TLV
316	      consists of:

318	      *  one octet-long field of TLV 's Type value (Section 7.2)

320	      *  one octet-long field of the length of the Value field in octets

322	      *  variable length Value field.

324	      The length of a TLV MUST be aligned on four octets boundary.

326	   The BFD Discriminator attribute SHALL be considered malformed if its
327	   length is not a non-zero multiple of four.  If malformed, the UPDATE
328	   message SHALL be handled using the approach of "treat-as-withdraw"
329	   per [RFC7606].

331	3.1.6.1.  Upstream PE Procedures

333	   When it is desired to track the P-tunnel status using a p2mp BFD
334	   session, the Upstream PE:

336	   o  MUST initiate BFD session and set bfd.SessionType = MultipointHead
337	      as described in [RFC8562];

339	   o  when transmitting BFD Control packets, MUST as the destination IP
340	      address one of the internal loopback addresses from 127/8 range
341	      for IPv4 or one of IPv4-mapped IPv4 loopback addresses from
342	      ::ffff:127.0.0.0/104 range for IPv6;

344	   o  MUST use the IP address of the Upstream PE as source IP address
345	      when transmitting BFD control packets;

347	   o  MUST include the BFD Discriminator attribute in the x-PMSI A-D
348	      Route with the value set to My Discriminator value;

350	   o  MUST periodically transmit BFD control packets over the x-PMSI
351	      tunnel.

353	   If the tracking of the P-tunnel by using a p2mp BFD session is
354	   enabled after the x-PMSI A-D route has been already advertised, the
355	   x-PMSI A-D Route MUST be re-sent with precisely the same attributes
356	   as before and the BFD Discriminator attribute included.

358	   If the x-PMSI A-D route is advertised with P-tunnel status tracked
359	   using the p2mp BFD session and it is desired to stop tracking
360	   P-tunnel status using BFD, then:

362	   o  x-PMSI A-D Route MUST be re-sent with precisely the same
363	      attributes as before, but the BFD Discriminator attribute MUST be
364	      excluded;

366	   o  the p2mp BFD session SHOULD be deleted.

368	3.1.6.2.  Downstream PE Procedures

370	   Upon receiving the BFD Discriminator attribute in the x-PMSI A-D
371	   Route, the Downstream PE:

373	   o  MUST associate the received BFD discriminator value with the
374	      P-tunnel originating from the Root PE and the IP address of the
375	      Upstream PE;

377	   o  MUST create p2mp BFD session and set bfd.SessionType =
378	      MultipointTail as described in [RFC8562];

380	   o  MUST use the source IP address of the BFD control packet, the
381	      value of the BFD Discriminator field, and the x-PMSI tunnel
382	      identifier the BFD control packet was received to properly
383	      demultiplex BFD sessions.

385	   After the state of the p2mp BFD session is up, i.e., bfd.SessionState
386	   == Up, the session state will then be used to track the health of the
387	   P-tunnel.

389	   According to [RFC8562], if the Downstream PE receives Down or
390	   AdminDown in the State field of the BFD control packet or associated
391	   with the BFD session Detection Timer expires, the BFD session is
392	   down, i.e., bfd.SessionState == Down.  When the BFD session state is
393	   Down, then the P-tunnel associated with the BFD session MUST be
394	   declared down.  As a result, the Downstream PE MAY initiate a
395	   switchover of the traffic from the Primary Upstream PE to the Standby
396	   Upstream PE only if the Standby Upstream PE deemed available.  A
397	   different p2mp BFD session MAY be used to monitor the state of the
398	   P-tunnel from Standby Upstream PE.

400	   If the Downstream PE's P-tunnel is already up when the Downstream PE
401	   receives the new x-PMSI A-D Route with BFD Discriminator attribute,
402	   the Downstream PE MUST accept the x-PMSI A-D Route and associate the
403	   value of BFD Discriminator field with the P-tunnel.  The Upstream PE
404	   MUST follow procedures listed above in this section to bring the p2mp
405	   BFD session up and use it to monitor the state of the associated
406	   P-tunnel.

408	   If the Downstream PE's P-tunnel is already up, its state being
409	   monitored by the p2mp BFD session, and the Downstream PE receives the
410	   new x-PMSI A-D Route without the BFD Discriminator attribute, the
411	   Downstream PE:

413	   o  MUST accept the x-PMSI A-D Route;

415	   o  MUST stop processing BFD control packets for this p2mp BFD
416	      session;

418	   o  SHOULD delete the p2mp BFD session associated with the P-tunnel;

420	   o  SHOULD NOT switch the traffic to the Standby Upstream PE.

422	3.1.7.  Per PE-CE link BFD Discriminator

424	   The following approach is defined in response to the detection by the
425	   upstream PE of PE-CE link failure.  Even though the provider tunnel
426	   is still up, it is desired for the downstream PEs to switch to a
427	   backup upstream PE.  To achieve that, if the upstream PE detects that
428	   its PE-CE link fails, it SHOULD set the bfd.LocalDiag of the p2mp BFD
429	   session to Concatenated Path Down and/or Reverse Concatenated Path
430	   Down (per section 6.8.17 [RFC5880]), unless it switches to a new PE-
431	   CE link within the time of bfd.DesiredMinTxInterval for the p2mp BFD
432	   session (in that case the upstream PE will start tracking the status
433	   of the new PE-CE link).  When a downstream PE receives that
434	   bfd.LocalDiag code, it treats as if the tunnel itself failed and
435	   tries to switch to a backup PE.

437	4.  Standby C-multicast route

439	   The procedures described below are limited to the case where the site
440	   that contains C-S is connected to two or more PEs though, to simplify
441	   the description, the case of dual-homing is described.  The
442	   procedures require all the PEs of that MVPN to follow the UMH
443	   selection, as specified in [RFC6513], whether the PE selected based
444	   on its IP address, hashing algorithm described in section 5.1.3
445	   [RFC6513], or Installed UMH Route.  The procedures assume that if a
446	   site of a given MVPN that contains C-S is dual-homed to two PEs, then
447	   all the other sites of that MVPN would have two unicast VPN routes
448	   (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its RD.

450	   As long as C-S is reachable via both PEs, a given downstream PE will
451	   select one of the PEs connected to C-S as its Upstream PE for C-S.
452	   We will refer to the other PE connected to C-S as the "Standby
453	   Upstream PE".  Note that if the connectivity to C-S through the
454	   Primary Upstream PE becomes unavailable, then the PE will select the
455	   Standby Upstream PE as its Upstream PE for C-S.  When the Primary PE
456	   later becomes available, then the PE will select the Primary Upstream
457	   PE again as its Upstream PE.  Such behavior is referred to as
458	   "revertive" behavior and MUST be supported.  Non-revertive behavior
459	   would refer to the behavior of continuing to select the backup PE as
460	   the UMH even after the Primary has come up.  This non-revertive
461	   behavior MAY also be supported by an implementation and would be
462	   enabled through some configuration.

464	   For readability, in the following sub-sections, the procedures are
465	   described for BGP C-multicast Source Tree Join routes, but they apply
466	   equally to BGP C-multicast Shared Tree Join routes failover for the
467	   case where the customer RP is dual-homed (substitute "C-RP" to
468	   "C-S").

470	4.1.  Downstream PE behavior

472	   When a (downstream) PE connected to some site of an MVPN needs to
473	   send a C-multicast route (C-S, C-G), then following the procedures
474	   specified in Section "Originating C-multicast routes by a PE" of
475	   [RFC6514] the PE sends the C-multicast route with RT that identifies
476	   the Upstream PE selected by the PE originating the route.  As long as
477	   C-S is reachable via the Primary Upstream PE, and the Upstream PE is
478	   the Primary Upstream PE.  If C-S is reachable only via the Standby
479	   Upstream PE, then the Upstream PE is the Standby Upstream PE.

481	   If C-S is reachable via both the Primary and the Standby Upstream PE,
482	   then in addition to sending the C-multicast route with an RT that
483	   identifies the Primary Upstream PE, the PE also originates and sends
484	   a C-multicast route with an RT that identifies the Standby Upstream
485	   PE.  This route that has the semantics of being a 'standby'
486	   C-multicast route is further called a "Standby BGP C-multicast
487	   route", and is constructed as follows:

489	   o  the NLRI is constructed as the original C-multicast route, except
490	      that the RD is the same as if the C-multicast route was built
491	      using the standby PE as the UMH (it will carry the RD associated
492	      to the unicast VPN route advertised by the standby PE for S and a
493	      Route Target derived from the standby PE's UMH route's VRF RT
494	      Import EC);

496	   o  SHOULD carry the "Standby PE" BGP Community (this is a new BGP
497	      Community, see Section 7).

499	   The normal and the standby C-multicast routes MUST have their Local
500	   Preference attribute adjusted so that, if two C-multicast routes with
501	   same NLRI are received by a BGP peer, one carrying the "Standby PE"
502	   community and the other one *not* carrying the "Standby PE"
503	   community, then preference is given to the one *not* carrying the
504	   "Standby PE" community.  Such a situation can happen when, for
505	   instance, due to transient unicast routing inconsistencies or lack of
506	   support of the Standby PE community, two different downstream PEs
507	   consider different upstream PEs to be the primary one; in that case,
508	   without any precaution taken, both upstream PEs would process a
509	   standby C-multicast route and possibly stop forwarding at the same
510	   time.  For this purpose, routes that carry the "Standby PE" BGP
511	   Community MUST have the LOCAL_PREF attribute set to zero.

513	   Note that, when a PE advertises such a Standby C-multicast join for a
514	   (C-S, C-G) it MUST join the corresponding P-tunnel.

516	   If at some later point the local PE determines that C-S is no longer
517	   reachable through the Primary Upstream PE, the Standby Upstream PE
518	   becomes the Upstream PE, and the local PE re-sends the C-multicast
519	   route with RT that identifies the Standby Upstream PE, except that
520	   now the route does not carry the Standby PE BGP Community (which
521	   results in replacing the old route with a new route, with the only
522	   difference between these routes being the presence/absence of the
523	   Standby PE BGP Community).  Also, a LOCAL_PREF attribute MUST be set
524	   to zero.

526	4.2.  Upstream PE behavior

528	   When a PE receives a C-multicast route for a particular (C-S, C-G),
529	   and the RT carried in the route results in importing the route into a
530	   particular VRF on the PE, if the route carries the Standby PE BGP
531	   Community, then the PE performs as follows:

533	      when the PE determines (the use of the particular method to detect
534	      the failure is outside the scope of this document) that C-S is not
535	      reachable through some other PE, the PE SHOULD install VRF PIM
536	      state corresponding to this Standby BGP C-multicast route (the
537	      result will be that a PIM Join message will be sent to the CE
538	      towards C-S, and that the PE will receive (C-S, C-G) traffic), and
539	      the PE SHOULD forward (C-S, C-G) traffic received by the PE to
540	      other PEs through a P-tunnel rooted at the PE.

542	   Furthermore, irrespective of whether C-S carried in that route is
543	   reachable through some other PE:

545	   a) based on local policy, as soon as the PE receives this Standby BGP
546	      C-multicast route, the PE MAY install VRF PIM state corresponding
547	      to this BGP Source Tree Join route (the result will be that Join
548	      messages will be sent to the CE toward C-S, and that the PE will
549	      receive (C-S, C-G) traffic)

551	   b) based on local policy, as soon as the PE receives this Standby BGP
552	      C-multicast route, the PE MAY forward (C-S, C-G) traffic to other
553	      PEs through a P-tunnel independently of the reachability of C-S
554	      through some other PE. [note that this implies also doing (a)]

556	   Doing neither (a) or (b) for a given (C-S, C-G) is called "cold root
557	   standby".

559	   Doing (a) but not (b) for a given (C-S, C-G) is called "warm root
560	   standby".

562	   Doing (b) (which implies also doing (a)) for a given (C-S, C-G) is
563	   called "hot root standby".

565	   Note that, if an upstream PE uses an S-PMSI only policy, it shall
566	   advertise an S-PMSI for a (C-S, C-G) as soon as it receives a
567	   C-multicast route for (C-S, C-G), normal or Standby; i.e., it shall
568	   not wait for receiving a non-Standby C-multicast route before
569	   advertising the corresponding S-PMSI.

571	   Section 9.3.2 of [RFC6514], describes the procedures of sending a
572	   Source-Active A-D result as a result of receiving the C-multicast
573	   route.  These procedures should be followed for both the normal and
574	   Standby C-multicast routes.

576	4.3.  Reachability determination

578	   The standby PE can use the following information to determine that
579	   C-S can or cannot be reached through the primary PE:

581	   o  presence/absence of a unicast VPN route toward C-S

583	   o  supposing that the standby PE is the egress of the tunnel rooted
584	      at the Primary PE, the standby PE can determine the reachability
585	      of C-S through the Primary PE based on the status of this tunnel,
586	      determined thanks to the same criteria as the ones described in
587	      Section 3.1 (without using the UMH selection procedures of
588	      Section 3);

590	   o  other mechanisms MAY be used.

592	4.4.  Inter-AS

594	   If the non-segmented inter-AS approach is used, the procedures in
595	   section 4 can be applied.

597	   When multicast VPNs are used in an inter-AS context with the
598	   segmented inter-AS approach described in section 8.2 of [RFC6514],
599	   the procedures in this section can be applied.

601	   A pre-requisite for the procedures described below to be applied for
602	   a source of a given MVPN is:

604	   o  that any PE of this MVPN receives two Inter-AS I-PMSI auto-
605	      discovery routes advertised by the AS of the source (or more)

607	   o  that these Inter-AS I-PMSI auto-discovery routes have distinct
608	      Route Distinguishers (as described in item "(2)" of section 9.2 of
609	      [RFC6514]).

611	   As an example, these conditions will be satisfied when the source is
612	   dual-homed to an AS that connects to the receiver AS through two ASBR
613	   using auto-configured RDs.

615	4.4.1.  Inter-AS procedures for downstream PEs, ASBR fast failover

617	   The following procedure is applied by downstream PEs of an AS, for a
618	   source S in a remote AS.

620	   Additionally, to choosing an Inter-AS I-PMSI auto-discovery route
621	   advertised from the AS of the source to construct a C-multicast
622	   route, as described in section 11.1.3 [RFC6514] a downstream PE will
623	   choose a second Inter-AS I-PMSI auto-discovery route advertised from
624	   the AS of the source and use this route to construct and advertise a
625	   Standby C-multicast route (C-multicast route carrying the Standby
626	   extended community) as described in Section 4.1.

628	4.4.2.  Inter-AS procedures for ASBRs

630	   When an upstream ASBR receives a C-multicast route, and at least one
631	   of the RTs of the route matches one of the ASBR Import RT, the ASBR,
632	   that supports this specification, MUST locate an Inter-AS I-PMSI A-D
633	   route whose RD and Source AS respectively match the RD and Source AS
634	   carried in the C-multicast route.  If the match is found, and
635	   C-multicast route carries the Standby PE BGP Community, then the ASBR
636	   MUST perform as follows:

638	   o  if the route was received over iBGP and its LOCAL_PREF attribute
639	      is set to zero, then it MUST be re-advertised in eBGP with a MED
640	      attribute (MULTI_EXIT_DISC) set to the highest possible value
641	      (0xffff)

643	   o  if the route was received over eBGP and its MED attribute set of
644	      0xffff, then it MUST be re-advertised in iBGP with a LOCAL_PREF
645	      attribute set to zero

647	   Other ASBR procedures are applied without modification.

649	5.  Hot Root Standby

651	   The mechanisms defined in sections Section 4 and Section 3 can be
652	   used together as follows.

654	   The principle is that, for a given VRF (or possibly only for a given
655	   C-S,C-G):

657	   o  downstream PEs advertise a Standby BGP C-multicast route (based on
658	      Section 4)

660	   o  upstream PEs use the "hot standby" optional behavior and thus will
661	      forward traffic for a given multicast state as soon as they have
662	      whether a (primary) BGP C-multicast route or a Standby BGP
663	      C-multicast route for that state (or both)

665	   o  downstream PEs accept traffic from the primary or standby tunnel,
666	      based on the status of the tunnel (based on Section 3)

668	   Other combinations of the mechanisms proposed in Section 4 and
669	   Section 3 are for further study.

671	   Note that the same level of protection would be achievable with a
672	   simple C-multicast Source Tree Join route advertised to both the
673	   primary and secondary upstream PEs (carrying as Route Target extended
674	   communities, the values of the VRF Route Import attribute of each VPN
675	   route from each upstream PEs).  The advantage of using the Standby
676	   semantic for is that, supposing that downstream PEs always advertise
677	   a Standby C-multicast route to the secondary upstream PE, it allows
678	   to choose the protection level through a change of configuration on
679	   the secondary upstream PE, without requiring any reconfiguration of
680	   all the downstream PEs.

682	6.  Duplicate packets

684	   Multicast VPN specifications [RFC6513] impose that a PE only forwards
685	   to CEs the packets coming from the expected upstream PE
686	   (Section 9.1).

688	   We highlight the reader's attention to the fact that the respect of
689	   this part of multicast VPN specifications is especially important
690	   when two distinct upstream PEs are susceptible to forward the same
691	   traffic on P-tunnels at the same time in the steady state.  That will
692	   be the case when "hot root standby" mode is used (Section 4), and
693	   which can also be the case if procedures of Section 3 are used and
694	   (a) the rules determining the status of a tree are not the same on
695	   two distinct downstream PEs or (b) the rule determining the status of
696	   a tree depends on conditions local to a PE (e.g., the PE-P upstream
697	   link being up).

699	7.  IANA Considerations

701	   IANA is requested to allocate the BGP "Standby PE" community value
702	   (TBA1) from the Border Gateway Protocol (BGP) Well-known Communities
703	   registry.

705	7.1.  BFD Discriminator

707	   This document defines a new BGP optional transitive attribute, called
708	   "BFD Discriminator".  IANA is requested to allocate a codepoint
709	   (TBA2) in the "BGP Path Attributes" registry to the BFD Discriminator
710	   attribute.

712	   IANA is requested to create a new BFD Mode sub-registry in Border
713	   Gateway Protocol (BGP) Parameters registry as described in Table 1.

715	           +---------+-------------------------+---------------+
716	           | Range   | Registration Procedures | Note          |
717	           +---------+-------------------------+---------------+
718	           | 0-249   |     Standards Action    |               |
719	           | 250-253 |  Specification Required | Experimental  |
720	           | 254     |       Private Use       |               |
721	           | 255     |     Standards Action    |               |
722	           +---------+-------------------------+---------------+

724	                      Table 1: BFD Mode Sub-registry

726	   IANA is requested to allocate the following values from the BFD Mode
727	   sub-registry as defined in Table 2.

729	               +-------+------------------+---------------+
730	               | Value | Description      | Reference     |
731	               +-------+------------------+---------------+
732	               | 0     | Reserved         | This document |
733	               | TBA3  | P2MP BFD Session | This document |
734	               | 255   | Reserved         | This document |
735	               +-------+------------------+---------------+

737	                             Table 2: BFD Mode

739	7.2.  BFD Discriminator Extention Type

741	   IANA is requested to create a new BFD Discriminator Extention Type
742	   sub-registry in Border Gateway Protocol (BGP) Parameters registry as
743	   described in Table 3.

745	            +---------+-------------+-------------------------+
746	            | Value   | Description | Reference               |
747	            +---------+-------------+-------------------------+
748	            | 0       |   Reserved  |                         |
749	            | 1-191   |  Unassigned | IETF Review             |
750	            | 192-251 |  Unassigned | First Come First Served |
751	            | 252-254 |  Unassigned | Private Use             |
752	            | 255     |   Reserved  |                         |
753	            +---------+-------------+-------------------------+

755	          Table 3: BFD Discriminator Extention Type Sub-registry

757	8.  Security Considerations

759	   This document describes procedures based on [RFC6513] and [RFC6514]
760	   and hence shares the security considerations respectively represented
761	   in these specifications.

763	   This document makes use of BFD, as defined in [RFC8562], which, in
764	   turn, is based on [RFC5880].  Security considerations relevant to
765	   each protocol are discussed in the respective protocol
766	   specifications.

768	9.  Acknowledgments

770	   The authors want to thank Greg Reaume, Eric Rosen, Jeffrey Zhang, and
771	   Zheng (Sandy) Zhang for their reviews, useful comments, and helpful
772	   suggestions.

774	10.  Contributor Addresses

776	   Below is a list of other contributing authors in alphabetical order:

778	      Rahul Aggarwal
779	      Arktan

781	      Email: raggarwa_1@yahoo.com

783	      Nehal Bhau
784	      Cisco

786	      Email: NBhau@cisco.com

788	      Clayton Hassen
789	      Bell Canada
790	      2955 Virtual Way
791	      Vancouver
792	      CANADA

794	      Email: Clayton.Hassen@bell.ca

796	      Wim Henderickx
797	      Nokia
798	      Copernicuslaan 50
799	      Antwerp  2018
800	      Belgium

802	      Email: wim.henderickx@nokia.com

804	      Pradeep Jain
805	      Nokia
806	      701 E Middlefield Rd
807	      Mountain View, CA  94043
808	      USA

810	      Email: pradeep.jain@nokia.com

812	      Jayant Kotalwar
813	      Nokia
814	      701 E Middlefield Rd
815	      Mountain View, CA  94043
816	      USA

818	      Email: Jayant.Kotalwar@nokia.com

820	      Praveen Muley
821	      Nokia
822	      701 East Middlefield Rd
823	      Mountain View, CA  94043
824	      U.S.A.

826	      Email: praveen.muley@nokia.com

828	      Ray (Lei) Qiu
829	      Juniper Networks
830	      1194 North Mathilda Ave.
831	      Sunnyvale, CA  94089
832	      U.S.A.

834	      Email: rqiu@juniper.net

836	      Yakov Rekhter
837	      Juniper Networks
838	      1194 North Mathilda Ave.
839	      Sunnyvale, CA  94089
840	      U.S.A.

842	      Email: yakov@juniper.net

844	      Kanwar Singh
845	      Nokia
846	      701 E Middlefield Rd
847	      Mountain View, CA  94043
848	      USA

850	      Email: kanwar.singh@nokia.com

852	11.  References

854	11.1.  Normative References

856	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
857	              Requirement Levels", BCP 14, RFC 2119,
858	              DOI 10.17487/RFC2119, March 1997,
859	              <https://www.rfc-editor.org/info/rfc2119>.

861	   [RFC4875]  Aggarwal, R., Ed., Papadimitriou, D., Ed., and S.
862	              Yasukawa, Ed., "Extensions to Resource Reservation
863	              Protocol - Traffic Engineering (RSVP-TE) for Point-to-
864	              Multipoint TE Label Switched Paths (LSPs)", RFC 4875,
865	              DOI 10.17487/RFC4875, May 2007,
866	              <https://www.rfc-editor.org/info/rfc4875>.

868	   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
869	              (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010,
870	              <https://www.rfc-editor.org/info/rfc5880>.

872	   [RFC6513]  Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
873	              BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
874	              2012, <https://www.rfc-editor.org/info/rfc6513>.

876	   [RFC6514]  Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
877	              Encodings and Procedures for Multicast in MPLS/BGP IP
878	              VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012,
879	              <https://www.rfc-editor.org/info/rfc6514>.

881	   [RFC7606]  Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K.
882	              Patel, "Revised Error Handling for BGP UPDATE Messages",
883	              RFC 7606, DOI 10.17487/RFC7606, August 2015,
884	              <https://www.rfc-editor.org/info/rfc7606>.

886	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
887	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
888	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

890	   [RFC8562]  Katz, D., Ward, D., Pallagatti, S., Ed., and G. Mirsky,
891	              Ed., "Bidirectional Forwarding Detection (BFD) for
892	              Multipoint Networks", RFC 8562, DOI 10.17487/RFC8562,
893	              April 2019, <https://www.rfc-editor.org/info/rfc8562>.

895	11.2.  Informative References

897	   [RFC4090]  Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast
898	              Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090,
899	              DOI 10.17487/RFC4090, May 2005,
900	              <https://www.rfc-editor.org/info/rfc4090>.

902	   [RFC7431]  Karan, A., Filsfils, C., Wijnands, IJ., Ed., and B.
903	              Decraene, "Multicast-Only Fast Reroute", RFC 7431,
904	              DOI 10.17487/RFC7431, August 2015,
905	              <https://www.rfc-editor.org/info/rfc7431>.

907	Authors' Addresses

909	   Thomas Morin (editor)
910	   Orange
911	   2, avenue Pierre Marzin
912	   Lannion  22307
913	   France

915	   Email: thomas.morin@orange-ftgroup.com

917	   Robert Kebler (editor)
918	   Juniper Networks
919	   1194 North Mathilda Ave.
920	   Sunnyvale, CA  94089
921	   U.S.A.

923	   Email: rkebler@juniper.net
924	   Greg Mirsky (editor)
925	   ZTE Corp.

927	   Email: gregimirsky@gmail.com