idnits 2.17.1 draft-ietf-bess-mvpn-fast-failover-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 4, 2018) is 2183 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-bfd-multipoint-16 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Morin, Ed. 3 Internet-Draft Orange 4 Intended status: Standards Track R. Kebler, Ed. 5 Expires: November 5, 2018 Juniper Networks 6 G. Mirsky, Ed. 7 ZTE Corp. 8 May 4, 2018 10 Multicast VPN fast upstream failover 11 draft-ietf-bess-mvpn-fast-failover-03 13 Abstract 15 This document defines multicast VPN extensions and procedures that 16 allow fast failover for upstream failures, by allowing downstream PEs 17 to take into account the status of Provider-Tunnels (P-tunnels) when 18 selecting the upstream PE for a VPN multicast flow, and extending BGP 19 MVPN routing so that a C-multicast route can be advertised toward a 20 standby upstream PE. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 26 "OPTIONAL" in this document are to be interpreted as described in BCP 27 14 [RFC2119] [RFC8174] when, and only when, they appear in all 28 capitals, as shown here. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on November 5, 2018. 47 Copyright Notice 49 Copyright (c) 2018 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 3. UMH Selection based on tunnel status . . . . . . . . . . . . 3 67 3.1. Determining the status of a tunnel . . . . . . . . . . . 4 68 3.1.1. mVPN tunnel root tracking . . . . . . . . . . . . . . 5 69 3.1.2. PE-P Upstream link status . . . . . . . . . . . . . . 5 70 3.1.3. P2MP RSVP-TE tunnels . . . . . . . . . . . . . . . . 5 71 3.1.4. Leaf-initiated P-tunnels . . . . . . . . . . . . . . 6 72 3.1.5. ((S, G)) counter information . . . . . . . . . . . . 6 73 3.1.6. BFD Discriminator . . . . . . . . . . . . . . . . . . 6 74 3.1.7. Per PE-CE link BFD Discriminator . . . . . . . . . . 9 75 4. Standby C-multicast route . . . . . . . . . . . . . . . . . . 10 76 4.1. Downstream PE behavior . . . . . . . . . . . . . . . . . 11 77 4.2. Upstream PE behavior . . . . . . . . . . . . . . . . . . 12 78 4.3. Reachability determination . . . . . . . . . . . . . . . 13 79 4.4. Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . 13 80 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast 81 failover . . . . . . . . . . . . . . . . . . . . . . 14 82 4.4.2. Inter-AS procedures for ASBRs . . . . . . . . . . . . 14 83 5. Hot leaf standby . . . . . . . . . . . . . . . . . . . . . . 14 84 6. Duplicate packets . . . . . . . . . . . . . . . . . . . . . . 15 85 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 86 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 87 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 88 10. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 16 89 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 90 11.1. Normative References . . . . . . . . . . . . . . . . . . 18 91 11.2. Informative References . . . . . . . . . . . . . . . . . 18 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 94 1. Introduction 96 In the context of multicast in BGP/MPLS VPNs, it is desirable to 97 provide mechanisms allowing fast recovery of connectivity on 98 different types of failures. This document addresses failures of 99 elements in the provider network that are upstream of PEs connected 100 to VPN sites with receivers. 102 Section 3 describes local procedures allowing an egress PE (a PE 103 connected to a receiver site) to take into account the status of 104 P-tunnels to determine the Upstream Multicast Hop (UMH) for a given 105 (C-S, C-G). This method does not provide a "fast failover" solution 106 when used alone, but can be used with the following sections for a 107 "fast failover" solution. 109 Section 4 describes protocol extensions that can speed up failover by 110 not requiring any multicast VPN routing message exchange at recovery 111 time. 113 Moreover, section 5 describes a "hot leaf standby" mechanism, that 114 uses a combination of these two mechanisms. This approach has 115 similarities with the solution described in [RFC7431] to improve 116 failover times when PIM routing is used in a network given some 117 topology and metric constraints. 119 2. Terminology 121 The terminology used in this document is the terminology defined in 122 [RFC6513] and [RFC6514]. 124 x-PMSI: I-PMSI or S-PMSI 126 3. UMH Selection based on tunnel status 128 Current multicast VPN specifications [RFC6513], section 5.1, describe 129 the procedures used by a multicast VPN downstream PE to determine 130 what the upstream multicast hop (UMH) is for a given (C-S,C-G). 132 The procedure described here is an OPTIONAL procedure that consists 133 of having a downstream PE take into account the status of P-tunnels 134 rooted at each possible upstream PEs, for including or not including 135 each given PE in the list of candidate UMHs for a given (C-S,C-G) 136 state. The result is that, if a P-tunnel is "down" (see 137 Section 3.1), the PE that is the root of the P-tunnel will not be 138 considered for UMH selection, which will result in the downstream PE 139 to failover to the upstream PE which is next in the list of 140 candidates. 142 A downstream PE monitors the status of the tunnels of UMHs that are 143 ahead of the current one. Whenever the downstream PE determines that 144 one of these tunnels is no longer "known to down", the PE selects the 145 UMH corresponding to that as the new UMH. 147 More precisely, UMH determination for a given (C-S,C-G) will consider 148 the UMH candidates in the following order: 150 o first, the UMH candidates that either (a) advertise a PMSI bound 151 to a tunnel, where the specified tunnel is not known to be down or 152 (b) do not advertise any x-PMSI applicable to the given (C-S,C-G) 153 but have associated a VRF Route Import BGP attribute to the 154 unicast VPN route for S (this is necessary to avoid incorrectly 155 invalidating an UMH PE that would use a policy where no I-PMSI is 156 advertised for a given VRF and where only S-PMSI are used, the 157 S-PMSI advertisement being possibly done only after the upstream 158 PE receives a C-multicast route for (C-S, C-G)/(C-*, C-G) to be 159 carried over the advertised S-PMSI) 161 o second, the UMH candidates that advertise a PMSI bound to a tunnel 162 that is "down" -- these will thus be used as a last resort to 163 ensure a graceful fallback to the basic MVPN UMH selection 164 procedures in the hypothetical case where a false negative would 165 occur when determining the status of all tunnels 167 For a given downstream PE and a given VRF, the P-tunnel corresponding 168 to a given upstream PE for a given (C-S,C-G) state is the S-PMSI 169 tunnel advertised by that upstream PE for this (C-S,C-G) and imported 170 into that VRF, or if there isn't any such S-PMSI, the I-PMSI tunnel 171 advertised by that PE and imported into that VRF. 173 Note that this document assumes that if a site of a given MVPN that 174 contains C-S is dual-homed to two PEs, then all the other sites of 175 that MVPN would have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) 176 routes to C-S, each with its own RD. 178 3.1. Determining the status of a tunnel 180 Different factors can be considered to determine the "status" of a 181 P-tunnel and are described in the following sub-sections. The 182 procedure proposed here also allows that all downstream PEs don't 183 apply the same rules to define what the status of a P-tunnel is 184 (please see Section 6), and some of them will produce a result that 185 may be different for different downstream PEs. Thus what is called 186 the "status" of a P-tunnel in this section, is not a characteristic 187 of the tunnel in itself, but is the status of the tunnel, *as seen 188 from a particular downstream PE*. Additionally, some of the 189 following methods determine the ability of downstream PE to receive 190 traffic on the P-tunnel and not specifically on the status of the 191 P-tunnel itself. This could be referred to as "P-tunnel reception 192 status", but for simplicity, we will use the terminology of P-tunnel 193 "status" for all of these methods. 195 Depending on the criteria used to determine the status of a P-tunnel, 196 there may be an interaction with another resiliency mechanism used 197 for the P-tunnel itself, and the UMH update may happen immediately or 198 may need to be delayed. Each particular case is covered in each 199 separate sub-section below. 201 3.1.1. mVPN tunnel root tracking 203 A condition to consider that the status of a P-tunnel is up is that 204 the root of the tunnel, as determined in the PMSI tunnel attribute, 205 is reachable through unicast routing tables. In this case, the 206 downstream PE can immediately update its UMH when the reachability 207 condition changes. 209 This is similar to BGP next-hop tracking for VPN routes, except that 210 the address considered is not the BGP next-hop address, but the root 211 address in the PMSI tunnel attribute. 213 If BGP next-hop tracking is done for VPN routes and the root address 214 of a given tunnel happens to be the same as the next-hop address in 215 the BGP auto-discovery route advertising the tunnel, then this 216 mechanisms may be omitted for this tunnel, as it will not bring any 217 specific benefit. 219 3.1.2. PE-P Upstream link status 221 A condition to consider a tunnel status as Up can be that the last- 222 hop link of the P-tunnel is up. 224 This method should not be used when there is a fast restoration 225 mechanism (such as MPLS FRR [RFC4090]) in place for the link. 227 3.1.3. P2MP RSVP-TE tunnels 229 For P-tunnels of type P2MP MPLS-TE, the status of the P-tunnel is 230 considered up if one or more of the P2MP RSVP-TE LSPs, identified by 231 the P-tunnel Attribute, are in Up state. The determination of 232 whether a P2MP RSVP-TE LSP is in Up state requires Path and Resv 233 state for the LSP and is based on procedures in [RFC4875]. In this 234 case, the downstream PE can immediately update its UMH when the 235 reachability condition changes. 237 When signaling state for a P2MP TE LSP is removed (e.g. if the 238 ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE 239 LSP changes state from Up to Down as determined by procedures in 240 [RFC4875], the status of the corresponding P-tunnel SHOULD be re- 241 evaluated. If the P-tunnel transitions from up to Down state, the 242 upstream PE, that is the ingress of the P-tunnel, SHOULD NOT be 243 considered a valid UMH. 245 3.1.4. Leaf-initiated P-tunnels 247 A PE can be removed from the UMH candidate list for a given ((S, G)) 248 if the P-tunnel for this (S, G) (I or S , depending) is leaf 249 triggered (PIM, mLDP), but for some reason internal to the protocol 250 the upstream one-hop branch of the tunnel from P to PE cannot be 251 built. In this case, the downstream PE can immediately update its 252 UMH when the reachability condition changes. 254 3.1.5. ((S, G)) counter information 256 In cases, where the downstream node can be configured so that the 257 maximum inter-packet time is known for all the multicast flows mapped 258 on a P-tunnel, the local per-(C-S,C-G) traffic counter information 259 for traffic received on this P-tunnel can be used to determine the 260 status of the P-tunnel. 262 When such a procedure is used, in the context where fast restoration 263 mechanisms are used for the P-tunnels, downstream PEs should be 264 configured to wait before updating the UMH, to let the P-tunnel 265 restoration mechanism happen. A configurable timer MUST be provided 266 for this purpose, and it is recommended to provide a reasonable 267 default value for this timer. 269 This method can be applicable, for instance, when a ((S, G)) flow is 270 mapped on an S-PMSI. 272 In cases where this mechanism is used in conjunction with 273 Hot leaf standby, then no prior knowledge of the rate of the 274 multicast streams is required; downstream PEs can compare reception 275 on the two P-tunnels to determine when one of them is down. 277 3.1.6. BFD Discriminator 279 P-tunnel status can be derived from the status of a multipoint BFD 280 session [I-D.ietf-bfd-multipoint] whose discriminator is advertised 281 along with an x-PMSI A-D route. 283 This document defines the format and ways of usingr a new BGP 284 attribute called the "BGP- BFD attribute". This is an optional 285 transitive BGP attribute. The format of this attribute is defined as 286 follows: 288 +-------------------------------+ 289 | Flags (1 octet) | 290 +-------------------------------+ 291 | BFD Discriminator (4 octets) | 292 +-------------------------------+ 294 The Flags field has the following format: 296 0 1 2 3 4 5 6 7 297 +-+-+-+-+-+-+-+-+ 298 | reserved | 299 +-+-+-+-+-+-+-+-+ 301 3.1.6.1. Upstream PE Procedures 303 When it is desired to track the P-tunnel status using p2mp BFD 304 session, the Upstream PE: 306 o MUST initiate BFD session and set bfd.SessionType = MultipointHead 307 as described in [I-D.ietf-bfd-multipoint]; 309 o MUST use [Ed.note] address as destination IP address when 310 transmitting BFD control packets; 312 o MUST use the IP address of the Upstream PE as source IP address 313 when transmitting BFD control packets; 315 o MUST include the BGP-BFD Attribute in the x-PMSI A-D Route with 316 BFD Discriminator value set to My Discriminator value. 318 If tracking of the P-tunnel by using a p2mp BFD session is to be 319 enabled after the P-tunnel has been already signaled, the the 320 procedure described above MUST be followed. Note that x-PMSI A-D 321 Route MUST be re-sent with exactly the same attributes as before and 322 the BGP-BFD Attribute included. 324 If P-tunnel is already signaled, and P-tunnel status tracked using 325 the p2mp BFD session and it is desired to stop tracking P-tunnel 326 status using BFD, then: 328 o x-PMSI A-D Route MUST be re-sent with exactly the same attributes 329 as before, but the BGP-BFD Attribute MUST be excluded; 331 o the p2mp BFD session SHOULD be deleted. 333 3.1.6.2. Downstream PE Procedures 335 On receiving the BGP-BFD Attribute in the x-PMSI A-D Route, the 336 Downstream PE: 338 o MUST associate the received BFD discriminator value with the 339 P-tunnel originating from the Root PE; 341 o MUST create p2mp BFD session and set bfd.SessionType = 342 MultipointTail as described in [I-D.ietf-bfd-multipoint]; 344 o MUST use the source IP address of a BFD control packet, the value 345 of BFD Discriminator from the BGP-BFD Attribute to properly 346 demultiplex BFD sessions; 348 After the state of the p2mp BFD session is up, i.e. bfd.SessionState 349 = Up, the session state will then be used to track the health of the 350 P-tunnel. 352 According to [I-D.ietf-bfd-multipoint], if the Downstream PE receives 353 Down or AdminDown in the State field of the BFD control packet or 354 associated with the BFD session Detection Timer expires, the BFD 355 session state is down, i.e. bfd.SessionState = Down. When the BFD 356 session state is Down, then the P-tunnel associated with the BFD 357 session as down MUST be declared down. Then The Downstream PE MAY 358 initiate a switchover of the traffic from the Primary Upstream PE to 359 the Standby Upstream PE. 361 If the Downstream PE's P-tunnel is already up when the Downstream PE 362 receives the new x-PMSI A-D Route with BGP-BFD Attribute, the 363 Downstream PE MUST accept the x-PMSI A-D Route and associate the 364 value of BFD Discriminator field with the P-tunnel. The Upstream PE 365 MUST follow procedures listed above in this section to bring the p2mp 366 BFD session up and use it to monitor the state of the associated 367 P-tunnel. 369 If the Downstream PE's P-tunnel is already up, its state being 370 monitored by the p2mp BFD session, and the Downstream PE receives the 371 new x-PMSI A-D Route without the BGP-BFD Attribute, the Downstream 372 PE: 374 o MUST accept the x-PMSI A-D Route; 376 o MUST stop receiving BFD control packets for this p2mp BFD session; 378 o SHOULD delete the p2mp BFD session associated with the P-tunnel; 380 o SHOULD NOT switch the traffic to the Standby Upstream PE. 382 When such a procedure is used, in the context where fast restoration 383 mechanisms are used for the P-tunnels, leaf PEs should be configured 384 to wait before updating the UMH, to let the P-tunnel restoration 385 mechanism happen. A configurable timer MUST be provided for this 386 purpose, and it is recommended to provide a reasonable default value 387 for this timer. 389 3.1.7. Per PE-CE link BFD Discriminator 391 The following approach is defined for the fast failover in response 392 to the detection of PE-CE link failures, in which UMH selection for a 393 given C-multicast route takes into account the state of the BFD 394 session associated with the state of the upstream PE-CE link. 396 3.1.7.1. Upstream PE Procedures 398 For each protected PE-CE link, the upstream PE initiates a multipoint 399 BFD session [I-D.ietf-bfd-multipoint] as MultipointHead toward 400 downstream PEs. A downstream PE monitors the state of the p2mp 401 session as MultipointTail and MAY interpret transition of the BFD 402 session into Down state as the indication of the associated PE-CE 403 link being down. 405 For SSM groups, the upstream PE advertises an ((S, G)) S-PMSI A-D 406 route or wildcard (S,*) S-PMSI A-D route for each received SSM ((S, 407 G)) C-multicast route for which protection is desired. For each ASM 408 ((S, G)) C-multicast route for which protection is desired, the 409 upstream PE advertises a ((S, G)) S-PMSI A-D route. For each ASM 410 (*,G) C-Multicast route for which protection is desired, the upstream 411 PE advertises a wildcard (*,G) S-PMSI A-D route. Note that all 412 S-PMSI A-D routes can signal the same P-tunnel, so there is no need 413 for a new P-tunnel for each S-PMSI A-D route. Multicast flows for 414 which protection is desired is controlled by configuration/policy on 415 the upstream PE. The protected link is the RPF PE-CE interface 416 towards the src/RP. The upstream PE advertises the BFD discriminator 417 of the protected link in the S-PMSI A-D route. If the route to the 418 src/RP changes such that the RPF interface is changed to be a new PE- 419 CE interface, then the upstream PE will update the S-PMSI A-D route 420 with included BGP-BFD Attribute so that value of the BFD 421 Discriminator is associated with the new RPF link. 423 3.1.7.2. Downstream PE Procedures 425 If an S-PMSI A-D route bound to a given C-multicast is signaled with 426 a multipoint BFD session, then the upstream PE is considered during 427 UMH selection for the C-multicast if and only if the corresponding 428 BFD session is not in state Down, i.e bfd.SessionState != Down. 429 Whenever the state of the BFD session changes to Down the Provider 430 Tunnel will be considered down, and the downstream PE will switch to 431 the backup Provider Tunnel. Note that the Provider Tunnel is 432 considered down only for the C-multicast states that match to an 433 S-PMSI A-D route which included BGP-BFD Attribute with the BFD 434 Discriminator of the p2mp BFD session which is down. 436 4. Standby C-multicast route 438 The procedures described below are limited to the case where the site 439 that contains C-S is connected to exactly two PEs. The procedures 440 require all the PEs of that MVPN to follow the single forwarder PE 441 selection, as specified in [RFC6513]. The procedures assume that if 442 a site of a given MVPN that contains C-S is dual-homed to two PEs, 443 then all the other sites of that MVPN would have two unicast VPN 444 routes (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its own RD. 446 As long as C-S is reachable via both PEs, a given downstream PE will 447 select one of the PEs connected to C-S as its Upstream PE with 448 respect to C-S. We will refer to the other PE connected to C-S as 449 the "Standby Upstream PE". Note that if the connectivity to C-S 450 through the Primary Upstream PE becomes unavailable, then the PE will 451 select the Standby Upstream PE as its Upstream PE with respect to 452 C-S. When the Primary PE later becomes available, then the PE will 453 select the Primary Upstream PE again as its Upstream PE. This is 454 referred to as "revertive" behavior and MUST be supported. Non- 455 revertive behavior would refer to the behavior of continuing to 456 select the backup PE as the UMH even after the Primary has come up. 457 This non-revertive behavior can also be optionally supported by an 458 implementation and would be enabled through some configuration. 460 For readability, in the following sub-sections, the procedures are 461 described for BGP C-multicast Source Tree Join routes, but they apply 462 equally to BGP C-multicast Shared Tree Join routes failover for the 463 case where the customer RP is dual-homed (substitute "C-RP" to 464 "C-S"). 466 4.1. Downstream PE behavior 468 When a (downstream) PE connected to some site of an MVPN needs to 469 send a C-multicast route (C-S, C-G), then following the procedures 470 specified in Section "Originating C-multicast routes by a PE" of 471 [RFC6514] the PE sends the C-multicast route with RT that identifies 472 the Upstream PE selected by the PE originating the route. As long as 473 C-S is reachable via the Primary Upstream PE, the Upstream PE is the 474 Primary Upstream PE. If C-S is reachable only via the Standby 475 Upstream PE, then the Upstream PE is the Standby Upstream PE. 477 If C-S is reachable via both the Primary and the Standby Upstream PE, 478 then in addition to sending the C-multicast route with an RT that 479 identifies the Primary Upstream PE, the PE also originates and sends 480 a C-multicast route with an RT that identifies the Standby Upstream 481 PE. This route, that has the semantics of being a 'standby' 482 C-multicast route, is further called a "Standby BGP C-multicast 483 route", and is constructed as follows: 485 o the NLRI is constructed as the original C-multicast route, except 486 that the RD is the same as if the C-multicast route was built 487 using the standby PE as the UMH (it will carry the RD associated 488 to the unicast VPN route advertised by the standby PE for S) 490 o SHOULD carry the "Standby PE" BGP Community (this is a new BGP 491 Community, see Section 7) 493 The normal and the standby C-multicast routes must have their Local 494 Preference attribute adjusted so that, if two C-multicast routes with 495 same NLRI are received by a BGP peer, one carrying the "Standby PE" 496 attribute and the other one *not* carrying the "Standby PE" 497 community, then preference is given to the one *not* carrying the 498 "Standby PE" attribute. Such a situation can happen when, for 499 instance, due to transient unicast routing inconsistencies, two 500 different downstream PEs consider different upstream PEs to be the 501 primary one; in that case, without any precaution taken, both 502 upstream PEs would process a standby C-multicast route and possibly 503 stop forwarding at the same time. For this purpose, routes that 504 carry the "Standby PE" BGP Community MUST have the LOCAL_PREF 505 attribute set to zero. 507 Note that, when a PE advertises such a Standby C-multicast join for 508 an ((S, G)) it must join the corresponding P-tunnel. 510 If at some later point the local PE determines that C-S is no longer 511 reachable through the Primary Upstream PE, the Standby Upstream PE 512 becomes the Upstream PE, and the local PE re-sends the C-multicast 513 route with RT that identifies the Standby Upstream PE, except that 514 now the route does not carry the Standby PE BGP Community (which 515 results in replacing the old route with a new route, with the only 516 difference between these routes being the presence/absence of the 517 Standby PE BGP Community). 519 4.2. Upstream PE behavior 521 When a PE receives a C-multicast route for a particular (C-S, C-G), 522 and the RT carried in the route results in importing the route into a 523 particular VRF on the PE, if the route carries the Standby PE BGP 524 Community, then the PE performs as follows: 526 when the PE determines that C-S is not reachable through some 527 other PE, the PE SHOULD install VRF PIM state corresponding to 528 this Standby BGP C-multicast route (the result will be that a PIM 529 Join message will be sent to the CE towards C-S, and that the PE 530 will receive (C-S,C-G) traffic), and the PE SHOULD forward (C-S, 531 C-G) traffic received by the PE to other PEs through a P-tunnel 532 rooted at the PE. 534 Furthermore, irrespective of whether C-S carried in that route is 535 reachable through some other PE: 537 a) based on local policy, as soon as the PE receives this Standby BGP 538 C-multicast route, the PE MAY install VRF PIM state corresponding 539 to this BGP Source Tree Join route (the result will be that Join 540 messages will be sent to the CE toward C-S, and that the PE will 541 receive (C-S,C-G) traffic) 543 b) based on local policy, as soon as the PE receives this Standby BGP 544 C-multicast route, the PE MAY forward (C-S, C-G) traffic to other 545 PEs through a P-tunnel independently of the reachability of C-S 546 through some other PE. [note that this implies also doing (a)] 548 Doing neither (a) or (b) for a given (C-S,C-G) is called "cold root 549 standby". 551 Doing (a) but not (b) for a given (C-S,C-G) is called "warm root 552 standby". 554 Doing (b) (which implies also doing (a)) for a given (C-S,C-G) is 555 called "hot root standby". 557 Note that, if an upstream PE uses an S-PMSI only policy, it shall 558 advertise an S-PMSI for an ((S, G)) as soon as it receives a 559 C-multicast route for ((S, G)), normal or Standby; i.e. it shall not 560 wait for receiving a non-Standby C-multicast route before advertising 561 the corresponding S-PMSI. 563 Section 9.3.2 of [RFC6514], describes the procedures of sending a 564 Source-Active A-D result as a result of receiving the C-multicast 565 route. These procedures should be followed for both the normal and 566 Standby C-multicast routes. 568 4.3. Reachability determination 570 The standby PE can use the following information to determine that 571 C-S can or cannot be reached through the primary PE: 573 o presence/absence of a unicast VPN route toward C-S 575 o supposing that the standby PE is an egress of the tunnel rooted at 576 the Primary PE, the standby PE can determine the reachability of 577 C-S through the Primary PE based on the status of this tunnel, 578 determined thanks to the same criteria as the ones described in 579 Section 3.1 (without using the UMH selection procedures of 580 Section 3) 582 o other mechanisms MAY be used 584 4.4. Inter-AS 586 If the non-segmented inter-AS approach is used, the procedures in 587 section 4 can be applied. 589 When multicast VPNs are used in an inter-AS context with the 590 segmented inter-AS approach described in section 8.2 of [RFC6514], 591 the procedures in this section can be applied. 593 A pre-requisite for the procedures described below to be applied for 594 a source of a given MVPN is: 596 o that any PE of this MVPN receives two Inter-AS I-PMSI auto- 597 discovery routes advertised by the AS of the source (or more) 599 o that these Inter-AS I-PMSI auto-discovery routes have distinct 600 Route Distinguishers (as described in item "(2)" of section 9.2 of 601 [RFC6514]). 603 As an example, these conditions will be satisfied when the source is 604 dual-homed to an AS that connects to the receiver AS through two ASBR 605 using auto-configured RDs. 607 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast failover 609 The following procedure is applied by downstream PEs of an AS, for a 610 source S in a remote AS. 612 Additionally, to choosing an Inter-AS I-PMSI auto-discovery route 613 advertised from the AS of the source to construct a C-multicast 614 route, as described in section 11.1.3 [RFC6514] a downstream PE will 615 choose a second Inter-AS I-PMSI auto-discovery route advertised from 616 the AS of the source and use this route to construct and advertise a 617 Standby C-multicast route (C-multicast route carrying the Standby 618 extended community) as described in Section 4.1. 620 4.4.2. Inter-AS procedures for ASBRs 622 When an upstream ASBR receives a C-multicast route, and at least one 623 of the RTs of the route matches one of the ASBR Import RT, the ASBR 624 locates an Inter-AS I-PMSI A-D route whose RD and Source AS matches 625 the RD and Source AS carried in the C-multicast route. If the match 626 is found, and C-multicast route carries the Standby PE BGP Community, 627 then the ASBR performs as follows: 629 o if the route was received over iBGP; the route is expected to have 630 a LOCAL_PREF attribute set to zero and it should be re-advertised 631 in eBGP with a MED attribute (MULTI_EXIT_DISC) set to the highest 632 possible value (0xffff) 634 o if the route was received over eBGP; the route is expected to have 635 a MED attribute set of 0xffff and should be re-advertised in iBGP 636 with a LOCAL_PREF attribute set to zero 638 Other ASBR procedures are applied without modification. 640 5. Hot leaf standby 642 The mechanisms defined in sections Section 4 and Section 3 can be 643 used together as follows. 645 The principle is that, for a given VRF (or possibly only for a given 646 C-S,C-G): 648 o downstream PEs advertise a Standby BGP C-multicast route (based on 649 Section 4) 651 o upstream PEs use the "hot standby" optional behavior and thus will 652 forward traffic for a given multicast state as soon as they have 653 whether a (primary) BGP C-multicast route or a Standby BGP 654 C-multicast route for that state (or both) 656 o downstream PEs accept traffic from the primary or standby tunnel, 657 based on the status of the tunnel (based on Section 3) 659 Other combinations of the mechanisms proposed in Section 4) and 660 Section 3 are for further study. 662 Note that the same level of protection would be achievable with a 663 simple C-multicast Source Tree Join route advertised to both the 664 primary and secondary upstream PEs (carrying as Route Target extended 665 communities, the values of the VRF Route Import attribute of each VPN 666 route from each upstream PEs). The advantage of using the Standby 667 semantic for is that, supposing that downstream PEs always advertise 668 a Standby C-multicast route to the secondary upstream PE, it allows 669 to choose the protection level through a change of configuration on 670 the secondary upstream PE, without requiring any reconfiguration of 671 all the downstream PEs. 673 6. Duplicate packets 675 Multicast VPN specifications [RFC6513] impose that a PE only forwards 676 to CEs the packets coming from the expected upstream PE 677 (Section 9.1). 679 We highlight the reader's attention to the fact that the respect of 680 this part of multicast VPN specifications is especially important 681 when two distinct upstream PEs are susceptible to forward the same 682 traffic on P-tunnels at the same time in the steady state. This will 683 be the case when "hot root standby" mode is used (Section 4), and 684 which can also be the case if procedures of Section 3 are used and 685 (a) the rules determining the status of a tree are not the same on 686 two distinct downstream PEs or (b) the rule determining the status of 687 a tree depend on conditions local to a PE (e.g. the PE-P upstream 688 link being up). 690 7. IANA Considerations 692 Allocation is expected from IANA for the BGP "Standby PE" community. 693 (TBC) 695 [Note to RFC Editor: this section may be removed on publication as an 696 RFC.] 698 8. Security Considerations 699 9. Acknowledgments 701 The authors want to thank Greg Reaume, Eric Rosen, and Jeffrey Zhang 702 for their review and useful feedback. 704 10. Contributor Addresses 706 Below is a list of other contributing authors in alphabetical order: 708 Rahul Aggarwal 709 Arktan 711 Email: raggarwa_1@yahoo.com 713 Nehal Bhau 714 Alcatel-Lucent, Inc. 715 701 E Middlefield Rd 716 Mountain View, CA 94043 717 USA 719 Email: Nehal.Bhau@alcatel-lucent.com 721 Clayton Hassen 722 Bell Canada 723 2955 Virtual Way 724 Vancouver 725 CANADA 727 Email: Clayton.Hassen@bell.ca 729 Wim Henderickx 730 Alcatel-Lucent 731 Copernicuslaan 50 732 Antwerp 2018 733 Belgium 735 Email: wim.henderickx@alcatel-lucent.com 737 Pradeep Jain 738 Alcatel-Lucent, Inc. 740 701 E Middlefield Rd 741 Mountain View, CA 94043 742 USA 744 Email: pradeep.jain@alcatel-lucent.com 746 Jayant Kotalwar 747 Alcatel-Lucent, Inc. 748 701 E Middlefield Rd 749 Mountain View, CA 94043 750 USA 752 Email: Jayant.Kotalwar@alcatel-lucent.com 754 Praveen Muley 755 Alcatel-Lucent 756 701 East Middlefield Rd 757 Mountain View, CA 94043 758 U.S.A. 760 Email: praveen.muley@alcatel-lucent.com 762 Ray (Lei) Qiu 763 Juniper Networks 764 1194 North Mathilda Ave. 765 Sunnyvale, CA 94089 766 U.S.A. 768 Email: rqiu@juniper.net 770 Yakov Rekhter 771 Juniper Networks 772 1194 North Mathilda Ave. 773 Sunnyvale, CA 94089 774 U.S.A. 776 Email: yakov@juniper.net 778 Kanwar Singh 779 Alcatel-Lucent, Inc. 780 701 E Middlefield Rd 781 Mountain View, CA 94043 782 USA 784 Email: kanwar.singh@alcatel-lucent.com 786 11. References 788 11.1. Normative References 790 [I-D.ietf-bfd-multipoint] 791 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 792 Multipoint Networks", draft-ietf-bfd-multipoint-16 (work 793 in progress), April 2018. 795 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 796 Requirement Levels", BCP 14, RFC 2119, 797 DOI 10.17487/RFC2119, March 1997, 798 . 800 [RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S. 801 Yasukawa, Ed., "Extensions to Resource Reservation 802 Protocol - Traffic Engineering (RSVP-TE) for Point-to- 803 Multipoint TE Label Switched Paths (LSPs)", RFC 4875, 804 DOI 10.17487/RFC4875, May 2007, 805 . 807 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 808 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 809 2012, . 811 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 812 Encodings and Procedures for Multicast in MPLS/BGP IP 813 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 814 . 816 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 817 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 818 May 2017, . 820 11.2. Informative References 822 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 823 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 824 DOI 10.17487/RFC4090, May 2005, 825 . 827 [RFC7431] Karan, A., Filsfils, C., Wijnands, IJ., Ed., and B. 828 Decraene, "Multicast-Only Fast Reroute", RFC 7431, 829 DOI 10.17487/RFC7431, August 2015, 830 . 832 Authors' Addresses 834 Thomas Morin (editor) 835 Orange 836 2, avenue Pierre Marzin 837 Lannion 22307 838 France 840 Email: thomas.morin@orange-ftgroup.com 842 Robert Kebler (editor) 843 Juniper Networks 844 1194 North Mathilda Ave. 845 Sunnyvale, CA 94089 846 U.S.A. 848 Email: rkebler@juniper.net 850 Greg Mirsky (editor) 851 ZTE Corp. 853 Email: gregimirsky@gmail.com