idnits 2.17.1 draft-ietf-bess-mvpn-fast-failover-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 1, 2020) is 1545 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Morin, Ed. 3 Internet-Draft Orange 4 Intended status: Standards Track R. Kebler, Ed. 5 Expires: August 4, 2020 Juniper Networks 6 G. Mirsky, Ed. 7 ZTE Corp. 8 February 1, 2020 10 Multicast VPN fast upstream failover 11 draft-ietf-bess-mvpn-fast-failover-09 13 Abstract 15 This document defines multicast VPN extensions and procedures that 16 allow fast failover for upstream failures, by allowing downstream PEs 17 to take into account the status of Provider-Tunnels (P-tunnels) when 18 selecting the upstream PE for a VPN multicast flow, and extending BGP 19 MVPN routing so that a C-multicast route can be advertised toward a 20 standby upstream PE. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 26 "OPTIONAL" in this document are to be interpreted as described in BCP 27 14 [RFC2119] [RFC8174] when, and only when, they appear in all 28 capitals, as shown here. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on August 4, 2020. 47 Copyright Notice 49 Copyright (c) 2020 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 3. UMH Selection based on tunnel status . . . . . . . . . . . . 3 67 3.1. Determining the status of a tunnel . . . . . . . . . . . 4 68 3.1.1. mVPN tunnel root tracking . . . . . . . . . . . . . . 5 69 3.1.2. PE-P Upstream link status . . . . . . . . . . . . . . 5 70 3.1.3. P2MP RSVP-TE tunnels . . . . . . . . . . . . . . . . 5 71 3.1.4. Leaf-initiated P-tunnels . . . . . . . . . . . . . . 6 72 3.1.5. (C-S, C-G) counter information . . . . . . . . . . . 6 73 3.1.6. BFD Discriminator Attribute . . . . . . . . . . . . . 6 74 3.1.7. Per PE-CE link BFD Discriminator . . . . . . . . . . 9 75 4. Standby C-multicast route . . . . . . . . . . . . . . . . . . 10 76 4.1. Downstream PE behavior . . . . . . . . . . . . . . . . . 10 77 4.2. Upstream PE behavior . . . . . . . . . . . . . . . . . . 12 78 4.3. Reachability determination . . . . . . . . . . . . . . . 13 79 4.4. Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . 13 80 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast 81 failover . . . . . . . . . . . . . . . . . . . . . . 14 82 4.4.2. Inter-AS procedures for ASBRs . . . . . . . . . . . . 14 83 5. Hot Root Standby . . . . . . . . . . . . . . . . . . . . . . 14 84 6. Duplicate packets . . . . . . . . . . . . . . . . . . . . . . 15 85 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 86 7.1. BFD Discriminator . . . . . . . . . . . . . . . . . . . . 15 87 7.2. BFD Discriminator Extention Type . . . . . . . . . . . . 16 88 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 89 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 17 90 10. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 17 91 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 92 11.1. Normative References . . . . . . . . . . . . . . . . . . 19 93 11.2. Informative References . . . . . . . . . . . . . . . . . 20 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 96 1. Introduction 98 In the context of multicast in BGP/MPLS VPNs, it is desirable to 99 provide mechanisms allowing fast recovery of connectivity on 100 different types of failures. This document addresses failures of 101 elements in the provider network that are upstream of PEs connected 102 to VPN sites with receivers. 104 Section 3 describes local procedures allowing an egress PE (a PE 105 connected to a receiver site) to take into account the status of 106 P-tunnels to determine the Upstream Multicast Hop (UMH) for a given 107 (C-S, C-G). This method does not provide a "fast failover" solution 108 when used alone, but can be used with the following sections for a 109 "fast failover" solution. 111 Section 4 describes protocol extensions that can speed up failover by 112 not requiring any multicast VPN routing message exchange at recovery 113 time. 115 Moreover, section 5 describes a "hot leaf standby" mechanism, that 116 uses a combination of these two mechanisms. This approach has 117 similarities with the solution described in [RFC7431] to improve 118 failover times when PIM routing is used in a network given some 119 topology and metric constraints. 121 2. Terminology 123 The terminology used in this document is the terminology defined in 124 [RFC6513] and [RFC6514]. 126 x-PMSI: I-PMSI or S-PMSI 128 3. UMH Selection based on tunnel status 130 Current multicast VPN specifications [RFC6513], section 5.1, describe 131 the procedures used by a multicast VPN downstream PE to determine 132 what the upstream multicast hop (UMH) is for a given (C-S, C-G). 134 The procedure described here is an OPTIONAL procedure that consists 135 of having a downstream PE take into account the status of P-tunnels 136 rooted at each possible upstream PEs, Because all PEs could arrive at 137 a different conclusion regarding the state of the tunnel, procedures 138 described in Section 9.1.1 of [RFC6513] MUST be used when using 139 inclusive tunnels. 141 For a given downstream PE and a given VRF, the P-tunnel corresponding 142 to a given upstream PE for a given (C-S, C-G) state is the S-PMSI 143 tunnel advertised by that upstream PE for this (C-S, C-G) and 144 imported into that VRF, or if there isn't any such S-PMSI, the I-PMSI 145 tunnel advertised by that PE and imported into that VRF. 147 There are three options specified in Section 5.1 of [RFC6513] for a 148 downstream PE to select an Upstream PE. 150 o The first two options select the Upstream PE from a candidate PE 151 set either based on IP address or a hashing algorithm. When used 152 together with the optional procedure of considering the P-tunnel 153 status as in this document, a candidate upstream PE is included in 154 the set if it either: 156 A. advertise a PMSI bound to a tunnel, where the specified tunnel 157 is not known to be down or up 159 B. do not advertise any x-PMSI applicable to the given (C-S, C-G) 160 but have associated a VRF Route Import BGP attribute to the 161 unicast VPN route for S (this is necessary to avoid 162 incorrectly invalidating a UMH PE that would use a policy 163 where no I-PMSI is advertised for a given VRF and where only 164 S-PMSI are used, the S-PMSI advertisement being possibly done 165 only after the upstream PE receives a C-multicast route for 166 (C-S, C-G)/(C-*, C-G) to be carried over the advertised 167 S-PMSI). 169 If the resulting candidate set is empty, then the procedure is 170 repeated without considering the P-tunnel status. 172 o The third option uses the installed UMH Route (i.e., the "best" 173 route towards the C-root) as the Selected UMH Route, and its 174 originating PE is the selected Upstream PE. With the optional 175 procedure of considering P-tunnel status as in this document, the 176 Selected UMH Route is the best one among those whose originating 177 PE's P-tunnel is not "down". If that does not exist, the 178 installed UMH Route is selected regardless of the P-tunnel status. 180 3.1. Determining the status of a tunnel 182 Different factors can be considered to determine the "status" of a 183 P-tunnel and are described in the following sub-sections. The 184 optional procedures proposed in this section also allow that all 185 downstream PEs don't apply the same rules to define what the status 186 of a P-tunnel is (please see Section 6), and some of them will 187 produce a result that may be different for different downstream PEs. 188 Thus what is called the "status" of a P-tunnel in this section, is 189 not a characteristic of the tunnel in itself, but is the status of 190 the tunnel, *as seen from a particular downstream PE*. Additionally, 191 some of the following methods determine the ability of downstream PE 192 to receive traffic on the P-tunnel and not specifically on the status 193 of the P-tunnel itself. That could be referred to as "P-tunnel 194 reception status", but for simplicity, we will use the terminology of 195 P-tunnel "status" for all of these methods. 197 Depending on the criteria used to determine the status of a P-tunnel, 198 there may be an interaction with another resiliency mechanism used 199 for the P-tunnel itself, and the UMH update may happen immediately or 200 may need to be delayed. Each particular case is covered in each 201 separate sub-section below. 203 3.1.1. mVPN tunnel root tracking 205 A condition to consider that the status of a P-tunnel is up is that 206 the root of the tunnel, as determined in the PMSI tunnel attribute, 207 is reachable through unicast routing tables. In this case, the 208 downstream PE can immediately update its UMH when the reachability 209 condition changes. 211 That is similar to BGP next-hop tracking for VPN routes, except that 212 the address considered is not the BGP next-hop address, but the root 213 address in the PMSI tunnel attribute. 215 If BGP next-hop tracking is done for VPN routes and the root address 216 of a given tunnel happens to be the same as the next-hop address in 217 the BGP auto-discovery route advertising the tunnel, then using this 218 mechanism for the tunnel will not bring any specific benefit. 220 3.1.2. PE-P Upstream link status 222 A condition to consider a tunnel status as Up can be that the last- 223 hop link of the P-tunnel is up. 225 Using this method when a fast restoration mechanism (such as MPLS FRR 226 [RFC4090]) is in place for the link requires careful consideration 227 and coordination of defect detection intervals for the link and the 228 tunnel. In many cases, it is not practical to use both methods at 229 the same time. 231 3.1.3. P2MP RSVP-TE tunnels 233 For P-tunnels of type P2MP MPLS-TE, the status of the P-tunnel is 234 considered up if the sub-LSP to this downstream PE is in Up state. 235 The determination of whether a P2MP RSVP-TE LSP is in Up state 236 requires Path and Resv state for the LSP and is based on procedures 237 specified in [RFC4875]. As a result, the downstream PE can 238 immediately update its UMH when the reachability condition changes. 240 When signaling state for a P2MP TE LSP is removed (e.g., if the 241 ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE 242 LSP changes state from Up to Down as determined by procedures in 243 [RFC4875], the status of the corresponding P-tunnel SHOULD be re- 244 evaluated. If the P-tunnel transitions from up to Down state, the 245 upstream PE that is the ingress of the P-tunnel SHOULD NOT be 246 considered a valid UMH. 248 3.1.4. Leaf-initiated P-tunnels 250 An upstream PE SHOULD be removed from the UMH candidate list for a 251 given (C-S, C-G) if the P-tunnel (I-PMSI or S-PMSI) for this (S, G) 252 is leaf-triggered (PIM, mLDP), but for some reason, internal to the 253 protocol, the upstream one-hop branch of the tunnel from P to PE 254 cannot be built. As a result, the downstream PE can immediately 255 update its UMH when the reachability condition changes. 257 3.1.5. (C-S, C-G) counter information 259 In cases, where the downstream node can be configured so that the 260 maximum inter-packet time is known for all the multicast flows mapped 261 on a P-tunnel, the local per-(C-S, C-G) traffic counter information 262 for traffic received on this P-tunnel can be used to determine the 263 status of the P-tunnel. 265 When such a procedure is used, in the context where fast restoration 266 mechanisms are used for the P-tunnels, a configurable timer MUST be 267 configured on the downstream PE to wait before updating the UMH, to 268 let the P-tunnel restoration mechanism happen. It is RECOMMENDED to 269 provide a reasonable default value for this timer. An implementation 270 SHOULD use three seconds as the default value for this timer. 272 This method can be applicable, for instance, when a (C-S, C-G) flow 273 is mapped on an S-PMSI. 275 In cases where this mechanism is used in conjunction with the method 276 described in Section 5, no prior knowledge of the rate of the 277 multicast streams is required; downstream PEs can compare reception 278 on the two P-tunnels to determine when one of them is down. 280 3.1.6. BFD Discriminator Attribute 282 P-tunnel status MAY be derived from the status of a multipoint BFD 283 session [RFC8562] whose discriminator is advertised along with an 284 x-PMSI A-D route. 286 This document defines the format and ways of using a new BGP 287 attribute called the "BFD Discriminator". It is an optional 288 transitive BGP attribute. The format of this attribute is defined as 289 follows: 291 0 1 2 3 292 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 294 | BFD Mode | Reserved | 295 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 296 | BFD Discriminator | 297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 298 ~ Optional TLVs ~ 299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 301 Format of the BFD Discriminator Attribute 303 Where: 305 BFD Mode is the one octet long field. This specification defines 306 the P2MP value (TBA3) Section 7.1. 308 Reserved field is three octets long, and the value MUST be zeroed 309 on transmission and ignored on receipt. 311 BFD Discriminator is four octets long field. 313 Optional TLVs is the optional variable-length field that MAY be 314 used in the BFD Discriminator attribute for future extensions. 315 TLVs MAY be included is a sequential or nested manner. Each TLV 316 consists of: 318 * one octet-long field of TLV 's Type value (Section 7.2) 320 * one octet-long field of the length of the Value field in octets 322 * variable length Value field. 324 The length of a TLV MUST be aligned on four octets boundary. 326 The BFD Discriminator attribute SHALL be considered malformed if its 327 length is not a non-zero multiple of four. If malformed, the UPDATE 328 message SHALL be handled using the approach of "treat-as-withdraw" 329 per [RFC7606]. 331 3.1.6.1. Upstream PE Procedures 333 When it is desired to track the P-tunnel status using a p2mp BFD 334 session, the Upstream PE: 336 o MUST initiate BFD session and set bfd.SessionType = MultipointHead 337 as described in [RFC8562]; 339 o when transmitting BFD Control packets, MUST as the destination IP 340 address one of the internal loopback addresses from 127/8 range 341 for IPv4 or one of IPv4-mapped IPv4 loopback addresses from 342 ::ffff:127.0.0.0/104 range for IPv6; 344 o MUST use the IP address of the Upstream PE as source IP address 345 when transmitting BFD control packets; 347 o MUST include the BFD Discriminator attribute in the x-PMSI A-D 348 Route with the value set to My Discriminator value; 350 o MUST periodically transmit BFD control packets over the x-PMSI 351 tunnel. 353 If the tracking of the P-tunnel by using a p2mp BFD session is 354 enabled after the x-PMSI A-D route has been already advertised, the 355 x-PMSI A-D Route MUST be re-sent with precisely the same attributes 356 as before and the BFD Discriminator attribute included. 358 If the x-PMSI A-D route is advertised with P-tunnel status tracked 359 using the p2mp BFD session and it is desired to stop tracking 360 P-tunnel status using BFD, then: 362 o x-PMSI A-D Route MUST be re-sent with precisely the same 363 attributes as before, but the BFD Discriminator attribute MUST be 364 excluded; 366 o the p2mp BFD session SHOULD be deleted. 368 3.1.6.2. Downstream PE Procedures 370 Upon receiving the BFD Discriminator attribute in the x-PMSI A-D 371 Route, the Downstream PE: 373 o MUST associate the received BFD discriminator value with the 374 P-tunnel originating from the Root PE and the IP address of the 375 Upstream PE; 377 o MUST create p2mp BFD session and set bfd.SessionType = 378 MultipointTail as described in [RFC8562]; 380 o MUST use the source IP address of the BFD control packet, the 381 value of the BFD Discriminator field, and the x-PMSI tunnel 382 identifier the BFD control packet was received to properly 383 demultiplex BFD sessions. 385 After the state of the p2mp BFD session is up, i.e., bfd.SessionState 386 == Up, the session state will then be used to track the health of the 387 P-tunnel. 389 According to [RFC8562], if the Downstream PE receives Down or 390 AdminDown in the State field of the BFD control packet or associated 391 with the BFD session Detection Timer expires, the BFD session is 392 down, i.e., bfd.SessionState == Down. When the BFD session state is 393 Down, then the P-tunnel associated with the BFD session MUST be 394 declared down. As a result, the Downstream PE MAY initiate a 395 switchover of the traffic from the Primary Upstream PE to the Standby 396 Upstream PE only if the Standby Upstream PE deemed available. A 397 different p2mp BFD session MAY be used to monitor the state of the 398 P-tunnel from Standby Upstream PE. 400 If the Downstream PE's P-tunnel is already up when the Downstream PE 401 receives the new x-PMSI A-D Route with BFD Discriminator attribute, 402 the Downstream PE MUST accept the x-PMSI A-D Route and associate the 403 value of BFD Discriminator field with the P-tunnel. The Upstream PE 404 MUST follow procedures listed above in this section to bring the p2mp 405 BFD session up and use it to monitor the state of the associated 406 P-tunnel. 408 If the Downstream PE's P-tunnel is already up, its state being 409 monitored by the p2mp BFD session, and the Downstream PE receives the 410 new x-PMSI A-D Route without the BFD Discriminator attribute, the 411 Downstream PE: 413 o MUST accept the x-PMSI A-D Route; 415 o MUST stop processing BFD control packets for this p2mp BFD 416 session; 418 o SHOULD delete the p2mp BFD session associated with the P-tunnel; 420 o SHOULD NOT switch the traffic to the Standby Upstream PE. 422 3.1.7. Per PE-CE link BFD Discriminator 424 The following approach is defined in response to the detection by the 425 upstream PE of PE-CE link failure. Even though the provider tunnel 426 is still up, it is desired for the downstream PEs to switch to a 427 backup upstream PE. To achieve that, if the upstream PE detects that 428 its PE-CE link fails, it SHOULD set the bfd.LocalDiag of the p2mp BFD 429 session to Concatenated Path Down and/or Reverse Concatenated Path 430 Down (per section 6.8.17 [RFC5880]), unless it switches to a new PE- 431 CE link within the time of bfd.DesiredMinTxInterval for the p2mp BFD 432 session (in that case the upstream PE will start tracking the status 433 of the new PE-CE link). When a downstream PE receives that 434 bfd.LocalDiag code, it treats as if the tunnel itself failed and 435 tries to switch to a backup PE. 437 4. Standby C-multicast route 439 The procedures described below are limited to the case where the site 440 that contains C-S is connected to two or more PEs though, to simplify 441 the description, the case of dual-homing is described. The 442 procedures require all the PEs of that MVPN to follow the UMH 443 selection, as specified in [RFC6513], whether the PE selected based 444 on its IP address, hashing algorithm described in section 5.1.3 445 [RFC6513], or Installed UMH Route. The procedures assume that if a 446 site of a given MVPN that contains C-S is dual-homed to two PEs, then 447 all the other sites of that MVPN would have two unicast VPN routes 448 (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its RD. 450 As long as C-S is reachable via both PEs, a given downstream PE will 451 select one of the PEs connected to C-S as its Upstream PE for C-S. 452 We will refer to the other PE connected to C-S as the "Standby 453 Upstream PE". Note that if the connectivity to C-S through the 454 Primary Upstream PE becomes unavailable, then the PE will select the 455 Standby Upstream PE as its Upstream PE for C-S. When the Primary PE 456 later becomes available, then the PE will select the Primary Upstream 457 PE again as its Upstream PE. Such behavior is referred to as 458 "revertive" behavior and MUST be supported. Non-revertive behavior 459 would refer to the behavior of continuing to select the backup PE as 460 the UMH even after the Primary has come up. This non-revertive 461 behavior MAY also be supported by an implementation and would be 462 enabled through some configuration. 464 For readability, in the following sub-sections, the procedures are 465 described for BGP C-multicast Source Tree Join routes, but they apply 466 equally to BGP C-multicast Shared Tree Join routes failover for the 467 case where the customer RP is dual-homed (substitute "C-RP" to 468 "C-S"). 470 4.1. Downstream PE behavior 472 When a (downstream) PE connected to some site of an MVPN needs to 473 send a C-multicast route (C-S, C-G), then following the procedures 474 specified in Section "Originating C-multicast routes by a PE" of 475 [RFC6514] the PE sends the C-multicast route with RT that identifies 476 the Upstream PE selected by the PE originating the route. As long as 477 C-S is reachable via the Primary Upstream PE, and the Upstream PE is 478 the Primary Upstream PE. If C-S is reachable only via the Standby 479 Upstream PE, then the Upstream PE is the Standby Upstream PE. 481 If C-S is reachable via both the Primary and the Standby Upstream PE, 482 then in addition to sending the C-multicast route with an RT that 483 identifies the Primary Upstream PE, the PE also originates and sends 484 a C-multicast route with an RT that identifies the Standby Upstream 485 PE. This route that has the semantics of being a 'standby' 486 C-multicast route is further called a "Standby BGP C-multicast 487 route", and is constructed as follows: 489 o the NLRI is constructed as the original C-multicast route, except 490 that the RD is the same as if the C-multicast route was built 491 using the standby PE as the UMH (it will carry the RD associated 492 to the unicast VPN route advertised by the standby PE for S and a 493 Route Target derived from the standby PE's UMH route's VRF RT 494 Import EC); 496 o SHOULD carry the "Standby PE" BGP Community (this is a new BGP 497 Community, see Section 7). 499 The normal and the standby C-multicast routes MUST have their Local 500 Preference attribute adjusted so that, if two C-multicast routes with 501 same NLRI are received by a BGP peer, one carrying the "Standby PE" 502 community and the other one *not* carrying the "Standby PE" 503 community, then preference is given to the one *not* carrying the 504 "Standby PE" community. Such a situation can happen when, for 505 instance, due to transient unicast routing inconsistencies or lack of 506 support of the Standby PE community, two different downstream PEs 507 consider different upstream PEs to be the primary one; in that case, 508 without any precaution taken, both upstream PEs would process a 509 standby C-multicast route and possibly stop forwarding at the same 510 time. For this purpose, routes that carry the "Standby PE" BGP 511 Community MUST have the LOCAL_PREF attribute set to zero. 513 Note that, when a PE advertises such a Standby C-multicast join for a 514 (C-S, C-G) it MUST join the corresponding P-tunnel. 516 If at some later point the local PE determines that C-S is no longer 517 reachable through the Primary Upstream PE, the Standby Upstream PE 518 becomes the Upstream PE, and the local PE re-sends the C-multicast 519 route with RT that identifies the Standby Upstream PE, except that 520 now the route does not carry the Standby PE BGP Community (which 521 results in replacing the old route with a new route, with the only 522 difference between these routes being the presence/absence of the 523 Standby PE BGP Community). Also, a LOCAL_PREF attribute MUST be set 524 to zero. 526 4.2. Upstream PE behavior 528 When a PE receives a C-multicast route for a particular (C-S, C-G), 529 and the RT carried in the route results in importing the route into a 530 particular VRF on the PE, if the route carries the Standby PE BGP 531 Community, then the PE performs as follows: 533 when the PE determines (the use of the particular method to detect 534 the failure is outside the scope of this document) that C-S is not 535 reachable through some other PE, the PE SHOULD install VRF PIM 536 state corresponding to this Standby BGP C-multicast route (the 537 result will be that a PIM Join message will be sent to the CE 538 towards C-S, and that the PE will receive (C-S, C-G) traffic), and 539 the PE SHOULD forward (C-S, C-G) traffic received by the PE to 540 other PEs through a P-tunnel rooted at the PE. 542 Furthermore, irrespective of whether C-S carried in that route is 543 reachable through some other PE: 545 a) based on local policy, as soon as the PE receives this Standby BGP 546 C-multicast route, the PE MAY install VRF PIM state corresponding 547 to this BGP Source Tree Join route (the result will be that Join 548 messages will be sent to the CE toward C-S, and that the PE will 549 receive (C-S, C-G) traffic) 551 b) based on local policy, as soon as the PE receives this Standby BGP 552 C-multicast route, the PE MAY forward (C-S, C-G) traffic to other 553 PEs through a P-tunnel independently of the reachability of C-S 554 through some other PE. [note that this implies also doing (a)] 556 Doing neither (a) or (b) for a given (C-S, C-G) is called "cold root 557 standby". 559 Doing (a) but not (b) for a given (C-S, C-G) is called "warm root 560 standby". 562 Doing (b) (which implies also doing (a)) for a given (C-S, C-G) is 563 called "hot root standby". 565 Note that, if an upstream PE uses an S-PMSI only policy, it shall 566 advertise an S-PMSI for a (C-S, C-G) as soon as it receives a 567 C-multicast route for (C-S, C-G), normal or Standby; i.e., it shall 568 not wait for receiving a non-Standby C-multicast route before 569 advertising the corresponding S-PMSI. 571 Section 9.3.2 of [RFC6514], describes the procedures of sending a 572 Source-Active A-D result as a result of receiving the C-multicast 573 route. These procedures should be followed for both the normal and 574 Standby C-multicast routes. 576 4.3. Reachability determination 578 The standby PE can use the following information to determine that 579 C-S can or cannot be reached through the primary PE: 581 o presence/absence of a unicast VPN route toward C-S 583 o supposing that the standby PE is the egress of the tunnel rooted 584 at the Primary PE, the standby PE can determine the reachability 585 of C-S through the Primary PE based on the status of this tunnel, 586 determined thanks to the same criteria as the ones described in 587 Section 3.1 (without using the UMH selection procedures of 588 Section 3); 590 o other mechanisms MAY be used. 592 4.4. Inter-AS 594 If the non-segmented inter-AS approach is used, the procedures in 595 section 4 can be applied. 597 When multicast VPNs are used in an inter-AS context with the 598 segmented inter-AS approach described in section 8.2 of [RFC6514], 599 the procedures in this section can be applied. 601 A pre-requisite for the procedures described below to be applied for 602 a source of a given MVPN is: 604 o that any PE of this MVPN receives two Inter-AS I-PMSI auto- 605 discovery routes advertised by the AS of the source (or more) 607 o that these Inter-AS I-PMSI auto-discovery routes have distinct 608 Route Distinguishers (as described in item "(2)" of section 9.2 of 609 [RFC6514]). 611 As an example, these conditions will be satisfied when the source is 612 dual-homed to an AS that connects to the receiver AS through two ASBR 613 using auto-configured RDs. 615 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast failover 617 The following procedure is applied by downstream PEs of an AS, for a 618 source S in a remote AS. 620 Additionally, to choosing an Inter-AS I-PMSI auto-discovery route 621 advertised from the AS of the source to construct a C-multicast 622 route, as described in section 11.1.3 [RFC6514] a downstream PE will 623 choose a second Inter-AS I-PMSI auto-discovery route advertised from 624 the AS of the source and use this route to construct and advertise a 625 Standby C-multicast route (C-multicast route carrying the Standby 626 extended community) as described in Section 4.1. 628 4.4.2. Inter-AS procedures for ASBRs 630 When an upstream ASBR receives a C-multicast route, and at least one 631 of the RTs of the route matches one of the ASBR Import RT, the ASBR, 632 that supports this specification, MUST locate an Inter-AS I-PMSI A-D 633 route whose RD and Source AS respectively match the RD and Source AS 634 carried in the C-multicast route. If the match is found, and 635 C-multicast route carries the Standby PE BGP Community, then the ASBR 636 MUST perform as follows: 638 o if the route was received over iBGP and its LOCAL_PREF attribute 639 is set to zero, then it MUST be re-advertised in eBGP with a MED 640 attribute (MULTI_EXIT_DISC) set to the highest possible value 641 (0xffff) 643 o if the route was received over eBGP and its MED attribute set of 644 0xffff, then it MUST be re-advertised in iBGP with a LOCAL_PREF 645 attribute set to zero 647 Other ASBR procedures are applied without modification. 649 5. Hot Root Standby 651 The mechanisms defined in sections Section 4 and Section 3 can be 652 used together as follows. 654 The principle is that, for a given VRF (or possibly only for a given 655 C-S,C-G): 657 o downstream PEs advertise a Standby BGP C-multicast route (based on 658 Section 4) 660 o upstream PEs use the "hot standby" optional behavior and thus will 661 forward traffic for a given multicast state as soon as they have 662 whether a (primary) BGP C-multicast route or a Standby BGP 663 C-multicast route for that state (or both) 665 o downstream PEs accept traffic from the primary or standby tunnel, 666 based on the status of the tunnel (based on Section 3) 668 Other combinations of the mechanisms proposed in Section 4 and 669 Section 3 are for further study. 671 Note that the same level of protection would be achievable with a 672 simple C-multicast Source Tree Join route advertised to both the 673 primary and secondary upstream PEs (carrying as Route Target extended 674 communities, the values of the VRF Route Import attribute of each VPN 675 route from each upstream PEs). The advantage of using the Standby 676 semantic for is that, supposing that downstream PEs always advertise 677 a Standby C-multicast route to the secondary upstream PE, it allows 678 to choose the protection level through a change of configuration on 679 the secondary upstream PE, without requiring any reconfiguration of 680 all the downstream PEs. 682 6. Duplicate packets 684 Multicast VPN specifications [RFC6513] impose that a PE only forwards 685 to CEs the packets coming from the expected upstream PE 686 (Section 9.1). 688 We highlight the reader's attention to the fact that the respect of 689 this part of multicast VPN specifications is especially important 690 when two distinct upstream PEs are susceptible to forward the same 691 traffic on P-tunnels at the same time in the steady state. That will 692 be the case when "hot root standby" mode is used (Section 4), and 693 which can also be the case if procedures of Section 3 are used and 694 (a) the rules determining the status of a tree are not the same on 695 two distinct downstream PEs or (b) the rule determining the status of 696 a tree depends on conditions local to a PE (e.g., the PE-P upstream 697 link being up). 699 7. IANA Considerations 701 IANA is requested to allocate the BGP "Standby PE" community value 702 (TBA1) from the Border Gateway Protocol (BGP) Well-known Communities 703 registry. 705 7.1. BFD Discriminator 707 This document defines a new BGP optional transitive attribute, called 708 "BFD Discriminator". IANA is requested to allocate a codepoint 709 (TBA2) in the "BGP Path Attributes" registry to the BFD Discriminator 710 attribute. 712 IANA is requested to create a new BFD Mode sub-registry in Border 713 Gateway Protocol (BGP) Parameters registry as described in Table 1. 715 +---------+-------------------------+---------------+ 716 | Range | Registration Procedures | Note | 717 +---------+-------------------------+---------------+ 718 | 0-249 | Standards Action | | 719 | 250-253 | Specification Required | Experimental | 720 | 254 | Private Use | | 721 | 255 | Standards Action | | 722 +---------+-------------------------+---------------+ 724 Table 1: BFD Mode Sub-registry 726 IANA is requested to allocate the following values from the BFD Mode 727 sub-registry as defined in Table 2. 729 +-------+------------------+---------------+ 730 | Value | Description | Reference | 731 +-------+------------------+---------------+ 732 | 0 | Reserved | This document | 733 | TBA3 | P2MP BFD Session | This document | 734 | 255 | Reserved | This document | 735 +-------+------------------+---------------+ 737 Table 2: BFD Mode 739 7.2. BFD Discriminator Extention Type 741 IANA is requested to create a new BFD Discriminator Extention Type 742 sub-registry in Border Gateway Protocol (BGP) Parameters registry as 743 described in Table 3. 745 +---------+-------------+-------------------------+ 746 | Value | Description | Reference | 747 +---------+-------------+-------------------------+ 748 | 0 | Reserved | | 749 | 1-191 | Unassigned | IETF Review | 750 | 192-251 | Unassigned | First Come First Served | 751 | 252-254 | Unassigned | Private Use | 752 | 255 | Reserved | | 753 +---------+-------------+-------------------------+ 755 Table 3: BFD Discriminator Extention Type Sub-registry 757 8. Security Considerations 759 This document describes procedures based on [RFC6513] and [RFC6514] 760 and hence shares the security considerations respectively represented 761 in these specifications. 763 This document makes use of BFD, as defined in [RFC8562], which, in 764 turn, is based on [RFC5880]. Security considerations relevant to 765 each protocol are discussed in the respective protocol 766 specifications. 768 9. Acknowledgments 770 The authors want to thank Greg Reaume, Eric Rosen, Jeffrey Zhang, and 771 Zheng (Sandy) Zhang for their reviews, useful comments, and helpful 772 suggestions. 774 10. Contributor Addresses 776 Below is a list of other contributing authors in alphabetical order: 778 Rahul Aggarwal 779 Arktan 781 Email: raggarwa_1@yahoo.com 783 Nehal Bhau 784 Cisco 786 Email: NBhau@cisco.com 788 Clayton Hassen 789 Bell Canada 790 2955 Virtual Way 791 Vancouver 792 CANADA 794 Email: Clayton.Hassen@bell.ca 796 Wim Henderickx 797 Nokia 798 Copernicuslaan 50 799 Antwerp 2018 800 Belgium 802 Email: wim.henderickx@nokia.com 804 Pradeep Jain 805 Nokia 806 701 E Middlefield Rd 807 Mountain View, CA 94043 808 USA 810 Email: pradeep.jain@nokia.com 812 Jayant Kotalwar 813 Nokia 814 701 E Middlefield Rd 815 Mountain View, CA 94043 816 USA 818 Email: Jayant.Kotalwar@nokia.com 820 Praveen Muley 821 Nokia 822 701 East Middlefield Rd 823 Mountain View, CA 94043 824 U.S.A. 826 Email: praveen.muley@nokia.com 828 Ray (Lei) Qiu 829 Juniper Networks 830 1194 North Mathilda Ave. 831 Sunnyvale, CA 94089 832 U.S.A. 834 Email: rqiu@juniper.net 836 Yakov Rekhter 837 Juniper Networks 838 1194 North Mathilda Ave. 839 Sunnyvale, CA 94089 840 U.S.A. 842 Email: yakov@juniper.net 844 Kanwar Singh 845 Nokia 846 701 E Middlefield Rd 847 Mountain View, CA 94043 848 USA 850 Email: kanwar.singh@nokia.com 852 11. References 854 11.1. Normative References 856 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 857 Requirement Levels", BCP 14, RFC 2119, 858 DOI 10.17487/RFC2119, March 1997, 859 . 861 [RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S. 862 Yasukawa, Ed., "Extensions to Resource Reservation 863 Protocol - Traffic Engineering (RSVP-TE) for Point-to- 864 Multipoint TE Label Switched Paths (LSPs)", RFC 4875, 865 DOI 10.17487/RFC4875, May 2007, 866 . 868 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 869 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 870 . 872 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 873 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 874 2012, . 876 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 877 Encodings and Procedures for Multicast in MPLS/BGP IP 878 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 879 . 881 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 882 Patel, "Revised Error Handling for BGP UPDATE Messages", 883 RFC 7606, DOI 10.17487/RFC7606, August 2015, 884 . 886 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 887 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 888 May 2017, . 890 [RFC8562] Katz, D., Ward, D., Pallagatti, S., Ed., and G. Mirsky, 891 Ed., "Bidirectional Forwarding Detection (BFD) for 892 Multipoint Networks", RFC 8562, DOI 10.17487/RFC8562, 893 April 2019, . 895 11.2. Informative References 897 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 898 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 899 DOI 10.17487/RFC4090, May 2005, 900 . 902 [RFC7431] Karan, A., Filsfils, C., Wijnands, IJ., Ed., and B. 903 Decraene, "Multicast-Only Fast Reroute", RFC 7431, 904 DOI 10.17487/RFC7431, August 2015, 905 . 907 Authors' Addresses 909 Thomas Morin (editor) 910 Orange 911 2, avenue Pierre Marzin 912 Lannion 22307 913 France 915 Email: thomas.morin@orange-ftgroup.com 917 Robert Kebler (editor) 918 Juniper Networks 919 1194 North Mathilda Ave. 920 Sunnyvale, CA 94089 921 U.S.A. 923 Email: rkebler@juniper.net 924 Greg Mirsky (editor) 925 ZTE Corp. 927 Email: gregimirsky@gmail.com