idnits 2.17.1 draft-ietf-bess-mvpn-fast-failover-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 1 instance of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 14, 2019) is 1898 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Morin, Ed. 3 Internet-Draft Orange 4 Intended status: Standards Track R. Kebler, Ed. 5 Expires: August 18, 2019 Juniper Networks 6 G. Mirsky, Ed. 7 ZTE Corp. 8 February 14, 2019 10 Multicast VPN fast upstream failover 11 draft-ietf-bess-mvpn-fast-failover-05 13 Abstract 15 This document defines multicast VPN extensions and procedures that 16 allow fast failover for upstream failures, by allowing downstream PEs 17 to take into account the status of Provider-Tunnels (P-tunnels) when 18 selecting the upstream PE for a VPN multicast flow, and extending BGP 19 MVPN routing so that a C-multicast route can be advertised toward a 20 standby upstream PE. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 26 "OPTIONAL" in this document are to be interpreted as described in BCP 27 14 [RFC2119] [RFC8174] when, and only when, they appear in all 28 capitals, as shown here. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on August 18, 2019. 47 Copyright Notice 49 Copyright (c) 2019 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 3. UMH Selection based on tunnel status . . . . . . . . . . . . 3 67 3.1. Determining the status of a tunnel . . . . . . . . . . . 4 68 3.1.1. mVPN tunnel root tracking . . . . . . . . . . . . . . 5 69 3.1.2. PE-P Upstream link status . . . . . . . . . . . . . . 5 70 3.1.3. P2MP RSVP-TE tunnels . . . . . . . . . . . . . . . . 5 71 3.1.4. Leaf-initiated P-tunnels . . . . . . . . . . . . . . 6 72 3.1.5. (C-S, C-G) counter information . . . . . . . . . . . 6 73 3.1.6. BFD Discriminator . . . . . . . . . . . . . . . . . . 6 74 3.1.7. Per PE-CE link BFD Discriminator . . . . . . . . . . 9 75 4. Standby C-multicast route . . . . . . . . . . . . . . . . . . 10 76 4.1. Downstream PE behavior . . . . . . . . . . . . . . . . . 11 77 4.2. Upstream PE behavior . . . . . . . . . . . . . . . . . . 12 78 4.3. Reachability determination . . . . . . . . . . . . . . . 13 79 4.4. Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . 13 80 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast 81 failover . . . . . . . . . . . . . . . . . . . . . . 14 82 4.4.2. Inter-AS procedures for ASBRs . . . . . . . . . . . . 14 83 5. Hot leaf standby . . . . . . . . . . . . . . . . . . . . . . 15 84 6. Duplicate packets . . . . . . . . . . . . . . . . . . . . . . 15 85 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 86 8. Security Considerations . . . . . . . . . . . . . . . . . . . 16 87 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 88 10. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 16 89 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 90 11.1. Normative References . . . . . . . . . . . . . . . . . . 18 91 11.2. Informative References . . . . . . . . . . . . . . . . . 19 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 94 1. Introduction 96 In the context of multicast in BGP/MPLS VPNs, it is desirable to 97 provide mechanisms allowing fast recovery of connectivity on 98 different types of failures. This document addresses failures of 99 elements in the provider network that are upstream of PEs connected 100 to VPN sites with receivers. 102 Section 3 describes local procedures allowing an egress PE (a PE 103 connected to a receiver site) to take into account the status of 104 P-tunnels to determine the Upstream Multicast Hop (UMH) for a given 105 (C-S, C-G). This method does not provide a "fast failover" solution 106 when used alone, but can be used with the following sections for a 107 "fast failover" solution. 109 Section 4 describes protocol extensions that can speed up failover by 110 not requiring any multicast VPN routing message exchange at recovery 111 time. 113 Moreover, section 5 describes a "hot leaf standby" mechanism, that 114 uses a combination of these two mechanisms. This approach has 115 similarities with the solution described in [RFC7431] to improve 116 failover times when PIM routing is used in a network given some 117 topology and metric constraints. 119 2. Terminology 121 The terminology used in this document is the terminology defined in 122 [RFC6513] and [RFC6514]. 124 x-PMSI: I-PMSI or S-PMSI 126 3. UMH Selection based on tunnel status 128 Current multicast VPN specifications [RFC6513], section 5.1, describe 129 the procedures used by a multicast VPN downstream PE to determine 130 what the upstream multicast hop (UMH) is for a given (C-S,C-G). 132 The procedure described here is an OPTIONAL procedure that consists 133 of having a downstream PE take into account the status of P-tunnels 134 rooted at each possible upstream PEs, for including or not including 135 each given PE in the list of candidate UMHs for a given (C-S,C-G) 136 state. The result is that, if a P-tunnel is "down" (see 137 Section 3.1), the PE that is the root of the P-tunnel will not be 138 considered for UMH selection, which will result in the downstream PE 139 to failover to the upstream PE which is next in the list of 140 candidates. If rules to determine the state of the P-tunnel are not 141 consistent across all PEs, then some may arrive at a different 142 conclusion regarding the state of the tunnel, In such a scenario, 143 procedures described in Section 9.1.1 of [RFC6513] MUST be used. 145 A downstream PE monitors the status of the tunnels of UMHs that are 146 ahead of the current one. Whenever the downstream PE determines that 147 one of these tunnels is no longer "known to down", the PE selects the 148 UMH corresponding to that as the new UMH. 150 More precisely, UMH determination for a given (C-S,C-G) will consider 151 the UMH candidates in the following order: 153 o first, the UMH candidates that either (a) advertise a PMSI bound 154 to a tunnel, where the specified tunnel is not known to be down or 155 (b) do not advertise any x-PMSI applicable to the given (C-S,C-G) 156 but have associated a VRF Route Import BGP attribute to the 157 unicast VPN route for S (this is necessary to avoid incorrectly 158 invalidating an UMH PE that would use a policy where no I-PMSI is 159 advertised for a given VRF and where only S-PMSI are used, the 160 S-PMSI advertisement being possibly done only after the upstream 161 PE receives a C-multicast route for (C-S, C-G)/(C-*, C-G) to be 162 carried over the advertised S-PMSI) 164 o second, the UMH candidates that advertise a PMSI bound to a tunnel 165 that is "down" -- these will thus be used as a last resort to 166 ensure a graceful fallback to the basic MVPN UMH selection 167 procedures in the hypothetical case where a false negative would 168 occur when determining the status of all tunnels 170 For a given downstream PE and a given VRF, the P-tunnel corresponding 171 to a given upstream PE for a given (C-S,C-G) state is the S-PMSI 172 tunnel advertised by that upstream PE for this (C-S,C-G) and imported 173 into that VRF, or if there isn't any such S-PMSI, the I-PMSI tunnel 174 advertised by that PE and imported into that VRF. 176 Note that this document assumes that if a site of a given MVPN that 177 contains C-S is dual-homed to two PEs, then all the other sites of 178 that MVPN would have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) 179 routes to C-S, each with its own RD. 181 3.1. Determining the status of a tunnel 183 Different factors can be considered to determine the "status" of a 184 P-tunnel and are described in the following sub-sections. The 185 optional procedures proposed in this section also allow that all 186 downstream PEs don't apply the same rules to define what the status 187 of a P-tunnel is (please see Section 6), and some of them will 188 produce a result that may be different for different downstream PEs. 189 Thus what is called the "status" of a P-tunnel in this section, is 190 not a characteristic of the tunnel in itself, but is the status of 191 the tunnel, *as seen from a particular downstream PE*. Additionally, 192 some of the following methods determine the ability of downstream PE 193 to receive traffic on the P-tunnel and not specifically on the status 194 of the P-tunnel itself. This could be referred to as "P-tunnel 195 reception status", but for simplicity, we will use the terminology of 196 P-tunnel "status" for all of these methods. 198 Depending on the criteria used to determine the status of a P-tunnel, 199 there may be an interaction with another resiliency mechanism used 200 for the P-tunnel itself, and the UMH update may happen immediately or 201 may need to be delayed. Each particular case is covered in each 202 separate sub-section below. 204 3.1.1. mVPN tunnel root tracking 206 A condition to consider that the status of a P-tunnel is up is that 207 the root of the tunnel, as determined in the PMSI tunnel attribute, 208 is reachable through unicast routing tables. In this case, the 209 downstream PE can immediately update its UMH when the reachability 210 condition changes. 212 This is similar to BGP next-hop tracking for VPN routes, except that 213 the address considered is not the BGP next-hop address, but the root 214 address in the PMSI tunnel attribute. 216 If BGP next-hop tracking is done for VPN routes and the root address 217 of a given tunnel happens to be the same as the next-hop address in 218 the BGP auto-discovery route advertising the tunnel, then this 219 mechanisms may be omitted for this tunnel, as it will not bring any 220 specific benefit. 222 3.1.2. PE-P Upstream link status 224 A condition to consider a tunnel status as Up can be that the last- 225 hop link of the P-tunnel is up. 227 This method should not be used when there is a fast restoration 228 mechanism (such as MPLS FRR [RFC4090]) in place for the link. 230 3.1.3. P2MP RSVP-TE tunnels 232 For P-tunnels of type P2MP MPLS-TE, the status of the P-tunnel is 233 considered up if one or more of the P2MP RSVP-TE LSPs, identified by 234 the P-tunnel Attribute, are in Up state. The determination of 235 whether a P2MP RSVP-TE LSP is in Up state requires Path and Resv 236 state for the LSP and is based on procedures in [RFC4875]. In this 237 case, the downstream PE can immediately update its UMH when the 238 reachability condition changes. 240 When signaling state for a P2MP TE LSP is removed (e.g. if the 241 ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE 242 LSP changes state from Up to Down as determined by procedures in 243 [RFC4875], the status of the corresponding P-tunnel SHOULD be re- 244 evaluated. If the P-tunnel transitions from up to Down state, the 245 upstream PE, that is the ingress of the P-tunnel, SHOULD NOT be 246 considered a valid UMH. 248 3.1.4. Leaf-initiated P-tunnels 250 A PE can be removed from the UMH candidate list for a given (C-S, 251 C-G) if the P-tunnel (I or S , depending) for this (S, G) is leaf 252 triggered (PIM, mLDP), but for some reason internal to the protocol 253 the upstream one-hop branch of the tunnel from P to PE cannot be 254 built. In this case, the downstream PE can immediately update its 255 UMH when the reachability condition changes. 257 3.1.5. (C-S, C-G) counter information 259 In cases, where the downstream node can be configured so that the 260 maximum inter-packet time is known for all the multicast flows mapped 261 on a P-tunnel, the local per-(C-S,C-G) traffic counter information 262 for traffic received on this P-tunnel can be used to determine the 263 status of the P-tunnel. 265 When such a procedure is used, in the context where fast restoration 266 mechanisms are used for the P-tunnels, downstream PEs should be 267 configured to wait before updating the UMH, to let the P-tunnel 268 restoration mechanism happen. A configurable timer MUST be provided 269 for this purpose, and it is recommended to provide a reasonable 270 default value for this timer. 272 This method can be applicable, for instance, when a (C-S, C-G) flow 273 is mapped on an S-PMSI. 275 In cases where this mechanism is used in conjunction with 276 Hot leaf standby, then no prior knowledge of the rate of the 277 multicast streams is required; downstream PEs can compare reception 278 on the two P-tunnels to determine when one of them is down. 280 3.1.6. BFD Discriminator 282 P-tunnel status can be derived from the status of a multipoint BFD 283 session [I-D.ietf-bfd-multipoint] whose discriminator is advertised 284 along with an x-PMSI A-D route. 286 This document defines the format and ways of using a new BGP 287 attribute called the "BGP- BFD attribute". This is an optional 288 transitive BGP attribute. The format of this attribute is defined as 289 follows: 291 +-------------------------------+ 292 | Flags (1 octet) | 293 +-------------------------------+ 294 | BFD Discriminator (4 octets) | 295 +-------------------------------+ 297 The Flags field has the following format: 299 0 1 2 3 4 5 6 7 300 +-+-+-+-+-+-+-+-+ 301 | reserved | 302 +-+-+-+-+-+-+-+-+ 304 3.1.6.1. Upstream PE Procedures 306 When it is desired to track the P-tunnel status using p2mp BFD 307 session, the Upstream PE: 309 o MUST initiate BFD session and set bfd.SessionType = MultipointHead 310 as described in [I-D.ietf-bfd-multipoint]; 312 o MUST use address in 127.0.0.0/8 range for IPv4 or in 313 0:0:0:0:0:FFFF:7F00:0/104 range for IPv6 as destination IP address 314 when transmitting BFD control packets; 316 o MUST use the IP address of the Upstream PE as source IP address 317 when transmitting BFD control packets; 319 o MUST include the BGP-BFD Attribute in the x-PMSI A-D Route with 320 BFD Discriminator value set to My Discriminator value; 322 o MUST periodically transmit BFD control packets over the x-PMSI 323 tunnel. 325 If tracking of the P-tunnel by using a p2mp BFD session is to be 326 enabled after the P-tunnel has been already signaled, then the 327 procedure described above MUST be followed. Note that x-PMSI A-D 328 Route MUST be re-sent with exactly the same attributes as before and 329 the BGP-BFD Attribute included. 331 If P-tunnel is already signaled, and P-tunnel status tracked using 332 the p2mp BFD session and it is desired to stop tracking P-tunnel 333 status using BFD, then: 335 o x-PMSI A-D Route MUST be re-sent with exactly the same attributes 336 as before, but the BGP-BFD Attribute MUST be excluded; 338 o the p2mp BFD session SHOULD be deleted. 340 3.1.6.2. Downstream PE Procedures 342 Upon receiving the BGP-BFD Attribute in the x-PMSI A-D Route, the 343 Downstream PE: 345 o MUST associate the received BFD discriminator value with the 346 P-tunnel originating from the Root PE and the IP address of the 347 Upstream PE; 349 o MUST create p2mp BFD session and set bfd.SessionType = 350 MultipointTail as described in [I-D.ietf-bfd-multipoint]; 352 o MUST use the source IP address of the BFD control packet, the 353 value of the BFD Discriminator field, and the x-PMSI tunnel 354 identifier the BFD control packet was received to properly 355 demultiplex BFD sessions. 357 After the state of the p2mp BFD session is up, i.e., bfd.SessionState 358 == Up, the session state will then be used to track the health of the 359 P-tunnel. 361 According to [I-D.ietf-bfd-multipoint], if the Downstream PE receives 362 Down or AdminDown in the State field of the BFD control packet or 363 associated with the BFD session Detection Timer expires, the BFD 364 session state is down, i.e., bfd.SessionState == Down. When the BFD 365 session state is Down, then the P-tunnel associated with the BFD 366 session as down MUST be declared down. Then The Downstream PE MAY 367 initiate a switchover of the traffic from the Primary Upstream PE to 368 the Standby Upstream PE only if the Standby Upstream PE deemed 369 available. A different p2mp BFD session MAY monitor the state of the 370 Standby Upstream PE. 372 If the Downstream PE's P-tunnel is already up when the Downstream PE 373 receives the new x-PMSI A-D Route with BGP-BFD Attribute, the 374 Downstream PE MUST accept the x-PMSI A-D Route and associate the 375 value of BFD Discriminator field with the P-tunnel. The Upstream PE 376 MUST follow procedures listed above in this section to bring the p2mp 377 BFD session up and use it to monitor the state of the associated 378 P-tunnel. 380 If the Downstream PE's P-tunnel is already up, its state being 381 monitored by the p2mp BFD session, and the Downstream PE receives the 382 new x-PMSI A-D Route without the BGP-BFD Attribute, the Downstream 383 PE: 385 o MUST accept the x-PMSI A-D Route; 387 o MUST stop receiving BFD control packets for this p2mp BFD session; 389 o SHOULD delete the p2mp BFD session associated with the P-tunnel; 391 o SHOULD NOT switch the traffic to the Standby Upstream PE. 393 In such a scenario, in the context where fast restoration mechanisms 394 are used for the P-tunnels, leaf PEs should be configured to wait 395 before updating the UMH, to let the P-tunnel restoration mechanism 396 happen. A configurable timer MUST be provided for this purpose, and 397 it is RECOMMENDED to provide a reasonable default value for this 398 timer. 400 3.1.7. Per PE-CE link BFD Discriminator 402 The following approach is defined for the fast failover in response 403 to the detection of PE-CE link failures, in which UMH selection for a 404 given C-multicast route takes into account the state of the BFD 405 session associated with the state of the upstream PE-CE link. 407 3.1.7.1. Upstream PE Procedures 409 For each protected PE-CE link, the upstream PE initiates a multipoint 410 BFD session [I-D.ietf-bfd-multipoint] as MultipointHead toward 411 downstream PEs. A downstream PE monitors the state of the p2mp 412 session as MultipointTail and MAY interpret transition of the BFD 413 session into Down state as the indication of the associated PE-CE 414 link being down. 416 For SSM groups, the upstream PE advertises an (C-S, C-G) S-PMSI A-D 417 route or wildcard (S,*) S-PMSI A-D route for each received SSM (C-S, 418 C-G) C-multicast route for which protection is desired. For each ASM 419 (C-S, C-G) C-multicast route for which protection is desired, the 420 upstream PE advertises a (C-S, C-G) S-PMSI A-D route. For each ASM 421 (*,G) C-Multicast route for which protection is desired, the upstream 422 PE advertises a wildcard (*,G) S-PMSI A-D route. Note that all 423 S-PMSI A-D routes can signal the same P-tunnel, so there is no need 424 for a new P-tunnel for each S-PMSI A-D route. Multicast flows for 425 which protection is desired is controlled by configuration/policy on 426 the upstream PE. The protected link is the RPF PE-CE interface 427 towards the src/RP. The upstream PE advertises the BFD Discriminator 428 of the protected link in the S-PMSI A-D route. If the route to the 429 src/RP changes such that the RPF interface is changed to be a new PE- 430 CE interface, then the upstream PE will update the S-PMSI A-D route 431 with included BGP-BFD Attribute so that the previously advertised 432 value of the BFD Discriminator is associated with the new RPF link. 434 3.1.7.2. Downstream PE Procedures 436 If an S-PMSI A-D route bound to a given C-multicast is signaled with 437 a multipoint BFD session, then the upstream PE is considered during 438 UMH selection for the C-multicast if and only if the corresponding 439 BFD session is not in state Down, i.e., bfd.SessionState != Down. 440 Whenever the state of the BFD session changes to Down the Provider 441 Tunnel will be considered down, and the downstream PE MAY switch to 442 the backup Provider Tunnel only if the backup Provider Tunnel deemed 443 available. The dedicated p2mp BFD session MAY monitor the state of 444 the backup Provider Tunnel. Note that the Provider Tunnel is 445 considered down only for the C-multicast states that match to an 446 S-PMSI A-D route which included BGP-BFD Attribute with the BFD 447 Discriminator of the p2mp BFD session which is down. 449 4. Standby C-multicast route 451 The procedures described below are limited to the case where the site 452 that contains C-S is connected to two or more PEs though, to simplify 453 the description, the case of dual-homing is described. The 454 procedures require all the PEs of that MVPN to follow the UMH 455 selection, as specified in [RFC6513], whether the PE selected based 456 on its IP address, hashing algorithm described in section 5.1.3 457 [RFC6513], or Installed UMH Route. The procedures assume that if a 458 site of a given MVPN that contains C-S is dual-homed to two PEs, then 459 all the other sites of that MVPN would have two unicast VPN routes 460 (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its own RD. 462 As long as C-S is reachable via both PEs, a given downstream PE will 463 select one of the PEs connected to C-S as its Upstream PE with 464 respect to C-S. We will refer to the other PE connected to C-S as 465 the "Standby Upstream PE". Note that if the connectivity to C-S 466 through the Primary Upstream PE becomes unavailable, then the PE will 467 select the Standby Upstream PE as its Upstream PE with respect to 468 C-S. When the Primary PE later becomes available, then the PE will 469 select the Primary Upstream PE again as its Upstream PE. This is 470 referred to as "revertive" behavior and MUST be supported. Non- 471 revertive behavior would refer to the behavior of continuing to 472 select the backup PE as the UMH even after the Primary has come up. 473 This non-revertive behavior can also be optionally supported by an 474 implementation and would be enabled through some configuration. 476 For readability, in the following sub-sections, the procedures are 477 described for BGP C-multicast Source Tree Join routes, but they apply 478 equally to BGP C-multicast Shared Tree Join routes failover for the 479 case where the customer RP is dual-homed (substitute "C-RP" to 480 "C-S"). 482 4.1. Downstream PE behavior 484 When a (downstream) PE connected to some site of an MVPN needs to 485 send a C-multicast route (C-S, C-G), then following the procedures 486 specified in Section "Originating C-multicast routes by a PE" of 487 [RFC6514] the PE sends the C-multicast route with RT that identifies 488 the Upstream PE selected by the PE originating the route. As long as 489 C-S is reachable via the Primary Upstream PE, the Upstream PE is the 490 Primary Upstream PE. If C-S is reachable only via the Standby 491 Upstream PE, then the Upstream PE is the Standby Upstream PE. 493 If C-S is reachable via both the Primary and the Standby Upstream PE, 494 then in addition to sending the C-multicast route with an RT that 495 identifies the Primary Upstream PE, the PE also originates and sends 496 a C-multicast route with an RT that identifies the Standby Upstream 497 PE. This route, that has the semantics of being a 'standby' 498 C-multicast route, is further called a "Standby BGP C-multicast 499 route", and is constructed as follows: 501 o the NLRI is constructed as the original C-multicast route, except 502 that the RD is the same as if the C-multicast route was built 503 using the standby PE as the UMH (it will carry the RD associated 504 to the unicast VPN route advertised by the standby PE for S and a 505 Route Target derived from the standby PE's UMH route's VRF RT 506 Import EC); 508 o SHOULD carry the "Standby PE" BGP Community (this is a new BGP 509 Community, see Section 7). 511 The normal and the standby C-multicast routes must have their Local 512 Preference attribute adjusted so that, if two C-multicast routes with 513 same NLRI are received by a BGP peer, one carrying the "Standby PE" 514 attribute and the other one *not* carrying the "Standby PE" 515 community, then preference is given to the one *not* carrying the 516 "Standby PE" attribute. Such a situation can happen when, for 517 instance, due to transient unicast routing inconsistencies, two 518 different downstream PEs consider different upstream PEs to be the 519 primary one; in that case, without any precaution taken, both 520 upstream PEs would process a standby C-multicast route and possibly 521 stop forwarding at the same time. For this purpose, routes that 522 carry the "Standby PE" BGP Community MUST have the LOCAL_PREF 523 attribute set to zero. 525 Note that, when a PE advertises such a Standby C-multicast join for 526 an (C-S, C-G) it must join the corresponding P-tunnel. 528 If at some later point the local PE determines that C-S is no longer 529 reachable through the Primary Upstream PE, the Standby Upstream PE 530 becomes the Upstream PE, and the local PE re-sends the C-multicast 531 route with RT that identifies the Standby Upstream PE, except that 532 now the route does not carry the Standby PE BGP Community (which 533 results in replacing the old route with a new route, with the only 534 difference between these routes being the presence/absence of the 535 Standby PE BGP Community). 537 4.2. Upstream PE behavior 539 When a PE receives a C-multicast route for a particular (C-S, C-G), 540 and the RT carried in the route results in importing the route into a 541 particular VRF on the PE, if the route carries the Standby PE BGP 542 Community, then the PE performs as follows: 544 when the PE determines that C-S is not reachable through some 545 other PE, the PE SHOULD install VRF PIM state corresponding to 546 this Standby BGP C-multicast route (the result will be that a PIM 547 Join message will be sent to the CE towards C-S, and that the PE 548 will receive (C-S,C-G) traffic), and the PE SHOULD forward (C-S, 549 C-G) traffic received by the PE to other PEs through a P-tunnel 550 rooted at the PE. 552 Furthermore, irrespective of whether C-S carried in that route is 553 reachable through some other PE: 555 a) based on local policy, as soon as the PE receives this Standby BGP 556 C-multicast route, the PE MAY install VRF PIM state corresponding 557 to this BGP Source Tree Join route (the result will be that Join 558 messages will be sent to the CE toward C-S, and that the PE will 559 receive (C-S,C-G) traffic) 561 b) based on local policy, as soon as the PE receives this Standby BGP 562 C-multicast route, the PE MAY forward (C-S, C-G) traffic to other 563 PEs through a P-tunnel independently of the reachability of C-S 564 through some other PE. [note that this implies also doing (a)] 566 Doing neither (a) or (b) for a given (C-S,C-G) is called "cold root 567 standby". 569 Doing (a) but not (b) for a given (C-S,C-G) is called "warm root 570 standby". 572 Doing (b) (which implies also doing (a)) for a given (C-S,C-G) is 573 called "hot root standby". 575 Note that, if an upstream PE uses an S-PMSI only policy, it shall 576 advertise an S-PMSI for an (C-S, C-G) as soon as it receives a 577 C-multicast route for (C-S, C-G), normal or Standby; i.e., it shall 578 not wait for receiving a non-Standby C-multicast route before 579 advertising the corresponding S-PMSI. 581 Section 9.3.2 of [RFC6514], describes the procedures of sending a 582 Source-Active A-D result as a result of receiving the C-multicast 583 route. These procedures should be followed for both the normal and 584 Standby C-multicast routes. 586 4.3. Reachability determination 588 The standby PE can use the following information to determine that 589 C-S can or cannot be reached through the primary PE: 591 o presence/absence of a unicast VPN route toward C-S 593 o supposing that the standby PE is an egress of the tunnel rooted at 594 the Primary PE, the standby PE can determine the reachability of 595 C-S through the Primary PE based on the status of this tunnel, 596 determined thanks to the same criteria as the ones described in 597 Section 3.1 (without using the UMH selection procedures of 598 Section 3); 600 o other mechanisms MAY be used. 602 4.4. Inter-AS 604 If the non-segmented inter-AS approach is used, the procedures in 605 section 4 can be applied. 607 When multicast VPNs are used in an inter-AS context with the 608 segmented inter-AS approach described in section 8.2 of [RFC6514], 609 the procedures in this section can be applied. 611 A pre-requisite for the procedures described below to be applied for 612 a source of a given MVPN is: 614 o that any PE of this MVPN receives two Inter-AS I-PMSI auto- 615 discovery routes advertised by the AS of the source (or more) 617 o that these Inter-AS I-PMSI auto-discovery routes have distinct 618 Route Distinguishers (as described in item "(2)" of section 9.2 of 619 [RFC6514]). 621 As an example, these conditions will be satisfied when the source is 622 dual-homed to an AS that connects to the receiver AS through two ASBR 623 using auto-configured RDs. 625 4.4.1. Inter-AS procedures for downstream PEs, ASBR fast failover 627 The following procedure is applied by downstream PEs of an AS, for a 628 source S in a remote AS. 630 Additionally, to choosing an Inter-AS I-PMSI auto-discovery route 631 advertised from the AS of the source to construct a C-multicast 632 route, as described in section 11.1.3 [RFC6514] a downstream PE will 633 choose a second Inter-AS I-PMSI auto-discovery route advertised from 634 the AS of the source and use this route to construct and advertise a 635 Standby C-multicast route (C-multicast route carrying the Standby 636 extended community) as described in Section 4.1. 638 4.4.2. Inter-AS procedures for ASBRs 640 When an upstream ASBR receives a C-multicast route, and at least one 641 of the RTs of the route matches one of the ASBR Import RT, the ASBR 642 locates an Inter-AS I-PMSI A-D route whose RD and Source AS matches 643 the RD and Source AS carried in the C-multicast route. If the match 644 is found, and C-multicast route carries the Standby PE BGP Community, 645 then the ASBR performs as follows: 647 o if the route was received over iBGP; the route is expected to have 648 a LOCAL_PREF attribute set to zero and it should be re-advertised 649 in eBGP with a MED attribute (MULTI_EXIT_DISC) set to the highest 650 possible value (0xffff) 652 o if the route was received over eBGP; the route is expected to have 653 a MED attribute set of 0xffff and should be re-advertised in iBGP 654 with a LOCAL_PREF attribute set to zero 656 Other ASBR procedures are applied without modification. 658 5. Hot leaf standby 660 The mechanisms defined in sections Section 4 and Section 3 can be 661 used together as follows. 663 The principle is that, for a given VRF (or possibly only for a given 664 C-S,C-G): 666 o downstream PEs advertise a Standby BGP C-multicast route (based on 667 Section 4) 669 o upstream PEs use the "hot standby" optional behavior and thus will 670 forward traffic for a given multicast state as soon as they have 671 whether a (primary) BGP C-multicast route or a Standby BGP 672 C-multicast route for that state (or both) 674 o downstream PEs accept traffic from the primary or standby tunnel, 675 based on the status of the tunnel (based on Section 3) 677 Other combinations of the mechanisms proposed in Section 4) and 678 Section 3 are for further study. 680 Note that the same level of protection would be achievable with a 681 simple C-multicast Source Tree Join route advertised to both the 682 primary and secondary upstream PEs (carrying as Route Target extended 683 communities, the values of the VRF Route Import attribute of each VPN 684 route from each upstream PEs). The advantage of using the Standby 685 semantic for is that, supposing that downstream PEs always advertise 686 a Standby C-multicast route to the secondary upstream PE, it allows 687 to choose the protection level through a change of configuration on 688 the secondary upstream PE, without requiring any reconfiguration of 689 all the downstream PEs. 691 6. Duplicate packets 693 Multicast VPN specifications [RFC6513] impose that a PE only forwards 694 to CEs the packets coming from the expected upstream PE 695 (Section 9.1). 697 We highlight the reader's attention to the fact that the respect of 698 this part of multicast VPN specifications is especially important 699 when two distinct upstream PEs are susceptible to forward the same 700 traffic on P-tunnels at the same time in the steady state. This will 701 be the case when "hot root standby" mode is used (Section 4), and 702 which can also be the case if procedures of Section 3 are used and 703 (a) the rules determining the status of a tree are not the same on 704 two distinct downstream PEs or (b) the rule determining the status of 705 a tree depend on conditions local to a PE (e.g. the PE-P upstream 706 link being up). 708 7. IANA Considerations 710 Allocation is expected from IANA for the BGP "Standby PE" community. 711 (TBC) 713 8. Security Considerations 715 9. Acknowledgments 717 The authors want to thank Greg Reaume, Eric Rosen, Jeffrey Zhang, and 718 Zheng (Sandy) Zhang for their reviews, useful comments, and helpful 719 suggestions. 721 10. Contributor Addresses 723 Below is a list of other contributing authors in alphabetical order: 725 Rahul Aggarwal 726 Arktan 728 Email: raggarwa_1@yahoo.com 730 Nehal Bhau 731 Alcatel-Lucent, Inc. 732 701 E Middlefield Rd 733 Mountain View, CA 94043 734 USA 736 Email: Nehal.Bhau@alcatel-lucent.com 738 Clayton Hassen 739 Bell Canada 740 2955 Virtual Way 741 Vancouver 742 CANADA 744 Email: Clayton.Hassen@bell.ca 746 Wim Henderickx 747 Alcatel-Lucent 748 Copernicuslaan 50 749 Antwerp 2018 750 Belgium 752 Email: wim.henderickx@alcatel-lucent.com 754 Pradeep Jain 755 Alcatel-Lucent, Inc. 756 701 E Middlefield Rd 757 Mountain View, CA 94043 758 USA 760 Email: pradeep.jain@alcatel-lucent.com 762 Jayant Kotalwar 763 Alcatel-Lucent, Inc. 764 701 E Middlefield Rd 765 Mountain View, CA 94043 766 USA 768 Email: Jayant.Kotalwar@alcatel-lucent.com 770 Praveen Muley 771 Alcatel-Lucent 772 701 East Middlefield Rd 773 Mountain View, CA 94043 774 U.S.A. 776 Email: praveen.muley@alcatel-lucent.com 778 Ray (Lei) Qiu 779 Juniper Networks 780 1194 North Mathilda Ave. 781 Sunnyvale, CA 94089 782 U.S.A. 784 Email: rqiu@juniper.net 785 Yakov Rekhter 786 Juniper Networks 787 1194 North Mathilda Ave. 788 Sunnyvale, CA 94089 789 U.S.A. 791 Email: yakov@juniper.net 793 Kanwar Singh 794 Alcatel-Lucent, Inc. 795 701 E Middlefield Rd 796 Mountain View, CA 94043 797 USA 799 Email: kanwar.singh@alcatel-lucent.com 801 11. References 803 11.1. Normative References 805 [I-D.ietf-bfd-multipoint] 806 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 807 Multipoint Networks", draft-ietf-bfd-multipoint-19 (work 808 in progress), December 2018. 810 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 811 Requirement Levels", BCP 14, RFC 2119, 812 DOI 10.17487/RFC2119, March 1997, 813 . 815 [RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S. 816 Yasukawa, Ed., "Extensions to Resource Reservation 817 Protocol - Traffic Engineering (RSVP-TE) for Point-to- 818 Multipoint TE Label Switched Paths (LSPs)", RFC 4875, 819 DOI 10.17487/RFC4875, May 2007, 820 . 822 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 823 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 824 2012, . 826 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 827 Encodings and Procedures for Multicast in MPLS/BGP IP 828 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 829 . 831 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 832 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 833 May 2017, . 835 11.2. Informative References 837 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 838 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 839 DOI 10.17487/RFC4090, May 2005, 840 . 842 [RFC7431] Karan, A., Filsfils, C., Wijnands, IJ., Ed., and B. 843 Decraene, "Multicast-Only Fast Reroute", RFC 7431, 844 DOI 10.17487/RFC7431, August 2015, 845 . 847 Authors' Addresses 849 Thomas Morin (editor) 850 Orange 851 2, avenue Pierre Marzin 852 Lannion 22307 853 France 855 Email: thomas.morin@orange-ftgroup.com 857 Robert Kebler (editor) 858 Juniper Networks 859 1194 North Mathilda Ave. 860 Sunnyvale, CA 94089 861 U.S.A. 863 Email: rkebler@juniper.net 865 Greg Mirsky (editor) 866 ZTE Corp. 868 Email: gregimirsky@gmail.com