idnits 2.17.1 draft-mackenzie-bess-evpn-l3mh-proto-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 9 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: Local ARP/ND learning will trigger a RT-2 route sync to any peer PE. There is no need for local MAC learning or sync over the L3 interface, only adjacencies. The MAC-only RT-2 route SHOULD not be advertised to peer PE. -- The document date (July 11, 2021) is 1018 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-09 == Outdated reference: A later version (-06) exists of draft-sajassi-bess-evpn-ac-aware-bundling-03 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group M. MacKenzie, Ed. 3 Internet-Draft P. Brissette 4 Intended status: Standards Track Cisco 5 Expires: January 12, 2022 S. Matsushima 6 Softbank 7 July 11, 2021 9 EVPN multi-homing support for L3 services 10 draft-mackenzie-bess-evpn-l3mh-proto-00 12 Abstract 14 This document brings the machinery and solution providing higher 15 network availability and load balancing benefits of EVPN Multi- 16 Chassis Link Aggregation Group (MC-LAG) to various L3 services 17 delivered by EVPN. 19 Requirements Language 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in RFC 2119 [RFC2119] and 24 RFC 8174 [RFC8174]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on January 12, 2022. 43 Copyright Notice 45 Copyright (c) 2021 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Problems with unicast load-balancing from core to CE . . 4 62 1.2. Problems with multicast from core to CE . . . . . . . . . 4 63 1.3. Problems with IGP adjacencies over the LAG port . . . . . 5 64 1.4. Problems with supporting multiple subnets on same ES in 65 all active mode . . . . . . . . . . . . . . . . . . . . . 6 66 1.5. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 6 67 1.6. Requirements . . . . . . . . . . . . . . . . . . . . . . 8 68 2. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 2.1. Mapping of L3VRF to EVPN EVI . . . . . . . . . . . . . . 10 70 2.2. Mapping for L3 Interface to ESI . . . . . . . . . . . . . 11 71 2.3. Mapping for L3 Sub-Interface to Attachment Circuit ID . . 11 72 2.4. Route sync for ARP/ND . . . . . . . . . . . . . . . . . . 11 73 2.4.1. Local adjacency (ARP/ND) learning . . . . . . . . . . 11 74 2.4.2. Remote ARP/ND learning . . . . . . . . . . . . . . . 12 75 2.5. Route sync for IGMP . . . . . . . . . . . . . . . . . . . 12 76 2.5.1. Local IGMP Join/Leave learning . . . . . . . . . . . 13 77 2.5.2. Remote IGMP Join/Leave learning . . . . . . . . . . . 13 78 2.6. Customer Subnet Route sync using Route-type(5) . . . . . 13 79 2.7. Mapping for VLAN to ETAG . . . . . . . . . . . . . . . . 14 80 3. Extensions to RT-2, RT-5, RT-7 and RT-8 . . . . . . . . . . . 14 81 4. Convergence Considerations . . . . . . . . . . . . . . . . . 14 82 5. Overall Advantages . . . . . . . . . . . . . . . . . . . . . 15 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 15 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 85 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 86 8.1. Normative References . . . . . . . . . . . . . . . . . . 15 87 8.2. Informative References . . . . . . . . . . . . . . . . . 16 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 90 1. Introduction 92 Resilient L3VPN service to a CE requires multiple service PEs to run 93 a MC-LAG mechanism, which previously required a proprietary ICL 94 control plane link between them. 96 This proposed extension to [RFC7432] brings EVPN based MC-LAG all- 97 active multi-homing load-balancing to various services (L2 and L3) 98 delivered by EVPN. Although this solution is also applicable to some 99 L2 service use cases, (example Centralized Gateway) this document 100 will focus on the L3VPN [RFC4364] use case to provide examples. 102 EVPN MC-LAG is completely transparent to a CE device, and provides 103 link and node level redundancy with load-balancing using the existing 104 BGP control plane required by the L3 services. 106 For example, the L3VPN service can be MPLS, VxLAN or SRv6 based, and 107 does not require EVPN signaling to remote neighbors. The EVPN 108 signaling will be limited to the redundant service PEs sharing a 109 Ethernet Segment Identifier (ESI). This will be used to synchronize 110 ARP/ND, multicast Join/Leave, and IGP routes replacing need for ICL 111 link. 113 +-----+ 114 | PE3 | 115 +-----+ 116 +-----------+ 117 | MPLS/IP | 118 | CORE | 119 +-----------+ 120 +-----+ +-----+ 121 | PE1 | | PE2 | 122 +-----+ +-----+ 123 | | 124 I1 I2 125 \ / 126 \ / 127 +---+ 128 |CE1| 129 +---+ 131 Figure 1: EVPN MC-LAG Topology 133 Figure 1 shows a MC-LAG multi-homing topology where PE1 and PE2 are 134 part of the same redundancy group providing multi-homing to CE1 via 135 interfaces I1 and I2. Interfaces I1 and I2 are Bundle-Ethernet 136 interfaces running LACP protocol. The CE device can be a layer-2 or 137 layer-3 device connecting to the redundant PEs over a single LACP LAG 138 port. In the case of a layer-3 CE device, this document looks to 139 solve the case of an IGP adjacency between PEs and CE, but further 140 study is needed to support BGP PE to CE protocols. The core, shown 141 as IP or MPLS enabled, provides wide range of L3 services. MC-LAG 142 multi-homing functionality is decoupled from those services in the 143 core and it focuses on providing multi-homing to CE. 145 To deliver resilient layer-3 services and provide traffic load- 146 balancing towards the access, the two service PEs will advertise 147 layer-3 reach-ability towards the layer-3 core and will both be 148 eligible to receive traffic and forward towards the Access. 150 1.1. Problems with unicast load-balancing from core to CE 152 The layer-2 hashing performed by CE over its LAG port means that its 153 possible for only one service PE to populate its ARP/ND cache. Take 154 for example PE1 and PE2 from Figure 1. If CE1 ARP/ND response 155 happens to always hash over I1 towards PE1, then PE2 ARP/ND table 156 will be empty. Since unicast traffic from remote PEs can be received 157 by either service PE, traffic that reaches the service PE2 will not 158 find an ARP entry matching the host IP address and traffic will drop 159 until ARP/ND resolves the adjacency. 161 If the CEs hash implementation always calculates the ARP/ND response 162 towards PE1, the resolution on PE2 will never happen and traffic load 163 balanced to PE2 will black-hole. 165 The route sync solution is described in Section 2.4 167 1.2. Problems with multicast from core to CE 169 Similar to the unicast behavior above, multicast IGMP join messages 170 from CE to LAG link may always hash to a single PE. 172 When PIM runs on both redundant layer-3 PEs that both service 173 multicast for the same access segment, PIM elects only one of the PEs 174 as a PIM Designated Router (DR) using PIM DR election algorithm 175 [RFC7761]. The PIM DR is responsible for tracking local multicast 176 listeners and forwarding traffic to those listeners. The PIM DR is 177 also responsible for sending local Join/Prune messages towards the RP 178 or source. 180 For example, if in Figure 1 PE2 is designated PIM-RP, but CE IGMP 181 join messages are hashed to I1 towards PE1, then multicast traffic 182 will not be attracted to this service pair as PE2 will not send PIM 183 Join on behalf of CE. 185 In order to ensure that the PIM DR always has all the MCAST route(s) 186 and able to forward PIM Join/Prune message towards RP, BGP-EVPN 187 multicast route-sync will be leveraged to synchronize MCAST route(s) 188 learned to the DR. 190 When a fail-over occurs, multicast states would be pre-programmed on 191 the newly elected DR service PE and assumes responsibility for the 192 routing and forwarding of all the traffic. 194 The multicast route sync solution is described in Section 2.5 196 1.3. Problems with IGP adjacencies over the LAG port 198 A layer-3 CE device/router that connects to the redundant PEs may 199 establish an IGP adjacency on the bundle port. In this case, the 200 adjacency will be formed to one of the PEs and IGP customer route(s) 201 will only be present on that PE. 203 This prevents the load-balancing benefits of redundant PEs from 204 supporting this use case, as only one PE will be aware and 205 advertising the customer routes to the core. 207 <---------+ 208 | IGP Adj 209 +-------+ | 210 | | 1.1.1.1/24 | 211 | PE1 +-----------+ | 212 | | | | 213 | | | + 214 +-------+ | 215 | 216 + | +------+ 217 RT5 | L | | CE +------>H1 218 Sync | A +->+ | 219 v G | | | 220 | | +------>R1 221 +-------+ | +------+ 222 | | | 1.1.1.2/2 223 | PE2 +-----------+ 224 | | 1.1.1.1/24 225 | | 226 +-------+ 228 Figure 2: IGP Adjacency over LAG Port 230 Figure 2 provides an example of this use case, where CE forms an IGP 231 adjacency with PE1 (example: ISIS or OSPF), and advertises its H1 and 232 R1 routes into the IP-VRF of PE1. PE1 may then redistribute this IGP 233 route into the core as an L3 service. Any remote PEs will only be 234 aware of the service from PE1, and cannot load balance through PE2 as 235 well. 237 Further study is required in order to support the case of BGP PE to 238 CE protocols. 240 A solution to this is described in Section 2.6 242 1.4. Problems with supporting multiple subnets on same ES in all active 243 mode 245 In the case where the L3 service is L3VPN such as [RFC4364], it is 246 likely the CE device could be a layer-2 switch supporting multiple 247 subnets through the use of VLANs. In addition, each VLAN may be 248 associated with a different customer VRF. 250 When ARP/ND routes are synchronized between the PEs for ARP proxy 251 support using RT-2, a similar problem is encountered as described by 252 Section 1.1 of [I-D.sajassi-bess-evpn-ac-aware-bundling]. The PE 253 receiving RT-2 is unable to determine which sub-interface the ARP/ND 254 entry is associated with. 256 When IGMP routes are synchronized between the PEs using RT-7 and RT- 257 8, a similar problem is encountered as described by Section 1.2 of 258 [I-D.sajassi-bess-evpn-ac-aware-bundling]. The PE receiving RT-7 and 259 RT-8 is unable to determine which sub-interface the IGMP join is 260 associated with. 262 This document proposes to use the solution defined by Section 4 of 263 [I-D.sajassi-bess-evpn-ac-aware-bundling] to solve both these cases. 264 All route sync messages (RT-2, RT-5, RT-7, RT-8) will carry an 265 Attachment Circuit Identifier Extended Community to signal which sub- 266 interface the routes were learnt on. 268 1.5. Acronyms 270 BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single 271 or multiple BDs. In case of VLAN-bundle and VLAN-aware bundle 272 service model, an EVI contains multiple BDs. 274 DF: Designated Forwarder 276 DR: Designated Router 278 EC: BGP Extended Community 280 ES: Ethernet Segment. When a customer site (device or network) is 281 connected to one or more PEs via a set of Ethernet links, then 282 that set of links is referred to as an 'Ethernet Segment'. 284 ESI: Ethernet Segment Identifier. A unique non-zero identifier that 285 identifies an Ethernet Segment is called an 'Ethernet Segment 286 Identifier'. 288 ETAG: Ethernet Tag. An Ethernet tag identifies a particular 289 broadcast domain, e.g., a VLAN. An EVPN instance consists of one 290 or more broadcast domains. 292 EVI: An EVPN instance spanning the Provider Edge (PE) devices 293 participating in that EVPN 295 ICL: Inter Chassis Link 297 IGMP: Internet Group Management Protocol 299 IP-VRF: A VPN Routing and Forwarding table for IP routes on an PE. 300 The IP routes could be populated by EVPN and IP-VPN address 301 families. An IP-VRF is also an instantiation of a layer 3 VPN in 302 an PE. 304 L3AA All-Active Redundancy Mode for Layer 3 services. When all PEs 305 attached to an Ethernet segment are allowed to forward known 306 unicast traffic to/from that Ethernet segment for a given VLAN, 307 then the Ethernet segment is defined to be operating in All-Active 308 redundancy mode. 310 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 311 Control (MAC) addresses on a PE. A MAC-VRF is also an 312 instantiation of an EVI in a PE 314 MC-LAG: Multi-Chassis Link Aggregation Group (MC-LAG). 316 PE: Provider Edge. 318 PIM: Protocol Independent Multicast 320 RT-2: EVPN route type 2, i.e., MAC/IP advertisement route, as 321 defined in [RFC7432]. 323 RT-5: EVPN route type 5, i.e., IP Prefix route, as defined in 324 Section 3 of [I-D.ietf-bess-evpn-prefix-advertisement] 326 RT-7: EVPN route type 7, i.e., Multicast Join Synch Route, as 327 defined in Section 9.2 of [I-D.ietf-bess-evpn-igmp-mld-proxy] 329 RT-8: EVPN route type 8, i.e., Multicast Leave Synch Route, as 330 defined in Section 9.3 of [I-D.ietf-bess-evpn-igmp-mld-proxy] 332 1.6. Requirements 334 1. The multi-homing solution MUST support Layer-3 access interface 336 2. The multi-homing solution MUST support Layer-3 access sub- 337 interface 339 3. The solution MUST support unicast and multicast VPN services 341 4. The solution SHOULD support igp synchronization 343 5. The solution SHOULD support unicast and multicast GRT services 345 6. The solution MUST support all-active load-balancing mode 347 7. The solution MAY support single-active load-balancing mode 349 8. The solution MUST support port-active load-balancing mode 351 2. Solution 352 +------ 353 | +-------+ .1 10.0.0.1/24 354 | PE1 || BE1 +---------------------------------+ 355 | || ESI-1| | 356 | || | .2 10.0.0.1/24 | 357 | || +-------------------------+ | 358 | +-------+ | | 359 | | | | 360 | +-------+ 10.0.1.1/24 | | 361 | || BE2 +------------------+ | | 362 | || ESI-2| | | | 363 | || | +v----+ | | 364 | || | |CE1 | | | 365 | +-------+ |.2 | | | 366 +------ |CUST1| | | 367 +^----+ | | 368 +------ | +v-----+-v----+ 369 | +-------+ 10.0.1.1/24 | |SW1 | +-->H1(.2) 370 | PE2 || BE2 +------------------+ |CUST2 |CUST1 | 371 | || ESI-2| +^-----+-^----+ 372 | || | | | 373 | || | | | 374 | +-------+ | | 375 | | | | 376 | +-------+ .2 10.0.0.1/24 | | 377 | || BE1 +-------------------------+ | 378 | || ESI-1| | 379 | || | .1 10.0.0.1/24 | 380 | || +---------------------------------+ 381 | +-------+ 382 +------ 384 PE(1,2): 385 CUST1-VRF: EVI 1 386 CUST2-VRF: EVI 2 388 SW1: 389 CUST1-Subnet1: 10.0.0.2/24 (VLAN 1) 390 CUST2-Subnet1: 10.0.0.2/24 (VLAN 2) 392 CE1: 393 CUST1-Subnet2 10.0.1.2/24 395 Figure 3: ARP/ND MAC-IP route-sync over different VRF(s) 397 Consider the Figure 3 topology, where 2 AC aware bundling service 398 interfaces are supported. On first bundling interface BE1, PE1 and 399 PE2 share a LAG interface with switch 1 (SW1) and have 2 separate 400 (but overlapping) customer 1 and customer 2 subnets. CUST1 Subnet 1 401 is resolving over sub-interface VLAN 1 (.1), and CUST2 Subnet 1 is 402 resolving over sub-interface VLAN 2 (.2). 404 On second bundling interface BE2, both PEs share a LAG interface with 405 Customer Edge device 1 (CE1) and only a single Customer (CUST1) 406 subnet on native VLAN. 408 Main interface BE1 on PE1 and PE2 is shared by customer 1 and 2, and 409 represented by ESI-1. 411 Main interface BE2 on PE1 and PE2 is only used by customer 1, and 412 represented by ESI-2. 414 If we focus on CUST1 for now, there are 2 cases visible. 416 Case 1: For CE 1, if its ARP responses hash towards PE2, then PE1 417 will be unaware of its presence. For PE2 to synchronize this 418 information to PE1, in addition to CE1 IP address (10.0.1.2) and MAC 419 address (m1), 2 additional unique identifiers are needed. 1. IP-VRF. 420 CUST 1 VRF is represented by EVI ID 1 2. Interface. BE2 Interface 421 is represented by ESI-2 423 Case 2: For Host 1 (H1), if its ARP responses hash towards PE2, then 424 PE1 will be unaware of its presence. For PE2 to synchronize this 425 information to PE1, then in addition to H1 IP address (10.0.0.2) and 426 MAC address (m2), 3 additional unique identifiers are required. 1. 427 IP-VRF. CUST 1 VRF is represented by EVI ID 1 2. Main Interface. 428 BE1 Interface is represented by ESI-1 3. Sub-Interface. Subnet/VLAN 429 1 is represented by Attachment Circuit ID 1. 431 2.1. Mapping of L3VRF to EVPN EVI 433 A separate EVPN instance will be configured to each layer-3 VRF and 434 be marked for route-sync only. Each L3-VRF will have a unique 435 associated EVI ID. The multi-homed peer PEs MUST have the same 436 configured EVI to layer-3 VRF mapping. This mapping also extends to 437 the GRT, where a unique EVI ID can be assigned to support non VPN 438 layer-3 services. Mis-configuration detection across peering PEs are 439 left for further study. 441 When an EVPN instance is created as route-sync only, a MAC-VRF table 442 is created to store all advertised routes. Local MAC learning may be 443 disabled as this feature does not require MAC-only RT-2 444 advertisements. 446 This EVI is applicable to the multi-homed peer PEs only 448 The EVPN instance will be responsible for populating the following 449 layer-3 VRF tables from remotely synced routes from peer PE 451 o ARP/ND 453 o IGMP 455 o IP (for customer subnets learned from IGP adjacency) 457 In the example Figure 3, route-syncs from VRF CUST1 will have EVI-RT 458 BGP Extended Community (EC) with EVI 1, and VRF CUST2 will have EVI 459 2. 461 2.2. Mapping for L3 Interface to ESI 463 The ESI represents the L3 LAG interface between PE and CEs. This ESI 464 is signalled using RT-4 with the ES-Import Route Target as described 465 in Section 8.1.1 of [RFC7432] so that the service PE peers can 466 discover each others common ES. 468 In the example Figure 3, route-syncs from interface BE1 have ES- 469 Import RT EC with ESI 1 471 2.3. Mapping for L3 Sub-Interface to Attachment Circuit ID 473 The Attachment Circuit ID represens the sub-interface subnet on the 474 L3 LAG interface between PE and CEs. The AC-ID is signalled using 475 RT-2, RT-5, RT-7 and RT-8 by attaching Attachment Circuit ID Extended 476 community as described in Section 6.1 of 477 [I-D.sajassi-bess-evpn-ac-aware-bundling]. 479 In the example Figure 3, route-syncs from sub-interface BE1.1 (VLAN1) 480 have Attachment-Circuit-ID EC with ID 1 482 2.4. Route sync for ARP/ND 484 This document proposes solving the issue described in Section 1.1 485 using RT-2 IP/MAC route sync as described in Section 10 of [RFC7432] 486 with a modification described below. 488 2.4.1. Local adjacency (ARP/ND) learning 490 Local ARP/ND learning will trigger a RT-2 route sync to any peer PE. 491 There is no need for local MAC learning or sync over the L3 492 interface, only adjacencies. The MAC-only RT-2 route SHOULD not be 493 advertised to peer PE. 495 Section 9.1 of [RFC7432] describes different mechanisms to learn 496 adjacency routes locally. 498 o An ARP/ND Sync route MUST carry exactly one ES-Import Route Target 499 extended community, the one that corresponds to the ES on which 500 the ARP or ND was received. 502 o It MUST also carry exactly one EVI-RT EC, the one that corresponds 503 to the EVI on which the ARP or ND was received. The EVI maps the 504 layer-3 VRF See Section 9.5 of [I-D.ietf-bess-evpn-igmp-mld-proxy] 505 for details on how to encode and construct the EVI-RT EC. 507 o If the case where PE supports AC aware bundling, it MUST also 508 carry one Attachment Circuit ID Extended Community. The circuit 509 ID maps the sub-interface (or subnet) this route was received. 510 For details on how to encode and construct this Extended 511 Community, see section 6.1 of 512 [I-D.sajassi-bess-evpn-ac-aware-bundling]. 514 2.4.2. Remote ARP/ND learning 516 When consuming a remote layer-3 RT-2 sync route: 518 o BGP only imports layer-3 sync route(s) when both ES-Import and 519 EVI-RT extended communities match those locally configured 521 o The layer-3 VRF is derived from the matching EVI 523 o The main interface is derived from the ESI 525 o The VLAN / sub-interface is derived from the AC-ID provided in the 526 Attachment-Circuit-ID extended community 528 o The combination of ES Import and EVI RT will allow BGP to import 529 layer-3 sync route(s) to only PE(s) that have are attached to the 530 same ESI and have the respective EVI. 532 2.5. Route sync for IGMP 534 This document proposes solving the issue described in Section 1.2 535 using RT-7 and RT-8 route sync as described by 536 [I-D.ietf-bess-evpn-igmp-mld-proxy]. 538 Local IGMP join and leave will trigger a RT-7/8 route sync to peer 539 PE. 541 2.5.1. Local IGMP Join/Leave learning 543 An IGP Join or Leave will trigger a RT-7/8 route sync to any peer PE. 545 Section 9.1 of [RFC7432] describes different mechanisms to learn 546 adjacency routes locally. 548 o An Multicast Join or Leave Sync route MUST carry exactly one ES- 549 Import Route Target extended community, the one that corresponds 550 to the ES on which the IGMP Join or Leave was received. 552 o It MUST also carry exactly one EVI-RT EC, the one that corresponds 553 to the EVI on which the IGMP Join or Leave was received. The EVI 554 maps the layer-3 VRF See Section 9.5 of 555 [I-D.ietf-bess-evpn-igmp-mld-proxy] for details on how to encode 556 and construct the EVI-RT EC. 558 o If the case where PE supports AC aware bundling, it MUST also 559 carry one Attachment Circuit ID Extended Community. The circuit 560 ID maps the sub-interface (or subnet) this route was received. 561 For details on how to encode and construct this Extended 562 Community, see section 6.1 of 563 [I-D.sajassi-bess-evpn-ac-aware-bundling]. 565 o The combination of ES Import and EVI RT will allow BGP to import 566 Multicast Join and Leave synch route(s) to only PE(s) that have 567 are attached to the same ESI and have the respective EVI. 569 2.5.2. Remote IGMP Join/Leave learning 571 When consuming a remote multicast RT-7 or RT-8 sync route: 573 o BGP only imports multicast sync route(s) when both ES-Import and 574 EVI-RT extended communities match those locally configured 576 o The layer-3 VRF is derived from the matching EVI 578 o The main interface is derived from the ESI 580 o The VLAN / sub-interface is derived from the AC-ID provided in the 581 Attachment-Circuit-ID extended community 583 2.6. Customer Subnet Route sync using Route-type(5) 585 Section 3 of [I-D.ietf-bess-evpn-prefix-advertisement] provides a 586 mechanism to synchronize layer-3 customer subnets between the PEs in 587 order to solve problem described in Section 1.3. 589 Using Figure 2 as example, if PE1 forms the IGP adjacency with CE, it 590 will be the only PE with knowledge of the customer subnet R1. BGP on 591 PE1 will then advertise R1 to remote PEs using L3-VPN signalling. 593 Although PE2 has the same ES connection to the CE, and could provide 594 load balancing to remote PEs, due to it not having formed an IGP 595 adjacency with CE it is not aware of the customer subnet R1. 597 This can be solved by PE1 signaling R1 to PE2 using a RT-5 synch 598 route. BGP on PE2 can then advertise this customer subnet R1 towards 599 the core is if it was locally learned through IGP, and provide load- 600 balancing from the remote PEs. 602 The route-type(5) will carry the ESI as well as the gateway address 603 GW (prefix next-hop address). 605 The same mapping mechanism will be used as for Route and IGMP sync, 606 where EVI will determine the L3-VRF, ESI carried with route-type(5) 607 will provide the main interface, and the gateway address will provide 608 the nexthop. 610 2.7. Mapping for VLAN to ETAG 612 Another possible signalling of VLAN/sub-interface between service PE 613 peers is to use the Ethernet Tag (ETAG) ID value in RT-2, RT-5, RT-7 614 and RT-8 as apposed to the Attachment Circuit Extended Community. 616 This will not work with vlan-aware bundling mode, but as that is a 617 layer2 mode this should not prevent ETAGs use for L3 services. 619 3. Extensions to RT-2, RT-5, RT-7 and RT-8 621 This document proposes extending the usecase of Extended communities 622 already defined in other drafts for the route types RT-2, RT-5, RT-7 623 and RT-8. 625 o EVI-RT Extended Community as defined in Section 9.5 of 626 [I-D.ietf-bess-evpn-igmp-mld-proxy]. 628 o Attachment Circuit ID Extended Community as defined in Section 6.1 629 of [I-D.sajassi-bess-evpn-ac-aware-bundling]. 631 4. Convergence Considerations 632 5. Overall Advantages 634 The use of EVPN MC-LAG all active multi-homing brings the following 635 benefits to L3 BGP services: 637 o Open standards based per interface all-active redundancy mechanism 638 that eliminates the need to run ICCP and LDP. 640 o Agnostic of underlay technology (MPLS, VXLAN, SRv6) and associated 641 services (L3, L3-VPN) 643 o Replaces legacy MC-LAG ICCP-based solution, and offers following 644 additional benefits: 646 * Fast convergence with mass-withdraw is possible with EVPN, no 647 equivalent in ICCP 649 o Requires signalling already defined in existing EVPN RFCs 650 [RFC7432] and drafts [I-D.ietf-bess-evpn-igmp-mld-proxy], 651 [I-D.sajassi-bess-evpn-ac-aware-bundling], and 652 [I-D.ietf-bess-evpn-prefix-advertisement] 654 o Removes the burden of having the need for ICL link 656 6. Security Considerations 658 The same Security Considerations described in [RFC7432] are valid for 659 this document. 661 7. IANA Considerations 663 There are no IANA considerations. 665 8. References 667 8.1. Normative References 669 [I-D.ietf-bess-evpn-igmp-mld-proxy] 670 Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., 671 and W. Lin, "IGMP and MLD Proxy for EVPN", draft-ietf- 672 bess-evpn-igmp-mld-proxy-09 (work in progress), April 673 2021. 675 [I-D.ietf-bess-evpn-prefix-advertisement] 676 Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. 677 Sajassi, "IP Prefix Advertisement in EVPN", draft-ietf- 678 bess-evpn-prefix-advertisement-11 (work in progress), May 679 2018. 681 [I-D.sajassi-bess-evpn-ac-aware-bundling] 682 Sajassi, A., Mishra, M., Thoria, S., Brissette, P., 683 Rabadan, J., and J. Drake, "AC-Aware Bundling Service 684 Interface in EVPN", draft-sajassi-bess-evpn-ac-aware- 685 bundling-03 (work in progress), February 2021. 687 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 688 Requirement Levels", BCP 14, RFC 2119, 689 DOI 10.17487/RFC2119, March 1997, 690 . 692 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 693 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 694 May 2017, . 696 8.2. Informative References 698 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 699 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 700 2006, . 702 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 703 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 704 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 705 2015, . 707 [RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I., 708 Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent 709 Multicast - Sparse Mode (PIM-SM): Protocol Specification 710 (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March 711 2016, . 713 Authors' Addresses 715 Michael MacKenzie (editor) 716 Cisco Systems 718 Email: mimacken@cisco.com 720 Patrice Brissette 721 Cisco Systems 723 Email: pbrisset@cisco.com 724 Satoru Matsushima 725 Softbank 727 Email: satoru.matsushima@g.softbank.co.jp