idnits 2.17.1 draft-ietf-bess-dci-evpn-overlay-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 6, 2015) is 3217 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC4761' is mentioned on line 413, but not defined == Missing Reference: 'RFC4762' is mentioned on line 413, but not defined == Missing Reference: 'RFC6074' is mentioned on line 413, but not defined == Missing Reference: 'RFC7432' is mentioned on line 830, but not defined == Missing Reference: 'RFC7041' is mentioned on line 469, but not defined == Missing Reference: 'RFC5512' is mentioned on line 560, but not defined ** Obsolete undefined reference: RFC 5512 (Obsoleted by RFC 9012) == Missing Reference: 'RFC2119' is mentioned on line 910, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-overlay-01 -- No information found for draft-ietf-bess-evpn-vpls-integration - is the name correct? Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan 3 Internet Draft S. Sathappan 4 Intended status: Standards Track W. Henderickx 5 S. Palislamovic 6 R. Shekhar Alcatel-Lucent 7 A. Lohiya 8 Juniper 9 A. Sajassi 10 D. Cai 11 Cisco 13 Expires: January 7, 2016 July 6, 2015 15 Interconnect Solution for EVPN Overlay networks 16 draft-ietf-bess-dci-evpn-overlay-01 18 Abstract 20 This document describes how Network Virtualization Overlay networks 21 (NVO) can be connected to a Wide Area Network (WAN) in order to 22 extend the layer-2 connectivity required for some tenants. The 23 solution analyzes the interaction between NVO networks running EVPN 24 and other L2VPN technologies used in the WAN, such as VPLS/PBB-VPLS 25 or EVPN/PBB-EVPN, and proposes a solution for the interworking 26 between both. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF), its areas, and its working groups. Note that 35 other groups may also distribute working documents as Internet- 36 Drafts. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of current Internet-Drafts can be accessed at 44 http://www.ietf.org/ietf/1id-abstracts.txt 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html 49 This Internet-Draft will expire on January 7, 2016. 51 Copyright Notice 53 Copyright (c) 2015 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (http://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. Decoupled Interconnect solution for EVPN overlay networks . . . 3 70 2.1. Interconnect requirements . . . . . . . . . . . . . . . . . 4 71 2.2. VLAN-based hand-off . . . . . . . . . . . . . . . . . . . . 5 72 2.3. PW-based (Pseudowire-based) hand-off . . . . . . . . . . . 5 73 2.4. Multi-homing solution on the GWs . . . . . . . . . . . . . 6 74 2.5. Gateway Optimizations . . . . . . . . . . . . . . . . . . . 6 75 2.5.1 Use of the Unknown MAC route to reduce unknown 76 flooding . . . . . . . . . . . . . . . . . . . . . . . . 6 77 2.5.2. MAC address advertisement control . . . . . . . . . . . 7 78 2.5.3. ARP flooding control . . . . . . . . . . . . . . . . . 7 79 2.5.4. Handling failures between GW and WAN Edge routers . . . 7 80 3. Integrated Interconnect solution for EVPN overlay networks . . 8 81 3.1. Interconnect requirements . . . . . . . . . . . . . . . . . 9 82 3.2. VPLS Interconnect for EVPN-Overlay networks . . . . . . . . 10 83 3.2.1. Control/Data Plane setup procedures on the GWs . . . . 10 84 3.2.2. Multi-homing procedures on the GWs . . . . . . . . . . 10 85 3.3. PBB-VPLS Interconnect for EVPN-Overlay networks . . . . . . 11 86 3.3.1. Control/Data Plane setup procedures on the GWs . . . . 11 87 3.3.2. Multi-homing procedures on the GWs . . . . . . . . . . 11 88 3.4. EVPN-MPLS Interconnect for EVPN-Overlay networks . . . . . 12 89 3.4.1. Control Plane setup procedures on the GWs . . . . . . . 12 90 3.4.2. Data Plane setup procedures on the GWs . . . . . . . . 14 91 3.4.3. Multi-homing procedures on the GWs . . . . . . . . . . 14 92 3.4.4. Impact on MAC Mobility procedures . . . . . . . . . . . 15 93 3.4.5. Gateway optimizations . . . . . . . . . . . . . . . . . 16 94 3.4.6. Benefits of the EVPN-MPLS Interconnect solution . . . . 16 95 3.5. PBB-EVPN Interconnect for EVPN-Overlay networks . . . . . . 17 96 3.5.1. Control/Data Plane setup procedures on the GWs . . . . 17 97 3.5.2. Multi-homing procedures on the GWs . . . . . . . . . . 18 98 3.5.3. Impact on MAC Mobility procedures . . . . . . . . . . . 18 99 3.5.4. Gateway optimizations . . . . . . . . . . . . . . . . . 18 100 3.6. EVPN-VXLAN Interconnect for EVPN-Overlay networks . . . . . 18 101 3.6.1. Globally unique VNIs in the Interconnect network . . . 19 102 3.6.2. Downstream assigned VNIs in the Interconnect network . 20 103 5. Conventions and Terminology . . . . . . . . . . . . . . . . . . 20 104 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 21 105 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21 106 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 107 8.1. Normative References . . . . . . . . . . . . . . . . . . . 21 108 8.2. Informative References . . . . . . . . . . . . . . . . . . 22 109 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 22 110 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 22 111 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 22 113 1. Introduction 115 [EVPN-Overlays] discusses the use of EVPN as the control plane for 116 Network Virtualization Overlay (NVO) networks, where VXLAN, NVGRE or 117 MPLS over GRE can be used as possible data plane encapsulation 118 options. 120 While this model provides a scalable and efficient multi-tenant 121 solution within the Data Center, it might not be easily extended to 122 the WAN in some cases due to the requirements and existing deployed 123 technologies. For instance, a Service Provider might have an already 124 deployed (PBB-)VPLS or (PBB-)EVPN network that must be used to 125 interconnect Data Centers and WAN VPN users. A Gateway (GW) function 126 is required in these cases. 128 This document describes a Interconnect solution for EVPN overlay 129 networks, assuming that the NVO Gateway (GW) and the WAN Edge 130 functions can be decoupled in two separate systems or integrated into 131 the same system. The former option will be referred as "Decoupled 132 Interconnect solution" throughout the document whereas the latter one 133 will be referred as "Integrated Interconnect solution". 135 2. Decoupled Interconnect solution for EVPN overlay networks 137 This section describes the interconnect solution when the GW and WAN 138 Edge functions are implemented in different systems. Figure 1 depicts 139 the reference model described in this section. 141 +--+ 142 |CE| 143 +--+ 144 | 145 +----+ 146 +----| PE |----+ 147 +---------+ | +----+ | +---------+ 148 +----+ | +---+ +----+ +----+ +---+ | +----+ 149 |NVE1|--| | | |WAN | |WAN | | | |--|NVE3| 150 +----+ | |GW1|--|Edge| |Edge|--|GW3| | +----+ 151 | +---+ +----+ +----+ +---+ | 152 | NVO-1 | | WAN | | NVO-2 | 153 | +---+ +----+ +----+ +---+ | 154 | | | |WAN | |WAN | | | | 155 +----+ | |GW2|--|Edge| |Edge|--|GW4| | +----+ 156 |NVE2|--| +---+ +----+ +----+ +---+ |--|NVE4| 157 +----+ +---------+ | | +---------+ +----+ 158 +--------------+ 160 |<-EVPN-Overlay-->|<-VLAN->|<-WAN L2VPN->|<--PW-->|<--EVPN-Overlay->| 161 hand-off hand-off 163 Figure 1 Decoupled Interconnect model 165 The following section describes the interconnect requirements for 166 this model. 168 2.1. Interconnect requirements 170 This proposed Interconnect architecture will be normally deployed in 171 networks where the EVPN-Overlay and WAN providers are different 172 entities and a clear demarcation is needed. The solution must observe 173 the following requirements: 175 o A simple connectivity hand-off must be provided between the EVPN- 176 Overlay network provider and the WAN provider so that QoS and 177 security enforcement are easily accomplished. 179 o The solution must be independent of the L2VPN technology deployed 180 in the WAN. 182 o Multi-homing between GW and WAN Edge routers is required. Per- 183 service load balancing MUST be supported. Per-flow load balancing 184 MAY be supported but it is not a strong requirement since a 185 deterministic path per service is usually required for an easy QoS 186 and security enforcement. 188 o Ethernet OAM and Connectivity Fault Management (CFM) functions must 189 be supported between the EVPN-Overlay network and the WAN network. 191 o The following optimizations MAY be supported at the GW: 192 + Flooding reduction of unknown unicast traffic sourced from the DC 193 Network Virtualization Edge devices (NVEs). 194 + Control of the WAN MAC addresses advertised to the DC. 195 + ARP flooding control for the requests coming from the WAN. 197 2.2. VLAN-based hand-off 199 In this option, the hand-off between the GWs and the WAN Edge routers 200 is based on 802.1Q VLANs. This is illustrated in Figure 1 (between 201 the GWs in NVO-1 and the WAN Edge routers). Each MAC-VRF in the GW is 202 connected to a different VSI/MAC-VRF instance in the WAN Edge router 203 by using a different C-TAG VLAN ID or a different combination of 204 S/C-TAG VLAN IDs that matches at both sides. 206 This option provides the best possible demarcation between the DC and 207 WAN providers and it does not require control plane interaction 208 between both providers. The disadvantage of this model is the 209 provisioning overhead since the service must be mapped to a S/C-TAG 210 VLAN ID combination at both, GW and WAN Edge routers. 212 In this model, the GW acts as a regular Network Virtualization Edge 213 (NVE) towards the DC. Its control plane, data plane procedures and 214 interactions are described in [EVPN-Overlays]. 216 The WAN Edge router acts as a (PBB-)VPLS or (PBB-)EVPN PE with 217 attachment circuits (ACs) to the GWs. Its functions are described in 218 [RFC4761][RFC4762][RFC6074] or [RFC7432][PBB-EVPN]. 220 2.3. PW-based (Pseudowire-based) hand-off 222 If MPLS can be enabled between the GW and the WAN Edge router, a PW- 223 based Interconnect solution can be deployed. In this option the 224 hand-off between both routers is based on FEC128-based PWs or FEC129- 225 based PWs (for a greater level of network automation). Note that this 226 model still provides a clear demarcation boundary between DC and WAN, 227 and security/QoS policies may be applied on a per PW basis. This 228 model provides better scalability than a C-TAG based hand-off and 229 less provisioning overhead than a combined C/S-TAG hand-off. The 230 PW-based hand-off interconnect is illustrated in Figure 1 (between 231 the NVO-2 GWs and the WAN Edge routers). 233 In this model, besides the usual MPLS procedures between GW and WAN 234 Edge router, the GW MUST support an interworking function in each 235 MAC-VRF that requires extension to the WAN: 237 o If a FEC128-based PW is used between the MAC-VRF (GW) and the VSI 238 (WAN Edge), the provisioning of the VCID for such PW MUST be 239 supported on the MAC-VRF and must match the VCID used in the peer 240 VSI at the WAN Edge router. 242 o If BGP Auto-discovery [RFC6074] and FEC129-based PWs are used 243 between the GW MAC-VRF and the WAN Edge VSI, the provisioning of 244 the VPLS-ID MUST be supported on the MAC-VRF and must match the 245 VPLS-ID used in the WAN Edge VSI. 247 2.4. Multi-homing solution on the GWs 249 As already discussed, single-active multi-homing, i.e. per-service 250 load-balancing multi-homing MUST be supported in this type of 251 interconnect. All-active multi-homing may be considered in future 252 revisions of this document. 254 The GWs will be provisioned with a unique ESI per WAN interconnect 255 and the hand-off attachment circuits or PWs between the GW and the 256 WAN Edge router will be assigned to such ESI. The ESI will be 257 administratively configured on the GWs according to the procedures in 258 [RFC7432]. This Interconnect ESI will be referred as "I-ESI" 259 hereafter. 261 The solution (on the GWs) MUST follow the single-active multi-homing 262 procedures as described in [EVPN-Overlays] for the provisioned I-ESI, 263 i.e. Ethernet A-D routes per ESI and per EVI will be advertised to 264 the DC NVEs. The MAC addresses learnt (in the data plane) on the 265 hand-off links will be advertised with the I-ESI encoded in the ESI 266 field. 268 2.5. Gateway Optimizations 270 The following features MAY be supported on the GW in order to 271 optimize the control plane and data plane in the DC. 273 2.5.1 Use of the Unknown MAC route to reduce unknown flooding 275 The use of EVPN in the NVO networks brings a significant number of 276 benefits as described in [EVPN-Overlays]. There are however some 277 potential issues that SHOULD be addressed when the DC EVIs are 278 connected to the WAN VPN instances. 280 The first issue is the additional unknown unicast flooding created in 281 the DC due to the unknown MACs existing beyond the GW. In virtualized 282 DCs where all the MAC addresses are learnt in the control/management 283 plane, unknown unicast flooding is significantly reduced. This is no 284 longer true if the GW is connected to a layer-2 domain with data 285 plane learning. 287 The solution suggested in this document is based on the use of an 288 "Unknown MAC route" that is advertised by the Designated Forwarder 289 GW. The Unknown MAC route is a regular EVPN MAC/IP Advertisement 290 route where the MAC Address Length is set to 48 and the MAC address 291 to 00:00:00:00:00:00 (IP length is set to 0). 293 If this procedure is used, when an EVI is created in the GWs and the 294 Designated Forwarder (DF) is elected, the DF will send the Unknown 295 MAC route. The NVEs supporting this concept will prune their unknown 296 unicast flooding list and will only send the unknown unicast packets 297 to the owner of the Unknown MAC route. Note that the I-ESI will be 298 encoded in the ESI field of the NLRI so that regular multi-homing 299 procedures can be applied to this unknown MAC too (e.g. backup-path). 301 2.5.2. MAC address advertisement control 303 Another issue derived from the EVI interconnect to the WAN layer-2 304 domain is the potential massive MAC advertisement into the DC. All 305 the MAC addresses learnt from the WAN on the hand-off attachment 306 circuits or PWs must be advertised by BGP EVPN. Even if optimized BGP 307 techniques like RT-constraint are used, the amount of MAC addresses 308 to advertise or withdraw (in case of failure) from the GWs can be 309 difficult to control and overwhelming for the DC network, especially 310 when the NVEs reside in the hypervisors. 312 This document proposes the addition of administrative options so that 313 the user can enable/disable the advertisement of MAC addresses learnt 314 from the WAN as well as the advertisement of the Unknown MAC route 315 from the DF GW. In cases where all the DC MAC addresses are learnt in 316 the control/management plane, the GW may disable the advertisement of 317 WAN MAC addresses. Any frame with unknown destination MAC will be 318 exclusively sent to the Unknown MAC route owner(s). 320 2.5.3. ARP flooding control 322 Another optimization mechanism, naturally provided by EVPN in the 323 GWs, is the Proxy ARP/ND function. The GWs SHOULD build a Proxy 324 ARP/ND cache table as per [RFC7432]. When the active GW receives an 325 ARP/ND request/solicitation coming from the WAN, the GW does a Proxy 326 ARP/ND table lookup and replies as long as the information is 327 available in its table. 329 This mechanism is especially recommended on the GWs since it protects 330 the DC network from external ARP/ND-flooding storms. 332 2.5.4. Handling failures between GW and WAN Edge routers 333 Link/PE failures MUST be handled on the GWs as specified in 334 [RFC7432]. The GW detecting the failure will withdraw the EVPN routes 335 as per [RFC7432]. 337 Individual AC/PW failures should be detected by OAM mechanisms. For 338 instance: 340 o If the Interconnect solution is based on a VLAN hand-off, 341 802.1ag/Y.1731 Ethernet-CFM MAY be used to detect individual AC 342 failures on both, the GW and WAN Edge router. An individual AC 343 failure will trigger the withdrawal of the corresponding A-D per 344 EVI route as well as the MACs learnt on that AC. 346 o If the Interconnect solution is based on a PW hand-off, the LDP PW 347 Status bits TLV MAY be used to detect individual PW failures on 348 both, the GW and WAN Edge router. 350 3. Integrated Interconnect solution for EVPN overlay networks 352 When the DC and the WAN are operated by the same administrative 353 entity, the Service Provider can decide to integrate the GW and WAN 354 Edge PE functions in the same router for obvious CAPEX and OPEX 355 saving reasons. This is illustrated in Figure 2. Note that this model 356 does not provide an explicit demarcation link between DC and WAN 357 anymore. 359 +--+ 360 |CE| 361 +--+ 362 | 363 +----+ 364 +----| PE |----+ 365 +---------+ | +----+ | +---------+ 366 +----+ | +---+ +---+ | +----+ 367 |NVE1|--| | | | | |--|NVE3| 368 +----+ | |GW1| |GW3| | +----+ 369 | +---+ +---+ | 370 | NVO-1 | WAN | NVO-2 | 371 | +---+ +---+ | 372 | | | | | | 373 +----+ | |GW2| |GW4| | +----+ 374 |NVE2|--| +---+ +---+ |--|NVE4| 375 +----+ +---------+ | | +---------+ +----+ 376 +--------------+ 378 |<--EVPN-Overlay--->|<-----VPLS--->|<---EVPN-Overlay-->| 379 |<--PBB-VPLS-->| 380 Interconnect -> |<-EVPN-MPLS-->| 381 options |<--EVPN-Ovl-->| 382 |<--PBB-EVPN-->| 384 Figure 2 Integrated Interconnect model 386 3.1. Interconnect requirements 388 The solution must observe the following requirements: 390 o The GW function must provide control plane and data plane 391 interworking between the EVPN-overlay network and the L2VPN 392 technology supported in the WAN, i.e. (PBB-)VPLS or (PBB-)EVPN, as 393 depicted in Figure 2. 395 o Multi-homing MUST be supported. Single-active multi-homing with 396 per-service load balancing MUST be implemented. All-active multi- 397 homing, i.e. per-flow load-balancing, MUST be implemented as long 398 as the technology deployed in the WAN supports it. 400 o If EVPN is deployed in the WAN, the MAC Mobility, Static MAC 401 protection and other procedures (e.g. proxy-arp) described in 402 [RFC7432] must be supported end-to-end. 404 o Any type of inclusive multicast tree MUST be independently 405 supported in the WAN as per [RFC7432], and in the DC as per [EVPN- 406 Overlays]. 408 3.2. VPLS Interconnect for EVPN-Overlay networks 410 3.2.1. Control/Data Plane setup procedures on the GWs 412 Regular MPLS tunnels and TLDP/BGP sessions will be setup to the WAN 413 PEs and RRs as per [RFC4761][RFC4762][RFC6074] and overlay tunnels 414 and EVPN will be setup as per [EVPN-Overlays]. Note that different 415 route-targets for the DC and for the WAN are normally required. A 416 single type-1 RD per service can be used. 418 In order to support multi-homing, the GWs will be provisioned with an 419 I-ESI (see section 2.4), that will be unique per interconnection. All 420 the [RFC7432] procedures are still followed for the I-ESI, e.g. any 421 MAC address learnt from the WAN will be advertised to the DC with the 422 I-ESI in the ESI field. 424 A MAC-VRF per EVI will be created in each GW. The MAC-VRF will have 425 two different types of tunnel bindings instantiated in two different 426 split-horizon-groups: 428 o VPLS PWs will be instantiated in the "WAN split-horizon-group". 430 o Overlay tunnel bindings (e.g. VXLAN, NVGRE) will be instantiated 431 in the "DC split-horizon-group". 433 Attachment circuits are also supported on the same MAC-VRF, but they 434 will not be part of any of the above split-horizon-groups. 436 Traffic received in a given split-horizon-group will never be 437 forwarded to a member of the same split-horizon-group. 439 As far as BUM flooding is concerned, a flooding list will be created 440 with the sub-list created by the inclusive multicast routes and the 441 sub-list created for VPLS in the WAN. BUM frames received from a 442 local attachment circuit will be flooded to both sub-lists. BUM 443 frames received from the DC or the WAN will be forwarded to the 444 flooding list observing the split-horizon-group rule described above. 446 Note that the GWs are not allowed to have an EVPN binding and a PW to 447 the same far-end within the same MAC-VRF in order to avoid loops and 448 packet duplication. This is described in [EVPN-VPLS-INTEGRATION]. 450 The optimizations procedures described in section 2.5 can also be 451 applied to this model. 453 3.2.2. Multi-homing procedures on the GWs 454 Single-active multi-homing MUST be supported on the GWs. All-active 455 multi-homing is not supported by VPLS. 457 All the single-active multi-homing procedures as described by [EVPN- 458 Overlays] will be followed for the I-ESI. 460 The non-DF GW for the I-ESI will block the transmission and reception 461 of all the bindings in the "WAN split-horizon-group" for BUM and 462 unicast traffic. 464 3.3. PBB-VPLS Interconnect for EVPN-Overlay networks 466 3.3.1. Control/Data Plane setup procedures on the GWs 468 In this case, there is no impact on the procedures described in 469 [RFC7041] for the B-component. However the I-component instances 470 become EVI instances with EVPN-Overlay bindings and potentially local 471 attachment circuits. M MAC-VRF instances can be multiplexed into the 472 same B-component instance. This option provides significant savings 473 in terms of PWs to be maintained in the WAN. 475 The I-ESI concept described in section 3.2.1 will also be used for 476 the PBB-VPLS-based Interconnect. 478 B-component PWs and I-component EVPN-overlay bindings established to 479 the same far-end will be compared. The following rules will be 480 observed: 482 o Attempts to setup a PW between the two GWs within the B- 483 component context will never be blocked. 485 o If a PW exists between two GWs for the B-component and an 486 attempt is made to setup an EVPN binding on an I-component linked 487 to that B-component, the EVPN binding will be kept operationally 488 down. Note that the BGP EVPN routes will still be valid but not 489 used. 491 o The EVPN binding will only be up and used as long as there is no 492 PW to the same far-end in the corresponding B-component. The EVPN 493 bindings in the I-components will be brought down before the PW in 494 the B-component is brought up. 496 The optimizations procedures described in section 2.5 can also be 497 applied to this Interconnect option. 499 3.3.2. Multi-homing procedures on the GWs 501 Single-active multi-homing MUST be supported on the GWs. 503 All the single-active multi-homing procedures as described by [EVPN- 504 Overlays] will be followed for the I-ESI for each EVI instance 505 connected to B-component. 507 3.4. EVPN-MPLS Interconnect for EVPN-Overlay networks 509 If EVPN for MPLS tunnels, EVPN-MPLS hereafter, is supported in the 510 WAN, an end-to-end EVPN solution can be deployed. The following 511 sections describe the proposed solution as well as the impact 512 required on the [RFC7432] procedures. 514 3.4.1. Control Plane setup procedures on the GWs 516 The GWs MUST establish separate BGP sessions for sending/receiving 517 EVPN routes to/from the DC and to/from the WAN. Normally each GW will 518 setup one (two) BGP EVPN session(s) to the DC RR(s) and one(two) 519 session(s) to the WAN RR(s). The same route-distinguisher (RD) per 520 MAC-VRF can be used for the EVPN service routes sent to both, WAN and 521 DC RRs. On the contrary, although reusing the same value is possible, 522 different route-targets are expected to be handled for the same EVI 523 in the WAN and the DC. Note that the EVPN service routes sent to the 524 DC RRs will normally include a [RFC5512] BGP encapsulation extended 525 community with a different tunnel type than the one sent to the WAN 526 RRs. 528 As in the other discussed options, an I-ESI will be configured on the 529 GWs for multi-homing. This I-ESI represents the WAN to the DC but 530 also the DC to the WAN. 532 Received EVPN routes will never be reflected on the GWs but consumed 533 and re-advertised (if needed): 535 o Ethernet A-D routes, ES routes and Inclusive Multicast routes 536 are consumed by the GWs and processed locally for the 537 corresponding [RFC7432] procedures. 539 o MAC/IP advertisement routes will be received, imported and if 540 they become active in the MAC-VRF MAC FIB, the information will 541 be re-advertised as new routes with the following fields: 543 + The RD will be the GW's RD for the MAC-VRF. 545 + The ESI will be set to the I-ESI. 547 + The Ethernet-tag value will be kept from the received NLRI. 549 + The MAC length, MAC address, IP Length and IP address values 550 will be kept from the received NLRI. 552 + The MPLS label will be a local 20-bit value (when sent to the 553 WAN) or a DC-global 24-bit value (when sent to the DC). 555 + The appropriate Route-Targets (RTs) and [RFC5512] BGP 556 Encapsulation extended community will be used according to 557 [EVPN-Overlays]. 559 The GWs will also generate the following local EVPN routes that will 560 be sent to the DC and WAN, with their corresponding RTs and [RFC5512] 561 BGP Encapsulation extended community values: 563 o ES route for the I-ESI. 565 o Ethernet A-D routes per ESI and EVI for the I-ESI. The A-D per- 566 EVI routes sent to the WAN and the DC will have a consistent 567 Ethernet-Tag values. 569 o Inclusive Multicast routes with independent tunnel type value 570 for the WAN and DC. E.g. a P2MP LSP may be used in the WAN 571 whereas ingress replication may be used in the DC. The routes 572 sent to the WAN and the DC will have a consistent Ethernet-Tag. 574 o MAC/IP advertisement routes for MAC addresses learned in local 575 attachment circuits. Note that these routes will not include the 576 I-ESI, but ESI=0 or different from 0 for local Ethernet Segments 577 (ES). The routes sent to the WAN and the DC will have a 578 consistent Ethernet-Tag. 580 Assuming GW1 and GW2 are peer GWs of the same DC, each GW will 581 generate two sets of local service routes: Set-DC will be sent to the 582 DC RRs and will include A-D per EVI, Inclusive Multicast and MAC/IP 583 routes for the DC encapsulation and RT. Set-WAN will be sent to the 584 WAN RRs and will include the same routes but using the WAN RT and 585 encapsulation. GW1 and GW2 will receive each other's set-DC and set- 586 WAN. This is the expected behavior on GW1 and GW2 for locally 587 generated routes: 589 o Inclusive multicast routes: when setting up the flooding lists 590 for a given MAC-VRF, each GW will include its DC peer GW only in 591 the EVPN-overlay flooding list (by default) and not the EVPN- 592 MPLS flooding list. That is, GW2 will import two Inclusive 593 Multicast routes from GW1 (from set-DC and set-WAN) but will 594 only consider one of the two, having the set-DC route higher 595 priority. 597 o MAC/IP advertisement routes for local attachment circuits: as 598 above, the GW will select only one, having the route from the 599 set-DC a higher priority. 601 3.4.2. Data Plane setup procedures on the GWs 603 The procedure explained at the end of the previous section will make 604 sure there are no loops or packet duplication between the GWs of the 605 same DC (for frames generated from local ACs) since only one EVPN 606 binding per EVI will be setup in the data plane between the two 607 nodes. That binding will by default be added to the EVPN-overlay 608 flooding list. 610 As for the rest of the EVPN tunnel bindings, they will be added to 611 one of the two flooding lists that each GW sets up for the same MAC- 612 VRF: 614 o EVPN-overlay flooding list (composed of bindings to the remote 615 NVEs or multicast tunnel to the NVEs). 617 o EVPN-MPLS flooding list (composed of MP2P or LSM tunnel to the 618 remote PEs) 620 Each flooding list will be part of a separate split-horizon-group: 621 the WAN split-horizon-group or the DC split-horizon-group. Traffic 622 generated from a local AC can be flooded to both 623 split-horizon-groups. Traffic from a binding of a split-horizon-group 624 can be flooded to the other split-horizon-group and local ACs, but 625 never to a member of its own split-horizon-group. 627 When either GW1 or GW2 receive a BUM frame on an overlay tunnel, they 628 will perform a tunnel IP SA lookup to determine if the packet's 629 origin is the peer DC GW, i.e. GW2 or GW1 respectively. If the packet 630 is coming from the peer DC GW, it MUST only be flooded to local 631 attachment circuits and not to the WAN split-horizon-group (the 632 assumption is that the peer GW would have sent the BUM packet to the 633 WAN directly). 635 3.4.3. Multi-homing procedures on the GWs 637 Single-active as well as all-active multi-homing MUST be supported. 639 All the multi-homing procedures as described by [RFC7432] will be 640 followed for the DF election for I-ESI, as well as the backup-path 641 (single-active) and aliasing (all-active) procedures on the remote 642 PEs/NVEs. The following changes are required at the GW with respect 643 to the I-ESI: 645 o Single-active multi-homing; assuming a WAN split-horizon-group, 646 a DC split-horizon-group and local ACs on the GWs: 648 + Forwarding behavior on the non-DF: the non-DF MUST NOT forward 649 BUM or unicast traffic received from a given split-horizon- 650 group to a member of its own split-horizon-group or to the 651 other split-horizon-group. Only forwarding to local ACs is 652 allowed (as long as they are not part of an ES for which the 653 node is non-DF). 655 + Forwarding behavior on the DF: the DF MUST NOT forward BUM or 656 unicast traffic received from a given split-horizon-group to a 657 member of his own split-horizon group or to the non-DF. 658 Forwarding to the other split-horizon-group (except the non- 659 DF) and local ACs is allowed (as long as the ACs are not part 660 of an ES for which the node is non-DF). 662 o All-active multi-homing; assuming a WAN split-horizon-group, a 663 DC split-horizon-group and local ACs on the GWs: 665 + Forwarding behavior on the non-DF: the non-DF follows the same 666 behavior as the non-DF in the single-active case but only for 667 BUM traffic. Unicast traffic received from a split-horizon- 668 group MUST NOT be forwarded to a member of its own split- 669 horizon-group but can be forwarded normally to the other 670 split-horizon-group and local ACs. If a known unicast packet 671 is identified as a "flooded" packet, the procedures for BUM 672 traffic MUST be followed. 674 + Forwarding behavior on the DF: the DF follows the same 675 behavior as the DF in the single-active case but only for BUM 676 traffic. Unicast traffic received from a split-horizon-group 677 MUST NOT be forwarded to a member of its own split-horizon- 678 group but can be forwarded normally to the other split- 679 horizon-group and local ACs. If a known unicast packet is 680 identified as a "flooded" packet, the procedures for BUM 681 traffic MUST be followed. 683 o No ESI label is required to be signaled for I-ESI for its use by 684 the non-DF in the data path. This is possible because the non-DF 685 and the DF will never forward BUM traffic (coming from a split- 686 horizon-group) to each other. 688 3.4.4. Impact on MAC Mobility procedures 690 Since the MAC/IP Advertisement routes are not reflected in the GWs 691 but rather consumed and re-advertised if active, the MAC Mobility 692 procedures can be constrained to each domain (DC or WAN) and resolved 693 within each domain. In other words, if a MAC moves within the DC, the 694 GW MUST NOT re-advertise the route to the WAN with a change in the 695 sequence number. Only when the MAC moves from the WAN domain to the 696 DC domain (or from one DC to another) the GW will re-advertise the 697 MAC with a higher sequence number in the MAC Mobility extended 698 community. In respect to the MAC Mobility procedures described in 699 [RFC7432] the MAC addresses learned from the NVEs in the local DC or 700 on the local ACs will be considered as local. 702 The sequence numbers MUST NOT be propagated between domains. The 703 sticky bit indication in the MAC Mobility extended community MUST be 704 propagated between domains. 706 3.4.5. Gateway optimizations 708 All the Gateway optimizations described in section 2.5 MAY be applied 709 to the GWs when the Interconnect is based on EVPN-MPLS. 711 In particular, the use of the Unknown MAC route, as described in 712 section 2.5.1, reduces the unknown flooding in the DC but also solves 713 some transient packet duplication issues in cases of all-active 714 multi-homing. This is explained in the following paragraph. 716 Consider the diagram in Figure 2 for EVPN-MPLS Interconnect and all- 717 active multi-homing, and the following sequence: 719 a) MAC Address M1 is advertised from NVE3 in EVI-1. 721 b) GW3 and GW4 learn M1 for EVI-1 and re-advertise M1 to the WAN 722 with I-ESI-2 in the ESI field. 724 c) GW1 and GW2 learn M1 and install GW3/GW4 as next-hops following 725 the EVPN aliasing procedures. 727 d) Before NVE1 learns M1, a packet arrives to NVE1 with 728 destination M1. The packet is subsequently flooded. 730 e) Since both GW1 and GW2 know M1, they both forward the packet to 731 the WAN (hence creating packet duplication), unless there is an 732 indication in the data plane that the packet from NVE1 has been 733 flooded. If the GWs signal the same VNI/VSID for MAC/IP 734 advertisement and inclusive multicast routes for EVI-1, such 735 data plane indication does not exist. 737 This undesired situation can be avoided by the use of the Unknown- 738 MAC-route. If this route is used, the NVEs will prune their unknown 739 unicast flooding list, and the non-DF GW will not received unknown 740 packets, only the DF will. This solves the MAC duplication issue 741 described above. 743 3.4.6. Benefits of the EVPN-MPLS Interconnect solution 744 Besides retaining the EVPN attributes between Data Centers and 745 throughout the WAN, the EVPN-MPLS Interconnect solution on the GWs 746 has some benefits compared to pure BGP EVPN RR or Inter-AS model B 747 solutions without a gateway: 749 o The solution supports the connectivity of local attachment 750 circuits on the GWs. 752 o Different data plane encapsulations can be supported in the DC 753 and the WAN. 755 o Optimized multicast solution, with independent inclusive 756 multicast trees in DC and WAN. 758 o MPLS Label aggregation: for the case where MPLS labels are 759 signaled from the NVEs for MAC/IP Advertisement routes, this 760 solution provides label aggregation. A remote PE MAY receive a 761 single label per GW MAC-VRF as opposed to a label per NVE/MAC- 762 VRF connected to the GW MAC-VRF. For instance, in Figure 2, PE 763 would receive only one label for all the routes advertised for a 764 given MAC-VRF from GW1, as opposed to a label per NVE/MAC-VRF. 766 o The GW will not propagate MAC mobility for the MACs moving 767 within a DC. Mobility intra-DC is solved by all the NVEs in the 768 DC. The MAC Mobility procedures on the GWs are only required in 769 case of mobility across DCs. 771 o Proxy-ARP/ND function on the DGWs can be leveraged to reduce 772 ARP/ND flooding in the DC or/and in the WAN. 774 3.5. PBB-EVPN Interconnect for EVPN-Overlay networks 776 [PBB-EVPN] is yet another Interconnect option. It requires the use of 777 GWs where I-components and associated B-components are EVI 778 instances. 780 3.5.1. Control/Data Plane setup procedures on the GWs 782 EVPN will run independently in both components, the I-component MAC- 783 VRF and B-component MAC-VRF. Compared to [PBB-EVPN], the DC C-MACs 784 are no longer learnt in the data plane on the GW but in the control 785 plane through EVPN running on the I-component. Remote C-MACs coming 786 from remote PEs are still learnt in the data plane. B-MACs in the B- 787 component will be assigned and advertised following the procedures 788 described in [PBB-EVPN]. 790 An I-ESI will be configured on the GWs for multi-homing, but it will 791 only be used in the EVPN control plane for the I-component EVI. No 792 non-reserved ESIs will be used in the control plane of the B- 793 component EVI as per [PBB-EVPN]. 795 The rest of the control plane procedures will follow [RFC7432] for 796 the I-component EVI and [PBB-EVPN] for the B-component EVI. 798 From the data plane perspective, the I-component and B-component EVPN 799 bindings established to the same far-end will be compared and the I- 800 component EVPN-overlay binding will be kept down following the rules 801 described in section 3.3.1. 803 3.5.2. Multi-homing procedures on the GWs 805 Single-active as well as all-active multi-homing MUST be supported. 807 The forwarding behavior of the DF and non-DF will be changed based on 808 the description outlined in section 3.4.3, only replacing the "WAN 809 split-horizon-group" for the B-component. 811 3.5.3. Impact on MAC Mobility procedures 813 C-MACs learnt from the B-component will be advertised in EVPN within 814 the I-component EVI scope. If the C-MAC was previously known in the 815 I-component database, EVPN would advertise the C-MAC with a higher 816 sequence number, as per [RFC7432]. From a Mobility perspective and 817 the related procedures described in [RFC7432], the C-MACs learnt from 818 the B-component are considered local. 820 3.5.4. Gateway optimizations 822 All the considerations explained in section 3.4.5 are applicable to 823 the PBB-EVPN Interconnect option. 825 3.6. EVPN-VXLAN Interconnect for EVPN-Overlay networks 827 If EVPN for Overlay tunnels is supported in the WAN and a GW function 828 is required, an end-to-end EVPN solution can be deployed. This 829 section focuses on the specific case of EVPN for VXLAN (EVPN-VXLAN 830 hereafter) and the impact on the [RFC7432] procedures. 832 This use-case assumes that NVEs need to use the VNIs or VSIDs as a 833 globally unique identifiers within a data center, and a Gateway needs 834 to be employed at the edge of the data center network to translate 835 the VNI or VSID when crossing the network boundaries. This GW 836 function provides VNI and tunnel IP address translation. The use-case 837 in which local downstream assigned VNIs or VSIDs can be used (like 838 MPLS labels) is described by [EVPN-Overlays]. 840 While VNIs are globally significant within each DC, there are two 841 possibilities in the Interconnect network: 843 a) Globally unique VNIs in the Interconnect network: 844 In this case, the GWs and PEs in the Interconnect network will 845 agree on a common VNI for a given EVI. The RT to be used in the 846 Interconnect network can be auto-derived from the agreed 847 Interconnect VNI. The VNI used inside each DC MAY be the same 848 as the Interconnect VNI. 850 b) Downstream assigned VNIs in the Interconnect network. 851 In this case, the GWs and PEs MUST use the proper RTs to 852 import/export the EVPN routes. Note that even if the VNI is 853 downstream assigned in the Interconnect network, and unlike 854 option B, it only identifies the pair and 855 not the pair. The VNI used inside 856 each DC MAY be the same as the Interconnect VNI. GWs SHOULD 857 support multiple VNI spaces per EVI (one per Interconnect 858 network they are connected to). 860 In both options, NVEs inside a DC only have to be aware of a single 861 VNI space, and only GWs will handle the complexity of managing 862 multiple VNI spaces. In addition to VNI translation above, the GWs 863 will provide translation of the tunnel source IP for the packets 864 generated from the NVEs, using their own IP address. GWs will use 865 that IP address as the BGP next-hop in all the EVPN updates to the 866 Interconnect network. 868 The following sections provide more details about these two options. 870 3.6.1. Globally unique VNIs in the Interconnect network 872 Considering Figure 2, if a host H1 in NVO-1 needs to communicate with 873 a host H2 in NVO-2, and assuming that different VNIs are used in each 874 DC for the same EVI, e.g. VNI-10 in NVO-1 and VNI-20 in NVO-2, then 875 the VNIs must be translated to a common Interconnect VNI (e.g. VNI- 876 100) on the GWs. Each GW is provisioned with a VNI translation 877 mapping so that it can translate the VNI in the control plane when 878 sending BGP EVPN route updates to the Interconnect network. In other 879 words, GW1 and GW2 must be configured to map VNI-10 to VNI-100 in the 880 BGP update messages for H1's MAC route. This mapping is also used to 881 translate the VNI in the data plane in both directions, that is, VNI- 882 10 to VNI-100 when the packet is received from NVO-1 and the reverse 883 mapping from VNI-100 to VNI-10 when the packet is received from the 884 remote NVO-2 network and needs to be forwarded to NVO-1. 886 The procedures described in section 3.4 will be followed, considering 887 that the VNIs advertised/received by the GWs will be translated 888 accordingly. 890 3.6.2. Downstream assigned VNIs in the Interconnect network 892 In this case, if a host H1 in NVO-1 needs to communicate with a host 893 H2 in NVO-2, and assuming that different VNIs are used in each DC for 894 the same EVI, e.g. VNI-10 in NVO-1 and VNI-20 in NVO-2, then the VNIs 895 must be translated as in section 3.6.1. However, in this case, there 896 is no need to translate to a common Interconnect VNI on the GWs. Each 897 GW can translate the VNI received in an EVPN update to a locally 898 assigned VNI advertised to the Interconnect network. Each GW can use 899 a different Interconnect VNI, hence this VNI does not need to be 900 agreed on all the GWs and PEs of the Interconnect network. 902 The procedures described in section 3.4 will be followed, taking the 903 considerations above for the VNI translation. 905 5. Conventions and Terminology 907 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 908 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 909 document are to be interpreted as described in RFC-2119 [RFC2119]. 911 AC: Attachment Circuit 913 BUM: it refers to the Broadcast, Unknown unicast and Multicast 914 traffic 916 DF: Designated Forwarder 918 GW: Gateway or Data Center Gateway 920 DCI: Data Center Interconnect 922 ES: Ethernet Segment 924 ESI: Ethernet Segment Identifier 926 I-ESI: Interconnect ESI defined on the GWs for multi-homing to/from 927 the WAN 929 EVI: EVPN Instance 931 MAC-VRF: it refers to an EVI instance in a particular node 933 NVE: Network Virtualization Edge 935 PW: Pseudowire 936 RD: Route-Distinguisher 938 RT: Route-Target 940 TOR: Top-Of-Rack switch 942 VNI/VSID: refers to VXLAN/NVGRE virtual identifiers 944 VSI: Virtual Switch Instance or VPLS instance in a particular PE 946 6. Security Considerations 948 This section will be completed in future versions. 950 7. IANA Considerations 952 8. References 954 8.1. Normative References 956 [RFC4761]Kompella, K., Ed., and Y. Rekhter, Ed., "Virtual Private LAN 957 Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, 958 DOI 10.17487/RFC4761, January 2007, . 961 [RFC4762]Lasserre, M., Ed., and V. Kompella, Ed., "Virtual Private 962 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 963 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 964 . 966 [RFC6074]Rosen, E., Davie, B., Radoaca, V., and W. Luo, 967 "Provisioning, Auto-Discovery, and Signaling in Layer 2 Virtual 968 Private Networks (L2VPNs)", RFC 6074, DOI 10.17487/RFC6074, January 969 2011, . 971 [RFC7041]Balus, F., Ed., Sajassi, A., Ed., and N. Bitar, Ed., 972 "Extensions to the Virtual Private LAN Service (VPLS) Provider Edge 973 (PE) Model for Provider Backbone Bridging", RFC 7041, DOI 974 10.17487/RFC7041, November 2013, . 977 [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 978 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 979 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 982 8.2. Informative References 984 [PBB-EVPN] Sajassi et al., "PBB-EVPN", draft-ietf-l2vpn-pbb-evpn-10, 985 work in progress, May, 2015 987 [EVPN-Overlays] Sajassi-Drake et al., "A Network Virtualization 988 Overlay Solution using EVPN", draft-ietf-bess-evpn-overlay-01.txt, 989 work in progress, February, 2015 991 [EVPN-VPLS-INTEGRATION] Sajassi et al., "(PBB-)EVPN Seamless 992 Integration with (PBB-)VPLS", draft-ietf-bess-evpn-vpls-integration- 993 00.txt, work in progress, February, 2015 995 9. Acknowledgments 997 The authors would like to thank Neil Hart for their valuable comments 998 and feedback. 1000 10. Contributors 1002 In addition to the authors listed on the front page, the following 1003 co-authors have also contributed to this document: 1005 Florin Balus 1006 John Drake 1008 11. Authors' Addresses 1010 Jorge Rabadan 1011 Alcatel-Lucent 1012 777 E. Middlefield Road 1013 Mountain View, CA 94043 USA 1014 Email: jorge.rabadan@alcatel-lucent.com 1016 Senthil Sathappan 1017 Alcatel-Lucent 1018 Email: senthil.sathappan@alcatel-lucent.com 1020 Wim Henderickx 1021 Alcatel-Lucent 1022 Email: wim.henderickx@alcatel-lucent.com 1024 Senad Palislamovic 1025 Alcatel-Lucent 1026 Email: senad.palislamovic@alcatel-lucent.com 1028 Ali Sajassi 1029 Cisco 1030 Email: sajassi@cisco.com 1032 Ravi Shekhar 1033 Juniper 1034 Email: rshekhar@juniper.net 1036 Anil Lohiya 1037 Juniper 1038 Email: alohiya@juniper.net 1040 Dennis Cai 1041 Cisco Systems 1042 Email: dcai@cisco.com