idnits 2.17.1 draft-boutros-bess-vxlan-evpn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 21, 2016) is 2742 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'EVPN-OVERLY' is mentioned on line 125, but not defined == Missing Reference: 'RFC2119' is mentioned on line 133, but not defined == Missing Reference: 'LACP' is mentioned on line 263, but not defined == Unused Reference: 'KEYWORDS' is defined on line 616, but no explicit reference was found in the text == Unused Reference: 'NVGRE' is defined on line 631, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Sami Boutros 3 Intended Status: Informational VMware 5 Ali Sajassi 6 Samer Salam 7 Dennis Cai 8 Samir Thoria 9 Cisco Systems 11 Tapraj Singh 12 John Drake 13 Juniper Networks 15 Jeff Tantsura 16 Ericsson 18 Expires: April 24, 2017 October 21, 2016 20 VXLAN DCI Using EVPN 21 draft-boutros-bess-vxlan-evpn-02.txt 23 Abstract 25 This document describes how Ethernet VPN (E-VPN) technology can be 26 used to interconnect VXLAN or NVGRE networks over an MPLS/IP network. 27 This is to provide intra-subnet connectivity at Layer 2 and control- 28 plane separation among the interconnected VXLAN or NVGRE networks. 29 The scope of the learning of host MAC addresses in VXLAN or NVGRE 30 network is limited to data plane learning in this document. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as 40 Internet-Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 The list of current Internet-Drafts can be accessed at 48 http://www.ietf.org/1id-abstracts.html 50 The list of Internet-Draft Shadow Directories can be accessed at 51 http://www.ietf.org/shadow.html 53 Copyright and License Notice 55 Copyright (c) 2016 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 71 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 72 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2.1. Control Plane Separation among VXLAN/NVGRE Networks . . . . 4 74 2.2 All-Active Multi-homing . . . . . . . . . . . . . . . . . . 5 75 2.3 Layer 2 Extension of VNIs/VSIDs over the MPLS/IP Network . . 5 76 2.4 Support for Integrated Routing and Bridging (IRB) . . . . . 5 77 3. Solution Overview . . . . . . . . . . . . . . . . . . . . . . . 5 78 3.1. Redundancy and All-Active Multi-homing . . . . . . . . . . 6 79 4. EVPN Routes . . . . . . . . . . . . . . . . . . . . . . . . . 7 80 4.1. BGP MAC Advertisement Route . . . . . . . . . . . . . . . 7 81 4.2. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 8 82 4.3. Per VPN Route Targets . . . . . . . . . . . . . . . . . . 8 83 4.4 Inclusive Multicast Route . . . . . . . . . . . . . . . . . 8 84 4.5. Unicast Forwarding . . . . . . . . . . . . . . . . . . . . 8 85 4.6. Handling Multicast . . . . . . . . . . . . . . . . . . . . 9 86 4.6.2. Multicast Stitching with Per-VNI Load Balancing . . . . 9 87 4.6.2.1 PIM SM operation . . . . . . . . . . . . . . . . . . 10 88 5. NVGRE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 89 6. Use Cases Overview . . . . . . . . . . . . . . . . . . . . . . 11 90 6.1. Homogeneous Network DCI interconnect Use cases . . . . . . 12 91 6.1.1. VNI Base Mode EVPN Service Use Case . . . . . . . . . . 12 92 6.1.2. VNI Bundle Service Use Case Scenario . . . . . . . . . 13 93 6.1.3. VNI Translation Use Case . . . . . . . . . . . . . . 13 95 6.2. Heterogeneous Network DCI Use Cases Scenarios . . . . . . . 13 96 6.2.1. VXLAN VLAN Interworking Over EVPN Use Case Scenario . . 13 97 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 98 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 99 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 100 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 101 10.1 Normative References . . . . . . . . . . . . . . . . . . . 14 102 10.2 Informative References . . . . . . . . . . . . . . . . . . 14 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 105 1 Introduction 107 [EVPN] introduces a solution for multipoint L2VPN services, with 108 advanced multi-homing capabilities, using BGP control plane over the 109 core MPLS/IP network. [VXLAN] defines a tunneling scheme to overlay 110 Layer 2 networks on top of Layer 3 networks. [VXLAN] allows for 111 optimal forwarding of Ethernet frames with support for multipathing 112 of unicast and multicast traffic. VXLAN uses UDP/IP encapsulation for 113 tunneling. 115 In this document, we discuss how Ethernet VPN (EVPN) technology can 116 be used to interconnect VXLAN or NVGRE networks over an MPLS/IP 117 network. This is achieved by terminating the VxLAN tunnel at the 118 hand-off points, performing data plane MAC learning of customer 119 traffic and providing intra-subnet connectivity for the customers at 120 Layer 2 across the MPLS/IP core. The solution maintains control-plane 121 separation among the interconnected VXLAN or NVGRE networks. The 122 scope of the learning of host MAC addresses in VXLAN or NVGRE network 123 is limited to data plane learning in this document. The distribution 124 of MAC addresses in control plane using BGP in VXLAN or NVGRE network 125 is outside of the scope of this document and it is covered in [EVPN- 126 OVERLY]. 128 1.1 Terminology 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 132 document are to be interpreted as described in RFC 2119 [RFC2119]. 134 LDP: Label Distribution Protocol. MAC: Media Access Control MPLS: 135 Multi Protocol Label Switching. OAM: Operations, Administration and 136 Maintenance. PE: Provide Edge Node. PW: PseudoWire. TLV: Type, 137 Length, and Value. VPLS: Virtual Private LAN Services. VXLAN: Virtual 138 eXtensible Local Area Network. VTEP: VXLAN Tunnel End Point VNI: 139 VXLAN Network Identifier (or VXLAN Segment ID) ToR: Top of Rack 140 switch. LACP: Link Aggregation Control Protocol 142 2. Requirements 144 2.1. Control Plane Separation among VXLAN/NVGRE Networks 146 It is required to maintain control-plane separation for the underlay 147 networks (e.g., among the various VXLAN/NVGRE networks) being 148 interconnected over the MPLS/IP network. This ensures the following 149 characteristics: 151 - scalability of the IGP control plane in large deployments and fault 152 domain localization, where link or node failures in one site do not 153 trigger re-convergence in remote sites. 155 - scalability of multicast trees as the number of interconnected 156 networks scales. 158 2.2 All-Active Multi-homing 160 It is important to allow for all-active multi-homing of the 161 VXLAN/NVGRE network to MPLS/IP network where traffic from a VTEP can 162 arrive at any of the PEs and can be forwarded accordingly over the 163 MPLS/IP network. Furthermore, traffic destined to a VTEP can be 164 received over the MPLS/IP network at any of the PEs connected to the 165 VXLAN/NVGRE network and be forwarded accordingly. The solution MUST 166 support all-active multi-homing to an VXLAN/NVGRE network. 168 2.3 Layer 2 Extension of VNIs/VSIDs over the MPLS/IP Network 170 It is required to extend the VXLAN VNIs or NVGRE VSIDs over the 171 MPLS/IP network to provide intra-subnet connectivity between the 172 hosts (e.g. VMs) at Layer 2. 174 2.4 Support for Integrated Routing and Bridging (IRB) 176 The data center WAN edge node is required to support integrated 177 routing and bridging in order to accommodate both inter-subnet 178 routing and intra-subnet bridging for a given VNI/VSID. For example, 179 inter-subnet switching is required when a remote host connected to an 180 enterprise IP-VPN site wants to access an application resided on a 181 VM. 183 3. Solution Overview 185 Every VXLAN/NVGRE network, which is connected to the MPLS/IP core, 186 runs an independent instance of the IGP control-plane. Each PE 187 participates in the IGP control plane instance of its VXLAN/NVGRE 188 network. 190 Each PE node terminates the VXLAN or NVGRE data-plane encapsulation 191 where each VNI or VSID is mapped to a bridge-domain. The PE performs 192 data plane MAC learning on the traffic received from the VXLAN/NVGRE 193 network. 195 Each PE node implements EVPN or PBB-EVPN to distribute in BGP either 196 the client MAC addresses learnt over the VXLAN tunnel in case of 197 EVPN, or the PEs' B-MAC addresses in case of PBB-EVPN. In the PBB- 198 EVPN case, client MAC addresses will continue to be learnt in data 199 plane. 201 Each PE node would encapsulate the Ethernet frames with MPLS when 202 sending the packets over the MPLS core and with the VXLAN or NVGRE 203 tunnel header when sending the packets over the VXLAN or NVGRE 204 Network. 206 +--------------+ 207 | | 208 +---------+ +----+ MPLS +----+ +---------+ 209 +-----+ | |---|PE1 | |PE3 |--| | +-----+ 210 |VTEP1|--| | +----+ +----+ | |--|VTEP3| 211 +-----+ | VXLAN | +----+ +----+ | VXLAN | +-----+ 212 +-----+ | |---|PE2 | |PE4 |--| | +-----+ 213 |VTEP2|--| | +----+Backbone+----+ | |--|VTEP4| 214 +-----+ +---------+ +--------------+ +---------+ +-----+ 216 |<--- Underlay IGP ---->|<-Overlay BGP->|<--- Underlay IGP --->| CP 218 |<----- VXLAN --------->||<------ VXLAN ------->| DP 219 |<----MPLS----->| 221 Legend: CP = Control Plane View DP = Data Plane View 223 Figure 1: Interconnecting VXLAN Networks with VXLAN-EVPN 225 3.1. Redundancy and All-Active Multi-homing 227 When a VXLAN network is multi-homed to two or more PEs, and provided 228 that these PEs have the same IGP distance to a given NVE, the 229 solution MUST support load-balancing of traffic between the NVE and 230 the MPLS network, among all the multi-homed PEs. This maximizes the 231 use of the bisectional bandwidth of the VXLAN network. One of the 232 main capabilities of EVPN/PBB-EVPN is the support for all-active 233 multi-homing, where the known unicast traffic to/from a multi-homed 234 site can be forwarded by any of the PEs attached to that site. This 235 ensures optimal usage of multiple paths and load balancing. EVPN/PBB- 236 EVPN, through its DF election and split-horizon filtering mechanisms, 237 ensures that no packet duplication or forwarding loops result in such 238 scenarios. In this solution, the VXLAN network is treated as a 239 multi-homed site for the purpose of EVPN operation. 241 Since the context of this solution is VXLAN networks with data-plane 242 learning paradigm, it is important for the multi-homing mechanism to 243 ensure stability of the MAC forwarding tables at the NVEs, while 244 supporting all-active forwarding at the PEs. For example, in Figure 1 245 above, if each PE uses a distinct IP address for its VTEP tunnel, 246 then for a given VNI, when an NVE learns a host's MAC address against 247 the originating VTEP source address, its MAC forwarding table will 248 keep flip-flopping among the VTEP addresses of the local PEs. This is 249 because a flow associated with the same host MAC address can arrive 250 at any of the PE devices. In order to ensure that there is no 251 flip/flopping of MAC-to-VTEP address associations, an IP Anycast 252 address MUST be used as the VTEP address on all PEs multi-homed to a 253 given VXLAN network. The use of IP Anycast address has two 254 advantages: 256 a) It prevents any flip/flopping in the forwarding tables for the 257 MAC-to-VTEP associations 259 b) It enables load-balancing via ECMP for DCI traffic among the 260 multi-homed PEs 262 In the baseline [EVPN] draft, the all-active multi-homing is 263 described for a multi-homed device (MHD) using [LACP] and the single- 264 active multi-homing is described for a multi-homed network (MHN) 265 using [802.1Q]. In this draft, the all-active multi-homing is 266 described for a VXLAN MHN. This implies some changes to the filtering 267 which will be described in details in the multicast section (Section 268 4.6.2). 270 The filtering used for BUM traffic of all-active multi-homing in 271 [EVPN] is asymmetric; where the BUM traffic from the MPLS/IP network 272 towards the multi-homed site is filtered on non-DF PE(s) and it 273 passes thorough the DF PE. There is no filtering of BUM traffic 274 originating from the multi-homed site because of the use of Ethernet 275 Link Aggregation: the MHD hashes the BUM traffic to only a single 276 link. However, in this solution because BUM traffic can arrive at 277 both PEs in both core-to-site and site-to-core directions, the 278 filtering needs to be symmetric just like the filtering of BUM 279 traffic for single-active multi-homing (on a per service 280 instance/VLAN basis). 282 4. EVPN Routes 284 This solution leverages the same BGP Routes and Attributes defined in 285 [EVPN], adapted as follows: 287 4.1. BGP MAC Advertisement Route 289 This route and its associated modes are used to distribute the 290 customer MAC addresses learnt in data plane over the VXLAN tunnel in 291 case of EVPN. Or can be used to distribute the provider Backbone MAC 292 addresses in case of PBB-EVPN. 294 In case of EVPN, the Ethernet Tag ID of this route is set to zero for 295 VNI-based mode, where there is one-to-one mapping between a VNI and 296 an EVI. In such case, there is no need to carry the VNI in the MAC 297 advertisement route because BD ID can be derived from the RT 298 associated with this route. However, for VNI-aware bundle mode, where 299 there is multiple VNIs can be mapped to the same EVI, the Ethernet 300 Tag ID MUST be set to the VNI. At the receiving PE, the BD ID is 301 derived from the combination of RT + VNI - e.g., the RT identifies 302 the associated EVI on that PE and the VNI identifies the 303 corresponding BD ID within that EVI. 305 The Ethernet Tag field can be set to a normalized value that maps to 306 the VNI, in VNI aware bundling services, this would make the VNI 307 value of local significance in multiple Data centers. Data plane need 308 to map to this normalized VNI value and have it on the IP VxLAN 309 packets exchanged between the DCIs. 311 4.2. Ethernet Auto-Discovery Route 313 When EVPN is used, the application of this route is as specified in 314 [EVPN]. However, when PBB-EVPN is used, there is no need for this 315 route per [PBB-EVPN]. 317 4.3. Per VPN Route Targets 319 VXLAN-EVPN uses the same set of route targets defined in [EVPN]. 321 4.4 Inclusive Multicast Route 323 The EVPN Inclusive Multicast route is used for auto-discovery of PE 324 devices participating in the same tenant virtual network identified 325 by a VNI over the MPLS network. It also enables the stitching of the 326 IP multicast trees, which are local to each VXLAN site, with the 327 Label Switched Multicast (LSM) trees of the MPLS network. 329 The Inclusive Multicast Route is encoded as follow: 331 - Ethernet Tag ID is set to zero for VNI-based mode and to VNI for 332 VNI-aware bundle mode. 334 - Originating Router's IP Address is set to one of the PE's IP 335 addresses. 337 All other fields are set as defined in [EVPN]. 339 Please see section 4.6 "Handling Multicast" 341 4.5. Unicast Forwarding 343 Host MAC addresses will be learnt in data plane from the VXLAN 344 network and associated with the corresponding VTEP identified by the 345 source IP address. Host MAC addresses will be learnt in control plane 346 if EVPN is implemented over the MPLS/IP core, or in the data-plane if 347 PBB-EVPN is implemented over the MPLS core. When Host MAC addressed 348 are learned in data plane over MPLS/IP core [in case of PBB-EVPN], 349 they are associated with their corresponding BMAC addresses. 351 L2 Unicast traffic destined to the VXLAN network will be encapsulated 352 with the IP/UDP header and the corresponding customer bridge VNI. 354 L2 Unicast traffic destined to the MPLS/IP network will be 355 encapsulated with the MPLS label. 357 4.6. Handling Multicast 359 Each VXLAN network independently builds its P2MP or MP2MP shared 360 multicast trees. A P2MP or MP2MP tree is built for one or more VNIs 361 local to the VXLAN network. 363 In the MPLS/IP network, multiple options are available for the 364 delivery of multicast traffic: - Ingress replication - LSM 365 with Inclusive trees - LSM with Aggregate Inclusive trees - 366 LSM with Selective trees - LSM with Aggregate Selective trees 368 When LSM is used, the trees are P2MP. 370 The PE nodes are responsible for stitching the IP multicast trees, on 371 the access side, to the ingress replication tunnels or LSM trees in 372 the MPLS/IP core. The stitching must ensure that the following 373 characteristics are maintained at all times: 375 1. Avoiding Packet Duplication: In the case where the VXLAN network 376 is multi-homed to multiple PE nodes, if all of the PE nodes forward 377 the same multicast frame, then packet duplication would arise. This 378 applies to both multicast traffic from site to core as well as from 379 core to site. 381 2. Avoiding Forwarding Loops: In the case of VXLAN network multi- 382 homing, the solution must ensure that a multicast frame forwarded by 383 a given PE to the MPLS core is not forwarded back by another PE (in 384 the same VXLAN network) to the VXLAN network of origin. The same 385 applies for traffic in the core to site direction. 387 The following approach of per-VNI load balancing can guarantee proper 388 stitching that meets the above requirements. 390 4.6.2. Multicast Stitching with Per-VNI Load Balancing 391 To setup multicast trees in the VXLAN network for DC applications, 392 PIM Bidir can be of special interest because it reduces the amount of 393 multicast state in the network significantly. Furthermore, it 394 alleviates any special processing for RPF check since PIM Bidir 395 doesn't require any RPF check. The RP for PIM Bidir can be any of the 396 spine nodes. Multiple trees can be built (e.g., one tree rooted per 397 spine node) for efficient load-balancing within the network. All PEs 398 participating in the multi-homing of the VXLAN network join all the 399 trees. Therefore, for a given tree, all PEs receive BUM traffic. DF 400 election procedures of [EVPN] are used to ensure that only traffic 401 to/from a single PE is forwarded, thus avoiding packet duplications 402 and forwarding loops. For load-balancing of BUM traffic, when a PE or 403 an NVE wants to send BUM traffic over the VXLAN network, it selects 404 one of the trees based on its VNI and forwards all the traffic for 405 that VNI on that tree. 407 Multicast traffic from VXLAN/NVGRE is first subjected to filtering 408 based on DF election procedures of [EVPN] using the VNI as the 409 Ethernet Tag. This is similar to filtering in [EVPN] in principal; 410 however, instead of VLAN ID, VNI is used for filtering, and instead 411 of being 802.1Q frame, it is a VXLAN encapsulated packet. On the DF 412 PE, where the multicast traffic is allowed to be forwarded, the VNI 413 is used to select a bridge domain,. After the packet is de- 414 capsulated, an L2 lookup is performed based on host MAC DA. It should 415 be noted that the MAC learning is performed in data-plane for the 416 traffic received from the VXLAN/NVGRE network and the host MAC SA is 417 learnt against the source VTEP address. 419 The PE nodes, connected to a multi-homed VXLAN network, perform BGP 420 DF election to decide which PE node is responsible for forwarding 421 multicast traffic associated with a given VNI. A PE would forward 422 multicast traffic for a given VNI only when it is the DF for this 423 VNI. This forwarding rule applies in both the site-to-core as well as 424 core-to-site directions. 426 4.6.2.1 PIM SM operation 428 With PIM SM, multicast traffic from the core-to-site could be dropped 429 since a transit router may decide that the RPF path towards the 430 anycast address source is toward a PE node that is not the DF. 432 The PE nodes whether DF or not, has to forward forward multicast 433 traffic from core-to-side. 435 The operation would work as follow: 437 Initially, the PE nodes connected to the multi-homed VXLAN network as 438 well the VTEPs, join towards the RP for the multicast group for a 439 particular VXLAN. 441 When BUM traffic needs to be flooded from core to site, all the PE 442 nodes connected to the multi-homed VXLAN network send PIM register 443 messages to the RP. The multicast flow is identified as (anycast 444 address, group) in the register message, and the source address for 445 the PIM-SM register message should be a unique address on the PE node 446 not the anycast address. 448 The RP will send a join for the (anycast address, group) upon 449 receiving the register message, routed towards the closest PE which 450 could be either the DF or the non-DF. This PE will switch to send 451 traffic natively. Upon receiving the native traffic, the RP will send 452 register-stop messages for other PEs that keep sending registering 453 messages, given that only one PE will get the (anycast address, 454 group) join. 456 When VTEPs receive traffic from the RP, VTEPs will send (anycast 457 address, group) join, routed towards the closet PE to each VTEP. This 458 starts native forwarding on multiple PE nodes connected to the VXLAN 459 network, but each VTEP or transit router will only accept multicast 460 traffic from one of the multi-homed PE nodes. 462 If PIM state times out when multicast traffic stops for a period of 463 time, the next flooded packet will trigger the above process again. 465 It is to be noted that before the RP receives the first natively sent 466 packet from one particular PE node connected to the multihomed VXLAN 467 network, all packets encapsulated in the register messages from all 468 PEs will be forwarded by the RP, causing duplications. 470 A possible optimization is for all PE nodes connected to the 471 multihomed VXLAN network to send null-register periodically to 472 maintain the PIM state at the RP, instead of encapsulating flooded 473 packets in register messages. 475 The site-to-core operations for flooding BUM traffic would still be 476 subject to DF election per VNI as described above. 478 5. NVGRE 480 Just like VXLAN, all the above specification would apply for NVGRE, 481 replacing the VNI with Virtual Subnet Identifier (VSID) and the VTEP 482 with NVGRE Endpoint. 484 6. Use Cases Overview 485 6.1. Homogeneous Network DCI interconnect Use cases 487 This covers DCI interconnect of two or more VXLAN based Data center 488 over MPLS enabled EVPN core. 490 6.1.1. VNI Base Mode EVPN Service Use Case 492 This use case handles the EVPN service where there is one to one 493 mapping between a VNI and an EVI. Ethernet TAG ID of EVPN BGP NLRI 494 should be set to Zero. BD ID can be derived from the RT associated 495 with the EVI/VNI. 497 +---+ +---+ 498 | H1| +---++-------+ +--+ +---------+ +---+ +------+ +---+ | H3| 499 | M1|--+ ++ +-+PE1+-+ +-+PE3+--+ +--+ +-| M3| 500 +---+ | || | +--+ |MPLS Core| +---+ | | | | +---+ 501 +---+ |NVE|| VXLAN | | (EVPN) | | VXLAN| |NVE| +---+ 502 | H2| | 1 || | +--+ | | +---+ | | | 2 | | H4| 503 | M2|--+ +-+ +-+PE2+-+ +-+PE4+--+ +--+ +-| M4| 504 +---+ +---++-------+ +--+ +---------+ +---+ +------+ +---+ +---+ 505 +--------+------+--------+------+--------+------+--------+--------+ 506 |Original|VXLAN |Original|MPLS |Original|VXLAN |Original|Original| 507 |Ethernet|Header|Ethernet|Header|Ethernet|Header|Ethernet|Ethernet| 508 |Frame | |Frame | |Frame | |Frame |Frame | 509 +--------+------+--------+------+--------+------+--------+--------+ 510 |<----Data Center Site1->|<----EVPN Core>|<---Data Center Site2-->| 512 Figure 2 VNI Base Service Packet Flow. 514 VNI base Service(One VNI mapped to one EVI). 516 Hosts H1, H2, H3 and H4 are hosts and there associated MAC addresses 517 are M1, M2, M3 and M4. PE1, PE2, PE3 and PE4 are the VXLAN-EVPN 518 gateways. NVE1 and NVE2 are the originators of the VXLAN based 519 network. 521 When host H1 in Data Center Site1 communicates with H3 in Data Center 522 Site2, H1 forms a layer2 packet with source IP address as IP1 and 523 Source MAC M1, Destination IP as IP3 and Destination MAC as 524 M3(assuming that ARP resolution already happened). VNE1 learns Source 525 MAC and lookup in bridge domain for the Destination MAC. Based on the 526 MAC lookup, the frame needs to be sent to VXLAN network. VXLAN 527 encapsulation is added to the original Ethernet frame and frame is 528 sent over the VXLAN tunnel. Frames arrives at PE1. PE1(i.e. VXLAN 529 gateway), identifies that frame is a VXLAN frame. The VXLAN header is 530 de-capsulated and Destination MAC lookup is done in the bridge domain 531 table of the EVI. Lookup of destination MAC results in the EVPN 532 unicast NH. This NH will be used for identifying the labels (tunnel 533 label and service label) to be added over the EVPN core. Similar 534 processing is done on the other side of DCI. 536 6.1.2. VNI Bundle Service Use Case Scenario 538 In the case of VNI-aware bundle service mode, there are multiple VNIs 539 are mapped to one EVI. The Ethernet TAG ID must be set to the VNI ID 540 in the EVPN BGP NLRIs. MPLS label allocation in this use case 541 scenario can be done either per EVI or per EVI, VNI ID basis. If MPLS 542 label allocation is done per EVI basis, then in data path there is a 543 need to push a VLAN TAG for identifying bridge-domain at egress PE so 544 that Destination MAC address lookup can be done on the bridge domain. 546 6.1.3. VNI Translation Use Case 547 +---+ +---+ 548 | H1| +---+ +-------+ +---+ +----------+ +---+ +-------+ +---+ | H3| 549 | M1|-+ +-+ +-+PE1+-+ +-+PE3+-+ +-+ +-| M3| 550 +---+ | | | | +---+ |MPLS Core | +---+ | | | | +---+ 551 +---+ |NVE| | VXLAN | | (EVPN) | | VXLAN | |NVE| +---+ 552 | H2| | 1 | | | +---+ | | +---+ | | | 2 | | H4| 553 | M2|-+ +-+ +-+PE2+-+ +-+PE4+-+ +-+ +-| M4| 554 +---+ +---+ +-------+ +---+ +----------+ +---+ +-------+ +---+ +---+ 555 |<----VNI ID A--->|<-------EVI-A------->|<----VNI_ID_B--->| 556 Figure 3 VNI Translation Use Case Scenarios. 558 There are two or more Data Center sites. These Data Center sites 559 might use different VNI ID for same service. For example, Service A 560 usage "VNI_ID_A" at data center site1 and "VNI_ID_B" for same service 561 in data center site 2. VNI ID A is terminated at ingress EVPN PE and 562 VNI ID B is encapsulated at the egress EVPN PE. 564 6.2. Heterogeneous Network DCI Use Cases Scenarios 566 Data Center sites are upgraded slowly; so heterogeneous network DCI 567 solution is required from the perspective of migration approach from 568 traditional data center to VXLAN based data center. For Example Data 569 Center Site1 is upgrade to VXLAN but Data Center Site 2 and 3 are 570 still layer2/VLAN based data centers. For these use cases, it is 571 required to provide VXLAN VLAN interworking over EVPN core. 573 6.2.1. VXLAN VLAN Interworking Over EVPN Use Case Scenario 575 The new data center site is VXLAN based data center site. But the 576 older data center sites are still based on the VLAN. 578 +---+ +---+ 579 | H1| +---+ +------+ +---+ +---------+ +---+ +-------+ +---+ | H3| 580 | M1|-+ +-+ +-+PE1+-+ +-+PE3+-+ +-+ +-| M3| 581 +---+ | | | | +---+ |MPLS Core| +---+ | | | | +---+ 582 +---+ |NVE| |VXLAN | | (EVPN) | | L2 | |NVE| +---+ 583 | H2| | 1 | | | +---+ | | +---+ |Network| | 2 | | H4| 584 | M2|-+ +-+ +-+PE2+-+ +-+PE4+-+ +-+ +-| M4| 585 +---+ +---+ +------+ +---+ +---------+ +---+ +-------+ +---+ +---+ 586 |<--Data Center Site1->|<---EVPN Core--->|<--Data Center Site2-->| 587 +-----+ +------+-----+ +------+------+-----+ +------+-----+ +-----+ 588 |L2 | |VXLAN |L2 | |MPLS |VLAN |L2 | |VLAN |L2 | |L2 | 589 |Frame| |Header|Frame| |Header|Header|Frame| |Header|Frame| |Frame| 590 +-----+ +------+-----+ +------+------+-----+ +------+-----+ +-----+ 592 Figure 5 VXLAN VLAN interworking over EVPN Use Case. 594 If a service that are represented by VXLAN on one site of data center 595 and via VLAN at different data center sites, then it is a recommended 596 to model the service as a VNI base EVPN service. The BGP NLRIs will 597 always advertise VLAN ID TAG as '0' in BGP routes. The advantage with 598 this approach is that there is no requirement to do the VNI 599 normalization at EVPN core. VNI ID A is terminated at ingress EVPN PE 600 and "VLAN ID B" is encapsulated at the egress EVPN PE. 602 7. Acknowledgements 604 The authors would like to acknowledge Wen Lin contributions to this 605 document. 606 8. Security Considerations 608 There are no additional security aspects that need to be discussed 609 here. 610 9. IANA Considerations 612 10. References 614 10.1 Normative References 616 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 617 Requirement Levels", BCP 14, RFC 2119, March 1997. 619 10.2 Informative References 621 [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432, 622 February, 2012. 624 [PBB-EVPN] Sajassi et al., "Provider Backbone Bridging Combined with 625 Ethernet VPN (PBB-EVPN)", RFC 7623, September, 2015. 627 [VXLAN] Mahalingam, Dutt et al., A Framework for Overlaying 628 Virtualized Layer 2 Networks over Layer 3 Networks, RFC 7348, August, 629 2012. 631 [NVGRE] Sridharan et al., Network Virtualization using Generic 632 Routing Encapsulation, RFC 7637, July, 2012. 634 Authors' Addresses 636 Sami Boutros 637 VMware, Inc. 638 EMail: sboutros@vmware.com 640 Ali Sajassi 641 Cisco Systems 642 EMail: sajassi@cisco.com 644 Samer Salam 645 Cisco Systems 646 EMail: ssalam@cisco.com 648 Dennis Cai 649 Cisco Systems 650 EMail: dcai@cisco.com 652 Tapraj Singh 653 Juniper Networks 654 Email: tsingh@juniper.net 656 John Drake 657 Juniper Networks 658 Email: jdrake@juniper.net 660 Samir Thoria 661 Cisco 662 EMail: sthoria@cisco.com 664 Jeff Tantsura 665 Ericsson 666 Email: jeff.tantsura@ericsson.com