idnits 2.17.1 draft-ietf-bess-evpn-prefix-advertisement-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 13, 2016) is 2775 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC7432' is mentioned on line 1327, but not defined == Missing Reference: 'RFC5512' is mentioned on line 1235, but not defined ** Obsolete undefined reference: RFC 5512 (Obsoleted by RFC 9012) == Missing Reference: 'RFC2119' is mentioned on line 1318, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-overlay-04 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-01 Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft W. Henderickx 4 S. Palislamovic 5 Intended status: Standards Track Nokia 7 A. Isaac 8 J. Drake 9 W. Lin 10 Juniper 12 A. Sajassi 13 Cisco 15 Expires: March 17, 2017 September 13, 2016 17 IP Prefix Advertisement in EVPN 18 draft-ietf-bess-evpn-prefix-advertisement-03 20 Abstract 22 EVPN provides a flexible control plane that allows intra-subnet 23 connectivity in an IP/MPLS and/or an NVO-based network. In NVO 24 networks, there is also a need for a dynamic and efficient inter- 25 subnet connectivity across Tenant Systems and End Devices that can be 26 physical or virtual and may not support their own routing protocols. 27 This document defines a new EVPN route type for the advertisement of 28 IP Prefixes and explains some use-case examples where this new route- 29 type is used. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF), its areas, and its working groups. Note that 38 other groups may also distribute working documents as Internet- 39 Drafts. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/ietf/1id-abstracts.txt 48 The list of Internet-Draft Shadow Directories can be accessed at 49 http://www.ietf.org/shadow.html 51 This Internet-Draft will expire on March 17, 2017. 53 Copyright Notice 55 Copyright (c) 2016 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Introduction and problem statement . . . . . . . . . . . . . . 4 72 2.1 Inter-subnet connectivity requirements in Data Centers . . . 4 73 2.2 The requirement for a new EVPN route type . . . . . . . . . 7 74 3. The BGP EVPN IP Prefix route . . . . . . . . . . . . . . . . . 8 75 3.1 IP Prefix Route encoding . . . . . . . . . . . . . . . . . . 9 76 4. Benefits of using the EVPN IP Prefix route . . . . . . . . . . 11 77 5. IP Prefix overlay index use-cases . . . . . . . . . . . . . . . 12 78 5.1 TS IP address overlay index use-case . . . . . . . . . . . . 12 79 5.2 Floating IP overlay index use-case . . . . . . . . . . . . . 15 80 5.3 ESI overlay index ("Bump in the wire") use-case . . . . . . 16 81 5.4 IP-VRF-to-IP-VRF model . . . . . . . . . . . . . . . . . . . 19 82 5.4.1 Interface-less IP-VRF-to-IP-VRF model . . . . . . . . . 20 83 5.4.2 Interface-full IP-VRF-to-IP-VRF with core-facing IRB . . 23 84 5.4.3 Interface-full IP-VRF-to-IP-VRF with unnumbered 85 core-facing IRB . . . . . . . . . . . . . . . . . . . . 25 86 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 28 87 7. Conventions used in this document . . . . . . . . . . . . . . . 29 88 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 29 89 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 29 90 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 91 10.1 Normative References . . . . . . . . . . . . . . . . . . . 30 92 10.2 Informative References . . . . . . . . . . . . . . . . . . 30 94 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 95 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 30 96 13. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 30 98 1. Terminology 100 GW IP: Gateway IP Address 102 IPL: IP address length 104 IRB: Integrated Routing and Bridging interface 106 ML: MAC address length 108 NVE: Network Virtualization Edge 110 TS: Tenant System 112 VA: Virtual Appliance 114 RT-2: EVPN route type 2, i.e. MAC/IP advertisement route 116 RT-5: EVPN route type 5, i.e. IP Prefix route 118 AC: Attachment Circuit 120 Overlay index: object used in the IP Prefix route, as described in 121 this document. It can be an IP address in the tenant space or an ESI, 122 and identifies a pointer yielded by the IP route lookup at the 123 routing context importing the route. An overlay index always needs a 124 recursive route resolution on the NVE receiving the IP Prefix route, 125 so that the NVE knows to which egress NVE it needs to forward the 126 packets. 128 Underlay next-hop: IP address sent by BGP along with any EVPN route, 129 i.e. BGP next-hop. It identifies the NVE sending the route and it is 130 used at the receiving NVE as the VXLAN destination VTEP or NVGRE 131 destination end-point. 133 Ethernet NVO tunnel: it refers to Network Virtualization Overlay 134 tunnels with Ethernet payload. Examples of this type of tunnels are 135 VXLAN or nvGRE. 137 IP NVO tunnel: it refers to Network Virtualization Overlay tunnels 138 with IP payload (no MAC header in the payload). Examples of IP NVO 139 tunnels are VXLAN GPE or MPLSoGRE (both with IP payload). 141 2. Introduction and problem statement 143 Inter-subnet connectivity is required for certain tenants within the 144 Data Center. [EVPN-INTERSUBNET] defines some fairly common inter- 145 subnet forwarding scenarios where TSes can exchange packets with TSes 146 located in remote subnets. In order to meet this requirement, 147 [EVPN-INTERSUBNET] describes how MAC/IPs encoded in TS RT-2 routes 148 are not only used to populate MAC-VRF and overlay ARP tables, but 149 also IP-VRF tables with the encoded TS host routes (/32 or /128). In 150 some cases, EVPN may advertise IP Prefixes and therefore provide 151 aggregation in the IP-VRF tables, as opposed to program individual 152 host routes. This document complements the scenarios described in 153 [EVPN-INTERSUBNET] and defines how EVPN may be used to advertise IP 154 Prefixes. 156 Section 2.1 describes the inter-subnet connectivity requirements in 157 Data Centers. Section 2.2 explains why a new EVPN route type is 158 required for IP Prefix advertisements. Once the need for a new EVPN 159 route type is justified, sections 3, 4 and 5 will describe this route 160 type and how it is used in some specific use cases. 162 2.1 Inter-subnet connectivity requirements in Data Centers 164 [RFC7432] is used as the control plane for a Network Virtualization 165 Overlay (NVO3) solution in Data Centers (DC), where Network 166 Virtualization Edge (NVE) devices can be located in Hypervisors or 167 TORs, as described in [EVPN-OVERLAY]. 169 If we use the term Tenant System (TS) to designate a physical or 170 virtual system identified by MAC and IP addresses, and connected to 171 an EVPN instance, the following considerations apply: 173 o The Tenant Systems may be Virtual Machines (VMs) that generate 174 traffic from their own MAC and IP. 176 o The Tenant Systems may be Virtual Appliance entities (VAs) that 177 forward traffic to/from IP addresses of different End Devices 178 seating behind them. 180 o These VAs can be firewalls, load balancers, NAT devices, other 181 appliances or virtual gateways with virtual routing instances. 183 o These VAs do not have their own routing protocols and hence 184 rely on the EVPN NVEs to advertise the routes on their behalf. 186 o In all these cases, the VA will forward traffic to the Data 187 Center using its own source MAC but the source IP will be the 188 one associated to the End Device seating behind or a 189 translated IP address (part of a public NAT pool) if the VA is 190 performing NAT. 192 o Note that the same IP address could exist behind two of these 193 TS. One example of this would be certain appliance resiliency 194 mechanisms, where a virtual IP or floating IP can be owned by 195 one of the two VAs running the resiliency protocol (the master 196 VA). VRRP is one particular example of this. Another example 197 is multi-homed subnets, i.e. the same subnet is connected to 198 two VAs. 200 o Although these VAs provide IP connectivity to VMs and subnets 201 behind them, they do not always have their own IP interface 202 connected to the EVPN NVE, e.g. layer-2 firewalls are examples 203 of VAs not supporting IP interfaces. 205 The following figure illustrates some of the examples described 206 above. 208 NVE1 209 +-----------+ 210 TS1(VM)--|(MAC-VRF10)|-----+ 211 IP1/M1 +-----------+ | DGW1 212 +---------+ +-------------+ 213 | |----|(MAC-VRF10) | 214 SN1---+ NVE2 | | | IRB1\ | 215 | +-----------+ | | | (IP-VRF)|---+ 216 SN2---TS2(VA)--|(MAC-VRF10)|-| | +-------------+ _|_ 217 | IP2/M2 +-----------+ | VXLAN/ | ( ) 218 IP4---+ <-+ | nvGRE | DGW2 ( WAN ) 219 | | | +-------------+ (___) 220 vIP23 (floating) | |----|(MAC-VRF10) | | 221 | +---------+ | IRB2\ | | 222 SN1---+ <-+ NVE3 | | | | (IP-VRF)|---+ 223 | IP3/M3 +-----------+ | | | +-------------+ 224 SN3---TS3(VA)--|(MAC-VRF10)|---+ | | 225 | +-----------+ | | 226 IP5---+ | | 227 | | 228 NVE4 | | NVE5 +--SN5 229 +---------------------+ | | +-----------+ | 230 IP6------|(MAC-VRF1) | | +-|(MAC-VRF10)|--TS4(VA)--SN6 231 | \ | | +-----------+ | 232 | (IP-VRF) |--+ ESI4 +--SN7 233 | / \IRB3 | 234 |---|(MAC-VRF2)(MAC-VRF10)| 235 SN4| +---------------------+ 237 Figure 1 DC inter-subnet use-cases 239 Where: 241 NVE1, NVE2, NVE3, NVE4, NVE5, DGW1 and DGW2 share the same EVI for a 242 particular tenant. EVI-10 is comprised of the collection of MAC-VRF10 243 instances defined in all the NVEs. All the hosts connected to EVI-10 244 belong to the same IP subnet. The hosts connected to EVI-10 are 245 listed below: 247 o TS1 is a VM that generates/receives traffic from/to IP1, where 248 IP1 belongs to the EVI-10 subnet. 250 o TS2 and TS3 are Virtual Appliances (VA) that generate/receive 251 traffic from/to the subnets and hosts seating behind them 252 (SN1, SN2, SN3, IP4 and IP5). Their IP addresses (IP2 and IP3) 253 belong to the EVI-10 subnet and they can also generate/receive 254 traffic. When these VAs receive packets destined to their own 255 MAC addresses (M2 and M3) they will route the packets to the 256 proper subnet or host. These VAs do not support routing 257 protocols to advertise the subnets connected to them and can 258 move to a different server and NVE when the Cloud Management 259 System decides to do so. These VAs may also support redundancy 260 mechanisms for some subnets, similar to VRRP, where a floating 261 IP is owned by the master VA and only the master VA forwards 262 traffic to a given subnet. E.g.: vIP23 in figure 1 is a 263 floating IP that can be owned by TS2 or TS3 depending on who 264 the master is. Only the master will forward traffic to SN1. 266 o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3 267 have their own IP addresses that belong to the EVI-10 subnet 268 too. These IRB interfaces connect the EVI-10 subnet to Virtual 269 Routing and Forwarding (IP-VRF) instances that can route the 270 traffic to other connected subnets for the same tenant (within 271 the DC or at the other end of the WAN). 273 o TS4 is a layer-2 VA that provides connectivity to subnets SN5, 274 SN6 and SN7, but does not have an IP address itself in the 275 EVI-10. TS4 is connected to a physical port on NVE5 assigned 276 to Ethernet Segment Identifier 4. 278 All the above DC use cases require inter-subnet forwarding and 279 therefore the individual host routes and subnets: 281 a) MUST be advertised from the NVEs (since VAs and VMs do not run 282 routing protocols) and 283 b) MAY be associated to an overlay index that can be a VA IP address, 284 a floating IP address or an ESI. 286 2.2 The requirement for a new EVPN route type 288 [RFC7432] defines a MAC/IP route (also referred as RT-2) where a MAC 289 address can be advertised together with an IP address length (IPL) 290 and IP address (IP). While a variable IPL might have been used to 291 indicate the presence of an IP prefix in a route type 2, there are 292 several specific use cases in which using this route type to deliver 293 IP Prefixes is not suitable. 295 One example of such use cases is the "floating IP" example described 296 in section 2.1. In this example we need to decouple the advertisement 297 of the prefixes from the advertisement of the floating IP (vIP23 in 298 figure 1) and MAC associated to it, otherwise the solution gets 299 highly inefficient and does not scale. 301 E.g.: if we are advertising 1k prefixes from M2 (using RT-2) and the 302 floating IP owner changes from M2 to M3, we would need to withdraw 1k 303 routes from M2 and re-advertise 1k routes from M3. However if we use 304 a separate route type, we can advertise the 1k routes associated to 305 the floating IP address (vIP23) and only one RT-2 for advertising the 306 ownership of the floating IP, i.e. vIP23 and M2 in the route type 2. 307 When the floating IP owner changes from M2 to M3, a single RT-2 308 withdraw/update is required to indicate the change. The remote DGW 309 will not change any of the 1k prefixes associated to vIP23, but will 310 only update the ARP resolution entry for vIP23 (now pointing at M3). 312 Other reasons to decouple the IP Prefix advertisement from the MAC/IP 313 route are listed below: 315 o Clean identification, operation of troubleshooting of IP 316 Prefixes, not subject to interpretation and independent of the 317 IPL and the IP value. E.g.: a default IP route 0.0.0.0/0 must 318 always be easily and clearly distinguished from the absence of 319 IP information. 321 o MAC address information must not be compared by BGP when 322 selecting two IP Prefix routes. If IP Prefixes were to be 323 advertised using MAC/IP routes, the MAC information would 324 always be present and part of the route key. 326 o IP Prefix routes must not be subject to MAC/IP route 327 procedures such as MAC mobility or aliasing. Prefixes 328 advertised from two different ESIs do not mean mobility; MACs 329 advertised from two different ESIs do mean mobility. Similarly 330 load balancing for IP prefixes is achieved through IP 331 mechanisms such as ECMP, and not through MAC route mechanisms 332 such as aliasing. 334 o NVEs that do not require processing IP Prefixes must have an 335 easy way to identify an update with an IP Prefix and ignore 336 it, rather than processing the MAC/IP route to find out only 337 later that it carries a Prefix that must be ignored. 339 The following sections describe how EVPN is extended with a new route 340 type for the advertisement of IP prefixes and how this route is used 341 to address the current and future inter-subnet connectivity 342 requirements existing in the Data Center. 344 3. The BGP EVPN IP Prefix route 346 The current BGP EVPN NLRI as defined in [RFC7432] is shown below: 348 +-----------------------------------+ 349 | Route Type (1 octet) | 350 +-----------------------------------+ 351 | Length (1 octet) | 352 +-----------------------------------+ 353 | Route Type specific (variable) | 354 +-----------------------------------+ 356 Where the route type field can contain one of the following specific 357 values: 359 + 1 - Ethernet Auto-Discovery (A-D) route 361 + 2 - MAC/IP advertisement route 363 + 3 - Inclusive Multicast Route 365 + 4 - Ethernet Segment Route 367 This document defines an additional route type that will be used for 368 the advertisement of IP Prefixes: 370 + 5 - IP Prefix Route 372 The support for this new route type is OPTIONAL. 374 Since this new route type is OPTIONAL, an implementation not 375 supporting it MUST ignore the route, based on the unknown route type 376 value. 378 The detailed encoding of this route and associated procedures are 379 described in the following sections. 381 3.1 IP Prefix Route encoding 383 An IP Prefix advertisement route NLRI consists of the following 384 fields: 386 +---------------------------------------+ 387 | RD (8 octets) | 388 +---------------------------------------+ 389 |Ethernet Segment Identifier (10 octets)| 390 +---------------------------------------+ 391 | Ethernet Tag ID (4 octets) | 392 +---------------------------------------+ 393 | IP Prefix Length (1 octet) | 394 +---------------------------------------+ 395 | IP Prefix (4 or 16 octets) | 396 +---------------------------------------+ 397 | GW IP Address (4 or 16 octets) | 398 +---------------------------------------+ 399 | MPLS Label (3 octets) | 400 +---------------------------------------+ 402 Where: 404 o RD, Ethernet Tag ID and MPLS Label fields will be used as 405 defined in [RFC7432] and [EVPN-OVERLAY]. 407 o The Ethernet Segment Identifier will be a non-zero 10-byte 408 identifier if the ESI is used as an overlay index. It will be 409 zero otherwise. 411 o The IP Prefix Length can be set to a value between 0 and 32 412 (bits) for ipv4 and between 0 and 128 for ipv6. 414 o The IP Prefix will be a 32 or 128-bit field (ipv4 or ipv6). 416 o The GW IP (Gateway IP Address) will be a 32 or 128-bit field 417 (ipv4 or ipv6), and will encode an overlay IP index for the IP 418 Prefixes. The GW IP field SHOULD be zero if it is not used as 419 an overlay index. 421 o The MPLS Label field is encoded as 3 octets, where the high- 422 order 20 bits contain the label value. The value SHOULD be 423 null when the IP Prefix route is used for a recursive lookup 424 resolution. 426 o The total route length will indicate the type of prefix (ipv4 427 or ipv6) and the type of GW IP address (ipv4 or ipv6). Note 428 that the IP Prefix + the GW IP should have a length of either 429 64 or 256 bits, but never 160 bits (ipv4 and ipv6 mixed values 430 are not allowed). 432 The Eth-Tag ID, IP Prefix Length and IP Prefix will be part of the 433 route key used by BGP to compare routes. The rest of the fields will 434 not be part of the route key. 436 The route will contain a single overlay index at most, i.e. if the 437 ESI field is different from zero, the GW IP field will be zero, and 438 vice versa. The following table shows the different inter-subnet use- 439 cases described in this document and the corresponding coding of the 440 overlay index in the route type 5 (RT-5). The IP-VRF-to-IP-VRF or IRB 441 forwarding on NVEs case is a special use-case, where there may be no 442 need for overlay index, since the actual next-hop is given by the BGP 443 next-hop. When an overlay index is present in the RT-5, the receiving 444 NVE will need to perform a recursive route resolution to find out to 445 which egress NVE to forward the packets. 447 +----------------------------+--------------------------------------+ 448 | Use-case | Overlay Index in the RT-5 BGP update | 449 +----------------------------+--------------------------------------+ 450 | TS IP address | Overlay GW IP Address | 451 | Floating IP address | Overlay GW IP Address | 452 | "Bump in the wire" | ESI | 453 | IP-VRF-to-IP-VRF | Overlay GW IP, MAC or N/A | 454 +----------------------------+--------------------------------------+ 456 4. Benefits of using the EVPN IP Prefix route 458 This section clarifies the different functions accomplished by the 459 EVPN RT-2 and RT-5 routes, and provides a list of benefits derived 460 from using a separate route type for the advertisement of IP Prefixes 461 in EVPN. 463 [RFC7432] describes the content of the BGP EVPN RT-2 specific NLRI, 464 i.e. MAC/IP Advertisement Route, where the IP address length (IPL) 465 and IP address (IP) of a specific advertised MAC are encoded. The 466 subject of the MAC advertisement route is the MAC address (M) and MAC 467 address length (ML) encoded in the route. The MAC mobility and other 468 procedures are defined around that MAC address. The IP address 469 information carries the host IP address required for the ARP 470 resolution of the MAC according to [RFC7432] and the host route to be 471 programmed in the IP-VRF [EVPN-INTERSUBNET]. 473 The BGP EVPN route type 5 defined in this document, i.e. IP Prefix 474 Advertisement route, decouples the advertisement of IP prefixes from 475 the advertisement of any MAC address related to it. This brings some 476 major benefits to NVO-based networks where certain inter-subnet 477 forwarding scenarios are required. Some of those benefits are: 479 a) Upon receiving a route type 2 or type 5, an egress NVE can easily 480 distinguish MACs and IPs from IP Prefixes. E.g. an IP prefix with 481 IPL=32 being advertised from two different ingress NVEs (as RT-5) 482 can be identified as such and be imported in the designated 483 routing context as two ECMP routes, as opposed to two MACs 484 competing for the same IP. 486 b) Similarly, upon receiving a route, an ingress NVE not supporting 487 processing of IP Prefixes can easily ignore the update, based on 488 the route type. 490 c) A MAC route includes the ML, M, IPL and IP in the route key that 491 is used by BGP to compare routes, whereas for IP Prefix routes, 492 only IPL and IP (as well as Ethernet Tag ID) are part of the route 493 key. Advertised IP Prefixes are imported into the designated 494 routing context, where there is no MAC information associated to 495 IP routes. In the example illustrated in figure 1, subnet SN1 496 should be advertised by NVE2 and NVE3 and interpreted by DGW1 as 497 the same route coming from two different next-hops, regardless of 498 the MAC address associated to TS2 or TS3. This is easily 499 accomplished in the RT-5 by including only the IP information in 500 the route key. 502 d) By decoupling the MAC from the IP Prefix advertisement procedures, 503 we can leave the IP Prefix advertisements out of the MAC mobility 504 procedures defined in [RFC7432] for MACs. In addition, this allows 505 us to have an indirection mechanism for IP Prefixes advertised 506 from a MAC/IP that can move between hypervisors. E.g. if there are 507 1,000 prefixes seating behind TS2 (figure 1), NVE2 will advertise 508 all those prefixes in RT-5 routes associated to the overlay index 509 IP2. Should TS2 move to a different NVE, a single MAC/IP 510 advertisement route withdraw for the M2/IP2 route from NVE2 will 511 invalidate the 1,000 prefixes, as opposed to have to wait for each 512 individual prefix to be withdrawn. This may be easily accomplished 513 by using IP Prefix routes that are not tied to a MAC address, and 514 use a different MAC/IP route to advertise the location and 515 resolution of the overlay index to a MAC address. 517 5. IP Prefix overlay index use-cases 519 The IP Prefix route can use a GW IP or an ESI as an overlay index as 520 well as no overlay index whatsoever. This section describes some use- 521 cases for these index types. 523 5.1 TS IP address overlay index use-case 525 The following figure illustrates an example of inter-subnet 526 forwarding for subnets seating behind Virtual Appliances (on TS2 and 527 TS3). 529 SN1---+ NVE2 DGW1 530 | +-----------+ +---------+ +-------------+ 531 SN2---TS2(VA)--|(MAC-VRF10)|-| |----|(MAC-VRF10) | 532 | IP2/M2 +-----------+ | | | IRB1\ | 533 IP4---+ | | | (IP-VRF)|---+ 534 | | +-------------+ _|_ 535 | VXLAN/ | ( ) 536 | nvGRE | DGW2 ( WAN ) 537 SN1---+ NVE3 | | +-------------+ (___) 538 | IP3/M3 +-----------+ | |----|(MAC-VRF10) | | 539 SN3---TS3(VA)--|(MAC-VRF10)|-| | | IRB2\ | | 540 | +-----------+ +---------+ | (IP-VRF)|---+ 541 IP5---+ +-------------+ 543 Figure 2 TS IP address use-case 545 An example of inter-subnet forwarding between subnet SN1/24 and a 546 subnet seating in the WAN is described below. NVE2, NVE3, DGW1 and 547 DGW2 are running BGP EVPN. TS2 and TS3 do not support routing 548 protocols, only a static route to forward the traffic to the WAN. 550 (1) NVE2 advertises the following BGP routes on behalf of TS2: 552 o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32, 553 IP=IP2 and [RFC5512] BGP Encapsulation Extended Community with 554 the corresponding Tunnel-type. 556 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 557 ESI=0, GW IP address=IP2. 559 (2) NVE3 advertises the following BGP routes on behalf of TS3: 561 o Route type 2 (MAC/IP route) containing: ML=48, M=M3, IPL=32, 562 IP=IP3 (and BGP Encapsulation Extended Community). 564 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 565 ESI=0, GW IP address=IP3. 567 (3) DGW1 and DGW2 import both received routes based on the 568 route-targets: 570 o Based on the MAC-VRF10 route-target in DGW1 and DGW2, the 571 MAC/IP route is imported and M2 is added to the MAC-VRF10 572 along with its corresponding tunnel information. For instance, 573 if VXLAN is used, the VTEP will be derived from the MAC/IP 574 route BGP next-hop (underlay next-hop) and VNI from the MPLS 575 Label1 field. IP2 - M2 is added to the ARP table. 577 o Based on the MAC-VRF10 route-target in DGW1 and DGW2, the IP 578 Prefix route is also imported and SN1/24 is added to the IP- 579 VRF with overlay index IP2 pointing at the local MAC-VRF10. 580 Should ECMP be enabled in the IP-VRF, SN1/24 would also be 581 added to the routing table with overlay index IP3. 583 (4) When DGW1 receives a packet from the WAN with destination IPx, 584 where IPx belongs to SN1/24: 586 o A destination IP lookup is performed on the DGW1 IP-VRF 587 routing table and overlay index=IP2 is found. Since IP2 is an 588 overlay index a recursive route resolution is required for 589 IP2. 591 o IP2 is resolved to M2 in the ARP table, and M2 is resolved to 592 the tunnel information given by the MAC-VRF FIB (e.g. remote 593 VTEP and VNI for the VXLAN case). 595 o The IP packet destined to IPx is encapsulated with: 597 . Source inner MAC = IRB1 MAC. 599 . Destination inner MAC = M2. 601 . Tunnel information provided by the MAC-VRF (VNI, VTEP IPs 602 and MACs for the VXLAN case). 604 (5) When the packet arrives at NVE2: 606 o Based on the tunnel information (VNI for the VXLAN case), the 607 MAC-VRF10 context is identified for a MAC lookup. 609 o Encapsulation is stripped-off and based on a MAC lookup 610 (assuming MAC forwarding on the egress NVE), the packet is 611 forwarded to TS2, where it will be properly routed. 613 (6) Should TS2 move from NVE2 to NVE3, MAC Mobility procedures will 614 be applied to the MAC route IP2/M2, as defined in [RFC7432]. 615 Route type 5 prefixes are not subject to MAC mobility procedures, 616 hence no changes in the DGW IP-VRF routing table will occur for 617 TS2 mobility, i.e. all the prefixes will still be pointing at IP2 618 as overlay index. There is an indirection for e.g. SN1/24, which 619 still points at overlay index IP2 in the routing table, but IP2 620 will be simply resolved to a different tunnel, based on the 621 outcome of the MAC mobility procedures for the MAC/IP route 622 IP2/M2. 624 Note that in the opposite direction, TS2 will send traffic based on 625 its static-route next-hop information (IRB1 and/or IRB2), and regular 626 EVPN procedures will be applied. 628 5.2 Floating IP overlay index use-case 630 Sometimes Tenant Systems (TS) work in active/standby mode where an 631 upstream floating IP - owned by the active TS - is used as the 632 overlay index to get to some subnets behind. This redundancy mode, 633 already introduced in section 2.1 and 2.2, is illustrated in Figure 634 3. 636 NVE2 DGW1 637 +-----------+ +---------+ +-------------+ 638 +---TS2(VA)--|(MAC-VRF10)|-| |----|(MAC-VRF10) | 639 | IP2/M2 +-----------+ | | | IRB1\ | 640 | <-+ | | | (IP-VRF)|---+ 641 | | | | +-------------+ _|_ 642 SN1 vIP23 (floating) | VXLAN/ | ( ) 643 | | | nvGRE | DGW2 ( WAN ) 644 | <-+ NVE3 | | +-------------+ (___) 645 | IP3/M3 +-----------+ | |----|(MAC-VRF10) | | 646 +---TS3(VA)--|(MAC-VRF10)|-| | | IRB2\ | | 647 +-----------+ +---------+ | (IP-VRF)|---+ 648 +-------------+ 650 Figure 3 Floating IP overlay index for redundant TS 652 In this example, assuming TS2 is the active TS and owns IP23: 654 (1) NVE2 advertises the following BGP routes for TS2: 656 o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32, 657 IP=IP23 (and BGP Encapsulation Extended Community). 659 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 660 ESI=0, GW IP address=IP23. 662 (2) NVE3 advertises the following BGP routes for TS3: 664 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 665 ESI=0, GW IP address=IP23. 667 (3) DGW1 and DGW2 import both received routes based on the route- 668 target: 670 o M2 is added to the MAC-VRF10 FIB along with its corresponding 671 tunnel information. For the VXLAN use case, the VTEP will be 672 derived from the MAC/IP route BGP next-hop and VNI from the 673 VNI/VSID field. IP23 - M2 is added to the ARP table. 675 o SN1/24 is added to the IP-VRF in DGW1 and DGW2 with overlay 676 index IP23 pointing at the local MAC-VRF10. 678 (4) When DGW1 receives a packet from the WAN with destination IPx, 679 where IPx belongs to SN1/24: 681 o A destination IP lookup is performed on the DGW1 IP-VRF 682 routing table and overlay index=IP23 is found. Since IP23 is 683 an overlay index, a recursive route resolution for IP23 is 684 required. 686 o IP23 is resolved to M2 in the ARP table, and M2 is resolved to 687 the tunnel information given by the MAC-VRF (remote VTEP and 688 VNI for the VXLAN case). 690 o The IP packet destined to IPx is encapsulated with: 692 . Source inner MAC = IRB1 MAC. 694 . Destination inner MAC = M2. 696 . Tunnel information provided by the MAC-VRF FIB (VNI, VTEP 697 IPs and MACs for the VXLAN case). 699 (5) When the packet arrives at NVE2: 701 o Based on the tunnel information (VNI for the VXLAN case), the 702 MAC-VRF10 context is identified for a MAC lookup. 704 o Encapsulation is stripped-off and based on a MAC lookup 705 (assuming MAC forwarding on the egress NVE), the packet is 706 forwarded to TS2, where it will be properly routed. 708 (6) When the redundancy protocol running between TS2 and TS3 appoints 709 TS3 as the new active TS for SN1, TS3 will now own the floating 710 IP23 and will signal this new ownership (GARP message or 711 similar). Upon receiving the new owner's notification, NVE3 will 712 issue a route type 2 for M3-IP23. DGW1 and DGW2 will update their 713 ARP tables with the new MAC resolving the floating IP. No changes 714 are carried out in the IP-VRF routing table. 716 5.3 ESI overlay index ("Bump in the wire") use-case 718 Figure 5 illustrates an example of inter-subnet forwarding for an IP 719 Prefix route that carries a subnet SN1 and uses an ESI as an overlay 720 index (ESI23). In this use-case, TS2 and TS3 are layer-2 VA devices 721 without any IP address that can be included as an overlay index in 722 the GW IP field of the IP Prefix route. Their MAC addresses are M2 723 and M3 respectively and are connected to EVI-10. Note that IRB1 and 724 IRB2 (in DGW1 and DGW2 respectively) have IP addresses in a subnet 725 different than SN1. 727 NVE2 DGW1 728 M2 +-----------+ +---------+ +-------------+ 729 +---TS2(VA)--|(MAC-VRF10)|-| |----|(MAC-VRF10) | 730 | ESI23 +-----------+ | | | IRB1\ | 731 | + | | | (IP-VRF)|---+ 732 | | | | +-------------+ _|_ 733 SN1 | | VXLAN/ | ( ) 734 | | | nvGRE | DGW2 ( WAN ) 735 | + NVE3 | | +-------------+ (___) 736 | ESI23 +-----------+ | |----|(MAC-VRF10) | | 737 +---TS3(VA)--|(MAC-VRF10)|-| | | IRB2\ | | 738 M3 +-----------+ +---------+ | (IP-VRF)|---+ 739 +-------------+ 741 Figure 5 ESI overlay index use-case 743 Since neither TS2 nor TS3 can run any routing protocol and have no IP 744 address assigned, an ESI, i.e. ESI23, will be provisioned on the 745 attachment ports of NVE2 and NVE3. This model supports VA redundancy 746 in a similar way as the one described in section 5.2 for the floating 747 IP overlay index use-case, only using the EVPN Ethernet A-D route 748 instead of the MAC advertisement route to advertise the location of 749 the overlay index. The procedure is explained below: 751 (1) NVE2 advertises the following BGP routes for TS2: 753 o Route type 1 (Ethernet A-D route for EVI-10) containing: 754 ESI=ESI23 and the corresponding tunnel information (VNI/VSID 755 field), as well as the BGP Encapsulation Extended Community as 756 per [EVPN-OVERLAY]. 758 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 759 ESI=ESI23, GW IP address=0. The Router's MAC Extended 760 Community defined in [EVPN-INTERSUBNET] is added and carries 761 the MAC address (M2) associated to the TS behind which SN1 762 seats. 764 (2) NVE3 advertises the following BGP routes for TS3: 766 o Route type 1 (Ethernet A-D route for EVI-10) containing: 767 ESI=ESI23 and the corresponding tunnel information (VNI/VSID 768 field), as well as the BGP Encapsulation Extended Community. 770 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 771 ESI=23, GW IP address=0. The Router's MAC Extended Community 772 is added and carries the MAC address (M3) associated to the TS 773 behind which SN1 seats. 775 (3) DGW1 and DGW2 import the received routes based on the route- 776 target: 778 o The tunnel information to get to ESI23 is installed in DGW1 779 and DGW2. For the VXLAN use case, the VTEP will be derived 780 from the Ethernet A-D route BGP next-hop and VNI from the 781 VNI/VSID field (see [EVPN-OVERLAY]). 783 o SN1/24 is added to the IP-VRF in DGW1 and DGW2 with overlay 784 index ESI23. 786 (4) When DGW1 receives a packet from the WAN with destination IPx, 787 where IPx belongs to SN1/24: 789 o A destination IP lookup is performed on the DGW1 IP-VRF 790 routing table and overlay index=ESI23 is found. Since ESI23 is 791 an overlay index, a recursive route resolution is required to 792 find the egress NVE where ESI23 resides. 794 o The IP packet destined to IPx is encapsulated with: 796 . Source inner MAC = IRB1 MAC. 798 . Destination inner MAC = M2 (this MAC will be obtained 799 from the Router's MAC Extended Community received along 800 with the RT-5 for SN1). 802 . Tunnel information for the NVO tunnel is provided by the 803 Ethernet A-D route per-EVI for ESI23 (VNI and VTEP IP for 804 the VXLAN case). 806 (5) When the packet arrives at NVE2: 808 o Based on the tunnel demultiplexer information (VNI for the 809 VXLAN case), the MAC-VRF10 context is identified for a MAC 810 lookup (assuming MAC disposition model) or the VNI MAY 811 directly identify the egress interface (for a label or VNI 812 disposition model). 814 o Encapsulation is stripped-off and based on a MAC lookup 815 (assuming MAC forwarding on the egress NVE) or a VNI lookup 816 (in case of VNI forwarding), the packet is forwarded to TS2, 817 where it will be forwarded to SN1. 819 (6) If the redundancy protocol running between TS2 and TS3 follows an 820 active/standby model and there is a failure, appointing TS3 as 821 the new active TS for SN1, TS3 will now own the connectivity to 822 SN1 and will signal this new ownership. Upon receiving the new 823 owner's notification, NVE3's AC will become active and issue a 824 route type 1 for ESI23, whereas NVE2 will withdraw its Ethernet 825 A-D route for ESI23. DGW1 and DGW2 will update their tunnel 826 information to resolve ESI23. The destination inner MAC will be 827 changed to M3. 829 5.4 IP-VRF-to-IP-VRF model 831 This use-case is similar to the scenario described in "IRB forwarding 832 on NVEs for Tenant Systems" in [EVPN-INTERSUBNET], however the new 833 requirement here is the advertisement of IP Prefixes as opposed to 834 only host routes. 836 In the examples described in sections 5.1, 5.2 and 5.3, the MAC-VRF 837 instance can connect IRB interfaces and any other Tenant Systems 838 connected to it. EVPN provides connectivity for: 840 1. Traffic destined to the IRB IP interfaces as well as 842 2. Traffic destined to IP subnets seating behind the TS, e.g. SN1 or 843 SN2. 845 In order to provide connectivity for (1), MAC/IP routes (RT-2) are 846 needed so that IRB MACs and IPs can be distributed. Connectivity type 847 (2) is accomplished by the exchange of IP Prefix routes (RT-5) for 848 IPs and subnets seating behind certain overlay indexes, e.g. GW IP or 849 ESI. 851 In some cases, IP Prefix routes may be advertised for subnets and IPs 852 seating behind an IRB. We refer to this use-case as the "IP-VRF-to- 853 IP-VRF" model. 855 [EVPN-INTERSUBNET] defines an asymmetric IRB model and a symmetric 856 IRB model, based on the required lookups at the ingress and egress 857 NVE: the asymmetric model requires an ip-lookup and a mac-lookup at 858 the ingress NVE, whereas only a mac-lookup is needed at the egress 859 NVE; the symmetric model requires ip and mac lookups at both, ingress 860 and egress NVE. From that perspective, the IP-VRF-to-IP-VRF use-case 861 described in this section is a symmetric IRB model. Note that in an 862 IP-VRF-to-IP-VRF scenario, a PE may not be configured with any MAC- 863 VRF for a given tenant, in which case it will only be doing IP 864 lookups and forwarding for that tenant. 866 Based on the way the IP-VRFs are interconnected, there are three 867 different IP-VRF-to-IP-VRF scenarios identified and described in this 868 document: 870 1) Interface-less model 871 2) Interface-full with core-facing IRB model 872 3) Interface-full with unnumbered core-facing IRB model 874 5.4.1 Interface-less IP-VRF-to-IP-VRF model 876 Figure 6 will be used for the description of this model. 878 NVE1(M1) 879 +------------+ 880 IP1+----|(MAC-VRF1) | DGW1(M3) 881 | \ | +---------+ +--------+ 882 | (IP-VRF)|----| |-|(IP-VRF)|----+ 883 | / | | | +--------+ | 884 +---|(MAC-VRF2) | | | _+_ 885 | +------------+ | | ( ) 886 SN1| | VXLAN/ | ( WAN ) 887 | NVE2(M2) | nvGRE/ | (___) 888 | +------------+ | MPLS | + 889 +---|(MAC-VRF2) | | | DGW2(M4) | 890 | \ | | | +--------+ | 891 | (IP-VRF)|----| |-|(IP-VRF)|----+ 892 | / | +---------+ +--------+ 893 SN2+----|(MAC-VRF3) | 894 +------------+ 896 Figure 6 Interface-less IP-VRF-to-IP-VRF model 898 In this case, the requirements are the following: 900 a) The NVEs and DGWs must provide connectivity between hosts in SN1, 901 SN2, IP1 and hosts seating at the other end of the WAN. 903 b) The IP-VRF instances in the NVE/DGWs are directly connected 904 through NVO tunnels, and no IRBs and/or MAC-VRF instances are 905 defined at the core. 907 c) The solution must provide layer-3 connectivity among the IP-VRFs 908 for Ethernet NVO tunnels, for instance, VXLAN or nvGRE. 910 d) The solution may provide layer-3 connectivity among the IP-VRFs 911 for IP NVO tunnels, for example, VXLAN GPE (with IP payload). 913 In order to meet the above requirements, the EVPN route type 5 will 914 be used to advertise the IP Prefixes, along with the Router's MAC 915 Extended Community as defined in [EVPN-INTERSUBNET] if the 916 advertising NVE/DGW uses Ethernet NVO tunnels. Each NVE/DGW will 917 advertise an RT-5 for each of its prefixes with the following fields: 919 o RD as per [RFC7432]. 921 o Eth-Tag ID=0 assuming VLAN-based service. 923 o IP address length and IP address, as explained in the previous 924 sections. 926 o GW IP address= SHOULD be set to 0. 928 o ESI=0 930 o MPLS label or VNI corresponding to the IP-VRF. 932 Each RT-5 will be sent with a route-target identifying the tenant 933 (IP-VRF) and two BGP extended communities: 935 o The first one is the BGP Encapsulation Extended Community, as 936 per [RFC5512], identifying the tunnel type. 938 o The second one is the Router's MAC Extended Community as per 939 [EVPN-INTERSUBNET] containing the MAC address associated to 940 the NVE advertising the route. This MAC address identifies the 941 NVE/DGW and MAY be re-used for all the IP-VRFs in the NVE. The 942 Router's MAC Extended Community MUST be sent if the route is 943 associated to an Ethernet NVO tunnel, for instance, VXLAN. If 944 the route is associated to an IP NVO tunnel, for instance 945 VXLAN GPE with IP payload, the Router's MAC Extended Community 946 SHOULD NOT be sent. 948 The following example illustrates the procedure to advertise and 949 forward packets to SN1/24 (ipv4 prefix advertised from NVE1) for 950 VXLAN tunnels: 952 (1) NVE1 advertises the following BGP route: 954 o Route type 5 (IP Prefix route) containing: 956 . IPL=24, IP=SN1, VNI=10. 958 . GW IP= SHOULD be set to 0. 960 . [RFC5512] BGP Encapsulation Extended Community with Tunnel- 961 type=VXLAN. 963 . Router's MAC Extended Community that contains M1. 965 . Route-target identifying the tenant (IP-VRF). 967 (2) DGW1 imports the received routes from NVE1: 969 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 970 route-target. 972 o Since GW IP=0 and the VNI is a valid value, DGW1 will use the 973 VNI and next-hop of the RT-5, as well as the MAC address 974 conveyed in the Router's MAC Extended Community (as inner 975 destination MAC address) to encapsulate the routed IP packets. 977 (3) When DGW1 receives a packet from the WAN with destination IPx, 978 where IPx belongs to SN1/24: 980 o A destination IP lookup is performed on the DGW1 IP-VRF 981 routing table. The lookup yields SN1/24. 983 o Since the RT-5 for SN1/24 had a GW IP=0 and a valid VNI and 984 next-hop (used as destination VTEP), DGW1 will not need a 985 recursive lookup to resolve the route. 987 o The IP packet destined to IPx is encapsulated with: Source 988 inner MAC = DGW1 MAC, Destination inner MAC = M1, Source outer 989 IP (source VTEP) = DGW1 IP, Destination outer IP (destination 990 VTEP) = NVE1 IP. 992 (4) When the packet arrives at NVE1: 994 o NVE1 will identify the IP-VRF for an IP-lookup based on the 995 VNI. 997 o An IP lookup is performed in the routing context, where SN1 998 turns out to be a local subnet associated to MAC-VRF2. A 999 subsequent lookup in the ARP table and the MAC-VRF FIB will 1000 provide the forwarding information for the packet in MAC-VRF2. 1002 The implementation of this Interface-less model is REQUIRED. 1004 5.4.2 Interface-full IP-VRF-to-IP-VRF with core-facing IRB 1006 Figure 7 will be used for the description of this model. 1008 NVE1 1009 +------------+ DGW1 1010 IP1+----+(MAC-VRF1) | +---------------+ +------------+ 1011 | \ (core) (core) | 1012 |(IP-VRF)(MAC-VRF) (MAC-VRF)(IP-VRF)|-----+ 1013 | / IRB(IP1/M1) IRB(IP3/M3) | | 1014 +---+(MAC-VRF2) | | | +------------+ _+_ 1015 | +------------+ | | ( ) 1016 SN1| | VXLAN/ | ( WAN ) 1017 | NVE2 | nvGRE/ | (___) 1018 | +------------+ | MPLS | DGW2 + 1019 +---+(MAC-VRF2) | | | +------------+ | 1020 | \ (core) (core) | | 1021 |(IP-VRF)(MAC-VRF) (MAC-VRF)(IP-VRF)|-----+ 1022 | / IRB(IP2/M2) IRB(IP4/M4) | 1023 SN2+----+(MAC-VRF3) | +---------------+ +------------+ 1024 +------------+ 1026 Figure 7 Interface-full with core-facing IRB model 1028 In this model, the requirements are the following: 1030 a) As in section 5.4.1, the NVEs and DGWs must provide connectivity 1031 between hosts in SN1, SN2, IP1 and hosts seating at the other end 1032 of the WAN. 1034 b) However, the NVE/DGWs are now connected through Ethernet NVO 1035 tunnels terminated in core-MAC-VRF instances. The IP-VRFs use IRB 1036 interfaces for their connectivity to the core MAC-VRFs. 1038 c) Each core-facing IRB has an IP and a MAC address, where the IP 1039 address must be reachable from other NVEs or DGWs. 1041 d) The core EVI is composed of the NVE/DGW MAC-VRFs and may contain 1042 other MAC-VRFs without IRB interfaces. Those non-IRB MAC-VRFs will 1043 typically connect TSes that need layer-3 connectivity to remote 1044 subnets. 1046 e) The solution must provide layer-3 connectivity for Ethernet NVO 1047 tunnels, for instance, VXLAN or nvGRE. 1049 EVPN type 5 routes will be used to advertise the IP Prefixes, whereas 1050 EVPN RT-2 routes will advertise the MAC/IP addresses of each core- 1051 facing IRB interface. Each NVE/DGW will advertise an RT-5 for each of 1052 its prefixes with the following fields: 1054 o RD as per [RFC7432]. 1056 o Eth-Tag ID=0 assuming VLAN-based service. 1058 o IP address length and IP address, as explained in the previous 1059 sections. 1061 o GW IP address=IRB-IP (this is the overlay index that will be 1062 used for the recursive route resolution). 1064 o ESI=0 1066 o MPLS label or VNI corresponding to the IP-VRF. Note that the 1067 value SHOULD be zero since the RT-5 route requires a recursive 1068 lookup resolution to an RT-2 route. The MPLS label or VNI to 1069 be used when forwarding packets will be derived from the RT- 1070 2's MPLS Label1 field. 1072 Each RT-5 will be sent with a route-target identifying the tenant 1073 (IP-VRF). The Router's MAC Extended Community SHOULD NOT be sent in 1074 this case. 1076 The following example illustrates the procedure to advertise and 1077 forward packets to SN1/24 (ipv4 prefix advertised from NVE1) for 1078 VXLAN tunnels: 1080 (1) NVE1 advertises the following BGP routes: 1082 o Route type 5 (IP Prefix route) containing: 1084 . IPL=24, IP=SN1, VNI= SHOULD be set to 0. 1086 . GW IP=IP1 (core-facing IRB's IP) 1088 . Route-target identifying the tenant (IP-VRF). 1090 o Route type 2 (MAC/IP route for the core-facing IRB) 1091 containing: 1093 . ML=48, M=M1, IPL=32, IP=IP1, VNI=10. 1095 . A [RFC5512] BGP Encapsulation Extended Community with 1096 Tunnel-type= VXLAN. 1098 . Route-target identifying the tenant. This route-target MAY 1099 be the same as the one used with the RT-5. 1101 (2) DGW1 imports the received routes from NVE1: 1103 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 1104 route-target. 1106 . Since GW IP is different from zero, the GW IP (IP1) will be 1107 used as the overlay index for the recursive route resolution 1108 to the RT-2 carrying IP1. 1110 (3) When DGW1 receives a packet from the WAN with destination IPx, 1111 where IPx belongs to SN1/24: 1113 o A destination IP lookup is performed on the DGW1 IP-VRF 1114 routing table. The lookup yields SN1/24, which is associated 1115 to the overlay index IP1. The forwarding information is 1116 derived from the RT-2 received for IP1. 1118 o The IP packet destined to IPx is encapsulated with: Source 1119 inner MAC = M3, Destination inner MAC = M1, Source outer IP 1120 (source VTEP) = DGW1 IP, Destination outer IP (destination 1121 VTEP) = NVE1 IP. 1123 (4) When the packet arrives at NVE1: 1125 o NVE1 will identify the IP-VRF for an IP-lookup based on the 1126 VNI and the inner MAC DA. 1128 o An IP lookup is performed in the routing context, where SN1 1129 turns out to be a local subnet associated to MAC-VRF2. A 1130 subsequent lookup in the ARP table and the MAC-VRF FIB will 1131 provide the forwarding information for the packet in MAC-VRF2. 1133 The implementation of the Interface-full with core-facing IRB model 1134 is REQUIRED. 1136 5.4.3 Interface-full IP-VRF-to-IP-VRF with unnumbered core-facing IRB 1138 Figure 8 will be used for the description of this model. Note that 1139 this model is similar to the one described in section 5.4.2, only 1140 without IP addresses on the core-facing IRB interfaces. 1142 NVE1 1143 +------------+ DGW1 1144 IP1+----+(MAC-VRF1) | +---------------+ +------------+ 1145 | \ (core) (core) | 1146 |(IP-VRF)(MAC-VRF) (MAC-VRF)(IP-VRF)|-----+ 1147 | / IRB(M1)| | IRB(M3) | | 1148 +---+(MAC-VRF2) | | | +------------+ _+_ 1149 | +------------+ | | ( ) 1150 SN1| | VXLAN/ | ( WAN ) 1151 | NVE2 | nvGRE/ | (___) 1152 | +------------+ | MPLS | DGW2 + 1153 +---+(MAC-VRF2) | | | +------------+ | 1154 | \ (core) (core) | | 1155 |(IP-VRF)(MAC-VRF) (MAC-VRF)(IP-VRF)|-----+ 1156 | / IRB(M2)| | IRB(M4) | 1157 SN2+----+(MAC-VRF3) | +---------------+ +------------+ 1158 +------------+ 1160 Figure 8 Interface-full with unnumbered core-facing IRB model 1162 In this model, the requirements are the following: 1164 a) As in section 5.4.1 and 5.4.2, the NVEs and DGWs must provide 1165 connectivity between hosts in SN1, SN2, IP1 and hosts seating at 1166 the other end of the WAN. 1168 b) As in section 5.4.2, the NVE/DGWs are connected through Ethernet 1169 NVO tunnels terminated in core-MAC-VRF instances. The IP-VRFs use 1170 IRB interfaces for their connectivity to the core MAC-VRFs. 1172 c) However, each core-facing IRB has a MAC address only, and no IP 1173 address (that is why the model refers to an 'unnumbered' core- 1174 facing IRB). In this model, there is no need to have IP 1175 reachability to the core-facing IRB interfaces themselves and 1176 there is a requirement to save IP addresses on those interfaces. 1178 d) As in section 5.4.2, the core EVI is composed of the NVE/DGW MAC- 1179 VRFs and may contain other MAC-VRFs. 1181 e) As in section 5.4.2, the solution must provide layer-3 1182 connectivity for Ethernet NVO tunnels, for instance, VXLAN or 1183 nvGRE. 1185 This model will also make use of the RT-5 recursive resolution. EVPN 1186 type 5 routes will advertise the IP Prefixes along with the Router's 1187 MAC Extended Community used for the recursive lookup, whereas EVPN 1188 RT-2 routes will advertise the MAC addresses of each core-facing IRB 1189 interface (this time without an IP). Each NVE/DGW will advertise an 1190 RT-5 for each of its prefixes with the following fields: 1192 o RD as per [RFC7432]. 1194 o Eth-Tag ID=0 assuming VLAN-based service. 1196 o IP address length and IP address, as explained in the previous 1197 sections. 1199 o GW IP address= SHOULD be set to 0. 1201 o ESI=0 1203 o MPLS label or VNI corresponding to the IP-VRF. Note that the 1204 value SHOULD be zero since the RT-5 route requires a recursive 1205 lookup resolution to an RT-2 route. The MPLS label or VNI to 1206 be used when forwarding packets will be derived from the RT- 1207 2's MPLS Label1 field. 1209 Each RT-5 will be sent with a route-target identifying the tenant 1210 (IP-VRF) and the Router's MAC Extended Community containing the MAC 1211 address associated to core-facing IRB interface. This MAC address MAY 1212 be re-used for all the IP-VRFs in the NVE. 1214 The following example illustrates the procedure to advertise and 1215 forward packets to SN1/24 (ipv4 prefix advertised from NVE1) for 1216 VXLAN tunnels: 1218 (1) NVE1 advertises the following BGP routes: 1220 o Route type 5 (IP Prefix route) containing: 1222 . IPL=24, IP=SN1, VNI= SHOULD be set to 0. 1224 . GW IP= SHOULD be set to 0. 1226 . Router's MAC Extended Community containing M1 (this will be 1227 used for the recursive lookup to a RT-2). 1229 . Route-target identifying the tenant (IP-VRF). 1231 o Route type 2 (MAC route for the core-facing IRB) containing: 1233 . ML=48, M=M1, IPL=0, VNI=10. 1235 . A [RFC5512] BGP Encapsulation Extended Community with 1236 Tunnel-type=VXLAN. 1238 . Route-target identifying the tenant. This route-target MAY 1239 be the same as the one used with the RT-5. 1241 (2) DGW1 imports the received routes from NVE1: 1243 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 1244 route-target. 1246 . The MAC contained in the Router's MAC Extended Community 1247 sent along with the RT-5 (M1) will be used as the overlay 1248 index for the recursive route resolution to the RT-2 1249 carrying M1. 1251 (3) When DGW1 receives a packet from the WAN with destination IPx, 1252 where IPx belongs to SN1/24: 1254 o A destination IP lookup is performed on the DGW1 IP-VRF 1255 routing table. The lookup yields SN1/24, which is associated 1256 to the overlay index M1. The forwarding information is derived 1257 from the RT-2 received for M1. 1259 o The IP packet destined to IPx is encapsulated with: Source 1260 inner MAC = M3, Destination inner MAC = M1, Source outer IP 1261 (source VTEP) = DGW1 IP, Destination outer IP (destination 1262 VTEP) = NVE1 IP. 1264 (4) When the packet arrives at NVE1: 1266 o NVE1 will identify the IP-VRF for an IP-lookup based on the 1267 VNI and the inner MAC DA. 1269 o An IP lookup is performed in the routing context, where SN1 1270 turns out to be a local subnet associated to MAC-VRF2. A 1271 subsequent lookup in the ARP table and the MAC-VRF FIB will 1272 provide the forwarding information for the packet in MAC-VRF2. 1274 The implementation of the Interface-full with unnumbered core-facing 1275 IRB model is OPTIONAL. 1277 6. Conclusions 1279 An EVPN route (type 5) for the advertisement of IP Prefixes is 1280 described in this document. This new route type has a differentiated 1281 role from the RT-2 route and addresses all the Data Center (or NVO- 1282 based networks in general) inter-subnet connectivity scenarios in 1283 which an IP Prefix advertisement is required. Using this new RT-5, an 1284 IP Prefix may be advertised along with an overlay index that can be a 1285 GW IP address, a MAC or an ESI, or without an overlay index, in which 1286 case the BGP next-hop will point at the egress NVE and the MAC in the 1287 Router's MAC Extended Community will provide the inner MAC 1288 destination address to be used. As discussed throughout the document, 1289 the EVPN RT-2 does not meet the requirements for all the DC use 1290 cases, therefore this EVPN route type 5 is required. 1292 The EVPN route type 5 decouples the IP Prefix advertisements from the 1293 MAC/IP route advertisements in EVPN, hence: 1295 a) Allows the clean and clear advertisements of ipv4 or ipv6 prefixes 1296 in an NLRI with no MAC addresses in the route key, so that only IP 1297 information is used in BGP route comparisons. 1299 b) Since the route type is different from the MAC/IP Advertisement 1300 route, the advertisement of prefixes will be excluded from all the 1301 procedures defined for the advertisement of VM MACs, e.g. MAC 1302 Mobility or aliasing. As a result of that, the current [RFC7432] 1303 procedures do not need to be modified. 1305 c) Allows a flexible implementation where the prefix can be linked to 1306 different types of overlay indexes: overlay IP address, overlay 1307 MAC addresses, overlay ESI, underlay IP next-hops, etc. 1309 d) An EVPN implementation not requiring IP Prefixes can simply 1310 discard them by looking at the route type value. An unknown route 1311 type MUST be ignored by the receiving NVE/PE. 1313 7. Conventions used in this document 1315 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 1316 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 1317 document are to be interpreted as described in RFC-2119 [RFC2119]. 1319 8. Security Considerations 1321 The security considerations discussed in [RFC7432] apply to this 1322 document. 1324 9. IANA Considerations 1326 This document requests the allocation of value 5 in the "EVPN Route 1327 Types" registry defined by [RFC7432] and modification of the registry 1328 as follows: 1330 Value Description Reference 1331 5 IP Prefix route [this document] 1332 6-255 Unassigned 1334 10. References 1336 10.1 Normative References 1338 [RFC4364]Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1339 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, 1340 . 1342 [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1343 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 1344 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 1347 10.2 Informative References 1349 [EVPN-OVERLAY] Sajassi-Drake et al., "A Network Virtualization 1350 Overlay Solution using EVPN", draft-ietf-bess-evpn-overlay-04.txt, 1351 work in progress, June, 2016 1353 [EVPN-INTERSUBNET] Sajassi et al., "IP Inter-Subnet Forwarding in 1354 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-01.txt, work in 1355 progress, October, 2015 1357 11. Acknowledgments 1359 The authors would like to thank Mukul Katiyar for their valuable 1360 feedback and contributions. The following people also helped 1361 improving this document with their feedback: Tony Przygienda and 1362 Thomas Morin. 1364 12. Contributors 1366 In addition to the authors listed on the front page, the following 1367 co-authors have also contributed to this document: 1369 Senthil Sathappan 1370 Florin Balus 1372 13. Authors' Addresses 1374 Jorge Rabadan (Editor) 1375 Nokia 1376 777 E. Middlefield Road 1377 Mountain View, CA 94043 USA 1378 Email: jorge.rabadan@nokia.com 1380 Wim Henderickx 1381 Nokia 1382 Email: wim.henderickx@nokia.com 1384 Aldrin Isaac 1385 Juniper 1386 Email: aisaac@juniper.net 1388 Senad Palislamovic 1389 Nokia 1390 Email: senad.palislamovic@nokia.com 1392 John E. Drake 1393 Juniper 1394 Email: jdrake@juniper.net 1396 Ali Sajassi 1397 Cisco 1398 Email: sajassi@cisco.com 1400 Wen Lin 1401 Juniper 1402 Email: wlin@juniper.net