idnits 2.17.1 draft-ietf-bess-evpn-prefix-advertisement-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC7365], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 18, 2018) is 2170 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 -- Possible downref: Non-RFC (?) normative reference: ref. 'EVPNRouteTypes' == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-06 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft W. Henderickx 4 Intended status: Standards Track Nokia 6 J. Drake 7 W. Lin 8 Juniper 10 A. Sajassi 11 Cisco 13 Expires: November 19, 2018 May 18, 2018 15 IP Prefix Advertisement in EVPN 16 draft-ietf-bess-evpn-prefix-advertisement-11 18 Abstract 20 The BGP MPLS-based Ethernet VPN (EVPN) [RFC7432] mechanism provides a 21 flexible control plane that allows intra-subnet connectivity in an 22 MPLS and/or NVO (Network Virtualization Overlay) [RFC7365] network. 23 In some networks, there is also a need for a dynamic and efficient 24 inter-subnet connectivity across Tenant Systems and End Devices that 25 can be physical or virtual and do not necessarily participate in 26 dynamic routing protocols. This document defines a new EVPN route 27 type for the advertisement of IP Prefixes and explains some use-case 28 examples where this new route-type is used. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that 37 other groups may also distribute working documents as Internet- 38 Drafts. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/ietf/1id-abstracts.txt 48 The list of Internet-Draft Shadow Directories can be accessed at 49 http://www.ietf.org/shadow.html 51 This Internet-Draft will expire on November 19, 2018. 53 Copyright Notice 55 Copyright (c) 2018 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 71 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 72 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 5 73 2.1 Inter-Subnet Connectivity Requirements in Data Centers . . . 5 74 2.2 The Need for the EVPN IP Prefix Route . . . . . . . . . . . 8 75 3. The BGP EVPN IP Prefix Route . . . . . . . . . . . . . . . . . 10 76 3.1 IP Prefix Route Encoding . . . . . . . . . . . . . . . . . . 11 77 3.2 Overlay Indexes and Recursive Lookup Resolution . . . . . . 13 78 4. Overlay Index Use-Cases . . . . . . . . . . . . . . . . . . . . 15 79 4.1 TS IP Address Overlay Index Use-Case . . . . . . . . . . . . 16 80 4.2 Floating IP Overlay Index Use-Case . . . . . . . . . . . . . 18 81 4.3 Bump-in-the-Wire Use-Case . . . . . . . . . . . . . . . . . 20 82 4.4 IP-VRF-to-IP-VRF Model . . . . . . . . . . . . . . . . . . . 23 83 4.4.1 Interface-less IP-VRF-to-IP-VRF Model . . . . . . . . . 24 84 4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB . . . . . . 27 85 4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB . 30 86 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 33 87 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 33 88 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 89 7.1 Normative References . . . . . . . . . . . . . . . . . . . . 34 90 7.2 Informative References . . . . . . . . . . . . . . . . . . . 34 91 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 35 92 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 35 93 10. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 36 95 1. Introduction 97 [RFC7365] provides a framework for Data Center (DC) Network 98 Virtualization over Layer 3 and specifies that the Network 99 Virtualization Edge devices (NVEs) must provide layer 2 and layer 3 100 virtualized network services in multi-tenant DCs. [RFC8365] discusses 101 the use of EVPN as the technology of choice to provide layer 2 or 102 intra-subnet services in these DCs. This document, along with [EVPN- 103 INTERSUBNET], specifies the use of EVPN for layer 3 or inter-subnet 104 connectivity services. 106 [EVPN-INTERSUBNET] defines some fairly common inter-subnet forwarding 107 scenarios where TSes can exchange packets with TSes located in remote 108 subnets. In order to achieve this, [EVPN-INTERSUBNET] describes how 109 MAC/IPs encoded in TS RT-2 routes are not only used to populate MAC- 110 VRF and overlay ARP tables, but also IP-VRF tables with the encoded 111 TS host routes (/32 or /128). In some cases, EVPN may advertise IP 112 Prefixes and therefore provide aggregation in the IP-VRF tables, as 113 opposed to propagate individual host routes. This document 114 complements the scenarios described in [EVPN-INTERSUBNET] and defines 115 how EVPN may be used to advertise IP Prefixes. Interoperability 116 between EVPN and L3VPN [RFC4364] IP Prefix routes is out of the scope 117 of this document. 119 Section 2.1 describes the inter-subnet connectivity requirements in 120 Data Centers. Section 2.2 explains why a new EVPN route type is 121 required for IP Prefix advertisements. Sections 3, 4 and 5 will 122 describe this route type and how it is used in some specific use 123 cases. 125 1.1 Terminology 127 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 128 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 129 "OPTIONAL" in this document are to be interpreted as described in BCP 130 14 [RFC2119] [RFC8174] when, and only when, they appear in all 131 capitals, as shown here. 133 AC: Attachment Circuit. 135 ARP: Address Resolution Protocol. 137 BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single 138 or multiple BDs. In case of VLAN-bundle and VLAN-based service 139 models (see [RFC7432]), a BD is equivalent to an EVI. In case of 140 VLAN-aware bundle service model, an EVI contains multiple BDs. 141 Also, in this document, BD and subnet are equivalent terms. 143 BD Route Target: refers to the Broadcast Domain assigned Route Target 144 [RFC4364]. In case of VLAN-aware bundle service model, all the BD 145 instances in the MAC-VRF share the same Route Target. 147 BT: Bridge Table. The instantiation of a BD in a MAC-VRF, as per 148 [RFC7432]. 150 DGW: Data Center Gateway. 152 Ethernet A-D route: Ethernet Auto-Discovery (A-D) route, as per 153 [RFC7432]. 155 Ethernet NVO tunnel: refers to Network Virtualization Overlay tunnels 156 with Ethernet payload. Examples of this type of tunnels are VXLAN 157 or GENEVE. 159 EVI: EVPN Instance spanning the NVE/PE devices that are participating 160 on that EVPN, as per [RFC7432]. 162 EVPN: Ethernet Virtual Private Networks, as per [RFC7432]. 164 GRE: Generic Routing Encapsulation. 166 GW IP: Gateway IP Address. 168 IPL: IP Prefix Length. 170 IP NVO tunnel: it refers to Network Virtualization Overlay tunnels 171 with IP payload (no MAC header in the payload). 173 IP-VRF: A VPN Routing and Forwarding table for IP routes on an 174 NVE/PE. The IP routes could be populated by EVPN and IP-VPN 175 address families. An IP-VRF is also an instantiation of a layer 3 176 VPN in an NVE/PE. 178 IRB: Integrated Routing and Bridging interface. It connects an IP-VRF 179 to a BD (or subnet). 181 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 182 Control (MAC) addresses on an NVE/PE, as per [RFC7432]. A MAC-VRF 183 is also an instantiation of an EVI in an NVE/PE. 185 ML: MAC address length. 187 ND: Neighbor Discovery Protocol. 189 NVE: Network Virtualization Edge. 191 GENEVE: Generic Network Virtualization Encapsulation, [GENEVE]. 193 NVO: Network Virtualization Overlays. 195 RT-2: EVPN route type 2, i.e., MAC/IP advertisement route, as defined 196 in [RFC7432]. 198 RT-5: EVPN route type 5, i.e., IP Prefix route. As defined in Section 199 3. 201 SBD: Supplementary Broadcast Domain. A BD that does not have any ACs, 202 only IRB interfaces, and it is used to provide connectivity among 203 all the IP-VRFs of the tenant. The SBD is only required in IP-VRF- 204 to-IP-VRF use-cases (see Section 4.4.). 206 SN: Subnet. 208 TS: Tenant System. 210 VA: Virtual Appliance. 212 VNI: Virtual Network Identifier. As in [RFC8365], the term is used as 213 a representation of a 24-bit NVO instance identifier, with the 214 understanding that VNI will refer to a VXLAN Network Identifier in 215 VXLAN, or Virtual Network Identifier in GENEVE, etc. unless it is 216 stated otherwise. 218 VTEP: VXLAN Termination End Point, as in [RFC7348]. 220 VXLAN: Virtual Extensible LAN, as in [RFC7348]. 222 This document also assumes familiarity with the terminology of 223 [RFC7432], [RFC8365] and [RFC7365]. 225 2. Problem Statement 227 This Section describes the inter-subnet connectivity requirements in 228 Data Centers and why a specific route type to advertise IP Prefixes 229 is needed. 231 2.1 Inter-Subnet Connectivity Requirements in Data Centers 233 [RFC7432] is used as the control plane for a Network Virtualization 234 Overlay (NVO) solution in Data Centers (DC), where Network 235 Virtualization Edge (NVE) devices can be located in Hypervisors or 236 Top of Rack switches (ToRs), as described in [RFC8365]. 238 The following considerations apply to Tenant Systems (TS) that are 239 physical or virtual systems identified by MAC and maybe IP addresses 240 and connected to BDs by Attachment Circuits: 242 o The Tenant Systems may be Virtual Machines (VMs) that generate 243 traffic from their own MAC and IP. 245 o The Tenant Systems may be Virtual Appliance entities (VAs) that 246 forward traffic to/from IP addresses of different End Devices 247 sitting behind them. 249 o These VAs can be firewalls, load balancers, NAT devices, other 250 appliances or virtual gateways with virtual routing instances. 252 o These VAs do not necessarily participate in dynamic routing 253 protocols and hence rely on the EVPN NVEs to advertise the 254 routes on their behalf. 256 o In all these cases, the VA will forward traffic to other TSes 257 using its own source MAC but the source IP will be the one 258 associated to the End Device sitting behind or a translated IP 259 address (part of a public NAT pool) if the VA is performing 260 NAT. 262 o Note that the same IP address and endpoint could exist behind 263 two of these TSes. One example of this would be certain 264 appliance resiliency mechanisms, where a virtual IP or 265 floating IP can be owned by one of the two VAs running the 266 resiliency protocol (the master VA). Virtual Router Redundancy 267 Protocol (VRRP), RFC5798, is one particular example of this. 268 Another example is multi-homed subnets, i.e., the same subnet 269 is connected to two VAs. 271 o Although these VAs provide IP connectivity to VMs and subnets 272 behind them, they do not always have their own IP interface 273 connected to the EVPN NVE, e.g., layer 2 firewalls are 274 examples of VAs not supporting IP interfaces. 276 Figure 1 illustrates some of the examples described above. 278 NVE1 279 +-----------+ 280 TS1(VM)--| (BD-10) |-----+ 281 IP1/M1 +-----------+ | DGW1 282 +---------+ +-------------+ 283 | |----| (BD-10) | 284 SN1---+ NVE2 | | | IRB1\ | 285 | +-----------+ | | | (IP-VRF)|---+ 286 SN2---TS2(VA)--| (BD-10) |-| | +-------------+ _|_ 287 | IP2/M2 +-----------+ | VXLAN/ | ( ) 288 IP4---+ <-+ | GENEVE | DGW2 ( WAN ) 289 | | | +-------------+ (___) 290 vIP23 (floating) | |----| (BD-10) | | 291 | +---------+ | IRB2\ | | 292 SN1---+ <-+ NVE3 | | | | (IP-VRF)|---+ 293 | IP3/M3 +-----------+ | | | +-------------+ 294 SN3---TS3(VA)--| (BD-10) |---+ | | 295 | +-----------+ | | 296 IP5---+ | | 297 | | 298 NVE4 | | NVE5 +--SN5 299 +---------------------+ | | +-----------+ | 300 IP6------| (BD-1) | | +-| (BD-10) |--TS4(VA)--SN6 301 | \ | | +-----------+ | 302 | (IP-VRF) |--+ ESI4 +--SN7 303 | / \IRB3 | 304 |---| (BD-2) (BD-10) | 305 SN4| +---------------------+ 307 Figure 1 DC inter-subnet use-cases 309 Where: 311 NVE1, NVE2, NVE3, NVE4, NVE5, DGW1 and DGW2 share the same BD for a 312 particular tenant. BD-10 is comprised of the collection of BD 313 instances defined in all the NVEs. All the hosts connected to BD-10 314 belong to the same IP subnet. The hosts connected to BD-10 are listed 315 below: 317 o TS1 is a VM that generates/receives traffic from/to IP1, where IP1 318 belongs to the BD-10 subnet. 320 o TS2 and TS3 are Virtual Appliances (VA) that send/receive traffic 321 from/to the subnets and hosts sitting behind them (SN1, SN2, SN3, 322 IP4 and IP5). Their IP addresses (IP2 and IP3) belong to the BD-10 323 subnet and they can also generate/receive traffic. When these VAs 324 receive packets destined to their own MAC addresses (M2 and M3) 325 they will route the packets to the proper subnet or host. These VAs 326 do not support routing protocols to advertise the subnets connected 327 to them and can move to a different server and NVE when the Cloud 328 Management System decides to do so. These VAs may also support 329 redundancy mechanisms for some subnets, similar to VRRP, where a 330 floating IP is owned by the master VA and only the master VA 331 forwards traffic to a given subnet. E.g.,: vIP23 in Figure 1 is a 332 floating IP that can be owned by TS2 or TS3 depending on which 333 system is the master. Only the master will forward traffic to SN1. 335 o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3 have 336 their own IP addresses that belong to the BD-10 subnet too. These 337 IRB interfaces connect the BD-10 subnet to Virtual Routing and 338 Forwarding (IP-VRF) instances that can route the traffic to other 339 subnets for the same tenant (within the DC or at the other end of 340 the WAN). 342 o TS4 is a layer 2 VA that provides connectivity to subnets SN5, SN6 343 and SN7, but does not have an IP address itself in the BD-10. TS4 344 is connected to a port on NVE5 assigned to Ethernet Segment 345 Identifier 4. 347 For a BD that an ingress NVE is attached to, "Overlay Index" is 348 defined as an identifier that the ingress EVPN NVE requires in order 349 to forward packets to a subnet or host in a remote subnet. As an 350 example, vIP23 (Figure 1) is an Overlay Index that any NVE attached 351 to BD-10 needs to know in order to forward packets to SN1. IRB3 IP 352 address is an Overlay Index required to get to SN4, and ESI4 353 (Ethernet Segment Identifier 4) is an Overlay Index needed to forward 354 traffic to SN5. In other words, the Overlay Index is a next-hop in 355 the overlay address space that can be an IP address, a MAC address or 356 an ESI. When advertised along with an IP Prefix, the Overlay Index 357 requires a recursive resolution to find out to what egress NVE the 358 EVPN packets need to be sent. 360 All the DC use cases in Figure 1 require inter-subnet forwarding and 361 therefore, the individual host routes and subnets: 363 a) must be advertised from the NVEs (since VAs and VMs do not 364 participate in dynamic routing protocols) and 365 b) may be associated to an Overlay Index that can be a VA IP address, 366 a floating IP address, a MAC address or an ESI. The Overlay Index 367 is further discussed in Section 3.2. 369 2.2 The Need for the EVPN IP Prefix Route 371 [RFC7432] defines a MAC/IP route (also referred as RT-2) where a MAC 372 address can be advertised together with an IP address length and IP 373 address (IP). While a variable IP address length might have been used 374 to indicate the presence of an IP prefix in a route type 2, there are 375 several specific use cases in which using this route type to deliver 376 IP Prefixes is not suitable. 378 One example of such use cases is the "floating IP" example described 379 in Section 2.1. In this example it is needed to decouple the 380 advertisement of the prefixes from the advertisement of MAC address 381 of either M2 or M3, otherwise the solution gets highly inefficient 382 and does not scale. 384 For example, if 1,000 prefixes are advertised from M2 (using RT-2) 385 and the floating IP owner changes from M2 to M3, 1,000 routes would 386 be withdrawn from M2 and readvertise 1k routes from M3. However if a 387 separate route type is used, 1,000 routes can be advertised as 388 associated to the floating IP address (vIP23) and only one RT-2 for 389 advertising the ownership of the floating IP, i.e., vIP23 and M2 in 390 the route type 2. When the floating IP owner changes from M2 to M3, a 391 single RT-2 withdraw/update is required to indicate the change. The 392 remote DGW will not change any of the 1,000 prefixes associated to 393 vIP23, but will only update the ARP resolution entry for vIP23 (now 394 pointing at M3). 396 An EVPN route (type 5) for the advertisement of IP Prefixes is 397 described in this document. This new route type has a differentiated 398 role from the RT-2 route and addresses the Data Center (or NVO-based 399 networks in general) inter-subnet connectivity scenarios described in 400 this document. Using this new RT-5, an IP Prefix may be advertised 401 along with an Overlay Index that can be a GW IP address, a MAC or an 402 ESI, or without an Overlay Index, in which case the BGP next-hop will 403 point at the egress NVE/ASBR/ABR and the MAC in the Router's MAC 404 Extended Community will provide the inner MAC destination address to 405 be used. As discussed throughout the document, the EVPN RT-2 does not 406 meet the requirements for all the DC use cases, therefore this EVPN 407 route type 5 is required. 409 The EVPN route type 5 decouples the IP Prefix advertisements from the 410 MAC/IP route advertisements in EVPN, hence: 412 a) Allows the clean and clear advertisements of IPv4 or IPv6 prefixes 413 in an NLRI (Network Layer Reachability Information message) with 414 no MAC addresses. 416 b) Since the route type is different from the MAC/IP Advertisement 417 route, the current [RFC7432] procedures do not need to be 418 modified. 420 c) Allows a flexible implementation where the prefix can be linked to 421 different types of Overlay/Underlay Indexes: overlay IP address, 422 overlay MAC addresses, overlay ESI, underlay BGP next-hops, etc. 424 d) An EVPN implementation not requiring IP Prefixes can simply 425 discard them by looking at the route type value. 427 The following Sections describe how EVPN is extended with a route 428 type for the advertisement of IP prefixes and how this route is used 429 to address the inter-subnet connectivity requirements existing in the 430 Data Center. 432 3. The BGP EVPN IP Prefix Route 434 The BGP EVPN NLRI as defined in [RFC7432] is shown below: 436 +-----------------------------------+ 437 | Route Type (1 octet) | 438 +-----------------------------------+ 439 | Length (1 octet) | 440 +-----------------------------------+ 441 | Route Type specific (variable) | 442 +-----------------------------------+ 444 Figure 2 BGP EVPN NLRI 446 This document defines an additional route type (RT-5) in the IANA 447 EVPN Route Types registry [EVPNRouteTypes], to be used for the 448 advertisement of EVPN routes using IP Prefixes: 450 Value: 5 451 Description: IP Prefix Route 453 According to Section 5.4 in [RFC7606], a node that doesn't recognize 454 the Route Type 5 (RT-5) will ignore it. Therefore an NVE following 455 this document can still be attached to a BD where an NVE ignoring RT- 456 5s is attached to. Regular [RFC7432] procedures would apply in that 457 case for both NVEs. In case two or more NVEs are attached to 458 different BDs of the same tenant, they MUST support RT-5 for the 459 proper Inter-Subnet Forwarding operation of the tenant. 461 The detailed encoding of this route and associated procedures are 462 described in the following Sections. 464 3.1 IP Prefix Route Encoding 466 An IP Prefix Route Type for IPv4 has the Length field set to 34 and 467 consists of the following fields: 469 +---------------------------------------+ 470 | RD (8 octets) | 471 +---------------------------------------+ 472 |Ethernet Segment Identifier (10 octets)| 473 +---------------------------------------+ 474 | Ethernet Tag ID (4 octets) | 475 +---------------------------------------+ 476 | IP Prefix Length (1 octet, 0 to 32) | 477 +---------------------------------------+ 478 | IP Prefix (4 octets) | 479 +---------------------------------------+ 480 | GW IP Address (4 octets) | 481 +---------------------------------------+ 482 | MPLS Label (3 octets) | 483 +---------------------------------------+ 485 Figure 3 EVPN IP Prefix route NLRI for IPv4 487 An IP Prefix Route Type for IPv6 has the Length field set to 58 and 488 consists of the following fields: 490 +---------------------------------------+ 491 | RD (8 octets) | 492 +---------------------------------------+ 493 |Ethernet Segment Identifier (10 octets)| 494 +---------------------------------------+ 495 | Ethernet Tag ID (4 octets) | 496 +---------------------------------------+ 497 | IP Prefix Length (1 octet, 0 to 128) | 498 +---------------------------------------+ 499 | IP Prefix (16 octets) | 500 +---------------------------------------+ 501 | GW IP Address (16 octets) | 502 +---------------------------------------+ 503 | MPLS Label (3 octets) | 504 +---------------------------------------+ 506 Figure 4 EVPN IP Prefix route NLRI for IPv6 508 Where: 510 o The Length field of the BGP EVPN NLRI for an EVPN IP Prefix route 511 MUST be either 34 (if IPv4 addresses are carried) or 58 (if IPv6 512 addresses are carried). The IP Prefix and Gateway IP Address MUST 513 be from the same IP address family. 515 o Route Distinguisher (RD) and Ethernet Tag ID MUST be used as 516 defined in [RFC7432] and [RFC8365]. In particular, the RD is unique 517 per MAC-VRF (or IP-VRF). The MPLS Label field is set to either an 518 MPLS label or a VNI, as described in [RFC8365] for other EVPN route 519 types. 521 o The Ethernet Segment Identifier MUST be a non-zero 10-octet 522 identifier if the ESI is used as an Overlay Index (see the 523 definition of Overlay Index in Section 3.2). It MUST be all bytes 524 zero otherwise. The ESI format is described in [RFC7432]. 526 o The IP Prefix Length can be set to a value between 0 and 32 (bits) 527 for IPv4 and between 0 and 128 for IPv6, and specifies the number 528 of bits in the Prefix. The value MUST NOT be greater than 128. 530 o The IP Prefix is a 4 or 16-octet field (IPv4 or IPv6). 532 o The GW (Gateway) IP Address field is a 4 or 16-octet field (IPv4 or 533 IPv6), and will encode a valid IP address as an Overlay Index for 534 the IP Prefixes. The GW IP field MUST be all bytes zero if it is 535 not used as an Overlay Index. Refer to Section 3.2 for the 536 definition and use of the Overlay Index. 538 o The MPLS Label field is encoded as 3 octets, where the high-order 539 20 bits contain the label value, as per [RFC7432]. When sending, 540 the label value SHOULD be zero if recursive resolution based on 541 overlay index is used. If the received MPLS Label value is zero, 542 the route MUST contain an Overlay Index and the ingress NVE/PE MUST 543 do recursive resolution to find the egress NVE/PE. If the received 544 Label is zero and the route does not contain an Overlay Index, it 545 MUST be treat-as-withdraw [RFC7606]. 547 The RD, Ethernet Tag ID, IP Prefix Length and IP Prefix are part of 548 the route key used by BGP to compare routes. The rest of the fields 549 are not part of the route key. 551 An IP Prefix Route MAY be sent along with a Router's MAC Extended 552 Community (defined in [EVPN-INTERSUBNET]) to carry the MAC address 553 that is used as the overlay index. Note that the MAC address may be 554 that of an TS. 556 As described in Section 3.2, certain data combinations in a received 557 routes would imply a "treat-as-withdraw" handling of the route 559 [RFC7606]. 561 3.2 Overlay Indexes and Recursive Lookup Resolution 563 RT-5 routes support recursive lookup resolution through the use of 564 Overlay Indexes as follows: 566 o An Overlay Index can be an ESI, IP address in the address space of 567 the tenant or MAC address and it is used by an NVE as the next-hop 568 for a given IP Prefix. An Overlay Index always needs a recursive 569 route resolution on the NVE/PE that installs the RT-5 into one of 570 its IP-VRFs, so that the NVE knows to which egress NVE/PE it needs 571 to forward the packets. It is important to note that recursive 572 resolution of the Overlay Index applies upon installation into an 573 IP-VRF, and not upon BGP propagation (for instance, on an ASBR). 574 Also, as a result of the recursive resolution, the egress NVE/PE is 575 not necessarily the same NVE that originated the RT-5. 577 o The Overlay Index is indicated along with the RT-5 in the ESI 578 field, GW IP field or Router's MAC Extended Community, depending on 579 whether the IP Prefix next-hop is an ESI, IP address or MAC address 580 in the tenant space. The Overlay Index for a given IP Prefix is set 581 by local policy at the NVE that originates an RT-5 for that IP 582 Prefix (typically managed by the Cloud Management System). 584 o In order to enable the recursive lookup resolution at the ingress 585 NVE, an NVE that is a possible egress NVE for a given Overlay Index 586 must originate a route advertising itself as the BGP next hop on 587 the path to the system denoted by the Overlay Index. For instance: 589 . If an NVE receives an RT-5 that specifies an Overlay Index, the 590 NVE cannot use the RT-5 in its IP-VRF unless (or until) it can 591 recursively resolve the Overlay Index. 592 . If the RT-5 specifies an ESI as the Overlay Index, recursive 593 resolution can only be done if the NVE has received and installed 594 an RT-1 (Auto-Discovery per-EVI) route specifying that ESI. 595 . If the RT-5 specifies a GW IP address as the Overlay Index, 596 recursive resolution can only be done if the NVE has received and 597 installed an RT-2 (MAC/IP route) specifying that IP address in 598 the IP address field of its NLRI. 599 . If the RT-5 specifies a MAC address as the Overlay Index, 600 recursive resolution can only be done if the NVE has received and 601 installed an RT-2 (MAC/IP route) specifying that MAC address in 602 the MAC address field of its NLRI. 604 Note that the RT-1 or RT-2 routes needed for the recursive 605 resolution may arrive before or after the given RT-5 route. 607 o Irrespective of the recursive resolution, if there is no IGP or BGP 608 route to the BGP next-hop of an RT-5, BGP MUST NOT install the RT-5 609 even if the Overlay Index can be resolved. 611 o The ESI and GW IP fields may both be zero at the same time. 612 However, they MUST NOT both be non-zero at the same time. A route 613 containing a non-zero GW IP and a non-zero ESI (at the same time) 614 SHOULD be treat-as-withdraw [RFC7606]. 616 o If either the ESI or GW IP are non-zero, then the non-zero one is 617 the Overlay Index, regardless of whether the Router's MAC Extended 618 Community is present or the value of the Label. In case the GW IP 619 is the Overlay Index (hence ESI is zero), the Router's MAC Extended 620 Community is ignored if present. 622 o A route where ESI, GW IP, MAC and Label are all zero at the same 623 time SHOULD be treat-as-withdraw. 625 The indirection provided by the Overlay Index and its recursive 626 lookup resolution is required to achieve fast convergence in case of 627 a failure of the object represented by the Overlay Index (see the 628 example described in Section 2.2). 630 Table 1 shows the different RT-5 field combinations allowed by this 631 specification and what Overlay Index must be used by the receiving 632 NVE/PE in each case. Those cases where there is no Overlay Index, are 633 indicated as "None" in Table 1. If there is no Overlay Index the 634 receiving NVE/PE will not perform any recursive resolution, and the 635 actual next-hop is given by the RT-5's BGP next-hop. 637 +----------+----------+----------+------------+----------------+ 638 | ESI | GW IP | MAC* | Label | Overlay Index | 639 |--------------------------------------------------------------| 640 | Non-Zero | Zero | Zero | Don't Care | ESI | 641 | Non-Zero | Zero | Non-Zero | Don't Care | ESI | 642 | Zero | Non-Zero | Zero | Don't Care | GW IP | 643 | Zero | Zero | Non-Zero | Zero | MAC | 644 | Zero | Zero | Non-Zero | Non-Zero | MAC or None** | 645 | Zero | Zero | Zero | Non-Zero | None*** | 646 +----------+----------+----------+------------+----------------+ 648 Table 1 - RT-5 fields and Indicated Overlay Index 650 Table NOTES: 652 * MAC with Zero value means no Router's MAC extended community is 653 present along with the RT-5. Non-Zero indicates that the extended 654 community is present and carries a valid MAC address. The 655 encoding of a MAC address MUST be the 6-octet MAC address 656 specified by [802.1Q] and [802.1D-REV]. Examples of invalid MAC 657 addresses are broadcast or multicast MAC addresses. The route 658 MUST be treat-as-withdraw in case of an invalid MAC address. The 659 presence of the Router's MAC extended community alone is not 660 enough to indicate the use of the MAC address as the Overlay 661 Index, since the extended community can be used for other 662 purposes. 664 ** In this case, the Overlay Index may be the RT-5's MAC address or 665 None, depending on the local policy of the receiving NVE/PE. Note 666 that the advertising NVE/PE that sets the Overlay Index SHOULD 667 advertise an RT-2 for the MAC Overlay Index if there are 668 receiving NVE/PEs configured to use the MAC as the Overlay Index. 669 This case in Table 1 is used in the IP-VRF-to-IP-VRF 670 implementations described in 4.4.1 and 4.4.3. The support of a 671 MAC Overlay Index in this model is OPTIONAL. 673 *** The Overlay Index is None. This is a special case used for IP- 674 VRF-to-IP-VRF where the NVE/PEs are connected by IP NVO tunnels 675 as opposed to Ethernet NVO tunnels. 677 If the combination of ESI, GW IP, MAC and Label in the receiving RT-5 678 is different than the combinations shown in Table 1, the router will 679 process the route as per the rules described at the beginning of this 680 Section (3.2). 682 Table 2 shows the different inter-subnet use-cases described in this 683 document and the corresponding coding of the Overlay Index in the 684 route type 5 (RT-5). 686 +---------+---------------------+----------------------------+ 687 | Section | Use-case | Overlay Index in the RT-5 | 688 +-------------------------------+----------------------------+ 689 | 4.1 | TS IP address | GW IP | 690 | 4.2 | Floating IP address | GW IP | 691 | 4.3 | "Bump in the wire" | ESI or MAC | 692 | 4.4 | IP-VRF-to-IP-VRF | GW IP, MAC or None | 693 +---------+---------------------+----------------------------+ 695 Table 2 - Use-cases and Overlay Indexes for Recursive Resolution 697 The above use-cases are representative of the different Overlay 698 Indexes supported by RT-5 (GW IP, ESI, MAC or None). 700 4. Overlay Index Use-Cases 701 This Section describes some use-cases for the Overlay Index types 702 used with the IP Prefix route. Although the examples use IPv4 703 Prefixes and subnets, the descriptions of the RT-5 are valid for the 704 same cases with IPv6, only replacing the IP Prefixes, IPL and GW IP 705 by the corresponding IPv6 values. 707 4.1 TS IP Address Overlay Index Use-Case 709 Figure 5 illustrates an example of inter-subnet forwarding for 710 subnets sitting behind Virtual Appliances (on TS2 and TS3). 712 IP4---+ NVE2 DGW1 713 | +-----------+ +---------+ +-------------+ 714 SN2---TS2(VA)--| (BD-10) |-| |----| (BD-10) | 715 | IP2/M2 +-----------+ | | | IRB1\ | 716 -+---+ | | | (IP-VRF)|---+ 717 | | | +-------------+ _|_ 718 SN1 | VXLAN/ | ( ) 719 | | GENEVE | DGW2 ( WAN ) 720 -+---+ NVE3 | | +-------------+ (___) 721 | IP3/M3 +-----------+ | |----| (BD-10) | | 722 SN3---TS3(VA)--| (BD-10) |-| | | IRB2\ | | 723 | +-----------+ +---------+ | (IP-VRF)|---+ 724 IP5---+ +-------------+ 726 Figure 5 TS IP address use-case 728 An example of inter-subnet forwarding between subnet SN1, which uses 729 a 24 bit IP prefix (written as SN1/24 in future), and a subnet 730 sitting in the WAN is described below. NVE2, NVE3, DGW1 and DGW2 are 731 running BGP EVPN. TS2 and TS3 do not participate in dynamic routing 732 protocols, and they only have a static route to forward the traffic 733 to the WAN. SN1/24 is dual-homed to NVE2 and NVE3. 735 In this case, a GW IP is used as an Overlay Index. Although a 736 different Overlay Index type could have been used, this use-case 737 assumes that the operator knows the VA's IP addresses beforehand, 738 whereas the VA's MAC address is unknown and the VA's ESI is zero. 739 Because of this, the GW IP is the suitable Overlay Index to be used 740 with the RT-5s. The NVEs know the GW IP to be used for a given Prefix 741 by policy. 743 (1) NVE2 advertises the following BGP routes on behalf of TS2: 745 o Route type 2 (MAC/IP route) containing: ML=48 (MAC Address 746 Length), M=M2 (MAC Address), IPL=32 (IP Prefix Length), IP=IP2 747 and [RFC5512] BGP Encapsulation Extended Community with the 748 corresponding Tunnel type. The MAC and IP addresses may be 749 learned via ARP snooping. 751 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 752 ESI=0, GW IP address=IP2. The prefix and GW IP are learned by 753 policy. 755 (2) Similarly, NVE3 advertises the following BGP routes on behalf of 756 TS3: 758 o Route type 2 (MAC/IP route) containing: ML=48, M=M3, IPL=32, 759 IP=IP3 (and BGP Encapsulation Extended Community). 761 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 762 ESI=0, GW IP address=IP3. 764 (3) DGW1 and DGW2 import both received routes based on the Route 765 Targets: 767 o Based on the BD-10 Route Target in DGW1 and DGW2, the MAC/IP 768 route is imported and M2 is added to the BD-10 along with its 769 corresponding tunnel information. For instance, if VXLAN is 770 used, the VTEP will be derived from the MAC/IP route BGP next- 771 hop and VNI from the MPLS Label1 field. IP2 - M2 is added to 772 the ARP table. Similarly, M3 is added to BD-10 and IP3 - M3 to 773 the ARP table. 775 o Based on the BD-10 Route Target in DGW1 and DGW2, the IP 776 Prefix route is also imported and SN1/24 is added to the IP- 777 VRF with Overlay Index IP2 pointing at the local BD-10. In 778 this example, it is assumed that the RT-5 from NVE2 is 779 preferred over the RT-5 from NVE3. If both routes were equally 780 preferable and ECMP enabled, SN1/24 would also be added to the 781 routing table with Overlay Index IP3. 783 (4) When DGW1 receives a packet from the WAN with destination IPx, 784 where IPx belongs to SN1/24: 786 o A destination IP lookup is performed on the DGW1 IP-VRF 787 routing table and Overlay Index=IP2 is found. Since IP2 is an 788 Overlay Index a recursive route resolution is required for 789 IP2. 791 o IP2 is resolved to M2 in the ARP table, and M2 is resolved to 792 the tunnel information given by the BD FIB (e.g., remote VTEP 793 and VNI for the VXLAN case). 795 o The IP packet destined to IPx is encapsulated with: 797 . Source inner MAC = IRB1 MAC. 799 . Destination inner MAC = M2. 801 . Tunnel information provided by the BD (VNI, VTEP IPs and 802 MACs for the VXLAN case). 804 (5) When the packet arrives at NVE2: 806 o Based on the tunnel information (VNI for the VXLAN case), the 807 BD-10 context is identified for a MAC lookup. 809 o Encapsulation is stripped off and based on a MAC lookup 810 (assuming MAC forwarding on the egress NVE), the packet is 811 forwarded to TS2, where it will be properly routed. 813 (6) Should TS2 move from NVE2 to NVE3, MAC Mobility procedures will 814 be applied to the MAC route IP2/M2, as defined in [RFC7432]. 815 Route type 5 prefixes are not subject to MAC mobility procedures, 816 hence no changes in the DGW IP-VRF routing table will occur for 817 TS2 mobility, i.e., all the prefixes will still be pointing at 818 IP2 as Overlay Index. There is an indirection for e.g., SN1/24, 819 which still points at Overlay Index IP2 in the routing table, but 820 IP2 will be simply resolved to a different tunnel, based on the 821 outcome of the MAC mobility procedures for the MAC/IP route 822 IP2/M2. 824 Note that in the opposite direction, TS2 will send traffic based on 825 its static-route next-hop information (IRB1 and/or IRB2), and regular 826 EVPN procedures will be applied. 828 4.2 Floating IP Overlay Index Use-Case 830 Sometimes Tenant Systems (TS) work in active/standby mode where an 831 upstream floating IP - owned by the active TS - is used as the 832 Overlay Index to get to some subnets behind. This redundancy mode, 833 already introduced in Section 2.1 and 2.2, is illustrated in Figure 834 6. 836 NVE2 DGW1 837 +-----------+ +---------+ +-------------+ 838 +---TS2(VA)--| (BD-10) |-| |----| (BD-10) | 839 | IP2/M2 +-----------+ | | | IRB1\ | 840 | <-+ | | | (IP-VRF)|---+ 841 | | | | +-------------+ _|_ 842 SN1 vIP23 (floating) | VXLAN/ | ( ) 843 | | | GENEVE | DGW2 ( WAN ) 844 | <-+ NVE3 | | +-------------+ (___) 845 | IP3/M3 +-----------+ | |----| (BD-10) | | 846 +---TS3(VA)--| (BD-10) |-| | | IRB2\ | | 847 +-----------+ +---------+ | (IP-VRF)|---+ 848 +-------------+ 850 Figure 6 Floating IP Overlay Index for redundant TS 852 In this use-case, a GW IP is used as an Overlay Index for the same 853 reasons as in 4.1. However, this GW IP is a floating IP that belongs 854 to the active TS. Assuming TS2 is the active TS and owns vIP23: 856 (1) NVE2 advertises the following BGP routes for TS2: 858 o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32, 859 IP=vIP23 (and BGP Encapsulation Extended Community). The MAC 860 and IP addresses may be learned via ARP snooping. 862 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 863 ESI=0, GW IP address=vIP23. The prefix and GW IP are learned 864 by policy. 866 (2) NVE3 advertises the following BGP route for TS3 (it does not 867 advertise an RT-2 for vIP23/M3): 869 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 870 ESI=0, GW IP address=vIP23. The prefix and GW IP are learned 871 by policy. 873 (3) DGW1 and DGW2 import both received routes based on the Route 874 Target: 876 o M2 is added to the BD-10 FIB along with its corresponding 877 tunnel information. For the VXLAN use case, the VTEP will be 878 derived from the MAC/IP route BGP next-hop and VNI from the 879 VNI field. vIP23 - M2 is added to the ARP table. 881 o SN1/24 is added to the IP-VRF in DGW1 and DGW2 with Overlay 882 index vIP23 pointing at M2 in the local BD-10. 884 (4) When DGW1 receives a packet from the WAN with destination IPx, 885 where IPx belongs to SN1/24: 887 o A destination IP lookup is performed on the DGW1 IP-VRF 888 routing table and Overlay Index=vIP23 is found. Since vIP23 is 889 an Overlay Index, a recursive route resolution for vIP23 is 890 required. 892 o vIP23 is resolved to M2 in the ARP table, and M2 is resolved 893 to the tunnel information given by the BD (remote VTEP and VNI 894 for the VXLAN case). 896 o The IP packet destined to IPx is encapsulated with: 898 . Source inner MAC = IRB1 MAC. 900 . Destination inner MAC = M2. 902 . Tunnel information provided by the BD FIB (VNI, VTEP IPs 903 and MACs for the VXLAN case). 905 (5) When the packet arrives at NVE2: 907 o Based on the tunnel information (VNI for the VXLAN case), the 908 BD-10 context is identified for a MAC lookup. 910 o Encapsulation is stripped off and based on a MAC lookup 911 (assuming MAC forwarding on the egress NVE), the packet is 912 forwarded to TS2, where it will be properly routed. 914 (6) When the redundancy protocol running between TS2 and TS3 appoints 915 TS3 as the new active TS for SN1, TS3 will now own the floating 916 vIP23 and will signal this new ownership, using a gratuitous ARP 917 REPLY message (explained in [RFC5227]) or similar. Upon receiving 918 the new owner's notification, NVE3 will issue a route type 2 for 919 M3-vIP23 and NVE2 will withdraw the RT-2 for M2-vIP23. DGW1 and 920 DGW2 will update their ARP tables with the new MAC resolving the 921 floating IP. No changes are made in the IP-VRF routing table. 923 4.3 Bump-in-the-Wire Use-Case 925 Figure 7 illustrates an example of inter-subnet forwarding for an IP 926 Prefix route that carries a subnet SN1. In this use-case, TS2 and TS3 927 are layer 2 VA devices without any IP address that can be included as 928 an Overlay Index in the GW IP field of the IP Prefix route. Their MAC 929 addresses are M2 and M3 respectively and are connected to BD-10. Note 930 that IRB1 and IRB2 (in DGW1 and DGW2 respectively) have IP addresses 931 in a subnet different than SN1. 933 NVE2 DGW1 934 M2 +-----------+ +---------+ +-------------+ 935 +---TS2(VA)--| (BD-10) |-| |----| (BD-10) | 936 | ESI23 +-----------+ | | | IRB1\ | 937 | + | | | (IP-VRF)|---+ 938 | | | | +-------------+ _|_ 939 SN1 | | VXLAN/ | ( ) 940 | | | GENEVE | DGW2 ( WAN ) 941 | + NVE3 | | +-------------+ (___) 942 | ESI23 +-----------+ | |----| (BD-10) | | 943 +---TS3(VA)--| (BD-10) |-| | | IRB2\ | | 944 M3 +-----------+ +---------+ | (IP-VRF)|---+ 945 +-------------+ 947 Figure 7 Bump-in-the-wire use-case 949 Since neither TS2 nor TS3 can participate in any dynamic routing 950 protocol and have no IP address assigned, there are two potential 951 Overlay Index types that can be used when advertising SN1: 953 a) an ESI, i.e., ESI23, that can be provisioned on the attachment 954 ports of NVE2 and NVE3, as shown in Figure 7. 955 b) or the VA's MAC address, that can be added to NVE2 and NVE3 by 956 policy. 958 The advantage of using an ESI as Overlay Index as opposed to the VA's 959 MAC address, is that the forwarding to the egress NVE can be done 960 purely based on the state of the AC in the ES (notified by the 961 Ethernet A-D per-EVI route) and all the EVPN multi-homing redundancy 962 mechanisms can be reused. For instance, the [RFC7432] mass-withdrawal 963 mechanism for fast failure detection and propagation can be used. 964 This Section assumes that an ESI Overlay Index is used in this use- 965 case but it does not prevent the use of the VA's MAC address as an 966 Overlay Index. If a MAC is used as Overlay Index, the control plane 967 must follow the procedures described in Section 4.4.3. 969 The model supports VA redundancy in a similar way to the one 970 described in Section 4.2 for the floating IP Overlay Index use-case, 971 except that it uses the EVPN Ethernet A-D per-EVI route instead of 972 the MAC advertisement route to advertise the location of the Overlay 973 Index. The procedure is explained below: 975 (1) Assuming TS2 is the active TS in ESI23, NVE2 advertises the 976 following BGP routes: 978 o Route type 1 (Ethernet A-D route for BD-10) containing: 979 ESI=ESI23 and the corresponding tunnel information (VNI 980 field), as well as the BGP Encapsulation Extended Community as 981 per [RFC8365]. 983 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 984 ESI=ESI23, GW IP address=0. The Router's MAC Extended 985 Community defined in [EVPN-INTERSUBNET] is added and carries 986 the MAC address (M2) associated to the TS behind which SN1 987 sits. M2 may be learned by policy, however the MAC in the 988 Extended Community is preferred if sent with the route. 990 (2) NVE3 advertises the following BGP route for TS3 (no AD per-EVI 991 route is advertised): 993 o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, 994 ESI=23, GW IP address=0. The Router's MAC Extended Community 995 is added and carries the MAC address (M3) associated to the TS 996 behind which SN1 sits. M3 may be learned by policy, however 997 the MAC in the Extended Community is preferred if sent with 998 the route. 1000 (3) DGW1 and DGW2 import the received routes based on the Route 1001 Target: 1003 o The tunnel information to get to ESI23 is installed in DGW1 1004 and DGW2. For the VXLAN use case, the VTEP will be derived 1005 from the Ethernet A-D route BGP next-hop and VNI from the 1006 VNI/VSID field (see [RFC8365]). 1008 o The RT-5 coming from the NVE that advertised the RT-1 is 1009 selected and SN1/24 is added to the IP-VRF in DGW1 and DGW2 1010 with Overlay Index ESI23 and MAC = M2. 1012 (4) When DGW1 receives a packet from the WAN with destination IPx, 1013 where IPx belongs to SN1/24: 1015 o A destination IP lookup is performed on the DGW1 IP-VRF 1016 routing table and Overlay Index=ESI23 is found. Since ESI23 is 1017 an Overlay Index, a recursive route resolution is required to 1018 find the egress NVE where ESI23 resides. 1020 o The IP packet destined to IPx is encapsulated with: 1022 . Source inner MAC = IRB1 MAC. 1024 . Destination inner MAC = M2 (this MAC will be obtained 1025 from the Router's MAC Extended Community received along 1026 with the RT-5 for SN1). Note that the Router's MAC 1027 Extended Community is used in this case to carry the TS' 1028 MAC address, as opposed to the NVE/PE's MAC address. 1030 . Tunnel information for the NVO tunnel is provided by the 1031 Ethernet A-D route per-EVI for ESI23 (VNI and VTEP IP for 1032 the VXLAN case). 1034 (5) When the packet arrives at NVE2: 1036 o Based on the tunnel demultiplexer information (VNI for the 1037 VXLAN case), the BD-10 context is identified for a MAC lookup 1038 (assuming MAC-based disposition model [RFC7432]) or the VNI 1039 may directly identify the egress interface (for a MPLS-based 1040 disposition model, which in this context is a VNI-based 1041 disposition model). 1043 o Encapsulation is stripped off and based on a MAC lookup 1044 (assuming MAC forwarding on the egress NVE) or a VNI lookup 1045 (in case of VNI forwarding), the packet is forwarded to TS2, 1046 where it will be forwarded to SN1. 1048 (6) If the redundancy protocol running between TS2 and TS3 follows an 1049 active/standby model and there is a failure, appointing TS3 as 1050 the new active TS for SN1, TS3 will now own the connectivity to 1051 SN1 and will signal this new ownership. Upon receiving the new 1052 owner's notification, NVE3's AC will become active and issue a 1053 route type 1 for ESI23, whereas NVE2 will withdraw its Ethernet 1054 A-D route for ESI23. DGW1 and DGW2 will update their tunnel 1055 information to resolve ESI23. The destination inner MAC will be 1056 changed to M3. 1058 4.4 IP-VRF-to-IP-VRF Model 1060 This use-case is similar to the scenario described in "IRB forwarding 1061 on NVEs for Tenant Systems" in [EVPN-INTERSUBNET], however the new 1062 requirement here is the advertisement of IP Prefixes as opposed to 1063 only host routes. 1065 In the examples described in Sections 4.1, 4.2 and 4.3, the BD 1066 instance can connect IRB interfaces and any other Tenant Systems 1067 connected to it. EVPN provides connectivity for: 1069 1. Traffic destined to the IRB or TS IP interfaces as well as 1071 2. Traffic destined to IP subnets sitting behind the TS, e.g., SN1 or 1072 SN2. 1074 In order to provide connectivity for (1), MAC/IP routes (RT-2) are 1075 needed so that IRB or TS MACs and IPs can be distributed. 1076 Connectivity type (2) is accomplished by the exchange of IP Prefix 1077 routes (RT-5) for IPs and subnets sitting behind certain Overlay 1078 Indexes, e.g., GW IP or ESI or TS MAC. 1080 In some cases, IP Prefix routes may be advertised for subnets and IPs 1081 sitting behind an IRB. This use-case is referred to as the "IP-VRF- 1082 to-IP-VRF" model. 1084 [EVPN-INTERSUBNET] defines an asymmetric IRB model and a symmetric 1085 IRB model, based on the required lookups at the ingress and egress 1086 NVE: the asymmetric model requires an IP lookup and a MAC lookup at 1087 the ingress NVE, whereas only a MAC lookup is needed at the egress 1088 NVE; the symmetric model requires IP and MAC lookups at both, ingress 1089 and egress NVE. From that perspective, the IP-VRF-to-IP-VRF use-case 1090 described in this Section is a symmetric IRB model. 1092 Note that, in an IP-VRF-to-IP-VRF scenario, out of the many subnets 1093 that a tenant may have, it may be the case that only a few are 1094 attached to a given NVE/PE's IP-VRF. In order to provide inter-subnet 1095 connectivity among the set of NVE/PEs where the tenant is connected, 1096 a new SBD is created on all of them if recursive resolution is 1097 needed. This SBD is instantiated as a regular BD (with no ACs) in 1098 each NVE/PE and has an IRB interface that connects the SBD to the IP- 1099 VRF. The IRB interface's IP or MAC address is used as the overlay 1100 index for recursive resolution. 1102 Depending on the existence and characteristics of the SBD and IRB 1103 interfaces for the IP-VRFs, there are three different IP-VRF-to-IP- 1104 VRF scenarios identified and described in this document: 1106 1) Interface-less model: no SBD and no overlay indexes required. 1107 2) Interface-ful with SBD IRB model: it requires SBD, as well as GW 1108 IP addresses as overlay indexes. 1109 3) Interface-ful with unnumbered SBD IRB model: it requires SBD, as 1110 well as MAC addresses as overlay indexes. 1112 Inter-subnet IP multicast is outside the scope of this document. 1114 4.4.1 Interface-less IP-VRF-to-IP-VRF Model 1116 Figure 8 will be used for the description of this model. 1118 NVE1(M1) 1119 +------------+ 1120 IP1+----| (BD-1) | DGW1(M3) 1121 | \ | +---------+ +--------+ 1122 | (IP-VRF)|----| |-|(IP-VRF)|----+ 1123 | / | | | +--------+ | 1124 +---| (BD-2) | | | _+_ 1125 | +------------+ | | ( ) 1126 SN1| | VXLAN/ | ( WAN )--H1 1127 | NVE2(M2) | GENEVE/| (___) 1128 | +------------+ | MPLS | + 1129 +---| (BD-2) | | | DGW2(M4) | 1130 | \ | | | +--------+ | 1131 | (IP-VRF)|----| |-|(IP-VRF)|----+ 1132 | / | +---------+ +--------+ 1133 SN2+----| (BD-3) | 1134 +------------+ 1136 Figure 8 Interface-less IP-VRF-to-IP-VRF model 1138 In this case: 1140 a) The NVEs and DGWs must provide connectivity between hosts in SN1, 1141 SN2, IP1 and hosts sitting at the other end of the WAN, for 1142 example, H1. It is assumed that the DGWs import/export IP and/or 1143 VPN-IP routes from/to the WAN. 1145 b) The IP-VRF instances in the NVE/DGWs are directly connected 1146 through NVO tunnels, and no IRBs and/or BD instances are 1147 instantiated to connect the IP-VRFs. 1149 c) The solution must provide layer 3 connectivity among the IP-VRFs 1150 for Ethernet NVO tunnels, for instance, VXLAN or GENEVE. 1152 d) The solution may provide layer 3 connectivity among the IP-VRFs 1153 for IP NVO tunnels, for example, GENEVE (with IP payload). 1155 In order to meet the above requirements, the EVPN route type 5 will 1156 be used to advertise the IP Prefixes, along with the Router's MAC 1157 Extended Community as defined in [EVPN-INTERSUBNET] if the 1158 advertising NVE/DGW uses Ethernet NVO tunnels. Each NVE/DGW will 1159 advertise an RT-5 for each of its prefixes with the following fields: 1161 o RD as per [RFC7432]. 1163 o Ethernet Tag ID=0. 1165 o IP Prefix Length and IP address, as explained in the previous 1166 Sections. 1168 o GW IP address=0. 1170 o ESI=0 1172 o MPLS label or VNI corresponding to the IP-VRF. 1174 Each RT-5 will be sent with a Route Target identifying the tenant 1175 (IP-VRF) and may be sent with two BGP extended communities: 1177 o The first one is the BGP Encapsulation Extended Community, as 1178 per [RFC5512], identifying the tunnel type. 1180 o The second one is the Router's MAC Extended Community as per 1181 [EVPN-INTERSUBNET] containing the MAC address associated to 1182 the NVE advertising the route. This MAC address identifies the 1183 NVE/DGW and MAY be reused for all the IP-VRFs in the NVE. The 1184 Router's MAC Extended Community must be sent if the route is 1185 associated to an Ethernet NVO tunnel, for instance, VXLAN. If 1186 the route is associated to an IP NVO tunnel, for instance 1187 GENEVE with IP payload, the Router's MAC Extended Community 1188 should not be sent. 1190 The following example illustrates the procedure to advertise and 1191 forward packets to SN1/24 (IPv4 prefix advertised from NVE1): 1193 (1) NVE1 advertises the following BGP route: 1195 o Route type 5 (IP Prefix route) containing: 1197 . IPL=24, IP=SN1, Label=10. 1199 . GW IP= set to 0. 1201 . [RFC5512] BGP Encapsulation Extended Community. 1203 . Router's MAC Extended Community that contains M1. 1205 . Route Target identifying the tenant (IP-VRF). 1207 (2) DGW1 imports the received routes from NVE1: 1209 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 1210 Route Target. 1212 o Since GW IP=ESI=0, the Label is a non-zero value and the local 1213 policy indicates this interface-less model, DGW1 will use the 1214 Label and next-hop of the RT-5, as well as the MAC address 1215 conveyed in the Router's MAC Extended Community (as inner 1216 destination MAC address) to set up the forwarding state and 1217 later encapsulate the routed IP packets. 1219 (3) When DGW1 receives a packet from the WAN with destination IPx, 1220 where IPx belongs to SN1/24: 1222 o A destination IP lookup is performed on the DGW1 IP-VRF 1223 routing table. The lookup yields SN1/24. 1225 o Since the RT-5 for SN1/24 had a GW IP=ESI=0, a non-zero Label 1226 and next-hop and the model is interface-less, DGW1 will not 1227 need a recursive lookup to resolve the route. 1229 o The IP packet destined to IPx is encapsulated with: Source 1230 inner MAC = DGW1 MAC, Destination inner MAC = M1, Source outer 1231 IP (tunnel source IP) = DGW1 IP, Destination outer IP (tunnel 1232 destination IP) = NVE1 IP. The Source and Destination inner 1233 MAC addresses are not needed if IP NVO tunnels are used. 1235 (4) When the packet arrives at NVE1: 1237 o NVE1 will identify the IP-VRF for an IP lookup based on the 1238 Label (the Destination inner MAC is not needed to identify the 1239 IP-VRF). 1241 o An IP lookup is performed in the routing context, where SN1 1242 turns out to be a local subnet associated to BD-2. A 1243 subsequent lookup in the ARP table and the BD FIB will provide 1244 the forwarding information for the packet in BD-2. 1246 The model described above is called Interface-less model since the 1247 IP-VRFs are connected directly through tunnels and they don't require 1248 those tunnels to be terminated in SBDs instead, as in Sections 4.4.2 1249 or 4.4.3. 1251 4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB 1253 Figure 9 will be used for the description of this model. 1255 NVE1 1256 +------------+ DGW1 1257 IP10+---+(BD-1) | +---------------+ +------------+ 1258 | \ | | | | | 1259 |(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+ 1260 | / IRB(IP1/M1) IRB(IP3/M3) | | 1261 +---+(BD-2) | | | +------------+ _+_ 1262 | +------------+ | | ( ) 1263 SN1| | VXLAN/ | ( WAN )--H1 1264 | NVE2 | GENEVE/ | (___) 1265 | +------------+ | MPLS | DGW2 + 1266 +---+(BD-2) | | | +------------+ | 1267 | \ | | | | | | 1268 |(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+ 1269 | / IRB(IP2/M2) IRB(IP4/M4) | 1270 SN2+----+(BD-3) | +---------------+ +------------+ 1271 +------------+ 1273 Figure 9 Interface-ful with SBD IRB model 1275 In this model: 1277 a) As in Section 4.4.1, the NVEs and DGWs must provide connectivity 1278 between hosts in SN1, SN2, IP10 and hosts sitting at the other end 1279 of the WAN. 1281 b) However, the NVE/DGWs are now connected through Ethernet NVO 1282 tunnels terminated in the SBD instance. The IP-VRFs use IRB 1283 interfaces for their connectivity to the SBD. 1285 c) Each SBD IRB has an IP and a MAC address, where the IP address 1286 must be reachable from other NVEs or DGWs. 1288 d) The SBD is attached to all the NVE/DGWs in the tenant domain BDs. 1290 e) The solution must provide layer 3 connectivity for Ethernet NVO 1291 tunnels, for instance, VXLAN or GENEVE (with Ethernet payload). 1293 EVPN type 5 routes will be used to advertise the IP Prefixes, whereas 1294 EVPN RT-2 routes will advertise the MAC/IP addresses of each SBD IRB 1295 interface. Each NVE/DGW will advertise an RT-5 for each of its 1296 prefixes with the following fields: 1298 o RD as per [RFC7432]. 1300 o Ethernet Tag ID=0. 1302 o IP Prefix Length and IP address, as explained in the previous 1303 Sections. 1305 o GW IP address=IRB-IP of the SBD (this is the Overlay Index 1306 that will be used for the recursive route resolution). 1308 o ESI=0 1310 o Label value should be zero since the RT-5 route requires a 1311 recursive lookup resolution to an RT-2 route. It is ignored on 1312 reception, and, when forwarding packets, the MPLS label or VNI 1313 from the RT-2's MPLS Label1 field is used. 1315 Each RT-5 will be sent with a Route Target identifying the tenant 1316 (IP-VRF). The Router's MAC Extended Community should not be sent in 1317 this case. 1319 The following example illustrates the procedure to advertise and 1320 forward packets to SN1/24 (IPv4 prefix advertised from NVE1): 1322 (1) NVE1 advertises the following BGP routes: 1324 o Route type 5 (IP Prefix route) containing: 1326 . IPL=24, IP=SN1, Label= SHOULD be set to 0. 1328 . GW IP=IP1 (SBD IRB's IP) 1330 . Route Target identifying the tenant (IP-VRF). 1332 o Route type 2 (MAC/IP route for the SBD IRB) containing: 1334 . ML=48, M=M1, IPL=32, IP=IP1, Label=10. 1336 . A [RFC5512] BGP Encapsulation Extended Community. 1338 . Route Target identifying the SBD. This Route Target may be 1339 the same as the one used with the RT-5. 1341 (2) DGW1 imports the received routes from NVE1: 1343 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 1344 Route Target. 1346 . Since GW IP is different from zero, the GW IP (IP1) will be 1347 used as the Overlay Index for the recursive route resolution 1348 to the RT-2 carrying IP1. 1350 (3) When DGW1 receives a packet from the WAN with destination IPx, 1351 where IPx belongs to SN1/24: 1353 o A destination IP lookup is performed on the DGW1 IP-VRF 1354 routing table. The lookup yields SN1/24, which is associated 1355 to the Overlay Index IP1. The forwarding information is 1356 derived from the RT-2 received for IP1. 1358 o The IP packet destined to IPx is encapsulated with: Source 1359 inner MAC = M3, Destination inner MAC = M1, Source outer IP 1360 (source VTEP) = DGW1 IP, Destination outer IP (destination 1361 VTEP) = IP1. 1363 (4) When the packet arrives at NVE1: 1365 o NVE1 will identify the IP-VRF for an IP lookup based on the 1366 Label and the inner MAC DA. 1368 o An IP lookup is performed in the routing context, where SN1 1369 turns out to be a local subnet associated to BD-2. A 1370 subsequent lookup in the ARP table and the BD FIB will provide 1371 the forwarding information for the packet in BD-2. 1373 The model described above is called 'Interface-ful with SBD IRB 1374 model' because the tunnels connecting the DGWs and NVEs need to be 1375 terminated into the SBD. The SBD is connected to the IP-VRFs via SBD 1376 IRB interfaces, and that allows the recursive resolution of RT-5s to 1377 GW IP addresses. 1379 4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB 1381 Figure 10 will be used for the description of this model. Note that 1382 this model is similar to the one described in Section 4.4.2, only 1383 without IP addresses on the SBD IRB interfaces. 1385 NVE1 1386 +------------+ DGW1 1387 IP1+----+(BD-1) | +---------------+ +------------+ 1388 | \ | | | | | 1389 |(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+ 1390 | / IRB(M1)| | IRB(M3) | | 1391 +---+(BD-2) | | | +------------+ _+_ 1392 | +------------+ | | ( ) 1393 SN1| | VXLAN/ | ( WAN )--H1 1394 | NVE2 | GENEVE/ | (___) 1395 | +------------+ | MPLS | DGW2 + 1396 +---+(BD-2) | | | +------------+ | 1397 | \ | | | | | | 1398 |(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+ 1399 | / IRB(M2)| | IRB(M4) | 1400 SN2+----+(BD-3) | +---------------+ +------------+ 1401 +------------+ 1403 Figure 10 Interface-ful with unnumbered SBD IRB model 1405 In this model: 1407 a) As in Section 4.4.1 and 4.4.2, the NVEs and DGWs must provide 1408 connectivity between hosts in SN1, SN2, IP1 and hosts sitting at 1409 the other end of the WAN. 1411 b) As in Section 4.4.2, the NVE/DGWs are connected through Ethernet 1412 NVO tunnels terminated in the SBD instance. The IP-VRFs use IRB 1413 interfaces for their connectivity to the SBD. 1415 c) However, each SBD IRB has a MAC address only, and no IP address 1416 (that is why the model refers to an 'unnumbered' SBD IRB). In this 1417 model, there is no need to have IP reachability to the SBD IRB 1418 interfaces themselves and there is a requirement to limit the 1419 number of IP addresses used. 1421 d) As in Section 4.4.2, the SBD is composed of all the NVE/DGW BDs of 1422 the tenant that need inter-subnet-forwarding. 1424 e) As in Section 4.4.2, the solution must provide layer 3 1425 connectivity for Ethernet NVO tunnels, for instance, VXLAN or 1426 GENEVE (with Ethernet payload). 1428 This model will also make use of the RT-5 recursive resolution. EVPN 1429 type 5 routes will advertise the IP Prefixes along with the Router's 1430 MAC Extended Community used for the recursive lookup, whereas EVPN 1431 RT-2 routes will advertise the MAC addresses of each SBD IRB 1432 interface (this time without an IP). 1434 Each NVE/DGW will advertise an RT-5 for each of its prefixes with the 1435 same fields as described in 4.4.2 except for: 1437 o GW IP address= set to 0. 1439 Each RT-5 will be sent with a Route Target identifying the tenant 1440 (IP-VRF) and the Router's MAC Extended Community containing the MAC 1441 address associated to SBD IRB interface. This MAC address may be 1442 reused for all the IP-VRFs in the NVE. 1444 The example is similar to the one in Section 4.4.2: 1446 (1) NVE1 advertises the following BGP routes: 1448 o Route type 5 (IP Prefix route) containing the same values as 1449 in the example in Section 4.4.2, except for: 1451 . GW IP= SHOULD be set to 0. 1453 . Router's MAC Extended Community containing M1 (this will be 1454 used for the recursive lookup to a RT-2). 1456 o Route type 2 (MAC route for the SBD IRB) with the same values 1457 as in Section 4.4.2 except for: 1459 . ML=48, M=M1, IPL=0, Label=10. 1461 (2) DGW1 imports the received routes from NVE1: 1463 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 1464 Route Target. 1466 . The MAC contained in the Router's MAC Extended Community 1467 sent along with the RT-5 (M1) will be used as the Overlay 1468 Index for the recursive route resolution to the RT-2 1469 carrying M1. 1471 (3) When DGW1 receives a packet from the WAN with destination IPx, 1472 where IPx belongs to SN1/24: 1474 o A destination IP lookup is performed on the DGW1 IP-VRF 1475 routing table. The lookup yields SN1/24, which is associated 1476 to the Overlay Index M1. The forwarding information is derived 1477 from the RT-2 received for M1. 1479 o The IP packet destined to IPx is encapsulated with: Source 1480 inner MAC = M3, Destination inner MAC = M1, Source outer IP 1481 (source VTEP) = DGW1 IP, Destination outer IP (destination 1482 VTEP) = NVE1 IP. 1484 (4) When the packet arrives at NVE1: 1486 o NVE1 will identify the IP-VRF for an IP lookup based on the 1487 Label and the inner MAC DA. 1489 o An IP lookup is performed in the routing context, where SN1 1490 turns out to be a local subnet associated to BD-2. A 1491 subsequent lookup in the ARP table and the BD FIB will provide 1492 the forwarding information for the packet in BD-2. 1494 The model described above is called Interface-ful with unnumbered SBD 1495 IRB model (as in Section 4.4.2), only this time the SBD IRB does not 1496 have an IP address. 1498 5. Security Considerations 1500 This document provides a set of procedures to achieve Inter-Subnet 1501 Forwarding across NVEs or PEs attached to a group of BDs that belong 1502 to the same tenant (or VPN). The security considerations discussed in 1503 [RFC7432] apply to the Intra-Subnet Forwarding or communication 1504 within each of those BDs. In addition, the security considerations in 1505 [RFC4364] should also be understood, since this document and 1506 [RFC4364] may be used in similar applications. 1508 Contrary to [RFC4364], this document does not describe PE/CE route 1509 distribution techniques, but rather considers the CEs as TSes or VAs 1510 that do not run dynamic routing protocols. This can be considered a 1511 security advantage, since dynamic routing protocols can be blocked on 1512 the NVE/PE ACs, not allowing the tenant to interact with the 1513 infrastructure's dynamic routing protocols. 1515 In this document, the RT-5 may use a regular BGP Next Hop for its 1516 resolution or an Overlay Index that requires a recursive resolution 1517 to a different EVPN route (an RT-2 or an RT-1). In the latter case, 1518 it is worth noting that any action that ends up filtering or 1519 modifying the RT-2/RT-1 routes used to convey the Overlay Indexes, 1520 will modify the resolution of the RT-5 and therefore the forwarding 1521 of packets to the remote subnet. 1523 6. IANA Considerations 1525 This document requests value 5 in the [EVPNRouteTypes] registry 1526 defined by [RFC7432]: 1528 Value Description Reference 1529 5 IP Prefix route [this document] 1531 7. References 1533 7.1 Normative References 1535 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1536 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 1537 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 1540 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1541 Subsequent Address Family Identifier (SAFI) and the BGP Tunnel 1542 Encapsulation Attribute", RFC 5512, DOI 10.17487/RFC5512, April 2009, 1543 . 1545 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1546 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1547 1997, . 1549 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119 1550 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 1551 . 1553 [RFC8365] Sajassi-Drake et al., "A Network Virtualization Overlay 1554 Solution using EVPN", RFC 8365, DOI 10.17487/RFC8365, March, 2018. 1556 [EVPN-INTERSUBNET] Sajassi et al., "IP Inter-Subnet Forwarding in 1557 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03.txt, work in 1558 progress, February, 2017 1560 [EVPNRouteTypes] IANA EVPN Route Type registry, 1561 https://www.iana.org/assignments/evpn 1563 7.2 Informative References 1565 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1566 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, 1567 . 1569 [RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 1570 "Revised Error Handling for BGP UPDATE Messages", RFC 7606, August 1571 2015, . 1573 [802.1D-REV] "IEEE Standard for Local and metropolitan area networks 1574 - Media Access Control (MAC) Bridges", IEEE Std. 802.1D, June 2004. 1576 [802.1Q] "IEEE Standard for Local and metropolitan area networks - 1577 Media Access Control (MAC) Bridges and Virtual Bridged Local Area 1578 Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014. 1580 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 1581 Rekhter, "Framework for Data Center (DC) Network Virtualization", RFC 1582 7365, DOI 10.17487/RFC7365, October 2014, . 1585 [RFC5227] Cheshire, S., "IPv4 Address Conflict Detection", RFC 5227, 1586 DOI 10.17487/RFC5227, July 2008, . 1589 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1590 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible 1591 Local Area Network (VXLAN): A Framework for Overlaying Virtualized 1592 Layer 2 Networks over Layer 3 Networks", RFC 7348, DOI 1593 10.17487/RFC7348, August 2014, . 1596 [GENEVE] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., 1597 "Geneve: Generic Network Virtualization Encapsulation", Work in 1598 Progress, draft-ietf-nvo3-geneve-06, March 2018. 1600 8. Acknowledgments 1602 The authors would like to thank Mukul Katiyar and Jeffrey Zhang for 1603 their valuable feedback and contributions. The following people also 1604 helped improving this document with their feedback: Tony Przygienda 1605 and Thomas Morin. Special THANK YOU to Eric Rosen for his detailed 1606 review, it really helped improve the readability and clarify the 1607 concepts. Thank you to Alvaro Retana for his thorough review. 1609 9. Contributors 1611 In addition to the authors listed on the front page, the following 1612 co-authors have also contributed to this document: 1614 Senthil Sathappan 1615 Florin Balus 1616 Aldrin Isaac 1617 Senad Palislamovic 1618 Samir Thoria 1620 10. Authors' Addresses 1622 Jorge Rabadan (Editor) 1623 Nokia 1624 777 E. Middlefield Road 1625 Mountain View, CA 94043 USA 1626 Email: jorge.rabadan@nokia.com 1628 Wim Henderickx 1629 Nokia 1630 Email: wim.henderickx@nokia.com 1632 John E. Drake 1633 Juniper 1634 Email: jdrake@juniper.net 1636 Ali Sajassi 1637 Cisco 1638 Email: sajassi@cisco.com 1640 Wen Lin 1641 Juniper 1642 Email: wlin@juniper.net