idnits 2.17.1 draft-sajassi-bess-secure-evpn-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 7 instances of lines with control characters in the document. ** The abstract seems to contain references ([RFC8365], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 20, 2018) is 2015 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVPN-PREFIX' is mentioned on line 175, but not defined == Missing Reference: 'RFC7365' is mentioned on line 199, but not defined == Missing Reference: 'RFC7296' is mentioned on line 671, but not defined == Missing Reference: 'SA' is mentioned on line 448, but not defined == Missing Reference: 'KE' is mentioned on line 449, but not defined == Missing Reference: 'Ni' is mentioned on line 446, but not defined == Missing Reference: 'IKEv2-IANA' is mentioned on line 459, but not defined == Unused Reference: 'IKEV2-IANA' is defined on line 780, but no explicit reference was found in the text == Unused Reference: 'RFC7606' is defined on line 807, but no explicit reference was found in the text == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-03 == Outdated reference: A later version (-01) exists of draft-carrel-ipsecme-controller-ike-00 -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2-IANA' == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-06 Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT A. Banerjee 4 Intended Status: Standards Track S. Thoria 5 D. Carrel 6 B. Weis 7 Cisco 9 Expires: May 20, 2019 October 20, 2018 11 Secure EVPN 12 draft-sajassi-bess-secure-evpn-00 14 Abstract 16 The applications of EVPN-based solutions ([RFC7432] and [RFC8365]) 17 have become pervasive in Data Center, Service Provider, and 18 Enterprise segments. It is being used for fabric overlays and inter- 19 site connectivity in the Data Center market segment, for Layer-2, 20 Layer-3, and IRB VPN services in the Service Provider market segment, 21 and for fabric overlay and WAN connectivity in Enterprise networks. 22 For Data Center and Enterprise applications, there is a need to 23 provide inter-site and WAN connectivity over public Internet in a 24 secured manner with same level of privacy, integrity, and 25 authentication for tenant's traffic as IPsec tunneling using IKEv2. 26 This document presents a solution where BGP point-to-multipoint 27 signaling is leveraged for key and policy exchange among PE devices 28 to create private pair-wise IPsec Security Associations without IKEv2 29 point-to-point signaling or any other direct peer-to-peer session 30 establishment messages. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as 40 Internet-Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2014 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 7 71 2.1 Tenant's Layer-2 and Layer-3 data & control traffic . . . . 7 72 2.2 Tenant's Unicast & Multicast Data Protection . . . . . . . . 7 73 2.3 P2MP Signaling for SA setup and Maintenance . . . . . . . . 7 74 2.3 Granularity of Security Association Tunnels . . . . . . . . 7 75 2.4 Support for Policy and DH-Group List . . . . . . . . . . . . 8 76 3 Solution Description . . . . . . . . . . . . . . . . . . . . . 8 77 3.1 Distribution of Public Keys and Policies . . . . . . . . . 9 78 3.1.1 Minimum Set . . . . . . . . . . . . . . . . . . . . . . 9 79 3.1.2 Single Policy . . . . . . . . . . . . . . . . . . . . . 10 80 3.1.3 Policy-list & DH-group-list . . . . . . . . . . . . . . 10 81 3.2 Initial IPsec SAs Generation . . . . . . . . . . . . . . . 11 82 3.3 Re-Keying . . . . . . . . . . . . . . . . . . . . . . . . . 11 83 3.4 IPsec Databases . . . . . . . . . . . . . . . . . . . . . . 11 84 4 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 4.1 Standard ESP Encapsulation . . . . . . . . . . . . . . . . . 12 86 4.2 ESP Encapsulation within UDP packet . . . . . . . . . . . . 13 87 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 14 88 5.1 ESP Notify Sub-TLV . . . . . . . . . . . . . . . . . . . . . 14 89 5.2 ESP Key Exchange Sub-TLV . . . . . . . . . . . . . . . . . . 15 90 5.3 ESP Nonce Sub-TLV . . . . . . . . . . . . . . . . . . . . . 15 91 5.3 ESP Proposals Sub-TLV . . . . . . . . . . . . . . . . . . . 16 92 6 Applicability to other VPN types . . . . . . . . . . . . . . . 17 93 7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 94 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 18 95 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 18 96 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 97 10.1 Normative References . . . . . . . . . . . . . . . . . . . 18 98 10.2 Informative References . . . . . . . . . . . . . . . . . . 19 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 101 Terminology 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 105 "OPTIONAL" in this document are to be interpreted as described in BCP 106 14 [RFC2119] [RFC8174] when, and only when, they appear in all 107 capitals, as shown here. 109 AC: Attachment Circuit. 111 ARP: Address Resolution Protocol. 113 BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single 114 or multiple BDs. In case of VLAN-bundle and VLAN-based service models 115 (see [RFC7432]), a BD is equivalent to an EVI. In case of VLAN-aware 116 bundle service model, an EVI contains multiple BDs. Also, in this 117 document, BD and subnet are equivalent terms. 119 BD Route Target: refers to the Broadcast Domain assigned Route Target 120 [RFC4364]. In case of VLAN-aware bundle service model, all the BD 121 instances in the MAC-VRF share the same Route Target. 123 BT: Bridge Table. The instantiation of a BD in a MAC-VRF, as per 124 [RFC7432]. 126 DGW: Data Center Gateway. 128 Ethernet A-D route: Ethernet Auto-Discovery (A-D) route, as per 129 [RFC7432]. 131 Ethernet NVO tunnel: refers to Network Virtualization Overlay tunnels 132 with Ethernet payload. Examples of this type of tunnels are VXLAN or 133 GENEVE. 135 EVI: EVPN Instance spanning the NVE/PE devices that are participating 136 on that EVPN, as per [RFC7432]. 138 EVPN: Ethernet Virtual Private Networks, as per [RFC7432]. 140 GRE: Generic Routing Encapsulation. 142 GW IP: Gateway IP Address. 144 IPL: IP Prefix Length. 146 IP NVO tunnel: it refers to Network Virtualization Overlay tunnels 147 with IP payload (no MAC header in the payload). 149 IP-VRF: A VPN Routing and Forwarding table for IP routes on an 150 NVE/PE. The IP routes could be populated by EVPN and IP-VPN address 151 families. An IP-VRF is also an instantiation of a layer 3 VPN in an 152 NVE/PE. 154 IRB: Integrated Routing and Bridging interface. It connects an IP-VRF 155 to a BD (or subnet). 157 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 158 Control (MAC) addresses on an NVE/PE, as per [RFC7432]. A MAC-VRF is 159 also an instantiation of an EVI in an NVE/PE. 161 ML: MAC address length. 163 ND: Neighbor Discovery Protocol. 165 NVE: Network Virtualization Edge. 167 GENEVE: Generic Network Virtualization Encapsulation, [GENEVE]. 169 NVO: Network Virtualization Overlays. 171 RT-2: EVPN route type 2, i.e., MAC/IP advertisement route, as defined 172 in [RFC7432]. 174 RT-5: EVPN route type 5, i.e., IP Prefix route. As defined in Section 175 3 of [EVPN-PREFIX]. 177 SBD: Supplementary Broadcast Domain. A BD that does not have any ACs, 178 only IRB interfaces, and it is used to provide connectivity among all 179 the IP-VRFs of the tenant. The SBD is only required in IP-VRF- to-IP- 180 VRF use-cases (see Section 4.4.). 182 SN: Subnet. 184 TS: Tenant System. 186 VA: Virtual Appliance. 188 VNI: Virtual Network Identifier. As in [RFC8365], the term is used as 189 a representation of a 24-bit NVO instance identifier, with the 190 understanding that VNI will refer to a VXLAN Network Identifier in 191 VXLAN, or Virtual Network Identifier in GENEVE, etc. unless it is 192 stated otherwise. 194 VTEP: VXLAN Termination End Point, as in [RFC7348]. 196 VXLAN: Virtual Extensible LAN, as in [RFC7348]. 198 This document also assumes familiarity with the terminology of 199 [RFC7432], [RFC8365] and [RFC7365]. 201 1 Introduction 203 The applications of EVPN-based solutions have become pervasive in 204 Data Center, Service Provider, and Enterprise segments. It is being 205 used for fabric overlays and inter-site connectivity in the Data 206 Center market segment, for Layer-2, Layer-3, and IRB VPN services in 207 the Service Provider market segment, and for fabric overlay and WAN 208 connectivity in the Enterprise networks. For Data Center and 209 Enterprise applications, there is a need to provide inter-site and 210 WAN connectivity over public Internet in a secured manner with the 211 same level of privacy, integrity, and authentication for tenant's 212 traffic as used in IPsec tunneling using IKEv2. This document 213 presents a solution where BGP point-to-multipoint signaling is 214 leveraged for key and policy exchange among PE devices to create 215 private pair-wise IPsec Security Associations without IKEv2 point-to- 216 point signaling or any other direct peer-to-peer session 217 establishment messages. 219 EVPN uses BGP as control-plane protocol for distribution of 220 information needed for discovery of PEs participating in a VPN, 221 discovery of PEs participating in a redundancy group, customer MAC 222 addresses and IP prefixes/addresses, aliasing information, tunnel 223 encapsulation types, multicast tunnel types, multicast group 224 memberships, and other info. The advantages of using BGP control 225 plane in EVPN are well understood including the following: 227 1) A full mesh of BGP sessions among PE devices can be avoided by 228 using Route Reflector (RR) where a PE only needs to setup a single 229 BGP session between itself and the RR as opposed to setting up N BGP 230 sessions to N other remote PEs; therefore, reducing number of BGP 231 sessions from O(N^2) to O(N) in the network. Furthermore, RR 232 hierarchy can be leveraged to scale the number of BGP routes on the 233 RR. 235 2) MP-BGP route filtering and constrained route distribution can be 236 leveraged to ensure that the control-plane traffic for a given VPN is 237 only distributed to the PEs participating in that VPN. 239 For setting up point-to-point security association (i.e., IPsec 240 tunnel) between a pair of EVPN PEs, it is important to leverage BGP 241 point-to-multipoint singling architecture using the RR along with its 242 route filtering and constrain mechanisms to achieve the performance 243 and the scale needed for large number of security associations (IPsec 244 tunnels) along with their frequent re-keying requirements. Using BGP 245 signaling along with the RR (instead of peer-to-peer protocol such as 246 IKEv2) reduces number of message exchanges needed for SAs 247 establishment and maintenance from O(N^2) to O(N) in the network. be 248 increased from O(N) to O(N^2). 250 2 Requirements 252 The requirements for secured EVPN are captured in the following 253 subsections. 255 2.1 Tenant's Layer-2 and Layer-3 data & control traffic 257 Tenant's layer-2 and layer-3 data and control traffic SHALL be 258 protected by IPsec cryptographic methods. This implies not only 259 tenant's data traffic SHALL be protected by IPsec but also tenant's 260 control and routing information that are advertised in BGP SHALL also 261 be protected by IPsec. This in turn implies that BGP session SHALL be 262 protected by IPsec. 264 2.2 Tenant's Unicast & Multicast Data Protection 266 Tenant's layer-2 and layer-3 unicast traffic SHALL be protected by 267 IPsec. In addition to that, tenant's layer-2 broadcast, unknown 268 unicast, and multicast traffic as well as tenant's layer-3 multicast 269 traffic SHALL be protected by IPsec when ingress replication or 270 assisted replication are used. The use of BGP P2MP signaling for 271 setting up P2MP SAs in P2MP multicast tunnels is for future study. 273 2.3 P2MP Signaling for SA setup and Maintenance 275 BGP P2MP signaling SHALL be used for IPsec SAs setup and maintenance. 276 The BGP signaling SHALL follow P2MP signaling framework per 277 [CONTROLLER-IKE] for IPsec SAs setup and maintenance in order to 278 reduce the number of message exchanges from O(N^2) to O(N) among the 279 participant PE devices. 281 2.3 Granularity of Security Association Tunnels 283 The solution SHALL support the setup and maintenance of IPsec SAs at 284 the following level of granularities: 286 1) Per pair of PEs: A single IPsec tunnel between a pair of PEs to be 287 used for all tenants' traffic supported by the pair of PEs. 289 2) Per tenant: A single IPsec tunnel per tenant per pair of PEs. For 290 example, if there are 1000 tenants supported on a pair of PEs, then 291 1000 IPsec tunnels are required between that pair of PEs. 293 3) Per subnet: A single IPsec tunnel per subnet (e.g., per VLAN/EVI) 294 of a tenant on a pair of PEs. 296 4) Per pair of IP addresses: A single IPsec tunnel per pair of IP 297 addresses of a tenant on a pair of PEs. 299 5) Per pair of MAC addresses: A single IPsec tunnel per pair of MAC 300 addresses of a tenant on a pair of PEs. 302 2.4 Support for Policy and DH-Group List 304 The solution SHALL support a single policy and DH group for all SAs 305 as well as supporting multiple policies and DH groups among the SAs. 307 3 Solution Description 309 This solution uses BGP P2MP signaling where an originating PE only 310 send a message to Route Reflector (RR) and then the RR reflects that 311 message to the interested recipient PEs. The framework for such 312 signaling is described in [CONTROLLER-IKE] and it is referred to as 313 device-to-controller trust model. This trust model is significantly 314 different than the traditional peer-to-peer trust model where a P2P 315 signaling protocol such as IKEv2 [RFC7296] is used in which the PE 316 devices directly authenticate each other and agree upon security 317 policy and keying material to protect communications between 318 themselves. The device-to-controller trust model leverages P2MP 319 signaling via the controller (e.g., the RR) to achieve much better 320 scale and performance for establishment and maintenance of large 321 number of pairwise Security Associations (SAs) among the PEs. 323 This device-to-controller trust model first secures the control 324 channel between each device and the controller using peer-to-peer 325 protocol such as IKEv2 [RFC7296] to establish P2P SAs between each PE 326 and the RR. It then uses this secured control channel for P2MP 327 signaling in establishment of P2P SAs between a pair of PE devices. 329 Each PE advertised to other PEs via the RR the information needed in 330 establishment of pair-wise SAs between itself an every other remote 331 PEs. These pieces of information are sent as Sub-TLVs of IPSec tunnel 332 type in BGP Tunnel Encapsulation attribute. These Sub-TLVs are 333 detailed in section 5 and they are based on IKEv2 specification 334 [RFC7296]. The IPsec tunnel TLVs along with its Sub-TLVs are sent 335 along with the BGP route (NLRI) for a given level of granularity. 337 If only a single SA is required per pair of PE devices to multiplex 338 user traffic for all tenants, then IPsec tunnel TLV is advertised 339 along with IPv4 or IPv6 NLRI representing loopback address of the 340 originating PE. It should be noted that this is not a VPN route but 341 rather an IPv4 or IPv6 route. 343 If a SA is required per tenant between a pair of PE devices, then 344 IPsec tunnel TLV can be advertised along with EVPN IMET route 345 representing the tenant or can be advertised along with a new EVPN 346 route representing the tenant. 348 If a SA is required per tenant's subnet (e.g., per VLAN) between a 349 pair of PE devices, then IPsec tunnel TLV is advertised along with 350 EVPN IMET route. 352 If a SA is required between a pair of tenant's devices represented by 353 a pair of IP addresses, then IPsec tunnel TLV is advertised along 354 with EVPN IP Prefix Advertisement Route or EVPN MAC/IP Advertisement 355 route. 357 If a SA is required between a pair of tenant's devices represented by 358 a pair of MAC addresses, then IPsec tunnel TLV is advertised along 359 with EVPN MAC/IP Advertisement route. 361 If a SA is required between a pair of tenant's devices represented by 362 a VLAN or a port, then IPsec tunnel TLV is advertised along with EVPN 363 Ethernet AD route. 365 3.1 Distribution of Public Keys and Policies 367 One of the requirements for this solution is to support a single DH 368 group and a single policy for all SAs as well as to support multiple 369 DH groups and policies among the SAs. The following subsections 370 describe what pieces of information (what Sub-TLVs) are needed to be 371 exchanged to support a single DH group and a single policy versus 372 multiple DH groups and multiple policies. 374 3.1.1 Minimum Set 376 For SA establishment, at the minimum, a PE needs to advertise to 377 other PEs, its ID, a notification to indicate if this is its initial 378 contact, key exchange including DH public number and DH group, and 379 Nonce. When a single policy is used among all SAs, it is assumed that 380 this single policy is configured by the management system in all the 381 PE devices and thus there is no need to signal it. The information 382 that need to be signaled (using RFC7296 notations) are: 384 ID, [N(INITIAL_CONTACT),] KE, Ni; where 386 ID payload is defined in section 3.5 of [RFC7296] 387 N (Notify) Payload in section 3.10 of [RFC7296] 388 KE (Key Exchange) payload in section 3.4 of [RFC7296] 389 Ni (Nonce) payload in section 3.9 of [RFC7296] 391 KE payload contains the DH public number and also identifies which DH 392 group to use. ID sub-TLV would not be needed in BGP because tunnel 393 attribute already carries originator ID. Section 5 details these sub- 394 TLVs as part of IPsec tunnel TLV in BGP Tunnel Encapsulation 395 Attribute. 397 3.1.2 Single Policy 399 If a single policy needs to be signaled among per tenant or per 400 subnet among a set of PEs, then in addition to the information 401 described in section 3.1.1, Security Association sub-TLV needs to be 402 signaled as well. The payload for this sub-TLV is defined in section 403 3.3 of [RFC7296] and detailed in section 5.3. 405 ID, [N(INITIAL_CONTACT),SA, KE, Ni 407 SA (Security Association) payload in section 3.3 of [RFC7296] 409 A single SA payload identifies a single IPsec policy. One important 410 restriction on the SA Payload is that an standard IKE SA payload can 411 contain multiple transform; however, [CONTROLLER-IKE] restricts the 412 SA payload to only a single transform for each transform type as 413 described in section A.3.1 of [CONTROLLER-IKE]. 415 3.1.3 Policy-list & DH-group-list 417 There can be scenarios for which there is a need to have multiple 418 policy options. This can happen when there is a need for policy 419 change and smooth migration among all PE devices to the new policy is 420 required. It can also happen if different PE devices have different 421 capabilities within the network. In these scenairos, PE devices need 422 to be able to choose the correct policy to use for each other. This 423 multi-policy scheme is described in section 6 of [CONTROLLER-IKE]. In 424 order to support this multi-policy feature, a PE device MUST 425 distribute a policy list. This list consists of multiple distinct 426 policies in order of preference, where the first policy is the most 427 preferred one. The receiving PE selects the policy by taking the 428 received list (starting with the first policy) and comparing that 429 against its own list and choosing the first one found in common. If 430 there is no match, this indicates a configuration error and the PEs 431 MUST NOT establish new SAs until a message is received that does 432 produce a match. 434 Furthermore, when a device supports more than one DH group, then a 435 unique DH public number MUST be specified for each in order of 436 preference. The selection of which DH group to use follows the same 437 logic as Policy selection, using the receiver's list order until a 438 match is found in the initiator's list. 440 In order to support multi-policy a policy list is signaled in 441 addition to the information described in section 3.1.1. Furthermore, 442 in order to support multi-DH-groups, a DH group list along with its 443 nonce list are signaled instead of a single DH group and a single 444 nonce as described in section 3.1.1. 446 ID, [N(INITIAL_CONTACT), [SA], [KE], [Ni] 448 [SA] list of IPsec policies (i.e., list of SA payloads) 449 [KE] list of KE payloads 451 3.2 Initial IPsec SAs Generation 453 The procedure for generation of initial IPsec SAs is described in 454 section 3 of [CONTROLLER-IKE]. This section gives a summary of it in 455 context of BGP signaling. When a PE device first comes up and wants 456 to setup an IPsec SA between itself and each of the interested remote 457 PEs, it generates a DH pair along for each of its intended IPsec SA 458 using an algorithm defined in the IKEv2 Diffie-Hellman Group 459 Transform IDs [IKEv2-IANA]. The originating PE distributes DH public 460 value along with a nonce (using IPsec Tunnel TLV in Tunnel 461 Encapsulation Attribute) to other remote PEs via the RR. Each 462 receiving PE uses this DH public number and the corresponding nonce 463 in creation of IPsec SA pair to the originating PE - i.e., an 464 outbound SA and an inbound SA. The detail procedures are described in 465 section 5.2 of [CONTROLLER-IKE]. 467 3.3 Re-Keying 469 A PE can initiate re-keying at any time due to local time or volume 470 based policy or due to the result of cipher counter nearing its final 471 value. The rekey process is performed individually for each remote 472 PE. If rekeying is performed with multiple PEs simultaneously, then 473 the decision process and rules described in this rekey are performed 474 independently for each PE. Section 4 of [CONTROLLER-IKE] describes 475 this rekeying process in details and gives examples for a single 476 IPsec device (e.g., a single PE) rekey versus multiple PE devices 477 rekey simultaneously. 479 3.4 IPsec Databases 481 The Peer Authorization Database (PAD), the Security Policy Database 482 (SPD), and the Security Association Database (SAD) all need to be 483 setup as defined in the IPsec Security Architecture [RFC4301]. 484 Section 5 of [CONTROLLER-IKE] gives a summary description of how 485 these databases are setup for the controller-based model where key is 486 exchanged via P2MP signaling via the controller (e.g., the RR) and 487 the policy can be either signaled via the RR (in case of multiple 488 policies) or configured by the management station (in case of single 489 policy). 491 4 Encapsulation 493 Vast majority of Encapsulation for Network Virtualization Overlay 494 (NVO) networks in deployment are based on UDP/IP with UDP destination 495 port ID indicating the type of NVO encapsulation (e.g., VxLAN, GPE, 496 GENEVE, GUE) and UDP source port ID representing flow entropy for 497 load-balancing of the traffic within the fabric based on n-tuple that 498 includes UDP header. When encrypting NVO encapsulated packets using 499 IP Encapsulating Security Payload (ESP), the following two options 500 can be used: a) adding a UDP header before ESP header (e.g., UDP 501 header in clear) and b) no UDP header before ESP header (e.g., 502 standard ESP encapsulation). The following subsection describe these 503 encapsulation in further details. 505 4.1 Standard ESP Encapsulation 507 When standard IP Encapsulating Security Payload (ESP) is used 508 (without outer UDP header) for encryption of NVO packets, it is used 509 in transport mode as depicted below. When such encapsulation is used, 510 the Tunnel Type of Tunnel Encapsulation TLV is set to ESP-Transport 511 and the Tunnel Type of Encapsulation Extended Community is set to NVO 512 encapsulation type (e.g., VxLAN, GENEVE, GPE, etc.). This implies 513 that the customer packets are first encapsulated using NVO 514 encapsulation type and then it is further encapsulated & encrypted 515 using ESP-Transport mode. 517 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 518 | MAC Header | | MAC Header | 519 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 520 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 521 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 522 | IP Header | | IP Header | 523 | Protocol = UDP | | Protocol = ESP | 524 +-----------------------+ +-----------------------+ 525 | UDP Header | | ESP Header | 526 | Dest Port = VxLAN | +-----------------------+ 527 +-----------------------+ | UDP Header | 528 | VxLAN Header | | Dest Port = VxLAN | 529 +-----------------------+ +-----------------------+ 530 | Inner MAC Header | | VxLAN Header | 531 +-----------------------+ +-----------------------+ 532 | Inner Eth Payload | | Inner MAC Header | 533 +-----------------------+ +-----------------------+ 534 | CRC | | Inner Eth Payload | 535 +-----------------------+ +-----------------------+ 536 | ESP Trailer (NP=UDP) | 537 +-----------------------+ 538 | CRC | 539 +-----------------------+ 541 Figure 3: VxLAN Encapsulation within ESP 543 4.2 ESP Encapsulation within UDP packet 545 In scenarios where NAT traversal is required ([RFC3948]) or where 546 load balancing using UDP header is required, then ESP encapsulation 547 within UDP packet as depicted in the following figure is used. The 548 ESP for NVO applications is in transport mode. The outer UDP header 549 (before the ESP header) has its source port set to flow entropy and 550 its destination port set to 4500 (indicating ESP header follows). A 551 non-zero SPI value in ESP header implies that this is a data packet 552 (i.e., it is not an IKE packet). The Next Protocol field in the ESP 553 trailer indicates what follows the ESP header, is a UDP header. This 554 inner UDP header has a destination port ID that identifies NVO 555 encapsulation type (e.g., VxLAN). Optimization of this packet format 556 where only a single UDP header is used (only the outer UDP header) is 557 for future study. 559 When such encapsulation is used, the Tunnel Type of Tunnel 560 Encapsulation TLV is set to ESP-in-UDP-Transport and the Tunnel Type 561 of Encapsulation Extended Community is set to NVO encapsulation type 562 (e.g., VxLAN, GENEVE, GPE, etc.). This implies that the customer 563 packets are first encapsulated using NVO encapsulation type and then 564 it is further encapsulated & encrypted using ESP-in-UDP with 565 Transport mode. 567 [RFC3948] 569 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 570 | MAC Header | | MAC Header | 571 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 572 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 573 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 574 | IP Header | | IP Header | 575 | Protocol = UDP | | Protocol = UDP | 576 +-----------------------+ +-----------------------+ 577 | UDP Header | | UDP Header | 578 | Dest Port = VxLAN | | Dest Port = 4500(ESP) | 579 +-----------------------+ +-----------------------+ 580 | VxLAN Header | | ESP Header | 581 +-----------------------+ +-----------------------+ 582 | Inner MAC Header | | UDP Header | 583 +-----------------------+ | Dest Port = VxLAN | 584 | Inner Eth Payload | +-----------------------+ 585 +-----------------------+ | VxLAN Header | 586 | CRC | +-----------------------+ 587 +-----------------------+ | Inner MAC Header | 588 +-----------------------+ 589 | Inner Eth Payload | 590 +-----------------------+ 591 | ESP Trailer (NP=UDP) | 592 +-----------------------+ 593 | CRC | 594 +-----------------------+ 595 Figure 4: VxLAN Encapsulation within ESP Within UDP 597 5 BGP Encoding 599 This document defines two new Tunnel Types along with its associated 600 sub-TLVs for The Tunnel Encapsulation Attribute [TUNNEL-ENCAP]. These 601 tunnel types correspond to ESP-Transport and ESP-in-UDP-Transport as 602 described in section 4. The following sub-TLVs apply to both tunnel 603 types unless stated otherwise. 605 5.1 ESP Notify Sub-TLV 606 This sub-TLV corresponds to Notify payload of IPsec Encapsulation 607 Security Payload protocol as defined in IKEv2 [RFC7296]. This payload 608 is defined and described in section 3.10 of [RFC7296]. 610 0 1 2 3 611 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 613 | Next Payload |C| Reserved | Payload Length | 614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 615 | Protocol ID | SPI Size | Notify Message Type | 616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 | | 618 ~ Security Parameter Index (SPI) ~ 619 | | 620 +---------------------------------------------------------------+ 621 | | 622 ~ Notification Data ~ 623 | | 624 +---------------------------------------------------------------+ 626 Figure 5: Notify Payload Format 628 5.2 ESP Key Exchange Sub-TLV 630 This sub-TLV corresponds to Key Exchange payload of IPsec 631 Encapsulation Security Payload protocol as defined in IKEv2 632 [RFC7296]. This payload is defined and described in section 3.4 of 633 [RFC7296]. 635 0 1 2 3 636 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | Next Payload |C| Reserved | Payload Length | 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | Diffie-Hellman Group Number | Reserved | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 | | 643 ~ Key Exchange Data ~ 644 | | 645 +---------------------------------------------------------------+ 647 Figure 6: Key Exchange Payload Format 649 5.3 ESP Nonce Sub-TLV 651 This sub-TLV corresponds to Nonce payload of IPsec Encapsulation 652 Security Payload protocol as defined in IKEv2 [RFC7296]. This payload 653 is defined and described in section 3.9 of [RFC7296]. 655 0 1 2 3 656 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Next Payload |C| Reserved | Payload Length | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 | | 661 ~ Nonce Data ~ 662 | | 663 +---------------------------------------------------------------+ 665 Figure 7: Nonce Payload Format 667 5.3 ESP Proposals Sub-TLV 669 This sub-TLV corresponds to Proposal payload of IPsec Encapsulation 670 Security Payload protocol as defined in IKEv2 [RFC7296]. This payload 671 is defined and described in section 3.3 of [RFC7296]. 673 0 1 2 3 674 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 | Next Payload |C| Reserved | Payload Length | 677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 678 | | 679 ~ ~ 680 | | 681 +---------------------------------------------------------------+ 683 Figure 8: Security Association Payload 685 Proposals (Variable) - one or more proposal substructures 686 0 1 2 3 687 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 689 | Last Substruc | Reserved | Proposal Length | 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 | Proposal Num | Protocol ID | SPI Size | Num Transforms| 692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 693 | | 694 ~ SPI (Variable) ~ 695 | | 696 +---------------------------------------------------------------+ 697 | | 698 ~ ~ 699 | | 700 +---------------------------------------------------------------+ 702 Figure 9: Proposal Substructure 704 6 Applicability to other VPN types 706 Although P2MP BGP signaling for establishment and maintenance of SAs 707 among PE devices is described in this document in context of EVPN, 708 there is no reason why it cannot be extended to other VPN 709 technologies such as IP-VPN [RFC4364], VPLS [RFC4761] & [RFC4762], 710 and MVPN [RFC6513] & [RFC6514] with ingress replication. The reason 711 EVPN has been chosen is because of its pervasiveness in DC, SP, and 712 Enterprise applications and because of its ability to support SA 713 establishment at different granularity levels such as: per PE, Per 714 tenant, per subnet, per Ethernet Segment, per IP address, and per 715 MAC. For other VPN technology types, a much smaller granularity 716 levels can be supported. For example for VPLS, only the granularity 717 of per PE and per subnet can be supported. For per-PE granularity 718 level, the mechanism is the same among all the VPN technologies as 719 IPsec tunnel type (and its associated TLV and sub-TLVs) are sent 720 along with the PE's loopback IPv4 (or IPv6) address. For VPLS, if 721 per-subnet (per bridge domain) granularity level needs to be 722 supported, then the IPsec tunnel type and TLV are sent along with 723 VPLS AD route. 725 The following table lists what level of granularity can be supported 726 by a given VPN technology and with what BGP route. 728 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 729 | Functionality | EVPN | IP-VPN | MVPN | VPLS | 730 +---------------+-------------+-------------+-----------+---------+ 731 | per PE |IPv4/v6 route|IPv4/v6 route|IPv4/v6 rte|IPv4/v6 | 732 +---------------+-------------+-------------+-----------+---------+ 733 | per tenant |IMET (or new)|lpbk (or new)| I-PMSI | N/A | 734 +---------------+-------------+-------------+-----------+---------+ 735 | per subnet | IMET | N/A | N/A | VPLS AD | 736 +---------------+-------------+-------------+-----------+---------+ 737 | per IP |EVPN RT2/RT5 | VPN IP rt | *,G or S,G| N/A | 738 +---------------+-------------+-------------+-----------+---------+ 739 | per MAC | EVPN RT2 | N/A | N/A | N/A | 740 +---------------+-------------+-------------+-----------+---------+ 742 7 Acknowledgements 744 8 Security Considerations 746 9 IANA Considerations 748 A new transitive extended community Type of 0x06 and Sub-Type of TBD 749 for EVPN Attachment Circuit Extended Community needs to be allocated 750 by IANA. 752 10 References 754 10.1 Normative References 756 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 757 Requirement Levels", BCP 14, RFC 2119, March 1997. 759 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119 760 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 761 2017. 763 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432, 764 February, 2015. 766 [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution 767 Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. 769 [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation 770 Attribute", draft-ietf-idr-tunnel-encaps-03, November 771 2016. 773 [CONTROLLER-IKE] Carrel et al., "IPsec Key Exchange using a 774 Controller", draft-carrel-ipsecme-controller-ike-00, July, 775 2018. 777 [RFC3948] Huttunen et al., "UDP Encapsulation of IPsec ESP Packets", 778 RFC 3948, January 2005. 780 [IKEV2-IANA] IANA, "Internet Key Exchange Version 2 (IKEv2) 781 Parameters", February 2016, 782 www.iana.org/assignments/ikev2-parameters/ikev2- 783 parameters.xhtml. 785 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 786 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 787 December 2005. 789 10.2 Informative References 791 [RFC4364] Rosen, E., et. al., "BGP/MPLS IP Virtual Private Networks 792 (VPNs)", RFC 4364, February 2006. 794 [RFC4761] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 795 Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. 797 [RFC4762] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 798 Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 799 2007. 801 [RFC6513] Rosen, E., et. al., "Multicast in MPLS/BGP IP VPNs", RFC 802 6513, February 2012. 804 [RFC6514] Rosen, E., et. al., "BGP Encodings and Procedures for 805 Multicast in MPLS/BGP IP VPNs", RFC 6514, February 2012. 807 [RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 808 "Revised Error Handling for BGP UPDATE Messages", RFC 7606, August 809 2015, . 811 [802.1Q] "IEEE Standard for Local and metropolitan area networks - 812 Media Access Control (MAC) Bridges and Virtual Bridged Local Area 813 Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014. 815 [RFC7348] Mahalingam, M., et al., "Virtual eXtensible Local Area 816 Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 817 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, 818 August 2014. 820 [GENEVE] Gross, J., et al., "Geneve: Generic Network Virtualization 821 Encapsulation", Work in Progress, draft-ietf-nvo3-geneve-06, March 822 2018. 824 Authors' Addresses 826 Ali Sajassi 827 Cisco 828 Email: sajassi@cisco.com 830 Ayan Banerjee 831 Cisco 832 Email: ayabaner@cisco.com 834 Samir Thoria 835 Cisco 836 Email: sthoria@cisco.com 838 David Carrel 839 Cisco 840 Email: carrel@cisco.com 842 Brian Weis 843 Cisco 844 Email: bew@cisco.com