idnits 2.17.1 draft-sajassi-bess-secure-evpn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8365], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 11, 2019) is 1872 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVPN-PREFIX' is mentioned on line 177, but not defined == Missing Reference: 'RFC7365' is mentioned on line 201, but not defined == Missing Reference: 'RFC7296' is mentioned on line 755, but not defined == Missing Reference: 'IKEv2-IANA' is mentioned on line 487, but not defined == Missing Reference: 'RFC5114' is mentioned on line 699, but not defined == Missing Reference: 'RFC4753' is mentioned on line 704, but not defined ** Obsolete undefined reference: RFC 4753 (Obsoleted by RFC 5903) == Unused Reference: 'IKEV2-IANA' is defined on line 837, but no explicit reference was found in the text == Unused Reference: 'RFC7606' is defined on line 864, but no explicit reference was found in the text == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-03 == Outdated reference: A later version (-01) exists of draft-carrel-ipsecme-controller-ike-00 -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2IANA' -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2-IANA' == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-06 Summary: 2 errors (**), 0 flaws (~~), 12 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT A. Banerjee 4 Intended Status: Standards Track S. Thoria 5 D. Carrel 6 B. Weis 7 Cisco 9 Expires: September 11, 2019 March 11, 2019 11 Secure EVPN 12 draft-sajassi-bess-secure-evpn-01 14 Abstract 16 The applications of EVPN-based solutions ([RFC7432] and [RFC8365]) 17 have become pervasive in Data Center, Service Provider, and 18 Enterprise segments. It is being used for fabric overlays and inter- 19 site connectivity in the Data Center market segment, for Layer-2, 20 Layer-3, and IRB VPN services in the Service Provider market segment, 21 and for fabric overlay and WAN connectivity in Enterprise networks. 22 For Data Center and Enterprise applications, there is a need to 23 provide inter-site and WAN connectivity over public Internet in a 24 secured manner with same level of privacy, integrity, and 25 authentication for tenant's traffic as IPsec tunneling using IKEv2. 26 This document presents a solution where BGP point-to-multipoint 27 signaling is leveraged for key and policy exchange among PE devices 28 to create private pair-wise IPsec Security Associations without IKEv2 29 point-to-point signaling or any other direct peer-to-peer session 30 establishment messages. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as 40 Internet-Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2014 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 7 71 2.1 Tenant's Layer-2 and Layer-3 data & control traffic . . . . 7 72 2.2 Tenant's Unicast & Multicast Data Protection . . . . . . . . 7 73 2.3 P2MP Signaling for SA setup and Maintenance . . . . . . . . 7 74 2.4 Granularity of Security Association Tunnels . . . . . . . . 7 75 2.5 Support for Policy and DH-Group List . . . . . . . . . . . . 8 76 3 Solution Description . . . . . . . . . . . . . . . . . . . . . 8 77 3.1 Inheritance of Security Policies . . . . . . . . . . . . . . 9 78 3.2 Distribution of Public Keys and Policies . . . . . . . . . 10 79 3.2.1 Minimal DIM . . . . . . . . . . . . . . . . . . . . . . 10 80 3.2.2 Multiple Policies . . . . . . . . . . . . . . . . . . . 10 81 3.2.2.1 Multiple DH-groups . . . . . . . . . . . . . . . . 11 82 3.2.2.2 Multiple or Single ESP SA policies . . . . . . . . 11 83 3.3 Initial IPsec SAs Generation . . . . . . . . . . . . . . . 11 84 3.4 Re-Keying . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 3.5 IPsec Databases . . . . . . . . . . . . . . . . . . . . . . 12 86 4 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . 12 87 4.1 Standard ESP Encapsulation . . . . . . . . . . . . . . . . . 13 88 4.2 ESP Encapsulation within UDP packet . . . . . . . . . . . . 13 89 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 15 90 5.1 The Base (Minimal Set) DIM Sub-TLV . . . . . . . . . . . . . 15 91 5.2 Key Exchange Sub-TLV . . . . . . . . . . . . . . . . . . . . 16 92 5.3 ESP SA Proposals Sub-TLV . . . . . . . . . . . . . . . . . . 17 93 5.3.1 Transform Substructure . . . . . . . . . . . . . . . . . 17 94 6 Applicability to other VPN types . . . . . . . . . . . . . . . 18 95 7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 96 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 19 97 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 19 98 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 99 10.1 Normative References . . . . . . . . . . . . . . . . . . . 19 100 10.2 Informative References . . . . . . . . . . . . . . . . . . 20 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 103 Terminology 105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 107 "OPTIONAL" in this document are to be interpreted as described in BCP 108 14 [RFC2119] [RFC8174] when, and only when, they appear in all 109 capitals, as shown here. 111 AC: Attachment Circuit. 113 ARP: Address Resolution Protocol. 115 BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single 116 or multiple BDs. In case of VLAN-bundle and VLAN-based service models 117 (see [RFC7432]), a BD is equivalent to an EVI. In case of VLAN-aware 118 bundle service model, an EVI contains multiple BDs. Also, in this 119 document, BD and subnet are equivalent terms. 121 BD Route Target: refers to the Broadcast Domain assigned Route Target 122 [RFC4364]. In case of VLAN-aware bundle service model, all the BD 123 instances in the MAC-VRF share the same Route Target. 125 BT: Bridge Table. The instantiation of a BD in a MAC-VRF, as per 126 [RFC7432]. 128 DGW: Data Center Gateway. 130 Ethernet A-D route: Ethernet Auto-Discovery (A-D) route, as per 131 [RFC7432]. 133 Ethernet NVO tunnel: refers to Network Virtualization Overlay tunnels 134 with Ethernet payload. Examples of this type of tunnels are VXLAN or 135 GENEVE. 137 EVI: EVPN Instance spanning the NVE/PE devices that are participating 138 on that EVPN, as per [RFC7432]. 140 EVPN: Ethernet Virtual Private Networks, as per [RFC7432]. 142 GRE: Generic Routing Encapsulation. 144 GW IP: Gateway IP Address. 146 IPL: IP Prefix Length. 148 IP NVO tunnel: it refers to Network Virtualization Overlay tunnels 149 with IP payload (no MAC header in the payload). 151 IP-VRF: A VPN Routing and Forwarding table for IP routes on an 152 NVE/PE. The IP routes could be populated by EVPN and IP-VPN address 153 families. An IP-VRF is also an instantiation of a layer 3 VPN in an 154 NVE/PE. 156 IRB: Integrated Routing and Bridging interface. It connects an IP-VRF 157 to a BD (or subnet). 159 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 160 Control (MAC) addresses on an NVE/PE, as per [RFC7432]. A MAC-VRF is 161 also an instantiation of an EVI in an NVE/PE. 163 ML: MAC address length. 165 ND: Neighbor Discovery Protocol. 167 NVE: Network Virtualization Edge. 169 GENEVE: Generic Network Virtualization Encapsulation, [GENEVE]. 171 NVO: Network Virtualization Overlays. 173 RT-2: EVPN route type 2, i.e., MAC/IP advertisement route, as defined 174 in [RFC7432]. 176 RT-5: EVPN route type 5, i.e., IP Prefix route. As defined in Section 177 3 of [EVPN-PREFIX]. 179 SBD: Supplementary Broadcast Domain. A BD that does not have any ACs, 180 only IRB interfaces, and it is used to provide connectivity among all 181 the IP-VRFs of the tenant. The SBD is only required in IP-VRF- to-IP- 182 VRF use-cases (see Section 4.4.). 184 SN: Subnet. 186 TS: Tenant System. 188 VA: Virtual Appliance. 190 VNI: Virtual Network Identifier. As in [RFC8365], the term is used as 191 a representation of a 24-bit NVO instance identifier, with the 192 understanding that VNI will refer to a VXLAN Network Identifier in 193 VXLAN, or Virtual Network Identifier in GENEVE, etc. unless it is 194 stated otherwise. 196 VTEP: VXLAN Termination End Point, as in [RFC7348]. 198 VXLAN: Virtual Extensible LAN, as in [RFC7348]. 200 This document also assumes familiarity with the terminology of 201 [RFC7432], [RFC8365] and [RFC7365]. 203 1 Introduction 205 The applications of EVPN-based solutions have become pervasive in 206 Data Center, Service Provider, and Enterprise segments. It is being 207 used for fabric overlays and inter-site connectivity in the Data 208 Center market segment, for Layer-2, Layer-3, and IRB VPN services in 209 the Service Provider market segment, and for fabric overlay and WAN 210 connectivity in the Enterprise networks. For Data Center and 211 Enterprise applications, there is a need to provide inter-site and 212 WAN connectivity over public Internet in a secured manner with the 213 same level of privacy, integrity, and authentication for tenant's 214 traffic as used in IPsec tunneling using IKEv2. This document 215 presents a solution where BGP point-to-multipoint signaling is 216 leveraged for key and policy exchange among PE devices to create 217 private pair-wise IPsec Security Associations without IKEv2 point-to- 218 point signaling or any other direct peer-to-peer session 219 establishment messages. 221 EVPN uses BGP as control-plane protocol for distribution of 222 information needed for discovery of PEs participating in a VPN, 223 discovery of PEs participating in a redundancy group, customer MAC 224 addresses and IP prefixes/addresses, aliasing information, tunnel 225 encapsulation types, multicast tunnel types, multicast group 226 memberships, and other info. The advantages of using BGP control 227 plane in EVPN are well understood including the following: 229 1) A full mesh of BGP sessions among PE devices can be avoided by 230 using Route Reflector (RR) where a PE only needs to setup a single 231 BGP session between itself and the RR as opposed to setting up N BGP 232 sessions to N other remote PEs; therefore, reducing number of BGP 233 sessions from O(N^2) to O(N) in the network. Furthermore, RR 234 hierarchy can be leveraged to scale the number of BGP routes on the 235 RR. 237 2) MP-BGP route filtering and constrained route distribution can be 238 leveraged to ensure that the control-plane traffic for a given VPN is 239 only distributed to the PEs participating in that VPN. 241 For setting up point-to-point security association (i.e., IPsec 242 tunnel) between a pair of EVPN PEs, it is important to leverage BGP 243 point-to-multipoint singling architecture using the RR along with its 244 route filtering and constrain mechanisms to achieve the performance 245 and the scale needed for large number of security associations (IPsec 246 tunnels) along with their frequent re-keying requirements. Using BGP 247 signaling along with the RR (instead of peer-to-peer protocol such as 248 IKEv2) reduces number of message exchanges needed for SAs 249 establishment and maintenance from O(N^2) to O(N) in the network. 251 2 Requirements 253 The requirements for secured EVPN are captured in the following 254 subsections. 256 2.1 Tenant's Layer-2 and Layer-3 data & control traffic 258 Tenant's layer-2 and layer-3 data and control traffic must be 259 protected by IPsec cryptographic methods. This implies not only 260 tenant's data traffic must be protected by IPsec but also tenant's 261 control and routing information that are advertised in BGP must also 262 be protected by IPsec. This in turn implies that BGP session must be 263 protected by IPsec. 265 2.2 Tenant's Unicast & Multicast Data Protection 267 Tenant's layer-2 and layer-3 unicast traffic must be protected by 268 IPsec. In addition to that, tenant's layer-2 broadcast, unknown 269 unicast, and multicast traffic as well as tenant's layer-3 multicast 270 traffic must be protected by IPsec when ingress replication or 271 assisted replication are used. The use of BGP P2MP signaling for 272 setting up P2MP SAs in P2MP multicast tunnels is for future study. 274 2.3 P2MP Signaling for SA setup and Maintenance 276 BGP P2MP signaling must be used for IPsec SAs setup and maintenance. 277 The BGP signaling must follow P2MP signaling framework per 278 [CONTROLLER-IKE] for IPsec SAs setup and maintenance in order to 279 reduce the number of message exchanges from O(N^2) to O(N) among the 280 participant PE devices. 282 2.4 Granularity of Security Association Tunnels 284 The solution must support the setup and maintenance of IPsec SAs at 285 the following level of granularities: 287 1) Per PE: A single IPsec tunnel between a pair of PEs to be used for 288 all tenants' traffic supported by the pair of PEs. 290 2) Per tenant: A single IPsec tunnel per tenant per pair of PEs. For 291 example, if there are 1000 tenants supported on a pair of PEs, then 292 1000 IPsec tunnels are required between that pair of PEs. 294 3) Per subnet: A single IPsec tunnel per subnet (e.g., per VLAN/EVI) 295 of a tenant on a pair of PEs. 297 4) Per IP address: A single IPsec tunnel per pair of IP addresses of 298 a tenant on a pair of PEs. 300 5) Per MAC address: A single IPsec tunnel per pair of MAC addresses 301 of a tenant on a pair of PEs. 303 6) Per Attachment Circuit: A single IPsec tunnel per pair of 304 Attachment Circuits between a pair of PEs. 306 2.5 Support for Policy and DH-Group List 308 The solution must support a single policy and DH group for all SAs as 309 well as supporting multiple policies and DH groups among the SAs. 311 3 Solution Description 313 This solution uses BGP P2MP signaling where an originating PE only 314 send a message to the Route Reflector (RR) and then the RR reflects 315 that message to the interested recipient PEs. The framework for such 316 signaling is described in [CONTROLLER-IKE] and it is referred to as 317 device-to-controller trust model. This trust model is significantly 318 different than the traditional peer-to-peer trust model where a P2P 319 signaling protocol such as IKEv2 [RFC7296] is used in which the PE 320 devices directly authenticate each other and agree upon security 321 policy and keying material to protect communications between 322 themselves. The device-to-controller trust model leverages P2MP 323 signaling via the controller (e.g., the RR) to achieve much better 324 scale and performance for establishment and maintenance of large 325 number of pair-wise Security Associations (SAs) among the PEs. 327 This device-to-controller trust model first secures the control 328 channel between each device and the controller using peer-to-peer 329 protocol such as IKEv2 [RFC7296] to establish P2P SAs between each PE 330 and the RR. It then uses this secured control channel for P2MP 331 signaling in establishment of P2P SAs between each pair of PE 332 devices. 334 Each PE advertises to other PEs via the RR the information needed in 335 establishment of pair-wise SAs between itself an every other remote 336 PEs. These pieces of information are sent as Sub-TLVs of IPSec tunnel 337 type in BGP Tunnel Encapsulation attribute. These Sub-TLVs are 338 detailed in section 5 and are based on the DIM message components 339 from [CONTROLLER-IKE] and the IKEv2 specification [RFC7296]. The 340 IPsec tunnel TLVs along with its Sub-TLVs are sent along with the BGP 341 route (NLRI) for a given level of granularity. 343 If only a single SA is required per pair of PE devices to multiplex 344 user traffic for all tenants, then IPsec tunnel TLV is advertised 345 along with IPv4 or IPv6 NLRI representing loopback address of the 346 originating PE. It should be noted that this is not a VPN route but 347 rather an IPv4 or IPv6 route. 349 If a SA is required per tenant between a pair of PE devices, then 350 IPsec tunnel TLV can be advertised along with EVPN IMET route 351 representing the tenant or can be advertised along with a new EVPN 352 route representing the tenant. 354 If a SA is required per tenant's subnet (e.g., per VLAN) between a 355 pair of PE devices, then IPsec tunnel TLV is advertised along with 356 EVPN IMET route. 358 If a SA is required between a pair of tenant's devices represented by 359 a pair of IP addresses, then IPsec tunnel TLV is advertised along 360 with EVPN IP Prefix Advertisement Route or EVPN MAC/IP Advertisement 361 route. 363 If a SA is required between a pair of tenant's devices represented by 364 a pair of MAC addresses, then IPsec tunnel TLV is advertised along 365 with EVPN MAC/IP Advertisement route. 367 If a SA is required between a pair of Attachment Circuits (ACs) on 368 two PE devices (where an AC can be represented by ), then 369 IPsec tunnel TLV is advertised along with EVPN Ethernet AD route. 371 3.1 Inheritance of Security Policies 373 Operationally, it is easy to configure a security association between 374 a pair of PEs using BGP signaling. This is the default security 375 association that is used for traffic that flows between peers. 376 However, in the event more finer granularity of security association 377 is desired on the traffic flows, it is possible to set up SAs between 378 a pair of tenants, a pair of subnets within a tenant, a pair of IPs 379 between a subnet, and a pair of MACs between a subnet using the 380 appropriate EVPN routes as described above. In the event, there are 381 no security TLVs associated with an EVPN route, there is a strict 382 order in the manner security associations are inherited for such a 383 route. This results in an EVPN route inheriting the security 384 associations of the parent in a hierarchical fashion. For example, 385 traffic between an IP pair is protected using security TLVs announced 386 along with the EVPN IP Prefix Advertisement Route or EVPN MAC/IP 387 Advertisement route as a first choice. If such TLVs are missing with 388 the associated route, then one checks to see if the subnets the IPs 389 are associated with has security TLVs with the EVPN IMET route. If 390 they are present, those associations are used in securing the 391 traffic. In the absence of them, the peer security associations are 392 used. The order in which security associations are inherited are from 393 the granular to the coarser, namely, IP/MAC associated TLVs with the 394 EVPN route being the first preference, and the subnet, the tenant, 395 and the peer associations preferred in that fashion. 397 It should be noted that when a security association is made it is 398 possible for it to be re-used by a large number of traffic flows. For 399 example, a tenant security association may be associated with a 400 number of child subnet routes. Clearly it is mandatory to keep a 401 tenant security association alive, if there are one or more subnet 402 routes that want to use that association. Logically, the security 403 associations between a pair of entities creates a single secure 404 tunnel. It is thus possible to classify the incoming traffic in the 405 most granular sense {IP/MAC, subnet, tenant, peer} to a particular 406 secure tunnel that falls within its route hierarchy. The policy that 407 is applied to such traffic is independent from its use of an existing 408 or a new secure tunnel. It is clear that since any number of 409 classified traffic flows can use a security association, such a 410 security association will not be torn down, if at least there is one 411 policy using such a secure tunnel. 413 3.2 Distribution of Public Keys and Policies 415 One of the requirements for this solution is to support a single DH 416 group and a single policy for all SAs as well as to support multiple 417 DH groups and policies among the SAs. The following subsections 418 describe what pieces of information (what Sub-TLVs) are needed to be 419 exchanged to support a single DH group and a single policy versus 420 multiple DH groups and multiple policies. 422 3.2.1 Minimal DIM 424 For SA establishment, at the minimum, a PE needs to advertise to 425 other PEs, its DIM values as specified in [CONTROLLER-IKE]. These 426 include: 428 ID Tunnel ID 429 N Nonce 430 RC Rekey Counter 431 I Indication of initial policy distribution 432 KE DH public value. 434 When this minimal set of DIM values is sent, then it is assumed that 435 all peer PEs share the same policy for which DH group to use, as well 436 as which IPSec SA policy to employ. Section 5.1 defines the Minimal 437 DIM sub-TLV as part of IPsec tunnel TLV in BGP Tunnel Encapsulation 438 Attribute. 440 3.2.2 Multiple Policies 441 There can be scenarios for which there is a need to have multiple 442 policy options. This can happen when there is a need for policy 443 change and smooth migration among all PE devices to the new policy is 444 required. It can also happen if different PE devices have different 445 capabilities within the network. In these scenarios, PE devices need 446 to be able to choose the correct policy to use for each other. This 447 multi-policy scheme is described in section 6 of [CONTROLLER-IKE]. In 448 order to support this multi-policy feature, a PE device MUST 449 distribute a policy list. This list consists of multiple distinct 450 policies in order of preference, where the first policy is the most 451 preferred one. The receiving PE selects the policy by taking the 452 received list (starting with the first policy) and comparing that 453 against its own list and choosing the first one found in common. If 454 there is no match, this indicates a configuration error and the PEs 455 MUST NOT establish new SAs until a message is received that does 456 produce a match. 458 3.2.2.1 Multiple DH-groups 460 It can be the case that not all peers use the same DH group. When 461 multiple DH groups are supported, the peer may include multiple KE 462 Sub-TLVs. The order of the KE Sub-TLVs determines the preference. 463 The preference and selection methods are specified in Section 6 of 464 [CONTROLLER-IKE]. 466 3.2.2.2 Multiple or Single ESP SA policies 468 In order to specify an ESP SA Policy, a DIM may include one or more 469 SA Sub-TLVs. When all peers are configured by a controller with the 470 same ESP SA policy, they MAY leave the SA out of the DIM. This 471 minimizes messaging when group configuration is static and known. 472 However, it may also be desirable to include the SA. If a single SA 473 is included, the peer is indicating what ESP SA policy it uses, but 474 is not willing to negotiate. If multiple SA Sub-TLVs are included, 475 the peer is indicating that it is willing to negotiate. The order of 476 the SA Sub-TLVs determines the preference. The preference and 477 selection methods are specified in Section 6 of [CONTROLLER-IKE]. 479 3.3 Initial IPsec SAs Generation 481 The procedure for generation of initial IPsec SAs is described in 482 section 3 of [CONTROLLER-IKE]. This section gives a summary of it in 483 context of BGP signaling. When a PE device first comes up and wants 484 to setup an IPsec SA between itself and each of the interested remote 485 PEs, it generates a DH pair along for each [what word here? 486 "tennant"?] using an algorithm defined in the IKEv2 Diffie-Hellman 487 Group Transform IDs [IKEv2-IANA]. The originating PE distributes the 488 DH public value along with the other values in the DIM (using IPsec 489 Tunnel TLV in Tunnel Encapsulation Attribute) to other remote PEs via 490 the RR. Each receiving PE uses this DH public number and the 491 corresponding nonce in creation of IPsec SA pair to the originating 492 PE - i.e., an outbound SA and an inbound SA. The detail procedures 493 are described in section 5.2 of [CONTROLLER-IKE]. 495 3.4 Re-Keying 497 A PE can initiate re-keying at any time due to local time or volume 498 based policy or due to the result of cipher counter nearing its final 499 value. The rekey process is performed individually for each remote 500 PE. If rekeying is performed with multiple PEs simultaneously, then 501 the decision process and rules described in this rekey are performed 502 independently for each PE. Section 4 of [CONTROLLER-IKE] describes 503 this rekeying process in details and gives examples for a single 504 IPsec device (e.g., a single PE) rekey versus multiple PE devices 505 rekey simultaneously. 507 3.5 IPsec Databases 509 The Peer Authorization Database (PAD), the Security Policy Database 510 (SPD), and the Security Association Database (SAD) all need to be 511 setup as defined in the IPsec Security Architecture [RFC4301]. 512 Section 5 of [CONTROLLER-IKE] gives a summary description of how 513 these databases are setup for the controller-based model where key is 514 exchanged via P2MP signaling via the controller (i.e., the RR) and 515 the policy can be either signaled via the RR (in case of multiple 516 policies) or configured by the management station (in case of single 517 policy). 519 4 Encapsulation 521 Vast majority of Encapsulation for Network Virtualization Overlay 522 (NVO) networks in deployment are based on UDP/IP with UDP destination 523 port ID indicating the type of NVO encapsulation (e.g., VxLAN, GPE, 524 GENEVE, GUE) and UDP source port ID representing flow entropy for 525 load-balancing of the traffic within the fabric based on n-tuple that 526 includes UDP header. When encrypting NVO encapsulated packets using 527 IP Encapsulating Security Payload (ESP), the following two options 528 can be used: a) adding a UDP header before ESP header (e.g., UDP 529 header in clear) and b) no UDP header before ESP header (e.g., 530 standard ESP encapsulation). The following subsection describe these 531 encapsulation in further details. 533 4.1 Standard ESP Encapsulation 535 When standard IP Encapsulating Security Payload (ESP) is used 536 (without outer UDP header) for encryption of NVO packets, it is used 537 in transport mode as depicted below. When such encapsulation is used, 538 for BGP signaling, the Tunnel Type of Tunnel Encapsulation TLV is set 539 to ESP-Transport and the Tunnel Type of Encapsulation Extended 540 Community is set to NVO encapsulation type (e.g., VxLAN, GENEVE, GPE, 541 etc.). This implies that the customer packets are first encapsulated 542 using NVO encapsulation type and then it is further encapsulated & 543 encrypted using ESP-Transport mode. 545 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 546 | MAC Header | | MAC Header | 547 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 548 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 549 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 550 | IP Header | | IP Header | 551 | Protocol = UDP | | Protocol = ESP | 552 +-----------------------+ +-----------------------+ 553 | UDP Header | | ESP Header | 554 | Dest Port = VxLAN | +-----------------------+ 555 +-----------------------+ | UDP Header | 556 | VxLAN Header | | Dest Port = VxLAN | 557 +-----------------------+ +-----------------------+ 558 | Inner MAC Header | | VxLAN Header | 559 +-----------------------+ +-----------------------+ 560 | Inner Eth Payload | | Inner MAC Header | 561 +-----------------------+ +-----------------------+ 562 | CRC | | Inner Eth Payload | 563 +-----------------------+ +-----------------------+ 564 | ESP Trailer (NP=UDP) | 565 +-----------------------+ 566 | CRC | 567 +-----------------------+ 569 Figure 3: VxLAN Encapsulation within ESP 571 4.2 ESP Encapsulation within UDP packet 573 In scenarios where NAT traversal is required ([RFC3948]) or where 574 load balancing using UDP header is required, then ESP encapsulation 575 within UDP packet as depicted in the following figure is used. The 576 ESP for NVO applications is in transport mode. The outer UDP header 577 (before the ESP header) has its source port set to flow entropy and 578 its destination port set to 4500 (indicating ESP header follows). A 579 non-zero SPI value in ESP header implies that this is a data packet 580 (i.e., it is not an IKE packet). The Next Protocol field in the ESP 581 trailer indicates what follows the ESP header, is a UDP header. This 582 inner UDP header has a destination port ID that identifies NVO 583 encapsulation type (e.g., VxLAN). Optimization of this packet format 584 where only a single UDP header is used (only the outer UDP header) is 585 for future study. 587 When such encapsulation is used, for BGP signaling, the Tunnel Type 588 of Tunnel Encapsulation TLV is set to ESP-in-UDP-Transport and the 589 Tunnel Type of Encapsulation Extended Community is set to NVO 590 encapsulation type (e.g., VxLAN, GENEVE, GPE, etc.). This implies 591 that the customer packets are first encapsulated using NVO 592 encapsulation type and then it is further encapsulated & encrypted 593 using ESP-in-UDP with Transport mode. 595 [RFC3948] 596 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 597 | MAC Header | | MAC Header | 598 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 599 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 600 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 601 | IP Header | | IP Header | 602 | Protocol = UDP | | Protocol = UDP | 603 +-----------------------+ +-----------------------+ 604 | UDP Header | | UDP Header | 605 | Dest Port = VxLAN | | Dest Port = 4500(ESP) | 606 +-----------------------+ +-----------------------+ 607 | VxLAN Header | | ESP Header | 608 +-----------------------+ +-----------------------+ 609 | Inner MAC Header | | UDP Header | 610 +-----------------------+ | Dest Port = VxLAN | 611 | Inner Eth Payload | +-----------------------+ 612 +-----------------------+ | VxLAN Header | 613 | CRC | +-----------------------+ 614 +-----------------------+ | Inner MAC Header | 615 +-----------------------+ 616 | Inner Eth Payload | 617 +-----------------------+ 618 | ESP Trailer (NP=UDP) | 619 +-----------------------+ 620 | CRC | 621 +-----------------------+ 622 Figure 4: VxLAN Encapsulation within ESP Within UDP 624 5 BGP Encoding 626 This document defines two new Tunnel Types along with its associated 627 sub-TLVs for The Tunnel Encapsulation Attribute [TUNNEL-ENCAP]. These 628 tunnel types correspond to ESP-Transport and ESP-in-UDP-Transport as 629 described in section 4. The following sub-TLVs apply to both tunnel 630 types unless stated otherwise. 632 5.1 The Base (Minimal Set) DIM Sub-TLV 634 The Base DIM is described in 3.2.1. One and only one Base DIM may be 635 sent in the IPSec Tunnel TLV. 637 0 1 2 3 638 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | ID Length | Nonce Length |I| Flags | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 | Rekey | 643 | Counter | 644 +---------------------------------------------------------------+ 645 | | 646 ~ Originator ID + (Tenant ID) + (Subnet ID) + (Tenant Address) ~ 647 | | 648 +---------------------------------------------------------------+ 649 | | 650 ~ Nonce Data ~ 651 | | 652 +---------------------------------------------------------------+ 654 Figure 5: The Base DIM Sub-TLV 656 ID Length (16 bits) is the length of the Originator ID + (Tenant ID) 657 + (Subnet ID) + (Tenant Address) in bytes. 659 Nonce Length (8 bits) is the length of the Nonce Data in bytes 661 I (1 bit) is the initial contact flag from [CONTROLLER-IKE] 663 Flags (7 bits) are reserved and MUST be set to zero on transmit and 664 ignored on receipt. 666 The Rekey Counter is a 64 bit rekey counter as specified in 667 [CONTROLLER-IKE] 669 The Originator ID + (Tenant ID) + (Subnet ID) + (Tenant Address) is 670 the tunnel identifier and uniquely identifies the tunnel. Depending 671 on the granularity of the tunnel, the fields in () may not be used - 672 i.e., for a tunnel at the PE level of granularity, only Originator ID 673 is required. 675 The Nonce Data is the nonce described in [CONTROLLER-IKE]. Its 676 length is a multiple of 32 bits. Nonce lengths should be chosen to 677 meet minimum requirements described in IKEv2 [RFC7296]. 679 5.2 Key Exchange Sub-TLV 681 The KE Sub-TLV is described in 3.2.1 and 3.2.2.1. A KE is always 682 required. One or more KE Sub-TLVs may be included in the IPSec 683 Tunnel TLV. 685 0 1 2 3 686 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 687 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 | Diffie-Hellman Group Num | Reserved | 689 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 690 | | 691 ~ Key Exchange Data ~ 692 | | 693 +---------------------------------------------------------------+ 695 Figure 6: Key Exchange Sub-TLV 697 Diffie-Hellman Group Num 916 bits) identifies the Diffie-Hellman 698 group in the Key Exchange Data was computed. Diffie-Hellman group 699 numbers are discussed in IKEv2 [RFC7296] Appendix B and [RFC5114]. 701 The Key Exchange payload is constructed by copying one's Diffie- 702 Hellman public value into the "Key Exchange Data" portion of the 703 payload. The length of the Diffie-Hellman public value is described 704 for MOPD groups in [RFC7296] and for ECP groups in [RFC4753]. 706 5.3 ESP SA Proposals Sub-TLV 708 The SA Sub-TLV is described in 3.2.2.2. Zero or more SA Sub-TLVs may 709 be included in the IPSec Tunnel TLV. 711 0 1 2 3 712 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 714 ||Num Transforms| Reserved | 715 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 716 | | 717 ~ Transforms ~ 718 | | 719 +---------------------------------------------------------------+ 721 Figure 8: ESP SA Proposals Sub-TLV 723 Num Transforms is the number of transforms included. 725 Reserved is not used and MUST be set to zero on transmit and MUST be 726 ignored on receipt. 728 5.3.1 Transform Substructure 729 0 1 2 3 730 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 731 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 732 | Transform Attr Length |Transform Type | Reserved. | 733 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 734 | Transform ID | Reserved | 735 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 736 | | 737 ~ Transform Attributes ~ 738 | | 739 +---------------------------------------------------------------+ 741 Figure 9: Transform Substructure Sub-TLV 743 The Transform Attr Length is the length of the Transform Attributes 744 field. 746 The Transform Type is from Section 3.3.2 of [RFC7296] and 747 [IKEV2IANA]. Only the values ENCR, INTEG, and ESN are allowed. 749 The Transform ID specifies the transform identification value from 750 [IKEV2IANA]. 752 Reserved is unused and MUST be zero on transmit and MUST be ignored 753 on receipt. 755 The Transform Attributes are taken directly from 3.3.5 of [RFC7296]. 757 6 Applicability to other VPN types 759 Although P2MP BGP signaling for establishment and maintenance of SAs 760 among PE devices is described in this document in context of EVPN, 761 there is no reason why it cannot be extended to other VPN 762 technologies such as IP-VPN [RFC4364], VPLS [RFC4761] & [RFC4762], 763 and MVPN [RFC6513] & [RFC6514] with ingress replication. The reason 764 EVPN has been chosen is because of its pervasiveness in DC, SP, and 765 Enterprise applications and because of its ability to support SA 766 establishment at different granularity levels such as: per PE, Per 767 tenant, per subnet, per Ethernet Segment, per IP address, and per 768 MAC. For other VPN technology types, a much smaller granularity 769 levels can be supported. For example for VPLS, only the granularity 770 of per PE and per subnet can be supported. For per-PE granularity 771 level, the mechanism is the same among all the VPN technologies as 772 IPsec tunnel type (and its associated TLV and sub-TLVs) are sent 773 along with the PE's loopback IPv4 (or IPv6) address. For VPLS, if 774 per-subnet (per bridge domain) granularity level needs to be 775 supported, then the IPsec tunnel type and TLV are sent along with 776 VPLS AD route. 778 The following table lists what level of granularity can be supported 779 by a given VPN technology and with what BGP route. 781 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 782 | Functionality | EVPN | IP-VPN | MVPN | VPLS | 783 +---------------+-------------+-------------+-----------+---------+ 784 | per PE |IPv4/v6 route|IPv4/v6 route|IPv4/v6 rte|IPv4/v6 | 785 +---------------+-------------+-------------+-----------+---------+ 786 | per tenant |IMET (or new)|lpbk (or new)| I-PMSI | N/A | 787 +---------------+-------------+-------------+-----------+---------+ 788 | per subnet | IMET | N/A | N/A | VPLS AD | 789 +---------------+-------------+-------------+-----------+---------+ 790 | per IP |EVPN RT2/RT5 | VPN IP rt | *,G or S,G| N/A | 791 +---------------+-------------+-------------+-----------+---------+ 792 | per MAC | EVPN RT2 | N/A | N/A | N/A | 793 +---------------+-------------+-------------+-----------+---------+ 795 7 Acknowledgements 797 8 Security Considerations 799 9 IANA Considerations 801 A new transitive extended community Type of 0x06 and Sub-Type of TBD 802 for EVPN Attachment Circuit Extended Community needs to be allocated 803 by IANA. 805 10 References 807 10.1 Normative References 809 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 810 Requirement Levels", BCP 14, RFC 2119, March 1997. 812 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119 813 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 814 2017. 816 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432, 817 February, 2015. 819 [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution 820 Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. 822 [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation 823 Attribute", draft-ietf-idr-tunnel-encaps-03, November 824 2016. 826 [CONTROLLER-IKE] Carrel et al., "IPsec Key Exchange using a 827 Controller", draft-carrel-ipsecme-controller-ike-00, July, 828 2018. 830 [IKEV2IANA] IANA, "Internet Key Exchange Version 2 (IKEv2) 831 Parameters", . 834 [RFC3948] Huttunen et al., "UDP Encapsulation of IPsec ESP Packets", 835 RFC 3948, January 2005. 837 [IKEV2-IANA] IANA, "Internet Key Exchange Version 2 (IKEv2) 838 Parameters", February 2016, 839 www.iana.org/assignments/ikev2-parameters/ikev2- 840 parameters.xhtml. 842 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 843 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 844 December 2005. 846 10.2 Informative References 848 [RFC4364] Rosen, E., et. al., "BGP/MPLS IP Virtual Private Networks 849 (VPNs)", RFC 4364, February 2006. 851 [RFC4761] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 852 Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. 854 [RFC4762] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 855 Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 856 2007. 858 [RFC6513] Rosen, E., et. al., "Multicast in MPLS/BGP IP VPNs", RFC 859 6513, February 2012. 861 [RFC6514] Rosen, E., et. al., "BGP Encodings and Procedures for 862 Multicast in MPLS/BGP IP VPNs", RFC 6514, February 2012. 864 [RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 865 "Revised Error Handling for BGP UPDATE Messages", RFC 7606, August 866 2015, . 868 [802.1Q] "IEEE Standard for Local and metropolitan area networks - 869 Media Access Control (MAC) Bridges and Virtual Bridged Local Area 870 Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014. 872 [RFC7348] Mahalingam, M., et al., "Virtual eXtensible Local Area 873 Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 874 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, 875 August 2014. 877 [GENEVE] Gross, J., et al., "Geneve: Generic Network Virtualization 878 Encapsulation", Work in Progress, draft-ietf-nvo3-geneve-06, March 879 2018. 881 Authors' Addresses 883 Ali Sajassi 884 Cisco 885 Email: sajassi@cisco.com 887 Ayan Banerjee 888 Cisco 889 Email: ayabaner@cisco.com 891 Samir Thoria 892 Cisco 893 Email: sthoria@cisco.com 895 David Carrel 896 Cisco 897 Email: carrel@cisco.com 899 Brian Weis 900 Cisco 901 Email: bew@cisco.com