idnits 2.17.1 draft-sajassi-bess-secure-evpn-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8365], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 13, 2020) is 1382 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVPN-PREFIX' is mentioned on line 186, but not defined == Missing Reference: 'RFC7365' is mentioned on line 210, but not defined == Missing Reference: 'RFC7296' is mentioned on line 819, but not defined == Missing Reference: 'IKEv2-IANA' is mentioned on line 552, but not defined == Missing Reference: 'RFC5114' is mentioned on line 762, but not defined == Missing Reference: 'RFC4753' is mentioned on line 767, but not defined ** Obsolete undefined reference: RFC 4753 (Obsoleted by RFC 5903) == Unused Reference: 'IKEV2-IANA' is defined on line 901, but no explicit reference was found in the text == Unused Reference: 'RFC7606' is defined on line 928, but no explicit reference was found in the text == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-03 == Outdated reference: A later version (-01) exists of draft-carrel-ipsecme-controller-ike-00 -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2IANA' -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2-IANA' == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-06 Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT A. Banerjee 4 Intended Status: Standards Track S. Thoria 5 D. Carrel 6 Cisco 7 B. Weis 8 Individual 9 J. Drake 10 Juniper 12 Expires: January 13, 2021 July 13, 2020 14 Secure EVPN 15 draft-sajassi-bess-secure-evpn-03 17 Abstract 19 The applications of EVPN-based solutions ([RFC7432] and [RFC8365]) 20 have become pervasive in Data Center, Service Provider, and 21 Enterprise segments. It is being used for fabric overlays and inter- 22 site connectivity in the Data Center market segment, for Layer-2, 23 Layer-3, and IRB VPN services in the Service Provider market segment, 24 and for fabric overlay and WAN connectivity in Enterprise networks. 25 For Data Center and Enterprise applications, there is a need to 26 provide inter-site and WAN connectivity over public Internet in a 27 secured manner with same level of privacy, integrity, and 28 authentication for tenant's traffic as IPsec tunneling using IKEv2. 29 This document presents a solution where BGP point-to-multipoint 30 signaling is leveraged for key and policy exchange among PE devices 31 to create private pair-wise IPsec Security Associations without IKEv2 32 point-to-point signaling or any other direct peer-to-peer session 33 establishment messages. 35 Status of this Memo 37 This Internet-Draft is submitted to IETF in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF), its areas, and its working groups. Note that 42 other groups may also distribute working documents as 43 Internet-Drafts. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 50 The list of current Internet-Drafts can be accessed at 51 http://www.ietf.org/1id-abstracts.html 53 The list of Internet-Draft Shadow Directories can be accessed at 54 http://www.ietf.org/shadow.html 56 Copyright and License Notice 58 Copyright (c) 2014 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6 74 2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 7 75 2.1 Tenant's Layer-2 and Layer-3 data & control traffic . . . . 7 76 2.2 Tenant's Unicast & Multicast Data Protection . . . . . . . . 7 77 2.3 P2MP Signaling for SA setup and Maintenance . . . . . . . . 7 78 2.4 Granularity of Security Association Tunnels . . . . . . . . 7 79 2.5 Support for Policy and DH-Group List . . . . . . . . . . . . 8 80 3 BGP Component . . . . . . . . . . . . . . . . . . . . . . . . . 8 81 3.1 Zero Touch Bring-up (ZTB) . . . . . . . . . . . . . . . . . 8 82 3.2 Configuration Management . . . . . . . . . . . . . . . . . . 8 83 3.3 Orchestration . . . . . . . . . . . . . . . . . . . . . . . 9 84 3.4 Signaling . . . . . . . . . . . . . . . . . . . . . . . . . 9 85 4 Solution Description . . . . . . . . . . . . . . . . . . . . . 9 86 4.1 Inheritance of Security Policies . . . . . . . . . . . . . . 10 87 4.2 Distribution of Public Keys and Policies . . . . . . . . . 11 88 4.2.1 Minimal DIM . . . . . . . . . . . . . . . . . . . . . . 11 89 4.2.2 Multiple Policies . . . . . . . . . . . . . . . . . . . 12 90 4.2.2.1 Multiple DH-groups . . . . . . . . . . . . . . . . 12 91 4.2.2.2 Multiple or Single ESP SA policies . . . . . . . . 12 92 4.3 Initial IPsec SAs Generation . . . . . . . . . . . . . . . 13 93 4.4 Re-Keying . . . . . . . . . . . . . . . . . . . . . . . . . 13 94 4.5 IPsec Databases . . . . . . . . . . . . . . . . . . . . . . 13 95 5 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . 13 96 5.1 Standard ESP Encapsulation . . . . . . . . . . . . . . . . . 14 97 5.2 ESP Encapsulation within UDP packet . . . . . . . . . . . . 15 98 6 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 16 99 6.1 The Base (Minimal Set) DIM Sub-TLV . . . . . . . . . . . . . 16 100 6.2 Key Exchange Sub-TLV . . . . . . . . . . . . . . . . . . . . 17 101 6.3 ESP SA Proposals Sub-TLV . . . . . . . . . . . . . . . . . . 18 102 6.3.1 Transform Substructure . . . . . . . . . . . . . . . . . 19 103 7 Applicability to other VPN types . . . . . . . . . . . . . . . 19 104 8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 105 9 Security Considerations . . . . . . . . . . . . . . . . . . . . 20 106 10 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 107 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 108 11.1 Normative References . . . . . . . . . . . . . . . . . . . 20 109 11.2 Informative References . . . . . . . . . . . . . . . . . . 21 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 112 Terminology 114 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 115 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 116 "OPTIONAL" in this document are to be interpreted as described in BCP 117 14 [RFC2119] [RFC8174] when, and only when, they appear in all 118 capitals, as shown here. 120 AC: Attachment Circuit. 122 ARP: Address Resolution Protocol. 124 BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single 125 or multiple BDs. In case of VLAN-bundle and VLAN-based service models 126 (see [RFC7432]), a BD is equivalent to an EVI. In case of VLAN-aware 127 bundle service model, an EVI contains multiple BDs. Also, in this 128 document, BD and subnet are equivalent terms. 130 BD Route Target: refers to the Broadcast Domain assigned Route Target 131 [RFC4364]. In case of VLAN-aware bundle service model, all the BD 132 instances in the MAC-VRF share the same Route Target. 134 BT: Bridge Table. The instantiation of a BD in a MAC-VRF, as per 135 [RFC7432]. 137 DGW: Data Center Gateway. 139 Ethernet A-D route: Ethernet Auto-Discovery (A-D) route, as per 140 [RFC7432]. 142 Ethernet NVO tunnel: refers to Network Virtualization Overlay tunnels 143 with Ethernet payload. Examples of this type of tunnels are VXLAN or 144 GENEVE. 146 EVI: EVPN Instance spanning the NVE/PE devices that are participating 147 on that EVPN, as per [RFC7432]. 149 EVPN: Ethernet Virtual Private Networks, as per [RFC7432]. 151 GRE: Generic Routing Encapsulation. 153 GW IP: Gateway IP Address. 155 IPL: IP Prefix Length. 157 IP NVO tunnel: it refers to Network Virtualization Overlay tunnels 158 with IP payload (no MAC header in the payload). 160 IP-VRF: A VPN Routing and Forwarding table for IP routes on an 161 NVE/PE. The IP routes could be populated by EVPN and IP-VPN address 162 families. An IP-VRF is also an instantiation of a layer 3 VPN in an 163 NVE/PE. 165 IRB: Integrated Routing and Bridging interface. It connects an IP-VRF 166 to a BD (or subnet). 168 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 169 Control (MAC) addresses on an NVE/PE, as per [RFC7432]. A MAC-VRF is 170 also an instantiation of an EVI in an NVE/PE. 172 ML: MAC address length. 174 ND: Neighbor Discovery Protocol. 176 NVE: Network Virtualization Edge. 178 GENEVE: Generic Network Virtualization Encapsulation, [GENEVE]. 180 NVO: Network Virtualization Overlays. 182 RT-2: EVPN route type 2, i.e., MAC/IP advertisement route, as defined 183 in [RFC7432]. 185 RT-5: EVPN route type 5, i.e., IP Prefix route. As defined in Section 186 3 of [EVPN-PREFIX]. 188 SBD: Supplementary Broadcast Domain. A BD that does not have any ACs, 189 only IRB interfaces, and it is used to provide connectivity among all 190 the IP-VRFs of the tenant. The SBD is only required in IP-VRF- to-IP- 191 VRF use-cases (see Section 4.4.). 193 SN: Subnet. 195 TS: Tenant System. 197 VA: Virtual Appliance. 199 VNI: Virtual Network Identifier. As in [RFC8365], the term is used as 200 a representation of a 24-bit NVO instance identifier, with the 201 understanding that VNI will refer to a VXLAN Network Identifier in 202 VXLAN, or Virtual Network Identifier in GENEVE, etc. unless it is 203 stated otherwise. 205 VTEP: VXLAN Termination End Point, as in [RFC7348]. 207 VXLAN: Virtual Extensible LAN, as in [RFC7348]. 209 This document also assumes familiarity with the terminology of 210 [RFC7432], [RFC8365] and [RFC7365]. 212 1 Introduction 214 The applications of EVPN-based solutions have become pervasive in 215 Data Center, Service Provider, and Enterprise segments. It is being 216 used for fabric overlays and inter-site connectivity in the Data 217 Center market segment, for Layer-2, Layer-3, and IRB VPN services in 218 the Service Provider market segment, and for fabric overlay and WAN 219 connectivity in the Enterprise networks. For Data Center and 220 Enterprise applications, there is a need to provide inter-site and 221 WAN connectivity over public Internet in a secured manner with the 222 same level of privacy, integrity, and authentication for tenant's 223 traffic as used in IPsec tunneling using IKEv2. This document 224 presents a solution where BGP point-to-multipoint signaling is 225 leveraged for key and policy exchange among PE devices to create 226 private pair-wise IPsec Security Associations without IKEv2 point-to- 227 point signaling or any other direct peer-to-peer session 228 establishment messages. 230 EVPN uses BGP as control-plane protocol for distribution of 231 information needed for discovery of PEs participating in a VPN, 232 discovery of PEs participating in a redundancy group, customer MAC 233 addresses and IP prefixes/addresses, aliasing information, tunnel 234 encapsulation types, multicast tunnel types, multicast group 235 memberships, and other info. The advantages of using BGP control 236 plane in EVPN are well understood including the following: 238 1) A full mesh of BGP sessions among PE devices can be avoided by 239 using Route Reflector (RR) where a PE only needs to setup a single 240 BGP session between itself and the RR as opposed to setting up N BGP 241 sessions to N other remote PEs; therefore, reducing number of BGP 242 sessions from O(N^2) to O(N) in the network. Furthermore, RR 243 hierarchy can be leveraged to scale the number of BGP routes on the 244 RR. 246 2) MP-BGP route filtering and constrained route distribution can be 247 leveraged to ensure that the control-plane traffic for a given VPN is 248 only distributed to the PEs participating in that VPN. 250 For setting up point-to-point security association (i.e., IPsec 251 tunnel) between a pair of EVPN PEs, it is important to leverage BGP 252 point-to-multipoint singling architecture using the RR along with its 253 route filtering and constrain mechanisms to achieve the performance 254 and the scale needed for large number of security associations (IPsec 255 tunnels) along with their frequent re-keying requirements. Using BGP 256 signaling along with the RR (instead of peer-to-peer protocol such as 257 IKEv2) reduces number of message exchanges needed for SAs 258 establishment and maintenance from O(N^2) to O(N) in the network. 260 2 Requirements 262 The requirements for secured EVPN are captured in the following 263 subsections. 265 2.1 Tenant's Layer-2 and Layer-3 data & control traffic 267 Tenant's layer-2 and layer-3 data and control traffic must be 268 protected by IPsec cryptographic methods. This implies not only 269 tenant's data traffic must be protected by IPsec but also tenant's 270 control and routing information that are advertised in BGP must also 271 be protected by IPsec. This in turn implies that BGP session must be 272 protected by IPsec. 274 2.2 Tenant's Unicast & Multicast Data Protection 276 Tenant's layer-2 and layer-3 unicast traffic must be protected by 277 IPsec. In addition to that, tenant's layer-2 broadcast, unknown 278 unicast, and multicast traffic as well as tenant's layer-3 multicast 279 traffic must be protected by IPsec when ingress replication or 280 assisted replication are used. The use of BGP P2MP signaling for 281 setting up P2MP SAs in P2MP multicast tunnels is for future study. 283 2.3 P2MP Signaling for SA setup and Maintenance 285 BGP P2MP signaling must be used for IPsec SAs setup and maintenance. 286 The BGP signaling must follow P2MP signaling framework per 287 [CONTROLLER-IKE] for IPsec SAs setup and maintenance in order to 288 reduce the number of message exchanges from O(N^2) to O(N) among the 289 participant PE devices. 291 2.4 Granularity of Security Association Tunnels 293 The solution must support the setup and maintenance of IPsec SAs at 294 the following level of granularities: 296 1) Per PE: A single IPsec tunnel between a pair of PEs to be used for 297 all tenants' traffic supported by the pair of PEs. 299 2) Per tenant: A single IPsec tunnel per tenant per pair of PEs. For 300 example, if there are 1000 tenants supported on a pair of PEs, then 301 1000 IPsec tunnels are required between that pair of PEs. 303 3) Per subnet: A single IPsec tunnel per subnet (e.g., per VLAN/EVI) 304 of a tenant on a pair of PEs. 306 4) Per IP address: A single IPsec tunnel per pair of IP addresses of 307 a tenant on a pair of PEs. 309 5) Per MAC address: A single IPsec tunnel per pair of MAC addresses 310 of a tenant on a pair of PEs. 312 6) Per Attachment Circuit: A single IPsec tunnel per pair of 313 Attachment Circuits between a pair of PEs. 315 2.5 Support for Policy and DH-Group List 317 The solution must support a single policy and DH group for all SAs as 318 well as supporting multiple policies and DH groups among the SAs. 320 3 BGP Component 322 The architecture that encompasses device-to-controller trust model, 323 has several components among which is the signaling component. Secure 324 EVPN Signaling, as defined in this document, is the BGP signaling 325 component of the overall Architecture. We will briefly describe this 326 Architecture here to further facilitate understanding how Secure EVPN 327 fits into the overall architecture. The Architecture describes the 328 components needed to create BGP based SD-WANs and how these 329 components work together. Our intention is to list these components 330 here along with their brief description and to describe this 331 Architecture in details in a separate document where to specify the 332 details for other parts of this architecture besides the BGP 333 signaling component which is described in this document. 335 The Architecture consists of four components. These components are 336 Zero Touch Bring-up, Configuration Management, Orchestration, and 337 Signaling. In addition to these components, secure communications 338 must be provided between the edge nodes and all servers/devices 339 providing the architecture components. 341 3.1 Zero Touch Bring-up (ZTB) 343 The first component is a zero touch capability that allows an edge 344 device to find and join its SD-WAN with little to no assistance other 345 than power and network connectivity. The goal is to use existing 346 work in this area. The requirements are that an edge device can 347 locate its ZTB server/component of its SD-WAN controller in a secure 348 manner and to proceed to receive its configuration. 350 3.2 Configuration Management 352 After an edge device joins its SD-WAN, it needs to be configured. 354 Configuration covers all device configuration, not just the 355 configuration related to Secure EVPN. The previous Zero Touch Bring- 356 up component will have directed the edge device, either directly or 357 indirectly, to its configuration server/component. One example of a 358 configuration server is the I2NSF Controller. After a device has been 359 configured, it can engage in the next two components. Configuration 360 may include updates over time and is not a one time only component. 362 3.3 Orchestration 364 This component is optional. It allows for more dynamic updates of 365 configuration and statistics information. Orchestration can be more 366 dynamic than configuration. 368 3.4 Signaling 370 Signaling is the component described in this document. The 371 functionality of a Route Reflector is well understood. Here we 372 describe the signaling component of BGP SD-WAN Architecture and the 373 BGP extension/signaling for IPsec key management and policy. 375 4 Solution Description 377 This solution uses BGP P2MP signaling where an originating PE only 378 send a message to the Route Reflector (RR) and then the RR reflects 379 that message to the interested recipient PEs. The framework for such 380 signaling is described in [CONTROLLER-IKE] and it is referred to as 381 device-to-controller trust model. This trust model is significantly 382 different than the traditional peer-to-peer trust model where a P2P 383 signaling protocol such as IKEv2 [RFC7296] is used in which the PE 384 devices directly authenticate each other and agree upon security 385 policy and keying material to protect communications between 386 themselves. The device-to-controller trust model leverages P2MP 387 signaling via the controller (e.g., the RR) to achieve much better 388 scale and performance for establishment and maintenance of large 389 number of pair-wise Security Associations (SAs) among the PEs. 391 This device-to-controller trust model first secures the control 392 channel between each device and the controller using peer-to-peer 393 protocol such as IKEv2 [RFC7296] to establish P2P SAs between each PE 394 and the RR. It then uses this secured control channel for P2MP 395 signaling in establishment of P2P SAs between each pair of PE 396 devices. 398 Each PE advertises to other PEs via the RR the information needed in 399 establishment of pair-wise SAs between itself an every other remote 400 PEs. These pieces of information are sent as Sub-TLVs of IPSec tunnel 401 type in BGP Tunnel Encapsulation attribute. These Sub-TLVs are 402 detailed in section 5 and are based on the DIM message components 403 from [CONTROLLER-IKE] and the IKEv2 specification [RFC7296]. The 404 IPsec tunnel TLVs along with its Sub-TLVs are sent along with the BGP 405 route (NLRI) for a given level of granularity. 407 If only a single SA is required per pair of PE devices to multiplex 408 user traffic for all tenants, then IPsec tunnel TLV is advertised 409 along with IPv4 or IPv6 NLRI representing loopback address of the 410 originating PE. It should be noted that this is not a VPN route but 411 rather an IPv4 or IPv6 route. 413 If a SA is required per tenant between a pair of PE devices, then 414 IPsec tunnel TLV can be advertised along with EVPN IMET route 415 representing the tenant or can be advertised along with a new EVPN 416 route representing the tenant. 418 If a SA is required per tenant's subnet (e.g., per VLAN) between a 419 pair of PE devices, then IPsec tunnel TLV is advertised along with 420 EVPN IMET route. 422 If a SA is required between a pair of tenant's devices represented by 423 a pair of IP addresses, then IPsec tunnel TLV is advertised along 424 with EVPN IP Prefix Advertisement Route or EVPN MAC/IP Advertisement 425 route. 427 If a SA is required between a pair of tenant's devices represented by 428 a pair of MAC addresses, then IPsec tunnel TLV is advertised along 429 with EVPN MAC/IP Advertisement route. 431 If a SA is required between a pair of Attachment Circuits (ACs) on 432 two PE devices (where an AC can be represented by ), then 433 IPsec tunnel TLV is advertised along with EVPN Ethernet AD route. 435 4.1 Inheritance of Security Policies 437 Operationally, it is easy to configure a security association between 438 a pair of PEs using BGP signaling. This is the default security 439 association that is used for traffic that flows between peers. 440 However, in the event more finer granularity of security association 441 is desired on the traffic flows, it is possible to set up SAs between 442 a pair of tenants, a pair of subnets within a tenant, a pair of IPs 443 between a subnet, and a pair of MACs between a subnet using the 444 appropriate EVPN routes as described above. In the event, there are 445 no security TLVs associated with an EVPN route, there is a strict 446 order in the manner security associations are inherited for such a 447 route. This results in an EVPN route inheriting the security 448 associations of the parent in a hierarchical fashion. For example, 449 traffic between an IP pair is protected using security TLVs announced 450 along with the EVPN IP Prefix Advertisement Route or EVPN MAC/IP 451 Advertisement route as a first choice. If such TLVs are missing with 452 the associated route, then one checks to see if the subnets the IPs 453 are associated with has security TLVs with the EVPN IMET route. If 454 they are present, those associations are used in securing the 455 traffic. In the absence of them, the peer security associations are 456 used. The order in which security associations are inherited are from 457 the granular to the coarser, namely, IP/MAC associated TLVs with the 458 EVPN route being the first preference, and the subnet, the tenant, 459 and the peer associations preferred in that fashion. 461 It should be noted that when a security association is made it is 462 possible for it to be re-used by a large number of traffic flows. For 463 example, a tenant security association may be associated with a 464 number of child subnet routes. Clearly it is mandatory to keep a 465 tenant security association alive, if there are one or more subnet 466 routes that want to use that association. Logically, the security 467 associations between a pair of entities creates a single secure 468 tunnel. It is thus possible to classify the incoming traffic in the 469 most granular sense {IP/MAC, subnet, tenant, peer} to a particular 470 secure tunnel that falls within its route hierarchy. The policy that 471 is applied to such traffic is independent from its use of an existing 472 or a new secure tunnel. It is clear that since any number of 473 classified traffic flows can use a security association, such a 474 security association will not be torn down, if at least there is one 475 policy using such a secure tunnel. 477 4.2 Distribution of Public Keys and Policies 479 One of the requirements for this solution is to support a single DH 480 group and a single policy for all SAs as well as to support multiple 481 DH groups and policies among the SAs. The following subsections 482 describe what pieces of information (what Sub-TLVs) are needed to be 483 exchanged to support a single DH group and a single policy versus 484 multiple DH groups and multiple policies. 486 4.2.1 Minimal DIM 488 For SA establishment, at the minimum, a PE needs to advertise to 489 other PEs, its DIM values as specified in [CONTROLLER-IKE]. These 490 include: 492 ID Tunnel ID 493 N Nonce 494 RC Rekey Counter 495 I Indication of initial policy distribution 496 KE DH public value. 498 When this minimal set of DIM values is sent, then it is assumed that 499 all peer PEs share the same policy for which DH group to use, as well 500 as which IPSec SA policy to employ. Section 5.1 defines the Minimal 501 DIM sub-TLV as part of IPsec tunnel TLV in BGP Tunnel Encapsulation 502 Attribute. 504 4.2.2 Multiple Policies 506 There can be scenarios for which there is a need to have multiple 507 policy options. This can happen when there is a need for policy 508 change and smooth migration among all PE devices to the new policy is 509 required. It can also happen if different PE devices have different 510 capabilities within the network. In these scenarios, PE devices need 511 to be able to choose the correct policy to use for each other. This 512 multi-policy scheme is described in section 6 of [CONTROLLER-IKE]. In 513 order to support this multi-policy feature, a PE device MUST 514 distribute a policy list. This list consists of multiple distinct 515 policies in order of preference, where the first policy is the most 516 preferred one. The receiving PE selects the policy by taking the 517 received list (starting with the first policy) and comparing that 518 against its own list and choosing the first one found in common. If 519 there is no match, this indicates a configuration error and the PEs 520 MUST NOT establish new SAs until a message is received that does 521 produce a match. 523 4.2.2.1 Multiple DH-groups 525 It can be the case that not all peers use the same DH group. When 526 multiple DH groups are supported, the peer may include multiple KE 527 Sub-TLVs. The order of the KE Sub-TLVs determines the preference. 528 The preference and selection methods are specified in Section 6 of 529 [CONTROLLER-IKE]. 531 4.2.2.2 Multiple or Single ESP SA policies 533 In order to specify an ESP SA Policy, a DIM may include one or more 534 SA Sub-TLVs. When all peers are configured by a controller with the 535 same ESP SA policy, they MAY leave the SA out of the DIM. This 536 minimizes messaging when group configuration is static and known. 537 However, it may also be desirable to include the SA. If a single SA 538 is included, the peer is indicating what ESP SA policy it uses, but 539 is not willing to negotiate. If multiple SA Sub-TLVs are included, 540 the peer is indicating that it is willing to negotiate. The order of 541 the SA Sub-TLVs determines the preference. The preference and 542 selection methods are specified in Section 6 of [CONTROLLER-IKE]. 544 4.3 Initial IPsec SAs Generation 546 The procedure for generation of initial IPsec SAs is described in 547 section 3 of [CONTROLLER-IKE]. This section gives a summary of it in 548 context of BGP signaling. When a PE device first comes up and wants 549 to setup an IPsec SA between itself and each of the interested remote 550 PEs, it generates a DH pair along for each [what word here? 551 "tennant"?] using an algorithm defined in the IKEv2 Diffie-Hellman 552 Group Transform IDs [IKEv2-IANA]. The originating PE distributes the 553 DH public value along with the other values in the DIM (using IPsec 554 Tunnel TLV in Tunnel Encapsulation Attribute) to other remote PEs via 555 the RR. Each receiving PE uses this DH public number and the 556 corresponding nonce in creation of IPsec SA pair to the originating 557 PE - i.e., an outbound SA and an inbound SA. The detail procedures 558 are described in section 5.2 of [CONTROLLER-IKE]. 560 4.4 Re-Keying 562 A PE can initiate re-keying at any time due to local time or volume 563 based policy or due to the result of cipher counter nearing its final 564 value. The rekey process is performed individually for each remote 565 PE. If rekeying is performed with multiple PEs simultaneously, then 566 the decision process and rules described in this rekey are performed 567 independently for each PE. Section 4 of [CONTROLLER-IKE] describes 568 this rekeying process in details and gives examples for a single 569 IPsec device (e.g., a single PE) rekey versus multiple PE devices 570 rekey simultaneously. 572 4.5 IPsec Databases 574 The Peer Authorization Database (PAD), the Security Policy Database 575 (SPD), and the Security Association Database (SAD) all need to be 576 setup as defined in the IPsec Security Architecture [RFC4301]. 577 Section 5 of [CONTROLLER-IKE] gives a summary description of how 578 these databases are setup for the controller-based model where key is 579 exchanged via P2MP signaling via the controller (i.e., the RR) and 580 the policy can be either signaled via the RR (in case of multiple 581 policies) or configured by the management station (in case of single 582 policy). 584 5 Encapsulation 585 Vast majority of Encapsulation for Network Virtualization Overlay 586 (NVO) networks in deployment are based on UDP/IP with UDP destination 587 port ID indicating the type of NVO encapsulation (e.g., VxLAN, GPE, 588 GENEVE, GUE) and UDP source port ID representing flow entropy for 589 load-balancing of the traffic within the fabric based on n-tuple that 590 includes UDP header. When encrypting NVO encapsulated packets using 591 IP Encapsulating Security Payload (ESP), the following two options 592 can be used: a) adding a UDP header before ESP header (e.g., UDP 593 header in clear) and b) no UDP header before ESP header (e.g., 594 standard ESP encapsulation). The following subsection describe these 595 encapsulation in further details. 597 5.1 Standard ESP Encapsulation 599 When standard IP Encapsulating Security Payload (ESP) is used 600 (without outer UDP header) for encryption of NVO packets, it is used 601 in transport mode as depicted below. When such encapsulation is used, 602 for BGP signaling, the Tunnel Type of Tunnel Encapsulation TLV is set 603 to ESP-Transport and the Tunnel Type of Encapsulation Extended 604 Community is set to NVO encapsulation type (e.g., VxLAN, GENEVE, GPE, 605 etc.). This implies that the customer packets are first encapsulated 606 using NVO encapsulation type and then it is further encapsulated & 607 encrypted using ESP-Transport mode. 609 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 610 | MAC Header | | MAC Header | 611 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 612 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 613 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 614 | IP Header | | IP Header | 615 | Protocol = UDP | | Protocol = ESP | 616 +-----------------------+ +-----------------------+ 617 | UDP Header | | ESP Header | 618 | Dest Port = VxLAN | +-----------------------+ 619 +-----------------------+ | UDP Header | 620 | VxLAN Header | | Dest Port = VxLAN | 621 +-----------------------+ +-----------------------+ 622 | Inner MAC Header | | VxLAN Header | 623 +-----------------------+ +-----------------------+ 624 | Inner Eth Payload | | Inner MAC Header | 625 +-----------------------+ +-----------------------+ 626 | CRC | | Inner Eth Payload | 627 +-----------------------+ +-----------------------+ 628 | ESP Trailer (NP=UDP) | 629 +-----------------------+ 630 | CRC | 631 +-----------------------+ 633 Figure 3: VxLAN Encapsulation within ESP 635 5.2 ESP Encapsulation within UDP packet 637 In scenarios where NAT traversal is required ([RFC3948]) or where 638 load balancing using UDP header is required, then ESP encapsulation 639 within UDP packet as depicted in the following figure is used. The 640 ESP for NVO applications is in transport mode. The outer UDP header 641 (before the ESP header) has its source port set to flow entropy and 642 its destination port set to 4500 (indicating ESP header follows). A 643 non-zero SPI value in ESP header implies that this is a data packet 644 (i.e., it is not an IKE packet). The Next Protocol field in the ESP 645 trailer indicates what follows the ESP header, is a UDP header. This 646 inner UDP header has a destination port ID that identifies NVO 647 encapsulation type (e.g., VxLAN). Optimization of this packet format 648 where only a single UDP header is used (only the outer UDP header) is 649 for future study. 651 When such encapsulation is used, for BGP signaling, the Tunnel Type 652 of Tunnel Encapsulation TLV is set to ESP-in-UDP-Transport and the 653 Tunnel Type of Encapsulation Extended Community is set to NVO 654 encapsulation type (e.g., VxLAN, GENEVE, GPE, etc.). This implies 655 that the customer packets are first encapsulated using NVO 656 encapsulation type and then it is further encapsulated & encrypted 657 using ESP-in-UDP with Transport mode. 659 [RFC3948] 661 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 662 | MAC Header | | MAC Header | 663 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 664 | Eth Type = IPv4/IPv6 | | Eth Type = IPv4/IPv6 | 665 +-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+ 666 | IP Header | | IP Header | 667 | Protocol = UDP | | Protocol = UDP | 668 +-----------------------+ +-----------------------+ 669 | UDP Header | | UDP Header | 670 | Dest Port = VxLAN | | Dest Port = 4500(ESP) | 671 +-----------------------+ +-----------------------+ 672 | VxLAN Header | | ESP Header | 673 +-----------------------+ +-----------------------+ 674 | Inner MAC Header | | UDP Header | 675 +-----------------------+ | Dest Port = VxLAN | 676 | Inner Eth Payload | +-----------------------+ 677 +-----------------------+ | VxLAN Header | 678 | CRC | +-----------------------+ 679 +-----------------------+ | Inner MAC Header | 680 +-----------------------+ 681 | Inner Eth Payload | 682 +-----------------------+ 683 | ESP Trailer (NP=UDP) | 684 +-----------------------+ 685 | CRC | 686 +-----------------------+ 687 Figure 4: VxLAN Encapsulation within ESP Within UDP 689 6 BGP Encoding 691 This document defines two new Tunnel Types along with its associated 692 sub-TLVs for The Tunnel Encapsulation Attribute [TUNNEL-ENCAP]. These 693 tunnel types correspond to ESP-Transport and ESP-in-UDP-Transport as 694 described in section 4. The following sub-TLVs apply to both tunnel 695 types unless stated otherwise. 697 6.1 The Base (Minimal Set) DIM Sub-TLV 698 The Base DIM is described in 3.2.1. One and only one Base DIM may be 699 sent in the IPSec Tunnel TLV. 701 0 1 2 3 702 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 703 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 704 | ID Length | Nonce Length |I| Flags | 705 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 706 | Rekey | 707 | Counter | 708 +---------------------------------------------------------------+ 709 | | 710 ~ Originator ID + (Tenant ID) + (Subnet ID) + (Tenant Address) ~ 711 | | 712 +---------------------------------------------------------------+ 713 | | 714 ~ Nonce Data ~ 715 | | 716 +---------------------------------------------------------------+ 718 Figure 5: The Base DIM Sub-TLV 720 ID Length (16 bits) is the length of the Originator ID + (Tenant ID) 721 + (Subnet ID) + (Tenant Address) in bytes. 723 Nonce Length (8 bits) is the length of the Nonce Data in bytes 725 I (1 bit) is the initial contact flag from [CONTROLLER-IKE] 727 Flags (7 bits) are reserved and MUST be set to zero on transmit and 728 ignored on receipt. 730 The Rekey Counter is a 64 bit rekey counter as specified in 731 [CONTROLLER-IKE] 733 The Originator ID + (Tenant ID) + (Subnet ID) + (Tenant Address) is 734 the tunnel identifier and uniquely identifies the tunnel. Depending 735 on the granularity of the tunnel, the fields in () may not be used - 736 i.e., for a tunnel at the PE level of granularity, only Originator ID 737 is required. 739 The Nonce Data is the nonce described in [CONTROLLER-IKE]. Its 740 length is a multiple of 32 bits. Nonce lengths should be chosen to 741 meet minimum requirements described in IKEv2 [RFC7296]. 743 6.2 Key Exchange Sub-TLV 744 The KE Sub-TLV is described in 3.2.1 and 3.2.2.1. A KE is always 745 required. One or more KE Sub-TLVs may be included in the IPSec 746 Tunnel TLV. 748 0 1 2 3 749 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 750 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 751 | Diffie-Hellman Group Num | Reserved | 752 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 753 | | 754 ~ Key Exchange Data ~ 755 | | 756 +---------------------------------------------------------------+ 758 Figure 6: Key Exchange Sub-TLV 760 Diffie-Hellman Group Num 916 bits) identifies the Diffie-Hellman 761 group in the Key Exchange Data was computed. Diffie-Hellman group 762 numbers are discussed in IKEv2 [RFC7296] Appendix B and [RFC5114]. 764 The Key Exchange payload is constructed by copying one's Diffie- 765 Hellman public value into the "Key Exchange Data" portion of the 766 payload. The length of the Diffie-Hellman public value is described 767 for MOPD groups in [RFC7296] and for ECP groups in [RFC4753]. 769 6.3 ESP SA Proposals Sub-TLV 771 The SA Sub-TLV is described in 3.2.2.2. Zero or more SA Sub-TLVs may 772 be included in the IPSec Tunnel TLV. 774 0 1 2 3 775 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 776 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 777 ||Num Transforms| Reserved | 778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 779 | | 780 ~ Transforms ~ 781 | | 782 +---------------------------------------------------------------+ 784 Figure 8: ESP SA Proposals Sub-TLV 786 Num Transforms is the number of transforms included. 788 Reserved is not used and MUST be set to zero on transmit and MUST be 789 ignored on receipt. 791 6.3.1 Transform Substructure 793 0 1 2 3 794 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 795 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 796 | Transform Attr Length |Transform Type | Reserved. | 797 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 798 | Transform ID | Reserved | 799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 800 | | 801 ~ Transform Attributes ~ 802 | | 803 +---------------------------------------------------------------+ 805 Figure 9: Transform Substructure Sub-TLV 807 The Transform Attr Length is the length of the Transform Attributes 808 field. 810 The Transform Type is from Section 3.3.2 of [RFC7296] and 811 [IKEV2IANA]. Only the values ENCR, INTEG, and ESN are allowed. 813 The Transform ID specifies the transform identification value from 814 [IKEV2IANA]. 816 Reserved is unused and MUST be zero on transmit and MUST be ignored 817 on receipt. 819 The Transform Attributes are taken directly from 3.3.5 of [RFC7296]. 821 7 Applicability to other VPN types 823 Although P2MP BGP signaling for establishment and maintenance of SAs 824 among PE devices is described in this document in context of EVPN, 825 there is no reason why it cannot be extended to other VPN 826 technologies such as IP-VPN [RFC4364], VPLS [RFC4761] & [RFC4762], 827 and MVPN [RFC6513] & [RFC6514] with ingress replication. The reason 828 EVPN has been chosen is because of its pervasiveness in DC, SP, and 829 Enterprise applications and because of its ability to support SA 830 establishment at different granularity levels such as: per PE, Per 831 tenant, per subnet, per Ethernet Segment, per IP address, and per 832 MAC. For other VPN technology types, a much smaller granularity 833 levels can be supported. For example for VPLS, only the granularity 834 of per PE and per subnet can be supported. For per-PE granularity 835 level, the mechanism is the same among all the VPN technologies as 836 IPsec tunnel type (and its associated TLV and sub-TLVs) are sent 837 along with the PE's loopback IPv4 (or IPv6) address. For VPLS, if 838 per-subnet (per bridge domain) granularity level needs to be 839 supported, then the IPsec tunnel type and TLV are sent along with 840 VPLS AD route. 842 The following table lists what level of granularity can be supported 843 by a given VPN technology and with what BGP route. 845 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 846 | Functionality | EVPN | IP-VPN | MVPN | VPLS | 847 +---------------+-------------+-------------+-----------+---------+ 848 | per PE |IPv4/v6 route|IPv4/v6 route|IPv4/v6 rte|IPv4/v6 | 849 +---------------+-------------+-------------+-----------+---------+ 850 | per tenant |IMET (or new)|lpbk (or new)| I-PMSI | N/A | 851 +---------------+-------------+-------------+-----------+---------+ 852 | per subnet | IMET | N/A | N/A | VPLS AD | 853 +---------------+-------------+-------------+-----------+---------+ 854 | per IP |EVPN RT2/RT5 | VPN IP rt | *,G or S,G| N/A | 855 +---------------+-------------+-------------+-----------+---------+ 856 | per MAC | EVPN RT2 | N/A | N/A | N/A | 857 +---------------+-------------+-------------+-----------+---------+ 859 8 Acknowledgements 861 9 Security Considerations 863 10 IANA Considerations 865 A new transitive extended community Type of 0x06 and Sub-Type of TBD 866 for EVPN Attachment Circuit Extended Community needs to be allocated 867 by IANA. 869 10 References 871 11.1 Normative References 873 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 874 Requirement Levels", BCP 14, RFC 2119, March 1997. 876 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119 877 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 878 2017. 880 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432, 881 February, 2015. 883 [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution 884 Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. 886 [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation 887 Attribute", draft-ietf-idr-tunnel-encaps-03, November 888 2016. 890 [CONTROLLER-IKE] Carrel et al., "IPsec Key Exchange using a 891 Controller", draft-carrel-ipsecme-controller-ike-00, July, 892 2018. 894 [IKEV2IANA] IANA, "Internet Key Exchange Version 2 (IKEv2) 895 Parameters", . 898 [RFC3948] Huttunen et al., "UDP Encapsulation of IPsec ESP Packets", 899 RFC 3948, January 2005. 901 [IKEV2-IANA] IANA, "Internet Key Exchange Version 2 (IKEv2) 902 Parameters", February 2016, 903 www.iana.org/assignments/ikev2-parameters/ikev2- 904 parameters.xhtml. 906 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 907 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 908 December 2005. 910 11.2 Informative References 912 [RFC4364] Rosen, E., et. al., "BGP/MPLS IP Virtual Private Networks 913 (VPNs)", RFC 4364, February 2006. 915 [RFC4761] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 916 Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. 918 [RFC4762] Kompella, K., et. al., "Virtual Private LAN Service (VPLS) 919 Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 920 2007. 922 [RFC6513] Rosen, E., et. al., "Multicast in MPLS/BGP IP VPNs", RFC 923 6513, February 2012. 925 [RFC6514] Rosen, E., et. al., "BGP Encodings and Procedures for 926 Multicast in MPLS/BGP IP VPNs", RFC 6514, February 2012. 928 [RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 929 "Revised Error Handling for BGP UPDATE Messages", RFC 7606, August 930 2015, . 932 [802.1Q] "IEEE Standard for Local and metropolitan area networks - 933 Media Access Control (MAC) Bridges and Virtual Bridged Local Area 934 Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014. 936 [RFC7348] Mahalingam, M., et al., "Virtual eXtensible Local Area 937 Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 938 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, 939 August 2014. 941 [GENEVE] Gross, J., et al., "Geneve: Generic Network Virtualization 942 Encapsulation", Work in Progress, draft-ietf-nvo3-geneve-06, March 943 2018. 945 Authors' Addresses 947 Ali Sajassi 948 Cisco 949 Email: sajassi@cisco.com 951 Ayan Banerjee 952 Cisco 953 Email: ayabaner@cisco.com 955 Samir Thoria 956 Cisco 957 Email: sthoria@cisco.com 959 David Carrel 960 Cisco 961 Email: carrel@cisco.com 963 Brian Weis 964 Individual 965 Email: bew.stds@gmail.com 967 John Drake 968 Juniper 969 Email: jdrake@juniper.net