idnits 2.17.1 draft-dm-net2cloud-gap-analysis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 19 instances of too long lines in the document, the longest one being 11 characters in excess of 72. ** The abstract seems to contain references ([Net2Cloud-problem]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 2, 2018) is 2123 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'Net2Cloud-problem' is mentioned on line 460, but not defined == Missing Reference: 'RFC2332' is mentioned on line 129, but not defined == Missing Reference: 'Tunnel-Encap' is mentioned on line 229, but not defined == Missing Reference: 'Tunnel-Encaps' is mentioned on line 217, but not defined == Missing Reference: 'MEF-Cloud' is mentioned on line 294, but not defined == Missing Reference: 'RFC6325' is mentioned on line 432, but not defined == Unused Reference: 'RFC2119' is defined on line 476, but no explicit reference was found in the text == Unused Reference: 'RFC8192' is defined on line 481, but no explicit reference was found in the text == Unused Reference: 'RFC5521' is defined on line 484, but no explicit reference was found in the text == Unused Reference: 'ITU-T-X1036' is defined on line 503, but no explicit reference was found in the text == Outdated reference: A later version (-01) exists of draft-rosen-bess-secure-l3vpn-00 == Outdated reference: A later version (-07) exists of draft-dm-net2cloud-problem-statement-02 Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group L. Dunbar 2 Internet Draft A. Malis 3 Intended status: Informational Huawei 4 Expires: January 2019 6 July 2, 2018 8 Gap Analysis of Interconnecting Underlay with Cloud Overlay 9 draft-dm-net2cloud-gap-analysis-00 11 Abstract 13 This document analyzes the technological gaps when using SD-WAN to 14 interconnect workloads & apps hosted in various locations, 15 especially cloud data centers when the network service providers do 16 not have or have limited physical infrastructure to reach the 17 locations [Net2Cloud-problem]. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. This document may not be modified, 26 and derivative works of it may not be created, except to publish it 27 as an RFC and to translate it into languages other than English. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six 35 months and may be updated, replaced, or obsoleted by other documents 36 at any time. It is inappropriate to use Internet-Drafts as 37 reference material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html 44 This Internet-Draft will expire on December 2, 2018. 46 Copyright Notice 48 Copyright (c) 2018 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with 56 respect to this document. Code Components extracted from this 57 document must include Simplified BSD License text as described in 58 Section 4.e of the Trust Legal Provisions and are provided without 59 warranty as described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction...................................................3 64 2. Conventions used in this document..............................3 65 3. Gap Analysis of CPEs Registration Protocol.....................4 66 4. Gap Analysis in aggregating VPN paths and Internet paths.......4 67 4.1. Gap analysis of Using BGP to cover SD-WAN paths...........6 68 4.2. Gaps in preventing attacks to CPEs from their Internet ports 69 ...............................................................7 70 5. Gap analysis of CPEs not directly connected to VPN PEs.........8 71 5.1. Gap Analysis of Floating PEs to connect to Remote CPEs...10 72 5.2. NAT Traversing...........................................10 73 5.3. Complication of use BGP between PE and remote CPEs via 74 Internet......................................................10 75 5.4. Designated Forwarder to the remote edges.................11 76 5.5. Traffic Path Management..................................12 77 6. Manageability Considerations..................................12 78 7. Security Considerations.......................................12 79 8. IANA Considerations...........................................12 80 9. References....................................................12 81 9.1. Normative References.....................................13 82 9.2. Informative References...................................13 84 10. Acknowledgments..............................................14 86 1. Introduction 88 [Net2Cloud-Problem] describes the problems of enterprises face today 89 in transitioning their IT infrastructure to support digital economy, 90 such as connecting enterprises' branch offices to dynamic workloads 91 in Cloud DCs. 93 This document analyzes the technological gaps to interconnect 94 dynamic workloads & apps hosted in various locations, especially in 95 cloud data centers to which the network service providers may not 96 have or have limited physical infrastructure to reach. 98 2. Conventions used in this document 100 Cloud DC: Off-Premise Data Centers that usually host applications 101 and workload owned by different organizations or 102 tenants. 104 Controller: Used interchangeably with SD-WAN controller to manage 105 SD-WAN overlay path creation/deletion and monitoring the 106 path conditions between two sites. 108 CPE-Based VPN: Virtual Private Secure network formed among CPEs. 109 This is to differentiate from most commonly used PE 110 based VPNs 112 OnPrem: On Premises data centers and branch offices 114 SD-WAN: Software Defined Wide Area Network, which can mean many 115 different things. In this document, "SD-WAN" refers to 116 the solutions specified by ONUG (Open Network User 117 Group), which build point-to-point IPsec overlay paths 118 between two end-points (or branch offices) that need to 119 intercommunicate. 121 3. Gap Analysis of CPEs Registration Protocol 123 SD-WAN, conceived in ONUG (Open Network User Group) a few years ago 124 as way to aggregate multiple connections between any two points, has 125 emerged as an on-demand technology to securely interconnect the 126 OnPrem branches with the workloads instantiated in Cloud DCs that do 127 not have MPLS VPN PE co-located or have very limited bandwidths. 129 Some SD-WAN networks use the NHRP protocol [RFC2332] to register SD- 130 WAN endpoints with a "Controller" (or NHRP server), which then has 131 the ability to map a private VPN address to a public IP address of 132 the destination node. DSVPN [DSVPN] or DMVPN [DMVPN] are used to 133 establish tunnels among SD-WAN endpoints. 135 NHRP was originally intended for ATM address resolution, and as a 136 result it misses many attributes which are necessary for dynamic end 137 point CPE registration to controller, such as: 139 - Location identifier, such as Site Identifier, System ID, and/or Port ID. 140 - CPE attached GW information. When a CPE is instantiated within Cloud DC, 141 the Cloud DC operator' GW to which the CPE is attached. 142 - Private <-> Public address mapping, which is needed when the CPEs use 143 private addresses. 144 - IPsec configuration parameters (from controller to CPEs) 146 4. Gap Analysis in aggregating VPN paths and Internet paths 148 Most likely, enterprises, especially large ones, already have their 149 CPEs interconnected by provider VPNs, such as EVPN, L2VPN, or L3VPN. 150 The L2VPN or L3VPN can also be formed among all the CPEs directly 151 attached to PEs, which is referred to as CPE based VPN as shown in 152 the following diagram. The commonly used CPE based VPNs have CPE 153 directly attached to PEs via VLANs (Ethernet). Therefore, the 154 communication is secure. The BGP is used to distribute routes among 155 CPEs. 157 +---+ 158 |RR | EVPN MAC/IP BGP updates 159 +======+---+===========+ 160 // \\ 161 // <-----EVPN-VxLAN----> \\ 162 +-+--+ ++-+ ++-+ +--+-+ 163 | CPE|--|PE| |PE+--+ CPE| 164 +--| 1 | |1 | |x | | c |---+ 165 +-+--+ ++-+ ++-+ +----+ 166 | | 167 | VPN +-+---+ +----+ 168 +--------+ | Network | PE3 | |CPE | 169 | CPE | | | |- --| 3 | 170 | c | +-----+ +-+---+ +----+ 171 +------+-+-------+ PE4 |-----+ 172 +---+-+ 174 === or \\ indicates control plane communications 176 Figure 1: L2 or L3 VPNs over IP WAN 178 To use SD-WAN to aggregate Internet paths with the VPN paths, the 179 CPEs need to have some ports connected to PEs and other Ports 180 connected to the internet. NHRP & DSVPN/DMVPN can be used for the 181 CPEs to be registered with their SD-WAN Controllers to establish 182 secure tunnels among relevant CPEs. 184 That means the CPEs need to participate in two separate control 185 planes: EVPN&BGP for CPE based VPN via links directly attached to 186 PEs and NHRP & DSVPN/DMVPN. Two separate control planes not only add 187 complexity to CPEs, but also increase operational cost. 189 +---------Internet paths--------------+ 190 | | 191 | +---+ | 192 | |RR | | 193 | +======+---+===========+ | 194 | // \\ | 195 | // <-----EVPN-VxLAN----> \\ | 196 | +-+--+ ++-+ ++-+ +--+-+ (|) 197 | | CPE|--|PE| |PE+--+ CPE| (|) 198 +--| 1 | |1 | |x | | c |---+ 199 +-+--+ ++-+ ++-+ +----+ 200 | | 201 | VPN +-+---+ +----+ 202 +--------+ | Network | PE3 | |CPE | 203 | CPE | | | |- --| 3 | 204 | c | +-----+ +-+---+ +----+ 205 +------+-+-------+ PE4 |-----+ 206 +---+-+ 207 Figure 2: CPEs interconnected by VPN paths and Internet Paths 209 4.1. Gap analysis of Using BGP to cover SD-WAN paths 211 Since BGP is widely deployed, it is desirable to consider using BGP 212 to control the SD-WAN paths instead of NHRP, DSVPN/DMVPN. This 213 section analyzes the gaps of using BGP to control SD-WAN. 215 RFC5512 and [Tunnel-Encap] describe methods for end points to 216 advertise tunnel information and to trigger tunnel Establishment. 217 RFC5512 & [Tunnel-Encaps] have the Endpoint Address to indicate IPv4 218 or IPv6 address format Tunnel Encapsulation attribute to indicate 219 different encapsulation formats, such as L2TPv3, GRE, VxLAN, IP in 220 IP, etc. There are sub-TLVs to describe the detailed tunnel 221 information for each of the encapsulations. 223 There is also Color sub-TLV to describe customer specified 224 information about the tunnels (which can be creatively used for SD- 226 To express supporting multiple Encap types, multiple Extended 227 communities with SAFI value = 7 can be used. 229 Here are some of the gaps using RFC5512 and [Tunnel-Encap] to 230 control SD-WAN: 232 - Doesn't have fields to carry detailed information of the remote CPE: 233 such as Site-ID, System-ID, Port-ID 235 - Does not have the proper field to express IPsec configuration 236 information from "Controller" (which can be RR) to CPEs. 237 - Does not have proper way for two peer CPEs to negotiate IPSec key based 238 on the configuration sent from Controller. 239 - UDP NAT private address <-> public address mapping 240 - CPEs tend to communicate with a few other CPEs, not all the CPEs need to 241 form mesh connections. Using BGP, CPEs can easily get dumped with too 242 much information of other CPEs that they never need to communicate. 243 NHRP only sends the relevant information for the interested end 244 points for establishing tunnels. Therefore, need some form of 245 "Registration" methods. 247 [VPN-over-Internet] describes a way to securely interconnect CPEs 248 via IPsec using BGP. This method is useful, however, it still miss 249 some aspects to aggregate CPE based VPN paths with internet paths 250 that interconnect the CPEs. In addition: 252 - The draft assumes that CPE "register" with the RR. However, it does not 253 say how. Should "NHRP" (modified version) be considered? In SD-WAN, Zero 254 Touch Provisioning is expected. It is not acceptable to require manual 255 configuration on RR which CPEs are controlled. 256 - The draft assumes that CPE and RR are connected by IPsec tunnel. With 257 zero touch provisioning, we need an automatic way to synchronize the 258 IPsec SA between CPE and RR. The draft assumes: 259 A CPE must also be provisioned with whatever additional information 260 is needed in order to set up an IPsec SA with each of the red RRs 262 - IPsec requires periodic refreshment of the keys. How to synchronize the 263 refreshment among multiple nodes? 264 - IPsec usually only send configuration parameters to two end points and 265 let the two end points to negotiate the KEY. Now we assume that RR is 266 responsible for creating the KEY for all end points. When one end point 267 is confiscated, all other connections are impacted. 269 4.2. Gaps in preventing attacks to CPEs from their Internet ports 271 When CPEs have ports facing internet, it brings in the security 272 risks of potential DDoS attacks to the CPEs from the ports facing 273 internet. I.e. the CPE resource are attacked by unwanted traffic. 275 To mitigate security risk, it is absolutely necessary to enable 276 Anti-DDoS feature on those CPEs to prevent major DDoS attack. 278 5. Gap analysis of CPEs not directly connected to VPN PEs 280 Because of the ephemeral property of the selected Cloud DCs, an 281 enterprise or its network service provider may not have the direct 282 links to the Cloud DCs that are optimal for hosting the enterprise's 283 specific workloads/Apps. Under those circumstances, SD-WAN is a very 284 flexible choice to interconnect the enterprise on-premises data 285 centers & branch offices to its desired Cloud DCs. 287 However, SD-WAN paths over public internet can have unpredictable 288 performance, especially over long distances and cross state/country 289 boundaries. Therefore, it is highly desirable to place as much as 290 possible the portion of SD-WAN paths over service provider VPN (e.g. 291 enterprise's existing VPN) that have guaranteed SLA to minimize the 292 distance/segments over public internet. 294 MEF Cloud Service Architecture [MEF-Cloud] also describes a use case 295 of network operators needing to use SD-WAN over LTE or public 296 internet for the last mile access that they do not have physical 297 infrastructure. 299 Under those scenarios, one or both of the SD-WAN end points may not 300 directly attached to the PEs of a SR Domain. 302 Using SD-WAN to connect the enterprise existing sites with the 303 workloads in Cloud DC, the enterprise existing sites' CPEs have to 304 be upgraded to support SD-WAN. If the workloads in Cloud DC need to 305 be connected to many sites, the upgrade process can be very 306 expensive. 308 [Net2Cloud-Problem] describes a hybrid network approach that 309 integrates SD-WAN with traditional MPLS-based VPNs, to extend the 310 existing MPLS-based VPNs to the Cloud DC Workloads over the access 311 paths that are not under the VPN provider control. To make it 312 working properly, a small number of the PEs of the MPLS VPN can be 313 designated to connect to the remote workloads via SD-WAN secure 314 IPsec tunnels. Those designated PEs are shown as fPE (floating PE 315 or smart PE) in Figure below. Once the secure IPsec tunnels are 316 established, the workloads in Cloud DC can be reached by the 317 enterprise's VPN without upgrading all of the enterprise's existing 318 CPEs. The only CPE that needs to support SD-WAN would be a 319 virtualized CPE instantiated within the cloud DC. 321 +--------+ +--------+ 322 | Host-a +--+ +----| Host-b | 323 | | | (') | | 324 +--------+ | +-----------+ ( ) +--------+ 325 | +-+--+ ++-+ ++-+ +--+-+ (_) 326 | | CPE|--|PE| |PE+--+ CPE| | 327 +--| | | | | | | |---+ 328 +-+--+ ++-+ ++-+ +----+ 329 / | | 330 / | MPLS +-+---+ +--+-++--------+ 331 +------+-+ | Network |fPE-1| |CPE || Host | 332 | Host | | | |- --| || d | 333 | c | +-----+ +-+---+ +--+-++--------+ 334 +--------+ |fPE-2|-----+ 335 +---+-+ (|) 336 (|) (|) SD-WAN 337 (|) (|) over any access 338 +=\======+=========+ 339 // \ | Cloud DC \\ 340 // \ ++-----+ \\ 341 +Remote| 342 | CPE | 343 +-+----+ 344 ----+-------+-------+----- 345 | | 346 +---+----+ +---+----+ 347 | Remote | | Remote | 348 | App-1 | | App-2 | 349 +--------+ +--------+ 351 Figure 3: VPN Extension to Cloud DC 353 In Figure 3 above, the optimal Cloud DC to host the workloads (due 354 to proximity, capacity, pricing, or other criteria chosen by the 355 enterprises) does not happen to have a direct connection to the PEs 356 of the MPLS VPN that interconnects the enterprise's existing sites. 358 5.1. Gap Analysis of Floating PEs to connect to Remote CPEs 360 To extend MPLS VPN to remote CPEs, it is necessary to establish 361 secure tunnels (such as IPsec tunnels) between the Floating PEs and 362 the remote CPEs. 364 Gap: 366 Even though a set of PEs can be manually selected to act as the 367 floating PEs for a specific cloud data center, there are no standard 368 protocols for those PEs to interact with the remote CPEs (most 369 likely virtualized) instantiated in the third party cloud data 370 centers (such as exchanging performance information or route 371 information). 373 When there is more than one fPE available for use (as there should 374 be for resiliency or the ability to support multiple cloud DCs 375 scattered geographically), it is not straight to designate egress 376 fPE to remote CPEs based on applications. There are too much 377 applications traffic traversing PEs, it is not feasible for PEs to 378 recognize applications carried by the payload. 380 5.2. NAT Traversing 382 Most cloud DCs only assign private IP addresses to the workloads 383 instantiated. Therefore, the traffic to/from the workload usually 384 need to traverse NAT. 386 5.3. Complication of use BGP between PE and remote CPEs via Internet 388 Even though EBGP (external BGP) Multihop method can be used to 389 connect peers that are not directly connected to each other, there 390 are still some complications/gaps in extending BGP from MPLS VPN PEs 391 to remote CPEs via any access paths (e.g. internet): 393 EBGP Multi-hop scheme requires static configuration on both peers. 394 To use EBGP between a PE and remote CPEs, the PE has to be 395 statically configured with "next-hop" to the IP addresses of the 396 CPEs. When remote CPEs, especially remote virtualized CPEs 397 dynamically instantiated or removed, the configuration on the PE 398 Multi-Hop EBGP has to be changed accordingly. 400 Gap: 402 Egress peering engineering (EPE) is not enough. Running BGP on 403 virtualized CPE in Cloud DC requires GRE tunnels being established 404 first, which requires address and key management for the remote 405 CPEs. RFC 7024 (Virtual Hub & Spoke) and Hierarchical VPN is not 406 enough 408 Also need a method to automatically trigger configuration changes 409 on PE when remote CPEs' are instantiated or moved (IP address 410 change) or deleted. 412 EBGP Multi-hop scheme does not have embedded security mechanism. 413 The PE and remote CPEs needs secure communication channel when 414 connected via public internet. 416 Remote CPEs, if instantiated in Cloud DC, might have to traverse NAT 417 to reach PE. It is not clear how BGP can be used between devices 418 outside the NAT and the entities behind the NAT. It is not clear how 419 to configure the Next Hop on the PEs to reach private addresses. 421 5.4. Designated Forwarder to the remote edges 423 Among multiple floating PEs available for a remote CPE, multicast 424 traffic from the remote CPE towards the MPLS VPN can be broadcasted 425 back to the remote CPE due to the PE receiving the broadcast data 426 frame forwarding the multicast/broadcast frame to other PEs that in 427 turn send to all attached CPEs. This process may cause a traffic 428 loop. 430 Therefore, it is necessary to designate one floating PE as the CPE's 431 Designated Forwarder, similar to TRILL's Appointed Forwarders 432 [RFC6325]. 434 Gap: the MPLS VPN does not have features like TRILL's Appointed 435 Forwarders. 437 5.5. Traffic Path Management 439 When there are multiple floating PEs that have established IPsec 440 tunnels to the remote CPE, the remote CPE can forward the outbound 441 traffic to the Designated Forwarder PE, which in turn forwards the 442 traffic to egress PEs to the destinations. However, it is not 443 straightforward for the egress PE to send back the return traffic to 444 the Designated Forwarder PE. 446 Example of Return Path management using Figure 3 above. 448 - fPE-1 is desired for communication between App-1 <-> Host-a due to 449 latency, pricing or other criteria. 450 - fPE-2 is desired for communication between App-1 <-> Host-b. 452 6. Manageability Considerations 454 TBD 456 7. Security Considerations 458 The intention of this draft is to identify the gaps in current and 459 proposed SD-WAN approaches to the requirements identified in 460 [Net2Cloud-problem]. 462 Several of these approaches have gaps in meeting enterprise 463 security requirements when tunneling their traffic over the 464 Internet, as is the general intention of SD-WAN. See the 465 individual sections above for further discussion of these security 466 gaps. 468 8. IANA Considerations 470 This document requires no IANA actions. RFC Editor: Please remove 471 this section before publication. 473 9. References 474 9.1. Normative References 476 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 477 Requirement Levels", BCP 14, RFC 2119, March 1997. 479 9.2. Informative References 481 [RFC8192] S. Hares, et al, "Interface to Network Security Functions 482 (I2NSF) Problem Statement and Use Cases", July 2017 484 [RFC5521] P. Mohapatra, E. Rosen, "The BGP Encapsulation Subsequent 485 Address Family Identifier (SAFI) and the BGP Tunnel 486 Encapsulation Attribute", April 2009. 488 [Tunnel-Encap]E. Rosen, et al, "The BGP Tunnel Encapsulation 489 Attribute", draft-ietf-idr-tunnel-encaps-09, Feb 2018. 491 [VPN-over-Internet] E. Rosen, "Provide Secure Layer L3VPNs over 492 Public Infrastructure", draft-rosen-bess-secure-l3vpn-00, 493 work-in-progress, July 2018 495 [DMVPN] Dynamic Multi-point VPN: 496 https://www.cisco.com/c/en/us/products/security/dynamic- 497 multipoint-vpn-dmvpn/index.html 499 [DSVPN] Dynamic Smart VPN: 500 http://forum.huawei.com/enterprise/en/thread-390771-1- 501 1.html 503 [ITU-T-X1036] ITU-T Recommendation X.1036, "Framework for creation, 504 storage, distribution and enforcement of policies for 505 network security", Nov 2007. 507 [Net2Cloud-Problem] L. Dunbar and A. Malis, "Seamless Interconnect 508 Underlay to Cloud Overlay Problem Statement", draft-dm- 509 net2cloud-problem-statement-02, June 2018 511 10. Acknowledgments 513 Acknowledgements to xxx for his review and contributions. 515 This document was prepared using 2-Word-v2.0.template.dot. 517 Authors' Addresses 519 Linda Dunbar 520 Huawei 521 Email: Linda.Dunbar@huawei.com 523 Andrew G. Malis 524 Huawei 525 Email: agmalis@gmail.com