idnits 2.17.1 draft-dm-vpn-ext-to-cloud-dc-gap-analysis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 30, 2018) is 2180 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC5357' is mentioned on line 210, but not defined == Missing Reference: 'RFC2332' is mentioned on line 224, but not defined == Missing Reference: 'RFC6325' is mentioned on line 337, but not defined == Unused Reference: 'RFC2119' is defined on line 386, but no explicit reference was found in the text == Unused Reference: 'RFC8192' is defined on line 391, but no explicit reference was found in the text == Unused Reference: 'ITU-T-X1036' is defined on line 394, but no explicit reference was found in the text == Unused Reference: 'Dynamic-CloudDC' is defined on line 398, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group L. Dunbar 2 Internet Draft A. Malis 3 Intended status: Informational Huawei 4 Expires: January 2019 6 April 30, 2018 8 Gap Analysis of VPN Extension to Dynamic Cloud Data Center 9 draft-dm-vpn-ext-to-cloud-dc-gap-analysis-01 11 Abstract 13 This document analyzes the technological gaps necessary to enable 14 existing VPN to securely connect to dynamic workloads hosted in 15 cloud data centers when the cloud DC doesn't have the VPN PEs co- 16 located [dynamic-cloudDC]. 18 Status of this Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. This document may not be modified, 25 and derivative works of it may not be created, except to publish it 26 as an RFC and to translate it into languages other than English. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six 34 months and may be updated, replaced, or obsoleted by other documents 35 at any time. It is inappropriate to use Internet-Drafts as 36 reference material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/ietf/1id-abstracts.txt 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html 43 This Internet-Draft will expire on October 30, 2018. 45 Copyright Notice 47 Copyright (c) 2018 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with 55 respect to this document. Code Components extracted from this 56 document must include Simplified BSD License text as described in 57 Section 4.e of the Trust Legal Provisions and are provided without 58 warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction...................................................3 63 2. Conventions used in this document..............................3 64 3. Connect OnPrem DCs & branches with dynamic workloads in Cloud DC 65 ..................................................................3 66 4. Gap Analysis...................................................6 67 4.1. Floating PEs to connect to Remote Edges...................6 68 4.2. NAT Traversing............................................7 69 4.3. Complication of use BGP between PE and remote CPEs via 70 Internet.......................................................7 71 4.4. Controller Facilitated Route Distribution.................8 72 4.5. Designated Forwarder to the remote edges..................9 73 4.6. Traffic Path Management...................................9 74 4.7. Smart PE.................................................10 75 5. Manageability Considerations..................................10 76 6. Security Considerations.......................................10 77 7. IANA Considerations...........................................10 78 8. References....................................................10 79 8.1. Normative References.....................................10 80 8.2. Informative References...................................11 81 9. Acknowledgments...............................................11 83 1. Introduction 85 [dynamic-cloudDC] describes the problems of today's State-of-Art VPN 86 technologies in connecting enterprise branch offices to dynamic 87 workloads in Cloud DC. This document analyzes the technological gaps 88 necessary to enable existing VPN to securely connect to dynamic 89 workloads hosted in cloud data centers when the cloud DC does not 90 have the VPN PEs co-located. 92 2. Conventions used in this document 94 Cloud DC: Off-Premise Data Centers that usually host applications 95 and workload owned by different organizations or 96 tenants. 98 Controller: Used interchangeably with SD-WAN controller to manage 99 SD-WAN overlay path creation/deletion and monitoring the 100 path conditions between two sites. 102 OnPrem: On Premises data centers and branch offices 104 SD-WAN: Software Defined Wide Area Network, which can mean many 105 different things. In this document, "SD-WAN" refers to 106 the solutions specified by ONUG (Open Network User 107 Group), which build point-to-point IPsec overlay paths 108 between two end-points (or branch offices) that need to 109 intercommunicate. 111 3. Connect OnPrem DCs & branches with dynamic workloads in Cloud DC 113 With the advent of widely available third party cloud data centers 114 in diverse geographic locations and the advancement of tools for 115 monitoring and predicting application behaviors, it is technically 116 feasible for enterprises to instantiate applications and workloads 117 in Cloud DCs that are geographically closest to their end users. 118 This property can improve overall end user experience. 120 However, those Cloud DCs might not have the co-located PEs for the 121 commonly deployed VPNs (e.g. L2VPN, L3VPN) that interconnect 122 enterprises' branch offices and on-premise data centers. 124 SD-WAN, conceived in ONUG (Open Network User Group) a few years ago, 125 has emerged as an on-demand technology to securely interconnect any 126 two locations, which theatrically can connect the OnPrem branches 127 with the workloads instantiated in Cloud DCs that do not have MPLS 128 VPN PE co-located. However, to use the SD-WAN to connect the 129 enterprise existing sites with the workloads in Cloud DC, the 130 enterprise existing sites' CPEs have to be upgraded to support SD- 131 WAN. If the workloads in Cloud DC need to be connected to many 132 sites, the upgrade process can be very expensive. 134 [dynamic-cloudDC] describes a hybrid network approach, (a.k.a. VPN 135 extension to Dynamic Cloud DC throughout the document), that 136 integrates SD-WAN with traditional MPLS-based VPNs, to connect 137 OnPrem locations with Cloud DC Workloads with minimum changes to 138 existing CPEs. 140 The VPN Extension to dynamic workload in Cloud DC has the assumption 141 that the workloads in Cloud DC can be temporary or may be migrated 142 to different DCs over time, therefore, cannot justify the cost of 143 adding new PEs to the existing MPLS VPN in order to reach the Cloud 144 DC. 146 To extend the existing MPLS VPN to Cloud DC over the access paths 147 that are not under the VPN provider control, a small number of the 148 PEs of the MPLS VPN can be designated to connect to the remote 149 workloads via SD-WAN secure IPsec tunnels. Those designated PEs are 150 shown as fPE (floating PE or smart PE) in Figure 1 below. Once the 151 secure IPsec tunnels are established, the workloads in Cloud DC can 152 be reached by the enterprise's VPN without upgrading all of the 153 enterprise's existing CPEs. The only CPE that needs to support SD- 154 WAN would be a virtualized CPE instantiated within the cloud DC. 156 +--------+ +--------+ 157 | Host-a +--+ +----| Host-b | 158 | | | (') | | 159 +--------+ | +-----------+ ( ) +--------+ 160 | +-+--+ ++-+ ++-+ +--+-+ (_) 161 | | CPE|--|PE| |PE+--+ CPE| | 162 +--| | | | | | | |---+ 163 +-+--+ ++-+ ++-+ +----+ 164 / | | 165 / | MPLS +-+---+ +--+-++--------+ 166 +------+-+ | Network |fPE-1| |CPE || Host | 167 | Host | | | |- --| || d | 168 | c | +-----+ +-+---+ +--+-++--------+ 169 +--------+ |fPE-2|-----+ 170 +---+-+ (|) 171 (|) (|) SD-WAN 172 (|) (|) over any access 173 +=\======+=========+ 174 // \ | Cloud DC \\ 175 // \ ++-----+ \\ 176 +Remote| 177 | CPE | 178 +-+----+ 179 ----+-------+-------+----- 180 | | 181 +---+----+ +---+----+ 182 | Remote | | Remote | 183 | App-1 | | App-2 | 184 +--------+ +--------+ 186 Figure 1: VPN Extension to Cloud DC 188 In Figure 1 above, the optimal Cloud DC to host the workloads (due 189 to proximity, capacity, pricing, or other criteria chosen by the 190 enterprises) does not happen to have a direct connection to the PEs 191 of the MPLS VPN that interconnects the enterprise's existing sites. 193 4. Gap Analysis 195 4.1. Floating PEs to connect to Remote Edges 197 When an Enterprise's MPLS VPN does not have PEs co-located with the 198 Cloud DC that is the optimal location to host workloads, a small set 199 of PEs can be designated as the "floating PEs (fPE)" to connect to 200 the (virtualized) CPEs in the Cloud DC via SD-WAN IPsec tunnels over 201 the any access paths, such as public Internet, LTE, or others. 203 As long as PEs have the following property, the SD-WAN IPsec tunnels 204 can be established: 206 - Be able to support IPsec tunnel termination 207 - The performance measurements between the PE and the remote CPE 208 (or the virtualized CPE in Cloud DC) can be measured, such as 209 round time delay, two way active measurement protocol (TWAMP) 210 [RFC5357], etc., so that more intelligent selection can be made 211 if there are multiple PEs available for connection. 212 - Have sufficient capacity to route traffic to/from remote CPEs 213 in Cloud DC. 215 Gap: 217 Even though a set of PEs can be manually selected to act as the 218 floating PE for a specific cloud data center, there are no standard 219 protocols for those PEs to interact with the remote CPEs (most 220 likely virtualized) instantiated in the third party cloud data 221 centers (such as exchanging performance information or route 222 information). 224 Some SD-WAN networks use the NHRP protocol [RFC2332] to register SD- 225 WAN endpoints with an NHRP server, which then has the ability to map 226 a private VPN address to a public IP address of the destination node 227 (i.e. PE). However, not all CPEs in cloud data center support NHRP 228 registration for the set of private addresses of workload 229 instantiated in the data center, and does not have ways to be 230 automatically configured with the address of the NHRP server. 231 Without proper address of the CPE in Cloud DC, it is difficult for 232 an "optimal" fPE to act as the SD-WAN conduit to the DC. 234 When there is more than one fPE available for use (as there should 235 be for resiliency or the ability to support multiple cloud DCs 236 scattered geographically), multi-homing from the remote CPE in cloud 237 has issues to VPN has unresolved issues. 239 4.2. Need Secure Channel into the Cloud DC 241 Today's common network connection to Cloud DC is via IPsec tunnel 242 terminated at the Cloud DC Gateway, and depends on Cloud DC network 243 to connect to the leased compute & storage resources , or virtual 244 private cloud within the Cloud DC. 246 Some enterprises prefer to have secure tunnels all the way to their 247 own workloads hosted in the cloud to increase its own security 248 control to its workloads. Since the OnPrem workloads or application 249 might not have the application layer secure layer, the end to end 250 secure path would be from either OnPrem CPE or PE into the virtual 251 CPEs in the Cloud DC. 253 4.3. Support virtual networks differentiation within one IPsec tunnel 255 When there are multiple virtual networks in Cloud DC to be connected 256 to enterprise's existing VPN, it is desirable to have traffic from 257 those virtual networks sharing the same IPsec tunnel between PEs and 258 the Cloud DC Gateway. Therefore, It is necessary to differentiate 259 traffic belong to different virtual networks within one IPsec 260 tunnel. 262 4.4. NAT Traversing 264 Most cloud DCs only assign private IP addresses to the workloads 265 instantiated. Therefore, the traffic to/from the workload usually 266 need to traverse NAT. 268 4.5. Complication of use BGP between PE and remote CPEs via Internet 270 Even though EBGP (external BGP) Multihop method can be used to 271 connect peers that are not directly connected to each other, there 272 are still some complications/gaps in extending BGP from MPLS VPN PEs 273 to remote CPEs via any access paths (e.g. internet): 275 EBGP Multi-hop scheme requires static configuration on both peers. 276 To use EBGP between a PE and remote CPEs, the PE has to be 277 statically configured with "next-hop" to the IP addresses of the 278 CPEs. When remote CPEs, especially remote virtualized CPEs 279 dynamically instantiated or removed, the configuration on the PE 280 Multi-Hop EBGP has to be changed accordingly. 282 Gap: 284 Egress peering engineering (EPE) is not enough. Running BGP on 285 virtualized CPE in Cloud DC requires GRE tunnels being established 286 first, which requires address and key management for the remote 287 CPEs. RFC 7024 (Virtual Hub & Spoke) and Hierarchical VPN is not 288 enough 290 Also need a method to automatically trigger configuration changes 291 on PE when remote CPEs' are instantiated or moved (IP address 292 change) or deleted. 294 EBGP Multi-hop scheme does not have embedded security mechanism. 295 The PE and remote CPEs needs secure communication channel when 296 connected via public internet. 298 Remote CPEs, if instantiated in Cloud DC, might have to traverse NAT 299 to reach PE. It is not clear how BGP can be used between devices 300 outside the NAT and the entities behind the NAT. It is not clear how 301 to configure the Next Hop on the PEs to reach private addresses. 303 4.6. Controller Facilitated Route Distribution 305 Some remote applications & workloads hosted in third party Cloud DCs 306 may only need to communicate with a small number of subnets (or 307 Virtual Networks) at a limited number of an enterprise's VPN sites. 308 Running an IGP among the remote (virtual) CPE in the Cloud DC and 309 all of the VPN sites to establish a full mesh routing table for 310 every site could be overkill. 312 Instead of running IGP with all other sites, the remote CPEs can 313 register its attached hosts to the controller via NHRP, which in 314 turn passes the addresses attached to the remote edges to the 315 relevant PEs/CPEs that need to communicate with the remote edges. 317 Gap: 319 A complicating issue is that the remote CPEs are not directly 320 connected to any of the PEs of the MPLS VPN. This may make it 321 difficult to use either an IGP or BGP as a method distribute routes 322 within the VPN to reach a particular private address within a data 323 center. However, route distribution may be possible once an IPSec 324 tunnel has been established. This needs to be investigated. 326 4.7. Designated Forwarder to the remote edges 328 Among multiple floating PEs available for a remote CPE, multicast 329 traffic from the remote CPE towards the MPLS VPN can be broadcasted 330 back to the remote CPE due to the PE receiving the broadcast data 331 frame forwarding the multicast/broadcast frame to other PEs that in 332 turn send to all attached CPEs. This process may cause a traffic 333 loop. 335 Therefore, it is necessary to designate one floating PE as the CPE's 336 Designated Forwarder, similar to TRILL's Appointed Forwarders 337 [RFC6325]. 339 Gap: the MPLS VPN does not have features like TRILL's Appointed 340 Forwarders. 342 4.8. Traffic Path Management 344 When there are multiple floating PEs that have established IPsec 345 tunnels to the remote CPE, the remote CPE can forward the outbound 346 traffic to the Designated Forwarder PE, which in turn forwards the 347 traffic to egress PEs to the destinations. However, it is not 348 straightforward for the egress PE to send back the return traffic to 349 the Designated Forwarder PE. 351 Example of Return Path management using Figure 1 above. 353 - fPE-1 is desired for communication between App-1 <-> Host-a due to 354 latency, pricing or other criteria. 355 - fPE-2 is desired for communication between App-1 <-> Host-b. 357 4.9. Smart PE 359 A Smart PE is the PE that can interact with remote CPE (or the 360 Controller) to learn the communication peer and pattern of services 361 hosted in third party DC. With that learned information, the Smart 362 PE can intelligently manage transport paths within the MPLS-based 363 VPN (for example, choosing the optimized egress PEs) based on the 364 delay and QoS measurement among different PEs (in order to support 365 high SLA requests from the CPE). 367 Gap: There needs to be a protocol to select Smart PEs. 369 5. Manageability Considerations 371 TBD 373 6. Security Considerations 375 TBD. 377 7. IANA Considerations 379 This document requires no IANA actions. RFC Editor: Please remove 380 this section before publication. 382 8. References 384 8.1. Normative References 386 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 387 Requirement Levels", BCP 14, RFC 2119, March 1997. 389 8.2. Informative References 391 [RFC8192] S. Hares, et al "Interface to Network Security Functions 392 (I2NSF) Problem Statement and Use Cases", July 2017 394 [ITU-T-X1036] ITU-T Recommendation X.1036, "Framework for creation, 395 storage, distribution and enforcement of policies for 396 network security", Nov 2007. 398 [Dynamic-CloudDC] L.Dunbar and A. Malis, "Dynamic Cloud Data Center 399 VPN Problem Statement", Nov 2017 401 9. Acknowledgments 403 Acknowledgements to xxx for his review and contributions. 405 This document was prepared using 2-Word-v2.0.template.dot. 407 Authors' Addresses 409 Linda Dunbar 410 Huawei 411 Email: Linda.Dunbar@huawei.com 413 Andrew G. Malis 414 Huawei 415 Email: agmalis@gmail.com