| < draft-ietf-rtgwg-net2cloud-problem-statement-05.txt | draft-ietf-rtgwg-net2cloud-problem-statement-06.txt > | |||
|---|---|---|---|---|
| Network Working Group L. Dunbar | Network Working Group L. Dunbar | |||
| Internet Draft Futurewei | Internet Draft Futurewei | |||
| Intended status: Informational Andy Malis | Intended status: Informational Andy Malis | |||
| Expires: March 2020 Independent | Expires: August 5, 2020 Independent | |||
| C. Jacquenet | C. Jacquenet | |||
| Orange | Orange | |||
| M. Toy | M. Toy | |||
| Verizon | Verizon | |||
| November 1, 2019 | February 5, 2020 | |||
| Dynamic Networks to Hybrid Cloud DCs Problem Statement | Dynamic Networks to Hybrid Cloud DCs Problem Statement | |||
| draft-ietf-rtgwg-net2cloud-problem-statement-05 | draft-ietf-rtgwg-net2cloud-problem-statement-06 | |||
| Abstract | Abstract | |||
| This document describes the problems that enterprises face today | This document describes the problems that enterprises face today | |||
| when interconnecting their branch offices with dynamic workloads in | when interconnecting their branch offices with dynamic workloads in | |||
| third party data centers (a.k.a. Cloud DCs). There can be many | third party data centers (a.k.a. Cloud DCs). There can be many | |||
| problems associated with network connecting to or among Clouds, many | problems associated with network connecting to or among Clouds, many | |||
| of which probably are out of the IETF scope. The objective of this | of which probably are out of the IETF scope. The objective of this | |||
| document is to identify some of the problems that need additional | document is to identify some of the problems that need additional | |||
| work in IETF Routing area. Other problems are out of the scope of | work in IETF Routing area. Other problems are out of the scope of | |||
| skipping to change at page 2, line 21 ¶ | skipping to change at page 2, line 21 ¶ | |||
| months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
| at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on April 1, 2009. | This Internet-Draft will expire on August 5, 2020. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with | carefully, as they describe your rights and restrictions with | |||
| respect to this document. Code Components extracted from this | respect to this document. Code Components extracted from this | |||
| document must include Simplified BSD License text as described in | document must include Simplified BSD License text as described in | |||
| Section 4.e of the Trust Legal Provisions and are provided without | Section 4.e of the Trust Legal Provisions and are provided without | |||
| warranty as described in the Simplified BSD License. | warranty as described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction...................................................3 | 1. Introduction...................................................3 | |||
| 1.1. On the evolution of Cloud DC connectivity.................3 | 1.1. Key Characteristics of Cloud Services:....................3 | |||
| 1.2. The role of SD-WAN techniques in Cloud DC connectivity....4 | 1.2. Connecting to Cloud Services..............................3 | |||
| 2. Definition of terms............................................4 | 1.3. The role of SD-WAN in connecting to Cloud Services........4 | |||
| 3. Interconnecting Enterprise Sites with Cloud DCs................5 | 2. Definition of terms............................................5 | |||
| 3.1. Multiple connections to workloads in a Cloud DC...........6 | 3. High Level Issues of Connecting to Multi-Cloud.................6 | |||
| 3.2. Interconnect Private and Public Cloud DCs.................7 | 3.1. Security Issues...........................................6 | |||
| 3.3. Desired Properties for Networks that interconnect Hybrid | 3.2. Authorization and Identity Management.....................6 | |||
| Clouds.........................................................8 | 3.3. API abstraction...........................................7 | |||
| 4. Multiple Clouds Interconnection................................9 | 3.4. DNS for Cloud Resources...................................8 | |||
| 4.1. Multi-Cloud Interconnection...............................9 | 3.5. NAT for Cloud Services....................................8 | |||
| 4.2. Desired Properties for Multi-Cloud Interconnection.......11 | 3.6. Cloud Discovery...........................................9 | |||
| 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs...11 | 4. Interconnecting Enterprise Sites with Cloud DCs................9 | |||
| 6. Problem with using IPsec tunnels to Cloud DCs.................13 | 4.1. Sites to Cloud DC........................................10 | |||
| 6.1. Complexity of multi-point any-to-any interconnection.....13 | 4.2. Inter-Cloud Interconnection..............................12 | |||
| 6.2. Poor performance over long distance......................14 | 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs...13 | |||
| 6.3. Scaling Issues with IPsec Tunnels........................14 | 6. Problem with using IPsec tunnels to Cloud DCs.................15 | |||
| 7. Problems of Using SD-WAN to connect to Cloud DCs..............15 | 6.1. Scaling Issues with IPsec Tunnels........................15 | |||
| 7.1. SD-WAN among branch offices vs. interconnect to Cloud DCs15 | 6.2. Poor performance over long distance......................15 | |||
| 8. End-to-End Security Concerns for Data Flows...................18 | 7. Problems of Using SD-WAN to connect to Cloud DCs..............16 | |||
| 9. Requirements for Dynamic Cloud Data Center VPNs...............18 | 7.1. More Complexity to Edge Nodes............................16 | |||
| 10. Security Considerations......................................19 | 7.2. Edge WAN Port Management.................................17 | |||
| 11. IANA Considerations..........................................19 | 7.3. Forwarding based on Application..........................17 | |||
| 12. References...................................................19 | 8. End-to-End Security Concerns for Data Flows...................17 | |||
| 12.1. Normative References....................................19 | 9. Requirements for Dynamic Cloud Data Center VPNs...............17 | |||
| 10. Security Considerations......................................18 | ||||
| 11. IANA Considerations..........................................18 | ||||
| 12. References...................................................18 | ||||
| 12.1. Normative References....................................18 | ||||
| 12.2. Informative References..................................19 | 12.2. Informative References..................................19 | |||
| 13. Acknowledgments..............................................20 | 13. Acknowledgments..............................................19 | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. On the evolution of Cloud DC connectivity | 1.1. Key Characteristics of Cloud Services: | |||
| The ever-increasing use of cloud applications for communication | Key characteristics of Cloud Services are on-demand, scalable, | |||
| services change the way corporate business works and shares | highly available, and usage-based billing. Cloud Services, such as, | |||
| information. Such cloud applications use resources hosted in third | compute, storage, network functions (most likely virtual), third | |||
| party DCs that also host services for other customers. | party managed applications, etc. are usually hosted and managed by third parties Cloud Operators. Here are some examples of Cloud network | |||
| functions: Virtual Firewall services, Virtual private network | ||||
| services, Virtual PBX services including voice and video | ||||
| conferencing systems, etc. Cloud Data Center (DC) is shared | ||||
| infrastructure that hosts the Cloud Services to many customers. | ||||
| With the advent of widely available third-party cloud DCs in diverse | 1.2. Connecting to Cloud Services | |||
| geographic locations and the advancement of tools for monitoring and | ||||
| predicting application behaviors, it is technically feasible for | ||||
| enterprises to instantiate applications and workloads in locations | ||||
| that are geographically closest to their end-users. Such proximity | ||||
| improves end-to-end latency and overall user experience. Conversely, | ||||
| an enterprise can easily shutdown applications and workloads | ||||
| whenever end-users are in motion (thereby modifying the networking | ||||
| connection of subsequently relocated applications and workloads). In | ||||
| addition, an enterprise may wish to take advantage of more and more | ||||
| business applications offered by third party private cloud DCs. | ||||
| Most of those enterprise branch offices & on-premises data centers | With the advent of widely available third-party cloud DCs and | |||
| are already connected via VPNs, such as MPLS-based L2VPNs and | services in diverse geographic locations and the advancement of | |||
| L3VPNs. Then connecting to the cloud-hosted resources may not be | tools for monitoring and predicting application behaviors, it is | |||
| straightforward if the provider of the VPN service does not have | very attractive for enterprises to instantiate applications and | |||
| direct connections to the corresponding cloud DCs. Under those | workloads in locations that are geographically closest to their end- | |||
| circumstances, the enterprise can upgrade the CPEs deployed in its | users. Such proximity can improve end-to-end latency and overall | |||
| various premises to utilize SD-WAN techniques to reach cloud | user experience. Conversely, an enterprise can easily shutdown | |||
| resources (without any assistance from the VPN service provider), or | applications and workloads whenever end-users are in motion (thereby | |||
| wait for their VPN service provider to make new agreements with data | modifying the networking connection of subsequently relocated | |||
| center providers to connect to the cloud resources. Either way has | applications and workloads). In addition, enterprises may wish to | |||
| additional infrastructure and operational costs. | take advantage of more and more business applications offered by | |||
| cloud operators. | ||||
| In addition, more enterprises are moving towards hybrid cloud DCs, | The networks that interconnect hybrid cloud DCs must address the | |||
| i.e. owned or operated by different Cloud operators, to maximize the | following requirements: | |||
| benefits of geographical proximity, elasticity and special features | - High availability to access all workloads in the desired cloud | |||
| offered by different cloud DCs. | DCs. | |||
| Many enterprises include cloud in their disaster recovery | ||||
| strategy, such as enforcing periodic backup policies within the | ||||
| cloud, or running backup applications in the Cloud. | ||||
| 1.2. The role of SD-WAN techniques in Cloud DC connectivity | - Global reachability from different geographical zones, thereby | |||
| facilitating the proximity of applications as a function of the | ||||
| end users' location, to improve latency. | ||||
| - Elasticity: prompt connection to newly instantiated | ||||
| applications at Cloud DCs when usages increase and prompt | ||||
| release of connection after applications at locations being | ||||
| removed when demands change. | ||||
| - Scalable security management. | ||||
| This document discusses the issues associated with connecting | 1.3. The role of SD-WAN in connecting to Cloud Services | |||
| enterprise's workloads/applications instantiated in multiple third- | ||||
| party data centers (a.k.a. Cloud DCs) and its on-prem data centers. | ||||
| Very often, the actual Cloud DCs that host the | ||||
| workloads/applications can be transient. | ||||
| SD-WAN, initially launched to maximize bandwidths between locations | Some of the characteristics of SD-WAN [SDWAN-BGP-USAGE], such as | |||
| by aggregating multiple paths managed by different service | network augmentation and forwarding based on application IDs instead | |||
| providers, has expanded to include flexible, on-demand, application- | of based on destination IP addresses, are very essential for | |||
| based connections established over any networks to access dynamic | connecting to on-demand Cloud services. | |||
| workloads in Cloud DCs. | ||||
| Therefore, this document discusses the use of SD-WAN techniques to | Issues associated with using SD-WAN for connecting to Cloud services | |||
| improve enterprise-to-cloud DC and cloud DC-to-cloud DC | are also discussed in this document. | |||
| connectivity. | ||||
| 2. Definition of terms | 2. Definition of terms | |||
| Cloud DC: Third party Data Centers that usually host applications | Cloud DC: Third party Data Centers that usually host applications | |||
| and workload owned by different organizations or | and workload owned by different organizations or | |||
| tenants. | tenants. | |||
| Controller: Used interchangeably with SD-WAN controller to manage | Controller: Used interchangeably with SD-WAN controller to manage | |||
| SD-WAN overlay path creation/deletion and monitoring the | SD-WAN overlay path creation/deletion and monitoring the | |||
| path conditions between two or more sites. | path conditions between two or more sites. | |||
| skipping to change at page 5, line 41 ¶ | skipping to change at page 6, line 5 ¶ | |||
| (depending on user provided policies). | (depending on user provided policies). | |||
| VPC: Virtual Private Cloud is a virtual network dedicated to | VPC: Virtual Private Cloud is a virtual network dedicated to | |||
| one client account. It is logically isolated from other | one client account. It is logically isolated from other | |||
| virtual networks in a Cloud DC. Each client can launch | virtual networks in a Cloud DC. Each client can launch | |||
| his/her desired resources, such as compute, storage, or | his/her desired resources, such as compute, storage, or | |||
| network functions into his/her VPC. Most Cloud | network functions into his/her VPC. Most Cloud | |||
| operators' VPCs only support private addresses, some | operators' VPCs only support private addresses, some | |||
| support IPv4 only, others support IPv4/IPv6 dual stack. | support IPv4 only, others support IPv4/IPv6 dual stack. | |||
| 3. Interconnecting Enterprise Sites with Cloud DCs | 3. High Level Issues of Connecting to Multi-Cloud | |||
| 3.1. Multiple connections to workloads in a Cloud DC | ||||
| There are many problems associated with connecting to hybrid Cloud | ||||
| Services, many of which are out of the IETF scope. This section is | ||||
| to identify some of the high level problems that can be addressed by | ||||
| IETF, especially by Routing area. Other problems are out of the | ||||
| scope of this document. By no means has this section covered all | ||||
| problems for connecting to Hybrid Cloud Services, e.g. difficulty in | ||||
| managing cloud spending is not discussed here. | ||||
| 3.1. Security Issues | ||||
| Cloud Services is built upon shared infrastructure, therefore not | ||||
| secure by nature. Security has been a primary, and valid, concern | ||||
| from the start of cloud computing: you are unable to see the exact | ||||
| location where your data is stored or being processed. Headlines | ||||
| highlighting data breaches, compromised credentials, and broken | ||||
| authentication, hacked interfaces and APIs, account hijacking | ||||
| haven't helped alleviate concerns. | ||||
| Secure user identity management, authentication, and access control | ||||
| mechanisms are important. Developing appropriate security | ||||
| measurements can enhance the confidence needed by enterprises to | ||||
| fully take advantage of Cloud Services. | ||||
| 3.2. Authorization and Identity Management | ||||
| One of the more prominent challenges for Cloud Services is Identity | ||||
| Management and Authorization. The Authorization not only includes | ||||
| user authorization, but also the authorization of API calls by | ||||
| applications from different Cloud DCs managed by different Cloud | ||||
| Operators. In addition, there are authorization for Workload | ||||
| Migration, Data Migration, and Workload Management. | ||||
| There are many types of users in cloud environments, e.g. end users | ||||
| for accessing applications hosted in Cloud DCs, Cloud-resource users | ||||
| who are responsible for setting permissions for the resources based | ||||
| on roles, access lists, IP addresses, domains, etc. | ||||
| There are many types of Cloud authorizations: including MAC | ||||
| (Mandatory Access Control) - where each app owns individual access | ||||
| permissions, DAC (Discretionary Access Control) - where each app | ||||
| requests permissions from an external permissions app, RBAC (Role- | ||||
| based Access Control) - where the authorization service owns roles | ||||
| with different privileges on the cloud service, and ABAC (Attribute- | ||||
| based Access Control) - where access is based on request attributes | ||||
| and policies. | ||||
| IETF hasn't yet developed comprehensive specification for Identity | ||||
| management and data models for Cloud Authorizations. | ||||
| 3.3. API abstraction | ||||
| Different Cloud Operators have different APIs to access their Cloud | ||||
| resources, security functions, the NAT, etc. | ||||
| It is difficult to move applications built by one Cloud operator's | ||||
| APIs to another. However, it is highly desirable to have a single | ||||
| and consistent way to manage the networks and respective security | ||||
| policies for interconnecting applications hosted in different Cloud | ||||
| DCs. | ||||
| The desired property would be having a single network fabric to | ||||
| which different Cloud DCs and enterprise's multiple sites can be | ||||
| attached or detached, with a common interface for setting desired | ||||
| policies. | ||||
| The difficulty of connecting applications in different Clouds might | ||||
| be stemmed from the fact that they are direct competitors. Usually | ||||
| traffic flow out of Cloud DCs incur charges. Therefore, direct | ||||
| communications between applications in different Cloud DCs can be | ||||
| more expensive than intra Cloud communications. | ||||
| It is desirable to have a common API shim layer or abstraction for | ||||
| different Cloud providers to make it easier to move applications | ||||
| from one Cloud DC to another. | ||||
| 3.4. DNS for Cloud Resources | ||||
| DNS name resolution is essential for on-premises and cloud-based | ||||
| resources. For customers with hybrid workloads, which include on- | ||||
| premises and cloud-based resources, extra steps are necessary to | ||||
| configure DNS to work seamlessly across both environments. | ||||
| Cloud operators have their own DNS to resolve resources within their | ||||
| Cloud DCs and to well-known public domains. Cloud's DNS can be | ||||
| configured to forward queries to customer managed authoritative DNS | ||||
| servers hosted on-premises, and to respond to DNS queries forwarded | ||||
| by on-premises DNS servers. | ||||
| For enterprises utilizing Cloud services by different cloud | ||||
| operators, it is necessary to establish policies and rules on | ||||
| how/where to forward DNS queries to. When applications in one Cloud | ||||
| need to communication with applications hosted in another Cloud, | ||||
| there could be DNS queries from one Cloud DC being forwarded to the | ||||
| enterprise's on premise DNS, which in turn be forwarded to the DNS | ||||
| service in another Cloud. Needless to say, configuration can be | ||||
| complex depending on the application communication patterns. | ||||
| 3.5. NAT for Cloud Services | ||||
| Cloud resources, such as VM instances, are usually assigned with | ||||
| private IP addresses. By configuration, some private subnets can | ||||
| have the NAT function to reach out to external network and some | ||||
| private subnets are internal to Cloud only. | ||||
| Different Cloud operators support different levels of NAT functions. | ||||
| For example, AWS NAT Gateway does not currently support connections | ||||
| towards, or from VPC Endpoints, VPN, AWS Direct Connect, or VPC | ||||
| Peering. https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc- | ||||
| nat-gateway.html#nat-gateway-other-services. AWS Direct | ||||
| Connect/VPN/VPC Peering does not currently support any NAT | ||||
| functionality. | ||||
| Google's Cloud NAT allows Google Cloud virtual machine (VM) | ||||
| instances without external IP addresses and private Google | ||||
| Kubernetes Engine (GKE) clusters to connect to the Internet. Cloud | ||||
| NAT implements outbound NAT in conjunction with a default route to | ||||
| allow instances to reach the Internet. It does not implement inbound | ||||
| NAT. Hosts outside of VPC network can only respond to established | ||||
| connections initiated by instances inside the Google Cloud; they | ||||
| cannot initiate their own, new connections to Cloud instances via | ||||
| NAT. | ||||
| For enterprises with applications running in different Cloud DCs, | ||||
| proper configuration of NAT have to be performed in Cloud DC and in | ||||
| their own on-premise DC. | ||||
| 3.6. Cloud Discovery | ||||
| One of the concerns of using Cloud services is not aware where the | ||||
| resource is actually located, especially Cloud operators can move | ||||
| application instances from one place to another. When applications | ||||
| in Cloud communicate with on-premise applications, it may not be | ||||
| clear where the Cloud applications are located or to which VPCs they | ||||
| belong. | ||||
| It is highly desirable to have tools to discover cloud services in | ||||
| much the same way as you would discover your on-premises | ||||
| infrastructure. A significant difference is that cloud discovery | ||||
| uses the cloud vendor's API to extract data on your cloud services, | ||||
| rather than the direct access used in scanning your on-premises | ||||
| infrastructure. | ||||
| Standard data models, APIs or tools can alleviate concerns of | ||||
| enterprise utilizing Cloud Resources, e.g. having a Cloud service | ||||
| scan that connects to the API of the cloud provider and collects | ||||
| information directly. | ||||
| 4. Interconnecting Enterprise Sites with Cloud DCs | ||||
| Considering that many enterprises already have existing VPNs (e.g. | ||||
| MPLS based L2VPN or L3VPN) interconnecting branch offices & on- | ||||
| premises data centers, connecting to Cloud services will be mixed of | ||||
| different types of networks. When an enterprise's existing VPN | ||||
| service providers do not have direct connections to the | ||||
| corresponding cloud DCs that the enterprise prefers to use, the | ||||
| enterprise has to face additional infrastructure and operational | ||||
| costs to utilize Cloud services. | ||||
| 4.1. Sites to Cloud DC | ||||
| Most Cloud operators offer some type of network gateway through | Most Cloud operators offer some type of network gateway through | |||
| which an enterprise can reach their workloads hosted in the Cloud | which an enterprise can reach their workloads hosted in the Cloud | |||
| DCs. For example, AWS (Amazon Web Services) offers the following | DCs. AWS (Amazon Web Services) offers the following options to reach | |||
| options to reach workloads in AWS Cloud DCs: | workloads in AWS Cloud DCs: | |||
| - AWS Internet gateway allows communication between instances in | - AWS Internet gateway allows communication between instances in | |||
| AWS VPC and the internet. | AWS VPC and the internet. | |||
| - AWS Virtual gateway (vGW) where IPsec tunnels [RFC6071] are | - AWS Virtual gateway (vGW) where IPsec tunnels [RFC6071] are | |||
| established between an enterprise's own gateway and AWS vGW, so | established between an enterprise's own gateway and AWS vGW, so | |||
| that the communications between those gateways can be secured | that the communications between those gateways can be secured | |||
| from the underlay (which might be the public Internet). | from the underlay (which might be the public Internet). | |||
| - AWS Direct Connect, which allows enterprises to purchase direct | - AWS Direct Connect, which allows enterprises to purchase direct | |||
| connect from network service providers to get a private leased | connect from network service providers to get a private leased | |||
| line interconnecting the enterprises gateway(s) and the AWS | line interconnecting the enterprises gateway(s) and the AWS | |||
| Direct Connect routers. In addition, an AWS Transit Gateway can | Direct Connect routers. In addition, an AWS Transit Gateway can | |||
| be used to interconnect multiple VPCs in different Availability | be used to interconnect multiple VPCs in different Availability | |||
| Zones. AWS Transit Gateway acts as a hub that controls how | Zones. AWS Transit Gateway acts as a hub that controls how | |||
| traffic is forwarded among all the connected networks which act | traffic is forwarded among all the connected networks which act | |||
| like spokes. | like spokes. | |||
| As an example, some branch offices of an enterprise can connect to | Microsoft's ExpressRoute allows extension of a private network to | |||
| over the Internet to reach AWS's vGW via IPsec tunnels. Other branch | any of the Microsoft cloud services, including Azure and Office365. | |||
| offices of the same enterprise can connect to AWS DirectConnect via | ExpressRoute is configured using Layer 3 routing. Customers can opt | |||
| a private network (without any encryption). ). It is important for | for redundancy by provisioning dual links from their location to two | |||
| enterprises to be able to observe the specific behaviors when | Microsoft Enterprise edge routers (MSEEs) located within a third- | |||
| connected by different connections. | party ExpressRoute peering location. The BGP routing protocol is | |||
| then setup over WAN links to provide redundancy to the cloud. This | ||||
| redundancy is maintained from the peering data center into | ||||
| Microsoft's cloud network. | ||||
| Figure below shows an example of some tenants' workloads are | Google's Cloud Dedicated Interconnect offers similar network | |||
| connectivity options as AWS and Microsoft. One distinct difference, | ||||
| however, is that Google's service allows customers access to the | ||||
| entire global cloud network by default. It does this by connecting | ||||
| your on-premises network with the Google Cloud using BGP and Google | ||||
| Cloud Routers to provide optimal paths to the different regions of | ||||
| the global cloud infrastructure. | ||||
| Figure below shows an example of some of a tenant's workloads are | ||||
| accessible via a virtual router connected by AWS Internet Gateway; | accessible via a virtual router connected by AWS Internet Gateway; | |||
| some are accessible via AWS vGW, and others are accessible via AWS | some are accessible via AWS vGW, and others are accessible via AWS | |||
| Direct Connect. vR1 uses IPsec to establish secure tunnels over the | Direct Connect. | |||
| Internet to avoid paying extra fees for the IPsec features provided | ||||
| by AWS vGW. Some tenants can deploy separate virtual routers to | ||||
| connect to internet traffic and to traffic from the secure channels | ||||
| from vGW and DirectConnect, e.g. vR1 & vR2. Others may have one | ||||
| virtual router connecting to both types of traffic. Customer Gateway | ||||
| can be customer owned router or ports physically connected to AWS | ||||
| Direct Connect GW. | ||||
| Different types of access require different level of security | ||||
| functions. Sometimes it is not visible to end customers which type | ||||
| of network access is used for a specific application instance. To | ||||
| get better visibility, separate virtual routers (e.g. vR1 & vR2) can | ||||
| be deployed to differentiate traffic to/from different cloud GWs. It | ||||
| is important for some enterprises to be able to observe the specific | ||||
| behaviors when connected by different connections. | ||||
| Customer Gateway can be customer owned router or ports physically | ||||
| connected to AWS Direct Connect GW. | ||||
| +------------------------+ | +------------------------+ | |||
| | ,---. ,---. | | | ,---. ,---. | | |||
| | (TN-1 ) ( TN-2)| | | (TN-1 ) ( TN-2)| | |||
| | `-+-' +---+ `-+-' | | | `-+-' +---+ `-+-' | | |||
| | +----|vR1|----+ | | | +----|vR1|----+ | | |||
| | ++--+ | | | ++--+ | | |||
| | | +-+----+ | | | +-+----+ | |||
| | | /Internet\ For External | | | /Internet\ For External | |||
| | +-------+ Gateway +---------------------- | | +-------+ Gateway +---------------------- | |||
| | \ / to reach via Internet | | \ / to reach via Internet | |||
| skipping to change at page 7, line 38 ¶ | skipping to change at page 12, line 5 ¶ | |||
| | | +-+----+ +------+ | | | +-+----+ +------+ | |||
| | | / \ For Direct /customer\ | | | / \ For Direct /customer\ | |||
| | +-------+ Gateway +----------+ gateway | | | +-------+ Gateway +----------+ gateway | | |||
| | \ / Connect \ / | | \ / Connect \ / | |||
| | +-+----+ +------+ | | +-+----+ +------+ | |||
| | | | | | | |||
| +------------------------+ | +------------------------+ | |||
| Figure 1: Examples of Multiple Cloud DC connections. | Figure 1: Examples of Multiple Cloud DC connections. | |||
| 3.2. Interconnect Private and Public Cloud DCs | 4.2. Inter-Cloud Interconnection | |||
| It is likely that hybrid designs will become the rule for cloud | ||||
| services, as more enterprises see the benefits of integrating public | ||||
| and private cloud infrastructures. However, enabling the growth of | ||||
| hybrid cloud deployments in the enterprise requires fast and safe | ||||
| interconnection between public and private cloud services. | ||||
| For an enterprise to connect to applications & workloads hosted in | ||||
| multiple Cloud DCs, the enterprise can use IPsec tunnels established | ||||
| over the Internet or a (virtualized) leased line service to connect | ||||
| its on-premises gateways to each of the Cloud DC's gateways, virtual | ||||
| routers instantiated in the Cloud DCs, or any other suitable design | ||||
| (including a combination thereof). | ||||
| Some enterprises prefer to instantiate their own virtual | ||||
| CPEs/routers inside the Cloud DC to connect the workloads within the | ||||
| Cloud DC. Then an overlay path is established between customer | ||||
| gateways to the virtual CPEs/routers for reaching the workloads | ||||
| inside the cloud DC. | ||||
| 3.3. Desired Properties for Networks that interconnect Hybrid Clouds | ||||
| The networks that interconnect hybrid cloud DCs must address the | ||||
| following requirements: | ||||
| - High availability to access all workloads in the desired cloud | ||||
| DCs. | ||||
| Many enterprises include cloud infrastructures in their | ||||
| disaster recovery strategy, e.g., by enforcing periodic backup | ||||
| policies within the cloud, or by running backup applications in | ||||
| the Cloud, etc. Therefore, the connection to the cloud DCs may | ||||
| not be permanent, but rather needs to be on-demand. | ||||
| - Global reachability from different geographical zones, thereby | ||||
| facilitating the proximity of applications as a function of the | ||||
| end users' location, to improve latency. | ||||
| - Elasticity: prompt connection to newly instantiated | ||||
| applications at Cloud DCs when usages increase and prompt | ||||
| release of connection after applications at locations being | ||||
| removed when demands change. | ||||
| Some enterprises have front-end web portals running in cloud | ||||
| DCs and database servers in their on-premises DCs. Those Front- | ||||
| end web portals need to be reachable from the public Internet. | ||||
| The backend connection to the sensitive data in database | ||||
| servers hosted in the on-premises DCs might need secure | ||||
| connections. | ||||
| - Scalable security management. IPsec is commonly used to | ||||
| interconnect cloud gateways with CPEs deployed in the | ||||
| enterprise premises. For enterprises with a large number or | ||||
| branch offices, managing the IPsec's Security Associations | ||||
| among many nodes can be very difficult. | ||||
| 4. Multiple Clouds Interconnection | ||||
| 4.1. Multi-Cloud Interconnection | ||||
| Enterprises today can instantiate their workloads or applications in | ||||
| Cloud DCs owned by different Cloud providers, e.g. AWS, Azure, | ||||
| GoogleCloud, Oracle, etc. Interconnecting those workloads involves | ||||
| three parties: The Enterprise, its network service providers, and | ||||
| the Cloud providers. | ||||
| All Cloud Operators offer secure ways to connect enterprises' on- | ||||
| prem sites/DCs with their Cloud DCs. | ||||
| Some Cloud Operators allow enterprises to connect via private | ||||
| networks. For example, AWS's DirectConnect allows enterprises to use rd 3 party provided private Layer 2 path from enterprises' GW to AWS | ||||
| DirectConnect GW. Microsoft's ExpressRoute allows extension of a | ||||
| private network to any of the Microsoft cloud services, including | ||||
| Azure and Office365. ExpressRoute is configured using Layer 3 | ||||
| routing. Customers can opt for redundancy by provisioning dual links | ||||
| from their location to two Microsoft Enterprise edge routers (MSEEs) | ||||
| located within a third-party ExpressRoute peering location. The BGP | ||||
| routing protocol is then setup over WAN links to provide redundancy | ||||
| to the cloud. This redundancy is maintained from the peering data | ||||
| center into Microsoft's cloud network. | ||||
| Google's Cloud Dedicated Interconnect offers similar network | ||||
| connectivity options as AWS and Microsoft. One distinct difference, | ||||
| however, is that Google's service allows customers access to the | ||||
| entire global cloud network by default. It does this by connecting | ||||
| your on-premises network with the Google Cloud using BGP and Google | ||||
| Cloud Routers to provide optimal paths to the different regions of | ||||
| the global cloud infrastructure. | ||||
| All those connectivity options are between Cloud providers' DCs and | The connectivity options to Cloud DCs described in the previous | |||
| the Enterprises, but not between cloud DCs. For example, to connect | section are for reaching Cloud providers' DCs, but not between cloud | |||
| applications in AWS Cloud to applications in Azure Cloud, there must | DCs. When applications in AWS Cloud need to communicate with | |||
| be a third-party gateway (physical or virtual) to interconnect the | applications in Azure, today's practice requires a third-party | |||
| AWS's Layer 2 DirectConnect path with Azure's Layer 3 ExpressRoute. | gateway (physical or virtual) to interconnect the AWS's Layer 2 | |||
| DirectConnect path with Azure's Layer 3 ExpressRoute. | ||||
| Enterprises can also instantiate their own virtual routers in | Enterprises can also instantiate their own virtual routers in | |||
| different Cloud DCs and administer IPsec tunnels among them, which | different Cloud DCs and administer IPsec tunnels among them, which | |||
| by itself is not a trivial task. Or by leveraging open source VPN | by itself is not a trivial task. Or by leveraging open source VPN | |||
| software such as strongSwan, you create an IPSec connection to the | software such as strongSwan, you create an IPSec connection to the | |||
| Azure gateway using a shared key. The strong swan instance within | Azure gateway using a shared key. The StrongSwan instance within AWS | |||
| AWS not only can connect to Azure but can also be used to facilitate | not only can connect to Azure but can also be used to facilitate | |||
| traffic to other nodes within the AWS VPC by configuring forwarding | traffic to other nodes within the AWS VPC by configuring forwarding | |||
| and using appropriate routing rules for the VPC. Most Cloud | and using appropriate routing rules for the VPC. | |||
| operators, such as AWS VPC or Azure VNET, use non-globally routable | ||||
| CIDR from private IPv4 address ranges as specified by RFC1918. To | Most Cloud operators, such as AWS VPC or Azure VNET, use non- | |||
| establish IPsec tunnel between two Cloud DCs, it is necessary to | globally routable CIDR from private IPv4 address ranges as specified | |||
| exchange Public routable addresses for applications in different | by RFC1918. To establish IPsec tunnel between two Cloud DCs, it is | |||
| Cloud DCs. [BGP-SDWAN] describes one method. Other methods are worth | necessary to exchange Public routable addresses for applications in | |||
| exploring. | different Cloud DCs. [BGP-SDWAN] describes one method. Other methods | |||
| are worth exploring. | ||||
| In summary, here are some approaches, available now (which might | In summary, here are some approaches, available now (which might | |||
| change in the future), to interconnect workloads among different | change in the future), to interconnect workloads among different | |||
| Cloud DCs: | Cloud DCs: | |||
| a) Utilize Cloud DC provided inter/intra-cloud connectivity | a) Utilize Cloud DC provided inter/intra-cloud connectivity | |||
| services (e.g., AWS Transit Gateway) to connect workloads | services (e.g., AWS Transit Gateway) to connect workloads | |||
| instantiated in multiple VPCs. Such services are provided with | instantiated in multiple VPCs. Such services are provided with | |||
| the cloud gateway to connect to external networks (e.g., AWS | the cloud gateway to connect to external networks (e.g., AWS | |||
| DirectConnect Gateway). | DirectConnect Gateway). | |||
| b) Hairpin all traffic through the customer gateway, meaning all | b) Hairpin all traffic through the customer gateway, meaning all | |||
| workloads are directly connected to the customer gateway, so | workloads are directly connected to the customer gateway, so | |||
| that communications among workloads within one Cloud DC must | that communications among workloads within one Cloud DC must | |||
| traverse through the customer gateway. | traverse through the customer gateway. | |||
| c) Establish direct tunnels among different VPCs (AWS' Virtual | c) Establish direct tunnels among different VPCs (AWS' Virtual | |||
| Private Clouds) and VNET (Azure's Virtual Networks) via | Private Clouds) and VNET (Azure's Virtual Networks) via | |||
| client's own virtual routers instantiated within Cloud DCs. | client's own virtual routers instantiated within Cloud DCs. | |||
| DMVPN (Dynamic Multipoint Virtual Private Network) or DSVPN | DMVPN (Dynamic Multipoint Virtual Private Network) or DSVPN | |||
| (Dynamic Smart VPN) techniques can be used to establish direct | (Dynamic Smart VPN) techniques can be used to establish direct | |||
| Multi-point-to-Point or multi-point-to multi-point tunnels | Multi-point-to-Point or multi-point-to multi-point tunnels | |||
| among those client's own virtual routers. | among those client's own virtual routers. | |||
| Approach a) usually does not work if Cloud DCs are owned and managed | Approach a) usually does not work if Cloud DCs are owned and managed | |||
| by different Cloud providers. | by different Cloud providers. | |||
| skipping to change at page 11, line 12 ¶ | skipping to change at page 13, line 30 ¶ | |||
| There are many differences between virtual routers in Public Cloud | There are many differences between virtual routers in Public Cloud | |||
| DCs and the nodes in an NBMA network. NHRP cannot be used for | DCs and the nodes in an NBMA network. NHRP cannot be used for | |||
| registering virtual routers in Cloud DCs unless an extension of such | registering virtual routers in Cloud DCs unless an extension of such | |||
| protocols is developed for that purpose, e.g. taking NAT or dynamic | protocols is developed for that purpose, e.g. taking NAT or dynamic | |||
| addresses into consideration. Therefore, DMVPN and/or DSVPN cannot | addresses into consideration. Therefore, DMVPN and/or DSVPN cannot | |||
| be used directly for connecting workloads in hybrid Cloud DCs. | be used directly for connecting workloads in hybrid Cloud DCs. | |||
| Other protocols such as BGP can be used, as described in [BGP- | Other protocols such as BGP can be used, as described in [BGP- | |||
| SDWAN]. | SDWAN]. | |||
| 4.2. Desired Properties for Multi-Cloud Interconnection | ||||
| Different Cloud Operators have different APIs to access their Cloud | ||||
| resources. It is difficult to move applications built by one Cloud | ||||
| operator's APIs to another. However, it is highly desirable to have | ||||
| a single and consistent way to manage the networks and respective | ||||
| security policies for interconnecting applications hosted in | ||||
| different Cloud DCs. | ||||
| The desired property would be having a single network fabric to | ||||
| which different Cloud DCs and enterprise's multiple sites can be | ||||
| attached or detached, with a common interface for setting desired | ||||
| policies. SDWAN is positioned to become that network fabric enabling | ||||
| Cloud DCs to be dynamically attached or detached. But the reality is | ||||
| that different Cloud Operators have different access methods, and | ||||
| Cloud DCs might be geographically far apart. More Cloud connectivity | ||||
| problems are described in the subsequent sections. | ||||
| The difficulty of connecting applications in different Clouds might | ||||
| be stemmed from the fact that they are direct competitors. Usually | ||||
| traffic flow out of Cloud DCs incur charges. Therefore, direct | ||||
| communications between applications in different Cloud DCs can be | ||||
| more expensive than intra Cloud communications. | ||||
| 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs | 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs | |||
| Traditional MPLS-based VPNs have been widely deployed as an | Traditional MPLS-based VPNs have been widely deployed as an | |||
| effective way to support businesses and organizations that require | effective way to support businesses and organizations that require | |||
| network performance and reliability. MPLS shifted the burden of | network performance and reliability. MPLS shifted the burden of | |||
| managing a VPN service from enterprises to service providers. The | managing a VPN service from enterprises to service providers. The | |||
| CPEs attached to MPLS VPNs are also simpler and less expensive, | CPEs attached to MPLS VPNs are also simpler and less expensive, | |||
| since they do not need to manage routes to remote sites; they simply | because they do not need to manage routes to remote sites; they | |||
| pass all outbound traffic to the MPLS VPN PEs to which the CPEs are | simply pass all outbound traffic to the MPLS VPN PEs to which the | |||
| attached (albeit multi-homing scenarios require more processing | CPEs are attached (albeit multi-homing scenarios require more | |||
| logic on CPEs). MPLS has addressed the problems of scale, | processing logic on CPEs). MPLS has addressed the problems of | |||
| availability, and fast recovery from network faults, and | scale, availability, and fast recovery from network faults, and | |||
| incorporated traffic-engineering capabilities. | incorporated traffic-engineering capabilities. | |||
| However, traditional MPLS-based VPN solutions are sub-optimized for | However, traditional MPLS-based VPN solutions are sub-optimized for | |||
| connecting end-users to dynamic workloads/applications in cloud DCs | connecting end-users to dynamic workloads/applications in cloud DCs | |||
| because: | because: | |||
| - The Provider Edge (PE) nodes of the enterprise's VPNs might not | - The Provider Edge (PE) nodes of the enterprise's VPNs might not | |||
| have direct connections to third party cloud DCs that are used | have direct connections to third party cloud DCs that are used | |||
| for hosting workloads with the goal of providing an easy access | for hosting workloads with the goal of providing an easy access | |||
| to enterprises' end-users. | to enterprises' end-users. | |||
| - It usually takes some time to deploy provider edge (PE) routers | - It takes some time to deploy provider edge (PE) routers at new | |||
| at new locations. When enterprise's workloads are changed from | locations. When enterprise's workloads are changed from one | |||
| one cloud DC to another (i.e., removed from one DC and re- | cloud DC to another (i.e., removed from one DC and re- | |||
| instantiated to another location when demand changes), the | instantiated to another location when demand changes), the | |||
| enterprise branch offices need to be connected to the new cloud | enterprise branch offices need to be connected to the new cloud | |||
| DC, but the network service provider might not have PEs located | DC, but the network service provider might not have PEs located | |||
| at the new location. | at the new location. | |||
| One of the main drivers for moving workloads into the cloud is | One of the main drivers for moving workloads into the cloud is | |||
| the widely available cloud DCs at geographically diverse | the widely available cloud DCs at geographically diverse | |||
| locations, where apps can be instantiated so that they can be | locations, where apps can be instantiated so that they can be | |||
| as close to their end-users as possible. When the user base | as close to their end-users as possible. When the user base | |||
| changes, the applications may be migrated to a new cloud DC | changes, the applications may be migrated to a new cloud DC | |||
| skipping to change at page 12, line 43 ¶ | skipping to change at page 14, line 41 ¶ | |||
| to connect to a Cloud provider at multiple locations. The | to connect to a Cloud provider at multiple locations. The | |||
| connection locations often correspond to gateways of different | connection locations often correspond to gateways of different | |||
| Cloud DC locations from the Cloud provider. The different | Cloud DC locations from the Cloud provider. The different | |||
| Cloud DCs are interconnected by the Cloud provider's own | Cloud DCs are interconnected by the Cloud provider's own | |||
| internal network. At each connection location (gateway), the | internal network. At each connection location (gateway), the | |||
| Cloud provider uses BGP to advertise all of the prefixes in the | Cloud provider uses BGP to advertise all of the prefixes in the | |||
| enterprise's VPC, regardless of which Cloud DC a given prefix | enterprise's VPC, regardless of which Cloud DC a given prefix | |||
| is actually in. This can result in inefficient routing for the | is actually in. This can result in inefficient routing for the | |||
| end-to-end data path. | end-to-end data path. | |||
| - Extensive usage of Overlay by Cloud DCs: | ||||
| Many cloud DCs use an overlay to connect their gateways to the | ||||
| workloads located inside the DC. There is currently no standard | ||||
| that specifies the interworking between the Cloud Overlay and | ||||
| the enterprise' existing underlay networks. One of the | ||||
| characteristics of overlay networks is that some of the WAN | ||||
| ports of the edge nodes connect to third party networks. There | ||||
| is therefore a need to propagate WAN port information to remote | ||||
| authorized peers in third party network domains in addition to | ||||
| route propagation. Such an exchange cannot happen before | ||||
| communication between peers is properly secured. | ||||
| Another roadblock is the lack of a standard way to express and | Another roadblock is the lack of a standard way to express and | |||
| enforce consistent security policies for workloads that not only use | enforce consistent security policies for workloads that not only use | |||
| virtual addresses, but in which are also very likely hosted in | virtual addresses, but in which are also very likely hosted in | |||
| different locations within the Cloud DC [RFC8192]. The current VPN | different locations within the Cloud DC [RFC8192]. The current VPN | |||
| path computation and bandwidth allocation schemes may not be | path computation and bandwidth allocation schemes may not be | |||
| flexible enough to address the need for enterprises to rapidly | flexible enough to address the need for enterprises to rapidly | |||
| connect to dynamically instantiated (or removed) workloads and | connect to dynamically instantiated (or removed) workloads and | |||
| applications regardless of their location/nature (i.e., third party | applications regardless of their location/nature (i.e., third party | |||
| cloud DCs). | cloud DCs). | |||
| 6. Problem with using IPsec tunnels to Cloud DCs | 6. Problem with using IPsec tunnels to Cloud DCs | |||
| As described in the previous section, many Cloud operators expose | As described in the previous section, many Cloud operators expose | |||
| their gateways for external entities (which can be enterprises | their gateways for external entities (which can be enterprises | |||
| themselves) to directly establish IPsec tunnels. Enterprises can | themselves) to directly establish IPsec tunnels. Enterprises can | |||
| also instantiate virtual routers within Cloud DCs to connect to | also instantiate virtual routers within Cloud DCs to connect to | |||
| their on-premises devices via IPsec tunnels. If there is only one | their on-premises devices via IPsec tunnels. | |||
| enterprise location that needs to reach the Cloud DC, an IPsec | ||||
| tunnel is a very convenient solution. | ||||
| However, many medium-to-large enterprises usually have multiple | ||||
| sites and multiple data centers. For workloads and apps hosted in | ||||
| cloud DCs, multiple sites need to communicate securely with those | ||||
| cloud workloads and apps. This section documents some of the issues | ||||
| associated with using IPsec tunnels to connect enterprise premises | ||||
| with cloud gateways. | ||||
| 6.1. Complexity of multi-point any-to-any interconnection | ||||
| The dynamic workload instantiated in cloud DC needs to communicate | 6.1. Scaling Issues with IPsec Tunnels | |||
| with multiple branch offices and on-premises data centers. Most | ||||
| enterprises need multi-point interconnection among multiple | ||||
| locations, which can be provided by means of MPLS L2/L3 VPNs. | ||||
| Using IPsec overlay paths to connect all branches & on-premises data | If there is only one enterprise location that needs to reach the | |||
| centers to cloud DCs requires CPEs to manage routing among Cloud DCs | Cloud DC, an IPsec tunnel is a very convenient solution. | |||
| gateways and the CPEs located at other branch locations, which can | ||||
| dramatically increase the complexity of the design, possibly at the | ||||
| cost of jeopardizing the CPE performance. | ||||
| The complexity of requiring CPEs to maintain routing among other | However, many medium-to-large enterprises have multiple sites and | |||
| CPEs is one of the reasons why enterprises migrated from Frame Relay | multiple data centers. For multiple sites to communicate with | |||
| based services to MPLS-based VPN services. | workloads and apps hosted in cloud DCs, Cloud DC gateways have to | |||
| maintain many IPsec tunnels to all those locations. In addition, | ||||
| each of those IPsec Tunnels requires pair-wise periodic key | ||||
| refreshment. For a company with hundreds or thousands of locations, | ||||
| there could be hundreds (or even thousands) of IPsec tunnels | ||||
| terminating at the cloud DC gateway, which is very processing | ||||
| intensive. That is why many cloud operators only allow a limited | ||||
| number of (IPsec) tunnels & bandwidth to each customer. | ||||
| MPLS-based VPNs have their PEs directly connected to the CPEs. | Alternatively, you could use a solution like group encryption where | |||
| Therefore, CPEs only need to forward all traffic to the directly | a single IPsec SA is necessary at the GW but the drawback is key | |||
| attached PEs, which are therefore responsible for enforcing the | distribution and maintenance of a key server, etc. | |||
| routing policy within the corresponding VPNs. Even for multi-homed | ||||
| CPEs, the CPEs only need to forward traffic among the directly | ||||
| connected PEs. However, when using IPsec tunnels between CPEs and | ||||
| Cloud DCs, the CPEs need to compute, select, establish and maintain | ||||
| routes for traffic to be forwarded to Cloud DCs, to remote CPEs via | ||||
| VPN, or directly. | ||||
| 6.2. Poor performance over long distance | 6.2. Poor performance over long distance | |||
| When enterprise CPEs or gateways are far away from cloud DC gateways | When enterprise CPEs or gateways are far away from cloud DC gateways | |||
| or across country/continent boundaries, performance of IPsec tunnels | or across country/continent boundaries, performance of IPsec tunnels | |||
| over the public Internet can be problematic and unpredictable. Even | over the public Internet can be problematic and unpredictable. Even | |||
| though there are many monitoring tools available to measure delay | though there are many monitoring tools available to measure delay | |||
| and various performance characteristics of the network, the | and various performance characteristics of the network, the | |||
| measurement for paths over the Internet is passive and past | measurement for paths over the Internet is passive and past | |||
| measurements may not represent future performance. | measurements may not represent future performance. | |||
| Many cloud providers can replicate workloads in different available | Many cloud providers can replicate workloads in different available | |||
| zones. An App instantiated in a cloud DC closest to clients may have | zones. An App instantiated in a cloud DC closest to clients may have | |||
| to cooperate with another App (or its mirror image) in another | to cooperate with another App (or its mirror image) in another | |||
| region or database server(s) in the on-premises DC. This kind of | region or database server(s) in the on-premises DC. This kind of | |||
| coordination requires predicable networking behavior/performance | coordination requires predicable networking behavior/performance | |||
| among those locations. | among those locations. | |||
| 6.3. Scaling Issues with IPsec Tunnels | ||||
| IPsec can achieve secure overlay connections between two locations | ||||
| over any underlay network, e.g., between CPEs and Cloud DC Gateways. | ||||
| If there is only one enterprise location connected to the cloud | ||||
| gateway, a small number of IPsec tunnels can be configured on-demand | ||||
| between the on-premises DC and the Cloud DC, which is an easy and | ||||
| flexible solution. | ||||
| However, for multiple enterprise locations to reach workloads hosted | ||||
| in cloud DCs, the cloud DC gateway needs to maintain multiple IPsec | ||||
| tunnels to all those locations (e.g., as a hub & spoke topology). | ||||
| For a company with hundreds or thousands of locations, there could | ||||
| be hundreds (or even thousands) of IPsec tunnels terminating at the | ||||
| cloud DC gateway, which is not only very expensive (because Cloud | ||||
| Operators usually charge their customers based on connections), but | ||||
| can be very processing intensive for the gateway. Many cloud | ||||
| operators only allow a limited number of (IPsec) tunnels & bandwidth | ||||
| to each customer. Alternatively, you could use a solution like | ||||
| group encryption where a single IPsec SA is necessary at the GW but | ||||
| the drawback here is key distribution and maintenance of a key | ||||
| server, etc. | ||||
| 7. Problems of Using SD-WAN to connect to Cloud DCs | 7. Problems of Using SD-WAN to connect to Cloud DCs | |||
| SD-WAN can establish parallel paths over multiple underlay networks | ||||
| between two locations on-demand, for example, to support the | ||||
| connections established between two CPEs interconnected by a | ||||
| traditional MPLS VPN ([RFC4364] or [RFC4664]) or by IPsec [RFC6071] | ||||
| tunnels. | ||||
| SD-WAN lets enterprises augment their current VPN network with cost- | SD-WAN lets enterprises augment their current VPN network with cost- | |||
| effective, readily available Broadband Internet connectivity, | effective, readily available Broadband Internet connectivity, | |||
| enabling some traffic offloading to paths over the Internet | enabling some traffic offloading to paths over the Internet | |||
| according to differentiated, possibly application-based traffic | according to differentiated, possibly application-based traffic | |||
| forwarding policies, or when the MPLS VPN connection between the two | forwarding policies, or when the MPLS VPN connection between the two | |||
| locations is congested, or otherwise undesirable or unavailable. | locations is congested, or otherwise undesirable or unavailable. | |||
| 7.1. SD-WAN among branch offices vs. interconnect to Cloud DCs | 7.1. More Complexity to Edge Nodes | |||
| SD-WAN interconnection of branch offices is not as simple as it | ||||
| appears. For an enterprise with multiple sites, using SD-WAN overlay | ||||
| paths among sites requires each CPE to manage all the addresses that | ||||
| local hosts have the potential to reach, i.e., map internal VPN | ||||
| addresses to appropriate SD-WAN paths. This is similar to the | ||||
| complexity of Frame Relay based VPNs, where each CPE needed to | ||||
| maintain mesh routing for all destinations if they were to avoid an | ||||
| extra hop through a hub router. Even though SD-WAN CPEs can get | ||||
| assistance from a central controller (instead of running a routing | ||||
| protocol) to resolve the mapping between destinations and SD-WAN | ||||
| paths, SD-WAN CPEs are still responsible for routing table | ||||
| maintenance as remote destinations change their attachments, e.g., | ||||
| the dynamic workload in other DCs are de-commissioned or added. | ||||
| Even though originally envisioned for interconnecting branch | Augmenting transport path is not as simple as it appears. For an | |||
| offices, SD-WAN offers a very attractive way for enterprises to | enterprise with multiple sites, CPE managed overlay paths among | |||
| connect to Cloud DCs. | sites requires each CPE to manage all the addresses that local hosts | |||
| have potential to reach, i.e., map internal VPN addresses to | ||||
| appropriate Overlay paths. This is similar to the complexity of | ||||
| Frame Relay based VPNs, where each CPE needed to maintain mesh | ||||
| routing for all destinations if they were to avoid an extra hop | ||||
| through a hub router. Even with the assistance from a central | ||||
| controller (instead of running a routing protocol) to resolve the | ||||
| mapping between destinations and SD-WAN paths, SD-WAN CPEs are still | ||||
| responsible for routing table maintenance as remote destinations | ||||
| change their attachments, e.g., the dynamic workload in other DCs | ||||
| are de-commissioned or added. | ||||
| The SD-WAN for interconnecting branch offices and the SD-WAN for | In addition, overlay path for interconnecting branch offices are | |||
| interconnecting to Cloud DCs have some differences: | different from connecting to Cloud DCs: | |||
| - SD-WAN for interconnecting branch offices usually have two end- | - Overlay path interconnecting branch offices usually have two | |||
| points (e.g., CPEs) controlled by one entity (e.g., a | end-points (e.g. CPEs) controlled by one entity (e.g. | |||
| controller or management system operated by the enterprise). | controllers or management systems operated by the enterprise). | |||
| - SD-WAN for Cloud DC interconnects may consider CPEs owned or | - Connecting to Cloud DC may consists of CPEs owned or managed by | |||
| managed by the enterprise, while remote end-points are being | the enterprise, and the remote end-points being managed or | |||
| managed or controlled by Cloud DCs (For the ease of | controlled by Cloud DCs. | |||
| description, let's call such CPEs asymmetrically-managed CPEs). | ||||
| - Cloud DCs may have different entry points (or devices) with one | 7.2. Edge WAN Port Management | |||
| entry point that terminates a private direct connection (based | ||||
| upon a leased line for example) and other entry points being | ||||
| devices terminating the IPsec tunnels, as shown in Figure 2. | ||||
| Therefore, the SD-WAN design becomes asymmetric. | An SDWAN edge node can have WAN ports connected to different | |||
| +------------------------+ | networks or public internet managed by different operators. | |||
| | ,---. ,---. | | There is therefore a need to propagate WAN port property to | |||
| | (TN-1 ) ( TN-2)| TN: Tenant applications/workloads | remote authorized peers in third party network domains in | |||
| | `-+-' +---+ `-+-' | | addition to route propagation. Such an exchange cannot happen | |||
| | +----|vR1|----+ | | before communication between peers is properly secured. | |||
| | ++--+ | | ||||
| | | +-+----+ | ||||
| | | /Internet\ One path via | ||||
| | +-------+ Gateway +---------------------+ | ||||
| | \ / Internet \ | ||||
| | +-+----+ \ | ||||
| +------------------------+ \ | ||||
| \ | ||||
| +------------------------+ native traffic \ | ||||
| | ,---. ,---. | without encryption| | ||||
| | (TN-3 ) ( TN-4)| | | ||||
| | `-+-' +--+ `-+-' | | +------+ | ||||
| | +----|vR|-----+ | +----+ CPE | | ||||
| | ++-+ | | +------+ | ||||
| | | +-+----+ | | ||||
| | | / virtual\ One path via IPsec Tunnel | | ||||
| | +-------+ Gateway +-------------------------- + | ||||
| | \ / Encrypted traffic over| | ||||
| | +-+----+ public network | | ||||
| +------------------------+ | | ||||
| | | ||||
| +------------------------+ | | ||||
| | ,---. ,---. | Native traffic | | ||||
| | (TN-5 ) ( TN-6)| without encryption | | ||||
| | `-+-' +--+ `-+-' | over secure network| | ||||
| | +----|vR|-----+ | | | ||||
| | ++-+ | | | ||||
| | | +-+----+ +------+ | | ||||
| | | / \ Via Direct /customer\ | | ||||
| | +-------+ Gateway +----------+ gateway |-----+ | ||||
| | \ / Connect \ / | ||||
| | +-+----+ +------+ | ||||
| +------------------------+Customer GW has physical connection to AWS GW | ||||
| Figure 2: Different Underlays to Reach Cloud DC | 7.3. Forwarding based on Application | |||
| Forwarding based on application IDs instead of based on | ||||
| destination IP addresses is often referred to as Application based | ||||
| Segmentation. If the Applications have unique IP addresses, then | ||||
| the Application Based Segmentation can be achieved by propagating | ||||
| different BGP UPDATE messages to different nodes, as described in | ||||
| [BGP-SDWAN-USAGE]. If the Application cannot be uniquely | ||||
| identified by the IP addresses, more work is needed. | ||||
| 8. End-to-End Security Concerns for Data Flows | 8. End-to-End Security Concerns for Data Flows | |||
| When IPsec tunnels established from enterprise on-premises CPEs | When IPsec tunnels established from enterprise on-premises CPEs | |||
| are terminated at the Cloud DC gateway where the workloads or | are terminated at the Cloud DC gateway where the workloads or | |||
| applications are hosted, some enterprises have concerns regarding | applications are hosted, some enterprises have concerns regarding | |||
| traffic to/from their workload being exposed to others behind the | traffic to/from their workload being exposed to others behind the | |||
| data center gateway (e.g., exposed to other organizations that | data center gateway (e.g., exposed to other organizations that | |||
| have workloads in the same data center). | have workloads in the same data center). | |||
| To ensure that traffic to/from workloads is not exposed to | To ensure that traffic to/from workloads is not exposed to | |||
| End of changes. 47 change blocks. | ||||
| 374 lines changed or deleted | 352 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||