| < draft-ietf-rtgwg-net2cloud-problem-statement-09.txt | draft-ietf-rtgwg-net2cloud-problem-statement-10.txt > | |||
|---|---|---|---|---|
| Network Working Group L. Dunbar | ||||
| Internet Draft Futurewei | ||||
| Intended status: Informational Andy Malis | ||||
| Expires: November 1, 2020 Independent | ||||
| C. Jacquenet | ||||
| Orange | ||||
| M. Toy | ||||
| Verizon | ||||
| May 1, 2020 | ||||
| Network Working Group L. Dunbar | Dynamic Networks to Hybrid Cloud DCs Problem Statement | |||
| Internet Draft Futurewei | draft-ietf-rtgwg-net2cloud-problem-statement-10 | |||
| Intended status: Informational Andy Malis | ||||
| Expires: September 16, 2020 Independent | ||||
| C. Jacquenet | ||||
| Orange | ||||
| M. Toy | ||||
| Verizon | ||||
| March 16, 2020 | ||||
| Dynamic Networks to Hybrid Cloud DCs Problem Statement | Abstract | |||
| draft-ietf-rtgwg-net2cloud-problem-statement-09 | ||||
| Abstract | This document describes the problems that enterprises face today | |||
| when interconnecting their branch offices with dynamic workloads in | ||||
| third party data centers (a.k.a. Cloud DCs). There can be many | ||||
| problems associated with network connecting to or among Clouds, many | ||||
| of which probably are out of the IETF scope. The objective of this | ||||
| document is to identify some of the problems that need additional | ||||
| work in IETF Routing area. Other problems are out of the scope of | ||||
| this document. | ||||
| This document describes the problems that enterprises face today | This document focuses on the network problems that many enterprises | |||
| when interconnecting their branch offices with dynamic workloads in | face when they have workloads & applications & data split among | |||
| third party data centers (a.k.a. Cloud DCs). There can be many | different data centers, especially for those enterprises with | |||
| problems associated with network connecting to or among Clouds, many | multiple sites that are already interconnected by VPNs (e.g., MPLS | |||
| of which probably are out of the IETF scope. The objective of this | L2VPN/L3VPN). | |||
| document is to identify some of the problems that need additional | ||||
| work in IETF Routing area. Other problems are out of the scope of | ||||
| this document. | ||||
| It examines some of the approaches interconnecting cloud DCs with | Current operational problems are examined to determine whether there | |||
| enterprises on-premises DCs & branch offices. This document also | is a need to improve existing protocols or whether a new protocol is | |||
| describes some of the network problems that many enterprises face | necessary to solve them. | |||
| when they have workloads & applications & data split among different | ||||
| data centers, especially for those enterprises with multiple sites | ||||
| that are already interconnected by VPNs (e.g., MPLS L2VPN/L3VPN). | ||||
| Current operational problems are examined to determine whether there | Status of this Memo | |||
| is a need to improve existing protocols or whether a new protocol is | ||||
| necessary to solve them. | ||||
| Status of this Memo | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | ||||
| This Internet-Draft is submitted in full conformance with the | Internet-Drafts are working documents of the Internet Engineering | |||
| provisions of BCP 78 and BCP 79. | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | ||||
| Drafts. | ||||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are draft documents valid for a maximum of six | |||
| Task Force (IETF), its areas, and its working groups. Note that | months and may be updated, replaced, or obsoleted by other documents | |||
| other groups may also distribute working documents as Internet- | at any time. It is inappropriate to use Internet-Drafts as | |||
| Drafts. | reference material or to cite them other than as "work in progress." | |||
| Internet-Drafts are draft documents valid for a maximum of six | The list of current Internet-Drafts can be accessed at | |||
| months and may be updated, replaced, or obsoleted by other documents | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| at any time. It is inappropriate to use Internet-Drafts as | ||||
| reference material or to cite them other than as "work in progress." | ||||
| The list of current Internet-Drafts can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/shadow.html | |||
| The list of Internet-Draft Shadow Directories can be accessed at | This Internet-Draft will expire on October 1, 2020. | |||
| http://www.ietf.org/shadow.html | ||||
| This Internet-Draft will expire on August 16, 2020. | Copyright Notice | |||
| Copyright Notice | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | ||||
| Copyright (c) 2020 IETF Trust and the persons identified as the | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| document authors. All rights reserved. | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | ||||
| publication of this document. Please review these documents | ||||
| carefully, as they describe your rights and restrictions with | ||||
| respect to this document. Code Components extracted from this | ||||
| document must include Simplified BSD License text as described in | ||||
| Section 4.e of the Trust Legal Provisions and are provided without | ||||
| warranty as described in the Simplified BSD License. | ||||
| This document is subject to BCP 78 and the IETF Trust s Legal | Table of Contents | |||
| Provisions Relating to IETF Documents | ||||
| (http://trustee.ietf.org/license-info) in effect on the date of | ||||
| publication of this document. Please review these documents | ||||
| carefully, as they describe your rights and restrictions with | ||||
| respect to this document. Code Components extracted from this | ||||
| document must include Simplified BSD License text as described in | ||||
| Section 4.e of the Trust Legal Provisions and are provided without | ||||
| warranty as described in the Simplified BSD License. | ||||
| Table of Contents | 1. Introduction...................................................3 | |||
| 1.1. Key Characteristics of Cloud Services:....................3 | ||||
| 1.2. Connecting to Cloud Services..............................3 | ||||
| 1.3. The role of SD-WAN in connecting to Cloud Services........4 | ||||
| 2. Definition of terms............................................4 | ||||
| 3. High Level Issues of Connecting to Multi-Cloud.................6 | ||||
| 3.1. Security Issues...........................................6 | ||||
| 3.2. Authorization and Identity Management.....................6 | ||||
| 3.3. API abstraction...........................................7 | ||||
| 3.4. DNS for Cloud Resources...................................8 | ||||
| 3.5. NAT for Cloud Services....................................9 | ||||
| 3.6. Cloud Discovery...........................................9 | ||||
| 4. Interconnecting Enterprise Sites with Cloud DCs...............10 | ||||
| 4.1. Sites to Cloud DC........................................10 | ||||
| 4.2. Inter-Cloud Interconnection..............................12 | ||||
| 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs...14 | ||||
| 6. Problem with using IPsec tunnels to Cloud DCs.................15 | ||||
| 6.1. Scaling Issues with IPsec Tunnels........................15 | ||||
| 6.2. Poor performance over long distance......................16 | ||||
| 7. Problems of Using SD-WAN to connect to Cloud DCs..............16 | ||||
| 7.1. More Complexity to Edge Nodes............................17 | ||||
| 7.2. Edge WAN Port Management.................................17 | ||||
| 7.3. Forwarding based on Application..........................18 | ||||
| 8. End-to-End Security Concerns for Data Flows...................18 | ||||
| 9. Requirements for Dynamic Cloud Data Center VPNs...............18 | ||||
| 10. Security Considerations......................................19 | ||||
| 11. IANA Considerations..........................................19 | ||||
| 12. References...................................................19 | ||||
| 12.1. Normative References....................................19 | ||||
| 12.2. Informative References..................................19 | ||||
| 13. Acknowledgments..............................................20 | ||||
| 1. Introduction...................................................3 | 1. Introduction | |||
| 1.1. Key Characteristics of Cloud Services:....................3 | ||||
| 1.2. Connecting to Cloud Services..............................3 | ||||
| 1.3. The role of SD-WAN in connecting to Cloud Services........4 | ||||
| 2. Definition of terms............................................5 | ||||
| 3. High Level Issues of Connecting to Multi-Cloud.................6 | ||||
| 3.1. Security Issues...........................................6 | ||||
| 3.2. Authorization and Identity Management.....................6 | ||||
| 3.3. API abstraction...........................................7 | ||||
| 3.4. DNS for Cloud Resources...................................8 | ||||
| 3.5. NAT for Cloud Services....................................9 | ||||
| 3.6. Cloud Discovery...........................................9 | ||||
| 4. Interconnecting Enterprise Sites with Cloud DCs...............10 | ||||
| 4.1. Sites to Cloud DC........................................10 | ||||
| 4.2. Inter-Cloud Interconnection..............................12 | ||||
| 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs...14 | ||||
| 6. Problem with using IPsec tunnels to Cloud DCs.................15 | ||||
| 6.1. Scaling Issues with IPsec Tunnels........................15 | ||||
| 6.2. Poor performance over long distance......................16 | ||||
| 7. Problems of Using SD-WAN to connect to Cloud DCs..............16 | ||||
| 7.1. More Complexity to Edge Nodes............................17 | ||||
| 7.2. Edge WAN Port Management.................................17 | ||||
| 7.3. Forwarding based on Application..........................18 | ||||
| 8. End-to-End Security Concerns for Data Flows...................18 | ||||
| 9. Requirements for Dynamic Cloud Data Center VPNs...............18 | ||||
| 10. Security Considerations......................................19 | ||||
| 11. IANA Considerations..........................................19 | ||||
| 12. References...................................................19 | ||||
| 12.1. Normative References....................................19 | ||||
| 12.2. Informative References..................................19 | ||||
| 13. Acknowledgments..............................................20 | ||||
| 1. Introduction | 1.1. Key Characteristics of Cloud Services: | |||
| 1.1. Key Characteristics of Cloud Services: | Key characteristics of Cloud Services are on-demand, scalable, | |||
| highly available, and usage-based billing. Cloud Services, such as, | ||||
| compute, storage, network functions (most likely virtual), third | ||||
| party managed applications, etc. are usually hosted and managed by | ||||
| third parties Cloud Operators. Here are some examples of Cloud | ||||
| network functions: Virtual Firewall services, Virtual private | ||||
| network services, Virtual PBX services including voice and video | ||||
| conferencing systems, etc. Cloud Data Center (DC) is shared | ||||
| infrastructure that hosts the Cloud Services to many customers. | ||||
| Key characteristics of Cloud Services are on-demand, scalable, | 1.2. Connecting to Cloud Services | |||
| highly available, and usage-based billing. Cloud Services, such as, | ||||
| compute, storage, network functions (most likely virtual), third | ||||
| party managed applications, etc. are usually hosted and managed by | ||||
| third parties Cloud Operators. Here are some examples of Cloud | ||||
| network functions: Virtual Firewall services, Virtual private | ||||
| network services, Virtual PBX services including voice and video | ||||
| conferencing systems, etc. Cloud Data Center (DC) is shared | ||||
| infrastructure that hosts the Cloud Services to many customers. | ||||
| 1.2. Connecting to Cloud Services | With the advent of widely available third-party cloud DCs and | |||
| services in diverse geographic locations and the advancement of | ||||
| tools for monitoring and predicting application behaviors, it is | ||||
| very attractive for enterprises to instantiate applications and | ||||
| workloads in locations that are geographically closest to their end- | ||||
| users. Such proximity can improve end-to-end latency and overall | ||||
| user experience. Conversely, an enterprise can easily shutdown | ||||
| applications and workloads whenever end-users are in motion (thereby | ||||
| modifying the networking connection of subsequently relocated | ||||
| applications and workloads). In addition, enterprises may wish to | ||||
| take advantage of more and more business applications offered by | ||||
| cloud operators. | ||||
| With the advent of widely available third-party cloud DCs and | The networks that interconnect hybrid cloud DCs must address the | |||
| services in diverse geographic locations and the advancement of | following requirements: | |||
| tools for monitoring and predicting application behaviors, it is | - to access all workloads in the desired cloud DCs. | |||
| very attractive for enterprises to instantiate applications and | Many enterprises include cloud in their disaster recovery | |||
| workloads in locations that are geographically closest to their end- | strategy, such as enforcing periodic backup policies within the | |||
| users. Such proximity can improve end-to-end latency and overall | cloud, or running backup applications in the Cloud. | |||
| user experience. Conversely, an enterprise can easily shutdown | ||||
| applications and workloads whenever end-users are in motion (thereby | ||||
| modifying the networking connection of subsequently relocated | ||||
| applications and workloads). In addition, enterprises may wish to | ||||
| take advantage of more and more business applications offered by | ||||
| cloud operators. | ||||
| The networks that interconnect hybrid cloud DCs must address the | - Global reachability from different geographical zones, thereby | |||
| following requirements: | facilitating the proximity of applications as a function of the | |||
| - High availability to access all workloads in the desired cloud | end users' location, to improve latency. | |||
| DCs. | - Elasticity: prompt connection to newly instantiated | |||
| Many enterprises include cloud in their disaster recovery | applications at Cloud DCs when usages increase and prompt | |||
| strategy, such as enforcing periodic backup policies within the | release of connection after applications at locations being | |||
| cloud, or running backup applications in the Cloud. | removed when demands change. | |||
| - Scalable security management. | ||||
| - Global reachability from different geographical zones, thereby | 1.3. The role of SD-WAN in connecting to Cloud Services | |||
| facilitating the proximity of applications as a function of the | ||||
| end users location, to improve latency. | ||||
| - Elasticity: prompt connection to newly instantiated | ||||
| applications at Cloud DCs when usages increase and prompt | ||||
| release of connection after applications at locations being | ||||
| removed when demands change. | ||||
| - Scalable security management. | ||||
| 1.3. The role of SD-WAN in connecting to Cloud Services | Some of the characteristics of SD-WAN [SDWAN-BGP-USAGE], such as | |||
| network augmentation and forwarding based on application IDs instead | ||||
| of based on destination IP addresses, are very essential for | ||||
| connecting to on-demand Cloud services. | ||||
| Some of the characteristics of SD-WAN [SDWAN-BGP-USAGE], such as | Issues associated with using SD-WAN for connecting to Cloud services | |||
| network augmentation and forwarding based on application IDs instead | are also discussed in this document. | |||
| of based on destination IP addresses, are very essential for | ||||
| connecting to on-demand Cloud services. | ||||
| Issues associated with using SD-WAN for connecting to Cloud services | 2. Definition of terms | |||
| are also discussed in this document. | ||||
| 2. Definition of terms | Cloud DC: Third party Data Centers that usually host applications | |||
| and workload owned by different organizations or | ||||
| tenants. | ||||
| Cloud DC: Third party Data Centers that usually host applications | Controller: Used interchangeably with SD-WAN controller to manage | |||
| and workload owned by different organizations or | SD-WAN overlay path creation/deletion and monitoring the | |||
| tenants. | path conditions between two or more sites. | |||
| Controller: Used interchangeably with SD-WAN controller to manage | DSVPN: Dynamic Smart Virtual Private Network. DSVPN is a secure | |||
| SD-WAN overlay path creation/deletion and monitoring the | network that exchanges data between sites without | |||
| path conditions between two or more sites. | needing to pass traffic through an organization's | |||
| headquarter virtual private network (VPN) server or | ||||
| router. | ||||
| DSVPN: Dynamic Smart Virtual Private Network. DSVPN is a secure | Heterogeneous Cloud: applications and workloads split among Cloud | |||
| network that exchanges data between sites without | DCs owned or managed by different operators. | |||
| needing to pass traffic through an organization's | ||||
| headquarter virtual private network (VPN) server or | ||||
| router. | ||||
| Heterogeneous Cloud: applications and workloads split among Cloud | Hybrid Clouds: Hybrid Clouds refers to an enterprise using its own | |||
| DCs owned or managed by different operators. | on-premises DCs in addition to Cloud services provided | |||
| by one or more cloud operators. (e.g. AWS, Azure, | ||||
| Google, Salesforces, SAP, etc). | ||||
| Hybrid Clouds: Hybrid Clouds refers to an enterprise using its own | SD-WAN: Software Defined Wide Area Network. In this document, | |||
| on-premises DCs in addition to Cloud services provided | "SD-WAN" refers to the solutions of pooling WAN | |||
| by one or more cloud operators. (e.g. AWS, Azure, | bandwidth from multiple underlay networks to get better | |||
| Google, Salesforces, SAP, etc). | WAN bandwidth management, visibility & control. When the | |||
| underlay networks are private networks, traffic can | ||||
| traverse without additional encryption; when the | ||||
| underlay networks are public, such as Internet, some | ||||
| traffic needs to be encrypted when traversing through | ||||
| (depending on user provided policies). | ||||
| SD-WAN: Software Defined Wide Area Network. In this document, | VPC: Virtual Private Cloud is a virtual network dedicated to | |||
| SD-WAN refers to the solutions of pooling WAN | one client account. It is logically isolated from other | |||
| bandwidth from multiple underlay networks to get better | virtual networks in a Cloud DC. Each client can launch | |||
| WAN bandwidth management, visibility & control. When the | his/her desired resources, such as compute, storage, or | |||
| underlay networks are private networks, traffic can | network functions into his/her VPC. Most Cloud | |||
| traverse without additional encryption; when the | operators' VPCs only support private addresses, some | |||
| underlay networks are public, such as Internet, some | support IPv4 only, others support IPv4/IPv6 dual stack. | |||
| traffic needs to be encrypted when traversing through | ||||
| (depending on user provided policies). | ||||
| VPC: Virtual Private Cloud is a virtual network dedicated to | 3. High Level Issues of Connecting to Multi-Cloud | |||
| one client account. It is logically isolated from other | ||||
| virtual networks in a Cloud DC. Each client can launch | ||||
| his/her desired resources, such as compute, storage, or | ||||
| network functions into his/her VPC. Most Cloud | ||||
| operators VPCs only support private addresses, some | ||||
| support IPv4 only, others support IPv4/IPv6 dual stack. | ||||
| 3. High Level Issues of Connecting to Multi-Cloud | There are many problems associated with connecting to hybrid Cloud | |||
| Services, many of which are out of the IETF scope. This section is | ||||
| to identify some of the high-level problems that can be addressed by | ||||
| IETF, especially by Routing area. Other problems are out of the | ||||
| scope of this document. By no means has this section covered all | ||||
| problems for connecting to Hybrid Cloud Services, e.g. difficulty in | ||||
| managing cloud spending is not discussed here. | ||||
| There are many problems associated with connecting to hybrid Cloud | 3.1. Security Issues | |||
| Services, many of which are out of the IETF scope. This section is | ||||
| to identify some of the high level problems that can be addressed by | ||||
| IETF, especially by Routing area. Other problems are out of the | ||||
| scope of this document. By no means has this section covered all | ||||
| problems for connecting to Hybrid Cloud Services, e.g. difficulty in | ||||
| managing cloud spending is not discussed here. | ||||
| 3.1. Security Issues | Cloud Services is built upon shared infrastructure, therefore not | |||
| secure by nature. Security has been a primary, and valid, concern | ||||
| from the start of cloud computing, e.g. not being able to see the | ||||
| exact location where the data are stored or trace of access. | ||||
| Headlines highlighting data breaches, compromised credentials, and | ||||
| broken authentication, hacked interfaces and APIs, account hijacking | ||||
| haven't helped alleviate concerns. | ||||
| Cloud Services is built upon shared infrastructure, therefore not | Many Cloud operators offer monitoring services for data stored in | |||
| secure by nature. Security has been a primary, and valid, concern | Clouds, such as AWS CloudTrail, Azure Monitor, and many third-party | |||
| from the start of cloud computing: you are unable to see the exact | monitoring tools to improve visibility to data stored in Clouds. But | |||
| location where your data is stored or being processed. Headlines | there is still underline security concerns on illegitimate data and | |||
| highlighting data breaches, compromised credentials, and broken | workloads access. | |||
| authentication, hacked interfaces and APIs, account hijacking | ||||
| haven t helped alleviate concerns. | ||||
| Secure user identity management, authentication, and access control | Secure user identity management, authentication, and access control | |||
| mechanisms are important. Developing appropriate security | mechanisms are important. Developing appropriate security | |||
| measurements can enhance the confidence needed by enterprises to | measurements can enhance the confidence needed by enterprises to | |||
| fully take advantage of Cloud Services. | fully take advantage of Cloud Services. | |||
| 3.2. Authorization and Identity Management | 3.2. Authorization and Identity Management | |||
| One of the more prominent challenges for Cloud Services is Identity | One of the more prominent challenges for Cloud Services is Identity | |||
| Management and Authorization. The Authorization not only includes | Management and Authorization. The Authorization not only includes | |||
| user authorization, but also the authorization of API calls by | user authorization, but also the authorization of API calls by | |||
| applications from different Cloud DCs managed by different Cloud | applications from different Cloud DCs managed by different Cloud | |||
| Operators. In addition, there are authorization for Workload | Operators. In addition, there are authorization for Workload | |||
| Migration, Data Migration, and Workload Management. | Migration, Data Migration, and Workload Management. | |||
| There are many types of users in cloud environments, e.g. end users | There are many types of users in cloud environments, e.g. end users | |||
| for accessing applications hosted in Cloud DCs, Cloud-resource users | for accessing applications hosted in Cloud DCs, Cloud-resource users | |||
| who are responsible for setting permissions for the resources based | who are responsible for setting permissions for the resources based | |||
| on roles, access lists, IP addresses, domains, etc. | on roles, access lists, IP addresses, domains, etc. | |||
| There are many types of Cloud authorizations: including MAC | There are many types of Cloud authorizations: including MAC | |||
| (Mandatory Access Control) where each app owns individual access | (Mandatory Access Control) - where each app owns individual access | |||
| permissions, DAC (Discretionary Access Control) where each app | permissions, DAC (Discretionary Access Control) - where each app | |||
| requests permissions from an external permissions app, RBAC (Role- | requests permissions from an external permissions app, RBAC (Role- | |||
| based Access Control) where the authorization service owns roles | based Access Control) - where the authorization service owns roles | |||
| with different privileges on the cloud service, and ABAC (Attribute- | with different privileges on the cloud service, and ABAC (Attribute- | |||
| based Access Control) where access is based on request attributes | based Access Control) - where access is based on request attributes | |||
| and policies. | and policies. | |||
| IETF hasn t yet developed comprehensive specification for Identity | IETF hasn't yet developed comprehensive specification for Identity | |||
| management and data models for Cloud Authorizations. | management and data models for Cloud Authorizations. | |||
| 3.3. API abstraction | 3.3. API abstraction | |||
| Different Cloud Operators have different APIs to access their Cloud | Different Cloud Operators have different APIs to access their Cloud | |||
| resources, security functions, the NAT, etc. | resources, security functions, the NAT, etc. | |||
| It is difficult to move applications built by one Cloud operator s | It is difficult to move applications built by one Cloud operator's | |||
| APIs to another. However, it is highly desirable to have a single | APIs to another. However, it is highly desirable to have a single | |||
| and consistent way to manage the networks and respective security | and consistent way to manage the networks and respective security | |||
| policies for interconnecting applications hosted in different Cloud | policies for interconnecting applications hosted in different Cloud | |||
| DCs. | DCs. | |||
| The desired property would be having a single network fabric to | The desired property would be having a single network fabric to | |||
| which different Cloud DCs and enterprise s multiple sites can be | which different Cloud DCs and enterprise's multiple sites can be | |||
| attached or detached, with a common interface for setting desired | attached or detached, with a common interface for setting desired | |||
| policies. | policies. | |||
| The difficulty of connecting applications in different Clouds might | The difficulty of connecting applications in different Clouds might | |||
| be stemmed from the fact that they are direct competitors. Usually | be stemmed from the fact that they are direct competitors. Usually | |||
| traffic flow out of Cloud DCs incur charges. Therefore, direct | traffic flow out of Cloud DCs incur charges. Therefore, direct | |||
| communications between applications in different Cloud DCs can be | communications between applications in different Cloud DCs can be | |||
| more expensive than intra Cloud communications. | more expensive than intra Cloud communications. | |||
| It is desirable to have a common API shim layer or abstraction for | It is desirable to have a common API shim layer or abstraction for | |||
| different Cloud providers to make it easier to move applications | different Cloud providers to make it easier to move applications | |||
| from one Cloud DC to another. | from one Cloud DC to another. | |||
| 3.4. DNS for Cloud Resources | 3.4. DNS for Cloud Resources | |||
| DNS name resolution is essential for on-premises and cloud-based | DNS name resolution is essential for on-premises and cloud-based | |||
| resources. For customers with hybrid workloads, which include on- | resources. For customers with hybrid workloads, which include on- | |||
| premises and cloud-based resources, extra steps are necessary to | premises and cloud-based resources, extra steps are necessary to | |||
| configure DNS to work seamlessly across both environments. | configure DNS to work seamlessly across both environments. | |||
| Cloud operators have their own DNS to resolve resources within their | Cloud operators have their own DNS to resolve resources within their | |||
| Cloud DCs and to well-known public domains. Cloud s DNS can be | Cloud DCs and to well-known public domains. Cloud's DNS can be | |||
| configured to forward queries to customer managed authoritative DNS | configured to forward queries to customer managed authoritative DNS | |||
| servers hosted on-premises, and to respond to DNS queries forwarded | servers hosted on-premises, and to respond to DNS queries forwarded | |||
| by on-premises DNS servers. | by on-premises DNS servers. | |||
| For enterprises utilizing Cloud services by different cloud | For enterprises utilizing Cloud services by different cloud | |||
| operators, it is necessary to establish policies and rules on | operators, it is necessary to establish policies and rules on | |||
| how/where to forward DNS queries to. When applications in one Cloud | how/where to forward DNS queries to. When applications in one Cloud | |||
| need to communication with applications hosted in another Cloud, | need to communication with applications hosted in another Cloud, | |||
| there could be DNS queries from one Cloud DC being forwarded to the | there could be DNS queries from one Cloud DC being forwarded to the | |||
| enterprise s on premise DNS, which in turn be forwarded to the DNS | enterprise's on premise DNS, which in turn be forwarded to the DNS | |||
| service in another Cloud. Needless to say, configuration can be | service in another Cloud. Needless to say, configuration can be | |||
| complex depending on the application communication patterns. | complex depending on the application communication patterns. | |||
| However, even with carefully managed policies and configurations, | However, even with carefully managed policies and configurations, | |||
| collisions can still occur. If you use an internal name like .cloud | collisions can still occur. If you use an internal name like .cloud | |||
| and then want your services to be available via or within some other | and then want your services to be available via or within some other | |||
| cloud provider which also uses .cloud, then it can't work. | cloud provider which also uses .cloud, then it can't work. | |||
| Therefore, it is better to use the global domain name even when an | Therefore, it is better to use the global domain name even when an | |||
| organization does not make all its namespace globally resolvable. An | organization does not make all its namespace globally resolvable. An | |||
| organization's globally unique DNS can include subdomains that | organization's globally unique DNS can include subdomains that | |||
| cannot be resolved at all outside certain restricted paths, zones | cannot be resolved at all outside certain restricted paths, zones | |||
| that resolve differently based on the origin of the query, and zones | that resolve differently based on the origin of the query, and zones | |||
| that resolve the same globally for all queries from any source. | that resolve the same globally for all queries from any source. | |||
| Globally unique names do not equate to globally resolvable names or | Globally unique names do not equate to globally resolvable names or | |||
| even global names that resolve the same way from every perspective. | even global names that resolve the same way from every perspective. | |||
| Globally unique names do prevent any possibility of collision at the | Globally unique names do prevent any possibility of collision at the | |||
| present or in the future and they make DNSSEC trust manageable. | present or in the future and they make DNSSEC trust manageable. | |||
| Consider using a registered and fully qualified domain name (FQDN) | Consider using a registered and fully qualified domain name (FQDN) | |||
| from global DNS as the root for enterprise and other internal | from global DNS as the root for enterprise and other internal | |||
| namespaces. | namespaces. | |||
| 3.5. NAT for Cloud Services | 3.5. NAT for Cloud Services | |||
| Cloud resources, such as VM instances, are usually assigned with | Cloud resources, such as VM instances, are usually assigned with | |||
| private IP addresses. By configuration, some private subnets can | private IP addresses. By configuration, some private subnets can | |||
| have the NAT function to reach out to external network and some | have the NAT function to reach out to external network and some | |||
| private subnets are internal to Cloud only. | private subnets are internal to Cloud only. | |||
| Different Cloud operators support different levels of NAT functions. | Different Cloud operators support different levels of NAT functions. | |||
| For example, AWS NAT Gateway does not currently support connections | For example, AWS NAT Gateway does not currently support connections | |||
| towards, or from VPC Endpoints, VPN, AWS Direct Connect, or VPC | towards, or from VPC Endpoints, VPN, AWS Direct Connect, or VPC | |||
| Peering. https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc- | Peering. https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc- | |||
| nat-gateway.html#nat-gateway-other-services. AWS Direct | nat-gateway.html#nat-gateway-other-services. AWS Direct | |||
| Connect/VPN/VPC Peering does not currently support any NAT | Connect/VPN/VPC Peering does not currently support any NAT | |||
| functionality. | functionality. | |||
| Google s Cloud NAT allows Google Cloud virtual machine (VM) | Google's Cloud NAT allows Google Cloud virtual machine (VM) | |||
| instances without external IP addresses and private Google | instances without external IP addresses and private Google | |||
| Kubernetes Engine (GKE) clusters to connect to the Internet. Cloud | Kubernetes Engine (GKE) clusters to connect to the Internet. Cloud | |||
| NAT implements outbound NAT in conjunction with a default route to | NAT implements outbound NAT in conjunction with a default route to | |||
| allow instances to reach the Internet. It does not implement inbound | allow instances to reach the Internet. It does not implement inbound | |||
| NAT. Hosts outside of VPC network can only respond to established | NAT. Hosts outside of VPC network can only respond to established | |||
| connections initiated by instances inside the Google Cloud; they | connections initiated by instances inside the Google Cloud; they | |||
| cannot initiate their own, new connections to Cloud instances via | cannot initiate their own, new connections to Cloud instances via | |||
| NAT. | NAT. | |||
| For enterprises with applications running in different Cloud DCs, | For enterprises with applications running in different Cloud DCs, | |||
| proper configuration of NAT have to be performed in Cloud DC and in | proper configuration of NAT has to be performed in Cloud DC and in | |||
| their own on-premise DC. | their own on-premise DC. | |||
| 3.6. Cloud Discovery | 3.6. Cloud Discovery | |||
| One of the concerns of using Cloud services is not aware where the | One of the concerns of using Cloud services is not aware where the | |||
| resource is actually located, especially Cloud operators can move | resource is actually located, especially Cloud operators can move | |||
| application instances from one place to another. When applications | application instances from one place to another. When applications | |||
| in Cloud communicate with on-premise applications, it may not be | in Cloud communicate with on-premise applications, it may not be | |||
| clear where the Cloud applications are located or to which VPCs they | clear where the Cloud applications are located or to which VPCs they | |||
| belong. | belong. | |||
| It is highly desirable to have tools to discover cloud services in | It is highly desirable to have tools to discover cloud services in | |||
| much the same way as you would discover your on-premises | much the same way as you would discover your on-premises | |||
| infrastructure. A significant difference is that cloud discovery | infrastructure. A significant difference is that cloud discovery | |||
| uses the cloud vendor's API to extract data on your cloud services, | uses the cloud vendor's API to extract data on your cloud services, | |||
| rather than the direct access used in scanning your on-premises | rather than the direct access used in scanning your on-premises | |||
| infrastructure. | infrastructure. | |||
| Standard data models, APIs or tools can alleviate concerns of | Standard data models, APIs or tools can alleviate concerns of | |||
| enterprise utilizing Cloud Resources, e.g. having a Cloud service | enterprise utilizing Cloud Resources, e.g. having a Cloud service | |||
| scan that connects to the API of the cloud provider and collects | scan that connects to the API of the cloud provider and collects | |||
| information directly. | information directly. | |||
| 4. Interconnecting Enterprise Sites with Cloud DCs | 4. Interconnecting Enterprise Sites with Cloud DCs | |||
| Considering that many enterprises already have existing VPNs (e.g. | Considering that many enterprises already have existing VPNs (e.g. | |||
| MPLS based L2VPN or L3VPN) interconnecting branch offices & on- | MPLS based L2VPN or L3VPN) interconnecting branch offices & on- | |||
| premises data centers, connecting to Cloud services will be mixed of | premises data centers, connecting to Cloud services will be mixed of | |||
| different types of networks. When an enterprise s existing VPN | different types of networks. When an enterprise's existing VPN | |||
| service providers do not have direct connections to the | service providers do not have direct connections to the | |||
| corresponding cloud DCs that the enterprise prefers to use, the | corresponding cloud DCs that the enterprise prefers to use, the | |||
| enterprise has to face additional infrastructure and operational | enterprise has to face additional infrastructure and operational | |||
| costs to utilize Cloud services. | costs to utilize Cloud services. | |||
| 4.1. Sites to Cloud DC | 4.1. Sites to Cloud DC | |||
| Most Cloud operators offer some type of network gateway through | Most Cloud operators offer some type of network gateway through | |||
| which an enterprise can reach their workloads hosted in the Cloud | which an enterprise can reach their workloads hosted in the Cloud | |||
| DCs. AWS (Amazon Web Services) offers the following options to reach | DCs. AWS (Amazon Web Services) offers the following options to reach | |||
| workloads in AWS Cloud DCs: | workloads in AWS Cloud DCs: | |||
| - AWS Internet gateway allows communication between instances in | - AWS Internet gateway allows communication between instances in | |||
| AWS VPC and the internet. | AWS VPC and the internet. | |||
| - AWS Virtual gateway (vGW) where IPsec tunnels [RFC6071] are | - AWS Virtual gateway (vGW) where IPsec tunnels [RFC6071] are | |||
| established between an enterprise s own gateway and AWS vGW, so | established between an enterprise's own gateway and AWS vGW, so | |||
| that the communications between those gateways can be secured | that the communications between those gateways can be secured | |||
| from the underlay (which might be the public Internet). | from the underlay (which might be the public Internet). | |||
| - AWS Direct Connect, which allows enterprises to purchase direct | - AWS Direct Connect, which allows enterprises to purchase direct | |||
| connect from network service providers to get a private leased | connect from network service providers to get a private leased | |||
| line interconnecting the enterprises gateway(s) and the AWS | line interconnecting the enterprises gateway(s) and the AWS | |||
| Direct Connect routers. In addition, an AWS Transit Gateway can | Direct Connect routers. In addition, an AWS Transit Gateway can | |||
| be used to interconnect multiple VPCs in different Availability | be used to interconnect multiple VPCs in different Availability | |||
| Zones. AWS Transit Gateway acts as a hub that controls how | Zones. AWS Transit Gateway acts as a hub that controls how | |||
| traffic is forwarded among all the connected networks which act | traffic is forwarded among all the connected networks which act | |||
| like spokes. | like spokes. | |||
| Microsoft s ExpressRoute allows extension of a private network to | Microsoft's ExpressRoute allows extension of a private network to | |||
| any of the Microsoft cloud services, including Azure and Office365. | any of the Microsoft cloud services, including Azure and Office365. | |||
| ExpressRoute is configured using Layer 3 routing. Customers can opt | ExpressRoute is configured using Layer 3 routing. Customers can opt | |||
| for redundancy by provisioning dual links from their location to two | for redundancy by provisioning dual links from their location to two | |||
| Microsoft Enterprise edge routers (MSEEs) located within a third- | Microsoft Enterprise edge routers (MSEEs) located within a third- | |||
| party ExpressRoute peering location. The BGP routing protocol is | party ExpressRoute peering location. The BGP routing protocol is | |||
| then setup over WAN links to provide redundancy to the cloud. This | then setup over WAN links to provide redundancy to the cloud. This | |||
| redundancy is maintained from the peering data center into | redundancy is maintained from the peering data center into | |||
| Microsoft's cloud network. | Microsoft's cloud network. | |||
| Google s Cloud Dedicated Interconnect offers similar network | Google's Cloud Dedicated Interconnect offers similar network | |||
| connectivity options as AWS and Microsoft. One distinct difference, | connectivity options as AWS and Microsoft. One distinct difference, | |||
| however, is that Google s service allows customers access to the | however, is that Google's service allows customers access to the | |||
| entire global cloud network by default. It does this by connecting | entire global cloud network by default. It does this by connecting | |||
| your on-premises network with the Google Cloud using BGP and Google | your on-premises network with the Google Cloud using BGP and Google | |||
| Cloud Routers to provide optimal paths to the different regions of | Cloud Routers to provide optimal paths to the different regions of | |||
| the global cloud infrastructure. | the global cloud infrastructure. | |||
| Figure below shows an example of some of a tenant s workloads are | Figure below shows an example of some of a tenant's workloads are | |||
| accessible via a virtual router connected by AWS Internet Gateway; | accessible via a virtual router connected by AWS Internet Gateway; | |||
| some are accessible via AWS vGW, and others are accessible via AWS | some are accessible via AWS vGW, and others are accessible via AWS | |||
| Direct Connect. | Direct Connect. | |||
| Different types of access require different level of security | Different types of access require different level of security | |||
| functions. Sometimes it is not visible to end customers which type | functions. Sometimes it is not visible to end customers which type | |||
| of network access is used for a specific application instance. To | of network access is used for a specific application instance. To | |||
| get better visibility, separate virtual routers (e.g. vR1 & vR2) can | get better visibility, separate virtual routers (e.g. vR1 & vR2) can | |||
| be deployed to differentiate traffic to/from different cloud GWs. It | be deployed to differentiate traffic to/from different cloud GWs. It | |||
| is important for some enterprises to be able to observe the specific | is important for some enterprises to be able to observe the specific | |||
| behaviors when connected by different connections. | behaviors when connected by different connections. | |||
| Customer Gateway can be customer owned router or ports physically | Customer Gateway can be customer owned router or ports physically | |||
| connected to AWS Direct Connect GW. | connected to AWS Direct Connect GW. | |||
| +------------------------+ | +------------------------+ | |||
| | ,---. ,---. | | | ,---. ,---. | | |||
| | (TN-1 ) ( TN-2)| | | (TN-1 ) ( TN-2)| | |||
| | `-+-' +---+ `-+-' | | | `-+-' +---+ `-+-' | | |||
| | +----|vR1|----+ | | | +----|vR1|----+ | | |||
| | ++--+ | | | ++--+ | | |||
| | | +-+----+ | | | +-+----+ | |||
| | | /Internet\ For External | | | /Internet\ For External | |||
| | +-------+ Gateway +---------------------- | | +-------+ Gateway +---------------------- | |||
| | \ / to reach via Internet | | \ / to reach via Internet | |||
| | +-+----+ | | +-+----+ | |||
| | | | | | | |||
| | ,---. ,---. | | | ,---. ,---. | | |||
| | (TN-1 ) ( TN-2)| | | (TN-1 ) ( TN-2)| | |||
| | `-+-' +---+ `-+-' | | | `-+-' +---+ `-+-' | | |||
| | +----|vR2|----+ | | | +----|vR2|----+ | | |||
| | ++--+ | | | ++--+ | | |||
| | | +-+----+ | | | +-+----+ | |||
| | | / virtual\ For IPsec Tunnel | | | / virtual\ For IPsec Tunnel | |||
| | +-------+ Gateway +---------------------- | | +-------+ Gateway +---------------------- | |||
| | | \ / termination | | | \ / termination | |||
| | | +-+----+ | | | +-+----+ | |||
| | | | | | | | | |||
| | | +-+----+ +------+ | | | +-+----+ +------+ | |||
| | | / \ For Direct /customer\ | | | / \ For Direct /customer\ | |||
| | +-------+ Gateway +----------+ gateway | | | +-------+ Gateway +----------+ gateway | | |||
| | \ / Connect \ / | | \ / Connect \ / | |||
| | +-+----+ +------+ | | +-+----+ +------+ | |||
| | | | | | | |||
| +------------------------+ | +------------------------+ | |||
| Figure 1: Examples of Multiple Cloud DC connections. | Figure 1: Examples of Multiple Cloud DC connections. | |||
| 4.2. Inter-Cloud Interconnection | 4.2. Inter-Cloud Interconnection | |||
| The connectivity options to Cloud DCs described in the previous | The connectivity options to Cloud DCs described in the previous | |||
| section are for reaching Cloud providers DCs, but not between cloud | section are for reaching Cloud providers' DCs, but not between cloud | |||
| DCs. When applications in AWS Cloud need to communicate with | DCs. When applications in AWS Cloud need to communicate with | |||
| applications in Azure, today s practice requires a third-party | applications in Azure, today's practice requires a third-party | |||
| gateway (physical or virtual) to interconnect the AWS s Layer 2 | gateway (physical or virtual) to interconnect the AWS's Layer 2 | |||
| DirectConnect path with Azure s Layer 3 ExpressRoute. | DirectConnect path with Azure's Layer 3 ExpressRoute. | |||
| Enterprises can also instantiate their own virtual routers in | Enterprises can also instantiate their own virtual routers in | |||
| different Cloud DCs and administer IPsec tunnels among them, which | different Cloud DCs and administer IPsec tunnels among them, which | |||
| by itself is not a trivial task. Or by leveraging open source VPN | by itself is not a trivial task. Or by leveraging open source VPN | |||
| software such as strongSwan, you create an IPSec connection to the | software such as strongSwan, you create an IPSec connection to the | |||
| Azure gateway using a shared key. The StrongSwan instance within AWS | Azure gateway using a shared key. The StrongSwan instance within AWS | |||
| not only can connect to Azure but can also be used to facilitate | not only can connect to Azure but can also be used to facilitate | |||
| traffic to other nodes within the AWS VPC by configuring forwarding | traffic to other nodes within the AWS VPC by configuring forwarding | |||
| and using appropriate routing rules for the VPC. | and using appropriate routing rules for the VPC. | |||
| Most Cloud operators, such as AWS VPC or Azure VNET, use non- | Most Cloud operators, such as AWS VPC or Azure VNET, use non- | |||
| globally routable CIDR from private IPv4 address ranges as specified | globally routable CIDR from private IPv4 address ranges as specified | |||
| by RFC1918. To establish IPsec tunnel between two Cloud DCs, it is | by RFC1918. To establish IPsec tunnel between two Cloud DCs, it is | |||
| necessary to exchange Public routable addresses for applications in | necessary to exchange Public routable addresses for applications in | |||
| different Cloud DCs. [BGP-SDWAN] describes one method. Other methods | different Cloud DCs. [BGP-SDWAN] describes one method. Other methods | |||
| are worth exploring. | are worth exploring. | |||
| In summary, here are some approaches, available now (which might | In summary, here are some approaches, available now (which might | |||
| change in the future), to interconnect workloads among different | change in the future), to interconnect workloads among different | |||
| Cloud DCs: | Cloud DCs: | |||
| a) | a) Utilize Cloud DC provided inter/intra-cloud connectivity | |||
| Utilize Cloud DC provided inter/intra-cloud connectivity | services (e.g., AWS Transit Gateway) to connect workloads | |||
| services (e.g., AWS Transit Gateway) to connect workloads | instantiated in multiple VPCs. Such services are provided with | |||
| instantiated in multiple VPCs. Such services are provided with | the cloud gateway to connect to external networks (e.g., AWS | |||
| the cloud gateway to connect to external networks (e.g., AWS | DirectConnect Gateway). | |||
| DirectConnect Gateway). | b) Hairpin all traffic through the customer gateway, meaning all | |||
| b) | workloads are directly connected to the customer gateway, so | |||
| Hairpin all traffic through the customer gateway, meaning all | that communications among workloads within one Cloud DC must | |||
| workloads are directly connected to the customer gateway, so | traverse through the customer gateway. | |||
| that communications among workloads within one Cloud DC must | c) Establish direct tunnels among different VPCs (AWS' Virtual | |||
| traverse through the customer gateway. | Private Clouds) and VNET (Azure's Virtual Networks) via | |||
| c) | client's own virtual routers instantiated within Cloud DCs. | |||
| Establish direct tunnels among different VPCs (AWS Virtual | DMVPN (Dynamic Multipoint Virtual Private Network) or DSVPN | |||
| Private Clouds) and VNET (Azure s Virtual Networks) via | (Dynamic Smart VPN) techniques can be used to establish direct | |||
| client s own virtual routers instantiated within Cloud DCs. | Multi-point-to-Point or multi-point-to multi-point tunnels | |||
| DMVPN (Dynamic Multipoint Virtual Private Network) or DSVPN | among those client's own virtual routers. | |||
| (Dynamic Smart VPN) techniques can be used to establish direct | ||||
| Multi-point-to-Point or multi-point-to multi-point tunnels | ||||
| among those client s own virtual routers. | ||||
| Approach a) usually does not work if Cloud DCs are owned and managed | Approach a) usually does not work if Cloud DCs are owned and managed | |||
| by different Cloud providers. | by different Cloud providers. | |||
| Approach b) creates additional transmission delay plus incurring | Approach b) creates additional transmission delay plus incurring | |||
| cost when exiting Cloud DCs. | cost when exiting Cloud DCs. | |||
| For the Approach c), DMVPN or DSVPN use NHRP (Next Hop Resolution | For the Approach c), DMVPN or DSVPN use NHRP (Next Hop Resolution | |||
| Protocol) [RFC2735] so that spoke nodes can register their IP | Protocol) [RFC2735] so that spoke nodes can register their IP | |||
| addresses & WAN ports with the hub node. The IETF ION | addresses & WAN ports with the hub node. The IETF ION | |||
| (Internetworking over NBMA (non-broadcast multiple access) WG | (Internetworking over NBMA (non-broadcast multiple access) WG | |||
| standardized NHRP for connection-oriented NBMA network (such as ATM) | standardized NHRP for connection-oriented NBMA network (such as ATM) | |||
| network address resolution more than two decades ago. | network address resolution more than two decades ago. | |||
| There are many differences between virtual routers in Public Cloud | There are many differences between virtual routers in Public Cloud | |||
| DCs and the nodes in an NBMA network. NHRP cannot be used for | DCs and the nodes in an NBMA network. NHRP cannot be used for | |||
| registering virtual routers in Cloud DCs unless an extension of such | registering virtual routers in Cloud DCs unless an extension of such | |||
| protocols is developed for that purpose, e.g. taking NAT or dynamic | protocols is developed for that purpose, e.g. taking NAT or dynamic | |||
| addresses into consideration. Therefore, DMVPN and/or DSVPN cannot | addresses into consideration. Therefore, DMVPN and/or DSVPN cannot | |||
| be used directly for connecting workloads in hybrid Cloud DCs. | be used directly for connecting workloads in hybrid Cloud DCs. | |||
| Other protocols such as BGP can be used, as described in [BGP- | Other protocols such as BGP can be used, as described in [BGP- | |||
| SDWAN]. | SDWAN]. | |||
| 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs | 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs | |||
| Traditional MPLS-based VPNs have been widely deployed as an | Traditional MPLS-based VPNs have been widely deployed as an | |||
| effective way to support businesses and organizations that require | effective way to support businesses and organizations that require | |||
| network performance and reliability. MPLS shifted the burden of | network performance and reliability. MPLS shifted the burden of | |||
| managing a VPN service from enterprises to service providers. The | managing a VPN service from enterprises to service providers. The | |||
| CPEs attached to MPLS VPNs are also simpler and less expensive, | CPEs attached to MPLS VPNs are also simpler and less expensive, | |||
| because they do not need to manage routes to remote sites; they | because they do not need to manage routes to remote sites; they | |||
| simply pass all outbound traffic to the MPLS VPN PEs to which the | simply pass all outbound traffic to the MPLS VPN PEs to which the | |||
| CPEs are attached (albeit multi-homing scenarios require more | CPEs are attached (albeit multi-homing scenarios require more | |||
| processing logic on CPEs). MPLS has addressed the problems of | processing logic on CPEs). MPLS has addressed the problems of | |||
| scale, availability, and fast recovery from network faults, and | scale, availability, and fast recovery from network faults, and | |||
| incorporated traffic-engineering capabilities. | incorporated traffic-engineering capabilities. | |||
| However, traditional MPLS-based VPN solutions are sub-optimized for | However, traditional MPLS-based VPN solutions are sub-optimized for | |||
| connecting end-users to dynamic workloads/applications in cloud DCs | connecting end-users to dynamic workloads/applications in cloud DCs | |||
| because: | because: | |||
| - The Provider Edge (PE) nodes of the enterprise s VPNs might not | - The Provider Edge (PE) nodes of the enterprise's VPNs might not | |||
| have direct connections to third party cloud DCs that are used | have direct connections to third party cloud DCs that are used | |||
| for hosting workloads with the goal of providing an easy access | for hosting workloads with the goal of providing an easy access | |||
| to enterprises end-users. | to enterprises' end-users. | |||
| - It takes some time to deploy provider edge (PE) routers at new | - It takes some time to deploy provider edge (PE) routers at new | |||
| locations. When enterprise s workloads are changed from one | locations. When enterprise's workloads are changed from one | |||
| cloud DC to another (i.e., removed from one DC and re- | cloud DC to another (i.e., removed from one DC and re- | |||
| instantiated to another location when demand changes), the | instantiated to another location when demand changes), the | |||
| enterprise branch offices need to be connected to the new cloud | enterprise branch offices need to be connected to the new cloud | |||
| DC, but the network service provider might not have PEs located | DC, but the network service provider might not have PEs located | |||
| at the new location. | at the new location. | |||
| One of the main drivers for moving workloads into the cloud is | One of the main drivers for moving workloads into the cloud is | |||
| the widely available cloud DCs at geographically diverse | the widely available cloud DCs at geographically diverse | |||
| locations, where apps can be instantiated so that they can be | locations, where apps can be instantiated so that they can be | |||
| as close to their end-users as possible. When the user base | as close to their end-users as possible. When the user base | |||
| changes, the applications may be migrated to a new cloud DC | changes, the applications may be migrated to a new cloud DC | |||
| location closest to the new user base. | location closest to the new user base. | |||
| - Most of the cloud DCs do not expose their internal networks. An | - Most of the cloud DCs do not expose their internal networks. An | |||
| enterprise with a hybrid cloud deployment can use an MPLS-VPN | enterprise with a hybrid cloud deployment can use an MPLS-VPN | |||
| to connect to a Cloud provider at multiple locations. The | to connect to a Cloud provider at multiple locations. The | |||
| connection locations often correspond to gateways of different | connection locations often correspond to gateways of different | |||
| Cloud DC locations from the Cloud provider. The different | Cloud DC locations from the Cloud provider. The different | |||
| Cloud DCs are interconnected by the Cloud provider's own | Cloud DCs are interconnected by the Cloud provider's own | |||
| internal network. At each connection location (gateway), the | internal network. At each connection location (gateway), the | |||
| Cloud provider uses BGP to advertise all of the prefixes in the | Cloud provider uses BGP to advertise all of the prefixes in the | |||
| enterprise's VPC, regardless of which Cloud DC a given prefix | enterprise's VPC, regardless of which Cloud DC a given prefix | |||
| is actually in. This can result in inefficient routing for the | is actually in. This can result in inefficient routing for the | |||
| end-to-end data path. | end-to-end data path. | |||
| Another roadblock is the lack of a standard way to express and | Another roadblock is the lack of a standard way to express and | |||
| enforce consistent security policies for workloads that not only use | enforce consistent security policies for workloads that not only use | |||
| virtual addresses, but in which are also very likely hosted in | virtual addresses, but in which are also very likely hosted in | |||
| different locations within the Cloud DC [RFC8192]. The current VPN | different locations within the Cloud DC [RFC8192]. The current VPN | |||
| path computation and bandwidth allocation schemes may not be | path computation and bandwidth allocation schemes may not be | |||
| flexible enough to address the need for enterprises to rapidly | flexible enough to address the need for enterprises to rapidly | |||
| connect to dynamically instantiated (or removed) workloads and | connect to dynamically instantiated (or removed) workloads and | |||
| applications regardless of their location/nature (i.e., third party | applications regardless of their location/nature (i.e., third party | |||
| cloud DCs). | cloud DCs). | |||
| 6. Problem with using IPsec tunnels to Cloud DCs | 6. Problem with using IPsec tunnels to Cloud DCs | |||
| As described in the previous section, many Cloud operators expose | As described in the previous section, many Cloud operators expose | |||
| their gateways for external entities (which can be enterprises | their gateways for external entities (which can be enterprises | |||
| themselves) to directly establish IPsec tunnels. Enterprises can | themselves) to directly establish IPsec tunnels. Enterprises can | |||
| also instantiate virtual routers within Cloud DCs to connect to | also instantiate virtual routers within Cloud DCs to connect to | |||
| their on-premises devices via IPsec tunnels. | their on-premises devices via IPsec tunnels. | |||
| 6.1. Scaling Issues with IPsec Tunnels | 6.1. Scaling Issues with IPsec Tunnels | |||
| If there is only one enterprise location that needs to reach the | If there is only one enterprise location that needs to reach the | |||
| Cloud DC, an IPsec tunnel is a very convenient solution. | Cloud DC, an IPsec tunnel is a very convenient solution. | |||
| However, many medium-to-large enterprises have multiple sites and | However, many medium-to-large enterprises have multiple sites and | |||
| multiple data centers. For multiple sites to communicate with | multiple data centers. For multiple sites to communicate with | |||
| workloads and apps hosted in cloud DCs, Cloud DC gateways have to | workloads and apps hosted in cloud DCs, Cloud DC gateways have to | |||
| maintain many IPsec tunnels to all those locations. In addition, | maintain many IPsec tunnels to all those locations. In addition, | |||
| each of those IPsec Tunnels requires pair-wise periodic key | each of those IPsec Tunnels requires pair-wise periodic key | |||
| refreshment. For a company with hundreds or thousands of locations, | refreshment. For a company with hundreds or thousands of locations, | |||
| there could be hundreds (or even thousands) of IPsec tunnels | there could be hundreds (or even thousands) of IPsec tunnels | |||
| terminating at the cloud DC gateway, which is very processing | terminating at the cloud DC gateway, which is very processing | |||
| intensive. That is why many cloud operators only allow a limited | intensive. That is why many cloud operators only allow a limited | |||
| number of (IPsec) tunnels & bandwidth to each customer. | number of (IPsec) tunnels & bandwidth to each customer. | |||
| Alternatively, you could use a solution like group encryption where | Alternatively, you could use a solution like group encryption where | |||
| a single IPsec SA is necessary at the GW but the drawback is key | a single IPsec SA is necessary at the GW but the drawback is key | |||
| distribution and maintenance of a key server, etc. | distribution and maintenance of a key server, etc. | |||
| 6.2. Poor performance over long distance | 6.2. Poor performance over long distance | |||
| When enterprise CPEs or gateways are far away from cloud DC gateways | When enterprise CPEs or gateways are far away from cloud DC gateways | |||
| or across country/continent boundaries, performance of IPsec tunnels | or across country/continent boundaries, performance of IPsec tunnels | |||
| over the public Internet can be problematic and unpredictable. Even | over the public Internet can be problematic and unpredictable. Even | |||
| though there are many monitoring tools available to measure delay | though there are many monitoring tools available to measure delay | |||
| and various performance characteristics of the network, the | and various performance characteristics of the network, the | |||
| measurement for paths over the Internet is passive and past | measurement for paths over the Internet is passive and past | |||
| measurements may not represent future performance. | measurements may not represent future performance. | |||
| Many cloud providers can replicate workloads in different available | Many cloud providers can replicate workloads in different available | |||
| zones. An App instantiated in a cloud DC closest to clients may have | zones. An App instantiated in a cloud DC closest to clients may have | |||
| to cooperate with another App (or its mirror image) in another | to cooperate with another App (or its mirror image) in another | |||
| region or database server(s) in the on-premises DC. This kind of | region or database server(s) in the on-premises DC. This kind of | |||
| coordination requires predicable networking behavior/performance | coordination requires predicable networking behavior/performance | |||
| among those locations. | among those locations. | |||
| 7. Problems of Using SD-WAN to connect to Cloud DCs | 7. Problems of Using SD-WAN to connect to Cloud DCs | |||
| SD-WAN lets enterprises augment their current VPN network with cost- | SD-WAN lets enterprises augment their current VPN network with cost- | |||
| effective, readily available Broadband Internet connectivity, | effective, readily available Broadband Internet connectivity, | |||
| enabling some traffic offloading to paths over the Internet | enabling some traffic offloading to paths over the Internet | |||
| according to differentiated, possibly application-based traffic | according to differentiated, possibly application-based traffic | |||
| forwarding policies, or when the MPLS VPN connection between the two | forwarding policies, or when the MPLS VPN connection between the two | |||
| locations is congested, or otherwise undesirable or unavailable. | locations is congested, or otherwise undesirable or unavailable. | |||
| 7.1. More Complexity to Edge Nodes | 7.1. More Complexity to Edge Nodes | |||
| Augmenting transport path is not as simple as it appears. For an | Augmenting transport path is not as simple as it appears. For an | |||
| enterprise with multiple sites, CPE managed overlay paths among | enterprise with multiple sites, CPE managed overlay paths among | |||
| sites requires each CPE to manage all the addresses that local hosts | sites requires each CPE to manage all the addresses that local hosts | |||
| have potential to reach, i.e., map internal VPN addresses to | have potential to reach, i.e., map internal VPN addresses to | |||
| appropriate Overlay paths. This is similar to the complexity of | appropriate Overlay paths. This is similar to the complexity of | |||
| Frame Relay based VPNs, where each CPE needed to maintain mesh | Frame Relay based VPNs, where each CPE needed to maintain mesh | |||
| routing for all destinations if they were to avoid an extra hop | routing for all destinations if they were to avoid an extra hop | |||
| through a hub router. Even with the assistance from a central | through a hub router. Even with the assistance from a central | |||
| controller (instead of running a routing protocol) to resolve the | controller (instead of running a routing protocol) to resolve the | |||
| mapping between destinations and SD-WAN paths, SD-WAN CPEs are still | mapping between destinations and SD-WAN paths, SD-WAN CPEs are still | |||
| responsible for routing table maintenance as remote destinations | responsible for routing table maintenance as remote destinations | |||
| change their attachments, e.g., the dynamic workload in other DCs | change their attachments, e.g., the dynamic workload in other DCs | |||
| are de-commissioned or added. | are de-commissioned or added. | |||
| In addition, overlay path for interconnecting branch offices are | In addition, overlay path for interconnecting branch offices are | |||
| different from connecting to Cloud DCs: | different from connecting to Cloud DCs: | |||
| - Overlay path interconnecting branch offices usually have two | - Overlay path interconnecting branch offices usually have two | |||
| end-points (e.g. CPEs) controlled by one entity (e.g. | end-points (e.g. CPEs) controlled by one entity (e.g. | |||
| controllers or management systems operated by the enterprise). | controllers or management systems operated by the enterprise). | |||
| - Connecting to Cloud DC may consists of CPEs owned or managed by | - Connecting to Cloud DC may consists of CPEs owned or managed by | |||
| the enterprise, and the remote end-points being managed or | the enterprise, and the remote end-points being managed or | |||
| controlled by Cloud DCs. | controlled by Cloud DCs. | |||
| 7.2. Edge WAN Port Management | 7.2. Edge WAN Port Management | |||
| An SDWAN edge node can have WAN ports connected to different | An SDWAN edge node can have WAN ports connected to different | |||
| networks or public internet managed by different operators. | networks or public internet managed by different operators. | |||
| There is therefore a need to propagate WAN port property to | There is therefore a need to propagate WAN port property to | |||
| remote authorized peers in third party network domains in | remote authorized peers in third party network domains in | |||
| addition to route propagation. Such an exchange cannot happen | addition to route propagation. Such an exchange cannot happen | |||
| before communication between peers is properly secured. | before communication between peers is properly secured. | |||
| 7.3. Forwarding based on Application | 7.3. Forwarding based on Application | |||
| Forwarding based on application IDs instead of based on | Forwarding based on application IDs instead of based on | |||
| destination IP addresses is often referred to as Application based | destination IP addresses is often referred to as Application based | |||
| Segmentation. If the Applications have unique IP addresses, then | Segmentation. If the Applications have unique IP addresses, then | |||
| the Application Based Segmentation can be achieved by propagating | the Application Based Segmentation can be achieved by propagating | |||
| different BGP UPDATE messages to different nodes, as described in | different BGP UPDATE messages to different nodes, as described in | |||
| [BGP-SDWAN-USAGE]. If the Application cannot be uniquely | [BGP-SDWAN-USAGE]. If the Application cannot be uniquely | |||
| identified by the IP addresses, more work is needed. | identified by the IP addresses, more work is needed. | |||
| 8. End-to-End Security Concerns for Data Flows | 8. End-to-End Security Concerns for Data Flows | |||
| When IPsec tunnels established from enterprise on-premises CPEs | When IPsec tunnels established from enterprise on-premises CPEs | |||
| are terminated at the Cloud DC gateway where the workloads or | are terminated at the Cloud DC gateway where the workloads or | |||
| applications are hosted, some enterprises have concerns regarding | applications are hosted, some enterprises have concerns regarding | |||
| traffic to/from their workload being exposed to others behind the | traffic to/from their workload being exposed to others behind the | |||
| data center gateway (e.g., exposed to other organizations that | data center gateway (e.g., exposed to other organizations that | |||
| have workloads in the same data center). | have workloads in the same data center). | |||
| To ensure that traffic to/from workloads is not exposed to | To ensure that traffic to/from workloads is not exposed to | |||
| unwanted entities, IPsec tunnels may go all the way to the | unwanted entities, IPsec tunnels may go all the way to the | |||
| workload (servers, or VMs) within the DC. | workload (servers, or VMs) within the DC. | |||
| 9. Requirements for Dynamic Cloud Data Center VPNs | 9. Requirements for Dynamic Cloud Data Center VPNs | |||
| In order to address the aforementioned issues, any solution for | In order to address the aforementioned issues, any solution for | |||
| enterprise VPNs that includes connectivity to dynamic workloads or | enterprise VPNs that includes connectivity to dynamic workloads or | |||
| applications in cloud data centers should satisfy a set of | applications in cloud data centers should satisfy a set of | |||
| requirements: | requirements: | |||
| - The solution should allow enterprises to take advantage of the | - The solution should allow enterprises to take advantage of the | |||
| current state-of-the-art in VPN technology, in both traditional | current state-of-the-art in VPN technology, in both traditional | |||
| MPLS-based VPNs and IPsec-based VPNs (or any combination | MPLS-based VPNs and IPsec-based VPNs (or any combination | |||
| thereof) that run over the public Internet. | thereof) that run over the public Internet. | |||
| - The solution should not require an enterprise to upgrade all | - The solution should not require an enterprise to upgrade all | |||
| their existing CPEs. | their existing CPEs. | |||
| - The solution should support scalable IPsec key management among | - The solution should support scalable IPsec key management among | |||
| all nodes involved in DC interconnect schemes. | all nodes involved in DC interconnect schemes. | |||
| - The solution needs to support easy and fast, on-the-fly, VPN | - The solution needs to support easy and fast, on-the-fly, VPN | |||
| connections to dynamic workloads and applications in third | connections to dynamic workloads and applications in third | |||
| party data centers, and easily allow these workloads to migrate | party data centers, and easily allow these workloads to migrate | |||
| both within a data center and between data centers. | both within a data center and between data centers. | |||
| - Allow VPNs to provide bandwidth and other performance | - Allow VPNs to provide bandwidth and other performance | |||
| guarantees. | guarantees. | |||
| - Be a cost-effective solution for enterprises to incorporate | - Be a cost-effective solution for enterprises to incorporate | |||
| dynamic cloud-based applications and workloads into their | dynamic cloud-based applications and workloads into their | |||
| existing VPN environment. | existing VPN environment. | |||
| 10. Security Considerations | 10. Security Considerations | |||
| The draft discusses security requirements as a part of the problem | The draft discusses security requirements as a part of the problem | |||
| space, particularly in sections 4, 5, and 8. | space, particularly in sections 4, 5, and 8. | |||
| Solution drafts resulting from this work will address security | Solution drafts resulting from this work will address security | |||
| concerns inherent to the solution(s), including both protocol | concerns inherent to the solution(s), including both protocol | |||
| aspects and the importance (for example) of securing workloads in | aspects and the importance (for example) of securing workloads in | |||
| cloud DCs and the use of secure interconnection mechanisms. | cloud DCs and the use of secure interconnection mechanisms. | |||
| 11. IANA Considerations | 11. IANA Considerations | |||
| This document requires no IANA actions. RFC Editor: Please remove | This document requires no IANA actions. RFC Editor: Please remove | |||
| this section before publication. | this section before publication. | |||
| 12. References | 12. References | |||
| 12.1. Normative References | 12.1. Normative References | |||
| 12.2. Informative References | 12.2. Informative References | |||
| [RFC2735] B. Fox, et al NHRP Support for Virtual Private | [RFC2735] B. Fox, et al "NHRP Support for Virtual Private | |||
| networks . Dec. 1999. | networks". Dec. 1999. | |||
| [RFC8192] S. Hares, et al Interface to Network Security Functions | [RFC8192] S. Hares, et al "Interface to Network Security Functions | |||
| (I2NSF) Problem Statement and Use Cases , July 2017 | (I2NSF) Problem Statement and Use Cases", July 2017 | |||
| [ITU-T-X1036] ITU-T Recommendation X.1036, Framework for creation, | [ITU-T-X1036] ITU-T Recommendation X.1036, "Framework for creation, | |||
| storage, distribution and enforcement of policies for | storage, distribution and enforcement of policies for | |||
| network security , Nov 2007. | network security", Nov 2007. | |||
| [RFC6071] S. Frankel and S. Krishnan, IP Security (IPsec) and | [RFC6071] S. Frankel and S. Krishnan, "IP Security (IPsec) and | |||
| Internet Key Exchange (IKE) Document Roadmap , Feb 2011. | Internet Key Exchange (IKE) Document Roadmap", Feb 2011. | |||
| [RFC4364] E. Rosen and Y. Rekhter, BGP/MPLS IP Virtual Private | [RFC4364] E. Rosen and Y. Rekhter, "BGP/MPLS IP Virtual Private | |||
| Networks (VPNs) , Feb 2006 | Networks (VPNs)", Feb 2006 | |||
| [RFC4664] L. Andersson and E. Rosen, Framework for Layer 2 Virtual | [RFC4664] L. Andersson and E. Rosen, "Framework for Layer 2 Virtual | |||
| Private Networks (L2VPNs) , Sept 2006. | Private Networks (L2VPNs)", Sept 2006. | |||
| [BGP-SDWAN] L. Dunbar, et al. BGP Extension for SDWAN Overlay | [BGP-SDWAN] L. Dunbar, et al. "BGP Extension for SDWAN Overlay | |||
| Networks , draft-dunbar-idr-bgp-sdwan-overlay-ext-03, | Networks", draft-dunbar-idr-bgp-sdwan-overlay-ext-03, | |||
| work-in-progress, Nov 2018. | work-in-progress, Nov 2018. | |||
| 13. Acknowledgments | 13. Acknowledgments | |||
| Many thanks to Alia Atlas, Chris Bowers, Paul Vixie, Paul Ebersman, | Many thanks to Alia Atlas, Chris Bowers, Paul Vixie, Paul Ebersman, | |||
| Timothy Morizot, Ignas Bagdonas, Michael Huang, Liu Yuan Jiao, | Timothy Morizot, Ignas Bagdonas, Michael Huang, Liu Yuan Jiao, | |||
| Katherine Zhao, and Jim Guichard for the discussion and | Katherine Zhao, and Jim Guichard for the discussion and | |||
| contributions. | contributions. | |||
| Authors Addresses | Authors' Addresses | |||
| Linda Dunbar | Linda Dunbar | |||
| Futurewei | Futurewei | |||
| Email: Linda.Dunbar@futurewei.com | Email: Linda.Dunbar@futurewei.com | |||
| Andrew G. Malis | Andrew G. Malis | |||
| Independent | Independent | |||
| Email: agmalis@gmail.com | Email: agmalis@gmail.com | |||
| Christian Jacquenet | Christian Jacquenet | |||
| Orange | Orange | |||
| Rennes, 35000 | Rennes, 35000 | |||
| France | France | |||
| Email: Christian.jacquenet@orange.com | Email: Christian.jacquenet@orange.com | |||
| Mehmet Toy | Mehmet Toy | |||
| Verizon | Verizon | |||
| One Verizon Way | One Verizon Way | |||
| Basking Ridge, NJ 07920 | Basking Ridge, NJ 07920 | |||
| Email: mehmet.toy@verizon.com | Email: mehmet.toy@verizon.com | |||
| End of changes. 142 change blocks. | ||||
| 665 lines changed or deleted | 665 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||