idnits 2.17.1 draft-ietf-ipsecme-ad-vpn-problem-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 16, 2013) is 3909 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPsecME Working Group V. Manral 3 Internet-Draft HP 4 Intended status: Informational S. Hanna 5 Expires: January 17, 2014 Juniper 6 July 16, 2013 8 Auto Discovery VPN Problem Statement and Requirements 9 draft-ietf-ipsecme-ad-vpn-problem-09 11 Abstract 13 This document describes the problem of enabling a large number of 14 systems to communicate directly using IPsec to protect the traffic 15 between them. It then expands on the requirements, for such a 16 solution. 18 Manual configuration of all possible tunnels is too cumbersome in 19 many such cases. In other cases the IP address of endpoints change 20 or the endpoints may be behind NAT gateways, making static 21 configuration impossible. The Auto Discovery VPN solution will 22 address these requirements. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on January 17, 2014. 41 Copyright Notice 43 Copyright (c) 2013 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.2. Conventions Used in This Document . . . . . . . . . . . . 4 61 2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2.1. Endpoint-to-Endpoint VPN Use Case . . . . . . . . . . . . 4 63 2.2. Gateway-to-Gateway VPN Use Case . . . . . . . . . . . . . 5 64 2.3. Endpoint-to-Gateway VPN Use Case . . . . . . . . . . . . 5 65 3. Inadequacy of Existing Solutions . . . . . . . . . . . . . . 6 66 3.1. Exhaustive Configuration . . . . . . . . . . . . . . . . 6 67 3.2. Star Topology . . . . . . . . . . . . . . . . . . . . . . 6 68 3.3. Proprietary Approaches . . . . . . . . . . . . . . . . . 7 69 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 70 4.1. Gateway and Endpoint Requirements . . . . . . . . . . . . 7 71 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 72 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 73 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 74 8. Normative References . . . . . . . . . . . . . . . . . . . . 11 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 77 1. Introduction 79 IPsec [RFC4301] is used in several different cases, including tunnel- 80 mode site-to-site VPNs and Remote Access VPNs. Both tunneling modes 81 for IPsec gateways and host-to-host transport mode are supported in 82 this document. 84 The subject of this document is the problem presented by large scale 85 deployments of IPsec and the requirements on a solution to address 86 the problem. These may be a large collection of VPN gateways 87 connecting various sites, a large number of remote endpoints 88 connecting to a number of gateways or to each other, or a mix of the 89 two. The gateways and endpoints may belong to a single 90 administrative domain or several domains with a trust relationship. 92 Section 4.4 of RFC 4301 describes the major IPsec databases needed 93 for IPsec processing. It requires an extensive configuration for 94 each tunnel, so manually configuring a system of many gateways and 95 endpoints becomes infeasible and inflexible. 97 The difficulty is that a lot of configuration mentioned in RFC 4301 98 is required to set up a Security Association. IKE implementations 99 need to know the identity and credentials of all possible peer 100 systems, as well as the addresses of hosts and/or networks behind 101 them. A simplified mechanism for dynamically establishing point-to- 102 point tunnels is needed. Section 2 contains several use cases that 103 motivate this effort. 105 1.1. Terminology 107 ADVPN - Auto Discovery Virtual Private Network (ADVPN) is VPN 108 solution that enables a large number of systems to communicate 109 directly, with minimal configuration and operator intervention using 110 IPsec to protect communication between them. 112 Endpoint - A device that implements IPsec for its own traffic but 113 does not act as a gateway. 115 Gateway - A network device that implements IPsec to protect traffic 116 flowing through the device. 118 Point-to-Point - Communication between two parties without active 119 participation (e.g. encryption or decryption) by any other parties. 121 Hub - The central point in a star topology/ dynamic full mesh 122 topology, or one of the central points in the full mesh style VPN, 123 i.e. gateway where multiple other hubs or spokes connect to. The 124 hubs usually forward traffic coming from encrypted links to other 125 encrypted links, i.e. there are no devices connected to it in clear. 127 Spoke - The endpoint in a star topology/ dynamic full mesh topology, 128 or gateway which forwards traffic from multiple cleartext devices to 129 other hubs or spokes, and some of those other devices are connected 130 to it in clear (i.e. it encrypts data coming from cleartext devices 131 and forwards it to the ADVPN). 133 ADVPN Peer - any member of an ADVPN including gateways, endpoints, 134 hub or spoke. 136 Star topology - This is the topology where there is direct 137 connectivity only between the hub and spoke and communication between 138 the 2 spokes happens through the hub. 140 Allied and Federated Environments - Environments where we have 141 multiple different organizations that have close association and need 142 to connect to each other. 144 Full Mesh topology - This is the topology where there is a direct 145 connectivity between every Spoke to every other Spoke directly, 146 without the traffic between the spokes having to be redirected 147 through an intermediate hub device. 149 Dynamic Full Mesh topology - This is the topology where direct 150 connections exist in a hub and spoke manner, but dynamic connections 151 are created/ removed between the spokes on a need basis. 153 Security Association (SA) - Defined in [RFC4301]. 155 1.2. Conventions Used in This Document 157 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 158 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 159 document are to be interpreted as described in [RFC2119]. 161 2. Use Cases 163 This section presents the key use cases for large-scale point-to- 164 point VPN. 166 In all of these use cases, the participants (endpoints and gateways) 167 may be from a single organization (administrative domain) or from 168 multiple organizations with an established trust relationship. When 169 multiple organizations are involved, products from multiple vendors 170 are employed so open standards are needed to provide 171 interoperability. Establishing communications between participants 172 with no established trust relationship is out of scope for this 173 effort. 175 2.1. Endpoint-to-Endpoint VPN Use Case 177 Two endpoints wish to communicate securely via a point-to-point 178 Security Association (SA). 180 The need for secure endpoint-to-endpoint communications is often 181 driven by a need to employ high-bandwidth, low -latency local 182 connectivity instead of using slow, expensive links to remote 183 gateways. For example, two users in close proximity may wish to 184 place a direct, secure video or voice call without needing to send 185 the call through remote gateways, which would add latency to the 186 call, consume precious remote bandwidth, and increase overall costs. 187 Such a use case also enables connectivity when both users are behind 188 NAT gateways. Such a use case ought to allow for seamless 189 connectivity even as endpoints roam, even if they are moving out from 190 behind a NAT gateway, from behind one NAT gateway to behind another, 191 or from a standalone position to behind a NAT gateway. 193 In a star topology, when two endpoints communicate they need a 194 mechanism for authentication, such that they do not expose themselves 195 to impersonation by the other spoke endpoint. 197 2.2. Gateway-to-Gateway VPN Use Case 199 A typical Enterprise traffic model follows a star topology, with the 200 gateways connecting to each other using IPsec tunnels. 202 However for voice and other rich media traffic that requires a lot of 203 bandwidth or is performance sensitive, the traffic tromboning (taking 204 a suboptimal path) to the hub can create traffic bottlenecks on the 205 hub and can lead to an increase in cost. A fully meshed solution 206 would make best use of the available network capacity and performance 207 but the deployment of a fully meshed solution involves considerable 208 configuration, especially when a large number of nodes are involved. 209 It is for this purpose spoke-to-spoke tunnels are dynamically created 210 and torn-down. For the reasons of cost and manual error reduction, 211 it is desired that there be minimal configuration on each gateway. 213 The solution ought to work in cases where the endpoints are in 214 different administrative domains, albeit, ones that have an existing 215 trust relationship (for example two organisations who are 216 collaborating on a project, they may wish to join their networks, 217 whilst retaining independent control over configuration). It is 218 highly desirable that the solution works for the star, full mesh as 219 well as dynamic full mesh topology. 221 The solution ought to also address the case where gateways use 222 dynamic IP addresses. 224 Additionally, the routing implications of gateway-to-gateway 225 communication need to be addressed. In the simple case, selectors 226 provide sufficient information for a gateway to forward traffic 227 appropriately. In other cases, additional tunneling (e.g., Generic 228 Routing Encapsulation - GRE) and routing (e.g., Open Shortest Path 229 First - OSPF) protocols are run over IPsec tunnels, and the 230 configuration impact on those protocols needs to be considered. 231 There is also the case when Layer-3 Virtual Private Networks (L3VPNs) 232 operate over IPsec Tunnels. 234 When two gateways communicate, they need to use a mechanism for 235 authentication, such that they do not expose themselves to the risk 236 of impersonation by the other entities. 238 2.3. Endpoint-to-Gateway VPN Use Case 239 A mobile endpoint ought to be able to use the most efficient gateway 240 as it roams in the internet. 242 A mobile user roaming on the Internet may connect to a gateway, which 243 because of roaming is no longer the most efficient gateway to use 244 (reasons could be cost/ efficiency/ latency or some other factor). 245 The mobile user ought to be able to discover and then connect to the 246 current most efficient gateway in a seamless way without having to 247 bring down the connection. 249 3. Inadequacy of Existing Solutions 251 Several solutions exist for the problems described above. However, 252 none of these solutions is adequate, as described here. 254 3.1. Exhaustive Configuration 256 One simple solution is to configure all gateways and endpoints in 257 advance with all the information needed to determine which gateway or 258 endpoint is optimal and to establish an SA with that gateway or 259 endpoint. However, this solution does not scale in a large network 260 with hundreds of thousands of gateways and endpoints, especially when 261 multiple administrative domains are involved and things are rapidly 262 changing (e.g. mobile endpoints). Such a solution is also limited by 263 the smallest endpoint/ gateway, as the same exhaustive configuration 264 is to be applied on all endpoints/ gateways. A more dynamic, secure 265 and scalable system for establishing SAs between gateways is needed. 267 3.2. Star Topology 269 The most common way to address a part of this this problem today is 270 to use what has been termed a "star topology". In this case one or a 271 few gateways are defined as "hub gateways", while the rest of the 272 systems (whether endpoints or gateways) are defined as "spokes". The 273 spokes never connect to other spokes. They only open tunnels with 274 the hub gateways. Also for a large number of gateways in one 275 administrative domain, one gateway may be defined as the hub, and the 276 rest of the gateways and remote access clients connect only to that 277 gateway. 279 This solution however is complicated by the case when the spokes use 280 dynamic IP addresses and DNS with dynamic updates needs to be used. 281 It is also desired that there is minimal to no configuration on the 282 hub as the number of spokes increases and new spokes are added and 283 deleted randomly. 285 Another problem with the star topology is that it creates a high load 286 on the hub gateways as well as on the connection between the spokes 287 and the hub. This load is both in processing power and in network 288 bandwidth. A single packet in the hub-and-spoke scenario can be 289 encrypted and decrypted multiple times. It would be much preferable 290 if these gateways and clients could initiate tunnels between them, 291 bypassing the hub gateways. Additionally, the path bandwidth to 292 these hub gateways may be lower than that of the path between the 293 spokes. For example, two remote access users may be in the same 294 building with high-speed wifi (for example, at an IETF meeting). 295 Channeling their conversation through the hub gateways of their 296 respective employers seems extremely wasteful, as well as having 297 lower bandwidth. 299 The challenge is to build large scale, IPsec-protected networks that 300 can dynamically change with minimal administrative overhead. 302 3.3. Proprietary Approaches 304 Several vendors offer proprietary solutions to these problems. 305 However, these solutions offer no interoperability between equipment 306 from one vendor and another. This means that they are generally 307 restricted to use within one organization, and it is harder to move 308 off such solutions as the features are not standardized. Besides 309 multiple organizations cannot be expected to all choose the same 310 equipment vendor. 312 4. Requirements 314 This section defines the requirements, on which the solution will be 315 based. 317 4.1. Gateway and Endpoint Requirements 319 1. For any network topology (star, full mesh and dynamic full mesh), 320 when a new gateway or endpoint is added, removed, or changed, 321 configuration changes are minimized as follows. Adding or removing a 322 spoke in the topology MUST NOT require configuration changes to the 323 hubs other than where the spoke was connected to and SHOULD NOT 324 require configuration changes to the hub the spoke was connected to. 325 The changes also MUST NOT require configuration changes in other 326 spokes. 328 Specifically, when evaluating potential proposals, we will compare 329 them by looking at how many endpoints or gateways must be 330 reconfigured when a new gateway or endpoint is added, removed, or 331 changed and how substantial this reconfiguration is, besides the 332 amount of static configuration required. 334 This requirement is driven by use cases 2.1 and 2.2 and by the 335 scaling limitations pointed out in section 3.1. 337 2. ADVPN peers MUST allow IPsec Tunnels to be setup with other 338 members of the ADVPN without any configuration changes, even when 339 peer addresses get updated every time the device comes up. This 340 implies that SPD entries or other configuration based on peer IP 341 address will need to be automatically updated, avoided, or handled in 342 some manner to avoid a need to manually update policy whenever an 343 address changes. 345 3. In many cases additional tunneling protocols (e.g. GRE) or 346 Routing protocols (e.g. OSPF) are run over the IPsec tunnels. 347 Gateways MUST allow for the operation of tunneling and Routing 348 protocols operating over spoke-to-spoke IPsec Tunnels with minimal or 349 no, configuration impact. The ADVPN solution SHOULD NOT increase the 350 amount of information required to configure protocols running over 351 IPsec tunnels. 353 4. In the full mesh and dynamic full mesh topology, Spokes MUST 354 allow for direct communication with other spoke gateways and 355 endpoints. In the star topology mode, direct communication between 356 spokes MUST be disallowed. 358 This requirement is driven by use cases 2.1 and 2.2 and by the 359 limitations of a star topology pointed out in section 3.2. 361 5. Any of the ADVPN Peers MUST NOT have a way to get the long term 362 authentication credentials for any other ADVPN Peers. The compromise 363 of an Endpoint MUST NOT affect the security of communications between 364 other ADVPN Peers. The compromise of a Gateway SHOULD NOT affect the 365 security of the communications between ADVPN Peers not associated 366 with that Gateway. 368 This requirement is driven by use case 2.1. ADVPN Peers (especially 369 Spokes) become compromised fairly often. The compromise of one ADVPN 370 Peer SHOULD NOT affect the security of other unrelated ADVPN Peers. 372 6. Gateways SHOULD allow for seamless handoff of sessions in case 373 endpoints are roaming, even if they cross policy boundaries. This 374 would mean the data traffic is minimally affected even as the handoff 375 happens. External factors like firewall, NAT boxes that will be part 376 of the overall solution when DVPN is deployed will not be considered 377 part of this solution. 379 Such endpoint roaming may affect not only the endpoint-to-endpoint SA 380 but also the relationship between the endpoints and gateways (such as 381 when an endpoint roams to a new network that is handled by a 382 different gateway). 384 This requirement is driven by use case 2.1. Today's endpoints are 385 mobile and transition often between different networks (from 4G to 386 WiFi and among various WiFi networks). 388 7. Gateways SHOULD allow for easy handoff of a session to another 389 gateway, to optimize latency, bandwidth, load balancing, 390 availability, or other factors, based on policy. 392 This ability to migrate traffic from one gateway to another applies 393 regardless of whether the gateways in question are hubs or spokes. 394 It even applies in the case where a gateway (hub or spoke) moves in 395 the network, as may happen with a vehicle-based network. 397 This requirement is driven by use case 2.3. 399 8. Gateways and endpoints MUST have the capability to participate in 400 an ADVPN even when they are located behind NAT boxes. However, in 401 some cases they may be deployed in such a way that they will not be 402 fully reachable behind a NAT box. It is especially difficult to 403 handle cases where the Hub is behind a NAT box. Where the two 404 endpoints are both behind separate NATs, communication between these 405 spokes SHOULD be supported using workarounds such as port forwarding 406 by the NAT or detecting when two spokes are behind uncooperative NATs 407 and using a hub in that case. 409 This requirement is driven by use cases 2.1 and 2.2. Endpoints are 410 often behind NATs and gateways sometimes are. IPsec SHOULD continue 411 to work seamlessly regardless, using ADVPN techniques whenever 412 possible and providing graceful fallback to hub and spoke techniques 413 as needed. 415 9. Changes such as establishing a new IPsec SA SHOULD be reportable 416 and manageable. However, creating a MIB or other management 417 technique is not within scope for this effort. 419 This requirement is driven by manageability concerns for all the use 420 cases, especially use case 2.2. As IPsec networks become more 421 dynamic, management tools become more essential. 423 10. To support allied and federated environments, endpoints and 424 gateways from different organizations SHOULD be able to connect to 425 each other. 427 This requirement is driven by demand for all the use cases in 428 federated and allied environments. 430 11. The administrator of the ADVPN SHOULD allow for the 431 configuration of a Star, Full mesh or a partial full mesh topology, 432 based on which tunnels are allowed to be setup. 434 This requirement is driven by demand for all the use cases in 435 federated and allied environments. 437 12. The ADVPN solution SHOULD be able to scale for multicast 438 traffic. 440 This requirement is driven by the use case 2.2, where the amount of 441 rich media multicast traffic is increasing. 443 13. The ADVPN solution SHOULD allow for easy monitoring, logging and 444 reporting of the dynamic changes, to help for trouble shooting such 445 environments. 447 This requirement is driven by demand for all the use cases in 448 federated and allied environments. 450 14. There is also the case when L3VPNs operate over IPsec Tunnels, 451 for example Provider Edge (PE) based VPN's. An ADVPN MUST support 452 L3VPN as an application protected by the IPsec Tunnels. 454 This requirement is driven by demand for all the use cases in 455 federated and allied environments. 457 15. There ADVPN solution SHOULD allow the enforcement of per peer 458 QoS in both the Star as well as the Full Mesh topology. 460 This requirement is driven by demand for all the use cases in 461 federated and allied environments. 463 16. There ADVPN solution SHOULD take care of not letting the Hub to 464 be a single point of failure. 466 This requirement is driven by demand for all the use cases in 467 federated and allied environments. 469 5. Security Considerations 471 This is a Problem statement and requirement document for the ADVPN 472 solution, and in itself does not introduce any new security concerns. 473 The solution to the problems presented in this draft may involve 474 dynamic updates to databases defined by RFC 4301, such as the 475 Security Policy Database (SPD) or the Peer Authorization Database 476 (PAD). 478 RFC 4301 is silent about the way these databases are populated, and 479 it is implied that these databases are static and pre-configured by a 480 human. Allowing dynamic updates to these databases must be thought 481 out carefully, because it allows the protocol to alter the security 482 policy that the IPsec endpoints implement. 484 One obvious attack to watch out for is stealing traffic to a 485 particular site. The IP address for www.example.com is 192.0.2.10. 486 If we add an entry to an IPsec endpoint's SPD that says that traffic 487 to 192.0.2.10 is protected through peer Gw-Mallory, then this allows 488 Gw-Mallory to either pretend to be www.example.com or to proxy and 489 read all traffic to that site. Updates to this database requires a 490 clear trust model. 492 Hubs can be a single point of failure which can cause loss of 493 connectivity of the entire system, that can be a big security issue. 494 Any ADVPN solution design should take care of these concerns. 496 6. IANA Considerations 498 No actions are required from IANA for this informational document. 500 7. Acknowledgements 502 Many people have contributed to the development of this problem 503 statement and many more will probably do so before we are done with 504 it. While we cannot thank all contributors, some have played an 505 especially prominent role. Yoav Nir, Yaron Sheffer, Jorge Coronel 506 Mendoza, Chris Ulliott, and John Veizades wrote the document upon 507 which this draft was based. Geoffrey Huang, Toby Mao, Suresh Melam, 508 Praveen Sathyanarayan, Andreas Steffen, Brian Weis, and Lou Berger 509 provided essential input. 511 8. Normative References 513 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 514 Requirement Levels", BCP 14, RFC 2119, March 1997. 516 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 517 Internet Protocol", RFC 4301, December 2005. 519 Authors' Addresses 520 Vishwas Manral 521 Hewlett-Packard Co. 522 19111 Pruneridge Ave. 523 Cupertino, CA 95113 524 USA 526 Email: vishwas.manral@hp.com 528 Steve Hanna 529 Juniper Networks, Inc. 530 1194 N. Mathilda Ave. 531 Sunnyvale, CA 94089 532 USA 534 Email: shanna@juniper.net