idnits 2.17.1 draft-ietf-l3vpn-end-system-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([RFC4364]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 11 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 308: '... VPN Forwarder MAY be co-located in ...' RFC 2119 keyword, line 342: '...irtual MAC address which SHOULD be the...' RFC 2119 keyword, line 344: '... virtual MAC address SHALL default to the VRRP [RFC5798] virtual...' RFC 2119 keyword, line 406: '... MUST implement the following functi...' RFC 2119 keyword, line 432: '...he VPN Forwarder MAY support the abili...' (17 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 18, 2014) is 3500 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4271' is defined on line 1006, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) == Outdated reference: A later version (-11) exists of draft-ietf-mpls-in-udp-03 Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Marques 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track L. Fang 5 Expires: March 22, 2015 Microsoft 6 N. Sheth 7 Juniper Networks 8 M. Napierala 9 AT&T Labs 10 N. Bitar 11 Verizon 12 September 18, 2014 14 BGP-signaled end-system IP/VPNs. 15 draft-ietf-l3vpn-end-system-03 17 Abstract 19 This document describes a solution in which the control plane 20 protocol specified in BGP/MPLS IP VPNs [RFC4364] is used to provide a 21 Virtual Network service to end-systems. These end-systems may be 22 used to provide network services or may directly host end-to-end 23 applications. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on March 22, 2015. 42 Copyright Notice 44 Copyright (c) 2014 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Applicability of BGP IP VPNs . . . . . . . . . . . . . . . . 4 63 4. Virtual network end-points . . . . . . . . . . . . . . . . . 7 64 5. VPN Forwarder . . . . . . . . . . . . . . . . . . . . . . . . 8 65 6. XMPP signaling protocol . . . . . . . . . . . . . . . . . . . 10 66 7. End-System Route Server behavior . . . . . . . . . . . . . . 16 67 8. Operational Model . . . . . . . . . . . . . . . . . . . . . . 17 68 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 69 10. XML schema . . . . . . . . . . . . . . . . . . . . . . . . . 20 70 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 71 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 72 12.1. Normative References . . . . . . . . . . . . . . . . . . 22 73 12.2. Informational References . . . . . . . . . . . . . . . . 23 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 76 1. Introduction 78 This document describes the requirements for a network virtualization 79 solution that provides an IP service to end-system virtual 80 interfaces. It then discusses how the BGP IP VPNs [RFC4364] control 81 plane can be used to provide a solution that meets these 82 requirements. Subsequent sections provide a detailed discussion of 83 the control and forwarding plane components. 85 In BGP IP VPNs, Customer Edge (CE) interfaces connect to a Provider 86 Edge (PE) device which provides both the control plane and VPN 87 encapsulation functions required to implement a Virtual Network 88 service. This document decouples the control plane and forwarding 89 functionality of the PE device in order to enable the forwarding 90 functionality to be implemented in multiple devices. For instance, 91 the forwarding function can be implemented directly on the operating 92 system of application servers or network appliances. 94 1.1. Terminology 96 This document makes use of the following terms: 98 End-System: A compute node which primary function is to run 99 applications. It is assumed that end-systems support multiple 100 application instances (e.g. virtual-machines), each with its 101 independent network configuration. 103 End-System Route Server: A software application that implements the 104 control plane functionality of a BGP IP VPN PE device and a XMPP 105 server that interacts with VPN Forwarders. 107 Virtual Interface: An interface in an end-system that is used by a 108 virtual machine or by applications. It performs the role of a CE 109 interface in a BGP IP VPN network. 111 VPN Forwarder: The forwarding component of a BGP IP VPN PE device. 112 This functionality may be co-located with the virtual interface or 113 implemented by an external device. 115 2. Requirements 117 Network Virtualization is used in both service provider as well as 118 enterprise networks to support multi-tenancy and network-based access 119 control. It may also be used to facilitate application instance 120 mobility. 122 Multi-tenancy allows a physical network to provide services to 123 multiple "customers" or "tenants", whether these are external 124 entities in the case of a Service Provider providing managed VPN 125 services or internal departments sharing an IT facility. Multi- 126 tenancy requires isolation of traffic and routing information between 127 tenants. 129 Within a tenant, it is often required to create multiple distinct 130 virtual networks, in order to be able to provide network-based access 131 control. In this service model, each virtual network behaves as a 132 "Closed User Group" (CUG) of virtual interfaces that are allowed to 133 exchange traffic freely, while traffic between virtual networks is 134 subject to access controls. This scenario can be found in both 135 enterprise campus networks, branch offices and data-centers. 137 It is often the case when network access control is used, that the 138 traffic patterns are such that there is significantly more traffic 139 crossing a CUG boundary than staying within such boundary. As an 140 example, in campus networks it is common to segregate users into CUGs 141 based on some classification such as the user's department. Campus 142 networks often see traffic patterns in which almost all the traffic 143 flows northbound to the data-center or internet boundaries. Similar 144 traffic patterns can be found in multi-tier applications in IT data- 145 centers. 147 Virtual interfaces are often configured to expect the concept of IP 148 subnet to match its closed user group. A network virtualization 149 solution should be able to provide this concept of IP subnet 150 regardless of whether the underlying implementation uses a multi- 151 access network or not. 153 Virtual interfaces should be able to directly access multiple closed 154 user groups without needing to traverse a gateway. Network access 155 policy should allow this access whether the source and destination 156 CUGs for a particular traffic flow belong to the same tenant or 157 different tenants. It is often the case that infrastructure services 158 are provided to multiple tenants. One such example is voice-over-IP 159 gateway services for branch offices. 161 Independently, but often associated with the previous two functions, 162 IP mobility is another network function that can be implemented using 163 network virtualization. By abstracting the externally visible 164 network address from the underlying infrastructure address, mobility 165 can be implemented without having to recur to home agents or large L2 166 broadcast domains. 168 IP Mobility requires the ability to "move" a virtual interface 169 without disrupting its TCP (or UDP) transport sessions. This 170 requires a mechanism that can efficiently communicate the mappings 171 between logical and physical addressing. 173 IP Mobility can be a result of devices physically moving (e.g., a 174 WiFi enabled laptop) or workload being diverted between physical 175 systems such as network appliances or application servers. 177 3. Applicability of BGP IP VPNs 179 BGP IP VPNs [RFC4364] is the industry de-facto standard for providing 180 "closed user group" functionality in WAN environments. It is used by 181 service providers in environments where several millions of routes 182 are present. It supports both isolated VPNs as well as overlapping 183 VPNs (often referred to as "extranets"). 185 The BGP IP VPN control plane has been designed to be able to 186 distribute the mapping between virtual address and location (next- 187 hop) to the subset of network nodes for which this information is 188 relevant, whenever that mapping changes. This provides an efficient 189 mechanism to address IP mobility requirements as compared to methods 190 that depend on a (cached) mapping request from the end-systems. 192 In its traditional usage in Service Provider networks, BGP IP VPN 193 functionality is implemented in a Provider Edge (PE) device that 194 combines both BGP signaling as well as VRF-based forwarding 195 functions. In practice, most PE devices in current use are multi- 196 component systems with the signaling and forwarding functionality 197 actually implemented in different processors attached to an internal 198 network. 200 This document assumes a similar separation of functionality in which 201 software appliances, the End-System Route Servers, implement the 202 control plane functionality of a PE device and a VPN Forwarder 203 implements the forwarding function usually found in a PE device 204 "line-card". The VPN Forwarder functionality may be co-located with 205 the end-system (e.g., implemented in the hypervisor switch or host OS 206 network drivers) or it may be external. For instance, residing in a 207 data-center switch or specialized appliance. 209 Operationally, BGP IP VPN technology has several important 210 characteristics: 212 It has a high-level of aggregation between customer interfaces and 213 managed entities (Provider Edge devices). 215 It defines VPNs as policies, allowing an interface to directly 216 exchange traffic with multiple VPNs and allowing for the topology 217 of the virtual network to be modified by modifying the policy 218 configuration. 220 It scales horizontally in terms of event propagation. By 221 increasing the number of signaling devices implementing the PE 222 control plane, it is possible to decrease the load on each 223 signaling device when it comes to propagating events that 224 originate in a specific location and must be propagated across the 225 network. 227 The last point is particularly relevant to the convergence 228 characteristics required for large scale deployments. BGP's 229 hierarchical route distribution capabilities allow a deployment to 230 divide the workload by increasing the number of End-System Route 231 Servers. 233 As an example consider a topology in which 100 End-System Route 234 Servers are deployed in a network each serving a subset of the VPN 235 forwarding elements. The Route Servers inter-connect to two top- 236 level BGP Route Reflectors [RFC4456]. 238 If an event (i.e. a VPN route change) needs to be propagated from a 239 specific end-system to 10,000 clients randomly distributed across the 240 network, each of the End-System Route Servers must generate 100 241 updates to its respective downstream clients. 243 By modifying this topology such that another 100 End-System Route 244 Servers are added, then each Route Server is now responsible to 245 generate 50 client updates. This example illustrates the linear 246 scaling properties of BGP: doubling the number of Route Servers (i.e. 247 the processing capacity) reduces in half the number of updates 248 generated by each (i.e. load at each processing node). 250 The same horizontal scaling techniques can be applied to the Route 251 Reflector layer in the example above by subsetting the VPN Route 252 space according to some pre-defined criteria (for instance VPN route 253 target) and using a pair of Route Reflectors per subset. 255 In the previous example we assumed a dense membership in which all 256 Route Servers have local clients that are interested in a particular 257 event. BGP also optimizes the route distribution for sparse events. 258 The Route Target Constraint [RFC4684] extension, builds an optimal 259 distribution tree for message propagation based on VPN membership. 260 It ensures that only the PEs with local receivers for a particular 261 event do receive it also decreasing the total load on the upstream 262 BGP speaker. 264 In the WAN environment, BGP IP VPN control plane scaling is focused 265 not primarily on route convergence times but on memory footprint of 266 embedded devices. While memory footprint does not have a similar 267 linear scaling behavior, memory technology available to software 268 appliances is often at 10x the scale of what is commonly found in WAN 269 environments. 271 The functionality present in the BGP IP VPN control plane addresses 272 the requirements specified in the previous section. Specifically, it 273 supports multiple potentially overlapping "groups", regular or "hub 274 and spoke" topologies and the scaling characteristics necessary. 276 The BGP IP VPN control plane supports not only the definition of 277 "closed user-groups" (VPNs in its terminology) but also the 278 propagation of inter-VPN traffic policies [RFC5575]. 280 Note that the signaling protocol itself is rather agnostic of the 281 encapsulation used on the wire as long as this encapsulation has the 282 ability to carry a 20 bit label. 284 Several network environments use a network infrastructure that is 285 only capable of providing an IP unicast service. In order to support 286 them, implementations of this document should support the MPLS in GRE 287 [RFC4023] encapsulation. Other encapsulations are possible, 288 including UDP based encapsulations [I-D.ietf-mpls-in-udp]. 290 4. Virtual network end-points 292 This document assumes that end-systems support one or more virtual 293 network interfaces in addition to a physical interface that is 294 associated with the underlying network infrastructure. Virtual 295 network interfaces can be associated with a restricted list of 296 applications via OS-dependent mechanisms, a Virtual Machine (VM), or 297 they can be used to provide network connectivity to all user 298 applications in the same way that a "VPN tunnel" interface is used to 299 provide access between an end-system (e.g., a laptop) and a remote 300 corporate network. 302 From an IP address assignment point of view, a virtual network 303 interface is addressed out of the virtual IP topology and associated 304 with a "closed user group" or VPN, while the physical interface of 305 the machine is addressed in the network infrastructure topology. 307 A virtual network interface is connected to a VPN Forwarder. This 308 VPN Forwarder MAY be co-located in the end-system or external. 310 Both static and dynamic IP address allocation can be supported. The 311 later assumes that the VPN Forwarder implements a DHCP relay or DHCP 312 proxy functionality. 314 Traffic that ingresses or egresses through a virtual network 315 interface is routed at the VPN Forwarder which acts as the first-hop 316 router (in the virtual topology). The IP configuration on the client 317 side of this virtual network interface (e.g., in the guest OS) can 318 follow one of two models: 320 point-to-point interface model. 322 multipoint interface model. 324 In a point-to-point interface model, the VPN client routing table 325 (e.g., on the guest OS) contains the following routing entires: a 326 host route to the local IP address, a host route to the first-hop 327 router via the virtual interface and a default route to the first-hop 328 router. This is the model typically used in "VPN tunnel" 329 configurations or other access technologies such as cable deployments 330 or DSL. When this model is used, the first-hop router IP address is 331 a link-local address that is the same on all first-hop routers across 332 a specific deployment. This first-hop IP address should not change 333 when a virtual interface moves between different machines. 335 In a multi-point interface model, the VPN client routing table (e.g., 336 on the guest OS) contains the following routing entires: a host route 337 to the local IP address, a subnet route to the local interface and 338 optionally a default route to a specific router address within that 339 subnet. In this model, the VPN client IP stack will issue address 340 resolution requests for any IP addresses it considers to be directly 341 attached to the subnet. The VPN Forwarder shall answer all address 342 resolution requests with a virtual MAC address which SHOULD be the 343 same across all VPN Forwarders in a specific deployment. This 344 virtual MAC address SHALL default to the VRRP [RFC5798] virtual 345 router MAC address for Virtual Router Identifier (VRID) 1. 347 When the virtual topology first-hop router resides on the same 348 physical machine, the host OS is responsible to map the virtual 349 interface with a VPN specific routing table (without taking L2 350 addresses into consideration). In this case the mac-addresses known 351 to the guest OS are not used on the wire. 353 When the virtual topology first-hop router resides in an external 354 system (e.g., the first hop-switch) the virtual interface shall be 355 identified by the combination of the mac-address assigned to physical 356 interface of the end-system and a 802.1Q VLAN tag. The first-hop 357 switch should use a virtual router MAC address to answer any address 358 resolution queries. 360 Whenever an external VPN Forwarder is used and resiliency is desired, 361 the external VPN Forwarder should be redundant. It is desirable to 362 use VRRP as a mechanism to control the flow of traffic between the 363 end-system and the external VPN Forwarder. VRRP already defines the 364 necessary procedures to elect a single forwarder for a LAN. 366 This specification uses the VRRP virtual router MAC address as the 367 default L2 address for the VPN Forwarder as a client virtual 368 interface may move between locations where redundancy may not be 369 present. 371 While the VRRP Virtual Router MAC will be used to answer any address 372 resolution request made by the virtual interface client (e.g., the 373 guest VM) this does not imply that a single default router is elected 374 per virtual IP subnet. The ingress VPN Forwarder will perform an IP 375 forwarding decision based on the destination IP address of the 376 (payload) traffic. 378 VRRP router election is only relevant in selecting the VPN Forwarder 379 associated with a specific machine, when external forwarders are in 380 use. 382 5. VPN Forwarder 384 In this solution, the Host OS/Hypervisor in the end-system must 385 participate in the virtual network service. Given an end-system with 386 multiple virtual interfaces, these virtual interfaces must be mapped 387 onto the network by the guest OS such that applications on one 388 virtual interface are not send traffic to networks they are not 389 authorized to communicate with or using source addresses not assigned 390 to the virtual interface. 392 When VPN forwarder functionality is implemented by the Host OS/ 393 Hypervisor, intermediate systems in the network do not require any 394 knowledge of the virtual network topology. This can simplify the 395 design and operation of the physical network. 397 When it is not possible or desirable to add the VPN forwarding 398 functionality to the end-system, it may be implemented by an external 399 system, typically located as close as possible to the end-system 400 itself. 402 Both models, co-located and external VPN Forwarder can co-exist in a 403 deployment. 405 In order to implement the BGP IP VPN Forwarder functionality a device 406 MUST implement the following functionality: 408 Support for multiple "Virtual Routing and Forwarding" (VRF) 409 tables; 411 VRF route entries map prefixes in the virtual network topology 412 to a next-hop containing a infrastructure IP address and a 413 20-bit label allocated by the destination Forwarder. The VRF 414 table lookup follows the standard IP lookup (best-match) 415 algorithm. 417 Associate an end-system virtual interface with a specific VRF 418 table; 420 When the the Forwarder is co-located with the end-system, this 421 association is implemented by an internal mechanism. When the 422 Forwarder is external the association is performed using the 423 mac-address of the end-system and a IEEE 802.1Q tag that 424 identifies the virtual interface within the end-system. 426 Encapsulate outgoing traffic (end-system to network) according to 427 the result of the VRF lookup; 429 Associate incoming packets (network to end-system) to a VRF 430 according to the 20-bit label contained in the packet; 432 The VPN Forwarder MAY support the ability to associate multiple 433 virtual interfaces with the same VRF. When that is the case, locally 434 originated routes, that is IP routes to the local virtual interfaces 435 SHALL NOT be used to forward outbound traffic (from the virtual 436 interfaces to the outside) unless a route advertisement has been 437 received that matches that specific IP prefix and next-hop 438 information. 440 As an example, if a given VRF contains two virtual interfaces, 441 "veth0" and "veth1", with the addresses 10.0.1.1/32 and 10.0.1.2/32 442 respectively, the initial forwarding state must be initialized such 443 that traffic from either of these interfaces does not match the 444 other's routing table entry. It may for instance match a default 445 route advertised by a remote system. Traffic received from other VPN 446 Forwarders, however, must be delivered to the correct local 447 interface. If at a subsequent stage a route is received from the 448 Route Server such that 10.0.1.2/32 has a next-hop with the IP address 449 of the local host and the correct label, the system may subsequently 450 install a local routing table entry that delivers traffic directly to 451 the "veth1" interface. This means that forwarding table entries 452 apply to downstream only by default. This capability can be used to 453 implement a hub-and-spoke topology, if required. 455 The 20-bit label which is associated with a virtual-interface is of 456 local significance only and SHOULD be allocated by the VPN Forwarder. 458 When an external VPN Forwarder is used the end-system MUST associate 459 each virtual interface with a VLAN [IEEE.802-1Q] that is unique on 460 the end-system. The switching infrastructure MUST be configured such 461 that multi-destination frames sourced from an end-system are only 462 delivered to VPN Forwarders used by this end-system and not to other 463 end-systems. 465 6. XMPP signaling protocol 467 End-System Route Servers must be aware of VPN membership on each 468 Forwarder as well as what IP addresses are currently associated with 469 each virtual interface. 471 VPN Forwarders must receive VPN route information from which to 472 populate their forwarding tables. External VPN Forwarders also need 473 to receive the virtual interface and IP address events from the end- 474 system for which they are VPN forwarders. In this case the end- 475 system assigns an 802.1Q VLAN tag to each virtual interface and 476 communicates that information to the Forwarder. 478 In order to exchange this information this specification uses the 479 XMPP [RFC6120] protocol along with the Publish-Subscribe [pubsub] 480 extension. 482 VPN forwarders (both co-located and external) establish XMPP sessions 483 with End-System Route Servers, acting as XMPP clients. When an 484 external VPN Forwarder is used, end-systems establish XMPP sessions 485 with VPN Forwarders. External VPN Forwarders act as XMPP servers for 486 end-systems which are associated with them. 488 A VPN Forwarder MAY connect to multiple End-System Route Servers for 489 reliability. In this case it SHOULD publish its information to each 490 of the Route Servers. It MAY choose to subscribe to VPN routing 491 information once only from one of the available gateways. 493 The information advertised by an XMPP client SHOULD be deleted after 494 a configurable timeout, when the session closes. This timeout should 495 default to 60 seconds. 497 +---------+ +--------+ 498 | RS | ----------- | BGP | 499 +---------+ +--------+ 500 // \ / 501 XMPP \ / 502 // \ / 503 +------------+ \ / 504 | end-system | \ / 505 +------------+ \/ 506 \\ /\ 507 XMPP / \ 508 \\ / \ 509 +---------+ / \ +--------+ 510 | RS | ----------- | BGP | 511 +---------+ +--------+ 513 The figure above represents a typical configuration in which an end- 514 system with a co-located VPN Forwarder is directly connected to two 515 End-System Routes Servers, which are in turn connected to multiple 516 BGP speakers which may be other L3VPN PEs or BGP route reflectors. 518 In deployment the number of End-System Route Servers used will depend 519 on the desired Route Server to VPN Forwarder ratio which affects the 520 convergence time of event propagation. 522 The XMPP "jid" used by the client shall be a string that uniquely 523 identifies it in its administrative domain. This specification 524 recommends the use of the hostname (when unique) or an IP address in 525 its string representation. 527 Each VPN shall be identified by a 128 octet ASCII character string. 529 When external Forwarders are used, its control software operates as a 530 XMPP server processing requests from end-systems and as a client of 531 one or more End-System Route Servers. The control software relays to 532 the End-System Route Server(s) VPN membership messages it receives 533 from the end-system. VPN routing information received from the Route 534 Server(s) SHOULD NOT be propagated to the end-system. 536 When a virtual interface is created on a end-system, the host 537 operating-system software shall generate an XMPP Subscribe message to 538 its server (the End-System Route Server or external VPN Forwarder). 540 Subscription request from co-located VPN Forwarder to Route Server: 542 546 547 548 549 1 550 551 552 554 The request above, instructs the End-System Route Server to start 555 populating the client's VRF table with any routing information that 556 is available for this VPN. The XMPP node 'vpn-customer-name' is 557 assumed to be implicitly created by the End-System Route Server. 558 Creation of a virtual interface may precede any IP address becoming 559 active on the interface, as it is the case with VM instantiation. 561 The optional "instance-id" element allows the VPN Forwarder to 562 specific an unique 16 bit index that can be used by the Route Server 563 to automatically assign a Route Distinguisher (RD) to any route 564 subsequently advertised by the VPN Forwarder. In a scenario where 565 the VPN Forwarder is advertising reachability information to multiple 566 Route Servers it is desirable for reachability information to have an 567 RD composed of the VPN Forwarder identifier (e.g. IPv4 address) and 568 the "instance-id". 570 Subscription request from end-system to external VPN Forwarder: 572 576 577 578 579 580 vlan-id 581 582 583 584 586 When an external VPN Forwarder is used the end-system should include 587 the VLAN identifier it assigned to the virtual interface as a 588 subscription option. 590 When a IP address is added to a virtual interface, the end-system 591 will generate an XMPP Publish request. 593 Publish request from VPN Forwarder to End-System Route Server: 595 597 to='network-control@domain.org' 598 id='request1'> 599 600 601 602 603 604 1 605
'vpn-ip-address/32'
606
607 608 609 1 610
'infrastructure-ip-address'
611 612
613
614 1 615
616
617
618
619
621 The End-System Route Server will convert the information received in 622 a the 'publish' request into the corresponding BGP route information 623 such that:. 625 It associates the specific request with a local VRF which it 626 resolves by using a combination of the originator jid and the 627 collection 'node' attribute. 629 It creates a BGP VPN route with a 'Route Distinguisher' (RD) which 630 contains an unique 32bit value per end-system plus the 16bit 631 instance-id specified in the subscribe message, the specified IP 632 prefix and 'label' received from the VPN Forwarder as the Network 633 Layer Reachability Information (NLRI). 635 The BGP next-hop address is set to the address of the VPN 636 Forwarder. 638 It optionally associates the route with an extended community TBD 639 containing a sequence number of the route advertisement. 641 Conversely, when an interface operational status is determined to be 642 down or an IP address is unconfigured the VPN forwarder generates an 643 XMPP retract message to withdraw the route advertisement. 645 Retract request from VPN Forwarder to End-System Route Server: 647 651 652 653 654 655 656 658 Update notification from Route Server to VPN Forwarder: 660 661 662 663 664 665 666 1 667
'vpn-ip-address>/32'
668
669 670 671 1 672
'infrastructure-ip-address'
673 674
675
676 1 677
678
679 680 ... 681 682
683
684
685 Notifications should be generated whenever a VPN route is added, 686 modified or deleted. 688 Note that the Update from the Route Server to the VPN Forwarder does 689 not contain the "jid" of the destination end-system. The "from" 690 attribute in the 'message' element contains a "jid" associated with 691 the Route Servers in the domain. The XMPP messages are point-to- 692 point in nature, between a Forwarder and Route Server. Even in the 693 case when one XMPP publish request from a Forwarder may cause the 694 Route Server to generate one or more event notifications. 696 When multiple possible routes exist for a given VPN IP address within 697 a VRF it is the responsibility of the Route Server to select the best 698 path to advertise to the Forwarder. 700 When routes are withdrawn, the End-System Route Server generates an 701 item "retract" request. 703 Route advertisements can have an optional sequence-number which help 704 the route server determine the most recent route advertisement. The 705 sequence number is detemined by an mechanism external to this 706 document. One example is to use time synchronization between compute 707 nodes to have a globally coordinated timestamp. This timestamp can 708 be used to identify the time of interface creation on the compute 709 node. 711 Routes can also be associated with a "local-preference" attribute. 712 This attribute mapps to the BGP attribute of the same name for the 713 purposes of route selection. 715 7. End-System Route Server behavior 717 End-System Route Servers SHALL support the BGP address families: VPN- 718 IPv4 (1, 128), VPN-IPv6 (2, 128) and RT-Constraint (1, 132) 719 [RFC4684]. 721 When an End-System Route Server receives a request to create or 722 modify a VPN route it SHALL generate a BGP VPN route advertisement 723 with the corresponding information. 725 It is assumed that the End-System Route Servers have information 726 regarding the mapping between the tuple ('end-system', 'vpn-customer- 727 names') and BGP Route Targets used to import and export information 728 from the associated VRFs. This mapping is known via an out-of-band 729 mechanism not specified in this document. 731 Whenever the End-System Route Server receives an XMPP subscription 732 request, it SHALL consult its RT-Constraint Routing Information Base 733 (RIB). If the Route Server does not have a locally originated RT- 734 Contraint route that corresponds to the vpn-name present in the 735 request, it SHALL create one and generate the corresponding BGP route 736 advertisement. This route advertisement should only be withdrawn 737 when there are no more downstream XMPP clients subscribed to the VPN. 739 End-System Route Servers SHOULD automatically assign a BGP route 740 distinguisher per VPN routing table. 742 8. Operational Model 744 In the simplest case, a VPN is a collection of systems that are 745 allowed to exchange traffic with each other and only with each other. 746 Since all the forwarding tables in this VPN have the same routing 747 entries they are often referred to as symmetrical VPNs. 749 In order to better illustrate the operation of the protocol we 750 consider a simple example in which "host 1" and "host 2" both contain 751 a virtual interface that is a member of the same VPN. 753 Each of these hosts has an XMPP session with an End-System Route 754 Server, RS1 and RS2 our example, and these Route Servers are part of 755 the same BGP mesh. 757 When a virtual interface is created on "host 1", the local XMPP 758 client generates a XMPP subscription message to its respective Route 759 Server. This message contains a VPN identifier that has been 760 assigned by the provisioning system. The Route Server maps that 761 identifier to a BGP IP VPN configuration which contains the list of 762 import and export route targets to be used for that particular VRF. 764 Once the interface is operational, "host 1" will publish any IP 765 addresses that are configured on the respective virtual interface. 766 This will in turn cause the End-System Route Server to advertise 767 these (directly or indirectly) to any other BGP speaker on the 768 network which is connected to an attachment point of that VPN. 770 +--------+ +------------+ +----------+ 771 | host 1 | <===> | End-System | <===> | BGP mesh | 772 +--------+ |Route Server| +----------+ 773 +------------+ 775 Figure 1 777 +----------------+-------------+-------+-----------+ 778 | VPN IP address | NEXT-HOP | label | Known via | 779 +----------------+-------------+-------+-----------+ 780 | 10.1.1.1/32 | 192.168.1.1 | 10000 | XMPP | 781 | | | | | 782 | 10.1.1.2/32 | 192.168.2.1 | 20000 | BGP | 783 +----------------+-------------+-------+-----------+ 785 VPN Routing table on Route Server 787 Table 1 789 The figure above represents the contents of the VRF routing table on 790 RS1 after the IPv4 address 10.1.1.1 has been added to the virtual 791 interface on host 1. It assumes that there is another attachement 792 point for this VPN with the IPv4 address of 10.1.1.2. Host 1 has an 793 infrastructure IP address of 192.168.1.1 configured on its physical 794 interface while host 2 has IP address 192.168.2.1. 796 The contents of the VRF routing table in the End-System Route Servers 797 are advertised via XMPP Update notifications sent to host 1. This 798 information is the used by the host to populate the forwarding table 799 associated with that VPN. 801 +--------+ +--------+ 802 app -- veth0 --| host 1 |=== [network] ===| host 2 |-- veth0 -- app 803 +--------+ +--------+ 805 IP pkt ===> GRE encap ===> [IP net] ===> GRE decap ===> IP pkt 806 [192.168.2.1, 20] map 20 to veth0 808 Figure 2 810 +----------------+--------------+-------+ 811 | VPN IP address | Host address | label | 812 +----------------+--------------+-------+ 813 | 10.1.1.1/32 | localhost | 10000 | 814 | | | | 815 | 10.1.1.2/32 | 192.168.2.1 | 20000 | 816 +----------------+--------------+-------+ 818 VRF table on host1 820 Table 2 822 When an application that uses the virtual interface on host 1 823 generates packets with a destination IP address of 10.1.1.2 these are 824 routed by the VPN Forwarder implemented in the Host OS. The packets 825 are encapsulated with a header that contains a 20-bit label assigned 826 by host 2. 828 In the case the virtual interface on host is associated with a guest 829 OS, this guest OS has had its address resolution queries answered 830 with the Virtual Router MAC address. As a result, that is the 831 address it uses as the destination MAC address in packets it 832 originates. This MAC address is not present on the encapsulated 833 packet. 835 End-System Route Servers are software applications that implement 836 both the BGP IP VPN PE control plane as well as XMPP server 837 functionality. These applications are not in the forwarding plane 838 and do not need to be co-located with a network device. 840 Network devices MAY have direct BGP sessions to the End-System Route 841 Servers. For instance, a router or security appliance that supports 842 BGP/MPLS IP VPNs over GRE may use its existing functionality to 843 inter-operate directly with a collection of Virtual Machines or other 844 network appliances that support this specification. 846 End-System Route Servers implement the VRF import policy and export 847 policy functionality that is associated with PE routers in standard 848 BGP IP/VPN deployments. VPN Forwarders receive forwarding 849 information after policy and route selection is applied. These are 850 unqualified routes in a specific VRF rather than VPN routing 851 information qualified by a Route Distinguisher and with a set of 852 Route Targets. 854 A symmetrical VPN uses a vrf import and vrf export polices that 855 contain a single route target, where the route target used for both 856 import and export is the same. 858 Different VPN topologies can be created by manipulating the vrf 859 import and export configuration including "hub-and-spoke" topologies 860 or overlapping VPNs. 862 An example of a hub-and-spoke VPN configuration is one where all the 863 traffic from the VPN clients must be redirected though a middle-box 864 for inspection. Assuming that the virtual interfaces of a particular 865 user are configured to be in the VPN "tenant1". At an initial stage 866 this "tenant1" VPN is symmetrical and uses a single Route Target in 867 both its import and export policies. The middle-box functionality 868 can be incrementally deployed by defining a new VPN, "tenant1-hub", 869 and an associated Route Target. Accompanied with a change in the 870 End-System Route Server configuration such that VPN "tenant1" only 871 imports routes with the Route Target associated with the hub. The 872 "hub" VPN is assumed to advertise a prefix that covers all the VPN 873 clients IP addresses. The "hub" VPN imports the VPN routes in order 874 for it to be able to generate the XMPP updates to the "hub" end- 875 system. This information is required for the return traffic from the 876 hub to the spokes (the VPN clients). In such a scenario a single 877 physical interface can connect the middle-box to the clients in a 878 given VPN which appear logically as downstream from it. Such a 879 middle-box would often require connectivity to multiple VPNs, such as 880 for instance an "outside" VPN which provides external connectivity to 881 one or more "inside" VPNs. 883 The functionality defined in this document in which the BGP IP VPN PE 884 functionality is split into its control (End-System Route Servers) 885 and forwarding (VPN Forwarder) components is fully interoperable with 886 existing BGP IP VPN PEs. 888 This makes it possible to reuse existing systems. For example, at 889 the edge of a data-center facility it may be desirable to use an 890 existing router or appliance that aggregates IP VPN routing 891 information and/or provides IP based services such as stateful packet 892 inspection. 894 Such a system can be configured, based on existing functionality, to 895 suppress more specific routes than a specified aggregate while 896 advertising the aggregate with a BGP NEXT_HOP containing the PE's IP 897 address and a locally assigned label corresponding to a VRF where the 898 more specific routes are present. 900 9. Security Considerations 902 The signaling protocol defines the access control policies for each 903 virtual interface and any guest application associated with it. It 904 is important to secure the end-system access to End-System Route 905 Servers and the BGP infrastructure itself. 907 The XMPP session between end-systems and the Route Servers MUST use 908 mutual authentication. One possible strategy is to distribute pre- 909 signed certificates to end-systems which are presented as proof of 910 authorization to the Route Server. 912 BGP sessions MUST be authenticated. This document recommends that 913 BGP speaking systems filter traffic on port 179 such that only IP 914 addresses which are known to participate in the BGP signaling 915 protocol are allowed. 917 As a security measure, it is recommended that virtual and 918 infrastructure topologies never be allowed to exchange traffic 919 directly. The infrastructure network containing the end-systems is 920 typically isolated from the outside world. 922 10. XML schema 923 The following schema defines the XML elements that are used to 924 communicate unicast reachability information between the Route Server 925 and VPN Forwarder: 927 931 932 933 934 935 936 937 939 940 943 945 946 947 948 949 951 953 954 956 958 959 960 961 962 964 965 966 967 968 969 970 972 974 975 976 978 979 980 982 983 985 987 989 11. Acknowledgements 991 Yakov Rekhter has contributed to this document by providing detailed 992 feedback and suggestions. The authors would also like to thank 993 Thomas Morin for his comments. 995 Amit Shukla and Ping Pan contributed to earlier versions of this 996 document. 998 12. References 1000 12.1. Normative References 1002 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 1003 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 1004 4023, March 2005. 1006 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1007 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1009 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1010 Networks (VPNs)", RFC 4364, February 2006. 1012 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 1013 Reflection: An Alternative to Full Mesh Internal BGP 1014 (IBGP)", RFC 4456, April 2006. 1016 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1017 R., Patel, K., and J. Guichard, "Constrained Route 1018 Distribution for Border Gateway Protocol/MultiProtocol 1019 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1020 Private Networks (VPNs)", RFC 4684, November 2006. 1022 [RFC5798] Nadas, S., "Virtual Router Redundancy Protocol (VRRP) 1023 Version 3 for IPv4 and IPv6", RFC 5798, March 2010. 1025 [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence 1026 Protocol (XMPP): Core", RFC 6120, March 2011. 1028 [pubsub] Millard, P., Saint-Andre, P., and R. Meijer, "Publish- 1029 Subscribe", XEP 0060, July 2010. 1031 12.2. Informational References 1033 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 1034 and D. McPherson, "Dissemination of Flow Specification 1035 Rules", RFC 5575, August 2009. 1037 [I-D.ietf-mpls-in-udp] 1038 Building, K., Sheth, N., Yong, L., Pignataro, C., and F. 1039 Yongbing, "Encapsulating MPLS in UDP", draft-ietf-mpls-in- 1040 udp-03 (work in progress), September 2013. 1042 [IEEE.802-1Q] 1043 Institute of Electrical and Electronics Engineers, "Local 1044 and Metropolitan Area Networks: Virtual Bridged Local Area 1045 Networks", IEEE Std 802.1Q-2005, May 2006. 1047 Authors' Addresses 1049 Pedro Marques 1050 Juniper Networks 1051 1133 Innovation Way 1052 Sunnyvale, CA 94089 1054 Email: roque@juniper.net 1056 Luyuan Fang 1057 Microsoft 1058 5600 148th Ave NE 1059 Redmond, WA 98052 1061 Email: lufang@microsoft.com 1062 Nischal Sheth 1063 Juniper Networks 1064 1133 Innovation Way 1065 Sunnyvale, CA 94089 1067 Email: nsheth@juniper.net 1069 Maria Napierala 1070 AT&T Labs 1071 200 Laurel Avenue 1072 Middletown, NJ 07748 1074 Email: mnapierala@att.com 1076 Nabil Bitar 1077 Verizon 1078 40 Sylvan Rd. 1079 Waltham, MA 02145 1081 Email: nabil.bitar@verizon.com