idnits 2.17.1 draft-ietf-l3vpn-end-system-requirements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 2, 2014) is 3493 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Maria Napierala 3 Intended Status: Informational AT&T 4 Expires: April 2, 2015 Luyuan Fang 5 Microsoft 7 October 2, 2014 9 Requirements for Extending BGP/MPLS VPNs to End-Systems 10 draft-ietf-l3vpn-end-system-requirements-00.txt 12 Abstract 14 The proven scalability and extensibility of the BGP/MPLS IP VPNs (IP 15 VPN) technology has made it an attractive candidate for data 16 center/cloud virtualization. Virtualized end-system environment 17 imposes additional requirements to MPLS/BGP VPN technology. This 18 document provides the requirements for extending IP VPN technology 19 (in original or modified versions) into the end-systems/hosts, such 20 as a server in a data center. 22 Status of this Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as 30 Internet-Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/1id-abstracts.html 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html 43 Copyright and License Notice 45 Copyright (c) 2014 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Application of MPLS/BGP VPNs to End-Systems . . . . . . . . . . 4 63 2.1. End-System CE and PE Functions . . . . . . . . . . . . . . 4 64 2.2. PE Control Plane Function . . . . . . . . . . . . . . . . . 5 65 3. VPN Communication Requirements . . . . . . . . . . . . . . . . 5 66 3.1. Unicast IPv4 and IPv6 . . . . . . . . . . . . . . . . . . . 5 67 3.2. Multicast/VPN Broadcast IPv4 and IPv6 . . . . . . . . . . . 5 68 3.3. IP Subnet Support . . . . . . . . . . . . . . . . . . . . . 6 69 4. Multi-Tenancy Requirements . . . . . . . . . . . . . . . . . . 6 70 5. Decoupling of Virtualized Networking from Physical 71 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . 7 72 6. Decoupling of Layer 3 Virtualization from Layer 2 Topology . . 8 73 7. Requirements for Encapsulation of Virtual Payloads . . . . . . 8 74 7.1. Encapsulation Methods . . . . . . . . . . . . . . . . . . . 9 75 7.2. Routing of Virtual Payloads . . . . . . . . . . . . . . . . 9 76 8. Optimal Forwarding of Traffic . . . . . . . . . . . . . . . . . 9 77 9. IP Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 9.1. IP Addressing of Virtual Hosts . . . . . . . . . . . . . . 10 79 9.2. Network Layer-Based Mobility . . . . . . . . . . . . . . . 10 80 9.3. Routing Convergence Requirements . . . . . . . . . . . . . 10 81 10. Inter-operability with Existing MPLS/BGP VPNs . . . . . . . . 11 82 11. BGP Requirements in a Virtualized Environment . . . . . . . . 12 83 11.1. BGP Convergence and Routing Consistency . . . . . . . . . 12 84 11.1.1. BGP IP Mobility Requirements . . . . . . . . . . . . . 12 85 11.2. Optimization of Route Distribution . . . . . . . . . . . . 13 86 12. Service chaining . . . . . . . . . . . . . . . . . . . . . . . 13 87 12.1. Load Balancing . . . . . . . . . . . . . . . . . . . . . . 13 88 12.2. Symmetric Service Chain Support . . . . . . . . . . . . . 14 89 12.3. Packet Header Transforming Services . . . . . . . . . . . 14 90 13. Security Considerations . . . . . . . . . . . . . . . . . . . 14 91 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 92 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 93 14.1. Normative References . . . . . . . . . . . . . . . . . . 15 94 14.2. Informative References . . . . . . . . . . . . . . . . . 16 96 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . 16 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 99 Requirements Language 101 Although this document is not a protocol specification, the key words 102 "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD 103 NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be 104 interpreted as described in RFC 2119 [RFC2119]. 106 1 Introduction 108 Enterprise networks are increasingly being consolidated and outsourced 109 in an effort to improve the deployment time of services as well as 110 reduce operational costs. This coincides with an increasing demand for 111 compute, storage, and network resources from applications. Logical 112 abstraction of these resources is needed to for improved scalability and 113 cost efficiency. This is referred as server, storage, and network 114 virtualization. It can be implemented in all layers of the computer 115 systems or networks. The virtualized loads are executed or transferred 116 over a common physical infrastructure. Compute nodes running guest 117 operating systems are often executed as Virtual Machines (or VMs). 119 This document defines requirements for a network virtualization solution 120 that provides secure IP VPN connectivity to virtual resources on end- 121 systems operating in a multi-tenant shared physical infrastructure. The 122 requirements address the needs of virtual resources, defined as Virtual 123 Machines, applications, and appliances that require only IP 124 connectivity. Non-IP communication is addressed by other solutions and 125 is not in scope of this document. 127 The technical solutions to support these requirements are work in 128 progress in IETF [I-D.ietf-l3vpn-end-system], 129 [I-D.fang-l3vpn-virtual-pe]. The solutions may referred as End-System 130 solutions or virtual PE (vPE) solutions in different documents. 132 1.1 Terminology 134 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 135 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 136 document are to be interpreted as described in RFC 2119 [RFC2119]. 138 Term Definition 139 ----------- -------------------------------------------------- 140 AS Autonomous System 141 CE Customer Edge router 142 End-System A device where Guest OS, Host OS/Hypervisor reside 143 GRE Generic Routing Encapsulation 144 Hypervisor Virtual Machine Manager 145 Iaa Infrastructure as a Service 146 PE Provider Edge router 147 RT Route Target 148 RTC RT Constraint 149 SDN Software Defined Network 150 ToR Top-of-Rack switch 151 VM Virtual Machine 152 vPE virtual Provider Edge Router 153 VPN Virtual Private Network 155 2. Application of MPLS/BGP VPNs to End-Systems 157 MPLS/BGP VPN technology [RFC4364] have proven to be able to scale to a 158 large number of VPNs (tens of thousands) and customer routes (millions) 159 while providing for aggregated management capability. In traditional WAN 160 deployments of BGP IP VPNs a Customer Edge (CE) is a physical device, 161 residing a customer's location, connected to a Provider Edge (PE), 162 residing in a Service Provider's location. CE devices are logically part 163 of a customer's VPN while PE routers are logically part of the SP's 164 network. In a traditional MPLS/BGP VPN deployment, a CE device is a 165 router and it is a routing peer of a PE to which it is attached via an 166 attachment circuit. In addition, the forwarding function and control 167 function of a Provider Edge (PE) device co-exist within a single 168 physical router. 170 MPLS/BGP VPN technology can be evolved and adapted to new virtualized 171 environments by implementing the VPN forwarding edge functionality on 172 the end-system hosts and thereby extending VPN service directly to end- 173 systems. 175 2.1. End-System CE and PE Functions 177 When end-system attaches to MPLS/BGP VPN, CE corresponds to a non- 178 routing host that can reside in a Virtual Machine or be an application 179 residing on the end-system itself. 181 As in traditional MPLS/BGP VPN deployments, it is undesirable for the 182 end-system VPN forwarding knowledge to extend to the transport network 183 infrastructure. Hence, optimally, with regard to forwarding, the end- 184 system should become both the CE and the PE simultaneously. 186 The network virtualization solution should also support deployments 187 where it is not possible or not desirable to co-locate the PE and CE 188 functionality. In such deployments PE may be implemented on an external 189 device with remote CE attachments. This external PE device should be as 190 close as possible to the end-system where the CE resides. The external 191 PE devices that attach to a particular VPN, need to know, for each 192 attachment circuit leading to that VPN, the host address that is 193 reachable over that attachment circuit. The end-system MPLS/BGP VPN 194 solution must specify a method to convey this information from the end- 195 system to the PE. 197 The same network virtualization solution should support deployments with 198 mixed, internal (co-located with CE) and external PE (i.e., remote CE) 199 implementations. 201 2.2. PE Control Plane Function 203 It is a current practice to implement MPLS/BGP VPN PE forwarding and 204 control functions in different processors of the same device and to use 205 internal (proprietary) communication between those processors. 206 Typically, the PE control functionality is implemented in one (or very 207 few) components of a device and the PE forwarding functionality is 208 implemented in multiple components of the same device (a.k.a., "line 209 cards"). 211 In end-system environment, a single end-system, effectively, corresponds 212 to a line card in a traditional PE router. For scalable and cost 213 effective deployment of end-system MPLS/BGP VPNs the PE forwarding 214 function should be decoupled from PE control function such that the 215 former can be implemented on multiple standalone devices. This 216 separation of functionality will allow for implementing the end-system 217 PE forwarding on multiple end-system devices, for example, in operating 218 systems of application servers or network appliances. Moreover, the 219 separation of PE forwarding and control plane functions allows for the 220 PE control plane function to be itself virtualized and run as an 221 application in end-system. 223 3. VPN Communication Requirements 225 3.1. Unicast IPv4 and IPv6 227 A network virtualization solution should be able to provide IPv4 and 228 IPv6 unicast connectivity between hosts in the same and different 229 subnets without any assumptions regarding the underlying media layer. 231 3.2. Multicast/VPN Broadcast IPv4 and IPv6 233 Furthermore, the multicast transmission, i.e., allowing IP applications 234 to send packets to a group of IPv4 or IPv6 addresses should be 235 supported. The multicast service should also support a delivery of 236 traffic to all endpoints of a given VPN even if those endpoints have not 237 sent any control messages indicating the need to receive that traffic. 238 In other words, the multicast service should be capable of delivering 239 the IP broadcast traffic in a virtual topology. A solution for 240 supporting VPN multicast and VPN broadcast must not require that the 241 underlying transport network supports IP multicast transmission service. 243 3.3. IP Subnet Support 245 In some deployments, Virtual Machines or applications are configured to 246 belong to an IP subnet. A network virtualization solution should 247 support grouping of virtual resources into IP subnets regardless of 248 whether the underlying implementation uses a multi-access network or 249 not. While some applications may expect to find other peers in a 250 particular user defined IP subnet, this does not imply the need to 251 provide a layer 2 service that preserves MAC addresses. End-system 252 network virtualization solution should be able to provide IP (unicast, 253 multicast, VPN broadcast) connectivity between hosts in the same and 254 different subnets without any assumptions regarding the underlying media 255 layer. 257 4. Multi-Tenancy Requirements 259 One of the main goals of network virtualization is to provide traffic 260 and routing isolation between different virtual components that share a 261 common physical infrastructure. Networks use various VPN technologies to 262 isolate disjoint groups of virtual resources. Some use VLANs 263 [IEEE.802-1Q] as a VPN technology, others use layer 3 based solutions, 264 often with proprietary control planes. Service Providers are interested 265 in interoperability and in openly documented protocols rather than in 266 proprietary solutions. 268 A collection of virtual resources might provide external or internal 269 services. Such collection may serve an external "customer" or internal 270 "tenant" to whom a Service Provider provides service(s). In MPLS/BGP VPN 271 terminology a collection of virtual resources dedicated to a process or 272 application corresponds to a VPN. 274 A network virtualization multi-tenancy solution should support the 275 following: 277 - Tenant or application isolation, in data plane and control plane, 278 while sharing the same underlying physical network. Tenants should 279 be able to independently select and deploy their choice of IP 280 address space: public or private IPv4 and/or IPv6. 282 - Multiple distinct VPNs per tenant. Tenant's inter-VPN traffic 283 should be allowed to cross VPN boundaries, subject to access 284 controls and/or routing policies. 286 - Inter-VPN communication, subject to access policies. Typically 287 VPNs that belong to different external tenants do not communicate 288 with each other directly but they should be allowed to access 289 shared services or shared network resources. It is often the case 290 that SP infrastructure services are provided to multiple tenants, 291 for example voice-over-IP gateway services or video-conferencing 292 services for branch offices. 294 - VM or application end-point should be able to directly access 295 multiple VPNs without a need to traverse a gateway. 297 End-system network virtualization solution should support both, isolated 298 VPNs as well as overlapping VPNs (often referred to as "extranets"). It 299 should also support any-to-any and hub-and-spoke topologies. 301 5. Decoupling of Virtualized Networking from Physical Infrastructure 303 One of the main goals in designing a large scale transport network is to 304 minimize the cost and complexity of its "fabric" by delegating the 305 virtual resource communication processing to the network edge. It has 306 been proven (in Internet and in large MPLS/BGP VPN deployments) that 307 moving complexity to network edge while keeping network core simple has 308 very good scaling properties. 310 The transport network infrastructure should not maintain any information 311 that pertains to the virtual resources in end-systems. Decoupling of 312 virtualized networking from the physical infrastructure has the 313 following advantages: 1) provides better scalability; 2) simplifies the 314 design and operation; 3) reduces network cost. 316 Decoupling of virtualized networking from underlying physical network 317 consists in the following: 319 - Separation between the virtualized segments (i.e., interface 320 associated with virtual resources) and the physical network (i.e., 321 physical interfaces associated with network infrastructure). 323 - Separation of the virtual network IP address space from the 324 physical infrastructure network IP address space. In the case of a 325 transport other than IP, for example MPLS or Ethernet, the 326 infrastructure address refers to the Subnetwork Point of 327 Attachment (SNPA) address in a given multi-access network. 329 - The physical infrastructure addresses should be routable (or 330 switchable) in the underlying transport network, while the virtual 331 network addresses should be routable only in the virtual network. 333 - The virtual network control plane should be decoupled from the 334 underlying transport network. 336 6. Decoupling of Layer 3 Virtualization from Layer 2 Topology 338 The layer 3 approach to network virtualization dictates that the 339 virtualized communication should be routed, not bridged. The layer 3 340 virtualization solution should be decoupled from the layer 2 341 topology. Thus, there should be no dependency on VLANs and layer 2 342 broadcast. 344 In solutions that depend on layer 2 broadcast domains, host-to-host 345 communication is established based on flooding and data plane MAC 346 learning. Layer 2 MAC information has to be maintained on every 347 switch where a given VLAN is present. Even if some solutions are able 348 to minimize data plane MAC learning and/or unicast flooding, they 349 still rely on MAC learning at the network edge and on maintaining the 350 MAC addresses on every switch where the layer 2 VPN is present. 352 The MAC addresses known to guest OS in end-system are not relevant to 353 IP services and introduce unnecessary overhead. Hence, the MAC 354 addresses associated with virtual resources should not be used in the 355 virtual layer 3 networks. Rather, only what is significant to IP 356 communication, namely the IP addresses of the virtual machines and 357 application endpoints should be maintained by the virtual networks. 359 7. Requirements for Encapsulation of Virtual Payloads 361 In order to scale the transport networks, the virtual network 362 payloads must be encapsulated with headers that are routable (or 363 switchable) in the physical network infrastructure. The IP addresses 364 of the virtual resources are not to be advertized within the physical 365 infrastructure address space. 367 The encapsulation (and de-capsulation) function should be implemented 368 on a device as close to virtualized resources as possible. Since the 369 hypervisors in the end-systems are the devices at the network edge 370 they are the most optimal location for the encap/decap functionality. 372 The network virtualization solution should also support deployments 373 where it is not possible or not desirable to implement the virtual 374 payload encapsulation in the hypervisor/Host OS. In such deployments 375 encap/decap functionality may be implemented in an external device. 376 The external device implementing encap/decap functionality should be 377 a close as possible to the end-system itself. The same network 378 virtualization solution should support deployments with both, 379 internal (in a hypervisor) and external (outside of a hypervisor) 380 encap/decap devices. 382 Whenever the virtual forwarding functionality is implemented in an 383 external device, the virtual service itself must be delivered to an 384 end-system such that switching elements connecting the end-system to 385 the encap/decap device are not aware of the virtual topology. 387 7.1. Encapsulation Methods 389 MPLS/VPN technology based on [RFC4364] specifies that different 390 encapsulation methods could be for connecting PE routers, namely 391 Label Switched Paths (LSPs), IP tunneling, and GRE tunneling. 393 If LSPs are used in the transport network they could be signaled with 394 LDP, in which case host (/32) routes to all PE routers must be 395 propagated throughout the network, or with RSVP-TE, in which case a 396 full mesh of RSVP-TE tunnels is required. The label forwarding tables 397 can also be constructed using SDN controllers without the need of 398 distributed signaling protocols. 400 If the transport network is only IP-capable then MPLS in IP or MPLS 401 in GRE [RFC4023] encapsulation could be used. Due to route 402 aggregation property of IP protocols, with IP/GRE encapsulation the 403 PE host routes do not have to be present in the transport network. 405 7.2. Routing of Virtual Payloads 407 A device implementing the encap/decap functionality acts as the 408 first-hop router in the virtual topology. 410 In a layer 3 end-system virtual network, IP packets should reach the 411 first-hop router in one IP-hop, regardless of whether the first-hop 412 router is an end-system itself (i.e., a hypervisor/Host OS) or it is 413 an external (to end-system) device. The first-hop router should 414 always perform an IP lookup on every packet it receives from a 415 virtual machine or an application. The first-hop router should 416 encapsulate the packets and route them towards the destination end- 417 system. 419 8. Optimal Forwarding of Traffic 421 The network virtualization solutions that optimize for the maximum 422 utilization of compute and storage resources require that those 423 resources may be located anywhere in the network. The physical and 424 logical spreading of appliances and workloads implies a very 425 significant increase in the infrastructure bandwidth consumption. In 426 order to be efficient in terms of traffic forwarding, the virtualized 427 networking solutions must assure that packets traverse the transport 428 network only once. 430 It must be also possible to send the traffic directly from one end- 431 system to another end-system without traversing through a midpoint 432 router. 434 9. IP Mobility 436 Another reason for a network virtualization is the need to support IP 437 mobility. IP mobility means that IP addresses used for communication 438 within or between applications can be located anywhere across the 439 virtual network. Using a virtual topology, i.e., abstracting the 440 externally visible network address from the underlying infrastructure 441 address is an effective way to solve IP mobility problem. 443 IP mobility consists in a device physically moving (e.g., a roaming 444 wireless device) or a workload being transferred from one physical 445 server/appliance to another. IP mobility requires preserving device's 446 active network connections (e.g., TCP and higher-level sessions). 447 Such mobility is also referred to as "live" migration with respect to 448 a Virtual Machine. IP mobility is highly desirable for many reasons 449 such as efficient and flexible resource sharing, data center 450 migration, disaster recovery, server redundancy, or service bursting. 452 9.1. IP Addressing of Virtual Hosts 454 To accommodate live mobility of a virtual machine (or a device), it 455 is desirable to assign to it a semi-permanent IP address that remains 456 with the VM/device as it moves. The semi-permanent IP address can be 457 configured through VM or device configuration process or by means of 458 DHCP. 460 9.2. Network Layer-Based Mobility 462 When dealing with IP-only applications it is not only sufficient but 463 optimal to forward the traffic based on layer 3 (network layer) 464 rather than on layer 2 (data-link layer) information. The MAC 465 addresses of devices or applications are irrelevant to IP services 466 and introduce unnecessary overhead and complications when devices or 467 VMs move. For example, when a VM moves between physical servers, the 468 MAC learning tables in the switches must be updated. Moreover, it is 469 possible that VM's MAC address might need to change in its new 470 location. In IP-based network virtualization solution a device or a 471 workload move is handled by an IP route advertisement. 473 9.3. Routing Convergence Requirements 475 IP mobility has to be transparent to applications and any external 476 entity interacting with the applications. This implies that the 477 network connectivity restoration time is critical. The transport 478 sessions can typically survive over several seconds of disruption, 479 however, applications may have sub-second latency requirement for 480 their correct operation. 482 To minimize the disruption to established communication during 483 workload or device mobility, the control plane of a network 484 virtualization solution should be able to differentiate between the 485 activation of a workload in a new location from advertizing its route 486 to the network. This will enable the remote end-points to update 487 their routing tables prior to workload's migration as well as 488 allowing the traffic to be tunneled via the workload's old location. 490 10. Inter-operability with Existing MPLS/BGP VPNs 492 Service Providers want to tie their server-based offerings to their 493 MPLS/BGP VPN services. MPLS/BGP VPNs provide secure and latency- 494 optimized remote connectivity to the virtualized resources in SP's 495 data center. The Service Provider-based VPN access can provide 496 additional capabilities compared with public internet access, such as 497 QoS, OAM, multicast service, VoIP service, video conferencing, 498 wireless connectivity. 500 MPLS/BGP VPN customers may require simultaneous access to resources 501 in both SP and their own data centers. 503 Service Providers want to "spin up" the L3VPN access to data center 504 VPNs as dynamically as the spin up of compute and other virtualized 505 resources. 507 The network virtualization solution should be fully inter-operable 508 with MPLS/BGP VPNs, including: 510 - Inter-AS MPLS/BGP VPN Options A, B, and C [RFC4364]. 512 - BGP/MPLS VPN-capable network devices (such as routers and network 513 appliances) should be able to participate directly in a virtual 514 network that spans end-systems. 516 - The network devices should be able to participate in isolated 517 collections of end-systems, i.e., in isolated VPNs, as well as in 518 overlapping VPNs (called "extranets" in BGP/MPLS VPN terminology). 520 - The network devices should be able to participate in any-to-any 521 and hub-and-spoke end-systems topologies. 523 When connecting an end-system VPN to other networks, it should not 524 be necessary to advertize the specific host routes but rather the 525 aggregated routing information. A BGP/MPLS VPN-capable router or 526 appliance can be used to aggregate VPN's IP routing information 527 and advertize the aggregated prefixes. The aggregated prefixes 528 should be advertized with the router/appliance IP address as BGP 529 next-hop and with locally assigned aggregate 20-bit label. The 530 aggregate label should trigger a destination IP lookup in its 531 corresponding VRF on all the packets entering the virtual network. 533 The inter-connection of end-system VPNs with traditional VPNs 534 requires an integrated control plane and unified orchestration of 535 network and end-system resources. 537 11. BGP Requirements in a Virtualized Environment 539 11.1. BGP Convergence and Routing Consistency 541 BGP was designed to carry very large amount of routing information 542 but it is not a very fast converging protocol. In addition, the 543 routing protocols, including BGP, have traditionally favored 544 convergence (i.e., responsiveness to route change due to failure or 545 policy change) over routing consistency. Routing consistency means 546 that a router forwards a packet strictly along the path adopted by 547 the upstream routers. When responsiveness is favored, a router 548 applies a received update immediately to its forwarding table before 549 propagating the update to other routers, including those that 550 potentially depend upon the outcome of the update. The route change 551 responsiveness comes at the cost of routing blackholes and loops. 553 Routing consistency in virtualized environments is important because 554 multiple workloads can be simultaneously moved between different 555 physical servers due to maintenance activities, for example. If 556 packets sent by the applications that are being moved are dropped 557 (because they do not follow a live path), the active network 558 connections will be dropped. To minimize the disruption to the 559 established communications during VM migration or device mobility, 560 the live path continuity is required. 562 11.1.1. BGP IP Mobility Requirements 564 In IP mobility, the network connectivity restoration time is 565 critical. In fact, Service Provider networks already use routing and 566 forwarding plane techniques that support fast failure restoration by 567 pre-installing a backup path to a given destination. These techniques 568 allow to forward traffic almost continuously using an indirect 569 forwarding path or a tunnel to a given destination, and hence, are 570 referred to as "local repair". The traffic forwarding path is 571 restored locally at the destination's old location while the network 572 converges to a backup path. Eventually, the network converges to an 573 optimal path and bypasses the local repair. BGP assists in the local 574 repair techniques by advertizing multiple paths and not only the best 575 path to a given destination. 577 11.2. Optimization of Route Distribution 579 When virtual networks are triggered based on the IP communication, 580 the Route Target Constraint extension [RFC4684] of BGP should be used 581 to optimize the route distribution for sparse virtual network events. 582 This technique ensures that only those VPN forwarders that have local 583 participants in a particular data plane event receive its routing 584 information. This also decreases the total load on the upstream BGP 585 speakers. 587 12. Service chaining 589 A service chain is a deployment where a sequence of appliances 590 intermediate traffic between networks. In fact, traffic from one 591 virtual network may go through an arbitrary graph of service nodes 592 before reaching another virtual network. Service chains can contain a 593 mixture of virtual services (implemented as VMs on compute nodes) and 594 physical services (hosted on service nodes). Network appliances tend 595 to be designed to operate on an "inside/outside" interface model. 596 This type of applications do not terminate traffic and are 597 transparent to packets. In an SDN approach, the service chain is 598 configured and managed in software that adds and removes services 599 from the chain in an automated way. It is a requirement that service 600 chaining is supported on devices using MPLS/BGP VPN technology for 601 virtual networking. 603 Connecting appliances in a sequence has been done for many years 604 using VLANs. However, "service-chaining" cannot be implemented 605 without solving the problem of how to bring in traffic from a routed 606 network into the set of appliances. The issue is always how to 607 attract the traffic in and forward it out of the service-chain, i.e., 608 how to integrate the service-chain with routing. By using the same 609 mechanism to route traffic in and out of a service chain as well as 610 through its intermediate hops, the implementation of service chains 611 is significantly simplified. 613 One solution currently work in progress in IETF is 614 [I-D.rfernando-l3vpn-service-chaining]. 616 12.1. Load Balancing 618 One of the main requirements of service-chaining is horizontal 619 scaling of a service in a service-chain to tens or hundreds of 620 instances. When using MPLS/BGP VPN routing instance (or VRF) 621 construct to implement service chaining, the load balancing is built- 622 in. The load balancing corresponds to BGP multipath where multiple 623 routes for a single prefix are installed in a routing instance. The 624 multiple BGP routes in the routing table translate to Equal Cost 625 Multi-Path in the forwarding plane. The hash used in the load 626 balancing algorithm can be per packet, per flow or per prefix. The 627 forwarding plane should support load balancing over several hundreds 628 next-hops. 630 Load balancing should support deployments where both, virtual and 631 physical service appliances are present. It should support 632 deployments where virtual service instances are spread across the 633 same and different end-systems/hosts. 635 12.2. Symmetric Service Chain Support 637 If a service function is stateful, it is required that forward flows 638 and reverse flows always pass through the same service instance. ECMP 639 does not provide this capability, since the hash calculation will see 640 different input data for the same flow in the forward and reverse 641 directions. Additionally, if the number of service instances changes, 642 either to expand/decrease capacity or due to an instance failure, the 643 hash table in ECMP is recalculated, and most flows will be re- 644 directed to a different service instance, causing user session 645 disruption. 647 It is a requirement that service chaining solution satisfies the 648 requirements of symmetric forward/reverse paths for flows and a 649 minimal traffic disruption when service instances are added to or 650 removed from a set of instances. 652 12.3. Packet Header Transforming Services 654 A service in a service chain might perform an action that changes the 655 packet header information, e.g., the packet's source address (such as 656 performed by NAT service). In order to support the reverse traffic 657 flow traffic in this case, the routing and forwarding information has 658 to be modified such that the traffic can be directed via the 659 instances of the transforming service. For example, the original 660 routes with a source prefix (Network-A) are replaced with a route 661 that has a prefix that includes all the possible addresses that the 662 source address could be mapped to. In the case of network address 663 translation, this would correspond to the NAT pool. 665 It is a requirement that service chaining solution supports services 666 that manipulate packet headers. 668 13. Security Considerations 669 The document presents the requirements for end-systems MPLS/BGP VPNs. 670 The security considerations for traditional MPLS/BGP VPN deployments 671 are described in [RFC4364] in Section 13. The additional security 672 issues associated with deployments using MPLS-in-GRE or MPLS-in-IP 673 encapsulations are described in [RFC4023] in Section 8. In addition, 674 [RFC4111] provides general IP VPN security guidelines. 676 The additional security requirements specific to end-system MPLS/BGP 677 VPNs are as follows: 679 - End-systems MPLS/BGP VPNs solution should guarantee that packets 680 originating from a specific end-system virtual interface are 681 accepted only if the corresponding VPN IP host is present on that 682 end-system. 684 - Virtual network must ensure that traffic arriving at the egress 685 end-system is being sent from the correct ingress end-system. 687 - One virtual host or VM should not be able to impersonate another, 688 during steady-state operation and during live migration. 690 The security considerations for specific solutions will be 691 documented in the relevant documents. 692 13. IANA Considerations 694 This document contains no new IANA considerations. 696 14. References 698 14.1. Normative References 700 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 701 "Encapsulating MPLS in IP or Generic Routing Encapsulation 702 (GRE)", RFC 4023, March 2005. 704 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 705 Networks (VPNs)", RFC 4364, February 2006. 707 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 708 R., Patel, K., and J. Guichard, "Constrained Route 709 Distribution for Border Gateway Protocol/MultiProtocol 710 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 711 Private Networks (VPNs)", RFC 4684, November 2006. 713 [IEEE.802-1Q] Institute of Electrical and Electronics Engineers, 714 "Local and Metropolitan Area Networks: Virtual Bridged 715 Local Area Networks", IEEE Std 802.1Q-2005, May 2006. 717 14.2. Informative References 719 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 720 Requirement Levels", BCP 14, RFC 2119, March 1997. 722 [RFC4111] Fang, L., Ed., "Security Framework for Provider- 723 Provisioned Virtual Private Networks (PPVPNs)", RFC 4111, 724 July 2005. 726 [I-D.ietf-l3vpn-end-system] Marques, P., Fang, L., Pan, P., Shukla, 727 A., Napierala, M., "BGP-signaled end-system IP/VPNs", 728 draft-ietf-l3vpn-end-system, work in progress. 730 [I-D.fang-l3vpn-virtual-pe] Fang, L., Ward, D., Fernando, R., 731 Napierala, M., Bitar, N., Rao, D., Rijsman, B., So, N., 732 "BGP IP VPN Virtual PE", draft-fang-l3vpn-virtual-pe, work 733 in progress. 735 [I-D.rfernando-l3vpn-service-chaining] Fernando, R., Rao, D., Fang, 736 L., Napierala, M., So, N., draft-rfernando-l3vpn-service- 737 chaining, work in progress. 739 Acknowledgements 741 The authors would like to thank Pedro Marques and Han Nguyen for the 742 comments and suggestions. 744 Authors' Addresses 746 Maria Napierala 747 AT&T 748 200 Laurel Avenue 749 Middletown, NJ 07748 750 Email: mnapierala@att.com 752 Luyuan Fang 753 Microsoft 754 5600 148th Ave NE 755 Redmond, WA 98052 756 Email: lufang@microsoft.com