idnits 2.17.1 draft-zu-nfvrg-elasticity-vnf-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 27, 2014) is 3469 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2234' is defined on line 491, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (Obsoleted by RFC 4234) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Research Task Force (IRTF) Z. Qiang 2 Internet Draft Ericsson 3 Intended status: Informational October 27, 2014 4 Expires: April 2015 6 Elasticity VNF 7 draft-zu-nfvrg-elasticity-vnf-00.txt 9 Status of this Memo 11 This Internet-Draft is submitted in full conformance with the 12 provisions of BCP 78 and BCP 79. 14 This document may contain material from IETF Documents or IETF 15 Contributions published or made publicly available before November 16 10, 2008. The person(s) controlling the copyright in some of this 17 material may not have granted the IETF Trust the right to allow 18 modifications of such material outside the IETF Standards Process. 19 Without obtaining an adequate license from the person(s) controlling 20 the copyright in such materials, this document may not be modified 21 outside the IETF Standards Process, and derivative works of it may 22 not be created outside the IETF Standards Process, except to format 23 it for publication as an RFC or to translate it into languages other 24 than English. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six 32 months and may be updated, replaced, or obsoleted by other documents 33 at any time. It is inappropriate to use Internet-Drafts as 34 reference material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/ietf/1id-abstracts.txt 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html 42 This Internet-Draft will expire on April 27, 2015. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with 54 respect to this document. Code Components extracted from this 55 document must include Simplified BSD License text as described in 56 Section 4.e of the Trust Legal Provisions and are provided without 57 warranty as described in the Simplified BSD License. 59 Abstract 61 This draft is an analytic of NFV applications based on the NFV 62 architecture, use cases and requirements. The purpose of this 63 analytic is to identify any NFV characteristics related issues. The 64 analytic is focusing on elasticity VNF with predicable performance, 65 reliability and security. Only the issues which are unique to NFV 66 are discussed in this document. 68 Table of Contents 70 1. Introduction...................................................3 71 2. Conventions used in this document..............................3 72 3. Terminology....................................................3 73 4. Network Function Virtualization................................5 74 4.1. NFV Requirements..........................................5 75 4.2. NFV Use Cases.............................................5 76 4.2.1. Network Function Virtualization Infrastructure.......6 77 4.2.2. Telecom Network Functions Migration..................7 78 5. Elasticity in a Distributed Cloud..............................7 79 5.1. Elasticity VNF............................................8 80 5.2. Elasticity VNF set........................................8 81 6. Elasticity with Predicable Performance.........................9 82 6.1. Predicable Performance....................................9 83 6.2. Hardware virtualization features..........................9 84 6.3. Network Overlay..........................................10 85 7. Elasticity with Reliability...................................10 86 8. Elasticity with Security......................................11 87 9. Security Considerations.......................................11 88 10. IANA Considerations..........................................11 89 11. References...................................................11 90 11.1. Normative References....................................11 91 11.2. Informative References..................................11 92 12. Acknowledgments..............................................12 94 1. Introduction 96 Network Functions Virtualization (NFV) is a network architecture 97 concept that proposes using IT virtualization related technologies, 98 to virtualize entire classes of network node functions into building 99 blocks that may be connected, or chained, together to create 100 communication services. NFV aims to transform the traditional 101 operator architect networks by evolving standard IT virtualization 102 technology to consolidate network equipment types onto industry 103 standard high volume services, switches and storage, which could be 104 located in a variety of NFVI PoPs including DC, network nodes and in 105 end user premises. It is also indicated that an important part of 106 controlling the NFV environment should be done through automation 107 network management and orchestration. 109 This draft is an analytic of NFV applications based on the NFV 110 architecture, use cases and requirements. The purpose of this 111 analytic is to identify any NFV characteristics related issues. The 112 analytic is focusing on elasticity VNF with predicable performance, 113 reliability and security. Only the issues which are unique to NFV 114 are discussed in this document. The intention is to identify what is 115 missing, and what is needed to be addressed in terms of protocol / 116 solution specifications which may be the potential work for IETF. 118 The reader is assumed to be familiar with the terminology as defined 119 in the NFV document [nfv-tem]. 121 2. Conventions used in this document 123 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 124 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 125 document are to be interpreted as described in RFC-2119 [RFC2119]. 127 In this document, these words will appear with that interpretation 128 only when in ALL CAPS. Lower case uses of these words are not to be 129 interpreted as carrying RFC-2119 significance. 131 3. Terminology 133 This document uses the same terminology as found in the NFV end to 134 end architecture [nfv-tem]: 136 Network Function Consumer: a Network Function Consumer (NFC) is the 137 consumer of virtual network functions. It can be either an 138 individual user, home user or the enterprise user. 140 NFV: network function virtualization. NFV technology uses the 141 commodity servers to replace the dedicated hardware boxes for the 142 network functions, for example, home gateway, enterprise access 143 router, carrier grade NAT and etc. So as to improve the 144 reusability, allow more vendors into the market, and reduce time to 145 market. NFV architecture includes a NFV Control and Management Plane 146 (orchestrator) to manage the virtual network functions and the 147 infrastructure resources. 149 NF: A functional building block within an operator's network 150 infrastructure, which has well-defined external interfaces and a 151 well-defined functional behavior. Note that the totality of all 152 network functions constitutes the entire network and services 153 infrastructure of an operator/service provider. In practical terms, 154 a Network Function is today often a network node or physical 155 appliance. 157 Network Function Provider: a Network Function Provider (NFP) 158 provides virtual network function software. 160 Network Service Provider (NSP): a company or organization that 161 provides a network service on a commercial basis to third parties. A 162 network service is a composition of network functions and defined by 163 its functional and behavior specification. The NSP operates the NFV 164 Control Plane. 166 NFV Infrastructure (NFVI): NFV Infrastructure indicates the 167 computing, storage and network resources to implement the virtual 168 network function. High performance acceleration platform is also 169 part of it. 171 VNF: virtual network function, an implementation of an executable 172 software program that constitutes the whole or a part of an NF that 173 can be deployed on a virtualization infrastructure. 175 VM: virtual machines, a program and configuration of part of a host 176 computer server. Note that the Virtual Machine inherits the 177 properties of its host computer server e.g. location, network 178 interfaces. 180 NFV Control and Management Plane (NFVCMP): a NFV Control and 181 Management Plane is operated by a NSP and orchestrates the NFV NFV 182 Overview 184 4. Network Function Virtualization 186 4.1. NFV Requirements 188 There are many virtualization requirements described by NFV in [nfv- 189 req]. The followings are highlights of a few NFV requirements which 190 are related to this document: 192 - Portability: VNF portability is a reasonable generic 193 virtualization requirement. It allows VNF mobility across 194 different but standard multi-vendor environment. However, moving a 195 VNF within the NFV framework with the Service Level Specification 196 (SLA) requirements including performance, reliability and security 197 could be a challenge. 198 - Performance: Virtualization adds additional processing overhead 199 and increases the latency. For latency-sensitive VNFs, it is a big 200 concern for NFV on how to achieve predictable low-latency 201 performance. 202 - Elasticity: NFV elasticity requirement allows the VNF to be scaled 203 within NFVI. Within the NFV framework, it is important to support 204 VNF scaling with the SLA requirements including performance, 205 reliability and security. 206 - Resiliency: NFV resiliency is a must requirement for NFV network, 207 including both the control plane and data plane. Necessary 208 mechanisms must be provided to improve the service availability 209 and fault management. 210 - Security: The traditional telecom network functions are developed 211 in dedicated hardware located in an isolated network. Security is 212 provided by underlay network. When moving VNF into a DC network 213 with shared Infrastructure, security becomes a big concern. 214 - Service Continuity: At VNF failure over, migration, mobility, and 215 upgrading, service downtime may not be avoided. In NFV, service 216 continuity must be supported which means the provided service must 217 be restored at the VNF instance updated / replaced / recovered. 218 This procedure includes the restoration of any ongoing data 219 sessions. And it shall be transparent to the user of NFV service. 220 4.2. NFV Use Cases 222 Multiple use cases are described by NFV in [nfv-uc]. The followings 223 are a highlight of the NFV use cases. 225 4.2.1. Network Function Virtualization Infrastructure 227 Network Function Virtualization Infrastructure as a Service 228 (NFVIaaS), Virtual Network Function as a Service (VNFaaS) and 229 Virtual Network Platform as a Service (VNPaaS) are the NFV use cases 230 which describe how the telecom operators would like to build up 231 their telecom cloud infrastructure using virtualization. 233 Network Function Virtualization Infrastructure (NFVI) is the 234 totality of all hardware and software components which build up the 235 environment in which VNFs are deployed. The NFVI can span across 236 several locations. The network providing connectivity between these 237 locations is regarded to be part of the NFVI. 239 NFVIaaS is a generic IaaS plus NaaS requirement which allows the 240 telecom operator to build up a VNF cloud on top of their own DCs 241 Infrastructure and any external DCs Infrastructure. This will allow 242 a telecom operator to migrate some of its network functions into a 243 3rd party DC when it is needed. Furthermore, a larger telecom 244 operator may have multiple DCs in different geography locations. The 245 operator may want to setup multiple vDC, where each vDC may cross 246 several of its physical DCs geography locations. Each vDC is defined 247 for providing one specific function, e.g. Telco Cloud. 249 VNFaaS is more focusing on enterprise network which may have its own 250 cloud infrastructure with some specific services / applications 251 running. VNFaaS allows the enterprise to merge and/or extend its 252 specific services / applications into a 3rd party commercial DC 253 provided by a telecom operator. With this VNFaaS, the enterprise 254 does not need to manage and control the NFVI or the VNF. However, 255 NFV Performance & portability considerations will apply to 256 deployments that strive to meet high performance and low latency 257 considerations. 258 With VNPaaS, the mobile network traffic, including WiFi traffic, is 259 routed based on the APN to a specific packet data service server 260 over the mobile packet core network. Applications running at the 261 packet data service server may be provided by the enterprise. And it 262 is possible to have an interface to route the traffic into an 263 enterprise network. But the infrastructure hosting the application 264 is fully under controlled by the operator. However, the enterprise 265 has full admin control of the application and needs to apply all 266 configurations on its own, potentially via a vDC like management 267 interface with support of the hosting operator. 269 All the above use cases need solutions for the operator to share the 270 infrastructure resources with 3rd parties. Therefore cross domain 271 orchestration with access control is needed. Besides, the 272 infrastructure resource management needs to provide a mechanism to 273 isolate the traffic, not only based on the traffic type, but also 274 from different operators and enterprises. 276 4.2.2. Telecom Network Functions Migration 278 Virtualization of telecom network functions, including Mobile Core 279 Network functions, IMS functions, Mobile base station functions, 280 Content Delivery Networks (CDN) functions, Home Environment 281 functions, and Fixed Access Network functions, are described in the 282 NFV use case document [nfv-uc]. In additional, VNF forwarding Graphs 283 is another use case which describes how the user data packets are 284 forwarded by traversing more than one operator service chain 285 functions, such as DPI, Firewall, Content Filtering, before reaching 286 the service server. 287 Migrate the telecom functions includes moving the control plane, 288 data plane and service network into a cloud based network and using 289 cloud based protocol to control the data plane. Service continuity, 290 network security, service availability, resiliency in both control 291 plane and data plane must be ensured at this migration. 293 5. Elasticity in a Distributed Cloud 295 Today the usage of personal devices, e.g. smartphones, for internet 296 service traffic, telecom specific service access, and accessing the 297 corporate network, is increased significantly. At the same time, 298 telecom operators are under pressure to accommodate the increased 299 service traffic in a fine-grained manner. Services provided by 300 telecom network must be done in an environment of increased 301 security, compliance, and auditing requirements, along with traffic 302 load may be changed dramatically overtime. Providing self-service 303 provisioning in telecom cloud requires elastic scaling of the VNF 304 based on the dynamic service traffic load and resource management 305 e.g. computing, storage, and networking. 307 The existing telecom network functions may not be cloud technologies 308 ready yet. Most of the NFV functions are stateful and running on 309 either specific hardware or a big VM. It is not designed to tolerate 310 any system failure in many VMs. The network functions are very 311 difficult in term of configuration, scale updating, etc. 313 Re-engineering may be needed for virtualization enabling, e.g. 314 software adaption for software and hardware decoupling. For cloud 315 technologies ready, the telecom network functions need to be re- 316 designed as stateless function and able to run on small VMs with 317 multiple instances which can provide higher application 318 availability. The application dynamic scaling can be achieved by 319 adding more VMs into the system. 321 Virtualization provides the elasticity ability to scale up / down, 322 scale out / in with guaranteed computational resources, security 323 isolation and API access for provisioning it all, without any of the 324 overhead of managing physical servers. However, there are still many 325 optimizations which can be used to avoid the increasingly overhead. 326 5.1. Elasticity VNF 328 Virtualized Network Function (VNF) is an implementation of a network 329 function that can be deployed on Network Function Virtualization 330 Infrastructure. 332 For a large telecom operator, multiple vDC may be created crossing 333 multiple physic data centers. And each vDC is defined for providing 334 one specific function, e.g. Telco Cloud. As the infrastructure 335 resources used by one vDC may locate in different geography 336 locations, the network performances may be different if the VM is 337 placed at different host within different location. 339 VNF capacity may be limited if it only can be scaled within one 340 network zone, e.g. within one DC in a geography location. As a NFVI 341 which may be crossing multiple data centers, it is possible to scale 342 an elasticity VNF crossing different network zones if it is needed. 343 At cross DC scaling, it is a mandatory requirement to provide the 344 same level of SLA including performance, reliability and security. 346 5.2. Elasticity VNF set 348 In NFV network, normally the VNFs are working as such that the 349 services provided by the VNFs may need to process the user data 350 packets with several selected VNF instances before delivering it to 351 its destination. VNF set is a NFV specific concept. It is a 352 collocation of VNFs with unspecified network connectivity between 353 them. When VNF works as VNF set, the service session is setup among 354 a group of VNFs. For instance, when mobile users setup a PDN 355 connection for IMS services, there are multiple network entities 356 involved along the PDN connection, including eNB, Serving GW, PDN 357 GW, P-CSCF, S-CSCF, etc. Another example is service function 358 chaining, where a service chain is referring to one or more service 359 processing functions in a specific order which are chained to 360 provide a composite service. 362 In telecom cloud, a service session may traverse multiple stateful 363 and stateless VNF functions of a VNF set. And with distributed NFV, 364 it may be crossing multiple DCs. In such cloud, the east-west 365 traffic is much heavier comparing to the north-south traffic. 366 When placing a VNF application, it is better spread the applications 367 in a wide network zone, which may give a better availability. 368 However, a wide network zone also increases the network latency 369 which can be big. VNF application dependence shall be considered 370 when placing VNF into the DC. 371 When scaling, VNFs are not scaled only in relation to compute and 372 storage domains. The VNF functions may need to be grouped together 373 and applying auto-scaling techniques to the entire group. The 374 scaling policies, e.g. ratio between the different VNFs, need to be 375 applied on the VNF set in aggregate to control the scaling process. 376 6. Elasticity with Predicable Performance 378 6.1. Predicable Performance 380 High performance with low-latency VNF is expected in the NFV 381 framework. The NFVI metrics are related to any kind of metrics 382 generated by the NFVI, including not only CPU load on a VM, CPU load 383 on a host, but also interrupt rate handled by the hypervisor or 384 network latency/packet loss. 386 Virtualization adds additional overhead which impacts the 387 performance. This additional extra distortion shall be avoided or, 388 at least, minimized. It is a big concern for NFV on how to achieve 389 predictable and low-latency performance. 391 Operator may wish to run standard test and use the result to provide 392 KPIs of the VNF. A significant part of a VNF vendor's performance 393 guarantees will depend on the choice of the virtualization 394 technology. 395 6.2. Hardware virtualization features 397 Virtualization layer adds minimal overhead and delivers a 398 predictable performance between a minimum and maximum threshold for 399 latency and jitter which are far more important. Light weight 400 virtualization, e.g. container or bare metal, may be considered for 401 performance sensitive VNF applications. In additional, hardware 402 virtualization features (e.g. SR-IOV) are important to be supported 403 in order to provide some performance improvement. Many VNF requires 404 direct access to the device hardware so that they can offload 405 functionality with throughput rates of millions of packets a second. 406 Another alternative, which may be more attractive for latency- 407 sensitive applications, is using and non-hypervisor virtualization, 408 including bare metal and Linux container. 409 Optimization to drive high-throughput network workloads associated 410 with such functions as traffic filtering, NATing and firewalling. 411 Avoiding performance bottleneck, the virtualization layer shall have 412 a suitably-architected I/O stack. 414 6.3. Network Overlay 416 Network overlay adds additional overhead when forwarding the data 417 packets. Reference [vxlan-p] is a VXLAN performance testing report 418 which indicates the overlay performance is a concern. Avoiding 419 overlay connections may be one option which is more attractive for 420 latency-sensitive applications. 422 Furthermore, additional network latency may be added when traversing 423 the cross-DC overlay connections. To avoid any additional network 424 latency, all the functions of a VNF set may be placed in the same 425 low-latency network zone, e.g. same host or same DC. However, when 426 the capacity limitation the network zone is reached, scaling-out one 427 VNF into another network zone may be needed. In this case, as the 428 service session has to traverse the same path, the Ping-Pong traffic 429 between the network zones cannot be avoided. Depends on the network 430 overlay technologies used for the cross network zone connection, the 431 overhead network latency can be various. In another words, the 432 network performance may become unpredictable. 433 7. Elasticity with Reliability 435 NFV resiliency is a must requirement for NFV network, including both 436 the control plane and data plane. Necessary mechanisms must be 437 provided to improve the service availability and fault management. 439 With virtualization, the use of VNFs can pose additional challenges 440 on the reliability of the provided services. For a VNF instance, it 441 typically would not have built-in reliability mechanisms on its host 442 (i.e., a general purpose server). Instead, there are more factors 443 of risk such as software failure at various levels including 444 hypervisors and virtual machines, hardware failure, and instance 445 migration that may make a VNF instance unreliable. Even for cloud 446 ready NFV applications, a HA may still be needed as the storage, 447 load balancer may be failure. Service restoration solution is still 448 needed. 450 One alternative to improve the VNF resiliency is to take snapshot of 451 the VM periodically. At VNF failure, the network can restore the VM 452 at same or different host using the stored snapshot. However, there 453 is a downtime of the provided service due to the snapshot 454 recovering. And the downtime is much longer than the expected value 455 which could be tolerated by NFV. NFV has a completely different 456 level of reliability requirements, e.g. recovering time, comparing 457 to enterprise cloud applications. 459 To improve the network function resiliency, some kind high 460 availability (HA) solutions may be needed for NFV network, which has 461 the potential to minimize the service downtime at failure. 463 The VNF reliability can be achieved by eliminating any single points 464 of failure by creating a redundancy of resources, normally, 465 including enough excess capacity in the design to compensate for the 466 performance decline and even failure of individual resources; that 467 is, a group of VNF instances providing the same function works as a 468 network function cluster or pool, which provides protection (e.g. 469 failover) for the applications and therefore an increased 470 availability. 472 8. Elasticity with Security 474 TDB 476 9. Security Considerations 478 This is a discussion paper which provides inputs for NFV related 479 discussions and in itself does not introduce any new security 480 concerns. 481 10. IANA Considerations 483 No actions are required from IANA for this informational document. 484 11. References 486 11.1. Normative References 488 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 489 Requirement Levels", BCP 14, RFC 2119, March 1997. 491 [RFC2234] Crocker, D. and Overell, P.(Editors), "Augmented BNF for 492 Syntax Specifications: ABNF", RFC 2234, Internet Mail 493 Consortium and Demon Internet Ltd., November 1997. 495 11.2. Informative References 497 [nfv-arch] Network Functions Virtualization Infrastructure 498 Architecture Overview; GS NFV INF 001. 500 [nfv-rel] Network Function Virtualization (NFV) Resiliency 501 Requirements; ETSI GS NFV-REL 001. 503 [nfv-uc] Network Function Virtualization (NFV) Use Cases; ETSI GS 504 NFV 001 506 [nfv-req] Network Function Virtualization (NFV) Virtualization 507 Requirements; ETSI GS NFV 004 509 [nfv-sec] Network Function Virtualization (NFV) NFV Security Problem 510 Statement; ETSI NFV-SEC 001 512 [nfv-tem] Network Function Virtualization (NFV) Terminology for Main 513 Concepts in NFV; ETSI GS NFV 003 515 [vxlan-p] Problem Statement for VxLAN Performance Test, draft-liu- 516 nvo3-ps-vxlan-perfomance, (working in progress) 518 12. Acknowledgments 520 Many people have contributed to the development of this document and 521 many more will probably do so before we are done with it. While we 522 cannot thank all contributors, some have played an especially 523 prominent role. The following have provided essential input: Suresh 524 Krishnan. 526 Authors' Addresses 527 Zu Qiang 528 Ericsson 529 8400, boul. Decarie 530 Ville Mont-Royal, QC, 531 Canada 533 Email: Zu.Qiang@Ericsson.com