idnits 2.17.1 draft-xia-vnfpool-use-cases-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 11, 2014) is 3451 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group L. Xia 2 Internet-Draft Q. Wu 3 Intended status: Standards Track Huawei 4 Expires: May 11, 2015 D. King 5 Lancaster University 6 H. Yokota 7 KDDI Lab 8 N. Khan 9 Verizon 10 November 11, 2014 12 Requirements and Use Cases for Virtual Network Functions 13 draft-xia-vnfpool-use-cases-02 15 Abstract 17 Network function appliances such as subscriber termination, 18 firewalls, tunnel switching, intrusion detection, and routing are 19 currently provided using dedicated network function hardware. As 20 network function is migrated from dedicated hardware platforms into 21 a virtualized environment, a set of use cases with application 22 specific resilience requirements begin to emerge. 24 These use cases and requirements cover a broad range of capabilities 25 and objectives, which will require detailed investigation and 26 documentation in order to identify relevant architecture, protocol 27 and procedure solutions to ensure reliance of user services using 28 virtualized functions. 30 This document provides an analysis of the key reliability 31 requirements for applications and functions that may be hosted within 32 a virtualized environment. These NFV engineering requirements are 33 based on a variety of uses cases and goals , which include 34 reliability scalability, performance, operation and automation. 36 Note that this document is not intended to provide or recommend 37 protocol solutions. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at http://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on May 11, 2011. 55 Copyright Notice 57 Copyright (c) 2014 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . .3 73 1.1. Network Function Virtualization (NFV) Effort . . . . . .4 74 1.2. Virtual Network Functions (VNF) Resilience Requirements .4 75 1.2.1. Service Continuity . . . . . . . . . . . . . . . . .5 76 1.2.2. Topological Transparency . . . . . . . . . . . . . .5 77 1.2.3. Load Balancing or Scaling . . . . . . . . . . . . . .5 78 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . .5 79 3. Virtual Network Function (VNF) Pool Architecture. . . . . . .7 80 3.1. VNF Instance Resilience Objectives . . . . . . . . . . .8 81 3.2. Resilience of Network Connectivity . . . . . . . . . . .8 82 3.3. Service Continuity . . . . . . . . . . . . . . . . . . .9 83 4. General Resilience Requirements For VNF Use Cases . . . . . .9 84 4.1. Resilience for Stateful Service . . . . . . . . . . . . .9 85 4.1.1 State Synchronization . . . . . . . . . . . . . . . . .10 86 4.2. Auto Scale of Virtual Network Function Instances . . . .11 87 4.3. Reliable Network Connectivity between Network Nodes . . .12 88 4.4. Existing Operating Virtual Network Function Instance 89 Replacement . . . . . . . . . . . . . . . . . . . . . . .13 90 4.5. Combining Different VNF Functions (a VNF Set) . . . . . .14 91 4.6. VNF Resilience Classes . . . . . . . . . . . . . . . . .15 92 4.7. Multi-tier Network Service . . . . . . . . . . . . . . .15 93 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . .17 94 6. Security Considerations . . . . . . . . . . . . . . . . . . .17 95 7. References . . . . . . . . . . . . . . . . . . . . . . . . .17 96 7.1. Normative References . . . . . . . . . . . . . . . . . .17 97 7.2. Informative References . . . . . . . . . . . . . . . . .17 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . .17 100 1. Introduction 102 Network virtualization technologies are finding increasing support 103 among network and Data Center (DC) operators. This is due to 104 demonstrable capital cost reduction and operational energy savings, 105 simplification of service management, potential for increased network 106 and service resiliency, network automation, and service and traffic 107 elasticity. 109 Within traditional DC networks, varied middleware boxes including FW 110 (Fire Wall), NAT (Network Address Translation), LB (Load Balancers), 111 WoC (Wan Optimization Controller), etc., are being used to provide 112 network functions, traffic control and optimization. 113 Each function is an essential part of the entire operator and DC 114 network, and overall service chain (required traffic path for users) 115 Combined these functions and capabilities. 117 Currently, a significant amount of network functions are being 118 migrated into virtualized entities, in essence the middleware 119 capability is implemented in software on commodity hardware using 120 well defined industry standard servers. Thus allowing the creation, 121 modification, deletion, scaling, and migration of single or groups of 122 network functions, across few or many servers. 124 These virtual network functions (VNF) may be location independent, 125 i.e., they may exist across distributed or centralized DC 126 hardware. This architecture will pose new issues and great 127 challenges to the automated provisioning across the DC network, 128 while maintaining high availability, fault-tolerant, load 129 balancing, and plethora of other requirements some of which are 130 technology and policy based. 132 Today, architecture and protocol mechanisms exist for the management 133 and operation of server hardware supporting applications, these 134 hardware resources are known as server node pools, which may be 135 accessed by other servers and clients. These server node pools have 136 a well-established set of requirements related to management, 137 availability, scalability and performance. 139 [I-D.zong-vnfpool-problem-statement] provides an overview of the 140 problems related to the reliability of a VNF set, and also introduces 141 briefly a VNF pooling architecture. This document provides an 142 analysis of the key reliability requirements for applications and 143 functions that may be hosted within a virtualized environment. These 144 Network Functions Virtualization (NFV) engineering requirements are 145 based on a variety of uses cases and goals , which include 146 reliability scalability, performance, operation and automation. 148 This document is not intended to provide or recommend solutions. The 149 intention of this document is to present an agreed set of objectives 150 and use cases providing network function using virtualized instances, 151 identification of key requirements across use cases. 153 1.1. Network Function Virtualization (NFV) Effort 155 NFV, an initiative started within the European Telecommunications 156 Standards Institute (ETSI), aims to transform the way that network 157 operators architect networks by evolving standard IT virtualization 158 technology to consolidate many network equipment types to industry 159 standard high volume servers, switches and storage. 161 The objectives for NFV being specified within the ETSI organization 162 include: 164 o Rapid service innovation through software-based deployment and 165 operationalization of network functions and end-to-end services; 167 o Improved operational efficiencies resulting from common automation 168 and operating procedures; 170 o Reduced power usage achieved by migrating workloads and powering 171 down unused hardware; 173 o Standardized and open interfaces between network functions and 174 their management entities so that such decoupled network elements 175 can be provided by different players; 177 o Greater flexibility in assigning Virtual Network Functions (VNF) 178 to hardware; 180 o Improved capital efficiencies compared with dedicated hardware 181 implementations. 183 1.2. Virtual Network Functions (VNF) Resilience Requirements 185 Deployment of NFV-based services will require the transition of 186 resilient capabilities from physical network nodes, which are 187 typically highly available, entities running Virtual Network 188 Functions (VNFs) on abstracted pool of hardware resources. 190 Thus, it is critical to ensure that end-to-end user services which 191 may require a variety of virtualized functions to be reliable, and in 192 the event failure would support seamless failover when required to 193 negate or minimize impact on user services. 195 A number of requirements have been discussed and documented within 196 the NFV Industry Steering Group (ISG) working groups, including 197 [ETSI-HA-USECASE] and are highlighted in following sub-sections. 199 1.2.1. Service Continuity 201 VNFs provide the capability to execute and operate network functions 202 on varying types of Virtual machines (VMs), and subsequently physical 203 equipment. It should be possible to inherently provides resiliency 204 at the function level, as well as physically. 206 Network Functions (NFs) are assigned session IDs, Sequence IDs and 207 Authentication IDs. This information may be static, dynamic and 208 temporal so will need to be replicated and maintained as needed for 209 failure scenarios. 211 Hardware entity such as a storage server or networking node are 212 assigned a unique MAC address, which is often pre-configured 213 (hardware encoded) and static. 215 In the event of a hardware failure or capacity limits (memory and 216 CPU) hosting VMs and therefore VNFs, it may be necessary to move VNFs 217 to another VM, and/or hardware platform. Therefore, service 218 continuity must be maintained with no or negligible impact to users 219 using with services being provided by the NFs. 221 1.2.2. Topological Transparency 223 Redundant systems are typically configured as an active and standby 224 nodes, running a specific NF in the same LAN segment. It is possible 225 that they are assigned duplicate IP addresses, and sometimes the same 226 MAC address as well. In the event of an active node failure the 227 standby node can take over transparently. This should be 228 architecture supported by any eventual solution. 230 In order to achieve topological transparency and seamless hand-over 231 the dependent nodes should replicate and maintain the necessary 232 information so that in the event of failure the standby node takes 233 over the service without any disruption to the users. 235 1.2.3. Load Balancing or Scaling 237 When load-balancing or scaling of sessions, the working session may 238 be moved to a new VNF instance, or indeed a new VM on another 239 hardware 240 platform. Again, service continuity must be maintained. 242 2. Terminology 244 The following terms have been defined by the ETSI Industry Steering 245 Group (ISG) responsible for the specification of NFV, and are reused 246 in this document: 248 Network Function (NF): A functional building block within a network 249 infrastructure, which has well-defined external interfaces and a 250 functional behavior. In practical terms, a Network Function is 251 today often a network node or physical appliance. 253 NFV Orchestrator: The NFV Orchestrator is in charge of the network 254 wide orchestration and management of NFV Infrastructure (NFVI) and 255 resources. The NFV Orchestrator has control and visibility of all 256 VNFs running inside the NFVI. The NFV Orchestrator provides GUI 257 and external NFV-Interfaces to the outside world to interact with 258 the orchestration software. 260 Service Continuity: The continuous delivery of service in 261 conformance with service, functional and behavioral specification 262 and SLA requirements, both in the control and data planes, for any 263 initiated transaction or session till its full completion even in 264 the events of intervening exceptions or anomalies, whether 265 scheduled or unscheduled, malicious, intentional or unintentional. 266 From an end-user perspective, service continuity implies 267 continuation of ongoing communication sessions with multiple media 268 traversing different network domains (access, aggregation, and 269 core network) or different user equipment. 271 Hypervisor: Software running on a server that allows multiple VMs to 272 run on the same physical server. The hypervisor manages and 273 provide network connectivity to Virtual machines [RFC7365]. 275 Network Functions Virtualization (NFV): Moving network function from 276 dedicated hardware platforms onto industry standard high volume 277 servers, switches and storage. 279 Set-top Box (STB): This device contains audio and video decoders and 280 is intended to connects to a variety of home user devices media 281 servers and televisions. 283 Virtual Machine (VM): Software abstraction of underlying hardware. 285 Virtual Application (VA): A Virtual Application is the more general 286 term for a piece of software which can be loaded into a Virtual 287 Machine. A VNSF is just one type of VA amongst many others, which 288 may not relate to any VNF (e.g. SW-tools or NFV-Infra-internal 289 applications). 291 Virtualized Network Function (VNF): a VNF provides the same 292 functional behavior and interfaces as the equivalent network 293 function, but is deployed as software instances building on top 294 of a virtualization layer. 296 The VNF Problem statement [I-D.zong-vnfpool-problem-statement] 297 defines the terms reliability, VNF, VNF Pool, VNF Pool 298 Manager, and VNF Set. This draft also uses these definitions. 299 In addition to the terms described above, this document also 300 uses the following additional terminology: 302 VNF Pool: a group of VNF instances providing the same network 303 function. 305 VNF Pool Manager: an entity that manages a VNF pool, and interacts 306 with the service control entity to provide the network function. 308 VNF Set: a group of VNF instances that can be used to build network 309 services. 311 3. Virtual Network Function (VNF) Pool Architecture 313 Shifting towards virtual network function presents a number of 314 challenges and requirements, this document focuses on those 315 related to network function availability and reliability. In large 316 DC environments, a virtual server may need to deal with traffic 317 from millions of hosts. This represents a significant scaling 318 challenge for Virtual network function deployment and operation. 320 +------------------+ 321 | NFV Orchestrator | 322 +------------------+ 323 ^ ^ 324 | | 325 +-----------+ +-----------+ 326 | | 327 v v 328 +----------------+ +----------------+ 329 |VNF Pool Manager|<---------->|VNF Pool Manager| 330 +----------------+ +----------------+ 331 ^ ^ 332 | | 333 v v 334 +------------------------------+ +------------------------------+ 335 |+----------+ +----------+ | | +----------+ +----------+| 336 || VNF | | VNF | | | | VNF | | VNF || 337 || Instance| ... | Instance |<+---+>| Instance| ... | Instance|| 338 |+----------+ +----------+ | | +----------+ +----------+| 339 | VNF Pool | | VNF Pool | 340 +------------------------------+ +------------------------------+ 342 Figure 1: Typical VNF Pool Network Architecture 344 As shown in Figure 1, the overall architecture of VNF Pool-based 345 network includes: 347 o VNF Instances 349 o VNF Pool 351 o VNF Pool Manager 353 Rserpool [RFC5351] has the similar architecture to provide high- 354 availability and load balancing, However Rserpool are only used to 355 manage physical servers and can not deal with VNF 356 instance when it was designed. 358 3.1. VNF Instance Resilience Objectives 360 In order to manage VNF-based nodes and provide fault tolerant and 361 load sharing across nodes, the VNF instances may be initiated and 362 established as logical element. A set of VNFs providing the 363 same service type, is known as a VNF Pool, or groups of network 364 functions (FW, LB, DPI) running on multiple VNFs, is known as a 365 VNF Set. 367 Considering the reliability requirements of a VNF-based 368 node architecture it should support several key points detailed 369 below: 371 o Application resource monitoring and health checking; 373 o Automatic detection of application failure; 375 o Failover to another VNF instance; 377 o Transparency to other VNF instances; 379 o Isolation and reporting of failures; 381 o Replication of state for active/standby network functions. 383 3.2. Resilience of Network Connectivity 385 The other category of reliability requirements concerns the network 386 connectivity between any two VNFs, across a VNF set, or between 387 VNF Pool Manager. 389 The connectivity between the VNF Pool Manager and the VNF instance is 390 used to provide registry service to the VNF Set. A set 391 of VNF Pool managers might be configured to provide reliable 392 registration. 394 When one VNF instance cannot obtain a register response from the 395 assigned VNF Pool Manager, it should be capable of fail-over to 396 another VNF Pool Manager. Connectivity may also be monitored by the 397 VNF Pool Manager to the VNF Instance periodically as well. 399 The connectivity between Pool Managers is used to maintain 400 synchronization of data between VNFs located in different VNF Pools 401 or VNF Sets. This allows every Pool Manager to acquire and maintain 402 overall information of all VNFs and provide protection for each 403 other. 405 For all types of network connectivity discussed previously, the key 406 reliability requirements stay consistent and include: 408 o Automatic detection of link failure; 410 o Failover to another usable link; 412 o Automated routing recovery. 414 3.3. Service Continuity 416 It is critical to ensure end-to-end service continuity over both 417 physical and virtual infrastructure. A number of requests exist to 418 maintain user services in the event of network or VNF 419 instance failure, these include: 421 o Storage and transfer of state information within the VNFs; 423 o VNF capacity (memory and CPU) limitations per instance to avoid 424 overbooking, and failure of end-to-end services; 426 o Automated recovery of end-to-end services after failure 427 situations; 429 4. General Resilience Requirements For VNF Use Cases 431 4.1. Resilience for Stateful Service 433 In the service continuity use case provided by the European 434 Telecommunications Standards Institute (ETSI) Network Function 435 Virtualization (NFV) Industry Specification Group (ISG) [NFV-REL-REQ] 436 , which describes virtual middlebox appliances providing layer-3 to 437 layer-7 services may require maintaining stateful information, e.g., 438 stateful vFW. In case of hardware failure or processing overload of 439 VNF, in addition to the replacement of VNF, it is necessary to move 440 its key status information to new VNF for service continuity. See 441 Figure 2 (Resilience for Stateful Service) for clarification. 443 In case of multiple vFws on one VM and not enough resources are 444 available at the time of failure, two strategies can be taken: one is 445 to move as many vFws as possible to a new place according to the 446 available resources, and the other is to suspend one or more running 447 VNFs in the new place and move all vFws on the failed hardware to it. 449 MAC, IP, VLAN, 450 Session id, Sequence No, ... 451 +-----------------+-----------------+ 452 | ************************************* 453 | * | |Limited | * | 454 | * | |Resource | * Suspend| 455 | * | V | * V 456 +--+-+ +-*--+ +--V-+ +----+ +--V-+ +-V--+ +----+ 457 |vFw1| |vFw1| |vFw1| |vFw2| |vFw1| |vFw1| |vFw3| 458 +----+ +----+ +----+ +----+ +----+ +----+ +----+ 459 +------------+ +------------+ +-------------------+ 460 | VM | | VM | | VM | 461 +------------+ +------------+ +-------------------+ 462 +------------+ +------------+ +-------------------+ 463 /-\ | | | | | 464 | || vServer | | vServer | | vServer | 465 \-/ | | | | | 466 +------------+ +------------+ +-------------------+ 467 Hardware 468 Failure 470 Figure 2: Resilience for Stateful Service 472 In both scenarios, the following requirements need to be satisfied: 474 o Supporting status information maintaining; 476 o Supporting status information moving; 478 o Supporting VNF moving from one VM to another VM; 480 o Supporting partial VNFs moving; 482 o Seamless switching user traffic to alternative VMs and VNFs. 484 4.1.1 State Synchronization 486 As identified in section 4.1 (Resilience for Stateful Service) their 487 is a requirement for for state synchronization. A failure of a 488 vFW would result in the loss of active connections transiting the 489 node. Any connection-orientated or secure sessions, including 490 enterprise and financial transactions, may be critical, and 491 losing them would result in the loss of data. 493 If required it should be possible to ensure that the VNF Pool 494 infrastructure should minimise or negate session data traffic if a 495 vFW failures. Prior to the failure the vFW might advertise and 496 synch the connection information transitioning its node. The 497 connection state synchronization to other vFWs acting as stand-by 498 nodes would provide fast fail-over and minimal connection 499 interruption to users. 501 This synchronization mechanism should be supported by the (NFV) 502 infrastructure level, that is, ideally each application does not need 503 to code the redundancy procedures (reserve a VM resource, instantiate 504 one or more backup server(s), copy the state, keep them in sync, 505 etc). Also, such a state can be embedded in each vNF or stored in an 506 external virtual storage, which should be supported by the NFV 507 infrastructure. 509 4.2. Auto Scale of Virtual Network Function Instances 511 Adjusting resource to achieve dynamic scaling of VMs described in the 512 ETSI [NFV-INF-UC] use case and [NFV-REL-REQ]. As shown in Figure 3, 513 if more service requests come to a VNF than one physical node can 514 accommodate, processing overload occurs. In this case, the movement 515 of the VNF instance to another physical node with the same resource 516 constraints will create a similar overload situation. A more 517 desirable approach is to replicate VNF instance to one or more new 518 VNF instances and at the same time distribute the incoming 519 requests to those VNF instances. 521 In a scenario where a particular VNF requires increased resource 522 allocation to improve overall application performance, the network 523 function might be distributed across multiple VMs. To guarantee 524 performance improvement, the hypervisor dynamically adjusts (scaling 525 up or scaling down) resources to each VNF in line with the current 526 or predicted performance needs. 528 +--------------+ 529 +-------------------+ | NFV | 530 | | |Management and| 531 | <===>Orchestration | 532 | +---------+ | | Entity | 533 | | #1 | | +--------------+ 534 | --| vIPS/IDS|-- | /\ 535 | | +---------+ | | || +---------+ 536 | | |--|-- || <--|End User1| 537 | | VM #1 | | | || +---------+ 538 | +-------------+ | | +----\/---+ 539 | | | | | +---------+ 540 | +---------+ | | | | <--|End User2| 541 | | #2 | | | | | +---------+ 542 | --| vIPS/IDS|-- | | | | 543 | | +---------+ | | | | | +---------+ 544 | | ---|------- Service | <--|End User3| 545 | | VM #2 | | | | Router | +---------+ 546 | +-------------+ | | | | +---------+ 547 | | | | | <--|End User4| 548 | +---------+ | | | | +---------+ 549 | | #3 | | | | | +---------+ 550 | --| vIPS/IDS|-- | | | | <--|End User5| 551 | | +---------+ | | | +---------+ +---------+ 552 | | ---|-- : 553 | | VM #3 | | 554 | +-------------+ | : 555 | | 556 +-------------------+ 558 Figure 3: Auto Scaling of Virtual network Function Instances 560 In this case, the following requirements need to be satisfied: 562 o Monitoring/fault detection/diagnosis/recovery - appropriate 563 mechanism for monitoring/fault detection/diagnosis/recovery of all 564 components and their states after virtualization, e.g. VNF, 565 hardware, hypervisor; 567 o Resource scaling - elastic service aware resource allocation to 568 network functions. 570 4.3. Reliable Network Connectivity between Network Nodes 572 In the reliable network connectivity between VNFs use case 573 provided by ETSI [NFV-INF-UC], the management and orchestration 574 entities must be informed of changes in network connectivity 575 resources between VNFs. For example, Some network 576 connectivity resources may be temporarily put in power savings mode 577 when resources are not in use. This change is not desirable since it 578 may have great impact on reachability and topology. Another example, 579 some network connectivity resource may be temporarily in a fault 580 state and comes back into an active state, however some other network 581 connectivity resource becomes permanent in a fault state and is not 582 available for use. 584 +----------------+ 585 |NFV Orchestrator| 586 +----------------+ 588 Web 589 vDPI vCache vFW vNATPT 591 +--------+ +--------+ +--------+ +--------+ 592 | +----+ | | +----+ | | +-++-+ | | +----+ | 593 |------| ------| -------| || | ----| |<----- 594 | | | | | | | | | | | || | | | | | | | 595 | | +----+ | | +----+ | | +-++-+ | | +----+ | | 596 | | | | | | | | | | 597 +----+ | | | +----+ | | +-++-+ | | | V| ,--,--,--. 598 | | | | | | | | | | || | | | | ,-' `-. 599 | |<->---------- | |----- | || |-----------<--> Internet ) 600 | | | | | +----+ | | +-++-+ | | | `-. ,-' 601 +-|--+ | | | | | | | | A `--'--'--' 602 | | +----+ | | | | +-++-+ | | +----+ | | 603 | | | ------------------| || ------| |<----| 604 -------- | | | | | | || | | | | | | 605 | +----+ | | | | +-++-+ | | +----+ | 606 +--------+ +--------+ +--------+ +--------+ 608 Figure 4: Reliable Network connectivity 610 In this case, the following requirements need to be satisfied: 612 o Quick detection of link failures; 614 o Adding or removing VNF instances; 616 o Adding or removing network links between VNFs. 618 4.4. Existing Operating Virtual Network Function Instance Replacement 620 In the Replacement of existing operating VNF instance use case 621 provided by ETSI [NFV-INF-UC] use case, the Management and 622 Orchestration entity may be configured to support virtualized network 623 function replacement. For example, the Network Service Provider has 624 a virtual firewall that is operating. When the operating vFW 625 overloads or fails, the Management and Orchestration entity 626 determines that this vFW instance needs to be replaced by another vFW 627 instance. 629 +----------------+ 630 |NFV Orchestrator| 631 +---|---------|--+ 632 | | 633 Create | | Report Stats 634 and launch | | (Traffic,CPU 635 new vFW | | Failure..) 636 | | 637 +--------|---+ +--|---------+ 638 |Host2 | |Host1 | 639 | | | | 640 | +---++---+ | | +---++---+ | 641 | |vFW||vFW| | | |vFW||vFW| | 642 | +---++---+ | | +---++---+ | 643 | +---++---+ | | +---++---+ | 644 | |vFW||vFW| | | |vFW||vFW| | 645 | +---++---+ | | +---++---+ | 646 +------------+ +------------+ 648 Figure 5: Existing vFW replacement 650 In this case, the following requirements need to be satisfied: 652 o Verifying if capacity is available for a new instance of the VNF 653 at some location; 655 o Instantiating the new instance of a VNF at the location; 657 o Transferring the traffic input and output connections from the old 658 instance to the new instance. This may require transfer of state 659 between the instances, and reconfiguration of redundancy 660 mechanisms; 662 o Pausing or deleting the old VNF instance. 664 4.5. Combining Different VNF Functions (a VNF Set) 666 A VNF Set is used to assemble a collection of network functions 667 together to support a type of user or end-to-end service. 668 Connectivity between the VNF sets is known as a VNF Forwarding 669 Graph (a graph of logical links connecting VNFs together for 670 steering traffic between network function). To support the 671 reliability of an end-to-end service, except for satisfying the 672 aforementioned basic use case requirements, a VNF Set presents 673 further requirements of reliability as followed: 675 o As a whole, any failures (i.e., VNF failures, link failures, 676 performance degradation, etc) of a VNF Set can be detected and 677 recovered in time; 679 o Keeping the VNF order and relation unchanged when the VNF Set 680 is updated; 682 o The integrated VNF Set performance is not denigrated after it 683 is updated; 685 4.6. VNF Resilience Classes 687 Different end-to-end services(e.g., Web, Video, financial backend, 688 etc) have different classes of resilience requirement for the VNFs. 690 The use of class-based resiliency to achieve service resiliency SLAs, 691 without "building to peak" is critical for operators. 693 VNF resilience classes can be specified by some attributes and 694 metrics as followed: 696 o Does the VNF need status synchronization; 698 o Fault Detection and Restoration Time Objective (e.g., real-time, 699 near-real time, non-realtime) and metrics; 701 o Service availability metrics; 703 o Service Quality metrics; 705 o Service reliability; 707 o Service Latency metrics for components. 709 [More description is needed.] 711 4.7. Multi-tier Network Service 713 Many network services require multiple network functions to be 714 performed sequentially on data packets. A traditional model for 715 multi-tier service is shown as below, where for each network 716 function, all instances connect to the corresponding entrance point 717 (e.g. LB) responsible for sending/receiving data packets to/from 718 selected instance(s), and steering the data packets between different 719 network functions. 721 Service (e.g. VOIP, Web) 722 +--------------+ +--------------+ +--------------+ 723 | function#1 | | function#2 | | function#n | 724 | +----------+ | | +----------+ | | +----------+ | 725 | | Instance | | | | Instance | |... ...| | Instance | | 726 | +----------+ | | +----------+ | | +----------+ | 727 | |data | | |data | | |data | 728 | |conn | | |conn | | |conn | 729 | +----------+ | | +----------+ | | +----------+ | 730 | | Entrance | | | | Entrance | | | | Entrance | | 731 | | Point | | | | Point | | | | Point | | 732 | +----------+ | | +----------+ | | +----------+ | 733 +-----+--------+ +-------+------+ +-------+------+ 734 |data conn |data conn | 735 +-------------------+----------------------+ 737 Figure 7: Multi-tier Service 739 Such model works well when all instances of the same network function 740 are topologically close to each other. However, VNF instances are 741 highly distributed in DC networks, Network Operator networks and even 742 customer premised. When VNF instances are topologically far from 743 each other, there could be many network links/nodes between them for 744 transferring the data packets. For two different VNF instances, it 745 is possible that they are on the same physical server, but the 746 entrance points are many links/nodes away. To improve network 747 efficiency, it is desirable to establish direct data connections 748 between VNF instances, as shown below: 750 Service (e.g. VOIP, Web) 751 +----------+ +----------+ +----------+ 752 | VNF#1 | data conn | VNF#2 | data conn | VNF#n | 753 | Instance |-----------| Instance |- ... ... -| Instance | 754 +----------+ +----------+ +----------+ 755 ^ 756 | Virtualization 757 +--------------------------------------------------------+ 758 | Virtualization Platform | 759 +--------------------------------------------------------+ 761 Figure 8: VNF Instances Direct Connection' 763 In this case, the following requirements need to be satisfied: 765 o End to end failure detection of VNFs or links for multi-tier 766 service; 768 o Keep running service not be influenced during VNF instance 769 transition or failure in the model of VNF instances direct 770 connection. 772 5. IANA Considerations 774 This document has no actions for IANA. 776 6. Security Considerations 778 TBD. 780 7. References 782 7.1. Normative References 784 7.2. Informative References 786 [NFV-INF-UC] 787 "Network Functions Virtualisation Infrastructure 788 Architecture Part 2: Use Cases", ISG INF Use Case, June 789 2013. 791 [ETSI-HA-USECASE] 792 "Network Function Virtualisation; Use Cases;", ISG NFV Use 793 Case, June 2013. 795 [NFV-REL-REQ] 796 "Network Function Virtualisation Resiliency Requirements", 797 ISG REL Requirements, June 2013. 799 [I-D.zong-vnfpool-problem-statement] 800 Zong, N., "Problem Statement for Reliable Virtualized 801 Network Function (VNF) Pool", July 2014. 803 [RFC7365] 804 Lasserre, M., et al. "Framework for DC Network 805 Virtualization", RFC7365, October 2014. 807 [RFC5351] Lei, P., Ong, L., Tuexen, M., and T. Dreibholz, "An 808 Overview of Reliable Server Pooling Protocols", May 2008. 810 Authors' Addresses 811 Liang Xia(Frank) 812 Huawei 813 101 Software Avenue, Yuhua District 814 Nanjing, Jiangsu 210012 815 China 817 Email: frank.xialiang@huawei.com 819 Qin Wu 820 Huawei 821 101 Software Avenue, Yuhua District 822 Nanjing, Jiangsu 210012 823 China 825 Email: bill.wu@huawei.com 827 Daniel King 828 Lancaster University 829 UK 831 Email: d.king@lancaster.ac.uk 833 Hidetoshi Yokota 834 KDDI Lab 835 Japan 837 Email: yokota@kddilabs.jp 839 Naseem Khan 840 Verizon 841 USA 843 Email: naseem.a.khan@verizon.com