idnits 2.17.1 draft-kim-bmwg-ha-nfvi-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 19, 2015) is 3112 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Kim 3 Internet-Draft E. Paik 4 Intended status: Informational KT 5 Expires: April 21, 2016 October 19, 2015 7 Considerations for Benchmarking High Availability of NFV Infrastructure 8 draft-kim-bmwg-ha-nfvi-00 10 Abstract 12 This documents lists additional considerations and strategies for 13 benchmarking high availability of NFV infrastructure when network 14 functions are virtualized and performed in NFV infrastructure. 16 Requirements Language 18 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 19 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 20 document are to be interpreted as described in RFC 2119 [RFC2119]. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 21, 2016. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Considerations for Benchmarking High Availability of NFV 58 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.1. Definitions for High Availability Benchmarking Test . . . 3 60 2.2. Configuration Parameters for Benchmarking Test . . . . . 3 61 3. High Availability Benchmarking test strategies . . . . . . . 4 62 3.1. Single Point of Failure Check . . . . . . . . . . . . . . 4 63 3.2. Failover Time Check . . . . . . . . . . . . . . . . . . . 5 64 4. Security Considerations . . . . . . . . . . . . . . . . . . . 6 65 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 66 6. Normative References . . . . . . . . . . . . . . . . . . . . 6 67 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 69 1. Introduction 71 As both amount and variety of traffic massively increase, operators 72 are adopting SDN and NFV, the new paradigm of networking, in order to 73 secure scalability and flexibility. Service provider and venders are 74 developing SDN and NFV solutions and VNF(Virtual Network Function) to 75 reduce CAPEX and OPEX, focusing on the increment of the scalability 76 and flexibility of the network with programmable networking. 78 To replace the legacy network devices with VNFs and to select the 79 fittest one from various products of vender, operators want to ensure 80 the availability and resiliency of the VNF products and their 81 infrastructures. There also exist fears on the immeasurable 82 failures. 84 Among VNFs, vEPC is getting many attentions and some 85 telecommunications company already deployed vEPC partially. 86 Currently in 4G mobile communication, the availability reaches 87 99.9999%; downtime being 3 seconds per year. Therefore, VNFs like 88 vEPC (virtual Evolved Packet Core) must guarantee the 6-nines to 89 replace hardware dedicated network functions. From the 90 telecommunication company's point of view, the availability is the 91 most important feature, and the benchmarking tests for the high 92 availability of VNFs and NFV infrastructure are also important. This 93 document investigates considerations for high availability of NFV 94 Infrastructure benchmarking test. 96 2. Considerations for Benchmarking High Availability of NFV 97 Infrastructure 99 This section defines and lists considerations which must be addressed 100 to benchmark the high availability of VNFs from the NFV 101 infrastructure perspective. 103 2.1. Definitions for High Availability Benchmarking Test 105 Generally, availability is defined as follows, where MTBF stands for 106 Mean Time Between Failure) and MTTR stands for Mean Time To Recovery. 108 Availability : MTBF / (MTBF + MTTR) 110 A failover procedure is as follows. 112 Failure -> Detection -> Isolatation -> Recovery, therefore the time 113 to take failover starts from the time when a failure happens. 115 2.2. Configuration Parameters for Benchmarking Test 117 o Types of VNFs; depending on the type of VNF, followings are 118 different. 120 1. What kind of operations they do 122 2. How many CPUs, MEMs, Storages they need 124 3. What kind of traffic pattern they usually face 126 o The specification of the physical machine which VMs 128 o The mapping ratio of hardware resources to VMs(virtual machine) 129 where VNF runs, such as vCPU:pCPU (virtual CPU to physical CPU), 130 vMEM:pMEM (virtual memory to physical memory), vNICs as shown 131 below. 133 o Types of hypervisor and the different limitations of their roles. 135 o Cloud Design Pattern of NFVI 137 o The composition of network functions in VNFs : for example, 138 sometimes in vEPC implementations, PGW(Packet Data Network 139 Gateway) and SGW(Serving Gateway) are combined or PGW+SGW+MME. 141 +---------------+ +---------------+ 142 | vCPU for VNF1 | | | 143 +---------------+ | vCPU for VNF2 | 144 +---------------+ | | +---------------+ 145 | vCPU for VNF2 | +---------------+ | vCPU for VNF1 | 146 +---------------+ +---------------+ 147 +---------------+ +---------------+ +---------------+ +---------------+ 148 | vCPU for VNF3 | | vCPU for VNF2 | | vCPU for VNF3 | | vCPU for VNF3 | 149 +---------------+ +---------------+ +---------------+ +---------------+ 150 +---------------+ +---------------+ +---------------+ +---------------+ 151 | pCPU 1 | | pCPU 2 | | pCPU 3 | | pCPU 4 | 152 +---------------+ +---------------+ +---------------+ +---------------+ 154 3. High Availability Benchmarking test strategies 156 This section discusses benchmarking test strategies for high 157 availability of NFV infrastructure. For the continuity of the 158 services, these two must be checked. 160 3.1. Single Point of Failure Check 162 All devices and software have potential failures, therefore, 163 redundancy is mandatory. First, the redundancy implementation of 164 every sing point of NFV infrastructure must be tested as shown below. 166 o Hardware 168 * Power supply 170 * CPU 172 * MEM 174 * Storage 176 * Network :NICs, ports, LAN cable .. 178 o Software 180 * The redundancy of VNFs 182 * The redundancy of VNFs path 184 * The redundancy of OvS 186 * The redundancy of vNICs 188 * The redundancy of VMs 190 +--------------------------------------------------------------+ 191 | Physical Machine | 192 | | 193 | | 194 | +--------------------------------------------------------+ | 195 | | Virtual Network Function | | 196 | +--------------------------------------------------------+ | 197 | +--------------------------------------------------------+ | 198 | | Virtual Machine | | 199 | +--------------------------------------------------------+ | 200 | +--------------------------------------------------------+ | 201 | | Virtual Bridge | | 202 | +--------------------------------------------------------+ | 203 | +--------------------------------------------------------+ | 204 | | Hypervisor | | 205 | +--------------------------------------------------------+ | 206 | +--------------------------------------------------------+ | 207 | | Operating System | | 208 | +--------------------------------------------------------+ | 209 | +--------------------------------------------------------+ | 210 | | Generic Hardware | | 211 | +--------------------------------------------------------+ | 212 +--------------------------------------------------------------+ 214 3.2. Failover Time Check 216 Even though the components of NFV infrastructure are redundant, 217 failover time can be long. For example, when a failure happens, the 218 VNF with failure stops and should be replaced by backup VNF but the 219 time to be shifted to the new VNF can be varied with the VNF; 220 stateless or stateful. Namely, redundancy does not guarantees high 221 availability and short failover time is required to reach high 222 availability. This section discusses strategy about measuring 223 failover time. 225 In order to measure the failover time presicely, the time when 226 failure happens must be defined. Followings are three different 227 criteria which is the time when failure happens. 229 1. The time starts when failure actually happens 231 2. The time starts when failure detected by manager or controller 233 3. The time starts when failure event alerts to the operator 235 As the actual operations in VNFs and NFV infrastructure start to be 236 changed when failure happens, the precise time of the failure 237 happened must be the 1. When measuring the failover time, it starts 238 from the time when the failures happens at a point in NFV 239 infrastructure or VNF itself. 241 4. Security Considerations 243 TBD. 245 5. IANA Considerations 247 No IANA Action is requested at this time. 249 6. Normative References 251 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 252 Requirement Levels", BCP 14, RFC 2119, 253 DOI 10.17487/RFC2119, March 1997, 254 . 256 Authors' Addresses 258 Taekhee Kim 259 KT 260 Infra R&D Lab. KT 261 17 Woomyeon-dong, Seocho-gu 262 Seoul 137-792 263 Korea 265 Phone: +82-2-526-6688 266 Fax: +82-2-526-5200 267 Email: taekhee.kim@kt.com 269 EunKyoung Paik 270 KT 271 Infra R&D Lab. KT 272 17 Woomyeon-dong, Seocho-gu 273 Seoul 137-792 274 Korea 276 Phone: +82-2-526-5233 277 Fax: +82-2-526-5200 278 Email: eun.paik@kt.com 279 URI: http://mmlab.snu.ac.kr/~eun/