idnits 2.17.1 draft-morton-bmwg-virtual-net-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 26, 2014) is 3469 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2330' is defined on line 395, but no explicit reference was found in the text == Unused Reference: 'RFC2679' is defined on line 402, but no explicit reference was found in the text == Unused Reference: 'RFC2680' is defined on line 405, but no explicit reference was found in the text == Unused Reference: 'RFC2681' is defined on line 408, but no explicit reference was found in the text == Unused Reference: 'RFC3393' is defined on line 411, but no explicit reference was found in the text == Unused Reference: 'RFC3432' is defined on line 415, but no explicit reference was found in the text == Unused Reference: 'RFC4737' is defined on line 419, but no explicit reference was found in the text == Unused Reference: 'RFC5357' is defined on line 423, but no explicit reference was found in the text == Unused Reference: 'RFC5905' is defined on line 427, but no explicit reference was found in the text == Unused Reference: 'RFC5481' is defined on line 436, but no explicit reference was found in the text == Unused Reference: 'RFC6248' is defined on line 439, but no explicit reference was found in the text == Unused Reference: 'RFC6390' is defined on line 443, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft AT&T Labs 4 Intended status: Informational October 26, 2014 5 Expires: April 29, 2015 7 Considerations for Benchmarking Virtual Network Functions and Their 8 Infrastructure 9 draft-morton-bmwg-virtual-net-02 11 Abstract 13 Benchmarking Methodology Working Group has traditionally conducted 14 laboratory characterization of dedicated physical implementations of 15 internetworking functions. This memo investigates additional 16 considerations when network functions are virtualized and performed 17 in commodity off-the-shelf hardware. 19 Requirements Language 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in RFC 2119 [RFC2119]. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on April 29, 2015. 42 Copyright Notice 44 Copyright (c) 2014 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. Considerations for Hardware and Testing . . . . . . . . . . . 3 62 3.1. Hardware Components . . . . . . . . . . . . . . . . . . . 3 63 3.2. Configuration Parameters . . . . . . . . . . . . . . . . 4 64 3.3. Testing Strategies . . . . . . . . . . . . . . . . . . . 4 65 4. Benchmarking Considerations . . . . . . . . . . . . . . . . . 5 66 4.1. Comparison with Physical Network Functions . . . . . . . 5 67 4.2. Continued Emphasis on Black-Box Benchmarks . . . . . . . 5 68 4.3. New Benchmarks . . . . . . . . . . . . . . . . . . . . . 6 69 4.4. Assessment of Benchmark Coverage . . . . . . . . . . . . 7 70 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 71 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 72 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 73 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 74 8.1. Normative References . . . . . . . . . . . . . . . . . . 9 75 8.2. Informative References . . . . . . . . . . . . . . . . . 10 76 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 78 1. Introduction 80 Benchmarking Methodology Working Group (BMWG) has traditionally 81 conducted laboratory characterization of dedicated physical 82 implementations of internetworking functions. The Black-box 83 Benchmarks of Throughput, Latency, Forwarding Rates and others have 84 served our industry for many years. [RFC1242] and [RFC2544] are the 85 cornerstones of the work. 87 An emerging set of service provider and vendor development goals is 88 to reduce costs while increasing flexibility of network devices, and 89 drastically accelerate their deployment. Network Function 90 Virtualization (NFV) has the promise to achieve these goals, and 91 therefore has garnered much attention. It now seems certain that 92 some network functions will be virtualized following the success of 93 cloud computing and virtual desktops supported by sufficient network 94 path capacity, performance, and widespread deployment; many of the 95 same techniques will help achieve NFV. 97 See http://www.etsi.org/technologies-clusters/technologies/nfv for 98 more background, for example, the white papers there may be a useful 99 starting place. The Performance and Portability Best Practices 100 [NFV.PER001] are particularly relevant to BMWG. There are currently 101 work-in-progress documents available in the Open Area 102 http://docbox.etsi.org/ISG/NFV/Open/Latest_Drafts/ including drafts 103 describing Infrastructure aspects and service quality. 105 2. Scope 107 BMWG will consider the new topic of Virtual Network Functions and 108 related Infrastructure to ensure that common issues are recognized 109 from the start, using background materials from industry and SDOs 110 (e.g., IETF, ETSI NFV). 112 This memo investigates additional methodological considerations 113 necessary when benchmarking VNF instantiated and hosted in commodity 114 off-the-shelf (COTS) hardware. An essential consideration is 115 benchmarking both physical and virtual network functions, thereby 116 allowing direct comparison. 118 A clearly related goal: the benchmarks for the capacity of COTS to 119 host a plurality of VNF instances should be investigated. Existing 120 networking technology benchmarks will also be considered for 121 adaptation to NFV and closely associated technologies. 123 A non-goal is any overlap with traditional computer benchmark 124 development and their specific metrics (SPECmark suites such as 125 SPECCPU). 127 A colossal non-goal is any form of architecture development related 128 to NFV and associated technologies in BMWG, as has been the case 129 since BMWG began work in 1989. 131 3. Considerations for Hardware and Testing 133 This section lists the new considerations which must be addressed to 134 benchmark VNF(s) and their supporting infrastructure. 136 3.1. Hardware Components 138 New Hardware devices will become part of the test set-up. 140 1. High volume server platforms (COTS, possibly with virtual 141 technology enhancements). 143 2. Storage systems with large capacity, high speed, and high 144 reliability. 146 3. Network Interface ports specially designed for efficient service 147 of many virtual NICs. 149 4. High capacity Ethernet Switches. 151 Labs conducting comparisons of different VNFs may be able to use the 152 same hardware platform over many studies, until the steady march of 153 innovations overtakes their capabilities (as happens with the lab's 154 traffic generation and testing devices today). 156 3.2. Configuration Parameters 158 It will be necessary to configure and document the settings for the 159 entire COTS platform, including: 161 o number of server blades (shelf occupation) 163 o CPUs 165 o caches 167 o storage system 169 o I/O 171 as well as configurations that support the devices which host the VNF 172 itself: 174 o Hypervisor 176 o Virtual Machine 178 o Infrastructure Virtual Network 180 and finally, the VNF itself, with items such as: 182 o specific function being implemented in VNF 184 o number of VNF components in the service function chain 186 o number of physical interfaces and links transited in the service 187 function chain 189 3.3. Testing Strategies 191 The concept of characterizing performance at capacity limits may 192 change. For example: 194 1. It may be more representative of system capacity to characterize 195 the case where Virtual Machines (VM, hosting the VNF) are 196 operating at 50% Utilization, and therefore sharing the "real" 197 processing power across many VMs. 199 2. Another important case stems from the need for partitioning 200 functions. A noisy neighbor (VM hosting a VNF in an infinite 201 loop) would ideally be isolated and the performance of other VMs 202 would continue according to their specifications. 204 3. System errors will likely occur as transients, implying a 205 distribution of performance characteristics with a long tail 206 (like latency), leading to the need for longer-term tests of each 207 set of configuration and test parameters. 209 4. The desire for Elasticity and flexibility among network functions 210 will include tests where there is constant flux in the VM 211 instances. Requests for new VMs and Releases for VMs hosting 212 VNFs no longer needed would be an normal operational condition. 214 5. All physical things can fail, and benchmarking efforts can also 215 examine recovery aided by the virtual architecture with different 216 approaches to resiliency. 218 4. Benchmarking Considerations 220 This section discusses considerations related to Benchmarks 221 applicable to VNFs and their associated technologies. 223 4.1. Comparison with Physical Network Functions 225 In order to compare the performance of virtual designs and 226 implementations with their physical counterparts, identical 227 benchmarks must be used. Since BMWG has developed specifications for 228 many network functions already, there will be re-use of existing 229 benchmarks through references, while allowing for the possibility of 230 benchmark curation during development of new methodologies. 231 Consideration should be given to quantifying the number of parallel 232 VNFs required to achieve comparable performance with a given physical 233 device, or whether some limit of scale was reached before the VNFs 234 could achieve the comparable level. 236 4.2. Continued Emphasis on Black-Box Benchmarks 238 When the network functions under test are based on Open Source code, 239 there may be a tendency to rely on internal measurements to some 240 extent, especially when the externally-observable phenomena only 241 support an inference of internal events (such as routing protocol 242 convergence). However, external observations remain essential as the 243 basis for Benchmarks. Internal observations with fixed specification 244 and interpretation may be provided in parallel, to assist the 245 development of operations procedures when the technology is deployed, 246 for example. Internal metrics and measurements from Open Source 247 implementations may be the only direct source of performance results 248 in a desired dimension, but corroborating external observations are 249 still required to assure the integrity of measurement discipline was 250 maintained for all reported results. 252 A related aspect of benchmark development is where the scope includes 253 multiple approaches to a common function under the same benchmark. 254 For example, there are many ways to arrange for activation of a 255 network path between interface points and the activation times can be 256 compared if the start-to-stop activation interval has a generic and 257 unambiguous definition. Thus, generic benchmark definitions are 258 preferred over technology/protocol specific definitions where 259 possible. 261 4.3. New Benchmarks 263 There will be new classes of benchmarks needed for network design and 264 assistance when developing operational practices (possibly automated 265 management and orchestration of deployment scale). Examples follow 266 in the paragraphs below, many of which are prompted by the goals of 267 increased elasticity and flexibility of the network functions, along 268 with accelerated deployment times. 270 Time to deploy VNFs: In cases where the COTS hardware is already 271 deployed and ready for service, it is valuable to know the response 272 time when a management system is tasked with "standing-up" 100's of 273 virtual machines and the VNFs they will host. 275 Time to migrate VNFs: In cases where a rack or shelf of hardware must 276 be removed from active service, it is valuable to know the response 277 time when a management system is tasked with "migrating" some number 278 of virtual machines and the VNFs they currently host to alternate 279 hardware that will remain in-service. 281 Time to create a virtual network in the COTS infrastructure: This is 282 a somewhat simplified version of existing benchmarks for convergence 283 time, in that the process is initiated by a request from (centralized 284 or distributed) control, rather than inferred from network events 285 (link failure). The successful response time would remain dependent 286 on dataplane observations to confirm that the network is ready to 287 perform. 289 4.4. Assessment of Benchmark Coverage 291 It can be useful to organize benchmarks according to their applicable 292 lifecycle stage and the performance criteria they intend to assess. 293 The table below provides a way to organize benchmarks such that there 294 is a clear indication of coverage for the intersection of lifecycle 295 stages and performance criteria. 297 |----------------------------------------------------------| 298 | | | | | 299 | | SPEED | ACCURACY | RELIABILITY | 300 | | | | | 301 |----------------------------------------------------------| 302 | | | | | 303 | Activation | | | | 304 | | | | | 305 |----------------------------------------------------------| 306 | | | | | 307 | Operation | | | | 308 | | | | | 309 |----------------------------------------------------------| 310 | | | | | 311 | De-activation | | | | 312 | | | | | 313 |----------------------------------------------------------| 315 For example, the "Time to deploy VNFs" benchmark described above 316 would be placed in the intersection of Activation and Speed, making 317 it clear that there are other potential performance criteria to 318 benchmark, such as the "percentage of unsuccessful VM/VNF stand-ups" 319 in a set of 100 attempts. This example emphasizes that the 320 Activation and De-activation lifecycle stages are key areas for NFV 321 and related infrastructure, and encourage expansion beyond 322 traditional benchmarks for normal operation. Thus, reviewing the 323 benchmark coverage using this table (sometimes called the 3x3 matrix) 324 can be a worthwhile exercise in BMWG. 326 Comment/Discussion: 328 In one of the first applications of the 3x3 matrix on BMWG, we 329 discovered that metrics on measured size, capacity, or scale do not 330 easily match one of the three columns above. There are three 331 alternatives to resolve this: 333 1. Add a column, Scaleability, but then it would be expected to have 334 metrics in most of the Activation, Operation, and De-activation 335 functions (which may not be the case). 337 2. Include Scalability under Reliability: This fits the user 338 perspective of the 3x3 matrix because the size or capacity of a 339 device contributes to the likelihood that a request will be 340 blocked, or that operation will be un-reliable when operating in 341 an overload state. 343 3. Keep size, capacity, and scale metrics separate from the 3x3 344 matrix, and present the results for key benchmarks in different 345 versions of the matrix, and the titles of each matrix provide the 346 details of configuration and scale. 348 Alternative 3 would address a discussion comment from IETF-90, so it 349 seems to cover a range of wanted features. 351 5. Security Considerations 353 Benchmarking activities as described in this memo are limited to 354 technology characterization of a Device Under Test/System Under Test 355 (DUT/SUT) using controlled stimuli in a laboratory environment, with 356 dedicated address space and the constraints specified in the sections 357 above. 359 The benchmarking network topology will be an independent test setup 360 and MUST NOT be connected to devices that may forward the test 361 traffic into a production network, or misroute traffic to the test 362 management network. 364 Further, benchmarking is performed on a "black-box" basis, relying 365 solely on measurements observable external to the DUT/SUT. 367 Special capabilities SHOULD NOT exist in the DUT/SUT specifically for 368 benchmarking purposes. Any implications for network security arising 369 from the DUT/SUT SHOULD be identical in the lab and in production 370 networks. 372 6. IANA Considerations 374 No IANA Action is requested at this time. 376 7. Acknowledgements 378 The author acknowledges an encouraging conversation on this topic 379 with Mukhtiar Shaikh and Ramki Krishnan in November 2013. 380 Bhuvaneswaran Vengainathan, Bhavani Parise, and Ilya Varlashkin have 381 provided useful suggestions to expand these considerations. 383 8. References 385 8.1. Normative References 387 [NFV.PER001] 388 "Network Function Virtualization: Performance and 389 Portability Best Practices", Group Specification ETSI GS 390 NFV-PER 001 V1.1.1 (2014-06), June 2014. 392 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 393 Requirement Levels", BCP 14, RFC 2119, March 1997. 395 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 396 "Framework for IP Performance Metrics", RFC 2330, May 397 1998. 399 [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for 400 Network Interconnect Devices", RFC 2544, March 1999. 402 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 403 Delay Metric for IPPM", RFC 2679, September 1999. 405 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 406 Packet Loss Metric for IPPM", RFC 2680, September 1999. 408 [RFC2681] Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip 409 Delay Metric for IPPM", RFC 2681, September 1999. 411 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 412 Metric for IP Performance Metrics (IPPM)", RFC 3393, 413 November 2002. 415 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 416 performance measurement with periodic streams", RFC 3432, 417 November 2002. 419 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 420 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 421 November 2006. 423 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J. 424 Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", 425 RFC 5357, October 2008. 427 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network 428 Time Protocol Version 4: Protocol and Algorithms 429 Specification", RFC 5905, June 2010. 431 8.2. Informative References 433 [RFC1242] Bradner, S., "Benchmarking terminology for network 434 interconnection devices", RFC 1242, July 1991. 436 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 437 Applicability Statement", RFC 5481, March 2009. 439 [RFC6248] Morton, A., "RFC 4148 and the IP Performance Metrics 440 (IPPM) Registry of Metrics Are Obsolete", RFC 6248, April 441 2011. 443 [RFC6390] Clark, A. and B. Claise, "Guidelines for Considering New 444 Performance Metric Development", BCP 170, RFC 6390, 445 October 2011. 447 Author's Address 449 Al Morton 450 AT&T Labs 451 200 Laurel Avenue South 452 Middletown,, NJ 07748 453 USA 455 Phone: +1 732 420 1571 456 Fax: +1 732 368 1192 457 Email: acmorton@att.com 458 URI: http://home.comcast.net/~acmacm/