idnits 2.17.1 draft-morton-bmwg-virtual-net-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 2, 2015) is 3372 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2330' is defined on line 430, but no explicit reference was found in the text == Unused Reference: 'RFC2679' is defined on line 437, but no explicit reference was found in the text == Unused Reference: 'RFC2680' is defined on line 440, but no explicit reference was found in the text == Unused Reference: 'RFC2681' is defined on line 443, but no explicit reference was found in the text == Unused Reference: 'RFC3393' is defined on line 446, but no explicit reference was found in the text == Unused Reference: 'RFC3432' is defined on line 450, but no explicit reference was found in the text == Unused Reference: 'RFC5357' is defined on line 462, but no explicit reference was found in the text == Unused Reference: 'RFC5905' is defined on line 466, but no explicit reference was found in the text == Unused Reference: 'RFC6248' is defined on line 481, but no explicit reference was found in the text == Unused Reference: 'RFC6390' is defined on line 485, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 11 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft AT&T Labs 4 Intended status: Informational February 2, 2015 5 Expires: August 6, 2015 7 Considerations for Benchmarking Virtual Network Functions and Their 8 Infrastructure 9 draft-morton-bmwg-virtual-net-03 11 Abstract 13 Benchmarking Methodology Working Group has traditionally conducted 14 laboratory characterization of dedicated physical implementations of 15 internetworking functions. This memo investigates additional 16 considerations when network functions are virtualized and performed 17 in commodity off-the-shelf hardware. 19 NOTES: 21 3.4 Added inter-actions/dependencies within resource domains 23 4.3 Added new metrics for characterization: PDV, reordering, mean 24 delay, etc. 26 4.4 Resolved the question of capacity and the 3x3 Matrix 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 6, 2015. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Considerations for Hardware and Testing . . . . . . . . . . . 4 70 3.1. Hardware Components . . . . . . . . . . . . . . . . . . . 4 71 3.2. Configuration Parameters . . . . . . . . . . . . . . . . 4 72 3.3. Testing Strategies . . . . . . . . . . . . . . . . . . . 5 73 3.4. Attention to Shared Resources . . . . . . . . . . . . . . 5 74 4. Benchmarking Considerations . . . . . . . . . . . . . . . . . 6 75 4.1. Comparison with Physical Network Functions . . . . . . . 6 76 4.2. Continued Emphasis on Black-Box Benchmarks . . . . . . . 6 77 4.3. New Benchmarks and Related Metrics . . . . . . . . . . . 7 78 4.4. Assessment of Benchmark Coverage . . . . . . . . . . . . 7 79 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 80 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 81 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 82 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 83 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 84 8.2. Informative References . . . . . . . . . . . . . . . . . 11 85 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 87 1. Introduction 89 Benchmarking Methodology Working Group (BMWG) has traditionally 90 conducted laboratory characterization of dedicated physical 91 implementations of internetworking functions. The Black-box 92 Benchmarks of Throughput, Latency, Forwarding Rates and others have 93 served our industry for many years. [RFC1242] and [RFC2544] are the 94 cornerstones of the work. 96 An emerging set of service provider and vendor development goals is 97 to reduce costs while increasing flexibility of network devices, and 98 drastically accelerate their deployment. Network Function 99 Virtualization (NFV) has the promise to achieve these goals, and 100 therefore has garnered much attention. It now seems certain that 101 some network functions will be virtualized following the success of 102 cloud computing and virtual desktops supported by sufficient network 103 path capacity, performance, and widespread deployment; many of the 104 same techniques will help achieve NFV. 106 See http://www.etsi.org/technologies-clusters/technologies/nfv for 107 more background, for example, the white papers there may be a useful 108 starting place. The Performance and Portability Best Practices 109 [NFV.PER001] are particularly relevant to BMWG. There are currently 110 work-in-progress documents available in the Open Area 111 http://docbox.etsi.org/ISG/NFV/Open/Latest_Drafts/ including drafts 112 describing Infrastructure aspects and service quality. 114 2. Scope 116 BMWG will consider the new topic of Virtual Network Functions and 117 related Infrastructure to ensure that common issues are recognized 118 from the start, using background materials from industry and SDOs 119 (e.g., IETF, ETSI NFV). 121 This memo investigates additional methodological considerations 122 necessary when benchmarking VNF instantiated and hosted in commodity 123 off-the-shelf (COTS) hardware. An essential consideration is 124 benchmarking both physical and virtual network functions, thereby 125 allowing direct comparison. 127 A clearly related goal: the benchmarks for the capacity of COTS to 128 host a plurality of VNF instances should be investigated. Existing 129 networking technology benchmarks will also be considered for 130 adaptation to NFV and closely associated technologies. 132 A non-goal is any overlap with traditional computer benchmark 133 development and their specific metrics (SPECmark suites such as 134 SPECCPU). 136 A colossal non-goal is any form of architecture development related 137 to NFV and associated technologies in BMWG, as has been the case 138 since BMWG began work in 1989. 140 3. Considerations for Hardware and Testing 142 This section lists the new considerations which must be addressed to 143 benchmark VNF(s) and their supporting infrastructure. 145 3.1. Hardware Components 147 New Hardware devices will become part of the test set-up. 149 1. High volume server platforms (COTS, possibly with virtual 150 technology enhancements). 152 2. Storage systems with large capacity, high speed, and high 153 reliability. 155 3. Network Interface ports specially designed for efficient service 156 of many virtual NICs. 158 4. High capacity Ethernet Switches. 160 Labs conducting comparisons of different VNFs may be able to use the 161 same hardware platform over many studies, until the steady march of 162 innovations overtakes their capabilities (as happens with the lab's 163 traffic generation and testing devices today). 165 3.2. Configuration Parameters 167 It will be necessary to configure and document the settings for the 168 entire COTS platform, including: 170 o number of server blades (shelf occupation) 172 o CPUs 174 o caches 176 o storage system 178 o I/O 180 as well as configurations that support the devices which host the VNF 181 itself: 183 o Hypervisor 185 o Virtual Machine 187 o Infrastructure Virtual Network 188 and finally, the VNF itself, with items such as: 190 o specific function being implemented in VNF 192 o number of VNF components in the service function chain 194 o number of physical interfaces and links transited in the service 195 function chain 197 3.3. Testing Strategies 199 The concept of characterizing performance at capacity limits may 200 change. For example: 202 1. It may be more representative of system capacity to characterize 203 the case where Virtual Machines (VM, hosting the VNF) are 204 operating at 50% Utilization, and therefore sharing the "real" 205 processing power across many VMs. 207 2. Another important case stems from the need for partitioning 208 functions. A noisy neighbor (VM hosting a VNF in an infinite 209 loop) would ideally be isolated and the performance of other VMs 210 would continue according to their specifications. 212 3. System errors will likely occur as transients, implying a 213 distribution of performance characteristics with a long tail 214 (like latency), leading to the need for longer-term tests of each 215 set of configuration and test parameters. 217 4. The desire for Elasticity and flexibility among network functions 218 will include tests where there is constant flux in the VM 219 instances. Requests for new VMs and Releases for VMs hosting 220 VNFs no longer needed would be an normal operational condition. 222 5. All physical things can fail, and benchmarking efforts can also 223 examine recovery aided by the virtual architecture with different 224 approaches to resiliency. 226 3.4. Attention to Shared Resources 228 Since many components of the new NFV Infrastructure are virtual, test 229 set-up design must have prior knowledge of inter-actions/dependencies 230 within the various resource domains in the System Under Test (SUT). 231 For example, a virtual machine performing the role of a traditional 232 tester function such as generating and/or receiving traffic should 233 avoid sharing any SUT resources with the Device Under Test DUT. 234 Otherwise, the results will have unexpected dependencies not 235 encountered in physical device benchmarking. The shared-resource 236 aspect of test design remains one of the critical challenges to 237 overcome in a reasonable way to produce useful results. 239 4. Benchmarking Considerations 241 This section discusses considerations related to Benchmarks 242 applicable to VNFs and their associated technologies. 244 4.1. Comparison with Physical Network Functions 246 In order to compare the performance of virtual designs and 247 implementations with their physical counterparts, identical 248 benchmarks must be used. Since BMWG has developed specifications for 249 many network functions already, there will be re-use of existing 250 benchmarks through references, while allowing for the possibility of 251 benchmark curation during development of new methodologies. 252 Consideration should be given to quantifying the number of parallel 253 VNFs required to achieve comparable performance with a given physical 254 device, or whether some limit of scale was reached before the VNFs 255 could achieve the comparable level. 257 4.2. Continued Emphasis on Black-Box Benchmarks 259 When the network functions under test are based on Open Source code, 260 there may be a tendency to rely on internal measurements to some 261 extent, especially when the externally-observable phenomena only 262 support an inference of internal events (such as routing protocol 263 convergence). However, external observations remain essential as the 264 basis for Benchmarks. Internal observations with fixed specification 265 and interpretation may be provided in parallel, to assist the 266 development of operations procedures when the technology is deployed, 267 for example. Internal metrics and measurements from Open Source 268 implementations may be the only direct source of performance results 269 in a desired dimension, but corroborating external observations are 270 still required to assure the integrity of measurement discipline was 271 maintained for all reported results. 273 A related aspect of benchmark development is where the scope includes 274 multiple approaches to a common function under the same benchmark. 275 For example, there are many ways to arrange for activation of a 276 network path between interface points and the activation times can be 277 compared if the start-to-stop activation interval has a generic and 278 unambiguous definition. Thus, generic benchmark definitions are 279 preferred over technology/protocol specific definitions where 280 possible. 282 4.3. New Benchmarks and Related Metrics 284 There will be new classes of benchmarks needed for network design and 285 assistance when developing operational practices (possibly automated 286 management and orchestration of deployment scale). Examples follow 287 in the paragraphs below, many of which are prompted by the goals of 288 increased elasticity and flexibility of the network functions, along 289 with accelerated deployment times. 291 Time to deploy VNFs: In cases where the COTS hardware is already 292 deployed and ready for service, it is valuable to know the response 293 time when a management system is tasked with "standing-up" 100's of 294 virtual machines and the VNFs they will host. 296 Time to migrate VNFs: In cases where a rack or shelf of hardware must 297 be removed from active service, it is valuable to know the response 298 time when a management system is tasked with "migrating" some number 299 of virtual machines and the VNFs they currently host to alternate 300 hardware that will remain in-service. 302 Time to create a virtual network in the COTS infrastructure: This is 303 a somewhat simplified version of existing benchmarks for convergence 304 time, in that the process is initiated by a request from (centralized 305 or distributed) control, rather than inferred from network events 306 (link failure). The successful response time would remain dependent 307 on dataplane observations to confirm that the network is ready to 308 perform. 310 Also, it appears to be valuable to measure traditional packet 311 transfer performance metrics during the assessment of traditional and 312 new benchmarks, including metrics that may be used to support service 313 engineering such as the Spatial Composition metrics found in 314 [RFC6049]. Examples include Mean one-way delay in section 4.1 of 315 [RFC6049], Packet Delay Variation (PDV) in [RFC5481], and Packet 316 Reordering [RFC4737] [RFC4689]. 318 4.4. Assessment of Benchmark Coverage 320 It can be useful to organize benchmarks according to their applicable 321 lifecycle stage and the performance criteria they intend to assess. 322 The table below provides a way to organize benchmarks such that there 323 is a clear indication of coverage for the intersection of lifecycle 324 stages and performance criteria. 326 |----------------------------------------------------------| 327 | | | | | 328 | | SPEED | ACCURACY | RELIABILITY | 329 | | | | | 330 |----------------------------------------------------------| 331 | | | | | 332 | Activation | | | | 333 | | | | | 334 |----------------------------------------------------------| 335 | | | | | 336 | Operation | | | | 337 | | | | | 338 |----------------------------------------------------------| 339 | | | | | 340 | De-activation | | | | 341 | | | | | 342 |----------------------------------------------------------| 344 For example, the "Time to deploy VNFs" benchmark described above 345 would be placed in the intersection of Activation and Speed, making 346 it clear that there are other potential performance criteria to 347 benchmark, such as the "percentage of unsuccessful VM/VNF stand-ups" 348 in a set of 100 attempts. This example emphasizes that the 349 Activation and De-activation lifecycle stages are key areas for NFV 350 and related infrastructure, and encourage expansion beyond 351 traditional benchmarks for normal operation. Thus, reviewing the 352 benchmark coverage using this table (sometimes called the 3x3 matrix) 353 can be a worthwhile exercise in BMWG. 355 In one of the first applications of the 3x3 matrix on BMWG, we 356 discovered that metrics on measured size, capacity, or scale do not 357 easily match one of the three columns above. There are three 358 possibilities to resolve this: 360 o Add a column, Scaleability, but then it would be expected to have 361 metrics in most of the Activation, Operation, and De-activation 362 functions (which may not be the case). 364 o Include Scalability under Reliability: This fits the user 365 perspective of the 3x3 matrix because the size or capacity of a 366 device contributes to the likelihood that a request will be 367 blocked, or that operation will be un-reliable when operating in 368 an overload state. 370 o Keep size, capacity, and scale metrics separate from the 3x3 371 matrix. 373 After some discussion, including some of the original developers of 374 the 3x3 matrix, it is suggested to keep capacity metrics separate 375 from the 3x3 matrix and list them separately. This approach 376 encourages use of the 3x3 matrix to organize reports of results, 377 where the capacity at which the various metrics were measured would 378 be included in the title of the matrix (and results for multiple 379 capacities would result in separate 3x3 matrices, if there were 380 sufficient measurements/results to organize in that way). 382 5. Security Considerations 384 Benchmarking activities as described in this memo are limited to 385 technology characterization of a Device Under Test/System Under Test 386 (DUT/SUT) using controlled stimuli in a laboratory environment, with 387 dedicated address space and the constraints specified in the sections 388 above. 390 The benchmarking network topology will be an independent test setup 391 and MUST NOT be connected to devices that may forward the test 392 traffic into a production network, or misroute traffic to the test 393 management network. 395 Further, benchmarking is performed on a "black-box" basis, relying 396 solely on measurements observable external to the DUT/SUT. 398 Special capabilities SHOULD NOT exist in the DUT/SUT specifically for 399 benchmarking purposes. Any implications for network security arising 400 from the DUT/SUT SHOULD be identical in the lab and in production 401 networks. 403 6. IANA Considerations 405 No IANA Action is requested at this time. 407 7. Acknowledgements 409 The author acknowledges an encouraging conversation on this topic 410 with Mukhtiar Shaikh and Ramki Krishnan in November 2013. Bhavani 411 Parise and Ilya Varlashkin have provided useful suggestions to expand 412 these considerations. Bhuvaneswaran Vengainathan has already tried 413 the 3x3 matrix with SDN controller draft, and contributed to many 414 discussions. Scott Bradner quickly pointed out shared resource 415 dependencies in an early vSwitch measurement proposal, and the topic 416 was included here as a key consideration. 418 8. References 420 8.1. Normative References 422 [NFV.PER001] 423 "Network Function Virtualization: Performance and 424 Portability Best Practices", Group Specification ETSI GS 425 NFV-PER 001 V1.1.1 (2014-06), June 2014. 427 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 428 Requirement Levels", BCP 14, RFC 2119, March 1997. 430 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 431 "Framework for IP Performance Metrics", RFC 2330, May 432 1998. 434 [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for 435 Network Interconnect Devices", RFC 2544, March 1999. 437 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 438 Delay Metric for IPPM", RFC 2679, September 1999. 440 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 441 Packet Loss Metric for IPPM", RFC 2680, September 1999. 443 [RFC2681] Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip 444 Delay Metric for IPPM", RFC 2681, September 1999. 446 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 447 Metric for IP Performance Metrics (IPPM)", RFC 3393, 448 November 2002. 450 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 451 performance measurement with periodic streams", RFC 3432, 452 November 2002. 454 [RFC4689] Poretsky, S., Perser, J., Erramilli, S., and S. Khurana, 455 "Terminology for Benchmarking Network-layer Traffic 456 Control Mechanisms", RFC 4689, October 2006. 458 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 459 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 460 November 2006. 462 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J. 463 Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", 464 RFC 5357, October 2008. 466 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network 467 Time Protocol Version 4: Protocol and Algorithms 468 Specification", RFC 5905, June 2010. 470 8.2. Informative References 472 [RFC1242] Bradner, S., "Benchmarking terminology for network 473 interconnection devices", RFC 1242, July 1991. 475 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 476 Applicability Statement", RFC 5481, March 2009. 478 [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of 479 Metrics", RFC 6049, January 2011. 481 [RFC6248] Morton, A., "RFC 4148 and the IP Performance Metrics 482 (IPPM) Registry of Metrics Are Obsolete", RFC 6248, April 483 2011. 485 [RFC6390] Clark, A. and B. Claise, "Guidelines for Considering New 486 Performance Metric Development", BCP 170, RFC 6390, 487 October 2011. 489 Author's Address 491 Al Morton 492 AT&T Labs 493 200 Laurel Avenue South 494 Middletown,, NJ 07748 495 USA 497 Phone: +1 732 420 1571 498 Fax: +1 732 368 1192 499 Email: acmorton@att.com 500 URI: http://home.comcast.net/~acmacm/