idnits 2.17.1 draft-bmwg-nvp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The first octets (the first characters of the first line) of this draft are 'BM', which can make the draft submission tool erroneously think that it is an image .bmp file. It is recommended that you change this, for instance by inserting a blank line before the line starting with 'BM'. == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 8, 2016) is 2847 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 124, but not defined == Unused Reference: 'RFC7364' is defined on line 548, but no explicit reference was found in the text == Unused Reference: '1' is defined on line 557, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-ietf-bmwg-virtual-net-03 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 BMWG S. Kommu 2 Internet Draft VMware 3 Intended status: Informational B. Basler 4 Expires: January 2017 VMware 5 J. Rapp 6 VMware 7 July 8, 2016 9 Considerations for Benchmarking Network Virtualization Platforms 10 draft-bmwg-nvp-00.txt 12 Status of this Memo 14 This Internet-Draft is submitted in full conformance with the 15 provisions of BCP 78 and BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other documents 24 at any time. It is inappropriate to use Internet-Drafts as 25 reference material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 This Internet-Draft will expire on January 8, 2009. 35 Copyright Notice 37 Copyright (c) 2016 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with 45 respect to this document. Code Components extracted from this 46 document must include Simplified BSD License text as described in 47 Section 4.e of the Trust Legal Provisions and are provided without 48 warranty as described in the Simplified BSD License. 50 Abstract 52 Current network benchmarking methodologies are focused on physical 53 networking components and do not consider the actual application 54 layer traffic patterns and hence do not reflect the traffic that 55 virtual networking components work with. The purpose of this 56 document is to distinguish and highlight benchmarking considerations 57 when testing and evaluating virtual networking components in the 58 data center. 60 Table of Contents 62 1. Introduction...................................................2 63 2. Conventions used in this document..............................3 64 3. Definitions....................................................3 65 3.1. System Under Test.........................................3 66 3.2. Network Virtualization Platform...........................4 67 3.3. Micro-services............................................5 68 4. Scope..........................................................5 69 4.1. Virtual Networking for Datacenter Applications............6 70 4.2. Interaction with Physical Devices.........................6 71 5. Interaction with Physical Devices..............................6 72 5.1. Server Architecture Considerations........................9 73 6. Security Considerations.......................................11 74 7. IANA Considerations...........................................12 75 8. Conclusions...................................................12 76 9. References....................................................12 77 9.1. Informative References...................................12 78 Appendix A. .....................................13 80 1. Introduction 82 Datacenter virtualization that includes both compute and network 83 virtualization is growing rapidly as the industry continues to look 84 for ways to improve productivity, flexibility and at the same time 85 cut costs. Network virtualization, is comparatively new and 86 expected to grow tremendously similar to compute virtualization. 87 There are multiple vendors and solutions out in the market, each 88 with their own benchmarks to showcase why a particular solution is 89 better than another. Hence, the need for a vendor and product 90 agnostic way to benchmark multivendor solutions to help with 91 comparison and make informed decisions when it comes to selecting 92 the right network virtualization solution. 94 Applications traditionally have been segmented using VLANs and ACLs 95 between the VLANs. This model does not scale because of the 4K 96 scale limitations of VLANs. Overlays such as VXLAN were designed to 97 address the limitations of VLANs 99 With VXLAN, applications are segmented based on VXLAN encapsulation 100 (specifically the VNI field in the VXLAN header), which is similar 101 to VLAN ID in the 802.1Q VLAN tag, however without the 4K scale 102 limitations of VLANs. For a more detailed discussion on this 103 subject please refer RFC 7364 "Problem Statement: Overlays for 104 Network Virtualization". 106 VXLAN is just one of several Network Virtualization Overlays(NVO). 107 Some of the others include STT, Geneve and NVGRE. . STT and Geneve 108 have expanded on the capabilities of VXLAN. Please refer IETF's 109 nvo3 working group < 110 https://datatracker.ietf.org/wg/nvo3/documents/> for more 111 information. 113 Modern application architectures, such as Micro-services, are going 114 beyond the three tier app models such as web, app and db. 115 Benchmarks MUST consider whether the proposed solution is able to 116 scale up to the demands of such applications and not just a three- 117 tier architecture. 119 2. Conventions used in this document 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 123 document are to be interpreted as described in RFC 2119 [RFC2119]. 125 In this document, these words will appear with that interpretation 126 only when in ALL CAPS. Lower case uses of these words are not to be 127 interpreted as carrying significance described in RFC 2119. 129 3. Definitions 131 3.1. System Under Test (SUT) 133 Traditional hardware based networking devices generally use the 134 device under test (DUT) model of testing. In this model, apart from 135 any allowed configuration, the DUT is a black box from a testing 136 perspective. This method works for hardware based networking 137 devices since the device itself is not influenced by any other 138 components outside the DUT. 140 Virtual networking components cannot leverage DUT model of testing 141 as the DUT is not just the virtual device but includes the hardware 142 components that were used to host the virtual device 143 Hence SUT model MUST be used instead of the traditional device under 144 test 146 With SUT model, the virtual networking component along with all 147 software and hardware components that host the virtual networking 148 component MUST be considered as part of the SUT. 150 Virtual networking components may also work with higher level TCP 151 segments such as TSO. In contrast, all physical switches and 152 routers, including the ones that act as initiators for NVOs, work 153 with L2/L3 packets. 155 Please refer to section 5 Figure 1 for a visual representation of 156 System Under Test in the case of Intra-Host testing and section 5 157 Figure 2 for System Under Test in the case of Inter-Host testing 159 3.2. Network Virtualization Platform 161 This document does not focus on Network Function Virtualization. 163 Network Function Virtualization focuses on being independent of 164 networking hardware while providing the same functionality. In the 165 case of NFV, traditional benchmarking methodologies recommended by 166 IETF may be used. Considerations for Benchmarking Virtual Network 167 Functions and Their Infrastructure IETF document addresses 168 benchmarking NFVs. 170 Network Virtualization Platforms, apart from providing hardware 171 agnostic network functions, also leverage performance optimizations 172 provided by the TCP stacks of hypervisors. 174 Network Virtualization Platforms are architected differently when 175 compared to NFV and are not limited by packet size constraints via 176 MTU that exist for both NFV and Hardware based network platforms. 178 NVPs leverage TCP stack optimizations such as TSO that enables NVPs 179 to work with much larger payloads of 64K unlike their counterparts 180 such as NFVs. 182 Because of the difference in the payload and thus the overall 183 segment sizes, normal benchmarking methods are not relevant to the 184 NVPs. 186 Instead, newer methods that take into account the built in 187 advantages of TCP provided optimizations MUST be used for testing 188 Network Virtualization Platforms. 190 3.3. Micro-services 192 Traditional monolithic application architectures such as the three 193 tier web, app and db architectures are hitting scale and deployment 194 limits for the modern use cases. 196 Micro-services make use of classic unix style of small app with 197 single responsibility. 199 These small apps are designed with the following characteristics: 201 . Each application only does one thing - like unix tools 203 . Small enough that you could rewrite instead of maintain 205 . Embedded with a simple web container 207 . Packaged as a single executable 209 . Installed as daemons 211 . Each of these applications are completely separate 213 . Interact via uniform interface 215 . REST (over HTTP/HTTPS) being the most common 217 With Micro-services architecture, a single web app of the three tier 218 application model could now have 100s of smaller apps dedicated to 219 do just one job. 221 These 100s of small one responsibility only services will MUST be 222 secured into their own segment - hence pushing the scale boundaries 223 of the overlay from both simple segmentation perspective and also 224 from a security perspective 226 4. Scope 228 This document does not address Network Function Virtualization has 229 been covered already by previous IETF documents 230 (https://datatracker.ietf.org/doc/draft-ietf-bmwg-virtual- 231 net/?include_text=1) the focus of this document is Network 232 Virtualization Platform where the network functions are an intrinsic 233 part of the hypervisor's TCP stack, working closer to the 234 application layer and leveraging performance optimizations such 235 TSO/RSS provided by the TCP stack and the underlying hardware. 237 4.1. Virtual Networking for Datacenter Applications 239 While virtualization is growing beyond the datacenter, this document 240 focuses on the virtual networking for east-west traffic within the 241 datacenter applications only. For example, in a three tier app such 242 web, app and db, this document focuses on the east-west traffic 243 between web and app. It does not address north-south web traffic 244 accessed from outside the datacenter. A future document would 245 address north-south traffic flows. 247 This document addresses scale requirements for modern application 248 architectures such as Micro-services to consider whether the 249 proposed solution is able to scale up to the demands of micro- 250 services application models that basically have 100s of small 251 services communicating on some standard ports such as http/https 252 using protocols such as REST 254 4.2. Interaction with Physical Devices 256 Virtual network components cannot be tested independent of other 257 components within the system. Example, unlike a physical router or 258 a firewall, where the tests can be focused directly solely on the 259 device, when testing a virtual router or firewall, multiple other 260 devices may become part of the system under test. Hence the 261 characteristics of these other traditional networking switches and 262 routers, LB, FW etc. MUST be considered. 264 . Hashing method used 266 . Over-subscription rate 268 . Throughput available 270 . Latency characteristics 272 5. Interaction with Physical Devices 274 In virtual environments, System Under Test (SUT) may often share 275 resources and reside on the same Physical hardware with other 276 components involved in the tests. Hence SUT MUST be clearly 277 defined. In this tests, a single hypervisor may host multiple 278 servers, switches, routers, firewalls etc., 280 Intra host testing: Intra host testing helps in reducing the number 281 of components involved in a test. For example, intra host testing 282 would help focus on the System Under Test, logical switch and the 283 hardware that is running the hypervisor that hosts the logical 284 switch, and eliminate other components. Because of the nature of 285 virtual infrastructures and multiple elements being hosted on the 286 same physical infrastructure, influence from other components cannot 287 be completely ruled out. For example, unlike in physical 288 infrastructures, logical routing or distributed firewall MUST NOT be 289 benchmarked independent of logical switching. System Under Test 290 definition MUST include all components involved with that particular 291 test. 293 +---------------------------------------------------+ 294 | System Under Test | 295 | +-----------------------------------------------+ | 296 | | Hypervisor | | 297 | | | | 298 | | +-------------+ | | 299 | | | NVP | | | 300 | | +-----+ | Switch/ | +-----+ | | 301 | | | VM1 |<------>| Router/ |<------>| VM2 | | | 302 | | +-----+ VW | Firewall/ | VW +-----+ | | 303 | | | etc., | | | 304 | | +-------------+ | | 305 | +------------------------_----------------------+ | 306 +---------------------------------------------------+ 308 Legend 309 VM: Virtual Machine 310 VW: Virtual Wire 312 Figure 1 Intra-Host System Under Test 314 Inter host testing: Inter host testing helps in profiling the 315 underlying network interconnect performance. For example, when 316 testing Logical Switching, inter host testing would not only test 317 the logical switch component but also any other devices that are 318 part of the physical data center fabric that connects the two 319 hypervisors. System Under Test MUST be well defined to help with 320 repeatability of tests. System Under Test definition in the case of 321 inter host testing, MUST include all components, including the 322 underlying network fabric. 324 Figure 2 is a visual representation of system under test for inter- 325 host testing 326 +---------------------------------------------------+ 327 | System Under Test | 328 | +-----------------------------------------------+ | 329 | | Hypervisor | | 330 | | +-------------+ | | 331 | | | NVP | | | 332 | | +-----+ | Switch/ | +-----+ | | 333 | | | VM1 |<------>| Router/ |<------>| VM2 | | | 334 | | +-----+ VW | Firewall/ | VW +-----+ | | 335 | | | etc., | | | 336 | | +-------------+ | | 337 | +------------------------_----------------------+ | 338 | ^ | 339 | | Network Cabling | 340 | v | 341 | +-----------------------------------------------+ | 342 | | Physical Networking Components | | 343 | | switches, routers, firewalls etc., | | 344 | +-----------------------------------------------+ | 345 | ^ | 346 | | Network Cabling | 347 | v | 348 | +-----------------------------------------------+ | 349 | | Hypervisor | | 350 | | +-------------+ | | 351 | | | NVP | | | 352 | | +-----+ | Switch/ | +-----+ | | 353 | | | VM1 |<------>| Router/ |<------>| VM2 | | | 354 | | +-----+ VW | Firewall/ | VW +-----+ | | 355 | | | etc., | | | 356 | | +-------------+ | | 357 | +------------------------_----------------------+ | 358 +---------------------------------------------------+ 360 Legend 361 VM: Virtual Machine 362 VW: Virtual Wire 364 Figure 2 Inter-Host System Under Test 366 Virtual components have a direct dependency on the physical 367 infrastructure that is hosting these resources. Hardware 368 characteristics of the physical host impact the performance of the 369 virtual components. The components that are being tested and the 370 impact of the other hardware components within the hypervisor on the 371 performance of the SUT MUST be documented. Virtual component 372 performance is influenced by the physical hardware components within 373 the hypervisor. Access to various offloads such as TCP segmentation 374 offload, may have significant impact on performance. Firmware and 375 driver differences may also significantly impact results based on 376 whether the specific driver leverages any hardware level offloads 377 offered. Hence, all physical components of the physical server 378 running the hypervisor that hosts the virtual components MUST be 379 documented along with the firmware and driver versions of all the 380 components used to help ensure repeatability of test results. For 381 example, BIOS configuration of the server MUST be documented as some 382 of those changes are designed to improve performance. Please refer 383 to Appendix A for a partial list of parameters to document. 385 5.1. Server Architecture Considerations 387 When testing physical networking components, the approach taken is 388 to consider the device as a black-box. With virtual infrastructure, 389 this approach would no longer help as the virtual networking 390 components are an intrinsic part of the hypervisor they are running 391 on and are directly impacted by the server architecture used. 392 Server hardware components define the capabilities of the virtual 393 networking components. Hence, server architecture MUST be 394 documented in detail to help with repeatability of tests. And the 395 entire hardware and software components become the SUT. 397 5.1.1. Frame format/sizes within the Hypervisor 399 Maximum Transmission Unit (MTU) limits physical network component's 400 frame sizes. The most common max supported MTU for physical devices 401 is 9000. However, 1500 MTU is the standard. Physical network 402 testing and NFV uses these MTU sizes for testing. However, the 403 virtual networking components that live inside a hypervisor, may 404 work with much larger segments because of the availability of 405 hardware and software based offloads. Hence, the normal smaller 406 packets based testing is not relevant for performance testing of 407 virtual networking components. All the TCP related configuration 408 such as TSO size, number of RSS queues MUST be documented along with 409 any other physical NIC related configuration. 411 Virtual network components work closer to the application layer then 412 the physical networking components. Hence virtual network 413 components work with type and size of segments that are often not 414 the same type and size that the physical network works with. Hence, 415 testing virtual network components MUST be done with application 416 layer segments instead of the physical network layer packets. 418 5.1.2. Baseline testing with Logical Switch 420 Logical switch is often an intrinsic component of the test system 421 along with any other hardware and software components used for 422 testing. Also, other logical components cannot be tested 423 independent of the Logical Switch. 425 5.1.3. Tunnel encap/decap outside the hypervisor 427 Logical network components may also have performance impact based on 428 the functionality available within the physical fabric. Physical 429 fabric that supports NVO encap/decap is one such case that has 430 considerable impact on the performance. Any such functionality that 431 exists on the physical fabric MUST be part of the test result 432 documentation to ensure repeatability of tests. In this case SUT 433 MUST include the physical fabric 435 5.1.4. SUT Hypervisor Profile 437 Physical networking equipment has well defined physical resource 438 characteristics such as type and number of ASICs/SoCs used, amount 439 of memory, type and number of processors etc., Virtual networking 440 components' performance is dependent on the physical hardware that 441 hosts the hypervisor. Hence the physical hardware usage, which is 442 part of SUT, for a given test MUST be documented. Example, CPU 443 usage when running logical router. 445 CPU usage changes based on the type of hardware available within the 446 physical server. For example, TCP Segmentation Offload greatly 447 reduces CPU usage by offloading the segmentation process to the NIC 448 card on the sender side. Receive side scaling offers similar 449 benefit on the receive side. Hence, availability and status of such 450 hardware MUST be documented along with actual CPU/Memory usage when 451 the virtual networking components have access to such offload 452 capable hardware. 454 Following is a partial list of components that MUST be documented - 455 both in terms of what's available and also what's used by the SUT - 457 . CPU - type, speed, available instruction sets (e.g. AES-NI) 459 . Memory - type, amount 461 . Storage - type, amount 463 . NIC Cards - type, number of ports, offloads available/used, 464 drivers, firmware (if applicable), HW revision 466 . Libraries such as DPDK if available and used 468 . Number and type of VMs used for testing and 470 o vCPUs 471 o RAM 473 o Storage 475 o Network Driver 477 o Any prioritization of VM resources 479 o Operating System type, version and kernel if applicable 481 o TCP Configuration Changes - if any 483 o MTU 485 . Test tool 487 o Workload type 489 o Protocol being tested 491 o Number of threads 493 o Version of tool 495 . For inter-hypervisor tests, 497 o Physical network devices that are part of the test 499 . Note: For inter-hypervisor tests, system under test 500 is no longer only the virtual component that is being 501 tested but the entire fabric that connects the 502 virtual components become part of the system under 503 test. 505 6. Security Considerations 507 Benchmarking activities as described in this memo are limited to 508 technology characterization of a Device Under Test/System Under Test 509 (DUT/SUT) using controlled stimuli in a laboratory environment, with 510 dedicated address space and the constraints specified in the 511 sections above. 513 The benchmarking network topology will be an independent test setup 514 and MUST NOT be connected to devices that may forward the test 515 traffic into a production network, or misroute traffic to the test 516 management network. 518 Further, benchmarking is performed on a "black-box" basis, relying 519 solely on measurements observable external to the DUT/SUT. 521 Special capabilities SHOULD NOT exist in the DUT/SUT specifically 522 for benchmarking purposes. Any implications for network security 523 arising from the DUT/SUT SHOULD be identical in the lab and in 524 production networks. 526 7. IANA Considerations 528 No IANA Action is requested at this time. 530 8. Conclusions 532 Network Virtualization Platforms, because of their proximity to the 533 application layer and since they can take advantage of TCP stack 534 optimizations, do not function on packets/sec basis. Hence, 535 traditional benchmarking methods, while still relevant for Network 536 Function Virtualization, are not designed to test Network 537 Virtualization Platforms. Also, advances in application 538 architectures such as micro-services, bring new challenges and need 539 benchmarking not just around throughput and latency but also around 540 scale. New benchmarking methods that are designed to take advantage 541 of the TCP optimizations or needed to accurately benchmark 542 performance of the Network Virtualization Platforms 544 9. References 546 9.1. Normative References 548 [RFC7364] T. Narten, E. Gray, D. Black, L. Fang, L. Kreeger, M. 549 Napierala, "Problem Statement: Overlays for Network Virtualization", 550 RFC 7364, October 2014, https://datatracker.ietf.org/doc/rfc7364/ 552 [nv03] IETF, WG, Network Virtualization Overlays, < 553 https://datatracker.ietf.org/wg/nvo3/documents/> 555 9.2. Informative References 557 [1] A. Morton " Considerations for Benchmarking Virtual Network 558 Functions and Their Infrastructure", draft-ietf-bmwg-virtual- 559 net-03, < https://datatracker.ietf.org/doc/draft-ietf-bmwg- 560 virtual-net/?include_text=1> 562 Appendix A. Partial List of Parameters to Document 564 A.1. CPU 566 CPU Vendor 568 CPU Number 570 CPU Architecture 572 # of Sockets (CPUs) 574 # of Cores 576 Clock Speed (GHz) 578 Max Turbo Freq. (GHz) 580 Cache per CPU (MB) 582 # of Memory Channels 584 Chipset 586 Hyperthreading (BIOS Setting) 588 Power Management (BIOS Setting) 590 VT-d 592 A.2. Memory 594 Memory Speed (MHz) 596 DIMM Capacity (GB) 598 # of DIMMs 600 DIMM configuration 602 Total DRAM (GB) 604 A.3. NIC 606 Vendor 608 Model 610 Port Speed (Gbps) 611 Ports 613 PCIe Version 615 PCIe Lanes 617 Bonded 619 Bonding Driver 621 Kernel Module Name 623 Driver Version 625 VXLAN TSO Capable 627 VXLAN RSS Capable 629 Ring Buffer Size RX 631 Ring Buffer Size TX 633 A.4. Hypervisor 635 Hypervisor Name 637 Version/Build 639 Based on 641 Hotfixes/Patches 643 OVS Version/Build 645 IRQ balancing 647 vCPUs per VM 649 Modifications to HV 651 Modifications to HV TCP stack 653 Number of VMs 655 IP MTU 657 Flow control TX (send pause) 659 Flow control RX (honor pause) 660 Encapsulation Type 662 A.5. Guest VM 664 Guest OS & Version 666 Modifications to VM 668 IP MTU Guest VM (Bytes) 670 Test tool used 672 Number of NetPerf Instances 674 Total Number of Streams 676 Guest RAM (GB) 678 A.6. Overlay Network Physical Fabric 680 Vendor 682 Model 684 # and Type of Ports 686 Software Release 688 Interface Configuration 690 Interface/Ethernet MTU (Bytes) 692 Flow control TX (send pause) 694 Flow control RX (honor pause) 696 A.7. Gateway Network Physical Fabric 698 Vendor 700 Model 702 # and Type of Ports 704 Software Release 706 Interface Configuration 708 Interface/Ethernet MTU (Bytes) 709 Flow control TX (send pause) 711 Flow control RX (honor pause) 713 Authors' Addresses 715 Samuel Kommu 716 VMware 717 3401 Hillview Ave 718 Palo Alto, CA, 94304 720 Email: skommu@vmware.com 722 Benjamin Basler 723 VMware 724 3401 Hillview Ave 725 Palo Alto, CA, 94304 727 Email: bbasler@vmware.com 729 Jacob Rapp 730 VMware 731 3401 Hillview Ave 732 Palo Alto, CA, 94304 734 Email: jrapp@vmware.com