idnits 2.17.1 draft-skommu-bmwg-nvp-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 28 longer pages, the longest (page 10) being 80 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 28 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' 1. Introduction' ) ** The document seems to lack a Security Considerations section. (A line matching the expected section header was found, but with an unexpected indentation: ' 7. Security Considerations' ) ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) (A line matching the expected section header was found, but with an unexpected indentation: ' 8. IANA Considerations' ) ** There are 556 instances of too long lines in the document, the longest one being 16 characters in excess of 72. ** The abstract seems to contain references ([RFC7364], [RFC8014], [RFC2119], [RFC8172], [RFC8174], [RFC8394]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 150: '...d db. Benchmarks MUST consider whether...' RFC 2119 keyword, line 181: '...Test (SUT) model MUST be used instead ...' RFC 2119 keyword, line 186: '... component MUST be considered as...' RFC 2119 keyword, line 193: '...rking stack itself, MUST be considered...' RFC 2119 keyword, line 264: '...everage TCP optimizations MUST be used...' (49 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Mar 11, 2019) is 1873 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Missing reference section? 'RFC2119' on line 165 looks like a reference

  -- Missing reference section? 'RFC8174' on line 165 looks like a reference

  -- Missing reference section? 'RFC7364' on line 997 looks like a reference

  -- Missing reference section? 'RFC8014' on line 1002 looks like a reference

  -- Missing reference section? 'RFC8394' on line 1007 looks like a reference

  -- Missing reference section? 'RFC8172' on line 1017 looks like a reference


     Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT

3	        BMWG                                                                 S. Kommu
4	        Internet-Draft                                                         VMware
5	        Intended status: Informational                                        J. Rapp
6	        Expires: Sep 2019                                                      VMware
7	                                                                         Mar 11, 2019

9	                Considerations for Benchmarking Network Virtualization Platforms
10	                                          draft-skommu-bmwg-nvp-03.txt

12	        Status of this Memo

14	             This Internet-Draft is submitted in full conformance with the
15	             provisions of BCP 78 and BCP 79.

17	             Internet-Drafts are working documents of the Internet Engineering
18	             Task Force (IETF), its areas, and its working groups.  Note that
19	             other groups may also distribute working documents as Internet-
20	             Drafts.

22	             Internet-Drafts are draft documents valid for a maximum of six months
23	             and may be updated, replaced, or obsoleted by other documents at any
24	             time.  It is inappropriate to use Internet-Drafts as reference
25	             material or to cite them other than as "work in progress."

27	             The list of current Internet-Drafts can be accessed at
28	             http://www.ietf.org/ietf/1id-abstracts.txt

30	             The list of Internet-Draft Shadow Directories can be accessed at
31	             http://www.ietf.org/shadow.html

33	             This Internet-Draft will expire on September 11, 2019.

35	        Copyright Notice

37	             Copyright (c) 2019 IETF Trust and the persons identified as the
38	             document authors. All rights reserved.

40	             This document is subject to BCP 78 and the IETF Trust's Legal
41	             Provisions Relating to IETF Documents
42	             (http://trustee.ietf.org/license-info) in effect on the date of
43	             publication of this document. Please review these documents
44	             carefully, as they describe your rights and restrictions with respect
45	             to this document. Code Components extracted from this document must
46	             include Simplified BSD License text as described in Section 4.e of
47	             the Trust Legal Provisions and are provided without warranty as
48	             described in the Simplified BSD License.

50	        Abstract

52	             Current network benchmarking methodologies are focused on physical
53	             networking components and do not consider the actual application
54	             layer traffic patterns and hence do not reflect the traffic that
55	             virtual networking components work with when using network
56	             virtualization overlays (NVO3).  The purpose of this document is to
57	             distinguish and highlight benchmarking considerations when testing
58	             and evaluating virtual networking components in the data center.

60	        Table of Contents

62	             1. Introduction .................................................. 3
63	             2. Conventions used in this document ............................. 4
64	             3. Definitions ................................................... 4
65	                 3.1. System Under Test (SUT) ................................. 4
66	                 3.2. Network Virtualization Platform ......................... 5
67	                 3.3. Microservices ........................................... 6
68	             4. Scope ......................................................... 7
69	                     4.1.1. Scenario 1 ........................................ 7
70	                     4.1.2. Scenario 2 ........................................ 7
71	                     4.1.3. Learning .......................................... 7
72	                     4.1.4. Flow Optimization ................................. 7
73	                     4.1.5. Out of scope ...................................... 7
74	                 4.2. Virtual Networking for Datacenter Applications .......... 8
75	                 4.3. Interaction with Physical Devices ....................... 8
76	             5. NVP Benchmarking Considerations ............................... 9
77	                 5.1. Learning ............................................... 12
78	                 5.2. Traffic Flow Optimizations ............................. 12
79	                     5.2.1. Fast Path ........................................ 12
80	                     5.2.2. Dedicated cores / Co-processors .................. 12
81	                     5.2.3. Prioritizing and de-prioritizing active flows .... 13
82	                 5.3. Server Architecture Considerations ..................... 13
83	                     5.3.1. NVE Component considerations ..................... 13
84	                     5.3.2. Frame format/sizes within the Hypervisor ......... 17
85	                     5.3.3. Baseline testing with Logical Switch ............. 17
86	                     5.3.4. Repeatability .................................... 17
87	                     5.3.5. Tunnel encap/decap outside the Hypervisor ........ 17
88	                     5.3.6. SUT Hypervisor Profile ........................... 18
89	                 5.4. Benchmarking Tools Considerations ...................... 20
90	                     5.4.1. Considerations for NVE ........................... 20
91	                     5.4.2. Considerations for Split-NVE ..................... 20
92	             6. Control Plane Scale Considerations ........................... 20
93	                     6.1.1. VM Events ........................................ 21
94	                     6.1.2. Scale ............................................ 22
95	                     6.1.3. Control Plane Performance at Scale ............... 22

97	             7. Security Considerations ...................................... 23
98	             8. IANA Considerations .......................................... 23
99	             9. Conclusions .................................................. 23
100	             10. References .................................................. 24
101	                 10.1. Normative References .................................. 24
102	                 10.2. Informative References ................................ 24
103	             11. Acknowledgments ............................................. 24
104	             Appendix A. Partial List of Parameters to Document .............. 25
105	                 A.1. CPU .................................................... 25
106	                 A.2. Memory ................................................. 25
107	                 A.3. NIC .................................................... 26
108	                 A.4. Hypervisor ............................................. 26
109	                 A.5. Guest VM ............................................... 27
110	                 A.6. Overlay Network Physical Fabric ........................ 27
111	                 A.7. Gateway Network Physical Fabric ........................ 28
112	                 A.8. Metrics ................................................ 28

114	        1. Introduction

116	             Datacenter virtualization that includes both compute and network
117	             virtualization is growing rapidly as the industry continues to look
118	             for ways to improve productivity, flexibility and at the same time
119	             cut costs.  Network virtualization is comparatively new and expected
120	             to grow tremendously similar to compute virtualization. There are
121	             multiple vendors and solutions out in the market.  Each vendor often
122	             has their own recommendations on how to benchmark their solutions
123	             thus making it difficult to perform a apples-to-apples comparision
124	             between different solutions.  Hence, the need for a vendor, product
125	             and cloud agnostic way to benchmark network virtualization solutions
126	             to help with comparison and make informed decisions when it comes to
127	             selecting the right network virtualization solution.

129	             Applications traditionally have been segmented using VLANs and ACLs
130	             between the VLANs.  This model does not scale because of the 4K scale
131	             limitations of VLANs.  Overlays such as VXLAN were designed to
132	             address the limitations of VLANs.

134	             With VXLAN, applications are segmented based on VXLAN encapsulation
135	             (specifically the VNI field in the VXLAN header), which is similar to
136	             VLAN ID in the 802.1Q VLAN tag, however without the 4K scale
137	             limitations of VLANs.  For a more detailed discussion on this subject
138	             please refer RFC 7364 'Problem Statement: Overlays for Network
139	             Virtualization'.

141	             VXLAN is just one of several Network Virtualization Overlays (NVO).
142	             Some of the others include STT, Geneve and NVGRE.  STT and Geneve
143	             have expanded on the capabilities of VXLAN.  Please refer IETF's nvo3
144	             working group < https://datatracker.ietf.org/wg/nvo3/documents/> for
145	             more information.

147	             Modern application architectures, such as Micro-services, because of
148	             IP based connectivity within the app, place high demands on the
149	             networking and security when compared to the traditional three tier
150	             app models such as web, app and db. Benchmarks MUST consider whether
151	             the proposed solution is able to scale up to the demands of such
152	             applications and not just a three-tier architecture.

154	             The benchmarks will be utilizing the various terminology and
155	             definitions from the NVO3 working group including RFC 8014 and RFC
156	             8394.

158	        2. Conventions used in this document

160	             The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
161	             "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
162	             "OPTIONAL" in this document are to be interpreted as described in BCP
163	             14 [RFC2119] [RFC8174] when, and only when, they appear in all
164	             capitals, as shown here.

166	        3. Definitions

168	        3.1. System Under Test (SUT)

170	             Traditional hardware based networking devices generally use the
171	             device under test (DUT) model of testing.  In this model, apart from
172	             any allowed configuration, the DUT is a black box from a testing
173	             perspective.  This method works for hardware based networking devices
174	             since the device itself is not influenced by any other components
175	             outside the DUT.

177	             Virtual networking components cannot leverage DUT model of testing as
178	             the DUT is not just the virtual device but includes the hardware
179	             components that were used to host the virtual device

181	             Hence System Under Test (SUT) model MUST be used instead of the
182	             traditional device under test

184	             With SUT model, the virtual networking component along with all
185	             software and hardware components that host the virtual networking
186	             component MUST be considered as part of the SUT.

188	             Virtual networking components, because of their dependency on the
189	             underlying hardware and other software components, may end up
190	             leveraging NIC offload benefits, such as TCP Segmentation Offload
191	             (TSO), Large Receive Offload (LRO) and Rx / Tx Filters. Such
192	             underlying hardware and software level features, even though they may
193	             not be part of virtual networking stack itself, MUST be considered
194	             and documented.  Note:  Physical switches and routers, including the
195	             ones that act as initiators for NVOs, work with L2/L3 packets and may
196	             not be able to leverage TCP enhancements such as TSO.

198	             Please refer to section 5 Figure 1 for a visual representation of
199	             System Under Test in the case of Intra-Host testing and section 5
200	             Figure 2 for System Under Test in the case of Inter-Host testing.

202	        3.2. Network Virtualization Platform

204	             This document focuses on the Network Virtualization Overlay platform
205	             as outlined in RFC 8014 and use cases from RFC 8394.

207	             Network Virtualization platforms, function closer to the application
208	             layer and are able to work with not only L2/L3 packets but also
209	             segments that leverage TCP optimizations such as Large Segment
210	             Offload (LSO).

212	             NVPs leverage TCP stack optimizations such as TCP Segmentation
213	             Offload (TSO) and Large Receive Offload (LRO) that enables NVPs to
214	             work with much larger payloads of up to 64K unlike their counterparts
215	             such as NFVs.

217	             Because of the difference in the payload, which translates into one
218	             operation per 64K of payload in NVP verses ~40 operations for the
219	             same amount of payload in NFV because of having to divide it to MTU
220	             sized packets, results in considerable difference in performance
221	             between NFV and NVP.

223	             Please refer to figure 1 for a pictorial representation of this
224	             primary difference between NPV and NFV for a 64K payload
225	             segment/packet running on network set to 1500 bytes MTU.

227	             Note:  Payload sizes in figure 1 are approximates.

229	             NVP (1 segment)                NFV (40 packets)

231	             Segment 1                      Packet 1
232	               +-------------------------+    +-------------------------+
233	               | Headers                 |    | Headers                 |
234	               | +---------------------+ |    | +---------------------+ |
235	               | | Payload - upto 64K  | |    | | Payload < 1500      | |
236	               | +---------------------+ |    | +---------------------+ |
237	               +-------------------------+    +-------------------------+

239	                                            Packet 2
240	                                              +-------------------------+
241	                                              | Headers                 |
242	                                              | +---------------------+ |
243	                                              | | Payload < 1500      | |
244	                                              | +---------------------+ |
245	                                              +-------------------------+

247	                                                           .
248	                                                           .
249	                                                           .
250	                                                           .

252	                                            Packet 40
253	                                              +-------------------------+
254	                                              | Headers                 |
255	                                              | +---------------------+ |
256	                                              | | Payload < 1500      | |
257	                                              | +---------------------+ |
258	                                              +-------------------------+

260	                                           Figure 1! Payload NPV vs NFV

262	             Hence, normal benchmarking methods are not relevant to the NVPs.

264	             Instead, newer methods that leverage TCP optimizations MUST be used
265	             for testing Network Virtualization Platforms.

267	        3.3. Microservices

269	             Moving from traditional monolithic application architectures such as
270	             the three tier web, app and db architectures to microservices model
271	             open up networking and security stacks to new scale and performance
272	             related challenges.  At a high level, in a microservices model, a
273	             traditional monolithic app that may use few IPs is broken down into
274	             100s of individual one-responsibility-only applications where each
275	             application has connectivity and security related requirements.
276	             These 100s of small one-responsibility-only micro-services need their
277	             own IP and also secured into their own segments, hence pushing the
278	             scale boundaries of the overlay from both simple segmentation
279	             perspective and also from a security perspective.

281	             For more details regarding microservices, please refer to wiki on
282	             microservices:  https://en.wikipedia.org/wiki/Microservices

284	        4. Scope

286	             Focus of this document is the Network Virtualization Platform in two
287	             separate scenarios as outlined in RFC 8014 section 4, Network
288	             Virtualization Edge (NVE) and RFC 8394 section 1.1 Split-NVE and the
289	             associated learning phase:

291	        4.1.1. Scenario 1

293	             RFC 8014 Section 4.1 "NVE Co-located with server hypervisor": Where
294	             the entire NVE functionality will typically be implemented as part of
295	             the hypervisor and/or virtual switch on the server.

297	        4.1.2. Scenario 2

299	             RFC 8394 Section 1.1 "Split-NVE:  A type of NVE (Network
300	             Virtualization Edge) where the functionalities are split across an
301	             end device supporting virtualization and an external network device."

303	        4.1.3. Learning

305	             Address learning rate is a key contributor to the overall performance
306	             of SUT specially in microservices type of use cases where a large
307	             amount of end-points are created and destroyed on demand.

309	        4.1.4. Flow Optimization

311	             There are several flow optimization algorithms that are designed to
312	             help improve latency or throughput.  These optimizations MUST be
313	             documented.

315	        4.1.5. Out of scope

317	             This document does not address Network Function Virtualization which
318	             has been covered already by previous IETF documents
319	             (https://datatracker.ietf.org/doc/draft-ietf-bmwg-virtual-
320	             net/?include_text=1).

322	             Network Function Virtualization (NFV) focuses on being independent of
323	             networking hardware while providing the same functionality.  In the
324	             case of NFV, traditional benchmarking methodologies recommended by
325	             IETF may be used.  Considerations for Benchmarking Virtual Network
326	             Functions and Their Infrastructure IETF document addresses
327	             benchmarking NFVs.

329	             Typical NFV implementations emulate in software, the characteristics
330	             and features of physical switches.  They are similar to any physical
331	             L2/L3 switch from the perspective of the packet size, which is
332	             typically enforced based on the maximum transmission unit used.

334	        4.2. Virtual Networking for Datacenter Applications

336	             This document focuses on the virtual networking for east-west traffic
337	             within on-prem datacenter and/or cloud.  For example, in a three tier
338	             app such web, app and db, this document focuses on the east-west
339	             traffic between web and app.

341	             This document addresses scale requirements for modern application
342	             architectures such as Micro-services to consider whether the proposed
343	             solution is able to scale up to the demands of micro- services
344	             application models that basically have 100s of small services
345	             communicating on some standard ports such as http/https using
346	             protocols such as REST.

348	        4.3. Interaction with Physical Devices

350	             Virtual network components MUST NOT be tested independent of other
351	             components within the system.  Example, unlike a physical router or a
352	             firewall, where the tests can be focused solely on the device, when
353	             testing a virtual router or firewall, multiple other devices may
354	             become part of the SUT.  Hence the characteristics of these other
355	             traditional networking switches and routers, LB, FW etc. MUST be
356	             considered.

358	                o   Hashing method used

360	                o   Over-subscription rate

362	                o   Throughput available

364	                o   Latency characteristics

366	        5. NVP Benchmarking Considerations

368	             In virtual environments, the SUT may often share resources and reside
369	             on the same physical hardware with other components involved in the
370	             tests.  Hence SUT MUST be clearly documented.  In these tests, a
371	             single hypervisor may host multiple servers, switches, routers,
372	             firewalls etc.

374	             Intra host testing:  Intra host testing helps in reducing the number
375	             of components involved in a test.  For example, intra host testing
376	             would help focus on the System Under Test, logical switch and the
377	             hardware that is running the hypervisor that hosts the logical
378	             switch, and eliminate other components.  Because of the nature of
379	             virtual infrastructures and multiple elements being hosted on the
380	             same physical infrastructure, influence from other components cannot
381	             be completely ruled out.  For example, unlike in physical
382	             infrastructures, logical routing or distributed firewall MUST NOT be
383	             benchmarked independent of logical switching. System Under Test
384	             definition MUST include all components involved with that particular
385	             test.

387	                     +---------------------------------------------------+
388	                     | System Under Test                                 |
389	                     | +-----------------------------------------------+ |
390	                     | | Hyper-Visor                                   | |
391	                     | |                                               | |
392	                     | |                +-------------+                | |
393	                     | |                |     NVP     |                | |
394	                     | | +-----+        |    Switch/  |        +-----+ | |
395	                     | | | VM1 |<------>|   Router/   |<------>| VM2 | | |
396	                     | | +-----+   VW   |  Fire Wall/ |   VW   +-----+ | |
397	                     | |                |     etc.,   |                | |
398	                     | |                +-------------+                | |
399	                     | | Legend                                        | |
400	                     | | VM: Virtual Machine                           | |
401	                     | | VW: Virtual Wire                              | |
402	                     | +-----------------------------------------------+ |
403	                     +---------------------------------------------------+

405	                                    Figure 2! Intra-Host System Under Test

407	             In the above figure, we only address the NVE co-located with the
408	             hypervisor.

410	             Inter-host testing:  Inter-host testing helps in profiling the
411	             underlying network interconnect performance.  For example, when
412	             testing Logical Switching, inter host testing would not only test the
413	             logical switch component but also any other devices that are part of
414	             the physical data center fabric that connects the two hypervisors.
415	             System Under Test MUST be well defined to help with repeatability of
416	             tests.  System Under Test definition in the case of inter host
417	             testing, MUST include all components, including the underlying
418	             network fabric.

420	             Figure 2 is a visual representation of system under test for inter-
421	             host testing.

423	                     +---------------------------------------------------+
424	                     | System Under Test                                 |
425	                     | +-----------------------------------------------+ |
426	                     | | Hyper-Visor                                   | |
427	                     | |                +-------------+                | |
428	                     | |                |     NVP     |                | |
429	                     | | +-----+        |    Switch/  |        +-----+ | |
430	                     | | | VM1 |<------>|   Router/   |<------>| VM2 | | |
431	                     | | +-----+   VW   |  Firewall/  |   VW   +-----+ | |
432	                     | |                |     etc.,   |                | |
433	                     | |                +-------------+                | |
434	                     | +------------------------_----------------------+ |
435	                     |                          ^                        |
436	                     |                          | Network Cabling        |
437	                     |                          v                        |
438	                     | +-----------------------------------------------+ |
439	                     | |       Physical Networking Components          | |
440	                     | |     switches, routers, firewalls etc.,        | |
441	                     | +-----------------------------------------------+ |
442	                     |                          ^                        |
443	                     |                          | Network Cabling        |
444	                     |                          v                        |
445	                     | +-----------------------------------------------+ |
446	                     | | Hyper-Visor                                   | |
447	                     | |                +-------------+                | |
448	                     | |                |     NVP     |                | |
449	                     | | +-----+        |    Switch/  |        +-----+ | |
450	                     | | | VM1 |<------>|   Router/   |<------>| VM2 | | |
451	                     | | +-----+   VW   |  Firewall/  |   VW   +-----+ | |
452	                     | |                |     etc.,   |                | |
453	                     | |                +-------------+                | |
454	                     | +------------------------_----------------------+ |
455	                     +---------------------------------------------------+
456	                     Legend
457	                     VM: Virtual Machine
458	                     VW: Virtual Wire

460	                                    Figure 3! Inter-Host System Under Test

462	             Virtual components have a direct dependency on the physical
463	             infrastructure that is hosting these resources.  Hardware
464	             characteristics of the physical host impact the performance of the
465	             virtual components. The components that are being tested and the
466	             impact of the other hardware components within the hypervisor on the
467	             performance of the SUT MUST be documented.  Virtual component
468	             performance is influenced by the physical hardware components within
469	             the hypervisor.  Access to various offloads such as TCP segmentation
470	             offload, may have significant impact on performance.  Firmware and
471	             driver differences may also significantly impact results based on
472	             whether the specific driver leverages any hardware level offloads
473	             offered.  Packet processing could be executed on shared or dedicated
474	             cores on the main processor or via a dedicated co-processor or
475	             embedded processor on NIC.

477	             Hence, all physical components of the physical server running the
478	             hypervisor that hosts the virtual components MUST be documented along
479	             with the firmware and driver versions of all the components used to
480	             help ensure repeatability of test results.  For example, BIOS
481	             configuration of the server MUST be documented as some of those
482	             changes are designed to improve performance.  Please refer to
483	             Appendix A for a partial list of parameters to document.

485	        5.1. Learning

487	             SUT needs to learn all the addresses before running any tests.
488	             Address learning rate MUST be considered in the overall performance
489	             metrics because address learning rate has a high impact in
490	             microservices based use cases where there is huge churn of end points
491	             as they are created and destroyed on demand.  In these cases, both
492	             the throughput at stable state, and the time taken to get to stable
493	             state MUST be tested and documented.

495	        5.2. Traffic Flow Optimizations

497	             Several mechanisms are employed to optimize traffic flows.  Following
498	             are some examples:

500	        5.2.1. Fast Path

502	             A single flow may go through various switching, routing and
503	             firewalling decisions.  While in the standard model, every single
504	             packet has to go through the entire process/pipeline, some
505	             optimizations help make this decision for the first packet, store the
506	             final state for that packet, and leverage it to skip the process for
507	             rest of the packets that are part of the same flow.

509	        5.2.2. Dedicated cores / Co-processors

511	             Packet processing is a CPU intensive workload.  Some NVE's may use
512	             dedicated cores or a co-processor primarily for packet processing
513	             instead of sharing the cores used for the actual workloads.  Such
514	             cases MUST be documented.  Tests MUST be performed with both shared
515	             and dedicated cores.  Results and differences in results MUST be
516	             documented.

518	        5.2.3. Prioritizing and de-prioritizing active flows

520	             Certain algorithms may prioritize or de-prioritize traffic flows
521	             based on purely their network characteristics such as the length of
522	             the flow.  For example, de-prioritize a long-lived flow.  This could
523	             result in changing the performance of a flow over a period of time.
524	             Such optimizations MUST be documented, and tests MUST consist of
525	             long-lived flows to help capture the change in performance for such
526	             flows.  Tests MUST note the point at which performance changes.

528	        5.3. Server Architecture Considerations

530	             When testing physical networking components, the approach taken is to
531	             consider the device as a black-box.  With virtual infrastructure,
532	             this approach would no longer help as the virtual networking
533	             components are an intrinsic part of the hypervisor they are running
534	             on and are directly impacted by the server architecture used. Server
535	             hardware components define the capabilities of the virtual networking
536	             components.  Hence, server architecture MUST be documented in detail
537	             to help with repeatability of tests.  And the entire hardware and
538	             software components become the SUT.

540	        5.3.1. NVE Component considerations

542	        5.3.1.1. NVE co-located

544	             Components of NVE co-located may be hypervisor based or offloaded
545	             entirely to the NIC card or a hybrid model.  In the case of
546	             hypervisor-based model, they may be running in user space or kernel
547	             space.  Further, they may use dedicated cores, shared cores or in
548	             some cases dedicated co-processors.  All the components and the
549	             process used MUST be documented.

551	        5.3.1.2. NVE split

553	             NVE split scenario generally has three primary components as
554	             documented per RFC 8394.

556	             "tNVE:  Terminal-side NVE.  The portion of Split-NVE functionalities
557	             located on the end device supporting virtualization.  The tNVE
558	             interacts with a Tenant System through an internal interface in the
559	             end device."  tNVE may be made of either hypervisor controlled
560	             components such as hypervisor provided switches or NVE controlled
561	             components where the network functionality is not provided by the
562	             hypervisor.  In either case, the components used MUST be documented.

564	             "nNVE:  Network-side NVE.  The portion of Split-NVE functionalities
565	             located on the network device that is directly or indirectly
566	             connected to the end device that contains the corresponding NVE. The
567	             nNVE normally performs encapsulation to and decapsulation from the
568	             overlay network."  All the functionality provided by the nNVE MUST be
569	             documented.

571	             "External NVE:  The physical network device that contains the nNVE."
572	             Networking device hardware specs MUST be documented.  Please use
573	             Apendix A for an example of the specs that MUST be documented.

575	             In either case, NVE co-located or NVE split all the components MUST
576	             be documented.  Where possible, individual components MUST be tested
577	             independent of the entire system.  For example, where possible,
578	             hypervisor provided switching functionality MUST be tested
579	             independent of the NVE.

581	             Per RFC 8014, "For the split-NVE case, protocols will be needed that
582	             allow the hypervisor and NVE to negotiate and set up the necessary
583	             state so that traffic sent across the access link between a server
584	             and the NVE can be associated with the correct virtual network
585	             instance." Supported VM lifecycle events, from RFC 8394 section 2,
586	             MUST be documented as part of the benchmark process.  This process
587	             MUST also include how the hypervisor and the external NVE have
588	             signaled each other to reach an agreement.  Example, see section 2.1
589	             of RFC 8394 "VM creation event".  The process used to update
590	             agreement status MUST also be documented.

592	                     +---------------------------------------------------+
593	                     | System Under Test                                 |
594	                     | +-----------------------------------------------+ |
595	                     | | Hyper-Visor                                   | |
596	                     | | +-----+        +-------------+                | |
597	                     | | | VM1 |<------>|   tNVE      |                | |
598	                     | | +-----+   VW   +-------------+                | |
599	                     | |                        ^                      | |
600	                     | |                        | TSI                  | |
601	                     | |                        v            Switch    | |
602	                     | |              +--------------------------+     | |
603	                     | |              |      External NVE        |     | |
604	                     | |              |   Router/Firewall/etc.,  |     | |
605	                     | |              +--------------------------+     | |
606	                     | |                        ^                      | |
607	                     | |                        | TSI                  | |
608	                     | |                        v                      | |
609	                     | |                +-------------+        +-----+ | |
610	                     | |                |     tNVE    |<------>| VM2 | | |
611	                     | |                +-------------+   VW   +-----+ | |
612	                     | +-----------------------------------------------+ |
613	                     +---------------------------------------------------+
614	                     Legend
615	                     VM: Virtual Machine
616	                     VW: Virtual Wire
617	                   TSI: Tenant System Interface
618	                   tNVE: Terminal Side NVE

620	                           Figure 4 NVE Split collocated - System Under Test

622	                     +---------------------------------------------------+
623	                     | System Under Test                                 |
624	                     | +-----------------------------------------------+ |
625	                     | | Hyper-Visor                                   | |
626	                     | |                +-------------+                | |
627	                     | | +-----+        |     NVP     |                | |
628	                     | | | VM1 |<------>|  Interface  |                | |
629	                     | | +-----+   VW   +-------------+                | |
630	                     | +-----------------------------------------------+ |
631	                     |                          ^                        |
632	                     |                          | Network Cabling        |
633	                     |                          v                        |
634	                     | +-----------------------------------------------+ |
635	                     | |  Physical switches, routers, firewalls etc.,  | |
636	                     | +-----------------------------------------------+ |
637	                     |                          ^                        |
638	                     |                          | Network Cabling        |
639	                     |                          v                        |
640	                     | +-----------------------------------------------+ |
641	                     | | Hyper-Visor/ +--------------------------+     | |
642	                     | | ToR Switch/  |        NVP Split         |     | |
643	                     | |  NIC etc.,   |   Router/Firewall/etc.,  |     | |
644	                     | |              +--------------------------+     | |
645	                     | +-----------------------------------------------+ |
646	                     |                          ^                        |
647	                     |                          | Network Cabling        |
648	                     |                          v                        |
649	                     | +-----------------------------------------------+ |
650	                     | |  Physical switches, routers, firewalls etc.,  | |
651	                     | +-----------------------------------------------+ |
652	                     |                          ^                        |
653	                     |                          | Network Cabling        |
654	                     |                          v                        |
655	                     | +-----------------------------------------------+ |
656	                     | | Hyper-Visor    +-------------+                | |
657	                     | |                |     NVP     |        +-----+ | |
658	                     | |                |  Interface  |<------>| VM2 | | |
659	                     | |                +-------------+   VW   +-----+ | |
660	                     | +-----------------------------------------------+ |
661	                     +---------------------------------------------------+
662	                     Legend
663	                     VM: Virtual Machine
664	                     VW: Virtual Wire

666	                        Figure 5 NVE Split not collocated - System Under Test

668	        5.3.2. Frame format/sizes within the Hypervisor

670	             Maximum Transmission Unit (MTU) limits physical network component's
671	             frame sizes.  The most common max supported MTU for physical devices
672	             is 9000.  However, 1500 MTU is the standard.  Physical network
673	             testing and NFV uses these MTU sizes for testing.  However, the
674	             virtual networking components that live inside a hypervisor, may work
675	             with much larger segments because of the availability of hardware and
676	             software based offloads.  Hence, the normal smaller packets based
677	             testing is not relevant for performance testing of virtual networking
678	             components.  All the TCP related configuration such as TSO size,
679	             number of RSS queues MUST be documented along with any other physical
680	             NIC related configuration.

682	             NVE co-located may have a different performance profile when compared
683	             with NVE split because, the NVE co-located may have access to
684	             offloads that may not be available when the packet has to traverse
685	             the physical link.  Such differences MUST be documented.

687	        5.3.3. Baseline testing with Logical Switch

689	             Logical switch is often an intrinsic component of the test system
690	             along with any other hardware and software components used for
691	             testing.  Also, other logical components cannot be tested independent
692	             of the Logical Switch.

694	        5.3.4. Repeatability

696	             To ensure repeatability of the results, in the physical network
697	             component testing, much care is taken to ensure the tests are
698	             conducted with exactly the same parameters.  Example parameters such
699	             as MAC addresses used.

701	             When testing NVP components with an application layer test tool,
702	             there may be a number of components within the system that may not be
703	             available to tune or to ensure they maintain a desired state.
704	             Example: housekeeping functions of the underlying Operating System.

706	             Hence, tests MUST be repeated a number of times and each test case
707	             MUST be run for at least 2 minutes if test tool provides such an
708	             option.  Results SHOULD be derived from multiple test runs. Variance
709	             between the tests SHOULD be documented.

711	        5.3.5. Tunnel encap/decap outside the Hypervisor

713	             Logical network components may also have performance impact based on
714	             the functionality available within the physical fabric.  Physical
715	             fabric that supports NVO encap/decap is one such case that may have a
716	             different performance profile.  Any such functionality that exists on
717	             the physical fabric MUST be part of the test result documentation to
718	             ensure repeatability of tests. In this case SUT MUST include the
719	             physical fabric if its being used for encap/decap operations.

721	        5.3.6. SUT Hypervisor Profile

723	             Physical networking equipment has well defined physical resource
724	             characteristics such as type and number of ASICs/SoCs used, amount of
725	             memory, type and number of processors etc., Virtual networking
726	             components performance is dependent on the physical hardware that
727	             hosts the hypervisor.  Hence the physical hardware usage, which is
728	             part of SUT, for a given test MUST be documented, for example, CPU
729	             usage when running logical router.

731	             CPU usage, changes based on the type of hardware available within the
732	             physical server.  For example, TCP Segmentation Offload greatly
733	             reduces CPU usage by offloading the segmentation process to the NIC
734	             card on the sender side.  Receive side scaling offers similar benefit
735	             on the receive side.  Hence, availability and status of such hardware
736	             MUST be documented along with actual CPU/Memory usage when the
737	             virtual networking components have access to such offload capable
738	             hardware.

740	             Following is a partial list of components that MUST be documented
741	             both in terms of what is available and also what is used by the SUT

743	                o   CPU - type, speed, available instruction sets (e.g. AES-NI)

745	                o   Memory - type, amount

747	                o   Storage - type, amount

749	                o   NIC Cards -

751	                       * Type

753	                       * number of ports

755	                       * offloads available/used - following is a partial list of
756	                           possible features

758	                               o TCP Segmentation Offload

760	                               o Large Receive Offload
761	                               o Checksum Offloads

763	                               o Receive Side Scaling

765	                               o Other Queuing Mechanisms

767	                       * drivers, firmware (if applicable)

769	                       * HW revision

771	                o    Libraries such as DPDK if available and used

773	                o    Number and type of VMs used for testing and

775	                       * vCPUs

777	                       * RAM

779	                       * Storage

781	                       * Network Driver

783	                       * Any prioritization of VM resources

785	                       * Operating System type, version and kernel if applicable

787	                       * TCP Configuration Changes - if any

789	                       * MTU

791	                o    Test tool

793	                       * Workload type

795	                       * Protocol being tested

797	                       * Number of threads

799	                       * Version of tool

801	                o   For inter-hypervisor tests,

803	                       * Physical network devices that are part of the test

805	                               o Note:  For inter-hypervisor tests, system under test
806	                                  is no longer only the virtual component that is being
807	                                  tested but the entire fabric that connects the virtual
808	                                  components become part of the system under test.

810	        5.4. Benchmarking Tools Considerations

812	        5.4.1. Considerations for NVE

814	             Virtual network components in NVE work closer to the application
815	             layer then the physical networking components, which enables the
816	             virtual network components to take advantage of TCP optimizations
817	             such as TCP Segmentation Offload (TSO) and Large Receive Offload
818	             (LRO).  Because of this optimizations, virtual network components
819	             work with type and size of segments that are often not the same type
820	             and size that the physical network works with.  Hence, testing
821	             virtual network components MUST be done with application layer
822	             segments instead of the physical network layer packets.  Testing MUST
823	             be done with application layer testing tools such as iperf, netperf
824	             etc.,

826	        5.4.2. Considerations for Split-NVE

828	             In the case of Split-NVE, since they may not leverage any TCP related
829	             optimizations, typical network test tools focused on packet
830	             processing MUST be used.  However, the tools used MUST be able to
831	             leverage Receive Side Scaling (RSS).

833	        6. Control Plane Scale Considerations

835	             For a holistic approach to performance testing, control plane
836	             performance must also be considered.  While the previous sections
837	             focused on performance tests after the SUT has come to a steady
838	             state, the following section focusses on tests to measure the time
839	             taken to bring the SUT to steady state.

841	             In a physical network infrastructure world view, this could be
842	             various stages such as boot up time, time taken to apply
843	             configuration, BGP convergence time etc.,  In a virtual
844	             infrastructure world, this involves lot more components which may
845	             also be distributed across multiple hosts.  Some of the components
846	             are:

848	                o   VM Creation Event

850	                o   VM Migration Event

852	                i   How many total VMs can the SUT support
853	                o   What is the rate at which the SUT would allow creation of VMs

855	             Please refer to section 2 of RFC 8394 for various VM events and their
856	             definitions.  In the following section we further clarify some of the
857	             terms used in the above RFC.

859	             VM Creation

861	             For the purposes of NVP control plane testing, VM Creation event is
862	             when a VM starts participating for the first time on a NVP provided
863	             network.  This involves various actions on the tNVE and NVP.  Please
864	             refer to 2.1 "VM Creation Event" of RFC 8394 for more details.

866	             In order to rule out any Hypervisor imposed limitations, System Under
867	             Test must first be profiled and baselined with-out the use of NVP
868	             components.  For the purposes of baselining control plane, the VM
869	             used may have very small footprint such as DSL Linux which runs in
870	             16MB RAM.

872	             Once a baseline has been established for a single HV, a similar
873	             exercise MUST be done on multiple HVs to establish a baseline for the
874	             entire hypervisor domain.  However, it may not be practical to have
875	             physical hosts and hence nested hosts may be used for this purpose

877	        6.1.1. VM Events

879	             Performance of various control plane activities which are associated
880	             with the System Under Test, MUST BE documented.

882	                o   VM Creation: Time taken to join the VMs to the SUT provided
883	                    network

885	                o   Policy Realization: Time taken for policy realization on the VM

887	                o   VM Migration: Time taken to migrate a VM from one SUT provided
888	                    network to another SUT provided network

890	             For the test itself, the following process could be use:

892	                1 API to call to join VM on the SUT provided network

894	                2 Loop while incrementing a timer - till the VM comes online on
895	                    the SUT provided network

897	             Similarly, policy realization and VM migration may also be tested
898	             with a check on whether the VM is available or not available based on
899	             the type of policy that is applied.

901	        6.1.2. Scale

903	             SUT must also be tested to determine the maximum scale supported.
904	             Scale can be multi-faceted such as the following:

906	                o   Total # of VMs per Host

908	                o   Total # of VMs per one SUT Domain

910	                o   Total # of Hosts per one SUT Domain

912	                o   Total # of Logical Switches per one SUT Domain

914	                       * Total # of VMs per one SUT provided Logical Switch

916	                               o Per Host

918	                               o Per SUT Domain

920	                o   Total # of Logical Routers per one SUT Domain

922	                       * Total # of Logical Switches per one Logical Router

924	                       * Total # of VMs on a single Logical Router

926	                o   Total # of Firewall Sections

928	                o   Total # of Firewall Rules per Section

930	                o   Total # of Firewall Rules applied per VM

932	                o   Total # of Firewall Rules applied per Host

934	                o   Total # of Firewall Rules per SUT

936	        6.1.3. Control Plane Performance at Scale

938	             Benchmarking MUST also test and document the control performance at
939	             scale.  That is,

941	                o   Total # VMs that can be created in parallel

943	                       * How long does the action take

945	                o   Total # of VMs that can be migrated in parallel

947	                       * How long does the action take

949	                o   Total amount of time taken to apply 1 firewall across the
950	                    entire VMs under a SUT

952	                o   Time taken to apply 1000s rules on a SUT

954	        7. Security Considerations

956	             Benchmarking activities as described in this memo are limited to
957	             technology characterization of a Device Under Test/System Under Test
958	             (DUT/SUT) using controlled stimuli in a laboratory environment, with
959	             dedicated address space and the constraints specified in the sections
960	             above.

962	             The benchmarking network topology will be an independent test setup
963	             and MUST NOT be connected to devices that may forward the test
964	             traffic into a production network, or misroute traffic to the test
965	             management network.

967	             Further, benchmarking is performed on a 'black-box' basis, relying
968	             solely on measurements observable external to the DUT/SUT.

970	             Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
971	             benchmarking purposes.  Any implications for network security arising
972	             from the DUT/SUT SHOULD be identical in the lab and in production
973	             networks.

975	        8. IANA Considerations

977	             No IANA Action is requested at this time.

979	        9. Conclusions

981	             Network Virtualization Platforms, because of their proximity to the
982	             application layer and since they can take advantage of TCP stack
983	             optimizations, do not function on packets/sec basis.  Hence,
984	             traditional benchmarking methods, while still relevant for Network
985	             Function Virtualization, are not designed to test Network
986	             Virtualization Platforms.  Also, advances in application
987	             architectures such as micro-services, bring new challenges and need
988	             benchmarking not just around throughput and latency but also around
989	             scale.  New benchmarking methods that are designed to take advantage
990	             of the TCP optimizations or needed to accurately benchmark
991	             performance of the Network Virtualization Platforms

993	        10. References

995	        10.1. Normative References

997	             [RFC7364] T. Narten, E. Gray, D. Black, L. Fang, L. Kreeger, M.
998	                           Napierala, 'Problem Statement: Overlays for Network
999	                           Virtualization', RFC 7364, October 2014,
1000	                           https://datatracker.ietf.org/doc/rfc7364/

1002	             [RFC8014] D. Black, J. Hudson, L. Kreeger, M. Lasserre, T. Narten 'An
1003	                           Architecture for Data-Center Network Virtualization over
1004	                           Layer 3 (NVO3)', RFC 8014, December 2016,
1005	                           https://tools.ietf.org/html/rfc8014

1007	             [RFC8394] Y. Li, D. Eastlake 3rd, L. Kreeger, T. Narten, D. Black
1008	                           'Split Network Virtualization Edge (Split-NVE) Control-
1009	                           Plane Requirements', RFC 8394, May 2018,
1010	                           https://tools.ietf.org/html/rfc8394

1012	             [nv03]    IETF, WG, Network Virtualization Overlays,
1013	                           

1015	        10.2. Informative References

1017	             [RFC8172] A. Morton 'Considerations for Benchmarking Virtual Network
1018	                           Functions and Their Infrastructure', RFC 8172, July 2017,
1019	                           https://tools.ietf.org/html/rfc8172

1021	        11. Acknowledgments

1023	             This document was prepared using 2-Word-v2.0.template.dot.

1025	        Appendix A.!Partial List of Parameters to Document

1027	        A.1. CPU

1029	             CPU Vendor

1031	             CPU Number

1033	             CPU Architecture

1035	             # of Sockets (CPUs)

1037	             # of Cores

1039	             Clock Speed (GHz)

1041	             Max Turbo Freq. (GHz)

1043	             Cache per CPU (MB)

1045	             # of Memory Channels

1047	             Chipset

1049	             Hyperthreading (BIOS Setting)

1051	             Power Management (BIOS Setting)

1053	             VT-d

1055	             Shared vs Dedicated packet processing

1057	             User space vs Kernel space packet processing

1059	        A.2. Memory

1061	             Memory Speed (MHz)

1063	             DIMM Capacity (GB)

1065	             # of DIMMs

1067	             DIMM configuration

1069	             Total DRAM (GB)

1071	        A.3. NIC

1073	             Vendor

1075	             Model

1077	             Port Speed (Gbps)

1079	             Ports

1081	             PCIe Version

1083	             PCIe Lanes

1085	             Bonded

1087	             Bonding Driver

1089	             Kernel Module Name

1091	             Driver Version

1093	             VXLAN TSO Capable

1095	             VXLAN RSS Capable

1097	             Ring Buffer Size RX

1099	             Ring Buffer Size TX

1101	        A.4. Hypervisor

1103	             Hypervisor Name

1105	             Version/Build

1107	             Based on

1109	             Hotfixes/Patches

1111	             OVS Version/Build

1113	             IRQ balancing

1115	             vCPUs per VM

1117	             Modifications to HV
1118	             Modifications to HV TCP stack

1120	             Number of VMs

1122	             IP MTU

1124	             Flow control TX (send pause)

1126	             Flow control RX (honor pause)

1128	             Encapsulation Type

1130	        A.5. Guest VM

1132	             Guest OS & Version

1134	             Modifications to VM

1136	             IP MTU Guest VM (Bytes)

1138	             Test tool used

1140	             Number of NetPerf Instances

1142	             Total Number of Streams

1144	             Guest RAM (GB)

1146	        A.6. Overlay Network Physical Fabric

1148	             Vendor

1150	             Model

1152	             # and Type of Ports

1154	             Software Release

1156	             Interface Configuration

1158	             Interface/Ethernet MTU (Bytes)

1160	             Flow control TX (send pause)

1162	             Flow control RX (honor pause)

1164	        A.7. Gateway Network Physical Fabric

1166	             Vendor

1168	             Model

1170	             # and Type of Ports

1172	             Software Release

1174	             Interface Configuration

1176	             Interface/Ethernet MTU (Bytes)

1178	             Flow control TX (send pause)

1180	             Flow control RX (honor pause)

1182	        A.8. Metrics

1184	             Drops on the virtual infrastructure

1186	             Drops on the physical underlay infrastructure

1188	        Authors' Addresses

1190	             Samuel Kommu
1191	             VMware
1192	             3401 Hillview Ave
1193	             Palo Alto, CA, 94304

1195	             Email: skommu@vmware.com

1197	             Jacob Rapp
1198	             VMware
1199	             3401 Hillview Ave
1200	             Palo Alto, CA, 94304

1202	             Email: jrapp@vmware.com