idnits 2.17.1
draft-dcn-bmwg-containerized-infra-07.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 992 has weird spacing: '...vSwitch v ...'
== The document seems to lack the recommended RFC 2119 boilerplate, even if
it appears to use RFC 2119 keywords -- however, there's a paragraph with
a matching beginning. Boilerplate error?
(The document does seem to have the reference to RFC 2119 which the
ID-Checklist requires).
-- The document date (11 November 2021) is 896 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
No issues found here.
Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Benchmarking Methodology Working Group K. Sun
3 Internet-Draft Soongsil University
4 Intended status: Informational H. Yang
5 Expires: 15 May 2022 KT
6 J. Lee
7 T. Ngoc
8 Y. Kim
9 Soongsil University
10 11 November 2021
12 Considerations for Benchmarking Network Performance in Containerized
13 Infrastructures
14 draft-dcn-bmwg-containerized-infra-07
16 Abstract
18 This draft describes considerations for benchmarking network
19 performance in containerized infrastructures. In the containerized
20 infrastructure, Virtualized Network Functions(VNFs) are deployed on
21 an operating-system-level virtualization platform by abstracting the
22 user namespace as opposed to virtualization using a hypervisor.
23 Leveraging this, the system configurations and networking scenarios
24 for benchmarking will be partially changed by the way in which the
25 resource allocation and network technologies are specified for
26 containerized VNFs. In this draft, we compare the state of the art
27 in a container networking architecture with networking on VM-based
28 virtualized systems and provide several test scenarios for
29 benchmarking network performance in containerized infrastructures.
31 Status of This Memo
33 This Internet-Draft is submitted in full conformance with the
34 provisions of BCP 78 and BCP 79.
36 Internet-Drafts are working documents of the Internet Engineering
37 Task Force (IETF). Note that other groups may also distribute
38 working documents as Internet-Drafts. The list of current Internet-
39 Drafts is at https://datatracker.ietf.org/drafts/current/.
41 Internet-Drafts are draft documents valid for a maximum of six months
42 and may be updated, replaced, or obsoleted by other documents at any
43 time. It is inappropriate to use Internet-Drafts as reference
44 material or to cite them other than as "work in progress."
46 This Internet-Draft will expire on 15 May 2022.
48 Copyright Notice
50 Copyright (c) 2021 IETF Trust and the persons identified as the
51 document authors. All rights reserved.
53 This document is subject to BCP 78 and the IETF Trust's Legal
54 Provisions Relating to IETF Documents (https://trustee.ietf.org/
55 license-info) in effect on the date of publication of this document.
56 Please review these documents carefully, as they describe your rights
57 and restrictions with respect to this document. Code Components
58 extracted from this document must include Simplified BSD License text
59 as described in Section 4.e of the Trust Legal Provisions and are
60 provided without warranty as described in the Simplified BSD License.
62 Table of Contents
64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
66 3. Containerized Infrastructure Overview . . . . . . . . . . . . 4
67 4. Networking Models in Containerized Infrastructure . . . . . . 8
68 4.1. Kernel-space vSwitch Models . . . . . . . . . . . . . . . 9
69 4.2. User-space vSwitch Models . . . . . . . . . . . . . . . . 10
70 4.3. Smart-NIC Acceleration Model . . . . . . . . . . . . . . 10
71 5. Performance Impacts . . . . . . . . . . . . . . . . . . . . . 12
72 5.1. CPU Isolation / NUMA Affinity . . . . . . . . . . . . . . 12
73 5.2. Hugepages . . . . . . . . . . . . . . . . . . . . . . . . 12
74 5.3. Additional Considerations . . . . . . . . . . . . . . . . 13
75 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
76 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 13
77 7.1. Informative References . . . . . . . . . . . . . . . . . 13
78 Appendix A. Benchmarking Experience(Contiv-VPP) . . . . . . . . 15
79 A.1. Benchmarking Environment . . . . . . . . . . . . . . . . 15
80 A.2. Trouble shooting and Result . . . . . . . . . . . . . . . 19
81 Appendix B. Benchmarking Experience(SR-IOV with DPDK) . . . . . 20
82 B.1. Benchmarking Environment . . . . . . . . . . . . . . . . 21
83 Appendix C. Benchmarking Experience(Multi-pod Test) . . . . . . 24
84 C.1. Benchmarking Overview . . . . . . . . . . . . . . . . . . 24
85 C.2. Hardware Configurations . . . . . . . . . . . . . . . . . 25
86 C.3. NUMA Allocation Scenario . . . . . . . . . . . . . . . . 27
87 C.4. Traffic Generator Configurations . . . . . . . . . . . . 27
88 C.5. Benchmark Results and Trouble-shootings . . . . . . . . . 27
89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28
91 1. Introduction
93 The Benchmarking Methodology Working Group(BMWG) has recently
94 expanded its benchmarking scope from Physical Network Function(PNF)
95 running on a dedicated hardware system to Network Function
96 Virtualization(NFV) infrastructure and Virtualized Network
97 Function(VNF). [RFC8172] described considerations for configuring
98 NFV infrastructure and benchmarking metrics, and [RFC8204] gives
99 guidelines for benchmarking virtual switch which connects VNFs in
100 Open Platform for NFV(OPNFV).
102 Recently NFV infrastructure has evolved to include a lightweight
103 virtualized platform called the containerized infrastructure, where
104 VNFs share the same host Operating System(OS) and are logically
105 isolated by using a different namespace. While previous NFV
106 infrastructure uses a hypervisor to allocate resources for Virtual
107 Machine(VMs) and instantiate VNFs on it, the containerized
108 infrastructure virtualizes resources without a hypervisor, therefore
109 making containers very lightweight and more efficient in
110 infrastructure resource utilization compared to the VM-based NFV
111 infrastructure. When we consider benchmarking for VNFs in the
112 containerized infrastructure, it may have a different System Under
113 Test(SUT) and Device Under Test(DUT) configuration compared with both
114 black-box benchmarking and VM-based NFV infrastructure as described
115 in [RFC8172]. Accordingly, additional configuration parameters and
116 testing strategies may be required.
118 In the containerized infrastructure, a VNF network is implemented by
119 running both switch and router functions in the host system. For
120 example, the internal communication between VNFs in the same host
121 uses the L2 bridge function, while communication with external
122 node(s) uses the L3 router function. For container networking, the
123 host system may use a virtual switch(vSwitch), but other options
124 exist. In the [ETSI-TST-009], they describe differences in
125 networking structure between the VM-based and the containerized
126 infrastructure. Occasioned by these differences, deployment
127 scenarios for testing network performance described in [RFC8204] may
128 be partially applied to the containerized infrastructure, but other
129 scenarios may be required.
131 This draft is aimed to distinguish benchmarking of containerized
132 infrastructure from the previous benchmarking methodology of common
133 NFV infrastructure. Similar to [RFC8204], the networking principle
134 of containerized infrastructure is basically based on virtual switch
135 (vSwitch), but there are several options and acceleration
136 technologies. At the same time, it is important to uncover the
137 impact of resource isolation methods specified in a containerized
138 infrastructure on the benchmark performance. In addition, this draft
139 contains benchmark experiences with various combinations of resource
140 isolation methods and networking models that can be a reference to
141 set up and benchmark containerized infrastructure. Note that,
142 although the detailed configurations of both infrastructures differ,
143 the new benchmarks and metrics defined in [RFC8172] can be equally
144 applied in containerized infrastructure from a generic-NFV point of
145 view, and therefore defining additional metrics or methodologies is
146 out of scope.
148 2. Terminology
150 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
151 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
152 document is to be interpreted as described in [RFC2119]. This
153 document uses the terminology described in [RFC8172], [RFC8204],
154 [ETSI-TST-009].
156 3. Containerized Infrastructure Overview
158 For the benchmarking of the containerized infrastructure, as
159 mentioned in [RFC8172], the basic approach is to reuse existing
160 benchmarking methods developed within the BMWG. Various network
161 function specifications defined in BMWG should still be applied to
162 containerized VNF(C-VNF)s for the performance comparison with
163 physical network functions and VM-based VNFs. A major distinction of
164 the containerized infrastructure from the VM-based infrastructure is
165 the absence of a hypervisor. Without hypervisor, all C- VNFs share
166 the same host resources including but not limited to computing,
167 storage, and networking resources, as well as the host Operating
168 System(OS), kernel, and libraries. These architectural differences
169 bring additional considerations of resource management impacts for
170 benchmarking.
172 In a common containerized infrastructure, thank the proliferation of
173 Kubernetes, the pod is defined as a basic unit for orchestration and
174 management that is able to host multiple containers. Based on that,
175 [ETSI-TST-009] defined two test scenario for container infrastructure
176 as follows.
178 o Container2Container: Communication between containers running in
179 the same pod. it can be done by shared volumes or Inter-process
180 communication (IPC).
182 o Pod2Pod: Communication between containers running in the different
183 pods.
185 As mentioned in [RFC8204], vSwitch is also an important aspect of the
186 containerized infrastructure. For Pod2Pod communication, every pod
187 has basically only one virtual Ethernet (vETH) interface. This
188 interface is connected to the vSwitch via vETH pair for each
189 container. Not only Pod2Pod but also Pod2External scenario that
190 communicates with an external node is also required. In this case,
191 vSwitch SHOULD support gateway and Network Address Translation (NAT)
192 functionalities.
194 Figure 1 shows briefly differences of network architectures based on
195 deployment models. Basically, on bare metal, C-VNFs can be deployed
196 as a cluster called POD by Kubernetes. Otherwise, each C-VNF can be
197 deployed separately using Docker. In the former case, there is only
198 one external network interface even a POD contains more than one
199 C-VNF. An additional deployment model considers a scenario in which
200 C-VNFs or PODs are running on VM. In our draft, we define new
201 terminologies; BMP which is Pod on bare metal, and VMP which is Pod
202 on VM.
204 +---------------------------------------------------------------------+
205 | Baremetal Node |
206 | +--------------+ +--------------+ +-------------- + +-------------+ |
207 | | | | POD | | VM | | VM | |
208 | | | |+------------+| |+-------------+| | +-------+ | |
209 | | C-VNF(A) | || C-VNFs(B) || || C-VNFs(C) || | |PODs(D)| | |
210 | | | |+------------+| |+-----^-------+| | +---^---+ | |
211 | | | | | | | | | | | |
212 | | +------+ | | +------+ | | +--v---+ | | +---v--+ | |
213 | +---| veth |---+ +---| veth |---+ +---|virtio|----+ +--|virtio|---+ |
214 | +--^---+ +---^--+ +--^---+ +---^--+ |
215 | | | | | |
216 | | | +--v---+ +---v--+ |
217 | +------|-----------------|------------|vhost |---------|vhost |---+ |
218 | | | | +--^---+ +---^--+ | |
219 | | | | | | | |
220 | | +--v---+ +---v--+ +--v---+ +---v--+ | |
221 | | +-| veth |---------| veth |---------| Tap |---------| Tap |-+ | |
222 | | | +--^---+ +---^--+ +--^---+ +---^--+ | | |
223 | | | | | vSwitch | | | | |
224 | | | +--|-----------------|---------------|-----------------|--+ | | |
225 | | +-| | | Bridge | | |-+ | |
226 | | +--|-----------------|---------------|-----------------|--+ | |
227 | | | +---------+ | +--|-----------------|---+ | |
228 | | | |Container| | | | Hypervisor | | | |
229 | | | | Engine | | | | | | | |
230 | | | +---------+ | +--|-----------------|---+ | |
231 | | | | Host Kernel | | | |
232 | +------|-----------------|---------------|-----------------|------+ |
233 | +--v-----------------v---------------v-----------------v--+ |
234 +-----| physical network |-----+
235 +---------------------------------------------------------+
237 Figure 1: Examples of Networking Architecture based on Deployment
238 Models - (A)C-VNF on Baremetal (B)Pod on Baremetal(BMP) (C)C-VNF
239 on VM (D)Pod on VM(VMP)
241 In [ETSI-TST-009], they described data plane test scenarios in a
242 single host. In that document, there are two scenarios for
243 containerized infrastructure; Container2Container which is internal
244 communication between two containers in the same Pod, and the Pod2Pod
245 model which is communication between two containers running in
246 different Pods. According to our new terminologies, we can call the
247 Pod2Pod model the BMP2BMP scenario. When we consider container
248 running on VM as an additional deployment option, there can be more
249 single host test scenarios as follows;
251 o BMP2VMP scenario
253 +---------------------------------------------------------------------+
254 | HOST +-----------------------------+ |
255 | |VM +-------------------+ | |
256 | | | C-VNF | | |
257 | +--------------------+ | | +--------------+ | | |
258 | | C-VNF | | | | Logical Port | | | |
259 | | +--------------+ | | +-+--^-------^---+--+ | |
260 | | | Logical Port | | | +----|-------|---+ | |
261 | +-+--^-------^---+---+ | | Logical Port | | |
262 | | | +---+----^-------^---+--------+ |
263 | | | | | |
264 | +----v-------|----------------------------|-------v-------------+ |
265 | | l----------------------------l | |
266 | | Data Plane Networking | |
267 | | (Kernel or User space) | |
268 | +----^--------------------------------------------^-------------+ |
269 | | | |
270 | +----v------+ +----v------+ |
271 | | Phy Port | | Phy Port | |
272 | +-----------+ +-----------+
273 +-------^--------------------------------------------^----------------+
274 | |
275 +-------v--------------------------------------------v----------------+
276 | |
277 | Traffic Generator |
278 | |
279 +---------------------------------------------------------------------+
281 Figure 2: Single Host Test Scenario - BMP2VMP
283 o VMP2VMP scenario
285 +---------------------------------------------------------------------+
286 | HOST |
287 | +-----------------------------+ +-----------------------------+ |
288 | |VM +-------------------+ | |VM +-------------------+ | |
289 | | | C-VNF | | | | C-VNF | | |
290 | | | +--------------+ | | | | +--------------+ | | |
291 | | | | Logical Port | | | | | | Logical Port | | | |
292 | | +-+--^-------^---+--+ | | +-+--^-------^---+--+ | |
293 | | +----|-------|---+ | | +----|-------|---+ | |
294 | | | Logical Port | | | | Logical Port | | |
295 | +---+----^-------^---+--------+ +---+----^-------^---+--------+ |
296 | | | | | |
297 | +--------v-------v------------------------|-------v-------------+ |
298 | | l------------------------l | |
299 | | Data Plane Networking | |
300 | | (Kernel or User space) | |
301 | +----^--------------------------------------------^-------------+ |
302 | | | |
303 | +----v------+ +----v------+ |
304 | | Phy Port | | Phy Port | |
305 | +-----------+ +-----------+ |
306 +-------^--------------------------------------------^----------------+
307 | |
308 +-------v--------------------------------------------v----------------+
309 | |
310 | Traffic Generator |
311 | |
312 +---------------------------------------------------------------------+
314 Figure 3: Single Host Test Scenario - VMP2VMP
316 4. Networking Models in Containerized Infrastructure
318 Container networking services are provided as network plugins.
319 Basically, using them, network services are deployed by using an
320 isolation environment from container runtime through the host
321 namespace, creating a virtual interface, allocating interface and IP
322 address to C-VNF. Since the containerized infrastructure has
323 different network architecture depending on its using plugins, it is
324 necessary to specify the plugin used in the infrastructure.
325 Especially for Kubernetes infrastructure, several Container
326 Networking Interface (CNI) plugins are developed, which describes
327 network configuration files in JSON format, and plugins are
328 instantiated as new namespaces. When the CNI plugin is initiated, it
329 pushes forwarding rules and networking policies to the existing
330 vSwitch (i.e., Linux bridge, Open vSwitch), or creates its own switch
331 functions to provide networking service.
333 The container network model can be classified according to the
334 location of the vSwitch component. There are some CNI plugins which
335 provide networking without the vSwitch components, however, this
336 draft focuses to plugins using vSwitch components.
338 4.1. Kernel-space vSwitch Models
340 +------------------------------------------------------------------+
341 | User Space |
342 | +-----------+ +-----------+ |
343 | | C-VNF | | C-VNF | |
344 | | +-------+ | | +-------+ | |
345 | +-| eth |-+ +-| eth |-+ |
346 | +---^---+ +---^---+ |
347 | | | |
348 | | +----------------------------------+ | |
349 | | | | | |
350 | | | Networking Controller / Agent | | |
351 | | | | | |
352 | | +-----------------^^---------------+ | |
353 ----------|-----------------------||---------------------|----------
354 | +---v---+ || +---v---+ |
355 | +--| veth |-------------------vv-----------------| veth |--+ |
356 | | +-------+ vSwitch Component +-------+ | |
357 | | (OVS Kernel Datapath, Linux Bridge, ..) | |
358 | | | |
359 | +-------------------------------^----------------------------+ |
360 | | |
361 | Kernel Space +-----------v----------+ |
362 +----------------------| NIC |--------------------+
363 +----------------------+
365 Figure 4: Examples of Kernel-Space vSwitch Model
367 Figure 4 shows kernel-space vSwitch model. In this model, the
368 vSwitch component is running on kernel space so data packets should
369 be processed in-network stack of host kernel before transferring
370 packets to the C-VNF running in user-space. Not only pod2External
371 but also pod2pod traffic should be processed in the kernel space.
372 For dynamic networking configuration, the Forwarding policy can be
373 pushed by the controller/agent located in the user-space. In the
374 case of Open vSwitch (OVS) [OVS], the first packet of flow can be
375 sent to the user space agent (ovs-switchd) for forwarding decision.
376 Kernel-space vSwitch models are listed below;
378 o Docker Network[Docker-network], Flannel Network[Flannel],
379 Calico[Calico], OVS(OpenvSwitch)[OVS], OVN(Open Virtual Network)[OVN]
381 4.2. User-space vSwitch Models
383 +------------------------------------------------------------------+
384 | User Space |
385 | +---------------+ +---------------+ |
386 | | C-VNF | | C-VNF | |
387 | | +-----------+ | +-----------------+ | +-----------+ | |
388 | | |virtio-user| | | Networking | | |virtio-user|-| |
389 | +-| / eth |-+ | Controller/Agent| +-| / eth |-+ |
390 | +-----^-----+ +-------^^--------+ +-----^-----+ |
391 | | || | |
392 | | || | |
393 | +-----v-----+ || +-----v-----+ |
394 | | vhost-user| || | vhost-user| |
395 | +--| / veth |--------------vv--------------| / veth |--+ |
396 | | +-----------+ +-----------+ | |
397 | | vSwtich | |
398 | | +--------------+ | |
399 | +----------------------| PMD Driver |----------------------+ |
400 | | | |
401 | +-------^------+ |
402 ----------------------------------|---------------------------------
403 | | |
404 | | |
405 | | |
406 | Kernel Space +----------V-----------+ |
407 +----------------------| NIC |--------------------+
408 +----------------------+
410 Figure 5: Examples of User-Space vSwitch Model
412 Figure 5 shows user-space vSwitch model, in which data packets from
413 physical network port are bypassed kernel processing and delivered
414 directly to the vSwitch running on user-space. This model is
415 commonly considered as Data Plane Acceleration (DPA) technology since
416 it can be achieved high-rate packet processing than a kernel-space
417 network that has limited packet throughput. For bypassing kernel and
418 directly transferring the packet to vSwitch, Data Plane Development
419 Kit (DPDK) is essentially required. With DPDK, an additional driver
420 called Pull-Mode Driver (PMD) is created on vSwtich. PMD driver must
421 be created for each NIC separately. User-space vSwitch models are
422 listed below;
424 o ovs-dpdk[ovs-dpdk], vpp[vpp]
426 4.3. Smart-NIC Acceleration Model
427 +------------------------------------------------------------------+
428 | User Space |
429 | +-----------------+ +-----------------+ |
430 | | C-VNF | | C-VNF | |
431 | | +-------------+ | | +-------------+ | |
432 | +-| vf driver |-+ +-| vf driver |-+ |
433 | +-----^-------+ +------^------+ |
434 | | | |
435 -------------|---------------------------------------|--------------
436 | +---------+ +---------+ |
437 | +------|-------------------|------+ |
438 | | +----v-----+ +-----v----+ | |
439 | | | virtual | | virtual | | |
440 | | | function | | function | | |
441 | Kernel Space | +----^-----+ NIC +-----^----+ | |
442 +---------------| | | |----------------+
443 | +----v-------------------v----+ |
444 | | Classify and Queue | |
445 | +-----------------------------+ |
446 +---------------------------------+
448 Figure 6: Examples of Smart-NIC Acceleration Model
450 Figure 6 shows Smart-NIC acceleration model, which does not use
451 vSwitch component. This model can be separated into two
452 technologies. One is Single-Root I/O Virtualization (SR-
453 IOV)[SR-IOV], which is an extension of PCIe specifications to enable
454 multiple partitions running simultaneously within a system to share
455 PCIe devices. In the NIC, there are virtual replicas of PCI
456 functions known as virtual functions (VF) and each of them is
457 directly connected to each container's network interfaces. Using SR-
458 IOV, data packets from external are bypassing both kernel and user
459 space and are directly forwarded to container's virtual network
460 interface.
462 Another smart-NIC acceleration is the extended Berkeley Packet Filter
463 (eBPF)[eBPF], which enables to run of sandboxed programs in the Linux
464 kernel without changing kernel source code or loading kernel module.
465 To accelerate data plane performance, it can attach eXpress Data Path
466 (XDP) to specific NIC to offload packet processing without host CPU
467 charge.
469 The Smart-NIC can use together with vSwitch network model to improve
470 network performance. In [userspace-cni], several combinations of
471 user-space vSwitch models with SR-IOV are supported. For eBPF with
472 DPDK, DPDK libraries to use eBPF can be found at [DPDK_eBPF].
474 5. Performance Impacts
476 5.1. CPU Isolation / NUMA Affinity
478 CPU pinning enables benefits such as maximizing cache utilization,
479 eliminating operating system thread scheduling overhead as well as
480 coordinating network I/O by guaranteeing resources. This technology
481 is very effective to avoid the "noisy neighbor" problem and it is
482 already proved in existing experience [Intel-EPA].
484 Using NUMA, performance will be increasing not CPU and memory but
485 also network since that network interface connected PCIe slot of
486 specific NUMA node have locality. Using NUMA requires a strong
487 understanding of VNF's memory requirements. If VNF uses more memory
488 than a single NUMA node contains, the overhead will be occurred due
489 to being spilled to another NUMA node. Network performance can be
490 changed depending on the location of the NUMA node whether it is the
491 same NUMA node where the physical network interface and CNF are
492 attached to. There is benchmarking experience for cross-NUMA
493 performance impacts [ViNePERF]. In that tests, they consist of
494 cross-NUMA performance with 3 scenarios depending on the location of
495 the traffic generator and traffic endpoint. As the results, it was
496 verified as below:
498 o A single NUMA Node serving multiple interfaces is worse than Cross-
499 NUMA Node performance degradation
501 o Worse performance with VNF sharing CPUs across NUMA
503 5.2. Hugepages
505 The huge page is that configuring a large page size of memory to
506 reduce Translation Lookaside Buffer(TLB) miss rate and increase the
507 application performance. This increases the performance of logical/
508 virtual to physical address lookups performed by a CPU's memory
509 management unit, and generally overall system performance. In the
510 containerized infrastructure, the container is isolated at the
511 application level and administrators can set huge pages more granular
512 level (e.g. Kubernetes allows to use of 512M bytes huge pages for
513 the container as default values). Moreover, this page is dedicated
514 to the application but another process so the application uses the
515 page more efficiently way. From a network benchmark point of view,
516 however, the impact on general packet processing can be relatively
517 negligible, and it may be necessary to consider the application level
518 to measure the impact together. In the case of using the DPDK
519 application, as reported in [Intel-EPA], it was verified to improve
520 network performance because packet handling processes are running in
521 the application together.
523 5.3. Additional Considerations
525 When we consider benchmarking for not only containerized but also VM-
526 based infrastructure and network functions, benchmarking scenarios
527 may contain various operational use cases. Traditional black-box
528 benchmarking is focused to measure the in-out performance of packets
529 from physical network ports since the hardware is tightly coupled
530 with its function and only a single function is running on its
531 dedicated hardware. However, in the NFV environment, the physical
532 network port commonly will be connected to multiple VNFs(i.e.
533 Multiple PVP test setup architectures were described in
534 [ETSI-TST-009]) rather than dedicated to a single VNF. Therefore,
535 benchmarking scenarios should reflect operational considerations such
536 as the number of VNFs or network services defined by a set of VNFs in
537 a single host. [service-density], which proposed a way for measuring
538 the performance of multiple NFV service instances at a varied service
539 density on a single host, is one example of these operational
540 benchmarking aspects.
542 Regarding the above draft, it can be classified into two types of
543 traffic for benchmark testing. One is North/South traffic and the
544 other is East/West traffic. North/South has an architecture that
545 receives data from other servers and routes them through VNF. On the
546 other hand, East/West traffic is a form of sending and receiving data
547 between containers deployed in the same server and can pass through
548 multiple containers. One example is Service Function Chaining.
549 Since network acceleration technology in a container environment has
550 different accelerated areas depending on the method provided,
551 performance differences may occur depending on traffic patterns.
553 6. Security Considerations
555 TBD
557 7. References
559 7.1. Informative References
561 [Calico] "Project Calico", July 2019,
562 .
564 [Docker-network]
565 "Docker, Libnetwork design", July 2019,
566 .
568 [DPDK_eBPF]
569 "DPDK-Berkeley Packet Filter Library", August 2021,
570 .
572 [eBPF] "eBPF, extended Berkeley Packet Filter", July 2019,
573 .
575 [ETSI-TST-009]
576 "Network Functions Virtualisation (NFV) Release 3;
577 Testing; Specification of Networking Benchmarks and
578 Measurement Methods for NFVI", October 2018.
580 [Flannel] "flannel 0.10.0 Documentation", July 2019,
581 .
583 [Intel-EPA]
584 Intel, "Enhanced Platform Awareness in Kubernetes", 2018,
585 .
588 [OVN] "How to use Open Virtual Networking with Kubernetes", July
589 2019, .
591 [OVS] "Open Virtual Switch", July 2019,
592 .
594 [ovs-dpdk] "Open vSwitch with DPDK", July 2019,
595 .
598 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
599 Requirement Levels", RFC 2119, March 1997,
600 .
602 [RFC8172] Morton, A., "Considerations for Benchmarking Virtual
603 Network Functions and Their Infrastructure", RFC 8172,
604 July 2017, .
606 [RFC8204] Tahhan, M., O'Mahony, B., and A. Morton, "Benchmarking
607 Virtual Switches in the Open Platform for NFV (OPNFV)",
608 RFC 8204, September 2017,
609 .
611 [service-density]
612 Konstantynowicz, M. and P. Mikus, "NFV Service Density
613 Benchmarking", March 2019, .
616 [SR-IOV] "SRIOV for Container-networking", July 2019,
617 .
619 [userspace-cni]
620 "Userspace CNI Plugin", August 2021,
621 .
623 [ViNePERF] Anuket Project, "Cross-NUMA performance measurements with
624 VSPERF", March 2019, .
627 [vpp] "VPP with Containers", July 2019, .
630 Appendix A. Benchmarking Experience(Contiv-VPP)
632 A.1. Benchmarking Environment
634 In this test, our purpose is that we test performance of user space
635 based model for container infrastructure and figure out relationship
636 between resource allocation and network performance. With respect to
637 this, we setup Contiv-VPP which is one of the user space based
638 network solution in container infrastructure and tested like below.
640 o Three physical server for benchmarking
642 +-------------------+----------------------+--------------------------+
643 | Node Name | Specification | Description |
644 +-------------------+----------------------+--------------------------+
645 | Conatiner Control |- Intel(R) Xeon(R) | Container Deployment |
646 | for Master | CPU E5-2690 | and Network Allocation |
647 | | (2Socket X 12Core) |- ubuntu 18.04 |
648 | |- MEM 128G |- Kubernetes Master |
649 | |- DISK 2T |- CNI Conterller |
650 | |- Control plane : 1G |.. Contive VPP Controller |
651 | | |.. Contive VPP Agent |
652 +-------------------+----------------------+--------------------------+
653 | Conatiner Service |- Intel(R) Xeon(R) | Container Service |
654 | for Worker | Gold 6148 |- ubuntu 18.04 |
655 | | (2socket X 20Core) |- Kubernetes Worker |
656 | |- MEM 128G |- CNI Agent |
657 | |- DISK 2T |.. Contive VPP Agent |
658 | |- Control plane : 1G | |
659 | |- Data plane : MLX 10G| |
660 | | (1NIC 2PORT) | |
661 +-------------------+----------------------+--------------------------+
662 | Packet Generator |- Intel(R) Xeon(R) | Packet Generator |
663 | | CPU E5-2690 |- CentOS 7 |
664 | | (2Socket X 12Core) |- installed Trex 2.4 |
665 | |- MEM 128G | |
666 | |- DISK 2T | |
667 | |- Control plane : 1G | |
668 | |- Data plane : MLX 10G| |
669 | | (1NIC 2PORT) | |
670 +-------------------+----------------------+--------------------------+
672 Figure 7: Test Environment-Server Specification
674 o The architecture of benchmarking
675 +----+ +--------------------------------------------------------+
676 | | | Containerized Infrastructure Master Node |
677 | | | +-----------+ |
678 | <-------> 1G PORT 0 | |
679 | | | +-----------+ |
680 | | +--------------------------------------------------------+
681 | |
682 | | +--------------------------------------------------------+
683 | | | Containerized Infrastructure Worker Node |
684 | | | +---------------------------------+ |
685 | s | | +-----------+ | +------------+ +------------+ | |
686 | w <-------> 1G PORT 0 | | | 10G PORT 0 | | 10G PORT 1 | | |
687 | i | | +-----------+ | +------^-----+ +------^-----+ | |
688 | t | | +--------|----------------|-------+ |
689 | c | +-----------------------------|----------------|---------+
690 | h | | |
691 | | +-----------------------------|----------------|---------+
692 | | | Packet Generator Node | | |
693 | | | +--------|----------------|-------+ |
694 | | | +-----------+ | +------v-----+ +------v-----+ | |
695 | <-------> 1G PORT 0 | | | 10G PORT 0 | | 10G PORT 1 | | |
696 | | | +-----------+ | +------------+ +------------+ | |
697 | | | +---------------------------------+ |
698 | | | |
699 +----+ +--------------------------------------------------------+
701 Figure 8: Test Environment-Architecture
703 o Network model of Containerized Infrastructure(User space Model)
704 +---------------------------------------------+---------------------+
705 | NUMA 0 | NUMA 0 |
706 +---------------------------------------------|---------------------+
707 | Containerized Infrastructure Worker Node | |
708 | +---------------------------+ | +----------------+ |
709 | | POD1 | | | POD2 | |
710 | | +-------------+ | | | +-------+ | |
711 | | | | | | | | | | |
712 | | +--v---+ +---v--+ | | | +-v--+ +-v--+ | |
713 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
714 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
715 | +------|-------------|------+ | +---|-------|----+ |
716 | +--- | | | | |
717 | | +-------|---------------|------+ | |
718 | | | | +------|--------------+ |
719 | +----------|--------|-------|--------|----+ | |
720 | | v v v v | | |
721 | | +-tap10--tap11-+ +-tap20--tap21-+ | | |
722 | | | ^ ^ | | ^ ^ | | | |
723 | | | | VRF1 | | | | VRF2 | | | | |
724 | | +--|--------|--+ +--|--------|--+ | | |
725 | | | +-----+ | +---+ | | |
726 | | +-tap01--|--|-------------|----|---+ | | |
727 | | | +------v--v-+ VRF0 +----v----v-+ | | | |
728 | | +-| 10G ETH0/0|------| 10G ETH0/1|-+ | | |
729 | | +---^-------+ +-------^---+ | | |
730 | | +---v-------+ +-------v---+ | | |
731 | +---| DPDK PMD0 |------| DPDK PMD1 |------+ | |
732 | +---^-------+ +-------^---+ | User Space |
733 +---------|----------------------|------------|---------------------+
734 | +-----|----------------------|-----+ | Kernal Space |
735 +---| +---V----+ +----v---+ |------|---------------------+
736 | | PORT 0 | 10G NIC | PORT 1 | | |
737 | +---^----+ +----^---+ |
738 +-----|----------------------|-----+
739 +-----|----------------------|-----+
740 +---| +---V----+ +----v---+ |----------------------------+
741 | | | PORT 0 | 10G NIC | PORT 1 | | Packet Generator (Trex) |
742 | | +--------+ +--------+ | |
743 | +----------------------------------+ |
744 +-------------------------------------------------------------------+
746 Figure 9: Test Environment-Network Architecture
748 We setup a Contive-VPP network to benchmark the user space container
749 network model in the containerized infrastructure worker node. We
750 setup network interface at NUMA0, and we created different network
751 subnet VRF1, VRF2 to classify input and output data traffic,
752 respectively. And then, we assigned two interface which connected to
753 VRF1, VRF2 and, we setup routing table to route Trex packet from eth1
754 interface to eth2 interface in POD.
756 A.2. Trouble shooting and Result
758 In this environment, we confirmed that the routing table doesn't work
759 when we send packet using Trex packet generator. The reason is that
760 when kernel space based network configured, ip forwarding rule is
761 processed to kernel stack level while 'ip packet forwarding rule' is
762 processed only in vrf0, which is the default virtual routing and
763 forwarding (VRF0) in VPP. That is, above testing architecture makes
764 problem since vrf1 and vrf2 interface couldn't route packet.
765 According to above result, we assigned vrf0 and vrf1 to POD and, data
766 flow is like below.
768 +---------------------------------------------+---------------------+
769 | NUMA 0 | NUMA 0 |
770 +---------------------------------------------|---------------------+
771 | Containerized Infrastructure Worker Node | |
772 | +---------------------------+ | +----------------+ |
773 | | POD1 | | | POD2 | |
774 | | +-------------+ | | | +-------+ | |
775 | | +--v----+ +---v--+ | | | +-v--+ +-v--+ | |
776 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
777 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
778 | +------|-------------|------+ | +---|-------|----+ |
779 | +-------+ | | | | |
780 | | +-------------|---------------|------+ | |
781 | | | | +------|--------------+ |
782 | +-----|-------|-------------|--------|----+ | |
783 | | | | v v | | |
784 | | | | +-tap10--tap11-+ | | |
785 | | | | | ^ ^ | | | |
786 | | | | | | VRF1 | | | | |
787 | | | | +--|--------|--+ | | |
788 | | | | | +---+ | | |
789 | | +-*tap00--*tap01----------|----|---+ | | |
790 | | | +-V-------v-+ VRF0 +----v----v-+ | | | |
791 | | +-| 10G ETH0/0|------| 10G ETH0/1|-+ | | |
792 | | +-----^-----+ +------^----+ | | |
793 | | +-----v-----+ +------v----+ | | |
794 | +---|*DPDK PMD0 |------|*DPDK PMD1 |------+ | |
795 | +-----^-----+ +------^----+ | User Space |
796 +-----------|-------------------|-------------|---------------------+
797 v v
798 *- CPU pinning interface
799 Figure 10: Test Environment-Network Architecture(CPU Pinning)
801 We conducted benchmarking with three conditions. The test
802 environments are as follows. - Basic VPP switch - General kubernetes
803 (No CPU Pining) - Shared Mode / Exclusive mode. In the basic
804 Kubernetes environment, all PODs share a host's CPU. Shared mode is
805 that some POD share a pool of CPU assigned to a specific PODs.
806 Exclusive mode is that a specific POD dedicates a specific CPU to
807 use. In shared mode, we assigned two CPU for several POD, in
808 exclusive mode, we dedicated one CPU for one POD, independently. The
809 result is like Figure 11. First, the test was conducted to figure
810 out the line rate of the VPP switch, and the basic Kubernetes
811 performance. After that, we applied NUMA to network interface using
812 Shared Mode and Exclusive Mode in the same node and different node
813 respectively. In Exclusive and Shared mode tests, we confirmed that
814 Exclusive mode showed better performance than Shared mode when same
815 NUMA cpu assigned, respectively. However, we confirmed that
816 performance is reduced at the section between the vpp switch and the
817 POD, so that it affect to total result.
819 +--------------------+---------------------+-------------+
820 | Model | NUMA Mode (pinning)| Result(Gbps)|
821 +--------------------+---------------------+-------------+
822 | | N/A | 3.1 |
823 | Switch only |---------------------+-------------+
824 | | same NUMA | 9.8 |
825 +--------------------+---------------------+-------------+
826 | K8S Scheduler | N/A | 1.5 |
827 +--------------------+---------------------+-------------+
828 | | same NUMA | 4.7 |
829 | CMK-Exclusive Mode +---------------------+-------------+
830 | | Different NUMA | 3.1 |
831 +--------------------+---------------------+-------------+
832 | | same NUMA | 3.5 |
833 | CMK-shared Mode +---------------------+-------------+
834 | | Different NUMA | 2.3 |
835 +--------------------+---------------------+-------------+
837 Figure 11: Test Results
839 Appendix B. Benchmarking Experience(SR-IOV with DPDK)
840 B.1. Benchmarking Environment
842 In this test, our purpose is that we test performance of user space
843 based model for container infrastructure and figure out relationship
844 between resource allocation and network performance. With respect to
845 this, we setup SRIOV combining with DPDK to bypass the Kernel space
846 in container infrastructure and tested based on that.
848 o Three physical server for benchmarking
850 +-------------------+-------------------------+------------------------+
851 | Node Name | Specification | Description |
852 +-------------------+-------------------------+------------------------+
853 | Conatiner Control |- Intel(R) Core(TM) | Container Deployment |
854 | for Master | i5-6200U CPU | and Network Allocation |
855 | | (1socket x 4Core) |- ubuntu 18.04 |
856 | |- MEM 8G |- Kubernetes Master |
857 | |- DISK 500GB |- CNI Conterller |
858 | |- Control plane : 1G | MULTUS CNI |
859 | | | SRIOV plugin with DPDK|
860 +-------------------+-------------------------+------------------------+
861 | Conatiner Service |- Intel(R) Xeon(R) | Container Service |
862 | for Worker | E5-2620 v3 @ 2.4Ghz |- Centos 7.7 |
863 | | (1socket X 6Core) |- Kubernetes Worker |
864 | |- MEM 128G |- CNI Agent |
865 | |- DISK 2T | MULTUS CNI |
866 | |- Control plane : 1G | SRIOV plugin with DPDK|
867 | |- Data plane : XL710-qda2| |
868 | | (1NIC 2PORT- 40Gb) | |
869 +-------------------+-------------------------+------------------------+
870 | Packet Generator |- Intel(R) Xeon(R) | Packet Generator |
871 | | Gold 6148 @ 2.4Ghz |- CentOS 7.7 |
872 | | (2Socket X 20Core) |- installed Trex 2.4 |
873 | |- MEM 128G | |
874 | |- DISK 2T | |
875 | |- Control plane : 1G | |
876 | |- Data plane : XL710-qda2| |
877 | | (1NIC 2PORT- 40Gb) | |
878 +-------------------+-------------------------+------------------------+
880 Figure 12: Test Environment-Server Specification
882 o The architecture of benchmarking
883 +----+ +--------------------------------------------------------+
884 | | | Containerized Infrastructure Master Node |
885 | | | +-----------+ |
886 | <-------> 1G PORT 0 | |
887 | | | +-----------+ |
888 | | +--------------------------------------------------------+
889 | |
890 | | +--------------------------------------------------------+
891 | | | Containerized Infrastructure Worker Node |
892 | | | +---------------------------------+ |
893 | s | | +-----------+ | +------------+ +------------+ | |
894 | w <-------> 1G PORT 0 | | | 40G PORT 0 | | 40G PORT 1 | | |
895 | i | | +-----------+ | +------^-----+ +------^-----+ | |
896 | t | | +--------|----------------|-------+ |
897 | c | +-----------------------------|----------------|---------+
898 | h | | |
899 | | +-----------------------------|----------------|---------+
900 | | | Packet Generator Node | | |
901 | | | +--------|----------------|-------+ |
902 | | | +-----------+ | +------v-----+ +------v-----+ | |
903 | <-------> 1G PORT 0 | | | 40G PORT 0 | | 40G PORT 1 | | |
904 | | | +-----------+ | +------------+ +------------+ | |
905 | | | +---------------------------------+ |
906 | | | |
907 +----+ +--------------------------------------------------------+
909 Figure 13: Test Environment-Architecture
911 o Network model of Containerized Infrastructure(User space Model)
912 +---------------------------------------------+---------------------+
913 | CMK shared core | CMK exclusive core |
914 +---------------------------------------------|---------------------+
915 | Containerized Infrastructure Worker Node | |
916 | +---------------------------+ | +----------------+ |
917 | | POD1 | | | POD2 | |
918 | | (testpmd) | | | (testpmd) | |
919 | | +-------------+ | | | +-------+ | |
920 | | | | | | | | | | |
921 | | +--v---+ +---v--+ | | | +-v--+ +-v--+ | |
922 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
923 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
924 | +------|-------------|------+ | +---|-------|----+ |
925 | | | | | | |
926 | +------ +-+ | | | |
927 | | +----|-----------------|------+ | |
928 | | | | +--------|--------------+ |
929 | | | | | | User Space|
930 +---------|------------|----|--------|--------|---------------------+
931 | | | | | | |
932 | +--+ +------| | | | |
933 | | | | | | Kernal Space|
934 +------|--------|-----------|--------|--------+---------------------+
935 | +----|--------|-----------|--------|-----+ | |
936 | | +--v--+ +--v--+ +--v--+ +--v--+ | | NIC|
937 | | | VF0 | | VF1 | | VF2 | | VF3 | | | |
938 | | +--|---+ +|----+ +----|+ +-|---+ | | |
939 | +----|------|---------------|-----|------+ | |
940 +---| +v------v+ +-v-----v+ |------|---------------------+
941 | | PORT 0 | 40G NIC | PORT 1 | |
942 | +---^----+ +----^---+ |
943 +-----|----------------------|-----+
944 +-----|----------------------|-----+
945 +---| +---V----+ +----v---+ |----------------------------+
946 | | | PORT 0 | 40G NIC | PORT 1 | | Packet Generator (Trex) |
947 | | +--------+ +--------+ | |
948 | +----------------------------------+ |
949 +-------------------------------------------------------------------+
951 Figure 14: Test Environment-Network Architecture
953 We setup a Multus CNI, SRIOV CNI with DPDK to benchmark the user
954 space container network model in the containerized infrastructure
955 worker node. The Multus CNI support to create multiple interfaces
956 for a container. The traffic is bypassed the Kernel space by SRIOV
957 with DPDK. We established two modes of CMK: shared core and
958 exclusive core. We created VFs for each network interface of a
959 container. Then, we setup TREX to route packet from eth1 to eth2 in
960 a POD.
962 Appendix C. Benchmarking Experience(Multi-pod Test)
964 C.1. Benchmarking Overview
966 The main goal of this experience was to benchmark multi-pod scenario,
967 which packet is traversed through two pods. To create additonal
968 interfaces for forwarding packet between two pods, Multus CNI was
969 used. We compared two userspace-vSwitch model network technologies:
970 OVS/DPDK and VPP-memif. Since that vpp-memif has different packet
971 forwarding mechanism by using shared memory interface, it is expected
972 that vpp-memif may provide higher performance that OVS-DPDK. Also,
973 we consider NUMA impact for both cases, we made 6 scenarios depending
974 on CPU location of vSwitch and two pods. Figure 15 is packet
975 forwarding scenario in this test, where two pods are running on the
976 same host and vSwitch is delieverig packets between two pods.
978 +----------------------------------------------------------------+
979 |Worker Node |
980 | +--------------------------------------------------------+ |
981 | |Kubernetes | |
982 | | +--------------+ +--------------+ | |
983 | | | pod1 | | pod2 | | |
984 | | | +--------+ | | +--------+ | | |
985 | | | | L2FWD | | | | L2FWD | | | |
986 | | | +---^--v-+ | | +--^--v--+ | | |
987 | | | | DPDK | | | | DPDK | | | |
988 | | | +---^--v-+ | | +--^--v--+ | | |
989 | | +------^--v----+ +-----^--v-----+ | |
990 | | ^ v ^ v | |
991 | | +------^--v>>>>>>>>>>>>>>>>>>>>>>>>>>>^--v-----+ | |
992 | | | ^ OVS-DPDK / VPP-memif vSwitch v | | |
993 | | +------^---------------------------------v-----+ | |
994 | | | ^ PMD Driver v | | |
995 | | +------^---------------------------------v-----+ | |
996 | | ^ v | |
997 | +----------^---------------------------------v-----------+ |
998 | ^ v |
999 | +----------^---------------------------------v---------+ |
1000 | | ^ 40G NIC v | |
1001 | | +------^-------+ +--------v-----+ | |
1002 +---|---| Port 0 |----------------| Port 1 |---|-----+
1003 | +------^-------+ +--------v-----+ |
1004 +----------^---------------------------------v---------+
1005 +------^-------+ +--------v-----+
1006 +-------| Port 0 |----------------| Port 1 |---------+
1007 | +------^-------+ +--------v-----+ |
1008 | Traffic Generator (TRex) |
1009 | |
1010 +----------------------------------------------------------------+
1012 Figure 15: Multi-pod Benchmarking Scenario
1014 C.2. Hardware Configurations
1015 +-------------------+-------------------------+------------------------+
1016 | Node Name | Specification | Description |
1017 +-------------------+-------------------------+------------------------+
1018 | Conatiner Control |- Intel(R) Core(TM) | Container Deployment |
1019 | for Master | E5-2620v3 @ 2.40GHz | and Network Allocation |
1020 | | (1socket x 12Cores) |- ubuntu 18.04 |
1021 | |- MEM 32GB |- Kubernetes Master |
1022 | |- DISK 1TB |- CNI Controller |
1023 | |- NIC: Control plane: 1G | - MULTUS CNI |
1024 | |- OS: CentOS Linux7.9 | - DPDK-OVS/VPP-memif |
1025 +-------------------+-------------------------+------------------------+
1026 | Conatiner Service |- Intel(R) Xeon(R) |- Container dpdk-L2fwd |
1027 | for Worker | Gold 6148 @ 2.40GHz |- Kubernetes Worker |
1028 | | (2socket X 40Cores) |- CNI Agent |
1029 | |- MEM 256GB | - Multus CNI |
1030 | |- DISK 2TB | - DPDK-OVS/VPP-memif |
1031 | |- NIC | |
1032 | | - Control plane: 1G | |
1033 | | - Data plane: XL710-qda2| |
1034 | | (1NIC 2PORT- 40Gb) | |
1035 | |- OS: CentOS Linux 7.9 | |
1036 +-------------------+-------------------------+------------------------+
1037 | Packet Generator |- Intel(R) Xeon(R) | Packet Generator |
1038 | | Gold 6148 @ 2.4Ghz |- Installed Trex v2.92 |
1039 | | (2Socket X 40Core) | |
1040 | |- MEM 256GB | |
1041 | |- DISK 2TB | |
1042 | |- NIC | |
1043 | | - Data plane: XL710-qda2| |
1044 | | (1NIC 2PORT - 40Gb) | |
1045 | |- OS: CentOS Lunix 7.9 | |
1046 +-------------------+-------------------------+------------------------+
1048 Figure 16: Hardware Configurations for Multi-pod Benchmarking
1050 For installations and configurations of CNIs, we used userspace-cni
1051 network plugin. Among this CNI, multus provides to create multiple
1052 interfaces for each pod. Both OVS-DPDK and VPP-memif bypasses kernel
1053 with DPDK PMD driver. For CPU isolation and NUMA allocation, we used
1054 Intel CMK with exclusive mode. Since Trex generator is upgraded to
1055 the new version, we used the latest version of Trex.
1057 C.3. NUMA Allocation Scenario
1059 For analyzing benchmarking impacts of different NUMA allocation, we
1060 set 6 scenarios depending on location of CPU allocating to two pods
1061 and vSwich. For this scenario, we did not consider cross-NUMA case,
1062 which allocates CPUs to pod or switch in manner that two cores are
1063 located in different NUMA nodes. 6 scenarios we considered are listed
1064 in Table 1. Note that, NIC is attaching to the NUMA1.
1066 +============+=========+=======+=======+
1067 | Scenario # | vSwtich | pod1 | pod2 |
1068 +============+=========+=======+=======+
1069 | S1 | NUMA1 | NUMA0 | NUMA0 |
1070 +------------+---------+-------+-------+
1071 | S2 | NUMA1 | NUMA1 | NUMA1 |
1072 +------------+---------+-------+-------+
1073 | S3 | NUMA0 | NUMA0 | NUMA0 |
1074 +------------+---------+-------+-------+
1075 | S4 | NUMA0 | NUMA1 | NUMA1 |
1076 +------------+---------+-------+-------+
1077 | S5 | NUMA1 | NUMA1 | NUMA0 |
1078 +------------+---------+-------+-------+
1079 | S6 | NUMA0 | NUMA0 | NUMA1 |
1080 +------------+---------+-------+-------+
1082 Table 1: NUMA Allocation Scenarios
1084 C.4. Traffic Generator Configurations
1086 For multi-pod benchmarking, we discovered Non Drop Rate (NDR) with
1087 binary search algorithm. In Trex, it supports command to discover
1088 NDR for each testing. Also, we test for different ethernet frame
1089 sizes from 64bytes to 1518bytes. For running Trex, we used command
1090 as follows;
1092 ./ndr --stl --port 0 1 -v --profile stl/bench.py --prof-tun size=x --
1093 opt-bin-search
1095 C.5. Benchmark Results and Trouble-shootings
1097 As the benchmarking results, Table 2 shows packet loss ratio using
1098 1518 kbytes packet in OVS-DPDK/vpp-memif. From that results, we can
1099 say that the vpp-memif has better performance that OVS-DPDK, which is
1100 came from difference the way to forward packet between vswitch and
1101 pod. Also, impact of NUMA is bigger in case of that vswitch and both
1102 pods are located in the same node than allocating CPU to the node
1103 where NIC is attached.
1105 +==================+=======+=======+=======+=======+=======+=======+
1106 | Networking Model | S1 | S2 | S3 | S4 | S5 | S6 |
1107 +==================+=======+=======+=======+=======+=======+=======+
1108 | OVS-DPDK | 21.29 | 13.17 | 6.32 | 19.76 | 12.43 | 6.38 |
1109 +------------------+-------+-------+-------+-------+-------+-------+
1110 | vpp-memif | 59.96 | 34.17 | 45.13 | 57.1 | 33.47 | 44.92 |
1111 +------------------+-------+-------+-------+-------+-------+-------+
1113 Table 2: Multi-pod Benchmarking Results (% of Line Rate)
1115 Authors' Addresses
1117 Kyoungjae Sun
1118 Soongsil University
1119 369, Sangdo-ro, Dongjak-gu
1120 Seoul
1121 06978
1122 Republic of Korea
1124 Phone: +82 10 3643 5627
1125 Email: gomjae@dcn.ssu.ac.kr
1127 Hyunsik Yang
1128 KT
1129 KT Research Center 151
1130 Taebong-ro, Seocho-gu
1131 Seoul
1132 06763
1133 Republic of Korea
1135 Phone: +82 10 9005 7439
1136 Email: yangun@dcn.ssu.ac.kr
1138 Jangwon Lee
1139 Soongsil University
1140 369, Sangdo-ro, Dongjak-gu
1141 Seoul
1142 06978
1143 Republic of Korea
1145 Phone: +82 10 7448 4664
1146 Email: jangwon.lee@dcn.ssu.ac.kr
1147 Tran Minh Ngoc
1148 Soongsil University
1149 369, Sangdo-ro, Dongjak-gu
1150 Seoul
1151 06978
1152 Republic of Korea
1154 Phone: +82 2 820 0841
1155 Email: mipearlska1307@dcn.ssu.ac.kr
1157 Younghan Kim
1158 Soongsil University
1159 369, Sangdo-ro, Dongjak-gu
1160 Seoul
1161 06978
1162 Republic of Korea
1164 Phone: +82 10 2691 0904
1165 Email: younghak@ssu.ac.kr