idnits 2.17.1
draft-dcn-bmwg-containerized-infra-05.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document doesn't use any RFC 2119 keywords, yet has text resembling
RFC 2119 boilerplate text.
-- The document date (November 02, 2020) is 1269 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
No issues found here.
Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Benchmarking Methodology Working Group K. Sun
3 Internet-Draft H. Yang
4 Intended status: Informational J. Lee
5 Expires: May 6, 2021 H. Nguyen
6 Y. Kim
7 Soongsil University
8 November 02, 2020
10 Considerations for Benchmarking Network Performance in Containerized
11 Infrastructures
12 draft-dcn-bmwg-containerized-infra-05
14 Abstract
16 This draft describes considerations for benchmarking network
17 performance in containerized infrastructures. In the containerized
18 infrastructure, Virtualized Network Functions(VNFs) are deployed on
19 operating-system-level virtualization platform by abstracting the
20 user namespace as opposed to virtualization using a hypervisor.
21 Leveraging this, the system configurations and networking scenarios
22 for benchmarking will be partially changed by the way in which the
23 resource allocation and network technologies specified for
24 containerized VNFs. In this draft, we compare the state of the art
25 in a container networking architecture with networking on VM-based
26 virtualized systems, and provide several test scenarios for
27 benchmarking network performance in containerized infrastructures.
29 Status of This Memo
31 This Internet-Draft is submitted in full conformance with the
32 provisions of BCP 78 and BCP 79.
34 Internet-Drafts are working documents of the Internet Engineering
35 Task Force (IETF). Note that other groups may also distribute
36 working documents as Internet-Drafts. The list of current Internet-
37 Drafts is at https://datatracker.ietf.org/drafts/current/.
39 Internet-Drafts are draft documents valid for a maximum of six months
40 and may be updated, replaced, or obsoleted by other documents at any
41 time. It is inappropriate to use Internet-Drafts as reference
42 material or to cite them other than as "work in progress."
44 This Internet-Draft will expire on May 6, 2021.
46 Copyright Notice
48 Copyright (c) 2020 IETF Trust and the persons identified as the
49 document authors. All rights reserved.
51 This document is subject to BCP 78 and the IETF Trust's Legal
52 Provisions Relating to IETF Documents
53 (https://trustee.ietf.org/license-info) in effect on the date of
54 publication of this document. Please review these documents
55 carefully, as they describe your rights and restrictions with respect
56 to this document. Code Components extracted from this document must
57 include Simplified BSD License text as described in Section 4.e of
58 the Trust Legal Provisions and are provided without warranty as
59 described in the Simplified BSD License.
61 Table of Contents
63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
65 3. Benchmarking Considerations . . . . . . . . . . . . . . . . . 4
66 3.1. Comparison with the VM-based Infrastructure . . . . . . . 4
67 3.2. Container Networking Classification . . . . . . . . . . . 5
68 3.3. Resource Considerations . . . . . . . . . . . . . . . . . 8
69 4. Benchmarking Scenarios for the Containerized Infrastructure . 10
70 5. Additional Considerations . . . . . . . . . . . . . . . . . . 12
71 6. Benchmarking Experience(Contiv-VPP) . . . . . . . . . . . . . 13
72 6.1. Benchmarking Environment(Contiv-VPP) . . . . . . . . . . 13
73 6.2. Trouble shooting and Result . . . . . . . . . . . . . . . 17
74 7. Benchmarking Experiment(SR-IoV-DPDK) . . . . . . . . . . . . 19
75 7.1. Benchmarking Environment(SR-IoV-DPDK) . . . . . . . . . . 19
76 7.2. Trouble shooting and Result(SR-IoV-DPDK) . . . . . . . . 23
77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 23
78 9. Acknkowledgement . . . . . . . . . . . . . . . . . . . . . . 23
79 10. Informative References . . . . . . . . . . . . . . . . . . . 23
80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24
82 1. Introduction
84 The Benchmarking Methodology Working Group(BMWG) has recently
85 expanded its benchmarking scope from Physical Network Function(PNF)
86 running on a dedicated hardware system to Network Function
87 Virtualization(NFV) infrastructure and Virtualized Network
88 Function(VNF). [RFC8172] described considerations for configuring
89 NFV infrastructure and benchmarking metrics, and [RFC8204] gives
90 guidelines for benchmarking virtual switch which connects VNFs in
91 Open Platform for NFV(OPNFV).
93 Recently NFV infrastructure has evolved to include a lightweight
94 virtualized platform called the containerized infrastructure, where
95 VNFs share the same host Operating System(OS) and they are logically
96 isolated by using a different namespace. While previous NFV
97 infrastructure uses a hypervisor to allocate resources for Virtual
98 Machine(VMs) and instantiate VNFs on it, the containerized
99 infrastructure virtualizes resources without a hypervisor, therefore
100 making containers very lightweight and more efficient in
101 infrastructure resource utilization compared to the VM-based NFV
102 infrastructure. When we consider benchmarking for VNFs in the
103 containerized infrastructure, it may have a different System Under
104 Test(SUT) and Device Under Test(DUT) configuration compared with both
105 black-box benchmarking and VM-based NFV infrastructure as described
106 in [RFC8172]. Accordingly, additional configuration parameters and
107 testing strategies may be required.
109 In the containerized infrastructure, a VNF network is implemented by
110 running both switch and router functions in the host system. For
111 example, the internal communication between VNFs in the same host
112 uses the L2 bridge function, while communication with external
113 node(s) uses the L3 router function. For container networking, the
114 host system may use a virtual switch(vSwitch), but other options
115 exist. In the [ETSI-TST-009], they describe differences in
116 networking structure between the VM-based and the containerized
117 infrastructure. Occasioned by these differences, deployment
118 scenarios for testing network performance described in [RFC8204] may
119 be partially applied to the containerized infrastructure, but other
120 scenarios may be required.
122 In this draft, we describe differences and additional considerations
123 for benchmarking containerized infrastructure based on [RFC8172] and
124 [RFC8204]. In particular, we focus on differences in system
125 configuration parameters and networking configurations of the
126 containerized infrastructure compared with VM-based NFV
127 infrastructure. Note that, although the detailed configurations of
128 both infrastructures differ, the new benchmarks and metrics defined
129 in [RFC8172] can be equally applied in containerized infrastructure
130 from a generic-NFV point of view, and therefore defining additional
131 metrics or methodologies is out of scope.
133 2. Terminology
135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
136 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
137 document is to be interpreted as described in [RFC2119]. This
138 document uses the terminology described in [RFC8172], [RFC8204],
139 [ETSI-TST-009].
141 3. Benchmarking Considerations
143 3.1. Comparison with the VM-based Infrastructure
145 For the benchmarking of the containerized infrastructure, as
146 mentioned in [RFC8172], the basic approach is to reuse existing
147 benchmarking methods developed within the BMWG. Various network
148 function specifications defined in BMWG should still be applied to
149 containerized VNF(C-VNF)s for the performance comparison with
150 physical network functions and VM-based VNFs.
152 +---------------------------------+ +--------------------------------+
153 |+--------------+ +--------------+| |+------------+ +------------+|
154 || Guest VM | | Guest VM || || Container | | Container ||
155 ||+------------+| |+------------+|| ||+----------+| |+----------+||
156 ||| APP || || APP ||| ||| APP || || APP |||
157 ||+------------+| |+------------+|| ||+----------+| |+----------+||
158 ||+------------+| |+------------+|| ||+----------+| |+----------+||
159 |||Guest Kernel|| ||Guest Kernel||| ||| Bin/Libs || || Bin/Libs |||
160 ||+------------+| |+------------+|| ||+----------+| |+----------+||
161 |+------^-------+ +-------^------+| |+-----^------+ +------^-----+|
162 |+------|-----------------|------+| |+-----|------------------|-----+|
163 || | Hypervisor | || || |+----------------+| ||
164 |+------|-----------------|------+| || ||Container Engine|| ||
165 |+------|-----------------|------+| || |+----------------+| ||
166 || | Host OS Kernel | || || | Host OS Kernel | ||
167 |+------|-----------------|-----+|| |+-----|------------------|-----+|
168 | +--v-----------------v--+ | | +---v------------------v---+ |
169 +----| physical network |----+ +--| physical network |--+
170 +-----------------------+ +--------------------------+
171 (a) VM-Based Infrastructure (b) Containerized Infrastructure
173 Figure 1: Comparison of NFV Infrastructures
175 In Figure 1, we describe two different NFV architectures: VM-based
176 and Containerized. A major distinction between the containerized and
177 the VM-based infrastructure is that with the former, all VNFs share
178 the same host resources including but not limited to computing,
179 storage and networking resources, as well as the host Operating
180 System(OS), kernel and libraries. The absence of the guest OS and
181 the hypervisor necessitates the following considerations that occur
182 in the test environment:
184 o When we consider hardware configurations for the containerized
185 infrastructure, all components described in [RFC8172] can be part of
186 the test setup. While the capabilities of servers and storage should
187 meet the minimum requirements for testing, it is possible to deploy a
188 test environment with fewer capabilities than in the VM-based
189 infrastructure.
191 o About configuration parameters, the containerized infrastructure
192 needs a specified management system instead of a hypervisor(e.g.
193 Linux Container, Docker Engine).
195 o In the VM-based infrastructure, each VM manipulates packets in the
196 kernel of the guest OS through its own CPU threads, virtualized and
197 assigned by the hypervisor. On the other hand, C-VNFs use the host
198 CPU without virtualization. Different CPU resource assignment
199 methods may have different CPU utilization perspectives for
200 performance benchmarking.
202 o From a Memory Management Unit(MMU) point of view, there are
203 differences in how the paging process is conducted between two
204 environments. The main difference lies in the isolated nature of the
205 OS for VM-based VNFs. In the containerized infrastructure, memory
206 paging which processes conversion between a physical address and the
207 virtual address is affected by the host resource directly. Thus,
208 memory usage of each C-VNFs is more dependent on the host resource
209 capabilities than in VM-based VNFs.
211 3.2. Container Networking Classification
213 Container networking services are provided as network plugins.
214 Basically, using them, network services are deployed by using an
215 isolation environment from container runtime through the host
216 namespace, creating a virtual interface, allocating interface and IP
217 address to C-VNF. Since the containerized infrastructure has
218 different network architecture depending on its using plugins, it is
219 necessary to specify the plugin used in the infrastructure. There
220 are two proposed models for configuring network interfaces for
221 containers as below;
223 o CNM(Container Networking Model) proposed by Docker, using
224 libnetwork which provides an interface between the Docker daemon and
225 network drivers.
227 o CNI(Container Network Interface) proposed by CoreOS, describing
228 network configuration files in JSON format and plugins are
229 instantiated as new namespaces. Kubernetes uses CNI for providing
230 network service.
232 Regardless of both CNM and CNI, the container network model can be
233 classified into the kernel-space network model and user-space network
234 model according to the location of network service creation. In the
235 case of the kernel-based network model, network interfaces are
236 created in kernel space so that data packets should be processed in
237 network stack of host kernel before transferring packets to the C-VNF
238 running in user-space. On the other hand, using user-based network
239 model, data packets from physical network port are bypassed kernel
240 processing and delivered directly to user-space. Specific
241 technologies for each network model and example of network
242 architecture are written as follows:
244 o Kernel space network model: Docker Network[Docker-network], Flannel
245 Network[Flannel], Calico[Calico], OVS(OpenvSwitch)[OVS], OVN(Open
246 Virtual Network)[OVN], eBPF[eBPF]
248 +------------------------------------------------------------------+
249 | User Space |
250 | +-----------+ +-----------+ |
251 | | Container | | Container | |
252 | | +-------+ | | +-------+ | |
253 | +-| eth |-+ +-| eth |-+ |
254 | +--^----+ +----^--+ |
255 | | +------------------------------------------+ | |
256 | | | vSwitch | | |
257 | | | +--------------------------------------+ | | |
258 | | | | +--v---v---v--+ | | | |
259 | | | |bridge | tag[n] | | | | |
260 | | | | +--^-------^--+ | | | |
261 | | | +--^-------------|-------|-----------^-+ | | |
262 | | | | +---+ +---+ | | | |
263 | | | | +------ v-----+ +-------v----+ | | | |
264 | | | | |tunnel bridge| | flat bridge | | | | |
265 | | | | +------^------+ +-------^-----+ | | | |
266 | | +--- |--------|----------------|-------|---+ | |
267 ---------|------ |--------|----------------|-------|------|---------
268 | +----|-------|--------|----------------|-------|------|----+ |
269 | | +--v-------v--+ | | +--v------v--+ | |
270 | | | veth | | | | veth | | |
271 | | +---^---------+ | | +---^--------+ | |
272 | | Kernel Datapath | | | |
273 | +---------------------|----------------|-------------------+ |
274 | | | |
275 | Kernel Space +--v----------------v--+ |
276 +----------------------| NIC |--------------------+
277 +----------------------+
279 Figure 2: Examples of Kernel Space Network Model
281 o User space network model / Device pass-through model: SR-
282 IOV[SR-IOV]
283 +------------------------------------------------------------------+
284 | User Space |
285 | +-----------------+ +-----------------+ |
286 | | Container | | Container | |
287 | | +-------------+ | | +-------------+ | |
288 | +-| vf driver |-+ +-| vf driver |-+ |
289 | +-----^-------+ +------^------+ |
290 | | | |
291 -------------|---------------------------------------|--------------
292 | +---------+ +---------+ |
293 | +------|-------------------|------+ |
294 | | +----v-----+ +-----v----+ | |
295 | | | virtual | | virtual | | |
296 | | | function | | function | | |
297 | Kernel Space | +----^-----+ NIC +-----^----+ | |
298 +---------------| | | |----------------+
299 | +----v-------------------v----+ |
300 | | Classify and Queue | |
301 | +-----------------------------+ |
302 +---------------------------------+
304 Figure 3: Examples of User Space Network Model - Device Pass-through
306 o User space network model / vSwitch model: ovs-dpdk[ovs-dpdk],
307 vpp[vpp], netmap[netmap]
308 +------------------------------------------------------------------+
309 | User Space |
310 | +-----------------+ +-----------------+ |
311 | | Container | | Container | |
312 | | +-------------+ | | +-------------+ | |
313 | +-| virtio-user |-+ +-| virtio-user |-+ |
314 | +-----^-------+ +-------^-----+ |
315 | | | |
316 | +---------+ +---------+ |
317 | +-----------------|--------------------|-----------------+ |
318 | | vSwitch | | | |
319 | | +-------v-----+ +-----v-------+ | |
320 | | | virtio-user | | virtio-user | | |
321 | | +-------^-----+ +-----^-------+ | |
322 | | +------------|--------------------|-------------+ | |
323 | | | +--v--------------------v---+ | | |
324 | | |bridge | tag[n] | | | |
325 | | | +------------^--------------+ | | |
326 | | +----------------------|------------------------+ | |
327 | | +-------v--------+ | |
328 | | | dpdk0 bridge | | |
329 | | +-------^--------+ | |
330 | +---------------------------|----------------------------+ |
331 | +-------v--------+ |
332 | | DPDK PMD | |
333 | +-------^--------+ |
334 ---------------------------------|----------------------------------
335 | Kernel Space +-----v------+ |
336 +--------------------------| NIC |--------------------------+
337 +------------+
339 Figure 4: Examples of User Space Network Model - vSwitch Model using
340 DPDK
342 3.3. Resource Considerations
344 In the containerized infrastructure, resource utilization and
345 isolation may have different characteristics compared with the VM-
346 based infrastructure. Some details are listed as follows:
348 o Hugepage
350 The huge page is that configuring a large page size of memory to
351 reduce Translation Lookaside Buffer(TLB) miss rate and increase the
352 application performance. This increases the performance of logical/
353 virtual to physical address lookups performed by a CPU's memory
354 management unit, and generally overall system performance. When
355 using Cent OS or RedHat OS in the VM-based infrastructure, the huge
356 page should be set to at least 1G byte. In the VM-based
357 infrastructure, the host OS and the hypervisor can configure a huge
358 page depending on the guest OS. For example, guest VMs with the
359 Linux OS requires to set huge pages at least 1G bytes. Even though
360 it is a huge size, since this memory page is for not only its running
361 application but also guest OS operation processes, actual memory
362 pages for application is smaller.
364 In the containerized infrastructure, the container is isolated in the
365 application level and administrators can set huge pages more granular
366 level (e.g. Kubernetes allows to use of 512M bytes huge pages for
367 the container as default values). Moreover, this page is dedicated
368 to the application but another process so application use page more
369 efficient way. Therefore, even if the page size is smaller than the
370 VM, the effect of the huge page is large, which leads to the
371 utilization of physical memory and the increasing number of functions
372 in the host.
374 o NUMA
376 NUMA technology can be used both in the VM-based and containerized
377 infrastructure. Using NUMA, performance will be increasing not CPU
378 and memory but also network since that network interface connected
379 PCIe slot of specific NUMA node have locality. Using NUMA, it
380 requires a strong understanding of VNF's memory requirements. If VNF
381 uses more memory than a single NUMA node contains, the overhead will
382 be occurred due to being spilled to another NUMA node.
384 In the VM-based infrastructure, the hypervisor can perform extracting
385 NUMA topology and schedules VM workloads. In containerized
386 infrastructure, however, it is more difficult to expose the NUMA
387 topology to the container and currently, it is hard to guarantee the
388 locality of memory when the container is deployed to host that has
389 multiple NUMA nodes. For that reason, the instantiation of C-VNFs is
390 somewhat non-deterministic and apparently NUMA-Node agnostic, which
391 is one way of saying that performance will likely vary whenever this
392 instantiation is performed. So, when we use NUMA in the
393 containerized infrastructure, repeated instantiation and testing to
394 quantify the performance variation is required.
396 o RX/TX Multiple-Queue
398 RX/TX Multiple-Queue technology[Multique], which enables packet
399 sending/receiving processing to scale with the number of available
400 vcpus of guest VM, may be used to enhance network performance in the
401 VM-based infrastructure. However, RX/TX Multiple-Queue technology is
402 not supported in the containerized infrastructure yet.
404 4. Benchmarking Scenarios for the Containerized Infrastructure
406 Figure 5 shows briefly differences of network architectures based on
407 deployment models. Basically, on bare metal, C-VNFs can be deployed
408 as a cluster called POD by Kubernetes. Otherwise each C-VNF can be
409 deployed separately using Docker. In the former case, there is only
410 one external network interface even a POD contains more than one
411 C-VNF. An additional deployment model considers a scenario in which
412 C-VNFs or PODs are running on VM. In our draft, we define new
413 terminologies; BMP which is Pod on bare metal and VMP which is Pod on
414 VM.
416 +---------------------------------------------------------------------+
417 | Baremetal Node |
418 | +--------------+ +--------------+ +-------------- + +-------------+ |
419 | | | | POD | | VM | | VM | |
420 | | | |+------------+| |+-------------+| | +-------+ | |
421 | | C-VNF(A) | || C-VNFs(B) || || C-VNFs(C) || | |PODs(D)| | |
422 | | | |+------------+| |+-----^-------+| | +---^---+ | |
423 | | | | | | | | | | | |
424 | | +------+ | | +------+ | | +--v---+ | | +---v--+ | |
425 | +---| veth |---+ +---| veth |---+ +---|virtio|----+ +--|virtio|---+ |
426 | +--^---+ +---^--+ +--^---+ +---^--+ |
427 | | | | | |
428 | | | +--v---+ +---v--+ |
429 | +------|-----------------|------------|vhost |---------|vhost |---+ |
430 | | | | +--^---+ +---^--+ | |
431 | | | | | | | |
432 | | +--v---+ +---v--+ +--v---+ +---v--+ | |
433 | | +-| veth |---------| veth |---------| Tap |---------| Tap |-+ | |
434 | | | +--^---+ +---^--+ +--^---+ +---^--+ | | |
435 | | | | | vSwitch | | | | |
436 | | | +--|-----------------|---------------|-----------------|--+ | | |
437 | | +-| | | Bridge | | |-+ | |
438 | | +--|-----------------|---------------|-----------------|--+ | |
439 | | | +---------+ | +--|-----------------|---+ | |
440 | | | |Container| | | | Hypervisor | | | |
441 | | | | Engine | | | | | | | |
442 | | | +---------+ | +--|-----------------|---+ | |
443 | | | | Host Kernel | | | |
444 | +------|-----------------|---------------|-----------------|------+ |
445 | +--v-----------------v---------------v-----------------v--+ |
446 +-----| physical network |-----+
447 +---------------------------------------------------------+
449 Figure 5: Examples of Networking Architecture based on Deployment
450 Models - (A)C-VNF on Baremetal (B)Pod on Baremetal(BMP) (C)C-VNF on
451 VM (D)Pod on VM(VMP)
453 In [ETSI-TST-009], they described data plane test scenarios in a
454 single host. In that document, there are two scenarios for
455 containerized infrastructure; Container2Container which is internal
456 communication between two containers in the same Pod, and the Pod2Pod
457 model which is communication between two containers running in
458 different Pods. According to our new terminologies, we can call the
459 Pod2Pod model as the BMP2BMP scenario. When we consider container
460 running on VM as an additional deployment option, there can be more
461 single host test scenarios as follows;
463 o BMP2VMP scenario
465 +---------------------------------------------------------------------+
466 | HOST +-----------------------------+ |
467 | |VM +-------------------+ | |
468 | | | C-VNF | | |
469 | +--------------------+ | | +--------------+ | | |
470 | | C-VNF | | | | Logical Port | | | |
471 | | +--------------+ | | +-+--^-------^---+--+ | |
472 | | | Logical Port | | | +----|-------|---+ | |
473 | +-+--^-------^---+---+ | | Logical Port | | |
474 | | | +---+----^-------^---+--------+ |
475 | | | | | |
476 | +----v-------|----------------------------|-------v-------------+ |
477 | | l----------------------------l | |
478 | | Data Plane Networking | |
479 | | (Kernel or User space) | |
480 | +----^--------------------------------------------^-------------+ |
481 | | | |
482 | +----v------+ +----v------+ |
483 | | Phy Port | | Phy Port | |
484 | +-----------+ +-----------+
485 +-------^--------------------------------------------^----------------+
486 | |
487 +-------v--------------------------------------------v----------------+
488 | |
489 | Traffic Generator |
490 | |
491 +---------------------------------------------------------------------+
493 Figure 6: Single Host Test Scenario - BMP2VMP
495 o VMP2VMP scenario
497 +---------------------------------------------------------------------+
498 | HOST |
499 | +-----------------------------+ +-----------------------------+ |
500 | |VM +-------------------+ | |VM +-------------------+ | |
501 | | | C-VNF | | | | C-VNF | | |
502 | | | +--------------+ | | | | +--------------+ | | |
503 | | | | Logical Port | | | | | | Logical Port | | | |
504 | | +-+--^-------^---+--+ | | +-+--^-------^---+--+ | |
505 | | +----|-------|---+ | | +----|-------|---+ | |
506 | | | Logical Port | | | | Logical Port | | |
507 | +---+----^-------^---+--------+ +---+----^-------^---+--------+ |
508 | | | | | |
509 | +--------v-------v------------------------|-------v-------------+ |
510 | | l------------------------l | |
511 | | Data Plane Networking | |
512 | | (Kernel or User space) | |
513 | +----^--------------------------------------------^-------------+ |
514 | | | |
515 | +----v------+ +----v------+ |
516 | | Phy Port | | Phy Port | |
517 | +-----------+ +-----------+ |
518 +-------^--------------------------------------------^----------------+
519 | |
520 +-------v--------------------------------------------v----------------+
521 | |
522 | Traffic Generator |
523 | |
524 +---------------------------------------------------------------------+
526 Figure 7: Single Host Test Scenario - VMP2VMP
528 5. Additional Considerations
530 When we consider benchmarking for not only containerized but also VM-
531 based infrastructure and network functions, benchmarking scenarios
532 may contain various operational use cases. Traditional black-box
533 benchmarking is focused to measure in-out performance of packet from
534 physical network ports since the hardware is tightly coupled with its
535 function and only a single function is running on its dedicated
536 hardware. However, in the NFV environment, the physical network port
537 commonly will be connected to multiple VNFs(i.e. Multiple PVP test
538 setup architectures were described in [ETSI-TST-009]) rather than
539 dedicated to a single VNF. Therefore, benchmarking scenarios should
540 reflect operational considerations such as number of VNFs or network
541 services defined by a set of VNFs in a single host.
542 [service-density], which proposed a way for measuring the performance
543 of multiple NFV service instances at a varied service density on a
544 single host, is one example of these operational benchmarking
545 aspects.
547 Regarding the above draft, it can be classified into two types of
548 traffic for benchmark testing. One is North/South traffic and the
549 other is East/West traffic. North/South has a architecture that
550 receives data from other servers and routes them through VNF. On the
551 other hand, East/West traffic is a form of sending and receiving data
552 between containers deployed in the same server, and can pass through
553 multiple containers. The one of the example is Service Function
554 Chaining. Since network acceleration technology in a container
555 environment has different accelerated areas depending on the method
556 provided, performance differences may occur depending on traffic
557 patterns.
559 6. Benchmarking Experience(Contiv-VPP)
561 6.1. Benchmarking Environment(Contiv-VPP)
563 In this test, our purpose is that we test performance of user space
564 based model for container infrastructure and figure out relationship
565 between resource allocation and network performance. With respect to
566 this, we setup Contiv-VPP which is one of the user space based
567 network solution in container infrastructure and tested like below.
569 o Three physical server for benchmarking
571 +-------------------+----------------------+--------------------------+
572 | Node Name | Specification | Description |
573 +-------------------+----------------------+--------------------------+
574 | Conatiner Control |- Intel(R) Xeon(R) | Container Deployment |
575 | for Master | CPU E5-2690 | and Network Allocation |
576 | | (2Socket X 12Core) |- ubuntu 18.04 |
577 | |- MEM 128G |- Kubernetes Master |
578 | |- DISK 2T |- CNI Conterller |
579 | |- Control plane : 1G |.. Contive VPP Controller |
580 | | |.. Contive VPP Agent |
581 +-------------------+----------------------+--------------------------+
582 | Conatiner Service |- Intel(R) Xeon(R) | Container Service |
583 | for Worker | Gold 6148 |- ubuntu 18.04 |
584 | | (2socket X 20Core) |- Kubernetes Worker |
585 | |- MEM 128G |- CNI Agent |
586 | |- DISK 2T |.. Contive VPP Agent |
587 | |- Control plane : 1G | |
588 | |- Data plane : MLX 10G| |
589 | | (1NIC 2PORT) | |
590 +-------------------+----------------------+--------------------------+
591 | Packet Generator |- Intel(R) Xeon(R) | Packet Generator |
592 | | CPU E5-2690 |- CentOS 7 |
593 | | (2Socket X 12Core) |- installed Trex 2.4 |
594 | |- MEM 128G | |
595 | |- DISK 2T | |
596 | |- Control plane : 1G | |
597 | |- Data plane : MLX 10G| |
598 | | (1NIC 2PORT) | |
599 +-------------------+----------------------+--------------------------+
601 Figure 8: Test Environment-Server Specification
603 o The architecture of benchmarking
604 +----+ +--------------------------------------------------------+
605 | | | Containerized Infrastructure Master Node |
606 | | | +-----------+ |
607 | <-------> 1G PORT 0 | |
608 | | | +-----------+ |
609 | | +--------------------------------------------------------+
610 | |
611 | | +--------------------------------------------------------+
612 | | | Containerized Infrastructure Worker Node |
613 | | | +---------------------------------+ |
614 | s | | +-----------+ | +------------+ +------------+ | |
615 | w <-------> 1G PORT 0 | | | 10G PORT 0 | | 10G PORT 1 | | |
616 | i | | +-----------+ | +------^-----+ +------^-----+ | |
617 | t | | +--------|----------------|-------+ |
618 | c | +-----------------------------|----------------|---------+
619 | h | | |
620 | | +-----------------------------|----------------|---------+
621 | | | Packet Generator Node | | |
622 | | | +--------|----------------|-------+ |
623 | | | +-----------+ | +------v-----+ +------v-----+ | |
624 | <-------> 1G PORT 0 | | | 10G PORT 0 | | 10G PORT 1 | | |
625 | | | +-----------+ | +------------+ +------------+ | |
626 | | | +---------------------------------+ |
627 | | | |
628 +----+ +--------------------------------------------------------+
630 Figure 9: Test Environment-Architecture
632 o Network model of Containerized Infrastructure(User space Model)
633 +---------------------------------------------+---------------------+
634 | NUMA 0 | NUMA 0 |
635 +---------------------------------------------|---------------------+
636 | Containerized Infrastructure Worker Node | |
637 | +---------------------------+ | +----------------+ |
638 | | POD1 | | | POD2 | |
639 | | +-------------+ | | | +-------+ | |
640 | | | | | | | | | | |
641 | | +--v---+ +---v--+ | | | +-v--+ +-v--+ | |
642 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
643 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
644 | +------|-------------|------+ | +---|-------|----+ |
645 | +--- | | | | |
646 | | +-------|---------------|------+ | |
647 | | | | +------|--------------+ |
648 | +----------|--------|-------|--------|----+ | |
649 | | v v v v | | |
650 | | +-tap10--tap11-+ +-tap20--tap21-+ | | |
651 | | | ^ ^ | | ^ ^ | | | |
652 | | | | VRF1 | | | | VRF2 | | | | |
653 | | +--|--------|--+ +--|--------|--+ | | |
654 | | | +-----+ | +---+ | | |
655 | | +-tap01--|--|-------------|----|---+ | | |
656 | | | +------v--v-+ VRF0 +----v----v-+ | | | |
657 | | +-| 10G ETH0/0|------| 10G ETH0/1|-+ | | |
658 | | +---^-------+ +-------^---+ | | |
659 | | +---v-------+ +-------v---+ | | |
660 | +---| DPDP PMD0 |------| DPDP PMD1 |------+ | |
661 | +---^-------+ +-------^---+ | User Space |
662 +---------|----------------------|------------|---------------------+
663 | +-----|----------------------|-----+ | Kernal Space |
664 +---| +---V----+ +----v---+ |------|---------------------+
665 | | PORT 0 | 10G NIC | PORT 1 | | |
666 | +---^----+ +----^---+ |
667 +-----|----------------------|-----+
668 +-----|----------------------|-----+
669 +---| +---V----+ +----v---+ |----------------------------+
670 | | | PORT 0 | 10G NIC | PORT 1 | | Packet Generator (Trex) |
671 | | +--------+ +--------+ | |
672 | +----------------------------------+ |
673 +-------------------------------------------------------------------+
675 Figure 10: Test Environment-Network Architecture
677 We setup a Contive-VPP network to benchmark the user space container
678 network model in the containerized infrastructure worker node. We
679 setup network interface at NUMA0, and we created different network
680 subnet VRF1, VRF2 to classify input and output data traffic,
681 respectively. And then, we assigned two interface which connected to
682 VRF1, VRF2 and, we setup routing table to route Trex packet from eth1
683 interface to eth2 interface in POD.
685 6.2. Trouble shooting and Result
687 In this environment, we confirmed that the routing table doesn't work
688 when we send packet using Trex packet generator. The reason is that
689 when kernel space based network configured, ip forwarding rule is
690 processed to kernel stack level while 'ip packet forwarding rule' is
691 processed only in vrf0, which is the default virtual routing and
692 forwarding (VRF0) in VPP. That is, above testing architecture makes
693 problem since vrf1 and vrf2 interface couldn't route packet.
694 According to above result, we assigned vrf0 and vrf1 to POD and, data
695 flow is like below.
697 +---------------------------------------------+---------------------+
698 | NUMA 0 | NUMA 0 |
699 +---------------------------------------------|---------------------+
700 | Containerized Infrastructure Worker Node | |
701 | +---------------------------+ | +----------------+ |
702 | | POD1 | | | POD2 | |
703 | | +-------------+ | | | +-------+ | |
704 | | +--v----+ +---v--+ | | | +-v--+ +-v--+ | |
705 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
706 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
707 | +------|-------------|------+ | +---|-------|----+ |
708 | +-------+ | | | | |
709 | | +-------------|---------------|------+ | |
710 | | | | +------|--------------+ |
711 | +-----|-------|-------------|--------|----+ | |
712 | | | | v v | | |
713 | | | | +-tap10--tap11-+ | | |
714 | | | | | ^ ^ | | | |
715 | | | | | | VRF1 | | | | |
716 | | | | +--|--------|--+ | | |
717 | | | | | +---+ | | |
718 | | +-*tap00--*tap01----------|----|---+ | | |
719 | | | +-V-------v-+ VRF0 +----v----v-+ | | | |
720 | | +-| 10G ETH0/0|------| 10G ETH0/1|-+ | | |
721 | | +-----^-----+ +------^----+ | | |
722 | | +-----v-----+ +------v----+ | | |
723 | +---|*DPDP PMD0 |------|*DPDP PMD1 |------+ | |
724 | +-----^-----+ +------^----+ | User Space |
725 +-----------|-------------------|-------------|---------------------+
726 v v
727 *- CPU pinning interface
729 Figure 11: Test Environment-Network Architecture(CPU Pinning)
731 We conducted benchmarking with three conditions. The test
732 environments are as follows. - Basic VPP switch - General kubernetes
733 (No CPU Pining) - Shared Mode / Exclusive mode. In the basic
734 Kubernetes environment, all PODs share a host's CPU. Shared mode is
735 that some POD share a pool of CPU assigned to a specific PODs.
736 Exclusive mode is that a specific POD dedicates a specific CPU to
737 use. In shared mode, we assigned two CPU for several POD, in
738 exclusive mode, we dedicated one CPU for one POD, independently. The
739 result is like Figure 12. First, the test was conducted to figure
740 out the line rate of the VPP switch, and the basic Kubernetes
741 performance. After that, we applied NUMA to network interface using
742 Shared Mode and Exclusive Mode in the same node and different node
743 respectively. In Exclusive and Shared mode tests, we confirmed that
744 Exclusive mode showed better performance than Shared mode when same
745 NUMA cpu assigned, respectively. However, we confirmed that
746 performance is reduced at the section between the vpp switch and the
747 POD, so that it affect to total result.
749 +--------------------+---------------------+-------------+
750 | Model | NUMA Mode (pinning)| Result(Gbps)|
751 +--------------------+---------------------+-------------+
752 | | N/A | 3.1 |
753 | Switch only |---------------------+-------------+
754 | | same NUMA | 9.8 |
755 +--------------------+---------------------+-------------+
756 | K8S Scheduler | N/A | 1.5 |
757 +--------------------+---------------------+-------------+
758 | | same NUMA | 4.7 |
759 | CMK-Exclusive Mode +---------------------+-------------+
760 | | Different NUMA | 3.1 |
761 +--------------------+---------------------+-------------+
762 | | same NUMA | 3.5 |
763 | CMK-shared Mode +---------------------+-------------+
764 | | Different NUMA | 2.3 |
765 +--------------------+---------------------+-------------+
767 Figure 12: Test Results
769 7. Benchmarking Experiment(SR-IoV-DPDK)
771 7.1. Benchmarking Environment(SR-IoV-DPDK)
773 In this test, our purpose is that we test performance of user space
774 based model for container infrastructure and figure out relationship
775 between resource allocation and network performance. With respect to
776 this, we setup SRIOV combining with DPDK to bypass the Kernel space
777 in container infrastructure and tested based on that.
779 o Three physical server for benchmarking
781 +-------------------+-------------------------+------------------------+
782 | Node Name | Specification | Description |
783 +-------------------+-------------------------+------------------------+
784 | Conatiner Control |- Intel(R) Core(TM) | Container Deployment |
785 | for Master | i5-6200U CPU | and Network Allocation |
786 | | (1socket x 4Core) |- ubuntu 18.04 |
787 | |- MEM 8G |- Kubernetes Master |
788 | |- DISK 500GB |- CNI Conterller |
789 | |- Control plane : 1G | MULTUS CNI |
790 | | | SRIOV plugin with DPDK|
791 +-------------------+-------------------------+------------------------+
792 | Conatiner Service |- Intel(R) Xeon(R) | Container Service |
793 | for Worker | E5-2620 v3 @ 2.4Ghz |- Centos 7.7 |
794 | | (1socket X 6Core) |- Kubernetes Worker |
795 | |- MEM 128G |- CNI Agent |
796 | |- DISK 2T | MULTUS CNI |
797 | |- Control plane : 1G | SRIOV plugin with DPDK|
798 | |- Data plane : XL710-qda2| |
799 | | (1NIC 2PORT- 40Gb) | |
800 +-------------------+-------------------------+------------------------+
801 | Packet Generator |- Intel(R) Xeon(R) | Packet Generator |
802 | | Gold 6148 @ 2.4Ghz |- CentOS 7.7 |
803 | | (2Socket X 20Core) |- installed Trex 2.4 |
804 | |- MEM 128G | |
805 | |- DISK 2T | |
806 | |- Control plane : 1G | |
807 | |- Data plane : XL710-qda2| |
808 | | (1NIC 2PORT- 40Gb) | |
809 +-------------------+-------------------------+------------------------+
811 Figure 13: Test Environment-Server Specification
813 o The architecture of benchmarking
814 +----+ +--------------------------------------------------------+
815 | | | Containerized Infrastructure Master Node |
816 | | | +-----------+ |
817 | <-------> 1G PORT 0 | |
818 | | | +-----------+ |
819 | | +--------------------------------------------------------+
820 | |
821 | | +--------------------------------------------------------+
822 | | | Containerized Infrastructure Worker Node |
823 | | | +---------------------------------+ |
824 | s | | +-----------+ | +------------+ +------------+ | |
825 | w <-------> 1G PORT 0 | | | 40G PORT 0 | | 40G PORT 1 | | |
826 | i | | +-----------+ | +------^-----+ +------^-----+ | |
827 | t | | +--------|----------------|-------+ |
828 | c | +-----------------------------|----------------|---------+
829 | h | | |
830 | | +-----------------------------|----------------|---------+
831 | | | Packet Generator Node | | |
832 | | | +--------|----------------|-------+ |
833 | | | +-----------+ | +------v-----+ +------v-----+ | |
834 | <-------> 1G PORT 0 | | | 40G PORT 0 | | 0G PORT 1 | | |
835 | | | +-----------+ | +------------+ +------------+ | |
836 | | | +---------------------------------+ |
837 | | | |
838 +----+ +--------------------------------------------------------+
840 Figure 14: Test Environment-Architecture
842 o Network model of Containerized Infrastructure(User space Model)
843 +---------------------------------------------+---------------------+
844 | CMK shared core | CMK exclusive core |
845 +---------------------------------------------|---------------------+
846 | Containerized Infrastructure Worker Node | |
847 | +---------------------------+ | +----------------+ |
848 | | POD1 | | | POD2 | |
849 | | (testpmd) | | | (testpmd) | |
850 | | +-------------+ | | | +-------+ | |
851 | | | | | | | | | | |
852 | | +--v---+ +---v--+ | | | +-v--+ +-v--+ | |
853 | | | eth1 | | eth2 | | | | |eth1| |eth2| | |
854 | | +--^---+ +---^--+ | | | +-^--+ +-^--+ | |
855 | +------|-------------|------+ | +---|-------|----+ |
856 | | | | | | |
857 | +------ +-+ | | | |
858 | | +----|-----------------|------+ | |
859 | | | | +--------|--------------+ |
860 | | | | | | User Space|
861 +---------|------------|----|--------|--------|---------------------+
862 | | | | | | |
863 | +--+ +------| | | | |
864 | | | | | | Kernal Space|
865 +------|--------|-----------|--------|--------+---------------------+
866 | +----|--------|-----------|--------|-----+ | |
867 | | +--v--+ +--v--+ +--v--+ +--v--+ | | NIC|
868 | | | VF0 | | VF1 | | VF2 | | VF3 | | | |
869 | | +--|---+ +|----+ +----|+ +-|---+ | | |
870 | +----|------|---------------|-----|------+ | |
871 +---| +v------v+ +-v-----v+ |------|---------------------+
872 | | PORT 0 | 40G NIC | PORT 1 | |
873 | +---^----+ +----^---+ |
874 +-----|----------------------|-----+
875 +-----|----------------------|-----+
876 +---| +---V----+ +----v---+ |----------------------------+
877 | | | PORT 0 | 40G NIC | PORT 1 | | Packet Generator (Trex) |
878 | | +--------+ +--------+ | |
879 | +----------------------------------+ |
880 +-------------------------------------------------------------------+
882 Figure 15: Test Environment-Network Architecture
884 We setup a Multus CNI, SRIOV CNI with DPDK to benchmark the user
885 space container network model in the containerized infrastructure
886 worker node. The Multus CNI support to create multiple interfaces
887 for a container. The traffic is bypassed the Kernel space by SRIOV
888 with DPDK. We established two modes of CMK: shared core and
889 exclusive core. We created VFs for each network interface of a
890 container. Then, we setup TREX to route packet from eth1 to eth2 in
891 a POD.
893 7.2. Trouble shooting and Result(SR-IoV-DPDK)
895 TBD
897 8. Security Considerations
899 TBD
901 9. Acknkowledgement
903 We would like to thank Al, Maciek and Luis who reviewed and gave
904 comments of previous draft.
906 10. Informative References
908 [Calico] "Project Calico", July 2019,
909 .
911 [Docker-network]
912 "Docker, Libnetwork design", July 2019,
913 .
915 [eBPF] "eBPF, extended Berkeley Packet Filter", July 2019,
916 .
918 [ETSI-TST-009]
919 "Network Functions Virtualisation (NFV) Release 3;
920 Testing; Specification of Networking Benchmarks and
921 Measurement Methods for NFVI", October 2018.
923 [Flannel] "flannel 0.10.0 Documentation", July 2019,
924 .
926 [Multique]
927 "Multiqueue virtio-net", July 2019,
928 .
930 [netmap] "Netmap: a framework for fast packet I/O", July 2019,
931 .
933 [OVN] "How to use Open Virtual Networking with Kubernetes", July
934 2019, .
936 [OVS] "Open Virtual Switch", July 2019,
937 .
939 [ovs-dpdk]
940 "Open vSwitch with DPDK", July 2019,
941 .
944 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
945 Requirement Levels", RFC 2119, March 1997.
947 [RFC8172] Morton, A., "Considerations for Benchmarking Virtual
948 Network Functions and Their Infrastructure", RFC 8172,
949 July 2017.
951 [RFC8204] Tahhan, M., O'Mahony, B., and A. Morton, "Benchmarking
952 Virtual Switches in the Open Platform for NFV (OPNFV)",
953 RFC 8204, September 2017.
955 [service-density]
956 Konstantynowicz, M. and P. Mikus, "NFV Service Density
957 Benchmarking", March 2019, .
960 [SR-IOV] "SRIOV for Container-networking", July 2019,
961 .
963 [vpp] "VPP with Containers", July 2019, .
966 Authors' Addresses
968 Kyoungjae Sun
969 School of Electronic Engineering
970 Soongsil University
971 369, Sangdo-ro, Dongjak-gu
972 Seoul, Seoul 06978
973 Republic of Korea
975 Phone: +82 10 3643 5627
976 EMail: gomjae@dcn.ssu.ac.kr
977 Hyunsik Yang
978 School of Electronic Engineering
979 Soongsil University
980 369, Sangdo-ro, Dongjak-gu
981 Seoul, Seoul 06978
982 Republic of Korea
984 Phone: +82 10 9005 7439
985 EMail: yangun@dcn.ssu.ac.kr
987 Jangwon Lee
988 School of Electronic Engineering
989 Soongsil University
990 369, Sangdo-ro, Dongjak-gu
991 Seoul, Seoul 06978
992 Republic of Korea
994 Phone: +82 10 7448 4664
995 EMail: jangwon.lee@dcn.ssu.ac.kr
997 Quang Huy Nguyen
998 School of Electronic Engineering
999 Soongsil University
1000 369, Sangdo-ro, Dongjak-gu
1001 Seoul, Seoul 06978
1002 Republic of Korea
1004 Phone: +82 10 4281 0720
1005 EMail: huynq@dcn.ssu.ac.kr
1007 Younghan Kim
1008 School of Electronic Engineering
1009 Soongsil University
1010 369, Sangdo-ro, Dongjak-gu
1011 Seoul, Seoul 06978
1012 Republic of Korea
1014 Phone: +82 10 2691 0904
1015 EMail: younghak@ssu.ac.kr