idnits 2.17.1 draft-lencse-v6ops-transition-scalability-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet has text resembling RFC 2119 boilerplate text. -- The document date (16 October 2021) is 922 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-04) exists of draft-ietf-v6ops-transition-comparison-00 == Outdated reference: A later version (-04) exists of draft-lencse-bmwg-benchmarking-stateful-02 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 v6ops G.L. Lencse 3 Internet-Draft Szechenyi Istvan University 4 Intended status: Informational 16 October 2021 5 Expires: 19 April 2022 7 Scalability of IPv6 Transition Technologies for IPv4aaS 8 draft-lencse-v6ops-transition-scalability-00 10 Abstract 12 Several IPv6 transition technologies have been developed to provide 13 customers with IPv4-as-a-Service (IPv4aaS) for ISPs with an IPv6-only 14 access and/or core network. All these technologies have their 15 advantages and disadvantages, and depending on existing topology, 16 skills, strategy and other preferences, one of these technologies may 17 be the most appropriate solution for a network operator. 19 This document examines the scalability of the five most prominent 20 IPv4aaS technologies (464XLAT, Dual Stack Lite, Lightweight 4over6, 21 MAP-E, MAP-T) considering two aspects: (1) how their performance 22 scales up with the number of CPU cores, (2) how their performance 23 degrades, when the number of concurrent sessions is increased until 24 hardware limit is reached. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on 19 April 2022. 43 Copyright Notice 45 Copyright (c) 2021 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 50 license-info) in effect on the date of publication of this document. 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. Code Components 53 extracted from this document must include Simplified BSD License text 54 as described in Section 4.e of the Trust Legal Provisions and are 55 provided without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 61 2. Scalability of iptables . . . . . . . . . . . . . . . . . . . 3 62 2.1. Measurement Method . . . . . . . . . . . . . . . . . . . 3 63 2.2. Performance scale up against the number of CPU cores . . 4 64 2.3. Performance degradation caused by the number of 65 sessions . . . . . . . . . . . . . . . . . . . . . . . . 6 66 3. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 67 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 69 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 70 6.1. Normative References . . . . . . . . . . . . . . . . . . 8 71 6.2. Informative References . . . . . . . . . . . . . . . . . 8 72 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 9 73 A.1. 00 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 74 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 76 1. Introduction 78 IETF has standardized several IPv6 transition technologies [LEN2019] 79 and occupied a neutral position trusting the selection of the most 80 appropriate ones to the market. 81 [I-D.ietf-v6ops-transition-comparison] provides a comprehensive 82 comparative analysis of the five most prominent IPv4aaS technologies 83 to assist operators with this problem. This document adds one more 84 detail: measurement data regarding the scalability of the examined 85 IPv4aaS technologies. 87 Currently, this document contains only the scalability measurements 88 of the iptables stateful NAT44 implementation. It serves as a sample 89 to test if the disclosed results are (1) useful and (2) sufficient 90 for the network operators. 92 1.1. Requirements Language 94 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 95 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 96 "OPTIONAL" in this document are to be interpreted as described in 97 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 98 capitals, as shown here. 100 2. Scalability of iptables 102 2.1. Measurement Method 104 [RFC8219] has defined a benchmarking methodology for IPv6 transition 105 technologies. [I-D.lencse-bmwg-benchmarking-stateful] has amended it 106 by addressing how to benchmark stateful NATxy gateways using 107 pseudorandom port numbers recommended by [RFC4814]. It has defined a 108 measurement procedure for maximum connection establishment rate and 109 reused the classic measurement procedures like throuhput, latency, 110 frame loss rate, etc. from [RFC8219]. We used two of them: maximum 111 connection establishment rate and throughput to characterize the 112 performance of the examined system. 114 The scalability of iptables is examined in two aspects: 116 * How its performance scales up with the number of CPU cores? 118 * How its performance degrades, when the number of concurrent 119 sessions is increased? 121 +--------------------------------------+ 122 10.0.0.2 |Initiator Responder| 198.19.0.2 123 +-------------| Tester |<------------+ 124 | private IPv4| [state table]| public IPv4 | 125 | +--------------------------------------+ | 126 | | 127 | +--------------------------------------+ | 128 | 10.0.0.1 | DUT: | 198.19.0.1 | 129 +------------>| Sateful NATxy gateway |-------------+ 130 private IPv4| [connection tracking table] | public IPv4 131 +--------------------------------------+ 133 Figure 1: Test setup for benchmarking stateful NATxy gateways 135 The test setup in Figure 1 was followed. The two devices, the Tester 136 and the DUT (Device Under Test), were both Dell PowerEdge R430 137 servers having two 2.1GHz Intel Xeon E5-2683 v4 CPUs, 384GB 2400MHz 138 DDR4 RAM and Intel 10G dual port X540 network adapters. The NICs of 139 the servers were interconnected by direct cables, and the CPU clock 140 frequecy was set to fixed 2.1 GHz on both servers. They had Debian 141 9.13 Linux operating system with 4.9.0-16-amd64 kernel. The 142 measurements were performed by siitperf [LEN2021] using the 143 "stateful" branch (latest commit Aug. 16, 2021). The DPDK version 144 was 16.11.11-1+deb9u2. The version of iptables was 1.6.0. 146 The ratio of number of connections in the connection tracking table 147 and the value of the hashsize parameter of iptables significantly 148 influences its performance. Although the default setting is 149 hashsize=nf_conntrack_max/8, we have usually set 150 hashsize=nf_conntrack_max to increase the performance of iptables, 151 which was crucial, when high number of connections were used, because 152 then the execution time of the tests was dominated by the preliminary 153 phase, when several hundereds of millions connections had to be 154 established. (In some cases, we had to use different settings due to 155 memory limitations. The tables presenting the results always contain 156 these parameters.) 158 The size of the port number pool is an important parameter of the 159 bechmarking method for stateful NATxy gateways, thus it is also given 160 for all tests. 162 2.2. Performance scale up against the number of CPU cores 164 To examine how the performance of iptables scales up with the number 165 of CPU cores, the number of active CPU cores was set to 1, 2, 4, 8, 166 16 using the "maxcpus=" kernel parameter. 168 The number of connections was always 4,000,000 using 4,000 different 169 source port numbers and 1,000 different destination port numbers. 170 Both the connection tracking table size and the hash table size was 171 set to 2^23. 173 The error of the binary search was chosen to be lower than 0.1% of 174 the expected results. The experiments were executed 10 times. 176 Besides the connection establishment rate and the throughput of 177 iptables, also the throuhput of the IPv4 packet forwarding of the 178 Linux kernel was measured to provide a basis for comparison. 180 The results are presented in Figure 2. The unit for the maximum 181 connection establishment rate is 1,000 connections per second. The 182 unit for throughput is 1,000 packets per second (measured with 183 bidirectional traffic, and the number of all packets per second is 184 displayed). 186 num. CPU cores 1 2 4 8 16 187 src ports 4,000 4,000 4,000 4,000 4,000 188 dst ports 1,000 1,000 1,000 1,000 1,000 189 num. conn. 4,000,000 4,000,000 4,000,000 4,000,000 4,000,000 190 conntrack t. s. 2^23 2^23 2^23 2^23 2^23 191 hash table size 2^23 2^23 2^23 2^23 2^23 192 c.t.s/num.conn. 2.097 2.097 2.097 2.097 2.097 193 num. experiments 10 10 10 10 10 194 error 100 100 100 1,000 1,000 195 cps median 223.5 371.1 708.7 1,341 2,383 196 cps min 221.6 367.7 701.7 1,325 2,304 197 cps max 226.7 375.9 723.6 1,376 2,417 198 cps rel. scale up 1 0.830 0.793 0.750 0.666 199 throughput median 414.9 742.3 1,379 2,336 4,557 200 throughput min 413.9 740.6 1,373 2,311 4,436 201 throughput max 416.1 746.9 1,395 2,361 4,627 202 tp. rel. scale up 1 0.895 0.831 0.704 0.686 203 IPv4 packet forwarding (using the same port number ranges) 204 error 200 500 1,000 1,000 1,000 205 throughput median 910.9 1,523 3,016 5,920 11,561 206 throughput min 874.8 1,485 2,951 5,811 10,998 207 throughput max 914.3 1,534 3,037 5,940 11,627 208 tp. rel. scale up 1 0.836 0.828 0.812 0.793 209 throughput ratio (%) 45.5 48.8 45.7 39.5 39.4 211 Figure 2: Scale up of iptables against the number of CPU cores 213 Whereas the throughput of IPv4 packet forwarding scaled up from 214 0.91Mpps to 11.56Mpps showing a relative scale up of 0.793, the 215 throuhput of iptables scaled up from 414.9kpps to 4,557kpps showing a 216 relative scale up of 0.686 (and the relative scale up of the maximum 217 connection establishment rate is only 0.666). On the one hand, this 218 is the price of the stateful operation. On the other hand, this 219 result is quite good compared to the scale-up results of NSD (a high 220 performance authoritative DNS server) presented in Table 9 of 221 [LEN2020], which is only 0.52. (1,454,661/177,432=8.2-fold 222 performance using 16 cores.) And DNS is not a stateful technology. 224 2.3. Performance degradation caused by the number of sessions 226 To examine how the performance of iptables degrades with the number 227 connections in the connection tracking table, the number of 228 connections was increased fourfold by doubling the size of both the 229 source port number range and the destination port number range. Both 230 the connection tracking table size and the hash table size was also 231 increased four fold. However, we reached the limits of the hardware 232 at 400,000,000 connections: we could not set the size of the hash 233 table to 2^29 but only to 2^28. The same value was used at 234 800,000,000 connections too, when the number of connections was only 235 doubled, because 1.6 billion connections would not fit into the 236 memory. 238 The error of the binary search was chosen to be lower than 0.1% of 239 the expected results. The experiments were executed 10 times (except 240 for the very long lasting measurements with 800,000,000 connections). 242 The results are presented in Figure 3. The unit for the maximum 243 connection establishment rate is 1,000,000 connections per second. 244 The unit for throughput is 1,000,000 packets per second (measured 245 with bidirectional traffic, and the number of all packets per second 246 is displayed). 248 num. conn. 1.56M 6.25M 25M 100M 400M 800M 249 src ports 2,500 5,000 10,000 20,000 40,000 40,000 250 dst ports 625 1,250 2,500 5,000 10,000 20,000 251 conntrack t. s. 2^21 2^23 2^25 2^27 2^29 2^30 252 hash table size 2^21 2^23 2^25 2^27 2^28 2^28 253 num. exp. 10 10 10 10 10 5 254 error 1,000 1,000 1,000 1,000 1,000 1,000 255 n.c./h.t.s. 0.745 0.745 0.745 0.745 1.490 2.980 256 cps median 2.406 2.279 2.278 2.237 2.013 1.405 257 cps min 2.358 2.226 2.226 2.124 1.983 1.390 258 cps max 2.505 2.315 2.317 2.290 2.050 1.440 259 thorughput med. 5.326 4.369 4.510 4.516 4.244 3.689 260 thorughput min 5.217 4.240 3.994 4.373 4.217 3.670 261 thorughput max 5.533 4.408 4.572 4.537 4.342 3.709 262 Figure 3: Performance of iptables against the number of sessions 264 The performance of iptables shows degradation at 6.25M connections 265 compared to 1.56M connections very likely due to the exhaustion of 266 the L3 cache of the CPU of the DUT. Then the performance of iptables 267 is fearly constant up to 100M connections. A small performance 268 decrease can be observed at 400M connections due to the lower hash 269 table size. A more significant performance decrease can be observed 270 at 800M connections. It is caused by two factors: 272 * on average, about 3 connections were hashed to the same place 274 * non NUMA local memory was also used. 276 We note that the CPU has 2 NUMA nodes, cores 0, 2, ... 14 belong to 277 NUMA node 0, and cores 1, 3, ... 15 belong to NUMA node 1. The 278 maximum memory consumption with 400,000,000 connections was below 279 150GB, thus it could be stored in NUMA local memory. 281 Therefore, we have pointed out important limitations of the stateful 282 NAT44 technology: 284 * there is a performance decrease, when approaching hardware limits 286 * there is a hardware limit, beyond which the system cannot handle 287 the connections at all (e.g. 1600M connections would not fit into 288 the memory). 290 Therefore, we can conclude that, on the one hand, a well tailored 291 hashing may guarantee an excellent scale-up of stateful NAT44 292 regarding the number of connections in a wide range, however, on the 293 other hand, stateful operation has its limits resulting both in 294 performance decrease, when approaching hardware limits and also in 295 inability to handle more sessions, when reaching the memory limits. 297 3. Acknowledgements 299 The measurements were carried out by remotely using the resources of 300 NICT StarBED, 2-12 Asahidai, Nomi-City, Ishikawa 923-1211, Japan. 301 The author would like to thank Shuuhei Takimoto for the possibility 302 to use StarBED, as well as to Satoru Gonno and Makoto Yoshida for 303 their help and advice in StarBED usage related issues. 305 The author would like to thank Ole Troan for his comments on the 306 v6ops mailing list, while the scalalability measurements of iptables 307 were intended to be a part of [I-D.ietf-v6ops-transition-comparison]. 309 4. IANA Considerations 311 This document does not make any request to IANA. 313 5. Security Considerations 315 TBD. 317 6. References 319 6.1. Normative References 321 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 322 Requirement Levels", BCP 14, RFC 2119, 323 DOI 10.17487/RFC2119, March 1997, 324 . 326 [RFC4814] Newman, D. and T. Player, "Hash and Stuffing: Overlooked 327 Factors in Network Device Benchmarking", RFC 4814, 328 DOI 10.17487/RFC4814, March 2007, 329 . 331 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 332 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 333 May 2017, . 335 [RFC8219] Georgescu, M., Pislaru, L., and G. Lencse, "Benchmarking 336 Methodology for IPv6 Transition Technologies", RFC 8219, 337 DOI 10.17487/RFC8219, August 2017, 338 . 340 6.2. Informative References 342 [I-D.ietf-v6ops-transition-comparison] 343 Lencse, G., Martinez, J. P., Howard, L., Patterson, R., 344 and I. Farrer, "Pros and Cons of IPv6 Transition 345 Technologies for IPv4aaS", Work in Progress, Internet- 346 Draft, draft-ietf-v6ops-transition-comparison-00, 15 April 347 2021, . 350 [I-D.lencse-bmwg-benchmarking-stateful] 351 Lencse, G. and K. Shima, "Benchmarking Methodology for 352 Stateful NATxy Gateways using RFC 4814 Pseudorandom Port 353 Numbers", Work in Progress, Internet-Draft, draft-lencse- 354 bmwg-benchmarking-stateful-02, 10 October 2021, 355 . 358 [LEN2019] Lencse, G. and Y. Kadobayashi, "Comprehensive Survey of 359 IPv6 Transition Technologies: A Subjective Classification 360 for Security Analysis", IEICE Transactions on 361 Communications, vol. E102-B, no.10, pp. 2021-2035., DOI: 362 10.1587/transcom.2018EBR0002, 1 October 2019, 363 . 366 [LEN2020] Lencse, G., "Benchmarking Authoritative DNS 367 Servers", IEEE Access, vol. 8. pp. 130224-130238, DOI: 368 10.1109/ACCESS.2020.3009141, July 2020, 369 . 371 [LEN2021] Lencse, G., "Design and Implementation of a Software 372 Tester for Benchmarking Stateless NAT64 Gateways", IEICE 373 Transactions on Communications, DOI: 374 10.1587/transcom.2019EBN0010, 2021, 375 . 378 Appendix A. Change Log 380 A.1. 00 382 Initial version: scale up of iptables. 384 Author's Address 386 Gabor Lencse 387 Szechenyi Istvan University 388 Gyor 389 Egyetem ter 1. 390 H-9026 391 Hungary 393 Email: lencse@sze.hu