idnits 2.17.1 draft-ietf-bmwg-ipv6-tran-tech-benchmarking-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 7 instances of too long lines in the document, the longest one being 224 characters in excess of 72. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 1 instance of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 17, 2016) is 2963 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC 5569' is mentioned on line 232, but not defined == Missing Reference: 'RFC 2544' is mentioned on line 562, but not defined == Missing Reference: 'RFC6147' is mentioned on line 605, but not defined == Unused Reference: 'RFC5569' is defined on line 972, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3511 (Obsoleted by RFC 9411) -- Obsolete informational reference (is this intentional?): RFC 6145 (Obsoleted by RFC 7915) Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Benchmarking Working Group M. Georgescu 2 Internet Draft NAIST 3 Intended status: Informational G. Lencse 4 Expires: September 2016 Szechenyi Istvan University 5 March 17, 2016 7 Benchmarking Methodology for IPv6 Transition Technologies 8 draft-ietf-bmwg-ipv6-tran-tech-benchmarking-01.txt 10 Abstract 12 There are benchmarking methodologies addressing the performance of 13 network interconnect devices that are IPv4- or IPv6-capable, but the 14 IPv6 transition technologies are outside of their scope. This 15 document provides complementary guidelines for evaluating the 16 performance of IPv6 transition technologies. More specifically, 17 this document targets IPv6 transition technologies that employ 18 encapsulation or translation mechanisms, as dual-stack nodes can be 19 very well tested using the recommendations of RFC2544 and RFC5180. 20 The methodology also includes a tentative metric for benchmarking 21 load scalability. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six 34 months and may be updated, replaced, or obsoleted by other documents 35 at any time. It is inappropriate to use Internet-Drafts as 36 reference material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html 44 This Internet-Draft will expire on September 17, 2016. 46 Copyright Notice 48 Copyright (c) 2016 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with 56 respect to this document. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with 63 respect to this document. Code Components extracted from this 64 document must include Simplified BSD License text as described in 65 Section 4.e of the Trust Legal Provisions and are provided without 66 warranty as described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction...................................................3 71 1.1. IPv6 Transition Technologies..............................4 72 2. Conventions used in this document..............................5 73 3. Terminology....................................................6 74 4. Test Setup.....................................................6 75 4.1. Single translation Transition Technologies................7 76 4.2. Encapsulation/Double translation Transition Technologies..7 77 5. Test Traffic...................................................8 78 5.1. Frame Formats and Sizes...................................8 79 5.1.1. Frame Sizes to Be Used over Ethernet.................9 80 5.2. Protocol Addresses........................................9 81 5.3. Traffic Setup.............................................9 82 6. Modifiers.....................................................10 83 7. Benchmarking Tests............................................10 84 7.1. Throughput - [RFC2544]...................................10 85 7.2. Latency..................................................10 86 7.3. Packet Delay Variation...................................11 87 7.3.1. PDV.................................................11 88 7.3.2. IPDV................................................12 89 7.4. Frame Loss Rate - [RFC2544]..............................13 90 7.5. Back-to-back Frames - [RFC2544]..........................13 91 7.6. System Recovery - [RFC2544]..............................13 92 7.7. Reset - [RFC2544]........................................13 94 8. Additional Benchmarking Tests for Stateful IPv6 Transition 95 Technologies.....................................................13 96 8.1. Concurrent TCP Connection Capacity -[RFC3511]............13 97 8.2. Maximum TCP Connection Establishment Rate -[RFC3511].....13 98 9. DNS Resolution Performance....................................13 99 9.1. Test and Traffic Setup...................................14 100 9.2. Benchmarking DNS Resolution Performance..................15 101 9.2.1. Requirements for the Tester.........................16 102 10. Scalability..................................................17 103 10.1. Test Setup..............................................17 104 10.1.1. Single Translation Transition Technologies.........17 105 10.1.2. Encapsulation/Double Translation Transition 106 Technologies...............................................18 107 10.2. Benchmarking Performance Degradation....................18 108 10.2.1. Network performance degradation with simultaneous load 109 ...........................................................18 110 10.2.2. Network performance degradation with incremental load 111 ...........................................................19 112 11. Summarizing function and variation...........................20 113 12. Security Considerations......................................20 114 13. IANA Considerations..........................................20 115 14. References...................................................21 116 14.1. Normative References....................................21 117 14.2. Informative References..................................21 118 15. Acknowledgements.............................................23 119 Appendix A. Theoretical Maximum Frame Rates......................24 121 1. Introduction 123 The methodologies described in [RFC2544] and [RFC5180] help vendors 124 and network operators alike analyze the performance of IPv4 and 125 IPv6-capable network devices. The methodology presented in [RFC2544] 126 is mostly IP version independent, while [RFC5180] contains 127 complementary recommendations, which are specific to the latest IP 128 version, IPv6. However, [RFC5180] does not cover IPv6 transition 129 technologies. 131 IPv6 is not backwards compatible, which means that IPv4-only nodes 132 cannot directly communicate with IPv6-only nodes. To solve this 133 issue, IPv6 transition technologies have been proposed and 134 implemented. 136 This document presents benchmarking guidelines dedicated to IPv6 137 transition technologies. The benchmarking tests can provide insights 138 about the performance of these technologies, which can act as useful 139 feedback for developers, as well as for network operators going 140 through the IPv6 transition process. 142 The document also includes an approach to quantify load scalability. 143 Load scalability can be defined as a system's ability to gracefully 144 accommodate higher loads. Because poor scalability usually leads to 145 poor performance, the proposed approach is to quantify the load 146 scalability by measuring the performance degradation created by a 147 higher number of network flows. 149 1.1. IPv6 Transition Technologies 151 Two of the basic transition technologies, dual IP layer (also known 152 as dual stack) and encapsulation are presented in [RFC4213]. 153 IPv4/IPv6 Translation is presented in [RFC6144]. Most of the 154 transition technologies employ at least one variation of these 155 mechanisms. Some of the more complex ones (e.g. DSLite [RFC6333]) 156 are using all three. In this context, a generic classification of 157 the transition technologies can prove useful. 159 Tentatively, we can consider a production network transitioning to 160 IPv6 as being constructed using the following IP domains: 162 o Domain A: IPvX specific domain 164 o Core domain: which may be IPvY specific or dual-stack(IPvX and 165 IPvY) 167 o Domain B: IPvX specific domain 169 Note: X,Y are part of the {4,6} set. 171 According to the technology used for the core domain traversal the 172 transition technologies can be categorized as follows: 174 1. Single Translation: In this case, the production network is 175 assumed to have only two domains, Domain A and the Core domain. 176 The core domain is assumed to be IPvY specific. IPvX packets are 177 translated to IPvY at the edge between Domain A and the Core 178 domain. 180 2. Dual-stack: the core domain devices implement both IP protocols 182 3. Encapsulation: The production network is assumed to have all 183 three domains, Domains A and B are IPvX specific, while the core 184 domain is IPvY specific. An encapsulation mechanism is used to 185 traverse the core domain. The IPvX packets are encapsulated to 186 IPvY packets at the edge between Domain A and the Core domain. 187 Subsequently, the IPvY packets are decapsulated at the edge 188 between the Core domain and Domain B. 190 4. Double translation: The production network is assumed to have all 191 three domains, Domains A and B are IPvX specific, while the core 192 domain is IPvY specific. A translation mechanism is employed for 193 the traversal of the core network. The IPvX packets are 194 translated to IPvY packets at the edge between Domain A and the 195 Core domain. Subsequently, the IPvY packets are translated back 196 to IPvX at the edge between the Core domain and Domain B. 198 The performance of Dual-stack transition technologies can be fully 199 evaluated using the benchmarking methodologies presented by 200 [RFC2544] and [RFC5180]. Consequently, this document focuses on the 201 other 3 categories: Single translation, Encapsulation and Double 202 translation transition technologies. 204 Another important aspect by which the IPv6 transition technologies 205 can be categorized is their use of stateful or stateless mapping 206 algorithms. The technologies that use stateful mapping algorithms 207 (e.g. Stateful NAT64 [RFC6146]) create dynamic correlations between 208 IP addresses or {IP address, transport protocol, transport port 209 number} tuples, which are stored in a state table. For ease of 210 reference, the IPv6 transition technologies which employ stateful 211 mapping algorithms will be called stateful IPv6 transition 212 technologies. The efficiency with which the state table is managed 213 can be an important performance indicator for these technologies. 214 Hence, for the stateful IPv6 transition technologies additional 215 benchmarking tests are RECOMMENDED. 217 Table 1 contains the generic categories as well as associations with 218 some of the IPv6 transition technologies proposed in the IETF. 220 Table 1. IPv6 Transition Technologies Categories 221 o +---+--------------------+------------------------------------+ 222 o | | Generic category | IPv6 Transition Technology | 223 o +---+--------------------+------------------------------------+ 224 o | 1 | Dual-stack | Dual IP Layer Operations [RFC4213] | 225 o +---+--------------------+------------------------------------+ 226 o | 2 | Single translation | NAT64 [RFC6146], IVI [RFC6219] | 227 o +---+--------------------+------------------------------------+ 228 o | 3 | Double translation | 464XLAT [RFC6877], MAP-T [RFC7599] | 229 o +---+--------------------+------------------------------------+ 230 o | 4 | Encapsulation | DSLite[RFC6333], MAP-E [RFC7597] | 231 o | | | Lightweight 4over6 [RFC7596] | 232 o | | | 6RD [RFC 5569] | 233 +---+--------------------+------------------------------------+ 234 2. Conventions used in this document 236 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 237 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 238 document are to be interpreted as described in [RFC2119]. 240 In this document, these words will appear with that interpretation 241 only when in ALL CAPS. Lower case uses of these words are not to be 242 interpreted as carrying [RFC2119] significance. 244 Although these terms are usually associated with protocol 245 requirements, in this doc the terms are requirements for users and 246 systems that intend to implement the test conditions and claim 247 conformance with this specification. 249 3. Terminology 251 A number of terms used in this memo have been defined in other RFCs. 252 Please refer to those RFCs for definitions, testing procedures and 253 reporting formats. 255 Throughput (Benchmark) - [RFC2544] 257 Frame Loss Rate (Benchmark) - [RFC2544] 259 Back-to-back Frames (Benchmark) - [RFC2544] 261 System Recovery (Benchmark) - [RFC2544] 263 Reset (Benchmark) - [RFC6201] 265 Concurrent TCP Connection Capacity (Benchmark) - [RFC3511] 267 Maximum TCP Connection Establishment Rate (Benchmark) - [RFC3511] 269 4. Test Setup 271 The test environment setup options recommended for IPv6 transition 272 technologies benchmarking are very similar to the ones presented in 273 Section 6 of [RFC2544]. In the case of the tester setup, the options 274 presented in [RFC2544] and [RFC5180] can be applied here as well. 275 However, the Device under test (DUT) setup options should be 276 explained in the context of the targeted categories of IPv6 277 transition technologies: Single translation, Double translation and 278 Encapsulation transition technologies. 280 Although both single tester and sender/receiver setups are 281 applicable to this methodology, the single tester setup will be used 282 to describe the DUT setup options. 284 For the test setups presented in this memo, dynamic routing SHOULD 285 be employed. However, the presence of routing and management frames 286 can represent unwanted background data that can affect the 287 benchmarking result. To that end, the procedures defined in 288 [RFC2544] (Sections 11.2 and 11.3) related to routing and management 289 frames SHOULD be used here as well. Moreover, the "Trial 290 description" recommendations presented in [RFC2544] (Section 23) are 291 valid for this memo as well. 293 In terms of route setup, the recommendations of [RFC2544] Section 13 294 are valid for this document as well assuming that an IPv6 version of 295 the routing packets shown in appendix C.2.6.2 is used. 297 4.1. Single translation Transition Technologies 299 For the evaluation of Single translation transition technologies, a 300 single DUT setup (see Figure 1) SHOULD be used. The DUT is 301 responsible for translating the IPvX packets into IPvY packets. In 302 this context, the tester device should be configured to support both 303 IPvX and IPvY. 305 +--------------------+ 306 | | 307 +------------|IPvX tester IPvY|<-------------+ 308 | | | | 309 | +--------------------+ | 310 | | 311 | +--------------------+ | 312 | | | | 313 +----------->|IPvX DUT IPvY|--------------+ 314 | | 315 +--------------------+ 316 Figure 1. Test setup 1 318 4.2. Encapsulation/Double translation Transition Technologies 320 For evaluating the performance of Encapsulation and Double 321 translation transition technologies, a dual DUT setup (see Figure 2) 322 SHOULD be employed. The tester creates a network flow of IPvX 323 packets. The first DUT is responsible for the encapsulation or 324 translation of IPvX packets into IPvY packets. The IPvY packets are 325 decapsulated/translated back to IPvX packets by the second DUT and 326 forwarded to the tester. 328 +--------------------+ 329 | | 330 +---------------------|IPvX tester IPvX|<------------------+ 331 | | | | 332 | +--------------------+ | 333 | | 334 | +--------------------+ +--------------------+ | 335 | | | | | | 336 +----->|IPvX DUT 1 IPvY |----->|IPvY DUT 2 IPvX |------+ 337 | | | | 338 +--------------------+ +--------------------+ 339 Figure 2. Test setup 2 341 One of the limitations of the dual DUT setup is the inability to 342 reflect asymmetries in behavior between the DUTs. Considering this, 343 additional performance tests SHOULD be performed using the single 344 DUT setup. 346 Note: For encapsulation IPv6 transition technologies, in the single 347 DUT setup, in order to test the decapsulation efficiency, the tester 348 SHOULD be able to send IPvX packets encasulated as IPvY. 350 5. Test Traffic 352 The test traffic represents the experimental workload and SHOULD 353 meet the requirements specified in this section. The requirements 354 are dedicated to unicast IP traffic. Multicast IP traffic is outside 355 of the scope of this document. 357 5.1. Frame Formats and Sizes 359 [RFC5180] describes the frame size requirements for two commonly 360 used media types: Ethernet and SONET (Synchronous Optical Network). 361 [RFC2544] covers also other media types, such as token ring and 362 FDDI. The two documents can be referred for the dual-stack 363 transition technologies. For the rest of the transition technologies 364 the frame overhead introduced by translation or encapsulation MUST 365 be considered. 367 The encapsulation/translation process generates different size 368 frames on different segments of the test setup. For instance, the 369 single translation transition technologies will create different 370 frame sizes on the receiving segment of the test setup, as IPvX 371 packets are translated to IPvY. This is not a problem if the 372 bandwidth of the employed media is not exceeded. To prevent 373 exceeding the limitations imposed by the media, the frame size 374 overhead needs to be taken into account when calculating the maximum 375 theoretical frame rates. The calculation method for the Ethernet, as 376 well as a calculation example are detailed in Appendix A. The 377 details of the media employed for the benchmarking tests MUST be 378 noted in all test reports. 380 In the context of frame size overhead, MTU recommendations are 381 needed in order to avoid frame loss due to MTU mismatch between the 382 virtual encapsulation/translation interfaces and the physical 383 network interface controllers (NICs). To avoid this situation, the 384 larger MTU between the physical NICs and virtual 385 encapsulation/translation interfaces SHOULD be set for all 386 interfaces of the DUT and tester. To be more specific, the minimum 387 IPv6 MTU size (1280 bytes) plus the encapsulation/translation 388 overhead is the RECOMMENDED value for the physical interfaces as 389 well as virtual ones. 391 5.1.1. Frame Sizes to Be Used over Ethernet 393 Based on the recommendations of [RFC5180], the following frame sizes 394 SHOULD be used for benchmarking IPvX/IPvY traffic on Ethernet links: 395 64, 128, 256, 512, 1024, 1280, 1518, 1522, 2048, 4096, 8192 and 396 9216. 398 The theoretical maximum frame rates considering an example of frame 399 overhead are presented in Appendix A1. 401 5.2. Protocol Addresses 403 The selected protocol addresses should follow the recommendations of 404 [RFC5180](Section 5) for IPv6 and [RFC2544](Section 12) for IPv4. 406 Note: testing traffic with extension headers might not be possible 407 for the transition technologies, which employ translation. Proposed 408 IPvX/IPvY translation algorithms such as IP/ICMP translation 409 [RFC6145] do not support the use of extension headers. 411 5.3. Traffic Setup 413 Following the recommendations of [RFC5180], all tests described 414 SHOULD be performed with bi-directional traffic. Uni-directional 415 traffic tests MAY also be performed for a fine grained performance 416 assessment. 418 Because of the simplicity of UDP, UDP measurements offer a more 419 reliable basis for comparison than other transport layer protocols. 420 Consequently, for the benchmarking tests described in Section 6 of 421 this document UDP traffic SHOULD be employed. 423 Considering that the stateful transition technologies need to manage 424 the state table for each connection, a connection-oriented transport 425 layer protocol needs to be used with the test traffic. Consequently, 426 TCP test traffic SHOULD be employed for the tests described in 427 Section 7 of this document. 429 6. Modifiers 431 The idea of testing under different operational conditions was first 432 introduced in [RFC2544](Section 11) and represents an important 433 aspect of benchmarking network elements, as it emulates to some 434 extent the conditions of a production environment. [RFC5180] 435 describes complementary testing conditions specific to IPv6. Their 436 recommendations can be referred for IPv6 transition technologies 437 testing as well. 439 7. Benchmarking Tests 441 The following sub-sections contain the list of all recommended 442 benchmarking tests. 444 7.1. Throughput - [RFC2544] 446 7.2. Latency 448 Objective: To determine the latency. Typical latency is based on the 449 definitions of latency from [RFC1242]. However, this memo provides a 450 new measurement procedure. 452 Procedure: Similar to [RFC2544], the throughput for DUT at each of 453 the listed frame sizes SHOULD be determined. Send a stream of frames 454 at a particular frame size through the DUT at the determined 455 throughput rate to a specific destination. The stream SHOULD be at 456 least 120 seconds in duration. 458 Identifying tags SHOULD be included in at least 500 frames after 60 459 seconds. For each tagged frame, the time at which was fully 460 transmitted (timestamp A) and the time at which the frame was 461 received (timestamp B) MUST be recorded. The latency is timestamp B 462 minus timestamp A as per the relevant definition from RFC 1242, 463 namely latency as defined for store and forward devices or latency 464 as defined for bit forwarding devices. 466 From the resulted (at least 500) latencies, 2 quantities SHOULD be 467 calculated. One is the typical latency, which SHOULD be calculated 468 with the following formula: 470 TL=Median(Li) 472 Where: TL - the reported typical latency of the stream 473 Li -the latency for tagged frame i 475 The other measure is the worst case latency, which SHOULD be 476 calculated with the following formula: 478 WCL=L99.9thPercentile 480 Where: WCL - The reported worst case latency 481 th L99.9thPercentile - The 99.9 Percentile of the stream measured 482 latencies 484 The test MUST be repeated at least 20 times with the reported 485 value being the median of the recorded values. 487 Reporting Format: The report MUST state which definition of latency 488 (from RFC 1242) was used for this test. The summarized latency 489 results SHOULD be reported in the format of a table with a row for 490 each of the tested frame sizes. There SHOULD be columns for the 491 frame size, the rate at which the latency test was run for that 492 frame size, for the media types tested, and for the resultant 493 typical latency and worst case latency values for each type of data st th stream tested. To account for the variation, the 1 and 99 494 percentiles of the 20 iterations MAY be reported in two separated 495 columns. 497 7.3. Packet Delay Variation 499 Considering two of the metrics presented in [RFC5481], Packet Delay 500 Variation (PDV) and Inter Packet Delay Variation (IPDV), it is 501 RECOMMENDED to measure PDV. For a fine grain analysis of delay 502 variation, IPDV measurements MAY be performed as well. 504 7.3.1. PDV 506 Objective: To determine the Packet Delay Variation as defined in 507 [RFC5481]. 509 Procedure: As described by [RFC2544], first determine the throughput 510 for the DUT at each of the listed frame sizes. Send a stream of 511 frames at a particular frame size through the DUT at the determined 512 throughput rate to a specific destination. The stream SHOULD be at 513 least 60 seconds in duration. Measure the One-way delay as described 514 by [RFC3393] for all frames in the stream. Calculate the PDV of the 515 stream using the formula: 517 PDV=D99.9thPercentile - Dmin 518 Where: D99.9thPercentile - the 99.9th Percentile (as it was 519 described in [RFC5481]) of the One-way delay for the stream 521 Dmin - the minimum One-way delay in the stream 523 As recommended in [RFC 2544], the test MUST be repeated at least 20 524 times with the reported value being the median of the recorded st th values. Moreover, the 1 and 99 percentiles SHOULD be calculated to 525 account for the variation of the dataset. 527 Reporting Format: The PDV results SHOULD be reported in a table with 528 a row for each of the tested frame sizes and columns for the frame 529 size and the applied frame rate for the tested media types. Two th columns for the 1st and 99 percentile values MAY as well be 530 displayed. Following the recommendations of [RFC5481], the 531 RECOMMENDED units of measurement are milliseconds. 533 7.3.2. IPDV 535 Objective: To determine the Inter Packet Delay Variation as defined 536 in [RFC5481]. 538 Procedure: As described by [RFC2544], first determine the throughput 539 for the DUT at each of the listed frame sizes. Send a stream of 540 frames at a particular frame size through the DUT at the determined 541 throughput rate to a specific destination. The stream SHOULD be at 542 least 60 seconds in duration. Measure the One-way delay as described 543 by [RFC3393] for all frames in the stream. Calculate the IPDV for 544 each of the frames using the formula: 546 IPDV(i)=D(i) - D(i-1) 548 Where: D(i) - the One-way delay of the i th frame in the stream 550 D(i-1) - the One-way delay of i-1 th frame in the stream 552 Given the nature of IPDV, reporting a single number might lead to 553 over-summarization. In this context, the report for each measurement 554 SHOULD include 3 values: Dmin, Dmed, and Dmax 556 Where: Dmin - the minimum One-way delay in the stream 558 Dmed - the median One-way delay of the stream 560 Dmax - the maximum One-way delay in the stream 562 As recommended in [RFC 2544], the test MUST be repeated at least 20 563 times. To summarize the 20 repetitions, for each of the 3 (Dmin, 564 Dmed and Dmax) the median value SHOULD be reported. 566 Reporting format: The median for the 3 proposed values SHOULD be 567 reported. The IPDV results SHOULD be reported in a table with a row 568 for each of the tested frame sizes. The columns SHOULD include the 569 frame size and associated frame rate for the tested media types and 570 sub-columns for the three proposed reported values. Following the 571 recommendations of [RFC5481], the RECOMMENDED units of measurement 572 are milliseconds. 574 7.4. Frame Loss Rate - [RFC2544] 576 7.5. Back-to-back Frames - [RFC2544] 578 7.6. System Recovery - [RFC2544] 580 7.7. Reset - [RFC2544] 582 8. Additional Benchmarking Tests for Stateful IPv6 Transition 583 Technologies 585 This section describes additional tests dedicated to the stateful 586 IPv6 transition technologies. For the tests described in this 587 section the DUT devices SHOULD follow the test setup and test 588 parameters recommendations presented in [RFC3511] (Sections 4, 5). 590 In addition to the IPv4/IPv6 transition function a network node can 591 have a firewall function. This document is targeting only the 592 network devices that do not have a firewall function, as this 593 function can be benchmarked using the recommendations of [RFC3511]. 594 Consequently, only the tests described in [RFC3511] (Sections 5.2, 595 5.3) are RECOMMENDED. Namely, the following additional tests SHOULD 596 be performed: 598 8.1. Concurrent TCP Connection Capacity -[RFC3511] 600 8.2. Maximum TCP Connection Establishment Rate -[RFC3511] 602 9. DNS Resolution Performance 604 This section describes benchmarking tests dedicated to DNS64 (see 605 [RFC6147]), used as DNS support for single translation technologies 606 such as NAT64. 608 9.1. Test and Traffic Setup 610 The test setup follows the setup proposed for single translation 611 IPv6 transition technologies in Figure 1. 613 1:AAAA query +--------------------+ 614 +------------| |<-------------+ 615 | |IPv6 Tester IPv4| | 616 | +-------->| |----------+ | 617 | | +--------------------+ 3:empty | | 618 | | 6:synt'd AAAA, | | 619 | | AAAA +--------------------+ 5:valid A| | 620 | +---------| |<---------+ | 621 | |IPv6 DUT IPv4| | 622 +----------->| (DNS64) |--------------+ 623 +--------------------+ 2:AAAA query, 4:A query 625 The test traffic SHOULD follow the following steps. 627 1. Query for the AAAA record of a domain name (from client to DNS64 628 server) 630 2. Query for the AAAA record of the same domain name (from DNS64 631 server to authoritative DNS server) 633 3. Empty AAAA record answer (from authoritative DNS server to DNS64 634 server) 636 4. Query for the A record of the same domain name (from DNS64 server 637 to authoritative DNS server) 639 5. Valid A record answer (from authoritative DNS server to DNS64 640 server) 642 6. Synthesized AAAA record answer (from DNS64 server to client) 644 The Tester plays the role of DNS client as well as authoritative DNS 645 server. It MAY be realized as a single physical device, or 646 alternatively, two physical devices MAY be used. 648 Please note that: 650 - If the DNS64 server implements caching and there is a cache hit 651 then step 1 is followed by step 6 (and steps 2 through 5 are 652 omitted). 654 - If the domain name has an AAAA record then it is returned in 655 step 3 by the authoritative DNS server, steps 4 and 5 are 656 omitted, and the DNS64 server does not synthesizes an AAAA 657 record, but returns the received AAAA record to the client. 658 - As for the IP version used between the tester and the DUT, IPv6 659 MUST be used between the client and the DNS64 server (as a 660 DNS64 server provides service for an IPv6-only client), but 661 either IPv4 or IPv6 MAY be used between the DNS64 server and 662 the authoritative DNS server. 664 9.2. Benchmarking DNS Resolution Performance 666 Objective: To determine DNS64 performance by means of the number of 667 successfully processed DNS requests per second. 669 Procedure: Send a specific number of DNS queries at a specific rate 670 to the DUT and then count the replies received in time (within a 671 predefined timeout period from the sending time of the corresponding 672 query, having the default value 1 second) from the DUT. If the count 673 of sent queries is equal to the count of received replies, the rate 674 of the queries is raised and the test is rerun. If fewer replies are 675 received than queries were sent, the rate of the queries is reduced 676 and the test is rerun. The duration of the test SHOULD be at least 677 60 seconds to reduce the potential gain of a DNS64 server, which is 678 able to exhibit higher performance by storing the requests and thus 679 utilizing also the timeout time for answering them. For the same 680 reason, no higher timeout time than 1 second SHOULD be used. 682 The number of processed DNS queries per second is the fastest rate 683 at which the count of DNS replies sent by the DUT is equal to the 684 number of DNS queries sent to it by the test equipment. 686 The test SHOULD be repeated at least 20 times and the median and 1st th and 99 percentiles of the number of processed DNS queries per 687 second SHOULD be calculated. 689 Details and parameters: 691 1. Caching 692 First, all the DNS queries MUST contain different domain names (or 693 domain names MUST NOT be repeated before the cache of the DUT is 694 exhausted). Then new tests MAY be executed with 10%, 20%, 30%, etc. 695 domain names which are repeated (early enough to be still in the 696 cache). 698 2. Existence of AAAA record 699 First, all the DNS queries MUST contain domain names which do not 700 have an AAAA record and have exactly one A record. 702 Then new tests MAY be executed with 10%, 20%, 30%, etc. domain names 703 which have an AAAA record. 705 Please note that the two conditions above are orthogonal, thus all 706 their combinations are possible and MAY be tested. The testing with 707 0% repeated DNS names and with 0% existing AAAA record is REQUIRED 708 and the other combinations are OPTIONAL. 710 Reporting format: The primary result of the DNS64/DNS46 test is the 711 average of the number of processed DNS queries per second measured 712 with the above mentioned "0% + 0% combination". The average SHOULD 713 be complemented with the margin of error to show the stability of st the result. If optional tests are done, the median and the 1 and th 99 percentiles MAY be presented in a two dimensional table where 714 the dimensions are the proportion of the repeated domain names and 715 the proportion of the DNS names having AAAA records. The two table 716 headings SHOULD contain these percentage values. Alternatively, the 717 results MAY be presented as the corresponding two dimensional graph, 718 too. In this case the graph SHOULD show the median values with the 719 percentiles as error bars. From both the table and the graph, one 720 dimensional excerpts MAY be made at any given fixed percentage value 721 of the other dimension. In this case, the fixed value MUST be given 722 together with a one dimensional table or graph. 724 9.2.1. Requirements for the Tester 726 Before a Tester can be used for testing a DUT at rate r queries per 727 second with t seconds timeout, it MUST perform a self-test in order 728 to exclude the possibility that the poor performance of the Tester 729 itself influences the results. For performing a self-test, the 730 tester is looped back (leaving out DUT) and its authoritative DNS 731 server subsystem is configured to be able to answer all the AAAA 732 record queries. For passing the self-test, the Tester SHOULD be able 733 to answer AAAA record queries at 2*(r+delta) rate within 0.25*t 734 timeout, where the value of delta is at least 0.1. 736 Explanation: When performing DNS64 testing, each AAAA record query 737 may result in at most two queries sent by the DUT, the first one of 738 them is for an AAAA record and the second one is for an A record 739 (the are both sent when there is no cache hit and also no AAAA 740 record exists). The parameters above guarantee that the 741 authoritative DNS server subsystem of the DUT is able to answer the 742 queries at the required frequency using up not more than the half of 743 the timeout time. 745 Remark: a sample open-source test program, dns64perf++ is available 746 from [Dns64perf]. It implements only the client part of the Tester 747 and it should be used together with an authoritative DNS server 748 implementation, e.g. BIND, NSD or YADIFA. 750 10. Scalability 752 Scalability has been often discussed; however, in the context of 753 network devices, a formal definition or a measurement method has not 754 yet been proposed. 756 In this context, scalability can be defined as the ability of each 757 transition technology to accommodate network growth. 759 Poor scalability usually leads to poor performance. Considering 760 this, scalability can be measured by quantifying the network 761 performance degradation while the network grows. 763 The following subsections describe how the test setups can be 764 modified to create network growth and how the associated performance 765 degradation can be quantified. 767 10.1. Test Setup 769 The test setups defined in Section 3 have to be modified to create 770 network growth. 772 10.1.1. Single Translation Transition Technologies 774 In the case of single translation transition technologies the 775 network growth can be generated by increasing the number of network 776 flows generated by the tester machine (see Figure 3). 778 +-------------------------+ 779 +------------|NF1 NF1|<-------------+ 780 | +---------|NF2 tester NF2|<----------+ | 781 | | ...| | | | 782 | | +-----|NFn NFn|<------+ | | 783 | | | +-------------------------+ | | | 784 | | | | | | 785 | | | +-------------------------+ | | | 786 | | +---->|NFn NFn|-------+ | | 787 | | ...| DUT | | | 788 | +-------->|NF2 (translator) NF2|-----------+ | 789 +----------->|NF1 NF1|--------------+ 790 +-------------------------+ 791 Figure 3. Test setup 3 793 10.1.2. Encapsulation/Double Translation Transition Technologies 795 Similarly, for the encapsulation/double translation technologies a 796 multi-flow setup is recommended. Considering a multipoint-to-point 797 scenario, for most transition technologies, one of the edge nodes is 798 designed to support more than one connecting devices. Hence, the 799 recommended test setup is a n:1 design, where n is the number of 800 client DUTs connected to the same server DUT (See Figure 4). 802 +-------------------------+ 803 +--------------------|NF1 NF1|<--------------+ 804 | +-----------------|NF2 tester NF2|<-----------+ | 805 | | ...| | | | 806 | | +-------------|NFn NFn|<-------+ | | 807 | | | +-------------------------+ | | | 808 | | | | | | 809 | | | +-----------------+ +---------------+ | | | 810 | | +--->| NFn DUT n NFn |--->|NFn NFn| ---+ | | 811 | | +-----------------+ | | | | 812 | | ... | | | | 813 | | +-----------------+ | DUT n+1 | | | 814 | +------->| NF2 DUT 2 NF2 |--->|NF2 NF2|--------+ | 815 | +-----------------+ | | | 816 | +-----------------+ | | | 817 +---------->| NF1 DUT 1 NF1 |--->|NF1 NF1|-----------+ 818 +-----------------+ +---------------+ 819 Figure 4. Test setup 4 821 This test setup can help to quantify the scalability of the server 822 device. However, for testing the scalability of the client DUTs 823 additional recommendations are needed. 824 For encapsulation transition technologies a m:n setup can be 825 created, where m is the number of flows applied to the same client 826 device and n the number of client devices connected to the same 827 server device. 828 For the translation based transition technologies the client devices 829 can be separately tested with n network flows using the test setup 830 presented in Figure 3. 832 10.2. Benchmarking Performance Degradation 834 10.2.1. Network performance degradation with simultaneous load 836 Objective: To quantify the performance degradation introduced by n 837 parallel and simultaneous network flows. 839 Procedure: First, the benchmarking tests presented in Section 6 have 840 to be performed for one network flow. 842 The same tests have to be repeated for n network flows, where the 843 network flows are started simultaneously. The performance 844 degradation of the X benchmarking dimension SHOULD be calculated as 845 relative performance change between the 1-flow results and the n- 846 flow results, using the following formula: 848 Xn - X1 849 Xpd= ----------- * 100, where: X1 - result for 1-flow 850 X1 Xn - result for n-flows 852 Reporting Format: The performance degradation SHOULD be expressed as 853 a percentage. The number of tested parallel flows n MUST be clearly 854 specified. For each of the performed benchmarking tests, there 855 SHOULD be a table containing a column for each frame size. The table 856 SHOULD also state the applied frame rate. 858 10.2.2. Network performance degradation with incremental load 860 Objective: To quantify the performance degradation introduced by n 861 parallel and incrementally started network flows. 863 Procedure: First, the benchmarking tests presented in Section 6 have 864 to be performed for one network flow. 866 The same tests have to be repeated for n network flows, where the 867 network flows are started incrementally in succession, each after 868 time T. In other words, if flow I is started at time x, flow i+1 869 will be started at time x+T. Considering the time T, the time 870 duration of each iteration must be extended with the time necessary 871 to start all the flows, namely (n-1)xT. 873 The performance degradation of the X benchmarking dimension SHOULD 874 be calculated as relative performance change between the 1-flow 875 results and the n-flow results, using the following formula 876 presented in Section 9.2.1. 878 Reporting Format: The performance degradation SHOULD be expressed as 879 a percentage. The number of tested parallel flows n MUST be clearly 880 specified. For each of the performed benchmarking tests, there 881 SHOULD be a table containing a column for each frame size. The table 882 SHOULD also state the applied frame rate and time duration T, used 883 as increment step between the network flows. The units of 884 measurement for T SHOULD be seconds. 886 11. Summarizing function and variation 888 To ensure the stability of the benchmarking scores obtained using 889 the tests presented in Sections 6-9, multiple test iterations are 890 recommended. Using a summarizing function (or measure of central 891 tendency) can be a simple and effective way to compare the results 892 obtained across different iterations. However, over-summarization is 893 an unwanted effect of reporting a single number. 895 Measuring the variation (dispersion index) can be used to counter 896 the over-summarization effect. Empirical data obtained following the 897 proposed methodology can also offer insights on which summarizing 898 function would fit better. 900 To that end, data presented in [ietf95pres] indicate the median as st th suitable summarizing function and the 1 and 99 percentiles as 901 variation measures for DNS Resolution Performance and PDV. 903 For a fine grain analysis of the frequency distribution of the data, 904 histograms or cumulative distribution function plots can be 905 employed. 907 12. Security Considerations 909 Benchmarking activities as described in this memo are limited to 910 technology characterization using controlled stimuli in a laboratory 911 environment, with dedicated address space and the constraints 912 specified in the sections above. 914 The benchmarking network topology will be an independent test setup 915 and MUST NOT be connected to devices that may forward the test 916 traffic into a production network, or misroute traffic to the test 917 management network. 919 Further, benchmarking is performed on a "black-box" basis, relying 920 solely on measurements observable external to the DUT/SUT. Special 921 capabilities SHOULD NOT exist in the DUT/SUT specifically for 922 benchmarking purposes. Any implications for network security arising 923 from the DUT/SUT SHOULD be identical in the lab and in production 924 networks. 926 13. IANA Considerations 928 The IANA has allocated the prefix 2001:0002::/48 [RFC5180] for IPv6 929 benchmarking. For IPv4 benchmarking, the 198.18.0.0/15 prefix was 930 reserved, as described in [RFC6890]. The two ranges are sufficient 931 for benchmarking IPv6 transition technologies. 933 14. References 935 14.1. Normative References 937 [RFC1242] Bradner, S., "Benchmarking Terminology for Network 938 Interconnection Devices", [RFC1242], July 1991. 940 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 941 Requirement Levels", BCP 14, RFC 2119, March 1997. 943 [RFC2544] Bradner, S., and J. McQuaid, "Benchmarking Methodology for 944 Network Interconnect Devices", [RFC2544], March 1999. 946 [RFC2647] Newman, D., "Benchmarking Terminology for Firewall 947 Devices", [RFC2647], August 1999. 949 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 950 Metric for IP Performance Metrics (IPPM)", RFC 3393, 951 November 2002. 953 [RFC3511] Hickman, B., Newman, D., Tadjudin, S. and T. Martin, 954 "Benchmarking Methodology for Firewall Performance", 955 [RFC3511], April 2003. 957 [RFC5180] Popoviciu, C., Hamza, A., Van de Velde, G., and D. 958 Dugatkin, "IPv6 Benchmarking Methodology for Network 959 Interconnect Devices", RFC 5180, May 2008. 961 [RFC5481] Morton, A., and B. Claise, "Packet Delay Variation 962 Applicability Statement", RFC 5481, March 2009. 964 [RFC6201] Asati, R., Pignataro, C., Calabria, F. and C. Olvera, 965 "Device Reset Characterization ", RFC 6201, March 2011. 967 14.2. Informative References 969 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 970 for IPv6 Hosts and Routers", RFC 4213, October 2005. 972 [RFC5569] Despres, R., "IPv6 Rapid Deployment on IPv4 973 Infrastructures (6rd)", RFC 5569, DOI 10.17487/RFC5569, 974 January 2010, . 976 [RFC6144] Baker, F., Li, X., Bao, C., and K. Yin, "Framework for 977 IPv4/IPv6 Translation", RFC 6144, April 2011. 979 [RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 980 Algorithm", RFC 6145, DOI 10.17487/RFC6145, April 2011, 981 . 983 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 984 NAT64: Network Address and Protocol Translation from IPv6 985 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 986 April 2011, . 988 [RFC6219] Li, X., Bao, C., Chen, M., Zhang, H., and J. Wu, "The 989 China Education and Research Network (CERNET) IVI 990 Translation Design and Deployment for the IPv4/IPv6 991 Coexistence and Transition", RFC 6219, DOI 992 10.17487/RFC6219, May 2011, . 995 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 996 Stack Lite Broadband Deployments Following IPv4 997 Exhaustion", RFC 6333, August 2011. 999 [RFC6877] Mawatari, M., Kawashima, M., and C. Byrne, "464XLAT: 1000 Combination of Stateful and Stateless Translation", RFC 1001 6877, DOI 10.17487/RFC6877, April 2013, . 1004 [RFC6890] Cotton, M., Vegoda, L., Bonica, R., and B. Haberman, 1005 "Special-Purpose IP Address Registries", BCP 153, RFC6890, 1006 April 2013. 1008 [RFC7596] Cui, Y., Sun, Q., Boucadair, M., Tsou, T., Lee, Y., and 1009 I. Farrer, "Lightweight 4over6: An Extension to the Dual- 1010 Stack Lite Architecture", RFC 7596, DOI 10.17487/RFC7596, 1011 July 2015, . 1013 [RFC7597] Troan, O., Ed., Dec, W., Li, X., Bao, C., Matsushima, S., 1014 Murakami, T., and T. Taylor, Ed., "Mapping of Address and 1015 Port with Encapsulation (MAP-E)", RFC 7597, DOI 1016 10.17487/RFC7597, July 2015, . 1019 [RFC7599] Li, X., Bao, C., Dec, W., Ed., Troan, O., Matsushima, S., 1020 and T. Murakami, "Mapping of Address and Port using 1021 Translation (MAP-T)", RFC 7599, DOI 10.17487/RFC7599, July 1022 2015, . 1024 [Dns64perf] Bakai, D., "A C++11 DNS64 performance tester", 1025 available: https://github.com/bakaid/dns64perfpp 1027 [ietf95pres] Georgescu, M., "Benchmarking Methodology for IPv6 1028 Transition Technologies", IETF 95, Buenos Aires, 1029 Argentina, April 3-8, 2016, available: [to appear] 1031 15. Acknowledgements 1033 The authors would like to thank Youki Kadobayashi and Hiroaki 1034 Hazeyama for their constant feedback and support. The thanks should 1035 be extended to the NECOMA project members for their continuous 1036 support. We would also like to thank Scott Bradner for the useful 1037 suggestions. We also note that portions of text from Scott's 1038 documents were used in this memo (e.g. Latency section). A big thank 1039 you to Al Morton and Fred Baker for their detailed review of the 1040 draft and very helpful suggestions. Other helpful comments and 1041 suggestions were offered by Bhuvaneswaran Vengainathan, Andrew 1042 McGregor, Nalini Elkins, Kaname Nishizuka, Yasuhiro Ohara, Masataka 1043 Mawatari, Kostas Pentikousis and Bela Almasi. A special thank you to 1044 the RFC Editor Team for their thorough editorial review and helpful 1045 suggestions. This document was prepared using 2-Word- 1046 v2.0.template.dot. 1048 Appendix A. Theoretical Maximum Frame Rates 1050 This appendix describes the recommended calculation formulas for the 1051 theoretical maximum frame rates to be employed over Ethernet as 1052 example media. The formula takes into account the frame size 1053 overhead created by the encapsulation or the translation process. 1054 For example, the 6in4 encapsulation described in [RFC4213] adds 20 1055 bytes of overhead to each frame. 1057 Considering X to be the frame size and O to be the frame size 1058 overhead created by the encapsulation on translation process, the 1059 maximum theoretical frame rate for Ethernet can be calculated using 1060 the following formula: 1062 Line Rate (bps) 1063 ------------------------------ 1064 (8bits/byte)*(X+O+20)bytes/frame 1066 The calculation is based on the formula recommended by RFC5180 in 1067 Appendix A1. As an example, the frame rate recommended for testing a 1068 6in4 implementation over 10Mb/s Ethernet with 64 bytes frames is: 1070 10,000,000(bps) 1071 ------------------------------ = 12,019 fps 1072 (8bits/byte)*(64+20+20)bytes/frame 1074 The complete list of recommended frame rates for 6in4 encapsulation 1075 can be found in the following table: 1077 +------------+---------+----------+-----------+------------+ 1078 | Frame size | 10 Mb/s | 100 Mb/s | 1000 Mb/s | 10000 Mb/s | 1079 | (bytes) | (fps) | (fps) | (fps) | (fps) | 1080 +------------+---------+----------+-----------+------------+ 1081 | 64 | 12,019 | 120,192 | 1,201,923 | 12,019,231 | 1082 | 128 | 7,440 | 74,405 | 744,048 | 7,440,476 | 1083 | 256 | 4,223 | 42,230 | 422,297 | 4,222,973 | 1084 | 512 | 2,264 | 22,645 | 226,449 | 2,264,493 | 1085 | 1024 | 1,175 | 11,748 | 117,481 | 1,174,812 | 1086 | 1280 | 947 | 9,470 | 94,697 | 946,970 | 1087 | 1518 | 802 | 8,023 | 80,231 | 802,311 | 1088 | 1522 | 800 | 8,003 | 80,026 | 800,256 | 1089 | 2048 | 599 | 5,987 | 59,866 | 598,659 | 1090 | 4096 | 302 | 3,022 | 30,222 | 302,224 | 1091 | 8192 | 152 | 1,518 | 15,185 | 151,846 | 1092 | 9216 | 135 | 1,350 | 13,505 | 135,048 | 1093 +------------+---------+----------+-----------+------------+ 1095 Authors' Addresses 1096 Marius Georgescu 1097 Nara Institute of Science and Technology (NAIST) 1098 Takayama 8916-5 1099 Nara 1100 Japan 1102 Phone: +81 743 72 5216 1103 Email: liviumarius-g@is.naist.jp 1105 Gabor Lencse 1106 Szechenyi Istvan University 1107 Egyetem ter 1. 1108 Gyor 1109 Hungary 1111 Phone: +36 20 775 8267 1112 Email: lencse@sze.hu