idnits 2.17.1 draft-ietf-bmwg-traffic-management-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 2, 2015) is 3250 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group B. Constantine 2 Internet Draft JDSU 3 Intended status: Informational R. Krishnan 4 Expires: February 2016 Brocade Communications 5 June 2, 2015 7 Traffic Management Benchmarking 8 draft-ietf-bmwg-traffic-management-05.txt 10 Status of this Memo 12 This Internet-Draft is submitted in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF). Note that other groups may also distribute 17 working documents as Internet-Drafts. The list of current Internet- 18 Drafts is at http://datatracker.ietf.org/drafts/current/. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 This Internet-Draft will expire on December 2, 2015. 27 Copyright Notice 29 Copyright (c) 2015 IETF Trust and the persons identified as the 30 document authors. All rights reserved. 32 This document is subject to BCP 78 and the IETF Trust's Legal 33 Provisions Relating to IETF Documents 34 (http://trustee.ietf.org/license-info) in effect on the date of 35 publication of this document. Please review these documents 36 carefully, as they describe your rights and restrictions with respect 37 to this document. Code Components extracted from this document must 38 include Simplified BSD License text as described in Section 4.e of 39 the Trust Legal Provisions and are provided without warranty as 40 described in the Simplified BSD License. 42 Abstract 44 This framework describes a practical methodology for benchmarking the 45 traffic management capabilities of networking devices (i.e. policing, 46 shaping, etc.). The goal is to provide a repeatable test method that 47 objectively compares performance of the device's traffic management 48 capabilities and to specify the means to benchmark traffic management 49 with representative application traffic. 51 Table of Contents 53 1. Introduction...................................................4 54 1.1. Traffic Management Overview...............................4 55 1.2. DUT Lab Configuration and Testing Overview................5 56 2. Conventions used in this document..............................7 57 3. Scope and Goals................................................8 58 4. Traffic Benchmarking Metrics...................................9 59 4.1. Metrics for Stateless Traffic Tests......................10 60 4.2. Metrics for Stateful Traffic Tests.......................11 61 5. Tester Capabilities...........................................12 62 5.1. Stateless Test Traffic Generation........................13 63 5.1.1. Burst Hunt with Stateless Traffic...................13 64 5.2. Stateful Test Pattern Generation.........................13 65 5.2.1. TCP Test Pattern Definitions........................15 66 6. Traffic Benchmarking Methodology..............................16 67 6.1. Policing Tests...........................................16 68 6.1.1 Policer Individual Tests................................17 69 6.1.2 Policer Capacity Tests..............................18 70 6.1.2.1 Maximum Policers on Single Physical Port..........19 71 6.1.2.2 Single Policer on All Physical Ports..............20 72 6.1.2.3 Maximum Policers on All Physical Ports............21 73 6.2. Queue/Scheduler Tests....................................21 74 6.2.1 Queue/Scheduler Individual Tests........................21 75 6.2.1.1 Testing Queue/Scheduler with Stateless Traffic....21 76 6.2.1.2 Testing Queue/Scheduler with Stateful Traffic.....23 77 6.2.2 Queue / Scheduler Capacity Tests......................25 78 6.2.2.1 Multiple Queues / Single Port Active..............25 79 6.2.2.1.1 Strict Priority on Egress Port..................26 80 6.2.2.1.2 Strict Priority + Weighted Fair Queue (WFQ).....26 81 6.2.2.2 Single Queue per Port / All Ports Active..........27 82 6.2.2.3 Multiple Queues per Port, All Ports Active........27 83 6.3. Shaper tests.............................................28 84 6.3.1 Shaper Individual Tests...............................28 85 6.3.1.1 Testing Shaper with Stateless Traffic.............29 86 6.3.1.2 Testing Shaper with Stateful Traffic..............30 87 6.3.2 Shaper Capacity Tests.................................32 88 6.3.2.1 Single Queue Shaped, All Physical Ports Active....32 89 6.3.2.2 All Queues Shaped, Single Port Active.............32 90 6.3.2.3 All Queues Shaped, All Ports Active...............33 91 6.4. Concurrent Capacity Load Tests...........................34 92 7. Security Considerations.......................................34 93 8. IANA Considerations...........................................34 94 9. References....................................................35 95 9.1. Normative References.....................................35 96 9.2. Informative References...................................35 97 Appendix A: Open Source Tools for Traffic Management Testing.....36 98 Appendix B: Stateful TCP Test Patterns...........................37 99 Acknowledgments..................................................41 100 Authors' Addresses...............................................42 102 1. Introduction 104 Traffic management (i.e. policing, shaping, etc.) is an increasingly 105 important component when implementing network Quality of Service 106 (QoS). 108 There is currently no framework to benchmark these features 109 although some standards address specific areas which are described 110 in Section 1.1. 112 This draft provides a framework to conduct repeatable traffic 113 management benchmarks for devices and systems in a lab environment. 115 Specifically, this framework defines the methods to characterize 116 the capacity of the following traffic management features in network 117 devices; classification, policing, queuing / scheduling, and 118 traffic shaping. 120 This benchmarking framework can also be used as a test procedure to 121 assist in the tuning of traffic management parameters before service 122 activation. In addition to Layer 2/3 (Ethernet / IP) benchmarking, 123 Layer 4 (TCP) test patterns are proposed by this draft in order to 124 more realistically benchmark end-user traffic. 126 1.1. Traffic Management Overview 128 In general, a device with traffic management capabilities performs 129 the following functions: 131 - Traffic classification: identifies traffic according to various 132 configuration rules (for example IEEE 802.1Q Virtual LAN (VLAN), 133 Differential Services Code Point (DSCP) etc.) and marks this traffic 134 internally to the network device. Multiple external priorities 135 (DSCP, 802.1p, etc.) can map to the same priority in the device. 136 - Traffic policing: limits the rate of traffic that enters a network 137 device according to the traffic classification. If the traffic 138 exceeds the provisioned limits, the traffic is either dropped or 139 remarked and forwarded onto to the next network device 140 - Traffic Scheduling: provides traffic classification within the 141 network device by directing packets to various types of queues and 142 applies a dispatching algorithm to assign the forwarding sequence 143 of packets 144 - Traffic shaping: a traffic control technique that actively buffers 145 and smooths the output rate in an attempt to adapt bursty traffic 146 to the configured limits 147 - Active Queue Management (AQM): AQM involves monitoring the status 148 of internal queues and proactively dropping (or remarking) packets, 149 which causes hosts using congestion-aware protocols to back-off and 150 in turn alleviate queue congestion [AQM-RECO]. On the other hand, 151 classic traffic management techniques reactively drop (or remark) 152 packets based on queue full condition. The benchmarking scenarios 153 for AQM are different and is outside of the scope of this testing 154 framework. 156 Even though AQM is outside of scope of this framework, it should be 157 noted that the TCP metrics and TCP test patterns (defined in Sections 158 4.2 and 5.2, respectively) could be useful to test new AQM 159 algorithms (targeted to alleviate buffer bloat). Examples of these 160 algorithms include code1 and pie (draft-ietf-aqm-code1 and 161 draft-ietf-aqm-pie). 163 The following diagram is a generic model of the traffic management 164 capabilities within a network device. It is not intended to 165 represent all variations of manufacturer traffic management 166 capabilities, but provide context to this test framework. 168 |----------| |----------------| |--------------| |----------| 169 | | | | | | | | 170 |Interface | |Ingress Actions | |Egress Actions| |Interface | 171 |Input | |(classification,| |(scheduling, | |Output | 172 |Queues | | marking, | | shaping, | |Queues | 173 | |-->| policing or |-->| active queue |-->| | 174 | | | shaping) | | management | | | 175 | | | | | remarking) | | | 176 |----------| |----------------| |--------------| |----------| 178 Figure 1: Generic Traffic Management capabilities of a Network Device 180 Ingress actions such as classification are defined in [RFC4689] 181 and include IP addresses, port numbers, DSCP, etc. In terms of 182 marking, [RFC2697] and [RFC2698] define a single rate and dual rate, 183 three color marker, respectively. 185 The Metro Ethernet Forum (MEF) specifies policing and shaping in 186 terms of Ingress and Egress Subscriber/Provider Conditioning 187 Functions in MEF12.1 [MEF-12.1]; Ingress and Bandwidth Profile 188 attributes in MEF10.2 [MEF-10.2] and MEF 26 [MEF-26]. 190 1.2 Lab Configuration and Testing Overview 192 The following is the description of the lab set-up for the traffic 193 management tests: 195 +--------------+ +-------+ +----------+ +-----------+ 196 | Transmitting | | | | | | Receiving | 197 | Test Host | | | | | | Test Host | 198 | |-----| Device|---->| Network |--->| | 199 | | | Under | | Delay | | | 200 | | | Test | | Emulator | | | 201 | |<----| |<----| |<---| | 202 | | | | | | | | 203 +--------------+ +-------+ +----------+ +-----------+ 205 As shown in the test diagram, the framework supports uni-directional 206 and bi-directional traffic management tests (where the transmitting 207 and receiving roles would be reversed on the return path). 209 This testing framework describes the tests and metrics for each of 210 the following traffic management functions: 211 - Classification 212 - Policing 213 - Queuing / Scheduling 214 - Shaping 216 The tests are divided into individual and rated capacity tests. 217 The individual tests are intended to benchmark the traffic management 218 functions according to the metrics defined in Section 4. The 219 capacity tests verify traffic management functions under the load of 220 many simultaneous individual tests and their flows. 222 This involves concurrent testing of multiple interfaces with the 223 specific traffic management function enabled, and increasing load to 224 the capacity limit of each interface. 226 As an example: a device is specified to be capable of shaping on all 227 of its egress ports. The individual test would first be conducted to 228 benchmark the specified shaping function against the metrics defined 229 in section 4. Then the capacity test would be executed to test the 230 shaping function concurrently on all interfaces and with maximum 231 traffic load. 233 The Network Delay Emulator (NDE) is required for TCP stateful tests 234 in order to allow TCP to utilize a significant size TCP window in its 235 control loop. 237 Also note that the Network Delay Emulator (NDE) SHOULD be passive in 238 nature such as a fiber spool. This is recommended to eliminate the 239 potential effects that an active delay element (i.e. test impairment 240 generator) may have on the test flows. In the case where a fiber 241 spool is not practical due to the desired latency, an active NDE MUST 242 be independently verified to be capable of adding the configured 243 delay without loss. In other words, the DUT would be removed and the 244 NDE performance benchmarked independently. 246 Note that the NDE SHOULD be used only as emulated delay. Most NDEs 247 allow for per flow delay actions, emulating QoS prioritization. For 248 this framework, the NDE's sole purpose is simply to add delay to all 249 packets (emulate network latency). So to benchmark the performance of 250 the NDE, maximum offered load should be tested against the following 251 frame sizes: 128, 256, 512, 768, 1024, 1500,and 9600 bytes. The delay 252 accuracy at each of these packet sizes can then be used to calibrate 253 the range of expected Bandwidth Delay Product (BDP) for the TCP 254 stateful tests. 256 2. Conventions used in this document 258 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 259 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 260 document are to be interpreted as described in [RFC2119]. 262 The following acronyms are used: 264 AQM: Active Queue Management 266 BB: Bottleneck Bandwidth 268 BDP: Bandwidth Delay Product 270 BSA: Burst Size Achieved 272 CBS: Committed Burst Size 274 CIR: Committed Information Rate 276 DUT: Device Under Test 278 EBS: Excess Burst Size 280 EIR: Excess Information Rate 282 NDE: Network Delay Emulator 284 SP: Strict Priority Queuing 286 QL: Queue Length 288 QoS: Quality of Service 290 RTH: Receiving Test Host 292 RTT: Round Trip Time 294 SBB: Shaper Burst Bytes 296 SBI: Shaper Burst Interval 298 SR: Shaper Rate 300 SSB: Send Socket Buffer 302 Tc: CBS Time Interval 304 Te: EBS Time Interval 306 Ti Transmission Interval 308 TTH: Transmitting Test Host 309 TTP: TCP Test Pattern 311 TTPET: TCP Test Pattern Execution Time 313 3. Scope and Goals 315 The scope of this work is to develop a framework for benchmarking and 316 testing the traffic management capabilities of network devices in the 317 lab environment. These network devices may include but are not 318 limited to: 319 - Switches (including Layer 2/3 devices) 320 - Routers 321 - Firewalls 322 - General Layer 4-7 appliances (Proxies, WAN Accelerators, etc.) 324 Essentially, any network device that performs traffic management as 325 defined in section 1.1 can be benchmarked or tested with this 326 framework. 328 The primary goal is to assess the maximum forwarding performance 329 deemed to be within the provisioned traffic limits that a network 330 device can sustain without dropping or impairing packets, or 331 compromising the accuracy of multiple instances of traffic 332 management functions. This is the benchmark for comparison between 333 devices. 335 Within this framework, the metrics are defined for each traffic 336 management test but do not include pass / fail criterion, which is 337 not within the charter of BMWG. This framework provides the test 338 methods and metrics to conduct repeatable testing, which will 339 provide the means to compare measured performance between DUTs. 341 As mentioned in section 1.2, these methods describe the individual 342 tests and metrics for several management functions. It is also within 343 scope that this framework will benchmark each function in terms of 344 overall rated capacity. This involves concurrent testing of multiple 345 interfaces with the specific traffic management function enabled, up 346 to the capacity limit of each interface. 348 It is not within scope of this framework to specify the procedure for 349 testing multiple configurations of traffic management functions 350 concurrently. The multitudes of possible combinations is almost 351 unbounded and the ability to identify functional "break points" 352 would be almost impossible. 354 However, section 6.4 provides suggestions for some profiles of 355 concurrent functions that would be useful to benchmark. The key 356 requirement for any concurrent test function is that tests MUST 357 produce reliable and repeatable results. 359 Also, it is not within scope to perform conformance testing. Tests 360 defined in this framework benchmark the traffic management functions 361 according to the metrics defined in section 4 and do not address any 362 conformance to standards related to traffic management. 364 The current specifications don't specify exact behavior or 365 implementation and the specifications that do exist (cited in 366 Section 1.1) allow implementations to vary w.r.t. short term rate 367 accuracy and other factors. This is a primary driver for this 368 framework: to provide an objective means to compare vendor traffic 369 management functions. 371 Another goal is to devise methods that utilize flows with 372 congestion-aware transport (TCP) as part of the traffic load and 373 still produce repeatable results in the isolated test environment. 374 This framework will derive stateful test patterns (TCP or 375 application layer) that can also be used to further benchmark the 376 performance of applicable traffic management techniques such as 377 queuing / scheduling and traffic shaping. In cases where the 378 network device is stateful in nature (i.e. firewall, etc.), 379 stateful test pattern traffic is important to test along with 380 stateless, UDP traffic in specific test scenarios (i.e. 381 applications using TCP transport and UDP VoIP, etc.). 383 As mentioned earlier in the document, repeatability of test results 384 is critical, especially considering the nature of stateful TCP 385 traffic. To this end, the stateful tests will use TCP test patterns 386 to emulate applications. This framework also provides guidelines 387 for application modeling and open source tools to achieve the 388 repeatable stimulus. And finally, TCP metrics from [RFC6349] MUST 389 be measured for each stateful test and provide the means to compare 390 each repeated test. 392 Even though the scope is targeted to TCP applications (i.e. Web, 393 Email, database, etc.), the framework could be applied to SCTP 394 in terms of test patterns. WebRTC, SS7 signaling, and 3gpp are 395 examples of SCTP protocols that could be modeled with this 396 framework to benchmark SCTP's effect on traffic management 397 performance. 399 Also note that currently, this framework does not address tcpcrypt 400 (encrypted TCP) test patterns, although the metrics defined in 401 Section 4.2 can still be used since the metrics are based on TCP 402 retransmission and RTT measurements (versus any of the payload). 403 Thus if tcpcrypt becomes popular, it would be natural for 404 benchmarkers to consider encrypted TCP patterns and include them 405 in test cases. 407 4. Traffic Benchmarking Metrics 409 The metrics to be measured during the benchmarks are divided into two 410 (2) sections: packet layer metrics used for the stateless traffic 411 testing and TCP layer metrics used for the stateful traffic 412 testing. 414 4.1. Metrics for Stateless Traffic Tests 416 Stateless traffic measurements require that sequence number and 417 time-stamp be inserted into the payload for lost packet analysis. 418 Delay analysis may be achieved by insertion of timestamps directly 419 into the packets or timestamps stored elsewhere (packet captures). 420 This framework does not specify the packet format to carry sequence 421 number or timing information. 423 However,[RFC4737] and [RFC4689] provide recommendations 424 for sequence tracking along with definitions of in-sequence and 425 out-of-order packets. 427 The following are the metrics that MUST be measured during the 428 stateless traffic benchmarking components of the tests: 430 - Burst Size Achieved (BSA): for the traffic policing and network 431 queue tests, the tester will be configured to send bursts to test 432 either the Committed Burst Size (CBS) or Excess Burst Size (EBS) of 433 a policer or the queue / buffer size configured in the DUT. The 434 Burst Size Achieved metric is a measure of the actual burst size 435 received at the egress port of the DUT with no lost packets. As an 436 example, the configured CBS of a DUT is 64KB and after the burst 437 test, only a 63 KB can be achieved without packet loss. Then 63KB is 438 the BSA. Also, the average Packet Delay Variation (PDV see below) as 439 experienced by the packets sent at the BSA burst size should be 440 recorded. This metric shall be reported in units of bytes, KBytes, 441 or MBytes. 443 - Lost Packets(LP): For all traffic management tests, the tester will 444 transmit the test packets into the DUT ingress port and the number of 445 packets received at the egress port will be measured. The difference 446 between packets transmitted into the ingress port and received at the 447 egress port is the number of lost packets as measured at the egress 448 port. These packets must have unique identifiers such that only the 449 test packets are measured. For cases where multiple flows are 450 transmitted from ingress to egress port (e.g. IP conversations), each 451 flow must have sequence numbers within the test packets stream. 453 [RFC6703] and [RFC2680] describe the need to establish the 454 time threshold to wait before a packet is declared as lost, and this 455 threshold MUST be reported with the results. This metric shall be 456 reported as an integer number which cannot be negative. (see: 457 http://tools.ietf.org/html/rfc6703#section-4.1) 459 - Out of Order (OOO): in additions to the LP metric, the test 460 packets must be monitored for sequence. [RFC4689] defines the 461 general function of sequence tracking, as well as definitions 462 for in-sequence and out-of-order packets. Out-of-order packets 463 will be counted per [RFC4737]. This metric shall be reported as 464 an integer number which cannot be negative. 466 - Packet Delay (PD): the Packet Delay metric is the difference 467 between the timestamp of the received egress port packets and the 468 packets transmitted into the ingress port and specified in [RFC1242]. 469 The transmitting host and receiving host time must be in 470 time sync using NTP , GPS, etc. This metric SHALL be reported as a 471 real number of seconds, where a negative measurement usually 472 indicates a time synchronization problem between test devices. 474 - Packet Delay Variation (PDV): the Packet Delay Variation metric is 475 the variation between the timestamp of the received egress port 476 packets and specified in [RFC5481]. Note that per [RFC5481], 477 this PDV is the variation of one-way delay across many packets in 478 the traffic flow. Per the measurement formula in [RFC5481], select 479 the high percentile of 99% and units of measure will be a real 480 number of seconds (negative is not possible for PDV and would 481 indicate a measurement error). 483 - Shaper Rate (SR): The SR represents the average DUT output 484 rate (bps) over the test interval. The Shaper Rate is only 485 applicable to the traffic shaping tests. 487 - Shaper Burst Bytes (SBB): A traffic shaper will emit packets in 488 different size "trains"; these are frames "back-to-back", respect 489 the mandatory inter-frame gap. This metric characterizes the method 490 by which the shaper emits traffic. Some shapers transmit larger 491 bursts per interval, and a burst of 1 packet would apply to the 492 extreme case of a shaper sending a CBR stream of single packets. 493 This metric SHALL be reported in units of bytes, KBytes, or MBytes. 494 Shaper Burst Bytes is only applicable to thetraffic shaping tests. 496 - Shaper Burst Interval(SBI): the SBI is the time between shaper 497 emitted bursts and is measured at the DUT egress port. This metric 498 shall be reported as an real number of seconds. Shaper Burst 499 Interval is only applicable to the traffic shaping tests, 501 4.2. Metrics for Stateful Traffic Tests 503 The stateful metrics will be based on [RFC6349] TCP metrics and 504 MUST include: 506 - TCP Test Pattern Execution Time (TTPET): [RFC6349] defined the TCP 507 Transfer Time for bulk transfers, which is simply the measured time 508 to transfer bytes across single or concurrent TCP connections. The 509 TCP test patterns used in traffic management tests will include bulk 510 transfer and interactive applications. The interactive patterns 511 include instances such as HTTP business applications, database 512 applications, etc. The TTPET will be the measure of the time for a 513 single execution of a TCP Test Pattern (TTP). Average, minimum, and 514 maximum times will be measured or calculated and expressed as a real 515 number of seconds. 517 An example would be an interactive HTTP TTP session which should take 518 5 seconds on a GigE network with 0.5 millisecond latency. During ten 519 (10) executions of this TTP, the TTPET results might be: average of 520 6.5 seconds, minimum of 5.0 seconds, and maximum of 7.9 seconds. 522 - TCP Efficiency: after the execution of the TCP Test Pattern, TCP 523 Efficiency represents the percentage of Bytes that were not 524 retransmitted. 526 Transmitted Bytes - Retransmitted Bytes 528 TCP Efficiency % = --------------------------------------- X 100 530 Transmitted Bytes 532 Transmitted Bytes are the total number of TCP Bytes to be transmitted 533 including the original and the retransmitted Bytes. These 534 retransmitted bytes should be recorded from the sender's TCP/IP stack 536 perspective, to avoid any misinterpretation that a reordered packet 537 is a retransmitted packet (as may be the case with packet decode 538 interpretation). 540 - Buffer Delay: represents the increase in RTT during a TCP test 541 versus the baseline DUT RTT (non congested, inherent latency). RTT 542 and the technique to measure RTT (average versus baseline) are 543 defined in [RFC6349]. Referencing [RFC6349], the average RTT is 544 derived from the total of all measured RTTs during the actual test 545 sampled at every second divided by the test duration in seconds. 547 Total RTTs during transfer 548 Average RTT during transfer = ----------------------------- 549 Transfer duration in seconds 551 Average RTT during Transfer - Baseline RTT 552 Buffer Delay % = ------------------------------------------ X 100 553 Baseline RTT 555 Note that even though this was not explicitly stated in [RFC6349], 556 retransmitted packets should not be used in RTT measurements. 558 Also, the test results should record the average RTT in millisecond 559 across the entire test duration and number of samples. 561 5. Tester Capabilities 563 The testing capabilities of the traffic management test environment 564 are divided into two (2) sections: stateless traffic testing and 565 stateful traffic testing 567 5.1. Stateless Test Traffic Generation 569 The test device MUST be capable of generating traffic at up to the 570 link speed of the DUT. The test device must be calibrated to verify 571 that it will not drop any packets. The test device's inherent PD and 572 PDV must also be calibrated and subtracted from the PD and PDV 573 metrics. The test device must support the encapsulation to be 574 tested such as IEEE 802.1Q VLAN, IEEE 802.1ad Q-in-Q, Multiprotocol 575 Label Switching (MPLS), etc. Also, the test device must allow 576 control of the classification techniques defined in [RFC4689] 577 (i.e. IP address, DSCP, TOS, etc classification). 579 The open source tool "iperf" can be used to generate stateless UDP 580 traffic and is discussed in Appendix A. Since iperf is a software 581 based tool, there will be performance limitations at higher link 582 speeds (e.g. GigE, 10 GigE, etc.). Careful calibration of any test 583 environment using iperf is important. At higher link speeds, it is 584 recommended to use hardware based packet test equipment. 586 5.1.1 Burst Hunt with Stateless Traffic 588 A central theme for the traffic management tests is to benchmark the 589 specified burst parameter of traffic management function, since burst 591 parameters of SLAs are specified in bytes. For testing efficiency, 592 it is recommended to include a burst hunt feature, which automates 593 the manual process of determining the maximum burst size which can 594 be supported by a traffic management function. 596 The burst hunt algorithm should start at the target burst size 597 (maximum burst size supported by the traffic management function) 598 and will send single bursts until it can determine the largest burst 599 that can pass without loss. If the target burst size passes, then 600 the test is complete. The hunt aspect occurs when the target burst 601 size is not achieved; the algorithm will drop down to a configured 602 minimum burst size and incrementally increase the burst until the 603 maximum burst supported by the DUT is discovered. The recommended 604 granularity of the incremental burst size increase is 1 KB. 606 Optionally for a policer function and if the burst size passes, the 607 burst should be increased by increments of 1 KB to verify that the 608 policer is truly configured properly (or enabled at all). 610 5.2. Stateful Test Pattern Generation 612 The TCP test host will have many of the same attributes as the TCP 613 test host defined in [RFC6349]. The TCP test device may be a 614 standard computer or a dedicated communications test instrument. In 615 both cases, it must be capable of emulating both a client and a 616 server. 618 For any test using stateful TCP test traffic, the Network Delay 619 Emulator (NDE function from the lab set-up diagram) must be used in 620 order to provide a meaningful BDP. As referenced in section 2, the 621 target traffic rate and configured RTT MUST be verified independently 622 using just the NDE for all stateful tests (to ensure the NDE can 623 delay without loss). 625 The TCP test host MUST be capable to generate and receive stateful 626 TCP test traffic at the full link speed of the DUT. As a general 627 rule of thumb, testing TCP Throughput at rates greater than 500 Mbps 628 may require high performance server hardware or dedicated hardware 629 based test tools. 631 The TCP test host MUST allow adjusting both Send and Receive Socket 632 Buffer sizes. The Socket Buffers must be large enough to fill the 633 BDP for bulk transfer TCP test application traffic. 635 Measuring RTT and retransmissions per connection will generally 636 require a dedicated communications test instrument. In the absence of 637 dedicated hardware based test tools, these measurements may need to 638 be conducted with packet capture tools, i.e. conduct TCP Throughput 639 tests and analyze RTT and retransmissions in packet captures. 641 The TCP implementation used by the test host MUST be specified in 642 the test results (e.g. TCP New Reno, TCP options supported, etc.). 643 Additionally, the test results SHALL provide specific congestion 644 control algorithm details, as per [RFC3148]. 646 While [RFC6349] defined the means to conduct throughput tests of TCP 647 bulk transfers, the traffic management framework will extend TCP test 648 execution into interactive TCP application traffic. Examples include 649 email, HTTP, business applications, etc. This interactive traffic is 650 bi-directional and can be chatty, meaning many turns in traffic 651 communication during the course of a transaction (versus the 652 relatively uni-directional flow of bulk transfer applications). 654 The test device must not only support bulk TCP transfer application 655 traffic but MUST also support chatty traffic. A valid stress test 656 SHOULD include both traffic types. This is due to the non-uniform, 657 bursty nature of chatty applications versus the relatively uniform 658 nature of bulk transfers (the bulk transfer smoothly stabilizes to 659 equilibrium state under lossless conditions). 661 While iperf is an excellent choice for TCP bulk transfer testing, 662 the netperf open source tool provides the ability to control the 663 client and server request / response behavior. The netperf-wrapper 664 tool is a Python wrapper to run multiple simultaneous netperf 665 instances and aggregate the results. Appendix A provides an overview 666 of netperf / netperf-wrapper and another open source application 667 emulation tools, iperf. As with any software based tool, the 668 performance must be qualified to the link speed to be tested. 669 Hardware-based test equipment should be considered for reliable 670 results at higher links speeds (e.g. 1 GigE, 10 GigE). 672 5.2.1. TCP Test Pattern Definitions 674 As mentioned in the goals of this framework, techniques are defined 675 to specify TCP traffic test patterns to benchmark traffic 676 management technique(s) and produce repeatable results. Some 677 network devices such as firewalls, will not process stateless test 678 traffic which is another reason why stateful TCP test traffic must 679 be used. 681 An application could be fully emulated up to Layer 7, however this 682 framework proposes that stateful TCP test patterns be used in order 683 to provide granular and repeatable control for the benchmarks. The 684 following diagram illustrates a simple Web Browsing application 685 (HTTP). 687 GET url 689 Client ------------------------> Web 691 Web 200 OK 100ms | 693 Browser <------------------------ Server 695 In this example, the Client Web Browser (Client) requests a URL and 696 then the Web Server delivers the web page content to the Client 697 (after a Server delay of 100 millisecond). This asynchronous, 698 "request/response" behavior is intrinsic to most TCP based 699 applications such as Email (SMTP), File Transfers (FTP and SMB), 701 Database (SQL), Web Applications (SOAP), REST, etc. The impact to 702 the network elements is due to the multitudes of Clients and the 703 variety of bursty traffic, which stresses traffic management 704 functions. The actual emulation of the specific application 705 protocols is not required and TCP test patterns can be defined to 706 mimic the application network traffic flows and produce repeatable 707 results. 709 Application modeling techniques have been proposed in 710 "3GPP2 C.R1002-0 v1.0" and provides examples to model the behavior of 711 HTTP, FTP, and WAP applications at the TCP layer. The models have 712 been defined with various mathematical distributions for the 713 Request/Response bytes and inter-request gap times. The model 714 definition format described in this work are the basis for the 715 guidelines provides in Appendix B and are also similar to formats 716 used by network modeling tools. Packet captures can also be used to 717 characterize application traffic and specify some of the test 718 patterns listed in Appendix B. 720 This framework does not specify a fixed set of TCP test patterns, but 721 does provide test cases that SHOULD be performed in Appendix B. Some 722 of these examples reflect those specified in "draft-ietf-bmwg-ca- 723 bench-meth-04" which suggests traffic mixes for a variety of 724 representative application profiles. Other examples are simply 725 well-known application traffic types such as HTTP. 727 6. Traffic Benchmarking Methodology 729 The traffic benchmarking methodology uses the test set-up from 730 section 2 and metrics defined in section 4. 732 Each test SHOULD compare the network device's internal statistics 733 (available via command line management interface, SNMP, etc.) to the 734 measured metrics defined in section 4. This evaluates the accuracy 735 of the internal traffic management counters under individual test 736 conditions and capacity test conditions that are defined in each 737 subsection. This comparison is not intended to compare real-time 738 statistics, but the cumulative statistics reported after the test 739 has completed and device counters have updated (it is common for 740 device counters to update after a 10 second or greater interval). 742 From a device configuration standpoint, scheduling and shaping 743 functionality can be applied to logical ports such Link Aggregation 744 (LAG). This would result in the same scheduling and shaping 745 configuration applied to all the member physical ports. The focus of 746 this draft is only on tests at a physical port level. 748 The following sections provide the objective, procedure, metrics, and 749 reporting format for each test. For all test steps, the following 750 global parameters must be specified: 752 Test Runs (Tr). Defines the number of times the test needs to be run 753 to ensure accurate and repeatable results. The recommended value is 754 a minimum of 10. 756 Test Duration (Td). Defines the duration of a test iteration, 757 expressed in seconds. The recommended minimum value is 60 seconds. 759 The variability in the test results MUST be measured between Test 760 Runs and if the variation is characterized as a significant portion 761 of the measured values, the next step may be to revise the methods to 762 achieve better consistency. 764 6.1. Policing Tests 766 A policer is defined as the entity performing the policy function. 767 The intent of the policing tests is to verify the policer performance 768 (i.e. CIR-CBS and EIR-EBS parameters). The tests will verify that the 769 network device can handle the CIR with CBS and the EIR with EBS and 770 will use back-back packet testing concepts from [RFC2544] (but 771 adapted to burst size algorithms and terminology). Also [MEF-14], 772 [MEF-19], and [MEF-37] provide some basis for specific components of 773 this test. The burst hunt algorithm defined in section 5.1.1 can 774 also be used to automate the measurement of the CBS value. 776 The tests are divided into two (2) sections; individual policer 777 tests and then full capacity policing tests. It is important to 778 benchmark the basic functionality of the individual policer then 779 proceed into the fully rated capacity of the device. This capacity 780 may include the number of policing policies per device and the 781 number of policers simultaneously active across all ports. 783 6.1.1 Policer Individual Tests 785 Objective: 786 Test a policer as defined by [RFC4115] or MEF 10.2, depending upon 787 the equipment's specification. In addition to verifying that the 788 policer allows the specified CBS and EBS bursts to pass, the policer 789 test MUST verify that the policer will remark or drop excess, and 790 pass traffic at the specified CBS/EBS values. 792 Test Summary: 793 Policing tests should use stateless traffic. Stateful TCP test 794 traffic will generally be adversely affected by a policer in the 795 absence of traffic shaping. So while TCP traffic could be used, 796 it is more accurate to benchmark a policer with stateless traffic. 798 As an example for [RFC4115], consider a CBS and EBS of 64KB and CIR 799 and EIR of 100 Mbps on a 1GigE physical link (in color-blind mode). 800 A stateless traffic burst of 64KB would be sent into the policer at 801 the GigE rate. This equates to approximately a 0.512 millisecond 802 burst time (64 KB at 1 GigE). The traffic generator must space these 803 bursts to ensure that the aggregate throughput does not exceed the 804 CIR. The Ti between the bursts would equal CBS * 8 / CIR = 5.12 805 millisecond in this example. 807 Test Metrics: 808 The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL 809 be measured at the egress port and recorded. 811 Procedure: 812 1. Configure the DUT policing parameters for the desired CIR/EIR and 813 CBS/EBS values to be tested 815 2. Configure the tester to generate a stateless traffic burst equal 816 to CBS and an interval equal to Ti (CBS in bits / CIR) 818 3. Compliant Traffic Step: Generate bursts of CBS + EBS traffic into 819 the policer ingress port and measure the metrics defined in 820 section 4.1 (BSA, LP. OOS, PD, and PDV) at the egress port and 821 across the entire Td (default 60 seconds duration) 823 4. Excess Traffic Test: Generate bursts of greater than CBS + EBS 824 limit traffic into the policer ingress port and verify that the 825 policer only allowed the BSA bytes to exit the egress. The excess 826 burst MUST be recorded and the recommended value is 1000 bytes. 827 Additional tests beyond the simple color-blind example might 828 include: color-aware mode, configurations where EIR is greater 829 than CIR, etc. 831 Reporting Format: 832 The policer individual report MUST contain all results for each 833 CIR/EIR/CBS/EBS test run and a recommended format is as follows: 835 ******************************************************** 836 Test Configuration Summary: Tr, Td 838 DUT Configuration Summary: CIR, EIR, CBS, EBS 840 The results table should contain entries for each test run, (Test #1 841 to Test #Tr). 843 Compliant Traffic Test: BSA, LP, OOS, PD, and PDV 845 Excess Traffic Test: BSA 846 ******************************************************** 848 6.1.2 Policer Capacity Tests 850 Objective: 851 The intent of the capacity tests is to verify the policer performance 852 in a scaled environment with multiple ingress customer policers on 853 multiple physical ports. This test will benchmark the maximum number 854 of active policers as specified by the device manufacturer. 856 Test Summary: 857 The specified policing function capacity is generally expressed in 858 terms of the number of policers active on each individual physical 859 port as well as the number of unique policer rates that are utilized. 860 For all of the capacity tests, the benchmarking test procedure and 861 report format described in Section 6.1.1 for a single policer MUST 862 be applied to each of the physical port policers. 864 As an example, a Layer 2 switching device may specify that each of 865 the 32 physical ports can be policed using a pool of policing service 866 policies. The device may carry a single customer's traffic on each 867 physical port and a single policer is instantiated per physical port. 868 Another possibility is that a single physical port may carry multiple 869 customers, in which case many customer flows would be policed 870 concurrently on an individual physical port (separate policers per 871 customer on an individual port). 873 Test Metrics: 874 The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL 875 be measured at the egress port and recorded. 877 The following sections provide the specific test scenarios, 878 procedures, and reporting formats for each policer capacity test. 880 6.1.2.1 Maximum Policers on Single Physical Port Test 882 Test Summary: 883 The first policer capacity test will benchmark a single physical 884 port, maximum policers on that physical port. 886 Assume multiple categories of ingress policers at rates r1, r2,...rn. 887 There are multiple customers on a single physical port. Each customer 888 could be represented by a single tagged vlan, double tagged vlan, 889 VPLS instance etc. Each customer is mapped to a different policer. 890 Each of the policers can be of rates r1, r2,..., rn. 892 An example configuration would be 893 - Y1 customers, policer rate r1 894 - Y2 customers, policer rate r2 895 - Y3 customers, policer rate r3 896 ... 897 - Yn customers, policer rate rn 899 Some bandwidth on the physical port is dedicated for other traffic 900 non customer traffic); this includes network control protocol 901 traffic. There is a separate policer for the other traffic. Typical 902 deployments have 3 categories of policers; there may be some 903 deployments with more or less than 3 categories of ingress 904 policers. 906 Test Procedure: 907 1. Configure the DUT policing parameters for the desired CIR/EIR and 908 CBS/EBS values for each policer rate (r1-rn) to be tested 910 2. Configure the tester to generate a stateless traffic burst equal 911 to CBS and an interval equal to TI (CBS in bits/CIR) for each 912 customer stream (Y1 - Yn). The encapsulation for each customer 913 must also be configured according to the service tested (VLAN, 914 VPLS, IP mapping, etc.). 916 3. Compliant Traffic Step: Generate bursts of CBS + EBS traffic into 917 the policer ingress port for each customer traffic stream and 918 measure the metrics defined in section 4.1 (BSA, LP, OOS, PD, and 919 PDV) at the egress port for each stream and across the entire Td 920 (default 30 seconds duration) 922 4. Excess Traffic Test: Generate bursts of greater than CBS + EBS 923 limit traffic into the policer ingress port for each customer 924 traffic stream and verify that the policer only allowed the BSA 925 bytes to exit the egress for each stream. The excess burst MUST 926 recorded and the recommended value is 1000 bytes. 928 Reporting Format: 929 The policer individual report MUST contain all results for each 930 CIR/EIR/CBS/EBS test run, per customer traffic stream. 932 A recommended format is as follows: 934 ******************************************************** 935 Test Configuration Summary: Tr, Td 937 Customer traffic stream Encapsulation: Map each stream to VLAN, 938 VPLS, IP address 940 DUT Configuration Summary per Customer Traffic Stream: CIR, EIR, 941 CBS, EBS 943 The results table should contain entries for each test run, (Test #1 944 to Test #Tr). 946 Customer Stream Y1-Yn (see note), Compliant Traffic Test: BSA, LP, 947 OOS, PD, and PDV 949 Customer Stream Y1-Yn (see note), Excess Traffic Test: BSA 950 ******************************************************** 952 Note: For each test run, there will be a two (2) rows for each 953 customer stream, the compliant traffic result and the excess traffic 954 result. 956 6.1.2.2 Single Policer on All Physical Ports 958 Test Summary: 959 The second policer capacity test involves a single Policer function 960 per physical port with all physical ports active. In this test, 961 there is a single policer per physical port. The policer can have 962 one of the rates r1, r2,.., rn. All the physical ports in the 963 networking device are active. 965 Procedure: 966 The procedure is identical to 6.1.1, the configured parameters must 967 be reported per port and the test report must include results per 968 measured egress port 970 6.1.2.3 Maximum Policers on All Physical Ports 972 Finally the third policer capacity test involves a combination of the 973 first and second capacity test, namely maximum policers active per 974 physical port and all physical ports are active. 976 Procedure: 977 Uses the procedural method from 6.1.2.1 and the configured parameters 978 must be reported per port and the test report must include per stream 979 results per measured egress port. 981 6.2. Queue and Scheduler Tests 983 Queues and traffic Scheduling are closely related in that a queue's 984 priority dictates the manner in which the traffic scheduler 985 transmits packets out of the egress port. 987 Since device queues / buffers are generally an egress function, this 988 test framework will discuss testing at the egress (although the 989 technique can be applied to ingress side queues). 991 Similar to the policing tests, the tests are divided into two 992 sections; individual queue/scheduler function tests and then full 993 capacity tests. 995 6.2.1 Queue/Scheduler Individual Tests Overview 997 The various types of scheduling techniques include FIFO, Strict 998 Priority (SP), Weighted Fair Queueing (WFQ) along with other 999 variations. This test framework recommends to test at a minimum 1000 of three techniques although it is the discretion of the tester 1001 to benchmark other device scheduling algorithms. 1003 6.2.1.1 Queue/Scheduler with Stateless Traffic Test 1005 Objective: 1006 Verify that the configured queue and scheduling technique can 1007 handle stateless traffic bursts up to the queue depth. 1009 Test Summary: 1010 A network device queue is memory based unlike a policing function, 1011 which is token or credit based. However, the same concepts from 1012 section 6.1 can be applied to testing network device queues. 1014 The device's network queue should be configured to the desired size 1015 in KB (queue length, QL) and then stateless traffic should be 1016 transmitted to test this QL. 1018 A queue should be able to handle repetitive bursts with the 1019 transmission gaps proportional to the bottleneck bandwidth. This 1020 gap is referred to as the transmission interval (Ti). Ti can 1021 be defined for the traffic bursts and is based off of the QL and 1022 Bottleneck Bandwidth (BB) of the egress interface. 1024 Ti = QL * 8 / BB 1026 Note that this equation is similar to the Ti required for 1027 transmission into a policer (QL = CBS, BB = CIR). Also note that the 1028 burst hunt algorithm defined in section 5.1.1 can also be used to 1029 automate the measurement of the queue value. 1031 The stateless traffic burst shall be transmitted at the link speed 1032 and spaced within the Ti time interval. The metrics defined in 1033 section 4.1 shall be measured at the egress port and recorded; the 1034 primary result is to verify the BSA and that no packets are dropped. 1036 The scheduling function must also be characterized to benchmark the 1037 device's ability to schedule the queues according to the priority. 1038 An example would be 2 levels of priority including SP and FIFO 1039 queueing. Under a flow load greater the egress port speed, the 1040 higher priority packets should be transmitted without drops (and 1041 also maintain low latency), while the lower priority (or best 1042 effort) queue may be dropped. 1044 Test Metrics: 1045 The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL 1046 be measured at the egress port and recorded. 1048 Procedure: 1049 1. Configure the DUT queue length (QL) and scheduling technique 1050 (FIFO, SP, etc) parameters 1052 2. Configure the tester to generate a stateless traffic burst equal 1053 to QL and an interval equal to Ti (QL in bits/BB) 1055 3. Generate bursts of QL traffic into the DUT and measure the 1056 metrics defined in section 4.1 (LP, OOS, PD, and PDV) at the 1057 egress port and across the entire Td (default 30 seconds 1058 duration) 1060 Report Format: 1061 The Queue/Scheduler Stateless Traffic individual report MUST contain 1062 all results for each QL/BB test run and a recommended format is as 1063 follows: 1065 ******************************************************** 1066 Test Configuration Summary: Tr, Td 1068 DUT Configuration Summary: Scheduling technique, BB and QL 1070 The results table should contain entries for each test run 1071 as follows, 1073 (Test #1 to Test #Tr). 1075 - LP, OOS, PD, and PDV 1076 ******************************************************** 1078 6.2.1.2 Testing Queue/Scheduler with Stateful Traffic 1080 Objective: 1081 Verify that the configured queue and scheduling technique can handle 1082 stateful traffic bursts up to the queue depth. 1084 Test Background and Summary: 1085 To provide a more realistic benchmark and to test queues in layer 4 1086 devices such as firewalls, stateful traffic testing is recommended 1087 for the queue tests. Stateful traffic tests will also utilize the 1088 Network Delay Emulator (NDE) from the network set-up configuration in 1089 section 2. 1091 The BDP of the TCP test traffic must be calibrated to the QL of the 1092 device queue. Referencing [RFC6349], the BDP is equal to: 1094 BB * RTT / 8 (in bytes) 1096 The NDE must be configured to an RTT value which is large enough to 1097 allow the BDP to be greater than QL. An example test scenario is 1098 defined below: 1100 - Ingress link = GigE 1101 - Egress link = 100 Mbps (BB) 1102 - QL = 32KB 1104 RTT(min) = QL * 8 / BB and would equal 2.56 ms (and the BDP = 32KB) 1106 In this example, one (1) TCP connection with window size / SSB of 1107 32KB would be required to test the QL of 32KB. This Bulk Transfer 1108 Test can be accomplished using iperf as described in Appendix A. 1110 Two types of TCP tests MUST be performed: Bulk Transfer test and 1111 Micro Burst Test Pattern as documented in Appendix B. The Bulk 1112 Transfer Test only bursts during the TCP Slow Start (or Congestion 1113 Avoidance) state, while the Micro Burst test emulates application 1114 layer bursting which may occur any time during the TCP connection. 1116 Other tests types SHOULD include: Simple Web Site, Complex Web Site, 1117 Business Applications, Email, SMB/CIFS File Copy (which are also 1118 documented in Appendix B). 1120 Test Metrics: 1121 The test results will be recorded per the stateful metrics defined in 1122 section 4.2, primarily the TCP Test Pattern Execution Time (TTPET), 1123 TCP Efficiency, and Buffer Delay. 1125 Procedure: 1127 1. Configure the DUT queue length (QL) and scheduling technique 1128 (FIFO, SP, etc) parameters 1130 2. Configure the test generator* with a profile of an emulated 1131 application traffic mixture 1133 - The application mixture MUST be defined in terms of percentage 1134 of the total bandwidth to be tested 1136 - The rate of transmission for each application within the mixture 1137 MUST be also be configurable 1139 * The test generator MUST be capable of generating precise TCP 1140 test patterns for each application specified, to ensure repeatable 1141 results. 1143 3. Generate application traffic between the ingress (client side) and 1144 egress (server side) ports of the DUT and measure application 1145 throughput the metrics (TTPET, TCP Efficiency, and Buffer Delay), 1147 per application stream and at the ingress and egress port (across 1148 the entire Td, default 60 seconds duration). 1150 Concerning application measurements, a couple of items require 1151 clarification. An application session may be comprised of a single 1152 TCP connection or multiple TCP connections. 1154 For the single TCP connection application sessions, the application 1155 thoughput / metrics have a 1-1 relationship to the TCP connection 1156 measurements. 1158 If an application session (i.e. HTTP-based application) utilizes 1159 multiple TCP connections, then all of the TCP connections are 1160 aggregated in the application throughput measurement / metrics for 1161 that application. 1163 Then there is the case of mulitlple instances of an application 1164 session (i.e. multiple FTPs emulating multiple clients). In this 1165 situation, the test should measure / record each FTP application 1166 session independently, tabulating the minimum, maximum, and average 1167 for all FTP sessions. 1169 Finally, application throughput measurements are based on Layer 4 1170 TCP throughput and do not include bytes retransmitted. The TCP 1171 Efficiency metric MUST be measured during the test and provides a 1172 measure of "goodput" during each test. 1174 Reporting Format: 1175 The Queue/Scheduler Stateful Traffic individual report MUST contain 1176 all results for each traffic scheduler and QL/BB test run and a 1177 recommended format is as follows: 1179 ******************************************************** 1180 Test Configuration Summary: Tr, Td 1182 DUT Configuration Summary: Scheduling technique, BB and QL 1184 Application Mixture and Intensities: this is the percent configured 1185 of each application type 1187 The results table should contain entries for each test run with 1188 minimum, maximum, and average per application session as follows, 1189 (Test #1 to Test #Tr) 1191 - Per Application Throughout (bps) and TTPET 1192 - Per Application Bytes In and Bytes Out 1193 - Per Application TCP Efficiency, and Buffer Delay 1194 ******************************************************** 1196 6.2.2 Queue / Scheduler Capacity Tests 1198 Objective: 1199 The intent of these capacity tests is to benchmark queue/scheduler 1200 performance in a scaled environment with multiple queues/schedulers 1201 active on multiple egress physical ports. This test will benchmark 1202 the maximum number of queues and schedulers as specified by the 1203 device manufacturer. Each priority in the system will map to a 1204 separate queue. 1206 Test Metrics: 1207 The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL 1208 be measured at the egress port and recorded. 1210 The following sections provide the specific test scenarios, 1211 procedures, and reporting formats for each queue / scheduler capacity 1212 test. 1214 6.2.2.1 Multiple Queues / Single Port Active 1216 For the first scheduler / queue capacity test, multiple queues per 1217 port will be tested on a single physical port. In this case, 1218 all the queues (typically 8) are active on a single physical port. 1219 Traffic from multiple ingress physical ports are directed to the 1220 same egress physical port which will cause oversubscription on the 1221 egress physical port. 1223 There are many types of priority schemes and combinations of 1224 priorities that are managed by the scheduler. The following 1225 sections specify the priority schemes that should be tested. 1227 6.2.2.1.1 Strict Priority on Egress Port 1229 Test Summary: 1230 For this test, Strict Priority (SP) scheduling on the egress 1231 physical port should be tested and the benchmarking methodology 1232 specified in section 6.2.1.1 and 6.2.1.2 (procedure, metrics, 1233 and reporting format) should be applied here. For a given 1234 priority, each ingress physical port should get a fair share of 1235 the egress physical port bandwidth. 1237 Since this is a capacity test, the configuration and report 1238 results format from 6.2.1.1 and 6.2.1.2 MUST also include: 1240 Configuration: 1241 - The number of physical ingress ports active during the test 1242 - The classication marking (DSCP, VLAN, etc.) for each physical 1243 ingress port 1244 - The traffic rate for stateful traffic and the traffic rate 1245 / mixture for stateful traffic for each physical ingress port 1247 Report results: 1248 - For each ingress port traffic stream, the achieved throughput 1249 rate and metrics at the egress port 1251 6.2.2.1.2 Strict Priority + Weighted Fair Queue (WFQ) on Egress Port 1253 Test Summary: 1254 For this test, Strict Priority (SP) and Weighted Fair Queue (WFQ) 1255 should be enabled simultaneously in the scheduler but on a single 1256 egress port. The benchmarking methodology specified in Section 1258 6.2.1.1 and 6.2.1.2 (procedure, metrics, and reporting format) 1259 should be applied here. Additionally, the egress port bandwidth 1260 sharing among weighted queues should be proportional to the assigned 1261 weights. For a given priority, each ingress physical port should get 1262 a fair share of the egress physical port bandwidth. 1264 Since this is a capacity test, the configuration and report results 1265 format from 6.2.1.1 and 6.2.1.2 MUST also include: 1267 Configuration: 1268 - The number of physical ingress ports active during the test 1269 - The classication marking (DSCP, VLAN, etc.) for each physical 1270 ingress port 1271 - The traffic rate for stateful traffic and the traffic rate / 1272 mixture for stateful traffic for each physical ingress port 1274 Report results: 1275 - For each ingress port traffic stream, the achieved throughput rate 1276 and metrics at each queue of the egress port queue (both the SP 1277 and WFQ queue). 1279 Example: 1280 - Egress Port SP Queue: throughput and metrics for ingress streams 1281 1-n 1282 - Egress Port WFQ Queue: throughput and metrics for ingress streams 1283 1-n 1285 6.2.2.2 Single Queue per Port / All Ports Active 1287 Test Summary: 1288 Traffic from multiple ingress physical ports are directed to the 1289 same egress physical port, which will cause oversubscription on the 1290 egress physical port. Also, the same amount of traffic is directed 1291 to each egress physical port. 1293 The benchmarking methodology specified in Section 6.2.1.1 1294 and 6.2.1.2 (procedure, metrics, and reporting format) should be 1295 applied here. Each ingress physical port should get a fair share of 1296 the egress physical port bandwidth. Additionally, each egress 1297 physical port should receive the same amount of traffic. 1299 Since this is a capacity test, the configuration and report results 1300 format from 6.2.1.1 and 6.2.1.2 MUST also include: 1302 Configuration: 1303 - The number of ingress ports active during the test 1304 - The number of egress ports active during the test 1305 - The classication marking (DSCP, VLAN, etc.) for each physical 1306 ingress port 1307 - The traffic rate for stateful traffic and the traffic rate / 1308 mixture for stateful traffic for each physical ingress port 1310 Report results: 1311 - For each egress port, the achieved throughput rate and metrics at 1312 the egress port queue for each ingress port stream. 1314 Example: 1315 - Egress Port 1: throughput and metrics for ingress streams 1-n 1316 - Egress Port n: throughput and metrics for ingress streams 1-n 1318 6.2.2.3 Multiple Queues per Port, All Ports Active 1320 Traffic from multiple ingress physical ports are directed to all 1321 queues of each egress physical port, which will cause 1322 oversubscription on the egress physical ports. Also, the same 1323 amount of traffic is directed to each egress physical port. 1325 The benchmarking methodology specified in Section 6.2.1.1 1326 and 6.2.1.2 (procedure, metrics, and reporting format) should be 1327 applied here. For a given priority, each ingress physical port 1328 should get a fair share of the egress physical port bandwidth. 1329 Additionally, each egress physical port should receive the same 1330 amount of traffic. 1332 Since this is a capacity test, the configuration and report results 1333 format from 6.2.1.1 and 6.2.1.2 MUST also include: 1335 Configuration: 1336 - The number of physical ingress ports active during the test 1337 - The classication marking (DSCP, VLAN, etc.) for each physical 1338 ingress port 1339 - The traffic rate for stateful traffic and the traffic rate / 1340 mixture for stateful traffic for each physical ingress port 1342 Report results: 1343 - For each egress port, the achieved throughput rate and metrics at 1344 each egress port queue for each ingress port stream. 1346 Example: 1347 - Egress Port 1, SP Queue: throughput and metrics for ingress 1348 streams 1-n 1349 - Egress Port 2, WFQ Queue: throughput and metrics for ingress 1350 streams 1-n 1351 . 1352 . 1353 - Egress Port n, SP Queue: throughput and metrics for ingress 1354 streams 1-n 1355 - Egress Port n, WFQ Queue: throughput and metrics for ingress 1356 streams 1-n 1358 6.3. Shaper tests 1360 A traffic shaper is memory based like a queue, but with the added 1361 intelligence of an active traffic scheduler. The same concepts from 1362 section 6.2 (Queue testing) can be applied to testing network device 1363 shaper. 1365 Again, the tests are divided into two sections; individual shaper 1366 benchmark tests and then full capacity shaper benchmark tests. 1368 6.3.1 Shaper Individual Tests Overview 1370 A traffic shaper generally has three (3) components that can be 1371 configured: 1373 - Ingress Queue bytes 1374 - Shaper Rate, bps 1375 - Burst Committed (Bc) and Burst Excess (Be), bytes 1377 The Ingress Queue holds burst traffic and the shaper then meters 1378 traffic out of the egress port according to the Shaper Rate and 1379 Bc/Be parameters. Shapers generally transmit into policers, so 1380 the idea is for the emitted traffic to conform to the policer's 1381 limits. 1383 6.3.1.1 Testing Shaper with Stateless Traffic 1385 Objective: 1386 Test a shaper by transmitting stateless traffic bursts into the 1387 shaper ingress port and verifying that the egress traffic is shaped 1388 according to the shaper traffic profile. 1390 Test Summary: 1391 The stateless traffic must be burst into the DUT ingress port and 1392 not exceed the Ingress Queue. The burst can be a single burst or 1393 multiple bursts. If multiple bursts are transmitted, then the 1394 Ti (Time interval) must be large enough so that the Shaper Rate is 1395 not exceeded. An example will clarify single and multiple burst 1396 test cases. 1398 In the example, the shaper's ingress and egress ports are both full 1399 duplex Gigabit Ethernet. The Ingress Queue is configured to be 1400 512,000 bytes, the Shaper Rate (SR) = 50 Mbps, and both Bc/Be 1401 configured to be 32,000 bytes. For a single burst test, the 1402 transmitting test device would burst 512,000 bytes maximum into the 1403 ingress port and then stop transmitting. 1405 If a multiple burst test is to be conducted, then the burst bytes 1406 divided by the time interval between the 512,000 byte bursts must 1407 not exceed the Shaper Rate. The time interval (Ti) must adhere to 1408 a similar formula as described in section 6.2.1.1 for queues, namely: 1410 Ti = Ingress Queue x 8 / Shaper Rate 1412 For the example from the previous paragraph, Ti between bursts must 1413 be greater than 82 millisecond (512,000 bytes x 8 / 50,000,000 bps). 1414 This yields an average rate of 50 Mbps so that an Input Queue 1415 would not overflow. 1417 Test Metrics: 1418 - The metrics defined in section 4.1 (LP, OOS, PDV, SR, SBB, SBI) 1419 SHALL be measured at the egress port and recorded. 1421 Procedure: 1422 1. Configure the DUT shaper ingress queue length (QL) and shaper 1423 egress rate parameters (SR, Bc, Be) parameters 1425 2. Configure the tester to generate a stateless traffic burst equal 1426 to QL and an interval equal to Ti (QL in bits/BB) 1428 3. Generate bursts of QL traffic into the DUT and measure the metrics 1429 defined in section 4.1 (LP, OOS, PDV, SR, SBB, SBI) at the egress 1430 port and across the entire Td (default 30 seconds duration) 1432 Report Format: 1433 The Shaper Stateless Traffic individual report MUST contain all 1434 results for each QL/SR test run and a recommended format is as 1435 follows: 1436 ******************************************************** 1437 Test Configuration Summary: Tr, Td 1439 DUT Configuration Summary: Ingress Burst Rate, QL, SR 1441 The results table should contain entries for each test run as 1442 follows,(Test #1 to Test #Tr). 1444 - LP, OOS, PDV, SR, SBB, SBI 1445 ******************************************************** 1447 6.3.1.2 Testing Shaper with Stateful Traffic 1449 Objective: 1450 Test a shaper by transmitting stateful traffic bursts into the shaper 1451 ingress port and verifying that the egress traffic is shaped 1452 according to the shaper traffic profile. 1454 Test Summary: 1455 To provide a more realistic benchmark and to test queues in layer 4 1456 devices such as firewalls, stateful traffic testing is also 1457 recommended for the shaper tests. Stateful traffic tests will also 1458 utilize the Network Delay Emulator (NDE) from the network set-up 1459 configuration in section 2. 1461 The BDP of the TCP test traffic must be calculated as described in 1462 section 6.2.2. To properly stress network buffers and the traffic 1463 shaping function, the cumulative TCP window should exceed the BDP 1464 which will stress the shaper. BDP factors of 1.1 to 1.5 are 1465 recommended, but the values are the discretion of the tester and 1466 should be documented. 1468 The cumulative TCP Window Sizes* (RWND at the receiving end & CWND 1469 at the transmitting end) equates to: 1471 TCP window size* for each connection x number of connections 1473 * as described in section 3 of [RFC6349], the SSB MUST be large 1474 enough to fill the BDP 1476 Example, if the BDP is equal to 256 Kbytes and a connection size of 1477 64Kbytes is used for each connection, then it would require four (4) 1478 connections to fill the BDP and 5-6 connections (over subscribe the 1479 BDP) to stress test the traffic shaping function. 1481 Two types of TCP tests MUST be performed: Bulk Transfer test and 1482 Micro Burst Test Pattern as documented in Appendix B. The Bulk 1483 Transfer Test only bursts during the TCP Slow Start (or Congestion 1484 Avoidance) state, while the Micro Burst test emulates application 1485 layer bursting which may any time during the TCP connection. 1487 Other tests types SHOULD include: Simple Web Site, Complex Web Site, 1488 Business Applications, Email, SMB/CIFS File Copy (which are also 1489 documented in Appendix B). 1491 Test Metrics: 1492 The test results will be recorded per the stateful metrics defined in 1493 section 4.2, primarily the TCP Test Pattern Execution Time (TTPET), 1494 TCP Efficiency, and Buffer Delay. 1496 Procedure: 1497 1. Configure the DUT shaper ingress queue length (QL) and shaper 1498 egress rate parameters (SR, Bc, Be) parameters 1500 2. Configure the test generator* with a profile of an emulated 1501 application traffic mixture 1503 - The application mixture MUST be defined in terms of percentage 1504 of the total bandwidth to be tested 1506 - The rate of transmission for each application within the mixture 1507 MUST be also be configurable 1509 * The test generator MUST be capable of generating precise TCP 1510 test patterns for each application specified, to ensure repeatable 1511 results. 1513 3. Generate application traffic between the ingress (client side) and 1514 egress (server side) ports of the DUT and measure the metrics 1515 (TTPET, TCP Efficiency, and Buffer Delay) per application stream 1516 and at the ingress and egress port (across the entire Td, default 1517 30 seconds duration). 1519 Reporting Format: 1520 The Shaper Stateful Traffic individual report MUST contain all 1521 results for each traffic scheduler and QL/SR test run and a 1522 recommended format is as follows: 1524 ******************************************************** 1525 Test Configuration Summary: Tr, Td 1527 DUT Configuration Summary: Ingress Burst Rate, QL, SR 1529 Application Mixture and Intensities: this is the percent configured 1530 of each application type 1531 The results table should contain entries for each test run with 1532 minimum, maximum, and average per application session as follows, 1533 (Test #1 to Test #Tr) 1535 - Per Application Throughout (bps) and TTPET 1536 - Per Application Bytes In and Bytes Out 1537 - Per Application TCP Efficiency, and Buffer Delay 1538 ******************************************************** 1540 6.3.2 Shaper Capacity Tests 1542 Objective: 1543 The intent of these scalability tests is to verify shaper performance 1544 in a scaled environment with shapers active on multiple queues on 1545 multiple egress physical ports. This test will benchmark the maximum 1546 number of shapers as specified by the device manufacturer. 1548 The following sections provide the specific test scenarios, 1549 procedures, and reporting formats for each shaper capacity test. 1551 6.3.2.1 Single Queue Shaped, All Physical Ports Active 1553 Test Summary: 1554 The first shaper capacity test involves per port shaping, all 1555 physical ports active. Traffic from multiple ingress physical ports 1556 are directed to the same egress physical port and this will cause 1557 oversubscription on the egress physical port. Also, the same amount 1558 of traffic is directed to each egress physical port. 1560 The benchmarking methodology specified in Section 6.3.1 (procedure, 1561 metrics, and reporting format) should be applied here. Since this is 1562 a capacity test, the configuration and report results format from 1563 6.3.1 MUST also include: 1565 Configuration: 1566 - The number of physical ingress ports active during the test 1567 - The classication marking (DSCP, VLAN, etc.) for each physical 1568 ingress port 1569 - The traffic rate for stateful traffic and the traffic rate / 1570 mixture for stateful traffic for each physical ingress port 1571 - The shaped egress ports shaper parameters (QL, SR, Bc, Be) 1573 Report results: 1574 - For each active egress port, the achieved throughput rate and 1575 shaper metrics for each ingress port traffic stream 1577 Example: 1578 - Egress Port 1: throughput and metrics for ingress streams 1-n 1579 - Egress Port n: throughput and metrics for ingress streams 1-n 1581 6.3.2.2 All Queues Shaped, Single Port Active 1583 Test Summary: 1584 The second shaper capacity test is conducted with all queues actively 1585 shaping on a single physical port. The benchmarking methodology 1586 described in per port shaping test (previous section) serves as the 1587 foundation for this. Additionally, each of the SP queues on the 1588 egress physical port is configured with a shaper. For the highest 1589 priority queue, the maximum amount of bandwidth available is limited 1590 by the bandwidth of the shaper. For the lower priority queues, the 1591 maximum amount of bandwidth available is limited by the bandwidth of 1592 the shaper and traffic in higher priority queues. 1594 The benchmarking methodology specified in Section 6.3.1 (procedure, 1595 metrics, and reporting format) should be applied here. Since this is 1596 a capacity test, the configuration and report results format from 1597 6.3.1 MUST also include: 1599 Configuration: 1600 - The number of physical ingress ports active during the test 1601 - The classication marking (DSCP, VLAN, etc.) for each physical 1602 ingress port 1603 - The traffic rate for stateful traffic and the traffic rate/mixture 1604 for stateful traffic for each physical ingress port 1605 - For the active egress port, each shaper queue parameters (QL, SR, 1606 Bc, Be) 1608 Report results: 1609 - For each queue of the active egress port, the achieved throughput 1610 rate and shaper metrics for each ingress port traffic stream 1612 Example: 1613 - Egress Port High Priority Queue: throughput and metrics for 1614 ingress streams 1-n 1615 - Egress Port Lower Priority Queue: throughput and metrics for 1616 ingress streams 1-n 1618 6.3.2.3 All Queues Shaped, All Ports Active 1620 Test Summary: 1621 And for the third shaper capacity test (which is a combination of the 1622 tests in the previous two sections),all queues will be actively 1623 shaping and all physical ports active. 1625 The benchmarking methodology specified in Section 6.3.1 (procedure, 1626 metrics, and reporting format) should be applied here. Since this is 1627 a capacity test, the configuration and report results format from 1628 6.3.1 MUST also include: 1630 Configuration: 1631 - The number of physical ingress ports active during the test 1632 - The classication marking (DSCP, VLAN, etc.) for each physical 1633 ingress port 1634 - The traffic rate for stateful traffic and the traffic rate / 1635 mixture for stateful traffic for each physical ingress port 1636 - For each of the active egress ports, shaper port and per queue 1637 parameters(QL, SR, Bc, Be) 1639 Report results: 1640 - For each queue of each active egress port, the achieved throughput 1641 rate and shaper metrics for each ingress port traffic stream 1643 Example: 1644 - Egress Port 1 High Priority Queue: throughput and metrics for 1645 ingress streams 1-n 1646 - Egress Port 1 Lower Priority Queue: throughput and metrics for 1647 ingress streams 1-n 1648 . 1649 - Egress Port n High Priority Queue: throughput and metrics for 1650 ingress streams 1-n 1651 - Egress Port n Lower Priority Queue: throughput and metrics for 1652 ingress streams 1-n 1654 6.4 Concurrent Capacity Load Tests 1656 As mentioned in the scope of this document, it is impossible to 1657 specify the various permutations of concurrent traffic management 1658 functions that should be tested in a device for capacity testing. 1659 However, some profiles are listed below which may be useful 1660 to test under capacity as well: 1662 - Policers on ingress and queuing on egress 1663 - Policers on ingress and shapers on egress (not intended for a 1664 flow to be policed then shaped, these would be two different 1665 flows tested at the same time) 1666 - etc. 1668 The test procedures and reporting formatting from the previous 1669 sections may be modified to accommodate the capacity test profile. 1671 7. Security Considerations 1673 Documents of this type do not directly affect the security of the 1674 Internet or of corporate networks as long as benchmarking is not 1675 performed on devices or systems connected to production networks. 1677 Further, benchmarking is performed on a "black-box" basis, relying 1678 solely on measurements observable external to the DUT/SUT. 1680 Special capabilities SHOULD NOT exist in the DUT/SUT specifically for 1681 benchmarking purposes. Any implications for network security arising 1682 from the DUT/SUT SHOULD be identical in the lab and in production 1683 networks. 1685 8. IANA Considerations 1687 This document does not REQUIRE an IANA registration for ports 1688 dedicated to the TCP testing described in this document. 1690 9. References 1692 9.1. Normative References 1694 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1695 Requirement Levels", BCP 14, RFC2119, March 1997. 1697 [RFC1242] S. Bradner, "Benchmarking Terminology for Network 1698 Interconnection Devices," RFC1242 July 1991 1700 [RFC2544] S. Bradner, "Benchmarking Methodology for Network 1701 Interconnect Devices," RFC2544 March 1999 1703 [RFC3148] M. Mathis et al., A Framework for Defining Empirical 1704 Bulk Transfer Capacity Metrics," RFC3148 July 2001 1706 [RFC5481] A. Morton et al., "Packet Delay Variation Applicability 1707 Statement," RFC5481 March 2009 1709 [RFC6703] A. Morton et al., "Reporting IP Network Performance 1710 Metrics: Different Points of View." RFC 6703 August 2012 1712 [RFC2680] G. Almes et al., "A One-way Packet Loss Metric for IPPM," 1713 RFC2680 September 1999 1715 [RFC4689] S. Poretsky et al., "Terminology for Benchmarking 1716 Network-layer Traffic Control Mechanisms," RFC4689, 1717 October 2006 1719 [RFC4737] A. Morton et al., "Packet Reordering Metrics," RFC4737, 1720 February 2006 1722 [RFC4115] O. Aboul-Magd et al., "A Differentiated Service Two-Rate, 1723 Three-Color Marker with Efficient Handling of in-Profile Traffic." 1724 RFC4115 July 2005 1726 [RFC6349] Barry Constantine et al., "Framework for TCP Throughput 1727 Testing," RFC6349, August 2011 1729 9.2. Informative References 1731 [RFC2697] J. Heinanen et al., "A Single Rate Three Color Marker," 1732 RFC2697, September 1999 1734 [RFC2698] J. Heinanen et al., "A Two Rate Three Color Marker, " 1735 RFC2698, September 1999 1737 [AQM-RECO] Fred Baker et al., "IETF Recommendations Regarding 1738 Active Queue Management," August 2014, 1739 https://datatracker.ietf.org/doc/draft-ietf-aqm- 1740 recommendation/ 1742 [MEF-10.2] "MEF 10.2: Ethernet Services Attributes Phase 2," October 1743 2009, http://metroethernetforum.org/PDF_Documents/ 1744 technical-specifications/MEF10.2.pdf 1746 [MEF-12.1] "MEF 12.1: Carrier Ethernet Network Architecture 1747 Framework -- 1748 Part 2: Ethernet Services Layer - Base Elements," April 1749 2010, https://www.metroethernetforum.org/Assets/Technical 1750 _Specifications/PDF/MEF12.1.pdf 1752 [MEF-26] "MEF 26: External Network Network Interface (ENNI) - 1753 Phase 1,"January 2010, http://www.metroethernetforum.org 1754 /PDF_Documents/technical-specifications/MEF26.pdf 1756 [MEF-14] "Abstract Test Suite for Traffic Management Phase 1, 1757 https://www.metroethernetforum.org/Assets 1758 /Technical_Specifications/PDF/MEF_14.pdf 1760 [MEF-19] "Abstract Test Suite for UNII Type 1", 1761 https://www.metroethernetforum.org/Assets 1762 /Technical_Specifications/PDF/MEF_19.pdf 1764 [MEF-37] "Abstract Test Suite for ENNI", 1765 https://www.metroethernetforum.org/Assets 1766 /Technical_Specifications/PDF/MEF_37.pdf 1768 Appendix A: Open Source Tools for Traffic Management Testing 1770 This framework specifies that stateless and stateful behaviors SHOULD 1771 both be tested. Some open source tools that can be used to 1772 accomplish many of the tests proposed in this framework are: 1773 iperf, netperf (with netperf-wrapper),uperf, TMIX, 1774 TCP-incast-generator, and D-ITG (Distributed Internet Traffic 1775 Generator). 1777 Iperf can generate UDP or TCP based traffic; a client and server must 1778 both run the iperf software in the same traffic mode. The server is 1779 set up to listen and then the test traffic is controlled from the 1780 client. Both uni-directional and bi-directional concurrent testing 1781 are supported. 1783 The UDP mode can be used for the stateless traffic testing. The 1784 target bandwidth, packet size, UDP port, and test duration can be 1785 controlled. A report of bytes transmitted, packets lost, and delay 1786 variation are provided by the iperf receiver. 1788 Iperf (TCP mode), TCP-incast-generator, and D-ITG can be used for 1789 stateful traffic testing to test bulk transfer traffic. The TCP 1790 Window size (which is actually the SSB), the number of connections, 1791 the packet size, TCP port and the test duration can be controlled. 1792 A report of bytes transmitted and throughput achieved are provided 1793 by the iperf sender, while TCP-incast-generator and D-ITG provide 1794 even more statistics. 1796 Netperf is a software application that provides network bandwidth 1797 testing between two hosts on a network. It supports Unix domain 1798 sockets, TCP, SCTP, DLPI and UDP via BSD Sockets. Netperf provides 1799 a number of predefined tests e.g. to measure bulk (unidirectional) 1800 data transfer or request response performance 1801 http://en.wikipedia.org/wiki/Netperf). Netperf-wrapper is a Python 1802 script that runs multiple simultaneous netperf instances and 1803 aggregate the results. 1805 uperf uses a description (or model) of an application mixture and 1806 the tool generates the load according to the model desciptor. uperf 1807 is more flexible than Netperf in it's ability to generate request 1808 / response application behavior within a single TCP connection. The 1809 application model descriptor can be based off of empirical data, but 1810 currently the import of packet captures is not directly supported. 1812 Tmix is another application traffic emulation tool and uses packet 1813 captures directly to create the traffic profile. The packet trace is 1814 'reverse compiled' into a source-level characterization, called a 1815 connection vector, of each TCP connection present in the trace. While 1816 most widely used in ns2 simulation environment, TMix also runs on 1817 Linux hosts. 1819 These open source tool's traffic generation capabilities facilitate 1820 the emulation of the TCP test patterns which are discussed in 1821 Appendix B. 1823 Appendix B: Stateful TCP Test Patterns 1825 This framework recommends at a minimum the following TCP test 1826 patterns since they are representative of real world application 1827 traffic (section 5.2.1 describes some methods to derive other 1828 application-based TCP test patterns). 1830 - Bulk Transfer: generate concurrent TCP connections whose aggregate 1831 number of in-flight data bytes would fill the BDP. Guidelines 1832 from [RFC6349] are used to create this TCP traffic pattern. 1834 - Micro Burst: generate precise burst patterns within a single or 1835 multiple TCP connections(s). The idea is for TCP to establish 1836 equilibrium and then burst application bytes at defined sizes. The 1837 test tool must allow the burst size and burst time interval to be 1838 configurable. 1840 - Web Site Patterns: The HTTP traffic model from 1841 "3GPP2 C.R1002-0 v1.0" is referenced (Table 4.1.3.2.1) to develop 1842 these TCP test patterns. In summary, the HTTP traffic model consists 1843 of the following parameters: 1844 - Main object size (Sm) 1845 - Embedded object size (Se) 1846 - Number of embedded objects per page (Nd) 1847 - Client processing time (Tcp) 1848 - Server processing time (Tsp) 1850 Web site test patterns are illustrated with the following examples: 1852 - Simple Web Site: mimic the request / response and object 1853 download behavior of a basic web site (small company). 1854 - Complex Web Site: mimic the request / response and object 1855 download behavior of a complex web site (ecommerce site). 1857 Referencing the HTTP traffic model parameters , the following table 1858 was derived (by analysis and experimentation) for Simple and Complex 1859 Web site TCP test patterns: 1861 Simple Complex 1862 Parameter Web Site Web Site 1863 ----------------------------------------------------- 1864 Main object Ave. = 10KB Ave. = 300KB 1865 size (Sm) Min. = 100B Min. = 50KB 1866 Max. = 500KB Max. = 2MB 1868 Embedded object Ave. = 7KB Ave. = 10KB 1869 size (Se) Min. = 50B Min. = 100B 1870 Max. = 350KB Max. = 1MB 1872 Number of embedded Ave. = 5 Ave. = 25 1873 objects per page (Nd) Min. = 2 Min. = 10 1874 Max. = 10 Max. = 50 1876 Client processing Ave. = 3s Ave. = 10s 1877 time (Tcp)* Min. = 1s Min. = 3s 1878 Max. = 10s Max. = 30s 1880 Server processing Ave. = 5s Ave. = 8s 1881 time (Tsp)* Min. = 1s Min. = 2s 1882 Max. = 15s Max. = 30s 1884 * The client and server processing time is distributed across the 1885 transmission / receipt of all of the main and embedded objects 1887 To be clear, the parameters in this table are reasonable guidelines 1888 for the TCP test pattern traffic generation. The test tool can use 1889 fixed parameters for simpler tests and mathematical distributions for 1890 more complex tests. However, the test pattern must be repeatable to 1891 ensure that the benchmark results can be reliably compared. 1893 - Inter-active Patterns: While Web site patterns are inter-active 1894 to a degree, they mainly emulate the downloading of various 1895 complexity web sites. Inter-active patterns are more chatty in 1896 nature since there is alot of user interaction with the servers. 1897 Examples include business applications such as Peoplesoft, Oracle 1898 and consumer applications such as Facebook, IM, etc. For the inter- 1899 active patterns, the packet capture technique was used to 1900 characterize some business applications and also the email 1901 application. 1903 In summary, an inter-active application can be described by the 1904 following parameters: 1905 - Client message size (Scm) 1906 - Number of Client messages (Nc) 1907 - Server response size (Srs) 1908 - Number of server messages (Ns) 1909 - Client processing time (Tcp) 1910 - Server processing Time (Tsp) 1911 - File size upload (Su)* 1912 - File size download (Sd)* 1914 * The file size parameters account for attachments uploaded or 1915 downloaded and may not be present in all inter-active applications 1917 Again using packet capture as a means to characterize, the following 1918 table reflects the guidelines for Simple Business Application, 1919 Complex Business Application, eCommerce, and Email Send / Receive: 1921 Simple Complex 1922 Parameter Biz. App. Biz. App eCommerce* Email 1923 -------------------------------------------------------------------- 1924 Client message Ave. = 450B Ave. = 2KB Ave. = 1KB Ave. = 200B 1925 size (Scm) Min. = 100B Min. = 500B Min. = 100B Min. = 100B 1926 Max. = 1.5KB Max. = 100KB Max. = 50KB Max. = 1KB 1928 Number of client Ave. = 10 Ave. = 100 Ave. = 20 Ave. = 10 1929 messages (Nc) Min. = 5 Min. = 50 Min. = 10 Min. = 5 1930 Max. = 25 Max. = 250 Max. = 100 Max. = 25 1932 Client processing Ave. = 10s Ave. = 30s Ave. = 15s Ave. = 5s 1933 time (Tcp)** Min. = 3s Min. = 3s Min. = 5s Min. = 3s 1934 Max. = 30s Max. = 60s Max. = 120s Max. = 45s 1936 Server response Ave. = 2KB Ave. = 5KB Ave. = 8KB Ave. = 200B 1937 size (Srs) Min. = 500B Min. = 1KB Min. = 100B Min. = 150B 1938 Max. = 100KB Max. = 1MB Max. = 50KB Max. = 750B 1940 Number of server Ave. = 50 Ave. = 200 Ave. = 100 Ave. = 15 1941 messages (Ns) Min. = 10 Min. = 25 Min. = 15 Min. = 5 1942 Max. = 200 Max. = 1000 Max. = 500 Max. = 40 1944 Server processing Ave. = 0.5s Ave. = 1s Ave. = 2s Ave. = 4s 1945 time (Tsp)** Min. = 0.1s Min. = 0.5s Min. = 1s Min. = 0.5s 1946 Max. = 5s Max. = 20s Max. = 10s Max. = 15s 1948 Complex Business Application, eCommerce, and Email Send / Receive 1949 (continued): 1951 Simple Complex 1952 Parameter Biz. App. Biz. App eCommerce* Email 1953 -------------------------------------------------------------------- 1954 File size Ave. = 50KB Ave. = 100KB Ave. = N/A Ave. = 100KB 1955 upload (Su) Min. = 2KB Min. = 10KB Min. = N/A Min. = 20KB 1956 Max. = 200KB Max. = 2MB Max. = N/A Max. = 10MB 1958 File size Ave. = 50KB Ave. = 100KB Ave. = N/A Ave. = 100KB 1959 download (Sd) Min. = 2KB Min. = 10KB Min. = N/A Min. = 20KB 1960 Max. = 200KB Max. = 2MB Max. = N/A Max. = 10MB 1962 * eCommerce used a combination of packet capture techniques and 1963 reference traffic flows from "SPECweb2009" (need proper reference) 1964 ** The client and server processing time is distributed across the 1965 transmission / receipt of all of messages. Client processing time 1966 consists mainly of the delay between user interactions (not machine 1967 processing). 1969 And again, the parameters in this table are the guidelines for the 1970 TCP test pattern traffic generation. The test tool can use fixed 1971 parameters for simpler tests and mathematical distributions for more 1972 complex tests. However, the test pattern must be repeatable to 1973 ensure that the benchmark results can be reliably compared. 1975 - SMB/CIFS File Copy: mimic a network file copy, both read and write. 1976 As opposed to FTP which is a bulk transfer and is only flow 1977 controlled via TCP, SMB/CIFS divides a file into application blocks 1978 and utilizes application level handshaking in addition to 1979 TCP flow control. 1981 In summary, an SMB/CIFS file copy can be described by the following 1982 parameters: 1983 - Client message size (Scm) 1984 - Number of client messages (Nc) 1985 - Server response size (Srs) 1986 - Number of Server messages (Ns) 1987 - Client processing time (Tcp) 1988 - Server processing time (Tsp) 1989 - Block size (Sb) 1991 The client and server messages are SMB control messages. The Block 1992 size is the data portion of th file transfer. 1994 Again using packet capture as a means to characterize the following 1995 table reflects the guidelines for SMB/CIFS file copy: 1997 SMB 1998 Parameter File Copy 1999 ------------------------------ 2000 Client message Ave. = 450B 2001 size (Scm) Min. = 100B 2002 Max. = 1.5KB 2003 Number of client Ave. = 10 2004 messages (Nc) Min. = 5 2005 Max. = 25 2006 Client processing Ave. = 1ms 2007 time (Tcp) Min. = 0.5ms 2008 Max. = 2 2009 Server response Ave. = 2KB 2010 size (Srs) Min. = 500B 2011 Max. = 100KB 2012 Number of server Ave. = 10 2013 messages (Ns) Min. = 10 2014 Max. = 200 2015 Server processing Ave. = 1ms 2016 time (Tsp) Min. = 0.5ms 2017 Max. = 2ms 2018 Block Ave. = N/A 2019 Size (Sb)* Min. = 16KB 2020 Max. = 128KB 2022 *Depending upon the tested file size, the block size will be 2023 transferred n number of times to complete the example. An example 2024 would be a 10 MB file test and 64KB block size. In this case 160 2025 blocks would be transferred after the control channel is opened 2026 between the client and server. 2028 Acknowledgments 2030 We would like to thank Al Morton for his continuous review and 2031 invaluable input to the document. We would also like to thank 2032 Scott Bradner for providing guidance early in the drafts 2033 conception in the area of benchmarking scope of traffic management 2034 functions. Additionally, we would like to thank Tim Copley for this 2035 original input and David Taht, Gory Erg, Toke Hoiland-Jorgensen for 2036 their review and input for the AQM group. And for the formal reviews 2037 of this document, we would like to thank Gilles Forget, 2038 Vijay Gurbani, Reinhard Schrage, and Bhuvaneswaran Vengainathan 2040 Authors' Addresses 2042 Barry Constantine 2043 JDSU, Test and Measurement Division 2044 Germantown, MD 20876-7100, USA 2045 Phone: +1 240 404 2227 2046 Email: barry.constantine@jdsu.com 2048 Ram Krishnan 2049 Brocade Communications 2050 San Jose, 95134, USA 2051 Phone: +001-408-406-7890 2052 Email: ramk@brocade.com