Benchmarking Methodology Working Group B. Balarajah Internet-Draft C. Rossenhoevel Intended status: Informational EANTC AG Expires: September 23, 2018 March 22, 2018 Benchmarking Methodology for Network Security Device Performance draft-balarajah-bmwg-ngfw-performance-03 Abstract This document provides benchmarking terminology and methodology for next-generation network security devices including next-generation firewalls (NGFW), intrusion detection and prevention solutions (IDS/ IPS) and unified threat management (UTM) implementations. The document aims to strongly improve the applicability, reproducibility and transparency of benchmarks and to align the test methodology with today's increasingly complex layer 7 application use cases. The main areas covered in this document are test terminology, traffic profiles and benchmarking methodology for NGFWs to start with. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 23, 2018. Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Balarajah & RossenhoeveExpires September 23, 2018 [Page 1] Internet-Draft Benchmarking for NGFW performance March 2018 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.1. Testbed Configuration . . . . . . . . . . . . . . . . . . 4 4.2. DUT/SUT Configuration . . . . . . . . . . . . . . . . . . 5 4.3. Test Equipment Configuration . . . . . . . . . . . . . . 8 4.3.1. Client Configuration . . . . . . . . . . . . . . . . 9 4.3.2. Backend Server Configuration . . . . . . . . . . . . 10 4.3.3. Traffic Flow Definition . . . . . . . . . . . . . . . 11 4.3.4. Traffic Load Profile . . . . . . . . . . . . . . . . 12 5. Test Bed Considerations . . . . . . . . . . . . . . . . . . . 13 6. Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . 14 6.1. Key Performance Indicators . . . . . . . . . . . . . . . 15 7. Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . 16 7.1. Throughput Performance With NetSecOPEN Traffic Mix . . . 16 7.1.1. Objective . . . . . . . . . . . . . . . . . . . . . . 16 7.1.2. Test Setup . . . . . . . . . . . . . . . . . . . . . 17 7.1.3. Test Parameters . . . . . . . . . . . . . . . . . . . 17 7.1.4. Test Procedures and expected Results . . . . . . . . 19 7.2. Concurrent TCP Connection Capacity With HTTP Traffic . . 20 7.2.1. Objective . . . . . . . . . . . . . . . . . . . . . . 20 7.2.2. Test Setup . . . . . . . . . . . . . . . . . . . . . 20 7.2.3. Test Parameters . . . . . . . . . . . . . . . . . . . 20 7.2.4. Test Procedures and expected Results . . . . . . . . 22 7.3. TCP/HTTP Connections Per Second . . . . . . . . . . . . . 23 7.3.1. Objective . . . . . . . . . . . . . . . . . . . . . . 23 7.3.2. Test Setup . . . . . . . . . . . . . . . . . . . . . 24 7.3.3. Test Parameters . . . . . . . . . . . . . . . . . . . 24 7.3.4. Test Procedures and Expected Results . . . . . . . . 25 7.4. HTTP Transactions Per Second . . . . . . . . . . . . . . 26 7.4.1. Objective . . . . . . . . . . . . . . . . . . . . . . 26 7.5. HTTP Throughput . . . . . . . . . . . . . . . . . . . . . 27 7.5.1. Objective . . . . . . . . . . . . . . . . . . . . . . 27 7.6. HTTP Transaction Latency . . . . . . . . . . . . . . . . 27 7.6.1. Objective . . . . . . . . . . . . . . . . . . . . . . 27 7.7. Concurrent SSL/TLS Connection Capacity . . . . . . . . . 27 7.7.1. Objective . . . . . . . . . . . . . . . . . . . . . . 27 7.8. SSL/TLS Handshake Rate . . . . . . . . . . . . . . . . . 27 7.8.1. Objective . . . . . . . . . . . . . . . . . . . . . . 27 7.8.2. Test Setup . . . . . . . . . . . . . . . . . . . . . 28 Balarajah & RossenhoeveExpires September 23, 2018 [Page 2] Internet-Draft Benchmarking for NGFW performance March 2018 7.8.3. Test Parameters . . . . . . . . . . . . . . . . . . . 28 7.8.4. Test Procedures and expected Results . . . . . . . . 30 7.9. HTTPS Transaction Per Second . . . . . . . . . . . . . . 31 7.9.1. Objective . . . . . . . . . . . . . . . . . . . . . . 31 7.10. HTTPS Throughput . . . . . . . . . . . . . . . . . . . . 31 7.10.1. Objective . . . . . . . . . . . . . . . . . . . . . 31 7.11. HTTPS Transaction Latency . . . . . . . . . . . . . . . . 31 7.11.1. Objective . . . . . . . . . . . . . . . . . . . . . 31 8. Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . . 31 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 10. Security Considerations . . . . . . . . . . . . . . . . . . . 32 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 12. Normative References . . . . . . . . . . . . . . . . . . . . 32 Appendix A. NetSecOPEN Basic Traffic Mix . . . . . . . . . . . . 32 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 1. Introduction 15 years have passed since IETF recommended test methodology and terminology for firewalls initially (RFC 2647, RFC 3511). The requirements for network security element performance and effectiveness have increased tremendously since then. Security function implementations have evolved to more advanced areas and have diversified into intrusion detection and prevention, threat management, analysis of encrypted traffic, etc. In an industry of growing importance, well-defined and reproducible key performance indicators (KPIs) are increasingly needed: They enable fair and reasonable comparison of network security functions. All these reasons have led to the creation of a new next-generation firewall benchmarking document. 2. Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] . 3. Scope This document provides testing terminology and testing methodology next-generation firewalls and related security functions. It covers two main areas: Performance benchmarks and security effectiveness testing. The document focuses on advanced, realistic, and reproducible testing methods. Additionally it describes test bed environments, test tool requirements and test result formats. Balarajah & RossenhoeveExpires September 23, 2018 [Page 3] Internet-Draft Benchmarking for NGFW performance March 2018 4. Test Setup Test setup defined in this document will be applicable to all of the benchmarking test cases described in Section 7. 4.1. Testbed Configuration Testbed configuration MUST ensure that any performance implications that are discovered during the benchmark testing aren't due to the inherent physical network limitations such as number of physical links and forwarding performance capabilities (throughput and latency) of the network devise in the testbed. For this reason, this document recommends to avoid external devices such as switch and router in the testbed as possible. In the typical deployment, the security devices (DUT/SUT) will not have a large number of entries in MAC or ARP tables, which impact the actual DUT/SUT performance due to MAC and ARP table lookup processes. Therefore, depend on number of used IP address in client and server side, it is recommended to connect Layer 3 device(s) between test equipment and DUT/SUT as shown in Figure 1. If the test equipment is capable to emulate layer 3 routing functionality and there is no need for test equipment ports aggregation, it is recommended to configure the test setup as shown in Figure 2. +-------------------+ +-----------+ +--------------------+ |Aggregation Switch/| | | | Aggregation Switch/| | Router +------+ DUT/SUT +------+ Router | | | | | | | +----------+--------+ +-----------+ +--------+-----------+ | | | | +-----------+-----------+ +-----------+-----------+ | | | | | +-------------------+ | | +-------------------+ | | | Emulated Router(s)| | | | Emulated Router(s)| | | | (Optional) | | | | (Optional) | | | +-------------------+ | | +-------------------+ | | +-------------------+ | | +-------------------+ | | | Clients | | | | Servers | | | +-------------------+ | | +-------------------+ | | | | | | Test Equipment | | Test Equipment | +-----------------------+ +-----------------------+ Figure 1: Testbed Setup - Option 1 Balarajah & RossenhoeveExpires September 23, 2018 [Page 4] Internet-Draft Benchmarking for NGFW performance March 2018 +-----------------------+ +-----------------------+ | +-------------------+ | +-----------+ | +-------------------+ | | | Emulated Router(s)| | | | | | Emulated Router(s)| | | | (Optional) | +----- DUT/SUT +-----+ (Optional) | | | +-------------------+ | | | | +-------------------+ | | +-------------------+ | +-----------+ | +-------------------+ | | | Clients | | | | Servers | | | +-------------------+ | | +-------------------+ | | | | | | Test Equipment | | Test Equipment | +-----------------------+ +-----------------------+ Figure 2: Testbed Setup - Option 2 4.2. DUT/SUT Configuration An unique DUT/SUT configuration MUST be used for all of the benchmarking tests described in Section 7. Since each DUT/SUT will have their own unique configuration, users SHOULD configure their device with the same parameters that would be used in the actual deployment of the device or a typical deployment. Also it is mandatory to enable all the security features on the DUT/SUT in order to achieve maximum security coverage for a specific deployment scenario. This document attempts to define the recommended security features which SHOULD be consistently enabled for all test cases. The table below describes the recommended sets of feature list which SHOULD be configured on the DUT/SUT. In order to improve repeatability, a summary of the DUT configuration including description of all enabled DUT/SUT features MUST be published with the benchmarking results. Balarajah & RossenhoeveExpires September 23, 2018 [Page 5] Internet-Draft Benchmarking for NGFW performance March 2018 +----------------------------------------------------+ | Device | +---------------------------------+---+---+---+------+ | | | | | | SSL | | NGFW |NGIPS|ADC|WAF|BPS|Broker| +-------------------------------------------------------------------+ | | |Included |Added to| Future test standards | |DUT Features |Feature|in initial|future | to be developed | | | |Scope |Scope | | +------------------------------------------------+---+---+---+------+ |SSL Inspection| x | | x | | | | | | +-------------------------------------------------------------------+ |IDS/IPS | x | x | | | | | | | +-------------------------------------------------------------------+ |Web Filtering | x | | x | | | | | | +-------------------------------------------------------------------+ |Antivirus | x | x | | | | | | | +-------------------------------------------------------------------+ |Anti Spyware | x | x | | | | | | | +-------------------------------------------------------------------+ |Anti Botnet | x | x | | | | | | | +-------------------------------------------------------------------+ |DLP | x | | x | | | | | | +-------------------------------------------------------------------+ |DDoS | x | | x | | | | | | +-------------------------------------------------------------------+ |Certificate | x | | x | | | | | | |Validation | | | | | | | | | +-------------------------------------------------------------------+ |Logging and | x | x | | | | | | | |Reporting | | | | | | | | | +-------------------------------------------------------------------+ |Application | x | x | | | | | | | |Identification| | | | | | | | | +----------------------+----------+--------+-----+---+---+---+------+ Table 1: DUT/SUT Feature List In addition, it is also recommended to configure a realistic number of access policy rules on the DUT/SUT. This document determines the number of access policy rules for three different class of DUT/SUT. The classification of the DUT/SUT MAY be based on its maximum supported throughput performance number. This document classifies the DUT/SUT in three different categories; namely small, medium and maximum. The recommended throughput values for the following classes are; Balarajah & RossenhoeveExpires September 23, 2018 [Page 6] Internet-Draft Benchmarking for NGFW performance March 2018 Small - supported throughput less than 5Gbit/s Medium - supported throughput greater than 5Gbit/s and less than 10Gbit/s Large - supported throughput greater than 10Gbit/s The access rule defined in the table 2 MUST be configured from top to bottom in correct order. The configured access policy rule MUST NOT block the test traffic used for the benchmarking test scenario. Balarajah & RossenhoeveExpires September 23, 2018 [Page 7] Internet-Draft Benchmarking for NGFW performance March 2018 +------------------------------------------------+------------------+ | | DUT/SUT | | | Classification | | | # Rules | +-----------+-----------+-----------------+------------+------+-----+ | | Match | | | | | | | Rules Type| Criteria| Description |Action|Small|Medium|Large| +-------------------------------------------------------------------+ |Application|Application|Any application |block | 10 | 20 | 50 | |layer | |traffic NOT | | | | | | | |included in the | | | | | | | |test traffic | | | | | +-------------------------------------------------------------------+ |Transport |Src IP and |Any src IP use in|block | 50 | 100 | 250 | |layer |TCP/UDP |the test AND any | | | | | | |Dst ports |dst ports NOT | | | | | | | |used in the test | | | | | | | |traffic | | | | | +-------------------------------------------------------------------+ |IP layer |Src/Dst IP |Any src/dst IP |block | 50 | 100 | 250 | | | |NOT used in the | | | | | | | |test | | | | | +-------------------------------------------------------------------+ |Application|Application|Applications |allow | 10 | 10 | 10 | |layer | |included in the | | | | | | | |test traffic | | | | | +-------------------------------------------------------------------+ |Transport |Src IP and |Half of the src |allow | 1 | 1 | 1 | |layer |TCP/UDP |IP used in the | | | | | | |Dst ports |test AND any dst | | | | | | | |ports used in the| | | | | | | |test traffic. One| | | | | | | |rule per subnet | | | | | +-------------------------------------------------------------------+ |IP layer |Src IP |The rest of the |allow | 1 | 1 | 1 | | | |src IP subnet | | | | | | | |range used in the| | | | | | | |test. One rule | | | | | | | |per subnet | | | | | +-----------+-----------------------------+------+-----+------+-----+ Table 2: DUT/SUT Access List 4.3. Test Equipment Configuration In general, test equipment allows configuring parameters in different protocol level. These parameters thereby influencing the traffic flows which will be offered and impacting performance measurements. Balarajah & RossenhoeveExpires September 23, 2018 [Page 8] Internet-Draft Benchmarking for NGFW performance March 2018 This document attempts to explicitly specify which test equipment parameters SHOULD be configurable, any such parameter(s) MUST be noted in the test report. 4.3.1. Client Configuration This section specifies which parameters SHOULD be considerable while configuring emulated clients using test equipment. Also this section specifies the recommended values for certain parameters. 4.3.1.1. TCP Stack Attributes The TCP stack SHOULD use a TCP Reno variant, which include congestion avoidance, back off and windowing, retransmission and recovery on every TCP connection between client and server endpoints. The default IPv4 and IPv6 MSS segments size MUST be set to 1460 bytes and 1440 bytes and a TX and RX receive windows of 32768 bytes. Delayed ACKs are permitted, but it SHOULD be limited to either a 200 msec delay timeout or 3000 in bytes before a forced ACK. Up to 3 retries SHOULD be allowed before a timeout event is declared. All traffic MUST set the TCP PSH flag to high. The source port range SHOULD be in the range of 1024 - 65535. Internal timeout SHOULD be dynamically scalable per RFC 793. 4.3.1.2. Client IP Address Space The sum of the client IP space SHOULD contain the following attributes. The traffic blocks SHOULD consist of multiple unique, continuous static address blocks. A default gateway is permitted. The IPv4 ToS byte should be set to '00'. The following equation can be used to determine the required total number of client IP address. Desired total number of client IP = Target throughput [Mbit/s] / Throughput per IP address [Mbit/s] (Idea 1) 6-7 Mbps per IP= 1,400-1,700 IPs per 10Gbit/s throughput (Idea 2) 0.1-0.2 Mbps per IP = 50,000-100,000 IPs per 10Gbit/s throughput Based on deployment and usecase scenario, client IP addresses SHOULD be distributed between IPv4 and IPv6 type. This document recommends using the following ratio(s) between IPv4 and IPv6: (Idea 1) 100 % IPv4, no IPv6 Balarajah & RossenhoeveExpires September 23, 2018 [Page 9] Internet-Draft Benchmarking for NGFW performance March 2018 (Idea 2) 80 % IPv4, 20 % IPv6 (Idea 3) 50 % IPv4, 50 % IPv6 (Idea 4) 0 % IPv4, 100 % IPv6 4.3.1.3. Emulated Web Browser Attributes The emulated web browser contains attributes that will materially affect how traffic is loaded. The objective is to emulate a modern, typical browser attributes to improve realism of the result set. For HTTP traffic emulation, the emulated browser must negotiate HTTP 1.1. HTTP persistency MAY be enabled depend on test scenario. The browser CAN open multiple TCP connections per Server endpoint IP at any time depending on how many sequential transactions are needed to be processed. Within the TCP connection multiple transactions can be processed if the emulated browser has available connections. The browser MUST advertise a User-Agent header. Headers will be sent uncompressed. The browser should enforce content length validation. For encrypted traffic, the following attributes shall define the negotiated encryption parameters. The tests must use TLSv1.2 or higher with a record size of 16383, commonly used cipher suite and key strength. Session reuse or ticket resumption may be used for subsequent connections to the same Server endpoint IP. The client endpoint must send TLS Extension SNI information when opening up a security tunnel. Server certificate validation should be disabled. Server certificate validation should be disabled. Cipher suite and certificate size should be defined in the parameter session of benchmarking tests. 4.3.2. Backend Server Configuration This document attempts to specify which parameters should be considerable while configuring emulated backend servers using test equipment. 4.3.2.1. TCP Stack Attributes The TCP stack SHOULD use a TCP Reno variant, which include congestion avoidance, back off and windowing, retransmission and recovery on every TCP connection between client and server endpoints. The default IPv4 MSS segment size MUST be set to 1460 bytes and a TX and RX receive windows of at least 32768 bytes. Delayed ACKs are permitted but SHOULD be limited to either a 200 msec delay timeout or 3000 in bytes before a forced ACK. Up to 3 retries SHOULD be allowed before a timeout event is declared. All traffic MUST set the TCP PSH Balarajah & RossenhoeveExpires September 23, 2018 [Page 10] Internet-Draft Benchmarking for NGFW performance March 2018 flag to high. The source port range SHOULD be in the range of 1024 - 65535. Internal timeout should be dynamically scalable per RFC 793. 4.3.2.2. Server Endpoint IP Addressing The server IP blocks should consist of unique, continuous static address blocks with one IP per Server FQDN endpoint per test port. The IPv4 ToS byte should be set to '00'. The source mac address of the server endpoints shall be the same emulating routed behavior. Each Server FQDN should have it's own unique IP address. The Server IP addressing should be fixed to the same number of FQDN entries. 4.3.2.3. HTTP / HTTPS Server Pool Endpoint Attributes The emulated server pool for HTTP should listen on TCP port 80 and emulated HTTP version 1.1 with persistence. For HTTPS server, the pool must have the same basic attributes of an HTTP server pool plus attributes for SSL/TLS. The server must advertise a server type. For HTTPS server, TLS 1.2 or higher must be used with a record size of 16383 bytes and ticket resumption or Session ID reuse enabled. The server must listen on port TCP 443. The server shall serve a certificate to the client. It is required that the HTTPS server also check Host SNI information with the Fully Qualified Domain Name (FQDN). Client certificate validation should be disabled. Cipher suite and certificate size should be defined in the parameter session of benchmarking tests. 4.3.3. Traffic Flow Definition The section describes the traffic pattern between the client and server endpoints. At the beginning of the test, the server endpoint initializes and will be in a ready to accept connection state including initialization of the TCP stack as well as bound HTTP and HTTPS servers. When a client endpoint is needed, it will initialize and be given attributes such as the MAC and IP address. The behavior of the client is to sweep though the given server IP space, sequentially generating a recognizable service by the DUT. Thus, a balanced, mesh between client endpoints and server endpoints will be generated in a client port server port combination. Each client endpoint performs the same actions as other endpoints, with the difference being the source IP of the client endpoint and the target server IP pool. The client shall use Fully Qualified Domain Names in Host Headers and for TLS 1.2 Server Name Indication (SNI). Balarajah & RossenhoeveExpires September 23, 2018 [Page 11] Internet-Draft Benchmarking for NGFW performance March 2018 4.3.3.1. Description of Intra-Client Behavior Client endpoints are independent of other clients that are concurrently executing. When a client endpoint initiate traffic, this section will describe how the steps though different services. Once initialized, the user should randomly hold (perform no operation) for a few milliseconds to allow for better randomization of start of client traffic. The client will then either open up a new TCP connection or connect to a TCP persistence stack still open to that specific server. At any point that the service profile may require encryption, a TLS 1.2 encryption tunnel will form presenting the URL request to the server. The server will then perform an SNI name check with the proposed FQDN compared to the domain embedded in the certificate. Only when correct, will the server process the object. The initial object to the server may not have a fixed size; its size is based on benchmarking tests described in Section 7. Multiple additional sub-URLs (Objects on the service page) may be requested simultaneously. This may or may not be to the same server IP as the initial URL. Each sub-object will also use a conical FQDN and URL path, as observed in the traffic mix used. 4.3.4. Traffic Load Profile The loading of traffic will be described in this section. The loading of an traffic load profile has five distinct phases: Init, ramp up, sustain, ramp down/close, and collection. Within the Init phase, test bed devices including the client and server endpoints should negotiate layer 2-3 connectivity such as MAC learning and ARP. Only after successful MAC learning or ARP resolution shall the test iteration move to the next phase. No measurements are made in this phase. The minimum recommended time for init phase is 5 seconds. During this phase the emulated clients SHOULD NOT initiate any sessions with the DUT/SUT, in contrast, the emulated servers should be ready to accept requests from DUT/SUT or from emulated clients. In the ramp up phase, the test equipment should start to generate the test traffic. It should use a set approximate number of unique client IP addresses actively to generate traffic. The traffic should ramp from zero to desired target objective. The target objective will be defined for each benchmarking test. The duration for the ramp up phase must be configured long enough, so that the test equipment do not overwhelm DUT/SUT's supported performance metrics namely; connection setup rate, concurrent connection and application transaction. The recommended time duration for the ramp up phase is 180- 300 seconds. No measurements are made in this phase. Balarajah & RossenhoeveExpires September 23, 2018 [Page 12] Internet-Draft Benchmarking for NGFW performance March 2018 In the sustain phase, the test equipment should keep to generate traffic t constant target value for a constant number of active client IPs. The recommended time duration for sustain phase is 600 seconds. This is the phase where measurements occur. In the ramp down/close phase, no new connection is established and no measurements are made. The recommend duration of this phase is between 180 to 300 seconds. The last phase is administrative and will be when the tester merges and collates the report data. 5. Test Bed Considerations This section recommends steps to control the test environment and test equipment, specifically focusing on virtualized environments and virtualized test equipment. 1. Ensure that any ancillary switching or routing functions between the system under test and the test equipment do not limit the performance of the traffic generator. This is specifically important for virtualized components (vSwitches, vRouters). 2. Verify that the performance of the test equipment matches and reasonably exceeds the expected maximum performance of the system under test. 3. Assert that the test bed characteristics are stable during the whole test session. A number of factors might influence stability specifically for virtualized test beds, for example additional work loads in a virtualized system, load balancing and movement of virtual machines during the test, or simple issues such as additional heat created by high workloads leading to an emergency CPU performance reduction. Test bed reference pre-tests help to ensure that the desired traffic generator aspects such as maximum throughput and the network performance metrics such as maximum latency and maximum packet loss are met. Once the desired maximum performance goals for the system under test have been identified, a safety margin of 10 % SHOULD be added for throughput and subtracted for maximum latency and maximum packet loss. Test bed preparation may be performed either by configuring the DUT in the most trivial setup (fast forwarding) or without presence of DUT. Balarajah & RossenhoeveExpires September 23, 2018 [Page 13] Internet-Draft Benchmarking for NGFW performance March 2018 6. Reporting This section describes how the final report should be formatted and presented. The final test report may have two major sections; Introduction and result sections. The following attributes should be present in the introduction section of the test report. 1. The name of the NetSecOPEN traffic mix (see Appendix A) must be prominent. 2. The time and date of the execution of the test must be prominent. 3. Summary of testbed software and Hardware details A. DUT Hardware/Virtual Configuration + This section should clearly identify the make and model of the DUT + iThe port interfaces, including speed and link information must be documented. + If the DUT is a virtual VNF, interface acceleration such as DPDK and SR-IOV must be documented as well as cores used, RAM used, and the pinning / resource sharing configuration. The Hypervisor and version must be documented. + Any additional hardware relevant to the DUT such as controllers must be documented B. DUT Software + The operating system name must be documented + The version must be documented + The specific configuration must be documented C. DUT Enabled Features + Specific features, such as logging, NGFW, DPI must be documented + iAttributes of those featured must be documented + Any additional relevant information about features must be documented Balarajah & RossenhoeveExpires September 23, 2018 [Page 14] Internet-Draft Benchmarking for NGFW performance March 2018 D. Test equipment hardware and software + Test equipment vendor name + Hardware details including model number, interface type + Test equipment firmware and test application software version 4. Results Summary / Executive Summary 1. Results should resemble a pyramid in how it is reported, with the introduction section documenting the summary of results in a prominent, easy to read block. 2. In the result section of the test report, the following attributes should be present for each test scenario. a. KPIs must be documented separately for each test scenario. The format of the KPI metrics should be presented as described in Section 6.1. b. The next level of detains should be graphs showing each of these metrics over the duration (sustain phase) of the test. This allows the user to see the measured performance stability changes over time. 6.1. Key Performance Indicators This section lists KPIs for overall benchmarking tests scenarios. All KPIs MUST be measured in whole period of sustain phase as described in Section 4.3.4. All KPIs MUST be measured from test equipment's result output. o TCP Concurrent Connection This key performance indicator will measure the average concurrent open TCP connections in the sustaining period. o TCP Connection Setup Rate This key performance indicator will measure the average established TCP connections per second in the sustaining period. For Session setup rate benchmarking test scenario, the KPI will measure average established and terminated TCP connections per second simultaneously. o Application Transaction Rate This key performance indicator will measure the average successful transactions per seconds in the sustaining period. Balarajah & RossenhoeveExpires September 23, 2018 [Page 15] Internet-Draft Benchmarking for NGFW performance March 2018 o TLS Handshake Rate This key performance indicator will measure the average TLS 1.2 or higher session formation rate within the sustaining period. o Throughput This key performance indicator will measure the average Layer 1 throughput within the sustaining period as well as average packets per seconds within the same period. The value of throughput should be presented in Gbps rounded to two places of precision with a more specific kbps in parenthesis. Optionally, goodput may also be logged as an average goodput rate measured over the same period. Goodput result shall also be presented in the same format as throughput. o URL Response time / Time to Last Byte (TTLB) This key performance indicator will measure the minimum, average and maximum per URL response time in the sustaining period as well as the average variance in the same period. o Application Transaction Time This key performance indicator will measure the minimum, average and maximum the amount of time to receive all objects from the server. o Time to First Byte (TTFB) This key performance indicator will measure minimum, average and maximum the time to first byte. TTFB is the elapsed time between sending the SYN packet from the client and receiving the first byte of application date from the DUT/SUT. TTFB SHOULD be expressed in millisecond. o TCP Connect Time This key performance indicator will measure minimum, average and maximum TCP connect time. It is elapsed between the time the client sends a SYN packet and the time it receives the SYN/ACK. TCP connect time SHOULD be expressed in millisecond. 7. Benchmarking Tests 7.1. Throughput Performance With NetSecOPEN Traffic Mix 7.1.1. Objective Using NetSecOPEN traffic mix, determine the maximum sustainable throughput performance supported by the DUT/SUT. (see Appendix A for details about traffic mix) Balarajah & RossenhoeveExpires September 23, 2018 [Page 16] Internet-Draft Benchmarking for NGFW performance March 2018 7.1.2. Test Setup Test bed setup MUST be configured as defined in Section 4. Any test scenario specific test bed configuration changes must be documented. 7.1.3. Test Parameters In this section, test scenario specific parameters SHOULD be defined. 7.1.3.1. DUT/SUT Configuration Parameters DUT/SUT parameters MUST conform to the requirements defined in Section 4.2. Any configuration changes for this specific test scenario MUST be documented. 7.1.3.2. Test Equipment Configuration Parameters Test equipment configuration parameters MUST conform to the requirements defined in Section 4.3. Following parameters MUST be noted for this test scenario: Client IP address range Server IP address range Traffic distribution ratio between IPv4 and IPv6 Traffic load objective or specification type (e.g. Throughput, SimUsers and etc.) Target throughput: It MAY be defined based on requirements. Otherwise it represents aggregated line rate of interface(s) used in the DUT/SUT Initial throughput: Initial throughput MAY be up to 10% of the "Target throughput" 7.1.3.3. Traffic Profile Test scenario MUST be run with a single application traffic mix profile (see Appendix A for details about traffic mix). The name of the NetSecOpen traffic mix MUST be documented. 7.1.3.4. Test Results Acceptance Criteria The following test Criteria is defined as test results acceptance criteria Balarajah & RossenhoeveExpires September 23, 2018 [Page 17] Internet-Draft Benchmarking for NGFW performance March 2018 a. Number of failed Application transaction MUST be 0.01%. b. Number of Terminated TCP connection due to unexpected TCP RST sent by DUT/SUT MUST be less than 0.01% c. Maximum deviation (max. dev) of application transaction time / TTLB (Time To Last Byte) MUST be less than X (The value for "X" will be finalyzed and updated in future draft release) The following equation MUST be used to calculate the deviation of application transaction time or TTLB. max. dev = max((avg_latency - min_latency),(max_latency - avg_latency)) / (Initial latency) Where, the initial latency is calculated using the following equation. For this calculation, the latency values (min', avg' and max') MUST be measured during test procedure step 1 as defined in Section 7.1.4.1. The variable latency represents application transaction time or TTLB. Initial latency:= min((avg' latency - min' latency) | (max' latency - avg' latency)) d. Maximum value of TCP connect time must be less than Xms (The value for "X" will be finalyzed and updated in future draft release). The definition for TCP connect time is found in Section 6.1. e. Maximum value of Time to First Byte must be less than 2* TCP connect time. Test Acceptance criteria for this test scenario MUST be monitored during the sustain phase of the traffic load profile only. 7.1.3.5. Measurement Following KPI metrics MUST be reported for this test scenario. Mandatory KPIs: average Throughput, maximum Concurrent TCP connection, TTLB/application transaction time (minimum, average and maximum) and average application transaction rate Optional KPIs: average TCP connection setup rate, average TLS handshake rate, TCP connect time and TTFB Balarajah & RossenhoeveExpires September 23, 2018 [Page 18] Internet-Draft Benchmarking for NGFW performance March 2018 7.1.4. Test Procedures and expected Results The test procedure is designed to measure the throughput performance of the DUT/SUT at the sustaining period of traffic load profile. The test procedure consists of three major steps. 7.1.4.1. Step 1: Test Initialization and Qualification Verify the link status of the all connected physical interfaces. All interfaces are expected to be "UP" status. Configure traffic load profile of the test equipment to generate test traffic at "initial throughput" rate as described in the parameters section. The DUT/SUT SHOULD reach the "initial throughput" during the sustain phase. Measure all KPI as defined in Section 7.1.3.5. The measured KPIs during the sustain phase MUST meet acceptance criteria "a" and "b" defined in Section 7.1.3.4. If the KPI metrics do not meet the acceptance criteria, the test procedure MUST NOT be continued to step 2. 7.1.4.2. Step 2: Test Run with Target Objective Configure test equipment to generate traffic at "Target throughput" rate defined in the parameter table. The test equipment SHOULD follow the traffic load profile definition as described in Section 4.3.4. The test equipment SHOULD start to measure and record all specified KPIs. The frequency of KPI metrics measurement MUST be less than 5 seconds. Continue the test until all traffic profile phases are completed. The DUT/SUT is expected to reach the desired target throughput during the sustain phase. In addition, the measured KPIs must meet all acceptance criteria. Follow the step 3, if the KPI metrics do not meet the acceptance criteria. 7.1.4.3. Step 3: Test Iteration with Binary Search Use binary search algorithm to configure the desired traffic load profile for each test iteration. Binary search algorithmn can be implemented using the parameter; Resolution =0.01* Target throughput and Backoff= 50%. Determine the maximum and average achievable throughput within the acceptance criteria. Balarajah & RossenhoeveExpires September 23, 2018 [Page 19] Internet-Draft Benchmarking for NGFW performance March 2018 7.2. Concurrent TCP Connection Capacity With HTTP Traffic 7.2.1. Objective Determine the maximum number of concurrent TCP connection that DUT/ SUT sustains when using HTTP traffic. 7.2.2. Test Setup Test bed setup SHOULD be configured as defined in Section 4. Any specific test bed configuration changes such as number of interfaces and interface type, etc. must be documented. 7.2.3. Test Parameters In this section, test scenario specific parameters SHOULD be defined. 7.2.3.1. DUT/SUT Configuration Parameters DUT/SUT parameters MUST conform to the requirements defined in Section 4.2. Any configuration changes for this specific test scenario MUST be documented. 7.2.3.2. Test Equipment Configuration Parameters Test equipment configuration parameters MUST conform to the requirements defined in Section 4.3. Following parameters MUST be noted for this test scenario: Client IP address range Server IP address range Traffic distribution ratio between IPv4 and IPv6 Traffic load objective or specification type (e.g Throughput, SimUsers and etc.) Target concurrent connection: It can be defined based on requirements Initial concurrent connection: 10% of "Target concurrent connection" Balarajah & RossenhoeveExpires September 23, 2018 [Page 20] Internet-Draft Benchmarking for NGFW performance March 2018 7.2.3.2.1. Client Configuration Parameters The client must negotiate HTTP 1.1 with persistence and each client can open multiple concurrent TCP connections per server endpoint IP. Test scenario SHOULD be run with a single traffic profile with following attributes: HTTP 1.1 with GET command requesting 10 Kbyte objects with random MIME type. The test equipment SHOULD perform HTTP transactions within each TCP connection subsequently. The frequency of transactions MUST be defined to achieve X% of total throughput that DUT can support. The suggested value of X is 25. It will be finalyzed and updated in the next draft version. During the sustain state of concurrent connection and traffic load , a minimal % of TCP connection SHOULD be closed and re-opened. 7.2.3.3. Test Results Acceptance Criteria The following test Criteria is defined as test results acceptance criteria a. Number of failed Application transaction MUST be less than 0.01% of attempt transaction. b. Number of Terminated TCP connection due to unexpected TCP RST sent by DUT/SUT MUST be less than 0.01% of total initiated TCP sessions. c. During the sustain phase, traffic should be forwarded constantly at the rate defined in the parameter Section 7.2.3. d. Maximum deviation (max. dev) of application transaction time / TTLB (Time To Last Byte) MUST be less than Xms (The value for "X" will be finalyzed and updated in future draft release). The following equation MUST be used to calculate the deviation of application transaction time or TTLB. max. dev = max((avg_latency - min_latency),(max_latency - avg_latency)) / (Initial latency) Where, the initial latency is calculated using the following equation. For this calculation, the latency values (min', avg' and max') MUST be measured during test procedure step 1 as defined in Section 7.1.4.1. Balarajah & RossenhoeveExpires September 23, 2018 [Page 21] Internet-Draft Benchmarking for NGFW performance March 2018 The variable latency represents application transaction time or TTLB. Initial latency:= min((avg' latency - min' latency) | (max' latency - avg' latency)) e. Maximum value of TCP connect time must be less than Xms (The value for "X" will be finalyzed and updated in future draft release). The definition for TCP connect time is found in Section 6.1. f. Maximum value of Time to First Byte must be less than 2* TCP connect time. Test Acceptance criteria for this test scenario MUST be monitored during the sustain phase of the traffic load profile only. 7.2.3.4. Measurement Following KPI metrics MUST be reported for this test scenario; average Throughput, max. Min. Avg. Concurrent TCP connection, TTLB/ application transaction time (minimum, average and maximum) and average application transaction rate. 7.2.4. Test Procedures and expected Results The test procedure is designed to measure the concurrent TCP connection capacity of the DUT/SUT at the sustaining period of traffic load profile. The test procedure consists of three major steps. This test procedure MAY be repeated multiple times with different IPv4 and IPv6 traffic distribution. 7.2.4.1. Step 1: Test Initialization and Qualification Verify the link status of the all connected physical interfaces. All interfaces are expected to be "UP" status. Configure traffic load profile of the test equipment to establish "initial concurrent connection" as defined in the parameters section. The traffic load profile should be defined as described in Section 4.3.4. The DUT/SUT SHOULD reach the "initial concurrent connection" during the sustain phase. The measured KPIs during the sustain phase MUST meet the acceptance criteria "a" and "b" defined in Section 7.2.3.3 Balarajah & RossenhoeveExpires September 23, 2018 [Page 22] Internet-Draft Benchmarking for NGFW performance March 2018 If the KPI metrics do not meet the acceptance criteria, the test procedure MUST NOT be continued to "Step 2". 7.2.4.2. Step 2: Test Run with Target Objective Configure test equipment to establish "Target concurrent connection" defined in the parameters table. The test equipment SHOULD follow the traffic load profile definition as described in Section 4.3.4. During the ramp up and sustain phase, the other KPIs such as throughput, TCP connection rate and application transaction MUST NOT reach to the maximum value that the DUT/SUT can support. Throughput, TCP connection rate and application transaction should not be reached more than X% of maximum value that DUT can support. The suggested value of X is 25. It will be finalyzed and updated in the next draft version. The test equipment SHOULD start to measure and record all specified KPIs. The frequency of measurement MUST be less than 5 seconds. Continue the test until all traffic profile phases are completed. The DUT/SUT is expected to reach the desired target concurrent connection at the sustain phase. In addition, the measured KPIs must meet all acceptance criteria. Follow the step 3, if the KPI metrics do not meet the acceptance criteria. 7.2.4.3. Step 3: Test Iteration with Binary Search Use binary search algorithm to configure the desired traffic load profile for each test iteration. Binary search algorithmn can be implemented using the parameter; Resolution =0.01* "Target concurrent connection" and Backoff= 50%. Determine the maximum and average achievable throughput within the acceptance criteria. 7.3. TCP/HTTP Connections Per Second 7.3.1. Objective Using HTTP traffic, determine the maximum and average value of TCP session establishment rate supported by the DUT/SUT. Test parameters and test test procedures will be added in the future release. Balarajah & RossenhoeveExpires September 23, 2018 [Page 23] Internet-Draft Benchmarking for NGFW performance March 2018 7.3.2. Test Setup Test bed setup SHOULD be configured as defined in section 4. Any specific test bed configuration changes such as number of interfaces and interface type, etc. must be documented. 7.3.3. Test Parameters In this section, test scenario specific parameters SHOULD be defined. 7.3.3.1. DUT/SUT Configuration Parameters DUT/SUT parameters MUST conform to the requirements defined in the section 4.2. Any configuration changes for this specific test scenario MUST be documented. 7.3.3.2. Test Equipment Configuration Parameters Test equipment configuration parameters MUST conform to the requirements defined in the section 4.3. Following parameters MUST be documented for this test scenario: - Client IP address range - Server IP address range - Traffic distribution ratio between IPv4 and IPv6 - Target connections per second - Initial connections per second: 10% of "Target connections per second" 7.3.3.2.1. Client Configuration Parameters' Test scenario SHOULD be run with a single traffic profile with following attributes: The client MUST negotiate HTTP 1.1 and close the connection immediately after completion of the transaction. HTTP 1.1 with GET command requesting a single 1, 16 or 64 Kbyte object. Balarajah & RossenhoeveExpires September 23, 2018 [Page 24] Internet-Draft Benchmarking for NGFW performance March 2018 7.3.3.3. Test Results Acceptance Criteria The following test Criteria is defined as test results acceptance criteria. a. Number of failed Application transaction MUST be less than 0.01% of attempt transaction. b. Number of Terminated TCP connection due to unexpected TCP RST sent by DUT/SUT MUST be less than 0.01% of total initiated TCP sessions c. During the sustain phase, traffic should be forwarded at a constant rate d. During the sustain phase, Average Time to TCP First Byte MUST be constant and not increase more than 10%. e. Concurrent TCP connection should be constant during steady state. This confirms the DUT opens and closes the session at the same rate. 7.3.3.4. Measurement Following KPI metrics MUST be reported for this test scenario. Mandatory KPIs: average TCP connections per second, average Throughput and Average Time to TCP First Byte. 7.3.4. Test Procedures and Expected Results The test procedure is designed to measure the TCP connection per second rate of the DUT/SUT at the sustaining period of traffic load profile. The test procedure consists of three major steps. This test procedure MAY be repeated multiple times with different IPv4 and IPv6 traffic distribution. 7.3.4.1. Step 1: Test Initialization and Qualification Verify the link status of the all connected physical interfaces. All interfaces are expected to be "UP" status. Configure traffic load profile of the test equipment to establish "initial connections per second" as defined in the parameters section. The traffic load profile CAN be defined as described in the section 4.3.4. Balarajah & RossenhoeveExpires September 23, 2018 [Page 25] Internet-Draft Benchmarking for NGFW performance March 2018 The DUT/SUT SHOULD reach the "initial connections per second" before the sustain phase. The measured KPIs during the sustain phase MUST meet the acceptance criteria a, b, c and d defined in section 7.3.3.3. If the KPI metrics do not meet the acceptance criteria, the test procedure MUST NOT be continued to "Step 2". 7.3.4.2. Step 2: Test Run with Target Objective Configure test equipment to establish "Target connections per second" defined in the parameters table. The test equipment SHOULD follow the traffic load profile definition as described in the section 4.3.4. During the ramp up and sustain phase, other KPIs such as throughput, TCP concurrent connections and application transaction MUST NOT reach to the maximum value the DUT/SUT can support. The test equipment SHOULD start to measure and record all specified KPIs. The frequency of measurement MUST be less than 5 seconds. Continue the test until all traffic profile phases are completed. The DUT/SUT is expected to reach the desired target connection per second rate at the sustain phase. In addition, the measured KPIs must meet all acceptance criteria. Follow the step 3, if the KPI metrics do not meet the acceptance criteria. 7.3.4.3. Step 3: Test Iteration with Binary Search Use binary search algorithm to configure the desired traffic load profile for each test iteration. Binary search algorithmn can be implemented using the parameter; Resolution =0.01* " Target connections per second " and Backoff= 50%. Determine the maximum and average achievable connections per second within the acceptance criteria. 7.4. HTTP Transactions Per Second 7.4.1. Objective Determine maximum and average HTTP transacton rate supported by the DUT/SUT. Balarajah & RossenhoeveExpires September 23, 2018 [Page 26] Internet-Draft Benchmarking for NGFW performance March 2018 Test parameters and test test procedures will be added in the future release. 7.5. HTTP Throughput 7.5.1. Objective Determine the average throughput performance of the DUT/SUT when using HTTP traffic. Test parameters and test test procedures will be added in the future release. 7.6. HTTP Transaction Latency 7.6.1. Objective Determine the minimum, average and maximum values of HTTP transaction latency at 80% throughput rate measured in "HTTP Throughput" test scenario. Test parameters and test test procedures will be added in the future release. 7.7. Concurrent SSL/TLS Connection Capacity 7.7.1. Objective Usin encrypted traffic (HTTPS), determine the maximum number of concurrent TCP connection that DUT/SUT sustains. Test parameters and test test procedures will be added in the future release. 7.8. SSL/TLS Handshake Rate 7.8.1. Objective Using HTTPS traffic, determine the maximum sustainable SSL/TLS session establishment rate supported by the DUT/SUT under different throughput load conditions. Test iterations MUST include common cipher suites and key strengths as well as forward looking stronger keys. Specific test iterations MUST include the following ciphers and keys: 1. ECHDE-ECDSA-AES128-GCM-SHA256 with Prime256v1 Balarajah & RossenhoeveExpires September 23, 2018 [Page 27] Internet-Draft Benchmarking for NGFW performance March 2018 2. ECDHE-RSA-AES128-GCM-SHA256 with RSA 2048 3. ECDHE-ECDSA-AES256-GCM-SHA384 with Secp384 4. ECDHE-RSA-AES256-GCM-SHA384 with RSA 3072 For each cipher suite and key strengths, test iterations MUST use a single HTTP transaction object size of 1KB, 16KB and 64KB to measure connections per second performance under a variety of DUT Security inspection load conditions. 7.8.2. Test Setup Test bed setup SHOULD be configured as defined in section 4. Any specific test bed configuration changes such as number of interfaces and interface type, etc. must be documented. 7.8.3. Test Parameters In this section, test scenario specific parameters SHOULD be defined. 7.8.3.1. DUT/SUT Configuration Parameters DUT/SUT parameters MUST conform to the requirements defined in the section 4.2. Any configuration changes for this specific test scenario MUST be documented. 7.8.3.2. Test Equipment Configuration Parameters Test equipment configuration parameters MUST conform to the requirements defined in the section 4.3. Following parameters MUST be documented for this test scenario: - Client IP address range - Server IP address range - Traffic distribution ratio between IPv4 and IPv6 - Target connections per second - Initial connections per second: 10% of "Target connections per second" Balarajah & RossenhoeveExpires September 23, 2018 [Page 28] Internet-Draft Benchmarking for NGFW performance March 2018 7.8.3.2.1. Client Configuration Parameters' Test scenario SHOULD be run with a single traffic profile with following attributes: The client MUST negotiate HTTPS 1.1 and close the connection immediately after completion of the transaction. HTTPS 1.1 with GET command requesting a single 1, 16 and 64 Kbyte object. Each client connection MUST perform a full handshake with server certificate and MUST NOT use session reuse or resumption. TLS record size MAY be optimized for the object size up to a record size of 16K. 7.8.3.3. Test Results Acceptance Criteria The following test Criteria is defined as test results acceptance criteria. a. Number of failed Application transaction MUST be less than 0.01% of attempt transaction. b. Number of Terminated TCP connection due to unexpected TCP RST sent by DUT/SUT MUST be less than 0.01% of total initiated TCP sessions. c. During the sustain phase, traffic should be forwarded at a constant rate. d. During the sustain phase, Average Time to TCP First Byte MUST be constant and not increase more than 10%. e. Concurrent TCP connection should be constant during steady state. This confirms the DUT opens and closes the session at the same rate. 7.8.3.4. Measurement Following KPI metrics MUST be reported for this test scenario. Mandatory KPIs: average TCP connections per second, average Throughput and Average Time to TCP First Byte. Balarajah & RossenhoeveExpires September 23, 2018 [Page 29] Internet-Draft Benchmarking for NGFW performance March 2018 7.8.4. Test Procedures and expected Results The test procedure is designed to measure the TCP connection per second rate of the DUT/SUT at the sustaining period of traffic load profile. The test procedure consists of three major steps. This test procedure MAY be repeated multiple times with different IPv4 and IPv6 traffic distribution. 7.8.4.1. Step 1: Test Initialization and Qualification Verify the link status of the all connected physical interfaces. All interfaces are expected to be "UP" status. Configure traffic load profile of the test equipment to establish "initial connections per second" as defined in the parameters section. The traffic load profile CAN be defined as described in the section 4.3.4. The DUT/SUT SHOULD reach the "initial connections per second" before the sustain phase. The measured KPIs during the sustain phase MUST meet the acceptance criteria a, b, c and d defined in section 7.8.3.3. If the KPI metrics do not meet the acceptance criteria, the test procedure MUST NOT be continued to "Step 2". 7.8.4.2. Step 2: Test Run with Target Objective Configure test equipment to establish "Target connections per second" defined in the parameters table. The test equipment SHOULD follow the traffic load profile definition as described in the section 4.3.4. During the ramp up and sustain phase, other KPIs such as throughput, TCP concurrent connections and application transaction MUST NOT reach to the maximum value the DUT/SUT can support. The test equipment SHOULD start to measure and record all specified KPIs. The frequency of measurement MUST be less than 5 seconds. Continue the test until all traffic profile phases are completed. The DUT/SUT is expected to reach the desired target connection per second rate at the sustain phase. In addition, the measured KPIs must meet all acceptance criteria. Follow the step 3, if the KPI metrics do not meet the acceptance criteria. Balarajah & RossenhoeveExpires September 23, 2018 [Page 30] Internet-Draft Benchmarking for NGFW performance March 2018 7.8.4.3. Step 3: Test Iteration with Binary Search Use binary search algorithm to configure the desired traffic load profile for each test iteration. Binary search algorithmn can be implemented using the parameter; Resolution =0.01* " Target connections per second " and Backoff= 50%. Determine the maximum and average achievable connections per second within the acceptance criteria. 7.9. HTTPS Transaction Per Second 7.9.1. Objective Determine maximum and average HTTPS transacton rate supported by the DUT/SUT. Test parameters and test test procedures will be added in the future release. 7.10. HTTPS Throughput 7.10.1. Objective Determine the average throughput performance of the DUT/SUT when using HTTPS traffic. Test parameters and test test procedures will be added in the future release. 7.11. HTTPS Transaction Latency 7.11.1. Objective Determine the minimum, average and maximum values of HTTPS transaction latency at 80% throughput rate measured in "HTTPS Throughput" test scenario. Test parameters and test test procedures will be added in the future release. 8. Formal Syntax 9. IANA Considerations This document makes no request of IANA. Balarajah & RossenhoeveExpires September 23, 2018 [Page 31] Internet-Draft Benchmarking for NGFW performance March 2018 Note to RFC Editor: this section may be removed on publication as an RFC. 10. Security Considerations Security consideration will be added in the future release. 11. Acknowledgements Acknowledgements will be added in the future release. 12. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . Appendix A. NetSecOPEN Basic Traffic Mix A traffic mix for testing performance of next generation firewalls MUST scale to stress the DUT based on real-world conditions. In order to achieve this the following MUST be included: o Clients connecting to multiple different server FQDNs per application o Clients loading apps and pages with connections and objects in specific orders o Multiple unique certificates for HTTPS/TLS o A wide variety of different object sizes o Different URL paths o Mix of HTTP and HTTPS A traffic mix for testing performance of next generation firewalls MUST also facility application identification using different detection methods with and without decryption of the traffic. Such as: o HTTP HOST based application detection o HTTPS/TLS Server Name Indication (SNI) o Certificate Subject Common Name (CN) Balarajah & RossenhoeveExpires September 23, 2018 [Page 32] Internet-Draft Benchmarking for NGFW performance March 2018 The mix MUST be of sufficient complexity and volume to render differences in individual apps as statistically insignificant. For example, changes in like to like apps - such as one type of video service vs. another both consist of larger objects whereas one news site vs. another both typically have more connections then other apps because of trackers and embedded advertising content. To achieve sufficient complexity, a mix MUST have: o Thousands of URLs each client walks thru o Hundreds of FQDNs each client connects to o Hundreds of unique certificates for HTTPS/TLS o Thousands of different object sizes per client in orders matching applications The following is a description of what a popular application in an enterprise traffic mix contains. Table 3 lists the FQDNs, number of transactions and bytes transferred as an example client interacts with Office 365 Outlook, Word, Excel, Powerpoint, Sharepoint and Skype. +---------------------------------+------------+-------------+ | Office365 FQDN | Bytes | Transaction | +============================================================+ | r1.res.office365.com | 14,056,960 | 192 | +---------------------------------+------------+-------------+ | s1-word-edit-15.cdn.office.net | 6,731,019 | 22 | +---------------------------------+------------+-------------+ | company1-my.sharepoint.com | 6,269,492 | 42 | +---------------------------------+------------+-------------+ | swx.cdn.skype.com | 6,100,027 | 12 | +---------------------------------+------------+-------------+ | static.sharepointonline.com | 6,036,947 | 41 | +---------------------------------+------------+-------------+ | spoprod-a.akamaihd.net | 3,904,250 | 25 | +---------------------------------+------------+-------------+ | s1-excel-15.cdn.office.net | 2,767,941 | 16 | +---------------------------------+------------+-------------+ | outlook.office365.com | 2,047,301 | 86 | +---------------------------------+------------+-------------+ | shellprod.msocdn.com | 1,008,370 | 11 | +---------------------------------+------------+-------------+ | word-edit.officeapps.live.com | 932,080 | 25 | +---------------------------------+------------+-------------+ | res.delve.office.com | 760,146 | 2 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 33] Internet-Draft Benchmarking for NGFW performance March 2018 +---------------------------------+------------+-------------+ | s1-powerpoint-15.cdn.office.net | 557,604 | 3 | +---------------------------------+------------+-------------+ | appsforoffice.microsoft.com | 511,171 | 5 | +---------------------------------+------------+-------------+ | powerpoint.officeapps.live.com | 471,625 | 14 | +---------------------------------+------------+-------------+ | excel.officeapps.live.com | 342,040 | 14 | +---------------------------------+------------+-------------+ | s1-officeapps-15.cdn.office.net | 331,343 | 5 | +---------------------------------+------------+-------------+ | webdir0a.online.lync.com | 66,930 | 15 | +---------------------------------+------------+-------------+ | portal.office.com | 13,956 | 1 | +---------------------------------+------------+-------------+ | config.edge.skype.com | 6,911 | 2 | +---------------------------------+------------+-------------+ | clientlog.portal.office.com | 6,608 | 8 | +---------------------------------+------------+-------------+ | webdir.online.lync.com | 4,343 | 5 | +---------------------------------+------------+-------------+ | graph.microsoft.com | 2,289 | 2 | +---------------------------------+------------+-------------+ | nam.loki.delve.office.com | 1,812 | 5 | +---------------------------------+------------+-------------+ | login.microsoftonline.com | 464 | 2 | +---------------------------------+------------+-------------+ | login.windows.net | 232 | 1 | +---------------------------------+------------+-------------+ Table 3: Office365 Clients MUST connect to multiple server FQDNs in the same order as real applications. Connections MUST be made when the client is interacting with the application and NOT first setup up all connections. Connections SHOULD stay open per client for subsequent transactions to the same FQDN similar to how a web browser behaves. Clients MUST use different URL Paths and Object sizes in orders as they are observed in real Applications. Clients MAY also setup multiple connections per FQDN to process multiple transactions in a sequence at the same time. Table 4 has a partial example sequence of the Office 365 Word application transactions. +---------------------------------+----------------------+----------+ | FQDN | URL Path | Object | | | | size | +===================================================================+ | company1-my.sharepoint.com | /personal... | 23,132 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 34] Internet-Draft Benchmarking for NGFW performance March 2018 +---------------------------------+----------------------+----------+ | word-edit.officeapps.live.com | /we/WsaUpload.ashx | 2 | +---------------------------------+----------------------+----------+ | static.sharepointonline.com | /bld/.../blank.js | 454 | +---------------------------------+----------------------+----------+ | static.sharepointonline.com | /bld/.../ | 23,254 | | | initstrings.js | | +---------------------------------+----------------------+----------+ | static.sharepointonline.com | /bld/.../init.js | 292,740 | +---------------------------------+----------------------+----------+ | company1-my.sharepoint.com | /ScriptResource... | 102,774 | +---------------------------------+----------------------+----------+ | company1-my.sharepoint.com | /ScriptResource... | 40,329 | +---------------------------------+----------------------+----------+ | company1-my.sharepoint.com | /WebResource... | 23,063 | +---------------------------------+----------------------+----------+ | word-edit.officeapps.live.com | /we/wordeditorframe. | 60,657 | | | aspx... | | +---------------------------------+----------------------+----------+ | static.sharepointonline.com | /bld/_layouts/.../ | 454 | | | blank.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 19,201 | | | EditSurface.css | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 221,397 | | | WordEditor.css | | +---------------------------------+----------------------+----------+ | s1-officeapps-15.cdn.office.net | /we/s/.../ | 107,571 | | | Microsoft | | | | Ajax.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 39,981 | | | wacbootwe.js | | +---------------------------------+----------------------+----------+ | s1-officeapps-15.cdn.office.net | /we/s/.../ | 51,749 | | | CommonIntl.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 6,050 | | | Compat.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 54,158 | | | Box4Intl.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 24,946 | | | WoncaIntl.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 53,515 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 35] Internet-Draft Benchmarking for NGFW performance March 2018 | | WordEditorIntl.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../ | 1,978,712| | | WordEditorExp.js | | +---------------------------------+----------------------+----------+ | s1-word-edit-15.cdn.office.net | /we/s/.../jSanity.js | 10,912 | +---------------------------------+----------------------+----------+ | word-edit.officeapps.live.com | /we/OneNote.ashx | 145,708 | +---------------------------------+----------------------+----------+ Table 4: Office365 Word Transactions For application identification the HTTPS/TLS traffic MUST include realistic Certificate Subject Common Name (CN) data as well as Server Name Indications. For example, a DUT may detect Facebook Chat traffic by inspecting the certificate and detecting *.facebook.com in the certificate subject CN and subsequently detect the word chat in the FQDN 5-edge-chat.facebook.com and identify traffic on the connection to be Facebook Chat. Table 5 includes further examples in SNI and CN pairs for several FQDNs of Office 365. +------------------------------+----------------------------------+ |Server Name Indication (SNI) | Certificate Subject | | | Common Name (CN) | +=================================================================+ | r1.res.office365.com | *.res.outlook.com | +------------------------------+----------------------------------+ | login.windows.net | graph.windows.net | +------------------------------+----------------------------------+ | webdir0a.online.lync.com | *.online.lync.com | +------------------------------+----------------------------------+ | login.microsoftonline.com | stamp2.login.microsoftonline.com | +------------------------------+----------------------------------+ | webdir.online.lync.com | *.online.lync.com | +------------------------------+----------------------------------+ | graph.microsoft.com | graph.microsoft.com | +------------------------------+----------------------------------+ | outlook.office365.com | outlook.com | +------------------------------+----------------------------------+ | appsforoffice.microsoft.com | appsforoffice.microsoft.com | +------------------------------+----------------------------------+ Table 5: Office365 SNI and CN Pairs Examples Balarajah & RossenhoeveExpires September 23, 2018 [Page 36] Internet-Draft Benchmarking for NGFW performance March 2018 NetSecOPEN has provided a reference enterprise perimeter traffic mix with dozens of applications, hundreds of connections, and thousands of transactions. (link to spreadsheet with details) NetSecOPEN has provided a reference enterprise perimeter traffic mix with dozens of applications, hundreds of connections, and thousands of transactions. (link to spreadsheet with details) The enterprise perimeter traffic mix consists of 70% HTTPS and 30% HTTP by Bytes, 58% HTTPS and 42% HTTP by Transactions. By connections with a single connection per FQDN the mix consists of 43% HTTPS and 57% HTTP. With multiple connections per FQDN the HTTPS percentage is higher. Table 6 is a summary of the NetSecOPEN enterprise perimeter traffic mix sorted by bytes with unique FQDNs and transactions per applications. +------------------+-------+--------------+-------------+ | Application | FQDNs | Transactions | Bytes | +=======================================================+ | Office365 | 26 | 558 | 52,931,947 | +------------------+-------+--------------+-------------+ | Box | 4 | 90 | 23,276,089 | +------------------+-------+--------------+-------------+ | Salesforce | 6 | 365 | 23,137,548 | +------------------+-------+--------------+-------------+ | Gmail | 13 | 139 | 16,399,289 | +------------------+-------+--------------+-------------+ | Linkedin | 10 | 206 | 15,040,918 | +------------------+-------+--------------+-------------+ | DailyMotion | 8 | 77 | 14,751,514 | +------------------+-------+--------------+-------------+ | GoogleDocs | 2 | 71 | 14,205,476 | +------------------+-------+--------------+-------------+ | Wikia | 15 | 159 | 13,909,777 | +------------------+-------+--------------+-------------+ | Foxnews | 82 | 499 | 13,758,899 | +------------------+-------+--------------+-------------+ | Yahoo Finance | 33 | 254 | 13,134,011 | +------------------+-------+--------------+-------------+ | Youtube | 8 | 97 | 13,056,216 | +------------------+-------+--------------+-------------+ | Facebook | 4 | 207 | 12,726,231 | +------------------+-------+--------------+-------------+ | CNBC | 77 | 275 | 11,939,566 | +------------------+-------+--------------+-------------+ | Lightreading | 27 | 304 | 11,200,864 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 37] Internet-Draft Benchmarking for NGFW performance March 2018 +------------------+-------+--------------+-------------+ | BusinessInsider | 16 | 142 | 11,001,575 | +------------------+-------+--------------+-------------+ | Alexa | 5 | 153 | 10,475,151 | +------------------+-------+--------------+-------------+ | CNN | 41 | 206 | 10,423,740 | +------------------+-------+--------------+-------------+ | Twitter Video | 2 | 72 | 10,112,820 | +------------------+-------+--------------+-------------+ | Cisco Webex | 1 | 213 | 9,988,417 | +------------------+-------+--------------+-------------+ | Slack | 3 | 40 | 9,938,686 | +------------------+-------+--------------+-------------+ | Google Maps | 5 | 191 | 8,771,873 | +------------------+-------+--------------+-------------+ | SpectrumIEEE | 7 | 145 | 8,682,629 | +------------------+-------+--------------+-------------+ | Yelp | 9 | 146 | 8,607,645 | +------------------+-------+--------------+-------------+ | Vimeo | 12 | 74 | 8,555,960 | +------------------+-------+--------------+-------------+ | Wikihow | 11 | 140 | 8,042,314 | +------------------+-------+--------------+-------------+ | Netflix | 3 | 31 | 7,839,256 | +------------------+-------+--------------+-------------+ | Instagram | 3 | 114 | 7,230,883 | +------------------+-------+--------------+-------------+ | Morningstar | 30 | 150 | 7,220,121 | +------------------+-------+--------------+-------------+ | Docusign | 5 | 68 | 6,972,738 | +------------------+-------+--------------+-------------+ | Twitter | 1 | 100 | 6,939,150 | +------------------+-------+--------------+-------------+ | Tumblr | 11 | 70 | 6,877,200 | +------------------+-------+--------------+-------------+ | Whatsapp | 3 | 46 | 6,829,848 | +------------------+-------+--------------+-------------+ | Imdb | 16 | 251 | 6,505,227 | +------------------+-------+--------------+-------------+ | NOAAgov | 1 | 44 | 6,316,283 | +------------------+-------+--------------+-------------+ | IndustryWeek | 23 | 192 | 6,242,403 | +------------------+-------+--------------+-------------+ | Spotify | 18 | 119 | 6,231,013 | +------------------+-------+--------------+-------------+ | AutoNews | 16 | 165 | 6,115,354 | +------------------+-------+--------------+-------------+ | Evernote | 3 | 47 | 6,063,168 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 38] Internet-Draft Benchmarking for NGFW performance March 2018 +------------------+-------+--------------+-------------+ | NatGeo | 34 | 104 | 6,026,344 | +------------------+-------+--------------+-------------+ | BBC News | 18 | 156 | 5,898,572 | +------------------+-------+--------------+-------------+ | Investopedia | 38 | 241 | 5,792,038 | +------------------+-------+--------------+-------------+ | Pinterest | 8 | 102 | 5,658,994 | +------------------+-------+--------------+-------------+ | Succesfactors | 2 | 112 | 5,049,001 | +------------------+-------+--------------+-------------+ | AbaJournal | 6 | 93 | 4,985,626 | +------------------+-------+--------------+-------------+ | Pbworks | 4 | 78 | 4,670,980 | +------------------+-------+--------------+-------------+ | NetworkWorld | 42 | 153 | 4,651,354 | +------------------+-------+--------------+-------------+ | WebMD | 24 | 280 | 4,416,736 | +------------------+-------+--------------+-------------+ | OilGasJournal | 14 | 105 | 4,095,255 | +------------------+-------+--------------+-------------+ | Trello | 5 | 39 | 4,080,182 | +------------------+-------+--------------+-------------+ | BusinessWire | 5 | 109 | 4,055,331 | +------------------+-------+--------------+-------------+ | Dropbox | 5 | 17 | 4,023,469 | +------------------+-------+--------------+-------------+ | Nejm | 20 | 190 | 4,003,657 | +------------------+-------+--------------+-------------+ | OilGasDaily | 7 | 199 | 3,970,498 | +------------------+-------+--------------+-------------+ | Chase | 6 | 52 | 3,719,232 | +------------------+-------+--------------+-------------+ | MedicalNews | 6 | 117 | 3,634,187 | +------------------+-------+--------------+-------------+ | Marketwatch | 25 | 142 | 3,291,226 | +------------------+-------+--------------+-------------+ | Imgur | 5 | 48 | 3,189,919 | +------------------+-------+--------------+-------------+ | NPR | 9 | 83 | 3,184,303 | +------------------+-------+--------------+-------------+ | Onelogin | 2 | 31 | 3,132,707 | +------------------+-------+--------------+-------------+ | Concur | 2 | 50 | 3,066,326 | +------------------+-------+--------------+-------------+ | Service-now | 1 | 37 | 2,985,329 | +------------------+-------+--------------+-------------+ | Apple itunes | 14 | 80 | 2,843,744 | Balarajah & RossenhoeveExpires September 23, 2018 [Page 39] Internet-Draft Benchmarking for NGFW performance March 2018 +------------------+-------+--------------+-------------+ | BerkeleyEdu | 3 | 69 | 2,622,009 | +------------------+-------+--------------+-------------+ | MSN | 39 | 203 | 2,532,972 | +------------------+-------+--------------+-------------+ | Indeed | 3 | 47 | 2,325,197 | +------------------+-------+--------------+-------------+ | MayoClinic | 6 | 56 | 2,269,085 | +------------------+-------+--------------+-------------+ | Ebay | 9 | 164 | 2,219,223 | +------------------+-------+--------------+-------------+ | UCLAedu | 3 | 42 | 1,991,311 | +------------------+-------+--------------+-------------+ | ConstructionDive | 5 | 125 | 1,828,428 | +------------------+-------+--------------+-------------+ | EducationNews | 4 | 78 | 1,605,427 | +------------------+-------+--------------+-------------+ | BofA | 12 | 68 | 1,584,851 | +------------------+-------+--------------+-------------+ | ScienceDirect | 7 | 26 | 1,463,951 | +------------------+-------+--------------+-------------+ | Reddit | 8 | 55 | 1,441,909 | +------------------+-------+--------------+-------------+ | FoodBusinessNews | 5 | 49 | 1,378,298 | +------------------+-------+--------------+-------------+ | Amex | 8 | 42 | 1,270,696 | +------------------+-------+--------------+-------------+ | Weather | 4 | 50 | 1,243,826 | +------------------+-------+--------------+-------------+ | Wikipedia | 3 | 27 | 958,935 | +------------------+-------+--------------+-------------+ | Bing | 1 | 52 | 697,514 | +------------------+-------+--------------+-------------+ | ADP | 1 | 30 | 508,654 | +------------------+-------+--------------+-------------+ | | | | | +------------------+-------+--------------+-------------+ | Grand Total | 983 | 10021 | 569,819,095 | +------------------+-------+--------------+-------------+ Table 6: Summary of NetSecOPEN Enterprise Perimeter Traffic Mix Authors' Addresses Balarajah & RossenhoeveExpires September 23, 2018 [Page 40] Internet-Draft Benchmarking for NGFW performance March 2018 Balamuhunthan Balarajah EANTC AG Salzufer 14 Berlin 10587 Germany Email: balarajah@eantc.de Carsten Rossenhoevel EANTC AG Salzufer 14 Berlin 10587 Germany Email: cross@eantc.de Balarajah & RossenhoeveExpires September 23, 2018 [Page 41]