idnits 2.17.1 

draft-banks-quic-performance-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (December 23, 2020) is 1219 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                           N. Banks
3	Internet-Draft                                     Microsoft Corporation
4	Intended status: Experimental                          December 23, 2020
5	Expires: June 26, 2021

7	                            QUIC Performance
8	                    draft-banks-quic-performance-00

10	Abstract

12	   The QUIC performance protocol provides a simple, general-purpose
13	   protocol for testing the performance characteristics of a QUIC
14	   implementation.  With this protocol a generic server can support any
15	   number of client-driven performance tests and configurations.
16	   Standardizing the performance protocol allows for easy comparisons
17	   across different QUIC implementations.

19	Status of This Memo

21	   This Internet-Draft is submitted in full conformance with the
22	   provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF).  Note that other groups may also distribute
26	   working documents as Internet-Drafts.  The list of current Internet-
27	   Drafts is at https://datatracker.ietf.org/drafts/current/.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet-Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   This Internet-Draft will expire on June 26, 2021.

36	Copyright Notice

38	   Copyright (c) 2020 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents
43	   (https://trustee.ietf.org/license-info) in effect on the date of
44	   publication of this document.  Please review these documents
45	   carefully, as they describe your rights and restrictions with respect
46	   to this document.  Code Components extracted from this document must
47	   include Simplified BSD License text as described in Section 4.e of
48	   the Trust Legal Provisions and are provided without warranty as
49	   described in the Simplified BSD License.

51	Table of Contents

53	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
54	     1.1.  Terms and Definitions . . . . . . . . . . . . . . . . . .   3
55	   2.  Specification . . . . . . . . . . . . . . . . . . . . . . . .   3
56	     2.1.  Protocol Negotiation  . . . . . . . . . . . . . . . . . .   3
57	     2.2.  Configuration . . . . . . . . . . . . . . . . . . . . . .   3
58	     2.3.  Streams . . . . . . . . . . . . . . . . . . . . . . . . .   3
59	       2.3.1.  Encoding Server Response Size . . . . . . . . . . . .   4
60	       2.3.2.  Bidirectional vs Unidirectional Streams . . . . . . .   4
61	   3.  Example Performance Scenarios . . . . . . . . . . . . . . . .   4
62	     3.1.  Single Connection Bulk Throughput . . . . . . . . . . . .   4
63	     3.2.  Requests Per Second . . . . . . . . . . . . . . . . . . .   5
64	     3.3.  Handshakes Per Second . . . . . . . . . . . . . . . . . .   6
65	     3.4.  Throughput Fairness Index . . . . . . . . . . . . . . . .   6
66	     3.5.  Maximum Number of Idle Connections  . . . . . . . . . . .   7
67	   4.  Things to Note  . . . . . . . . . . . . . . . . . . . . . . .   7
68	     4.1.  What Data Should be Sent? . . . . . . . . . . . . . . . .   7
69	     4.2.  Ramp up Congestion Control or Not?  . . . . . . . . . . .   7
70	     4.3.  Disabling Encryption  . . . . . . . . . . . . . . . . . .   7
71	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
72	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
73	   7.  Normative References  . . . . . . . . . . . . . . . . . . . .   8
74	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

76	1.  Introduction

78	   The various QUIC implementations are still quite young and not
79	   exhaustively tested for many different performance heavy scenarios.
80	   Some have done their own testing, but many are just starting this
81	   process.  Additionally, most only test the performance between their
82	   own client and server.  The QUIC performance protocol aims to
83	   standardize the performance testing mechanisms.  This will hopefully
84	   achieve the following:

86	   o  Remove the need to redesign a performance test for each QUIC
87	      implementation.

89	   o  Provide standard test cases that can produce performance metrics
90	      that can be easily compared across different configurations and
91	      implementations.

93	   o  Allow for easy cross-implementation performance testing.

95	1.1.  Terms and Definitions

97	   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
98	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
99	   "OPTIONAL" in this document are to be interpreted as described in BCP
100	   14 [RFC2119] [RFC8174] when, and only when, they appear in all
101	   capitals, as shown here.

103	2.  Specification

105	   The sections below describe the mechanisms used by a client to
106	   connect to a QUIC perf server and execute various performance
107	   scenarios.

109	2.1.  Protocol Negotiation

111	   The ALPN used by the QUIC performance protocol is "perf".  It can be
112	   used on any UDP port, but UDP port 443 is used by default, if no
113	   other port is specified.  No SNI is required to connect, but may be
114	   optionally provided if the client wishes.

116	2.2.  Configuration

118	   TODO - Possible options: use the first stream to exchange
119	   configurations data OR use a custom transport parameter.

121	2.3.  Streams

123	   The performance protocol is primarily centered around sending and
124	   receiving data.  Streams are the primary vehicle for this.  All
125	   performance tests are client-driven:

127	   o  The client opens a stream.

129	   o  The client encodes the size of the requested server response.

131	   o  The client sends any data it wishes to.

133	   o  The client cleanly closes the stream with a FIN.

135	   When a server receives a stream does the following:

137	   o  The server accepts the new stream.

139	   o  The server processes the encoded response size.

141	   o  The server drains the rest of the client data.

143	   o  The server then sends any response payload that was requested.

145	   *Note* - Should the server wait for FIN before replying?

147	2.3.1.  Encoding Server Response Size

149	   Every stream opened by the client uses the first 8 bytes of the
150	   stream data to encode a 64-bit unsigned integer in network byte order
151	   to indicate the length of data the client wishes the server to
152	   respond with.  An encoded value of zero is perfectly legal, and a
153	   value of MAX_UINT64 (0xFFFFFFFFFFFFFFFF) is practically used to
154	   indicate an unlimited server response.  The client may then cancel
155	   the transfer at its convenience with a STOP_SENDING frame.

157	   On the server side, any stream that is closed before all 8 bytes are
158	   received should just be ignored, and gracefully closed on its end (if
159	   applicable).

161	2.3.2.  Bidirectional vs Unidirectional Streams

163	   When a client uses a bidirectional stream to request a response
164	   payload from the server, the server sends the requested data on the
165	   same stream.  If no data is requested by the client, the server
166	   merely closes its side of the stream.

168	   When a client uses a unidirectional stream to request a response
169	   payload from the server, the server opens a new unidirectional stream
170	   to send the requested data.  If no data is requested by the client,
171	   the server need take no action.

173	3.  Example Performance Scenarios

175	   All stream payload based tests below can be achieved either with
176	   bidirectional or unidirectional streams.  Generally, the goal of all
177	   these performance tests is to measure the maximum load that can be
178	   achieved with the given QUIC implementation and hardware
179	   configuration.  To that end, the network is not expected to be the
180	   bottleneck in any of these tests.  To achieve that, the appropriate
181	   network hardware must be used so as to not limit throughput.

183	3.1.  Single Connection Bulk Throughput

185	   Bulk data throughput on a single QUIC connection is probably the most
186	   common metric when first discussing the performance of a QUIC
187	   implementation.  It uses only a single QUIC connection.  It may be
188	   either an upload or download.  It can be of any desired length.

190	   For an upload test, the client need only open a single stream,
191	   encodes a zero server response size, sends the upload payload and
192	   then closes (FIN) the stream.

194	   For a download test, the client again opens a single stream, encodes
195	   the server's response size (N bytes) and then closes the stream.

197	   The total throughput rate is measured by the client, and is
198	   calculated by dividing the total bytes sent or received by difference
199	   in time from when the client created its initial stream to the time
200	   the client received the server's FIN.

202	3.2.  Requests Per Second

204	   Another very common performance metric is calculating the maximum
205	   requests per second that a QUIC server can handle.  Unlike the bulk
206	   throughput test above, this test generally requires many parallel
207	   connections (possibly from multiple client machines) in order to
208	   saturate the server properly.  There are several variables that tend
209	   to directly affect the results of this test:

211	   o  The number of parallel connections.

213	   o  The size of the client's request.

215	   o  The size of the server's response.

217	   All of the above variables may be changed to measure the maximum RPS
218	   in the given scenario.

220	   The test starts with the client connecting all parallel connections
221	   and waiting for them to be connected.  It's recommended to wait an
222	   additional couple of seconds for things to settle down.

224	   The client then starts sending "requests" on each connection.
225	   Specifically, the client should keep at least one request pending
226	   (preferrably at least two) on each connection at all times.  When a
227	   request completes (receive server's FIN) the client should
228	   immediately queue another request.

230	   The client continues to do this for a configured period of time.
231	   From my testing, ten seconds seems to be a good amount of time to
232	   reach the steady state.

234	   Finally, the client measures the maximum requests per second rate as
235	   the total number of requests completed divided by the total execution
236	   time of the requests phase of the connection (not including the
237	   handshake and wait period).

239	3.3.  Handshakes Per Second

241	   Another metric that may reveal the connection setup efficiency is
242	   handshakes per second.  It lets multiple clients (possibly from
243	   multiple machines) setup QUIC connections (then close them by
244	   CONNECTION_CLOSE) with a single server.  Variables that may
245	   potentially affect the results are:

247	   o  The number of client machines.

249	   o  The number of connections a client can initialize in a second.

251	   o  The size of ClientHello (long list of supported ciphers, versions,
252	      etc.).

254	   All the variables may be changed to measure the maximum handshakes
255	   per second in a given scenario.

257	   The test starts with the multiple clients initializing connections
258	   and waiting for them to be connected with the single server on the
259	   other machine.  It's recommended to wait an additional couple of
260	   seconds for connections to settle down.

262	   The clients will initialize as many connections as possible to
263	   saturate the server.  Once the client receive the handshake from the
264	   server, it terminates the connection by sending a CONNECTION_CLOSE to
265	   the server.  The total handshakes per second are calculated by
266	   dividing the time period by the total number of connections that have
267	   successfully established during that time.

269	3.4.  Throughput Fairness Index

271	   Connection fairness is able to help us reveal how the throughput is
272	   allocated among each connection.  A way of doing it is to establish
273	   multiple hundreds or thousands of concurrent connections and request
274	   the same data block from a single server.  Variables that have
275	   potential impact on the results are:

277	   o  the size of the data being requested.

279	   o  the number of the concurrent connections.

281	   The test starts with establishing several hundreds or thousands of
282	   concurrent connections and downloading the same data block from the
283	   server simultaneously.

285	   The index of fairness is calculated using the complete time of each
286	   connection and the size of the data block in [Jain's manner]
287	   (https://www.cse.wustl.edu/~jain/atmf/ftp/af_fair.pdf).

289	   Be noted that the relationship between fairness and whether the link
290	   is saturated is uncertain before any test.  Thus it is recommended
291	   that both cases are covered in the test.

293	   TODO: is it necessary that we also provide tests on latency fairness
294	   in the multi-connection case?

296	3.5.  Maximum Number of Idle Connections

298	   TODO

300	4.  Things to Note

302	   There are a few important things to note when doing performance
303	   testing.

305	4.1.  What Data Should be Sent?

307	   Since the goal here is to measure the efficiency of the QUIC
308	   implementation and not any application protocol, the performance
309	   application layer should be as light-weight as possible.  To this
310	   end, the client and server application layer may use a single
311	   preallocated and initialized buffer that it queues to send when any
312	   payload needs to be sent out.

314	4.2.  Ramp up Congestion Control or Not?

316	   When running CPU limited, and not network limited, performance tests
317	   ideally we don't care too much about the congestion control state.
318	   That being said, assuming the tests run for enough time, generally
319	   congestion control should ramp up very quickly and not be a
320	   measureable factor in the measurements that result.

322	4.3.  Disabling Encryption

324	   A common topic when talking about QUIC performance is the effect that
325	   its encryption has.  The draft-banks-quic-disable-encryption draft
326	   specifies a way for encryption to be mutually negotiated to be
327	   disabled so that an A:B test can be made to measure the "cost of
328	   encryption" in QUIC.

330	5.  Security Considerations

332	   Since the performance protocol allows for a client to trivially
333	   request the server to do a significant amount of work, it's generally
334	   advisable not to deploy a server running this protocol on the open
335	   internet.

337	   One possible mitigation for unauthenticated clients generating an
338	   unacceptable amount of work on the server would be to use client
339	   certificates to authenticate the client first.

341	6.  IANA Considerations

343	   None

345	7.  Normative References

347	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
348	              Requirement Levels", BCP 14, RFC 2119,
349	              DOI 10.17487/RFC2119, March 1997,
350	              <https://www.rfc-editor.org/info/rfc2119>.

352	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
353	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
354	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

356	Author's Address

358	   Nick Banks
359	   Microsoft Corporation

361	   Email: nibanks@microsoft.com