idnits 2.17.1 

draft-ietf-quic-qlog-main-schema-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (7 March 2022) is 780 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-quic-qlog-h3-events-01

  == Outdated reference: A later version (-07) exists of
     draft-ietf-quic-qlog-quic-events-01

  ** Downref: Normative reference to an Informational RFC: RFC 1952

  ** Downref: Normative reference to an Informational RFC: RFC 4180

  ** Downref: Normative reference to an Informational RFC: RFC 6839

  ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949)

  -- Duplicate reference: RFC7464, mentioned in 'RFC7464', was also mentioned
     in 'JSON-Text-Sequences'.

  ** Downref: Normative reference to an Informational RFC: RFC 7932

  ** Downref: Normative reference to an Informational RFC: RFC 8091

  -- Duplicate reference: RFC8259, mentioned in 'RFC8259', was also mentioned
     in 'JSON'.


     Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	QUIC                                                        R. Marx, Ed.
3	Internet-Draft                                                 KU Leuven
4	Intended status: Standards Track                       L. Niccolini, Ed.
5	Expires: 8 September 2022                                       Facebook
6	                                                         M. Seemann, Ed.
7	                                                           Protocol Labs
8	                                                            7 March 2022

10	                      Main logging schema for qlog
11	                  draft-ietf-quic-qlog-main-schema-02

13	Abstract

15	   This document describes a high-level schema for a standardized
16	   logging format called qlog.  This format allows easy sharing of data
17	   and the creation of reusable visualization and debugging tools.  The
18	   high-level schema in this document is intended to be protocol-
19	   agnostic.  Separate documents specify how the format should be used
20	   for specific protocol data.  The schema is also format-agnostic, and
21	   can be represented in for example JSON, csv or protobuf.

23	Status of This Memo

25	   This Internet-Draft is submitted in full conformance with the
26	   provisions of BCP 78 and BCP 79.

28	   Internet-Drafts are working documents of the Internet Engineering
29	   Task Force (IETF).  Note that other groups may also distribute
30	   working documents as Internet-Drafts.  The list of current Internet-
31	   Drafts is at https://datatracker.ietf.org/drafts/current/.

33	   Internet-Drafts are draft documents valid for a maximum of six months
34	   and may be updated, replaced, or obsoleted by other documents at any
35	   time.  It is inappropriate to use Internet-Drafts as reference
36	   material or to cite them other than as "work in progress."

38	   This Internet-Draft will expire on 8 September 2022.

40	Copyright Notice

42	   Copyright (c) 2022 IETF Trust and the persons identified as the
43	   document authors.  All rights reserved.

45	   This document is subject to BCP 78 and the IETF Trust's Legal
46	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
47	   license-info) in effect on the date of publication of this document.
48	   Please review these documents carefully, as they describe your rights
49	   and restrictions with respect to this document.  Code Components
50	   extracted from this document must include Revised BSD License text as
51	   described in Section 4.e of the Trust Legal Provisions and are
52	   provided without warranty as described in the Revised BSD License.

54	Table of Contents

56	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
57	     1.1.  Notational Conventions  . . . . . . . . . . . . . . . . .   4
58	       1.1.1.  Schema definition . . . . . . . . . . . . . . . . . .   4
59	       1.1.2.  Serialization . . . . . . . . . . . . . . . . . . . .   5
60	   2.  Design goals  . . . . . . . . . . . . . . . . . . . . . . . .   6
61	   3.  The high level qlog schema  . . . . . . . . . . . . . . . . .   6
62	     3.1.  Summary . . . . . . . . . . . . . . . . . . . . . . . . .   7
63	     3.2.  traces  . . . . . . . . . . . . . . . . . . . . . . . . .   8
64	     3.3.  Individual Trace containers . . . . . . . . . . . . . . .   9
65	       3.3.1.  Configuration . . . . . . . . . . . . . . . . . . . .  10
66	       3.3.2.  vantage_point . . . . . . . . . . . . . . . . . . . .  12
67	     3.4.  Field name semantics  . . . . . . . . . . . . . . . . . .  13
68	       3.4.1.  Timestamps  . . . . . . . . . . . . . . . . . . . . .  15
69	       3.4.2.  Category and Event Type . . . . . . . . . . . . . . .  16
70	       3.4.3.  Data  . . . . . . . . . . . . . . . . . . . . . . . .  17
71	       3.4.4.  protocol_type . . . . . . . . . . . . . . . . . . . .  19
72	       3.4.5.  Triggers  . . . . . . . . . . . . . . . . . . . . . .  19
73	       3.4.6.  group_id  . . . . . . . . . . . . . . . . . . . . . .  20
74	       3.4.7.  common_fields . . . . . . . . . . . . . . . . . . . .  21
75	   4.  Guidelines for event definition documents . . . . . . . . . .  23
76	     4.1.  Event design guidelines . . . . . . . . . . . . . . . . .  24
77	     4.2.  Event importance indicators . . . . . . . . . . . . . . .  24
78	     4.3.  Custom fields . . . . . . . . . . . . . . . . . . . . . .  25
79	   5.  Generic events and data classes . . . . . . . . . . . . . . .  26
80	     5.1.  Raw packet and frame information  . . . . . . . . . . . .  26
81	     5.2.  Generic events  . . . . . . . . . . . . . . . . . . . . .  27
82	       5.2.1.  error . . . . . . . . . . . . . . . . . . . . . . . .  27
83	       5.2.2.  warning . . . . . . . . . . . . . . . . . . . . . . .  28
84	       5.2.3.  info  . . . . . . . . . . . . . . . . . . . . . . . .  28
85	       5.2.4.  debug . . . . . . . . . . . . . . . . . . . . . . . .  28
86	       5.2.5.  verbose . . . . . . . . . . . . . . . . . . . . . . .  29
87	     5.3.  Simulation events . . . . . . . . . . . . . . . . . . . .  29
88	       5.3.1.  scenario  . . . . . . . . . . . . . . . . . . . . . .  29
89	       5.3.2.  marker  . . . . . . . . . . . . . . . . . . . . . . .  30
90	   6.  Serializing qlog  . . . . . . . . . . . . . . . . . . . . . .  30
91	     6.1.  qlog to JSON mapping  . . . . . . . . . . . . . . . . . .  31
92	       6.1.1.  I-JSON  . . . . . . . . . . . . . . . . . . . . . . .  31
93	       6.1.2.  Truncated values  . . . . . . . . . . . . . . . . . .  32
94	     6.2.  qlog to JSON Text Sequences mapping . . . . . . . . . . .  33
95	       6.2.1.  Supporting JSON Text Sequences in tooling . . . . . .  36
96	     6.3.  Other optimizated formatting options  . . . . . . . . . .  36
97	       6.3.1.  Data structure optimizations  . . . . . . . . . . . .  37
98	       6.3.2.  Compression . . . . . . . . . . . . . . . . . . . . .  38
99	       6.3.3.  Binary formats  . . . . . . . . . . . . . . . . . . .  39
100	       6.3.4.  Overview and summary  . . . . . . . . . . . . . . . .  40
101	     6.4.  Conversion between formats  . . . . . . . . . . . . . . .  41
102	   7.  Methods of access and generation  . . . . . . . . . . . . . .  42
103	     7.1.  Set file output destination via an environment
104	           variable  . . . . . . . . . . . . . . . . . . . . . . . .  42
105	     7.2.  Access logs via a well-known endpoint . . . . . . . . . .  44
106	   8.  Tooling requirements  . . . . . . . . . . . . . . . . . . . .  44
107	   9.  Security and privacy considerations . . . . . . . . . . . . .  45
108	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  45
109	   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
110	     11.1.  Normative References . . . . . . . . . . . . . . . . . .  45
111	     11.2.  Informative References . . . . . . . . . . . . . . . . .  47
112	   Appendix A.  Change Log . . . . . . . . . . . . . . . . . . . . .  47
113	     A.1.  Since draft-ietf-quic-qlog-main-schema-01:  . . . . . . .  47
114	     A.2.  Since draft-ietf-quic-qlog-main-schema-00:  . . . . . . .  47
115	     A.3.  Since draft-marx-qlog-main-schema-draft-02: . . . . . . .  47
116	     A.4.  Since draft-marx-qlog-main-schema-01: . . . . . . . . . .  48
117	     A.5.  Since draft-marx-qlog-main-schema-00: . . . . . . . . . .  48
118	   Appendix B.  Design Variations  . . . . . . . . . . . . . . . . .  48
119	   Appendix C.  Acknowledgements . . . . . . . . . . . . . . . . . .  49
120	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  49

122	1.  Introduction

124	   There is currently a lack of an easily usable, standardized endpoint
125	   logging format.  Especially for the use case of debugging and
126	   evaluating modern Web protocols and their performance, it is often
127	   difficult to obtain structured logs that provide adequate information
128	   for tasks like problem root cause analysis.

130	   This document aims to provide a high-level schema and harness that
131	   describes the general layout of an easily usable, shareable,
132	   aggregatable and structured logging format.  This high-level schema
133	   is protocol agnostic, with logging entries for specific protocols and
134	   use cases being defined in other documents (see for example
135	   [QLOG-QUIC] for QUIC and [QLOG-H3] for HTTP/3 and QPACK-related event
136	   definitions).

138	   The goal of this high-level schema is to provide amenities and
139	   default characteristics that each logging file should contain (or
140	   should be able to contain), such that generic and reusable toolsets
141	   can be created that can deal with logs from a variety of different
142	   protocols and use cases.

144	   As such, this document contains concepts such as versioning, metadata
145	   inclusion, log aggregation, event grouping and log file size
146	   reduction techniques.

148	   Feedback and discussion are welcome at https://github.com/quicwg/qlog
149	   (https://github.com/quicwg/qlog).  Readers are advised to refer to
150	   the "editor's draft" at that URL for an up-to-date version of this
151	   document.

153	   Concrete examples of integrations of this schema in various
154	   programming languages can be found at https://github.com/quiclog/
155	   qlog/ (https://github.com/quiclog/qlog/).

157	1.1.  Notational Conventions

159	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
160	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
161	   document are to be interpreted as described in [RFC2119].

163	1.1.1.  Schema definition

165	   To define events and data structures, all qlog documents use the
166	   Concise Data Definition Language [CDDL].  This document uses the
167	   basic syntax, the specific text, uint, float32, float64, bool, and
168	   any types, as well as the .default, .size, and .regexp control
169	   operators, the ~ unwrapping operator, and the $ extension point
170	   syntax from [CDDL].

172	   Additionally, this document defines the following custom types for
173	   clarity:

175	   ; CDDL's uint is defined as being 64-bit in size
176	   ; but for many protocol fields we want to be more restrictive
177	   ; and explicit
178	   uint8 = uint .size 1
179	   uint16 = uint .size 2
180	   uint32 = uint .size 4
181	   uint64 = uint .size 8

183	   ; an even-length lowercase string of hexadecimally encoded bytes
184	   ; examples: 82dc, 027339, 4cdbfd9bf0
185	   ; this is needed because the default CDDL binary string (bytes/bstr)
186	   ; is only CBOR and not JSON compatible
187	   hexstring = text .regexp "([0-9a-f]{2})*"

189	                 Figure 1: Additional CDDL type definitions

191	   The main general CDDL syntax conventions in this document a reader
192	   should be aware of for easy reading comprehension are:

194	   *  ? obj : this object is optional

196	   *  TypeName1 / TypeName2 : a union of these two types (object can be
197	      either type 1 OR type 2)

199	   *  obj: TypeName : this object has this concrete type

201	   *  obj: [* TypeName] : this object is an array of this type with
202	      minimum size of 0 elements

204	   *  obj: [+ TypeName] : this object is an array of this type with
205	      minimum size of 1 element

207	   *  TypeName = ... : defines a new type

209	   *  EnumName = "entry1" / "entry2" / entry3 / ...: defines an enum

211	   *  StructName = { ... } : defines a new struct type

213	   *  ; : single-line comment

215	   *  * text => any : special syntax to indicate 0 or more fields that
216	      have a string key that maps to any value.  Used to indicate a
217	      generic JSON object.

219	   All timestamps and time-related values (e.g., offsets) in qlog are
220	   logged as float64 in the millisecond resolution.

222	   Other qlog documents can define their own CDDL-compatible (struct)
223	   types (e.g., separately for each Packet type that a protocol
224	   supports).

226	1.1.2.  Serialization

228	   While the qlog schemas are format-agnostic, and can be serialized in
229	   many ways (e.g., JSON, CBOR, protobuf, ...), this document only
230	   describes how to employ [JSON], its subset [I-JSON], and its
231	   streamable derivative [JSON-Text-Sequences] as textual serialization
232	   options.  As such, examples are provided in [JSON].  Other documents
233	   may describe how to utilize other concrete serialization options,
234	   though tips and requirements for these are also listed in this
235	   document (Section 6).

237	2.  Design goals

239	   The main tenets for the qlog schema design are:

241	   *  Streamable, event-based logging

243	   *  Flexibility in the format, complexity in the tooling (e.g., few
244	      components are a MUST, tools need to deal with this)

246	   *  Extensible and pragmatic

248	   *  Aggregation and transformation friendly (e.g., the top-level
249	      element for the non-streaming format is a container for individual
250	      traces, group_ids can be used to tag events to a particular
251	      context)

253	   *  Metadata is stored together with event data

255	3.  The high level qlog schema

257	   A qlog file should be able to contain several indivdual traces and
258	   logs from multiple vantage points that are in some way related.  To
259	   that end, the top-level element in the qlog schema defines only a
260	   small set of "header" fields and an array of component traces.  For
261	   this document, the required "qlog_version" field MUST have a value of
262	   "0.3".

264	   Note:  there have been several previously broadly deployed qlog
265	      versions based on older drafts of this document (see draft-marx-
266	      qlog-main-schema).  The old values for the "qlog_version" field
267	      were "draft-00", "draft-01" and "draft-02".  When qlog was moved
268	      to the QUIC working group, we decided to switch to a new
269	      versioning scheme which is independent of individual draft
270	      document numbers.  However, we did start from 0.3, as conceptually
271	      0.0, 0.1 and 0.2 can map to draft-00, draft-01 and draft-02.

273	   As qlog can be serialized in a variety of ways, the "qlog_format"
274	   field is used to indicate which serialization option was chosen.  Its
275	   value MUST either be one of the options defined in this document
276	   (e.g., Section 6) or the field must be omitted entirely, in which
277	   case it assumes the default value of "JSON".

279	   In order to make it easier to parse and identify qlog files and their
280	   serialization format, the "qlog_version" and "qlog_format" fields and
281	   their values SHOULD be in the first 256 characters/bytes of the
282	   resulting log file.

284	   An example of the qlog file's top-level structure is shown in
285	   Figure 2.

287	   Definition:

289	   QlogFile = {
290	       qlog_version: text
291	       ? qlog_format: text .default "JSON"
292	       ? title: text
293	       ? description: text
294	       ? summary: Summary
295	       ? traces: [+ Trace / TraceError]
296	   }

298	                       Figure 2: QlogFile definition

300	   JSON serialization example:

302	   {
303	       "qlog_version": "0.3",
304	       "qlog_format": "JSON",
305	       "title": "Name of this particular qlog file (short)",
306	       "description": "Description for this group of traces (long)",
307	       "summary": {
308	           ...
309	       },
310	       "traces": [...]
311	   }

313	                         Figure 3: QlogFile example

315	3.1.  Summary

317	   In a real-life deployment with a large amount of generated logs, it
318	   can be useful to sort and filter logs based on some basic summarized
319	   or aggregated data (e.g., log length, packet loss rate, log location,
320	   presence of error events, ...).  The summary field (if present)
321	   SHOULD be on top of the qlog file, as this allows for the file to be
322	   processed in a streaming fashion (i.e., the implementation could just
323	   read up to and including the summary field and then only load the
324	   full logs that are deemed interesting by the user).

326	   As the summary field is highly deployment-specific, this document
327	   does not specify any default fields or their semantics.  Some
328	   examples of potential entries are shown in Section 3.1.

330	   Definition:

332	   Summary = {
333	       ; summary can contain any type of custom information
334	       ; text here doesn't mean the type text,
335	       ; but the fact that keys/names in the objects are strings
336	       * text => any
337	   }

339	                        Figure 4: Summary definition

341	   JSON serialization example:

343	   {
344	       "trace_count": 1,
345	       "max_duration": 5006,
346	       "max_outgoing_loss_rate": 0.013,
347	       "total_event_count": 568,
348	       "error_count": 2
349	   }

351	                         Figure 5: Summary example

353	3.2.  traces

355	   It is often advantageous to group several related qlog traces
356	   together in a single file.  For example, we can simultaneously
357	   perform logging on the client, on the server and on a single point on
358	   their common network path.  For analysis, it is useful to aggregate
359	   these three individual traces together into a single file, so it can
360	   be uniquely stored, transferred and annotated.

362	   As such, the "traces" array contains a list of individual qlog
363	   traces.  Typical qlogs will only contain a single trace in this
364	   array.  These can later be combined into a single qlog file by taking
365	   the "traces" entry/entries for each qlog file individually and
366	   copying them to the "traces" array of a new, aggregated qlog file.
367	   This is typically done in a post-processing step.

369	   The "traces" array can thus contain both normal traces (for the
370	   definition of the Trace type, see Section 3.3), but also "error"
371	   entries.  These indicate that we tried to find/convert a file for
372	   inclusion in the aggregated qlog, but there was an error during the
373	   process.  Rather than silently dropping the erroneous file, we can
374	   opt to explicitly include it in the qlog file as an entry in the
375	   "traces" array, as shown in Figure 6.

377	   Definition:

379	   TraceError = {
380	       error_description: text
381	       ; the original URI at which we attempted to find the file
382	       ? uri: text
383	       ? vantage_point: VantagePoint
384	   }

386	                      Figure 6: TraceError definition

388	   JSON serialization example:

390	   {
391	       "error_description": "File could not be found",
392	       "uri": "/srv/traces/today/latest.qlog",
393	       "vantage_point": { type: "server" }
394	   }

396	                        Figure 7: TraceError example

398	   Note that another way to combine events of different traces in a
399	   single qlog file is through the use of the "group_id" field,
400	   discussed in Section 3.4.6.

402	3.3.  Individual Trace containers

404	   The exact conceptual definition of a Trace can be fluid.  For
405	   example, a trace could contain all events for a single connection,
406	   for a single endpoint, for a single measurement interval, for a
407	   single protocol, etc.  As such, a Trace container contains some
408	   metadata in addition to the logged events, see Figure 8.

410	   In the normal use case however, a trace is a log of a single data
411	   flow collected at a single location or vantage point.  For example,
412	   for QUIC, a single trace only contains events for a single logical
413	   QUIC connection for either the client or the server.

415	   The semantics and context of the trace can mainly be deduced from the
416	   entries in the "common_fields" list and "vantage_point" field.

418	   Definition:

420	   Trace = {
421	       ? title: text
422	       ? description: text
423	       ? configuration: Configuration
424	       ? common_fields: CommonFields
425	       ? vantage_point: VantagePoint
426	       events: [* Event]
427	   }

429	                         Figure 8: Trace definition

431	   JSON serialization example:

433	   {
434	       "title": "Name of this particular trace (short)",
435	       "description": "Description for this trace (long)",
436	       "configuration": {
437	           "time_offset": 150
438	       },
439	       "common_fields": {
440	           "ODCID": "abcde1234",
441	           "time_format": "absolute"
442	       },
443	       "vantage_point": {
444	           "name": "backend-67",
445	           "type": "server"
446	       },
447	       "events": [...]
448	   }

450	                          Figure 9: Trace example

452	3.3.1.  Configuration

454	   We take into account that a qlog file is usually not used in
455	   isolation, but by means of various tools.  Especially when
456	   aggregating various traces together or preparing traces for a
457	   demonstration, one might wish to persist certain tool-based settings
458	   inside the qlog file itself.  For this, the configuration field is
459	   used.

461	   The configuration field can be viewed as a generic metadata field
462	   that tools can fill with their own fields, based on per-tool logic.
463	   It is best practice for tools to prefix each added field with their
464	   tool name to prevent collisions across tools.  This document only
465	   defines two optional, standard, tool-independent configuration
466	   settings: "time_offset" and "original_uris".

468	   Definition:

470	   Configuration = {
471	       ; time_offset is in milliseconds
472	       time_offset: float64
473	       original_uris:[* text]
474	       * text => any
475	   }

477	                    Figure 10: Configuration definition

479	   JSON serialization example:

481	   {
482	       "time_offset": 150,
483	       "original_uris": [
484	           "https://example.org/trace1.qlog",
485	           "https://example.org/trace2.qlog"
486	       ]
487	   }

489	                      Figure 11: Configuration example

491	3.3.1.1.  time_offset

493	   The time_offset field indicates by how many milliseconds the starting
494	   time of the current trace should be offset.  This is useful when
495	   comparing logs taken from various systems, where clocks might not be
496	   perfectly synchronous.  Users could use manual tools or automated
497	   logic to align traces in time and the found optimal offsets can be
498	   stored in this field for future usage.  The default value is 0.

500	3.3.1.2.  original_uris

502	   The original_uris field is used when merging multiple individual qlog
503	   files or other source files (e.g., when converting .pcaps to qlog).
504	   It allows to keep better track where certain data came from.  It is a
505	   simple array of strings.  It is an array instead of a single string,
506	   since a single qlog trace can be made up out of an aggregation of
507	   multiple component qlog traces as well.  The default value is an
508	   empty array.

510	3.3.1.3.  custom fields

512	   Tools can add optional custom metadata to the "configuration" field
513	   to store state and make it easier to share specific data viewpoints
514	   and view configurations.

516	   Two examples from the qvis toolset (https://qvis.edm.uhasselt.be) are
517	   shown in Figure 12.

519	   {
520	       "configuration" : {
521	           "qvis" : {
522	               "congestion_graph": {
523	                   "startX": 1000,
524	                   "endX": 2000,
525	                   "focusOnEventIndex": 124
526	               }

528	               "sequence_diagram" : {
529	                   "focusOnEventIndex": 555
530	               }
531	           }
532	       }
533	   }

535	               Figure 12: Custom configuration fields example

537	3.3.2.  vantage_point

539	   The vantage_point field describes the vantage point from which the
540	   trace originates, see Figure 13.  Each trace can have only a single
541	   vantage_point and thus all events in a trace MUST BE from the
542	   perspective of this vantage_point.  To include events from multiple
543	   vantage_points, implementers can for example include multiple traces,
544	   split by vantage_point, in a single qlog file.

546	   Definitions:

548	   VantagePoint = {
549	       ? name: text
550	       type: VantagePointType
551	       ? flow: VantagePointType
552	   }

554	   ; client = endpoint which initiates the connection
555	   ; server = endpoint which accepts the connection
556	   ; network = observer in between client and server
557	   VantagePointType = "client" / "server" / "network" / "unknown"

559	                     Figure 13: VantagePoint definition

561	   JSON serialization examples:

563	   {
564	       "name": "aioquic client",
565	       "type": "client",
566	   }

568	   {
569	       "name": "wireshark trace",
570	       "type": "network",
571	       "flow": "client"
572	   }

574	                      Figure 14: VantagePoint example

576	   The flow field is only required if the type is "network" (for
577	   example, the trace is generated from a packet capture).  It is used
578	   to disambiguate events like "packet sent" and "packet received".
579	   This is indicated explicitly because for multiple reasons (e.g.,
580	   privacy) data from which the flow direction can be otherwise inferred
581	   (e.g., IP addresses) might not be present in the logs.

583	   Meaning of the different values for the flow field: * "client"
584	   indicates that this vantage point follows client data flow semantics
585	   (a "packet sent" event goes in the direction of the server).  *
586	   "server" indicates that this vantage point follow server data flow
587	   semantics (a "packet sent" event goes in the direction of the
588	   client).  * "unknown" indicates that the flow's direction is unknown.

590	   Depending on the context, tools confronted with "unknown" values in
591	   the vantage_point can either try to heuristically infer the semantics
592	   from protocol-level domain knowledge (e.g., in QUIC, the client
593	   always sends the first packet) or give the user the option to switch
594	   between client and server perspectives manually.

596	3.4.  Field name semantics

598	   Inside of the "events" field of a qlog trace is a list of events
599	   logged by the endpoint.  Each event is specified as a generic object
600	   with a number of member fields and their associated data.  Depending
601	   on the protocol and use case, the exact member field names and their
602	   formats can differ across implementations.  This section lists the
603	   main, pre-defined and reserved field names with specific semantics
604	   and expected corresponding value formats.

606	   Each qlog event at minimum requires the "time" (Section 3.4.1),
607	   "name" (Section 3.4.2) and "data" (Section 3.4.3) fields.  Other
608	   typical fields are "time_format" (Section 3.4.1), "protocol_type"
609	   (Section 3.4.4), "trigger" (Section 3.4.5), and "group_id"
610	   Section 3.4.6.  As especially these later fields typically have
611	   identical values across individual event instances, they are normally
612	   logged separately in the "common_fields" (Section 3.4.7).

614	   The specific values for each of these fields and their semantics are
615	   defined in separate documents, specific per protocol or use case.
616	   For example: event definitions for QUIC, HTTP/3 and QPACK can be
617	   found in [QLOG-QUIC] and [QLOG-H3].

619	   Other fields are explicitly allowed by the qlog approach, and tools
620	   SHOULD allow for the presence of unknown event fields, but their
621	   semantics depend on the context of the log usage (e.g., for QUIC, the
622	   ODCID field is used), see [QLOG-QUIC].

624	   An example of a qlog event with its component fields is shown in
625	   Figure 15.

627	   Definition:

629	   Event = {
630	       time: float64
631	       name: text
632	       data: $ProtocolEventBody

634	       ? time_format: TimeFormat

636	       ? protocol_type: ProtocolType
637	       ? group_id: GroupID

639	       ; events can contain any amount of custom fields
640	       * text => any
641	   }

643	                        Figure 15: Event definition

645	   JSON serialization:

647	   {
648	       time: 1553986553572,

650	       name: "transport:packet_sent",
651	       data: { ... }

653	       protocol_type:  ["QUIC","HTTP3"],
654	       group_id: "127ecc830d98f9d54a42c4f0842aa87e181a",

656	       time_format: "absolute",

658	       ODCID: "127ecc830d98f9d54a42c4f0842aa87e181a",
659	   }

661	                          Figure 16: Event example

663	3.4.1.  Timestamps

665	   The "time" field indicates the timestamp at which the event occured.
666	   Its value is typically the Unix timestamp since the 1970 epoch
667	   (number of milliseconds since midnight UTC, January 1, 1970, ignoring
668	   leap seconds).  However, qlog supports two more succint timestamps
669	   formats to allow reducing file size.  The employed format is
670	   indicated in the "time_format" field, which allows one of three
671	   values: "absolute", "delta" or "relative".

673	   Definition:

675	   TimeFormat = "absolute" / "delta" / "relative"

677	                      Figure 17: TimeFormat definition

679	   *  Absolute: Include the full absolute timestamp with each event.
680	      This approach uses the largest amount of characters.  This is also
681	      the default value of the "time_format" field.

683	   *  Delta: Delta-encode each time value on the previously logged
684	      value.  The first event in a trace typically logs the full
685	      absolute timestamp.  This approach uses the least amount of
686	      characters.

688	   *  Relative: Specify a full "reference_time" timestamp (typically
689	      this is done up-front in "common_fields", see Section 3.4.7) and
690	      include only relatively-encoded values based on this
691	      reference_time with each event.  The "reference_time" value is
692	      typically the first absolute timestamp.  This approach uses a
693	      medium amount of characters.

695	   The first option is good for stateless loggers, the second and third
696	   for stateful loggers.  The third option is generally preferred, since
697	   it produces smaller files while being easier to reason about.  An
698	   example for each option can be seen in Figure 18.

700	   The absolute approach will use:
701	   1500, 1505, 1522, 1588

703	   The delta approach will use:
704	   1500, 5, 17, 66

706	   The relative approach will:
707	   - set the reference_time to 1500 in "common_fields"
708	   - use: 0, 5, 22, 88

710	        Figure 18: Three different approaches for logging timestamps

712	   One of these options is typically chosen for the entire trace (put
713	   differently: each event has the same value for the "time_format"
714	   field).  Each event MUST include a timestamp in the "time" field.

716	   Events in each individual trace SHOULD be logged in strictly
717	   ascending timestamp order (though not necessarily absolute value, for
718	   the "delta" format).  Tools CAN sort all events on the timestamp
719	   before processing them, though are not required to (as this could
720	   impose a significant processing overhead).  This can be a problem
721	   especially for multi-threaded and/or streaming loggers, who could
722	   consider using a separate postprocesser to order qlog events in time
723	   if a tool do not provide this feature.

725	   Timestamps do not have to use the UNIX epoch timestamp as their
726	   reference.  For example for privacy considerations, any initial
727	   reference timestamps (for example "endpoint uptime in ms" or "time
728	   since connection start in ms") can be chosen.  Tools SHOULD NOT
729	   assume the ability to derive the absolute Unix timestamp from qlog
730	   traces, nor allow on them to relatively order events across two or
731	   more separate traces (in this case, clock drift should also be taken
732	   into account).

734	3.4.2.  Category and Event Type

736	   Events differ mainly in the type of metadata associated with them.
737	   To help identify a given event and how to interpret its metadata in
738	   the "data" field (see Section 3.4.3), each event has an associated
739	   "name" field.  This can be considered as a concatenation of two other
740	   fields, namely event "category" and event "type".

742	   Category allows a higher-level grouping of events per specific event
743	   type.  For example for QUIC and HTTP/3, the different categories
744	   could be "transport", "http", "qpack", and "recovery".  Within these
745	   categories, the event Type provides additional granularity.  For
746	   example for QUIC and HTTP/3, within the "transport" Category, there
747	   would be "packet_sent" and "packet_received" events.

749	   Logging category and type separately conceptually allows for fast and
750	   high-level filtering based on category and the re-use of event types
751	   across categories.  However, it also considerably inflates the log
752	   size and this flexibility is not used extensively in practice at the
753	   time of writing.

755	   As such, the default approach in qlog is to concatenate both field
756	   values using the ":" character in the "name" field, as can be seen in
757	   Figure 19.  As such, qlog category and type names MUST NOT include
758	   this character.

760	   JSON serialization using separate fields:
761	   {
762	       "category": "transport",
763	       "type": "packet_sent"
764	   }

766	   JSON serialization using ":" concatenated field:
767	   {
768	       "name": "transport:packet_sent"
769	   }

771	      Figure 19: Ways of logging category, type and name of an event.

773	   Certain serializations CAN emit category and type as separate fields,
774	   and qlog tools SHOULD be able to deal with both the concatenated
775	   "name" field, and the separate "category" and "type" fields.  Text-
776	   based serializations however are encouraged to employ the
777	   concatenated "name" field for efficiency.

779	3.4.3.  Data

781	   The data field is a generic object.  It contains the per-event
782	   metadata and its form and semantics are defined per specific sort of
783	   event.  For example, data field value definitons for QUIC and HTTP/3
784	   can be found in [QLOG-QUIC] and [QLOG-H3].

786	   This field is defined here as a CDDL extension point (a "socket" or
787	   "plug") named $ProtocolEventBody.  Other documents MUST properly
788	   extend this extension point when defining new data field content
789	   options to enable automated validation of aggregated qlog schemas.

791	   The only common field defined for the data field is the trigger
792	   field, which is discussed in Section 3.4.5.

794	   Definition:

796	   ; The ProtocolEventBody is any key-value map (e.g., JSON object)
797	   ; only the optional trigger field is defined in this document
798	   $ProtocolEventBody /= {
799	       ? trigger: text
800	       * text => any
801	   }
802	   ; event documents are intended to extend this socket by using:
803	   ; NewProtocolEvents = EventType1 / EventType2 / ... / EventTypeN
804	   ; $ProtocolEventBody /= NewProtocolEvents

806	                  Figure 20: ProtocolEventBody definition

808	   One purely illustrative example for a QUIC "packet_sent" event is
809	   shown in Figure 21:

811	   TransportPacketSent = {
812	       ? packet_size: uint16
813	       header: PacketHeader
814	       ? frames:[* QuicFrame]
815	       ? trigger: "pto_probe" / "retransmit_timeout" / "bandwidth_probe"
816	   }

818	   could be serialized as

820	   {
821	       packet_size: 1280,
822	       header: {
823	           packet_type: "1RTT",
824	           packet_number: 123
825	       },
826	       frames: [
827	           {
828	               frame_type: "stream",
829	               length: 1000,
830	               offset: 456
831	           },
832	           {
833	               frame_type: "padding"
834	           }
835	       ]
836	   }

838	    Figure 21: Example of the 'data' field for a QUIC packet_sent event

840	3.4.4.  protocol_type

842	   The "protocol_type" array field indicates to which protocols (or
843	   protocol "stacks") this event belongs.  This allows a single qlog
844	   file to aggregate traces of different protocols (e.g., a web server
845	   offering both TCP+HTTP/2 and QUIC+HTTP/3 connections).

847	   Definition:

849	   ProtocolType = [+ text]

851	                     Figure 22: ProtocolType definition

853	   For example, QUIC and HTTP/3 events have the "QUIC" and "HTTP3"
854	   protocol_type entry values, see [QLOG-QUIC] and [QLOG-H3].

856	   Typically however, all events in a single trace are of the same few
857	   protocols, and this array field is logged once in "common_fields",
858	   see Section 3.4.7.

860	3.4.5.  Triggers

862	   Sometimes, additional information is needed in the case where a
863	   single event can be caused by a variety of other events.  In the
864	   normal case, the context of the surrounding log messages gives a hint
865	   as to which of these other events was the cause.  However, in highly-
866	   parallel and optimized implementations, corresponding log messages
867	   might separated in time.  Another option is to explicitly indicate
868	   these "triggers" in a high-level way per-event to get more fine-
869	   grained information without much additional overhead.

871	   In qlog, the optional "trigger" field contains a string value
872	   describing the reason (if any) for this event instance occuring, see
873	   Section 3.4.3.  While this "trigger" field could be a property of the
874	   qlog Event itself, it is instead a property of the "data" field
875	   instead.  This choice was made because many event types do not
876	   include a trigger value, and having the field at the Event-level
877	   would cause overhead in some serializations.  Additional information
878	   on the trigger can be added in the form of additional member fields
879	   of the "data" field value, yet this is highly implementation-
880	   specific, as are the trigger field's string values.

882	   One purely illustrative example of some potential triggers for QUIC's
883	   "packet_dropped" event is shown in Figure 23:

885	   TransportPacketDropped = {
886	       ? packet_type: PacketType
887	       ? raw_length: uint16

889	       ? trigger: "key_unavailable" / "unknown_connection_id" /
890	                  "decrypt_error" / "unsupported_version"
891	   }

893	                         Figure 23: Trigger example

895	3.4.6.  group_id

897	   As discussed in Section 3.3, a single qlog file can contain several
898	   traces taken from different vantage points.  However, a single trace
899	   from one endpoint can also contain events from a variety of sources.
900	   For example, a server implementation might choose to log events for
901	   all incoming connections in a single large (streamed) qlog file.  As
902	   such, we need a method for splitting up events belonging to separate
903	   logical entities.

905	   The simplest way to perform this splitting is by associating a "group
906	   identifier" to each event that indicates to which conceptual "group"
907	   each event belongs.  A post-processing step can then extract events
908	   per group.  However, this group identifier can be highly protocol and
909	   context-specific.  In the example above, we might use QUIC's
910	   "Original Destination Connection ID" to uniquely identify a
911	   connection.  As such, they might add a "ODCID" field to each event.
912	   However, a middlebox logging IP or TCP traffic might rather use four-
913	   tuples to identify connections, and add a "four_tuple" field.

915	   As such, to provide consistency and ease of tooling in cross-protocol
916	   and cross-context setups, qlog instead defines the common "group_id"
917	   field, which contains a string value.  Implementations are free to
918	   use their preferred string serialization for this field, so long as
919	   it contains a unique value per logical group.  Some examples can be
920	   seen in Figure 25.

922	   Definition:

924	   GroupID = text

926	                       Figure 24: GroupID definition

928	   JSON serialization example for events grouped by four tuples and QUIC
929	   connection IDs:

931	   events: [
932	       {
933	           time: 1553986553579,
934	           protocol_type: ["TCP", "TLS", "HTTP2"],
935	           group_id: "ip1=2001:67c:1232:144:9498:6df6:f450:110b,
936	                      ip2=2001:67c:2b0:1c1::198,port1=59105,port2=80",
937	           name: "transport:packet_received",
938	           data: { ... },
939	       },
940	       {
941	           time: 1553986553581,
942	           protocol_type: ["QUIC","HTTP3"],
943	           group_id: "127ecc830d98f9d54a42c4f0842aa87e181a",
944	           name: "transport:packet_sent",
945	           data: { ... },
946	       }
947	   ]

949	                         Figure 25: GroupID example

951	   Note that in some contexts (for example a Multipath transport
952	   protocol) it might make sense to add additional contextual per-event
953	   fields (for example "path_id"), rather than use the group_id field
954	   for that purpose.

956	   Note also that, typically, a single trace only contains events
957	   belonging to a single logical group (for example, an individual QUIC
958	   connection).  As such, instead of logging the "group_id" field with
959	   an identical value for each event instance, this field is typically
960	   logged once in "common_fields", see Section 3.4.7.

962	3.4.7.  common_fields

964	   As discussed in the previous sections, information for a typical qlog
965	   event varies in three main fields: "time", "name" and associated
966	   data.  Additionally, there are also several more advanced fields that
967	   allow mixing events from different protocols and contexts inside of
968	   the same trace (for example "protocol_type" and "group_id").  In most
969	   "normal" use cases however, the values of these advanced fields are
970	   consistent for each event instance (for example, a single trace
971	   contains events for a single QUIC connection).

973	   To reduce file size and making logging easier, qlog uses the
974	   "common_fields" list to indicate those fields and their values that
975	   are shared by all events in this component trace.  This prevents
976	   these fields from being logged for each individual event.  An example
977	   of this is shown in Figure 26.

979	   JSON serialization with repeated field values
980	   per-event instance:

982	   {
983	       events: [{
984	               group_id: "127ecc830d98f9d54a42c4f0842aa87e181a",
985	               protocol_type: ["QUIC","HTTP3"],
986	               time_format: "relative",
987	               reference_time: 1553986553572,

989	               time: 2,
990	               name: "transport:packet_received",
991	               data: { ... }
992	           },{
993	               group_id: "127ecc830d98f9d54a42c4f0842aa87e181a",
994	               protocol_type: ["QUIC","HTTP3"],
995	               time_format: "relative",
996	               reference_time: 1553986553572,

998	               time: 7,
999	               name: "http:frame_parsed",
1000	               data: { ... }
1001	           }
1002	       ]
1003	   }

1005	   JSON serialization with repeated field values instead
1006	   extracted to common_fields:

1008	   {
1009	       common_fields: {
1010	           group_id: "127ecc830d98f9d54a42c4f0842aa87e181a",
1011	           protocol_type: ["QUIC","HTTP3"],
1012	           time_format: "relative",
1013	           reference_time: 1553986553572
1014	       },
1015	       events: [
1016	           {
1017	               time: 2,
1018	               name: "transport:packet_received",
1019	               data: { ... }
1020	           },{
1021	               7,
1022	               name: "http:frame_parsed",
1023	               data: { ... }
1024	           }
1025	       ]
1026	   }
1027	                      Figure 26: CommonFields example

1029	   The "common_fields" field is a generic dictionary of key-value pairs,
1030	   where the key is always a string and the value can be of any type,
1031	   but is typically also a string or number.  As such, unknown entries
1032	   in this dictionary MUST be disregarded by the user and tools (i.e.,
1033	   the presence of an uknown field is explicitly NOT an error).

1035	   The list of default qlog fields that are typically logged in
1036	   common_fields (as opposed to as individual fields per event instance)
1037	   are shown in the listing below:

1039	   Definition:

1041	   CommonFields = {
1042	       ? time_format: TimeFormat
1043	       ? reference_time: float64

1045	       ? protocol_type: ProtocolType
1046	       ? group_id: GroupID

1048	       * text => any
1049	   }

1051	                     Figure 27: CommonFields definition

1053	   Tools MUST be able to deal with these fields being defined either on
1054	   each event individually or combined in common_fields.  Note that if
1055	   at least one event in a trace has a different value for a given
1056	   field, this field MUST NOT be added to common_fields but instead
1057	   defined on each event individually.  Good example of such fields are
1058	   "time" and "data", who are divergent by nature.

1060	4.  Guidelines for event definition documents

1062	   This document only defines the main schema for the qlog format.  This
1063	   is intended to be used together with specific, per-protocol event
1064	   definitions that specify the name (category + type) and data needed
1065	   for each individual event.  This is with the intent to allow the qlog
1066	   main schema to be easily re-used for several protocols.  Examples
1067	   include the QUIC event definitions [QLOG-QUIC] and HTTP/3 and QPACK
1068	   event definitions [QLOG-H3].

1070	   This section defines some basic annotations and concepts the creators
1071	   of event definition documents SHOULD follow to ensure a measure of
1072	   consistency, making it easier for qlog implementers to extrapolate
1073	   from one protocol to another.

1075	4.1.  Event design guidelines

1077	   TODO: pending QUIC working group discussion.  This text reflects the
1078	   initial (qlog draft 01 and 02) setup.

1080	   There are several ways of defining qlog events.  In practice, we have
1081	   seen two main types used so far: a) those that map directly to
1082	   concepts seen in the protocols (e.g., packet_sent) and b) those that
1083	   act as aggregating events that combine data from several possible
1084	   protocol behaviours or code paths into one (e.g., parameters_set).
1085	   The latter are typically used as a means to reduce the amount of
1086	   unique event definitions, as reflecting each possible protocol event
1087	   as a separate qlog entity would cause an explosion of event types.

1089	   Additionally, logging duplicate data is typically prevented as much
1090	   as possible.  For example, packet header values that remain
1091	   consistent across many packets are split into separate events (for
1092	   example spin_bit_updated or connection_id_updated for QUIC).

1094	   Finally, we have typically refrained from adding additional state
1095	   change events if those state changes can be directly inferred from
1096	   data on the wire (for example flow control limit changes) if the
1097	   implementation is bug-free and spec-compliant.  Exceptions have been
1098	   made for common events that benefit from being easily identifiable or
1099	   individually logged (for example packets_acked).

1101	4.2.  Event importance indicators

1103	   Depending on how events are designed, it may be that several events
1104	   allow the logging of similar or overlapping data.  For example the
1105	   separate QUIC connection_started event overlaps with the more generic
1106	   connection_state_updated.  In these cases, it is not always clear
1107	   which event should be logged or used, and which event should take
1108	   precedence if e.g., both are present and provide conflicting
1109	   information.

1111	   To aid in this decision making, we recommend that each event SHOULD
1112	   have an "importance indicator" with one of three values, in
1113	   decreasing order of importance and exptected usage:

1115	   *  Core

1117	   *  Base

1119	   *  Extra
1120	   The "Core" events are the events that SHOULD be present in all qlog
1121	   files for a given protocol.  These are typically tied to basic packet
1122	   and frame parsing and creation, as well as listing basic internal
1123	   metrics.  Tool implementers SHOULD expect and add support for these
1124	   events, though SHOULD NOT expect all Core events to be present in
1125	   each qlog trace.

1127	   The "Base" events add additional debugging options and CAN be present
1128	   in qlog files.  Most of these can be implicitly inferred from data in
1129	   Core events (if those contain all their properties), but for many it
1130	   is better to log the events explicitly as well, making it clearer how
1131	   the implementation behaves.  These events are for example tied to
1132	   passing data around in buffers, to how internal state machines change
1133	   and help show when decisions are actually made based on received
1134	   data.  Tool implementers SHOULD at least add support for showing the
1135	   contents of these events, if they do not handle them explicitly.

1137	   The "Extra" events are considered mostly useful for low-level
1138	   debugging of the implementation, rather than the protocol.  They
1139	   allow more fine-grained tracking of internal behaviour.  As such,
1140	   they CAN be present in qlog files and tool implementers CAN add
1141	   support for these, but they are not required to.

1143	   Note that in some cases, implementers might not want to log for
1144	   example data content details in the "Core" events due to performance
1145	   or privacy considerations.  In this case, they SHOULD use (a subset
1146	   of) relevant "Base" events instead to ensure usability of the qlog
1147	   output.  As an example, implementations that do not log QUIC
1148	   packet_received events and thus also not which (if any) ACK frames
1149	   the packet contains, SHOULD log packets_acked events instead.

1151	   Finally, for event types whose data (partially) overlap with other
1152	   event types' definitions, where necessary the event definition
1153	   document should include explicit guidance on which to use in specific
1154	   situations.

1156	4.3.  Custom fields

1158	   Event definition documents are free to define new category and event
1159	   types, top-level fields (e.g., a per-event field indicating its
1160	   privacy properties or path_id in multipath protocols), as well as
1161	   values for the "trigger" property within the "data" field, or other
1162	   member fields of the "data" field, as they see fit.

1164	   They however SHOULD NOT expect non-specialized tools to recognize or
1165	   visualize this custom data.  However, tools SHOULD make an effort to
1166	   visualize even unknown data if possible in the specific tool's
1167	   context.  If they do not, they MUST ignore these unknown fields.

1169	5.  Generic events and data classes

1171	   There are some event types and data classes that are common across
1172	   protocols, applications and use cases that benefit from being defined
1173	   in a single location.  This section specifies such common
1174	   definitions.

1176	5.1.  Raw packet and frame information

1178	   While qlog is a more high-level logging format, it also allows the
1179	   inclusion of most raw wire image information, such as byte lengths
1180	   and even raw byte values.  This can be useful when for example
1181	   investigating or tuning packetization behaviour or determining
1182	   encoding/framing overheads.  However, these fields are not always
1183	   necessary and can take up considerable space if logged for each
1184	   packet or frame.  They can also have a considerable privacy and
1185	   security impact.  As such, they are grouped in a separate optional
1186	   field called "raw" of type RawInfo (where applicable).

1188	   Definition:

1190	   RawInfo = {
1191	       ; the full byte length of the entity (e.g., packet or frame),
1192	       ; including headers and trailers
1193	       ? length: uint64

1195	       ; the byte length of the entity's payload,
1196	       ; without headers or trailers
1197	       ? payload_length: uint64

1199	       ; the contents of the full entity,
1200	       ; including headers and trailers
1201	       ? data: hexstring
1202	   }

1204	                       Figure 28: RawInfo definition

1206	   Note:  The RawInfo:data field can be truncated for privacy or
1207	      security purposes (for example excluding payload data), see
1208	      Section 6.1.2.  In this case, the length properties should still
1209	      indicate the non-truncated lengths.

1211	   Note:  We do not specify explicit header_length or trailer_length
1212	      fields.  In most protocols, header_length can be calculated by
1213	      subtracing the payload_length from the length (e.g., if
1214	      trailer_length is always 0).  In protocols with trailers (e.g.,
1215	      QUIC's AEAD tag), event definitions documents SHOULD define other
1216	      ways of logging the trailer_length to make the header_length
1217	      calculation possible.

1219	      The exact definitions entities, headers, trailers and payloads
1220	      depend on the protocol used.  If this is non-trivial, event
1221	      definitions documents SHOULD include a clear explanation of how
1222	      entities are mapped into the RawInfo structure.

1224	   Note:  Relatedly, many modern protocols use Variable-Length Integer
1225	      Encoded (VLIE) values in their headers, which are of a dynamic
1226	      length.  Because of this, we cannot deterministally reconstruct
1227	      the header encoding/length from non-RawInfo qlog data, as
1228	      implementations might not necessarily employ the most efficient
1229	      VLIE scheme for all values.  As such, to make exact size-analysis
1230	      possible, implementers should use explicit lengths in RawInfo
1231	      rather than reconstructing them from other qlog data.  Similarly,
1232	      tool developers should only utilize RawInfo (and related
1233	      information) in such tools to prevent errors.

1235	5.2.  Generic events

1237	   In typical logging setups, users utilize a discrete number of well-
1238	   defined logging categories, levels or severities to log freeform
1239	   (string) data.  This generic events category replicates this approach
1240	   to allow implementations to fully replace their existing text-based
1241	   logging by qlog.  This is done by providing events to log generic
1242	   strings for the typical well-known logging levels (error, warning,
1243	   info, debug, verbose).

1245	   For the events defined below, the "category" is "generic" and their
1246	   "type" is the name of the heading in lowercase (e.g., the "name" of
1247	   the error event is "generic:error").

1249	5.2.1.  error

1251	   Importance: Core

1253	   Used to log details of an internal error that might not get reflected
1254	   on the wire.

1256	   Definition:

1258	   GenericError = {
1259	       ? code: uint64
1260	       ? message: text
1261	   }

1263	                     Figure 29: GenericError definition

1265	5.2.2.  warning

1267	   Importance: Base

1269	   Used to log details of an internal warning that might not get
1270	   reflected on the wire.

1272	   Definition:

1274	   GenericWarning = {
1275	       ? code: uint64
1276	       ? message: text
1277	   }

1279	                    Figure 30: GenericWarning definition

1281	5.2.3.  info

1283	   Importance: Extra

1285	   Used mainly for implementations that want to use qlog as their one
1286	   and only logging format but still want to support unstructured string
1287	   messages.

1289	   Definition:

1291	   GenericInfo = {
1292	       message: text
1293	   }

1295	                     Figure 31: GenericInfo definition

1297	5.2.4.  debug

1299	   Importance: Extra

1301	   Used mainly for implementations that want to use qlog as their one
1302	   and only logging format but still want to support unstructured string
1303	   messages.

1305	   Definition:

1307	   GenericDebug = {
1308	       message: text
1309	   }

1311	                     Figure 32: GenericDebug definition

1313	5.2.5.  verbose

1315	   Importance: Extra

1317	   Used mainly for implementations that want to use qlog as their one
1318	   and only logging format but still want to support unstructured string
1319	   messages.

1321	   Definition:

1323	   GenericVerbose = {
1324	       message: text
1325	   }

1327	                    Figure 33: GenericVerbose definition

1329	5.3.  Simulation events

1331	   When evaluating a protocol implementation, one typically sets up a
1332	   series of interoperability or benchmarking tests, in which the test
1333	   situations can change over time.  For example, the network bandwidth
1334	   or latency can vary during the test, or the network can be fully
1335	   disable for a short time.  In these setups, it is useful to know when
1336	   exactly these conditions are triggered, to allow for proper
1337	   correlation with other events.

1339	   For the events defined below, the "category" is "simulation" and
1340	   their "type" is the name of the heading in lowercase (e.g., the
1341	   "name" of the scenario event is "simulation:scenario").

1343	5.3.1.  scenario

1345	   Importance: Extra

1347	   Used to specify which specific scenario is being tested at this
1348	   particular instance.  This could also be reflected in the top-level
1349	   qlog's summary or configuration fields, but having a separate event
1350	   allows easier aggregation of several simulations into one trace
1351	   (e.g., split by group_id).

1353	   Definition:

1355	   SimulationScenario = {
1356	       ? name: text
1357	       ? details: {* text => any }
1358	   }

1360	                  Figure 34: SimulationScenario definition

1362	5.3.2.  marker

1364	   Importance: Extra

1366	   Used to indicate when specific emulation conditions are triggered at
1367	   set times (e.g., at 3 seconds in 2% packet loss is introduced, at 10s
1368	   a NAT rebind is triggered).

1370	   Definition:

1372	   SimulationMarker = {
1373	       ? type: text
1374	       ? message: text
1375	   }

1377	                   Figure 35: SimulationMarker definition

1379	6.  Serializing qlog

1381	   This document and other related qlog schema definitions are
1382	   intentionally serialization-format agnostic.  This means that
1383	   implementers themselves can choose how to represent and serialize
1384	   qlog data practically on disk or on the wire.  Some examples of
1385	   possible formats are JSON, CBOR, CSV, protocol buffers, flatbuffers,
1386	   etc.

1388	   All these formats make certain tradeoffs between flexibility and
1389	   efficiency, with textual formats like JSON typically being more
1390	   flexible but also less efficient than binary formats like protocol
1391	   buffers.  The format choice will depend on the practical use case of
1392	   the qlog user.  For example, for use in day to day debugging, a
1393	   plaintext readable (yet relatively large) format like JSON is
1394	   probably preferred.  However, for use in production, a more optimized
1395	   yet restricted format can be better.  In this latter case, it will be
1396	   more difficult to achieve interoperability between qlog
1397	   implementations of various protocol stacks, as some custom or tweaked
1398	   events from one might not be compatible with the format of the other.
1399	   This will also reflect in tooling: not all tools will support all
1400	   formats.

1402	   This being said, the authors prefer JSON as the basis for storing
1403	   qlog, as it retains full flexibility and maximum interoperability.
1404	   Storage overhead can be managed well in practice by employing
1405	   compression.  For this reason, this document details how to
1406	   practically transform qlog schema definitions to [JSON], its subset
1407	   [I-JSON], and its streamable derivative [JSON-Text-Sequences]s.  We
1408	   discuss concrete options to bring down JSON size and processing
1409	   overheads in Section 6.3.

1411	   As depending on the employed format different deserializers/parsers
1412	   should be used, the "qlog_format" field is used to indicate the
1413	   chosen serialization approach.  This field is always a string, but
1414	   can be made hierarchical by the use of the "." separator between
1415	   entries.  For example, a value of "JSON.optimizationA" can indicate
1416	   that a default JSON format is being used, but that a certain
1417	   optimization of type A was applied to the file as well (see also
1418	   Section 6.3).

1420	6.1.  qlog to JSON mapping

1422	   When mapping qlog to normal JSON, the "qlog_format" field MUST have
1423	   the value "JSON".  This is also the default qlog serialization and
1424	   default value of this field.

1426	   When using normal JSON serialization, the file extension/suffix
1427	   SHOULD be ".qlog" and the Media Type (if any) SHOULD be "application/
1428	   qlog+json" per [RFC6839].

1430	   JSON files by definition ([RFC8259]) MUST utilize the UTF-8 encoding,
1431	   both for the file itself and the string values.

1433	   While not specifically required by the JSON specification, all qlog
1434	   field names in a JSON serialization MUST be lowercase.

1436	   In order to serialize CDDL-based qlog event and data structure
1437	   definitions to JSON, the official CDDL-to-JSON mapping defined in
1438	   Appendix E of [CDDL] SHOULD be employed.

1440	6.1.1.  I-JSON

1442	   For some use cases, it should be taken into account that not all
1443	   popular JSON parsers support the full JSON format.  Especially for
1444	   parsers integrated with the JavaScript programming language (e.g.,
1445	   Web browsers, NodeJS), users are recommended to stick to a JSON
1446	   subset dubbed [I-JSON] (or Internet-JSON).

1448	   One of the key limitations of JavaScript and thus I-JSON is that it
1449	   cannot represent full 64-bit integers in standard operating mode
1450	   (i.e., without using BigInt extensions), instead being limited to the
1451	   range of [-(2**53)+1, (2**53)-1].  In these circumstances, Appendix E
1452	   of [CDDL] recommends defining new CDDL types for int64 and uint64
1453	   that limit their values to this range.

1455	   While this can be sensible and workable for most use cases, some
1456	   protocols targeting qlog serialization (e.g., QUIC, HTTP/3), might
1457	   require full uint64 variables in some (rare) circumstances.  In these
1458	   situations, it should be allowed to also use the string-based
1459	   representation of uint64 values alongside the numerical
1460	   representation.  Concretely, the following definition of uint64
1461	   should override the original and (web-based) tools should take into
1462	   account that a uint64 field can be either a number or string.

1464	   uint64 = text / uint .size 8

1466	               Figure 36: Custom uint64 definition for I-JSON

1468	6.1.2.  Truncated values

1470	   For some use cases (e.g., limiting file size, privacy), it can be
1471	   necessary not to log a full raw blob (using the hexstring type) but
1472	   instead a truncated value (for example, only the first 100 bytes of
1473	   an HTTP response body to be able to discern which file it actually
1474	   contained).  In these cases, the original byte-size length cannot be
1475	   obtained from the serialized value directly.

1477	   As such, all qlog schema definitions SHOULD include a separate,
1478	   length-indicating field for all fields of type hexstring they
1479	   specify, see for example Section 5.1.  This not only ensures the
1480	   original length can always be retrieved, but also allows the omission
1481	   of any raw value bytes of the field completely (e.g., out of privacy
1482	   or security considerations).

1484	   To reduce overhead however and in the case the full raw value is
1485	   logged, the extra length-indicating field can be left out.  As such,
1486	   tools MUST be able to deal with this situation and derive the length
1487	   of the field from the raw value if no separate length-indicating
1488	   field is present.  The main possible permutations are shown by
1489	   example in Figure 37.

1491	   // both the full raw value and its length are present
1492	   // (length is redundant)
1493	   {
1494	       "raw_length": 5,
1495	       "raw": "051428abff"
1496	   }

1498	   // only the raw value is present, indicating it
1499	   // represents the fields full value the byte
1500	   // length is obtained by calculating raw.length / 2
1501	   {
1502	       "raw": "051428abff"
1503	   }

1505	   // only the length field is present, meaning the
1506	   // value was omitted
1507	   {
1508	       "raw_length": 5,
1509	   }

1511	   // both fields are present and the lengths do not match:
1512	   // the value was truncated to the first three bytes.
1513	   {
1514	       "raw_length": 5,
1515	       "raw": "051428"
1516	   }

1518	          Figure 37: Example for serializing truncated hexstrings

1520	6.2.  qlog to JSON Text Sequences mapping

1522	   One of the downsides of using pure JSON is that it is inherently a
1523	   non-streamable format.  Put differently, it is not possible to simply
1524	   append new qlog events to a log file without "closing" this file at
1525	   the end by appending "]}]}".  Without these closing tags, most JSON
1526	   parsers will be unable to parse the file entirely.  As most platforms
1527	   do not provide a standard streaming JSON parser (which would be able
1528	   to deal with this problem), this document also provides a qlog
1529	   mapping to a streamable JSON format called JSON Text Sequences (JSON-
1530	   SEQ) ([RFC7464]).

1532	   When mapping qlog to JSON-SEQ, the "qlog_format" field MUST have the
1533	   value "JSON-SEQ".

1535	   When using JSON-SEQ serialization, the file extension/suffix SHOULD
1536	   be ".sqlog" (for "streaming" qlog) and the Media Type (if any) SHOULD
1537	   be "application/qlog+json-seq" per [RFC8091].

1539	   JSON Text Sequences are very similar to JSON, except that JSON
1540	   objects are serialized as individual records, each prefixed by an
1541	   ASCII Record Separator (<RS>, 0x1E), and each ending with an ASCII
1542	   Line Feed character (\n, 0x0A).  Note that each record can also
1543	   contain any amount of newlines in its body, as long as it ends with a
1544	   newline character before the next <RS> character.

1546	   Each qlog event is serialized and interpreted as an individual JSON
1547	   Text Sequence record, and can simply be appended as a new object at
1548	   the back of an event stream or log file.  Put differently, unlike
1549	   default JSON, it does not require a file to be wrapped as a full
1550	   object with "{ ... }" or "[... ]".

1552	   For this to work, some qlog definitions have to be adjusted however.
1553	   Mainly, events are no longer part of the "events" array in the Trace
1554	   object, but are instead logged separately from the qlog "header", as
1555	   indicated by the TraceSeq object in Figure 38.  Additionally, qlog's
1556	   JSON-SEQ mapping does not allow logging multiple individual traces in
1557	   a single qlog file.  As such, the QlogFile:traces field is replaced
1558	   by the singular QlogFileSeq:trace field, see Figure 39.  An example
1559	   can be seen in Figure 40.  Note that the "group_id" field can still
1560	   be used on a per-event basis to include events from conceptually
1561	   different sources in a single JSON-SEQ qlog file.

1563	   Definition:

1565	   TraceSeq = {
1566	       ? title: text
1567	       ? description: text
1568	       ? configuration: Configuration
1569	       ? common_fields: CommonFields
1570	       ? vantage_point: VantagePoint
1571	   }

1573	                       Figure 38: TraceSeq definition

1575	   Definition:

1577	   QlogFileSeq = {
1578	       qlog_format: "JSON-SEQ"

1580	       qlog_version: text
1581	       ? title: text
1582	       ? description: text
1583	       ? summary: Summary
1584	       trace: TraceSeq
1585	   }
1586	                     Figure 39: QlogFileSeq definition

1588	   JSON-SEQ serialization examples:

1590	   // list of qlog events, serialized in accordance with RFC 7464,
1591	   // starting with a Record Separator character and ending with a
1592	   // newline.
1593	   // For display purposes, Record Separators are rendered as <RS>

1595	   <RS>{
1596	       "qlog_version": "0.3",
1597	       "qlog_format": "JSON-SEQ",
1598	       "title": "Name of JSON Text Sequence qlog file (short)",
1599	       "description": "Description for this trace file (long)",
1600	       "summary": {
1601	           ...
1602	       },
1603	       "trace": {
1604	         "common_fields": {
1605	           "protocol_type": ["QUIC","HTTP3"],
1606	           "group_id":"127ecc830d98f9d54a42c4f0842aa87e181a",
1607	           "time_format":"relative",
1608	           "reference_time": 1553986553572
1609	         },
1610	         "vantage_point": {
1611	           "name":"backend-67",
1612	           "type":"server"
1613	         }
1614	       }
1615	   }
1616	   <RS>{"time": 2, "name": "transport:parameters_set", "data": { ... } }
1617	   <RS>{"time": 7, "name": "transport:packet_sent", "data": { ... } }
1618	   ...

1620	                        Figure 40: Top-level element

1622	   Note: while not specifically required by the JSON-SEQ specification,
1623	   all qlog field names in a JSON-SEQ serialization MUST be lowercase.

1625	   In order to serialize all other CDDL-based qlog event and data
1626	   structure definitions to JSON-SEQ, the official CDDL-to-JSON mapping
1627	   defined in Appendix E of [CDDL] SHOULD still be employed.

1629	6.2.1.  Supporting JSON Text Sequences in tooling

1631	   Note that JSON Text Sequences are not supported in most default
1632	   programming environments (unlike normal JSON).  However, several
1633	   custom JSON-SEQ parsing libraries exist in most programming languages
1634	   that can be used and the format is easy enough to parse with existing
1635	   implementations (i.e., by splitting the file into its component
1636	   records and feeding them to a normal JSON parser individually, as
1637	   each record by itself is a valid JSON object).

1639	6.3.  Other optimizated formatting options

1641	   Both the JSON and JSON-SEQ formatting options described above are
1642	   serviceable in general small to medium scale (debugging) setups.
1643	   However, these approaches tend to be relatively verbose, leading to
1644	   larger file sizes.  Additionally, generalized JSON(-SEQ)
1645	   (de)serialization performance is typically (slightly) lower than that
1646	   of more optimized and predictable formats.  Both aspects make these
1647	   formats more challenging (though still practical
1648	   (https://qlog.edm.uhasselt.be/anrw/)) to use in large scale setups.

1650	   During the development of qlog, we compared a multitude of
1651	   alternative formatting and optimization options.  The results of this
1652	   study are summarized on the qlog github repository
1653	   (https://github.com/quiclog/internet-drafts/issues/30#issuecomment-
1654	   617675097).  The rest of this section discusses some of these
1655	   approaches implementations could choose and the expected gains and
1656	   tradeoffs inherent therein.  Tools SHOULD support mainly the
1657	   compression options listed in Section 6.3.2, as they provide the
1658	   largest wins for the least cost overall.

1660	   Over time, specific qlog formats and encodings can be created that
1661	   more formally define and combine some of the discussed optimizations
1662	   or add new ones.  We choose to define these schemes in separate
1663	   documents to keep the main qlog definition clean and generalizable,
1664	   as not all contexts require the same performance or flexibility as
1665	   others and qlog is intended to be a broadly usable and extensible
1666	   format (for example more flexibility is needed in earlier stages of
1667	   protocol development, while more performance is typically needed in
1668	   later stages).  This is also the main reason why the general qlog
1669	   format is the less optimized JSON instead of a more performant
1670	   option.

1672	   To be able to easily distinguish between these options in qlog
1673	   compatible tooling (without the need to have the user provide out-of-
1674	   band information or to (heuristically) parse and process files in a
1675	   multitude of ways, see also Section 8), we recommend using explicit
1676	   file extensions to indicate specific formats.  As there are no
1677	   standards in place for this type of extension to format mapping, we
1678	   employ a commonly used scheme here.  Our approach is to list the
1679	   applied optimizations in the extension in ascending order of
1680	   application (e.g., if a qlog file is first optimized with technique A
1681	   and then compressed with technique B, the resulting file would have
1682	   the extension ".(s)qlog.A.B").  This allows tooling to start at the
1683	   back of the extension to "undo" applied optimizations to finally
1684	   arrive at the expected qlog representation.

1686	6.3.1.  Data structure optimizations

1688	   The first general category of optimizations is to alter the
1689	   representation of data within an JSON(-SEQ) qlog file to reduce file
1690	   size.

1692	   The first option is to employ a scheme similar to the CSV (comma
1693	   separated value [RFC4180]) format, which utilizes the concept of
1694	   column "headers" to prevent repeating field names for each datapoint
1695	   instance.  Concretely for JSON qlog, several field names are repeated
1696	   with each event (i.e., time, name, data).  These names could be
1697	   extracted into a separate list, after which qlog events could be
1698	   serialized as an array of values, as opposed to a full object.  This
1699	   approach was a key part of the original qlog format (prior to draft-
1700	   02) using the "event_fields" field.  However, tests showed that this
1701	   optimization only provided a mean file size reduction of 5% (100MB to
1702	   95MB) while significantly increasing the implementation complexity,
1703	   and this approach was abandoned in favor of the default JSON setup.
1704	   Implementations using this format should not employ a separate file
1705	   extension (as it still uses JSON), but rather employ a new value of
1706	   "JSON.namedheaders" (or "JSON-SEQ.namedheaders") for the
1707	   "qlog_format" field (see Section 3).

1709	   The second option is to replace field values and/or names with
1710	   indices into a (dynamic) lookup table.  This is a common compression
1711	   technique and can provide significant file size reductions (up to 50%
1712	   in our tests, 100MB to 50MB).  However, this approach is even more
1713	   difficult to implement efficiently and requires either including the
1714	   (dynamic) table in the resulting file (an approach taken by for
1715	   example Chromium's NetLog format
1716	   (https://www.chromium.org/developers/design-documents/network-stack/
1717	   netlog)) or defining a (static) table up-front and sharing this
1718	   between implementations.  Implementations using this approach should
1719	   not employ a separate file extension (as it still uses JSON), but
1720	   rather employ a new value of "JSON.dictionary" (or "JSON-
1721	   SEQ.dictionary") for the "qlog_format" field (see Section 3).

1723	   As both options either proved difficult to implement, reduced qlog
1724	   file readability, and provided too little improvement compared to
1725	   other more straightforward options (for example Section 6.3.2), these
1726	   schemes are not inherently part of qlog.

1728	6.3.2.  Compression

1730	   The second general category of optimizations is to utilize a
1731	   (generic) compression scheme for textual data.  As qlog in the JSON(-
1732	   SEQ) format typically contains a large amount of repetition, off-the-
1733	   shelf (text) compression techniques typically succeed very well in
1734	   bringing down file sizes (regularly with up to two orders of
1735	   magnitude in our tests, even for "fast" compression levels).  As
1736	   such, utilizing compression is recommended before attempting other
1737	   optimization options, even though this might (somewhat) increase
1738	   processing costs due to the additional compression step.

1740	   The first option is to use GZIP compression ([RFC1952]).  This
1741	   generic compression scheme provides multiple compression levels
1742	   (providing a trade-off between compression speed and size reduction).
1743	   Utilized at level 6 (a medium setting thought to be applicable for
1744	   streaming compression of a qlog stream in commodity devices), gzip
1745	   compresses qlog JSON files to 7% of their initial size on average
1746	   (100MB to 7MB).  For this option, the file extension .(s)qlog.gz
1747	   SHOULD BE used.  The "qlog_format" field should still reflect the
1748	   original JSON formatting of the qlog data (e.g., "JSON" or "JSON-
1749	   SEQ").

1751	   The second option is to use Brotli compression ([RFC7932]).  While
1752	   similar to gzip, this more recent compression scheme provides a
1753	   better efficiency.  It also allows multiple compression levels.
1754	   Utilized at level 4 (a medium setting thought to be applicable for
1755	   streaming compression of a qlog stream in commodity devices), brotli
1756	   compresses qlog JSON files to 7% of their initial size on average
1757	   (100MB to 7MB).  For this option, the file extension .(s)qlog.br
1758	   SHOULD BE used.  The "qlog_format" field should still reflect the
1759	   original JSON formatting of the qlog data (e.g., "JSON" or "JSON-
1760	   SEQ").

1762	   Other compression algorithms of course exist (for example xz, zstd,
1763	   and lz4).  We mainly recommend gzip and brotli because of their
1764	   tweakable behaviour and wide support in web-based environments, which
1765	   we envision as the main tooling ecosystem (see also Section 8).

1767	6.3.3.  Binary formats

1769	   The third general category of optimizations is to use a more
1770	   optimized (often binary) format instead of the textual JSON format.
1771	   This approach inherently produces smaller files and often has better
1772	   (de)serialization performance.  However, the resultant files are no
1773	   longer human readable and some formats require hard tradeoffs between
1774	   flexibility for performance.

1776	   The first option is to use the CBOR (Concise Binary Object
1777	   Representation [RFC7049]) format.  For our purposes, CBOR can be
1778	   viewed as a straighforward binary variant of JSON.  As such, existing
1779	   JSON qlog files can be trivially converted to and from CBOR (though
1780	   slightly more work is needed for JSON-SEQ qlogs to convert them to
1781	   CBOR-SEQ, see [RFC8742]).  While CBOR thus does retain the full qlog
1782	   flexibility, it only provides a 25% file size reduction (100MB to
1783	   75MB) compared to textual JSON(-SEQ).  As CBOR support in programming
1784	   environments is not as widespread as that of textual JSON and the
1785	   format lacks human readability, CBOR was not chosen as the default
1786	   qlog format.  For this option, the file extension .(s)qlog.cbor
1787	   SHOULD BE used.  The "qlog_format" field should still reflect the
1788	   original JSON formatting of the qlog data (e.g., "JSON" or "JSON-
1789	   SEQ").  The media type should indicate both whether JSON or JSON Text
1790	   Sequences are used, as well as whether CBOR or CBOR Sequences are
1791	   used (see the table below).

1793	   A second option is to use a more specialized binary format, such as
1794	   Protocol Buffers (https://developers.google.com/protocol-buffers)
1795	   (protobuf).  This format is battle-tested, has support for optional
1796	   fields and has libraries in most programming languages.  Still, it is
1797	   significantly less flexible than textual JSON or CBOR, as it relies
1798	   on a separate, pre-defined schema (a .proto file).  As such, it it
1799	   not possible to (easily) log new event types in protobuf files
1800	   without adjusting this schema as well, which has its own practical
1801	   challenges.  As qlog is intended to be a flexible, general purpose
1802	   format, this type of format was not chosen as its basic
1803	   serialization.  The lower flexibility does lead to significantly
1804	   reduced file sizes.  Our straightforward mapping of the qlog main
1805	   schema and QUIC/HTTP3 event types to protobuf created qlog files 24%
1806	   as large as the raw JSON equivalents (100MB to 24MB).  For this
1807	   option, the file extension .(s)qlog.protobuf SHOULD BE used.  The
1808	   "qlog_format" field should reflect the different internal format, for
1809	   example: "qlog_format": "protobuf".

1811	   Note that binary formats can (and should) also be used in conjunction
1812	   with compression (see Section 6.3.2).  For example, CBOR compresses
1813	   well (to about 6% of the original textual JSON size (100MB to 6MB)
1814	   for both gzip and brotli) and so does protobuf (5% (gzip) to 3%
1815	   (brotli)).  However, these gains are similar to the ones achieved by
1816	   simply compression the textual JSON equivalents directly (7%, see
1817	   Section 6.3.2).  As such, since compression is still needed to
1818	   achieve optimal file size reductions event with binary formats, we
1819	   feel the more flexible compressed textual JSON options are a better
1820	   default for the qlog format in general.

1822	6.3.4.  Overview and summary

1824	   In summary, textual JSON was chosen as the main qlog format due to
1825	   its high flexibility and because its inefficiencies can be largely
1826	   solved by the utilization of compression techniques (which are needed
1827	   to achieve optimal results with other formats as well).

1829	   Still, qlog implementers are free to define other qlog formats
1830	   depending on their needs and context of use.  These formats should be
1831	   described in their own documents, the discussion in this document
1832	   mainly acting as inspiration and high-level guidance.  Implementers
1833	   are encouraged to add concrete qlog formats and definitions to the
1834	   designated public repository (https://github.com/quiclog/qlog).

1836	   The following table provides an overview of all the discussed qlog
1837	   formatting options with examples:

1839	   +===============+===================+================+==============+
1840	   | format        | qlog_format       | extension      | media type   |
1841	   +===============+===================+================+==============+
1842	   | JSON          | JSON              | .qlog          | application/ |
1843	   | Section 6.1   |                   |                | qlog+json    |
1844	   +---------------+-------------------+----------------+--------------+
1845	   | JSON Text     | JSON-SEQ          | .sqlog         | application/ |
1846	   | Sequences     |                   |                | qlog+json-   |
1847	   | Section 6.2   |                   |                | seq          |
1848	   +---------------+-------------------+----------------+--------------+
1849	   | named         | JSON(-            | .(s)qlog       | application/ |
1850	   | headers       | SEQ).namedheaders |                | qlog+json(-  |
1851	   | Section       |                   |                | seq)         |
1852	   | 6.3.1         |                   |                |              |
1853	   +---------------+-------------------+----------------+--------------+
1854	   | dictionary    | JSON(-            | .(s)qlog       | application/ |
1855	   | Section       | SEQ).dictionary   |                | qlog+json(-  |
1856	   | 6.3.1         |                   |                | seq)         |
1857	   +---------------+-------------------+----------------+--------------+
1858	   | CBOR          | JSON(-SEQ)        | .(s)qlog.cbor  | application/ |
1859	   | Section       |                   |                | qlog+json(-  |
1860	   | 6.3.3         |                   |                | seq)+cbor(-  |
1861	   |               |                   |                | seq)         |
1862	   +---------------+-------------------+----------------+--------------+
1863	   | protobuf      | protobuf          | .qlog.protobuf | NOT          |
1864	   | Section       |                   |                | SPECIFIED BY |
1865	   | 6.3.3         |                   |                | IANA         |
1866	   +---------------+-------------------+----------------+--------------+
1867	   +---------------+-------------------+----------------+--------------+
1868	   | gzip          | no change         | .gz suffix     | application/ |
1869	   | Section       |                   |                | gzip         |
1870	   | 6.3.2         |                   |                |              |
1871	   +---------------+-------------------+----------------+--------------+
1872	   | brotli        | no change         | .br suffix     | NOT          |
1873	   | Section       |                   |                | SPECIFIED BY |
1874	   | 6.3.2         |                   |                | IANA         |
1875	   +---------------+-------------------+----------------+--------------+

1877	                                  Table 1

1879	6.4.  Conversion between formats

1881	   As discussed in the previous sections, a qlog file can be serialized
1882	   in a multitude of formats, each of which can conceivably be
1883	   transformed into or from one another without loss of information.
1884	   For example, a number of JSON-SEQ streamed qlogs could be combined
1885	   into a JSON formatted qlog for later processing.  Similarly, a
1886	   captured binary qlog could be transformed to JSON for easier
1887	   interpretation and sharing.

1889	   Secondly, we can also consider other structured logging approaches
1890	   that contain similar (though typically not identical) data to qlog,
1891	   like raw packet capture files (for example .pcap files from tcpdump)
1892	   or endpoint-specific logging formats (for example the NetLog format
1893	   in Google Chrome).  These are sometimes the only options, if an
1894	   implementation cannot or will not support direct qlog output for any
1895	   reason, but does provide other internal or external (e.g.,
1896	   SSLKEYLOGFILE export to allow decryption of packet captures) logging
1897	   options For this second category, a (partial) transformation from/to
1898	   qlog can also be defined.

1900	   As such, when defining a new qlog serialization format or wanting to
1901	   utilize qlog-compatible tools with existing codebases lacking qlog
1902	   support, it is recommended to define and provide a concrete mapping
1903	   from one format to default JSON-serialized qlog.  Several of such
1904	   mappings exist.  Firstly, [pcap2qlog]((https://github.com/quiclog/
1905	   pcap2qlog) transforms QUIC and HTTP/3 packet capture files to qlog.
1906	   Secondly, netlog2qlog
1907	   (https://github.com/quiclog/qvis/tree/master/visualizations/src/
1908	   components/filemanager/netlogconverter) converts chromium's internal
1909	   dictionary-encoded JSON format to qlog.  Finally, quictrace2qlog
1910	   (https://github.com/quiclog/quictrace2qlog) converts the older
1911	   quictrace format to JSON qlog.  Tools can then easily integrate with
1912	   these converters (either by incorporating them directly or for
1913	   example using them as a (web-based) API) so users can provide
1914	   different file types with ease.  For example, the qvis
1915	   (https://qvis.edm.uhasselt.be) toolsuite supports a multitude of
1916	   formats and qlog serializations.

1918	7.  Methods of access and generation

1920	   Different implementations will have different ways of generating and
1921	   storing qlogs.  However, there is still value in defining a few
1922	   default ways in which to steer this generation and access of the
1923	   results.

1925	7.1.  Set file output destination via an environment variable

1927	   To provide users control over where and how qlog files are created,
1928	   we define two environment variables.  The first, QLOGFILE, indicates
1929	   a full path to where an individual qlog file should be stored.  This
1930	   path MUST include the full file extension.  The second, QLOGDIR, sets
1931	   a general directory path in which qlog files should be placed.  This
1932	   path MUST include the directory separator character at the end.

1934	   In general, QLOGDIR should be preferred over QLOGFILE if an endpoint
1935	   is prone to generate multiple qlog files.  This can for example be
1936	   the case for a QUIC server implementation that logs each QUIC
1937	   connection in a separate qlog file.  An alternative that uses
1938	   QLOGFILE would be a QUIC server that logs all connections in a single
1939	   file and uses the "group_id" field (Section 3.4.6) to allow post-hoc
1940	   separation of events.

1942	   Implementations SHOULD provide support for QLOGDIR and MAY provide
1943	   support for QLOGFILE.

1945	   When using QLOGDIR, it is up to the implementation to choose an
1946	   appropriate naming scheme for the qlog files themselves.  The chosen
1947	   scheme will typically depend on the context or protocols used.  For
1948	   example, for QUIC, it is recommended to use the Original Destination
1949	   Connection ID (ODCID), followed by the vantage point type of the
1950	   logging endpoint.  Examples of all options for QUIC are shown in
1951	   Figure 41.

1953	   Command: QLOGFILE=/srv/qlogs/client.qlog quicclientbinary

1955	   Should result in the the quicclientbinary executable logging a
1956	   single qlog file named client.qlog in the /srv/qlogs directory.
1957	   This is for example useful in tests when the client sets up
1958	   just a single connection and then exits.

1960	   Command: QLOGDIR=/srv/qlogs/ quicserverbinary

1962	   Should result in the quicserverbinary executable generating
1963	   several logs files, one for each QUIC connection.
1964	   Given two QUIC connections, with ODCID values "abcde" and
1965	   "12345" respectively, this would result in two files:
1966	   /srv/qlogs/abcde_server.qlog
1967	   /srv/qlogs/12345_server.qlog

1969	   Command: QLOGFILE=/srv/qlogs/server.qlog quicserverbinary

1971	   Should result in the the quicserverbinary executable logging
1972	   a single qlog file named server.qlog in the /srv/qlogs directory.
1973	   Given that the server handled two QUIC connections before it was
1974	   shut down, with ODCID values "abcde" and "12345" respectively,
1975	   this would result in event instances in the qlog file being
1976	   tagged with the "group_id" field with values "abcde" and "12345".

1978	     Figure 41: Environment variable examples for a QUIC implementation

1980	7.2.  Access logs via a well-known endpoint

1982	   After generation, qlog implementers MAY make available generated logs
1983	   and traces on an endpoint (typically the server) via the following
1984	   .well-known URI:

1986	      .well-known/qlog/IDENTIFIER.extension

1988	   The IDENTIFIER variable depends on the context and the protocol.  For
1989	   example for QUIC, the lowercase Original Destination Connection ID
1990	   (ODCID) is recommended, as it can uniquely identify a connection.
1991	   Additionally, the extension depends on the chosen format (see
1992	   Section 6.3.4).  For example, for a QUIC connection with ODCID
1993	   "abcde", the endpoint for fetching its default JSON-formatted .qlog
1994	   file would be:

1996	      .well-known/qlog/abcde.qlog

1998	   Implementers SHOULD allow users to fetch logs for a given connection
1999	   on a 2nd, separate connection.  This helps prevent pollution of the
2000	   logs by fetching them over the same connection that one wishes to
2001	   observe through the log.  Ideally, for the QUIC use case, the logs
2002	   should also be approachable via an HTTP/2 or HTTP/1.1 endpoint (i.e.,
2003	   on TCP port 443), to for example aid debugging in the case where
2004	   QUIC/UDP is blocked on the network.

2006	   qlog implementers SHOULD NOT enable this .well-known endpoint in
2007	   typical production settings to prevent (malicious) users from
2008	   downloading logs from other connections.  Implementers are advised to
2009	   disable this endpoint by default and require specific actions from
2010	   the end users to enable it (and potentially qlog itself).
2011	   Implementers MUST also take into account the general privacy and
2012	   security guidelines discussed in Section 9 before exposing qlogs to
2013	   outside actors.

2015	8.  Tooling requirements

2017	   Tools ingestion qlog MUST indicate which qlog version(s), qlog
2018	   format(s), compression methods and potentially other input file
2019	   formats (for example .pcap) they support.  Tools SHOULD at least
2020	   support .qlog files in the default JSON format (Section 6.1).
2021	   Additionally, they SHOULD indicate exactly which values for and
2022	   properties of the name (category and type) and data fields they look
2023	   for to execute their logic.  Tools SHOULD perform a (high-level)
2024	   check if an input qlog file adheres to the expected qlog schema.  If
2025	   a tool determines a qlog file does not contain enough supported
2026	   information to correctly execute the tool's logic, it SHOULD generate
2027	   a clear error message to this effect.

2029	   Tools MUST NOT produce breaking errors for any field names and/or
2030	   values in the qlog format that they do not recognize.  Tools SHOULD
2031	   indicate even unknown event occurences within their context (e.g.,
2032	   marking unknown events on a timeline for manual interpretation by the
2033	   user).

2035	   Tool authors should be aware that, depending on the logging
2036	   implementation, some events will not always be present in all traces.
2037	   For example, using a circular logging buffer of a fixed size, it
2038	   could be that the earliest events (e.g., connection setup events) are
2039	   later overwritten by "newer" events.  Alternatively, some events can
2040	   be intentionally omitted out of privacy or file size considerations.
2041	   Tool authors are encouraged to make their tools robust enough to
2042	   still provide adequate output for incomplete logs.

2044	9.  Security and privacy considerations

2046	   TODO : discuss privacy and security considerations (e.g., what NOT to
2047	   log, what to strip out of a log before sharing, ...)

2049	   TODO: strip out/don't log IPs, ports, specific CIDs, raw user data,
2050	   exact times, HTTP HEADERS (or at least :path), SNI values

2052	   TODO: see if there is merit in encrypting the logs and having the
2053	   server choose an encryption key (e.g., sent in transport parameters)

2055	   Good initial reference: Christian Huitema's blogpost
2056	   (https://huitema.wordpress.com/2020/07/21/scrubbing-quic-logs-for-
2057	   privacy/)

2059	10.  IANA Considerations

2061	   TODO: primarily the .well-known URI

2063	11.  References

2065	11.1.  Normative References

2067	   [CDDL]     Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
2068	              Definition Language (CDDL): A Notational Convention to
2069	              Express Concise Binary Object Representation (CBOR) and
2070	              JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
2071	              June 2019, <https://www.rfc-editor.org/rfc/rfc8610>.

2073	   [I-JSON]   Bray, T., Ed., "The I-JSON Message Format", RFC 7493,
2074	              DOI 10.17487/RFC7493, March 2015,
2075	              <https://www.rfc-editor.org/rfc/rfc7493>.

2077	   [JSON]     Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
2078	              Interchange Format", STD 90, RFC 8259,
2079	              DOI 10.17487/RFC8259, December 2017,
2080	              <https://www.rfc-editor.org/rfc/rfc8259>.

2082	   [JSON-Text-Sequences]
2083	              Williams, N., "JavaScript Object Notation (JSON) Text
2084	              Sequences", RFC 7464, DOI 10.17487/RFC7464, February 2015,
2085	              <https://www.rfc-editor.org/rfc/rfc7464>.

2087	   [QLOG-H3]  Marx, R., Ed., Niccolini, L., Ed., and M. Seemann, Ed.,
2088	              "HTTP/3 and QPACK event definitions for qlog", Work in
2089	              Progress, Internet-Draft, draft-ietf-quic-qlog-h3-events-
2090	              01, <https://datatracker.ietf.org/doc/html/draft-ietf-
2091	              quic-qlog-h3-events-01>.

2093	   [QLOG-QUIC]
2094	              Marx, R., Ed., Niccolini, L., Ed., and M. Seemann, Ed.,
2095	              "QUIC event definitions for qlog", Work in Progress,
2096	              Internet-Draft, draft-ietf-quic-qlog-quic-events-01,
2097	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
2098	              qlog-quic-events-01>.

2100	   [RFC1952]  Deutsch, P., "GZIP file format specification version 4.3",
2101	              RFC 1952, DOI 10.17487/RFC1952, May 1996,
2102	              <https://www.rfc-editor.org/rfc/rfc1952>.

2104	   [RFC4180]  Shafranovich, Y., "Common Format and MIME Type for Comma-
2105	              Separated Values (CSV) Files", RFC 4180,
2106	              DOI 10.17487/RFC4180, October 2005,
2107	              <https://www.rfc-editor.org/rfc/rfc4180>.

2109	   [RFC6839]  Hansen, T. and A. Melnikov, "Additional Media Type
2110	              Structured Syntax Suffixes", RFC 6839,
2111	              DOI 10.17487/RFC6839, January 2013,
2112	              <https://www.rfc-editor.org/rfc/rfc6839>.

2114	   [RFC7049]  Bormann, C. and P. Hoffman, "Concise Binary Object
2115	              Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049,
2116	              October 2013, <https://www.rfc-editor.org/rfc/rfc7049>.

2118	   [RFC7464]  Williams, N., "JavaScript Object Notation (JSON) Text
2119	              Sequences", RFC 7464, DOI 10.17487/RFC7464, February 2015,
2120	              <https://www.rfc-editor.org/rfc/rfc7464>.

2122	   [RFC7932]  Alakuijala, J. and Z. Szabadka, "Brotli Compressed Data
2123	              Format", RFC 7932, DOI 10.17487/RFC7932, July 2016,
2124	              <https://www.rfc-editor.org/rfc/rfc7932>.

2126	   [RFC8091]  Wilde, E., "A Media Type Structured Syntax Suffix for JSON
2127	              Text Sequences", RFC 8091, DOI 10.17487/RFC8091, February
2128	              2017, <https://www.rfc-editor.org/rfc/rfc8091>.

2130	   [RFC8259]  Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
2131	              Interchange Format", STD 90, RFC 8259,
2132	              DOI 10.17487/RFC8259, December 2017,
2133	              <https://www.rfc-editor.org/rfc/rfc8259>.

2135	11.2.  Informative References

2137	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2138	              Requirement Levels", BCP 14, RFC 2119,
2139	              DOI 10.17487/RFC2119, March 1997,
2140	              <https://www.rfc-editor.org/rfc/rfc2119>.

2142	   [RFC8742]  Bormann, C., "Concise Binary Object Representation (CBOR)
2143	              Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020,
2144	              <https://www.rfc-editor.org/rfc/rfc8742>.

2146	Appendix A.  Change Log

2148	A.1.  Since draft-ietf-quic-qlog-main-schema-01:

2150	   *  Change the data definition language from TypeScript to CDDL (#143)

2152	A.2.  Since draft-ietf-quic-qlog-main-schema-00:

2154	   *  Changed the streaming serialization format from NDJSON to JSON
2155	      Text Sequences (#172)

2157	   *  Added Media Type definitions for various qlog formats (#158)

2159	   *  Changed to semantic versioning

2161	A.3.  Since draft-marx-qlog-main-schema-draft-02:

2163	   *  These changes were done in preparation of the adoption of the
2164	      drafts by the QUIC working group (#137)

2166	   *  Moved RawInfo, Importance, Generic events and Simulation events to
2167	      this document.

2169	   *  Added basic event definition guidelines

2171	   *  Made protocol_type an array instead of a string (#146)

2173	A.4.  Since draft-marx-qlog-main-schema-01:

2175	   *  Decoupled qlog from the JSON format and described a mapping
2176	      instead (#89)

2178	      -  Data types are now specified in this document and proper
2179	         definitions for fields were added in this format

2181	      -  64-bit numbers can now be either strings or numbers, with a
2182	         preference for numbers (#10)

2184	      -  binary blobs are now logged as lowercase hex strings (#39, #36)

2186	      -  added guidance to add length-specifiers for binary blobs (#102)

2188	   *  Removed "time_units" from Configuration.  All times are now in ms
2189	      instead (#95)

2191	   *  Removed the "event_fields" setup for a more straightforward JSON
2192	      format (#101,#89)

2194	   *  Added a streaming option using the NDJSON format (#109,#2,#106)

2196	   *  Described optional optimization options for implementers (#30)

2198	   *  Added QLOGDIR and QLOGFILE environment variables, clarified the
2199	      .well-known URL usage (#26,#33,#51)

2201	   *  Overall tightened up the text and added more examples

2203	A.5.  Since draft-marx-qlog-main-schema-00:

2205	   *  All field names are now lowercase (e.g., category instead of
2206	      CATEGORY)

2208	   *  Triggers are now properties on the "data" field value, instead of
2209	      separate field types (#23)

2211	   *  group_ids in common_fields is now just also group_id

2213	Appendix B.  Design Variations

2215	   *  Quic-trace (https://github.com/google/quic-trace) takes a slightly
2216	      different approach based on protocolbuffers.

2218	   *  Spindump (https://github.com/EricssonResearch/spindump) also
2219	      defines a custom text-based format for in-network measurements

2221	   *  Wireshark (https://www.wireshark.org/) also has a QUIC dissector
2222	      and its results can be transformed into a json output format using
2223	      tshark.

2225	   The idea is that qlog is able to encompass the use cases for both of
2226	   these alternate designs and that all tooling converges on the qlog
2227	   standard.

2229	Appendix C.  Acknowledgements

2231	   Much of the initial work by Robin Marx was done at Hasselt
2232	   University.

2234	   Thanks to Jana Iyengar, Brian Trammell, Dmitri Tikhonov, Stephen
2235	   Petrides, Jari Arkko, Marcus Ihlar, Victor Vasiliev, Mirja
2236	   Kuehlewind, Jeremy Laine and Lucas Pardue for their feedback and
2237	   suggestions.

2239	Authors' Addresses

2241	   Robin Marx (editor)
2242	   KU Leuven
2243	   Email: robin.marx@kuleuven.be

2245	   Luca Niccolini (editor)
2246	   Facebook
2247	   Email: lniccolini@fb.com

2249	   Marten Seemann (editor)
2250	   Protocol Labs
2251	   Email: marten@protocol.ai