idnits 2.17.1 

draft-mcquistin-augmented-ascii-diagrams-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (2 November 2020) is 1268 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC7405' is defined on line 1077, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-27

  -- Obsolete informational reference (is this intentional?): RFC 7049
     (Obsoleted by RFC 8949)

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)


     Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                       S. McQuistin
3	Internet-Draft                                                   V. Band
4	Intended status: Experimental                                   D. Jacob
5	Expires: 6 May 2021                                        C. S. Perkins
6	                                                   University of Glasgow
7	                                                         2 November 2020

9	  Describing Protocol Data Units with Augmented Packet Header Diagrams
10	              draft-mcquistin-augmented-ascii-diagrams-07

12	Abstract

14	   This document describes a machine-readable format for specifying the
15	   syntax of protocol data units within a protocol specification.  This
16	   format is comprised of a consistently formatted packet header
17	   diagram, followed by structured explanatory text.  It is designed to
18	   maintain human readability while enabling support for automated
19	   parser generation from the specification document.  This document is
20	   itself an example of how the format can be used.

22	Status of This Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at https://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on 6 May 2021.

39	Copyright Notice

41	   Copyright (c) 2020 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
46	   license-info) in effect on the date of publication of this document.
47	   Please review these documents carefully, as they describe your rights
48	   and restrictions with respect to this document.  Code Components
49	   extracted from this document must include Simplified BSD License text
50	   as described in Section 4.e of the Trust Legal Provisions and are
51	   provided without warranty as described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
56	   2.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   4
57	     2.1.  Limitations of Current Packet Format Diagrams . . . . . .   4
58	     2.2.  Formal languages in standards documents . . . . . . . . .   7
59	   3.  Design Principles . . . . . . . . . . . . . . . . . . . . . .   7
60	   4.  Augmented Packet Header Diagrams  . . . . . . . . . . . . . .  10
61	     4.1.  PDUs with Fixed and Variable-Width Fields . . . . . . . .  10
62	     4.2.  PDUs That Cross-Reference Previously Defined Fields . . .  13
63	     4.3.  PDUs with Non-Contiguous Fields . . . . . . . . . . . . .  16
64	     4.4.  PDUs with Constraints on Field Values . . . . . . . . . .  16
65	     4.5.  PDUs That Extend Sub-Structures . . . . . . . . . . . . .  18
66	     4.6.  Storing Data for Parsing  . . . . . . . . . . . . . . . .  19
67	     4.7.  Connecting Structures with Functions  . . . . . . . . . .  20
68	     4.8.  Specifying Enumerated Types . . . . . . . . . . . . . . .  21
69	     4.9.  Specifying Protocol Data Units  . . . . . . . . . . . . .  22
70	     4.10. Importing PDU Definitions from Other Documents  . . . . .  22
71	   5.  Open Issues . . . . . . . . . . . . . . . . . . . . . . . . .  22
72	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  23
73	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
74	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  23
75	   9.  Informative References  . . . . . . . . . . . . . . . . . . .  23
76	   Appendix A.  ABNF specification . . . . . . . . . . . . . . . . .  26
77	     A.1.  Constraint Expressions  . . . . . . . . . . . . . . . . .  26
78	     A.2.  Augmented packet diagrams . . . . . . . . . . . . . . . .  26
79	   Appendix B.  Tooling & source code  . . . . . . . . . . . . . . .  26
80	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  27

82	1.  Introduction

84	   Packet header diagrams have become a widely used format for
85	   describing the syntax of binary protocols.  In otherwise largely
86	   textual documents, they allow for the visualisation of packet
87	   formats, reducing human error, and aiding in the implementation of
88	   parsers for the protocols that they specify.

90	   Figure 1 gives an example of how packet header diagrams are used to
91	   define binary protocol formats.  The format has an obvious structure:
92	   the diagram clearly delineates each field, showing its width and its
93	   position within the header.  This type of diagram is designed for
94	   human readers, but is consistent enough that it should be possible to
95	   develop a tool that generates a parser for the packet format from the
96	   diagram.

98	   :    0                   1                   2                   3
99	   :    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
100	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
101	   :   |          Source Port          |       Destination Port        |
102	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
103	   :   |                        Sequence Number                        |
104	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
105	   :   |                    Acknowledgment Number                      |
106	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
107	   :   |  Data |           |U|A|P|R|S|F|                               |
108	   :   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |
109	   :   |       |           |G|K|H|T|N|N|                               |
110	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
111	   :   |           Checksum            |         Urgent Pointer        |
112	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
113	   :   |                    Options                    |    Padding    |
114	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
115	   :   |                             data                              |
116	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

118	               Figure 1: TCP's header format (from [RFC793])

120	   Unfortunately, the format of such packet diagrams varies both within
121	   and between documents.  This variation makes it difficult to build
122	   tools to generate parsers from the specifications.  Better tooling
123	   could be developed if protocol specifications adopted a consistent
124	   format for their packet descriptions.  Indeed, this underpins the
125	   format described by this draft: we want to retain the benefits that
126	   packet header diagrams provide, while identifying the benefits of
127	   adopting a consistent format.

129	   This document describes a consistent packet header diagram format and
130	   accompanying structured text constructs that allow for the parsing
131	   process of protocol headers to be fully specified.  This provides
132	   support for the automatic generation of parser code.  Broad design
133	   principles, that seek to maintain the primacy of human readability
134	   and flexibility in writing, are described, before the format itself
135	   is given.

137	   This document is itself an example of the approach that it describes,
138	   with the packet header diagrams and structured text format described
139	   by example.  Examples that do not form part of the protocol
140	   description language are marked by a colon at the beginning of each
141	   line; this prevents them from being parsed by the accompanying
142	   tooling.

144	   This draft describes early work.  As consensus builds around the
145	   particular syntax of the format described, a formal ABNF
146	   specification (Appendix A) will be provided.

148	   Example specifications of a number of IETF protocols described using
149	   the Augmented Packet Header Diagram format are available.  These
150	   documents describe UDP [draft-mcquistin-augmented-udp-example], TCP
151	   [draft-mcquistin-augmented-tcp-example], and QUIC
152	   [draft-mcquistin-quic-augmented-diagrams].  Code that parses those
153	   documents and automatically generates parser code for the described
154	   protocols is described in Appendix B.

156	2.  Background

158	   This section begins by considering how packet header diagrams are
159	   used in existing documents.  This exposes the limitations that the
160	   current usage has in terms of machine-readability, guiding the design
161	   of the format that this document proposes.

163	   While this document focuses on the machine-readability of packet
164	   format diagrams, this section also discusses the use of other
165	   structured or formal languages within IETF documents.  Considering
166	   how and why these languages are used provides an instructive contrast
167	   to the relatively incremental approach proposed here.

169	2.1.  Limitations of Current Packet Format Diagrams
170	   :   The RESET_STREAM frame is as follows:
171	   :
172	   :    0                   1                   2                   3
173	   :    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
174	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
175	   :   |                        Stream ID (i)                        ...
176	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
177	   :   |  Application Error Code (16)  |
178	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
179	   :   |                        Final Size (i)                       ...
180	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
181	   :
182	   :   RESET_STREAM frames contain the following fields:
183	   :
184	   :   Stream ID:  A variable-length integer encoding of the Stream ID
185	   :      of the stream being terminated.
186	   :
187	   :   Application Protocol Error Code:  A 16-bit application protocol
188	   :      error code (see Section 20.1) which indicates why the stream
189	   :      is being closed.
190	   :
191	   :   Final Size: A variable-length integer indicating the final size
192	   :      of the stream by the RESET_STREAM sender, in unit of bytes.

194	     Figure 2: QUIC's RESET_STREAM frame format (from [QUIC-TRANSPORT])

196	   Packet header diagrams are frequently used in IETF standards to
197	   describe the format of binary protocols.  While there is no standard
198	   for how these diagrams should be formatted, they have a broadly
199	   similar structure, where the layout of a protocol data unit (PDU) or
200	   structure is shown in diagrammatic form, followed by a description
201	   list of the fields that it contains.  An example of this format,
202	   taken from the QUIC specification, is given in Figure 2.

204	   These packet header diagrams, and the accompanying descriptions, are
205	   formatted for human readers rather than for automated processing.  As
206	   a result, while there is rough consistency in how packet header
207	   diagrams are formatted, there are a number of limitations that make
208	   them difficult to work with programmatically:

210	   Inconsistent syntax:  There are two classes of consistency that are
211	      needed to support automated processing of specifications: internal
212	      consistency within a diagram or document, and external consistency
213	      across all documents.

215	      Figure 2 gives an example of internal inconsistency.  Here, the
216	      packet diagram shows a field labelled "Application Error Code",
217	      while the accompanying description lists the field as "Application
218	      Protocol Error Code".  The use of an abbreviated name is suitable
219	      for human readers, but makes parsing the structure difficult for
220	      machines.  Figure 3 gives a further example, where the description
221	      includes an "Option-Code" field that does not appear in the packet
222	      diagram; and where the description states that each field is 16
223	      bits in length, but the diagram shows the OPTION_RELAY_PORT as 13
224	      bits, and Option-Len as 19 bits.  Another example is [RFC6958],
225	      where the packet format diagram showing the structure of the
226	      Burst/Gap Loss Metrics Report Block shows the Number of Bursts
227	      field as being 12 bits wide but the corresponding text describes
228	      it as 16 bits.

230	      Comparing Figure 2 with Figure 3 exposes external inconsistency
231	      across documents.  While the packet format diagrams are broadly
232	      similar, the surrounding text is formatted differently.  If
233	      machine parsing is to be made possible, then this text must be
234	      structured consistently.

236	   Ambiguous constraints:  The constraints that are enforced on a
237	      particular field are often described ambiguously, or in a way that
238	      cannot be parsed easily.  In Figure 3, each of the three fields in
239	      the structure is constrained.  The first two fields ("Option-Code"
240	      and "Option-Len") are to be set to constant values (note the
241	      inconsistency in how these constraints are expressed in the
242	      description).  However, the third field ("Downstream Source Port")
243	      can take a value from a constrained set.  This constraint is
244	      expressed in prose that cannot readily by understood by machine.

246	   Poor linking between sub-structures:  Protocol data units and other
247	      structures are often comprised of sub-structures that are defined
248	      elsewhere, either in the same document, or within another
249	      document.  Chaining these structures together is essential for
250	      machine parsing: the parsing process for a protocol data unit is
251	      only fully expressed if all elements can be parsed.

253	      Figure 2 highlights the difficulty that machine parsers have in
254	      chaining structures together.  Two fields ("Stream ID" and "Final
255	      Size") are described as being encoded as variable-length integers;
256	      this is a structure described elsewhere in the same document.
257	      Structured text is required both alongside the definition of the
258	      containing structure and with the definition of the sub-structure,
259	      to allow a parser to link the two together.

261	   Lack of extension and evolution syntax:  Protocols are often
262	      specified across multiple documents, either because the protocol
263	      explicitly includes extension points (e.g., profiles and payload
264	      format specifications in RTP [RFC3550]) or because definition of a
265	      protocol data unit has changed and evolved over time.  As a
266	      result, it is essential that syntax be provided to allow for a
267	      complete definition of a protocol's parsing process to be
268	      constructed across multiple documents.

270	   :   The format of the "Relay Source Port Option" is shown below:
271	   :
272	   :    0                   1                   2                   3
273	   :    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
274	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
275	   :   |    OPTION_RELAY_PORT    |         Option-Len                  |
276	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
277	   :   |    Downstream Source Port     |
278	   :   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
279	   :
280	   :   Where:
281	   :
282	   :   Option-Code:  OPTION_RELAY_PORT. 16-bit value, 135.
283	   :
284	   :   Option-Len:  16-bit value to be set to 2.
285	   :
286	   :   Downstream Source Port:  16-bit value.  To be set by the IPv6
287	   :      relay either to the downstream relay agent's UDP source port
288	   :      used for the UDP packet, or to zero if only the local relay
289	   :      agent uses the non-DHCP UDP port (not 547).

291	        Figure 3: DHCPv6's Relay Source Port Option (from [RFC8357])

293	2.2.  Formal languages in standards documents

295	   A small proportion of IETF standards documents contain structured and
296	   formal languages, including ABNF [RFC5234], ASN.1 [ASN1], C, CBOR
297	   [RFC7049], JSON, the TLS presentation language [RFC8446], YANG models
298	   [RFC7950], and XML.  While this broad range of languages may be
299	   problematic for the development of tooling to parse specifications,
300	   these, and other, languages serve a range of different use cases.
301	   ABNF, for example, is typically used to specify text protocols, while
302	   ASN.1 is used to specify data structure serialisation.  This document
303	   specifies a structured language for specifying the parsing of binary
304	   protocol data units.

306	3.  Design Principles

308	   The use of structures that are designed to support machine
309	   readability might potentially interfere with the existing ways in
310	   which protocol specifications are used and authored.  To the extent
311	   that these existing uses are more important than machine readability,
312	   such interference must be minimised.

314	   In this section, the broad design principles that underpin the format
315	   described by this document are given.  However, these principles
316	   apply more generally to any approach that introduces structured and
317	   formal languages into standards documents.

319	   It should be noted that these are design principles: they expose the
320	   trade-offs that are inherent within any given approach.  Violating
321	   these principles is sometimes necessary and beneficial, and this
322	   document sets out the potential consequences of doing so.

324	   The central tenet that underpins these design principles is a
325	   recognition that the standardisation process is not broken, and so
326	   does not need to be fixed.  Failure to recognise this will likely
327	   lead to approaches that are incompatible with the standards process,
328	   or that will see limited adoption.  However, the standards process
329	   can be improved with appropriate approaches, as guided by the
330	   following broad design principles:

332	   Most readers are human:  Primarily, standards documents should be
333	      written for people, who require text and diagrams that they can
334	      understand.  Structures that cannot be easily parsed by people
335	      should be avoided, and if included, should be clearly delineated
336	      from human-readable content.

338	      Any approach that shifts this balance -- that is, that primarily
339	      targets machine readers -- is likely to be disruptive to the
340	      standardisation process, which relies upon discussion centered
341	      around documents written in prose.

343	   Writing tools are diverse:  Standards document writing is a
344	      distributed process that involves a diverse set of tools and
345	      workflows.  The introduction of machine-readable structures into
346	      specifications should not require that specific tools are used to
347	      produce standards documents, to ensure that disruption to existing
348	      workflows is minimised.  This does not preclude the development of
349	      optional, supplementary tools that aid in the authoring machine-
350	      readable structures.

352	      The immediate impact of requiring specific tooling is that
353	      adoption is likely to be limited.  A long-term impact might be
354	      that authors whose workflows are incompatible might be alienated
355	      from the process.

357	   Canonical specifications:  As far as possible, machine-readable
358	      structures should not replicate the human readable specification
359	      of the protocol within the same document.  Machine-readable
360	      structures should form part of a canonical specification of the
361	      protocol.  Adding supplementary machine-readable structures, in
362	      parallel to the existing human readable text, is undesirable
363	      because it creates the potential for inconsistency.

365	      As an example, program code that describes how a protocol data
366	      unit can be parsed might be provided as an appendix within a
367	      standards document.  This code would provide a specification of
368	      the protocol that is separate to the prose description in the main
369	      body of the document.  This has the undesirable effect of
370	      introducing the potential for the program code to specify
371	      behaviour that the prose-based specification does not, and vice-
372	      versa.

374	   Expressiveness:  Any approach should be expressive enough to capture
375	      the syntax and parsing process for the majority of binary
376	      protocols.  If a given language is not sufficiently expressive,
377	      then adoption is likely to be limited.  At the limits of what can
378	      be expressed by the language, authors are likely to revert to
379	      defining the protocol in prose: this undermines the broad goal of
380	      using structured and formal languages.  Equally, though,
381	      understandable specifications and ease of use are critical for
382	      adoption.  A tool that is simple to use and addresses the most
383	      common use cases might be preferred to a complex tool that
384	      addresses all use cases.

386	      It may be desirable to restrict expressiveness, however, to
387	      guarantee intrinsic safety, security, and computability properties
388	      of both the generated parser code for the protocol, and the parser
389	      of the description language itself.  In much the same way as the
390	      language-theoretic security ([LANGSEC]) community advocates for
391	      programming language design to be informed by the desired
392	      properties of the parsers for those languages, protocol designers
393	      should be aware of the implications of their design choices.  The
394	      expressiveness of the protocol description languages that they use
395	      to define their protocols can force such awareness.

397	      Broadly, those languages that have grammars which are more
398	      expressive tend to have parsers that are more complex and less
399	      safe.  As a result, while considering the other goals described in
400	      this document, protocol description languages should attempt to be
401	      minimally expressive, and either restrict protocol designs to
402	      those for which safe and secure parsers can be generated, or as a
403	      minimum, ensure that protocol designers are aware of the
404	      boundaries their designs cross, in terms of computability and
405	      decidability [SASSAMAN].

407	   Minimise required change:  Any approach should require as few changes
408	      as possible to the way that documents are formatted, authored, and
409	      published.  Forcing adoption of a particular structured or formal
410	      language is incompatible with the IETF's standardisation process:
411	      there are very few components of standards documents that are non-
412	      optional.

414	4.  Augmented Packet Header Diagrams

416	   The design principles described in Section 3 can largely be met by
417	   the existing uses of packet header diagrams.  These diagrams aid
418	   human readability, do not require new or specialised tools to write,
419	   do not split the specification into multiple parts, can express most
420	   binary protocol features, and require no changes to existing
421	   publication processes.

423	   However, as discussed in Section 2.1 there are limitations to how
424	   packet header diagrams are used that must be addressed if they are to
425	   be parsed by machine.  In this section, an augmented packet header
426	   diagram format is described.

428	   The concept is first illustrated by example.  This is appropriate,
429	   given the visual nature of the language.  In future drafts, these
430	   examples will be parsable using provided tools, and a formal
431	   specification of the augmented packet diagrams will be given in
432	   Appendix A.

434	4.1.  PDUs with Fixed and Variable-Width Fields

436	   The simplest PDU is one that contains only a set of fixed-width
437	   fields in a known order, with no optional fields or variation in the
438	   packet format.

440	   Some packet formats include variable-width fields, where the size of
441	   a field is either derived from the value of some previous field, or
442	   is unspecified and inferred from the total size of the packet and the
443	   size of the other fields.

445	   To ensure that there is no ambiguity, a PDU description can contain
446	   only one field whose length is unspecified.  The length of a single
447	   field, where all other fields are of known (but perhaps variable)
448	   length, can be inferred from the total size of the containing PDU.

450	   A PDU description is introduced by the exact phrase "A/An _______ is
451	   formatted as follows:" at the end of a paragraph.  This is followed
452	   by the PDU description itself, as a packet diagram within an
453	   <artwork> element in the XML representation, starting with a header
454	   line to show the bit width of the diagram.  The description of the
455	   fields follows the diagram, as an XML <dl> list, after a paragraph
456	   containing the text "where:".

458	   PDU names must be unique, both within a document, and across all
459	   documents that are linked together (i.e., using the structured
460	   language defined in Section 4.10).

462	   Each field of the description starts with a <dt> tag comprising the
463	   field name and an optional short name in parenthesis.  These are
464	   followed by a colon, the field length, an optional presence
465	   expression (described in Section 4.2), and a terminating period.  The
466	   following <dd> tag contains a prose description of the field.  Field
467	   names cannot be the same as a previously defined PDU name, and must
468	   be unique within a given structure definition.

470	   For example, this can be illustrated using the IPv4 Header Format
471	   [RFC791].  An IPv4 Header is formatted as follows:

473	        0                   1                   2                   3
474	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
475	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
476	       |Version|   IHL |    DSCP   |ECN|         Total Length          |
477	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
478	       |         Identification        |Flags|     Fragment Offset     |
479	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
480	       | Time to Live  |    Protocol   |        Header Checksum        |
481	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
482	       |                         Source Address                        |
483	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
484	       |                      Destination Address                      |
485	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
486	       |                            Options                          ...
487	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
488	       |                                                               :
489	       :                            Payload                            :
490	       :                                                               |
491	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

493	   where:

495	   Version (V): 4 bits.  This is a fixed-width field, whose full label
496	      is shown in the diagram.  The field's width -- 4 bits -- is given
497	      in the label of the description list, separated from the field's
498	      label by a colon.

500	   Internet Header Length (IHL): 4 bits.  This is a shorter field, whose
501	      full label is too large to be shown in the diagram.  A short label
502	      (IHL) is used in the diagram, and this short label is provided, in
503	      brackets, after the full label in the description list.

505	   Differentiated Services Code Point (DSCP): 6 bits.  This is a fixed-
506	      width field, as previously discussed.

508	   Explicit Congestion Notification (ECN): 2 bits.  This is a fixed-
509	      width field, as previously discussed.

511	   Total Length (TL): 2 bytes.  This is a fixed-width field, as
512	      previously discussed.  Where fields are an integral number of
513	      bytes in size, the field length can be given in bytes rather than
514	      in bits.

516	   Identification: 2 bytes.  This is a fixed-width field, as previously
517	      discussed.

519	   Flags: 3 bits.  This is a fixed-width field, as previously discussed.

521	   Fragment Offset: 13 bits.  This is a fixed-width field, as previously
522	      discussed.

524	   Time to Live (TTL): 1 byte.  This is a fixed-width field, as
525	      previously discussed.

527	   Protocol: 1 byte.  This is a fixed-width field, as previously
528	      discussed.

530	   Header Checksum: 2 bytes.  This is a fixed-width field, as previously
531	      discussed.

533	   Source Address: 32 bits.  This is a fixed-width field, as previously
534	      discussed.

536	   Destination Address: 32 bits.  This is a fixed-width field, as
537	      previously discussed.

539	   Options: (IHL-5)*32 bits.  This is a variable-length field, whose
540	      length is defined by the value of the field with short label IHL
541	      (Internet Header Length).  Constraint expressions can be used in
542	      place of constant values: the grammar for the expression language
543	      is defined in Appendix A.1.  Constraints can include a previously
544	      defined field's short or full label, where one has been defined.
545	      Short variable-length fields are indicated by "..." instead of a
546	      pipe at the end of the row.

548	   Payload: TL - ((IHL*32)/8) bytes.  This is a multi-row variable-
549	      length field, constrained by the values of fields TL and IHL.
550	      Instead of the "..." notation, ":" is used to indicate that the
551	      field is variable-length.  The use of ":" instead of "..."
552	      indicates the field is likely to be a longer, multi-row field.
553	      However, semantically, there is no difference: these different
554	      notations are for the benefit of human readers.

556	4.2.  PDUs That Cross-Reference Previously Defined Fields

558	   Binary formats often reference sub-structures that have been defined
559	   earlier in the specification.  For example, in RTP [RFC3550], the
560	   Contributing Source Identifiers in an RTP Data Packet are defined as
561	   comprising a list of Source Identifier elements.  A Source Identifier
562	   is formatted as follows:

564	        0                   1                   2                   3
565	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
566	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
567	       |                               SSRC                            |
568	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

570	   where:

572	   SSRC: 32 bits.  This is a fixed-width field, as described previously.

574	   The following example shows how a Source Identifier can be referenced
575	   in the description of an RTP Data Packet.  It also shows how the
576	   presence of some fields in a format may be dependent on the values of
577	   an earlier field.

579	   An RTP Data Packet is formatted as follows:

581	        0                   1                   2                   3
582	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
583	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
584	       | V |P|X|  CC   |M|     PT      |       Sequence Number         |
585	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
586	       |                           Timestamp                           |
587	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
588	       |                Synchronization Source identifier              |
589	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
590	       |                [Contributing Source identifiers]              |
591	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
592	       |                       Header Extension                        |
593	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
594	       |                             Payload                           :
595	       :                                                               :
596	       :                                                               |
597	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
598	       |                           Padding             | Padding Count |
599	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

601	   where:

603	   Version (V): 2 bits.  This is a fixed-width field, as described
604	      previously.

606	   Padding (P): 1 bit.  This is a fixed-width field, as described
607	      previously.

609	   Extension (X): 1 bit.  This is a fixed-width field, as described
610	      previously.

612	   CSRC count (CC): 4 bits.  This is a fixed-width field, as described
613	      previously.

615	   Marker (M): 1 bit.  This is a fixed-width field, as described
616	      previously.

618	   Payload Type (PT): 7 bits.  This is a fixed-width field, as described
619	      previously.

621	   Sequence Number (PT): 16 bits.  This is a fixed-width field, as
622	      described previously.

624	   Timestamp (PT): 32 bits.  This is a fixed-width field, as described
625	      previously.

627	   Synchronization Source identifier: 1 Source Identifier.  This is a
628	      field whose structure is a previously defined PDU format (Source
629	      Identifier).  To indicate this, the width of the field is
630	      expressed in terms of cross-referenced structure.  When used in
631	      constraint expressions, PDU names refer to the length of that PDU
632	      structure.

634	   Contributing Source identifiers: CC Source Identifier.  Where a field
635	      is comprised of a sequence of previously defined structures,
636	      square brackets can be used to indicate this in the diagram.  The
637	      length of the sequence can be defined using the constraint
638	      expression grammar as described earlier.  Where the length is
639	      unknown, the type of each element of the sequence must be given in
640	      square brackets.

642	      In this example, both a PDU name (Source Identifier) and a field
643	      name (CC) are used in the constraint expression.  The PDU name
644	      refers to the length of the PDU, while the field name refers to
645	      the value of the field.  This is possible because field names
646	      cannot be the same as previously defined PDU names.

648	   Header Extension: 32 bits; present only when X == 1.  This is a field
649	      whose presence is predicated on an expression given using the
650	      constraint expression grammar described earlier.  Optional fields
651	      can be of any previously defined format (e.g., fixed- or variable-
652	      width).  Optional fields are indicated by the presence of ";
653	      present only when [expr]." at the end of the definition term
654	      (i.e., the text contained within the <dt> tag).

656	      [Note that this example deviates from the format as described in
657	      [RFC3550].  As specified in that document, the Header Extension
658	      would be a cross-referenced structure.  This is not shown here for
659	      brevity.]

661	   Payload.  The length of the Payload is not specified, and hence needs
662	      to be inferred from the total length of the packet and the lengths
663	      of the known fields.  There can only be one field of unspecified
664	      size in a PDU.

666	   Padding: PC bytes; present only when (P == 1) && (PC > 0).  This is a
667	      variable size field, with size dependent on a later field in the
668	      packet.  Fields can only depend on the value of a later field if
669	      they follow a field with unspecified size.

671	   Padding Count (PC): 1 byte; present only when P == 1.  This is a
672	      fixed-width field, as previously discussed.

674	4.3.  PDUs with Non-Contiguous Fields

676	   In some binary formats, fields are striped across multiple non-
677	   contiguous bits.  This is often to allow for backwards compatibility
678	   with previous definitions of the same fields in earlier documents:
679	   striping in this way allows for careful use of the possible range of
680	   values.

682	   This format is illustrated using the STUN Message Type
683	   [draft-ietf-tram-stunbis-21].  A STUN Message Type is formatted as
684	   follows:

686	        0                   1
687	        0 1 2 3 4 5 6 7 8 9 0 1 2 3
688	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+
689	       |M|M|M|M|M|C|M|M|M|C|M|M|M|M|
690	       |B|A|9|8|7|1|6|5|4|0|3|2|1|0|
691	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+

693	   where:

695	   Method (M): 12 bits.  This field is comprised of multiple sub-fields
696	      (M0 through MB) as shown in the diagram.  That these sub-fields
697	      should be concatenated, after parsing, into a single field is
698	      indicated by their being labelled using the 'M' short field name
699	      followed by a single hexadecimal digit, with the least significant
700	      bit labelled with 0, and subsequent bits labelled in sequence.

702	   Class (C): 2 bits.  This field follows the same format as M described
703	      above.

705	4.4.  PDUs with Constraints on Field Values

707	   A PDU may be defined not only by the layout and type of its fields,
708	   but also by the value of those fields.  For example, field values may
709	   be constrained to be of a known exact value or to be within a range.
710	   More generally, our format enables a boolean expression to be
711	   attached to a field, which must be true for the PDU to be parsed
712	   successfully.

714	   This format is illustrated using the QUIC Long Header Packet format
715	   [QUIC-TRANSPORT].  A Long Header is formatted as follows:

717	    0                   1                   2                   3
718	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
719	   +-+-+-+-+-+-+-+-+
720	   |1|1| T | R | P |
721	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
722	   |                             Version                           |
723	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
724	   |    DCID Len   |
725	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
726	   |                 Destination Connection ID (DCID)            ...
727	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
728	   |    SCID Len   |
729	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
730	   |                  Source Connection ID (SCID)                ...
731	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

733	   where:

735	   Header Form (HF): 1 bit; HF == 1.  This is a fixed-width field,
736	      constrained to be a of an known, exact value.  At most one field
737	      value constraint may be given, and if provided, it must be given
738	      as a boolean expression, separated by a semi-colon in the field
739	      definition name (i.e., the text contained within the <dt> tag).
740	      If present, a value constraint must follow the name, short name,
741	      and length of the field, but appear before any presence
742	      constraint, if applicable.  The order of the field must be the
743	      same in both the diagram and description list.

745	   Fixed Bit (FB): 1 bit; FB == 1.  This is a fixed-width field, with a
746	      value constraint, as previously described.

748	   Long Packet Type (T): 2 bits.  This is a fixed-width field as
749	      previously described.

751	   Reserved Bits (R): 2 bits.  This is a fixed-width field as previously
752	      described.

754	   Packet Number Length (P): 2 bits.  This is a fixed-width field as
755	      previously described.

757	   Version: 32 bits.  This is a fixed-width field as previously
758	      described.

760	   DCID Len (DLen): 1 byte; DLen <= 20.  This is a fixed-width field,
761	      with a value constraint, as previously described.  Note that the
762	      constraint language is not limited to equality; it is defined
763	      fully in Appendix A.1.

765	   Destination Connection ID: DLen bytes.  This is a variable-width
766	      field as previously described.

768	   SCID Len (SLen): 1 byte; SLen <= 20.  This is a fixed-width field,
769	      with a value constraint, as previously described.

771	   Source Connection ID: SLen bytes.  This is a variable-width field as
772	      previously described.

774	4.5.  PDUs That Extend Sub-Structures

776	   A PDU may not only use or reference existing sub-structures, but they
777	   may extend them, adding new fields, or enforcing different or
778	   additional constraints.

780	   Where a sub-structure is extended, the diagram may show the sub-
781	   structure as a block, labelled with the sub-structure name.  It may
782	   also be desirable to show the sub-structure diagram in full; in this
783	   case, the fields must be given in the same order and be of the same
784	   length.  New field constraints can be shown.  Similarly, in the
785	   description list, those fields inherited without change (i.e., with
786	   no change to their constraints) do not need to be repeated.  Those
787	   with different or additional constraints must be described, and the
788	   order of the fields in the description list must match that of the
789	   sub-structure and the containing structure.

791	   This format is illustrated using the QUIC Retry Packet format
792	   [QUIC-TRANSPORT].  A Retry Packet is formatted as follows:

794	     0                   1                   2                   3
795	     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
796	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
797	    |                                                               :
798	    :                          Long Header                          :
799	    :                                                               |
800	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
801	    |                          Retry Token                        ...
802	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
803	    |                                                               |
804	    +                                                               +
805	    |                                                               |
806	    +                     Retry Integrity Tag                       +
807	    |                                                               |
808	    +                                                               +
809	    |                                                               |
810	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

812	   where:

814	   Long Header (LH): 1 Long Header; LH.T == 3.  This field is a
815	      previously defined sub-structure.  Its constraints can access
816	      fields in that sub-structure.  In this example, the T field of the
817	      Long Header must be equal to 3.

819	   Retry Token  This is a variable-length field as previously defined.

821	   Retry Integrity Tag: 128 bits.  This is a fixed-width field as
822	      previously defined.

824	   As shown, the Long Header packet sub-structure is included.  The
825	   Retry Packet enforces a new value constraint on the Long Packet Type
826	   (T) field.

828	4.6.  Storing Data for Parsing

830	   The parsing process may require data from previously parsed
831	   structures.  This means that data needs to be stored persistently
832	   throughout the process.  This data needs to be identified.

834	   That the value of a particular field be stored upon parsing is
835	   indicated by the exact phrase "On receipt, the value of <field name>
836	   is stored as <stored name>." being present at the end of the
837	   description of a field (i.e., at the end of the <dd> element.)

839	   An Initial Packet is formatted as follows:

841	      0                   1                   2                   3
842	      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
843	     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
844	     |                                                               :
845	     :                          Long Header                          :
846	     :                                                               |
847	     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

849	   where:

851	   Long Header (LH): 1 Long Header; LH.T == 0.  This is field is a sub-
852	      structure, with a constraint, as previously defined.  On receipt,
853	      the value of LH.DCID is stored as Initial DCID.

855	   In this example, the value of the DCID field of the Long Header sub-
856	   structure is stored as Initial DCID.

858	4.7.  Connecting Structures with Functions

860	   The parsing or serialisation of some binary formats cannot be fully
861	   described without the use of functions.  These functions take
862	   arguments (values from another structure), perform some computation,
863	   and generate a new structure.

865	   Given the goal of fully capturing the parsing or serialisation of
866	   binary protocols, it is necessary to include the signature of these
867	   helper functions.

869	   Function signatures are described in <artwork> elements.  They are
870	   constructed as the word "func", followed by a space, then the name of
871	   the function.  This is immediately followed by a set of brackets
872	   containing a comma separated list of the function's parameters,
873	   formatted as "<parameter name>: <parameter type>".  This is followed
874	   by "->" and the return type of the function, followed by a colon.

876	   The body of the function is not captured, owing to the complexity of
877	   both capturing and translating arbitrary code.  As a result, it can
878	   be described in whichever format is most suitable for the document
879	   and its readership.

881	   Those values that are stored persistently, as defined in Section 4.6,
882	   are accessible by functions.

884	   As an example, the "apply_protection" function is defined as:

886	   func apply_protection(to: Unprotected Packet)
887	                   -> Protected Packet:
888	      apply packet protection to payload
889	      apply header protection to first_byte and packet_number
890	      construct appropriate Protected Packet based on first_byte
891	      return Protected Packet

893	   In this example, 'Unprotected Packet' and 'Protected Packet' are
894	   existing types.

896	   To indicate that a PDU is created from another by way of a function,
897	   the sentence "A/An <PDU name A> is parsed from a <PDU name B> using
898	   the <function name> function" is used.  This indicates that a PDU A
899	   is generated by passing PDU B into the named function.  The function
900	   must take a single parameter, of the same type as PDU B, and return a
901	   PDU B.

903	   To indicate that a PDU can be serialised to another by way of a
904	   function, the sentence "A/An <PDU name A> is serialised to a <PDU
905	   name B> using the <function name> function" is used.  This indicates
906	   that a PDU B is generated by passing PDU A into the named function.
907	   The function must take a single parameter, of the same type as PDU A,
908	   and return a PDU B.

910	4.8.  Specifying Enumerated Types

912	   In addition to the use of the sub-structures, it is desirable to be
913	   able to define a type that may take the value of one of a set of
914	   alternative structures.

916	   The alternative structures that comprise an enumerated type are
917	   identified using the exact phrase "The <enumerated type name> is one
918	   of: <list of structure names>" where the list of structure names is a
919	   comma separated list (with the last element, if there is more than
920	   one element, preceded by 'or'), each optionally preceded by "a" or
921	   "an".  The structure names must be defined within the document or a
922	   linked document.

924	   Where an enumerated type has only two variants, an alternative phrase
925	   can be used: "The <enumerated type name> is either a <variant 1 name>
926	   or <variant 2 name>".  The names of the variants must be defined
927	   within the document or a linked document.

929	   A PING Frame is formatted as follows:

931	     0                   1                   2                   3
932	     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
933	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
934	    |       1       |
935	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

937	   where:

939	   Frame Type (FT): 1 Variable-Length Integer Encoding; FT.T == 1.  Fram
940	      e type, set to 1 for PING frames.

942	   A HANDSHAKE_DONE Frame is formatted as follows:

944	    0                   1                   2                   3
945	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
946	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
947	   |       30      |
948	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

950	   where:

952	   Frame Type (FT): 1 Variable-Length Integer Encoding; FT.T == 30.  Fra
953	      me type, set to 30 for HANDSHAKE_DONE frames.

955	   A Frame is either a PING Frame or a HANDSHAKE_DONE Frame.

957	4.9.  Specifying Protocol Data Units

959	   A document will set out different structures that are not, on their
960	   own, protocol data units.  To capture the parsing or serialisation of
961	   a protocol, it is necessary to be able to identify or construct those
962	   packets that are valid PDUs.  As a result, it is necessary for the
963	   document to identify those structures that are PDUs.

965	   The PDUs that comprise a protocol are identified using the exact
966	   phrase "This document describes the <protocol name> protocol.  The
967	   <protocol name> protocol uses <list of PDU names>" where the list of
968	   PDU names is a comma separated list (with the last element, if there
969	   is more than one element, preceded by 'and'), each optionally
970	   preceded by "a" or "an".  The PDU names must be structure names
971	   defined in the document or a linked document.  The PDU names are
972	   pluralised in the list.  A document must contain exactly one instance
973	   of this phrase.

975	   This document describes the Example protocol.  The Example protocol
976	   uses Long Headers, STUN Message Types, IPv4 Headers, and RTP Data
977	   Packets.

979	4.10.  Importing PDU Definitions from Other Documents

981	   Protocols are often specified across multiple documents, either
982	   because the specification of a protocol's data units has changed over
983	   time, or because of explicit extension points contained in the
984	   protocol's original specification.  To allow a document to make use
985	   of a previous PDU definition, it is possible to import PDU
986	   definitions (written in the format described in this document) from
987	   other documents.

989	   A PDU definition is imported using the exact phrase "A/An ________ is
990	   formatted as described in <document identifier>".  The document
991	   identifier must refer, unambiguously, to an existing document.  An
992	   Internet-Draft is identified by its name.  RFCs are identified by
993	   "RFC" followed by their number.

995	5.  Open Issues

997	   *  Need a simple syntax for defining a list of identical objects, and
998	      a way of referring to the size of the enclosing packet.  The
999	      format cannot currently represent RFC 6716 section 3.2.3, and
1000	      should be able to (the underlying type system can do so).

1002	   *  Need some discussion about the checks that the tooling might
1003	      perform, and the implications of those checks.  For example, the
1004	      tooling checks for consistency between the diagram and the
1005	      description list of fields, ensuring that fields match by name and
1006	      width. -01 of this draft had a field that mismatched because of
1007	      case: is this something that the tooling should identify?  More
1008	      broadly, what is the trade-off between the rigour that the tooling
1009	      can enforce, and the flexibility desired/needed by authors?

1011	   *  Need to describe the rules governing the import of PDU definitions
1012	      from other documents.

1014	6.  IANA Considerations

1016	   This document contains no actions for IANA.

1018	7.  Security Considerations

1020	   Poorly implemented parsers are a frequent source of security
1021	   vulnerabilities in protocol implementations.  Structuring the
1022	   description of a protocol data unit so that a parser can be
1023	   automatically derived from the specification can reduce the
1024	   likelihood of vulnerable implementations.

1026	   As described in Section 3, the expressiveness of a protocol
1027	   description language has implications for the safety, security, and
1028	   computability properties of the parser for the protocol description
1029	   language itself, and on the generated parser code for the protocols
1030	   described using it.  The language-theoretic security ([LANGSEC])
1031	   community explores the security implications of programming language
1032	   design; the principles developed in that community should guide the
1033	   development of protocol description languages.

1035	8.  Acknowledgements

1037	   The authors would like to thank Marc Petit-Huguenin for extensive
1038	   feedback on the draft, including work on formalising the constraint
1039	   syntax as given in Appendix A.1.

1041	   The authors would like to thank David Southgate for preparing a
1042	   prototype implementation of some of the ideas described here.

1044	   This work has received funding from the UK Engineering and Physical
1045	   Sciences Research Council under grant EP/R04144X/1.

1047	9.  Informative References

1049	   [RFC8357]  Deering, S. and R. Hinden, "Generalized UDP Source Port
1050	              for DHCP Relay", RFC 8357, March 2018,
1051	              <https://www.rfc-editor.org/info/rfc8357>.

1053	   [QUIC-TRANSPORT]
1054	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1055	              and Secure Transport", Work in Progress, Internet-Draft,
1056	              draft-ietf-quic-transport-27, 21 February 2020,
1057	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
1058	              transport-27.txt>.

1060	   [RFC6958]  Clark, A., Zhang, S., Zhao, J., and Q. Wu, "RTP Control
1061	              Protocol (RTCP) Extended Report (XR) Block for Burst/Gap
1062	              Loss Metric Reporting", RFC 6958, May 2013,
1063	              <https://www.rfc-editor.org/info/rfc6958>.

1065	   [RFC7950]  Bjorklund, M., "The YANG 1.1 Data Modeling Language",
1066	              RFC 7950, August 2016,
1067	              <https://www.rfc-editor.org/info/rfc7950>.

1069	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1070	              Version 1.3", RFC 8446, August 2018,
1071	              <https://www.rfc-editor.org/info/rfc8446>.

1073	   [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
1074	              Specifications: ABNF", RFC 5234, January 2008,
1075	              <https://www.rfc-editor.org/info/rfc5234>.

1077	   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
1078	              RFC 7405, December 2014,
1079	              <https://www.rfc-editor.org/info/rfc7405>.

1081	   [ASN1]     ITU-T, "ITU-T Recommendation X.680, X.681, X.682, and
1082	              X.683", ITU-T Recommendation X.680, X.681, X.682, and
1083	              X.683.

1085	   [RFC7049]  Bormann, C. and P. Hoffman, "Concise Binary Object
1086	              Representation (CBOR)", RFC 7049, October 2013,
1087	              <https://www.rfc-editor.org/info/rfc7049>.

1089	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1090	              Jacobson, "RTP: A Transport Protocol for Real-Time
1091	              Applications", RFC 3550, July 2003,
1092	              <https://www.rfc-editor.org/info/rfc3550>.

1094	   [draft-ietf-tram-stunbis-21]
1095	              Petit-Huguenin, M., Salgueiro, G., Rosenberg, J., Wing,
1096	              D., Mahy, R., and P. Matthews, "Session Traversal
1097	              Utilities for NAT (STUN)", Work in Progress, Internet-
1098	              Draft, draft-ietf-tram-stunbis-21, 21 March 2019,
1099	              <http://www.ietf.org/internet-drafts/draft-ietf-tram-
1100	              stunbis-21.txt>.

1102	   [RFC791]   Postel, J., "Internet Protocol", RFC 791, September 1981,
1103	              <https://www.rfc-editor.org/info/rfc791>.

1105	   [RFC793]   Postel, J., "Transmission Control Protocol", RFC 793,
1106	              September 1981, <https://www.rfc-editor.org/info/rfc793>.

1108	   [LANGSEC]  LANGSEC, "LANGSEC: Language-theoretic Security",
1109	              <http://langsec.org>.

1111	   [SASSAMAN] Sassaman, L., Patterson, M. L., Bratus, S., and A.
1112	              Shubina, "The Halting Problems of Network Stack
1113	              Insecurity", ;login: -- December 2011, Volume 36, Number
1114	              6, <https://www.usenix.org/publications/login/december-
1115	              2011-volume-36-number-6/halting-problems-network-stack-
1116	              insecurity>.

1118	   [draft-mcquistin-augmented-udp-example]
1119	              McQuistin, S., Band, V., Jacob, D., and C. S. Perkins,
1120	              "Describing UDP with Augmented Packet Header Diagrams",
1121	              Work in Progress, Internet-Draft, draft-mcquistin-
1122	              augmented-udp-example-00, 2 November 2020,
1123	              <http://www.ietf.org/internet-drafts/draft-mcquistin-
1124	              augmented-udp-00.txt>.

1126	   [draft-mcquistin-augmented-tcp-example]
1127	              McQuistin, S., Band, V., Jacob, D., and C. S. Perkins,
1128	              "Describing TCP with Augmented Packet Header Diagrams",
1129	              Work in Progress, Internet-Draft, draft-mcquistin-
1130	              augmented-udp-example-00, 2 November 2020,
1131	              <http://www.ietf.org/internet-drafts/draft-mcquistin-
1132	              augmented-tcp-example-00.txt>.

1134	   [draft-mcquistin-quic-augmented-diagrams]
1135	              McQuistin, S., Band, V., Jacob, D., and C. S. Perkins,
1136	              "Describing QUIC's Protocol Data Units with Augmented
1137	              Packet Header Diagrams", Work in Progress, Internet-Draft,
1138	              draft-mcquistin-quic-augmented-diagrams-03, 2 November
1139	              2020, <http://www.ietf.org/internet-drafts/draft-
1140	              mcquistin-quic-augmented-diagrams-03.txt>.

1142	Appendix A.  ABNF specification

1144	A.1.  Constraint Expressions

1146	   constant = %x31-39 *(%x30-39)  ; natural numbers without leading 0s
1147	   short-name = ALPHA *(ALPHA / DIGIT / "-" / "_")
1148	   name = short-name *(" " short-name)
1149	   sp = [" "] ; optional space in expression
1150	   bool-expr = "(" sp bool-expr sp ")" /
1151	              "!" sp bool-expr /
1152	              bool-expr sp bool-op sp bool-expr /
1153	              bool-expr sp "?" sp expr sp ":" sp expr /
1154	              expr sp cmp-op sp expr
1155	   bool-op = "&&" / "||"
1156	   cmp-op = "==" / "!=" / "<" / "<=" / ">" / ">="
1157	   expr = "(" sp expr sp ")" /
1158	         expr sp op sp expr /
1159	         bool-expr "?" expr ":" expr /
1160	         name / short-name "." short-name /
1161	         constant
1162	   op = "+" / "-" / "*" / "/" / "%" / "^"
1163	   length = expr sp unit / "[" sp name sp "]"
1164	   unit = %s"bit" / %s"bits" / %s"byte" / %s"bytes" / name

1166	A.2.  Augmented packet diagrams

1168	   Future revisions of this draft will include an ABNF specification for
1169	   the augmented packet diagram format described in Section 4.  Such a
1170	   specification is omitted from this draft given that the format is
1171	   likely to change as its syntax is developed.  Given the visual nature
1172	   of the format, it is more appropriate for discussion to focus on the
1173	   examples given in Section 4.

1175	Appendix B.  Tooling & source code

1177	   The source for this draft is available from https://github.com/
1178	   glasgow-ipl/draft-mcquistin-augmented-ascii-diagrams.

1180	   The source code for tooling that can be used to parse this document
1181	   is available from https://github.com/glasgow-ipl/ips-protodesc-code.
1182	   This tooling supports the automatic generation of Rust parser code
1183	   from protocol descriptions written in the Augmented Packet Header
1184	   Diagram format.  It also provides test harnesses that demonstrate
1185	   that example descriptions of UDP
1186	   [draft-mcquistin-augmented-udp-example] and TCP
1187	   [draft-mcquistin-augmented-udp-example] function as expected.

1189	Authors' Addresses

1191	   Stephen McQuistin
1192	   University of Glasgow
1193	   School of Computing Science
1194	   Glasgow
1195	   G12 8QQ
1196	   United Kingdom

1198	   Email: sm@smcquistin.uk

1200	   Vivian Band
1201	   University of Glasgow
1202	   School of Computing Science
1203	   Glasgow
1204	   G12 8QQ
1205	   United Kingdom

1207	   Email: vivianband0@gmail.com

1209	   Dejice Jacob
1210	   University of Glasgow
1211	   School of Computing Science
1212	   Glasgow
1213	   G12 8QQ
1214	   United Kingdom

1216	   Email: d.jacob.1@research.gla.ac.uk

1218	   Colin Perkins
1219	   University of Glasgow
1220	   School of Computing Science
1221	   Glasgow
1222	   G12 8QQ
1223	   United Kingdom

1225	   Email: csp@csperkins.org