idnits 2.17.1 draft-ietf-httpbis-header-structure-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [3], [4], [5], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 2, 2018) is 2123 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1098 -- Looks like a reference, but probably isn't: '2' on line 1100 -- Looks like a reference, but probably isn't: '3' on line 1102 -- Looks like a reference, but probably isn't: '4' on line 1104 -- Looks like a reference, but probably isn't: '5' on line 1106 == Missing Reference: 'RFCxxxx' is mentioned on line 217, but not defined == Missing Reference: 'RFC3986' is mentioned on line 235, but not defined -- Looks like a reference, but probably isn't: '6' on line 1071 ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP M. Nottingham 3 Internet-Draft Fastly 4 Intended status: Standards Track P-H. Kamp 5 Expires: January 3, 2019 The Varnish Cache Project 6 July 2, 2018 8 Structured Headers for HTTP 9 draft-ietf-httpbis-header-structure-07 11 Abstract 13 This document describes a set of data types and algorithms associated 14 with them that are intended to make it easier and safer to define and 15 handle HTTP header fields. It is intended for use by new 16 specifications of HTTP header fields as well as revisions of existing 17 header field specifications when doing so does not cause 18 interoperability issues. 20 Note to Readers 22 _RFC EDITOR: please remove this section before publication_ 24 Discussion of this draft takes place on the HTTP working group 25 mailing list (ietf-http-wg@w3.org), which is archived at 26 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 28 Working Group information can be found at https://httpwg.github.io/ 29 [2]; source code and issues list for this draft can be found at 30 https://github.com/httpwg/http-extensions/labels/header-structure 31 [3]. 33 Tests for implementations are collected at https://github.com/httpwg/ 34 structured-header-tests [4]. 36 Implementations are tracked at https://github.com/httpwg/wiki/wiki/ 37 Structured-Headers [5]. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on January 3, 2019. 56 Copyright Notice 58 Copyright (c) 2018 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 74 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 4 75 2. Defining New Structured Headers . . . . . . . . . . . . . . . 4 76 3. Structured Header Data Types . . . . . . . . . . . . . . . . 6 77 3.1. Dictionaries . . . . . . . . . . . . . . . . . . . . . . 6 78 3.2. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 6 79 3.3. Parameterised Lists . . . . . . . . . . . . . . . . . . . 7 80 3.4. Items . . . . . . . . . . . . . . . . . . . . . . . . . . 7 81 3.5. Integers . . . . . . . . . . . . . . . . . . . . . . . . 8 82 3.6. Floats . . . . . . . . . . . . . . . . . . . . . . . . . 8 83 3.7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 3.8. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 9 85 3.9. Binary Content . . . . . . . . . . . . . . . . . . . . . 9 86 4. Structured Headers in HTTP/1 . . . . . . . . . . . . . . . . 10 87 4.1. Serialising Structured Headers into HTTP/1 . . . . . . . 10 88 4.2. Parsing HTTP/1 Header Fields into Structured Headers . . 14 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 22 91 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 92 7.1. Normative References . . . . . . . . . . . . . . . . . . 22 93 7.2. Informative References . . . . . . . . . . . . . . . . . 23 94 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 24 95 Appendix A. Frequently Asked Questions . . . . . . . . . . . . . 24 96 A.1. Why not JSON? . . . . . . . . . . . . . . . . . . . . . . 24 97 A.2. Structured Headers don't "fit" my data. . . . . . . . . . 25 98 Appendix B. Changes . . . . . . . . . . . . . . . . . . . . . . 25 99 B.1. Since draft-ietf-httpbis-header-structure-06 . . . . . . 25 100 B.2. Since draft-ietf-httpbis-header-structure-05 . . . . . . 25 101 B.3. Since draft-ietf-httpbis-header-structure-04 . . . . . . 26 102 B.4. Since draft-ietf-httpbis-header-structure-03 . . . . . . 26 103 B.5. Since draft-ietf-httpbis-header-structure-02 . . . . . . 26 104 B.6. Since draft-ietf-httpbis-header-structure-01 . . . . . . 26 105 B.7. Since draft-ietf-httpbis-header-structure-00 . . . . . . 26 106 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 108 1. Introduction 110 Specifying the syntax of new HTTP header fields is an onerous task; 111 even with the guidance in [RFC7231], Section 8.3.1, there are many 112 decisions - and pitfalls - for a prospective HTTP header field 113 author. 115 Once a header field is defined, bespoke parsers and serialisers often 116 need to be written, because each header has slightly different 117 handling of what looks like common syntax. 119 This document introduces a set of common data structures for use in 120 HTTP header field values to address these problems. In particular, 121 it defines a generic, abstract model for header field values, along 122 with a concrete serialisation for expressing that model in HTTP/1 123 [RFC7230] header fields. 125 HTTP headers that are defined as "Structured Headers" use the types 126 defined in this specification to define their syntax and basic 127 handling rules, thereby simplifying both their definition by 128 specification writers and handling by implementations. 130 Additionally, future versions of HTTP can define alternative 131 serialisations of the abstract model of these structures, allowing 132 headers that use it to be transmitted more efficiently without being 133 redefined. 135 Note that it is not a goal of this document to redefine the syntax of 136 existing HTTP headers; the mechanisms described herein are only 137 intended to be used with headers that explicitly opt into them. 139 To specify a header field that is a Structured Header, see Section 2. 141 Section 3 defines a number of abstract data types that can be used in 142 Structured Headers. 144 Those abstract types can be serialised into and parsed from textual 145 headers - such as those used in HTTP/1 - using the algorithms 146 described in Section 4. 148 1.1. Notational Conventions 150 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 151 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 152 "OPTIONAL" in this document are to be interpreted as described in BCP 153 14 [RFC2119] [RFC8174] when, and only when, they appear in all 154 capitals, as shown here. 156 This document uses the Augmented Backus-Naur Form (ABNF) notation of 157 [RFC5234], including the VCHAR, DIGIT, ALPHA and DQUOTE rules from 158 that document. It also includes the OWS rule from [RFC7230]. 160 This document uses algorithms to specify parsing and serialisation 161 behaviours, and ABNF to illustrate expected syntax. 163 For parsing, implementations MUST follow the algorithms, but MAY vary 164 in implementation so as the behaviours are indistinguishable from 165 specified behaviour. If there is disagreement between the parsing 166 algorithms and ABNF, the specified algorithms take precedence. 168 For serialisation, the ABNF illustrates the range of acceptable wire 169 representations with as much fidelity as possible, and the algorithms 170 define the recommended way to produce them. Implementations MAY vary 171 from the specified behaviour so long as the output still matches the 172 ABNF. 174 2. Defining New Structured Headers 176 To define a HTTP header as a structured header, its specification 177 needs to: 179 o Reference this specification. Recipients and generators of the 180 header need to know that the requirements of this document are in 181 effect. 183 o Specify the header field's allowed syntax for values, in terms of 184 the types described in Section 3, along with their associated 185 semantics. Syntax definitions are encouraged to use the ABNF 186 rules beginning with "sh-" defined in this specification. 188 o Specify any additional constraints upon the syntax of the 189 structured used, as well as the consequences when those 190 constraints are violated. When Structured Headers parsing fails, 191 the header is discarded (see Section 4.2); in most situations, 192 header-specific constraints should do likewise. 194 Note that a header field definition cannot relax the requirements of 195 a structure or its processing; they can only add additional 196 constraints, because doing so would preclude handling by generic 197 software. 199 For example: 201 # Foo-Example Header 203 The Foo-Example HTTP header field conveys information about how 204 much Foo the message has. 206 Foo-Example is a Structured Header [RFCxxxx]. Its value MUST be a 207 dictionary ([RFCxxxx], Section Y.Y). Its ABNF is: 209 Foo-Example = sh-dictionary 211 The dictionary MUST contain: 213 * Exactly one member whose key is "foo", and whose value is an 214 integer ([RFCxxxx], Section Y.Y), indicating the number of foos 215 in the message. 216 * Exactly one member whose key is "barUrls", and whose value is a 217 string ([RFCxxxx], Section Y.Y), conveying the Bar URLs for the 218 message. See below for processing requirements. 220 If the parsed header field does not contain both, it MUST be 221 ignored. 223 "foo" MUST be between 0 and 10, inclusive; other values MUST cause 224 the header to be ignored. 226 "barUrls" contains a space-separated list of URI-references 227 ([RFC3986], Section 4.1): 229 barURLs = URI-reference *( 1*SP URI-reference ) 231 If a member of barURLs is not a valid URI-reference, it MUST cause 232 that value to be ignored. 234 If a member of barURLs is a relative reference ([RFC3986], 235 Section 4.2), it MUST be resolved ([RFC3986], Section 5) before 236 being used. 238 This specification defines minimums for the length or number of 239 various structures supported by Structured Headers implementations. 240 It does not specify maximum sizes in most cases, but header authors 241 should be aware that HTTP implementations do impose various limits on 242 the size of individual header fields, the total number of fields, 243 and/or the size of the entire header block. 245 3. Structured Header Data Types 247 This section defines the abstract value types that can be composed 248 into Structured Headers. The ABNF provided represents the on-wire 249 format in HTTP/1. 251 3.1. Dictionaries 253 Dictionaries are unordered maps of key-value pairs, where the keys 254 are identifiers (Section 3.8) and the values are items (Section 3.4). 255 There can be one or more members, and keys are required to be unique. 257 The ABNF for dictionaries is: 259 sh-dictionary = dict-member *( OWS "," OWS dict-member ) 260 dict-member = member-name "=" member-value 261 member-name = identifier 262 member-value = sh-item 264 In HTTP/1, keys and values are separated by "=" (without whitespace), 265 and key/value pairs are separated by a comma with optional 266 whitespace. For example: 268 Example-DictHeader: en="Applepie", da=*w4ZibGV0w6ZydGUK=* 270 Typically, a header field specification will define the semantics of 271 individual keys, as well as whether their presence is required or 272 optional. Recipients MUST ignore keys that are undefined or unknown, 273 unless the header field's specification specifically disallows them. 275 Parsers MUST support dictionaries containing at least 1024 key/value 276 pairs. 278 3.2. Lists 280 Lists are arrays of items (Section 3.4) with one or more members. 282 The ABNF for lists is: 284 sh-list = list-member *( OWS "," OWS list-member ) 285 list-member = sh-item 286 In HTTP/1, each member is separated by a comma and optional 287 whitespace. For example, a header field whose value is defined as a 288 list of strings could look like: 290 Example-StrListHeader: "foo", "bar", "It was the best of times." 292 Header specifications can constrain the types of individual values if 293 necessary. 295 Parsers MUST support lists containing at least 1024 members. 297 3.3. Parameterised Lists 299 Parameterised Lists are arrays of a parameterised identifiers. 301 A parameterised identifier is an identifier (Section 3.8) with an 302 optional set of parameters, each parameter having a identifier and an 303 optional value that is an item (Section 3.4). Ordering between 304 parameters is not significant, and duplicate parameters MUST cause 305 parsing to fail. 307 The ABNF for parameterised lists is: 309 sh-param-list = param-id *( OWS "," OWS param-id ) 310 param-id = identifier *parameter 311 parameter = OWS ";" OWS param-name [ "=" param-value ] 312 param-name = identifier 313 param-value = sh-item 315 In HTTP/1, each param-id is separated by a comma and optional 316 whitespace (as in Lists), and the parameters are separated by 317 semicolons. For example: 319 Example-ParamListHeader: abc_123;a=1;b=2; cdef_456, ghi;q="9";r=w 321 Parsers MUST support parameterised lists containing at least 1024 322 members, and support members with at least 256 parameters. 324 3.4. Items 326 An item is can be a integer (Section 3.5), float (Section 3.6), 327 string (Section 3.7), or binary content (Section 3.9). 329 The ABNF for items is: 331 sh-item = sh-integer / sh-float / sh-string / sh-binary 333 3.5. Integers 335 Integers have a range of -9,223,372,036,854,775,808 to 336 9,223,372,036,854,775,807 inclusive (i.e., a 64-bit signed integer). 338 The ABNF for integers is: 340 sh-integer = ["-"] 1*19DIGIT 342 For example: 344 Example-IntegerHeader: 42 346 3.6. Floats 348 Floats are integers with a fractional part, that can be stored as 349 IEEE 754 double precision numbers (binary64) ([IEEE754]). 351 The ABNF for floats is: 353 sh-float = ["-"] ( 354 DIGIT "." 1*14DIGIT / 355 2DIGIT "." 1*13DIGIT / 356 3DIGIT "." 1*12DIGIT / 357 4DIGIT "." 1*11DIGIT / 358 5DIGIT "." 1*10DIGIT / 359 6DIGIT "." 1*9DIGIT / 360 7DIGIT "." 1*8DIGIT / 361 8DIGIT "." 1*7DIGIT / 362 9DIGIT "." 1*6DIGIT / 363 10DIGIT "." 1*5DIGIT / 364 11DIGIT "." 1*4DIGIT / 365 12DIGIT "." 1*3DIGIT / 366 13DIGIT "." 1*2DIGIT / 367 14DIGIT "." 1DIGIT ) 369 For example, a header whose value is defined as a float could look 370 like: 372 Example-FloatHeader: 4.5 374 3.7. Strings 376 Strings are zero or more printable ASCII [RFC0020] characters (i.e., 377 the range 0x20 to 0x7E). Note that this excludes tabs, newlines, 378 carriage returns, etc. 380 The ABNF for strings is: 382 sh-string = DQUOTE *(chr) DQUOTE 383 chr = unescaped / escaped 384 unescaped = %x20-21 / %x23-5B / %x5D-7E 385 escaped = "\" ( DQUOTE / "\" ) 387 In HTTP/1 headers, strings are delimited with double quotes, using a 388 backslash ("\") to escape double quotes and backslashes. For 389 example: 391 Example-StringHeader: "hello world" 393 Note that strings only use DQUOTE as a delimiter; single quotes do 394 not delimit strings. Furthermore, only DQUOTE and "\" can be 395 escaped; other sequences MUST cause parsing to fail. 397 Unicode is not directly supported in this document, because it causes 398 a number of interoperability issues, and - with few exceptions - 399 header values do not require it. 401 When it is necessary for a field value to convey non-ASCII string 402 content, binary content (Section 3.9) SHOULD be specified, along with 403 a character encoding (preferably, UTF-8). 405 Parsers MUST support strings with at least 1024 characters. 407 3.8. Identifiers 409 Identifiers are short textual identifiers; their abstract model is 410 identical to their expression in the textual HTTP serialisation. 411 Parsers MUST support identifiers with at least 64 characters. 413 The ABNF for identifiers is: 415 identifier = lcalpha *( lcalpha / DIGIT / "_" / "-"/ "*" / "/" ) 416 lcalpha = %x61-7A ; a-z 418 Note that identifiers can only contain lowercase letters. 420 3.9. Binary Content 422 Arbitrary binary content can be conveyed in Structured Headers. 424 The ABNF for binary content is: 426 sh-binary = "*" *(base64) "*" 427 base64 = ALPHA / DIGIT / "+" / "/" / "=" 428 In HTTP/1 headers, binary content is delimited with asterisks and 429 encoded using base64 ([RFC4648], Section 4). For example: 431 Example-BinaryHdr: *cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==* 433 Parsers MUST support binary content with at least 16384 octets after 434 decoding. 436 4. Structured Headers in HTTP/1 438 This section defines how to serialise and parse Structured Headers in 439 HTTP/1 textual header fields, and protocols compatible with them 440 (e.g., in HTTP/2 [RFC7540] before HPACK [RFC7541] is applied). 442 4.1. Serialising Structured Headers into HTTP/1 444 Given a structured defined in this specification: 446 1. If the structure is a dictionary, return the result of 447 Serialising a Dictionary {#ser-dictionary}. 449 2. If the structure is a list, return the result of Serialising a 450 List {#ser-list}. 452 3. If the structure is a parameterised list, return the result of 453 Serialising a Parameterised List {#ser-param-list}. 455 4. If the structure is an item, return the result of Serialising an 456 Item {#ser-item}. 458 5. Otherwise, fail serialisation. 460 4.1.1. Serialising a Dictionary 462 Given a dictionary as input: 464 1. Let output be an empty string. 466 2. For each member mem of input: 468 1. Let name be the result of applying Serialising an Identifier 469 Section 4.1.8 to mem's member-name. 471 2. Append name to output. 473 3. Append "=" to output. 475 4. Let value be the result of applying Serialising an Item 476 Section 4.1.4 to mem's member-value. 478 5. Append value to output. 480 3. Return output. 482 4.1.2. Serialising a List 484 Given a list as input: 486 1. Let output be an empty string. 488 2. For each member mem of input: 490 1. Let value be the result of applying Serialising an Item 491 Section 4.1.4 to mem. 493 2. Append value to output. 495 3. If more members remain in input: 497 1. Append a COMMA to output. 499 2. Append a single WS to output. 501 3. Return output. 503 4.1.3. Serialising a Parameterised List 505 Given a parameterised list as input: 507 1. Let output be an empty string. 509 2. For each member mem of input: 511 1. Let id be the result of applying Serialising an Identifier 512 Section 4.1.8 to mem's identifier. 514 2. Append id to output. 516 3. For each parameter in mem's parameters: 518 1. Let name be the result of applying Serialising an 519 Identifier Section 4.1.8 to parameter's param-name. 521 2. Append name to output. 523 3. If parameter has a param-value: 525 1. Let value be the result of applying Serialising an 526 Item Section 4.1.4 to parameter's param-value. 528 2. Append "=" to output. 530 3. Append value to output. 532 3. Return output. 534 4.1.4. Serialising an Item 536 Given an item as input: 538 1. If input is a type other than an integer, float, string or binary 539 content, fail serialisation. 541 2. Let output be an empty string. 543 3. If input is an integer, let value be the result of applying 544 Serialising an Integer Section 4.1.5 to input. 546 4. If input is a float, let value be the result of applying 547 Serialising a Float Section 4.1.6 to input. 549 5. If input is a string, let value be the result of applying 550 Serialising a String Section 4.1.7 to input. 552 6. If input is binary content, let value be the result of applying 553 Serialising Binary Content Section 4.1.9 to input. 555 7. Return output. 557 4.1.5. Serialising an Integer 559 Given an integer as input: 561 1. If input is not an integer in the range of 562 -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 563 inclusive, fail serialisation. 565 2. Let output be an empty string. 567 3. If input is less than (but not equal to) 0, append "-" to output. 569 4. Append input's numeric value represented in base 10 using only 570 decimal digits to output. 572 5. Return output. 574 4.1.6. Serialising a Float 576 Given a float as input: 578 1. If input is not a IEEE 754 double precision number, fail 579 serialisation. 581 2. Let output be an empty string. 583 3. If input is less than (but not equal to) 0, append "-" to output. 585 4. Append input's integer component represented in base 10 using 586 only decimal digits to output; if it is zero, append "0". 588 5. Append "." to output. 590 6. Append input's decimal component represented in base 10 using 591 only decimal digits to output; if it is zero, append "0". 593 7. Return output. 595 4.1.7. Serialising a String 597 Given a string as input: 599 1. If input is not a sequence of characters, or contains characters 600 outside the range allowed by VCHAR, fail serialisation. 602 2. Let output be an empty string. 604 3. Append DQUOTE to output. 606 4. For each character char in input: 608 1. If char is "\" or DQUOTE: 610 1. Append "\" to output. 612 2. Append char to output, using ASCII encoding [RFC0020]. 614 5. Append DQUOTE to output. 616 6. Return output. 618 4.1.8. Serialising an Identifier 620 Given an identifier as input: 622 1. If input is not a sequence of characters, or contains characters 623 not allowed in Section 3.8, fail serialisation. 625 2. Let output be an empty string. 627 3. Append input to output, using ASCII encoding [RFC0020]. 629 4. Return output. 631 4.1.9. Serialising Binary Content 633 Given binary content as input: 635 1. If input is not a sequence of bytes, fail serialisation. 637 2. Let output be an empty string. 639 3. Append "*" to output. 641 4. Append the result of base64-encoding input as per [RFC4648], 642 Section 4, taking account of the requirements below. 644 5. Append "*" to output. 646 6. Return output. 648 The encoded data is required to be padded with "=", as per [RFC4648], 649 Section 3.2. 651 Likewise, encoded data SHOULD have pad bits set to zero, as per 652 [RFC4648], Section 3.5, unless it is not possible to do so due to 653 implementation constraints. 655 4.2. Parsing HTTP/1 Header Fields into Structured Headers 657 When a receiving implementation parses textual HTTP header fields 658 (e.g., in HTTP/1 or HTTP/2) that are known to be Structured Headers, 659 it is important that care be taken, as there are a number of edge 660 cases that can cause interoperability or even security problems. 661 This section specifies the algorithm for doing so. 663 Given an ASCII string input_string that represents the chosen 664 header's field-value, and header_type, one of "dictionary", "list", 665 "param-list", or "item", return the parsed header value. 667 1. Discard any leading OWS from input_string. 669 2. If header_type is "dictionary", let output be the result of 670 Parsing a Dictionary from Text (Section 4.2.1). 672 3. If header_type is "list", let output be the result of Parsing a 673 List from Text (Section 4.2.2). 675 4. If header_type is "param-list", let output be the result of 676 Parsing a Parameterised List from Text (Section 4.2.3). 678 5. Otherwise, let output be the result of Parsing an Item from Text 679 (Section 4.2.5). 681 6. Discard any leading OWS from input_string. 683 7. If input_string is not empty, fail parsing. 685 8. Otherwise, return output. 687 When generating input_string, parsers MUST combine all instances of 688 the target header field into one comma-separated field-value, as per 689 [RFC7230], Section 3.2.2; this assures that the header is processed 690 correctly. 692 For Lists, Parameterised Lists and Dictionaries, this has the effect 693 of correctly concatenating all instances of the header field. 695 Strings can but SHOULD NOT be split across multiple header instances, 696 because comma(s) inserted upon combination will become part of the 697 string output by the parser. 699 Integers, Floats and Binary Content cannot be split across multiple 700 headers because the inserted commas will cause parsing to fail. 702 If parsing fails - including when calling another algorithm - the 703 entire header field's value MUST be discarded. This is intentionally 704 strict, to improve interoperability and safety, and specifications 705 referencing this document cannot loosen this requirement. 707 Note that this has the effect of discarding any header field with 708 non-ASCII characters in input_string. 710 4.2.1. Parsing a Dictionary from Text 712 Given an ASCII string input_string, return a mapping of (identifier, 713 item). input_string is modified to remove the parsed value. 715 1. Let dictionary be an empty, unordered mapping. 717 2. While input_string is not empty: 719 1. Let this_key be the result of running Parse Identifier from 720 Text (Section 4.2.8) with input_string. 722 2. If dictionary already contains this_key, fail parsing. 724 3. Consume the first character of input_string; if it is not 725 "=", fail parsing. 727 4. Let this_value be the result of running Parse Item from Text 728 (Section 4.2.5) with input_string. 730 5. Add key this_key with value this_value to dictionary. 732 6. Discard any leading OWS from input_string. 734 7. If input_string is empty, return dictionary. 736 8. Consume the first character of input_string; if it is not 737 COMMA, fail parsing. 739 9. Discard any leading OWS from input_string. 741 10. If input_string is empty, fail parsing. 743 3. No structured data has been found; fail parsing. 745 4.2.2. Parsing a List from Text 747 Given an ASCII string input_string, return a list of items. 748 input_string is modified to remove the parsed value. 750 1. Let items be an empty array. 752 2. While input_string is not empty: 754 1. Let item be the result of running Parse Item from Text 755 (Section 4.2.5) with input_string. 757 2. Append item to items. 759 3. Discard any leading OWS from input_string. 761 4. If input_string is empty, return items. 763 5. Consume the first character of input_string; if it is not 764 COMMA, fail parsing. 766 6. Discard any leading OWS from input_string. 768 7. If input_string is empty, fail parsing. 770 3. No structured data has been found; fail parsing. 772 4.2.3. Parsing a Parameterised List from Text 774 Given an ASCII string input_string, return a list of parameterised 775 identifiers. input_string is modified to remove the parsed value. 777 1. Let items be an empty array. 779 2. While input_string is not empty: 781 1. Let item be the result of running Parse Parameterised 782 Identifier from Text (Section 4.2.4) with input_string. 784 2. Append item to items. 786 3. Discard any leading OWS from input_string. 788 4. If input_string is empty, return items. 790 5. Consume the first character of input_string; if it is not 791 COMMA, fail parsing. 793 6. Discard any leading OWS from input_string. 795 7. If input_string is empty, fail parsing. 797 3. No structured data has been found; fail parsing. 799 4.2.4. Parsing a Parameterised Identifier from Text 801 Given an ASCII string input_string, return a identifier with an 802 mapping of parameters. input_string is modified to remove the parsed 803 value. 805 1. Let primary_identifier be the result of Parsing a Identifier from 806 Text (Section 4.2.8) from input_string. 808 2. Let parameters be an empty, unordered mapping. 810 3. In a loop: 812 1. Discard any leading OWS from input_string. 814 2. If the first character of input_string is not ";", exit the 815 loop. 817 3. Consume a ";" character from the beginning of input_string. 819 4. Discard any leading OWS from input_string. 821 5. let param_name be the result of Parsing a Identifier from 822 Text (Section 4.2.8) from input_string. 824 6. If param_name is already present in parameters, fail parsing. 826 7. Let param_value be a null value. 828 8. If the first character of input_string is "=": 830 1. Consume the "=" character at the beginning of 831 input_string. 833 2. Let param_value be the result of Parsing an Item from 834 Text (Section 4.2.5) from input_string. 836 9. Insert (param_name, param_value) into parameters. 838 4. Return the tuple (primary_identifier, parameters). 840 4.2.5. Parsing an Item from Text 842 Given an ASCII string input_string, return an item. input_string is 843 modified to remove the parsed value. 845 1. Discard any leading OWS from input_string. 847 2. If the first character of input_string is a "-" or a DIGIT, 848 process input_string as a number (Section 4.2.6) and return the 849 result. 851 3. If the first character of input_string is a DQUOTE, process 852 input_string as a string (Section 4.2.7) and return the result. 854 4. If the first character of input_string is "*", process 855 input_string as binary content (Section 4.2.9) and return the 856 result. 858 5. Otherwise, fail parsing. 860 4.2.6. Parsing a Number from Text 862 NOTE: This algorithm parses both Integers Section 3.5 and Floats 863 Section 3.6, and returns the corresponding structure. 865 1. Let type be "integer". 867 2. Let sign be 1. 869 3. Let input_number be an empty string. 871 4. If the first character of input_string is "-", remove it from 872 input_string and set sign to -1. 874 5. If input_string is empty, fail parsing. 876 6. If the first character of input_string is not a DIGIT, fail 877 parsing. 879 7. While input_string is not empty: 881 1. Let char be the result of removing the first character of 882 input_string. 884 2. If char is a DIGIT, append it to input_number. 886 3. Else, if type is "integer" and char is ".", append char to 887 input_number and set type to "float". 889 4. Otherwise, fail parsing. 891 5. If type is "integer" and input_number contains more than 19 892 characters, fail parsing. 894 6. If type is "float" and input_number contains more than 16 895 characters, fail parsing. 897 8. If type is "integer": 899 1. Parse input_number as an integer and let output_number be 900 the result. 902 2. If output_number is outside the range defined in 903 Section 3.5, fail parsing. 905 9. Otherwise: 907 1. If the final character of input_number is ".", fail parsing. 909 2. Parse input_number as a float and let output_number be the 910 result. 912 10. Return the product of output_number and sign. 914 4.2.7. Parsing a String from Text 916 Given an ASCII string input_string, return an unquoted string. 917 input_string is modified to remove the parsed value. 919 1. Let output_string be an empty string. 921 2. If the first character of input_string is not DQUOTE, fail 922 parsing. 924 3. Discard the first character of input_string. 926 4. While input_string is not empty: 928 1. Let char be the result of removing the first character of 929 input_string. 931 2. If char is a backslash ("\"): 933 1. If input_string is now empty, fail parsing. 935 2. Else: 937 1. Let next_char be the result of removing the first 938 character of input_string. 940 2. If next_char is not DQUOTE or "\", fail parsing. 942 3. Append next_char to output_string. 944 3. Else, if char is DQUOTE, return output_string. 946 4. Else, if char is in the range %x00-1f or %x7f (i.e., is not 947 in VCHAR), fail parsing. 949 5. Else, append char to output_string. 951 5. Otherwise, fail parsing. 953 4.2.8. Parsing an Identifier from Text 955 Given an ASCII string input_string, return a identifier. input_string 956 is modified to remove the parsed value. 958 1. If the first character of input_string is not lcalpha, fail 959 parsing. 961 2. Let output_string be an empty string. 963 3. While input_string is not empty: 965 1. Let char be the result of removing the first character of 966 input_string. 968 2. If char is not one of lcalpha, DIGIT, "_", "-", "*" or "/": 970 1. Prepend char to input_string. 972 2. Return output_string. 974 3. Append char to output_string. 976 4. Return output_string. 978 4.2.9. Parsing Binary Content from Text 980 Given an ASCII string input_string, return binary content. 981 input_string is modified to remove the parsed value. 983 1. If the first character of input_string is not "*", fail parsing. 985 2. Discard the first character of input_string. 987 3. Let b64_content be the result of removing content of input_string 988 up to but not including the first instance of the character "*". 989 If there is not a "*" character before the end of input_string, 990 fail parsing. 992 4. Consume the "*" character at the beginning of input_string. 994 5. If b64_content contains a character not included in ALPHA, DIGIT, 995 "+", "/" and "=", fail parsing. 997 6. Let binary_content be the result of Base 64 Decoding [RFC4648] 998 b64_content, synthesising padding if necessary (note the 999 requirements about recipient behaviour below). 1001 7. Return binary_content. 1003 As per [RFC4648], Section 3.2, it is RECOMMENDED that parsers reject 1004 encoded data that is not properly padded, although this might not be 1005 possible in some base64 implementations. 1007 Because some implementations of base64 do not allow rejection of 1008 encoded data that has non-zero pad bits (see [RFC4648], Section 3.5), 1009 parsers SHOULD NOT fail when it is present, unless they cannot be 1010 configured to handle it. 1012 This specification does not relax the requirements in [RFC4648], 1013 Section 3.1 and 3.3; therefore, parsers MUST fail on characters 1014 outside the base64 alphabet, and on line feeds in encoded data. 1016 5. IANA Considerations 1018 This draft has no actions for IANA. 1020 6. Security Considerations 1022 The size of most types defined by Structured Headers is not limited; 1023 as a result, extremely large header fields could be an attack vector 1024 (e.g., for resource consumption). Most HTTP implementations limit 1025 the sizes of size of individual header fields as well as the overall 1026 header block size to mitigate such attacks. 1028 It is possible for parties with the ability to inject new HTTP header 1029 fields to change the meaning of a Structured Headers. In some 1030 circumstances, this will cause parsing to fail, but it is not 1031 possible to reliably fail in all such circumstances. 1033 7. References 1035 7.1. Normative References 1037 [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, 1038 RFC 20, DOI 10.17487/RFC0020, October 1969, 1039 . 1041 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1042 Requirement Levels", BCP 14, RFC 2119, 1043 DOI 10.17487/RFC2119, March 1997, 1044 . 1046 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1047 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1048 . 1050 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1051 Specifications: ABNF", STD 68, RFC 5234, 1052 DOI 10.17487/RFC5234, January 2008, 1053 . 1055 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1056 Protocol (HTTP/1.1): Message Syntax and Routing", 1057 RFC 7230, DOI 10.17487/RFC7230, June 2014, 1058 . 1060 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1061 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1062 May 2017, . 1064 7.2. Informative References 1066 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 1067 IEEE 754-2008, DOI 10.1109/IEEESTD.2008.4610935, 1068 ISBN 978-0-7381-5752-8, August 2008, 1069 . 1071 See also http://grouper.ieee.org/groups/754/ [6]. 1073 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1074 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 1075 DOI 10.17487/RFC7231, June 2014, 1076 . 1078 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 1079 DOI 10.17487/RFC7493, March 2015, 1080 . 1082 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1083 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1084 DOI 10.17487/RFC7540, May 2015, 1085 . 1087 [RFC7541] Peon, R. and H. Ruellan, "HPACK: Header Compression for 1088 HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, 1089 . 1091 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1092 Interchange Format", STD 90, RFC 8259, 1093 DOI 10.17487/RFC8259, December 2017, 1094 . 1096 7.3. URIs 1098 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 1100 [2] https://httpwg.github.io/ 1102 [3] https://github.com/httpwg/http-extensions/labels/header-structure 1104 [4] https://github.com/httpwg/structured-header-tests 1106 [5] https://github.com/httpwg/wiki/wiki/Structured-Headers 1108 Appendix A. Frequently Asked Questions 1110 A.1. Why not JSON? 1112 Earlier proposals for structured headers were based upon JSON 1113 [RFC8259]. However, constraining its use to make it suitable for 1114 HTTP header fields required senders and recipients to implement 1115 specific additional handling. 1117 For example, JSON has specification issues around large numbers and 1118 objects with duplicate members. Although advice for avoiding these 1119 issues is available (e.g., [RFC7493]), it cannot be relied upon. 1121 Likewise, JSON strings are by default Unicode strings, which have a 1122 number of potential interoperability issues (e.g., in comparison). 1123 Although implementers can be advised to avoid non-ASCII content where 1124 unnecessary, this is difficult to enforce. 1126 Another example is JSON's ability to nest content to arbitrary 1127 depths. Since the resulting memory commitment might be unsuitable 1128 (e.g., in embedded and other limited server deployments), it's 1129 necessary to limit it in some fashion; however, existing JSON 1130 implementations have no such limits, and even if a limit is 1131 specified, it's likely that some header field definition will find a 1132 need to violate it. 1134 Because of JSON's broad adoption and implementation, it is difficult 1135 to impose such additional constraints across all implementations; 1136 some deployments would fail to enforce them, thereby harming 1137 interoperability. 1139 Since a major goal for Structured Headers is to improve 1140 interoperability and simplify implementation, these concerns led to a 1141 format that requires a dedicated parser and serialiser. 1143 Additionally, there were widely shared feelings that JSON doesn't 1144 "look right" in HTTP headers. 1146 A.2. Structured Headers don't "fit" my data. 1148 Structured headers intentionally limits the complexity of data 1149 structures, to assure that it can be processed in a performant manner 1150 with little overhead. This means that work is necessary to fit some 1151 data types into them. 1153 Sometimes, this can be achieved by creating limited substructures in 1154 values, and/or using more than one header. For example, consider: 1156 Example-Thing: name="Widget", cost=89.2, descriptions="foo bar" 1157 Example-Description: foo; url="https://example.net"; context=123, 1158 bar; url="https://example.org"; context=456 1160 Since the description contains a list of key/value pairs, we use a 1161 Parameterised List to represent them, with the identifier for each 1162 item in the list used to identify it in the "descriptions" member of 1163 the Example-Thing header. 1165 When specifying more than one header, it's important to remember to 1166 describe what a processor's behaviour should be when one of the 1167 headers is missing. 1169 If you need to fit arbitrarily complex data into a header, Structured 1170 Headers is probably a poor fit for your use case. 1172 Appendix B. Changes 1174 _RFC Editor: Please remove this section before publication._ 1176 B.1. Since draft-ietf-httpbis-header-structure-06 1178 o Add a FAQ. 1180 o Allow non-zero pad bits. 1182 o Explicitly check for integers that violate constraints. 1184 B.2. Since draft-ietf-httpbis-header-structure-05 1186 o Reorganise specification to separate parsing out. 1188 o Allow referencing specs to use ABNF. 1190 o Define serialisation algorithms. 1192 o Refine relationship between ABNF, parsing and serialisation 1193 algorithms. 1195 B.3. Since draft-ietf-httpbis-header-structure-04 1197 o Remove identifiers from item. 1199 o Remove most limits on sizes. 1201 o Refine number parsing. 1203 B.4. Since draft-ietf-httpbis-header-structure-03 1205 o Strengthen language around failure handling. 1207 B.5. Since draft-ietf-httpbis-header-structure-02 1209 o Split Numbers into Integers and Floats. 1211 o Define number parsing. 1213 o Tighten up binary parsing and give it an explicit end delimiter. 1215 o Clarify that mappings are unordered. 1217 o Allow zero-length strings. 1219 o Improve string parsing algorithm. 1221 o Improve limits in algorithms. 1223 o Require parsers to combine header fields before processing. 1225 o Throw an error on trailing garbage. 1227 B.6. Since draft-ietf-httpbis-header-structure-01 1229 o Replaced with draft-nottingham-structured-headers. 1231 B.7. Since draft-ietf-httpbis-header-structure-00 1233 o Added signed 64bit integer type. 1235 o Drop UTF8, and settle on BCP137 ::EmbeddedUnicodeChar for h1- 1236 unicode-string. 1238 o Change h1_blob delimiter to ":" since "'" is valid t_char 1240 Authors' Addresses 1242 Mark Nottingham 1243 Fastly 1245 Email: mnot@mnot.net 1246 URI: https://www.mnot.net/ 1248 Poul-Henning Kamp 1249 The Varnish Cache Project 1251 Email: phk@varnish-cache.org