idnits 2.17.1 draft-ietf-httpbis-header-structure-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [3], [4], [5], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 5, 2018) is 2152 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1078 -- Looks like a reference, but probably isn't: '2' on line 1080 -- Looks like a reference, but probably isn't: '3' on line 1082 -- Looks like a reference, but probably isn't: '4' on line 1084 -- Looks like a reference, but probably isn't: '5' on line 1086 == Missing Reference: 'RFCxxxx' is mentioned on line 213, but not defined == Missing Reference: 'RFC3986' is mentioned on line 231, but not defined -- Looks like a reference, but probably isn't: '6' on line 1060 ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP M. Nottingham 3 Internet-Draft Fastly 4 Intended status: Standards Track P-H. Kamp 5 Expires: December 7, 2018 The Varnish Cache Project 6 June 5, 2018 8 Structured Headers for HTTP 9 draft-ietf-httpbis-header-structure-06 11 Abstract 13 This document describes a set of data types and algorithms associated 14 with them that are intended to make it easier and safer to define and 15 handle HTTP header fields. It is intended for use by new 16 specifications of HTTP header fields as well as revisions of existing 17 header field specifications when doing so does not cause 18 interoperability issues. 20 Note to Readers 22 _RFC EDITOR: please remove this section before publication_ 24 Discussion of this draft takes place on the HTTP working group 25 mailing list (ietf-http-wg@w3.org), which is archived at 26 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 28 Working Group information can be found at https://httpwg.github.io/ 29 [2]; source code and issues list for this draft can be found at 30 https://github.com/httpwg/http-extensions/labels/header-structure 31 [3]. 33 Tests for implementations are collected at https://github.com/httpwg/ 34 structured-header-tests [4]. 36 Implementations are tracked at https://github.com/httpwg/wiki/wiki/ 37 Structured-Headers [5]. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on December 7, 2018. 56 Copyright Notice 58 Copyright (c) 2018 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 74 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 4 75 2. Defining New Structured Headers . . . . . . . . . . . . . . . 4 76 3. Structured Header Data Types . . . . . . . . . . . . . . . . 6 77 3.1. Dictionaries . . . . . . . . . . . . . . . . . . . . . . 6 78 3.2. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 6 79 3.3. Parameterised Lists . . . . . . . . . . . . . . . . . . . 7 80 3.4. Items . . . . . . . . . . . . . . . . . . . . . . . . . . 7 81 3.5. Integers . . . . . . . . . . . . . . . . . . . . . . . . 7 82 3.6. Floats . . . . . . . . . . . . . . . . . . . . . . . . . 8 83 3.7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 3.8. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 9 85 3.9. Binary Content . . . . . . . . . . . . . . . . . . . . . 9 86 4. Structured Headers in HTTP/1 . . . . . . . . . . . . . . . . 10 87 4.1. Serialising Structured Headers into HTTP/1 . . . . . . . 10 88 4.2. Parsing HTTP/1 Header Fields into Structured Headers . . 14 89 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 22 91 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 92 7.1. Normative References . . . . . . . . . . . . . . . . . . 22 93 7.2. Informative References . . . . . . . . . . . . . . . . . 23 94 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 23 95 Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . 24 96 A.1. Since draft-ietf-httpbis-header-structure-05 . . . . . . 24 97 A.2. Since draft-ietf-httpbis-header-structure-04 . . . . . . 24 98 A.3. Since draft-ietf-httpbis-header-structure-03 . . . . . . 24 99 A.4. Since draft-ietf-httpbis-header-structure-02 . . . . . . 24 100 A.5. Since draft-ietf-httpbis-header-structure-01 . . . . . . 25 101 A.6. Since draft-ietf-httpbis-header-structure-00 . . . . . . 25 102 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 104 1. Introduction 106 Specifying the syntax of new HTTP header fields is an onerous task; 107 even with the guidance in [RFC7231], Section 8.3.1, there are many 108 decisions - and pitfalls - for a prospective HTTP header field 109 author. 111 Once a header field is defined, bespoke parsers and serialisers often 112 need to be written, because each header has slightly different 113 handling of what looks like common syntax. 115 This document introduces a set of common data structures for use in 116 HTTP header field values to address these problems. In particular, 117 it defines a generic, abstract model for header field values, along 118 with a concrete serialisation for expressing that model in HTTP/1 119 [RFC7230] header fields. 121 HTTP headers that are defined as "Structured Headers" use the types 122 defined in this specification to define their syntax and basic 123 handling rules, thereby simplifying both their definition by 124 specification writers and handling by implementations. 126 Additionally, future versions of HTTP can define alternative 127 serialisations of the abstract model of these structures, allowing 128 headers that use it to be transmitted more efficiently without being 129 redefined. 131 Note that it is not a goal of this document to redefine the syntax of 132 existing HTTP headers; the mechanisms described herein are only 133 intended to be used with headers that explicitly opt into them. 135 To specify a header field that is a Structured Header, see Section 2. 137 Section 3 defines a number of abstract data types that can be used in 138 Structured Headers. 140 Those abstract types can be serialised into and parsed from textual 141 headers - such as those used in HTTP/1 - using the algorithms 142 described in Section 4. 144 1.1. Notational Conventions 146 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 147 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 148 "OPTIONAL" in this document are to be interpreted as described in BCP 149 14 [RFC2119] [RFC8174] when, and only when, they appear in all 150 capitals, as shown here. 152 This document uses the Augmented Backus-Naur Form (ABNF) notation of 153 [RFC5234], including the DIGIT, ALPHA and DQUOTE rules from that 154 document. It also includes the OWS rule from [RFC7230]. 156 This document uses algorithms to specify parsing and serialisation 157 behaviours, and ABNF to illustrate expected syntax. 159 For parsing, implementations MUST follow the algorithms, but MAY vary 160 in implementation so as the behaviours are indistinguishable from 161 specified behaviour. If there is disagreement between the parsing 162 algorithms and ABNF, the specified algorithms take precedence. 164 For serialisation, the ABNF illustrates the range of acceptable wire 165 representations with as much fidelity as possible, and the algorithms 166 define the recommended way to produce them. Implementations MAY vary 167 from the specified behaviour so long as the output still matches the 168 ABNF. 170 2. Defining New Structured Headers 172 To define a HTTP header as a structured header, its specification 173 needs to: 175 o Reference this specification. Recipients and generators of the 176 header need to know that the requirements of this document are in 177 effect. 179 o Specify the header field's allowed syntax for values, in terms of 180 the types described in Section 3, along with their associated 181 semantics. Syntax definitions are encouraged to use the ABNF 182 rules beginning with "sh-" defined in this specification. 184 o Specify any additional constraints upon the syntax of the 185 structured sued, as well as the consequences when those 186 constraints are violated. When Structured Headers parsing fails, 187 the header is discarded (see Section 4.2); in most situations, 188 header-specific constraints should do likewise. 190 Note that a header field definition cannot relax the requirements of 191 a structure or its processing; they can only add additional 192 constraints, because doing so would preclude handling by generic 193 software. 195 For example: 197 # Foo-Example Header 199 The Foo-Example HTTP header field conveys information about how 200 much Foo the message has. 202 Foo-Example is a Structured Header [RFCxxxx]. Its value MUST be a 203 dictionary ([RFCxxxx], Section Y.Y). Its ABNF is: 205 Foo-Example = sh-dictionary 207 The dictionary MUST contain: 209 * Exactly one member whose key is "foo", and whose value is an 210 integer ([RFCxxxx], Section Y.Y), indicating the number of foos 211 in the message. 212 * Exactly one member whose key is "barUrls", and whose value is a 213 string ([RFCxxxx], Section Y.Y), conveying the Bar URLs for the 214 message. See below for processing requirements. 216 If the parsed header field does not contain both, it MUST be 217 ignored. 219 "foo" MUST be between 0 and 10, inclusive; other values MUST cause 220 the header to be ignored. 222 "barUrls" contains a space-separated list of URI-references 223 ([RFC3986], Section 4.1): 225 barURLs = URI-reference *( 1*SP URI-reference ) 227 If a member of barURLs is not a valid URI-reference, it MUST cause 228 that value to be ignored. 230 If a member of barURLs is a relative reference ([RFC3986], 231 Section 4.2), it MUST be resolved ([RFC3986], Section 5) before 232 being used. 234 This specification defines minimums for the length or number of 235 various structures supported by Structured Headers implementations. 236 It does not specify maximum sizes in most cases, but header authors 237 should be aware that HTTP implementations do impose various limits on 238 the size of individual header fields, the total number of fields, 239 and/or the size of the entire header block. 241 3. Structured Header Data Types 243 This section defines the abstract value types that can be composed 244 into Structured Headers. The ABNF provided represents the on-wire 245 format in HTTP/1. 247 3.1. Dictionaries 249 Dictionaries are unordered maps of key-value pairs, where the keys 250 are identifiers (Section 3.8) and the values are items (Section 3.4). 251 There can be one or more members, and keys are required to be unique. 253 The ABNF for dictionaries is: 255 sh-dictionary = dict-member *( OWS "," OWS dict-member ) 256 dict-member = member-name "=" member-value 257 member-name = identifier 258 member-value = sh-item 260 In HTTP/1, keys and values are separated by "=" (without whitespace), 261 and key/value pairs are separated by a comma with optional 262 whitespace. For example: 264 Example-DictHeader: en="Applepie", da=*w4ZibGV0w6ZydGUK=* 266 Typically, a header field specification will define the semantics of 267 individual keys, as well as whether their presence is required or 268 optional. Recipients MUST ignore keys that are undefined or unknown, 269 unless the header field's specification specifically disallows them. 271 Parsers MUST support dictionaries containing at least 1024 key/value 272 pairs. 274 3.2. Lists 276 Lists are arrays of items (Section 3.4) with one or more members. 278 The ABNF for lists is: 280 sh-list = list-member *( OWS "," OWS list-member ) 281 list-member = sh-item 283 In HTTP/1, each member is separated by a comma and optional 284 whitespace. For example, a header field whose value is defined as a 285 list of strings could look like: 287 Example-StrListHeader: "foo", "bar", "It was the best of times." 288 Header specifications can constrain the types of individual values if 289 necessary. 291 Parsers MUST support lists containing at least 1024 members. 293 3.3. Parameterised Lists 295 Parameterised Lists are arrays of a parameterised identifiers. 297 A parameterised identifier is an identifier (Section 3.8) with an 298 optional set of parameters, each parameter having a identifier and an 299 optional value that is an item (Section 3.4). Ordering between 300 parameters is not significant, and duplicate parameters MUST cause 301 parsing to fail. 303 The ABNF for parameterised lists is: 305 sh-param-list = param-id *( OWS "," OWS param-id ) 306 param-id = identifier *parameter 307 parameter = OWS ";" OWS param-name [ "=" param-value ] 308 param-name = identifier 309 param-value = sh-item 311 In HTTP/1, each param-id is separated by a comma and optional 312 whitespace (as in Lists), and the parameters are separated by 313 semicolons. For example: 315 Example-ParamListHeader: abc_123;a=1;b=2; cdef_456, ghi;q="9";r=w 317 Parsers MUST support parameterised lists containing at least 1024 318 members, and support members with at least 256 parameters. 320 3.4. Items 322 An item is can be a integer (Section 3.5), float (Section 3.6), 323 string (Section 3.7), or binary content (Section 3.9). 325 The ABNF for items is: 327 sh-item = sh-integer / sh-float / sh-string / sh-binary 329 3.5. Integers 331 Integers have a range of -9,223,372,036,854,775,808 to 332 9,223,372,036,854,775,807 inclusive (i.e., a 64-bit signed integer). 334 The ABNF for integers is: 336 sh-integer = ["-"] 1*19DIGIT 338 For example: 340 Example-IntegerHeader: 42 342 3.6. Floats 344 Floats are integers with a fractional part, that can be stored as 345 IEEE 754 double precision numbers (binary64) ([IEEE754]). 347 The ABNF for floats is: 349 sh-float = ["-"] ( 350 DIGIT "." 1*14DIGIT / 351 2DIGIT "." 1*13DIGIT / 352 3DIGIT "." 1*12DIGIT / 353 4DIGIT "." 1*11DIGIT / 354 5DIGIT "." 1*10DIGIT / 355 6DIGIT "." 1*9DIGIT / 356 7DIGIT "." 1*8DIGIT / 357 8DIGIT "." 1*7DIGIT / 358 9DIGIT "." 1*6DIGIT / 359 10DIGIT "." 1*5DIGIT / 360 11DIGIT "." 1*4DIGIT / 361 12DIGIT "." 1*3DIGIT / 362 13DIGIT "." 1*2DIGIT / 363 14DIGIT "." 1DIGIT ) 365 For example, a header whose value is defined as a float could look 366 like: 368 Example-FloatHeader: 4.5 370 3.7. Strings 372 Strings are zero or more printable ASCII [RFC0020] characters (i.e., 373 the range 0x20 to 0x7E). Note that this excludes tabs, newlines, 374 carriage returns, etc. 376 The ABNF for strings is: 378 sh-string = DQUOTE *(chr) DQUOTE 379 chr = unescaped / escaped 380 unescaped = %x20-21 / %x23-5B / %x5D-7E 381 escaped = "\" ( DQUOTE / "\" ) 382 In HTTP/1 headers, strings are delimited with double quotes, using a 383 backslash ("\") to escape double quotes and backslashes. For 384 example: 386 Example-StringHeader: "hello world" 388 Note that strings only use DQUOTE as a delimiter; single quotes do 389 not delimit strings. Furthermore, only DQUOTE and "\" can be 390 escaped; other sequences MUST cause parsing to fail. 392 Unicode is not directly supported in this document, because it causes 393 a number of interoperability issues, and - with few exceptions - 394 header values do not require it. 396 When it is necessary for a field value to convey non-ASCII string 397 content, binary content (Section 3.9) SHOULD be specified, along with 398 a character encoding (preferably, UTF-8). 400 Parsers MUST support strings with at least 1024 characters. 402 3.8. Identifiers 404 Identifiers are short textual identifiers; their abstract model is 405 identical to their expression in the textual HTTP serialisation. 406 Parsers MUST support identifiers with at least 64 characters. 408 The ABNF for identifiers is: 410 identifier = lcalpha *( lcalpha / DIGIT / "_" / "-"/ "*" / "/" ) 411 lcalpha = %x61-7A ; a-z 413 Note that identifiers can only contain lowercase letters. 415 3.9. Binary Content 417 Arbitrary binary content can be conveyed in Structured Headers. 419 The ABNF for binary content is: 421 sh-binary = "*" *(base64) "*" 422 base64 = ALPHA / DIGIT / "+" / "/" / "=" 424 In HTTP/1 headers, binary content is delimited with asterisks and 425 encoded using base64 ([RFC4648], Section 4). For example: 427 Example-BinaryHdr: *cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==* 428 Parsers MUST support binary content with at least 16384 octets after 429 decoding. 431 4. Structured Headers in HTTP/1 433 This section defines how to serialise and parse Structured Headers in 434 HTTP/1 textual header fields, and protocols compatible with them 435 (e.g., in HTTP/2 [RFC7540] before HPACK [RFC7541] is applied). 437 4.1. Serialising Structured Headers into HTTP/1 439 Given a structured defined in this specification: 441 1. If the structure is a dictionary, return the result of 442 Serialising a Dictionary {#ser-dictionary}. 444 2. If the structure is a list, return the result of Serialising a 445 List {#ser-list}. 447 3. If the structure is a parameterised list, return the result of 448 Serialising a Parameterised List {#ser-param-list}. 450 4. If the structure is an item, return the result of Serialising an 451 Item {#ser-item}. 453 5. Otherwise, fail serialisation. 455 4.1.1. Serialising a Dictionary 457 Given a dictionary as input: 459 1. Let output be an empty string. 461 2. For each member mem of input: 463 1. Let name be the result of applying Serialising an Identifier 464 Section 4.1.8 to mem's member-name. 466 2. Append name to output. 468 3. Append "=" to output. 470 4. Let value be the result of applying Serialising an Item 471 Section 4.1.4 to mem's member-value. 473 5. Append value to output. 475 3. Return output. 477 4.1.2. Serialising a List 479 Given a list as input: 481 1. Let output be an empty string. 483 2. For each member mem of input: 485 1. Let value be the result of applying Serialising an Item 486 Section 4.1.4 to mem. 488 2. Append value to output. 490 3. If more members remain in input: 492 1. Append a COMMA to output. 494 2. Append a single WS to output. 496 3. Return output. 498 4.1.3. Serialising a Parameterised List 500 Given a parameterised list as input: 502 1. Let output be an empty string. 504 2. For each member mem of input: 506 1. Let id be the result of applying Serialising an Identifier 507 Section 4.1.8 to mem's identifier. 509 2. Append id to output. 511 3. For each parameter in mem's parameters: 513 1. Let name be the result of applying Serialising an 514 Identifier Section 4.1.8 to parameter's param-name. 516 2. Append name to output. 518 3. If parameter has a param-value: 520 1. Let value be the result of applying Serialising an 521 Item Section 4.1.4 to parameter's param-value. 523 2. Append "=" to output. 525 3. Append value to output. 527 3. Return output. 529 4.1.4. Serialising an Item 531 Given an item as input: 533 1. If input is a type other than an integer, float, string or binary 534 content, fail serialisation. 536 2. Let output be an empty string. 538 3. If input is an integer, let value be the result of applying 539 Serialising an Integer Section 4.1.5 to input. 541 4. If input is a float, let value be the result of applying 542 Serialising a Float Section 4.1.6 to input. 544 5. If input is a string, let value be the result of applying 545 Serialising a String Section 4.1.7 to input. 547 6. If input is binary content, let value be the result of applying 548 Serialising Binary Content Section 4.1.9 to input. 550 7. Return output. 552 4.1.5. Serialising an Integer 554 Given an integer as input: 556 1. If input is not an integer in the range of 557 -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 558 inclusive, fail serialisation. 560 2. Let output be an empty string. 562 3. If input is less than (but not equal to) 0, append "-" to output. 564 4. Append input's numeric value represented in base 10 using only 565 decimal digits to output. 567 5. Return output. 569 4.1.6. Serialising a Float 571 Given a float as input: 573 1. If input is not a IEEE 754 double precision number, fail 574 serialisation. 576 2. Let output be an empty string. 578 3. If input is less than (but not equal to) 0, append "-" to output. 580 4. Append input's integer component represented in base 10 using 581 only decimal digits to output; if it is zero, append "0". 583 5. Append "." to output. 585 6. Append input's decimal component represented in base 10 using 586 only decimal digits to output; if it is zero, append "0". 588 7. Return output. 590 4.1.7. Serialising a String 592 Given a string as input: 594 1. If input is not a sequence of characters, or contains characters 595 outside the range allowed by the ABNF defined in Section 3.7, 596 fail serialisation. 598 2. Let output be an empty string. 600 3. Append DQUOTE to output. 602 4. For each character char in input: 604 1. If char is "" or DQUOTE: 606 1. Append "" to output. 608 2. Append char to output, using ASCII encoding [RFC0020]. 610 5. Append DQUOTE to output. 612 6. Return output. 614 4.1.8. Serialising an Identifier 616 Given an identifier as input: 618 1. If input is not a sequence of characters, or contains characters 619 not allowed in Section 3.8, fail serialisation. 621 2. Let output be an empty string. 623 3. Append input to output, using ASCII encoding [RFC0020]. 625 4. Return output. 627 4.1.9. Serialising Binary Content 629 Given binary content as input: 631 1. If input is not a sequence of bytes, fail serialisation. 633 2. Let output be an empty string. 635 3. Append "*" to output. 637 4. Append the result of base64-encoding input as per [RFC4648], 638 Section 4, taking account of the requirements below. 640 5. Append "*" to output. 642 6. Return output. 644 The encoded data is required to be padded with "=", as per [RFC4648], 645 Section 3.2. Likewise, encoded data is required to have pad bits set 646 to zero, as per [RFC4648], Section 3.5. 648 4.2. Parsing HTTP/1 Header Fields into Structured Headers 650 When a receiving implementation parses textual HTTP header fields 651 (e.g., in HTTP/1 or HTTP/2) that are known to be Structured Headers, 652 it is important that care be taken, as there are a number of edge 653 cases that can cause interoperability or even security problems. 654 This section specifies the algorithm for doing so. 656 Given an ASCII string input_string that represents the chosen 657 header's field-value, and header_type, one of "dictionary", "list", 658 "param-list", or "item", return the parsed header value. 660 1. Discard any leading OWS from input_string. 662 2. If header_type is "dictionary", let output be the result of 663 Parsing a Dictionary from Text (Section 4.2.1). 665 3. If header_type is "list", let output be the result of Parsing a 666 List from Text (Section 4.2.2). 668 4. If header_type is "param-list", let output be the result of 669 Parsing a Parameterised List from Text (Section 4.2.3). 671 5. Otherwise, let output be the result of Parsing an Item from Text 672 (Section 4.2.5). 674 6. Discard any leading OWS from input_string. 676 7. If input_string is not empty, fail parsing. 678 8. Otherwise, return output. 680 When generating input_string, parsers MUST combine all instances of 681 the target header field into one comma-separated field-value, as per 682 [RFC7230], Section 3.2.2; this assures that the header is processed 683 correctly. 685 For Lists, Parameterised Lists and Dictionaries, this has the effect 686 of correctly concatenating all instances of the header field. 688 Strings can but SHOULD NOT be split across multiple header instances, 689 because comma(s) inserted upon combination will become part of the 690 string output by the parser. 692 Integers, Floats and Binary Content cannot be split across multiple 693 headers because the inserted commas will cause parsing to fail. 695 If parsing fails - including when calling another algorithm - the 696 entire header field's value MUST be discarded. This is intentionally 697 strict, to improve interoperability and safety, and specifications 698 referencing this document cannot loosen this requirement. 700 Note that this has the effect of discarding any header field with 701 non-ASCII characters in input_string. 703 4.2.1. Parsing a Dictionary from Text 705 Given an ASCII string input_string, return a mapping of (identifier, 706 item). input_string is modified to remove the parsed value. 708 1. Let dictionary be an empty, unordered mapping. 710 2. While input_string is not empty: 712 1. Let this_key be the result of running Parse Identifier from 713 Text (Section 4.2.8) with input_string. 715 2. If dictionary already contains this_key, fail parsing. 717 3. Consume a "=" from input_string; if none is present, fail 718 parsing. 720 4. Let this_value be the result of running Parse Item from Text 721 (Section 4.2.5) with input_string. 723 5. Add key this_key with value this_value to dictionary. 725 6. Discard any leading OWS from input_string. 727 7. If input_string is empty, return dictionary. 729 8. Consume a COMMA from input_string; if no comma is present, 730 fail parsing. 732 9. Discard any leading OWS from input_string. 734 10. If input_string is empty, fail parsing. 736 3. No structured data has been found; fail parsing. 738 4.2.2. Parsing a List from Text 740 Given an ASCII string input_string, return a list of items. 741 input_string is modified to remove the parsed value. 743 1. Let items be an empty array. 745 2. While input_string is not empty: 747 1. Let item be the result of running Parse Item from Text 748 (Section 4.2.5) with input_string. 750 2. Append item to items. 752 3. Discard any leading OWS from input_string. 754 4. If input_string is empty, return items. 756 5. Consume a COMMA from input_string; if no comma is present, 757 fail parsing. 759 6. Discard any leading OWS from input_string. 761 7. If input_string is empty, fail parsing. 763 3. No structured data has been found; fail parsing. 765 4.2.3. Parsing a Parameterised List from Text 767 Given an ASCII string input_string, return a list of parameterised 768 identifiers. input_string is modified to remove the parsed value. 770 1. Let items be an empty array. 772 2. While input_string is not empty: 774 1. Let item be the result of running Parse Parameterised 775 Identifier from Text (Section 4.2.4) with input_string. 777 2. Append item to items. 779 3. Discard any leading OWS from input_string. 781 4. If input_string is empty, return items. 783 5. Consume a COMMA from input_string; if no comma is present, 784 fail parsing. 786 6. Discard any leading OWS from input_string. 788 7. If input_string is empty, fail parsing. 790 3. No structured data has been found; fail parsing. 792 4.2.4. Parsing a Parameterised Identifier from Text 794 Given an ASCII string input_string, return a identifier with an 795 mapping of parameters. input_string is modified to remove the parsed 796 value. 798 1. Let primary_identifier be the result of Parsing a Identifier from 799 Text (Section 4.2.8) from input_string. 801 2. Let parameters be an empty, unordered mapping. 803 3. In a loop: 805 1. Discard any leading OWS from input_string. 807 2. If the first character of input_string is not ";", exit the 808 loop. 810 3. Consume a ";" character from the beginning of input_string. 812 4. Discard any leading OWS from input_string. 814 5. let param_name be the result of Parsing a Identifier from 815 Text (Section 4.2.8) from input_string. 817 6. If param_name is already present in parameters, fail parsing. 819 7. Let param_value be a null value. 821 8. If the first character of input_string is "=": 823 1. Consume the "=" character at the beginning of 824 input_string. 826 2. Let param_value be the result of Parsing an Item from 827 Text (Section 4.2.5) from input_string. 829 9. Insert (param_name, param_value) into parameters. 831 4. Return the tuple (primary_identifier, parameters). 833 4.2.5. Parsing an Item from Text 835 Given an ASCII string input_string, return an item. input_string is 836 modified to remove the parsed value. 838 1. Discard any leading OWS from input_string. 840 2. If the first character of input_string is a "-" or a DIGIT, 841 process input_string as a number (Section 4.2.6) and return the 842 result. 844 3. If the first character of input_string is a DQUOTE, process 845 input_string as a string (Section 4.2.7) and return the result. 847 4. If the first character of input_string is "*", process 848 input_string as binary content (Section 4.2.9) and return the 849 result. 851 5. Otherwise, fail parsing. 853 4.2.6. Parsing a Number from Text 855 NOTE: This algorithm parses both Integers Section 3.5 and Floats 856 Section 3.6, and returns the corresponding structure. 858 1. Let type be "integer". 860 2. Let sign be 1. 862 3. Let input_number be an empty string. 864 4. If the first character of input_string is "-", remove it from 865 input_string and set sign to -1. 867 5. If input_string is empty, fail parsing. 869 6. If the first character of input_string is not a DIGIT, fail 870 parsing. 872 7. While input_string is not empty: 874 1. Let char be the result of removing the first character of 875 input_string. 877 2. If char is a DIGIT, append it to input_number. 879 3. Else, if type is "integer" and char is ".", append char to 880 input_number and set type to "float". 882 4. Otherwise, fail parsing. 884 5. If type is "integer" and input_number contains more than 19 885 characters, fail parsing. 887 6. If type is "float" and input_number contains more than 16 888 characters, fail parsing. 890 8. If type is "integer", parse input_number as an integer and let 891 output_number be the result. 893 9. Otherwise: 895 1. If the final character of input_number is ".", fail parsing. 897 2. Parse input_number as a float and let output_number be the 898 result. 900 10. Return the product of output_number and sign. 902 Parsers that encounter an integer outside the range defined in 903 Section 3.5 MUST fail parsing. Therefore, the value 904 "9223372036854775808" would be invalid. Likewise, values that do not 905 conform to the ABNF above are invalid, and MUST fail parsing. 907 Parsers that encounter a float that does not conform to the ABNF in 908 Section 3.6 MUST fail parsing. 910 4.2.7. Parsing a String from Text 912 Given an ASCII string input_string, return an unquoted string. 913 input_string is modified to remove the parsed value. 915 1. Let output_string be an empty string. 917 2. If the first character of input_string is not DQUOTE, fail 918 parsing. 920 3. Discard the first character of input_string. 922 4. While input_string is not empty: 924 1. Let char be the result of removing the first character of 925 input_string. 927 2. If char is a backslash ("\"): 929 1. If input_string is now empty, fail parsing. 931 2. Else: 933 1. Let next_char be the result of removing the first 934 character of input_string. 936 2. If next_char is not DQUOTE or "\", fail parsing. 938 3. Append next_char to output_string. 940 3. Else, if char is DQUOTE, return output_string. 942 4. Else, append char to output_string. 944 5. Otherwise, fail parsing. 946 4.2.8. Parsing an Identifier from Text 948 Given an ASCII string input_string, return a identifier. input_string 949 is modified to remove the parsed value. 951 1. If the first character of input_string is not lcalpha, fail 952 parsing. 954 2. Let output_string be an empty string. 956 3. While input_string is not empty: 958 1. Let char be the result of removing the first character of 959 input_string. 961 2. If char is not one of lcalpha, DIGIT, "_", "-", "*" or "/": 963 1. Prepend char to input_string. 965 2. Return output_string. 967 3. Append char to output_string. 969 4. Return output_string. 971 4.2.9. Parsing Binary Content from Text 973 Given an ASCII string input_string, return binary content. 974 input_string is modified to remove the parsed value. 976 1. If the first character of input_string is not "*", fail parsing. 978 2. Discard the first character of input_string. 980 3. Let b64_content be the result of removing content of input_string 981 up to but not including the first instance of the character "*". 982 If there is not a "*" character before the end of input_string, 983 fail parsing. 985 4. Consume the "*" character at the beginning of input_string. 987 5. Let binary_content be the result of Base 64 Decoding [RFC4648] 988 b64_content, synthesising padding if necessary (note the 989 requirements about recipient behaviour below). 991 6. Return binary_content. 993 As per [RFC4648], Section 3.2, it is RECOMMENDED that parsers reject 994 encoded data that is not properly padded, although this might not be 995 possible in some base64 implementations. 997 As per [RFC4648], Section 3.5, it is RECOMMENDED that parsers fail on 998 encoded data that has non-zero pad bits, although this might not be 999 possible in some base64 implementations. 1001 This specification does not relax the requirements in [RFC4648], 1002 Section 3.1 and 3.3; therefore, parsers MUST fail on characters 1003 outside the base64 alphabet, and on line feeds in encoded data. 1005 5. IANA Considerations 1007 This draft has no actions for IANA. 1009 6. Security Considerations 1011 The size of most types defined by Structured Headers is not limited; 1012 as a result, extremely large header fields could be an attack vector 1013 (e.g., for resource consumption). Most HTTP implementations limit 1014 the sizes of size of individual header fields as well as the overall 1015 header block size to mitigate such attacks. 1017 It is possible for parties with the ability to inject new HTTP header 1018 fields to change the meaning of a Structured Headers. In some 1019 circumstances, this will cause parsing to fail, but it is not 1020 possible to reliably fail in all such circumstances. 1022 7. References 1024 7.1. Normative References 1026 [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, 1027 RFC 20, DOI 10.17487/RFC0020, October 1969, 1028 . 1030 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1031 Requirement Levels", BCP 14, RFC 2119, 1032 DOI 10.17487/RFC2119, March 1997, 1033 . 1035 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1036 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1037 . 1039 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1040 Specifications: ABNF", STD 68, RFC 5234, 1041 DOI 10.17487/RFC5234, January 2008, 1042 . 1044 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1045 Protocol (HTTP/1.1): Message Syntax and Routing", 1046 RFC 7230, DOI 10.17487/RFC7230, June 2014, 1047 . 1049 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1050 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1051 May 2017, . 1053 7.2. Informative References 1055 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 1056 IEEE 754-2008, DOI 10.1109/IEEESTD.2008.4610935, 1057 ISBN 978-0-7381-5752-8, August 2008, 1058 . 1060 See also http://grouper.ieee.org/groups/754/ [6]. 1062 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1063 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 1064 DOI 10.17487/RFC7231, June 2014, 1065 . 1067 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1068 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1069 DOI 10.17487/RFC7540, May 2015, 1070 . 1072 [RFC7541] Peon, R. and H. Ruellan, "HPACK: Header Compression for 1073 HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, 1074 . 1076 7.3. URIs 1078 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 1080 [2] https://httpwg.github.io/ 1082 [3] https://github.com/httpwg/http-extensions/labels/header-structure 1084 [4] https://github.com/httpwg/structured-header-tests 1086 [5] https://github.com/httpwg/wiki/wiki/Structured-Headers 1088 Appendix A. Changes 1090 A.1. Since draft-ietf-httpbis-header-structure-05 1092 o Reorganise specification to separate parsing out. 1094 o Allow referencing specs to use ABNF. 1096 o Define serialisation algorithms. 1098 o Refine relationship between ABNF, parsing and serialisation 1099 algorithms. 1101 A.2. Since draft-ietf-httpbis-header-structure-04 1103 o Remove identifiers from item. 1105 o Remove most limits on sizes. 1107 o Refine number parsing. 1109 A.3. Since draft-ietf-httpbis-header-structure-03 1111 o Strengthen language around failure handling. 1113 A.4. Since draft-ietf-httpbis-header-structure-02 1115 o Split Numbers into Integers and Floats. 1117 o Define number parsing. 1119 o Tighten up binary parsing and give it an explicit end delimiter. 1121 o Clarify that mappings are unordered. 1123 o Allow zero-length strings. 1125 o Improve string parsing algorithm. 1127 o Improve limits in algorithms. 1129 o Require parsers to combine header fields before processing. 1131 o Throw an error on trailing garbage. 1133 A.5. Since draft-ietf-httpbis-header-structure-01 1135 o Replaced with draft-nottingham-structured-headers. 1137 A.6. Since draft-ietf-httpbis-header-structure-00 1139 o Added signed 64bit integer type. 1141 o Drop UTF8, and settle on BCP137 ::EmbeddedUnicodeChar for h1- 1142 unicode-string. 1144 o Change h1_blob delimiter to ":" since "'" is valid t_char 1146 Authors' Addresses 1148 Mark Nottingham 1149 Fastly 1151 Email: mnot@mnot.net 1152 URI: https://www.mnot.net/ 1154 Poul-Henning Kamp 1155 The Varnish Cache Project 1157 Email: phk@varnish-cache.org