idnits 2.17.1 draft-ietf-httpbis-header-structure-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [3], [4], [5], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 23, 2018) is 2013 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1182 -- Looks like a reference, but probably isn't: '2' on line 1184 -- Looks like a reference, but probably isn't: '3' on line 1186 -- Looks like a reference, but probably isn't: '4' on line 1188 -- Looks like a reference, but probably isn't: '5' on line 1190 == Missing Reference: 'RFCxxxx' is mentioned on line 222, but not defined == Missing Reference: 'RFC3986' is mentioned on line 240, but not defined -- Looks like a reference, but probably isn't: '6' on line 1155 ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP M. Nottingham 3 Internet-Draft Fastly 4 Intended status: Standards Track P-H. Kamp 5 Expires: April 26, 2019 The Varnish Cache Project 6 October 23, 2018 8 Structured Headers for HTTP 9 draft-ietf-httpbis-header-structure-08 11 Abstract 13 This document describes a set of data types and algorithms associated 14 with them that are intended to make it easier and safer to define and 15 handle HTTP header fields. It is intended for use by new 16 specifications of HTTP header fields as well as revisions of existing 17 header field specifications when doing so does not cause 18 interoperability issues. 20 Note to Readers 22 _RFC EDITOR: please remove this section before publication_ 24 Discussion of this draft takes place on the HTTP working group 25 mailing list (ietf-http-wg@w3.org), which is archived at 26 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 28 Working Group information can be found at https://httpwg.github.io/ 29 [2]; source code and issues list for this draft can be found at 30 https://github.com/httpwg/http-extensions/labels/header-structure 31 [3]. 33 Tests for implementations are collected at https://github.com/httpwg/ 34 structured-header-tests [4]. 36 Implementations are tracked at https://github.com/httpwg/wiki/wiki/ 37 Structured-Headers [5]. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on April 26, 2019. 56 Copyright Notice 58 Copyright (c) 2018 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 74 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 4 75 2. Defining New Structured Headers . . . . . . . . . . . . . . . 4 76 3. Structured Header Data Types . . . . . . . . . . . . . . . . 7 77 3.1. Dictionaries . . . . . . . . . . . . . . . . . . . . . . 7 78 3.2. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 3.3. Parameterised Lists . . . . . . . . . . . . . . . . . . . 8 80 3.4. Items . . . . . . . . . . . . . . . . . . . . . . . . . . 8 81 3.5. Integers . . . . . . . . . . . . . . . . . . . . . . . . 9 82 3.6. Floats . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 3.7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 3.8. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 10 85 3.9. Byte Sequences . . . . . . . . . . . . . . . . . . . . . 10 86 3.10. Booleans . . . . . . . . . . . . . . . . . . . . . . . . 11 87 4. Structured Headers in HTTP/1 . . . . . . . . . . . . . . . . 11 88 4.1. Serialising Structured Headers into HTTP/1 . . . . . . . 11 89 4.2. Parsing HTTP/1 Header Fields into Structured Headers . . 16 90 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 91 6. Security Considerations . . . . . . . . . . . . . . . . . . . 24 92 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 93 7.1. Normative References . . . . . . . . . . . . . . . . . . 25 94 7.2. Informative References . . . . . . . . . . . . . . . . . 25 95 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 26 96 Appendix A. Frequently Asked Questions . . . . . . . . . . . . . 26 97 A.1. Why not JSON? . . . . . . . . . . . . . . . . . . . . . . 26 98 A.2. Structured Headers don't "fit" my data. . . . . . . . . . 27 99 Appendix B. Changes . . . . . . . . . . . . . . . . . . . . . . 28 100 B.1. Since draft-ietf-httpbis-header-structure-07 . . . . . . 28 101 B.2. Since draft-ietf-httpbis-header-structure-06 . . . . . . 28 102 B.3. Since draft-ietf-httpbis-header-structure-05 . . . . . . 28 103 B.4. Since draft-ietf-httpbis-header-structure-04 . . . . . . 28 104 B.5. Since draft-ietf-httpbis-header-structure-03 . . . . . . 29 105 B.6. Since draft-ietf-httpbis-header-structure-02 . . . . . . 29 106 B.7. Since draft-ietf-httpbis-header-structure-01 . . . . . . 29 107 B.8. Since draft-ietf-httpbis-header-structure-00 . . . . . . 29 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 110 1. Introduction 112 Specifying the syntax of new HTTP header fields is an onerous task; 113 even with the guidance in [RFC7231], Section 8.3.1, there are many 114 decisions - and pitfalls - for a prospective HTTP header field 115 author. 117 Once a header field is defined, bespoke parsers and serialisers often 118 need to be written, because each header has slightly different 119 handling of what looks like common syntax. 121 This document introduces a set of common data structures for use in 122 HTTP header field values to address these problems. In particular, 123 it defines a generic, abstract model for header field values, along 124 with a concrete serialisation for expressing that model in HTTP/1 125 [RFC7230] header fields. 127 HTTP headers that are defined as "Structured Headers" use the types 128 defined in this specification to define their syntax and basic 129 handling rules, thereby simplifying both their definition by 130 specification writers and handling by implementations. 132 Additionally, future versions of HTTP can define alternative 133 serialisations of the abstract model of these structures, allowing 134 headers that use it to be transmitted more efficiently without being 135 redefined. 137 Note that it is not a goal of this document to redefine the syntax of 138 existing HTTP headers; the mechanisms described herein are only 139 intended to be used with headers that explicitly opt into them. 141 To specify a header field that is a Structured Header, see Section 2. 143 Section 3 defines a number of abstract data types that can be used in 144 Structured Headers. 146 Those abstract types can be serialised into and parsed from textual 147 headers - such as those used in HTTP/1 - using the algorithms 148 described in Section 4. 150 1.1. Notational Conventions 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 154 "OPTIONAL" in this document are to be interpreted as described in BCP 155 14 [RFC2119] [RFC8174] when, and only when, they appear in all 156 capitals, as shown here. 158 This document uses the Augmented Backus-Naur Form (ABNF) notation of 159 [RFC5234], including the VCHAR, SP, DIGIT, ALPHA and DQUOTE rules 160 from that document. It also includes the OWS rule from [RFC7230]. 162 This document uses algorithms to specify parsing and serialisation 163 behaviours, and ABNF to illustrate expected syntax in HTTP/1-style 164 header fields. 166 For parsing from HTTP/1 header fields, implementations MUST follow 167 the algorithms, but MAY vary in implementation so as the behaviours 168 are indistinguishable from specified behaviour. If there is 169 disagreement between the parsing algorithms and ABNF, the specified 170 algorithms take precedence. In some places, the algorithms are 171 "greedy" with whitespace, but this should not affect conformance. 173 For serialisation to HTTP/1 header fields, the ABNF illustrates the 174 range of acceptable wire representations with as much fidelity as 175 possible, and the algorithms define the recommended way to produce 176 them. Implementations MAY vary from the specified behaviour so long 177 as the output still matches the ABNF. 179 2. Defining New Structured Headers 181 To define a HTTP header as a structured header, its specification 182 needs to: 184 o Reference this specification. Recipients and generators of the 185 header need to know that the requirements of this document are in 186 effect. 188 o Specify the header field's allowed syntax for values, in terms of 189 the types described in Section 3, along with their associated 190 semantics. Syntax definitions are encouraged to use the ABNF 191 rules beginning with "sh-" defined in this specification. 193 o Specify any additional constraints upon the syntax of the 194 structured used, as well as the consequences when those 195 constraints are violated. When Structured Headers parsing fails, 196 the header is discarded (see Section 4.2); in most situations, 197 header-specific constraints should do likewise. 199 Note that a header field definition cannot relax the requirements of 200 a structure or its processing because doing so would preclude 201 handling by generic software; they can only add additional 202 constraints. 204 For example: 206 # Foo-Example Header 208 The Foo-Example HTTP header field conveys information about how 209 much Foo the message has. 211 Foo-Example is a Structured Header [RFCxxxx]. Its value MUST be a 212 dictionary ([RFCxxxx], Section Y.Y). Its ABNF is: 214 Foo-Example = sh-dictionary 216 The dictionary MUST contain: 218 * Exactly one member whose key is "foo", and whose value is an 219 integer ([RFCxxxx], Section Y.Y), indicating the number of foos 220 in the message. 221 * Exactly one member whose key is "barUrls", and whose value is a 222 string ([RFCxxxx], Section Y.Y), conveying the Bar URLs for the 223 message. See below for processing requirements. 225 If the parsed header field does not contain both, it MUST be 226 ignored. 228 "foo" MUST be between 0 and 10, inclusive; other values MUST cause 229 the header to be ignored. 231 "barUrls" contains a space-separated list of URI-references 232 ([RFC3986], Section 4.1): 234 barURLs = URI-reference *( 1*SP URI-reference ) 236 If a member of barURLs is not a valid URI-reference, it MUST cause 237 that value to be ignored. 239 If a member of barURLs is a relative reference ([RFC3986], 240 Section 4.2), it MUST be resolved ([RFC3986], Section 5) before 241 being used. 243 This specification defines minimums for the length or number of 244 various structures supported by Structured Headers implementations. 245 It does not specify maximum sizes in most cases, but header authors 246 should be aware that HTTP implementations do impose various limits on 247 the size of individual header fields, the total number of fields, 248 and/or the size of the entire header block. 250 3. Structured Header Data Types 252 This section defines the abstract value types that can be composed 253 into Structured Headers. The ABNF provided represents the on-wire 254 format in HTTP/1. 256 3.1. Dictionaries 258 Dictionaries are ordered maps of key-value pairs, where the keys are 259 identifiers (Section 3.8) and the values are items (Section 3.4). 260 There can be one or more members, and keys are required to be unique. 262 Implementations MUST provide access to dictionaries both by index and 263 by key. Specifications MAY use either means of accessing the 264 members. 266 The ABNF for dictionaries in HTTP/1 headers is: 268 sh-dictionary = dict-member *( OWS "," OWS dict-member ) 269 dict-member = member-name "=" member-value 270 member-name = sh-identifier 271 member-value = sh-item 273 In HTTP/1, keys and values are separated by "=" (without whitespace), 274 and key/value pairs are separated by a comma with optional 275 whitespace. For example: 277 Example-DictHeader: en="Applepie", da=*w4ZibGV0w6ZydGU=* 279 Typically, a header field specification will define the semantics of 280 individual keys, as well as whether their presence is required or 281 optional. Recipients MUST ignore keys that are undefined or unknown, 282 unless the header field's specification specifically disallows them. 284 Parsers MUST support dictionaries containing at least 1024 key/value 285 pairs. 287 3.2. Lists 289 Lists are arrays of items (Section 3.4) with one or more members. 291 The ABNF for lists in HTTP/1 headers is: 293 sh-list = list-member *( OWS "," OWS list-member ) 294 list-member = sh-item 295 In HTTP/1, each member is separated by a comma and optional 296 whitespace. For example, a header field whose value is defined as a 297 list of strings could look like: 299 Example-StrListHeader: "foo", "bar", "It was the best of times." 301 Header specifications can constrain the types of individual values if 302 necessary. 304 Parsers MUST support lists containing at least 1024 members. 306 3.3. Parameterised Lists 308 Parameterised Lists are arrays of a parameterised identifiers. 310 A parameterised identifier is an identifier (Section 3.8) with an 311 optional set of parameters, each parameter having an identifier and 312 an optional value that is an item (Section 3.4). Ordering between 313 parameters is not significant, and duplicate parameters MUST cause 314 parsing to fail. 316 The ABNF for parameterised lists in HTTP/1 headers is: 318 sh-param-list = param-id *( OWS "," OWS param-id ) 319 param-id = sh-identifier *parameter 320 parameter = OWS ";" OWS param-name [ "=" param-value ] 321 param-name = sh-identifier 322 param-value = sh-item 324 In HTTP/1, each param-id is separated by a comma and optional 325 whitespace (as in Lists), and the parameters are separated by 326 semicolons. For example: 328 Example-ParamListHeader: abc_123;a=1;b=2; cdef_456, ghi;q="9";r="w" 330 Parsers MUST support parameterised lists containing at least 1024 331 members, and support members with at least 256 parameters. 333 3.4. Items 335 An item is can be a integer (Section 3.5), float (Section 3.6), 336 string (Section 3.7), identifier (Section 3.8), byte sequence 337 (Section 3.9), or Boolean (Section 3.10). 339 The ABNF for items in HTTP/1 headers is: 341 sh-item = sh-integer / sh-float / sh-string / sh-identifier / sh-binary 342 / sh-boolean 344 3.5. Integers 346 Integers have a range of -9,223,372,036,854,775,808 to 347 9,223,372,036,854,775,807 inclusive (i.e., a 64-bit signed integer). 349 The ABNF for integers in HTTP/1 headers is: 351 sh-integer = ["-"] 1*19DIGIT 353 For example: 355 Example-IntegerHeader: 42 357 3.6. Floats 359 Floats are integers with a fractional part, that can be stored as 360 IEEE 754 double precision numbers (binary64) ([IEEE754]). 362 The ABNF for floats in HTTP/1 headers is: 364 sh-float = ["-"] ( 365 DIGIT "." 1*14DIGIT / 366 2DIGIT "." 1*13DIGIT / 367 3DIGIT "." 1*12DIGIT / 368 4DIGIT "." 1*11DIGIT / 369 5DIGIT "." 1*10DIGIT / 370 6DIGIT "." 1*9DIGIT / 371 7DIGIT "." 1*8DIGIT / 372 8DIGIT "." 1*7DIGIT / 373 9DIGIT "." 1*6DIGIT / 374 10DIGIT "." 1*5DIGIT / 375 11DIGIT "." 1*4DIGIT / 376 12DIGIT "." 1*3DIGIT / 377 13DIGIT "." 1*2DIGIT / 378 14DIGIT "." 1DIGIT ) 380 For example, a header whose value is defined as a float could look 381 like: 383 Example-FloatHeader: 4.5 385 3.7. Strings 387 Strings are zero or more printable ASCII [RFC0020] characters (i.e., 388 the range 0x20 to 0x7E). Note that this excludes tabs, newlines, 389 carriage returns, etc. 391 The ABNF for strings in HTTP/1 headers is: 393 sh-string = DQUOTE *(chr) DQUOTE 394 chr = unescaped / escaped 395 unescaped = %x20-21 / %x23-5B / %x5D-7E 396 escaped = "\" ( DQUOTE / "\" ) 398 In HTTP/1 headers, strings are delimited with double quotes, using a 399 backslash ("\") to escape double quotes and backslashes. For 400 example: 402 Example-StringHeader: "hello world" 404 Note that strings only use DQUOTE as a delimiter; single quotes do 405 not delimit strings. Furthermore, only DQUOTE and "\" can be 406 escaped; other sequences MUST cause parsing to fail. 408 Unicode is not directly supported in this document, because it causes 409 a number of interoperability issues, and - with few exceptions - 410 header values do not require it. 412 When it is necessary for a field value to convey non-ASCII string 413 content, a byte sequence (Section 3.9) SHOULD be specified, along 414 with a character encoding (preferably UTF-8). 416 Parsers MUST support strings with at least 1024 characters. 418 3.8. Identifiers 420 Identifiers are short textual identifiers; their abstract model is 421 identical to their expression in the textual HTTP serialisation. 422 Parsers MUST support identifiers with at least 64 characters. 424 The ABNF for identifiers in HTTP/1 headers is: 426 sh-identifier = lcalpha *( lcalpha / DIGIT / "_" / "-"/ "*" / "/" ) 427 lcalpha = %x61-7A ; a-z 429 Note that identifiers can only contain lowercase letters. 431 3.9. Byte Sequences 433 Byte sequences can be conveyed in Structured Headers. 435 The ABNF for a byte sequence in HTTP/1 headers is: 437 sh-binary = "*" *(base64) "*" 438 base64 = ALPHA / DIGIT / "+" / "/" / "=" 439 In HTTP/1 headers, a byte sequence is delimited with asterisks and 440 encoded using base64 ([RFC4648], Section 4). For example: 442 Example-BinaryHdr: *cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==* 444 Parsers MUST support byte sequences with at least 16384 octets after 445 decoding. 447 3.10. Booleans 449 Boolean values can be conveyed in Structured Headers. 451 The ABNF for a Boolean in HTTP/1 headers is: 453 sh-boolean = "!" boolean 454 boolean = "T" / "F" 456 In HTTP/1 headers, a byte sequence is delimited with a "!" character. 457 For example: 459 Example-BoolHdr: !T 461 4. Structured Headers in HTTP/1 463 This section defines how to serialise and parse Structured Headers in 464 HTTP/1 textual header fields, and protocols compatible with them 465 (e.g., in HTTP/2 [RFC7540] before HPACK [RFC7541] is applied). 467 4.1. Serialising Structured Headers into HTTP/1 469 Given a structured defined in this specification: 471 1. If the structure is a dictionary, return the result of 472 Serialising a Dictionary {#ser-dictionary}. 474 2. If the structure is a list, return the result of Serialising a 475 List {#ser-list}. 477 3. If the structure is a parameterised list, return the result of 478 Serialising a Parameterised List {#ser-param-list}. 480 4. If the structure is an item, return the result of Serialising an 481 Item {#ser-item}. 483 5. Otherwise, fail serialisation. 485 4.1.1. Serialising a Dictionary 487 Given a dictionary as input: 489 1. Let output be an empty string. 491 2. For each member mem of input: 493 1. Let name be the result of applying Serialising an Identifier 494 Section 4.1.8 to mem's member-name. 496 2. Append name to output. 498 3. Append "=" to output. 500 4. Let value be the result of applying Serialising an Item 501 Section 4.1.4 to mem's member-value. 503 5. Append value to output. 505 6. If more members remain in input: 507 1. Append a COMMA to output. 509 2. Append a single WS to output. 511 3. Return output. 513 4.1.2. Serialising a List 515 Given a list as input: 517 1. Let output be an empty string. 519 2. For each member mem of input: 521 1. Let value be the result of applying Serialising an Item 522 Section 4.1.4 to mem. 524 2. Append value to output. 526 3. If more members remain in input: 528 1. Append a COMMA to output. 530 2. Append a single WS to output. 532 3. Return output. 534 4.1.3. Serialising a Parameterised List 536 Given a parameterised list as input: 538 1. Let output be an empty string. 540 2. For each member mem of input: 542 1. Let id be the result of applying Serialising an Identifier 543 Section 4.1.8 to mem's identifier. 545 2. Append id to output. 547 3. For each parameter in mem's parameters: 549 1. Append ";" to output. 551 2. Let name be the result of applying Serialising an 552 Identifier Section 4.1.8 to parameter's param-name. 554 3. Append name to output. 556 4. If parameter has a param-value: 558 1. Let value be the result of applying Serialising an 559 Item Section 4.1.4 to parameter's param-value. 561 2. Append "=" to output. 563 3. Append value to output. 565 4. If more members remain in input: 567 1. Append a COMMA to output. 569 2. Append a single WS to output. 571 3. Return output. 573 4.1.4. Serialising an Item 575 Given an item as input: 577 1. If input is a type other than an integer, float, string, 578 identifier, byte sequence, or Boolean, fail serialisation. 580 2. If input is an integer, return the result of applying Serialising 581 an Integer Section 4.1.5 to input. 583 3. If input is a float, return the result of applying Serialising a 584 Float Section 4.1.6 to input. 586 4. If input is a string, return the result of applying Serialising a 587 String Section 4.1.7 to input. 589 5. If input is an identifier, return the result of Serialising an 590 Identifier {#ser-identifier}. 592 6. If input is a Boolean, return the result of applying Serialising 593 a Boolean Section 4.1.10 to input. 595 7. Otherwise, return the result of applying Serialising a Byte 596 Sequence Section 4.1.9 to input. 598 4.1.5. Serialising an Integer 600 Given an integer as input: 602 1. If input is not an integer in the range of 603 -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 604 inclusive, fail serialisation. 606 2. Let output be an empty string. 608 3. If input is less than (but not equal to) 0, append "-" to output. 610 4. Append input's numeric value represented in base 10 using only 611 decimal digits to output. 613 5. Return output. 615 4.1.6. Serialising a Float 617 Given a float as input: 619 1. If input is not a IEEE 754 double precision number, fail 620 serialisation. 622 2. Let output be an empty string. 624 3. If input is less than (but not equal to) 0, append "-" to output. 626 4. Append input's integer component represented in base 10 using 627 only decimal digits to output; if it is zero, append "0". 629 5. Append "." to output. 631 6. Append input's decimal component represented in base 10 using 632 only decimal digits to output; if it is zero, append "0". 634 7. Return output. 636 4.1.7. Serialising a String 638 Given a string as input: 640 1. If input is not a sequence of characters, or contains characters 641 outside the range allowed by VCHAR or SP, fail serialisation. 643 2. Let output be an empty string. 645 3. Append DQUOTE to output. 647 4. For each character char in input: 649 1. If char is "\" or DQUOTE: 651 1. Append "\" to output. 653 2. Append char to output, using ASCII encoding [RFC0020]. 655 5. Append DQUOTE to output. 657 6. Return output. 659 4.1.8. Serialising an Identifier 661 Given an identifier as input: 663 1. If input is not a sequence of characters, or contains characters 664 not allowed in Section 3.8, fail serialisation. 666 2. Let output be an empty string. 668 3. Append input to output, using ASCII encoding [RFC0020]. 670 4. Return output. 672 4.1.9. Serialising a Byte Sequence 674 Given a byte sequence as input: 676 1. If input is not a sequence of bytes, fail serialisation. 678 2. Let output be an empty string. 680 3. Append "*" to output. 682 4. Append the result of base64-encoding input as per [RFC4648], 683 Section 4, taking account of the requirements below. 685 5. Append "*" to output. 687 6. Return output. 689 The encoded data is required to be padded with "=", as per [RFC4648], 690 Section 3.2. 692 Likewise, encoded data SHOULD have pad bits set to zero, as per 693 [RFC4648], Section 3.5, unless it is not possible to do so due to 694 implementation constraints. 696 4.1.10. Serialising a Boolean 698 Given a Boolean as input: 700 1. If input is not a boolean, fail serialisation. 702 2. Let output be an empty string. 704 3. Append "!" to output. 706 4. If input is true, append "T" to output. 708 5. If input is false, append "F" to output. 710 6. Return output. 712 4.2. Parsing HTTP/1 Header Fields into Structured Headers 714 When a receiving implementation parses textual HTTP header fields 715 (e.g., in HTTP/1 or HTTP/2) that are known to be Structured Headers, 716 it is important that care be taken, as there are a number of edge 717 cases that can cause interoperability or even security problems. 718 This section specifies the algorithm for doing so. 720 Given an ASCII string input_string that represents the chosen 721 header's field-value, and header_type, one of "dictionary", "list", 722 "param-list", or "item", return the parsed header value. 724 1. Discard any leading OWS from input_string. 726 2. If header_type is "dictionary", let output be the result of 727 Parsing a Dictionary from Text (Section 4.2.1). 729 3. If header_type is "list", let output be the result of Parsing a 730 List from Text (Section 4.2.2). 732 4. If header_type is "param-list", let output be the result of 733 Parsing a Parameterised List from Text (Section 4.2.3). 735 5. Otherwise, let output be the result of Parsing an Item from Text 736 (Section 4.2.5). 738 6. Discard any leading OWS from input_string. 740 7. If input_string is not empty, fail parsing. 742 8. Otherwise, return output. 744 When generating input_string, parsers MUST combine all instances of 745 the target header field into one comma-separated field-value, as per 746 [RFC7230], Section 3.2.2; this assures that the header is processed 747 correctly. 749 For Lists, Parameterised Lists and Dictionaries, this has the effect 750 of correctly concatenating all instances of the header field. 752 Strings split across multiple header instances will have 753 unpredictable results, because comma(s) and whitespace inserted upon 754 combination will become part of the string output by the parser. 755 Since concatenation might be done by an upstream intermediary, the 756 results are not under the control of the serialiser or the parser. 758 Integers, Floats and Byte Sequences cannot be split across multiple 759 headers because the inserted commas will cause parsing to fail. 761 If parsing fails - including when calling another algorithm - the 762 entire header field's value MUST be discarded. This is intentionally 763 strict, to improve interoperability and safety, and specifications 764 referencing this document cannot loosen this requirement. 766 Note that this has the effect of discarding any header field with 767 non-ASCII characters in input_string. 769 4.2.1. Parsing a Dictionary from Text 771 Given an ASCII string input_string, return an ordered map of 772 (identifier, item). input_string is modified to remove the parsed 773 value. 775 1. Let dictionary be an empty, ordered map. 777 2. While input_string is not empty: 779 1. Let this_key be the result of running Parse Identifier from 780 Text (Section 4.2.8) with input_string. 782 2. If dictionary already contains this_key, fail parsing. 784 3. Consume the first character of input_string; if it is not 785 "=", fail parsing. 787 4. Let this_value be the result of running Parse Item from Text 788 (Section 4.2.5) with input_string. 790 5. Add key this_key with value this_value to dictionary. 792 6. Discard any leading OWS from input_string. 794 7. If input_string is empty, return dictionary. 796 8. Consume the first character of input_string; if it is not 797 COMMA, fail parsing. 799 9. Discard any leading OWS from input_string. 801 10. If input_string is empty, fail parsing. 803 3. No structured data has been found; fail parsing. 805 4.2.2. Parsing a List from Text 807 Given an ASCII string input_string, return a list of items. 808 input_string is modified to remove the parsed value. 810 1. Let items be an empty array. 812 2. While input_string is not empty: 814 1. Let item be the result of running Parse Item from Text 815 (Section 4.2.5) with input_string. 817 2. Append item to items. 819 3. Discard any leading OWS from input_string. 821 4. If input_string is empty, return items. 823 5. Consume the first character of input_string; if it is not 824 COMMA, fail parsing. 826 6. Discard any leading OWS from input_string. 828 7. If input_string is empty, fail parsing. 830 3. No structured data has been found; fail parsing. 832 4.2.3. Parsing a Parameterised List from Text 834 Given an ASCII string input_string, return a list of parameterised 835 identifiers. input_string is modified to remove the parsed value. 837 1. Let items be an empty array. 839 2. While input_string is not empty: 841 1. Let item be the result of running Parse Parameterised 842 Identifier from Text (Section 4.2.4) with input_string. 844 2. Append item to items. 846 3. Discard any leading OWS from input_string. 848 4. If input_string is empty, return items. 850 5. Consume the first character of input_string; if it is not 851 COMMA, fail parsing. 853 6. Discard any leading OWS from input_string. 855 7. If input_string is empty, fail parsing. 857 3. No structured data has been found; fail parsing. 859 4.2.4. Parsing a Parameterised Identifier from Text 861 Given an ASCII string input_string, return an identifier with an 862 unordered map of parameters. input_string is modified to remove the 863 parsed value. 865 1. Let primary_identifier be the result of Parsing an Identifier 866 from Text (Section 4.2.8) from input_string. 868 2. Let parameters be an empty, unordered map. 870 3. In a loop: 872 1. If the first character of input_string is not ";", exit the 873 loop. 875 2. Consume a ";" character from the beginning of input_string. 877 3. Discard any leading OWS from input_string. 879 4. let param_name be the result of Parsing an Identifier from 880 Text (Section 4.2.8) from input_string. 882 5. If param_name is already present in parameters, fail parsing. 884 6. Let param_value be a null value. 886 7. If the first character of input_string is "=": 888 1. Consume the "=" character at the beginning of 889 input_string. 891 2. Let param_value be the result of Parsing an Item from 892 Text (Section 4.2.5) from input_string. 894 8. Insert (param_name, param_value) into parameters. 896 4. Return the tuple (primary_identifier, parameters). 898 4.2.5. Parsing an Item from Text 900 Given an ASCII string input_string, return an item. input_string is 901 modified to remove the parsed value. 903 1. Discard any leading OWS from input_string. 905 2. If the first character of input_string is a "-" or a DIGIT, 906 process input_string as a number (Section 4.2.6) and return the 907 result. 909 3. If the first character of input_string is a DQUOTE, process 910 input_string as a string (Section 4.2.7) and return the result. 912 4. If the first character of input_string is "*", process 913 input_string as a byte sequence (Section 4.2.9) and return the 914 result. 916 5. If the first character of input_string is "!", process 917 input_string as a Boolean (Section 4.2.10) and return the result. 919 6. If the first character of input_string is a lcalpha, process 920 input_string as an identifier (Section 4.2.8) and return the 921 result. 923 7. Otherwise, fail parsing. 925 4.2.6. Parsing a Number from Text 927 NOTE: This algorithm parses both Integers Section 3.5 and Floats 928 Section 3.6, and returns the corresponding structure. 930 1. Let type be "integer". 932 2. Let sign be 1. 934 3. Let input_number be an empty string. 936 4. If the first character of input_string is "-", remove it from 937 input_string and set sign to -1. 939 5. If input_string is empty, fail parsing. 941 6. If the first character of input_string is not a DIGIT, fail 942 parsing. 944 7. While input_string is not empty: 946 1. Let char be the result of removing the first character of 947 input_string. 949 2. If char is a DIGIT, append it to input_number. 951 3. Else, if type is "integer" and char is ".", append char to 952 input_number and set type to "float". 954 4. Otherwise, prepend char to input_string, and exit the loop. 956 5. If type is "integer" and input_number contains more than 19 957 characters, fail parsing. 959 6. If type is "float" and input_number contains more than 16 960 characters, fail parsing. 962 8. If type is "integer": 964 1. Parse input_number as an integer and let output_number be 965 the product of the result and sign. 967 2. If output_number is outside the range defined in 968 Section 3.5, fail parsing. 970 9. Otherwise: 972 1. If the final character of input_number is ".", fail parsing. 974 2. Parse input_number as a float and let output_number be the 975 product of the result and sign. 977 10. Return output_number. 979 4.2.7. Parsing a String from Text 981 Given an ASCII string input_string, return an unquoted string. 982 input_string is modified to remove the parsed value. 984 1. Let output_string be an empty string. 986 2. If the first character of input_string is not DQUOTE, fail 987 parsing. 989 3. Discard the first character of input_string. 991 4. While input_string is not empty: 993 1. Let char be the result of removing the first character of 994 input_string. 996 2. If char is a backslash ("\"): 998 1. If input_string is now empty, fail parsing. 1000 2. Else: 1002 1. Let next_char be the result of removing the first 1003 character of input_string. 1005 2. If next_char is not DQUOTE or "\", fail parsing. 1007 3. Append next_char to output_string. 1009 3. Else, if char is DQUOTE, return output_string. 1011 4. Else, if char is in the range %x00-1f or %x7f (i.e., is not 1012 in VCHAR or SP), fail parsing. 1014 5. Else, append char to output_string. 1016 5. Reached the end of input_string without finding a closing DQUOTE; 1017 fail parsing. 1019 4.2.8. Parsing an Identifier from Text 1021 Given an ASCII string input_string, return an identifier. 1022 input_string is modified to remove the parsed value. 1024 1. If the first character of input_string is not lcalpha, fail 1025 parsing. 1027 2. Let output_string be an empty string. 1029 3. While input_string is not empty: 1031 1. Let char be the result of removing the first character of 1032 input_string. 1034 2. If char is not one of lcalpha, DIGIT, "_", "-", "*" or "/": 1036 1. Prepend char to input_string. 1038 2. Return output_string. 1040 3. Append char to output_string. 1042 4. Return output_string. 1044 4.2.9. Parsing a Byte Sequence from Text 1046 Given an ASCII string input_string, return a byte sequence. 1047 input_string is modified to remove the parsed value. 1049 1. If the first character of input_string is not "*", fail parsing. 1051 2. Discard the first character of input_string. 1053 3. Let b64_content be the result of removing content of input_string 1054 up to but not including the first instance of the character "*". 1055 If there is not a "*" character before the end of input_string, 1056 fail parsing. 1058 4. Consume the "*" character at the beginning of input_string. 1060 5. If b64_content contains a character not included in ALPHA, DIGIT, 1061 "+", "/" and "=", fail parsing. 1063 6. Let binary_content be the result of Base 64 Decoding [RFC4648] 1064 b64_content, synthesising padding if necessary (note the 1065 requirements about recipient behaviour below). 1067 7. Return binary_content. 1069 Because some implementations of base64 do not allow reject of encoded 1070 data that is not properly "=" padded (see [RFC4648], Section 3.2), 1071 parsers SHOULD NOT fail when it is not present, unless they cannot be 1072 configured to do so. 1074 Because some implementations of base64 do not allow rejection of 1075 encoded data that has non-zero pad bits (see [RFC4648], Section 3.5), 1076 parsers SHOULD NOT fail when it is present, unless they cannot be 1077 configured to do so. 1079 This specification does not relax the requirements in [RFC4648], 1080 Section 3.1 and 3.3; therefore, parsers MUST fail on characters 1081 outside the base64 alphabet, and on line feeds in encoded data. 1083 4.2.10. Parsing a Boolean from Text 1085 Given an ASCII string input_string, return a Boolean. input_string is 1086 modified to remove the parsed value. 1088 1. If the first character of input_string is not "!", fail parsing. 1090 2. Discard the first character of input_string. 1092 3. If the first character of input_string case-sensitively matches 1093 "T", discard the first character, and return true. 1095 4. If the first character of input_string case-sensitively matches 1096 "F", discard the first character, and return false. 1098 5. No value has matched; fail parsing. 1100 5. IANA Considerations 1102 This draft has no actions for IANA. 1104 6. Security Considerations 1106 The size of most types defined by Structured Headers is not limited; 1107 as a result, extremely large header fields could be an attack vector 1108 (e.g., for resource consumption). Most HTTP implementations limit 1109 the sizes of size of individual header fields as well as the overall 1110 header block size to mitigate such attacks. 1112 It is possible for parties with the ability to inject new HTTP header 1113 fields to change the meaning of a Structured Header. In some 1114 circumstances, this will cause parsing to fail, but it is not 1115 possible to reliably fail in all such circumstances. 1117 7. References 1119 7.1. Normative References 1121 [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, 1122 RFC 20, DOI 10.17487/RFC0020, October 1969, 1123 . 1125 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1126 Requirement Levels", BCP 14, RFC 2119, 1127 DOI 10.17487/RFC2119, March 1997, 1128 . 1130 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 1131 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 1132 . 1134 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1135 Specifications: ABNF", STD 68, RFC 5234, 1136 DOI 10.17487/RFC5234, January 2008, 1137 . 1139 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1140 Protocol (HTTP/1.1): Message Syntax and Routing", 1141 RFC 7230, DOI 10.17487/RFC7230, June 2014, 1142 . 1144 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1145 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1146 May 2017, . 1148 7.2. Informative References 1150 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 1151 IEEE 754-2008, DOI 10.1109/IEEESTD.2008.4610935, 1152 ISBN 978-0-7381-5752-8, August 2008, 1153 . 1155 See also http://grouper.ieee.org/groups/754/ [6]. 1157 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 1158 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 1159 DOI 10.17487/RFC7231, June 2014, 1160 . 1162 [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, 1163 DOI 10.17487/RFC7493, March 2015, 1164 . 1166 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1167 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1168 DOI 10.17487/RFC7540, May 2015, 1169 . 1171 [RFC7541] Peon, R. and H. Ruellan, "HPACK: Header Compression for 1172 HTTP/2", RFC 7541, DOI 10.17487/RFC7541, May 2015, 1173 . 1175 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1176 Interchange Format", STD 90, RFC 8259, 1177 DOI 10.17487/RFC8259, December 2017, 1178 . 1180 7.3. URIs 1182 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 1184 [2] https://httpwg.github.io/ 1186 [3] https://github.com/httpwg/http-extensions/labels/header-structure 1188 [4] https://github.com/httpwg/structured-header-tests 1190 [5] https://github.com/httpwg/wiki/wiki/Structured-Headers 1192 Appendix A. Frequently Asked Questions 1194 A.1. Why not JSON? 1196 Earlier proposals for structured headers were based upon JSON 1197 [RFC8259]. However, constraining its use to make it suitable for 1198 HTTP header fields required senders and recipients to implement 1199 specific additional handling. 1201 For example, JSON has specification issues around large numbers and 1202 objects with duplicate members. Although advice for avoiding these 1203 issues is available (e.g., [RFC7493]), it cannot be relied upon. 1205 Likewise, JSON strings are by default Unicode strings, which have a 1206 number of potential interoperability issues (e.g., in comparison). 1207 Although implementers can be advised to avoid non-ASCII content where 1208 unnecessary, this is difficult to enforce. 1210 Another example is JSON's ability to nest content to arbitrary 1211 depths. Since the resulting memory commitment might be unsuitable 1212 (e.g., in embedded and other limited server deployments), it's 1213 necessary to limit it in some fashion; however, existing JSON 1214 implementations have no such limits, and even if a limit is 1215 specified, it's likely that some header field definition will find a 1216 need to violate it. 1218 Because of JSON's broad adoption and implementation, it is difficult 1219 to impose such additional constraints across all implementations; 1220 some deployments would fail to enforce them, thereby harming 1221 interoperability. 1223 Since a major goal for Structured Headers is to improve 1224 interoperability and simplify implementation, these concerns led to a 1225 format that requires a dedicated parser and serialiser. 1227 Additionally, there were widely shared feelings that JSON doesn't 1228 "look right" in HTTP headers. 1230 A.2. Structured Headers don't "fit" my data. 1232 Structured headers intentionally limits the complexity of data 1233 structures, to assure that it can be processed in a performant manner 1234 with little overhead. This means that work is necessary to fit some 1235 data types into them. 1237 Sometimes, this can be achieved by creating limited substructures in 1238 values, and/or using more than one header. For example, consider: 1240 Example-Thing: name="Widget", cost=89.2, descriptions="foo bar" 1241 Example-Description: foo; url="https://example.net"; context=123, 1242 bar; url="https://example.org"; context=456 1244 Since the description contains a list of key/value pairs, we use a 1245 Parameterised List to represent them, with the identifier for each 1246 item in the list used to identify it in the "descriptions" member of 1247 the Example-Thing header. 1249 When specifying more than one header, it's important to remember to 1250 describe what a processor's behaviour should be when one of the 1251 headers is missing. 1253 If you need to fit arbitrarily complex data into a header, Structured 1254 Headers is probably a poor fit for your use case. 1256 Appendix B. Changes 1258 _RFC Editor: Please remove this section before publication._ 1260 B.1. Since draft-ietf-httpbis-header-structure-07 1262 o Make Dictionaries ordered mappings (#659). 1264 o Changed "binary content" to "byte sequence" to align with Infra 1265 specification (#671). 1267 o Changed "mapping" to "map" for #671. 1269 o Don't fail if byte sequences aren't "=" padded (#658). 1271 o Add Booleans (#683). 1273 o Allow identifiers in items again (#629). 1275 o Disallowed whitespace before items (#703). 1277 o Explain the consequences of splitting a string across multiple 1278 headers (#686). 1280 B.2. Since draft-ietf-httpbis-header-structure-06 1282 o Add a FAQ. 1284 o Allow non-zero pad bits. 1286 o Explicitly check for integers that violate constraints. 1288 B.3. Since draft-ietf-httpbis-header-structure-05 1290 o Reorganise specification to separate parsing out. 1292 o Allow referencing specs to use ABNF. 1294 o Define serialisation algorithms. 1296 o Refine relationship between ABNF, parsing and serialisation 1297 algorithms. 1299 B.4. Since draft-ietf-httpbis-header-structure-04 1301 o Remove identifiers from item. 1303 o Remove most limits on sizes. 1305 o Refine number parsing. 1307 B.5. Since draft-ietf-httpbis-header-structure-03 1309 o Strengthen language around failure handling. 1311 B.6. Since draft-ietf-httpbis-header-structure-02 1313 o Split Numbers into Integers and Floats. 1315 o Define number parsing. 1317 o Tighten up binary parsing and give it an explicit end delimiter. 1319 o Clarify that mappings are unordered. 1321 o Allow zero-length strings. 1323 o Improve string parsing algorithm. 1325 o Improve limits in algorithms. 1327 o Require parsers to combine header fields before processing. 1329 o Throw an error on trailing garbage. 1331 B.7. Since draft-ietf-httpbis-header-structure-01 1333 o Replaced with draft-nottingham-structured-headers. 1335 B.8. Since draft-ietf-httpbis-header-structure-00 1337 o Added signed 64bit integer type. 1339 o Drop UTF8, and settle on BCP137 ::EmbeddedUnicodeChar for h1- 1340 unicode-string. 1342 o Change h1_blob delimiter to ":" since "'" is valid t_char 1344 Authors' Addresses 1346 Mark Nottingham 1347 Fastly 1349 Email: mnot@mnot.net 1350 URI: https://www.mnot.net/ 1351 Poul-Henning Kamp 1352 The Varnish Cache Project 1354 Email: phk@varnish-cache.org