idnits 2.17.1 draft-ietf-httpbis-header-structure-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [3], [4], [5], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 30, 2018) is 2151 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 857 -- Looks like a reference, but probably isn't: '2' on line 859 -- Looks like a reference, but probably isn't: '3' on line 861 -- Looks like a reference, but probably isn't: '4' on line 863 -- Looks like a reference, but probably isn't: '5' on line 865 == Missing Reference: 'RFCxxxx' is mentioned on line 203, but not defined -- Looks like a reference, but probably isn't: '6' on line 838 ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTP M. Nottingham 3 Internet-Draft Fastly 4 Intended status: Standards Track P-H. Kamp 5 Expires: December 1, 2018 The Varnish Cache Project 6 May 30, 2018 8 Structured Headers for HTTP 9 draft-ietf-httpbis-header-structure-05 11 Abstract 13 This document describes a set of data types and parsing algorithms 14 associated with them that are intended to make it easier and safer to 15 define and handle HTTP header fields. It is intended for use by new 16 specifications of HTTP header fields as well as revisions of existing 17 header field specifications when doing so does not cause 18 interoperability issues. 20 Note to Readers 22 _RFC EDITOR: please remove this section before publication_ 24 Discussion of this draft takes place on the HTTP working group 25 mailing list (ietf-http-wg@w3.org), which is archived at 26 https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. 28 Working Group information can be found at https://httpwg.github.io/ 29 [2]; source code and issues list for this draft can be found at 30 https://github.com/httpwg/http-extensions/labels/header-structure 31 [3]. 33 Tests for implementations are collected at https://github.com/httpwg/ 34 structured-header-tests [4]. 36 Implementations are tracked at https://github.com/httpwg/wiki/wiki/ 37 Structured-Headers [5]. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on December 1, 2018. 56 Copyright Notice 58 Copyright (c) 2018 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 74 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 4 75 2. Defining New Structured Headers . . . . . . . . . . . . . . . 4 76 3. Parsing Textual Header Fields . . . . . . . . . . . . . . . . 6 77 4. Structured Header Data Types . . . . . . . . . . . . . . . . 7 78 4.1. Dictionaries . . . . . . . . . . . . . . . . . . . . . . 7 79 4.2. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 8 80 4.3. Parameterised Lists . . . . . . . . . . . . . . . . . . . 9 81 4.4. Items . . . . . . . . . . . . . . . . . . . . . . . . . . 11 82 4.5. Integers . . . . . . . . . . . . . . . . . . . . . . . . 12 83 4.6. Floats . . . . . . . . . . . . . . . . . . . . . . . . . 13 84 4.7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 4.8. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 15 86 4.9. Binary Content . . . . . . . . . . . . . . . . . . . . . 16 87 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 88 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 89 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 90 7.1. Normative References . . . . . . . . . . . . . . . . . . 18 91 7.2. Informative References . . . . . . . . . . . . . . . . . 18 92 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 19 93 Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . 19 94 A.1. Since draft-ietf-httpbis-header-structure-04 . . . . . . 19 95 A.2. Since draft-ietf-httpbis-header-structure-03 . . . . . . 19 96 A.3. Since draft-ietf-httpbis-header-structure-02 . . . . . . 19 97 A.4. Since draft-ietf-httpbis-header-structure-01 . . . . . . 20 98 A.5. Since draft-ietf-httpbis-header-structure-00 . . . . . . 20 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 101 1. Introduction 103 Specifying the syntax of new HTTP header fields is an onerous task; 104 even with the guidance in [RFC7231], Section 8.3.1, there are many 105 decisions - and pitfalls - for a prospective HTTP header field 106 author. 108 Once a header field is defined, bespoke parsers for it often need to 109 be written, because each header has slightly different handling of 110 what looks like common syntax. 112 This document introduces a set of common data structures for use in 113 HTTP header field values to address these problems. In particular, 114 it defines a generic, abstract model for header field values, along 115 with a concrete serialisation for expressing that model in textual 116 HTTP headers, as used by HTTP/1 [RFC7230] and HTTP/2 [RFC7540]. 118 HTTP headers that are defined as "Structured Headers" use the types 119 defined in this specification to define their syntax and basic 120 handling rules, thereby simplifying both their definition and 121 parsing. 123 Additionally, future versions of HTTP can define alternative 124 serialisations of the abstract model of these structures, allowing 125 headers that use it to be transmitted more efficiently without being 126 redefined. 128 Note that it is not a goal of this document to redefine the syntax of 129 existing HTTP headers; the mechanisms described herein are only 130 intended to be used with headers that explicitly opt into them. 132 To specify a header field that is a Structured Header, see Section 2. 134 Section 4 defines a number of abstract data types that can be used in 135 Structured Headers. Dictionaries and lists are only usable at the 136 "top" level, while the remaining types can be specified appear at the 137 top level or inside those structures. 139 Those abstract types can be serialised into textual headers - such as 140 those used in HTTP/1 and HTTP/2 - using the algorithms described in 141 Section 3. 143 1.1. Notational Conventions 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 147 "OPTIONAL" in this document are to be interpreted as described in BCP 148 14 [RFC2119] [RFC8174] when, and only when, they appear in all 149 capitals, as shown here. 151 This document uses the Augmented Backus-Naur Form (ABNF) notation of 152 [RFC5234], including the DIGIT, ALPHA and DQUOTE rules from that 153 document. It also includes the OWS rule from [RFC7230]. 155 This document uses algorithms to specify normative parsing 156 behaviours, and ABNF to illustrate the on-wire format expected. 157 Implementations MUST follow the normative algorithms, but MAY vary in 158 implementation so as the behaviours are indistinguishable from 159 specified behaviour. If there is disagreement between the algorithms 160 and ABNF, the specified algorithms take precedence. 162 2. Defining New Structured Headers 164 A HTTP header that uses the structures in this specification need to 165 be defined to do so explicitly; recipients and generators need to 166 know that the requirements of this document are in effect. The 167 simplest way to do that is by referencing this document in its 168 definition. 170 The field's definition will also need to specify the field-value's 171 allowed syntax, in terms of the types described in Section 4, along 172 with their associated semantics. 174 A header field definition cannot relax or otherwise modify the 175 requirements of this specification, or change the nature of its data 176 structures; doing so would preclude handling by generic software. 178 However, header field authors are encouraged to clearly state 179 additional constraints upon the syntax, as well as the consequences 180 when those constraints are violated. When Structured Headers parsing 181 fails, the header is discarded (see Section 3); in most situations, 182 header-specific constraints should do likewise. 184 Such constraints could include additional structure inside those 185 defined here (e.g., a list of URLs [RFC3986] inside a string). 187 For example: 189 # Foo-Example Header 191 The Foo-Example HTTP header field conveys information about how 192 much Foo the message has. 194 Foo-Example is a Structured Header [RFCxxxx]. Its value MUST be a 195 dictionary ([RFCxxxx], Section Y.Y). 197 The dictionary MUST contain: 199 * Exactly one member whose key is "foo", and whose value is an 200 integer ([RFCxxxx], Section Y.Y), indicating the number of foos in 201 the message. 202 * Exactly one member whose key is "barUrls", and whose value is a 203 string ([RFCxxxx], Section Y.Y), conveying the Bar URLs for the 204 message. See below for processing requirements. 206 If the parsed header field does not contain both, it MUST be ignored. 208 "foo" MUST be between 0 and 10, inclusive; other values MUST cause 209 the header to be ignored. 211 "barUrls" contains a space-separated list of URI-references 212 ([RFC3986], Section 4.1): 214 barURLs = URI-reference *( 1*SP URI-reference ) 216 If a member of barURLs is not a valid URI-reference, it MUST cause 217 that value to be ignored. 219 If a member of barURLs is a relative reference ([RFC3986], 220 Section 4.2), it MUST be resolved ([RFC3986], Section 5) before being 221 used. 223 This specification defines minimums for the length or number of 224 various structures supported by Structured Headers implementations. 225 It does not specify maximum sizes in most cases, but header authors 226 should be aware that HTTP implementations do impose various limits on 227 the size of individual header fields, the total number of fields, 228 and/or the size of the entire header block. 230 Note that specifications using Structured Headers do not re-specify 231 its ABNF or parsing algorithms; instead, they should be specified in 232 terms of its abstract data structures. 234 Also, empty header field values are not allowed, and therefore 235 parsing for them will fail. 237 3. Parsing Textual Header Fields 239 When a receiving implementation parses textual HTTP header fields 240 (e.g., in HTTP/1 or HTTP/2) that are known to be Structured Headers, 241 it is important that care be taken, as there are a number of edge 242 cases that can cause interoperability or even security problems. 243 This section specifies the algorithm for doing so. 245 Given an ASCII string input_string that represents the chosen 246 header's field-value, and header_type, one of "dictionary", "list", 247 "param-list", or "item", return the parsed header value. 249 1. Discard any leading OWS from input_string. 251 2. If header_type is "dictionary", let output be the result of 252 Parsing a Dictionary from Text (Section 4.1.1). 254 3. If header_type is "list", let output be the result of Parsing a 255 List from Text (Section 4.2.1). 257 4. If header_type is "param-list", let output be the result of 258 Parsing a Parameterised List from Text (Section 4.3.1). 260 5. Otherwise, let output be the result of Parsing an Item from Text 261 (Section 4.4.1). 263 6. Discard any leading OWS from input_string. 265 7. If input_string is not empty, fail parsing. 267 8. Otherwise, return output. 269 When generating input_string, parsers MUST combine all instances of 270 the target header field into one comma-separated field-value, as per 271 [RFC7230], Section 3.2.2; this assures that the header is processed 272 correctly. 274 For Lists, Parameterised Lists and Dictionaries, this has the effect 275 of correctly concatenating all instances of the header field. 277 Strings can but SHOULD NOT be split across multiple header instances, 278 because comma(s) inserted upon combination will become part of the 279 string output by the parser. 281 Integers, Floats and Binary Content cannot be split across multiple 282 headers because the inserted commas will cause parsing to fail. 284 If parsing fails - including when calling another algorithm - the 285 entire header field's value MUST be discarded. This is intentionally 286 strict, to improve interoperability and safety, and specifications 287 referencing this document cannot loosen this requirement. 289 Note that this has the effect of discarding any header field with 290 non-ASCII characters in input_string. 292 4. Structured Header Data Types 294 This section defines the abstract value types that can be composed 295 into Structured Headers, along with the textual HTTP serialisations 296 of them. 298 4.1. Dictionaries 300 Dictionaries are unordered maps of key-value pairs, where the keys 301 are identifiers (Section 4.8) and the values are items (Section 4.4). 302 There can be one or more members, and keys are required to be unique. 304 In the textual HTTP serialisation, keys and values are separated by 305 "=" (without whitespace), and key/value pairs are separated by a 306 comma with optional whitespace. Duplicate keys MUST cause parsing to 307 fail. 309 dictionary = dict-member *( OWS "," OWS dict-member ) 310 dict-member = identifier "=" item 312 For example, a header field whose value is defined as a dictionary 313 could look like: 315 Example-DictHeader: foo=1.23, en="Applepie", da=*w4ZibGV0w6ZydGUK=* 317 Typically, a header field specification will define the semantics of 318 individual keys, as well as whether their presence is required or 319 optional. Recipients MUST ignore keys that are undefined or unknown, 320 unless the header field's specification specifically disallows them. 322 Parsers MUST support dictionaries containing at least 1024 key/value 323 pairs. 325 4.1.1. Parsing a Dictionary from Text 327 Given an ASCII string input_string, return a mapping of (identifier, 328 item). input_string is modified to remove the parsed value. 330 1. Let dictionary be an empty, unordered mapping. 332 2. While input_string is not empty: 334 1. Let this_key be the result of running Parse Identifier from 335 Text (Section 4.8.1) with input_string. 337 2. If dictionary already contains this_key, fail parsing. 339 3. Consume a "=" from input_string; if none is present, fail 340 parsing. 342 4. Let this_value be the result of running Parse Item from Text 343 (Section 4.4.1) with input_string. 345 5. Add key this_key with value this_value to dictionary. 347 6. Discard any leading OWS from input_string. 349 7. If input_string is empty, return dictionary. 351 8. Consume a COMMA from input_string; if no comma is present, 352 fail parsing. 354 9. Discard any leading OWS from input_string. 356 10. If input_string is empty, fail parsing. 358 3. No structured data has been found; fail parsing. 360 4.2. Lists 362 Lists are arrays of items (Section 4.4) with one or more members. 364 In the textual HTTP serialisation, each member is separated by a 365 comma and optional whitespace. 367 list = list-member *( OWS "," OWS list-member ) 368 list-member = item 370 For example, a header field whose value is defined as a list of 371 strings could look like: 373 Example-StrListHeader: "foo", "bar", "It was the best of times." 375 Parsers MUST support lists containing at least 1024 members. 377 4.2.1. Parsing a List from Text 379 Given an ASCII string input_string, return a list of items. 380 input_string is modified to remove the parsed value. 382 1. Let items be an empty array. 384 2. While input_string is not empty: 386 1. Let item be the result of running Parse Item from Text 387 (Section 4.4.1) with input_string. 389 2. Append item to items. 391 3. Discard any leading OWS from input_string. 393 4. If input_string is empty, return items. 395 5. Consume a COMMA from input_string; if no comma is present, 396 fail parsing. 398 6. Discard any leading OWS from input_string. 400 7. If input_string is empty, fail parsing. 402 3. No structured data has been found; fail parsing. 404 4.3. Parameterised Lists 406 Parameterised Lists are arrays of a parameterised identifiers. 408 A parameterised identifier is an identifier (Section 4.8) with an 409 optional set of parameters, each parameter having a identifier and an 410 optional value that is an item (Section 4.4). Ordering between 411 parameters is not significant, and duplicate parameters MUST cause 412 parsing to fail. 414 In the textual HTTP serialisation, each parameterised identifier is 415 separated by a comma and optional whitespace. Parameters are 416 delimited from each other using semicolons (";"), and equals ("=") 417 delimits the parameter name from its value. 419 param-list = param-id *( OWS "," OWS param-id ) 420 param-id = identifier *( OWS ";" OWS identifier [ "=" item ] ) 422 For example, 424 Example-ParamListHeader: abc_123;a=1;b=2; c, def_456, ghi;q="19";r=foo 425 Parsers MUST support parameterised lists containing at least 1024 426 members, and support members with at least 256 parameters. 428 4.3.1. Parsing a Parameterised List from Text 430 Given an ASCII string input_string, return a list of parameterised 431 identifiers. input_string is modified to remove the parsed value. 433 1. Let items be an empty array. 435 2. While input_string is not empty: 437 1. Let item be the result of running Parse Parameterised 438 Identifier from Text (Section 4.3.2) with input_string. 440 2. Append item to items. 442 3. Discard any leading OWS from input_string. 444 4. If input_string is empty, return items. 446 5. Consume a COMMA from input_string; if no comma is present, 447 fail parsing. 449 6. Discard any leading OWS from input_string. 451 7. If input_string is empty, fail parsing. 453 3. No structured data has been found; fail parsing. 455 4.3.2. Parsing a Parameterised Identifier from Text 457 Given an ASCII string input_string, return a identifier with an 458 mapping of parameters. input_string is modified to remove the parsed 459 value. 461 1. Let primary_identifier be the result of Parsing a Identifier from 462 Text (Section 4.8.1) from input_string. 464 2. Let parameters be an empty, unordered mapping. 466 3. In a loop: 468 1. Discard any leading OWS from input_string. 470 2. If the first character of input_string is not ";", exit the 471 loop. 473 3. Consume a ";" character from the beginning of input_string. 475 4. Discard any leading OWS from input_string. 477 5. let param_name be the result of Parsing a Identifier from 478 Text (Section 4.8.1) from input_string. 480 6. If param_name is already present in parameters, fail parsing. 482 7. Let param_value be a null value. 484 8. If the first character of input_string is "=": 486 1. Consume the "=" character at the beginning of 487 input_string. 489 2. Let param_value be the result of Parsing an Item from 490 Text (Section 4.4.1) from input_string. 492 9. Insert (param_name, param_value) into parameters. 494 4. Return the tuple (primary_identifier, parameters). 496 4.4. Items 498 An item is can be a integer (Section 4.5), float (Section 4.6), 499 string (Section 4.7), or binary content (Section 4.9). 501 item = integer / float / string / binary 503 4.4.1. Parsing an Item from Text 505 Given an ASCII string input_string, return an item. input_string is 506 modified to remove the parsed value. 508 1. Discard any leading OWS from input_string. 510 2. If the first character of input_string is a "-" or a DIGIT, 511 process input_string as a number (Section 4.5.1) and return the 512 result. 514 3. If the first character of input_string is a DQUOTE, process 515 input_string as a string (Section 4.7.1) and return the result. 517 4. If the first character of input_string is "*", process 518 input_string as binary content (Section 4.9.1) and return the 519 result. 521 5. Otherwise, fail parsing. 523 4.5. Integers 525 Abstractly, integers have a range of -9,223,372,036,854,775,808 to 526 9,223,372,036,854,775,807 inclusive (i.e., a 64-bit signed integer). 528 integer = ["-"] 1*19DIGIT 530 Parsers that encounter an integer outside the range defined above 531 MUST fail parsing. Therefore, the value "9223372036854775808" would 532 be invalid. Likewise, values that do not conform to the ABNF above 533 are invalid, and MUST fail parsing. 535 For example, a header whose value is defined as a integer could look 536 like: 538 Example-IntegerHeader: 42 540 4.5.1. Parsing a Number from Text 542 NOTE: This algorithm parses both Integers and Floats Section 4.6, and 543 returns the corresponding structure. 545 1. Let type be "integer". 547 2. Let sign be 1. 549 3. Let input_number be an empty string. 551 4. If the first character of input_string is "-", remove it from 552 input_string and set sign to -1. 554 5. If input_string is empty, fail parsing. 556 6. If the first character of input_string is not a DIGIT, fail 557 parsing. 559 7. While input_string is not empty: 561 1. Let char be the result of removing the first character of 562 input_string. 564 2. If char is a DIGIT, append it to input_number. 566 3. Else, if type is "integer" and char is ".", append char to 567 input_number and set type to "float". 569 4. Otherwise, fail parsing. 571 5. If type is "integer" and input_number contains more than 19 572 characters, fail parsing. 574 6. If type is "float" and input_number contains more than 16 575 characters, fail parsing. 577 8. If type is "integer", parse input_number as an integer and let 578 output_number be the result. 580 9. Otherwise: 582 1. If the final character of input_number is ".", fail parsing. 584 2. Parse input_number as a float and let output_number be the 585 result. 587 10. Return the product of output_number and sign. 589 4.6. Floats 591 Abstractly, floats are integers with a fractional part, that can be 592 stored as IEEE 754 double precision numbers (binary64) ([IEEE754]). 594 The textual HTTP serialisation of floats allows a maximum of fifteen 595 digits between the integer and fractional part, with at least one 596 required on each side, along with an optional "-" indicating negative 597 numbers. 599 float = ["-"] ( 600 DIGIT "." 1*14DIGIT / 601 2DIGIT "." 1*13DIGIT / 602 3DIGIT "." 1*12DIGIT / 603 4DIGIT "." 1*11DIGIT / 604 5DIGIT "." 1*10DIGIT / 605 6DIGIT "." 1*9DIGIT / 606 7DIGIT "." 1*8DIGIT / 607 8DIGIT "." 1*7DIGIT / 608 9DIGIT "." 1*6DIGIT / 609 10DIGIT "." 1*5DIGIT / 610 11DIGIT "." 1*4DIGIT / 611 12DIGIT "." 1*3DIGIT / 612 13DIGIT "." 1*2DIGIT / 613 14DIGIT "." 1DIGIT ) 615 Values that do not conform to the ABNF above are invalid, and MUST 616 fail parsing. 618 For example, a header whose value is defined as a float could look 619 like: 621 Example-FloatHeader: 4.5 623 See Section 4.5.1 for the parsing algorithm for floats. 625 4.7. Strings 627 Abstractly, strings are zero or more printable ASCII [RFC0020] 628 characters (i.e., the range 0x20 to 0x7E). Note that this excludes 629 tabs, newlines, carriage returns, etc. 631 The textual HTTP serialisation of strings uses a backslash ("\") to 632 escape double quotes and backslashes in strings. 634 string = DQUOTE *(chr) DQUOTE 635 chr = unescaped / escaped 636 unescaped = %x20-21 / %x23-5B / %x5D-7E 637 escaped = "\" ( DQUOTE / "\" ) 639 For example, a header whose value is defined as a string could look 640 like: 642 Example-StringHeader: "hello world" 644 Note that strings only use DQUOTE as a delimiter; single quotes do 645 not delimit strings. Furthermore, only DQUOTE and "\" can be 646 escaped; other sequences MUST cause parsing to fail. 648 Unicode is not directly supported in this document, because it causes 649 a number of interoperability issues, and - with few exceptions - 650 header values do not require it. 652 When it is necessary for a field value to convey non-ASCII string 653 content, binary content (Section 4.9) SHOULD be specified, along with 654 a character encoding (preferably, UTF-8). 656 Parsers MUST support strings with at least 1024 characters. 658 4.7.1. Parsing a String from Text 660 Given an ASCII string input_string, return an unquoted string. 661 input_string is modified to remove the parsed value. 663 1. Let output_string be an empty string. 665 2. If the first character of input_string is not DQUOTE, fail 666 parsing. 668 3. Discard the first character of input_string. 670 4. While input_string is not empty: 672 1. Let char be the result of removing the first character of 673 input_string. 675 2. If char is a backslash ("\"): 677 1. If input_string is now empty, fail parsing. 679 2. Else: 681 1. Let next_char be the result of removing the first 682 character of input_string. 684 2. If next_char is not DQUOTE or "\", fail parsing. 686 3. Append next_char to output_string. 688 3. Else, if char is DQUOTE, return output_string. 690 4. Else, append char to output_string. 692 5. Otherwise, fail parsing. 694 4.8. Identifiers 696 Identifiers are short textual identifiers; their abstract model is 697 identical to their expression in the textual HTTP serialisation. 698 Parsers MUST support identifiers with at least 64 characters. 700 identifier = lcalpha *( lcalpha / DIGIT / "_" / "-"/ "*" / "/" ) 701 lcalpha = %x61-7A ; a-z 703 Note that identifiers can only contain lowercase letters. 705 4.8.1. Parsing a Identifier from Text 707 Given an ASCII string input_string, return a identifier. input_string 708 is modified to remove the parsed value. 710 1. If the first character of input_string is not lcalpha, fail 711 parsing. 713 2. Let output_string be an empty string. 715 3. While input_string is not empty: 717 1. Let char be the result of removing the first character of 718 input_string. 720 2. If char is not one of lcalpha, DIGIT, "_", "-", "*" or "/": 722 1. Prepend char to input_string. 724 2. Return output_string. 726 3. Append char to output_string. 728 4. Return output_string. 730 4.9. Binary Content 732 Arbitrary binary content can be conveyed in Structured Headers. 734 The textual HTTP serialisation encodes the data using Base 64 735 Encoding [RFC4648], Section 4, and surrounds it with a pair of 736 asterisks ("*") to delimit from other content. 738 The encoded data is required to be padded with "=", as per [RFC4648], 739 Section 3.2. It is RECOMMENDED that parsers reject encoded data that 740 is not properly padded, although this might not be possible with some 741 base64 implementations. 743 Likewise, encoded data is required to have pad bits set to zero, as 744 per [RFC4648], Section 3.5. It is RECOMMENDED that parsers fail on 745 encoded data that has non-zero pad bits, although this might not be 746 possible with some base64 implementations. 748 This specification does not relax the requirements in [RFC4648], 749 Section 3.1 and 3.3; therefore, parsers MUST fail on characters 750 outside the base64 alphabet, and on line feeds in encoded data. 752 binary = "*" *(base64) "*" 753 base64 = ALPHA / DIGIT / "+" / "/" / "=" 755 For example, a header whose value is defined as binary content could 756 look like: 758 Example-BinaryHeader: *cHJldGVuZCB0aGlzIGlzIGJpbmFyeSBjb250ZW50Lg==* 759 Parsers MUST support binary content with at least 16384 octets after 760 decoding. 762 4.9.1. Parsing Binary Content from Text 764 Given an ASCII string input_string, return binary content. 765 input_string is modified to remove the parsed value. 767 1. If the first character of input_string is not "*", fail parsing. 769 2. Discard the first character of input_string. 771 3. Let b64_content be the result of removing content of input_string 772 up to but not including the first instance of the character "*". 773 If there is not a "*" character before the end of input_string, 774 fail parsing. 776 4. Consume the "*" character at the beginning of input_string. 778 5. Let binary_content be the result of Base 64 Decoding [RFC4648] 779 b64_content, synthesising padding if necessary (note the 780 requirements about recipient behaviour in Section 4.9). 782 6. Return binary_content. 784 5. IANA Considerations 786 This draft has no actions for IANA. 788 6. Security Considerations 790 The size of most types defined by Structured Headers is not limited; 791 as a result, extremely large header fields could be an attack vector 792 (e.g., for resource consumption). Most HTTP implementations limit 793 the sizes of size of individual header fields as well as the overall 794 header block size to mitigate such attacks. 796 It is possible for parties with the ability to inject new HTTP header 797 fields to change the meaning of a Structured Headers. In some 798 circumstances, this will cause parsing to fail, but it is not 799 possible to reliably fail in all such circumstances. 801 7. References 802 7.1. Normative References 804 [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, 805 RFC 20, DOI 10.17487/RFC0020, October 1969, 806 . 808 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 809 Requirement Levels", BCP 14, RFC 2119, 810 DOI 10.17487/RFC2119, March 1997, 811 . 813 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 814 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 815 . 817 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 818 Specifications: ABNF", STD 68, RFC 5234, 819 DOI 10.17487/RFC5234, January 2008, 820 . 822 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 823 Protocol (HTTP/1.1): Message Syntax and Routing", 824 RFC 7230, DOI 10.17487/RFC7230, June 2014, 825 . 827 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 828 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 829 May 2017, . 831 7.2. Informative References 833 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 834 IEEE 754-2008, DOI 10.1109/IEEESTD.2008.4610935, 835 ISBN 978-0-7381-5752-8, August 2008, 836 . 838 See also http://grouper.ieee.org/groups/754/ [6]. 840 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 841 Resource Identifier (URI): Generic Syntax", STD 66, 842 RFC 3986, DOI 10.17487/RFC3986, January 2005, 843 . 845 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 846 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 847 DOI 10.17487/RFC7231, June 2014, 848 . 850 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 851 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 852 DOI 10.17487/RFC7540, May 2015, 853 . 855 7.3. URIs 857 [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ 859 [2] https://httpwg.github.io/ 861 [3] https://github.com/httpwg/http-extensions/labels/header-structure 863 [4] https://github.com/httpwg/structured-header-tests 865 [5] https://github.com/httpwg/wiki/wiki/Structured-Headers 867 Appendix A. Changes 869 A.1. Since draft-ietf-httpbis-header-structure-04 871 o Remove identifiers from item. 873 o Remove most limits on sizes. 875 o Refine number parsing. 877 A.2. Since draft-ietf-httpbis-header-structure-03 879 o Strengthen language around failure handling. 881 A.3. Since draft-ietf-httpbis-header-structure-02 883 o Split Numbers into Integers and Floats. 885 o Define number parsing. 887 o Tighten up binary parsing and give it an explicit end delimiter. 889 o Clarify that mappings are unordered. 891 o Allow zero-length strings. 893 o Improve string parsing algorithm. 895 o Improve limits in algorithms. 897 o Require parsers to combine header fields before processing. 899 o Throw an error on trailing garbage. 901 A.4. Since draft-ietf-httpbis-header-structure-01 903 o Replaced with draft-nottingham-structured-headers. 905 A.5. Since draft-ietf-httpbis-header-structure-00 907 o Added signed 64bit integer type. 909 o Drop UTF8, and settle on BCP137 ::EmbeddedUnicodeChar for h1- 910 unicode-string. 912 o Change h1_blob delimiter to ":" since "'" is valid t_char 914 Authors' Addresses 916 Mark Nottingham 917 Fastly 919 Email: mnot@mnot.net 920 URI: https://www.mnot.net/ 922 Poul-Henning Kamp 923 The Varnish Cache Project 925 Email: phk@varnish-cache.org