idnits 2.17.1 draft-nottingham-http-structure-retrofit-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (7 October 2021) is 932 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 7231 (Obsoleted by RFC 9110) -- Duplicate reference: RFC8941, mentioned in 'STRUCTURED-FIELDS', was also mentioned in 'RFC8941'. Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Nottingham 3 Internet-Draft 7 October 2021 4 Intended status: Informational 5 Expires: 10 April 2022 7 Retrofit Structured Fields for HTTP 8 draft-nottingham-http-structure-retrofit-00 10 Abstract 12 This specification defines how a selection of existing HTTP fields 13 can be handled as Structured Fields. 15 Note to Readers 17 _RFC EDITOR: please remove this section before publication_ 19 The issues list for this draft can be found at 20 https://github.com/mnot/I-D/labels/http-structure-retrofit 21 (https://github.com/mnot/I-D/labels/http-structure-retrofit). 23 The most recent (often, unpublished) draft is at 24 https://mnot.github.io/I-D/http-structure-retrofit/ 25 (https://mnot.github.io/I-D/http-structure-retrofit/). 27 Recent changes are listed at https://github.com/mnot/I-D/commits/gh- 28 pages/http-structure-retrofit (https://github.com/mnot/I-D/commits/ 29 gh-pages/http-structure-retrofit). 31 See also the draft's current status in the IETF datatracker, at 32 https://datatracker.ietf.org/doc/draft-nottingham-http-structure- 33 retrofit/ (https://datatracker.ietf.org/doc/draft-nottingham-http- 34 structure-retrofit/). 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on 10 April 2022. 53 Copyright Notice 55 Copyright (c) 2021 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 60 license-info) in effect on the date of publication of this document. 61 Please review these documents carefully, as they describe your rights 62 and restrictions with respect to this document. Code Components 63 extracted from this document must include Simplified BSD License text 64 as described in Section 4.e of the Trust Legal Provisions and are 65 provided without warranty as described in the Simplified BSD License. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 70 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 71 1.2. Compatible Fields . . . . . . . . . . . . . . . . . . . . 4 72 1.3. Mapped Fields . . . . . . . . . . . . . . . . . . . . . . 6 73 1.3.1. URLs . . . . . . . . . . . . . . . . . . . . . . . . 7 74 1.3.2. Dates . . . . . . . . . . . . . . . . . . . . . . . . 7 75 1.3.3. ETags . . . . . . . . . . . . . . . . . . . . . . . . 8 76 1.3.4. Links . . . . . . . . . . . . . . . . . . . . . . . . 8 77 1.3.5. Cookies . . . . . . . . . . . . . . . . . . . . . . . 8 78 1.4. IANA Considerations . . . . . . . . . . . . . . . . . . . 9 79 2. Security Considerations . . . . . . . . . . . . . . . . . . . 10 80 3. Normative References . . . . . . . . . . . . . . . . . . . . 10 81 Appendix A. Data Supporting Field Compatibility . . . . . . . . 11 82 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 84 1. Introduction 86 Structured Field Values for HTTP [STRUCTURED-FIELDS] introduced a 87 data model with associated parsing and serialisation algorithms for 88 HTTP field values. Header fields that are defined as Structured 89 Fields can realise a number of benefits, including: 91 * Improved interoperability and security: precisely defined parsing 92 and serialisation algorithms are typically not available for 93 fields defined with just ABNF and/or prose. 95 * Reuse of common implementations: many parsers for other fields are 96 specific to a single field or a small family of fields 98 * Canonical form: because a deterministic serialisation algorithm is 99 defined for each type, Structure Fields have a canonical 100 representation 102 * Enhanced API support: a regular data model makes it easier to 103 expose field values as a native data structure in implementations 105 * Alternative serialisations: While [STRUCTURED-FIELDS] defines a 106 textual serialisation of that data model, other, more efficient 107 serialisations of the underlying data model are also possible. 109 However, a field needs to be defined as a Structured Field for these 110 benefits to be realised. Many existing fields are not, making up the 111 bulk of header and trailer fields seen in HTTP traffic on the 112 Internet. 114 This specification defines how a selection of existing HTTP fields 115 can be handled as Structured Fields, so that these benefits can be 116 realised -- thereby making them Retrofit Structured Fields. 118 It does so using two techniques. Section 1.2 lists compatible fields 119 -- those that can be handled as if they were Structured Fields due to 120 the similarity of their defined syntax to that in Structured Fields. 121 Section 1.3 lists mapped fields -- those whose syntax needs to be 122 transformed into an underlying data model which is then mapped into 123 that defined by Structured Fields. 125 While implementations can parse and serialise Compatible Fields as 126 Structured Fields subject to the caveats in Section 1.2, a sender 127 cannot generate mapped fields from Section 1.3 and expect them to be 128 understood and acted upon by the recipient without prior negotiation. 129 This specification does not define such a mechanism. 131 1.1. Notational Conventions 133 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 134 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 135 "OPTIONAL" in this document are to be interpreted as described in BCP 136 14 [RFC2119] [RFC8174] when, and only when, they appear in all 137 capitals, as shown here. 139 1.2. Compatible Fields 141 HTTP fields with the following names can usually have their values 142 handled as Structured Fields according to the listed parsing and 143 serialisation algorithms in [RFC8941], subject to the listed caveats. 145 The listed types are chosen for compatibility with the defined syntax 146 of the field as well as with actual Internet traffic (see 147 Appendix A). However, not all instances of these fields will 148 successfully parse. This might be because the field value is clearly 149 invalid, or it might be because it is valid but not parseable as a 150 Structured Field. 152 As such, an application using this specification will need to 153 consider how to handle these field values. Depending on its 154 requirements, it might be advisable to reject such values, treat them 155 as opaque strings, or attempt to recover a structured value from them 156 in an ad hoc fashion. 158 * Accept - List 160 * Accept-Encoding - List 162 * Accept-Language - List 164 * Accept-Patch - List 166 * Accept-Ranges - List 168 * Access-Control-Allow-Credentials - Item 170 * Access-Control-Allow-Headers - List 172 * Access-Control-Allow-Methods - List 174 * Access-Control-Allow-Origin - Item 176 * Access-Control-Expose-Headers - List 178 * Access-Control-Max-Age - Item 180 * Access-Control-Request-Headers - List 182 * Access-Control-Request-Method - Item 184 * Age - Item 186 * Allow - List 187 * ALPN - List 189 * Alt-Svc - Dictionary 191 * Alt-Used - Item 193 * Cache-Control - Dictionary 195 * Connection - List 197 * Content-Encoding - List 199 * Content-Language - List 201 * Content-Length - List 203 * Content-Type - Item 205 * Cross-Origin-Resource-Policy - Item 207 * Expect - Item 209 * Expect-CT - Dictionary 211 * Host - Item 213 * Keep-Alive - Dictionary 215 * Origin - Item 217 * Pragma - Dictionary 219 * Prefer - Dictionary 221 * Preference-Applied - Dictionary 223 * Retry-After - Item 225 * Surrogate-Control - Dictionary 227 * TE - List 229 * Timing-Allow-Origin: List 231 * Trailer - List 233 * Transfer-Encoding - List 234 * Vary - List 236 * X-Content-Type-Options - Item 238 * X-Frame-Options - Item 240 * X-XSS-Protection - List 242 Note the following caveats: 244 Parameters: HTTP parameter names are case-insensitive (as per 245 Section 5.6.6 of [HTTP]), but Structured Fields require them to be 246 all-lowercase. Although the vast majority of parameters seen in 247 typical traffic are all-lowercase, compatibility can be improved 248 by force-lowercasing parameters when encountered. 250 Empty Field Values: Empty and whitespace-only field values are 251 considered errors in Structured Fields. For compatible fields, an 252 empty field indicates that the field should be silently ignored. 254 Alt-Svc: Some ALPN tokens (e.g., h3-Q43) do not conform to key's 255 syntax. Since the final version of HTTP/3 uses the h3 token, this 256 shouldn't be a long-term issue, although future tokens may again 257 violate this assumption. 259 Cache-Control, Expect-CT, Pragma, Prefer, Preference-Applied, 260 Surrogate-Control: These Dictionary-based fields consider the key to 261 be case-insensitive, but Structured Fields requires keys to be 262 all-lowercase. Although the vast majority of values seen in 263 typical traffic are all-lowercase, compatibility can be improved 264 by force-lowercasing these Dictionary keys when encountered. 266 Content-Length: Content-Length is defined as a List because it is 267 not uncommon for implementations to mistakenly send multiple 268 values. See Section 8.6 of [HTTP] for handling requirements. 270 Retry-After: Only the delta-seconds form of Retry-After is 271 supported; a Retry-After value containing a http-date will need to 272 be either converted into delta-seconds or represented as a raw 273 value. 275 1.3. Mapped Fields 277 HTTP fields with the following names can have their values 278 represented in Structured Fields by mapping them into its data types 279 and then serialising the result using an alternative field name. 281 For example, the Date HTTP header field carries a string representing 282 a date: 284 Date: Sun, 06 Nov 1994 08:49:37 GMT 286 Its value is more efficiently represented as an integer number of 287 delta seconds from the Unix epoch (00:00:00 UTC on 1 January 1970, 288 minus leap seconds). Thus, the example above would be mapped as: 290 SF-Date: 784072177 292 As in Section 1.2, these fields are unable to represent values that 293 are not Structured Fields, and so an application using this 294 specification will need to how to support such values. Typically, 295 serialising them using the original field name is sufficient. 297 Each field name listed below indicates a replacement field name and a 298 means of mapping its original value into a Structured Field. 300 1.3.1. URLs 302 The following field names (paired with their replacement field names) 303 have values that can be represented as Structured Fields by 304 considering the original field's value as a string. 306 * Content-Location - SF-Content-Location 308 * Location - SF-Location 310 * Referer - SF-Referer 312 For example, a Location field could be represented as: 314 SF-Location: "https://example.com/foo" 316 1.3.2. Dates 318 The following field names (paired with their replacement field names) 319 have values that can be represented as Structured Fields by parsing 320 their payload according to [RFC7231], Section 7.1.1.1, and 321 representing the result as an integer number of seconds delta from 322 the Unix Epoch (00:00:00 UTC on 1 January 1970, minus leap seconds). 324 * Date - SF-Date 326 * Expires - SF-Expires 328 * If-Modified-Since - SF-IMS 329 * If-Unmodified-Since - SF-IUS 331 * Last-Modified - SF-LM 333 For example, an Expires field could be represented as: 335 SF-Expires: 1571965240 337 1.3.3. ETags 339 The following field names (paired with their replacement field names) 340 have values that can be represented as Structured Fields by 341 representing the entity-tag as a string, and the weakness flag as a 342 boolean "w" parameter on it, where true indicates that the entity-tag 343 is weak; if 0 or unset, the entity-tag is strong. 345 * ETag - SF-ETag 347 For example: 349 SF-ETag: "abcdef"; w=?1 351 If-None-Match is a list of the structure described above. 353 * If-None-Match - SF-INM 355 For example: 357 SF-INM: "abcdef"; w=?1, "ghijkl" 359 1.3.4. Links 361 The field-value of the Link header field [RFC8288] can be represented 362 as a Structured Field by representing the URI-Reference as a string, 363 and link-param as parameters. 365 * Link: SF-Link 367 For example: 369 SF-Link: "/terms"; rel="copyright"; anchor="#foo" 371 1.3.5. Cookies 373 The field-values of the Cookie and Set-Cookie fields [RFC6265] can be 374 represented in Structured Fields as a List with parameters and a 375 Dictionary, respectively. 377 The serialisation is almost identical, except that the Expires 378 parameter is always a string (as it can contain a comma), multiple 379 cookie-strings can appear in Set-Cookie, and cookie-pairs are 380 delimited in Cookie by a comma, rather than a semicolon. 382 * Set-Cookie: SF-Set-Cookie 384 * Cookie: SF-Cookie 386 SF-Set-Cookie: lang=en-US; expires="Wed, 09 Jun 2021 10:18:14 GMT"; 387 samesite=Strict 388 SF-Cookie: SID=31d4d96e407aad42, lang=en-US 390 * ISSUE: explicitly convert Expires to an integer? 391 https://github.com/mnot/I-D/issues/308 (https://github.com/mnot/I- 392 D/issues/308) 394 * ISSUE: dictionary keys cannot contain UC alpha. 395 https://github.com/mnot/I-D/issues/312 (https://github.com/mnot/I- 396 D/issues/312) 398 * ISSUE: explicitly allow non-string content. 399 https://github.com/mnot/I-D/issues/313 (https://github.com/mnot/I- 400 D/issues/313) 402 1.4. IANA Considerations 404 IANA is asked to register the following entries in the HTTP Field 405 Name Registry with a status of "permanent" and referring to this 406 document: 408 * SF-Content-Location 410 * SF-Location 412 * SF-Referer 414 * SF-Date 416 * SF-Expires 418 * SF-IMS 420 * SF-IUS 422 * SF-LM 424 * SF-ETag 425 * SF-INM 427 * SF-Link 429 * SF-Set-Cookie 431 * SF-Cookie 433 2. Security Considerations 435 Section 1.2 identifies existing HTTP fields that can be parsed and 436 serialised with the algorithms defined in [STRUCTURED-FIELDS]. 437 Variances from other implementations might be exploitable, 438 particularly if they allow an attacker to target one implementation 439 in a chain (e.g., an intermediary). However, given the considerable 440 variance in parsers already deployed, convergence towards a single 441 parsing algorithm is likely to have a net security benefit in the 442 longer term. 444 Section 1.3 defines alternative representations of existing fields. 445 Because downstream consumers might interpret the message differently 446 based upon whether they recognise the alternative representation, 447 implementations are prohibited from generating such fields unless 448 they have negotiated support for them with their peer. This 449 specification does not define such a mechanism, but any such 450 definition needs to consider the implications of doing so carefully. 452 3. Normative References 454 [HTTP] Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP 455 Semantics", Work in Progress, Internet-Draft, draft-ietf- 456 httpbis-semantics-19, 12 September 2021, 457 . 460 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 461 Requirement Levels", BCP 14, RFC 2119, 462 DOI 10.17487/RFC2119, March 1997, 463 . 465 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 466 DOI 10.17487/RFC6265, April 2011, 467 . 469 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 470 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 471 DOI 10.17487/RFC7231, June 2014, 472 . 474 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 475 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 476 May 2017, . 478 [RFC8288] Nottingham, M., "Web Linking", RFC 8288, 479 DOI 10.17487/RFC8288, October 2017, 480 . 482 [RFC8941] Nottingham, M. and P-H. Kamp, "Structured Field Values for 483 HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, 484 . 486 [STRUCTURED-FIELDS] 487 Nottingham, M. and P-H. Kamp, "Structured Field Values for 488 HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, 489 . 491 Appendix A. Data Supporting Field Compatibility 493 To help guide decisions about compatible fields, the HTTP response 494 headers captured by the HTTP Archive https://httparchive.org 495 (https://httparchive.org) in September 2021 (representing more than 496 528,000,000 HTTP exchanges) were parsed as Structured Fields using 497 the types listed in Section 1.2, with the indicated number of 498 successful header instances, failures, and the resulting failure 499 rate: 501 accept 9,099 / 34 = 0.372%* 502 accept-encoding 116,708 / 58 = 0.050%* 503 accept-language 127,710 / 95 = 0.074%* 504 accept-patch 281 / 0 = 0.000% 505 accept-ranges 289,341,375 / 7,776 = 0.003% 506 access-control-allow-credentials 36,159,371 / 2,671 = 0.007% 507 access-control-allow-headers 25,980,519 / 23,181 = 0.089% 508 access-control-allow-methods 32,071,437 / 17,424 = 0.054% 509 access-control-allow-origin 165,719,859 / 130,247 = 0.079% 510 access-control-expose-headers 20,787,683 / 1,973 = 0.009% 511 access-control-max-age 9,549,494 / 9,846 = 0.103% 512 access-control-request-headers 165,882 / 503 = 0.302%* 513 access-control-request-method 346,135 / 30,680 = 8.142%* 514 age 107,395,872 / 36,649 = 0.034% 515 allow 579,822 / 281 = 0.048% 516 alt-svc 56,773,977 / 4,914,119 = 7.966% 517 cache-control 395,402,834 / 1,146,080 = 0.289% 518 connection 112,017,641 / 3,491 = 0.003% 519 content-encoding 225,568,224 / 237 = 0.000% 520 content-language 3,339,291 / 1,744 = 0.052% 521 content-length 422,415,406 / 126 = 0.000% 522 content-type 503,950,894 / 507,133 = 0.101% 523 cross-origin-resource-policy 102,483,430 / 799 = 0.001% 524 expect 0 / 53 = 100.000%* 525 expect-ct 54,129,244 / 80,333 = 0.148% 526 host 57,134 / 1,486 = 2.535%* 527 keep-alive 50,606,877 / 1,509 = 0.003% 528 origin 32,438 / 1,396 = 4.126%* 529 pragma 66,321,848 / 97,328 = 0.147% 530 preference-applied 189 / 0 = 0.000% 531 referrer-policy 14,274,787 / 8,091 = 0.057% 532 retry-after 523,533 / 7,585 = 1.428% 533 surrogate-control 282,846 / 976 = 0.344% 534 te 1 / 0 = 0.000% 535 timing-allow-origin 91,979,983 / 8 = 0.000% 536 trailer 1,171 / 0 = 0.000% 537 transfer-encoding 15,098,518 / 0 = 0.000% 538 vary 246,483,644 / 69,607 = 0.028% 539 x-content-type-options 166,063,072 / 237,255 = 0.143% 540 x-frame-options 56,863,322 / 1,014,464 = 1.753% 541 x-xss-protection 132,739,109 / 347,133 = 0.261% 543 Note that this data set only includes response headers, although some 544 request headers are present, indicated with an asterisk (because, the 545 Web). Also, Dictionary and Parameter keys have not been force- 546 lowercased, with the result that any values containing uppercase keys 547 are considered to fail. 549 The top thirty header fields in that data set that were not 550 considered compatible are (* indicates that the field is mapped in 551 Section 1.3): 553 * *date: 524,810,577 555 * server: 470,777,294 557 * *last-modified: 383,437,099 559 * *expires: 292,109,781 561 * *etag: 255,788,799 563 * strict-transport-security: 111,993,787 565 * x-cache: 70,713,258 567 * via: 55,983,914 569 * cf-ray: 54,556,881 571 * p3p: 54,479,183 573 * report-to: 54,056,804 575 * cf-cache-status: 53,536,789 577 * nel: 44,815,769 579 * x-powered-by: 37,281,354 581 * content-security-policy-report-only: 33,104,387 583 * *location: 30,533,957 585 * x-amz-cf-pop: 28,549,182 587 * x-amz-cf-id: 28,444,359 589 * content-security-policy: 25,404,401 591 * x-served-by: 23,277,252 593 * x-cache-hits: 21,842,899 595 * *link: 20,761,372 596 * x-timer: 18,780,130 598 * content-disposition: 18,516,671 600 * x-request-id: 16,048,668 602 * referrer-policy: 15,596,734 604 * x-cdn: 10,153,756 606 * x-amz-version-id: 9,786,024 608 * x-amz-request-id: 9,680,689 610 * x-dc: 9,557,728 612 Author's Address 614 Mark Nottingham 615 Prahran 616 VIC 617 Australia 619 Email: mnot@mnot.net 620 URI: https://www.mnot.net/