idnits 2.17.1 draft-bormann-core-media-content-type-format-04.txt: -(3): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 6 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (22 February 2021) is 1159 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '19' on line 167 -- Obsolete informational reference (is this intentional?): RFC 1521 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) -- Obsolete informational reference (is this intentional?): RFC 1590 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) -- Obsolete informational reference (is this intentional?): RFC 7231 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 8152 (Obsoleted by RFC 9052, RFC 9053) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Bormann 3 Internet-Draft Universität Bremen TZI 4 Intended status: Standards Track H. Birkholz 5 Expires: 26 August 2021 Fraunhofer SIT 6 22 February 2021 8 On Media-Types, Content-Types, and related terminology 9 draft-bormann-core-media-content-type-format-04 11 Abstract 13 There is a lot of confusion about media-types, content-types, and 14 related terminology. 16 This memo is an attempt at clearing it up, so we can use consistent 17 terminology in CoRE and related specifications. It also defines some 18 ABNF that can be used in these specifications. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on 26 August 2021. 37 Copyright Notice 39 Copyright (c) 2021 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 44 license-info) in effect on the date of publication of this document. 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. Code Components 47 extracted from this document must include Simplified BSD License text 48 as described in Section 4.e of the Trust Legal Provisions and are 49 provided without warranty as described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Media-Type . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. Content-Type . . . . . . . . . . . . . . . . . . . . . . . . 4 56 4. Content-Coding . . . . . . . . . . . . . . . . . . . . . . . 5 57 5. Content-Format . . . . . . . . . . . . . . . . . . . . . . . 5 58 6. Remaining ABNF . . . . . . . . . . . . . . . . . . . . . . . 6 59 7. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . 6 60 8. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 7 61 9. Suggested usage . . . . . . . . . . . . . . . . . . . . . . . 7 62 9.1. COSE . . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 9.2. SenML . . . . . . . . . . . . . . . . . . . . . . . . . . 8 64 9.3. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 66 11. Security Considerations . . . . . . . . . . . . . . . . . . . 8 67 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 68 12.1. Normative References . . . . . . . . . . . . . . . . . . 8 69 12.2. Informative References . . . . . . . . . . . . . . . . . 8 70 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 10 71 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 73 1. Introduction 75 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 76 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 77 "OPTIONAL" in this document are to be interpreted as described in 78 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 79 capitals, as shown here. 81 [RFC1590] introduced media types and their registration. That 82 document took MIME types from [RFC1521] and gave them a new name. At 83 that time, the term "media type" was often used just for the major 84 type ("text", "audio"), and what we call a media-type now was the 85 combination of a type and a subtype. This lives on in [RFC6838], 86 which does not even have an ABNF [RFC5234] production for media type. 87 [RFC6838]'s predecessor, [RFC4288], supplied the ABNF shown in 88 (Figure 1). 90 | type-name = reg-name 91 | subtype-name = reg-name 92 | 93 | reg-name = 1*127reg-name-chars 94 | reg-name-chars = ALPHA / DIGIT / "!" / 95 | "#" / "$" / "&" / "." / 96 | "+" / "-" / "^" / "_" 97 | 98 | Figure 1: ABNF for type and subtype, cited from RFC 4288 100 [RFC6838], obsoleting [RFC4288], restricts the first character of a 101 reg-name to alphanumeric. It contains the otherwise semantically 102 equivalent ABNF shown in Figure 2, however adding prose comments that 103 further limit the use of "." and "+". 105 type-name = restricted-name 106 subtype-name = restricted-name 108 restricted-name = restricted-name-first *126restricted-name-chars 109 restricted-name-first = ALPHA / DIGIT 110 restricted-name-chars = ALPHA / DIGIT / "!" / "#" / 111 "$" / "&" / "-" / "^" / "_" 112 restricted-name-chars =/ "." ; Characters before first dot always 113 ; specify a facet name 114 restricted-name-chars =/ "+" ; Characters after last plus always 115 ; specify a structured syntax suffix 117 Figure 2: ABNF for type and subtype, as defined from RFC 6838 119 2. Media-Type 121 Today, the term "media type" is now generally used for a registered 122 combination of a type-name and a subtype-name, as well as for the 123 specification that defines the semantics of this combination. We 124 further disambiguate by calling the former a _media type name_. An 125 ABNF definition of "Media-Type-Name": 127 Media-Type-Name = type-name "/" subtype-name 129 Figure 3: Definition of Media-Type-Name 131 For the purposes of this memo, we define: 133 Media-Type-Name: A combination of a type-name and a subtype-name 134 registered in [IANA.media-types], conventionally identified by the 135 two names separated by a slash. 137 (This leaves the term "Media Type" for the actual specification that 138 is registered under the Media-Type-Name.) 140 3. Content-Type 142 Media types can have parameters [RFC6838], some of which are defined 143 by the media type specification to be mandatory. In HTTP and many 144 other protocols, media-type-names and parameters are then used 145 together in a "Content-Type" header field. HTTP [RFC7231] uses the 146 ABNF in Figure 4: 148 | Content-Type = media-type 149 | media-type = type "/" subtype *( OWS ";" OWS parameter ) 150 | type = token 151 | subtype = token 152 | token = 1*tchar 153 | tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" 154 | / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 155 | / DIGIT / ALPHA 156 | OWS = *( SP / HTAB ) 157 | 158 | Figure 4: Content-Type ABNF from RFC 7231 160 In the ABNF as established by [RFC2616], parts of which became 161 [RFC7231], the rule name media-type is used for a Media-Type-Name 162 with parameters attached. We don't follow this inclusive use of 163 media-type; note that [RFC2616] was quite confused about this term by 164 claiming (Section 3.7 of [RFC2616]): 166 Media-type values are registered with the Internet Assigned Number 167 Authority (IANA [19]). 169 This clearly reverts to the understanding of Media-Type-Name we use. 171 In order to resolve some of this confusion, we define as a separate 172 term: 174 Content-Type: A Media-Type-Name, optionally associated with 175 parameters (separated from the media type name and from each other 176 by a semicolon). 178 Removing the legacy HTAB characters now shunned in polite 179 conversation, as well as some other cobwebs, we define the 180 conventional textual representation of a Content-Type with the ABNF 181 in Figure 5: 183 Content-Type = Media-Type-Name *( *SP ";" *SP parameter ) 184 parameter = token "=" ( token / quoted-string ) 186 token = 1*tchar 187 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" 188 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 189 / DIGIT / ALPHA 190 quoted-string = %x22 *qdtext %x22 191 qdtext = SP / %x21 / %x23-5B / %x5D-7E 193 Figure 5: Definition of Content-Type 195 Note that there is a slight inconsistency between the "token" used 196 here and the "reg-name"/"restricted-name" used above; since media 197 type parameters probably will be defined within the guard rails set 198 by [RFC7231], we need to use HTTP's more comprehensive definition 199 here. 201 4. Content-Coding 203 Section 3.5 of [RFC2616] also introduced the term Content-Coding, a 204 registered name for an encoding transformation that has been or can 205 be applied to a representation: 207 content-coding = token 209 Figure 6: Definition of content-coding as in RFC 2616 211 Confusingly, in HTTP the Content-Coding is then given in a header 212 field called "Content-Encoding"; we *never* use this term (except 213 when we are in error). Instead we define: 215 Content-Coding: a registered name for an encoding transformation 216 that has been or can be applied to a representation. 218 Content-Codings are registered in the HTTP Content Coding Registry, a 219 subregistry of [IANA.http-parameters]. We often use the "identity" 220 Content-Coding, which is the identity transformation, and often fail 221 to identify that Content-Coding by name, instead calling it "no 222 Content-Coding". 224 5. Content-Format 226 CoAP, in Section 1 of [RFC7252], defines a Content-Format as the 227 combination of a Content-Type and a Content-Coding, identified by a 228 numeric identifier defined in the "CoAP Content-Formats" registry (a 229 subregistry of [IANA.core-parameters]), but in more confusing words 230 (it did not have the benefit of the present specifications). 232 Content-Format: the combination of a Content-Type and a Content- 233 Coding, identified by a numeric identifier defined by the "CoAP 234 Content-Formats" subregistry of [IANA.core-parameters]. 236 Note that there has not been a conventional string representation of 237 just the combination of a Content-Type and a Content-Coding; Content- 238 Formats so far always are identified by their registered Content- 239 Format numbers. However, there are applications where that is useful 240 [I-D.keranen-core-senml-data-ct], so we define: 242 Content-Format = "0" / (POS-DIGIT *DIGIT) 243 Content-Format-String = Content-Type ["@" content-coding] 245 Figure 7: Definition of Content-Format/-String 247 This allows the use of Content-Format-Strings such as "application/ 248 json@deflate" in place of the less self-describing content-format 249 "11050", or other combinations that do not have a content-format 250 number defined yet. 252 Content-Format-Strings MUST NOT explicitly use the content-coding 253 value of "identity" (i.e., if an identity content-coding is desired, 254 the entire optional part including the "@" sign is left out). 256 Note that a quoted string inside a content-type parameter might 257 contain an "@" sign, so the parsing of Content-Format-Strings cannot 258 be done in a too simplistic way. 260 6. Remaining ABNF 262 This specification uses the ABNF given in Figure 8, as originally 263 defined in [RFC5234] and [RFC8866]: 265 DIGIT = %x30-39 ; 0 – 9 266 POS-DIGIT = %x31-39 ; 1 – 9 267 ALPHA = %x41-5A / %x61-7A ; A – Z / a – z 268 SP = %x20 270 Figure 8: Commonly Used ABNF Definitions 272 7. Abbreviations 274 Media type names are sometimes abbreviated as "mt", and Content-Types 275 as "ct". We propose not to use those abbreviations: Where the long 276 form of the values can be used, the long form "Content-Type" can also 277 be used to name them. 279 For historical reasons, both [RFC6690] and [RFC7252] use the 280 abbreviation "ct" for Content-Format (think first and last 281 character). 283 For Content-Coding, the abbreviation "cc" can be used. 285 8. Discussion 287 The ABNF given here is provisional and may need some more cleanup, 288 such as unifying the various forms of reg-name, token, etc. 290 (ABNF just shown for illustration is centered, in a blockquote, and 291 tagged with "" in the XML, while the 292 normative ABNF of this memo is left-aligned and tagged with 293 "".) 295 The XPath expression "//sourcecode[@type='abnf']/text()" can be used 296 on the XML form of this specification to extract the ABNF defined 297 here. 299 We need to discuss case-insensitivity at some point, which is usually 300 rather insensitive. 302 9. Suggested usage 304 9.1. COSE 306 Section 3.1 of [RFC8152] defines a common COSE header parameter 307 (number 3) called "content type" in the description, to indicate the 308 type of the data in the payload or ciphertext fields. 310 This header parameter can either be an unsigned integer, indicating a 311 CoRE Content-Format number, or a text string. The latter alternative 312 is only defined in general terms. It points to Section 4.2 of 313 [RFC6838] for 'text values following the syntax of "/"...', but also discusses the use of parameters 315 and subparameters; no ABNF or similar detail specification is 316 provided. The text does not discuss the use of Content-Coding in the 317 text string form, probably because nothing like the present document 318 existed at the time, creating a weird gap compared with numeric 319 Content-Format values. (The text only has trivial changes in its 320 updated version in Section 3.1 of 321 [I-D.ietf-cose-rfc8152bis-struct-15].) 323 The present specification suggests using the production "Content- 324 Format-String" as a more formal definition of the text string that 325 can go into the "content type" (number 3) common header parameter in 326 COSE. 328 9.2. SenML 330 As discussed above, Section 3 of [I-D.keranen-core-senml-data-ct] 331 makes use of the present specification. 333 9.3. ... 335 (to be filled in along further use cases) 337 10. IANA Considerations 339 While this memo talks a lot about IANA registries, it does not 340 require any action from IANA. 342 11. Security Considerations 344 Confusion about terminology may, in the worst case, cause security 345 problems, as can loosely defined syntax elements of a specification. 346 No other security considerations are known to be raised by the 347 present specification. 349 12. References 351 12.1. Normative References 353 [IANA.core-parameters] 354 IANA, "Constrained RESTful Environments (CoRE) 355 Parameters", 356 . 358 [IANA.http-parameters] 359 IANA, "Hypertext Transfer Protocol (HTTP) Parameters", 360 . 362 [IANA.media-types] 363 IANA, "Media Types", 364 . 366 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 367 Requirement Levels", BCP 14, RFC 2119, 368 DOI 10.17487/RFC2119, March 1997, 369 . 371 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 372 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 373 May 2017, . 375 12.2. Informative References 377 [I-D.ietf-cose-rfc8152bis-struct-15] 378 Schaad, J., "CBOR Object Signing and Encryption (COSE): 379 Structures and Process", Work in Progress, Internet-Draft, 380 draft-ietf-cose-rfc8152bis-struct-15, 1 February 2021, 381 . 384 [I-D.keranen-core-senml-data-ct] 385 Keranen, A. and C. Bormann, "SenML Data Value Content- 386 Format Indication", Work in Progress, Internet-Draft, 387 draft-keranen-core-senml-data-ct-02, 8 July 2019, 388 . 391 [RFC1521] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet 392 Mail Extensions) Part One: Mechanisms for Specifying and 393 Describing the Format of Internet Message Bodies", 394 RFC 1521, DOI 10.17487/RFC1521, September 1993, 395 . 397 [RFC1590] Postel, J., "Media Type Registration Procedure", RFC 1590, 398 DOI 10.17487/RFC1590, March 1994, 399 . 401 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 402 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 403 Transfer Protocol -- HTTP/1.1", RFC 2616, 404 DOI 10.17487/RFC2616, June 1999, 405 . 407 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and 408 Registration Procedures", RFC 4288, DOI 10.17487/RFC4288, 409 December 2005, . 411 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 412 Specifications: ABNF", STD 68, RFC 5234, 413 DOI 10.17487/RFC5234, January 2008, 414 . 416 [RFC6690] Shelby, Z., "Constrained RESTful Environments (CoRE) Link 417 Format", RFC 6690, DOI 10.17487/RFC6690, August 2012, 418 . 420 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 421 Specifications and Registration Procedures", BCP 13, 422 RFC 6838, DOI 10.17487/RFC6838, January 2013, 423 . 425 [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 426 Protocol (HTTP/1.1): Semantics and Content", RFC 7231, 427 DOI 10.17487/RFC7231, June 2014, 428 . 430 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 431 Application Protocol (CoAP)", RFC 7252, 432 DOI 10.17487/RFC7252, June 2014, 433 . 435 [RFC8152] Schaad, J., "CBOR Object Signing and Encryption (COSE)", 436 RFC 8152, DOI 10.17487/RFC8152, July 2017, 437 . 439 [RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: 440 Session Description Protocol", RFC 8866, 441 DOI 10.17487/RFC8866, January 2021, 442 . 444 Acknowledgements 446 Matthias Kovatsch forced the authors to make up their minds about 447 this. Ari Keränen forced them to write it up, then, and created a 448 convincing use case of Content-Format-Strings. John Mattsson alerted 449 us to a mistake. Alexey Melnikov suggested reviving this draft after 450 a year of dormancy. 452 Authors' Addresses 454 Carsten Bormann 455 Universität Bremen TZI 456 Postfach 330440 457 D-28359 Bremen 458 Germany 460 Phone: +49-421-218-63921 461 Email: cabo@tzi.org 463 Henk Birkholz 464 Fraunhofer SIT 465 Rheinstrasse 75 466 64295 Darmstadt 467 Germany 469 Email: henk.birkholz@sit.fraunhofer.de