idnits 2.17.1 draft-masinter-mime-web-info-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 25, 2010) is 4931 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 1522 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force L. Masinter 3 Internet-Draft Adobe 4 Intended status: Informational October 25, 2010 5 Expires: April 28, 2011 7 MIME and the Web 8 draft-masinter-mime-web-info-01 10 Abstract 12 This document describes some of the ways in which parts of the MIME 13 system, originally designed for electronic mail, have been used in 14 the Web, and some of the ways in which those uses have resulted in 15 difficulties. Given this background and justification, this document 16 then goes on to outline requirements for changes to MIME registries 17 and practices for their use within W3C and IETF, in order to address 18 those difficulties. Within IETF, a companion Best Current Practice 19 document will be developed to specifically make some changes to the 20 Internet Media Types and Charset registries. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 28, 2011. 39 Copyright Notice 41 Copyright (c) 2010 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Origins of MIME . . . . . . . . . . . . . . . . . . . . . 3 59 2.2. Introducing MIME into the Web . . . . . . . . . . . . . . 4 60 2.3. Distributed Extensibility . . . . . . . . . . . . . . . . 5 61 3. Problems with application to the Web . . . . . . . . . . . . . 5 62 3.1. Lack of clarity . . . . . . . . . . . . . . . . . . . . . 5 63 3.2. Differences between email and Web delivery . . . . . . . . 6 64 3.3. The Rules Weren't Quite Followed . . . . . . . . . . . . . 7 65 3.4. Consequences . . . . . . . . . . . . . . . . . . . . . . . 7 66 3.5. The Down Side of Extensibility . . . . . . . . . . . . . . 8 67 4. Additional considerations . . . . . . . . . . . . . . . . . . 8 68 4.1. There are related problems with charsets . . . . . . . . . 8 69 4.2. Embedded, downloaded, launch independent application . . . 9 70 4.3. Additional Use Cases: Polyglot and Multiview . . . . . . . 9 71 4.4. Evolution, Versioning, Forking . . . . . . . . . . . . . . 9 72 4.5. Content Negotiation . . . . . . . . . . . . . . . . . . . 10 73 4.6. Fragment identifiers . . . . . . . . . . . . . . . . . . . 11 74 5. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 11 75 5.1. Internet Media Type registration . . . . . . . . . . . . . 12 76 5.1.1. MIME registry magic numbers for sniffing . . . . . . . 12 77 5.1.2. Scripting and scriptable content safety . . . . . . . 12 78 5.1.3. Fragment identifiers . . . . . . . . . . . . . . . . . 12 79 5.1.4. Application info . . . . . . . . . . . . . . . . . . . 12 80 5.1.5. File extensions in registry . . . . . . . . . . . . . 12 81 5.2. Sniffing . . . . . . . . . . . . . . . . . . . . . . . . . 13 82 5.2.1. Sniffing uses Media Type magic number . . . . . . . . 13 83 5.2.2. Sniffing when there are multiple different 84 definitions . . . . . . . . . . . . . . . . . . . . . 13 85 5.2.3. Sniffing charsets . . . . . . . . . . . . . . . . . . 13 86 5.2.4. Sniffing security uses scriptability info . . . . . . 13 87 5.3. Changes to IANA processes for MIME registries . . . . . . 13 88 5.4. FTP specification . . . . . . . . . . . . . . . . . . . . 13 89 5.5. Update some URI definitions . . . . . . . . . . . . . . . 14 90 5.6. Changes to W3C findings, processes . . . . . . . . . . . . 14 91 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 92 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 93 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 94 9. Informative References . . . . . . . . . . . . . . . . . . . . 14 95 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 15 97 1. Introduction 99 This document was initially prompted by a set of discussions about 100 Web architecture and the difficulties surrounding evolution of the 101 Web, Internet Media types, multiple specifications for a single media 102 type, and related discussions. 104 The document gives some of the history of MIME and its introduction 105 and use in the web Section 2. It then describes some of the current 106 difficulties with the use of MIME in the web context Section 3. This 107 background and context is then followed by a description of changes 108 which would reduce some of those difficulties; the changes involve 109 specifications, practices, and registries within IETF and W3C 110 Section 5. In particular, changes to the registry and maintenance 111 procedures for MIME-related registries maintained by IANA are 112 describes. 114 Currently, discussion of this document is suggested on the mailing 115 list www-tag@w3c.org (mailing list open for subscription to all), 116 archives at http://lists.w3.org/Archives/Public/www-tag/. 118 NOTE: This document is still quite rough; some of the facts need to 119 be checked, many sections still need expansion. Any help with 120 references and such appreciated. 122 2. History 124 2.1. Origins of MIME 126 MIME ("Multipurpose Internet Mail Extensions") was invented 127 originally for email, based on general principles of "messaging" (a 128 foundational architecture framework). The role of MIME was to extend 129 Internet email messaging from ASCII-only plain text, to include other 130 character sets, images, rich documents, etc.) [RFC1521], [RFC1522]. 131 The basic architecture of complex content messaging is: 133 o Message sent from A to B. 135 o Message includes some data. Sender A includes standard 'headers' 136 telling recipient B enough information that recipient B knows how 137 sender A intends the message to be interpreted. 139 o Recipient B gets the message, interprets the headers for the data 140 and uses it as information on how to interpret the data. 142 MIME is a "tagging and bagging" specification: 144 tagging: How to label content so the intent of how the content 145 should be interpreted is known. 147 bagging: How to wrap the content so the label is clear, or, if there 148 are multiple parts to a single message, how to combine them. 150 "MIME types" (renamed "Internet Media Types" in later specs 151 [RFC2046]) are part of the "tagging" -- a way to describe the content 152 of a message so that it could be used to initiate interpretation of a 153 message. The "Internet Media Type registry" (MIME type registry) is 154 where someone can tell the world what a particular label means, as 155 far as the sender's intent of how recipients should process a message 156 of that type, and the description of a recipients capability and 157 ability for senders. 159 2.2. Introducing MIME into the Web 161 The original World Wide Web (the 0.9 version of HTTP, see [RFC1945]) 162 didn't have "tagging and bagging" -- everything sent via HTTP was 163 assumed to be HTML. However, at the time (early 1990's) other 164 distributed information access systems, including Gopher (distributed 165 menu system) and WAIS (remote access to document databases) were 166 adding capabilities for accessing many things other text and 167 hypertext and the WWW folks were considering type tagging. It was 168 agreed that HTTP should use MIME as the vocabulary for talking about 169 file types and character sets. The result was that HTTP 1.0 added 170 the "content-type" header, following (more or less) MIME. Later, for 171 content negotiation, additional uses of this technology (in 'Accept' 172 headers) were also added. 174 The differences between the use of Internet Media Types between email 175 and HTTP have minor: 177 o default charset: HTTP specified ISO-8859-1 as the default 178 character set, not US-ASCII 180 o requirement for CRLF in plain text: in practice, web clients 181 didn't restrict content to use CRLF in text/* MIME bodies. 183 These minor differences have caused a lot of trouble. 185 2.3. Distributed Extensibility 187 The real advantage of using Internet Media Types to label content 188 meant that the Web was no longer restricted to a single format. This 189 one addition meant expanding from Global Hypertext to Global 190 Hypermedia (as suggested in a 1992 email [connolly92]) 192 +-------------------------------------------------------------------+ 193 | The Internet currently serves as the backbone for a global | 194 | hypertext. FTP and email provided a good start, and the gopher, | 195 | WWW, or WAIS clients and servers make wide area information | 196 | browsing simple. These systems even interoperate, with email | 197 | servers talking to FTP servers, WWW clients talking to gopher | 198 | servers, on and on. | 199 | This currently works quite well for text. But what should WWW | 200 | clients do as Gopher and WAIS servers begin to serve up pictures, | 201 | sounds, movies, spreadsheet templates, postscript files, etc.? | 202 | It would be a shame for each to adopt its own multimedia typing | 203 | system. | 204 | If they all adopt the MIME typing system (and as many other | 205 | features from MIME as are appropriate), we can step from global | 206 | hypertext to global hypermedia that much easier. | 207 +-------------------------------------------------------------------+ 209 The fact that HTTP could reliably transport images of different 210 formats, for example, allowed NCSA to add to HTML. MIME 211 allowed other document formats (Word, PDF, Postscript) and other 212 kinds of hypermedia, as well as other applications, to be part of the 213 Web. MIME was arguably the most important extensibility mechanism in 214 the Web. 216 3. Problems with application to the Web 218 Unfortunately, while the use of Internet Media Types for the Web 219 added incredible power, a number of problems have arisen. 221 3.1. Lack of clarity 223 Many people are confused about the purpose of MIME in the Web, its 224 uses, the meaning of Internet Media Types. Many W3C specifications 225 TAG findings and Internet Media Type registrations make what are 226 incorrect assumptions about the meaning and purposes of a Internet 227 Media Type registration. 229 3.2. Differences between email and Web delivery 231 Some of the differences between the application contexts of email and 232 Web delivery determine different requirements: 234 o In the Web, the transfer of data is initiated differently than in 235 email: the "messages" with labeled content are usually HTTP 236 responses to a specific (GET) request (although the request is 237 itself a message, GET has no content). In the most common case, 238 then, the receiver knows more about the data before it has been 239 sent. 241 o Clients would like to know more about the content before they 242 retrieve it. The "tagging" is often not sufficient to know, for 243 example, "can I interpret this if I retrieve it", because of 244 versioning, capabilities, or dependencies on things like screen 245 size or interaction capabilities of the recipient. 247 o Some content isn't delivered over the HTTP (files on local file 248 system), or there is no opportunity for tagging (data delivered 249 over FTP) and in those cases, some other ways are needed for 250 determining file type. 252 Operating systems use (and continued to evolve) different systems to 253 determine the 'type' of something, different from the MIME tagging 254 and bagging: 256 o 'magic numbers': in many contexts, file types could be guessed 257 pretty reliably by looking for headers. 259 o Originally MAC OS had a 4 character 'file type' and another 4 260 character 'creator code' for file types. 262 o Windows evolved to use the "file extension" -- 3 letters (and then 263 more) at the end of the file name -- as the initial determination 264 of the oveall type of a file. This practice has now extended to 265 other systems. 267 Information about these other ways of determining type (rather than 268 by the content-type label) were gathered for the Internet Media Type 269 registry; those registering types are encouraged to also describe 270 'magic numbers', Mac file type, common file extensions. However, 271 since there was no formal use of that information, the quality of 272 that information in the registry is haphazard. 274 Finally, there was the fact that tagging and bagging might be OK for 275 unilaterally initiated (one-way) messaging, you might want to know 276 whether you could handle the data before reading it in and 277 interpreting it, but the Internet Media Types weren't enough to tell. 279 3.3. The Rules Weren't Quite Followed 281 The behavior of the community when the Internet Media Type registry 282 was designed hasn't matched expectations: 284 o Lots of file types aren't registered (no entry in IANA for file 285 types). 287 o Those that are, the registration is incomplete or incorrect 288 (people doing registration didn't understand 'magic number' or 289 other fields). 291 o The actual content deployed or created by deployed software 292 doesn't match the registration. 294 In particular, Web implementations of Internet Media Types diverged 295 from expected behavior: 297 o Browser implementors would be liberal in what they accepted, and 298 use what looked like a file extension in the URL and/or magic 299 number or other 'sniffing' techniques to decide file type, without 300 assuming content-label was authoritative. This was necessary 301 anyway for files that weren't delivered by HTTP. 303 o HTTP server implementors and administrators didn't supply ways of 304 easily associating the 'intended' file type label with the file, 305 resulting in files frequently being delivered with a label other 306 than the one they would have chosen if they'd thought about it, 307 and if browsers *had* assumed content-type was authoritative. 308 Some popular servers had default configuration files that treated 309 any unknown type as "text/plain" (plain ext in ASCII). Since it 310 didn't matter (the browsers worked anyway), it was hard to get 311 this fixed. 313 Incorrect senders coupled with liberal readers wind up feeding a 314 negative feedback loop based on the robustness principle 315 ([WikiRobust], [RFC3117]). 317 3.4. Consequences 319 The result, alas, is that increased unreliability, in that 321 o servers sending responses to browsers don't have a good guarantee 322 that the browser won't "sniff" the content and decide to do 323 something other than treat it as it is labeled 325 o browsers receiving content don't have a good guarantee that the 326 content isn't mis-labeled 328 o intermediaries (gateways, proxies, caches, and other pieces of the 329 Web infrastructure) don't have a good way of telling what the 330 conversation means. 332 This ambiguity and 'sniffing' also applies to packaged content in 333 webapps ('bagging' but using ZIP rather than MIME multipart). (NOTE: 334 NEEDS EXPANSION, REFERENCE TO WEBAPPS) 336 3.5. The Down Side of Extensibility 338 Extensibility adds great power, and allows the Web to evolve without 339 committee approval of every extension. For some (those who want to 340 extend and their clients who want those extensions), this is power! 341 For others (those who are building Web components or infrastructure), 342 extensibility is a drawback -- it adds to the unreliability and 343 difference of the Web experience. When senders use extensions 344 recipients aren't aware of, implement incorrectly or incompletely, 345 then communication often fails. With messaging, this is a serious 346 problem, although most 'rich text' documents are still delivered in 347 multiple forms (using multipart/alternative). 349 If your job is to support users of a popular browser, however, where 350 each user has installed a different configuration of file handlers 351 and extensibility mechanisms, MIME may appear to add unnecessary 352 complexity and variable experience for users of all but the most 353 popular types. 355 4. Additional considerations 357 This section notes some additional considerations. 359 4.1. There are related problems with charsets 361 MIME includes provisions not only for file 'types', but also, 362 importantly the "character encoding" used by text types: for example, 363 simple US ASCII, Western European ISO-8859-1, Unicode UTF8. A 364 similar vicious cycle also happened with character set labels: 365 mislabeled content happily processed correctly by liberal browsers 366 encouraged more and more sites to proliferate text with mis-labeled 367 character sets, to the point where browsers feel they *have* to guess 368 the wrong label. (NEEDS EXPANSION) 370 There are sites that intentionally label content as iso-2022-jp or 371 euc-jp when it is in fact one of the Microsoft extension charsets 372 (e.g., for access to circled digits. This is an intentional misuse 373 of the definitions of the charsets themselves -- definitions which 374 originated at the national standards body level. 376 4.2. Embedded, downloaded, launch independent application 378 The type of a document might be determined not only for entire 379 documents "HTML" vs "Word" vs "PDF", but also to embedded components 380 of documents, "JPEG image" vs. "PNG image". However, the use cases, 381 requirements and likely operational impact of MIME handling is likely 382 different for those use cases. 384 4.3. Additional Use Cases: Polyglot and Multiview 386 There are some interesting additional use cases which add to the 387 design requirements: 389 o "Polyglot" documents: A 'polyglot' document is one which is some 390 data which can be treated as two different Internet Media Types, 391 in the case where the meaning of the data is the same. This is 392 part of a transition strategy to allow content providers (senders) 393 to manage, produce, store, deliver the same data, but with two 394 different labels, and have it work equivalently with two different 395 kinds of receivers (one of which knows one Internet Media Type, 396 and another which knows a second one.) This use case was part of 397 the transition strategy from HTML to an XML-based XHTML, and also 398 as a way of a single service offering both HTML-based and XML- 399 based processing (e.g., same content useful for news articles and 400 Web pages. 402 o "Multiview" documents: This use case seems similar but it's quite 403 different. In this case, the same data has very different meaning 404 when served as two different content-types, but that difference is 405 intentional; for example, the same data served as text/html is a 406 document, and served as an RDFa type is some specific data. 408 4.4. Evolution, Versioning, Forking 410 The subject of format/language/type evolution is complex; this 411 section is a litle terse. 413 Formats and their specifications evolve over time. There are several 414 reasons for the evolution: innovation, compatibility with other 415 implementations, attempts to gain control. 417 Some times new evolutions are "compatible", although compatibility 418 has several variations. It is part of the responsibility of the 419 designer of a new version of a file type to try to insure both 420 forward and backward compatibility: new documents work reasonably 421 (with some fallback) with old viewers and that old documents work 422 reasonably with new viewers. In some cases this is accomplished, 423 others not; in some cases, "works reasonably" is softened to "either 424 works reasonably or gives clear warning about nature of problem 425 (version mismatch)." 427 In MIME, the 'tag', the Internet Media Type, corresponds to the 428 versioned series. Internet Media Types do not identify a particular 429 version of a file format. Rather, the general idea is that the 430 Internet Media Type identifies the family, and also how you're 431 supposed to otherwise find version information on a per-format basis. 432 Many (most) file formats have an internal version indicator, with the 433 idea that you only need a new Internet Media Type to designate a 434 completely incompatible format. The notion of an "Internet Media 435 Type" is very course-grained. The general approach to this has been 436 that the actual Media Type includes provisions for version 437 indicator(s) embedded in the content itself to determine more 438 precisely the nature of how the data is to be interpreted. That is, 439 the message itself contains further information. 441 Unfortunately, lots has gone wrong in this scenario as well -- 442 processors ignoring version indicators encouraging content creators 443 to not be careful to supply correct version indicators, leading to 444 lots of content with wrong version indicators. 446 Those updating an existing Internet Media Type registration to 447 account for new versions are admonished to not make previously 448 conforming documents non-conforming. This is harder to enforce than 449 would seem, because the previous specifications are not always 450 accurate to what the Internet Media Type was used for in practice. 452 (NOTE: MULTIPLE INCOMPATIBLE AUTHORITATIVE SPECS) 454 4.5. Content Negotiation 456 The general idea of content negotiation is when party A communicates 457 to party B, and the message can be delivered in more than one format 458 (or version, or configuration), there can be some way of allowing 459 some negotiation, some way for A to communication to B the available 460 options, and for B to be able to accept or indicate preferences. 462 Content negotiation happens all over. When one fax machine twirps to 463 another when initially connecting, they are negotiating resolution, 464 compression methods and so forth. In Internet mail, which is a one- 465 way communication, the "negotiation" consists of the sender preparing 466 and sending multiple versions of the message, one in text/html, one 467 in text/plain, for example, in sender-preference order. The 468 recipient then chooses the first version it can understand. 470 HTTP added "Accept" and "Accept-language" to allow content 471 negotiation in HTTP GET, based on Internet Media Types, and there are 472 other methods explained in the HTTP spec. 474 4.6. Fragment identifiers 476 The Web added the notion of being able to address part of a content 477 and not the whole content by adding a 'fragment identifier' to the 478 URL that addressed the data. Of course, this originally made sense 479 for the original Web with just HTML, but how would it apply to other 480 content. The URL spec glibly noted that "the definition of the 481 fragment identifier meaning depends on the Internet Media Type", but 482 unfortunately, few of the Internet Media Type definitions included 483 this information, and practices diverged greatly. 485 If the interpretation of fragment identifiers depends on the MIME 486 type, though, this really crimps the style of using fragment 487 identifiers differently if content negotiation is wanted. 489 5. Recommendations 491 This section outlines the kinds of changes needed to bring the MIME 492 system in line with current practice and to address the problems 493 outlined above. The purpose of this text is not to specify the exact 494 details of how changes can be accomplished, but rather to find broad 495 aggreement. 497 We need a clear direction on how to make the Web more reliable, not 498 less. We need a realistic transition plan from the unreliable Web to 499 the more reliable one. Part of this is to encourage senders (Web 500 servers) to mean what they say, and encourage recipients (browsers) 501 to give preference to what the senders are sending. 503 We should try to create specifications for protocols and best 504 practices that will lead the Web to more reliable and secure 505 communication. To this end, we give an overall architectural 506 approach to use of MIME, and then specific specifications, for HTTP 507 clients and servers, Web Browsers in general, proxies and 508 intermediaries, which encourage behavior which, on the one hand, 509 continues to work with the already deployed infrastructure (of 510 servers, browsers, and intermediaries), but which advice, if 511 followed, also improves the operability, reliability and security of 512 the Web. 514 This section outlines requirements for standards and practices 515 intended to address some of the difficulties. This is an early 516 version, which mainly contains "strawman" proposals for changes. It 517 is intended to stimulate discussion -- however, the hope is that we 518 can get agreement about the nature of the problems and current 519 situation before focusing in detail about possible solutions. 520 However, having at least strawman proposals seems to be helpful. For 521 some problems, additional changes to IETF and W3C specifications are 522 also be advisable; the expectations are briefly outlined here. 524 5.1. Internet Media Type registration 526 Update the Internet Media Type registry and registration process. 528 5.1.1. MIME registry magic numbers for sniffing 530 Be clearer about relationship of 'magic numbers' to sniffing; review 531 Internet Media Types already registered and update. 533 5.1.2. Scripting and scriptable content safety 535 Be clearer about requiring Security Considerations to address risks 536 of sniffing 538 5.1.3. Fragment identifiers 540 Problem: MIME type definitions don't talk about fragment identifiers. 542 require definition of fragment identifier applicability; add fragID 543 semantics 545 5.1.4. Application info 547 Problem: ((hasn't been expanded) 549 Could the 'applications that use this type' section to be clearer 550 about whether the file type is frequently for embedding (plug-in) or 551 as a separate document with auto-launch (MIME handler), or should 552 always be downloaded? Is there a separate issue for 'auto-play on 553 download' vs. 'ask user for permission'? 555 5.1.5. File extensions in registry 557 Problem: Sniffing needs to use file extensions too; signify which 558 file extensions are useful for sniffing. 560 Be clearer about file extension use and relationship of file 561 extensions to MIME handlers 563 5.2. Sniffing 565 Various new specifications discuss, promote or mandate the use of 566 'sniffing' -- using the content of the data to supplement or even 567 override the declared content-type or charset. Update these 568 specifications. 570 5.2.1. Sniffing uses Media Type magic number 572 Update the proposed Media Type sniffing document so that sniffing 573 uses MIME registry for 'magic numbers'. 575 5.2.2. Sniffing when there are multiple different definitions 577 Address issue of sniffing when there are multiple independent 578 definitions of the same MIME type. 580 5.2.3. Sniffing charsets 582 Update sniffing of charsets to use charset reference info. 584 5.2.4. Sniffing security uses scriptability info 586 If the Internet Media Type registry is more explicit about which 587 kinds of content contain what kind of scriptability access, then the 588 specifications for sniffing can reference the Internet Media Type 589 registry to determine what kinds of sniffing constitute a 'privelege 590 upgrade'. 592 Note that all sniffing can be a priviledge upgrade, if there is a 593 buggy recipient, although bugs can be fixed, but spec violations are 594 a problem. 596 5.3. Changes to IANA processes for MIME registries 598 Problem: Internet Media Type registries are hard to update, and there 599 can be different definitions of the same MIME type. 601 STRAWMAN: Allow commenting or easier update; not all Internet Media 602 Type owners need or have all the information the internet needs. 603 Wiki for Internet Media Types as well as formal registry? Ability to 604 add comments about deployed senders, deployed content, deployed 605 recievers. 607 5.4. FTP specification 609 Do FTP clients also change rules about guessing file types based on 610 OS of FTP server? 612 5.5. Update some URI definitions 614 ftp, file, need sniffing, http sometimes does; data defaults to text/ 615 plain rather than sniffing. Should this info be in the URI 616 definitions. 618 5.6. Changes to W3C findings, processes 620 Update Tag finding on authoritative metadata: is it possible to 621 remove 'authority'? 623 new: MIME and Internet Media Type section to WebArch, referencing 624 this memo 626 New: Add a W3C Web architecture material on MIME in HTML to W3C web 627 site, referencing this memo 629 Reconsider other extensibility mechanisms (namespaces, for example): 630 should they use MIME or something like it? 632 6. Acknowledgements 634 This document is the result of discussions among many individuals in 635 the IETF and W3C. Special thanks to Noah Mendelsohn. 637 7. IANA Considerations 639 This document includes no specific changes to IANA registries or 640 processes. However, it outlines several considerations for future 641 explicit recommendations to IANA, to change Internet Media Type and 642 Charset registries and the processes around their maintenance. IANA 643 evaluation of the feasibility of these changed processes is required. 645 8. Security Considerations 647 This document discusses some of the security issues resulting from 648 use (and mis-use) of MIME content types in the Web. 650 9. Informative References 652 [RFC1521] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet 653 Mail Extensions) Part One: Mechanisms for Specifying and 654 Describing the Format of Internet Message Bodies", 655 RFC 1521, . 657 [RFC1522] Moore, K., "MIME (Multipurpose Internet Mail Extensions) 658 Part Two: Message Header Extensions for Non-ASCII Text", 659 RFC 1522, September 1993, 660 . 662 [RFC1945] Berners-Lee, T., Fielding, R., and H. Nielsen, "Hypertext 663 Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996, 664 . 666 [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 667 Extensions (MIME) Part Two: Media Types", RFC 2046, 668 November 1996, . 670 [RFC3117] Rose, M., "On the Design of Application Protocols", 671 RFC 3117, November 2001, 672 . 674 [WikiRobust] 675 "Robustness principle", 2010, 676 . 678 [connolly92] 679 Connolly, D., "Global Hypermedia", Oct 1992, . 683 [mime-sniff] 684 Barth, A. and I. Hickson, "Media Type Sniffing", May 2010, 685 . 687 Author's Address 689 Larry Masinter 690 Adobe 691 345 Park Ave. 692 San Jose, 95110 693 USA 695 Phone: +1 408 536 3024 696 Email: masinter@adobe.com 697 URI: http://larry.masinter.net