idnits 2.17.1 draft-abarth-mime-sniff-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 9, 2009) is 5576 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'HTTP' on line 123 looks like a reference -- Missing reference section? 'RFC2616' on line 178 looks like a reference -- Missing reference section? 'RFC2046' on line 216 looks like a reference -- Missing reference section? '0' on line 508 looks like a reference -- Missing reference section? '1' on line 508 looks like a reference -- Missing reference section? '2' on line 508 looks like a reference Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Working Group A. Barth 3 Internet-Draft U.C. Berkeley 4 Expires: July 13, 2009 I. Hickson 5 Google, Inc. 6 January 9, 2009 8 Content-Type Processing Model 9 draft-abarth-mime-sniff-00 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on July 13, 2009. 34 Copyright Notice 36 Copyright (c) 2009 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. 46 Abstract 48 Many Web servers supply incorrect Content-Type headers with their 49 HTTP responses. In order to be compatible with these Web servers, 50 Web browsers must consider the content of HTTP responses as well as 51 the Content-Type header when determining the effective mime type of 52 the response. This document describes an algorithm for determining 53 the effective mime type of HTTP responses that balances security and 54 compatibility considerations. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Web Pages . . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 4. Text or Binary . . . . . . . . . . . . . . . . . . . . . . . . 8 62 5. Unknown Type . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 6. Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 64 7. Feed or HTML . . . . . . . . . . . . . . . . . . . . . . . . . 15 65 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 67 1. Introduction 69 The HTTP Content-Type header indicates the mime type of an HTTP 70 responses. However, many HTTP servers supply a Content-Type that 71 does not match the actual contents of the response. Historically, 72 Web browsers have been tolerated these servers by examining the 73 content of HTTP responses in addition to the Content-Type header to 74 determine the effective mime type of the response. 76 Without a clear specification of how to "sniff" the mime type, each 77 browser vendor was forced to reverse engineer the behavior of the 78 other borwsers and to developed their own algorithm. These divergent 79 algorithms have lead to a lack of interoperability between browsers 80 and to security issues when the site intends an HTTP response to be 81 interpreted as one mime type but the browser interpretes the 82 responses as another mime type. 84 These security issues are must severe when a Web site lets users 85 upload files and then serves the contents of those files with a low- 86 privilege mime type (such as text/plain or image/jpeg). In the 87 absense of mime sniffing, this user-generated content will not be 88 able to run JavaScript, but if the browser treats the response as 89 text/html, then the user can mount a cross-site scripting attack by 90 including JavaScript code in the uploaded file. 92 This document describes a mime sniffing algorithm that carefully 93 balances the compatibility needs of browser vendors with the security 94 constraints. The algorithm has been constructed with reference to 95 mime sniffing algorithms present in popular Web browsers, an 96 extensive database of Web content, and metrics collected from 97 implementations deployed to a sizable number of Web users. 99 Warning! It is imperative that the algorithm in this document be 100 followed exactly. When a user agent uses different heuristics for 101 content type detection than the server expects, security problems can 102 occur. For example, if a server believes that the client will treat 103 a contributed file as an image (and thus treat it as benign), but a 104 Web browser believes the content to be HTML (and thus execute any 105 scripts contained therein), the end user can be exposed to malicious 106 content, making the user vulnerable to cookie theft attacks and other 107 cross-site scripting attacks. 109 2. Metadata 111 What explicit Content-Type metadata is associated with the resource 112 (the resource's type information) depends on the protocol that was 113 used to fetch the resource. 115 For HTTP resources, only the first Content-Type HTTP header, if any, 116 contributes any type information; the explicit type of the resource 117 is then the value of that header, interpreted as described by the 118 HTTP specifications. If the Content-Type HTTP header is present but 119 the value of the first such header cannot be interpreted as described 120 by the HTTP specifications (e.g. because its value doesn't contain a 121 U+002F SOLIDUS ('/') character), then the resource has no type 122 information (even if there are multiple Content-Type HTTP headers and 123 one of the other ones is syntactically correct). [HTTP] 125 For resources fetched from the file system, user agents should use 126 platform-specific conventions, e.g. operating system extension/type 127 mappings. 129 Extensions must not be used for determining resource types for 130 resources fetched over HTTP. 132 For resources fetched over most other protocols, e.g. FTP, there is 133 no type information. 135 The algorithm for extracting an encoding from a Content-Type, given a 136 string s, is as follows. It either returns an encoding or nothing. 138 1. Find the first seven characters in s that are an ASCII case- 139 insensitive match for the word "charset". If no such match is 140 found, return nothing. 142 2. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters 143 that immediately follow the word 'charset' (there might not be 144 any). 146 3. If the next character is not a U+003D EQUALS SIGN ('='), return 147 nothing. 149 4. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters 150 that immediately follow the equals sign (there might not be any). 152 5. Process the next character as follows: 154 * If it is a U+0022 QUOTATION MARK ('"') and there is a later 155 U+0022 QUOTATION MARK ('"') in s, or 157 * If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 158 APOSTROPHE ("'") in s 160 Return the string between this character and the next 161 earliest occurrence of this character. 163 * If it is an unmatched U+0022 QUOTATION MARK ('"'), 165 * If it is an unmatched U+0027 APOSTROPHE ("'"), or 167 * If there is no next character 169 Return nothing. 171 * Otherwise 173 Return the string from this character to the first U+0009, 174 U+000A, U+000C, U+000D, U+0020, or U+003B character or the 175 end of s, whichever comes first. 177 Note: The above algorithm is a willful violation of the HTTP 178 specification. [RFC2616] 180 3. Web Pages 182 The sniffed type of a resource must be found as follows: 184 1. If the user agent is configured to strictly obey Content-Type 185 headers for this resource, then jump to the last step in this set 186 of steps. 188 2. If the resource was fetched over an HTTP protocol and there is an 189 HTTP Content-Type header and the value of the first such header 190 has bytes that exactly match one of the following lines: 192 +-------------------------------+--------------------------------+ 193 | Bytes in Hexadecimal | Textual representation | 194 +-------------------------------+--------------------------------+ 195 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain | 196 +-------------------------------+--------------------------------+ 197 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=ISO-8859-1 | 198 | 3b 20 63 68 61 72 73 65 74 3d | | 199 | 49 53 4f 2d 38 38 35 39 2d 31 | | 200 +-------------------------------+--------------------------------+ 201 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=iso-8859-1 | 202 | 3b 20 63 68 61 72 73 65 74 3d | | 203 | 69 73 6f 2d 38 38 35 39 2d 31 | | 204 +-------------------------------+--------------------------------+ 205 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=UTF-8 | 206 | 3b 20 63 68 61 72 73 65 74 3d | | 207 | 55 54 46 2d 38 | | 208 +-------------------------------+--------------------------------+ 210 ...then jump to the "text or binary" section below. 212 3. Let official type be the type given by the Content-Type metadata 213 for the resource, ignoring parameters. If there is no such type, 214 jump to the unknown type step below. Comparisons with this type, 215 as defined by MIME specifications, are done in an ASCII case- 216 insensitive manner. [RFC2046] 218 4. If official type is "unknown/unknown" or "application/unknown", 219 jump to the unknown type step below. 221 5. If official type ends in "+xml", or if it is either "text/xml" or 222 "application/xml", then the sniffed type of the resource is 223 official type; return that and abort these steps. 225 6. If official type is an image type supported by the user agent 226 (e.g. "image/png", "image/gif", "image/jpeg", etc), then jump to 227 the "images" section below, passing it the official type. 229 7. If official type is "text/html", then jump to the feed or HTML 230 section below. 232 8. The sniffed type of the resource is official type. 234 4. Text or Binary 236 1. The user agent may wait for 512 or more bytes of the resource to 237 be available. 239 2. Let n be the smaller of either 512 or the number of bytes already 240 available. 242 3. If n is 4 or more, and the first bytes of the resource match one 243 of the following byte sets: 245 +----------------------+--------------+ 246 | Bytes in Hexadecimal | Description | 247 +----------------------+--------------+ 248 | FE FF | UTF-16BE BOM | 249 | FF FE | UTF-16LE BOM | 250 | EF BB BF | UTF-8 BOM | 251 +----------------------+--------------+ 253 ...then the sniffed type of the resource is "text/plain". Abort 254 these steps. 256 4. If none of the first n bytes of the resource are binary data 257 bytes then the sniffed type of the resource is "text/plain". 258 Abort these steps. 260 +-------------------------+ 261 | Binary data byte ranges | 262 +-------------------------+ 263 | 0x00 -- 0x08 | 264 | 0x0B | 265 | 0x0E -- 0x1A | 266 | 0x1C -- 0x1F | 267 +-------------------------+ 269 5. If the first bytes of the resource match one of the byte 270 sequences in the "pattern" column of the table in the unknown 271 type section below, ignoring any rows whose cell in the 272 "security" column says "scriptable" (or "n/a"), then the sniffed 273 type of the resource is the type given in the corresponding cell 274 in the "sniffed type" column on that row; abort these steps. 276 Warning! It is critical that this step not ever return a 277 scriptable type (e.g. text/html), as otherwise that would 278 allow a privilege escalation attack. 280 6. Otherwise, the sniffed type of the resource is "application/ 281 octet-stream". 283 5. Unknown Type 285 1. The user agent may wait for 512 or more bytes of the resource to 286 be available. 288 2. Let stream length be the smaller of either 512 or the number of 289 bytes already available. 291 3. For each row in the table below: 293 * If the row has no "WS" bytes: 295 1. Let pattern length be the length of the pattern (number of 296 bytes described by the cell in the second column of the 297 row). 299 2. If stream length is smaller than pattern length then skip 300 this row. 302 3. Apply the "and" operator to the first pattern length bytes 303 of the resource and the given mask (the bytes in the cell 304 of first column of that row), and let the result be the 305 data. 307 4. If the bytes of the data matches the given pattern bytes 308 exactly, then the sniffed type of the resource is the type 309 given in the cell of the third column in that row; abort 310 these steps. 312 * If the row has a "WS" byte: 314 1. Let index_pattern be an index into the mask and pattern 315 byte strings of the row. 317 2. Let index_stream be an index into the byte stream being 318 examined. 320 3. Loop: If indexstream points beyond the end of the byte 321 stream, then this row doesn't match, skip this row. 323 4. Examine the indexstreamth byte of the byte stream as 324 follows: 326 - If the index_patternth byte of the pattern is a normal 327 hexadecimal byte and not a "WS" byte: 329 If the "and" operator, applied to the index_streamth 330 byte of the stream and the index_patternth byte of 331 the mask, yield a value different that the 332 index_patternth byte of the pattern, then skip this 333 row. 335 Otherwise, increment index_pattern to the next byte 336 in the mask and pattern and index_stream to the next 337 byte in the byte stream. 339 - Otherwise, if the indexpatternth byte of the pattern is 340 a "WS" byte: 342 "WS" means "whitespace", and allows insignificant 343 whitespace to be skipped when sniffing for a type 344 signature. 346 If the index_streamth byte of the stream is one of 347 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF), 348 0x0D (ASCII CR), or 0x20 (ASCII space), then 349 increment only the index_stream to the next byte in 350 the byte stream. 352 Otherwise, increment only the index_pattern to the 353 next byte in the mask and pattern. 355 5. If index_pattern does not point beyond the end of the mask 356 and pattern byte strings, then jump back to the loop step 357 in this algorithm. 359 6. Otherwise, the sniffed type of the resource is the type 360 given in the cell of the third column in that row; abort 361 these steps. 363 4. If none of the first n bytes of the resource are binary data 364 bytes then the sniffed type of the resource is "text/plain". 365 Abort these steps. 367 5. Otherwise, the sniffed type of the resource is "application/ 368 octet-stream". 370 The table used by the above algorithm is: 372 +-------------------+-------------------+-----------------+------------+ 373 | Mask in Hex | Pattern in Hex | Sniffed type | Security | 374 +-------------------+-------------------+-----------------+------------+ 375 | FF FF DF DF DF DF | 3C 21 44 4F 43 54 | text/html | Scriptable | 376 | DF DF DF FF DF DF | 59 50 45 20 48 54 | | | 377 | DF DF | 4D 4C | | | 378 | | 379 | Comment: The string ""), 534 then increase pos by 3 and jump back to the previous step 535 (the step labeled loop start) in the overall algorithm in 536 this section. 538 3. Otherwise, increase pos by 1. 540 4. Return to step 2 in these substeps. 542 8. If s[pos] is 0x21 (ASCII "!"): 544 1. Increase pos by 1. 546 2. If s[pos] equal 0x3E, then increase pos by 1 and jump back 547 to the step labeled loop start in the overall algorithm in 548 this section. 550 3. Otherwise, return to step 1 in these substeps. 552 9. If s[pos] is 0x3F (ASCII "?"): 554 1. Increase pos by 1. 556 2. If s[pos] and s[pos+1] equal 0x3F and 0x3E respectively, 557 then increase pos by 1 and jump back to the step labeled 558 loop start in the overall algorithm in this section. 560 3. Otherwise, return to step 1 in these substeps. 562 10. Otherwise, if the bytes in s starting at pos match any of the 563 sequences of bytes in the first column of the following table, 564 then the user agent must follow the steps given in the 565 corresponding cell in the second column of the same row. 567 +----------------------+-----------------------------------+-----------+ 568 | Bytes in Hexadecimal | Requirement | Comment | 569 +----------------------+-----------------------------------+-----------+ 570 | 72 73 73 | The sniffed type of the resource | "rss" | 571 | | is "application/rss+xml"; abort | | 572 | | these steps. | | 573 +----------------------+-----------------------------------+-----------+ 574 | 66 65 65 64 | The sniffed type of the resource | "feed" | 575 | | si "application/atom+xml"; abort | | 576 | | these steps. | | 577 +----------------------+-----------------------------------+-----------+ 578 | 72 64 66 3A 52 44 46 | Continue to the next step in this | "rdf:RDF" | 579 | | algorithm. | | 580 +----------------------+-----------------------------------+-----------+ 582 If none of the byte sequences above match the bytes in s 583 starting at pos, then the sniffed type of the resource is "text/ 584 html". Abort these steps. 586 11. ???? If, before the next ">", you find two xmlns* attributes 587 with http://www.w3.org/1999/02/22-rdf-syntax-ns# and 588 http://purl.org/rss/1.0/ as the namespaces, then the sniffed 589 type of the resource is "application/rss+xml", abort these 590 steps. (maybe we only need to check for http://purl.org/rss/1.0/ 591 actually) ???? 593 12. Otherwise, the sniffed type of the resource is "text/html". 595 For efficiency reasons, implementations may wish to implement this 596 algorithm and the algorithm for detecting the character encoding of 597 HTML documents in parallel. 599 Authors' Addresses 601 Adam Barth 602 Univeristy of California, Berkeley 604 Email: abarth@eecs.berkeley.edu 605 URI: http://www.adambarth.com/ 607 Ian Hickson 608 Google, Inc. 610 Email: ian@hixie.ch 611 URI: http://ln.hixie.ch/