idnits 2.17.1 draft-abarth-mime-sniff-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 136: '... File extensions MUST NOT be used for ...' RFC 2119 keyword, line 189: '...e/ of a resource MUST be found as foll...' RFC 2119 keyword, line 245: '... The user agent MAY wait for 512 or m...' RFC 2119 keyword, line 294: '... The user agent MAY wait for 512 or m...' RFC 2119 keyword, line 534: '... The user agent MAY wait for 512 or m...' (1 more instance...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 31, 2009) is 5434 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC2616' on line 185 looks like a reference -- Missing reference section? 'RFC2046' on line 194 looks like a reference -- Missing reference section? '0' on line 554 looks like a reference -- Missing reference section? '1' on line 554 looks like a reference -- Missing reference section? '2' on line 554 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Working Group A. Barth 3 Internet-Draft U.C. Berkeley 4 Expires: December 2, 2009 I. Hickson 5 Google, Inc. 6 May 31, 2009 8 Content-Type Processing Model 9 draft-abarth-mime-sniff-01 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on December 2, 2009. 34 Copyright Notice 36 Copyright (c) 2009 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents in effect on the date of 41 publication of this document (http://trustee.ietf.org/license-info). 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. 45 Abstract 47 Many web servers supply incorrect Content-Type headers with their 48 HTTP responses. In order to be compatible with these servers, user 49 agents must consider the content of HTTP responses as well as the 50 Content-Type header when determining the effective media type of the 51 response. This document describes an algorithm for determining the 52 effective media type of HTTP responses that balances security and 53 compatibility considerations. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3. Web Pages . . . . . . . . . . . . . . . . . . . . . . . . . . 7 60 4. Text or Binary . . . . . . . . . . . . . . . . . . . . . . . . 9 61 5. Unknown Type . . . . . . . . . . . . . . . . . . . . . . . . . 11 62 6. Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 63 7. Feed or HTML . . . . . . . . . . . . . . . . . . . . . . . . . 17 64 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 66 1. Introduction 68 The HTTP Content-Type header indicates the media type of an HTTP 69 response. However, many HTTP servers supply a Content-Type that does 70 not match the actual contents of the response. Historically, web 71 browsers have been tolerated these servers by examining the content 72 of HTTP responses in addition to the Content-Type header to determine 73 the effective media type of the response. 75 Without a clear specification of how to "sniff" the media type, each 76 user agent implementor was forced to reverse engineer the behavior of 77 the other user agents and to developed their own algorithm. These 78 divergent algorithms have lead to a lack of interoperability between 79 user agents and to security issues when the server intends an HTTP 80 response to be interpreted as one media type but some user agents 81 interpret the responses as another media type. 83 These security issues are most severe when an "honest" server lets 84 potentially malicious users upload files and then serves the contents 85 of those files with a low-privilege media type (such as text/plain or 86 image/jpeg). (Malicious servers, of course, can specify an arbitrary 87 media type in the Content-Type header.) In the absense of mime 88 sniffing, this user-generated content would not be interpreted as a 89 high-privilege media type, such as text/html. However, if a user 90 agent does interpret a low-privilege media type, such as image/gif, 91 as a high-privilege media type, such as text/html, the user agent as 92 created a privilege escalation vulnerability in the server. For 93 example, a malicious user might be able to leverage content sniffing 94 to mount a cross-site script attack by including JavaScript code in 95 the uploaded file that a user agent treats as text/html. 97 This document describes a content sniffing algorithm that carefully 98 balances the compatibility needs of user agent implementors with the 99 security constraints. The algorithm has been constructed with 100 reference to content sniffing algorithms present in popular user 101 agents, an extensive database of existing web content, and metrics 102 collected from implementations deployed to a sizable number of users. 104 WARNING! Whenever possible, user agents should avoid employing a 105 content sniffing algorithm. However, if the user agent does emply a 106 content sniffing algorithm, it is imperative that the algorithm in 107 this document be followed exactly. When a user agent uses different 108 heuristics for media type detection than the server expects, security 109 problems can occur. For example, if a server believes that the 110 client will treat a contributed file as an image (and thus treat it 111 as benign), but a user agent believes the content to be HTML (and 112 thus privileged to execute any scripts contained therein), an 113 attacker might be able to steal the user's authentication credentials 114 and mount other cross-site scripting attacks. 116 2. Metadata 118 What explicit Content-Type metadata is associated with the resource 119 (the resource's type information) depends on the protocol that was 120 used to fetch the resource. 122 For HTTP resources, only the last Content-Type HTTP header, if any, 123 contributes any type information; the official type of the resource 124 is then the value of that header, interpreted as described by the 125 HTTP specifications. If the Content-Type HTTP header is present but 126 the value of the last such header cannot be interpreted as described 127 by the HTTP specifications (e.g. because its value doesn't contain a 128 U+002F SOLIDUS ('/') character), then the resource has no type 129 information (even if there are multiple Content-Type HTTP headers and 130 one of the other ones is syntactically correct). 132 For resources fetched from the file system, user agents should use 133 platform-specific conventions, e.g. operating system file extension/ 134 type mappings. 136 File extensions MUST NOT be used for determining resource types for 137 resources fetched over HTTP. 139 For resources fetched over most other protocols, e.g. FTP, there is 140 no type information. 142 The algorithm for extracting an encoding from a Content-Type, given a 143 string s, is as follows. It either returns an encoding or nothing. 145 1. Find the first seven characters in s that are an ASCII case- 146 insensitive match for the word "charset". If no such match is 147 found, return nothing. 149 2. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters 150 that immediately follow the word 'charset' (there might not be 151 any). 153 3. If the next character is not a U+003D EQUALS SIGN ('='), return 154 nothing. 156 4. Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 characters 157 that immediately follow the equals sign (there might not be any). 159 5. Process the next character as follows: 161 * If it is a U+0022 QUOTATION MARK ('"') and there is a later 162 U+0022 QUOTATION MARK ('"') in s, or 164 * If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 165 APOSTROPHE ("'") in s 167 Return the string between this character and the next 168 earliest occurrence of this character. 170 * If it is an unmatched U+0022 QUOTATION MARK ('"'), 172 * If it is an unmatched U+0027 APOSTROPHE ("'"), or 174 * If there is no next character 176 Return nothing. 178 * Otherwise 180 Return the string from this character to the first U+0009, 181 U+000A, U+000C, U+000D, U+0020, or U+003B character or the 182 end of s, whichever comes first. 184 Note: The above algorithm is a willful violation of the HTTP 185 specification. [RFC2616] 187 3. Web Pages 189 The /sniffed type/ of a resource MUST be found as follows: 191 1. Let /official type/ be the type given by the Content-Type 192 metadata for the resource, ignoring parameters. Comparisons with 193 this type, as defined by MIME specifications, are done in an 194 ASCII case-insensitive manner. [RFC2046] 196 2. If the user agent is configured to strictly obey Content-Type 197 headers for this resource, then jump to the last step in this set 198 of steps. 200 3. If the resource was fetched over an HTTP protocol and there is an 201 HTTP Content-Type header and the value of the last such header 202 has bytes that exactly match one of the following lines: 204 +-------------------------------+--------------------------------+ 205 | Bytes in Hexadecimal | Textual Representation | 206 +-------------------------------+--------------------------------+ 207 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain | 208 +-------------------------------+--------------------------------+ 209 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=ISO-8859-1 | 210 | 3b 20 63 68 61 72 73 65 74 3d | | 211 | 49 53 4f 2d 38 38 35 39 2d 31 | | 212 +-------------------------------+--------------------------------+ 213 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=iso-8859-1 | 214 | 3b 20 63 68 61 72 73 65 74 3d | | 215 | 69 73 6f 2d 38 38 35 39 2d 31 | | 216 +-------------------------------+--------------------------------+ 217 | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=UTF-8 | 218 | 3b 20 63 68 61 72 73 65 74 3d | | 219 | 55 54 46 2d 38 | | 220 +-------------------------------+--------------------------------+ 222 ...then jump to the "text or binary" section below. 224 4. If there is no /official type/, jump to the unknown type step 225 below. 227 5. If /official type/ is "unknown/unknown", "application/unknown", 228 or "*/*", jump to the unknown type step below. 230 6. If /official type/ ends in "+xml", or if it is either "text/xml" 231 or "application/xml", then the /sniffed type/ of the resource is 232 /official type/; return that and abort these steps. 234 7. If /official type/ is an image type supported by the user agent 235 (e.g. "image/png", "image/gif", "image/jpeg", etc), then jump to 236 the "images" section below, passing it the /official type/. 238 8. If /official type/ is "text/html", then jump to the feed or HTML 239 section below. 241 9. The /sniffed type/ of the resource is /official type/. 243 4. Text or Binary 245 1. The user agent MAY wait for 512 or more bytes of the resource to 246 be available. 248 2. Let n be the smaller of either 512 or the number of bytes already 249 available. 251 3. If n is greater than or equal to 3, and the first 2 or 3 bytes of 252 the resource match one of the following byte sequences: 254 +----------------------+--------------+ 255 | Bytes in Hexadecimal | Description | 256 +----------------------+--------------+ 257 | FE FF | UTF-16BE BOM | 258 | FF FE | UTF-16LE BOM | 259 | EF BB BF | UTF-8 BOM | 260 +----------------------+--------------+ 262 ...then the /sniffed type/ of the resource is "text/plain". 263 Abort these steps. 265 4. If none of the first n bytes of the resource are binary data 266 bytes then the /sniffed type/ of the resource is "text/plain". 267 Abort these steps. 269 +-------------------------+ 270 | Binary Data Byte Ranges | 271 +-------------------------+ 272 | 0x00 -- 0x08 | 273 | 0x0B | 274 | 0x0E -- 0x1A | 275 | 0x1C -- 0x1F | 276 +-------------------------+ 278 5. If the first bytes of the resource match one of the byte 279 sequences in the "pattern" column of the table in the unknown 280 type section below, ignoring any rows whose cell in the 281 "security" column says "scriptable" (or "n/a"), then the /sniffed 282 type/ of the resource is the type given in the corresponding cell 283 in the "sniffed type" column on that row; abort these steps. 285 WARNING! It is critical that this step not ever return a 286 scriptable type (e.g. text/html), as otherwise that would 287 allow a privilege escalation attack. 289 6. Otherwise, the /sniffed type/ of the resource is "application/ 290 octet-stream". 292 5. Unknown Type 294 1. The user agent MAY wait for 512 or more bytes of the resource to 295 be available. 297 2. Let /stream length/ be the smaller of either 512 or the number of 298 bytes already available. 300 3. For each row in the table below: 302 * If the row has no "WS" bytes: 304 1. Let /pattern length/ be the length of the pattern (number 305 of bytes described by the cell in the second column of the 306 row). 308 2. If /stream length/ is smaller than /pattern length/ then 309 skip this row. 311 3. Apply the "and" operator to the first /pattern length/ 312 bytes of the resource and the given mask (the bytes in the 313 cell of first column of that row), and let the result be 314 the data. 316 4. If the bytes of the data matches the given pattern bytes 317 exactly, then the /sniffed type/ of the resource is the 318 type given in the cell of the third column in that row; 319 abort these steps. 321 * If the row has a "WS" byte: 323 1. Let /index pattern/ be an index into the mask and pattern 324 byte strings of the row. 326 2. Let /index stream/ be an index into the byte stream being 327 examined. 329 3. Loop: If /index stream/ points beyond the end of the byte 330 stream, then this row doesn't match, skip this row. 332 4. Examine the /index stream/th byte of the byte stream as 333 follows: 335 - If the /index pattern/th byte of the pattern is a 336 normal hexadecimal byte and not a "WS" byte: 338 If the "and" operator, applied to the /index 339 stream/th byte of the stream and the /index 340 pattern/th byte of the mask, yield a value different 341 that the /index pattern/th byte of the pattern, then 342 skip this row. 344 Otherwise, increment /index pattern/ to the next 345 byte in the mask and pattern and /index stream/ to 346 the next byte in the byte stream. 348 - Otherwise, if the /index pattern/th byte of the pattern 349 is a "WS" byte: 351 "WS" means "whitespace", and allows insignificant 352 whitespace to be skipped when sniffing for a type 353 signature. 355 If the /index stream/th byte of the stream is one of 356 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF), 357 0x0D (ASCII CR), or 0x20 (ASCII space), then 358 increment only the /index stream/ to the next byte 359 in the byte stream. 361 Otherwise, increment only the /index pattern/ to the 362 next byte in the mask and pattern. 364 5. If /index pattern/ does not point beyond the end of the 365 mask and pattern byte strings, then jump back to the loop 366 step in this algorithm. 368 6. Otherwise, the /sniffed type/ of the resource is the type 369 given in the cell of the third column in that row; abort 370 these steps. 372 4. If none of the first n bytes of the resource are binary data 373 bytes then the sniffed type of the resource is "text/plain". 374 Abort these steps. 376 5. Otherwise, the sniffed type of the resource is "application/ 377 octet-stream". 379 The table used by the above algorithm is: 381 +-------------------+-------------------+-----------------+------------+ 382 | Mask in Hex | Pattern in Hex | Sniffed Type | Security | 383 +-------------------+-------------------+-----------------+------------+ 384 | FF FF DF DF DF DF | WS 3C 21 44 4F 43 | text/html | Scriptable | 385 | DF DF DF FF DF DF | 54 59 50 45 20 48 | | | 386 | DF DF | 54 4D 4C | | | 387 | Comment: ""), 581 then increase pos by 3 and jump back to the previous step 582 (the step labeled loop start) in the overall algorithm in 583 this section. 585 3. Otherwise, increase pos by 1. 587 4. Return to step 2 in these substeps. 589 8. If s[pos] equals 0x21 (ASCII "!"): 591 1. Increase pos by 1. 593 2. If s[pos] equals 0x3E, then increase pos by 1 and jump back 594 to the step labeled loop start in the overall algorithm in 595 this section. 597 3. Otherwise, return to step 1 in these substeps. 599 9. If s[pos] equals 0x3F (ASCII "?"): 601 1. Increase pos by 1. 603 2. If s[pos] and s[pos+1] equal 0x3F and 0x3E respectively, 604 then increase pos by 1 and jump back to the step labeled 605 loop start in the overall algorithm in this section. 607 3. Otherwise, return to step 1 in these substeps. 609 10. Otherwise, if the bytes in s starting at pos match any of the 610 sequences of bytes in the first column of the following table, 611 then the user agent must follow the steps given in the 612 corresponding cell in the second column of the same row. 614 +----------------------+------------------------------------+---------+ 615 | Bytes in Hexadecimal | Requirement | Comment | 616 +----------------------+------------------------------------+---------+ 617 | 72 73 73 | The /sniffed type/ of the resource | rss | 618 | | is "application/rss+xml"; abort | | 619 | | these steps. | | 620 +----------------------+------------------------------------+---------+ 621 | 66 65 65 64 | The /sniffed type/ of the resource | feed | 622 | | is "application/atom+xml"; abort | | 623 | | these steps. | | 624 +----------------------+------------------------------------+---------+ 625 | 72 64 66 3A 52 44 46 | Continue to the next step in this | rdf:RDF | 626 | | algorithm. | | 627 +----------------------+------------------------------------+---------+ 629 If none of the byte sequences above match the bytes in s 630 starting at pos, then the /sniffed type/ of the resource is 631 "text/html". Abort these steps. 633 11. Otherwise, the /sniffed type/ of the resource is "text/html". 635 For efficiency reasons, implementations may wish to implement this 636 algorithm and the algorithm for detecting the character encoding of 637 HTML documents in parallel. 639 Authors' Addresses 641 Adam Barth 642 University of California, Berkeley 644 Email: abarth@eecs.berkeley.edu 645 URI: http://www.adambarth.com/ 647 Ian Hickson 648 Google, Inc. 650 Email: ian@hixie.ch 651 URI: http://ln.hixie.ch/