idnits 2.17.1 draft-ietf-websec-mime-sniff-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 7, 2011) is 4737 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '0' on line 656

  -- Looks like a reference, but probably isn't: '1' on line 656

  -- Looks like a reference, but probably isn't: '2' on line 656

  ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
     RFC 7232, RFC 7233, RFC 7234, RFC 7235)


     Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	None                                                            A. Barth
3	Internet-Draft                                                I. Hickson
4	Expires: November 8, 2011                                   Google, Inc.
5	                                                             May 7, 2011

7	                          Media Type Sniffing
8	                    draft-ietf-websec-mime-sniff-03

10	Abstract

12	   Many web servers supply incorrect Content-Type header fields with
13	   their HTTP responses.  In order to be compatible with these servers,
14	   user agents consider the content of HTTP responses as well as the
15	   Content-Type header fields when determining the effective media type
16	   of the response.  This document describes an algorithm for
17	   determining the effective media type of HTTP responses that balances
18	   security and compatibility considerations.

20	   Please send feedback on this draft to websec@ietf.org.

22	Status of this Memo

24	   This Internet-Draft is submitted to IETF in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as Internet-
30	   Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/ietf/1id-abstracts.txt.

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html.

43	   This Internet-Draft will expire on November 8, 2011.

45	Copyright Notice

47	   Copyright (c) 2011 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the BSD License.

60	Table of Contents

62	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
63	   2.  Conventions  . . . . . . . . . . . . . . . . . . . . . . . . .  5
64	   3.  Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
65	   4.  Web Pages  . . . . . . . . . . . . . . . . . . . . . . . . . .  7
66	   5.  Text or Binary . . . . . . . . . . . . . . . . . . . . . . . .  9
67	   6.  Unknown Type . . . . . . . . . . . . . . . . . . . . . . . . . 11
68	     6.1.  Signature for MP4  . . . . . . . . . . . . . . . . . . . . 16
69	   7.  Image  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
70	   8.  Video  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
71	   9.  Fonts  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
72	   10. Feed or HTML . . . . . . . . . . . . . . . . . . . . . . . . . 20
73	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
74	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 23
75	     11.2. Informative References . . . . . . . . . . . . . . . . . . 23
76	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24

78	1.  Introduction

80	   The HTTP Content-Type header field indicates the media type of an
81	   HTTP response.  However, many HTTP servers supply a Content-Type that
82	   does not match the actual contents of the response.  Historically,
83	   web browsers have tolerated these servers by examining the content of
84	   HTTP responses in addition to the Content-Type header field to
85	   determine the effective media type of the response.

87	   Without a clear specification of how to "sniff" the media type, each
88	   user agent implementor was forced to reverse engineer the behavior of
89	   the other user agents and to develop their own algorithm.  These
90	   divergent algorithms have lead to a lack of interoperability between
91	   user agents and to security issues when the server intends an HTTP
92	   response to be interpreted as one media type but some user agents
93	   interpret the responses as another media type.

95	   These security issues are most severe when an "honest" server lets
96	   potentially malicious users upload files and then serves the contents
97	   of those files with a low-privilege media type (such as text/plain or
98	   image/jpeg).  (Malicious servers, of course, can specify an arbitrary
99	   media type in the Content-Type header field.)  In the absence of
100	   media type sniffing, this user-generated content would not be
101	   interpreted as a high-privilege media type, such as text/html.
102	   However, if a user agent does interpret a low-privilege media type,
103	   such as image/gif, as a high-privilege media type, such as text/html,
104	   the user agent has created a privilege escalation vulnerability in
105	   the server.  For example, a malicious user might be able to leverage
106	   content sniffing to mount a cross-site script attack by including
107	   JavaScript code in the uploaded file that a user agent treats as
108	   text/html.

110	   This document describes a content sniffing algorithm that carefully
111	   balances the compatibility needs of user agent implementors with the
112	   security constraints.  The algorithm has been constructed with
113	   reference to content sniffing algorithms present in popular user
114	   agents, an extensive database of existing web content, and metrics
115	   collected from implementations deployed to a sizable number of users
116	   [BarthCaballeroSong2009].

118	   WARNING!  Whenever possible, user agents SHOULD NOT employ a content
119	   sniffing algorithm.  However, if a user agent does employ a content
120	   sniffing algorithm, the user agent SHOULD use the algorithm in this
121	   document because using a different content sniffing algorithm than
122	   servers expect causes security problems.  For example, if a server
123	   believes that the client will treat a contributed file as an image
124	   (and thus treat it as benign), but a user agent believes the content
125	   to be HTML (and thus privileged to execute any scripts contained
126	   therein), an attacker might be able to steal the user's
127	   authentication credentials and mount other cross-site scripting
128	   attacks.

130	   Conformance requirements phrased as algorithms or specific steps MAY
131	   be implemented in any manner, so long as the end result is
132	   equivalent.  (In particular, the algorithms defined in this
133	   specification are intended to be easy to follow, and not intended to
134	   be performant.)

136	2.  Conventions

138	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
139	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
140	   document are to be interpreted as described in [RFC2119].

142	   Requirements phrased in the imperative as part of algorithms (such as
143	   "strip any leading space characters" or "return false and abort these
144	   steps") are to be interpreted with the meaning of the key word
145	   ("MUST", "SHOULD", "MAY", etc) used in introducing the algorithm.

147	   Conformance requirements phrased as algorithms or specific steps can
148	   be implemented in any manner, so long as the end result is
149	   equivalent.  In particular, the algorithms defined in this
150	   specification are intended to be easy to understand and are not
151	   intended to be performant.

153	3.  Metadata

155	   The explicit media type metadata information associated with sequence
156	   of octets depends on the protocol that was used to fetch the octets.

158	   For octets received via HTTP, the Content-Type HTTP header field, if
159	   present, indicates the media type.  Let the official-type be the
160	   media type indicted by the HTTP Content-Type header field, if
161	   present.  If the Content-Type header field is absent or if its value
162	   cannot be interpreted as a media type (e.g. because its value doesn't
163	   contain a U+002F SOLIDUS ('/') character), then there is no official-
164	   type.  (Such messages are invalid according to [RFC2616]

166	      Note: If an HTTP response contains multiple Content-Type header
167	      fields, the user agent MUST use the textually last Content-Type
168	      header field to the official-type.  For example, if the last
169	      Content-Type header field contains the value "foo", then there is
170	      no official media type because "foo" cannot be interpreted as a
171	      media type (even if the HTTP response contains another Content-
172	      Type header field that could be interpreted as a media type).

174	   For octets fetched from the file system, user agents should use
175	   platform-specific conventions (e.g., operating system file extension/
176	   type mappings) to determine the official-type.

178	      Note: It is essential that file extensions are not used for
179	      determining the media type for octets fetched over HTTP because,
180	      in some cases, file extensions can be supplied by malicious
181	      parties.  For example, most PHP installations let the attacker
182	      append arbitrary path information to URLs (e.g.,
183	      http://example.com/foo.php/bar.html) and thereby determine the
184	      file extension.

186	   For octets fetched over some other protocols, e.g.  FTP [RFC0959],
187	   there is no type information.

189	   Note: Comparisons between media types, as defined by MIME
190	   specifications, are done in an ASCII case-insensitive manner.
191	   [RFC2046]

193	4.  Web Pages

195	   The user agent MUST use the following algorithm to determine the
196	   sniffed-type of a sequence of octets:

198	   1.  If the user agent is configured to strictly obey the official-
199	       type, then let the sniffed-type be the official-type and abort
200	       these steps.

202	   2.  If the octets were fetched via HTTP and there is an HTTP Content-
203	       Type header field and the value of the last such header field has
204	       octets that *exactly* match the octets contained in one of the
205	       following lines:

207	      +-------------------------------+--------------------------------+
208	      | Bytes in Hexadecimal          | Textual Representation         |
209	      +-------------------------------+--------------------------------+
210	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain                     |
211	      +-------------------------------+--------------------------------+
212	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=ISO-8859-1 |
213	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
214	      | 49 53 4f 2d 38 38 35 39 2d 31 |                                |
215	      +-------------------------------+--------------------------------+
216	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=iso-8859-1 |
217	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
218	      | 69 73 6f 2d 38 38 35 39 2d 31 |                                |
219	      +-------------------------------+--------------------------------+
220	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=UTF-8      |
221	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
222	      | 55 54 46 2d 38                |                                |
223	      +-------------------------------+--------------------------------+

225	       ...then jump to the "text or binary" section below.

227	   3.  If there is no official-type, jump to the "unknown type" section
228	       below.

230	   4.  If the official-type is "unknown/unknown", "application/unknown",
231	       or "*/*", jump to the "unknown type" section below.

233	   5.  If the official-type ends in "+xml", or if it is either "text/
234	       xml" or "application/xml", then let the sniffed-type be the
235	       official-type and abort these steps.

237	   6.  If the official-type is an image type supported by the user agent
238	       (e.g., "image/png", "image/gif", "image/jpeg", etc), then jump to
239	       the "images" section below.

241	   7.  If the official-type is "text/html", then jump to the "feed or
242	       HTML" section below.

244	   8.  Let the sniffed-type be the official type.

246	5.  Text or Binary

248	   This section defines the *rules for distinguishing if a resource is
249	   text or binary*.

251	   1.  The user agent MAY wait for 512 or more octets to arrive.

253	          Note: Waiting for 512 octets octets to arrive causes the text-
254	          or-binary algorithm to be deterministic for a given sequence
255	          of octets.  However, in some cases, the user agent might need
256	          to wait an arbitrary length of time for these octets to
257	          arrive.  User agents SHOULD wait for 512 octets to arrive,
258	          when feasible.

260	   2.  Let n be the smaller of either 512 or the number of octets that
261	       have already arrived.

263	   3.  If n is greater than or equal to 3, and the first 2 or 3 octets
264	       match one of the following octet sequences:

266	                   +----------------------+--------------+
267	                   | Bytes in Hexadecimal | Description  |
268	                   +----------------------+--------------+
269	                   | FE FF                | UTF-16BE BOM |
270	                   | FF FE                | UTF-16LE BOM |
271	                   | EF BB BF             | UTF-8 BOM    |
272	                   +----------------------+--------------+

274	       ...then let the sniffed-type be "text/plain" and abort these
275	       steps.

277	   4.  If none of the first n octets are binary data octets then let the
278	       sniffed-type be "text/plain" and abort these steps.

280	                         +-------------------------+
281	                         | Binary Data Byte Ranges |
282	                         +-------------------------+
283	                         | 0x00 -- 0x08            |
284	                         | 0x0B                    |
285	                         | 0x0E -- 0x1A            |
286	                         | 0x1C -- 0x1F            |
287	                         +-------------------------+

289	   5.  If the first octets match one of the octet sequences in the
290	       "pattern" column of the table in the "unknown type" section
291	       below, ignoring any rows whose cell in the "security" column says
292	       "scriptable" (or "n/a"), then let the sniffed-type be the type
293	       given in the corresponding cell in the "sniffed type" column on
294	       that row and abort these steps.

296	          WARNING!  It is critical that this step not ever return a
297	          scriptable type (e.g., text/html), because otherwise that
298	          would allow a privilege escalation attack.

300	   6.  Otherwise, let the sniffed-type be "application/octet-stream" and
301	       abort these steps.

303	6.  Unknown Type

305	   1.  The user agent MAY wait for 512 or more octets to arrive for the
306	       same reason as in the "text or binary" section above.

308	   2.  Let n be the smaller of either 512 or the number of octets that
309	       have already arrived.

311	   3.  For each row in the table below:

313	       *  If the row has no "WS" octets:

315	          1.  Let pattern-length be the length of the pattern.

317	          2.  If n is smaller than pattern-length then skip this row.

319	          3.  Apply the bit-wise "and" operator to the first pattern-
320	              length octets and the given mask, and let the result be
321	              the masked-data.

323	          4.  If the octets of the masked-data matches the given pattern
324	              octets exactly, then let the sniffed-type be the type
325	              given in the cell of the third column in that row and
326	              abort these steps.

328	       *  If the row has a "WS" octet or a "_>" octet:

330	          1.  Let index-pattern be an index into the mask and pattern
331	              octet strings of the row.

333	          2.  Let index-stream be an index into the octet stream being
334	              examined.

336	          3.  LOOP: If index-stream points beyond the end of the octet
337	              stream, then this row doesn't match and skip this row.

339	          4.  Examine the index-stream-th octet of the octet stream as
340	              follows:

342	              -  If the index-pattern-th octet of the pattern is a
343	                 normal hexadecimal octet and not a "WS" octet or a "_>"
344	                 octet:

346	                    If the bit-wise "and" operator, applied to the
347	                    index-stream-th octet of the stream and the index-
348	                    pattern-th octet of the mask, yield a value
349	                    different than the index-pattern-th octet of the
350	                    pattern, then skip this row.

352	                    Otherwise, increment index-pattern to the next octet
353	                    in the mask and pattern and index-stream to the next
354	                    octet in the octet stream.

356	              -  Otherwise, if the index-pattern-th octet of the pattern
357	                 is a "WS" octet:

359	                    "WS" means "whitespace", and allows insignificant
360	                    whitespace to be skipped when sniffing for a type
361	                    signature.

363	                    If the index-stream-th octet of the stream is one of
364	                    0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF),
365	                    0x0D (ASCII CR), or 0x20 (ASCII space), then
366	                    increment only the index-stream to the next octet in
367	                    the octet stream.

369	                    Otherwise, increment only the index-pattern to the
370	                    next octet in the mask and pattern.

372	              -  Otherwise, if the index-pattern-th octet of the pattern
373	                 is a "_>" octet:

375	                    "_>" means "space-or-bracket", and allows HTML tag
376	                    names to terminate with either a space or a greater
377	                    than sign.

379	                    If index-stream-th octet of the stream is different
380	                    than 0x20 (ASCII space) or 0x3E (ASCII ">"), then
381	                    skip this row.

383	                    Otherwise, increment index-pattern to the next octet
384	                    in the mask and pattern and index-stream to the next
385	                    octet in the octet stream.

387	          5.  If index-pattern does not point beyond the end of the mask
388	              and pattern octet strings, then jump back to the LOOP step
389	              in this algorithm.

391	          6.  Otherwise, let the sniffed-type be the type given in the
392	              cell of the third column in that row and abort these
393	              steps.

395	   4.  If the first n octets match the signature for MP4 (as define in
396	       Section 6.1), then let the sniffed-type be video/mp4 and abort
397	       these steps.

399	   5.  If none of the first n octets are binary data (as defined in the
400	       "text or binary" section), then let the sniffed-type be "text/
401	       plain" and abort these steps.

403	   6.  Otherwise, let the sniffed-type be "application/octet-stream" and
404	       abort these steps.

406	   The table used by the above algorithm is:

408	+-------------------+-------------------+-----------------+------------+
409	| Mask in Hex       | Pattern in Hex    | Sniffed Type    | Security   |
410	+-------------------+-------------------+-----------------+------------+
411	| FF FF FF DF DF DF | WS 3C 21 44 4F 43 | text/html       | Scriptable |
412	| DF DF DF DF FF DF | 54 59 50 45 20 48 |                 |            |
413	| DF DF DF FF       | 54 4D 4C _>       |                 |            |
414	| Comment:                 |                 |            |
418	| Comment:                 |                 |            |
422	| Comment:           |                 |            |
426	| Comment: