idnits 2.17.1 draft-ietf-websec-mime-sniff-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 115: '...ble, user agents SHOULD NOT employ a c...' RFC 2119 keyword, line 117: '..., the user agent SHOULD use the algori...' RFC 2119 keyword, line 127: '...ed as algorithms or specific steps MAY...' RFC 2119 keyword, line 147: '..., the user agent MUST use the textuall...' RFC 2119 keyword, line 175: '... The user agent MUST use the followin...' (8 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 4, 2011) is 4831 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2046' is mentioned on line 171, but not defined

  -- Looks like a reference, but probably isn't: '0' on line 634

  -- Looks like a reference, but probably isn't: '1' on line 634

  -- Looks like a reference, but probably isn't: '2' on line 634

  -- Possible downref: Non-RFC (?) normative reference: ref.
     'BarthCaballeroSong2009'


     Summary: 5 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	None                                                            A. Barth
3	Internet-Draft                                                I. Hickson
4	Expires: August 8, 2011                                     Google, Inc.
5	                                                        February 4, 2011

7	                          Media Type Sniffing
8	                    draft-ietf-websec-mime-sniff-02

10	Abstract

12	   Many web servers supply incorrect Content-Type header fields with
13	   their HTTP responses.  In order to be compatible with these servers,
14	   user agents consider the content of HTTP responses as well as the
15	   Content-Type header fields when determining the effective media type
16	   of the response.  This document describes an algorithm for
17	   determining the effective media type of HTTP responses that balances
18	   security and compatibility considerations.

20	   Please send feedback on this draft to websec@ietf.org.

22	Status of this Memo

24	   This Internet-Draft is submitted to IETF in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as Internet-
30	   Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/ietf/1id-abstracts.txt.

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html.

43	   This Internet-Draft will expire on August 8, 2011.

45	Copyright Notice

47	   Copyright (c) 2011 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the BSD License.

60	Table of Contents

62	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
63	   2.  Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
64	   3.  Web Pages  . . . . . . . . . . . . . . . . . . . . . . . . . .  6
65	   4.  Text or Binary . . . . . . . . . . . . . . . . . . . . . . . .  8
66	   5.  Unknown Type . . . . . . . . . . . . . . . . . . . . . . . . . 10
67	     5.1.  Signature for H.264  . . . . . . . . . . . . . . . . . . . 15
68	   6.  Image  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
69	   7.  Video  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
70	   8.  Fonts  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
71	   9.  Feed or HTML . . . . . . . . . . . . . . . . . . . . . . . . . 19
72	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
73	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23

75	1.  Introduction

77	   The HTTP Content-Type header field indicates the media type of an
78	   HTTP response.  However, many HTTP servers supply a Content-Type that
79	   does not match the actual contents of the response.  Historically,
80	   web browsers have tolerated these servers by examining the content of
81	   HTTP responses in addition to the Content-Type header field to
82	   determine the effective media type of the response.

84	   Without a clear specification of how to "sniff" the media type, each
85	   user agent implementor was forced to reverse engineer the behavior of
86	   the other user agents and to develop their own algorithm.  These
87	   divergent algorithms have lead to a lack of interoperability between
88	   user agents and to security issues when the server intends an HTTP
89	   response to be interpreted as one media type but some user agents
90	   interpret the responses as another media type.

92	   These security issues are most severe when an "honest" server lets
93	   potentially malicious users upload files and then serves the contents
94	   of those files with a low-privilege media type (such as text/plain or
95	   image/jpeg).  (Malicious servers, of course, can specify an arbitrary
96	   media type in the Content-Type header field.)  In the absence of
97	   media type sniffing, this user-generated content would not be
98	   interpreted as a high-privilege media type, such as text/html.
99	   However, if a user agent does interpret a low-privilege media type,
100	   such as image/gif, as a high-privilege media type, such as text/html,
101	   the user agent has created a privilege escalation vulnerability in
102	   the server.  For example, a malicious user might be able to leverage
103	   content sniffing to mount a cross-site script attack by including
104	   JavaScript code in the uploaded file that a user agent treats as
105	   text/html.

107	   This document describes a content sniffing algorithm that carefully
108	   balances the compatibility needs of user agent implementors with the
109	   security constraints.  The algorithm has been constructed with
110	   reference to content sniffing algorithms present in popular user
111	   agents, an extensive database of existing web content, and metrics
112	   collected from implementations deployed to a sizable number of users
113	   [BarthCaballeroSong2009].

115	   WARNING!  Whenever possible, user agents SHOULD NOT employ a content
116	   sniffing algorithm.  However, if a user agent does employ a content
117	   sniffing algorithm, the user agent SHOULD use the algorithm in this
118	   document because using a different content sniffing algorithm than
119	   servers expect causes security problems.  For example, if a server
120	   believes that the client will treat a contributed file as an image
121	   (and thus treat it as benign), but a user agent believes the content
122	   to be HTML (and thus privileged to execute any scripts contained
123	   therein), an attacker might be able to steal the user's
124	   authentication credentials and mount other cross-site scripting
125	   attacks.

127	   Conformance requirements phrased as algorithms or specific steps MAY
128	   be implemented in any manner, so long as the end result is
129	   equivalent.  (In particular, the algorithms defined in this
130	   specification are intended to be easy to follow, and not intended to
131	   be performant.)

133	2.  Metadata

135	   The explicit media type metadata information associated with sequence
136	   of octets depends on the protocol that was used to fetch the octets.

138	   For octets received via HTTP, the Content-Type HTTP header field, if
139	   present, indicates the media type.  Let the official-type be the
140	   media type indicted by the HTTP Content-Type header field, if
141	   present.  If the Content-Type header field is absent or if its value
142	   cannot be interpreted as a media type (e.g. because its value doesn't
143	   contain a U+002F SOLIDUS ('/') character), then there is no official-
144	   type.

146	      Note: If an HTTP response contains multiple Content-Type header
147	      fields, the user agent MUST use the textually last Content-Type
148	      header field to the official-type.  For example, if the last
149	      Content-Type header field contains the value "foo", then there is
150	      no official media type because "foo" cannot be interpreted as a
151	      media type (even if the HTTP response contains another Content-
152	      Type header field that could be interpreted as a media type).

154	   For octets fetched from the file system, user agents should use
155	   platform-specific conventions (e.g., operating system file extension/
156	   type mappings) to determine the official-type.

158	      Note: It is essential that file extensions are not used for
159	      determining the media type for octets fetched over HTTP because,
160	      in some cases, file extensions can be supplied by malicious
161	      parties.  For example, most PHP installations let the attacker
162	      append arbitrary path information to URLs (e.g.,
163	      http://example.com/foo.php/bar.html) and thereby determine the
164	      file extension.

166	   For octets fetched over some other protocols, e.g.  FTP, there is no
167	   type information.

169	   Note: Comparisons between media types, as defined by MIME
170	   specifications, are done in an ASCII case-insensitive manner.
171	   [RFC2046]

173	3.  Web Pages

175	   The user agent MUST use the following algorithm to determine the
176	   sniffed-type of a sequence of octets:

178	   1.  If the user agent is configured to strictly obey the official-
179	       type, then let the sniffed-type be the official-type and abort
180	       these steps.

182	   2.  If the octets were fetched via HTTP and there is an HTTP Content-
183	       Type header field and the value of the last such header field has
184	       octets that *exactly* match the octets contained in one of the
185	       following lines:

187	      +-------------------------------+--------------------------------+
188	      | Bytes in Hexadecimal          | Textual Representation         |
189	      +-------------------------------+--------------------------------+
190	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain                     |
191	      +-------------------------------+--------------------------------+
192	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=ISO-8859-1 |
193	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
194	      | 49 53 4f 2d 38 38 35 39 2d 31 |                                |
195	      +-------------------------------+--------------------------------+
196	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=iso-8859-1 |
197	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
198	      | 69 73 6f 2d 38 38 35 39 2d 31 |                                |
199	      +-------------------------------+--------------------------------+
200	      | 74 65 78 74 2f 70 6c 61 69 6e | text/plain; charset=UTF-8      |
201	      | 3b 20 63 68 61 72 73 65 74 3d |                                |
202	      | 55 54 46 2d 38                |                                |
203	      +-------------------------------+--------------------------------+

205	       ...then jump to the "text or binary" section below.

207	   3.  If there is no official-type, jump to the "unknown type" section
208	       below.

210	   4.  If the official-type is "unknown/unknown", "application/unknown",
211	       or "*/*", jump to the "unknown type" section below.

213	   5.  If the official-type ends in "+xml", or if it is either "text/
214	       xml" or "application/xml", then let the sniffed-type be the
215	       official-type and abort these steps.

217	   6.  If the official-type is an image type supported by the user agent
218	       (e.g., "image/png", "image/gif", "image/jpeg", etc), then jump to
219	       the "images" section below.

221	   7.  If the official-type is "text/html", then jump to the "feed or
222	       HTML" section below.

224	   8.  Let the sniffed-type be the official type.

226	4.  Text or Binary

228	   This section defines the *rules for distinguishing if a resource is
229	   text or binary*.

231	   1.  The user agent MAY wait for 512 or more octets be to arrive.

233	          Note: Waiting for 512 octets octets to arrive causes the text-
234	          or-binary algorithm to be deterministic for a given sequence
235	          of octets.  However, in some cases, the user agent might need
236	          to wait an arbitrary length of time for these octets to
237	          arrive.  User agents SHOULD wait for 512 octets to arrive,
238	          when feasible.

240	   2.  Let n be the smaller of either 512 or the number of octets that
241	       have already arrived.

243	   3.  If n is greater than or equal to 3, and the first 2 or 3 octets
244	       match one of the following octet sequences:

246	                   +----------------------+--------------+
247	                   | Bytes in Hexadecimal | Description  |
248	                   +----------------------+--------------+
249	                   | FE FF                | UTF-16BE BOM |
250	                   | FF FE                | UTF-16LE BOM |
251	                   | EF BB BF             | UTF-8 BOM    |
252	                   +----------------------+--------------+

254	       ...then let the sniffed-type be "text/plain" and abort these
255	       steps.

257	   4.  If none of the first n octets are binary data octets then let the
258	       sniffed-type be "text/plain" and abort these steps.

260	                         +-------------------------+
261	                         | Binary Data Byte Ranges |
262	                         +-------------------------+
263	                         | 0x00 -- 0x08            |
264	                         | 0x0B                    |
265	                         | 0x0E -- 0x1A            |
266	                         | 0x1C -- 0x1F            |
267	                         +-------------------------+

269	   5.  If the first octets match one of the octet sequences in the
270	       "pattern" column of the table in the "unknown type" section
271	       below, ignoring any rows whose cell in the "security" column says
272	       "scriptable" (or "n/a"), then let the sniffed-type be the type
273	       given in the corresponding cell in the "sniffed type" column on
274	       that row and abort these steps.

276	          WARNING!  It is critical that this step not ever return a
277	          scriptable type (e.g., text/html), because otherwise that
278	          would allow a privilege escalation attack.

280	   6.  Otherwise, let the sniffed-type be "application/octet-stream" and
281	       abort these steps.

283	5.  Unknown Type

285	   1.  The user agent MAY wait for 512 or more octets to arrive for the
286	       same reason as in the "text or binary" section above.

288	   2.  Let n be the smaller of either 512 or the number of octets that
289	       have already arrived.

291	   3.  For each row in the table below:

293	       *  If the row has no "WS" octets:

295	          1.  Let pattern-length be the length of the pattern.

297	          2.  If n is smaller than pattern-length then skip this row.

299	          3.  Apply the bit-wise "and" operator to the first pattern-
300	              length octets and the given mask, and let the result be
301	              the masked-data.

303	          4.  If the octets of the masked-data matches the given pattern
304	              octets exactly, then let the sniffed-type be the type
305	              given in the cell of the third column in that row and
306	              abort these steps.

308	       *  If the row has a "WS" octet or a "_>" octet:

310	          1.  Let index-pattern be an index into the mask and pattern
311	              octet strings of the row.

313	          2.  Let index-stream be an index into the octet stream being
314	              examined.

316	          3.  LOOP: If index-stream points beyond the end of the octet
317	              stream, then this row doesn't match and skip this row.

319	          4.  Examine the index-stream-th octet of the octet stream as
320	              follows:

322	              -  If the index-pattern-th octet of the pattern is a
323	                 normal hexadecimal octet and not a "WS" octet or a "_>"
324	                 octet:

326	                    If the bit-wise "and" operator, applied to the
327	                    index-stream-th octet of the stream and the index-
328	                    pattern-th octet of the mask, yield a value
329	                    different than the index-pattern-th octet of the
330	                    pattern, then skip this row.

332	                    Otherwise, increment index-pattern to the next octet
333	                    in the mask and pattern and index-stream to the next
334	                    octet in the octet stream.

336	              -  Otherwise, if the index-pattern-th octet of the pattern
337	                 is a "WS" octet:

339	                    "WS" means "whitespace", and allows insignificant
340	                    whitespace to be skipped when sniffing for a type
341	                    signature.

343	                    If the index-stream-th octet of the stream is one of
344	                    0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0C (ASCII FF),
345	                    0x0D (ASCII CR), or 0x20 (ASCII space), then
346	                    increment only the index-stream to the next octet in
347	                    the octet stream.

349	                    Otherwise, increment only the index-pattern to the
350	                    next octet in the mask and pattern.

352	              -  Otherwise, if the index-pattern-th octet of the pattern
353	                 is a "_>" octet:

355	                    "_>" means "space-or-bracket", and allows HTML tag
356	                    names to terminate with either a space or a greater
357	                    than sign.

359	                    If index-stream-th octet of the stream different
360	                    than 0x20 (ASCII space) or 0x3E (ASCII ">"), then
361	                    skip this row.

363	                    Otherwise, increment index-pattern to the next octet
364	                    in the mask and pattern and index-stream to the next
365	                    octet in the octet stream.

367	          5.  If index-pattern does not point beyond the end of the mask
368	              and pattern octet strings, then jump back to the LOOP step
369	              in this algorithm.

371	          6.  Otherwise, let the sniffed-type be the type given in the
372	              cell of the third column in that row and abort these
373	              steps.

375	   4.  If the first n octets match the signature for H264 (as define in
376	       Section 5.1), then let the sniffed-type be video/H264 and abort
377	       these steps.

379	   5.  If none of the first n octets are binary data (as defined in the
380	       "text or binary" section), then let the sniffed-type be "text/
381	       plain" and abort these steps.

383	   6.  Otherwise, let the sniffed-type be "application/octet-stream" and
384	       abort these steps.

386	   The table used by the above algorithm is:

388	+-------------------+-------------------+-----------------+------------+
389	| Mask in Hex       | Pattern in Hex    | Sniffed Type    | Security   |
390	+-------------------+-------------------+-----------------+------------+
391	| FF FF FF DF DF DF | WS 3C 21 44 4F 43 | text/html       | Scriptable |
392	| DF DF DF DF FF DF | 54 59 50 45 20 48 |                 |            |
393	| DF DF DF FF       | 54 4D 4C _>       |                 |            |
394	| Comment:                 |                 |            |
398	| Comment:                 |                 |            |
402	| Comment:           |                 |            |
406	| Comment: