idnits 2.17.1 

draft-ietf-appsawg-xml-mediatypes-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 3) being 60 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC3023]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 189 has weird spacing: '...andards  tree ...'

  == Line 949 has weird spacing: '...  Since  the X...'

  == Line 1106 has weird spacing: '...related  media...'

  == Line 1352 has weird spacing: '... S. and  et al...'

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (November 2012) is 4173 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'ISO8859' is defined on line 1262, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ASCII'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CSS'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'HTTPbis'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO8859'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'PNG'

  ** Obsolete normative reference: RFC 1652 (Obsoleted by RFC 6152)

  ** Obsolete normative reference: RFC 2445 (Obsoleted by RFC 5545)

  ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
     RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  ** Obsolete normative reference: RFC 3023 (Obsoleted by RFC 7303)

  ** Obsolete normative reference: RFC 3501 (Obsoleted by RFC 9051)

  ** Obsolete normative reference: RFC 4288 (Obsoleted by RFC 6838)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SGML'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'TAGMIME'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'UML'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XBase'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XHTML'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XLink'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XML'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XPointerElement'

  -- Possible downref: Non-RFC (?) normative reference: ref.
     'XPointerFramework'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'XPtrReg'

  -- Obsolete informational reference (is this intentional?): RFC 2376
     (Obsoleted by RFC 3023)


     Summary: 7 errors (**), 0 flaws (~~), 8 warnings (==), 18 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          C. Lilley
3	Internet-Draft                                                       W3C
4	Intended status: Standards Track                               M. Murata
5	Expires: May 03, 2013                  International University of Japan
6	                                                             A. Melnikov
7	                                                              Isode Ltd.
8	                                                          H. S. Thompson
9	                                                 University of Edinburgh
10	                                                           November 2012

12	                            XML Media Types
13	                  draft-ietf-appsawg-xml-mediatypes-00

15	Abstract

17	   This specification standardizes three media types -- application/xml,
18	   application/xml-external-parsed-entity, and application/xml-dtd --
19	   for use in exchanging network entities that are related to the
20	   Extensible Markup Language (XML) while defining text/xml and text/
21	   xml-external-parsed-entity as aliases for the respective application/
22	   types.  This specification also standardizes a convention (using the
23	   suffix '+xml') for naming media types outside of these five types
24	   when those media types represent XML MIME entities.  XML MIME
25	   entities are currently exchanged via the HyperText Transfer Protocol
26	   on the World Wide Web, are an integral part of the WebDAV protocol
27	   for remote web authoring, and are expected to have utility in many
28	   domains.

30	   Major differences from [RFC3023] are alignment of charset handling
31	   for text/xml and text/xml-external-parsed-entity with application/
32	   xml, the addition of XPointer and XML Base as fragment identifiers
33	   and base URIs, respectively, mention of the XPointer Registry, and
34	   updating of many references.

36	Status of This Memo

38	   This Internet-Draft is submitted in full conformance with the
39	   provisions of BCP 78 and BCP 79.

41	   Internet-Drafts are working documents of the Internet Engineering
42	   Task Force (IETF).  Note that other groups may also distribute
43	   working documents as Internet-Drafts.  The list of current Internet-
44	   Drafts is at http://datatracker.ietf.org/drafts/current/.

46	   Internet-Drafts are draft documents valid for a maximum of six months
47	   and may be updated, replaced, or obsoleted by other documents at any
48	   time.  It is inappropriate to use Internet-Drafts as reference
49	   material or to cite them other than as "work in progress."

51	   This Internet-Draft will expire on May 03, 2013.

53	Copyright Notice

55	   Copyright (c) 2012 IETF Trust and the persons identified as the
56	   document authors.  All rights reserved.

58	   This document is subject to BCP 78 and the IETF Trust's Legal
59	   Provisions Relating to IETF Documents
60	   (http://trustee.ietf.org/license-info) in effect on the date of
61	   publication of this document.  Please review these documents
62	   carefully, as they describe your rights and restrictions with respect
63	   to this document.  Code Components extracted from this document must
64	   include Simplified BSD License text as described in Section 4.e of
65	   the Trust Legal Provisions and are provided without warranty as
66	   described in the Simplified BSD License.

68	Table of Contents

70	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
71	   2.  Notational Conventions . . . . . . . . . . . . . . . . . . . .  4
72	   3.  XML Media Types  . . . . . . . . . . . . . . . . . . . . . . .  5
73	     3.1.  Application/xml Registration . . . . . . . . . . . . . . .  7
74	     3.2.  Text/xml Registration  . . . . . . . . . . . . . . . . . .  9
75	     3.3.  Application/xml-external-parsed-entity Registration  . . .  9
76	     3.4.  Text/xml-external-parsed-entity Registration . . . . . . . 10
77	     3.5.  Application/xml-dtd Registration . . . . . . . . . . . . . 11
78	     3.6.  Summary  . . . . . . . . . . . . . . . . . . . . . . . . . 11
79	   4.  The Byte Order Mark (BOM) and Conversions to/from the UTF-16
80	       Charset  . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
81	   5.  Fragment Identifiers . . . . . . . . . . . . . . . . . . . . . 12
82	   6.  The Base URI . . . . . . . . . . . . . . . . . . . . . . . . . 13
83	   7.  XML Versions . . . . . . . . . . . . . . . . . . . . . . . . . 13
84	   8.  A Naming Convention for XML-Based Media Types  . . . . . . . . 14
85	     8.1.  Referencing  . . . . . . . . . . . . . . . . . . . . . . . 16
86	   9.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
87	     9.1.  application/xml or text/xml with Omitted Charset and 8-bit
88	           MIME entity  . . . . . . . . . . . . . . . . . . . . . . . 17
89	     9.2.  application/xml or text/xml with Omitted Charset and 16-bit
90	           MIME entity  . . . . . . . . . . . . . . . . . . . . . . . 17
91	     9.3.  application/xml or text/xml with UTF-8 Charset . . . . . . 17
92	     9.4.  application/xml with UTF-16 Charset  . . . . . . . . . . . 18
93	     9.5.  text/xml with UTF-16 Charset . . . . . . . . . . . . . . . 18
94	     9.6.  application/xml with UTF-16BE Charset  . . . . . . . . . . 18
95	     9.7.  text/xml with UTF-16BE Charset . . . . . . . . . . . . . . 19
96	     9.8.  application/xml or text/xml with ISO-2022-KR Charset . . . 19
97	     9.9.  application/xml or text/xml with Omitted Charset, no
98	           Internal Encoding Declaration and UTF-8 Entity . . . . . . 19
99	     9.10. application/xml or text/xml with Omitted Charset and
100	           Internal Encoding Declaration  . . . . . . . . . . . . . . 20
101	     9.11. application/xml-external-parsed-entity or text/xml-external-
102	           parsed-entity with UTF-8 Charset . . . . . . . . . . . . . 20
103	     9.12. application/xml-external-parsed-entity with UTF-16 Charset 20
104	     9.13. application/xml-external-parsed-entity with UTF-16BE Chars 21
105	     9.14. application/xml-dtd  . . . . . . . . . . . . . . . . . . . 21
106	     9.15. application/mathml+xml . . . . . . . . . . . . . . . . . . 21
107	     9.16. application/xslt+xml . . . . . . . . . . . . . . . . . . . 21
108	     9.17. application/rdf+xml  . . . . . . . . . . . . . . . . . . . 22
109	     9.18. image/svg+xml  . . . . . . . . . . . . . . . . . . . . . . 22
110	     9.19. model/x3d+xml  . . . . . . . . . . . . . . . . . . . . . . 22
111	     9.20. INCONSISTENT EXAMPLE: text/xml with UTF-8 Charset  . . . . 22
112	     9.21. application/soap+xml . . . . . . . . . . . . . . . . . . . 23
113	   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 23
114	   11. Security Considerations  . . . . . . . . . . . . . . . . . . . 23
115	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
116	     12.1.  Normative References  . . . . . . . . . . . . . . . . . . 25
117	     12.2.  Informative References  . . . . . . . . . . . . . . . . . 28
118	   Appendix A.  Why Use the '+xml' Suffix for XML-Based MIME Types? . 29
119	     A.1.  Why not just use text/xml or application/xml and let the XML
120	           processor dispatch to the correct application based on the
121	           referenced DTD?  . . . . . . . . . . . . . . . . . . . . . 29
122	     A.2.  Why not create a new subtree (e.g., image/xml.svg) to
123	           represent XML MIME types?  . . . . . . . . . . . . . . . . 30
124	     A.3.  Why not create a new top-level MIME type for XML-based media
125	           types? . . . . . . . . . . . . . . . . . . . . . . . . . . 30
126	     A.4.  Why not just have the MIME processor 'sniff' the content to
127	           determine whether it is XML? . . . . . . . . . . . . . . . 30
128	     A.5.  Why not use a MIME parameter to specify that a media type
129	           uses XML syntax? . . . . . . . . . . . . . . . . . . . . . 31
130	     A.6.  How about labeling with parameters in the other direction
131	           (e.g., application/xml; Content-Feature=iotp)? . . . . . . 31
132	     A.7.  How about a new superclass MIME parameter that is defined to
133	           apply to all MIME types (e.g., Content-Type:
134	           application/iotp; $superclass=xml)?  . . . . . . . . . . . 32
135	     A.8.  What about adding a new parameter to the Content-Disposition
136	           header or creating a new Content-Structure header to
137	           indicate XML syntax? . . . . . . . . . . . . . . . . . . . 32
138	     A.9.  How about a new Alternative-Content-Type header? . . . . . 32
139	     A.10. How about using a conneg tag instead (e.g., accept-features:
140	           (syntax=xml))? . . . . . . . . . . . . . . . . . . . . . . 32
141	     A.11. How about a third-level content-type, such as text/xml/rdf 33
142	     A.12. Why use the plus ('+') character for the suffix '+xml'?  . 33
143	     A.13. What is the semantic difference between application/foo and
144	           application/foo+xml? . . . . . . . . . . . . . . . . . . . 33
145	     A.14. What happens when an even better markup language (e.g.,
146	           EBML) is defined, or a new category of data? . . . . . . . 34
147	     A.15. Why must I use the '+xml' suffix for my new XML-based media
148	           type?  . . . . . . . . . . . . . . . . . . . . . . . . . . 34
149	   Appendix B.  Changes from RFC 3023 . . . . . . . . . . . . . . . . 34
150	   Appendix C.  Acknowledgements  . . . . . . . . . . . . . . . . . . 35
151	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35

153	1.  Introduction

155	   The World Wide Web Consortium has issued the Extensible Markup
156	   Language (XML) 1.0 specification.  [XML].  To enable the exchange of
157	   XML network entities, this specification standardizes three media
158	   types -- application/xml, application/xml-external-parsed-entity, and
159	   application/xml-dtd and two aliases -- text/xml and text/xml-
160	   external-parsed-entity, as well as a naming convention for
161	   identifying XML-based MIME media types (using +xml).

163	   XML entities are currently exchanged on the World Wide Web, and XML
164	   is also used for property values and parameter marshalling by the
165	   WebDAV  [RFC4918] protocol for remote web authoring.  Thus, there is
166	   a need for a media type to properly label the exchange of XML network
167	   entities.

169	   Although XML is a subset of the Standard Generalized Markup Language
170	   (SGML) ISO 8879  [SGML], which has been assigned the media types text
171	   /sgml and application/sgml, there are several reasons why use of text
172	   /sgml or application/sgml to label XML is inappropriate.  First,
173	   there exist many applications that can process XML, but that cannot
174	   process SGML, due to SGML's larger feature set.  Second, SGML
175	   applications cannot always process XML entities, because XML uses
176	   features of recent technical corrigenda to SGML.  Third, the
177	   definition of text/sgml and application/sgml in [RFC1874] includes
178	   parameters for SGML bit combination transformation format (SGML-
179	   bctf), and SGML boot attribute (SGML-boot).  Since XML does not use
180	   these parameters, it would be ambiguous if such parameters were given
181	   for an XML MIME entity.  For these reasons, the best approach for
182	   labeling XML network entities has been to provide new media types for
183	   XML.

185	   Since XML is an integral part of the WebDAV Distributed Authoring
186	   Protocol, and since World Wide Web Consortium Recommendations are
187	   assigned standards tree media types, and since similar media types
188	   (HTML, SGML) have been assigned standards tree media types, the XML
189	   media types were also placed in the standards  tree [RFC3023].

191	   Similarly, XML has been used as a foundation for other media types,
192	   including types in every branch of the IETF media types tree.  To
193	   facilitate the processing of such types, media types based on XML,
194	   but that are not identified using application/xml (or text/xml),
195	   SHOULD be named using a suffix of '+xml' as described in Section 8.
196	   This will allow generic XML-based tools -- browsers, editors, search
197	   engines, and other processors -- to work with all XML-based media
198	   types.

200	2.  Notational Conventions

202	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
203	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
204	   specification are to be interpreted as described in [RFC2119].

206	   As defined in [RFC2781] (informative), the three charsets "utf-16",
207	   "utf-16le", and "utf-16be" are used to label UTF-16 text.  In this
208	   specification, "the UTF-16 family" refers to those three charsets.
209	   By contrast, the phrases "utf-16" or UTF-16 in this specification
210	   refer specifically to the single charset "utf-16".

212	   As sometimes happens between two communities, both MIME and XML have
213	   defined the term entity, with different meanings.  Section 2.4 of
214	   [RFC2045] says:

216	      "The term 'entity' refers specifically to the MIME-defined header
217	      fields and contents of either a message or one of the parts in the
218	      body of a multipart entity."

220	   Section 4 of [XML] says:

222	      "An XML document may consist of one or many storage units.  These
223	      are called entities; they all have content and are all (except for
224	      the document entity and the external DTD subset) identified by
225	      entity name".

227	   In this specification, "XML MIME entity" is defined as the latter (an
228	   XML entity) encapsulated in the former (a MIME entity).

230	3.  XML Media Types

232	   This specification standardizes three media types related to XML MIME
233	   entities: application/xml (with text/xml as an alias), application/
234	   xml-external-parsed-entity (with text/xml-external-parsed-entity as
235	   an alias), and application/xml-dtd.  Registration information for
236	   these media types is described in the sections below.

238	   Within the XML specification, XML MIME entities can be classified
239	   into four types.  In the XML terminology, they are called "document
240	   entities", "external DTD subsets", "external parsed entities", and
241	   "external parameter entities".  The media types application/xml or
242	   text/xml MAY be used for "document entities", while application/xml-
243	   external-parsed-entity or text/xml-external-parsed-entity SHOULD be
244	   used for "external parsed entities".  Note that [RFC3023] (which this
245	   specification obsoletes) recommended the use of text/xml and text/
246	   xml-external-parsed-entity for document entities and external parsed
247	   entities, respectively, but described charset handling which differed
248	   from common implementation practice.  These media types are still
249	   commonly used, and this specification aligns the charset handling
250	   with industry practice.  The media type application/xml-dtd SHOULD be
251	   used for "external DTD subsets" or "external parameter entities".
252	   application/xml and text/xml MUST NOT be used for "external parameter
253	   entities" or "external DTD subsets", and MUST NOT be used for
254	   "external parsed entities" unless they are also well-formed "document
255	   entities" and are referenced as such.  Note that [RFC2376] (which is
256	   obsolete) allowed such usage, although in practice it is likely to
257	   have been rare.

259	   Neither external DTD subsets nor external parameter entities parse as
260	   XML documents, and while some XML document entities may be used as
261	   external parsed entities and vice versa, there are many cases where
262	   the two are not interchangeable.  XML also has unparsed entities,
263	   internal parsed entities, and internal parameter entities, but they
264	   are not XML MIME entities.

266	   Application/xml and application/xml-external-parsed-entity are
267	   recommended.  Compared to [RFC2376] or [RFC3023], this specification
268	   alters the charset handling of text/xml and text/xml-external-parsed-
269	   entity, treating them no differently from the respective application/
270	   types.  The reasons are as follows:

272	      Conflicting specifications regarding the character encoding have
273	      caused confusion.  On the one hand, [RFC2046] specifies "The
274	      default character set, which must be assumed in the absence of a
275	      charset parameter, is US-ASCII.", [RFC2616] Section 3.7.1, defines
276	      that "media subtypes of the 'text' type are defined to have a
277	      default charset value of 'ISO-8859-1'", and [RFC2376] as well as
278	      [RFC3023] specify the default charset is US-ASCII.

280	      On the other hand, implementors and users of XML parsers,
281	      following Appendix F of [XML], assume that the default is provided
282	      by the XML encoding declaration or BOM.  Note that this conflict
283	      did not exist for application/xml or application/xml-external-
284	      parsed-entity (see "Optional parameters" of application/xml
285	      registration in Section 3.1).

287	      The current situation, reflected in this specification, has been
288	      simplified by [RFC6657] updating [RFC2046] to remove the US-ASCII
289	      default.  Furthermore, in accordance with [RFC6657]'s other
290	      recommendations, [HTTPbis] changes [RFC2616] by removing the
291	      ISO-8859-1 default and not defining any default at all.

293	      The top-level media type "text" has some restrictions on MIME
294	      entities and they are described in [RFC2045] and [RFC2046].  In
295	      particular, for transports other than HTTP  [RFC2616] or HTTPS
296	      (which uses a MIME-like mechanism).  the UTF-16 family, UCS-4, and
297	      UTF-32 are not allowed However, section 4.3.3 of [XML] says:

299	         "Each external parsed entity in an XML document may use a
300	         different encoding for its characters.  All XML processors MUST
301	         be able to read entities in both the UTF-8 and UTF-16
302	         encodings."

304	      Thus, although all XML processors can read entities in at least
305	      UTF-16, if an XML document or external parsed entity is encoded in
306	      such character encoding schemes, it could not be labeled as text/
307	      xml or text/xml-external-parsed-entity (except for HTTP).

309	      It is not possible to deprecate text/xml because it is widely used
310	      in practice, and implementations are largely interoperable,
311	      following the rules of  Appendix F of [XML] and ignoring the
312	      requirements of [RFC3023].

314	   XML provides a general framework for defining sequences of structured
315	   data.  In some cases, it may be desirable to define new media types
316	   that use XML but define a specific application of XML, perhaps due to
317	   domain-specific display, editing, security considerations or runtime
318	   information.  Furthermore, such media types may allow UTF-8 or UTF-16
319	   only and prohibit other charsets.  This specification does not
320	   prohibit such media types and in fact expects them to proliferate.
321	   However, developers of such media types are STRONGLY RECOMMENDED to
322	   use this specification as a basis for their registration.  In
323	   particular, the charset parameter, if used, MUST agree with the
324	   encoding of the XML entity, as described in Section 8.1, in order to
325	   enhance interoperability.

327	   An XML document labeled as application/xml or text/xml, or with a
328	   +xml media type, might contain namespace declarations, stylesheet-
329	   linking processing instructions (PIs), schema information, or other
330	   declarations that might be used to suggest how the document is to be
331	   processed.  For example, a document might have the XHTML namespace
332	   and a reference to a CSS stylesheet.  Such a document might be
333	   handled by applications that would use this information to dispatch
334	   the document for appropriate processing.

336	3.1.  Application/xml Registration

338	   MIME media type name: application

340	   MIME subtype name: xml

342	   Mandatory parameters: none

344	   Optional parameters: charset

346	      The charset parameter MUST only be used, when the charset is
347	      reliably known and agrees with the encoding declaration.  This
348	      information can be used by non-XML processors to determine
349	      authoritatively the charset of the XML MIME entity.  The charset
350	      parameter can also be used to provide protocol-specific
351	      operations, such as charset-based content negotiation in HTTP.

353	      "utf-8" [RFC3629] and "utf-16" [RFC2781] are the recommended
354	      values, representing the UTF-8 and UTF-16 charsets, respectively.
355	      These charsets are preferred since they are supported by all
356	      conforming processors of [XML].

358	      If an application/xml entity is received where the charset
359	      parameter is omitted, no information is being provided about the
360	      charset by the MIME Content-Type header.  Conforming XML
361	      processors MUST follow the requirements in section 4.3.3 of [XML]
362	      that directly address this contingency.  However, MIME processors
363	      that are not XML processors SHOULD NOT assume a default charset if
364	      the charset parameter is omitted from an application/xml entity.

366	      There are several reasons that the charset parameter is optionally
367	      allowed.  First, recent web servers have been improved so that
368	      users can specify the charset parameter.  Second, [RFC2130]
369	      (informative) specifies that the recommended specification scheme
370	      is the "charset" parameter.

372	      On the other hand, it has been argued that the charset parameter
373	      should be omitted and the mechanism described in Appendix F of
374	      [XML]  (which is non-normative) should be solely relied on.  This
375	      approach would allow users to avoid configuration of the charset
376	      parameter; an XML document stored in a file is likely to contain a
377	      correct encoding declaration or BOM (if necessary), since the
378	      operating system does not typically provide charset information
379	      for files.  If users would like to rely on the encoding
380	      declaration or BOM and to hide charset information from protocols,
381	      they SHOULD determine not to use the parameter.

383	      Since a receiving application can, with very high reliability,
384	      determine the encoding of an XML document by reading it, the XML
385	      encoding declaration SHOULD be provided.

387	   Encoding considerations: This media type MAY be encoded as
388	      appropriate for the charset and the capabilities of the underlying
389	      MIME transport.  For 7-bit transports, data in either UTF-8 or
390	      UTF-16 MUST be encoded in quoted-printable or base64.  For 8-bit
391	      clean transport (e.g., 8BITMIME  [RFC1652] ESMTP or NNTP
392	      [RFC3977]), UTF-8 is not encoded, but the UTF-16 family MUST be
393	      encoded in base64.  For binary clean transports (e.g., HTTP
394	      [RFC2616]), no content-transfer-encoding is necessary.

396	   Security considerations: See Section 11.

398	   Interoperability considerations: XML has proven to be interoperable
399	      across WebDAV clients and servers, and for import and export from
400	      multiple XML authoring tools.  For maximum interoperability,
401	      validating processors are recommended.  Although non-validating
402	      processors may be more efficient, they are not required to handle
403	      all features of XML.  For further information, see sub-section 2.9
404	      "Standalone Document Declaration" and section 5 "Conformance" of
405	      [XML] .

407	   Published specification: Extensible Markup Language (XML) 1.0 (Fifth
408	      Edition) [XML].

410	   Applications which use this media type: XML is device-, platform-,
411	      and vendor-neutral and is supported by a wide range of Web user
412	      agents, WebDAV  [RFC4918] clients and servers, as well as XML
413	      authoring tools.

415	   Additional information:

417	      Magic number(s): None.

419	         Although no byte sequences can be counted on to always be
420	         present, XML MIME entities in ASCII-compatible charsets
421	         (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C
422	         ("<?xml"), and those in UTF-16 often begin with hexadecimal FE
423	         FF 00 3C 00 3F 00 78 00 6D 00 6C or FF FE 3C 00 3F 00 78 00 6D
424	         00 6C 00 (the Byte Order Mark (BOM) followed by "<?xml").  For
425	         more information, see Appendix F of [XML].

427	      File extension(s): .xml

429	      Macintosh File Type Code(s): "TEXT"

431	   Person and email address for further information:

433	         MURATA Makoto (FAMILY Given) <eb2m-mrt@asahi-net.or.jp>

435	         Alexey Melnikov <alexey.melnikov@isode.com>

437	         Chris Lilley <chris@w3.org>

439	         Henry S. Thompson <ht@inf.ed.ac.uk>

441	   Intended usage: COMMON

443	   Author/Change controller: The XML specification is a work product of
444	      the World Wide Web Consortium's XML Working Group, and was edited
445	      by:

447	         Tim Bray <tbray@textuality.com>

449	         Jean Paoli <jeanpa@microsoft.com>

451	         C. M. Sperberg-McQueen <cmsmcq@uic.edu>

453	         Eve Maler <eve.maler@east.sun.com>

455	         Francois Yergeau <mailto:francois@yergeau.com>

457	3.2.  Text/xml Registration

459	   text/xml is an alias for application/xml, as defined in Section 3.1
460	   above.

462	3.3.  Application/xml-external-parsed-entity Registration
463	   MIME media type name: application

465	   MIME subtype name: xml-external-parsed-entity

467	   Mandatory parameters: none

469	   Optional parameters: charset

471	      The charset parameter of application/xml-external-parsed-entity is
472	      handled the same as that of application/xml as described in
473	      Section 3.1.

475	   Encoding considerations: Same as application/xml as described in
476	      Section 3.1.

478	   Security considerations: See Section 11.

480	   Interoperability considerations: XML external parsed entities are as
481	      interoperable as XML documents, though they have a less tightly
482	      constrained structure and therefore need to be referenced by XML
483	      documents for proper handling by XML processors.  Similarly, XML
484	      documents cannot be reliably used as external parsed entities
485	      because external parsed entities are prohibited from having
486	      standalone document declarations or DTDs.  Identifying XML
487	      external parsed entities with their own content type should
488	      enhance interoperability of both XML documents and XML external
489	      parsed entities.

491	   Published specification: Same as application/xml as described in
492	      Section 3.1.

494	   Applications which use this media type: Same as application/xml as
495	      described in Section 3.1.

497	   Additional information:

499	      Magic number(s): Same as application/xml as described in Section
500	         3.1.

502	      File extension(s): .xml or .ent

504	      Macintosh File Type Code(s): "TEXT"

506	   Person and email address for further information: Same as application
507	      /xml as described in Section 3.1.

509	   Intended usage: COMMON

511	   Author/Change controller: Same as application/xml as described in
512	      Section 3.1.

514	3.4.  Text/xml-external-parsed-entity Registration
515	   text/xml-external-parsed-entity is an alias for application/xml-
516	   external-parsed-entity, as defined in Section 3.3 above.

518	3.5.  Application/xml-dtd Registration

520	   MIME media type name: application

522	   MIME subtype name: xml-dtd

524	   Mandatory parameters: none

526	   Optional parameters: charset

528	      The charset parameter of application/xml-dtd is handled the same
529	      as that of application/xml as described in Section 3.1.

531	   Encoding considerations: Same as Section 3.1.

533	   Security considerations: See Section 11.

535	   Interoperability considerations: XML DTDs have proven to be
536	      interoperable by DTD authoring tools and XML browsers, among
537	      others.

539	   Published specification: Same as application/xml as described in
540	      Section 3.1.

542	   Applications which use this media type: DTD authoring tools handle
543	      external DTD subsets as well as external parameter entities.  XML
544	      browsers may also access external DTD subsets and external
545	      parameter entities.

547	   Additional information:

549	      Magic number(s): Same as application/xml as described in Section
550	         3.1.

552	      File extension(s): .dtd or .mod

554	      Macintosh File Type Code(s): "TEXT"

556	   Person and email address for further information: Same as application
557	      /xml as described in Section 3.1.

559	   Intended usage: COMMON

561	   Author/Change controller: Same as application/xml as described in
562	      Section 3.1.

564	3.6.  Summary
565	   o  If the charset parameter is omitted, conforming XML processors
566	      MUST follow the requirements in section 4.3.3 of [XML] or [XML1.1]
567	      as appropriate.

569	   o  If provided, the charset parameter MUST agree with the xml
570	      encoding declaration.

572	4.  The Byte Order Mark (BOM) and Conversions to/from the UTF-16 Charset

574	   Section 4.3.3 of [XML] specifies that XML MIME entities in the
575	   charset "utf-16" MUST begin with a byte order mark (BOM), which is a
576	   hexadecimal octet sequence 0xFE 0xFF (or 0xFF 0xFE, depending on
577	   endian).  The XML Recommendation further states that the BOM is an
578	   encoding signature, and is not part of either the markup or the
579	   character data of the XML document.

581	   Due to the presence of the BOM, applications that convert XML from
582	   "utf-16" to a non-Unicode encoding MUST strip the BOM before
583	   conversion.  Similarly, when converting from another encoding into
584	   "utf-16", the BOM MUST be added after conversion is complete.

586	   In addition to the charset "utf-16", [RFC2781] introduces "utf-16le"
587	   (little endian) and "utf-16be" (big endian) as well.  The BOM is
588	   prohibited for these charsets.  When an XML MIME entity is encoded in
589	   "utf-16le" or "utf-16be", it MUST NOT begin with the BOM but SHOULD
590	   contain an encoding declaration.  Conversion from "utf-16" to "utf-
591	   16be" or "utf-16le" and conversion in the other direction MUST strip
592	   or add the BOM, respectively.

594	5.  Fragment Identifiers

596	   Uniform Resource Identifiers (URIs) may contain fragment identifiers
597	   (see Section 3.5 of [RFC3986]).  Likewise, Internationalized Resource
598	   Identifiers (IRIs) [RFC3987] may contain fragment identifiers.

600	   The syntax and semantics of fragment identifiers for the XML media
601	   types defined in this specification are based on the
602	   [XPointerFramework] W3C Recommendation.  It allows simple names, and
603	   more complex constructions based on named schemes.  When the syntax
604	   of a fragment identifier part of any URI or IRI with a retrieved
605	   media type governed by this specification conforms to the syntax
606	   specified in [XPointerFramework], conformant applications MUST
607	   attempt to interpret such fragment identifiers as designating that
608	   part of the retrieved representation specified by
609	   [XPointerFramework] and whatever other specifications define any
610	   XPointer schemes used.  Conformant applications MUST support the
611	   'element' scheme as defined in [XPointerElement], but need not
612	   support other schemes.

614	   If an XPointer error is reported in the attempt to process the part,
615	   this specification does not define an interpretation for the part.

617	   A  registry of XPointer schemes [XPtrReg] is maintained at the W3C.
618	   Unregistered schemes SHOULD NOT be used.

620	   See Section 8.1 for additional rquirements which apply when an XML-
621	   based MIME media type follows the naming convention '+xml'.

623	   If [XPointerFramework] and [XPointerElement] are inappropriate for
624	   some XML-based media type, it SHOULD NOT follow the naming convention
625	   '+xml'.

627	   When a URI has a fragment identifier, it is encoded by a limited
628	   subset of the repertoire of US-ASCII [ASCII] characters, as defined
629	   in [RFC3986].  When an IRI contains a fragment identifier, it is
630	   encoded by a much wider repertoire of characters.  The conversion
631	   between IRI fragment identifiers and URI fragment identifiers is
632	   presented in Section 7 of [RFC3987].

634	6.  The Base URI

636	   Section 5.1 of [RFC3986] specifies that the semantics of a relative
637	   URI reference embedded in a MIME entity is dependent on the base URI.
638	   The base URI is either (1) the base URI embedded in context, (2) the
639	   base URI from the encapsulating entity, (3) the base URI from the
640	   Retrieval URI, or (4) the default base URI, where (1) has the highest
641	   precedence.  [RFC3986] further specifies that the mechanism for
642	   embedding the base URI is dependent on the media type.

644	   The media type dependent mechanism for embedding the base URI in a
645	   MIME entity of type application/xml, text/xml, application/xml-
646	   external-parsed-entity or text/xml-external-parsed-entity is to use
647	   the xml:base attribute described in detail in [XBase].

649	   Note that the base URI may be embedded in a different MIME entity,
650	   since the default value for the xml:base attribute may be specified
651	   in an external DTD subset or external parameter entity.

653	7.  XML Versions

655	   application/xml, application/xml-external-parsed-entity, and
656	   application/xml-dtd, text/xml and text/xml-external-parsed-entity are
657	   to be used with [XML]   In all examples herein where version="1.0" is
658	   shown, it is understood that version="1.1" may also be used,
659	   providing the content does indeed conform to [XML1.1].

661	   The normative requirement of this specification upon XML is to follow
662	   the requirements of [XML], section 4.3.3.  Except for minor
663	   clarifications, that section is substantially identical from the
664	   first edition to the current (5th) edition of XML 1.0, and for XML
665	   1.1.  Therefore, this specification may be used with any version or
666	   edition of XML 1.0 or 1.1.

668	   Specifications and recommendations based on or referring to this RFC
669	   SHOULD indicate any limitations on the particular versions of XML to
670	   be used.  For example, a particular specification might indicate:
671	   "content MUST be represented using media-type application/xml, and
672	   the document must either (a) carry an xml declaration specifying
673	   version="1.0" or (b) omit the XML declaration, in which case per the
674	   XML recommendation the version defaults to 1.0"

676	8.  A Naming Convention for XML-Based Media Types

678	   This specification recommends the use of a naming convention (a
679	   suffix of '+xml') for identifying XML-based MIME media types,
680	   whatever their particular content may represent.  This allows the use
681	   of generic XML processors and technologies on a wide variety of
682	   different XML document types at a minimum cost, using existing
683	   frameworks for media type registration.

685	   Although the use of a suffix was not considered as part of the
686	   original MIME architecture, this choice is considered to provide the
687	   most functionality with the least potential for interoperability
688	   problems or lack of future extensibility.  The alternatives to the
689	   '+xml' suffix and the reason for its selection are described in
690	   Appendix A.

692	   As XML development continues, new XML document types are appearing
693	   rapidly.  Many of these XML document types would benefit from the
694	   identification possibilities of a more specific MIME media type than
695	   text/xml or application/xml can provide, and it is likely that many
696	   new media types for XML-based document types will be registered in
697	   the near and ongoing future.

699	   While the benefits of specific MIME types for particular types of XML
700	   documents are significant, all XML documents share common structures
701	   and syntax that make possible common processing.

703	   Some areas where 'generic' processing is useful include:

705	   o  Browsing - An XML browser can display any XML document with a
706	      provided [CSS] or [XSLT] style sheet, whatever the vocabulary of
707	      that document.

709	   o  Editing - Any XML editor can read, modify, and save any XML
710	      document.

712	   o  Fragment identification - XPointers (see Section 5) can work with
713	      any XML document, whatever vocabulary it uses.

715	   o  Hypertext linking - [XLink] hypertext linking is designed to
716	      connect any XML documents, regardless of vocabulary.

718	   o  Searching - XML-oriented search engines, web crawlers, agents, and
719	      query tools should be able to read XML documents and extract the
720	      names and content of elements and attributes even if the tools are
721	      ignorant of the particular vocabulary used for elements and
722	      attributes.

724	   o  Storage - XML-oriented storage systems, which keep XML documents
725	      internally in a parsed form, should similarly be able to process,
726	      store, and recreate any XML document.

728	   o  Well-formedness and validity checking - An XML processor can
729	      confirm that any XML document is well-formed and that it is valid
730	      (i.e., conforms to its declared DTD or Schema).

732	   When a new media type is introduced for an XML-based format, the name
733	   of the media type SHOULD end with '+xml'. This convention will allow
734	   applications that can process XML generically to detect that the MIME
735	   entity is supposed to be an XML document, verify this assumption by
736	   invoking some XML processor, and then process the XML document
737	   accordingly.  Applications may match for types that represent XML
738	   MIME entities by comparing the subtype to the pattern '*/*+xml'.  (Of
739	   course, 4 of the 5 media types defined in this specification -- text/
740	   xml, application/xml, text/xml-external-parsed-entity, and
741	   application/xml-external-parsed-entity -- also represent XML MIME
742	   entities while not conforming to the '*/*+xml' pattern.)

744	      NOTE: Section 14.1 of HTTP  [RFC2616] does not support Accept
745	      headers of the form "Accept: */*+xml" and so this header MUST NOT
746	      be used in this way.  Instead, content negotiation  [RFC2703]
747	      could potentially be used if an XML-based MIME type were needed.

749	   Media types following the naming convention '+xml' SHOULD introduce
750	   the charset parameter for consistency, since XML-generic processing
751	   applies the same program for any such media type.  However, there are
752	   some cases that the charset parameter need not be introduced.  For
753	   example:

755	      When an XML-based media type is restricted to UTF-8, it is not
756	      necessary to introduce the charset parameter.  "UTF-8 only" is a
757	      generic principle and UTF-8 is the default of XML.

759	      When an XML-based media type is restricted to UTF-8 and UTF-16, it
760	      might not be unreasonable to omit the charset parameter.  Neither
761	      UTF-8 nor UTF-16 require encoding declarations of XML.

763	      Note: Some argue that XML-based media types should not introduce
764	      the charset parameter, although others disagree.

766	   XML generic processing is not always appropriate for XML-based media
767	   types.  For example, authors of some such media types may wish that
768	   the types remain entirely opaque except to applications that are
769	   specifically designed to deal with that media type.  By NOT following
770	   the naming convention '+xml', such media types can avoid XML-generic
771	   processing.  Since generic processing will be useful in many cases,
772	   however -- including in some situations that are difficult to predict
773	   ahead of time -- those registering media types SHOULD use the '+xml'
774	   convention unless they have a particularly compelling reason not to.

776	   *HST: This paragraph needs updating once some pending RFCs are out
777	   there *The registration process for these media types is described in
778	   [RFC4288] and [RFC4289]  .  The registrar for the IETF tree will
779	   encourage new XML-based media type registrations in the IETF tree to
780	   follow this guideline.  Registrars for other trees SHOULD follow this
781	   convention in order to ensure maximum interoperability of their XML-
782	   based documents.  Similarly, media subtypes that do not represent XML
783	   MIME entities MUST NOT be allowed to register with a '+xml' suffix.

785	8.1.  Referencing

787	   Registrations for new XML-based media types under top-level types
788	   SHOULD, in specifying the charset parameter and encoding
789	   considerations, define them as: "Same as [charset parameter /
790	   encoding considerations] of application/xml as specified in RFC
791	   XXXX."

793	   The use of the charset parameter is STRONGLY RECOMMENDED, since this
794	   information can be used by XML processors to determine
795	   authoritatively the charset of the XML MIME entity.    If there are
796	   some reasons not to follow this advice, they SHOULD be included as
797	   part of the registration.  As shown above, two such reasons are
798	   "UTF-8 only" or "UTF-8 or UTF-16 only".

800	   These registrations SHOULD specify that the XML-based media type
801	   being registered has all of the security considerations described in
802	   RFC XXXX plus any additional considerations specific to that media
803	   type.

805	   These registrations SHOULD also make reference to RFC XXXX in
806	   specifying magic numbers, base URIs, and use of the BOM.

808	   When these registrations use the '+xml' convention, they MUST also
809	   make reference to RFC XXXX in specifying fragment identifier syntax
810	   and semantics, and they MAY restrict the syntax to a specified subset
811	   of schemes, except that they MUST NOT disallow barenames or 'element'
812	   scheme pointers.  They MAY further require support for other
813	   registered schemes.  They also MAY add additional syntax (which MUST
814	   NOT overlap with [XPointerFramework] syntax) together with associated
815	   semantics, and MAY add additional semantics for barename XPointers
816	   which, as provided for in Section 5, will only apply when this
817	   specification does not define an interpretation.

819	   These registrations MAY reference the application/xml registration in
820	   RFC XXXX in specifying interoperability considerations, if these
821	   considerations are not overridden by issues specific to that media
822	   type.

824	9.  Examples

826	   The examples below give the value of the MIME Content-type header and
827	   the XML declaration (which includes the encoding declaration) inside
828	   the XML MIME entity.  For UTF-16 examples, the Byte Order Mark
829	   character is denoted as "{BOM}", and the XML declaration is assumed
830	   to come at the beginning of the XML MIME entity, immediately
831	   following the BOM.  Note that other MIME headers may be present, and
832	   the XML MIME entity may contain other data in addition to the XML
833	   declaration; the examples focus on the Content-type header and the
834	   encoding declaration for clarity.

836	9.1.  application/xml or text/xml with Omitted Charset and 8-bit MIME
837	      entity

839	   Content-type: application/xml or text/xml

841	   <?xml version="1.0" encoding="iso-8859-1"?>

843	   Since the charset parameter is not provided in the Content-Type
844	   header, XML processors MUST treat the  "iso-8859-1" encoding as
845	   authoritative.  XML-unaware MIME processors SHOULD make no
846	   assumptions about the charset of the XML MIME entity.

848	9.2.  application/xml or text/xml with Omitted Charset and 16-bit MIME
849	      entity

851	   Content-type: application/xml or text/xml

853	   {BOM}<?xml version="1.0" encoding="utf-16"?>

855	   or

857	   {BOM}<?xml version="1.0"?>

859	   This example shows a 16-bit MIME entity with no charset parameter.
860	   Since the charset parameter is not provided in the Content-Type
861	   header, in this case XML processors MUST treat the "utf-16" encoding
862	   and/or the BOM as authoritative.  XML-unaware MIME processors SHOULD
863	   make no assumptions about the charset of the XML MIME entity.

865	   Omitting the charset parameter is NOT RECOMMENDED for application/xml
866	   when used with transports other than HTTP or HTTPS---text/xml SHOULD
867	   NOT be used for 16-bit MIME with transports other than HTTP or HTTPS
868	   (see.  Section 9.5).

870	9.3.  application/xml or text/xml with UTF-8 Charset

872	   Content-type: application/xml or text/xml; charset="utf-8"

874	   <?xml version="1.0" encoding="utf-8"?>

876	   This is the recommended encoding for use with all the media types
877	   defined in this specification.  Since the charset parameter is
878	   provided, both MIME and XML processors MUST treat the enclosed entity
879	   as UTF-8 encoded.

881	   If sent using a 7-bit transport (e.g.  SMTP  [RFC5321]), the XML MIME
882	   entity MUST use a content-transfer-encoding of either quoted-
883	   printable or base64.  For an 8-bit clean transport (e.g., 8BITMIME
884	   ESMTP or NNTP), or a binary clean transport (e.g., HTTP), no content-
885	   transfer-encoding is necessary.

887	9.4.  application/xml with UTF-16 Charset

889	   Content-type: application/xml; charset="utf-16"

891	   {BOM}<?xml version="1.0" encoding="utf-16"?>

893	   or

895	   {BOM}<?xml version="1.0"?>

897	   If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean
898	   transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be
899	   encoded in quoted-printable or base64.  For a binary clean transport
900	   (e.g., HTTP), no content-transfer-encoding is necessary.

902	9.5.  text/xml with UTF-16 Charset

904	   Content-type: text/xml; charset="utf-16"

906	   {BOM}<?xml version='1.0' encoding='utf-16'?>

908	   or

910	   {BOM}<?xml version='1.0'?>

912	   This is possible only when the XML MIME entity is transmitted via
913	   HTTP  or HTTPS, which use a MIME-like mechanism and are binary-clean
914	   protocols, hence do not perform CR and LF transformations and allow
915	   NUL octets.  As described in [RFC2781], the UTF-16 family MUST NOT be
916	   used with media types under the top-level type "text" except over
917	   HTTP or HTTPS (see section 19.4.1 of [RFC2616] for details).

919	   Since HTTP is binary clean, no content-transfer-encoding is
920	   necessary.

922	9.6.  application/xml with UTF-16BE Charset
923	   Content-type: application/xml; charset="utf-16be"

925	   <?xml version='1.0' encoding='utf-16be'?>

927	   Observe that the BOM does not exist.  Since the charset parameter is
928	   provided, MIME and XML processors MUST treat the enclosed entity as
929	   UTF-16BE encoded.

931	9.7.  text/xml with UTF-16BE Charset

933	   Content-type: text/xml; charset="utf-16be"

935	   <?xml version='1.0' encoding='utf-16be'?>

937	   Observe that the BOM does not exist.  As for UTF-16, this is possible
938	   only when the XML MIME entity is transmitted via HTTP.

940	9.8.  application/xml or text/xml with ISO-2022-KR Charset

942	   Content-type: application/xml; charset="iso-2022-kr"

944	   <?xml version="1.0" encoding="iso-2022-kr"?>

946	   This example shows the use of a Korean charset (e.g., Hangul) encoded
947	   following the specification in [RFC1557].  Since the charset
948	   parameter is provided, MIME processors MUST treat the enclosed entity
949	   as encoded per RFC 1557.  Since  the XML MIME entity has an internal
950	   encoding declaration (this example does show such a declaration,
951	   which agrees with the charset parameter) XML processors MUST also
952	   treat the enclosed entity as encoded per RFC 1557.  Thus,
953	   interoperability is assured.

955	   Since ISO-2022-KR has been defined to use only 7 bits of data, no
956	   content-transfer-encoding is necessary with any transport.

958	9.9.  application/xml or text/xml with Omitted Charset, no Internal
959	      Encoding Declaration and UTF-8 Entity

961	   Content-type: application/xml or text/xml

963	   <?xml version='1.0'?>

965	   In this example, the charset parameter has been omitted, the is no
966	   internal encoding declaration, and there is no BOM.  Since there is
967	   no BOM, the XML processor follows the requirements in section 4.3.3,
968	   and optionally applies the mechanism described in Appendix F (which
969	   is non-normative) of [XML] to determine the charset encoding of
970	   UTF-8.  Although the XML MIME entity does not contain an encoding
971	   declaration, the encoding actually -is- UTF-8, so this is still a
972	   conforming XML MIME entity.

974	   An XML-unaware MIME processor SHOULD make no assumptions about the
975	   charset of the XML MIME entity.

977	9.10.  application/xml or text/xml with Omitted Charset and Internal
978	       Encoding Declaration

980	   Content-type: application/xml or text/xml

982	   <?xml version='1.0' encoding="iso-10646-ucs-4"?>

984	   In this example, the charset parameter has been omitted, and there is
985	   no BOM.  However, the XML MIME entity does have an encoding
986	   declaration inside the XML MIME entity that specifies the entity's
987	   charset.  Following the requirements in section 4.3.3, and optionally
988	   applying the mechanism described in Appendix F (non-normative) of
989	   [XML], the XML processor determines the charset encoding of the XML
990	   MIME entity (in this example, UCS-4).

992	   An XML-unaware MIME processor SHOULD make no assumptions about the
993	   charset of the XML MIME entity.

995	9.11.  application/xml-external-parsed-entity or text/xml-external-
996	       parsed-entity with UTF-8 Charset

998	   Content-type: text/xml-external-parsed-entity or application/xml-
999	   external-parsed-entity; charset="utf-8"

1001	   <?xml encoding="utf-8"?>

1003	   Since the charset parameter is provided, MIME and XML processors MUST
1004	   treat the enclosed entity as UTF-8 encoded.

1006	   If sent using a 7-bit transport (e.g.  SMTP), the XML MIME entity
1007	   MUST use a content-transfer-encoding of either quoted-printable or
1008	   base64.  For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP),
1009	   or a binary clean transport (e.g., HTTP) no content-transfer-encoding
1010	   is necessary.

1012	9.12.  application/xml-external-parsed-entity with UTF-16 Charset

1014	   Content-type: application/xml-external-parsed-entity;
1015	   charset="utf-16"

1017	   {BOM}<?xml encoding="utf-16"?>
1018	   or

1020	   {BOM}<?xml?>

1022	   Since the charset parameter is provided, MIME and XML processors MUST
1023	   treat the enclosed entity as UTF-16 encoded.

1025	   If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean
1026	   transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be
1027	   encoded in quoted-printable or base64.  For a binary clean transport
1028	   (e.g., HTTP), no content-transfer-encoding is necessary.

1030	9.13.  application/xml-external-parsed-entity with UTF-16BE Charset

1032	   Content-type: application/xml-external-parsed-entity; charset="utf-
1033	   16be"

1035	   <?xml encoding="utf-16be"?>

1037	   Since the charset parameter is provided, MIME and XML processors MUST
1038	   treat the enclosed entity as UTF-16BE encoded.

1040	9.14.  application/xml-dtd

1042	   Content-type: application/xml-dtd; charset="utf-8"

1044	   <?xml encoding="utf-8"?>

1046	   Charset "utf-8" is a recommended charset value for use with
1047	   application/xml-dtd.  Since the charset parameter is provided, MIME
1048	   and XML processors MUST treat the enclosed entity as UTF-8 encoded.

1050	9.15.  application/mathml+xml

1052	   Content-type: application/mathml+xml

1054	   <?xml version="1.0" ?>

1056	   MathML documents are XML documents whose content describes
1057	   mathematical information, as defined by [MathML].  As a format based
1058	   on XML, MathML documents SHOULD follow the '+xml' suffix convention
1059	   and use 'mathml+xml' in their MIME content-type identifier.This media
1060	   type has been registered at IANA and is fully defined in [MathML].

1062	9.16.  application/xslt+xml

1064	   Content-type: application/xslt+xml

1066	   <?xml version="1.0" ?>

1068	   Extensible Stylesheet Language (XSLT) documents are XML documents
1069	   whose content describes stylesheets for other XML documents, as
1070	   defined by [XSLT].  As a format based on XML, XSLT documents SHOULD
1071	   follow the '+xml' suffix convention and use 'xslt+xml' in their MIME
1072	   content-type identifier.This media type has been registered at IANA
1073	   and is fully defined in [XSLT].

1075	9.17.  application/rdf+xml

1077	   Content-type: application/rdf+xml

1079	   <?xml version="1.0" ?>

1081	   Resources identified using the application/rdf+xml media type are XML
1082	   documents whose content describe RDF metadata.  This media type has
1083	   been registered at IANA and is fully defined in [RFC3870].

1085	9.18.  image/svg+xml

1087	   Content-type: image/svg+xml

1089	   <?xml version="1.0" ?>

1091	   Scalable Vector Graphics (SVG) documents are XML documents whose
1092	   content describes graphical information, as defined by [SVG].  As a
1093	   format based on XML, SVG documents SHOULD follow the '+xml' suffix
1094	   convention and use 'svg+xml' in their MIME content-type
1095	   identifier.The image/svg+xml media type has been registered at IANA
1096	   and is fully defined in [SVG].  .

1098	9.19.  model/x3d+xml

1100	   Content-type: model/x3d+xml

1102	   <?xml version="1.0" ?>

1104	   X3D is derived from VRML and is used for 3D models.  Besides the XML
1105	   representation, it may also be serialised in classic VRML syntax and
1106	   using a fast infoset.  Separate, but clearly related  media types are
1107	   used for these serialisations (model/x3d+vrml and model/
1108	   x3d+fastinfoset respectively).

1110	9.20.  INCONSISTENT EXAMPLE: text/xml with UTF-8 Charset
1111	   Content-type: text/xml; charset="utf-8"

1113	   <?xml version="1.0" encoding="iso-8859-1"?>

1115	   Since the charset parameter is provided in the Content-Type header
1116	   and differs from the XML encoding declaration , MIME and XML
1117	   processors will not interoperate.  MIME processors will treat the
1118	   enclosed entity as UTF-8 encoded.  That is, the "iso-8859-1" encoding
1119	   will be be ignored.  XML processors on the other hand will ignore the
1120	   charset parameter and treat the XML entity as encoded in iso-8859-1.

1122	   Processors generating XML MIME entities MUST NOT label conflicting
1123	   charset information between the MIME Content-Type and the XML
1124	   declaration.  In particular, the addition of an explicit, site-wide
1125	   charset without inspecting the XML entity has frequently lead to
1126	   interoperability problems.

1128	9.21.  application/soap+xml

1130	   Content-type: application/soap+xml

1132	   <?xml version="1.0" ?>

1134	   Resources identified using the application/soap+xml media type are
1135	   SOAP 1.2 message envelopes that have been serialized with XML 1.0.
1136	   This media type has been registered at IANA and is fully defined in
1137	   [RFC3902].

1139	10.  IANA Considerations

1141	   As described in Section 8, this specification updates the [RFC4288]
1142	   and [RFC4289]  registration process for XML-based MIME types.

1144	11.  Security Considerations

1146	   XML, as a subset of SGML, has all of the same security considerations
1147	   as specified in [RFC1874], and likely more, due to its ubiquitous
1148	   deployment.

1150	   To paraphrase section 3 of RFC 1874, XML MIME entities contain
1151	   information to be parsed and processed by the recipient's XML system.
1152	   These entities may contain and such systems may permit explicit
1153	   system level commands to be executed while processing the data.  To
1154	   the extent that an XML system will execute arbitrary command strings,
1155	   recipients of XML MIME entities may be a risk.  In general, it may be
1156	   possible to specify commands that perform unauthorized file
1157	   operations or make changes to the display processor's environment
1158	   that affect subsequent operations.

1160	   In general, any information stored outside of the direct control of
1161	   the user -- including CSS style sheets, XSL transformations, entity
1162	   declarations, and DTDs -- can be a source of insecurity, by either
1163	   obvious or subtle means.  For example, a tiny "whiteout attack"
1164	   modification made to a "master" style sheet could make words in
1165	   critical locations disappear in user documents, without directly
1166	   modifying the user document or the stylesheet it references.  Thus,
1167	   the security of any XML document is vitally dependent on all of the
1168	   documents recursively referenced by that document.

1170	   The entity lists and DTDs for XHTML 1.0  [XHTML], for instance, are
1171	   likely to be a commonly used set of information.  Many developers
1172	   will use and trust them, few of whom will know much about the level
1173	   of security on the W3C's servers, or on any similarly trusted
1174	   repository.

1176	   The simplest attack involves adding declarations that break
1177	   validation.  Adding extraneous declarations to a list of character
1178	   entities can effectively "break the contract" used by documents.  A
1179	   tiny change that produces a fatal error in a DTD could halt XML
1180	   processing on a large scale.  Extraneous declarations are fairly
1181	   obvious, but more sophisticated tricks, like changing attributes from
1182	   being optional to required, can be difficult to track down.  Perhaps
1183	   the most dangerous option available to crackers is redefining default
1184	   values for attributes: e.g., if developers have relied on defaulted
1185	   attributes for security, a relatively small change might expose
1186	   enormous quantities of information.

1188	   Apart from the structural possibilities, another option, "entity
1189	   spoofing," can be used to insert text into documents, vandalizing and
1190	   perhaps conveying an unintended message.  Because XML 1.0 permits
1191	   multiple entity declarations, and the first declaration takes
1192	   precedence, it's possible to insert malicious content where an entity
1193	   is used, such as by inserting the full text of Winnie the Pooh in
1194	   every occurrence of &mdash;.

1196	   Use of the digital signatures work currently underway by the xmldsig
1197	   working group may eventually ameliorate the dangers of referencing
1198	   external documents not under one's own control.

1200	   Use of XML is expected to be varied, and widespread.  XML is under
1201	   scrutiny by a wide range of communities for use as a common syntax
1202	   for community-specific metadata.  For example, the Dublin Core
1203	   [RFC5013] group is using XML for document metadata, and a new effort
1204	   has begun that is considering use of XML for medical information.
1205	   Other groups view XML as a mechanism for marshalling parameters for
1206	   remote procedure calls.  More uses of XML will undoubtedly arise.

1208	   Security considerations will vary by domain of use.  For example, XML
1209	   medical records will have much more stringent privacy and security
1210	   considerations than XML library metadata.  Similarly, use of XML as a
1211	   parameter marshalling syntax necessitates a case by case security
1212	   review.

1214	   XML may also have some of the same security concerns as plain text.
1215	   Like plain text, XML can contain escape sequences that, when
1216	   displayed, have the potential to change the display processor
1217	   environment in ways that adversely affect subsequent operations.
1218	   Possible effects include, but are not limited to, locking the
1219	   keyboard, changing display parameters so subsequent displayed text is
1220	   unreadable, or even changing display parameters to deliberately
1221	   obscure or distort subsequent displayed material so that its meaning
1222	   is lost or altered.  Display processors SHOULD either filter such
1223	   material from displayed text or else make sure to reset all important
1224	   settings after a given display operation is complete.

1226	   Some terminal devices have keys whose output, when pressed, can be
1227	   changed by sending the display processor a character sequence.  If
1228	   this is possible the display of a text object containing such
1229	   character sequences could reprogram keys to perform some illicit or
1230	   dangerous action when the key is subsequently pressed by the user.
1231	   In some cases not only can keys be programmed, they can be triggered
1232	   remotely, making it possible for a text display operation to directly
1233	   perform some unwanted action.  As such, the ability to program keys
1234	   SHOULD be blocked either by filtering or by disabling the ability to
1235	   program keys entirely.

1237	   Note that it is also possible to construct XML documents that make
1238	   use of what XML terms "entity references" (using the XML meaning of
1239	   the term "entity" as described in Section 2), to construct repeated
1240	   expansions of text.  Recursive expansions are prohibited by [XML] and
1241	   XML processors are required to detect them.  However, even non-
1242	   recursive expansions may cause problems with the finite computing
1243	   resources of computers, if they are performed many times.  (Entity A
1244	   consists of 100 copies of entity B, which in turn consists of 100
1245	   copies of entity C, and so on)

1247	12.  References

1249	12.1.  Normative References

1251	   [ASCII]    "US-ASCII.  Coded Character Set -- 7-Bit American Standard
1252	              Code for Information Interchange", ANSI X3.4-1986, 1986.

1254	   [CSS]      Bos, B., Lie, H.W., Lilley, C., and I. Jacobs, "Cascading
1255	              Style Sheets, level 2 (CSS2) Specification", World Wide
1256	              Web Consortium Recommendation REC-CSS2, May 1998, <http://
1257	              www.w3.org/TR/REC-CSS2/>.

1259	   [HTTPbis]  Fielding, R., "Hypertext Transfer Protocol -- HTTP/1.2?",
1260	              RFC ???, January 2013.

1262	   [ISO8859]  "ISO-8859.  International Standard -- Information
1263	              Processing -- 8-bit Single-Byte Coded Graphic Character
1264	              Sets -- Part 1: Latin alphabet No.  1, ISO-8859-1:1987",
1265	              1987.

1267	   [PNG]      Boutell, T., "PNG (Portable Network Graphics)
1268	              Specification", World Wide Web Consortium Recommendation
1269	              REC-png, October 1996, <http://www.w3.org/TR/REC-png>.

1271	   [RFC1652]  Klensin, J., Freed, N., Rose, M., Stefferud, E., and D.
1272	              Crocker, "SMTP Service Extension for 8bit-MIMEtransport",
1273	              RFC 1652, July 1994.

1275	   [RFC2045]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
1276	              Extensions (MIME) Part One: Format of Internet Message
1277	              Bodies", RFC 2045, November 1996.

1279	   [RFC2046]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
1280	              Extensions (MIME) Part Two: Media Types", RFC 2046,
1281	              November 1996.

1283	   [RFC2077]  Nelson, S.D., Parks, C., and  Mitra, "The Model Primary
1284	              Content Type for Multipurpose Internet Mail Extensions",
1285	              RFC 2077, January 1997.

1287	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1288	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1290	   [RFC2445]  Dawson, F. and D. Stenerson, "Internet Calendaring and
1291	              Scheduling Core Object Specification (iCalendar)", RFC
1292	              2445, November 1998.

1294	   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Nielsen, H.,
1295	              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1296	              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

1298	   [RFC3023]  Murata, M., St.Laurent, S., and D. Kohn, "XML Media
1299	              Types", January 2001.

1301	   [RFC3501]  Crispin, M., "Internet Message Access Protocol - Version
1302	              4rev1", RFC 3501, March 2003.

1304	   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
1305	              10646", RFC 3629, November  2003.

1307	   [RFC3977]  Feather, B., "Network News Transfer Protocol", RFC 3977,
1308	              October 2006.

1310	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1311	              Resource Identifiers (URI): Generic Syntax.", RFC 3986,
1312	              January 2005.

1314	   [RFC3987]  DUeerst, M. and M. Suignard, "Internationalized Resource
1315	              Identifiers (IRIs)", RFC 3987, July 2005.

1317	   [RFC4288]  Freed, N. and J. Klensin, "Media Type Specifications and
1318	              Registration Procedures", RFC 4288, December  2005.

1320	   [RFC4289]  Freed, N. and J. Klensin, "Multipurpose Internet Mail
1321	              Extensions (MIME) Part Four: Registration Procedures", RFC
1322	              4289, December  2005.

1324	   [RFC4918]  Dusseault, L., "HTTP Extensions for Distributed Authoring
1325	              -- WEBDAV", RFC 4918, June 2007.

1327	   [RFC5321]  Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
1328	              October 2008.

1330	   [RFC6657]  Melnikov, A. and J. Reschke, "Update to MIME regarding
1331	              "charset" Parameter Handling in Textual Media Types", RFC
1332	              6657, July 2012, <http://www.rfc-editor.org/rfc/
1333	              rfc6657.txt>.

1335	   [SGML]     International Standard Organization, "Information
1336	              Processing -- Text and Office Systems -- Standard
1337	              Generalized Markup Language (SGML)", ISO 8879, October
1338	              1986.

1340	   [TAGMIME]  Bray, T., Ed., "Internet Media Type registration,
1341	              consistency of use", April 2004, <http://www.w3.org/2001/
1342	              tag/2004/0430-mime>.

1344	   [UML]      Object Management Group, "OMG Unified Modeling Language
1345	              Specification, Version 1.3", OMG Specification ad/
1346	              99-06-08, June 1999, <http://www.omg.org/uml/>.

1348	   [XBase]    Marsh, J. and R. Tobin, "XML Base", World Wide Web
1349	              Consortium Recommendation xmlbase, January 2009, <http://
1350	              www.w3.org/TR/xmlbase>.

1352	   [XHTML]    Pemberton, S. and  et al, "XHTML 1.0: The Extensible
1353	              HyperText Markup Language", World Wide Web Consortium
1354	              Recommendation xhtml1, December 1999, <http://www.w3.org/
1355	              TR/xhtml1>.

1357	   [XLink]    DeRose, S., Maler, E., Orchard, D., and N. Walsh, "XML
1358	              Linking Language (XLink) Version 1.1", World Wide Web
1359	              Consortium Recommendation xlink11, May 2010, <http://
1360	              www.w3.org/TR/xlink/>.

1362	   [XML1.1]   Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E.,
1363	              Yergeau, F., and J. Cowan, "Extensible Markup Language
1364	              (XML) 1.1", World Wide Web Consortium Recommendation REC-
1365	              xml, September 2006, <http://www.w3.org/TR/xml11>.

1367	   [XML]      Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E.,
1368	              and F. Yergeau, "Extensible Markup Language (XML) 1.0
1369	              (Fifth Edition)", World Wide Web Consortium Recommendation
1370	              REC-xml, November 2008, <http://www.w3.org/TR/REC-xml>.

1372	   [XPointerElement]
1373	              Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer
1374	              element() Scheme", World Wide Web Consortium
1375	              Recommendation REC-XPointer-Element, March 2003, <http://
1376	              www.w3.org/TR/xptr-element/>.

1378	   [XPointerFramework]
1379	              Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer
1380	              Framework", World Wide Web Consortium Recommendation REC-
1381	              XPointer-Framework, March 2003, <http://www.w3.org/TR/
1382	              xptr-framework/>.

1384	   [XPtrReg]  Hazael-Massieux, D., "XPointer Registry", 2005, <http://
1385	              www.w3.org/2005/04/xpointer-schemes/>.

1387	12.2.  Informative References

1389	   [MathML]   Carlisle, D., Ion, P., and R. Miner, "Mathematical Markup
1390	              Language (MathML) Version 3.0", World Wide Web Consortium
1391	              Recommendation MathML, October 2010, <http://www.w3.org/TR
1392	              /MathML/>.

1394	   [RFC1557]  Choi, U., Chon, K., and H. Park, "Korean Character
1395	              Encoding for Internet Messages", RFC 1557, December 1993.

1397	   [RFC1874]  Levinson, E., "SGML Media Types", RFC 1874, December 1995.

1399	   [RFC2130]  Weider, C., Cecilia Preston, C., Simonsen, K., Alvestrand,
1400	              H., Atkinson, R., Crispin, M., and P. Svanberg, "The
1401	              Report of the IAB Character Set Workshop held 29 February
1402	              - 1 March, 1996", RFC 2130, April 1997.

1404	   [RFC2376]  Whitehead, E. and M. Murata, "XML Media Types", RFC 2376,
1405	              July 1998.

1407	   [RFC2703]  Klyne, G., "Protocol-independent Content Negotiation
1408	              Framework", RFC 2703, September 1999.

1410	   [RFC2781]  Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO
1411	              10646", RFC 2781, Februrary 2000.

1413	   [RFC2801]  Burdett, D., "Internet Open Trading Protocol - IOTP
1414	              Version 1.0", RFC 2801, April 2000.

1416	   [RFC3870]  3870, A., "application/rdf+xml Media Type Registration",
1417	              RFC 3870, September 2004.

1419	   [RFC3902]  Baker, M. and M. Nottingham, "The "application/soap+xml"
1420	              media type", RFC 3902, September 2004.

1422	   [RFC5013]  Kunze, J. and T. Baker, "Dublin Core Metadata for Resource
1423	              Discovery", RFC 5013, August 2007.

1425	   [SVG]      Dahlstroem, E. and others.  , "Scalable Vector Graphics
1426	              (SVG) 1.1 Specification (Second edition)", World Wide Web
1427	              Consortium  Recommendation SVG, August 2011, <http://
1428	              www.w3.org/TR/SVG/>.

1430	   [XSLT]     Kay, M., "XSL Transformations (XSLT) Version 2.0", World
1431	              Wide Web Consortium Recommendation xslt20, January 2007,
1432	              <http://www.w3.org/TR/xslt20/>.

1434	Appendix A.  Why Use the '+xml' Suffix for XML-Based MIME Types?

1436	   Although the use of a suffix was not considered as part of the
1437	   original MIME architecture, this choice is considered to provide the
1438	   most functionality with the least potential for interoperability
1439	   problems or lack of future extensibility.  The alternatives to the
1440	   '+xml' suffix and the reason for its selection are described below.

1442	A.1.  Why not just use text/xml or application/xml and let the XML
1443	      processor dispatch to the correct application based on the
1444	      referenced DTD?

1446	   text/xml and application/xml remain useful in many situations,
1447	   especially for document-oriented applications that involve combining
1448	   XML with a stylesheet in order to present the data.  However, XML is
1449	   also used to define entirely new data types, and an XML-based format
1450	   such as image/svg+xml fits the definition of a MIME media type
1451	   exactly as well as image/png  [PNG] does.  (Note that image/svg+xml
1452	   is not yet registered.) Although extra functionality is available for
1453	   MIME processors that are also XML processors, XML-based media types
1454	   -- even when treated as opaque, non-XML media types -- are just as
1455	   useful as any other media type and should be treated as such.

1457	   Since MIME dispatchers work off of the MIME type, use of text/xml or
1458	   application/xml to label discrete media types will hinder correct
1459	   dispatching and general interoperability.  Finally, many XML
1460	   documents use neither DTDs nor namespaces, yet are perfectly legal
1461	   XML.

1463	A.2.  Why not create a new subtree (e.g., image/xml.svg) to represent
1464	      XML MIME types?

1466	   The subtree under which a media type is registered -- IETF, vendor (*
1467	   /vnd.*), or personal (*/prs.*); see [RFC4288] and [RFC4289]  for
1468	   details -- is completely orthogonal from whether the media type uses
1469	   XML syntax or not.  The suffix approach allows XML document types to
1470	   be identified within any subtree.  The vendor subtree, for example,
1471	   is likely to include a large number of XML-based document types.  By
1472	   using a suffix, rather than setting up a separate subtree, those
1473	   types may remain in the same location in the tree of MIME types that
1474	   they would have occupied had they not been based on XML.

1476	A.3.  Why not create a new top-level MIME type for XML-based media
1477	      types?

1479	   The top-level MIME type (e.g., model/*  [RFC2077]) determines what
1480	   kind of content the type is, not what syntax it uses.  For example,
1481	   agents using image/* to signal acceptance of any image format should
1482	   certainly be given access to media type image/svg+xml, which is in
1483	   all respects a standard image subtype.  It just happens to use XML to
1484	   describe its syntax.  The two aspects of the media type are
1485	   completely orthogonal.

1487	   XML-based data types will most likely be registered in ALL top-level
1488	   categories.  Potential, though currently unregistered, examples could
1489	   include application/mathml+xml  [MathML], model/uml+xml  [UML], and
1490	   image/svg+xml  [SVG].

1492	A.4.  Why not just have the MIME processor 'sniff' the content to
1493	      determine whether it is XML?

1495	   Rather than explicitly labeling XML-based media types, the processor
1496	   could look inside each type and see whether or not it is XML.  The
1497	   processor could also cache a list of XML-based media types.

1499	   Although this method might work acceptably for some mail
1500	   applications, it would fail completely in many other uses of MIME.
1501	   For instance, an XML-based web crawler would have no way of
1502	   determining whether a file is XML except to fetch it and check.  The
1503	   same issue applies in some IMAP4  [RFC3501] mail applications, where
1504	   the client first fetches the MIME type as part of the message
1505	   structure and then decides whether to fetch the MIME entity.
1506	   Requiring these fetches just to determine whether the MIME type is
1507	   XML could have significant bandwidth and latency disadvantages in
1508	   many situations.

1510	   Sniffing XML also isn't as simple as it might seem.  DOCTYPE
1511	   declarations aren't required, and they can appear fairly deep into a
1512	   document under certain unpreventable circumstances.  (E.g., the XML
1513	   declaration, comments, and processing instructions can occupy space
1514	   before the DOCTYPE declaration.) Even sniffing the DOCTYPE isn't
1515	   completely reliable, thanks to a variety of issues involving default
1516	   values for namespaces within external DTDs and overrides inside the
1517	   internal DTD.  Finally, the variety in potential character encodings
1518	   (something XML provides tools to deal with), also makes reliable
1519	   sniffing less likely.

1521	A.5.  Why not use a MIME parameter to specify that a media type uses XML
1522	      syntax?

1524	   For example, one could use "Content-Type: application/iotp;
1525	   alternate-type=text/xml" or "Content-Type: application/iotp;
1526	   syntax=xml".

1528	   Section 5 of [RFC2045] says that "Parameters are modifiers of the
1529	   media subtype, and as such do not fundamentally affect the nature of
1530	   the content".  However, all XML-based media types are by their nature
1531	   always XML.  Parameters, as they have been defined in the MIME
1532	   architecture, are never invariant across all instantiations of a
1533	   media type.

1535	   More practically, very few if any MIME dispatchers and other MIME
1536	   agents support dispatching off of a parameter.  While MIME agents on
1537	   the receiving side will need to be updated in either case to support
1538	   (or fall back to) generic XML processing, it has been suggested that
1539	   it is easier to implement this functionality when acting off of the
1540	   media type rather than a parameter.  More important, sending agents
1541	   require no update to properly tag an image as "image/svg+xml", but
1542	   few if any sending agents currently support always tagging certain
1543	   content types with a parameter.

1545	A.6.  How about labeling with parameters in the other direction (e.g.,
1546	      application/xml; Content-Feature=iotp)?

1548	   This proposal fails under the simplest case, of a user with neither
1549	   knowledge of XML nor an XML-capable MIME dispatcher.  In that case,
1550	   the user's MIME dispatcher is likely to dispatch the content to an
1551	   XML processing application when the correct default behavior should
1552	   be to dispatch the content to the application responsible for the
1553	   content type (e.g., an ecommerce engine for application/iotp+xml
1554	   [RFC2801], once this media type is registered).

1556	   Note that even if the user had already installed the appropriate
1557	   application (e.g., the ecommerce engine), and that installation had
1558	   updated the MIME registry, many operating system level MIME
1559	   registries such as .mailcap in Unix and HKEY_CLASSES_ROOT in Windows
1560	   do not currently support dispatching off a parameter, and cannot
1561	   easily be upgraded to do so.  And, even if the operating system were
1562	   upgraded to support this, each MIME dispatcher would also separately
1563	   need to be upgraded.

1565	A.7.  How about a new superclass MIME parameter that is defined to apply
1566	      to all MIME types (e.g., Content-Type: application/iotp;
1567	      $superclass=xml)?

1569	   This combines the problems of Appendix A.5 and Appendix A.6.

1571	   If the sender attaches an image/svg+xml file to a message and
1572	   includes the instructions "Please copy the French text on the road
1573	   sign", someone with an XML-aware MIME client and an XML browser but
1574	   no support for SVG can still probably open the file and copy the
1575	   text.  By contrast, with superclasses, the sender must add superclass
1576	   support to her existing mailer AND the receiver must add superclass
1577	   support to his before this transaction can work correctly.

1579	   If the receiver comes to rely on the superclass tag being present and
1580	   applications are deployed relying on that tag (as always seems to
1581	   happen), then only upgraded senders will be able to interoperate with
1582	   those receiving applications.

1584	A.8.  What about adding a new parameter to the Content-Disposition
1585	      header or creating a new Content-Structure header to indicate XML
1586	      syntax?

1588	   This has nearly identical problems to Appendix A.7, in that it
1589	   requires both senders and receivers to be upgraded, and few if any
1590	   operating systems and MIME dispatchers support working off of
1591	   anything other than the MIME type.

1593	A.9.  How about a new Alternative-Content-Type header?

1595	   This is better than Appendix A.8, in that no extra functionality
1596	   needs to be added to a MIME registry to support dispatching of
1597	   information other than standard content types.  However, it still
1598	   requires both sender and receiver to be upgraded, and it will also
1599	   fail in many cases (e.g., web hosting to an outsourced server), where
1600	   the user can set MIME types (often through implicit mapping to file
1601	   extensions), but has no way of adding arbitrary HTTP headers.

1603	A.10.  How about using a conneg tag instead (e.g., accept-features:
1604	       (syntax=xml))?

1606	   When the conneg protocol is fully defined, this may potentially be a
1607	   reasonable thing to do.  But given the limited current state of
1608	   conneg  [RFC2703] development, it is not a credible replacement for a
1609	   MIME-based solution.

1611	   Also, note that adding a content-type parameter doesn't work with
1612	   conneg either, since conneg only deals with media types, not their
1613	   parameters.  This is another illustration of the limits of parameters
1614	   for MIME dispatchers.

1616	A.11.  How about a third-level content-type, such as text/xml/rdf?

1618	   MIME explicitly defines two levels of content type, the top-level for
1619	   the kind of content and the second-level for the specific media type.
1620	   [RFC4288] and [RFC4289] extends this in an interoperable way by using
1621	   prefixes to specify separate trees for IETF, vendor, and personal
1622	   registrations.  This specification also extends the two-level type by
1623	   using the '+xml' suffix.  In both cases, processors that are unaware
1624	   of these later specifications treat them as opaque and continue to
1625	   interoperate.  By contrast, adding a third-level type would break the
1626	   current MIME architecture and cause numerous interoperability
1627	   failures.

1629	A.12.  Why use the plus ('+') character for the suffix '+xml'?

1631	   As specified in Section 5.1 of [RFC2045], a tspecial can't be used:

1633	      tspecials :=
1634	      "(" / ")" / "<" / ">" / "@" /
1635	      "," / ";" / ":" / "\" / <">
1636	      "/" / "[" / "]" / "?"  / "="

1638	   It was thought that "."  would not be a good choice since it is
1639	   already used as an additional hierarchy delimiter.  Also, "*" has a
1640	   common wildcard meaning, and "-" and "_" are common word separators
1641	   and easily confused.  The characters %'`#& are frequently used for
1642	   quoting or comments and so are not ideal.

1644	   That leaves: ~!$^+{}|

1646	   Note that "-" is used heavily in the current registry.  "$" and "_"
1647	   are used once each.  The others are currently unused.

1649	   It was thought that '+' expressed the semantics that a MIME type can
1650	   be treated (for example) as both scalable vector graphics AND ALSO as
1651	   XML; it is both simultaneously.

1653	A.13.  What is the semantic difference between application/foo and
1654	       application/foo+xml?

1656	   MIME processors that are unaware of XML will treat the '+xml' suffix
1657	   as completely opaque, so it is essential that no extra semantics be
1658	   assigned to its presence.  Therefore, application/foo and application
1659	   /foo+xml SHOULD be treated as completely independent media types.
1660	   Although, for example, text/calendar+xml could be an XML version of
1661	   text/calendar  [RFC2445], it is possible that this (hypothetical) new
1662	   media type would include new semantics as well as new syntax, and in
1663	   any case, there would be many applications that support text/calendar
1664	   but had not yet been upgraded to support text/calendar+xml.

1666	A.14.  What happens when an even better markup language (e.g., EBML) is
1667	       defined, or a new category of data?

1669	   In the ten years that MIME has existed, XML is the first generic data
1670	   format that has seemed to justify special treatment, so it is hoped
1671	   that no further suffixes will be necessary.  However, if some are
1672	   later defined, and these documents were also XML, they would need to
1673	   specify that the '+xml' suffix is always the outermost suffix (e.g.,
1674	   application/foo+ebml+xml not application/foo+xml+ebml).  If they were
1675	   not XML, then they would use a regular suffix (e.g., application/
1676	   foo+ebml).

1678	A.15.  Why must I use the '+xml' suffix for my new XML-based media type?

1680	   You don't have to, but unless you have a good reason to explicitly
1681	   disallow generic XML processing, you should use the suffix so as not
1682	   to curtail the options of future users and developers.

1684	   Whether the inventors of a media type, today, design it for dispatch
1685	   to generic XML processing machinery (and most won't) is not the
1686	   critical issue.  The core notion is that the knowledge that some
1687	   media type happens to use XML syntax opens the door to unanticipated
1688	   kinds of processing beyond those envisioned by its inventors, and on
1689	   this basis identifying such encoding is a good and useful thing.

1691	   Developers of new media types are often tightly focused on a
1692	   particular type of processing that meets current needs.  But there is
1693	   no need to rule out generic processing as well, which could make your
1694	   media type more valuable over time.  It is believed that registering
1695	   with the '+xml' suffix will cause no interoperability problems
1696	   whatsoever, while it may enable significant new functionality and
1697	   interoperability now and in the future.  So, the conservative
1698	   approach is to include the '+xml' suffix.

1700	Appendix B.  Changes from RFC 3023

1702	   There are numerous and significant differences between this
1703	   specification and [RFC3023], which it obsoletes.  This appendix
1704	   summarizes the major differences only.

1706	   First, XPointer ([XPointerFramework] and [XPointerElement] has been
1707	   added as fragment identifier syntax for "application/xml", and the
1708	   XPointer Registry ([XPtrReg]) mentioned.  Second, [XBase] has been
1709	   added as a mechanism for specifying base URIs.  Third, the language
1710	   regarding charsets was updated to correspond to the W3C TAG finding
1711	   Internet Media Type registration, consistency of use [TAGMIME].
1712	   Fourth, many references are updated.

1714	Appendix C.  Acknowledgements

1716	   This specification reflects the input of numerous participants to the
1717	   ietf-xml-mime@imc.org mailing list, though any errors are the
1718	   responsibility of the authors.  Special thanks to:

1720	   Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed,
1721	   Yaron Goland, Rick Jelliffe, Larry Masinter, David Megginson, Keith
1722	   Moore, Chris Newman, Gavin Nicol, Marshall Rose, Jim Whitehead and
1723	   participants of the XML activity and the TAG at the W3C.

1725	   Jim Whitehead and Simon St.Laurent are editors of [RFC2376] and
1726	   [RFC3023], respectively.

1728	Authors' Addresses

1730	   Chris Lilley
1731	   World Wide Web Consortium
1732	   2004, Route des Lucioles - B.P. 93 06902
1733	   Sophia Antipolis Cedex
1734	   France

1736	   Email: chris@w3.org
1737	   URI:   http://www.w3.org/People/chris/

1739	   MURATA Makoto (FAMILY Given)
1740	   International University of Japan

1742	   Email: eb2m-mrt@asahi-net.or.jp

1744	   Alexey Melnikov
1745	   Isode Ltd.

1747	   Email: alexey.melnikov@isode.com
1748	   URI:   http://www.melnikov.ca/

1750	   Henry S. Thompson
1751	   University of Edinburgh

1753	   Email: ht@inf.ed.ac.uk
1754	   URI:   http://www.ltg.ed.ac.uk/~ht/