idnits 2.17.1 

draft-ietf-822ext-mime-imt-01.txt:
  ** The Abstract section seems to be numbered


  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 318: '...y MIME text type MUST represent a line...'
     RFC 2119 keyword, line 320: '...in text MUST represent a line break.  ...'
     RFC 2119 keyword, line 402: '...racter encodings MUST use an appropria...'
     RFC 2119 keyword, line 861: '...undary delimiter MUST NOT appear insid...'
     RFC 2119 keyword, line 943: '...undary delimiter MUST occur at the beg...'
     (7 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 486 has weird spacing: '...of text  is "p...'

  == Line 956 has weird spacing: '...F (line  break...'

  == Line 1703 has weird spacing: '...ed, the  defau...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 5, 1995) is 10584 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'RFC-1341' on line 216 looks like a reference

  -- Missing reference section? 'RFC-1563' on line 331 looks like a reference

  -- Missing reference section? 'ISO-646' on line 382 looks like a reference

  -- Missing reference section? 'US-ASCII' on line 433 looks like a reference

  -- Missing reference section? 'ISO-8859' on line 436 looks like a reference

  -- Missing reference section? 'PCM' on line 533 looks like a reference

  -- Missing reference section? 'MPEG' on line 551 looks like a reference

  -- Missing reference section? 'POSTSCRIPT' on line 643 looks like a
     reference

  -- Missing reference section? 'POSTSCRIPT2' on line 644 looks like a
     reference

  -- Missing reference section? 'MIME-IMB' on line 878 looks like a reference

  -- Missing reference section? 'RFC-959' on line 1700 looks like a reference

  -- Missing reference section? 'RFC-783' on line 1695 looks like a reference


     Summary: 9 errors (**), 0 flaws (~~), 4 warnings (==), 14 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                     Nathaniel Borenstein
2	Internet Draft                                       Ned Freed
3	                           <draft-ietf-822ext-mime-imt-01.txt>

5	            Multipurpose Internet Mail Extensions
6	                       (MIME) Part Two:

8	                         Media Types

10	                         May 5, 1995

12	                     Status of this Memo

14	This document is an Internet-Draft.  Internet-Drafts are
15	working documents of the Internet Engineering Task Force
16	(IETF), its areas, and its working groups. Note that other
17	groups may also distribute working documents as Internet-
18	Drafts.

20	Internet-Drafts are draft documents valid for a maximum of six
21	months. Internet-Drafts may be updated, replaced, or obsoleted
22	by other documents at any time.  It is not appropriate to use
23	Internet-Drafts as reference material or to cite them other
24	than as a "working draft" or "work in progress".

26	To learn the current status of any Internet-Draft, please
27	check the 1id-abstracts.txt listing contained in the
28	Internet-Drafts Shadow Directories on ds.internic.net (US East
29	Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast),
30	or munnari.oz.au (Pacific Rim).

32	1.  Abstract

34	STD 11, RFC 822 defines a message representation protocol
35	specifying considerable detail about US-ASCII message headers,
36	but which leaves the message content, or message body, as flat
37	US-ASCII text.  This set of documents, collectively called the
38	Multipurpose Internet Mail Extensions, or MIME, redefines the
39	format of messages to allow for
40	 (1)   textual message bodies in character sets other than
41	       US-ASCII,

43	 (2)   non-textual message bodies,

45	 (3)   multi-part message bodies, and

47	 (4)   textual header information in character sets other than
48	       US-ASCII.

50	These documents are based on earlier work documented in RFC
51	934, STD 11, and RFC 1049, but extends and revises them.
52	Because RFC 822 said so little about message bodies, these
53	documents are largely orthogonal to (rather than a revision
54	of) RFC 822.

56	In particular, these documents are designed to provide
57	facilities to include multiple parts in a single message, to
58	represent body and header text in character sets other than
59	US-ASCII, to represent formatted multi-font text messages, to
60	represent non-textual material such as images and audio
61	fragments, and generally to facilitate later extensions
62	defining new types of Internet mail for use by cooperating
63	mail agents.

65	The initial document in this set, RFC MIME-IMB, specifies the
66	various headers used to describe the structure of MIME
67	messages. This second document defines the general structure
68	of the MIME media typing system and defines an initial set of
69	media types. The third document, RFC MIME-HEADERS, describes
70	extensions to RFC 822 to allow non-US-ASCII text data in
71	Internet mail header fields. The fourth document, RFC MIME-
72	REG, specifies various IANA registration procedures for MIME-
73	related entities.  The fifth and final document, RFC MIME-
74	CONF, describes MIME conformance criteria as well as providing
75	some illustrative examples of MIME message formats,
76	acknowledgements, and the bibliography.

78	These documents are revisions of RFCs 1521 and 1522, which
79	themselves were revisions of RFCs 1341 and 1342.  An appendix
80	in RFC MIME-CONF describes differences and changes from
81	previous versions.

83	2.  Table of Contents

85	1 Abstract ..............................................    1
86	2 Table of Contents .....................................    3
87	3 Introduction ..........................................    4
88	4 Definition of a Top-Level Media Type ..................    5
89	5 Overview Of The Initial Top-Level Media Types .........    5
90	6 Discrete Media Type Values ............................    7
91	6.1 Text Media Type .....................................    7
92	6.1.1 Representation of Line Breaks .....................    8
93	6.1.2 Charset Parameter .................................    8
94	6.1.3 Plain Subtype .....................................   12
95	6.1.4 Unrecognized Subtypes .............................   12
96	6.2 Image Media Type ....................................   12
97	6.3 Audio Media Type ....................................   13
98	6.4 Video Media Type ....................................   13
99	6.5 Application Media Type ..............................   14
100	6.5.1 Octet-Stream Subtype ..............................   15
101	6.5.2 PostScript Subtype ................................   15
102	6.5.3 Other Application Subtypes ........................   19
103	7 Composite Media Type Values ...........................   19
104	7.1 Multipart Media Type ................................   19
105	7.1.1 Common Syntax .....................................   21
106	7.1.2 Handling Nested Messages and Multiparts ...........   27
107	7.1.3 Mixed Subtype .....................................   27
108	7.1.4 Alternative Subtype ...............................   28
109	7.1.5 Digest Subtype ....................................   30
110	7.1.6 Parallel Subtype ..................................   31
111	7.1.7 Other Multipart Subtypes ..........................   32
112	7.2 Message Media Type ..................................   32
113	7.2.1 RFC822 Subtype ....................................   32
114	7.2.2 Partial Subtype ...................................   33
115	7.2.2.1 Message Fragmentation and Reassembly ............   34
116	7.2.2.2 Fragmentation and Reassembly Example ............   35
117	7.2.3 External-Body Subtype .............................   37
118	7.2.4 Other Message Subtypes ............................   46
119	8 Experimental Media Type Values ........................   46
120	9 Summary ...............................................   47
121	10 Security Considerations ..............................   47
122	11 Authors' Addresses ...................................   48
123	A Collected Grammar .....................................   49
124	3.  Introduction

126	The first document in this set, RFC MIME-IMB, defines a number
127	of header fields, including Content-Type. The Content-Type
128	field is used to specify the nature of the data in the body of
129	an entity, by giving media type and subtype identifiers, and
130	by providing auxiliary information that may be required for
131	certain media types.  After the type and subtype names, the
132	remainder of the header field is simply a set of parameters,
133	specified in an attribute/value notation.  The ordering of
134	parameters is not significant.

136	In general, the top-level media type is used to declare the
137	general type of data, while the subtype specifies a specific
138	format for that type of data.  Thus, a media type of
139	"image/xyz" is enough to tell a user agent that the data is an
140	image, even if the user agent has no knowledge of the specific
141	image format "xyz".  Such information can be used, for
142	example, to decide whether or not to show a user the raw data
143	from an unrecognized subtype -- such an action might be
144	reasonable for unrecognized subtypes of text, but not for
145	unrecognized subtypes of image or audio.  For this reason,
146	registered subtypes of text, image, audio, and video should
147	not contain embedded information that is really of a different
148	type.  Such compound formats should be represented using the
149	"multipart" or "application" types.

151	Parameters are modifiers of the media subtype, and as such do
152	not fundamentally affect the nature of the content.  The set
153	of meaningful parameters depends on the media type and
154	subtype.  Most parameters are associated with a single
155	specific subtype.  However, a given top-level media type may
156	define parameters which are applicable to any subtype of that
157	type.  Parameters may be required by their defining media type
158	or subtype or they may be optional.  MIME implementations must
159	also ignore any parameters whose names they do not recognize.

161	MIME's Content-Type header field and media type mechanism has
162	been carefully designed to be extensible, and it is expected
163	that the set of media type/subtype pairs and their associated
164	parameters will grow significantly over time.  Several other
165	MIME entities, most notably the list of the name of character
166	sets registered for MIME usage, are likely to have new values
167	defined over time.  In order to ensure that the set of such
168	values is developed in an orderly, well-specified, and public
169	manner, MIME sets up a registration process which uses the
170	Internet Assigned Numbers Authority (IANA) as a central
171	registry for MIME's extension areas.  The registration process
172	is described in a companion document, RFC MIME-REG.

174	The initial seven standard top-level media type are defined
175	and described in the remainder of this document.

177	4.  Definition of a Top-Level Media Type

179	The definition of a top-level media type consists of:

181	 (1)   a name and a description of the type, including
182	       criteria for whether a particular type would qualify
183	       under that type,

185	 (2)   the names and definitions of parameters, if any, which
186	       are defined for all subtypes of that type (including
187	       whether such parameters are required or optional),

189	 (3)   how a user agent and/or gateway should handle unknown
190	       subtypes of this type,

192	 (4)   general considerations on gatewaying objects of this
193	       top-level type, if any, and

195	 (5)   any restrictions on content-transfer-encodings for
196	       objects of this top-level type.

198	5.  Overview Of The Initial Top-Level Media Types

200	The five discrete top-level media types are:

202	 (1)   text -- textual information.  The subtype "plain" in
203	       particular indicates plain (unformatted) text.  No
204	       special software is required to get the full meaning of
205	       the text, aside from support for the indicated
206	       character set.  Other subtypes are to be used for
207	       enriched text in forms where application software may
208	       enhance the appearance of the text, but such software
209	       must not be required in order to get the general idea
210	       of the content.  Possible subtypes thus include any
211	       word processor format that can be read without
212	       resorting to software that understands the format.  In
213	       particular, formats that employ embeddded binary
214	       formatting information are not considered directly
215	       readable.  A very simple and portable subtype,
216	       richtext, was defined in RFC 1341 [RFC-1341], with a
217	       further revision in RFC 1563 [RFC-1563] under the name
218	       "enriched".

220	 (2)   image -- image data.  Image requires a display device
221	       (such as a graphical display, a graphics printer, or a
222	       FAX machine) to view the information.  An initial
223	       subtype is defined for the widely-used image format
224	       JPEG.

226	 (3)   audio -- audio data.  Audio requires an audio output
227	       device (such as a speaker or a telephone) to "display"
228	       the contents.  An initial subtype "basic" is defined in
229	       this document.

231	 (4)   video -- video data.  Video requires the capability to
232	       display moving images, typically including specialized
233	       hardware and software.  An initial subtype "mpeg" is
234	       defined in this document.

236	 (5)   application -- some other kind of data, typically
237	       either uninterpreted binary data or information to be
238	       processed by an application.  The subtype "octet-
239	       stream" is to be used in the case of uninterpreted
240	       binary data, in which case the simplest recommended
241	       action is to offer to write the information into a file
242	       for the user.  The "PostScript" subtype is also defined
243	       for the transport of PostScript material.  Other
244	       expected uses for "application" include spreadsheets,
245	       data for mail-based scheduling systems, and languages
246	       for "active" (computational) messaging, and word
247	       processing formats that are not directly readable.
248	       Note that security considerations may exist for some
249	       types of application data, most notably
250	       application/PostScript and any form of active
251	       messaging.  These issues are discussed later in this
252	       document.

254	The two composite top-level media types are:

256	 (1)   multipart -- data consisting of multiple parts of
257	       independent data types.  Four subtypes are initially
258	       defined, including the basic "mixed" subtype specifying
259	       a generic mixed set of parts, "alternative" for
260	       representing the same data in multiple formats,
261	       "parallel" for parts intended to be viewed
262	       simultaneously, and "digest" for multipart entities in
263	       which each part has a default type of "message/rfc822".

265	 (2)   message -- an encapsulated message.  A body of media
266	       type "message" is itself all or part of some kind of
267	       message object.  Such objects may in turn contain other
268	       messages and body parts of their own.  The "rfc822"
269	       subtype is used when the encapsulated content is itself
270	       an RFC 822 message.  The "partial" subtype is defined
271	       for partial RFC 822 messages, to permit the fragmented
272	       transmission of bodies that are thought to be too large
273	       to be passed through transport facilities in one piece.
274	       Another subtype, "external-body", is defined for
275	       specifying large bodies by reference to an external
276	       data source.

278	It should be noted that the list of media type values given
279	here may be augmented in time, via the mechanisms described
280	above, and that the set of subtypes is expected to grow
281	substantially.

283	6.  Discrete Media Type Values

285	Five of the seven initial media type values refer to discrete
286	bodies.  The content of such entities is handled by non-MIME
287	mechanisms; they are opaque to MIME processors.

289	6.1.  Text Media Type

291	The text media type is intended for sending material which is
292	principally textual in form.  A "charset" parameter may be
293	used to indicate the character set of the body text for some
294	text subtypes, notably including the subtype "text/plain",
295	which indicates plain (unformatted) text.  The default media
296	type for Internet mail if none is specified is "text/plain;
297	charset=us-ascii".

299	Beyond plain text, there are many formats for representing
300	what might be known as "extended text" -- text with embedded
301	formatting and presentation information.  An interesting
302	characteristic of many such representations is that they are
303	to some extent readable even without the software that
304	interprets them.  It is useful, then, to distinguish them, at
305	the highest level, from such unreadable data as images, audio,
306	or text represented in an unreadable form.  In the absence of
307	appropriate interpretation software, it is reasonable to show
308	subtypes of text to the user, while it is not reasonable to do
309	so with most nontextual data.

311	Such formatted textual data should be represented using
312	subtypes of text.  Plausible subtypes of text are typically
313	given by the common name of the representation format, e.g.,
314	"text/enriched" [RFC-1563].

316	6.1.1.  Representation of Line Breaks

318	The canonical form of any MIME text type MUST represent a line
319	break as a CRLF sequence.  Similarly, any occurrence of CRLF
320	in text MUST represent a line break.  Use of CR and LF outside
321	of line break sequences is also forbidden.

323	This rule applies regardless of format or character set or
324	sets involved.

326	NOTE: The proper interpretation of line breaks when a body is
327	displayed depends on the media type. In particular, while it
328	is appropriate to treat a line break as a transition to a new
329	line when displaying a text/plain body, this treatment is
330	actually incorrect for other subtypes of text like
331	text/enriched [RFC-1563].

333	6.1.2.  Charset Parameter

335	A critical parameter that may be specified in the Content-Type
336	field for text/plain data is the character set.  This is
337	specified with a "charset" parameter, as in:

339	  Content-type: text/plain; charset=iso-8859-1

341	Unlike some other parameter values, the values of the charset
342	parameter are NOT case sensitive.  The default character set,
343	which must be assumed in the absence of a charset parameter,
344	is US-ASCII.

346	The specification for any future subtypes of "text" must
347	specify whether or not they will also utilize a "charset"
348	parameter, and may possibly restrict its values as well.  When
349	used with a particular body, the semantics of the "charset"
350	parameter should be identical to those specified here for
351	"text/plain", i.e., the body consists entirely of characters
352	in the given charset.  In particular, definers of future text
353	subtypes should pay close attention to the implications of
354	multioctet character sets for their subtype definitions.

356	This RFC specifies the definition of the charset parameter for
357	the purposes of MIME to be the name of a character set, as
358	"character set" as defined in MIME-IMB.  The rules regarding
359	line breaks detailed in the previous section must also be
360	observed -- a character set whose definition does not conform
361	to these rules cannot be used in a MIME text type.

363	An initial list of predefined character set names can be found
364	at the end of this section.  Additional character sets may be
365	registered with IANA as described in RFC MIME-REG.

367	Note that if the specified character set includes 8-bit data,
368	a Content-Transfer-Encoding header field and a corresponding
369	encoding on the data are required in order to transmit the
370	body via some mail transfer protocols, such as SMTP.

372	The default character set, US-ASCII, has been the subject of
373	some confusion and ambiguity in the past.  Not only were there
374	some ambiguities in the definition, there have been wide
375	variations in practice.  In order to eliminate such ambiguity
376	and variations in the future, it is strongly recommended that
377	new user agents explicitly specify a character set as a media
378	type parameter in the Content-Type header field.  "US-ASCII"
379	does not indicate an arbitrary 7-bit character code, but
380	specifies that the body uses character coding that uses the
381	exact correspondence of octets to characters specified in US-
382	ASCII.  National use variations of ISO 646 [ISO-646] are NOT
383	US-ASCII and their use in Internet mail is explicitly
384	discouraged.  The omission of the ISO 646 character set is
385	deliberate in this regard.  The character set name of "US-
386	ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only.

388	The character set name "ASCII" is reserved and must not be
389	used for any purpose.

391	NOTE: RFC 821 explicitly specifies "ASCII", and references an
392	earlier version of the American Standard.  Insofar as one of
393	the purposes of specifying a media type and character set is
394	to permit the receiver to unambiguously determine how the
395	sender intended the coded message to be interpreted, assuming
396	anything other than "strict ASCII" as the default would risk
397	unintentional and incompatible changes to the semantics of
398	messages now being transmitted.  This also implies that
399	messages containing characters coded according to national
400	variations on ISO 646, or using code-switching procedures
401	(e.g., those of ISO 2022), as well as 8-bit or multiple octet
402	character encodings MUST use an appropriate character set
403	specification to be consistent with this specification.

405	The complete US-ASCII character set is listed in ANSI X3.4-
406	1986. Note that the control characters including DEL (0-31,
407	127) have no defined meaning apart from the combination CRLF
408	(US-ASCII values 13 and 10) indicating a new line.  Two of the
409	characters have de facto meanings in wide use: FF (12) often
410	means "start subsequent text on the beginning of a new page";
411	and TAB or HT (9) often (though not always) means "move the
412	cursor to the next available column after the current position
413	where the column number is a multiple of 8 (counting the first
414	column as column 0)."  Apart from this, any use of the control
415	characters or DEL in a body must be part of a private
416	agreement between the sender and recipient.  Such private
417	agreements are discouraged and should be replaced by the other
418	capabilities of this document.

420	NOTE:  Beyond US-ASCII, an enormous proliferation of character
421	sets is possible.  It is the opinion of the IETF working group
422	that a large number of character sets is NOT a good thing.  We
423	would prefer to specify a SINGLE character set that can be
424	used universally for representing all of the world's languages
425	in Internet mail.  Unfortunately, existing practice in several
426	communities seems to point to the continued use of multiple
427	character sets in the near future.  For this reason, we define
428	names for a small number of character sets for which a strong
429	constituent base exists.

431	The defined charset values are:

433	 (1)   US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].

435	 (2)   ISO-8859-X -- where "X" is to be replaced, as
436	       necessary, for the parts of ISO-8859 [ISO-8859].  Note
437	       that the ISO 646 character sets have deliberately been
438	       omitted in favor of their 8859 replacements, which are
439	       the designated character sets for Internet mail.  As of
440	       the publication of this document, the legitimate values
441	       for "X" are the digits 1 through 9.

443	All of these character sets are used as pure 7- or 8-bit sets
444	without any shift or escape functions.  The meaning of shift
445	and escape sequences in these character sets is not defined.

447	The character sets specified above are the ones that were
448	relatively uncontroversial during the drafting of MIME.  This
449	document does not endorse the use of any particular character
450	set other than US-ASCII, and recognizes that the future
451	evolution of world character sets remains unclear.  It is
452	expected that in the future, additional character sets will be
453	registered for use in MIME.

455	Note that the character set used, if anything other than US-
456	ASCII, must always be explicitly specified in the Content-Type
457	field.

459	No other character set name may be used in Internet mail
460	without the publication of a formal specification and its
461	registration with IANA, or by private agreement, in which case
462	the character set name must begin with "X-".

464	Implementors are discouraged from defining new character sets
465	unless absolutely necessary.

467	The "charset" parameter has been defined primarily for the
468	purpose of textual data, and is described in this section for
469	that reason.  However, it is conceivable that non-textual data
470	might also wish to specify a charset value for some purpose,
471	in which case the same syntax and values should be used.

473	In general, composition software should always use the "lowest
474	common denominator" character set possible.  For example, if a
475	body contains only US-ASCII characters, it should be marked as
476	being in the US-ASCII character set, not ISO-8859-1, which,
477	like all the ISO-8859 family of character sets, is a superset
478	of US-ASCII.  More generally, if a widely-used character set
479	is a subset of another character set, and a body contains only
480	characters in the widely-used subset, it should be labelled as
481	being in that subset.  This will increase the chances that the
482	recipient will be able to view the resulting object correctly.

484	6.1.3.  Plain Subtype

486	The simplest and most important subtype of text  is "plain".
487	This indicates plain (unformatted) text.  The default media
488	type of "text/plain; charset=us-ascii" for Internet mail
489	describes existing Internet practice.  That is, it is the type
490	of body defined by RFC 822.

492	No other text subtype is defined by this document.

494	6.1.4.  Unrecognized Subtypes

496	Unrecognized subtypes of text should be treated as subtype
497	"plain" as long as the MIME implementation knows how to handle
498	the charset.  Unrecognized subtypes which also specify an
499	unrecognized charset should be treated as "application/octet-
500	stream".

502	6.2.  Image Media Type

504	A media type of "image" indicates that the body contains an
505	image.  The subtype names the specific image format.  These
506	names are not case sensitive. An initial subtype is "jpeg" for
507	the JPEG format using JFIF encoding.

509	The list of image subtypes given here is neither exclusive nor
510	exhaustive, and is expected to grow as more types are
511	registered with IANA, as described in RFC MIME-REG.

513	Unrecognized subtypes of image should at a miniumum be treated
514	as "application/octet-stream".  Implementations may optionally
515	elect to pass subtypes of image that they do not specifically
516	recognize to a robust general-purpose image viewing
517	application, if such an application is available.

519	6.3.  Audio Media Type

521	A media type of "audio" indicates that the body contains audio
522	data.  Although there is not yet a consensus on an "ideal"
523	audio format for use with computers, there is a pressing need
524	for a format capable of providing interoperable behavior.

526	The initial subtype of "basic" is specified to meet this
527	requirement by providing an absolutely minimal lowest common
528	denominator audio format.  It is expected that richer formats
529	for higher quality and/or lower bandwidth audio will be
530	defined by a later document.

532	The content of the "audio/basic" subtype is single channel
533	audio encoded using 8-bit ISDN mu-law [PCM] at a sample rate
534	of 8000 Hz.

536	Unrecognized subtypes of audio should at a miniumum be treated
537	as "application/octet-stream".  Implementations may optionally
538	elect to pass subtypes of audio that they do not specifically
539	recognize to a robust general-purpose audio playing
540	application, if such an application is available.

542	6.4.  Video Media Type

544	A media type of "video" indicates that the body contains a
545	time-varying-picture image, possibly with color and
546	coordinated sound.  The term "video" is used extremely
547	generically, rather than with reference to any particular
548	technology or format, and is not meant to preclude subtypes
549	such as animated drawings encoded compactly.  The subtype
550	"mpeg" refers to video coded according to the MPEG standard
551	[MPEG].

553	Note that although in general this document strongly
554	discourages the mixing of multiple media in a single body, it
555	is recognized that many so-called "video" formats include a
556	representation for synchronized audio, and this is explicitly
557	permitted for subtypes of "video".

559	Unrecognized subtypes of video should at a minumum be treated
560	as "application/octet-stream".  Implementations may optionally
561	elect to pass subtypes of video that they do not specifically
562	recognize to a robust general-purpose video display
563	application, if such an application is available.

565	6.5.  Application Media Type

567	The "application" media type is to be used for discrete data
568	which do not fit in any of the other categories, and
569	particularly for data to be processed by some type of
570	application program.  This is information which must be
571	processed by an application before it is viewable or usable by
572	a user.  Expected uses for the application media type include
573	file transfer, spreadsheets, data for mail-based scheduling
574	systems, and languages for "active" (computational) messages.
575	(The latter, in particular, can pose security problems which
576	must be understood by implementors, and are considered in
577	detail in the discussion of the application/PostScript media
578	type.)

580	For example, a meeting scheduler might define a standard
581	representation for information about proposed meeting dates.
582	An intelligent user agent would use this information to
583	conduct a dialog with the user, and might then send additional
584	material based on that dialog.  More generally, there have
585	been several "active" messaging languages developed in which
586	programs in a suitably specialized language are transported to
587	a remote location and automatically run in the recipient's
588	environment.

590	Such applications may be defined as subtypes of the
591	"application" media type. This document defines two subtypes:
592	octet-stream, and PostScript.

594	The subtype of application will often be the name of the
595	application for which the data are intended.  This does not
596	mean, however, that any application program name may be used
597	freely as a subtype of application.  Usage of any subtype
598	(other than subtypes beginning with "x-") must be registered
599	with IANA, as described in RFC MIME-REG.

601	6.5.1.  Octet-Stream Subtype

603	The "octet-stream" subtype is used to indicate that a body
604	contains arbitrary binary data.  The set of currently defined
605	parameters is:

607	 (1)   TYPE -- the general type or category of binary data.
608	       This is intended as information for the human recipient
609	       rather than for any automatic processing.

611	 (2)   PADDING -- the number of bits of padding that were
612	       appended to the bit-stream comprising the actual
613	       contents to produce the enclosed 8-bit byte-oriented
614	       data.  This is useful for enclosing a bit-stream in a
615	       body when the total number of bits is not a multiple of
616	       8.

618	Both of these parameters are optional.

620	An additional parameter, "CONVERSIONS", was defined in RFC
621	1341 but has since been removed.  RFC 1341 also defined the
622	use of a "NAME" parameter which gave a suggested file name to
623	be used if the data were to be written to a file.  This has
624	been deprecated in anticipation of a separate Content-
625	Disposition header field, to be defined in a subsequent RFC.

627	The recommended action for an implementation that receives an
628	application/octet-stream object is to simply offer to put the
629	data in a file, with any Content-Transfer-Encoding undone, or
630	perhaps to use it as input to a user-specified process.

632	To reduce the danger of transmitting rogue programs, it is
633	strongly recommended that implementations NOT implement a
634	path-search mechanism whereby an arbitrary program named in
635	the Content-Type parameter (e.g., an "interpreter=" parameter)
636	is found and executed using the message body as input.

638	6.5.2.  PostScript Subtype

640	A media type of "application/postscript" indicates a
641	PostScript program.  Currently two variants of the PostScript
642	language are allowed; the original level 1 variant is
643	described in [POSTSCRIPT] and the more recent level 2 variant
644	is described in [POSTSCRIPT2].

646	PostScript is a registered trademark of Adobe Systems, Inc.
647	Use of the MIME media type "application/postscript" implies
648	recognition of that trademark and all the rights it entails.

650	The PostScript language definition provides facilities for
651	internal labelling of the specific language features a given
652	program uses.  This labelling, called the PostScript document
653	structuring conventions, or DSC, is very general and provides
654	substantially more information than just the language level.
655	The use of document structuring conventions, while not
656	required, is strongly recommended as an aid to
657	interoperability.  Documents which lack proper structuring
658	conventions cannot be tested to see whether or not they will
659	work in a given environment.  As such, some systems may assume
660	the worst and refuse to process unstructured documents.

662	The execution of general-purpose PostScript interpreters
663	entails serious security risks, and implementors are
664	discouraged from simply sending PostScript bodies to "off-
665	the-shelf" interpreters.  While it is usually safe to send
666	PostScript to a printer, where the potential for harm is
667	greatly constrained by typical printer environments,
668	implementors should consider all of the following before they
669	add interactive display of PostScript bodies to their MIME
670	readers.

672	The remainder of this section outlines some, though probably
673	not all, of the possible problems with the transport of
674	PostScript objects.

676	 (1)   Dangerous operations in the PostScript language
677	       include, but may not be limited to, the PostScript
678	       operators "deletefile", "renamefile", "filenameforall",
679	       and "file".  "File" is only dangerous when applied to
680	       something other than standard input or output.
681	       Implementations may also define additional nonstandard
682	       file operators; these may also pose a threat to
683	       security. "Filenameforall", the wildcard file search
684	       operator, may appear at first glance to be harmless.
685	       Note, however, that this operator has the potential to
686	       reveal information about what files the recipient has
687	       access to, and this information may itself be
688	       sensitive.  Message senders should avoid the use of
689	       potentially dangerous file operators, since these
690	       operators are quite likely to be unavailable in secure
691	       PostScript implementations.  Message receiving and
692	       displaying software should either completely disable
693	       all potentially dangerous file operators or take
694	       special care not to delegate any special authority to
695	       their operation.  These operators should be viewed as
696	       being done by an outside agency when interpreting
697	       PostScript documents.  Such disabling and/or checking
698	       should be done completely outside of the reach of the
699	       PostScript language itself; care should be taken to
700	       insure that no method exists for re-enabling full-
701	       function versions of these operators.

703	 (2)   The PostScript language provides facilities for exiting
704	       the normal interpreter, or server, loop.  Changes made
705	       in this "outer" environment are customarily retained
706	       across documents, and may in some cases be retained
707	       semipermanently in nonvolatile memory.  The operators
708	       associated with exiting the interpreter loop have the
709	       potential to interfere with subsequent document
710	       processing.  As such, their unrestrained use
711	       constitutes a threat of service denial.  PostScript
712	       operators that exit the interpreter loop include, but
713	       may not be limited to, the exitserver and startjob
714	       operators.  Message sending software should not
715	       generate PostScript that depends on exiting the
716	       interpreter loop to operate, since the ability to exit
717	       will probably be unavailable in secure PostScript
718	       implementations.  Message receiving and displaying
719	       software should completely disable the ability to make
720	       retained changes to the PostScript environment by
721	       eliminating or disabling the "startjob" and
722	       "exitserver" operations.  If these operations cannot be
723	       eliminated or completely disabled the password
724	       associated with them should at least be set to a hard-
725	       to-guess value.

727	 (3)   PostScript provides operators for setting system-wide
728	       and device-specific parameters.  These parameter
729	       settings may be retained across jobs and may
730	       potentially pose a threat to the correct operation of
731	       the interpreter.  The PostScript operators that set
732	       system and device parameters include, but may not be
733	       limited to, the "setsystemparams" and "setdevparams"
734	       operators.  Message sending software should not
735	       generate PostScript that depends on the setting of
736	       system or device parameters to operate correctly.  The
737	       ability to set these parameters will probably be
738	       unavailable in secure PostScript implementations.
739	       Message receiving and displaying software should
740	       disable the ability to change system and device
741	       parameters.  If these operators cannot be completely
742	       disabled the password associated with them should at
743	       least be set to a hard-to-guess value.

745	 (4)   Some PostScript implementations provide nonstandard
746	       facilities for the direct loading and execution of
747	       machine code.  Such facilities are quite obviously open
748	       to substantial abuse.  Message sending software should
749	       not make use of such features.  Besides being totally
750	       hardware-specific, they are also likely to be
751	       unavailable in secure implementations of PostScript.
752	       Message receiving and displaying software should not
753	       allow such operators to be used if they exist.

755	 (5)   PostScript is an extensible language, and many, if not
756	       most, implementations of it provide a number of their
757	       own extensions.  This document does not deal with such
758	       extensions explicitly since they constitute an unknown
759	       factor.  Message sending software should not make use
760	       of nonstandard extensions; they are likely to be
761	       missing from some implementations.  Message receiving
762	       and displaying software should make sure that any
763	       nonstandard PostScript operators are secure and don't
764	       present any kind of threat.

766	 (6)   It is possible to write PostScript that consumes huge
767	       amounts of various system resources.  It is also
768	       possible to write PostScript programs that loop
769	       indefinitely.  Both types of programs have the
770	       potential to cause damage if sent to unsuspecting
771	       recipients.  Message-sending software should avoid the
772	       construction and dissemination of such programs, which
773	       is antisocial.  Message receiving and displaying
774	       software should provide appropriate mechanisms to abort
775	       processing of a document after a reasonable amount of
776	       time has elapsed. In addition, PostScript interpreters
777	       should be limited to the consumption of only a
778	       reasonable amount of any given system resource.

780	 (7)   It is possible to include raw binary information inside
781	       PostScript in various forms.  This is not recommended
782	       for use in Internet mail, both because it is not
783	       supported by all PostScript interpreters and because it
784	       significantly complicates the use of a MIME Content-
785	       Transfer-Encoding.  (Without such binary, PostScript
786	       may typically be viewed as line-oriented data.  The
787	       treatment of CRLF sequences becomes extremely
788	       problematic if binary and line-oriented data are mixed
789	       in a single Postscript data stream.)

791	 (8)   Finally, bugs may exist in some PostScript interpreters
792	       which could possibly be exploited to gain unauthorized
793	       access to a recipient's system.  Apart from noting this
794	       possibility, there is no specific action to take to
795	       prevent this, apart from the timely correction of such
796	       bugs if any are found.

798	6.5.3.  Other Application Subtypes

800	It is expected that many other subtypes of application will be
801	defined in the future.  MIME implementations must at a minimum
802	treat any unrecognized subtypes as being equivalent to
803	"application/octet-stream".

805	7.  Composite Media Type Values

807	The remaining two of the seven initial Content-Type values
808	refer to composite entities.  Composite entities are handled
809	using MIME mechanisms -- a MIME processor typically handles
810	the body directly.

812	7.1.  Multipart Media Type

814	In the case of multiple part entities, in which one or more
815	different sets of data are combined in a single body, a
816	"multipart" media type field must appear in the entity's
817	header.  The body must then contain one or more "body parts,"
818	each preceded by a boundary delimiter line, and the last one
819	followed by a closing boundary delimiter line.  After its
820	boundary delimiter line, each body part then consists of a
821	header area, a blank line, and a body area.  Thus a body part
822	is similar to an RFC 822 message in syntax, but different in
823	meaning.

825	A body part is NOT to be interpreted as actually being an RFC
826	822 message.  To begin with, NO header fields are actually
827	required in body parts.  A body part that starts with a blank
828	line, therefore, is allowed and is a body part for which all
829	default values are to be assumed.  In such a case, the absence
830	of a Content-Type header usually indicates that the
831	corresponding body has a content-type of "text/plain;
832	charset=US-ASCII".

834	The only header fields that have defined meaning for body
835	parts are those the names of which begin with "Content-".  All
836	other header fields are generally to be ignored in body parts.
837	Although they should generally be retained if at all possible,
838	they may be discarded by gateways if necessary.  Such other
839	fields are permitted to appear in body parts but must not be
840	depended on.  "X-" fields may be created for experimental or
841	private purposes, with the recognition that the information
842	they contain may be lost at some gateways.

844	NOTE:  The distinction between an RFC 822 message and a body
845	part is subtle, but important.  A gateway between Internet and
846	X.400 mail, for example, must be able to tell the difference
847	between a body part that contains an image and a body part
848	that contains an encapsulated message, the body of which is a
849	JPEG image.  In order to represent the latter, the body part
850	must have "Content-Type: message/rfc822", and its body (after
851	the blank line) must be the encapsulated message, with its own
852	"Content-Type: image/jpeg" header field.  The use of similar
853	syntax facilitates the conversion of messages to body parts,
854	and vice versa, but the distinction between the two must be
855	understood by implementors.  (For the special case in which
856	all parts actually are messages, a "digest" subtype is also
857	defined.)

859	As stated previously, each body part is preceded by a boundary
860	delimiter line that contains the boundary delimiter.  The
861	boundary delimiter MUST NOT appear inside any of the
862	encapsulated parts, on a line by itself or as the prefix of
863	any line.  This implies that it is crucial that the composing
864	agent be able to choose and specify a unique boundary
865	parameter value that does not contain the boundary parameter
866	value of an enclosing multipart as a prefix.

868	All present and future subtypes of the "multipart" type must
869	use an identical syntax.  Subtypes may differ in their
870	semantics, and may impose additional restrictions on syntax,
871	but must conform to the required syntax for the multipart
872	type.  This requirement ensures that all conformant user
873	agents will at least be able to recognize and separate the
874	parts of any multipart entity, even those of an unrecognized
875	subtype.

877	As stated in the definition of the Content-Transfer-Encoding
878	field [MIME-IMB], no encoding other than "7bit", "8bit", or
879	"binary" is permitted for entities of type "multipart".  The
880	multipart boundary delimiters and header fields are always
881	represented as 7-bit US-ASCII in any case (though the header
882	fields may encode non-US-ASCII header text as per RFC MIME-
883	HEADERS) and data within the body parts can be encoded on a
884	part-by-part basis, with Content-Transfer-Encoding fields for
885	each appropriate body part.

887	7.1.1.  Common Syntax

889	This section defines a common syntax for subtypes of
890	multipart.  All subtypes of multipart must use this syntax.  A
891	simple example of a multipart message also appears in this
892	section.  An example of a more complex multipart message is
893	given in RFC MIME-CONF.

895	The Content-Type field for multipart entities requires one
896	parameter, "boundary". The boundary delimiter line is then
897	defined as a line consisting entirely of two hyphen characters
898	("-", decimal value 45) followed by the boundary parameter
899	value from the Content-Type header field, optional linear
900	whitespace, and a terminating CRLF.

902	NOTE:  The hyphens are for rough compatibility with the
903	earlier RFC 934 method of message encapsulation, and for ease
904	of searching for the boundaries in some implementations.
905	However, it should be noted that multipart messages are NOT
906	completely compatible with RFC 934 encapsulations; in
907	particular, they do not obey RFC 934 quoting conventions for
908	embedded lines that begin with hyphens.  This mechanism was
909	chosen over the RFC 934 mechanism because the latter causes
910	lines to grow with each level of quoting.  The combination of
911	this growth with the fact that SMTP implementations sometimes
912	wrap long lines made the RFC 934 mechanism unsuitable for use
913	in the event that deeply-nested multipart structuring is ever
914	desired.

916	WARNING TO IMPLEMENTORS:  The grammar for parameters on the
917	Content-type field is such that it is often necessary to
918	enclose the boundary parameter values in quotes on the
919	Content-type line.  This is not always necessary, but never
920	hurts. Implementors should be sure to study the grammar
921	carefully in order to avoid producing invalid Content-type
922	fields.  Thus, a typical multipart Content-Type header field
923	might look like this:

925	  Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p

927	But the following is not valid:

929	  Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p

931	(because of the colon) and must instead be represented as

933	  Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"

935	This Content-Type value indicates that the content consists of
936	one or more parts, each with a structure that is syntactically
937	identical to an RFC 822 message, except that the header area
938	is allowed to be completely empty, and that the parts are each
939	preceded by the line

941	  --gc0pJq0M:08jU534c0p

943	The boundary delimiter MUST occur at the beginning of a line,
944	i.e., following a CRLF, and the initial CRLF is considered to
945	be attached to the boundary delimiter line rather than part of
946	the preceding part.  The boundary may be followed by zero or
947	more characters of linear whitespace. It is then terminated by
948	either another CRLF and the header fields for the next part,
949	or by two CRLFs, in which case there are no header fields for
950	the next part.  If no Content-Type field is present it is
951	assumed to be of message/rfc822 in a multipart/digest and
952	text/plain otherwise.

954	NOTE:  The CRLF preceding the boundary delimiter line is
955	conceptually attached to the boundary so that it is possible
956	to have a part that does not end with a CRLF (line  break).
957	Body parts that must be considered to end with line breaks,
958	therefore, must have two CRLFs preceding the boundary
959	delimiter line, the first of which is part of the preceding
960	body part, and the second of which is part of the
961	encapsulation boundary.

963	Boundary delimiters must not appear within the encapsulated
964	material, and must be no longer than 70 characters, not
965	counting the two leading hyphens.

967	The boundary delimiter line following the last body part is a
968	distinguished delimiter that indicates that no further body
969	parts will follow.  Such a delimiter line is identical to the
970	previous delimiter lines, with the addition of two more
971	hyphens after the boundary parameter value.

973	  --gc0pJq0M:08jU534c0p--

975	NOTE TO IMPLEMENTORS:  Boundary string comparisons must
976	compare the boundary value with the beginning of each
977	candidate line.  An exact match of the entire candidate line
978	is not required; it is sufficient that the boundary appear in
979	its entirety following the CRLF.

981	There appears to be room for additional information prior to
982	the first boundary delimiter line and following the final
983	boundary delimiter line.  These areas should generally be left
984	blank, and implementations must ignore anything that appears
985	before the first boundary delimiter line or after the last
986	one.

988	NOTE:  These "preamble" and "epilogue" areas are generally not
989	used because of the lack of proper typing of these parts and
990	the lack of clear semantics for handling these areas at
991	gateways, particularly X.400 gateways.  However, rather than
992	leaving the preamble area blank, many MIME implementations
993	have found this to be a convenient place to insert an
994	explanatory note for recipients who read the message with
995	pre-MIME software, since such notes will be ignored by MIME-
996	compliant software.

998	NOTE:  Because boundary delimiters must not appear in the body
999	parts being encapsulated, a user agent must exercise care to
1000	choose a unique boundary parameter value.  The boundary
1001	parameter value in the example above could have been the
1002	result of an algorithm designed to produce boundary delimiters
1003	with a very low probability of already existing in the data to
1004	be encapsulated without having to prescan the data.  Alternate
1005	algorithms might result in more "readable" boundary delimiters
1006	for a recipient with an old user agent, but would require more
1007	attention to the possibility that the boundary delimiter might
1008	appear at the beginning of some line in the encapsulated part.
1009	The simplest boundary delimiter line possible is something
1010	like "---", with a closing boundary delimiter line of "-----".

1012	As a very simple example, the following multipart message has
1013	two parts, both of them plain text, one of them explicitly
1014	typed and one of them implicitly typed:

1016	  From: Nathaniel Borenstein <nsb@bellcore.com>
1017	  To: Ned Freed <ned@innosoft.com>
1018	  Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST)
1019	  Subject: Sample message
1020	  MIME-Version: 1.0
1021	  Content-type: multipart/mixed; boundary="simple boundary"

1023	  This is the preamble.  It is to be ignored, though it
1024	  is a handy place for composition agents to include an
1025	  explanatory note to non-MIME conformant readers.

1027	  --simple boundary

1029	  This is implicitly typed plain US-ASCII text.
1030	  It does NOT end with a linebreak.
1031	  --simple boundary
1032	  Content-type: text/plain; charset=us-ascii

1034	  This is explicitly typed plain US-ASCII text.
1035	  It DOES end with a linebreak.

1037	  --simple boundary--

1039	  This is the epilogue.  It is also to be ignored.

1041	The use of a media type of multipart in a body part within
1042	another multipart entity is explicitly allowed.  In such
1043	cases, for obvious reasons, care must be taken to ensure that
1044	each nested multipart entity uses a different boundary
1045	delimiter.  See RFC MIME-CONF for an example of nested
1046	multipart entities.

1048	The use of the multipart media type with only a single body
1049	part may be useful in certain contexts, and is explicitly
1050	permitted.

1052	The only mandatory global parameter for the multipart media
1053	type is the boundary parameter, which consists of 1 to 70
1054	characters from a set of characters known to be very robust
1055	through mail gateways, and NOT ending with white space. (If a
1056	boundary delimiter line appears to end with white space, the
1057	white space must be presumed to have been added by a gateway,
1058	and must be deleted.)  It is formally specified by the
1059	following BNF:

1061	  boundary := 0*69<bchars> bcharsnospace

1063	  bchars := bcharsnospace / " "

1065	  bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
1066	                   "+" / "_" / "," / "-" / "." /
1067	                   "/" / ":" / "=" / "?"

1069	Overall, the body of a multipart entity may be specified as
1070	follows:

1072	  dash-boundary := "--" boundary
1073	                   ; boundary taken from the value of
1074	                   ; boundary parameter of the
1075	                   ; Content-Type field.

1077	  multipart-body := [preamble CRLF]
1078	                    dash-boundary transport-padding CRLF
1079	                    body-part *encapsulation
1080	                    close-delimiter transport-padding
1081	                    [CRLF epilogue]

1083	  transport-padding := *LWSP-char
1084	                       ; Composers MUST NOT generate
1085	                       ; non-zero length transport
1086	                       ; padding, but receivers MUST
1087	                       ; be able to handle padding
1088	                       ; added by message transports.

1090	  encapsulation := delimiter transport-padding
1091	                   CRLF body-part

1093	  delimiter := CRLF dash-boundary

1095	  close-delimiter := delimiter "--"

1097	  preamble := discard-text

1099	  epilogue := discard-text

1101	  discard-text := *(*text CRLF) *text
1102	                  ; To be ignored upon receipt.

1104	  body-part := <"message" as defined in RFC 822, with all
1105	                header fields optional, not starting with the
1106	                specified dash-boundary, and with the
1107	                delimiter not occurring anywhere in the
1108	                body part.  Note that the semantics of a
1109	                part differ from the semantics of a message,
1110	                as described in the text.>

1112	IMPORTANT NOTE:  The free insertion of linear-white-space and
1113	RFC 822 comments between the elements shown in this BNF is NOT
1114	allowed since this BNF does not specify a structured header
1115	field.

1117	NOTE:  In certain transport enclaves, RFC 822 restrictions
1118	such as the one that limits bodies to printable US-ASCII
1119	characters may not be in force.  (That is, the transport
1120	domains may resemble standard Internet mail transport as
1121	specified in RFC 821 and assumed by RFC 822, but without
1122	certain restrictions.) The relaxation of these restrictions
1123	should be construed as locally extending the definition of
1124	bodies, for example to include octets outside of the US-ASCII
1125	range, as long as these extensions are supported by the
1126	transport and adequately documented in the Content-Transfer-
1127	Encoding header field.  However, in no event are headers
1128	(either message headers or body-part headers) allowed to
1129	contain anything other than US-ASCII characters.

1131	NOTE:  Conspicuously missing from the multipart type is a
1132	notion of structured, related body parts.  In general, it
1133	seems premature to try to standardize interpart structure yet.
1134	It is recommended that those wishing to provide a more
1135	structured or integrated multipart messaging facility should
1136	define a subtype of multipart that is syntactically identical,
1137	but that always expects the inclusion of a distinguished part
1138	that can be used to specify the structure and integration of
1139	the other parts, probably referring to them by their Content-
1140	ID field.  If this approach is used, other implementations
1141	will not recognize the new subtype, but will treat it as the
1142	primary subtype (multipart/mixed) and will thus be able to
1143	show the user the parts that are recognized.

1145	7.1.2.  Handling Nested Messages and Multiparts

1147	The "message/rfc822" subtype defined in a subsequent section
1148	of this document has no terminating condition other than
1149	running out of data. Similarly, an improperly truncated
1150	multipart object may not have any terminating boundary marker,
1151	and can turn up operationally due to mail system malfunctions.

1153	It is essential that such objects be handled correctly when
1154	they are themselves imbedded inside of another multipart
1155	structure.  MIME implementations are therefore required to
1156	recognize outer level boundary markers at ANY level of inner
1157	nesting.  It is not sufficient to only check for the next
1158	expected marker or other terminating condition.

1160	7.1.3.  Mixed Subtype

1162	The "mixed" subtype of multipart is intended for use when the
1163	body parts are independent and need to be bundled in a
1164	particular order.  Any multipart subtypes that an
1165	implementation does not recognize must be treated as being of
1166	subtype "mixed".

1168	7.1.4.  Alternative Subtype

1170	The multipart/alternative type is syntactically identical to
1171	multipart/mixed, but the semantics are different.  In
1172	particular, each of the parts is an "alternative" version of
1173	the same information.

1175	Systems should recognize that the content of the various parts
1176	are interchangeable.  Systems should choose the "best" type
1177	based on the local environment and references, in some cases
1178	even through user interaction.  As with multipart/mixed, the
1179	order of body parts is significant.  In this case, the
1180	alternatives appear in an order of increasing faithfulness to
1181	the original content.  In general, the best choice is the LAST
1182	part of a type supported by the recipient system's local
1183	environment.

1185	Multipart/alternative may be used, for example, to send a
1186	message in a fancy text format in such a way that it can
1187	easily be displayed anywhere:

1189	  From: Nathaniel Borenstein <nsb@bellcore.com>
1190	  To: Ned Freed <ned@innosoft.com>
1191	  Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST)
1192	  Subject: Formatted text mail
1193	  MIME-Version: 1.0
1194	  Content-Type: multipart/alternative; boundary=boundary42

1196	  --boundary42
1197	  Content-Type: text/plain; charset=us-ascii

1199	    ... plain text version of message goes here ...

1201	  --boundary42
1202	  Content-Type: text/enriched

1204	    ... RFC 1563 text/enriched version of same message
1205	        goes here ...

1207	  --boundary42
1208	  Content-Type: application/x-whatever

1210	    ... fanciest version of same message goes here ...

1212	  --boundary42--

1214	In this example, users whose mail systems understood the
1215	"application/x-whatever" format would see only the fancy
1216	version, while other users would see only the enriched or
1217	plain text version, depending on the capabilities of their
1218	system.

1220	In general, user agents that compose multipart/alternative
1221	entities must place the body parts in increasing order of
1222	preference, that is, with the preferred format last.  For
1223	fancy text, the sending user agent should put the plainest
1224	format first and the richest format last.  Receiving user
1225	agents should pick and display the last format they are
1226	capable of displaying.  In the case where one of the
1227	alternatives is itself of type "multipart" and contains
1228	unrecognized sub-parts, the user agent may choose either to
1229	show that alternative, an earlier alternative, or both.

1231	NOTE:  From an implementor's perspective, it might seem more
1232	sensible to reverse this ordering, and have the plainest
1233	alternative last.  However, placing the plainest alternative
1234	first is the friendliest possible option when
1235	multipart/alternative entities are viewed using a non-MIME-
1236	conformant viewer.  While this approach does impose some
1237	burden on conformant MIME viewers, interoperability with older
1238	mail readers was deemed to be more important in this case.

1240	It may be the case that some user agents, if they can
1241	recognize more than one of the formats, will prefer to offer
1242	the user the choice of which format to view.  This makes
1243	sense, for example, if a message includes both a nicely-
1244	formatted image version and an easily-edited text version.
1245	What is most critical, however, is that the user not
1246	automatically be shown multiple versions of the same data.
1247	Either the user should be shown the last recognized version or
1248	should be given the choice.

1250	NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE:
1251	Each part of a multipart/alternative entity represents the
1252	same data, but the mappings between the two are not
1253	necessarily without information loss.  For example,
1254	information is lost when translating ODA to PostScript or
1255	plain text.  It is recommended that each part should have a
1256	different Content-ID value in the case where the information
1257	content of the two parts is not identical.  And when the
1258	information content is identical -- for example, where several
1259	parts of type "message/external-body" specify alternate ways
1260	to access the identical data -- the same Content-ID field
1261	value should be used, to optimize any caching mechanisms that
1262	might be present on the recipient's end.  However, the
1263	Content-ID values used by the parts should NOT be the same
1264	Content-ID value that describes the multipart/alternative as a
1265	whole, if there is any such Content-ID field.  That is, one
1266	Content-ID value will refer to the multipart/alternative
1267	entity, while one or more other Content-ID values will refer
1268	to the parts inside it.

1270	7.1.5.  Digest Subtype

1272	This document defines a "digest" subtype of the multipart
1273	Content-Type.  This type is syntactically identical to
1274	multipart/mixed, but the semantics are different.  In
1275	particular, in a digest, the default Content-Type value for a
1276	body part is changed from "text/plain" to "message/rfc822".
1277	This is done to allow a more readable digest format that is
1278	largely compatible (except for the quoting convention) with
1279	RFC 934.

1281	A digest in this format might, then, look something like this:

1283	  From: Moderator-Address
1284	  To: Recipient-List
1285	  Date: Mon, 22 Mar 1994 13:34:51 +0000
1286	  Subject: Internet Digest, volume 42
1287	  MIME-Version: 1.0
1288	  Content-Type: multipart/digest;
1289	                boundary="---- next message ----"

1291	  ------ next message ----

1293	  From: someone-else
1294	  Date: Fri, 26 Mar 1993 11:13:32 +0200
1295	  Subject: my opinion

1297	    ...body goes here ...

1299	  ------ next message ----

1301	  From: someone-else-again
1302	  Date: Fri, 26 Mar 1993 10:07:13 -0500
1303	  Subject: my different opinion

1305	    ... another body goes here ...

1307	  ------ next message ------

1309	7.1.6.  Parallel Subtype

1311	This document defines a "parallel" subtype of the multipart
1312	Content-Type.  This type is syntactically identical to
1313	multipart/mixed, but the semantics are different.  In
1314	particular, in a parallel entity, the order of body parts is
1315	not significant.

1317	A common presentation of this type is to display all of the
1318	parts simultaneously on hardware and software that are capable
1319	of doing so.  However, composing agents should be aware that
1320	many mail readers will lack this capability and will show the
1321	parts serially in any event.

1323	7.1.7.  Other Multipart Subtypes

1325	Other multipart subtypes are expected in the future.  MIME
1326	implementations must in general treat unrecognized subtypes of
1327	multipart as being equivalent to "multipart/mixed".

1329	7.2.  Message Media Type

1331	It is frequently desirable, in sending mail, to encapsulate
1332	another mail message.  A special media type, "message", is
1333	defined to facilitate this.  In particular, the "rfc822"
1334	subtype of "message" is used to encapsulate RFC 822 messages.

1336	NOTE:  It has been suggested that subtypes of message might be
1337	defined for forwarded or rejected messages.  However,
1338	forwarded and rejected messages can be handled as multipart
1339	messages in which the first part contains any control or
1340	descriptive information, and a second part, of type
1341	message/rfc822, is the forwarded or rejected message.
1342	Composing rejection and forwarding messages in this manner
1343	will preserve the type information on the original message and
1344	allow it to be correctly presented to the recipient, and hence
1345	is strongly encouraged.

1347	Subtypes of message often impose restrictions on what
1348	encodings are allowed.  These restrictions are described in
1349	conjunction with each specific subtype.

1351	Mail gateways, relays, and other mail handling agents are
1352	commonly known to alter the top-level header of an RFC 822
1353	message.  In particular, they frequently add, remove, or
1354	reorder header fields.  Such alterations are explicitly
1355	forbidden for the encapsulated headers embedded in the bodies
1356	of messages of type "message."

1358	7.2.1.  RFC822 Subtype

1360	A media type of "message/rfc822" indicates that the body
1361	contains an encapsulated message, with the syntax of an RFC
1362	822 message.  However, unlike top-level RFC 822 messages, the
1363	restriction that each message/rfc822 body must include a
1364	"From", "Date", and at least one destination header is removed
1365	and replaced with the requirement that at least one of "From",
1366	"Subject", or "Date" must be present.

1368	No encoding other than "7bit", "8bit", or "binary" is
1369	permitted for body parts of type "message/rfc822".  The
1370	message header fields are always US-ASCII in any case, and
1371	data within the body can still be encoded, in which case the
1372	Content-Transfer-Encoding header field in the encapsulated
1373	message will reflect this.  Non-US-ASCII text in the headers
1374	of an encapsulated message can be specified using the
1375	mechanisms described in RFC MIME-HEADERS.

1377	It should be noted that, despite the use of the numbers "822",
1378	a message/rfc822 entity can include enhanced information as
1379	defined in this document.  In other words, a message/rfc822
1380	message may be a MIME message.

1382	7.2.2.  Partial Subtype

1384	The "partial" subtype is defined to allow large entities to be
1385	delivered as several separate pieces of mail and automatically
1386	reassembled by a receiving user agent.  (The concept is
1387	similar to IP fragmentation and reassembly in the basic
1388	Internet Protocols.)  This mechanism can be used when
1389	intermediate transport agents limit the size of individual
1390	messages that can be sent.  The media type "message/partial"
1391	thus indicates that the body contains a fragment of a larger
1392	entity.

1394	Three parameters must be specified in the Content-Type field
1395	of type message/partial:  The first, "id", is a unique
1396	identifier, as close to a world-unique identifier as possible,
1397	to be used to match the fragments together. (In general, the
1398	identifier is essentially a message-id; if placed in double
1399	quotes, it can be ANY message-id, in accordance with the BNF
1400	for "parameter" given earlier in this specification.)  The
1401	second, "number", an integer, is the fragment number, which
1402	indicates where this fragment fits into the sequence of
1403	fragments.  The third, "total", another integer, is the total
1404	number of fragments. This third subfield is required on the
1405	final fragment, and is optional (though encouraged) on the
1406	earlier fragments.  Note also that these parameters may be
1407	given in any order.

1409	Thus, the second piece of a 3-piece message may have either of
1410	the following header fields:

1412	  Content-Type: Message/Partial; number=2; total=3;
1413	                id="oc=jpbe0M2Yt4s@thumper.bellcore.com"

1415	  Content-Type: Message/Partial;
1416	                id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
1417	                number=2

1419	But the third piece MUST specify the total number of
1420	fragments:

1422	  Content-Type: Message/Partial; number=3; total=3;
1423	                id="oc=jpbe0M2Yt4s@thumper.bellcore.com"

1425	Note that fragment numbering begins with 1, not 0.

1427	When the fragments of an entity broken up in this manner are
1428	put together, the result is a complete MIME entity, which may
1429	have its own Content-Type header field, and thus may contain
1430	any other data type.

1432	7.2.2.1.  Message Fragmentation and Reassembly

1434	The semantics of a reassembled partial message must be those
1435	of the "inner" message, rather than of a message containing
1436	the inner message.  This makes it possible, for example, to
1437	send a large audio message as several partial messages, and
1438	still have it appear to the recipient as a simple audio
1439	message rather than as an encapsulated message containing an
1440	audio message.  That is, the encapsulation of the message is
1441	considered to be "transparent".

1443	When generating and reassembling the pieces of a
1444	message/partial message, the headers of the encapsulated
1445	message must be merged with the headers of the enclosing
1446	entities.  In this process the following rules must be
1447	observed:

1449	 (1)   All of the header fields from the initial enclosing
1450	       message, except those that start with "Content-" and
1451	       the specific header fields "Subject", "Message-ID",
1452	       "Encrypted", and "MIME-Version", must be copied, in
1453	       order, to the new message.

1455	 (2)   The header fields in the enclosed message which start
1456	       with "Content-", plus the "Subject", "Message-ID",
1457	       "Encrypted", and "MIME-Version" fields, must be
1458	       appended, in order, to the header fields of the new
1459	       message.  Any header fields in the enclosed message
1460	       which do not start with "Content-" (except for the
1461	       "Subject", "Message-ID", "Encrypted", and "MIME-
1462	       Version" fields) will be ignored and dropped.

1464	 (3)   All of the header fields from the second and any
1465	       subsequent enclosing messages are discarded by the
1466	       reassembly process.

1468	7.2.2.2.  Fragmentation and Reassembly Example

1470	If an audio message is broken into two pieces, the first piece
1471	might look something like this:

1473	  X-Weird-Header-1: Foo
1474	  From: Bill@host.com
1475	  To: joe@otherhost.com
1476	  Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
1477	  Subject: Audio mail (part 1 of 2)
1478	  Message-ID: <id1@host.com>
1479	  MIME-Version: 1.0
1480	  Content-type: message/partial; id="ABC@host.com";
1481	                number=1; total=2

1483	  X-Weird-Header-1: Bar
1484	  X-Weird-Header-2: Hello
1485	  Message-ID: <anotherid@foo.com>
1486	  Subject: Audio mail
1487	  MIME-Version: 1.0
1488	  Content-type: audio/basic
1489	  Content-transfer-encoding: base64

1491	    ... first half of encoded audio data goes here ...

1493	and the second half might look something like this:

1495	  From: Bill@host.com
1496	  To: joe@otherhost.com
1497	  Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
1498	  Subject: Audio mail (part 2 of 2)
1499	  MIME-Version: 1.0
1500	  Message-ID: <id2@host.com>
1501	  Content-type: message/partial;
1502	                id="ABC@host.com"; number=2; total=2

1504	    ... second half of encoded audio data goes here ...

1506	Then, when the fragmented message is reassembled, the
1507	resulting message to be displayed to the user should look
1508	something like this:

1510	  X-Weird-Header-1: Foo
1511	  From: Bill@host.com
1512	  To: joe@otherhost.com
1513	  Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
1514	  Subject: Audio mail
1515	  Message-ID: <anotherid@foo.com>
1516	  MIME-Version: 1.0
1517	  Content-type: audio/basic
1518	  Content-transfer-encoding: base64

1520	    ... first half of encoded audio data goes here ...
1521	    ... second half of encoded audio data goes here ...

1523	Because data of type "message" may never be encoded in base64
1524	or quoted-printable, a problem might arise if message/partial
1525	entities are constructed in an environment that supports
1526	binary or 8-bit transport.  The problem is that the binary
1527	data would be split into multiple message/partial messages,
1528	each of them requiring binary transport.  If such messages
1529	were encountered at a gateway into a 7-bit transport
1530	environment, there would be no way to properly encode them for
1531	the 7-bit world, aside from waiting for all of the fragments,
1532	reassembling the inner message, and then encoding the
1533	reassembled data in base64 or quoted-printable.  Since it is
1534	possible that different fragments might go through different
1535	gateways, even this is not an acceptable solution.  For this
1536	reason, it is specified that MIME entities of type
1537	message/partial must always have a content-transfer-encoding
1538	of 7-bit (the default).  In particular, even in environments
1539	that support binary or 8-bit transport, the use of a content-
1540	transfer-encoding of "8bit" or "binary" is explicitly
1541	prohibited for entities of type message/partial.

1543	Because some message transfer agents may choose to
1544	automatically fragment large messages, and because such agents
1545	may use very different fragmentation thresholds, it is
1546	possible that the pieces of a partial message, upon
1547	reassembly, may prove themselves to comprise a partial
1548	message.  This is explicitly permitted.

1550	The inclusion of a "References" field in the headers of the
1551	second and subsequent pieces of a fragmented message that
1552	references the Message-Id on the previous piece may be of
1553	benefit to mail readers that understand and track references.
1554	However, the generation of such "References" fields is
1555	entirely optional.

1557	Finally, it should be noted that the "Encrypted" header field
1558	has been made obsolete by Privacy Enhanced Messaging (PEM)
1559	[RFC1421, RFC1422, RFC1423, and RFC1424], but the rules above
1560	are nevertheless believed to describe the correct way to treat
1561	it if it is encountered in the context of conversion to and
1562	from message/partial fragments.

1564	7.2.3.  External-Body Subtype

1566	The external-body subtype indicates that the actual body data
1567	are not included, but merely referenced.  In this case, the
1568	parameters describe a mechanism for accessing the external
1569	data.

1571	When an entity is of type "message/external-body", it consists
1572	of a header, two consecutive CRLFs, and the message header for
1573	the encapsulated message.  If another pair of consecutive
1574	CRLFs appears, this of course ends the message header for the
1575	encapsulated message.  However, since the encapsulated
1576	message's body is itself external, it does NOT appear in the
1577	area that follows.  For example, consider the following
1578	message:

1580	  Content-type: message/external-body;
1581	                access-type=local-file;
1582	                name="/u/nsb/Me.jpeg"

1584	  Content-type: image/jpeg
1585	  Content-ID: <id42@guppylake.bellcore.com>
1586	  Content-Transfer-Encoding: binary

1588	  THIS IS NOT REALLY THE BODY!

1590	The area at the end, which might be called the "phantom body",
1591	is ignored for most external-body messages.  However, it may
1592	be used to contain auxiliary information for some such
1593	messages, as indeed it is when the access-type is "mail-
1594	server".  The only access-type defined in this document that
1595	uses the phantom body is "mail-server", but other access-types
1596	may be defined in the future in other documents that use this
1597	area.

1599	The encapsulated headers in ALL message/external-body entities
1600	MUST include a Content-ID header field to give a unique
1601	identifier by which to reference the data.  This identifier
1602	may be used for caching mechanisms, and for recognizing the
1603	receipt of the data when the access-type is "mail-server".

1605	Note that, as specified here, the tokens that describe
1606	external-body data, such as file names and mail server
1607	commands, are required to be in the US-ASCII character set.
1608	If this proves problematic in practice, a new mechanism may be
1609	required as a future extension to MIME, either as newly
1610	defined access-types for message/external-body or by some
1611	other mechanism.

1613	As with message/partial, MIME entities of type
1614	message/external-body MUST have a content-transfer-encoding of
1615	7-bit (the default).  In particular, even in environments that
1616	support binary or 8-bit transport, the use of a content-
1617	transfer-encoding of "8bit" or "binary" is explicitly
1618	prohibited for entities of type message/external-body.

1620	7.2.3.1.  General External-Body Parameters

1622	The parameters that may be used with any message/external-body
1623	are:

1625	 (1)   ACCESS-TYPE -- A word indicating the supported access
1626	       mechanism by which the file or data may be obtained.
1627	       This word is not case sensitive.  Values include, but
1628	       are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL-
1629	       FILE", and "MAIL-SERVER".  Future values, except for
1630	       experimental values beginning with "X-", must be
1631	       registered with IANA, as described in RFC MIME-REG.
1632	       This parameter is unconditionally mandatory and MUST be
1633	       present on EVERY message/external-body.

1635	 (2)   EXPIRATION -- The date (in the RFC 822 "date-time"
1636	       syntax, as extended by RFC 1123 to permit 4 digits in
1637	       the year field) after which the existence of the
1638	       external data is not guaranteed.  This parameter may be
1639	       used with ANY access-type and is ALWAYS optional.

1641	 (3)   SIZE -- The size (in octets) of the data.  The intent
1642	       of this parameter is to help the recipient decide
1643	       whether or not to expend the necessary resources to
1644	       retrieve the external data.  Note that this describes
1645	       the size of the data in its canonical form, that is,
1646	       before any Content-Transfer-Encoding has been applied
1647	       or after the data have been decoded.  This parameter
1648	       may be used with ANY access-type and is ALWAYS
1649	       optional.

1651	 (4)   PERMISSION -- A case-insensitive field that indicates
1652	       whether or not it is expected that clients might also
1653	       attempt to overwrite the data.  By default, or if
1654	       permission is "read", the assumption is that they are
1655	       not, and that if the data is retrieved once, it is
1656	       never needed again.  If PERMISSION is "read-write",
1657	       this assumption is invalid, and any local copy must be
1658	       considered no more than a cache.  "Read" and "Read-
1659	       write" are the only defined values of permission.  This
1660	       parameter may be used with ANY access-type and is
1661	       ALWAYS optional.

1663	The precise semantics of the access-types defined here are
1664	described in the sections that follow.

1666	7.2.3.2.  The 'ftp' and 'tftp' Access-Types

1668	An access-type of FTP or TFTP indicates that the message body
1669	is accessible as a file using the FTP [RFC-959] or TFTP [RFC-
1670	783] protocols, respectively.  For these access-types, the
1671	following additional parameters are mandatory:

1673	 (1)   NAME -- The name of the file that contains the actual
1674	       body data.

1676	 (2)   SITE -- A machine from which the file may be obtained,
1677	       using the given protocol.  This must be a fully
1678	       qualified domain name, not a nickname.

1680	 (3)   Before any data are retrieved, using FTP, the user will
1681	       generally need to be asked to provide a login id and a
1682	       password for the machine named by the site parameter.
1683	       For security reasons, such an id and password are not
1684	       specified as content-type parameters, but must be
1685	       obtained from the user.

1687	In addition, the following parameters are optional:

1689	 (1)   DIRECTORY -- A directory from which the data named by
1690	       NAME should be retrieved.

1692	 (2)   MODE -- A case-insensitive string indicating the mode
1693	       to be used when retrieving the information.  The valid
1694	       values for access-type "TFTP" are "NETASCII", "OCTET",
1695	       and "MAIL", as specified by the TFTP protocol [RFC-
1696	       783].  The valid values for access-type "FTP" are
1697	       "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a
1698	       decimal integer, typically 8.  These correspond to the
1699	       representation types "A" "E" "I" and "L n" as specified
1700	       by the FTP protocol [RFC-959].  Note that "BINARY" and
1701	       "TENEX" are not valid values for MODE and that "OCTET"
1702	       or "IMAGE" or "LOCAL8" should be used instead.  IF MODE
1703	       is not specified, the  default value is "NETASCII" for
1704	       TFTP and "ASCII" otherwise.

1706	7.2.3.3.  The 'anon-ftp' Access-Type

1708	The "anon-ftp" access-type is identical to the "ftp" access
1709	type, except that the user need not be asked to provide a name
1710	and password for the specified site.  Instead, the ftp
1711	protocol will be used with login "anonymous" and a password
1712	that corresponds to the user's mail address.

1714	7.2.3.4.  The 'local-file' Access-Type

1716	An access-type of "local-file" indicates that the actual body
1717	is accessible as a file on the local machine.  Two additional
1718	parameters are defined for this access type:

1720	 (1)   NAME -- The name of the file that contains the actual
1721	       body data.  This parameter is mandatory for the
1722	       "local-file" access-type.

1724	 (2)   SITE -- A domain specifier for a machine or set of
1725	       machines that are known to have access to the data
1726	       file.  This optional parameter is used to describe the
1727	       locality of reference for the data, that is, the site
1728	       or sites at which the file is expected to be visible.
1729	       Asterisks may be used for wildcard matching to a part
1730	       of a domain name, such as "*.bellcore.com", to indicate
1731	       a set of machines on which the data should be directly
1732	       visible, while a single asterisk may be used to
1733	       indicate a file that is expected to be universally
1734	       available, e.g., via a global file system.

1736	7.2.3.5.  The 'mail-server' Access-Type

1738	The "mail-server" access-type indicates that the actual body
1739	is available from a mail server.  Two additional parameters
1740	are defined for this access-type:

1742	 (1)   SERVER -- The email address of the mail server from
1743	       which the actual body data can be obtained.  This
1744	       parameter is mandatory for the "mail-server" access-
1745	       type.

1747	 (2)   SUBJECT -- The subject that is to be used in the mail
1748	       that is sent to obtain the data.  Note that keying mail
1749	       servers on Subject lines is NOT recommended, but such
1750	       mail servers are known to exist.  This is an optional
1751	       parameter.

1753	Because mail servers accept a variety of syntaxes, some of
1754	which is multiline, the full command to be sent to a mail
1755	server is not included as a parameter in the content-type
1756	header field.  Instead, it is provided as the "phantom body"
1757	when the media type is message/external-body and the access-
1758	type is mail-server.

1760	Note that MIME does not define a mail server syntax.  Rather,
1761	it allows the inclusion of arbitrary mail server commands in
1762	the phantom body.  Implementations must include the phantom
1763	body in the body of the message it sends to the mail server
1764	address to retrieve the relevant data.

1766	Unlike other access-types, mail-server access is asynchronous
1767	and will happen at an unpredictable time in the future.  For
1768	this reason, it is important that there be a mechanism by
1769	which the returned data can be matched up with the original
1770	message/external-body entity.  MIME mail servers must use the
1771	same Content-ID field on the returned message that was used in
1772	the original message/external-body entity, to facilitate such
1773	matching.

1775	7.2.3.6.  External-Body Security Issues

1777	Message/external-body entities give rise to two important
1778	security issues:

1780	 (1)   Accessing data via a message/external-body reference
1781	       effectively results in the message recipient performing
1782	       an operation that was specified by the message
1783	       originator.  It is therefore possible for the message
1784	       originator to trick a recipient into doing something
1785	       they would not have done otherwise.  For example, an
1786	       originator could specify a action that attempts
1787	       retrieval of material that the recipient is not
1788	       authorized to obtain, causing the recipient to
1789	       unwittingly violate some security policy.  For this
1790	       reason, user agents capable of resolving external
1791	       references must always take steps to describe the
1792	       action they are to take to the recipient and ask for
1793	       explicit permisssion prior to performing it.

1795	       The 'mail-server' access-type is particularly
1796	       vulnerable, in that it causes the recipient to send a
1797	       new message whose contents are specified by the
1798	       original message's originator.  Given the potential for
1799	       abuse, any such request messages that are constructed
1800	       should contain a clear indication that they were
1801	       generated automatically (e.g. in a Comments: header
1802	       field) in an attempt to resolve a MIME
1803	       message/external-body reference.

1805	 (2)   MIME will sometimes be used in environments that
1806	       provide some guarantee of message integrity and
1807	       authenticity.  If present, such guarantees may apply
1808	       only to the actual direct content of messages -- they
1809	       may or may not apply to data accessed through MIME's
1810	       message/external-body mechanism.  In particular, it may
1811	       be possible to subvert certain access mechanisms even
1812	       when the messaging system itself is secure.

1814	       It should be noted that this problem exists either with
1815	       or without the availabilty of MIME mechanisms.  A
1816	       casual reference to an FTP site containing a document
1817	       in the text of a secure message brings up similar
1818	       issues -- the only difference is that MIME provides for
1819	       automatic retrieval of such material, and users may
1820	       place unwarranted trust is such automatic retrieval
1821	       mechanisms.

1823	7.2.3.7.  Examples and Further Explanations

1825	When the external-body mechanism is used in conjunction with
1826	the multipart/alternative media type it extends the
1827	functionality of multipart/alternative to include the case
1828	where the same object is provided in the same format but via
1829	different accces mechanisms.  When this is done the originator
1830	of the message must order the part first in terms of preferred
1831	formats and then by preferred access mechanisms.  The
1832	recipient's viewer should then evaluate the list both in terms
1833	of format and access mechanisms.

1835	With the emerging possibility of very wide-area file systems,
1836	it becomes very hard to know in advance the set of machines
1837	where a file will and will not be accessible directly from the
1838	file system.  Therefore it may make sense to provide both a
1839	file name, to be tried directly, and the name of one or more
1840	sites from which the file is known to be accessible.  An
1841	implementation can try to retrieve remote files using FTP or
1842	any other protocol, using anonymous file retrieval or
1843	prompting the user for the necessary name and password.  If an
1844	external body is accessible via multiple mechanisms, the
1845	sender may include multiple parts of type message/external-
1846	body within an entity of type multipart/alternative.

1848	However, the external-body mechanism is not intended to be
1849	limited to file retrieval, as shown by the mail-server
1850	access-type.  Beyond this, one can imagine, for example, using
1851	a video server for external references to video clips.

1853	The embedded message header fields which appear in the body of
1854	the message/external-body data must be used to declare the
1855	media type of the external body if it is anything other than
1856	plain US-ASCII text, since the external body does not have a
1857	header section to declare its type.  Similarly, any Content-
1858	transfer-encoding other than "7bit" must also be declared
1859	here.  Thus a complete message/external-body message,
1860	referring to a document in PostScript format, might look like
1861	this:

1863	  From: Whomever
1864	  To: Someone
1865	  Date: Whenever
1866	  Subject: whatever
1867	  MIME-Version: 1.0
1868	  Message-ID: <id1@host.com>
1869	  Content-Type: multipart/alternative; boundary=42
1870	  Content-ID: <id001@guppylake.bellcore.com>

1872	  --42
1873	  Content-Type: message/external-body; name="BodyFormats.ps";
1874	                site="thumper.bellcore.com"; mode="image";
1875	                access-type=ANON-FTP; directory="pub";
1876	                expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

1878	  Content-type: application/postscript
1879	  Content-ID: <id42@guppylake.bellcore.com>

1881	  --42
1882	  Content-Type: message/external-body; access-type=local-file;
1883	                name="/u/nsb/writing/rfcs/RFC-MIME.ps";
1884	                site="thumper.bellcore.com";
1885	                expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

1887	  Content-type: application/postscript
1888	  Content-ID: <id42@guppylake.bellcore.com>

1890	  --42
1891	  Content-Type: message/external-body;
1892	                access-type=mail-server
1893	                server="listserv@bogus.bitnet";
1894	                expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

1896	  Content-type: application/postscript
1897	  Content-ID: <id42@guppylake.bellcore.com>

1899	  get RFC-MIME.DOC

1901	  --42--

1903	Note that in the above examples, the default Content-
1904	transfer-encoding of "7bit" is assumed for the external
1905	postscript data.

1907	Like the message/partial type, the message/external-body media
1908	type is intended to be transparent, that is, to convey the
1909	data type in the external body rather than to convey a message
1910	with a body of that type.  Thus the headers on the outer and
1911	inner parts must be merged using the same rules as for
1912	message/partial.  In particular, this means that the Content-
1913	type header is overridden, but the From and Subject headers
1914	are preserved.

1916	Note that since the external bodies are not transported along
1917	with the external body reference, they need not conform to
1918	transport limitations that apply to the reference itself. In
1919	particular, Internet mail transports may impose 7-bit and line
1920	length limits, but these do not automatically apply to binary
1921	external body references. Thus a Content-Transfer-Encoding is
1922	not generally necessary, though it is permitted.

1924	Note that the body of a message of type "message/external-
1925	body" is governed by the basic syntax for an RFC 822 message.
1926	In particular, anything before the first consecutive pair of
1927	CRLFs is header information, while anything after it is body
1928	information, which is ignored for most access-types.

1930	7.2.4.  Other Message Subtypes

1932	MIME implementations must in general treat unrecognized
1933	subtypes of message as being equivalent to
1934	"application/octet-stream".

1936	8.  Experimental Media Type Values

1938	A media type value beginning with the characters "X-" is a
1939	private value, to be used by consenting systems by mutual
1940	agreement.  Any format without a rigorous and public
1941	definition must be named with an "X-" prefix, and publicly
1942	specified values shall never begin with "X-".  (Older versions
1943	of the widely used Andrew system use the "X-BE2" name, so new
1944	systems should probably choose a different name.)

1946	In general, the use of "X-" top-level types is strongly
1947	discouraged.  Implementors should invent subtypes of the
1948	existing types whenever possible. In many cases, a subtype of
1949	application will be more appropriate than a new top-level
1950	type.

1952	9.  Summary

1954	The five discrete media types provide provide a standardized
1955	mechanism for tagging messages or body parts as audio, image,
1956	or several other kinds of data.  The composite "multipart" and
1957	"message" media types allow mixing and hierarchical
1958	structuring of objects of different types in a single message.
1959	A distinguished parameter syntax allows further specification
1960	of data format details, particularly the specification of
1961	alternate character sets. Additional optional header fields
1962	provide mechanisms for certain extensions deemed desirable by
1963	many implementors. Finally, a number of useful media types are
1964	defined for general use by consenting user agents, notably
1965	message/partial, and message/external-body.

1967	10.  Security Considerations

1969	Security issues are discussed in the context of the
1970	application/postscript type, the message/external-body type,
1971	and in RFC MIME-REG.  Implementors should pay special
1972	attention to the security implications of any media types that
1973	can cause the remote execution of any actions in the
1974	recipient's environment.  In such cases, the discussion of the
1975	application/postscript type may serve as a model for
1976	considering other media types with remote execution
1977	capabilities.

1979	11.  Authors' Addresses

1981	For more information, the authors of this document are best
1982	contacted via Internet mail:

1984	Nathaniel S. Borenstein
1985	First Virtual Holdings
1986	25 Washington Avenue
1987	Morristown, NJ 07960
1988	USA

1990	Email: nsb@nsb.fv.com
1991	Phone: +1 201 540 8967
1992	Fax:   +1 201 993 3032

1994	Ned Freed
1995	Innosoft International, Inc.
1996	1050 East Garvey Avenue South
1997	West Covina, CA 91790
1998	USA

2000	Email: ned@innosoft.com
2001	Phone: +1 818 919 3600
2002	Fax:   +1 818 919 3614

2004	MIME is a result of the work of the Internet Engineering Task
2005	Force Working Group on Email Extensions.  The chairman of that
2006	group, Greg Vaudreuil, may be reached at:

2008	Gregory M. Vaudreuil
2009	Tigon Corporation
2010	17060 Dallas Parkway
2011	Dallas Texas, 75248

2013	Email: greg.vaudreuil@ons.octel.com
2014	Phone: +1 214 733 2722
2015	               Appendix A -- Collected Grammar

2017	This appendix contains the complete BNF grammar for all the
2018	syntax specified by this document.

2020	By itself, however, this grammar is incomplete.  It refers to
2021	several entities that are defined by RFC 822.  Rather than
2022	reproduce those definitions here, and risk unintentional
2023	differences between the two, this document simply refers the
2024	reader to RFC 822 for the remaining definitions.  Wherever a
2025	term is undefined, it refers to the RFC 822 definition.

2027	  boundary := 0*69<bchars> bcharsnospace

2029	  bchars := bcharsnospace / " "

2031	  bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
2032	                   "+" / "_" / "," / "-" / "." /
2033	                   "/" / ":" / "=" / "?"

2035	  body-part := <"message" as defined in RFC 822, with all
2036	                header fields optional, not starting with the
2037	                specified dash-boundary, and with the
2038	                delimiter not occurring anywhere in the
2039	                body part.  Note that the semantics of a
2040	                part differ from the semantics of a message,
2041	                as described in the text.>

2043	  close-delimiter := delimiter "--"

2045	  dash-boundary := "--" boundary
2046	                   ; boundary taken from the value of
2047	                   ; boundary parameter of the
2048	                   ; Content-Type field.

2050	  delimiter := CRLF dash-boundary

2052	  discard-text := *(*text CRLF)
2053	                  ; To be ignored upon receipt.

2055	  encapsulation := delimiter transport-padding
2056	                   CRLF body-part

2058	  epilogue := discard-text

2060	  multipart-body := [preamble CRLF]
2061	                    dash-boundary transport-padding CRLF
2062	                    body-part *encapsulation
2063	                    close-delimiter transport-padding
2064	                    [CRLF epilogue]

2066	  preamble := discard-text

2068	  transport-padding := *LWSP-char
2069	                       ; Composers MUST NOT generate
2070	                       ; non-zero length transport
2071	                       ; padding, but receivers MUST
2072	                       ; be able to handle padding
2073	                       ; added by message transports.